Cing
发布于 2023-05-07 / 194 阅读
0

简单 OLAP 场景数据库性能对比

一、准备环境

使用 docker 命令安装 MySQL,PostgreSQL,Clickhouse

docker run -d -p 3306:3306 --name mysql -v /docker/mysql:/var/lib/mysql -e MYSQL_ROOT_PASSWORD=123456 mysql
docker run -d -p 5432:5432 --name postgres -v /docker/postgresql:/var/lib/postgresql/data -e POSTGRES_PASSWORD=123456 postgres
docker run -d \
    -p 8123:8123 \
    -v /docker/clickhouse:/var/lib/clickhouse/ \
    -v /docker/clickhouse/log:/var/log/clickhouse-server/ \
    --name clickhouse-server --ulimit nofile=262144:262144 clickhouse/clickhouse-server

二、建表

MySQL

create table employees
(
    emp_no     int             not null
        primary key,
    birth_date date            not null,
    first_name varchar(14)     not null,
    last_name  varchar(16)     not null,
    gender     enum ('M', 'F') not null,
    hire_date  date            not null
);

create table salaries
(
    emp_no    int  not null,
    salary    int  not null,
    from_date date not null,
    to_date   date not null,
    primary key (emp_no, from_date)
);

PostgreSQL

create table employees
(
    emp_no     integer not null
        constraint employees_pk
            primary key,
    birth_date date,
    first_name varchar(14),
    last_name  varchar(16),
    gender     char,
    hire_date  date
);

create table salaries
(
    emp_no    integer not null,
    salary    integer,
    from_date date    not null,
    to_date   date,
    constraint salaries_pk
        primary key (emp_no, from_date)
);

Clickhouse

create table employees
(
    emp_no     Int32,
    birth_date Date32,
    first_name String,
    last_name  String,
    gender     String,
    hire_date  Date32
)
    engine = MergeTree()
        primary key (emp_no);

create table salaries
(
    emp_no    Int32,
    salary    Int32,
    from_date Date32,
    to_date   Date32
)
    engine = MergeTree()
        primary key (emp_no, from_date);

select sum(salary) from salaries;

with sums as (select emp_no, sum(salary) ss from salaries group by emp_no)
select employees.emp_no, first_name, ss
from employees
         join sums on sums.emp_no = employees.emp_no
order by emp_no;

三、数据导入

数据来自 MySQL 官方的数据集,仅导入需要的数据 employees 和 salaries 即可

Release test_db 1.0.7 · datacharmer/test_db (github.com)

四、性能对比

主要测试 OLAP 的性能

测试 SQL 如下

# SQL 1
select sum(salary) from salaries;

# SQL 2
with sums as (select emp_no, sum(salary) ss from salaries group by emp_no)
select employees.emp_no, first_name, ss
from employees
         join sums on sums.emp_no = employees.emp_no
order by emp_no;

测试前均重启了 docker 容器,避免缓存提速

每条语句在每种 DB 下均跑 5 次看看平均值

另外 MySQL PostgreSQL SQL 2 均是使用 explain analyze 跑的,因为 PG 会利用缓存使 SQL 2 结果在 10ms 左右返回

MySQL

SQL 1

670 ms (execution: 638 ms, fetching: 32 ms)
477 ms (execution: 474 ms, fetching: 3 ms)
489 ms (execution: 486 ms, fetching: 3 ms)
463 ms (execution: 460 ms, fetching: 3 ms)
461 ms (execution: 457 ms, fetching: 4 ms)

SQL 2

1 s 923 ms (execution: 1 s 920 ms, fetching: 3 ms)
1 s 913 ms (execution: 1 s 909 ms, fetching: 4 ms)
1 s 906 ms (execution: 1 s 903 ms, fetching: 3 ms)
1 s 908 ms (execution: 1 s 904 ms, fetching: 4 ms)
1 s 899 ms (execution: 1 s 896 ms, fetching: 3 ms)

PostgreSQL

# SQL 1

225 ms (execution: 199 ms, fetching: 26 ms)
180 ms (execution: 177 ms, fetching: 3 ms)
188 ms (execution: 184 ms, fetching: 4 ms)
178 ms (execution: 174 ms, fetching: 4 ms)
197 ms (execution: 194 ms, fetching: 3 ms)

# SQL 2

679 ms (execution: 675 ms, fetching: 4 ms)
668 ms (execution: 664 ms, fetching: 4 ms)
680 ms (execution: 676 ms, fetching: 4 ms)
676 ms (execution: 672 ms, fetching: 4 ms)
679 ms (execution: 676 ms, fetching: 3 ms)

Clickhouse

# SQL 1

71 ms (execution: 20 ms, fetching: 51 ms)
15 ms (execution: 11 ms, fetching: 4 ms)
15 ms (execution: 12 ms, fetching: 3 ms)
13 ms (execution: 10 ms, fetching: 3 ms)
14 ms (execution: 11 ms, fetching: 3 ms)

# SQL 2

175 ms (execution: 141 ms, fetching: 34 ms)
165 ms (execution: 153 ms, fetching: 12 ms)
140 ms (execution: 128 ms, fetching: 12 ms)
142 ms (execution: 129 ms, fetching: 13 ms)
155 ms (execution: 137 ms, fetching: 18 ms)

五、结论

很明显在 OLAP 场景下 ClickHouse > PostgreSQL > MySQL