龙空技术网

又一个选择PostgreSQL的理由:单表1亿性能对比

迷路的架构师 3569

前言:

现时大家对“单表上亿数据用什么数据库”大约比较珍视,朋友们都想要分析一些“单表上亿数据用什么数据库”的相关资讯。那么小编在网络上收集了一些对于“单表上亿数据用什么数据库””的相关文章,希望咱们能喜欢,同学们一起来学习一下吧!

今天测试一下 1 亿条数据,MySQL 和 PostgreSQL 的性能表现。说明下,只是做一些基本的测试,并没有用一些数据库 Benchmark 工具进行测试。

准备

建表语句:

CREATE TABLE user_mysql / user_postgresql (                                 id SERIAL PRIMARY KEY,                                 username VARCHAR(50),                                 email VARCHAR(100),                                 password VARCHAR(100),                                 first_name VARCHAR(50),                                 last_name VARCHAR(50),                                 address VARCHAR(200),                                 city VARCHAR(50),                                 state VARCHAR(50),                                 zip_code VARCHAR(10),                                 country VARCHAR(50),                                 phone_number VARCHAR(50),                                 date_of_birth DATE,                                 gender VARCHAR(10),                                 occupation VARCHAR(100),                                 education_level VARCHAR(50),                                 registration_date TIMESTAMP,                                 last_login TIMESTAMP,                                 is_active BOOLEAN,                                 is_admin BOOLEAN,                                 additional_field1 VARCHAR(100),                                 additional_field2 VARCHAR(100));

接下来记录一下相关数据。

1.插入耗时MySQL:≈ 67分钟PostgreSQL:≈ 55分钟2.count(*)耗时

MySQL:45 s 877 ms,明细如下:

mydatabase> select count(*) from user_mysql[2023-09-26 22:22:24] 1 row retrieved starting from 1 in 45 s 877 ms (execution: 45 s 767 ms, fetching: 110 ms)

PostgreSQL:8 s 169 ms,明细如下:

postgres.public> select count(*) from user_postgresql[2023-09-26 22:24:08] 1 row retrieved starting from 1 in 8 s 169 ms (execution: 8 s 133 ms, fetching: 36 ms)

1亿数据量

3.根据主键查询数据

MySQL:47 ms,明细如下:

mydatabase> select * from user_mysql where id = 19279833[2023-09-26 22:28:10] 1 row retrieved starting from 1 in 47 ms (execution: 16 ms, fetching: 31 ms)

PostgreSQL:46 ms,明细如下:

postgres.public> select * from user_postgresql where id = 19279833[2023-09-26 22:29:51] 1 row retrieved starting from 1 in 46 ms (execution: 15 ms, fetching: 31 ms)
4.根据username查询(无索引)

MySQL:1 m 56 s 986 ms,明细如下:

// 查询第99279833行数据mydatabase> select * from user_mysql where username = '10190439674'[2023-09-26 22:36:09] 1 row retrieved starting from 1 in 1 m 56 s 986 ms (execution: 1 m 56 s 939 ms, fetching: 47 ms)

PostgreSQL:38 s 73 ms,明细如下:

// 同样查询第99279833行数据postgres.public> select * from user_postgresql where username = '14998727834'[2023-09-26 22:38:25] 1 row retrieved starting from 1 in 38 s 73 ms (execution: 38 s 18 ms, fetching: 55 ms)
5.创建索引耗时

MySQL创建B+TREE索引:5 m 31 s 276 ms,明细如下:

mydatabase> ALTER TABLE user_mysql ADD INDEX idx_name (username)[2023-09-26 22:47:37] completed in 5 m 31 s 276 ms

PostgreSQL创建B-TREE索引:9 m 20 s 847 ms,明细如下:

postgres.public> CREATE INDEX idx_name ON user_postgresql (username)[2023-09-26 22:57:59] completed in 9 m 20 s 847 ms
6.根据username查询(有索引)

MySQL:93 ms,明细如下:

// 查询第99279833行数据mydatabase> select * from user_mysql where username = '10190439674'[2023-09-26 23:01:48] 1 row retrieved starting from 1 in 93 ms (execution: 0 ms, fetching: 93 ms)

PostgreSQL:63 ms,明细如下:

// 同样查询第99279833行数据postgres.public> select * from user_postgresql where username = '14998727834'[2023-09-26 23:00:07] 1 row retrieved starting from 1 in 63 ms (execution: 0 ms, fetching: 63 ms)
7.根据username修改(有索引)

MySQL:16 ms,明细如下:

mydatabase> update user_mysql set email='myemail' where username = '10190439674'[2023-09-26 23:06:05] 1 row affected in 16 ms

PostgreSQL:15 ms,明细如下:

postgres.public> update user_postgresql set email='myemail' where username = '14998727834'[2023-09-26 23:07:13] 1 row affected in 15 ms
8.分页查询(不加条件)

MySQL:1 m 40 s 265 ms,明细如下:

mydatabase> select * from user_mysql limit 89999980, 20[2023-09-26 23:10:54] 20 rows retrieved starting from 1 in 1 m 40 s 265 ms (execution: 1 m 40 s 234 ms, fetching: 31 ms)

PostgreSQL:27 s 750 ms,明细如下:

postgres.public> select * from user_postgresql limit 20 offset 89999980[2023-09-26 23:12:32] 20 rows retrieved starting from 1 in 27 s 750 ms (execution: 27 s 688 ms, fetching: 62 ms)
9.分页查询(加条件,条件为索引)

MySQL:94 ms,明细如下:

mydatabase> select * from user_mysql where id >= 89999980 limit 20[2023-09-26 23:13:34] 20 rows retrieved starting from 1 in 94 ms (execution: 0 ms, fetching: 94 ms)

PostgreSQL:78 ms,明细如下:

postgres.public> select * from user_postgresql where id >= 89999980 limit 20[2023-09-26 23:14:12] 20 rows retrieved starting from 1 in 78 ms (execution: 0 ms, fetching: 78 ms)
总结

在数据量达到1亿时,数据库操作的开销都会比较大,尤其是不走索引的操作和DDL操作等。因此在生产环境时,不建议数据量太大,数据库特别大的情况下,建议使用更强大的数据库,不建议分表分库。对大表进行DDL操作时也需要谨慎操作。

声明:这些数据均为本机测试,并未用专业测试软件测试,仅供参考。

标签: #单表上亿数据用什么数据库