In this blog post, we’ll look at does the Meltdown fix affect performance for MySQL on bare metal servers.
Since the news about the Meltdown bug, there were a lot of reports on the performance hit from proposed fixes. We have looked at how the fix affects MySQL (Percona Server for MySQL) under a sysbench workload.
In this case, we used bare metal boxes with the following specifications:
- Two-socket Intel(R) Xeon(R) CPU E5-2683 v3 @ 2.00GHz (in total 56 entries in /proc/cpuinfo)
- Ubuntu 16.04
- Memory: 256GB
- Storage: Samsung SM863 1.9TB SATA SSD
- Percona Server for MySQL 5.7.20
- Kernel (vulnerable) 4.13.0-21
- Kernel (with Meltdown fix) 4.13.0-25
Please note, the current kernel for Ubuntu 16.04 contains only a Meltdown fix, and not one for Spectre.
We performed the validation with the https://github.com/speed47/spectre-meltdown-checker tool. The database size is 100GB in a sysbench workload with 100 tables, 4mln rows each with Pareto distribution.
We have used a socket connection and TCP host connection to measure a possible overhead from the TCP network connection. We also perform read-write and read-only benchmarks.
The results are below for a various number of threads:
Where
- Nokpti: kernel without KPTI patch (4.13.0-21)
- Pti: kernel with KPTI patch (4.13.0-25), with PTI enabled
- Nopti: kernel with KPTI patch (4.13.0-25), with PTI disabled
testname | bp | socket | threads | pti | nopti | nokpti | nopti_pct | pti_pct | |
---|---|---|---|---|---|---|---|---|---|
1 | OLTP_RO | in-memory | tcp_socket | 1 | 709.93 | 718.47 | 699.50 | -2.64 | -1.47 |
4 | OLTP_RO | in-memory | tcp_socket | 8 | 5473.05 | 5500.08 | 5483.40 | -0.30 | 0.19 |
3 | OLTP_RO | in-memory | tcp_socket | 64 | 21716.18 | 22036.98 | 21548.46 | -2.22 | -0.77 |
2 | OLTP_RO | in-memory | tcp_socket | 128 | 21606.02 | 22010.36 | 21548.62 | -2.10 | -0.27 |
5 | OLTP_RO | in-memory | unix_socket | 1 | 750.41 | 759.33 | 776.88 | 2.31 | 3.53 |
8 | OLTP_RO | in-memory | unix_socket | 8 | 5851.80 | 5896.86 | 5986.89 | 1.53 | 2.31 |
7 | OLTP_RO | in-memory | unix_socket | 64 | 23052.10 | 23552.26 | 23191.48 | -1.53 | 0.60 |
6 | OLTP_RO | in-memory | unix_socket | 128 | 23215.38 | 23602.64 | 23146.42 | -1.93 | -0.30 |
9 | OLTP_RO | io-bound | tcp_socket | 1 | 364.03 | 369.68 | 370.51 | 0.22 | 1.78 |
12 | OLTP_RO | io-bound | tcp_socket | 8 | 3205.05 | 3225.21 | 3210.63 | -0.45 | 0.17 |
11 | OLTP_RO | io-bound | tcp_socket | 64 | 15324.66 | 15456.44 | 15364.25 | -0.60 | 0.26 |
10 | OLTP_RO | io-bound | tcp_socket | 128 | 17705.29 | 18007.45 | 17748.70 | -1.44 | 0.25 |
13 | OLTP_RO | io-bound | unix_socket | 1 | 421.74 | 430.10 | 432.88 | 0.65 | 2.64 |
16 | OLTP_RO | io-bound | unix_socket | 8 | 3322.19 | 3367.46 | 3367.34 | -0.00 | 1.36 |
15 | OLTP_RO | io-bound | unix_socket | 64 | 15977.28 | 16186.59 | 16248.42 | 0.38 | 1.70 |
14 | OLTP_RO | io-bound | unix_socket | 128 | 18729.10 | 19111.55 | 18962.02 | -0.78 | 1.24 |
17 | OLTP_RW | in-memory | tcp_socket | 1 | 490.76 | 495.21 | 489.49 | -1.16 | -0.26 |
20 | OLTP_RW | in-memory | tcp_socket | 8 | 3445.66 | 3459.16 | 3414.36 | -1.30 | -0.91 |
19 | OLTP_RW | in-memory | tcp_socket | 64 | 11165.77 | 11167.44 | 10861.44 | -2.74 | -2.73 |
18 | OLTP_RW | in-memory | tcp_socket | 128 | 12176.96 | 12226.17 | 12204.85 | -0.17 | 0.23 |
21 | OLTP_RW | in-memory | unix_socket | 1 | 530.08 | 534.98 | 540.27 | 0.99 | 1.92 |
24 | OLTP_RW | in-memory | unix_socket | 8 | 3734.93 | 3757.98 | 3772.17 | 0.38 | 1.00 |
23 | OLTP_RW | in-memory | unix_socket | 64 | 12042.27 | 12160.86 | 12138.01 | -0.19 | 0.80 |
22 | OLTP_RW | in-memory | unix_socket | 128 | 12930.34 | 12939.02 | 12844.78 | -0.73 | -0.66 |
25 | OLTP_RW | io-bound | tcp_socket | 1 | 268.08 | 270.51 | 270.71 | 0.07 | 0.98 |
28 | OLTP_RW | io-bound | tcp_socket | 8 | 1585.39 | 1589.30 | 1557.58 | -2.00 | -1.75 |
27 | OLTP_RW | io-bound | tcp_socket | 64 | 4828.30 | 4782.42 | 4620.57 | -3.38 | -4.30 |
26 | OLTP_RW | io-bound | tcp_socket | 128 | 5158.66 | 5172.82 | 5321.03 | 2.87 | 3.15 |
29 | OLTP_RW | io-bound | unix_socket | 1 | 280.54 | 282.06 | 282.35 | 0.10 | 0.65 |
32 | OLTP_RW | io-bound | unix_socket | 8 | 1582.69 | 1584.52 | 1601.26 | 1.06 | 1.17 |
31 | OLTP_RW | io-bound | unix_socket | 64 | 4519.45 | 4485.72 | 4515.28 | 0.66 | -0.09 |
30 | OLTP_RW | io-bound | unix_socket | 128 | 5524.28 | 5460.03 | 5275.53 | -3.38 | -4.50 |
As you can see, there is very little difference between runs (in 3-4% range), which fits into variance during the test.
Similar experiments were done on different servers and workloads:
- https://github.com/Percona-Lab-results/201701-meltdown-OLTP-ro/blob/master/smperf-ubuntu-results.md
- https://github.com/Percona-Lab-results/201701-meltdown-OLTP-ro/blob/master/centos-res.md
There also we see a negligible difference that fits into measurement variance.
Overhead analysis
To understand why we do not see much effect in MySQL (InnoDB workloads), let’s take a look where we expect to see the overhead from the proposed fix.
The main overhead is expected from a system call, so let’s test syscall execution on the kernel before the fix and after the fix (thanks for Alexey Kopytov for an idea how to test it with sysbench).
We will use the following script syscall.lua:
ffi.cdef[[long syscall(long, long, long, long);]] function event() for i = 1, 10000 do ffi.C.syscall(0, 0, 0, 0) end end
Basically, we measure the time for executing 10000 system calls (this will be one event).
To run benchmark:
sysbench syscall.lua --time=60 --report-interval=1 run
And the results are following:
- On the kernel without the fix (4.13.0-21): 455 events/sec
- On the kernel with the fix (4.13.0-26): 250 events/sec
This means that time to execute 10000 system calls increased from 2.197ms to 4ms.
While this increase looks significant, it does not have much effect on MySQL (InnoDB engine). In MySQL, you can expect most system calls done for IO or network communication.
We can assume that the time to execute 10000 IO events on the fast storage takes 1000ms, so adding an extra 2ms for the system calls corresponds to adding 0.2% in overhead (which is practically invisible in MySQL workloads).
I expect the effect will be much more visible if we work with MyISAM tables cached in OS memory. In this case, the syscall overhead would be much more visible when accessing data in memory.
Conclusion:
From our results, we do not see a measurable effect from KPTI patches (to mitigate the Meltdown vulnerability) running on bare metal servers with Ubuntu 16.04 and 4.13 kernel series.
Reference commands and configs:
sysbench oltp_read_only.lua {--mysql-socket=/tmp/mysql.sock|--mysql-host=127.0.0.1} --mysql-user=root --mysql-db=sbtest100t4M --rand-type=pareto --tables=100 --table-size=4000000 --num-threads=$threads --report-interval=1 --max-time=180 --max-requests=0 run
RW:
sysbench oltp_read_write.lua {--mysql-socket=/tmp/mysql.sock|--mysql-host=127.0.0.1} --mysql-user=root --mysql-db=sbtest100t4M --rand-type=pareto --tables=100 --table-size=4000000 --num-threads=$threads --report-interval=1 --max-time=180 --max-requests=0 run
mysqld:
Percona Server 5.7.20-19
numactl --physcpubind=all --interleave=all /usr/bin/env LD_PRELOAD=/data/opt/alexey.s/bin64_5720.ps/lib/mysql/libjemalloc.so.1 ./bin/mysqld --defaults-file=/data/opt/alexey.s/my-perf57.cnf --basedir=. --datadir=/data/sam/sbtest100t4M --user=root --innodb_flush_log_at_trx_commit=1 --innodb-buffer-pool-size=150GB --innodb-log-file-size=10G --innodb-buffer-pool-instances=8 --innodb-io-capacity-max=20000 --innodb-io-capacity=10000 --loose-innodb-page-cleaners=8 --ssl=0
My.cnf file:
[mysqld] user=root port=3306 innodb_status_file=0 innodb_data_file_path=ibdata1:100M:autoextend innodb_flush_log_at_trx_commit = 1 innodb_file_per_table = true innodb_log_buffer_size = 128M innodb_log_file_size = 10G innodb_log_files_in_group = 2 innodb_write_io_threads=8 innodb_read_io_threads=8 innodb_io_capacity=15000 innodb_io_capacity_max=25000 innodb_lru_scan_depth=8192 #innodb_buffer_pool_size=${BP}G innodb_doublewrite=1 innodb_thread_concurrency=0 innodb-checksum-algorithm=strict_crc32 innodb_flush_method=O_DIRECT_NO_FSYNC innodb_purge_threads=8 loose-innodb-page-cleaners=8 innodb_buffer_pool_instances=8 innodb_change_buffering=none innodb_adaptive_hash_index=OFF sync_binlog=0 max_connections=5000 table_open_cache=5000 query_cache_type=OFF thread_cache_size=16 back_log=2000 connect_timeout=15 skip-grant-tables sort_buffer_size=262144 key_buffer_size=8388608 join_buffer_size=262144 server-id=1 max_connections=50000 skip_name_resolve=ON max_prepared_stmt_count=1048560 performance_schema=OFF performance-schema-instrument='wait/synch/%=ON' innodb_monitor_enable=all innodb_flush_neighbors=0 metadata_locks_hash_instances=256 table_open_cache_instances=64