In this blog, I will cover important aspects which you need to test when benchmarking replication setup. MySQL has great tools that could be used to test its performance. They include:
sysbench – https://github.com/akopytov/sysbench
BMK-kit – http://dimitrik.free.fr/blog/posts/mysql-perf-bmk-kit.html
mysqlslap – https://dev.mysql.com/doc/refman/8.0/en/mysqlslap.html
LinkBench – https://github.com/facebookarchive/linkbench
I will not describe how to use them here, as you can find instructions on the provided links or in the Percona blog by browsing tags sysbench, benchmark, benchmarks, and the category benchmarks.
However, most of the instructions take into account standalone MySQL servers. If you want to test how your replication setup performs, you may want to perform additional checks.
Can the replica catch up to the source server?
To answer this question, you need to run a load on the source machine and monitor the replica delay. You can do it by examining the
SHOW REPLICA STATUS
(
SHOW SLAVE STATUS
) output, particularly the value of the
Seconds_Behind_Source
(
Seconds_Behind_Master
) parameter. The same parameter is monitored by the “MySQL Replication Lag” graph under the “MySQL Replication Summary” dashboard in Percona Monitoring and Management (PMM).
The more advanced check would be to compare
Exec_Source_Log_Pos
(
Exec_Master_Log_Pos
),
Relay_Log_File
with
Read_Source_Log_Pos
(
Read_Master_Log_Pos
),
Source_Log_File
pairs since
Seconds_Behind_Source
may be affected by long-running commands and return wrong values.
You may tune your replica server by adding parallelization if you see increasing lag. Check also how option
binlog_transaction_dependency_tracking
Can the replica run queries while applying updates from the source server?
Once you are happy with the replication lag, you may start adding the load you plan to run on this replica and test its performance. For example, if you plan to use the replica for handling read-only queries, you can run a read-only benchmark on it. You need to watch the performance of the read-only benchmark and replication lag.
If you plan to run analytical queries on the replica: test them. If you plan to use this replica for taking backups: take the backup and measure how it will affect the performance of the replication.
Synchronous replication
Synchronous replication solutions such as Galera, Percona XtraDB Cluster (PXC), and Group Replication are affected by the flow control paused time. You can use PMM to monitor it.
For PXC/Galera clusters, watch graph “Flow Control Paused Time” under “PXC/Galera Cluster Summary” dashboard.
Or you can watch the value of the
wsrep_flow_control_paused
variable. See Galera Flow Control in Percona XtraDB Cluster for MySQL for more details.
For Group Replication, you need to check the value of the
COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE
column in the table
performance_schema.replication_group_member_stats
. That shows how many transactions are waiting in the queue on the secondary node to apply.
You may also watch “Replication Lag” graph under “MySQL Group Replication” dashboard in PMM
Your best test is your production
As said by Dimitri Kravtchuk at Fosdem 2020:
While it is hard to create such a benchmark on the server that accepts writes from users, for the replica, it is comparatively easy. All you need to do is to create a replica of your production server and not make it available for the application yet. First, watch if it can handle replication lag while idle, then start adding load to it. You may also use the mirroring feature of ProxySQL Note; however, you need to check if some of the limitations described in this post, still apply.
How fast will the replica catch up?
One more thing you may want to test is how fast the replica will recover after a delay in replicating updates from the source server. To perform this test, stop the replica while running the source server’s load, restart it after a certain amount of time, and watch replication lag. If you are testing a replica that serves read-only load, let the load test continue.
It is better to perform stop, wait, and start in one series of commands, so you will have control over how long the replica was stopped. For example, to stop replica for one hour, use the following commands:
STOP REPLICA; SELECT SLEEP(3600); START REPLICA;
Then wait until the replica is running:
SELECT SERVICE_STATE FROM performance_schema.replication_applier_status;
Once this query returns ON, start monitoring the value of
Seconds_Behind_Source
in the
SHOW REPLICA STATUS
output until it becomes 0. Once it happens, record the time. This way, you will know how much time will be needed to apply one hour of changes.
Conclusion
You can use the same benchmarking tools that are recommended for standalone MySQL servers to perform performance tests against replication setups. You only need to monitor replication lag together with other performance characteristics.