As per the documentation, pt-table-checksum is a tool to perform online replication consistency checks by executing checksum queries on the master, which produces different results on replicas that are inconsistent with the master.
The master and each slave insert checksums into the percona.checksums
table, and these are later compared for differences. It’s fairly obvious that the checksums need to be determined independently on each node, and so these inserts must be replicated as STATEMENT and not ROW. Otherwise, the slaves would just insert the same checksum as the master and not calculate it independently.
The tool only requires
binlog_format=STATEMENT
for its own session. It sets this itself on the master, but will error if this isn’t already set on each slave node. The reason for the error is that the statement to change the session’s binlog_format
will not be replicated. So if a slave has binlog_format=ROW
then the slave itself will execute the checksum correctly, but the results will be written as a ROW. Any slaves of this slave in the chain will just insert the same result. See bug 899415.
This is only a problem if we have chained replication, and the error can be skipped with --no-check-binlog-format
so for simple replication setups with ROW or MIXED replication we can still use the tool. If we do not have a slave-of-slave style chained topology, then there’s no need to worry about this.
However, there is one caveat to be aware of if you’re using replication filters: when a slave isn’t replicating a particular database due to binlog-ignore-db
, this setting behaves differently with ROW based replication (RBR) vs. STATEMENT based.
Per the documentation, with RBR,
binlog-ignore-db=testing
will cause all updates to testing.* to be skipped. With STATEMENT-based replication it will cause all updates after
USE test_database;
to be ignored (regardless of where the updates were being written to).
pt-table-checksum operates in the following way:
use `testing`/*!*/; SET TIMESTAMP=1541583280/*!*/; REPLACE INTO `percona`.`checksums` (db, tbl, chunk, chunk_index, lower_boundary, upper_boundary, this_cnt, this_crc) SELECT ‘testing', 'testing', '1', NULL, NULL, NULL, COUNT(*) AS cnt, COALESCE(LOWER(CONV(BIT_XOR(CAST(CRC32(CONCAT_WS('#', `id`, CONCAT(ISNULL(`id`)))) AS UNSIGNED)), 10, 16)), 0) AS crc FROM `testing`.`testing` /*checksum table*/
Due to the use testing
the slave will then skip these statements with no errors, and simply not write into percona.checksums
.
As per the documentation:
The tool monitors replicas continually. If any replica falls too far behind in replication, pt-table-checksum pauses to allow it to catch up. If any replica has an error, or replication stops, pt-table-checksum pauses and waits.
In this case, you will see the tool continually wait, with the following debug output:
# pt_table_checksum:12398 10967 SELECT MAX(chunk) FROM `percona`.`checksums` WHERE db=‘testing’ AND tbl=‘testing’ AND master_crc IS NOT NULL # pt_table_checksum:12416 10967 Getting last checksum on slave1 # pt_table_checksum:12419 10967 slave1 max chunk: undef # pt_table_checksum:12472 10967 Sleep 0.25 waiting for chunks # pt_table_checksum:12416 10967 Getting last checksum on slave1 # pt_table_checksum:12419 10967 slave1 max chunk: undef # pt_table_checksum:12472 10967 Sleep 0.5 waiting for chunks # pt_table_checksum:12416 10967 Getting last checksum on slave1 # pt_table_checksum:12419 10967 slave1 max chunk: undef
We don’t recommend using the tool with replication filters in place, but if --no-check-replication-filters
is specified you should be aware of the differences in how different binlog formats handle these filters.
One workaround would be to replace
binlog-ignore-db=testing
With the following which will just ignore writes to that database:
binlog-wild-ignore-table=testing.%
More resources
You can read more about pt-table-checksum in this blog post MySQL Replication Primer with pt-table-checksum and pt-table-sync
The latest version of Percona Toolkit can be downloaded from our website. All Percona software is open source and free.
Photo by Black ice from Pexels