Percona Operator for MySQL, based on Percona XtraDB Cluster (PXC), provides configuration to expose the ProxySQL, HAProxy, and the PXC instances themselves so that you can use Percona Toolkit to connect to MySQL from these endpoints externally. However, you can also create Kubernetes pods to run Percona Toolkit operations there instead if you do not […]
22
2024
Online Schema Changes on Tables with Foreign Keys in MySQL
pt-online-schema-change is an amazing tool for assisting in table modifications in cases where ONLINE ALTER is not an option. But if you have foreign keys, this could be an interesting and important read for you. Tables with foreign keys are always complicated, and they have to be handled with care. The use case I am […]
19
2024
eu-stack Support and Other Improvements in pt-pmp of Percona Toolkit
pt-pmp is a profiler tool that creates and summarizes full stack traces of processes on Linux. It was inspired by http://poormansprofiler.org and helped Percona Support resolve many performance issues. In this blog post, I will present an improved pt-pmp that can collect stack traces with minimal impact on the production environment. TLDR; Starting from Percona Toolkit […]
18
2024
Partial Data Archiving and Schema Change
Starting from Percona Toolkit 3.6.0, pt-online-schema-change supports the option –where, so you can not only alter your table definition live but copy only rows that satisfy certain criteria. It may happen that you may not need to copy all the data when changing table definition. For example, if the table is too big and you […]
14
2024
Resume Your Failed pt-online-schema-change Job
Starting from Percona Toolkit 3.6.0, you can resume pt-online-schema-change if it was interrupted. This blog describes the prerequisites and usage of the new –resume option. To restart the job, you need to know where it failed. This is why the first option you must use is –history. It instructs pt-online-schema-change to store its progress in […]
12
2024
What’s New in Percona Toolkit 3.6.0
Percona Toolkit 3.6.0 has been released on June 12, 2024. The most important updates in this version are: The possibility to resume pt-online-schema-change if it is interrupted. eu-stack support in pt-pmp that significantly improves this tool’s performance and decreases the load it causes on production servers. New tool pt-eustack-resolver Packages for Ubuntu 24.04 (Noble Numbat) […]
13
2024
Seamless Table Modifications: Leveraging pt-online-schema-change for Online Alterations
Table modifications are a routine task for database administrators. The blog post Using Percona Toolkit to Alter Database Tables Online: A Controlled Approach provides insights into the process of altering tables online in a controlled manner, ensuring uninterrupted access for application users and preventing application downtime. We will focus here on utilizing the powerful “pt-online-schema-change” […]
22
2023
What’s New in Percona Toolkit 3.5.6
Percona Toolkit 3.5.6 was released on December 21, 2023. This blog post covers the main changes in this release.
New tool: pt-galera-log-explainer
We continue adding more instruments for the Support teams.
pt–galera–log–explainer was written by my Percona Support colleague Yoann La Cancellera.
pt–galera–log–explainer filters, aggregates, and summarizes multiple Galera library logs together. The purpose of this tool is to help find useful information in Percona XtraDB Cluster/Galera library logs.
pt–galera–log–explainer takes raw log files, usually very verbose as the one available in the regression test suite, and makes short summary out of them:
$ ./bin/pt-galera-log-explainer list --all src/go/pt-galera-log-explainer/tests/logs/merge_rotated_daily/node1.20230315.log identifier node1 current path .../tests/logs/merge_rotated_daily/node1.20230315.log last known ip last known name node1 mysql version 2023-03-15T20:10:57.784904+02:00 node2 joined 2023-03-15T20:10:57.785568+02:00 node3 left 2023-03-15T20:10:57.791959+02:00 node3 left 2023-03-15T20:10:57.797221+02:00 PRIMARY(n=2) 2023-03-15T20:20:12.714291+02:00 node2 joined 2023-03-15T20:20:12.714331+02:00 node3 joined 2023-03-15T20:20:13.776977+02:00 PRIMARY(n=3) 2023-03-15T20:20:14.839684+02:00 local node will resync node3 2023-03-15T20:20:14.839723+02:00 SYNCED -> DONOR 2023-03-15T20:20:15.799020+02:00 IST will be used 2023-03-15T20:20:16.850525+02:00 finished sending IST to node3 2023-03-15T20:20:16.850549+02:00 DESYNCED -> JOINED 2023-03-15T20:20:16.865312+02:00 JOINED -> SYNCED
The tool can process logs from multiple nodes and draw a timeline. You can find a usage example with sample output in the user reference manual. You can also filter events to have a more compact output.
pt–galera–log–explainer is an excellent tool for analyzing large log files that produce the Galera library.
Better macOS and ARM support
This release contains a few improvements for platforms and operating systems that Percona does not officially support.
PR-516, contributed by Ivan Kruglov, makes
pt–online–schema–changeand other tools respect case-insensitive lookup on Windows and macOS: the default option for these operating systems.
PR-720 simplifies the build process for the Percona Toolkit. Historically, Percona Toolkit was written in Perl and Shell programming languages. Build instructions for all code were simple:
perl Makefile.PL make (optionally) make test make install
Since the first tool, written in the Go programming language, was introduced, package maintainers had to perform one extra step:
cd src/go make <PLATFORM such as linux-amd64>
Now, this extra step is optional because the top-level
Makefilehas instructions for building Go tools on the current platform.
PR-712 adds the
darwin–arm64 platform to the list of platforms for which Go binaries could be built. While we do not officially support ARM and macOS, you can build the tools yourself without extra effort.
To create macOS ARM binaries on the same platform, simply run the following from the top-level directory.
perl Makefile.PL make
If you want to create binaries on another platform, change the directory to
src/go, then run
make darwin-arm64
We also have unofficial ARM packages for Percona Toolkit and PMM Dump in our labs at https://github.com/Percona-Lab/percona-on-arm.
Tools improvements
Explain output for slow query in JSON report for pt-query-digest
Earlier,
pt–query–digest did not print
EXPLAIN output with option
—output=json . PR-471, sent by Ayush Goyal, adds this possibility.
Hook before_die for pt-online-schema-change
PR-509, sent by Ilaria Migliozzi, introduces a new hook for
pt–online–schema–change . If
pt–online–schema–change stops ungracefully, this hook lets you print diagnostic information about failing operations. You can find an example plugin on GitHub.
Option –unstop for pt-archiver
pt–archiver supports the option –stop that terminates running instances by creating a sentinel file. However, there was no option to remove this file and restart
pt–archiver . PR-429 by fraff resolves this gap by adding the option
—unstop . If
pt–archiver is called with this option, it removes the sentinel file and restarts the operation.
Quality improvements
In my last Percona Toolkit release blog, What’s New in Percona Toolkit 3.5.5, I wrote about how we plan to make regression tests part of the release process. This partially happened, and we are testing Percona Toolkit with Percona Server for MySQL 5.7 and 8.0 on all supported platforms now. There are still a few issues with tests, as reported at PT-2295. We are planning to fix them and then add support for other products.
Viktor Szépe continued to contribute to the quality of the Percona Toolkit source code. He introduced EditorConfig for our GitHub repository, fixed typos, and made style improvements for Go code.
In addition to Kushal Haldar’s reports about vulnerabilities in Go, we enabled better automation for our GitHub repository. This release was built with the latest version of Go and fixes all known module vulnerabilities.
Percona Toolkit Docker images
We released an official Docker image for the Percona Toolkit at https://hub.docker.com/r/percona/percona-toolkit. It can be installed using the following command.
docker pull percona/percona-toolkit
To call any tool using docker, run the following.
docker run <TOOL NAME> <OPTIONS>
For example:
$ docker run --network="host" percona/percona-toolkit pt-online-schema-change > h=127.0.0.1,P=12345,u=msandbox,p=msandbox,D=test,t=t1 --alter='ADD COLUMN f2 INT' > --execute Found 2 slaves: s76 -> 127.0.0.1:12346 s76 -> 127.0.0.1:12347 Will check slave lag on: s76 -> 127.0.0.1:12346 s76 -> 127.0.0.1:12347 Operation, tries, wait: analyze_table, 10, 1 copy_rows, 10, 0.25 create_triggers, 10, 1 drop_triggers, 10, 1 swap_tables, 10, 1 update_foreign_keys, 10, 1 Altering `test`.`t1`... Creating new table... Created new table test._t1_new OK. Altering new table... Altered `test`.`_t1_new` OK. 2023-12-21T15:29:22 Creating triggers... 2023-12-21T15:29:22 Created triggers OK. 2023-12-21T15:29:22 Copying approximately 1 rows... 2023-12-21T15:29:22 Copied rows OK. 2023-12-21T15:29:22 Analyzing new table... 2023-12-21T15:29:22 Swapping tables... 2023-12-21T15:29:22 Swapped original and new tables OK. 2023-12-21T15:29:22 Dropping old table... 2023-12-21T15:29:22 Dropped old table `test`.`_t1_old` OK. 2023-12-21T15:29:22 Dropping triggers... 2023-12-21T15:29:22 Dropped triggers OK. Successfully altered `test`.`t1`
Docker images are not only needed for those who prefer to install software via docker but will allow us to easily ship the Percona Toolkit together with other software, such as Percona Monitoring and Management or Percona Kubernetes Operators.
Community contributions
This release includes many contributions from community members. We want to thank:
- Ivan Kruglov for fixing case-sensitivity issues in
pt–online–schema–change and other tools - Chrys Swingler for fixing a bug with the option
—skip–check–slave–lag in
pt–online–schema–change and
pt–table–checksum - Ayush Goyal for adding
EXPLAIN output for slow query in JSON report for
pt–query–digest - Ilaria Migliozzi for new hook
after_die for
pt–online–schema–change - Jakob for fixing
pt–archiver bug PT-2064. - Viktor Szépe for introducing EditorConfig, fixing typos and issues with the Go code
- fraff for the option
—unstop for
pt–archiver - Zongzhi Chen for fixing
pt–pmp bug PT-2211 - Jason Ng for fixing
pt–visual–explain bug PT-2277 - Kushal Haldar for his reports about vulnerabilities in Go-based tools
Percona Toolkit: Free your DBAs with advanced open source command-line tools.
12
2023
How to Use Percona Toolkit’s pt-table-sync for Replica Tables With Triggers in MySQL
In Percona Managed Services, we manage Percona for MySQL, Community MySQL, and MariaDB. Sometimes, the replica server might have replication errors, and the replica might be out of sync with the primary. In this case, we can use Percona Toolkit’s pt-table-checksum and pt-table-sync to check the data drift between primary and replica servers and make the replica in sync with the primary. This blog gives you some ideas on using pt-table-sync for replica tables with triggers.
In my lab, we have two test nodes with replication setup, and both servers will have Debian 11 and Percona Server for MySQL 8.0.33 (with Percona Toolkit) installed.
The PRIMARY server is deb11m8 (IP: 192.168.56.188 ), and the REPLICA server name is deb11m8s (IP: 192.168.56.189).
1. Creating the test tables and the AFTER INSERT trigger
Create the below table and trigger on PRIMARY, and it will replicate down to REPLICA. We have two tables: test_tab and test_tab_log. When a new row is inserted into test_tab, the trigger will fire and put the data and the user who did the insert into the test_tab_log table.
Create database testdb; Use testdb; Create table test_tab (id bigint NOT NULL , test_data varchar(50) NOT NULL ,op_time TIMESTAMP NOT NULL , PRIMARY KEY (id,op_time)); Create table test_tab_log (id bigint NOT NULL , test_data varchar(50) NOT NULL ,op_user varchar(60) NOT NULL ,op_time TIMESTAMP NOT NULL , PRIMARY KEY (id,op_time)); DELIMITER $$ CREATE DEFINER=`larry`@`%` TRIGGER after_test_tab_insert AFTER INSERT ON test_tab FOR EACH ROW BEGIN INSERT INTO test_tab_log(id,test_data,op_user,op_time) VALUES(new.id, NEW.test_data, USER(),NOW()); END$$ DELIMITER ;
2. Let’s fill in some test data
We do an insert as a root user. You can see that after data is inserted, the trigger fires as expected.
mysql> insert into test_tab (id,test_data,op_time) values(1,'lt1',now()); Query OK, 1 row affected (0.01 sec) mysql> select * from test_tab; select * from test_tab_log; +----+-----------+---------------------+ | id | test_data | op_time | +----+-----------+---------------------+ | 1 | lt1 | 2023-11-26 09:59:19 | +----+-----------+---------------------+ 1 row in set (0.00 sec) +----+-----------+----------------+---------------------+ | id | test_data | op_user | op_time | +----+-----------+----------------+---------------------+ | 1 | lt1 | root@localhost | 2023-11-26 09:59:19 | +----+-----------+----------------+---------------------+ 1 row in set (0.00 sec) We Insert another row insert into test_tab (id,test_data,op_time) values(2,'lt2',now()); mysql> select * from test_tab; select * from test_tab_log; +----+-----------+---------------------+ | id | test_data | op_time | +----+-----------+---------------------+ | 1 | lt1 | 2023-11-26 09:59:19 | | 2 | lt2 | 2023-11-26 10:01:30 | +----+-----------+---------------------+ 2 rows in set (0.00 sec) +----+-----------+----------------+---------------------+ | id | test_data | op_user | op_time | +----+-----------+----------------+---------------------+ | 1 | lt1 | root@localhost | 2023-11-26 09:59:19 | | 2 | lt2 | root@localhost | 2023-11-26 10:01:30 | +----+-----------+----------------+---------------------+ 2 rows in set (0.00 sec)
3. Let’s get percona.dsns ready for pt-table-checksum and pt-table-sync
CREATE TABLE percona.dsns (`id` int(11) NOT NULL AUTO_INCREMENT,`parent_id` int(11) DEFAULT NULL, `dsn` varchar(255) NOT NULL, PRIMARY KEY (`id`)) ENGINE=InnoDB DEFAULT CHARSET=utf8; INSERT INTO percona.dsns (dsn) VALUES ('h=192.168.56.190');
4. Simulate the out of sync on 192.168.56.190 by removing one row (id=1) in test_tab
mysql> use testdb; Database changed mysql> set sql_log_bin=0; Query OK, 0 rows affected (0.00 sec) mysql> delete from test_tab where id=1; Query OK, 1 row affected (0.00 sec) mysql> select * from test_tab; select * from test_tab_log; +----+-----------+---------------------+ | id | test_data | op_time | +----+-----------+---------------------+ | 2 | lt2 | 2023-11-26 10:01:30 | +----+-----------+---------------------+ 1 row in set (0.00 sec) +----+-----------+----------------+---------------------+ | id | test_data | op_user | op_time | +----+-----------+----------------+---------------------+ | 1 | lt1 | root@localhost | 2023-11-26 09:59:19 | | 2 | lt2 | root@localhost | 2023-11-26 10:01:30 | +----+-----------+----------------+---------------------+ 2 rows in set (0.00 sec)
5. Run pt-table-checksum to report the difference
root@deb11m8:~/test_pt_trigger# pt-table-checksum h=192.168.56.189 --port=3306 --no-check-binlog-format --no-check-replication-filters --replicate percona.checksums_test_tab --recursion-method=dsn=D=percona,t=dsns --tables testdb.test_tab --max-load Threads_running=50 --max-lag=10 --pause-file /tmp/checksums_test_tab Checking if all tables can be checksummed ... Starting checksum ... TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE 11-26T10:02:58 0 1 2 1 1 0 4.148 testdb.test_tab root@deb11m8:~/test_pt_trigger# on REPLICA, deb11m8s, we can see the checksum reports the difference. mysql> SELECT db, tbl, SUM(this_cnt) AS total_rows, COUNT(*) AS chunks -> FROM percona.checksums_test_tab -> WHERE ( -> master_cnt <> this_cnt -> OR master_crc <> this_crc -> OR ISNULL(master_crc) <> ISNULL(this_crc)) -> GROUP BY db, tbl; +--------+----------+------------+--------+ | db | tbl | total_rows | chunks | +--------+----------+------------+--------+ | testdb | test_tab | 1 | 1 | +--------+----------+------------+--------+ 1 row in set (0.00 sec)
6. Let’s try pt-table-sync to fix it; we will run pt-table-sync under user ‘larry’@’%’
Pt-table-sync says Triggers are defined on the table and will not continue to fix it.
root@deb11m8:~/test_pt_trigger# pt-table-sync h=192.168.56.190,P=3306 --sync-to-master --replicate percona.checksums_test_tab --tables=testdb.test_tab --verbose --print # Syncing via replication P=3306,h=192.168.56.190 # DELETE REPLACE INSERT UPDATE ALGORITHM START END EXIT DATABASE.TABLE Triggers are defined on the table at /usr/bin/pt-table-sync line 11306. while doing testdb.test_tab on 192.168.56.190 # 0 0 0 0 0 10:03:31 10:03:31 1 testdb.test_tab
Pt-table-sync has an option –[no]check-triggers- to that will skip trigger checking. The print result is good.
root@deb11m8:~/test_pt_trigger# pt-table-sync --user=larry --ask-pass h=192.168.56.190,P=3306 --sync-to-master --nocheck-triggers --replicate percona.checksums_test_tab --tables=testdb.test_tab --verbose --print # Syncing via replication P=3306,h=192.168.56.190 # DELETE REPLACE INSERT UPDATE ALGORITHM START END EXIT DATABASE.TABLE REPLACE INTO `testdb`.`test_tab`(`id`, `test_data`, `op_time`) VALUES ('1', 'lt1', '2023-11-26 09:59:19') /*percona-toolkit src_db:testdb src_tbl:test_tab src_dsn:P=3306,h=192.168.56.189 dst_db:testdb dst_tbl:test_tab dst_dsn:P=3306,h=192.168.56.190 lock:1 transaction:1 changing_src:percona.checksums_test_tab replicate:percona.checksums_test_tab bidirectional:0 pid:4169 user:root host:deb11m8*/; # 0 1 0 0 Nibble 10:03:54 10:03:55 2 testdb.test_tab
When we run pt-table-sync with –execute under user ‘larry’@’%’:
root@deb11m8:~/test_pt_trigger# pt-table-sync --user=larry --ask-pass h=192.168.56.190,P=3306 --sync-to-master --nocheck-triggers --replicate percona.checksums_test_tab --tables=testdb.test_tab --verbose --execute # Syncing via replication P=3306,h=192.168.56.190 # DELETE REPLACE INSERT UPDATE ALGORITHM START END EXIT DATABASE.TABLE # 0 1 0 0 Nibble 10:05:26 10:05:26 2 testdb.test_tab -------PRIMARY ------- mysql> select * from test_tab; select * from test_tab_log; +----+-----------+---------------------+ | id | test_data | op_time | +----+-----------+---------------------+ | 1 | lt1 | 2023-11-26 09:59:19 | | 2 | lt2 | 2023-11-26 10:01:30 | +----+-----------+---------------------+ 2 rows in set (0.00 sec) +----+-----------+----------------+---------------------+ | id | test_data | op_user | op_time | +----+-----------+----------------+---------------------+ | 1 | lt1 | root@localhost | 2023-11-26 09:59:19 | | 1 | lt1 | larry@deb11m8 | 2023-11-26 10:05:26 | | 2 | lt2 | root@localhost | 2023-11-26 10:01:30 | +----+-----------+----------------+---------------------+ 3 rows in set (0.00 sec) -----REPLICA mysql> select * from test_tab; select * from test_tab_log; +----+-----------+---------------------+ | id | test_data | op_time | +----+-----------+---------------------+ | 1 | lt1 | 2023-11-26 09:59:19 | | 2 | lt2 | 2023-11-26 10:01:30 | +----+-----------+---------------------+ 2 rows in set (0.00 sec) +----+-----------+----------------+---------------------+ | id | test_data | op_user | op_time | +----+-----------+----------------+---------------------+ | 1 | lt1 | root@localhost | 2023-11-26 09:59:19 | | 1 | lt1 | | 2023-11-26 10:05:26 | | 2 | lt2 | root@localhost | 2023-11-26 10:01:30 | +----+-----------+----------------+---------------------+ 3 rows in set (0.00 sec)
We can see a new row inserted into the test_tab_log table. The reason is that the trigger fired on the primary and replicated to the REPLICA when we ran pt-table-sync.
7. If we do not want that to happen (new row inserted in test_tab_log table)
Option 1: Do the pt-table-checksum/pt-table-sync for the test_tab_log table again. This might fix the issue.
Option 2: We might need to do some work on the trigger like below (or there might be another better way).
Let‘s recreate the trigger as below; the trigger will check if it’s run by ‘larry’.
Drop trigger after_test_tab_insert; DELIMITER $$ CREATE DEFINER=`larry`@`%` TRIGGER after_test_tab_insert AFTER INSERT ON test_tab FOR EACH ROW BEGIN IF left(USER(),5) <> 'larry' and trim(left(USER(),5)) <>'' THEN INSERT INTO test_tab_log(id,test_data, op_user,op_time) VALUES(new.id, NEW.test_data, USER(),NOW()); END IF; END$$ DELIMITER ;
And restore the data to its original out-of-sync state.
The PRIMARY
mysql> select * from test_tab; select * from test_tab_log; +----+-----------+---------------------+ | id | test_data | op_time | +----+-----------+---------------------+ | 1 | lt1 | 2023-11-26 09:59:19 | | 2 | lt2 | 2023-11-26 10:01:30 | +----+-----------+---------------------+ 2 rows in set (0.00 sec) +----+-----------+----------------+---------------------+ | id | test_data | op_user | op_time | +----+-----------+----------------+---------------------+ | 1 | lt1 | root@localhost | 2023-11-26 09:59:19 | | 2 | lt2 | root@localhost | 2023-11-26 10:01:30 | +----+-----------+----------------+---------------------+
The REPLICA
mysql> select * from test_tab; select * from test_tab_log; +----+-----------+---------------------+ | id | test_data | op_time | +----+-----------+---------------------+ | 2 | lt2 | 2023-11-26 10:01:30 | +----+-----------+---------------------+ 1 row in set (0.00 sec) +----+-----------+----------------+---------------------+ | id | test_data | op_user | op_time | +----+-----------+----------------+---------------------+ | 1 | lt1 | root@localhost | 2023-11-26 09:59:19 | | 2 | lt2 | root@localhost | 2023-11-26 10:01:30 | +----+-----------+----------------+---------------------+ 2 rows in set (0.00 sec)
Run pt-table-sync under user ‘larry’@’%’.
root@deb11m8s:~# pt-table-sync --user=larry --ask-pass h=192.168.56.190,P=3306 --sync-to-master --nocheck-triggers --replicate percona.checksums_test_tab --tables=testdb.test_tab --verbose --execute Enter password for 192.168.56.190: # Syncing via replication P=3306,h=192.168.56.190,p=...,u=larry # DELETE REPLACE INSERT UPDATE ALGORITHM START END EXIT DATABASE.TABLE # 0 1 0 0 Nibble 21:02:26 21:02:27 2 testdb.test_tab
We can use pt-table-sync, which will fix the data drift for us, and the trigger will not fire when pt-table-sync is run under user larry.
root@deb11m8s:~# pt-table-sync --user=larry --ask-pass h=192.168.56.190,P=3306 --sync-to-master --nocheck-triggers --replicate percona.checksums_test_tab --tables=testdb.test_tab --verbose --execute Enter password for 192.168.56.190: # Syncing via replication P=3306,h=192.168.56.190,p=...,u=larry # DELETE REPLACE INSERT UPDATE ALGORITHM START END EXIT DATABASE.TABLE # 0 1 0 0 Nibble 21:02:26 21:02:27 2 testdb.test_tab
—The PRIMARY mysql> select * from test_tab; select * from test_tab_log; +----+-----------+---------------------+ | id | test_data | op_time | +----+-----------+---------------------+ | 1 | lt1 | 2023-11-26 09:59:19 | | 2 | lt2 | 2023-11-26 10:01:30 | +----+-----------+---------------------+ 2 rows in set (0.00 sec) +----+-----------+----------------+---------------------+ | id | test_data | op_user | op_time | +----+-----------+----------------+---------------------+ | 1 | lt1 | root@localhost | 2023-11-26 09:59:19 | | 2 | lt2 | root@localhost | 2023-11-26 10:01:30 | +----+-----------+----------------+---------------------+ 2 rows in set (0.00 sec) —The REPLICA mysql> select * from test_tab; select * from test_tab_log; +----+-----------+---------------------+ | id | test_data | op_time | +----+-----------+---------------------+ | 1 | lt1 | 2023-11-26 09:59:19 | | 2 | lt2 | 2023-11-26 10:01:30 | +----+-----------+---------------------+ 1 row in set (0.00 sec) +----+-----------+----------------+---------------------+ | id | test_data | op_user | op_time | +----+-----------+----------------+---------------------+ | 1 | lt1 | root@localhost | 2023-11-26 09:59:19 | | 2 | lt2 | root@localhost | 2023-11-26 10:01:30 | +----+-----------+----------------+---------------------+ 2 rows in set (0.00 sec)
8. If we still insert other data into the table test_tab under another user (e.g. root@localhost), the trigger will still fire
mysql> select user(); +----------------+ | user() | +----------------+ | root@localhost | +----------------+ 1 row in set (0.00 sec) mysql> insert into test_tab (id,test_data,op_time) values(3,'lt3',now()); Query OK, 1 row affected (0.01 sec) -— The PRIMARY mysql> select * from test_tab; select * from test_tab_log; +----+-----------+---------------------+ | id | test_data | op_time | +----+-----------+---------------------+ | 1 | lt1 | 2023-11-26 09:59:19 | | 2 | lt2 | 2023-11-26 10:01:30 | | 3 | lt3 | 2023-11-26 21:04:26 | +----+-----------+---------------------+ 3 rows in set (0.00 sec) +----+-----------+----------------+---------------------+ | id | test_data | op_user | op_time | +----+-----------+----------------+---------------------+ | 1 | lt1 | root@localhost | 2023-11-26 09:59:19 | | 2 | lt2 | root@localhost | 2023-11-26 10:01:30 | | 3 | lt3 | root@localhost | 2023-11-26 21:04:26 | +----+-----------+----------------+---------------------+ 3 rows in set (0.00 sec) — The REPLICA mysql> select * from test_tab; select * from test_tab_log; +----+-----------+---------------------+ | id | test_data | op_time | +----+-----------+---------------------+ | 1 | lt1 | 2023-11-26 09:59:19 | | 2 | lt2 | 2023-11-26 10:01:30 | | 3 | lt3 | 2023-11-26 21:04:26 | +----+-----------+---------------------+ 3 rows in set (0.00 sec) +----+-----------+----------------+---------------------+ | id | test_data | op_user | op_time | +----+-----------+----------------+---------------------+ | 1 | lt1 | root@localhost | 2023-11-26 09:59:19 | | 2 | lt2 | root@localhost | 2023-11-26 10:01:30 | | 3 | lt3 | root@localhost | 2023-11-26 21:04:26 | +----+-----------+----------------+---------------------+ 3 rows in set (0.00 sec)
In our test case, we just cover one AFTER INSERT trigger. In a live production system, there might be more complex scenarios (e.g. a lot of different types of triggers defined on the table you are going to do pt-table-sync, auto-increment value, the table has foreign key constraints, etc.). It would be better to test on a test environment before you go to production and make sure you have a valid backup before making a system change.
I hope this will give you some ideas on pt-table-sync on a table with triggers.
Percona Distribution for MySQL is the most complete, stable, scalable, and secure open source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!
31
2023
Resolving Data Drift in a Dual-Primary Topology With Replica in MySQL/MariaDB
Hello friends,
In Managed Services, we have the opportunity to see different technologies and various topologies, which makes the work fascinating and challenging at the same time.
This time, I’m going to tell you about a particular case: a client with a dual-primary topology plus a replica, as detailed below:
PS-primary-1=192.168.0.14 [RW] |___ PS-primary-2=192.168.0.59 [RW] (Slave_delay: 0) |___ PS-replica-1=192.168.0.99 [R] (Slave_delay: 0)
[RW] means Read/Write access.
[R] means Read Only access.
We received an alert of this kind on PS-primary-2:
Last_SQL_Errno: 1032 Last_SQL_Error: Could not execute Update_rows event on table foo.persons; Can't find record in 'persons', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000360, end_log_pos 58177
Seconds later, we noticed that someone (not us) resolved the replication issue, possibly by skipping the error in some way (using pt-slave-restart) or simply executing something like:
mysql> STOP SLAVE; SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1; START SLAVE;
Or by injecting an empty transaction.
(Disclaimer: We won’t deeply analyze the root cause here, but rather the surprise we found during the analysis.)
Having seen this, the most advisable action is to run pt-table-checksum, and if any differences appear, use pt-table-sync to fix the data drift.
Therefore, we executed the pt-table-checksum in the following way:
$ pt-table-checksum h=192.168.0.14 --port=3306 --no-check-binlog-format --no-check-replication-filters --replicate percona.checksums --recursion-method=hosts --max-load Threads_running=50 --max-lag=100 --databases=foo --tables=persons Checking if all tables can be checksummed ... Starting checksum ... TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE 07-28T12:40:31 0 1 6 1 1 0 0.193 foo.persons
(Second disclaimer: We have populated the table with a small amount of data to simplify and improve visibility in this case. With larger volumes, the same scenario also applies.)
(Credentials, for security reasons, are stored in the file /home/user/.my.cnf, which is why you don’t see them in any executed command. If you want to know how to create it, here’s an example)
At a glance, there seem to be differences. Let’s check the other primary instance and the replica as well.
PS-primary-2> SELECT max(ts), db, tbl, SUM(this_cnt) AS total_rows, COUNT(*) AS chunks FROM percona.checksums WHERE ( master_cnt <> this_cnt OR master_crc <> this_crc OR ISNULL(master_crc) <> ISNULL(this_crc)) GROUP BY db, tbl; +---------------------+-----+---------+------------+--------+ | max(ts) | db | tbl | total_rows | chunks | +---------------------+-----+---------+------------+--------+ | 2023-07-28 08:40:31 | foo | persons | 5 | 1 | +---------------------+-----+---------+------------+--------+ 1 row in set (0.00 sec) PS-replica-1> SELECT max(ts), db, tbl, SUM(this_cnt) AS total_rows, COUNT(*) AS chunks FROM percona.checksums WHERE ( master_cnt <> this_cnt OR master_crc <> this_crc OR ISNULL(master_crc) <> ISNULL(this_crc)) GROUP BY db, tbl; +---------------------+-----+---------+------------+--------+ | max(ts) | db | tbl | total_rows | chunks | +---------------------+-----+---------+------------+--------+ | 2023-07-28 09:53:10 | foo | persons | 4 | 1 | +---------------------+-----+---------+------------+--------+ 1 row in set (0.00 sec)
The next step is to execute pt-table-sync with the –print option to review what the utility will actually do. The following output appears:
$ pt-table-sync --print h=192.168.0.59 --sync-to-master --database=foo --table=persons --replicate percona.checksums DELETE FROM `foo`.`persons` WHERE `personid`='3' LIMIT 1 /*percona-toolkit src_db:foo src_tbl:persons src_dsn:P=3306,h=192.168.0.14 dst_db:foo dst_tbl:persons dst_dsn:h=192.168.0.59 lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:1519 user:percona host:localhost.localdomain*/; REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('1', 'Iglesias', 'Par', 'IND', 'Jaipur', '2023-07-27 07:10:10') /*percona-toolkit src_db:foo src_tbl:persons src_dsn:P=3306,h=192.168.0.14 dst_db:foo dst_tbl:persons dst_dsn:h=192.168.0.59 lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:1519 user:percona host:localhost.localdomain*/; REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('7', 'Ewd', 'Woo', 'CAN', 'Vancouver', '2023-07-28 07:10:10') /*percona-toolkit src_db:foo src_tbl:persons src_dsn:P=3306,h=192.168.0.14 dst_db:foo dst_tbl:persons dst_dsn:h=192.168.0.59 lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:1519 user:percona host:localhost.localdomain*/;
A DELETE statement has appeared. How is it possible?
As this is a dual-primary topology, it is advisable to perform a checksum on the other primary server as well, which is PS-primary-2. This will help identify any discrepancies and ensure data consistency between both primary servers.
$ pt-table-checksum h=192.168.0.59 --port=3306 --no-check-binlog-format --no-check-replication-filters --replicate percona.checksums --recursion-method=hosts --max-load Threads_running=50 --max-lag=100 --databases=foo --tables=persons Checking if all tables can be checksummed ... Starting checksum ... TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE 07-28T13:08:12 0 1 5 1 1 0 0.147 foo.persons
That’s when we notice that one server has six rows, and the other has four. How is it possible? Let’s review:
PS-primary-1 (none)> SELECT max(ts), db, tbl, SUM(this_cnt) AS total_rows, COUNT(*) AS chunks FROM percona.checksums WHERE ( master_cnt <> this_cnt OR master_crc <> this_crc OR ISNULL(master_crc) <> ISNULL(this_crc)) GROUP BY db, tbl; +---------------------+-----+---------+------------+--------+ | max(ts) | db | tbl | total_rows | chunks | +---------------------+-----+---------+------------+--------+ | 2023-07-28 10:03:00 | foo | persons | 6 | 1 | +---------------------+-----+---------+------------+--------+ PS-replica-1 (none)> SELECT max(ts), db, tbl, SUM(this_cnt) AS total_rows, COUNT(*) AS chunks FROM percona.checksums WHERE ( master_cnt <> this_cnt OR master_crc <> this_crc OR ISNULL(master_crc) <> ISNULL(this_crc)) GROUP BY db, tbl; +---------------------+-----+---------+------------+--------+ | max(ts) | db | tbl | total_rows | chunks | +---------------------+-----+---------+------------+--------+ | 2023-07-28 10:03:00 | foo | persons | 6 | 1 | +---------------------+-----+---------+------------+--------+
Let’s see what data the table currently contains:
$ for SERVE in 192.168.0.14 192.168.0.59 192.168.0.99; do echo ; echo $SERVE ; mysql -h $SERVE -e "select * from foo.persons"; done 192.168.0.14 +----------+-----------+-----------+---------+------------+---------------------+ | PersonID | LastName | FirstName | Address | City | Insert_timestamp | +----------+-----------+-----------+---------+------------+---------------------+ | 1 | Iglesias | Par | IND | Jaipur | 2023-07-27 07:10:10 | | 2 | Rooster | War | IND | Tamil Nadu | 2023-07-27 07:10:10 | | 4 | SoS | Syl | CAN | Ontario | 2023-07-27 07:10:10 | | 5 | Al-lot | Pat | IND | Delhi | 2023-07-27 07:10:10 | | 6 | The Crazy | Vai | IND | Vadodara | 2023-07-27 07:10:10 | | 7 | Ewd | Woo | CAN | Vancouver | 2023-07-28 07:10:10 | +----------+-----------+-----------+---------+------------+---------------------+ 192.168.0.59 +----------+-----------+-----------+---------+------------+---------------------+ | PersonID | LastName | FirstName | Address | City | Insert_timestamp | +----------+-----------+-----------+---------+------------+---------------------+ | 2 | Rooster | War | IND | Tamil Nadu | 2023-07-27 07:10:10 | | 3 | Colonel | Vil | PER | Lima | 2023-07-27 07:10:10 | | 4 | SoS | Syl | CAN | Ontario | 2023-07-27 07:10:10 | | 5 | Al-lot | Pat | IND | Delhi | 2023-07-27 07:10:10 | | 6 | The Crazy | Vai | IND | Vadodara | 2023-07-27 07:10:10 | +----------+-----------+-----------+---------+------------+---------------------+ 192.168.0.99 +----------+-----------+-----------+---------+------------+---------------------+ | PersonID | LastName | FirstName | Address | City | Insert_timestamp | +----------+-----------+-----------+---------+------------+---------------------+ | 2 | Rooster | War | IND | Tamil Nadu | 2023-07-27 07:10:10 | | 4 | SoS | Syl | CAN | Ontario | 2023-07-27 07:10:10 | | 5 | Al-lot | Pat | IND | Delhi | 2023-07-27 07:10:10 | | 6 | The Crazy | Vai | IND | Vadodara | 2023-07-27 07:10:10 | +----------+-----------+-----------+---------+------------+---------------------+
As you can see, there is data drift everywhere.
And how can this be possible? I’ll enumerate several possibilities, and it is not an exhaustive list
Data drift and integrity of replica data is complex. Let’s outline some common ways this can happen (generally speaking):
- A statement is executed on a primary with: SET SESSION sql_log_bin = OFF
- A statement was executed directly on the replica
- This can happen if the replica was not in super_read_only and a super user executed
- This can happen if the replica was not in read_only
- A statement was executed on a replica, and the replica was later promoted to a primary without GTID in place
- A primary server is not configured for full ACID compliance, and it crashed
- At some point, the primary was not configured for row-based replication (even briefly)
These are the most common ways that data drift occurs, but certainly not the only ways that it can happen. More exotic cases can involve bugs, engine differences, version differences, etc.
Coming back to the issue, if we run pt-table-sync targeting the second primary instance, we will see the following result:
$ pt-table-sync --print h=192.168.0.14 --sync-to-master --database=foo --table=persons --replicate percona.checksums DELETE FROM `foo`.`persons` WHERE `personid`='1' LIMIT 1 /*percona-toolkit src_db:foo src_tbl:persons src_dsn:P=3306,h=192.168.0.59 dst_db:foo dst_tbl:persons dst_dsn:h=192.168.0.14 lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:5732 user:percona host:localhost.localdomain*/; DELETE FROM `foo`.`persons` WHERE `personid`='7' LIMIT 1 /*percona-toolkit src_db:foo src_tbl:persons src_dsn:P=3306,h=192.168.0.59 dst_db:foo dst_tbl:persons dst_dsn:h=192.168.0.14 lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:5732 user:percona host:localhost.localdomain*/; REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('3', 'Colonel', 'Vil', 'PER', 'Lima', '2023-07-27 07:10:10') /*percona-toolkit src_db:foo src_tbl:persons src_dsn:P=3306,h=192.168.0.59 dst_db:foo dst_tbl:persons dst_dsn:h=192.168.0.14 lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:5732 user:percona host:localhost.localdomain*/;
Here, the output is less promising: it suggests deleting the rows with ID 1 and 7, and performing a REPLACE on ID 3.
Therefore, note that it asks us to delete rows, which is not correct. So, what to do? After a discussion with the customer (we’ve shared with them all that we gathered), we agreed on the following:
- We need to add the row with ID=3 to the instances where it’s missing.
- We need to update the row with ID=1 in the instances where it’s missing.
- We need to add the row with ID=7 to the instances where it’s missing.
- We DO NOT need to delete any rows; no DELETES are necessary.
Important note: The client informed us that the data that exists only in one instance and not in the others is necessary. The client also verified that the actions we proposed to them were correct. Remember: we, as DBAs, are the guardians of the data and its integrity, but not necessarily the owners, let alone know the content for ethical reasons.
How do we achieve this, then?
Unfortunately, in this case, using pt-table-sync is not recommended since, as we have seen with the –print option, it would perform DELETES in both cases, regardless of which primary instance we consider as the source of truth.
We also cannot execute the REPLACE on PS-primary-1 because the REPLACE command follows a specific logic explained here
MySQL uses the following algorithm for REPLACE (and LOAD DATA … REPLACE):
- Try to insert the new row into the table
- While the insertion fails because a duplicate-key error occurs for a primary key or unique index:
- Delete from the table the conflicting row that has the duplicate key value
- Try again to insert the new row into the table
That means that if we execute a REPLACE on PS-primary-1 where the row with ID 7 does not exist but does exist in PS-primary-2, it will result in a duplicate key error, which I’ll show you below (as the row already exists):
PS-primary-1 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('3', 'Colonel', 'Vil', 'PER', 'Lima', '2023-07-27 09:20:13'); Query OK, 1 row affected (0.01 sec)
We connected to PS-primary-2, and indeed, we found the duplicate key error:
PS-primary-2 (none)> PAGER grep "Last_Err"; SHOW SLAVE STATUSG PAGER set to 'grep "Last_Err"' Last_Errno: 1062 Last_Error: Could not execute Write_rows event on table foo.persons; Duplicate entry '3' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.000359, end_log_pos 26840 1 row in set (0.01 sec)
The commands we should execute would be the following:
set sql_log_bin=0; REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('3', 'Colonel', 'Vil', 'PER', 'Lima', '2023-07-27 07:10:10'); REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('1', 'Iglesias', 'Par', 'IND', 'Jaipur', '2023-07-27 07:10:10'); REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('7', 'Ewd', 'Woo', 'CAN', 'Vancouver', '2023-07-28 07:10:10');
However, as we observed, the first one will fail due to a duplicate key. So, how do we resolve this? The safest option, in my humble opinion, would be to execute the three REPLACE statements on each of the instances by first invoking this command:
PS-primary-1 > set sql_log_bin=0; Query OK, 0 rows affected (0.00 sec) PS-primary-1 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('3', 'Colonel', 'Vil', 'PER', 'Lima', '2023-07-27 07:10:10'); Query OK, 1 row affected (0.01 sec) PS-primary-1 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('1', 'Iglesias', 'Par', 'IND', 'Jaipur', '2023-07-27 07:10:10'); Query OK, 1 row affected (0.01 sec) PS-primary-1 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('7', 'Ewd', 'Woo', 'CAN', 'Vancouver', '2023-07-28 07:10:10'); Query OK, 1 row affected (0.01 sec) PS-primary-1 > Bye PS-primary-2 > set sql_log_bin=0; Query OK, 0 rows affected (0.00 sec) PS-primary-2 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('3', 'Colonel', 'Vil', 'PER', 'Lima', '2023-07-27 07:10:10'); Query OK, 1 row affected (0.00 sec) PS-primary-2 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('1', 'Iglesias', 'Par', 'IND', 'Jaipur', '2023-07-27 07:10:10'); Query OK, 1 row affected (0.01 sec) PS-primary-2 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('7', 'Ewd', 'Woo', 'CAN', 'Vancouver', '2023-07-28 07:10:10'); Query OK, 1 row affected (0.01 sec) PS-primary-2 > Bye PS-replica-1 > set sql_log_bin=0; Query OK, 0 rows affected (0.00 sec) PS-replica-1 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('3', 'Colonel', 'Vil', 'PER', 'Lima', '2023-07-27 07:10:10'); Query OK, 1 row affected (0.01 sec) PS-replica-1 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('1', 'Iglesias', 'Par', 'IND', 'Jaipur', '2023-07-27 07:10:10'); Query OK, 1 row affected (0.01 sec) PS-replica-1 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('7', 'Ewd', 'Woo', 'CAN', 'Vancouver', '2023-07-28 07:10:10'); Query OK, 1 row affected (0.01 sec) PS-replica-1 > set global super_read_only=1; set global read_only=1; Query OK, 0 rows affected (0.00 sec)
Finally, we verify that all instances have the same data, either with the SELECT statement or the pt-table-checksum command.
$ for SERVE in 192.168.0.14 192.168.0.59 192.168.0.99; do echo ; echo $SERVE ; mysql -h $SERVE -e "select * from foo.persons"; done 192.168.0.14 +----------+-----------+-----------+---------+------------+---------------------+ | PersonID | LastName | FirstName | Address | City | Insert_timestamp | +----------+-----------+-----------+---------+------------+---------------------+ | 1 | Iglesias | Par | IND | Jaipur | 2023-07-27 07:10:10 | | 2 | Rooster | War | IND | Tamil Nadu | 2023-07-27 07:10:10 | | 3 | Colonel | Vil | PER | Lima | 2023-07-27 07:10:10 | | 4 | SoS | Syl | CAN | Ontario | 2023-07-27 07:10:10 | | 5 | Al-lot | Pat | IND | Delhi | 2023-07-27 07:10:10 | | 6 | The Crazy | Vai | IND | Vadodara | 2023-07-27 07:10:10 | | 7 | Ewd | Woo | CAN | Vancouver | 2023-07-28 07:10:10 | +----------+-----------+-----------+---------+------------+---------------------+ 192.168.0.59 +----------+-----------+-----------+---------+------------+---------------------+ | PersonID | LastName | FirstName | Address | City | Insert_timestamp | +----------+-----------+-----------+---------+------------+---------------------+ | 1 | Iglesias | Par | IND | Jaipur | 2023-07-27 07:10:10 | | 2 | Rooster | War | IND | Tamil Nadu | 2023-07-27 07:10:10 | | 3 | Colonel | Vil | PER | Lima | 2023-07-27 07:10:10 | | 4 | SoS | Syl | CAN | Ontario | 2023-07-27 07:10:10 | | 5 | Al-lot | Pat | IND | Delhi | 2023-07-27 07:10:10 | | 6 | The Crazy | Vai | IND | Vadodara | 2023-07-27 07:10:10 | | 7 | Ewd | Woo | CAN | Vancouver | 2023-07-28 07:10:10 | +----------+-----------+-----------+---------+------------+---------------------+ 192.168.0.99 +----------+-----------+-----------+---------+------------+---------------------+ | PersonID | LastName | FirstName | Address | City | Insert_timestamp | +----------+-----------+-----------+---------+------------+---------------------+ | 1 | Iglesias | Par | IND | Jaipur | 2023-07-27 07:10:10 | | 2 | Rooster | War | IND | Tamil Nadu | 2023-07-27 07:10:10 | | 3 | Colonel | Vil | PER | Lima | 2023-07-27 07:10:10 | | 4 | SoS | Syl | CAN | Ontario | 2023-07-27 07:10:10 | | 5 | Al-lot | Pat | IND | Delhi | 2023-07-27 07:10:10 | | 6 | The Crazy | Vai | IND | Vadodara | 2023-07-27 07:10:10 | | 7 | Ewd | Woo | CAN | Vancouver | 2023-07-28 07:10:10 | +----------+-----------+-----------+---------+------------+---------------------+ $ pt-table-checksum h=192.168.0.14 --port=3306 --no-check-binlog-format --no-check-replication-filters --replicate percona.checksums --recursion-method=hosts --max-load Threads_running=50 --max-lag=100 --databases=foo --tables=persons Checking if all tables can be checksummed ... Starting checksum ... TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE 07-28T14:12:32 0 0 7 0 1 0 0.185 foo.persons $ pt-table-checksum h=192.168.0.59 --port=3306 --no-check-binlog-format --no-check-replication-filters --replicate percona.checksums --recursion-method=hosts --max-load Threads_running=50 --max-lag=100 --databases=foo --tables=persons Checking if all tables can be checksummed ... Starting checksum ... TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE 07-28T14:12:50 0 0 7 0 1 0 0.148 foo.persons
Conclusion
Maintaining this type of topology (dual primaries) can be a headache, and they are prone to such problems. Therefore, resolving them is not always as simple as it may seem. The tools pt-table-checksum and pt-table-sync are essential for resolving these issues in most cases and for efficient verification, as in this case.
Remember that if you have any questions or concerns, you can always contact us.
Lastly, don’t forget to run the pt-table-sync with the –print option to verify and share with the client or data owner whether the data to be corrected and the proposed corrections seem appropriate. Always.
Let’s stay synchronized, and I’ll see you in the next blog!