Aug
20
2024
--

How to Run Percona Toolkit in Kubernetes

How to Run Percona Toolkit in KubernetesPercona Operator for MySQL, based on Percona XtraDB Cluster (PXC), provides configuration to expose the ProxySQL, HAProxy, and the PXC instances themselves so that you can use Percona Toolkit to connect to MySQL from these endpoints externally. However, you can also create Kubernetes pods to run Percona Toolkit operations there instead if you do not […]

Jul
22
2024
--

Online Schema Changes on Tables with Foreign Keys in MySQL

Online Schema Changes on Tables with Foreign Keys in MySQLpt-online-schema-change is an amazing tool for assisting in table modifications in cases where ONLINE ALTER is not an option. But if you have foreign keys, this could be an interesting and important read for you. Tables with foreign keys are always complicated, and they have to be handled with care. The use case I am […]

Jun
19
2024
--

eu-stack Support and Other Improvements in pt-pmp of Percona Toolkit

Improvements in pt-pmp of Percona Toolkitpt-pmp is a profiler tool that creates and summarizes full stack traces of processes on Linux. It was inspired by http://poormansprofiler.org and helped Percona Support resolve many performance issues. In this blog post, I will present an improved pt-pmp that can collect stack traces with minimal impact on the production environment. TLDR; Starting from Percona Toolkit […]

Jun
18
2024
--

Partial Data Archiving and Schema Change

Starting from Percona Toolkit 3.6.0, pt-online-schema-change supports the option –where, so you can not only alter your table definition live but copy only rows that satisfy certain criteria. It may happen that you may not need to copy all the data when changing table definition. For example, if the table is too big and you […]

Jun
14
2024
--

Resume Your Failed pt-online-schema-change Job

Starting from Percona Toolkit 3.6.0, you can resume pt-online-schema-change if it was interrupted. This blog describes the prerequisites and usage of the new –resume option. To restart the job, you need to know where it failed. This is why the first option you must use is –history. It instructs pt-online-schema-change to store its progress in […]

Jun
12
2024
--

What’s New in Percona Toolkit 3.6.0

Percona Toolkit 3.6.0 has been released on June 12, 2024. The most important updates in this version are: The possibility to resume pt-online-schema-change if it is interrupted. eu-stack support in pt-pmp that significantly improves this tool’s performance and decreases the load it causes on production servers. New tool pt-eustack-resolver Packages for Ubuntu 24.04 (Noble Numbat) […]

May
13
2024
--

Seamless Table Modifications: Leveraging pt-online-schema-change for Online Alterations

Seamless Table Modifications in MySQL pt-online-schema-changeTable modifications are a routine task for database administrators. The blog post Using Percona Toolkit to Alter Database Tables Online: A Controlled Approach provides insights into the process of altering tables online in a controlled manner, ensuring uninterrupted access for application users and preventing application downtime. We will focus here on utilizing the powerful “pt-online-schema-change” […]

Dec
22
2023
--

What’s New in Percona Toolkit 3.5.6

Percona Toolkit

Percona Toolkit 3.5.6 was released on December 21, 2023. This blog post covers the main changes in this release.

New tool: pt-galera-log-explainer

We continue adding more instruments for the Support teams.
ptgaleralogexplainer was written by my Percona Support colleague Yoann La Cancellera.
ptgaleralogexplainer filters, aggregates, and summarizes multiple Galera library logs together. The purpose of this tool is to help find useful information in Percona XtraDB Cluster/Galera library logs.

ptgaleralogexplainer  takes raw log files, usually very verbose as the one available in the regression test suite, and makes short summary out of them:

$ ./bin/pt-galera-log-explainer list --all src/go/pt-galera-log-explainer/tests/logs/merge_rotated_daily/node1.20230315.log
identifier                     node1                                               
current path                   .../tests/logs/merge_rotated_daily/node1.20230315.log   
last known ip                                                                          
last known name                node1                                               
mysql version                                                                          
                                                                                        
2023-03-15T20:10:57.784904+02:00   node2 joined                                        
2023-03-15T20:10:57.785568+02:00   node3 left                                          
2023-03-15T20:10:57.791959+02:00   node3 left                                          
2023-03-15T20:10:57.797221+02:00   PRIMARY(n=2)                                        
2023-03-15T20:20:12.714291+02:00   node2 joined                                        
2023-03-15T20:20:12.714331+02:00   node3 joined                                        
2023-03-15T20:20:13.776977+02:00   PRIMARY(n=3)                                        
2023-03-15T20:20:14.839684+02:00   local node will resync node3                        
2023-03-15T20:20:14.839723+02:00   SYNCED -> DONOR                                     
2023-03-15T20:20:15.799020+02:00   IST will be used                                    
2023-03-15T20:20:16.850525+02:00   finished sending IST to node3                       
2023-03-15T20:20:16.850549+02:00   DESYNCED -> JOINED                                  
2023-03-15T20:20:16.865312+02:00   JOINED -> SYNCED

The tool can process logs from multiple nodes and draw a timeline. You can find a usage example with sample output in the user reference manual. You can also filter events to have a more compact output.

ptgaleralogexplainer  is an excellent tool for analyzing large log files that produce the Galera library.

Better macOS and ARM support

This release contains a few improvements for platforms and operating systems that Percona does not officially support.

PR-516, contributed by Ivan Kruglov, makes
ptonlineschemachangeand other tools respect case-insensitive lookup on Windows and macOS: the default option for these operating systems.

PR-720 simplifies the build process for the Percona Toolkit. Historically, Percona Toolkit was written in Perl and Shell programming languages. Build instructions for all code were simple:

perl Makefile.PL
make
(optionally) make test
make install

Since the first tool, written in the Go programming language, was introduced, package maintainers had to perform one extra step:

cd src/go
make <PLATFORM such as linux-amd64>

Now, this extra step is optional because the top-level
Makefilehas instructions for building Go tools on the current platform.

PR-712 adds the
darwinarm64  platform to the list of platforms for which Go binaries could be built. While we do not officially support ARM and macOS, you can build the tools yourself without extra effort.

To create macOS ARM binaries on the same platform, simply run the following from the top-level directory.

perl Makefile.PL
make

If you want to create binaries on another platform, change the directory to
src/go, then run

make darwin-arm64

We also have unofficial ARM packages for Percona Toolkit and PMM Dump in our labs at https://github.com/Percona-Lab/percona-on-arm.

Tools improvements

Explain output for slow query in JSON report for pt-query-digest

Earlier,
ptquerydigest did not print
EXPLAIN  output with option
output=json . PR-471, sent by Ayush Goyal, adds this possibility.

Hook before_die for pt-online-schema-change

PR-509, sent by Ilaria Migliozzi, introduces a new hook for
ptonlineschemachange . If
ptonlineschemachange  stops ungracefully, this hook lets you print diagnostic information about failing operations. You can find an example plugin on GitHub.

Option –unstop for pt-archiver

ptarchiver supports the option –stop that terminates running instances by creating a sentinel file. However, there was no option to remove this file and restart
ptarchiver . PR-429 by fraff resolves this gap by adding the option
unstop . If
ptarchiver  is called with this option, it removes the sentinel file and restarts the operation.

Quality improvements

In my last Percona Toolkit release blog, What’s New in Percona Toolkit 3.5.5, I wrote about how we plan to make regression tests part of the release process. This partially happened, and we are testing Percona Toolkit with Percona Server for MySQL 5.7 and 8.0 on all supported platforms now. There are still a few issues with tests, as reported at PT-2295. We are planning to fix them and then add support for other products.

Viktor Szépe continued to contribute to the quality of the Percona Toolkit source code. He introduced EditorConfig for our GitHub repository, fixed typos, and made style improvements for Go code.

In addition to Kushal Haldar’s reports about vulnerabilities in Go, we enabled better automation for our GitHub repository. This release was built with the latest version of Go and fixes all known module vulnerabilities.

Percona Toolkit Docker images

We released an official Docker image for the Percona Toolkit at https://hub.docker.com/r/percona/percona-toolkit. It can be installed using the following command.

docker pull percona/percona-toolkit

To call any tool using docker, run the following.

docker run <TOOL NAME> <OPTIONS>

For example:

$ docker run --network="host" percona/percona-toolkit pt-online-schema-change 
> h=127.0.0.1,P=12345,u=msandbox,p=msandbox,D=test,t=t1 --alter='ADD COLUMN f2 INT' 
> --execute
Found 2 slaves:
s76 -> 127.0.0.1:12346
s76 -> 127.0.0.1:12347
Will check slave lag on:
s76 -> 127.0.0.1:12346
s76 -> 127.0.0.1:12347
Operation, tries, wait:
analyze_table, 10, 1
copy_rows, 10, 0.25
create_triggers, 10, 1
drop_triggers, 10, 1
swap_tables, 10, 1
update_foreign_keys, 10, 1
Altering `test`.`t1`...
Creating new table...
Created new table test._t1_new OK.
Altering new table...
Altered `test`.`_t1_new` OK.
2023-12-21T15:29:22 Creating triggers...
2023-12-21T15:29:22 Created triggers OK.
2023-12-21T15:29:22 Copying approximately 1 rows...
2023-12-21T15:29:22 Copied rows OK.
2023-12-21T15:29:22 Analyzing new table...
2023-12-21T15:29:22 Swapping tables...
2023-12-21T15:29:22 Swapped original and new tables OK.
2023-12-21T15:29:22 Dropping old table...
2023-12-21T15:29:22 Dropped old table `test`.`_t1_old` OK.
2023-12-21T15:29:22 Dropping triggers...
2023-12-21T15:29:22 Dropped triggers OK.
Successfully altered `test`.`t1`

Docker images are not only needed for those who prefer to install software via docker but will allow us to easily ship the Percona Toolkit together with other software, such as Percona Monitoring and Management or Percona Kubernetes Operators.

Community contributions

This release includes many contributions from community members. We want to thank:

  • Ivan Kruglov for fixing case-sensitivity issues in
    ptonlineschemachange  and other tools
  • Chrys Swingler for fixing a bug with the option
    skipcheckslavelag  in
    ptonlineschemachange  and
    pttablechecksum
  • Ayush Goyal for adding
    EXPLAIN  output for slow query in JSON report for
    ptquerydigest
  • Ilaria Migliozzi for new hook
    after_die  for
    ptonlineschemachange
  • Jakob for fixing
    ptarchiver bug PT-2064.
  • Viktor Szépe for introducing EditorConfig, fixing typos and issues with the Go code
  • fraff for the option
    unstop  for
    ptarchiver
  • Zongzhi Chen for fixing
    ptpmp  bug PT-2211
  • Jason Ng for fixing
    ptvisualexplain  bug PT-2277
  • Kushal Haldar for his reports about vulnerabilities in Go-based tools

Percona Toolkit: Free your DBAs with advanced open source command-line tools.

 

Learn more

Dec
12
2023
--

How to Use Percona Toolkit’s pt-table-sync for Replica Tables With Triggers in MySQL

pt-table-sync for Replica Tables With Triggers

In Percona Managed Services, we manage Percona for MySQL, Community MySQL, and MariaDB. Sometimes, the replica server might have replication errors, and the replica might be out of sync with the primary. In this case, we can use Percona Toolkit’s pt-table-checksum and pt-table-sync to check the data drift between primary and replica servers and make the replica in sync with the primary. This blog gives you some ideas on using pt-table-sync for replica tables with triggers.

In my lab, we have two test nodes with replication setup, and both servers will have Debian 11 and Percona Server for MySQL 8.0.33 (with Percona Toolkit) installed.

The PRIMARY server is deb11m8 (IP: 192.168.56.188 ), and the REPLICA server name is deb11m8s (IP: 192.168.56.189).

1. Creating the test tables and the AFTER INSERT trigger

Create the below table and trigger on PRIMARY, and it will replicate down to REPLICA. We have two tables: test_tab and test_tab_log. When a new row is inserted into test_tab, the trigger will fire and put the data and the user who did the insert into the test_tab_log table.

Create database testdb;
Use testdb; 
Create table test_tab (id bigint NOT NULL , test_data varchar(50)  NOT NULL ,op_time TIMESTAMP  NOT NULL , PRIMARY KEY (id,op_time));              
Create table test_tab_log (id bigint  NOT NULL , test_data varchar(50)  NOT NULL ,op_user varchar(60)  NOT NULL  ,op_time TIMESTAMP  NOT NULL , PRIMARY KEY (id,op_time)); 
DELIMITER $$
CREATE DEFINER=`larry`@`%` TRIGGER after_test_tab_insert  AFTER INSERT
ON test_tab FOR EACH ROW
BEGIN
   INSERT INTO test_tab_log(id,test_data,op_user,op_time) VALUES(new.id, NEW.test_data, USER(),NOW());
END$$
DELIMITER ;

2. Let’s fill in some test data

We do an insert as a root user. You can see that after data is inserted, the trigger fires as expected.

mysql> insert into test_tab (id,test_data,op_time) values(1,'lt1',now());
Query OK, 1 row affected (0.01 sec)
mysql> select * from test_tab; select * from test_tab_log;
+----+-----------+---------------------+
| id | test_data | op_time             |
+----+-----------+---------------------+
|  1 | lt1       | 2023-11-26 09:59:19 |
+----+-----------+---------------------+
1 row in set (0.00 sec)
+----+-----------+----------------+---------------------+
| id | test_data | op_user        | op_time             |
+----+-----------+----------------+---------------------+
|  1 | lt1       | root@localhost | 2023-11-26 09:59:19 |
+----+-----------+----------------+---------------------+
1 row in set (0.00 sec)
We  Insert another row insert into test_tab (id,test_data,op_time) values(2,'lt2',now());
mysql> select * from test_tab; select * from test_tab_log;
+----+-----------+---------------------+
| id | test_data | op_time             |
+----+-----------+---------------------+
|  1 | lt1       | 2023-11-26 09:59:19 |
|  2 | lt2       | 2023-11-26 10:01:30 |
+----+-----------+---------------------+
2 rows in set (0.00 sec)
+----+-----------+----------------+---------------------+
| id | test_data | op_user        | op_time             |
+----+-----------+----------------+---------------------+
|  1 | lt1       | root@localhost | 2023-11-26 09:59:19 |
|  2 | lt2       | root@localhost | 2023-11-26 10:01:30 |
+----+-----------+----------------+---------------------+
2 rows in set (0.00 sec)

3. Let’s get percona.dsns ready for pt-table-checksum and pt-table-sync

CREATE TABLE percona.dsns (`id` int(11) NOT NULL AUTO_INCREMENT,`parent_id` int(11) DEFAULT NULL, `dsn` varchar(255) NOT NULL, PRIMARY KEY (`id`)) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO percona.dsns (dsn) VALUES ('h=192.168.56.190');

4. Simulate the out of sync on 192.168.56.190 by removing one row (id=1) in test_tab

mysql> use testdb;
Database changed
mysql> set sql_log_bin=0;
Query OK, 0 rows affected (0.00 sec)
mysql> delete from test_tab where id=1;
Query OK, 1 row affected (0.00 sec)
mysql> select * from test_tab; select * from test_tab_log;
+----+-----------+---------------------+
| id | test_data | op_time             |
+----+-----------+---------------------+
|  2 | lt2       | 2023-11-26 10:01:30 |
+----+-----------+---------------------+
1 row in set (0.00 sec)
+----+-----------+----------------+---------------------+
| id | test_data | op_user        | op_time             |
+----+-----------+----------------+---------------------+
|  1 | lt1       | root@localhost | 2023-11-26 09:59:19 |
|  2 | lt2       | root@localhost | 2023-11-26 10:01:30 |
+----+-----------+----------------+---------------------+
2 rows in set (0.00 sec)

5. Run pt-table-checksum to report the difference

root@deb11m8:~/test_pt_trigger# pt-table-checksum h=192.168.56.189 --port=3306 --no-check-binlog-format 
--no-check-replication-filters --replicate percona.checksums_test_tab  
--recursion-method=dsn=D=percona,t=dsns 
--tables testdb.test_tab 
--max-load Threads_running=50 
--max-lag=10 --pause-file /tmp/checksums_test_tab
Checking if all tables can be checksummed ...
Starting checksum ...
            TS ERRORS  DIFFS     ROWS  DIFF_ROWS  CHUNKS SKIPPED    TIME TABLE
11-26T10:02:58      0      1        2          1       1       0   4.148 testdb.test_tab
root@deb11m8:~/test_pt_trigger#

on REPLICA, deb11m8s, we can see the checksum reports the difference.

mysql> SELECT db, tbl, SUM(this_cnt) AS total_rows, COUNT(*) AS chunks
    -> FROM percona.checksums_test_tab
    -> WHERE (
    -> master_cnt <> this_cnt
    -> OR master_crc <> this_crc
    -> OR ISNULL(master_crc) <> ISNULL(this_crc))
    -> GROUP BY db, tbl;
+--------+----------+------------+--------+
| db     | tbl      | total_rows | chunks |
+--------+----------+------------+--------+
| testdb | test_tab |          1 |      1 |
+--------+----------+------------+--------+
1 row in set (0.00 sec)

6. Let’s try pt-table-sync to fix it; we will run pt-table-sync under user ‘larry’@’%’

Pt-table-sync says Triggers are defined on the table and will not continue to fix it.

root@deb11m8:~/test_pt_trigger# pt-table-sync h=192.168.56.190,P=3306 --sync-to-master 
--replicate percona.checksums_test_tab 
--tables=testdb.test_tab 
--verbose --print
# Syncing via replication P=3306,h=192.168.56.190
# DELETE REPLACE INSERT UPDATE ALGORITHM START    END      EXIT DATABASE.TABLE
Triggers are defined on the table at /usr/bin/pt-table-sync line 11306.  while doing testdb.test_tab on 192.168.56.190
#      0       0      0      0 0         10:03:31 10:03:31 1    testdb.test_tab

Pt-table-sync has an option –[no]check-triggers- to that will skip trigger checking. The print result is good.

root@deb11m8:~/test_pt_trigger# pt-table-sync --user=larry --ask-pass  h=192.168.56.190,P=3306 --sync-to-master --nocheck-triggers 
--replicate percona.checksums_test_tab 
--tables=testdb.test_tab 
--verbose --print
# Syncing via replication P=3306,h=192.168.56.190
# DELETE REPLACE INSERT UPDATE ALGORITHM START    END      EXIT DATABASE.TABLE
REPLACE INTO `testdb`.`test_tab`(`id`, `test_data`, `op_time`) VALUES ('1', 'lt1', '2023-11-26 09:59:19') /*percona-toolkit src_db:testdb src_tbl:test_tab src_dsn:P=3306,h=192.168.56.189 dst_db:testdb dst_tbl:test_tab dst_dsn:P=3306,h=192.168.56.190 lock:1 transaction:1 changing_src:percona.checksums_test_tab replicate:percona.checksums_test_tab bidirectional:0 pid:4169 user:root host:deb11m8*/;
#      0       1      0      0 Nibble    10:03:54 10:03:55 2    testdb.test_tab

When we run pt-table-sync with –execute under user ‘larry’@’%’:

root@deb11m8:~/test_pt_trigger# pt-table-sync --user=larry --ask-pass  h=192.168.56.190,P=3306 --sync-to-master --nocheck-triggers 
--replicate percona.checksums_test_tab 
--tables=testdb.test_tab 
--verbose --execute
# Syncing via replication P=3306,h=192.168.56.190
# DELETE REPLACE INSERT UPDATE ALGORITHM START    END      EXIT DATABASE.TABLE
#      0       1      0      0 Nibble    10:05:26 10:05:26 2    testdb.test_tab
-------PRIMARY -------
mysql> select * from test_tab; select * from test_tab_log;
+----+-----------+---------------------+
| id | test_data | op_time             |
+----+-----------+---------------------+
|  1 | lt1       | 2023-11-26 09:59:19 |
|  2 | lt2       | 2023-11-26 10:01:30 |
+----+-----------+---------------------+
2 rows in set (0.00 sec)
+----+-----------+----------------+---------------------+
| id | test_data | op_user        | op_time             |
+----+-----------+----------------+---------------------+
|  1 | lt1       | root@localhost | 2023-11-26 09:59:19 |
|  1 | lt1       | larry@deb11m8  | 2023-11-26 10:05:26 |
|  2 | lt2       | root@localhost | 2023-11-26 10:01:30 |
+----+-----------+----------------+---------------------+
3 rows in set (0.00 sec)
-----REPLICA
mysql> select * from test_tab; select * from test_tab_log;
+----+-----------+---------------------+
| id | test_data | op_time             |
+----+-----------+---------------------+
|  1 | lt1       | 2023-11-26 09:59:19 |
|  2 | lt2       | 2023-11-26 10:01:30 |
+----+-----------+---------------------+
2 rows in set (0.00 sec)
+----+-----------+----------------+---------------------+
| id | test_data | op_user        | op_time             |
+----+-----------+----------------+---------------------+
|  1 | lt1       | root@localhost | 2023-11-26 09:59:19 |
|  1 | lt1       |                | 2023-11-26 10:05:26 |
|  2 | lt2       | root@localhost | 2023-11-26 10:01:30 |
+----+-----------+----------------+---------------------+
3 rows in set (0.00 sec)

We can see a new row inserted into the test_tab_log table. The reason is that the trigger fired on the primary and replicated to the REPLICA when we ran pt-table-sync.

7. If we do not want that to happen (new row inserted  in test_tab_log table)

Option 1: Do the pt-table-checksum/pt-table-sync for the test_tab_log  table again. This might fix the issue.

Option 2: We might need to do some work on the trigger like below (or there might be another better way).

Let‘s recreate the trigger as below; the trigger will check if it’s run by ‘larry’.

Drop trigger  after_test_tab_insert;
DELIMITER $$
CREATE  DEFINER=`larry`@`%`  TRIGGER after_test_tab_insert
AFTER INSERT
ON test_tab FOR EACH ROW
BEGIN
   IF left(USER(),5) <> 'larry' and trim(left(USER(),5)) <>'' THEN
     INSERT INTO test_tab_log(id,test_data, op_user,op_time)
     VALUES(new.id, NEW.test_data, USER(),NOW());
   END IF;
END$$
DELIMITER ;

And restore the data to its original out-of-sync state.

The PRIMARY

mysql> select * from test_tab; select * from test_tab_log;
+----+-----------+---------------------+
| id | test_data | op_time             |
+----+-----------+---------------------+
|  1 | lt1       | 2023-11-26 09:59:19 |
|  2 | lt2       | 2023-11-26 10:01:30 |
+----+-----------+---------------------+
2 rows in set (0.00 sec)
+----+-----------+----------------+---------------------+
| id | test_data | op_user        | op_time             |
+----+-----------+----------------+---------------------+
|  1 | lt1       | root@localhost | 2023-11-26 09:59:19 |
|  2 | lt2       | root@localhost | 2023-11-26 10:01:30 |
+----+-----------+----------------+---------------------+

The REPLICA

mysql> select * from test_tab; select * from test_tab_log;
+----+-----------+---------------------+
| id | test_data | op_time             |
+----+-----------+---------------------+
|  2 | lt2       | 2023-11-26 10:01:30 |
+----+-----------+---------------------+
1 row in set (0.00 sec)
+----+-----------+----------------+---------------------+
| id | test_data | op_user        | op_time             |
+----+-----------+----------------+---------------------+
|  1 | lt1       | root@localhost | 2023-11-26 09:59:19 |
|  2 | lt2       | root@localhost | 2023-11-26 10:01:30 |
+----+-----------+----------------+---------------------+
2 rows in set (0.00 sec)

Run pt-table-sync under user ‘larry’@’%’.  

root@deb11m8s:~# pt-table-sync --user=larry --ask-pass h=192.168.56.190,P=3306 --sync-to-master --nocheck-triggers --replicate percona.checksums_test_tab --tables=testdb.test_tab --verbose --execute
Enter password for 192.168.56.190: 
# Syncing via replication P=3306,h=192.168.56.190,p=...,u=larry
# DELETE REPLACE INSERT UPDATE ALGORITHM START    END      EXIT DATABASE.TABLE
#      0       1      0      0 Nibble    21:02:26 21:02:27 2    testdb.test_tab

We can use pt-table-sync, which will fix the data drift for us, and the trigger will not fire when pt-table-sync is run under user larry.

root@deb11m8s:~# pt-table-sync --user=larry --ask-pass h=192.168.56.190,P=3306 --sync-to-master --nocheck-triggers --replicate percona.checksums_test_tab --tables=testdb.test_tab --verbose --execute
Enter password for 192.168.56.190: 
# Syncing via replication P=3306,h=192.168.56.190,p=...,u=larry
# DELETE REPLACE INSERT UPDATE ALGORITHM START    END      EXIT DATABASE.TABLE
#      0       1      0      0 Nibble    21:02:26 21:02:27 2    testdb.test_tab

—The PRIMARY
mysql> select * from test_tab; select * from test_tab_log;
+----+-----------+---------------------+
| id | test_data | op_time             |
+----+-----------+---------------------+
|  1 | lt1       | 2023-11-26 09:59:19 |
|  2 | lt2       | 2023-11-26 10:01:30 |
+----+-----------+---------------------+
2 rows in set (0.00 sec)
+----+-----------+----------------+---------------------+
| id | test_data | op_user        | op_time             |
+----+-----------+----------------+---------------------+
|  1 | lt1       | root@localhost | 2023-11-26 09:59:19 |
|  2 | lt2       | root@localhost | 2023-11-26 10:01:30 |
+----+-----------+----------------+---------------------+
2 rows in set (0.00 sec)

—The REPLICA
mysql> select * from test_tab; select * from test_tab_log;
+----+-----------+---------------------+
| id | test_data | op_time             |
+----+-----------+---------------------+
|  1 | lt1       | 2023-11-26 09:59:19 |
|  2 | lt2       | 2023-11-26 10:01:30 |
+----+-----------+---------------------+
1 row in set (0.00 sec)
+----+-----------+----------------+---------------------+
| id | test_data | op_user        | op_time             |
+----+-----------+----------------+---------------------+
|  1 | lt1       | root@localhost | 2023-11-26 09:59:19 |
|  2 | lt2       | root@localhost | 2023-11-26 10:01:30 |
+----+-----------+----------------+---------------------+
2 rows in set (0.00 sec)

8. If we still insert other data into the table test_tab under another user (e.g. root@localhost), the trigger will still fire

mysql> select user();
+----------------+
| user()         |
+----------------+
| root@localhost |
+----------------+
1 row in set (0.00 sec)
mysql> insert into test_tab (id,test_data,op_time) values(3,'lt3',now());
Query OK, 1 row affected (0.01 sec)
-— The PRIMARY 
mysql> select * from test_tab; select * from test_tab_log;
+----+-----------+---------------------+
| id | test_data | op_time             |
+----+-----------+---------------------+
|  1 | lt1       | 2023-11-26 09:59:19 |
|  2 | lt2       | 2023-11-26 10:01:30 |
|  3 | lt3       | 2023-11-26 21:04:26 |
+----+-----------+---------------------+
3 rows in set (0.00 sec)
+----+-----------+----------------+---------------------+
| id | test_data | op_user        | op_time             |
+----+-----------+----------------+---------------------+
|  1 | lt1       | root@localhost | 2023-11-26 09:59:19 |
|  2 | lt2       | root@localhost | 2023-11-26 10:01:30 |
|  3 | lt3       | root@localhost | 2023-11-26 21:04:26 |
+----+-----------+----------------+---------------------+
3 rows in set (0.00 sec)

— The REPLICA 
mysql> select * from test_tab; select * from test_tab_log;
+----+-----------+---------------------+
| id | test_data | op_time             |
+----+-----------+---------------------+
|  1 | lt1       | 2023-11-26 09:59:19 |
|  2 | lt2       | 2023-11-26 10:01:30 |
|  3 | lt3       | 2023-11-26 21:04:26 |
+----+-----------+---------------------+
3 rows in set (0.00 sec)

+----+-----------+----------------+---------------------+
| id | test_data | op_user        | op_time             |
+----+-----------+----------------+---------------------+
|  1 | lt1       | root@localhost | 2023-11-26 09:59:19 |
|  2 | lt2       | root@localhost | 2023-11-26 10:01:30 |
|  3 | lt3       | root@localhost | 2023-11-26 21:04:26 |
+----+-----------+----------------+---------------------+
3 rows in set (0.00 sec)

In our test case, we just cover one AFTER INSERT trigger. In a live production system, there might be more complex scenarios (e.g. a lot of different types of triggers defined on the table you are going to do pt-table-sync, auto-increment value, the table has foreign key constraints, etc.). It would be better to test on a test environment before you go to production and make sure you have a valid backup before making a system change.

I hope this will give you some ideas on pt-table-sync on a table with triggers.

Percona Distribution for MySQL is the most complete, stable, scalable, and secure open source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!

 

Try Percona Distribution for MySQL today!

Oct
31
2023
--

Resolving Data Drift in a Dual-Primary Topology With Replica in MySQL/MariaDB

Resolving Data Drift in a Dual-Primary Topology With Replica

Hello friends,

In Managed Services, we have the opportunity to see different technologies and various topologies, which makes the work fascinating and challenging at the same time.

This time, I’m going to tell you about a particular case: a client with a dual-primary topology plus a replica, as detailed below:

PS-primary-1=192.168.0.14 [RW]
  |___ PS-primary-2=192.168.0.59 [RW] (Slave_delay: 0)
  |___ PS-replica-1=192.168.0.99 [R] (Slave_delay: 0)

[RW] means Read/Write access. 

[R] means Read Only access.

We received an alert of this kind on PS-primary-2:

Last_SQL_Errno: 1032
Last_SQL_Error: Could not execute Update_rows event on table foo.persons; Can't find record in 'persons', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000360, end_log_pos 58177

Seconds later, we noticed that someone (not us) resolved the replication issue, possibly by skipping the error in some way (using pt-slave-restart) or simply executing something like:

mysql> STOP SLAVE; SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1; START SLAVE;

Or by injecting an empty transaction.

(Disclaimer: We won’t deeply analyze the root cause here, but rather the surprise we found during the analysis.)

Having seen this, the most advisable action is to run pt-table-checksum, and if any differences appear, use pt-table-sync to fix the data drift.

Therefore, we executed the pt-table-checksum in the following way:

$ pt-table-checksum h=192.168.0.14 --port=3306 --no-check-binlog-format --no-check-replication-filters --replicate percona.checksums --recursion-method=hosts --max-load Threads_running=50 --max-lag=100 --databases=foo --tables=persons

Checking if all tables can be checksummed ...
Starting checksum ...
            TS ERRORS  DIFFS     ROWS  DIFF_ROWS  CHUNKS SKIPPED    TIME TABLE
07-28T12:40:31      0      1        6          1       1       0   0.193 foo.persons

(Second disclaimer: We have populated the table with a small amount of data to simplify and improve visibility in this case. With larger volumes, the same scenario also applies.)

(Credentials, for security reasons, are stored in the file /home/user/.my.cnf, which is why you don’t see them in any executed command. If you want to know how to create it, here’s an example)

At a glance, there seem to be differences. Let’s check the other primary instance and the replica as well.

PS-primary-2> SELECT max(ts), db, tbl, SUM(this_cnt) AS total_rows, COUNT(*) AS chunks FROM percona.checksums WHERE ( master_cnt <> this_cnt OR master_crc <> this_crc OR ISNULL(master_crc) <> ISNULL(this_crc)) GROUP BY db, tbl;
+---------------------+-----+---------+------------+--------+
| max(ts)             | db  | tbl     | total_rows | chunks |
+---------------------+-----+---------+------------+--------+
| 2023-07-28 08:40:31 | foo | persons |          5 |      1 |
+---------------------+-----+---------+------------+--------+
1 row in set (0.00 sec)

PS-replica-1> SELECT max(ts), db, tbl, SUM(this_cnt) AS total_rows, COUNT(*) AS chunks FROM percona.checksums WHERE ( master_cnt <> this_cnt OR master_crc <> this_crc OR ISNULL(master_crc) <> ISNULL(this_crc)) GROUP BY db, tbl;
+---------------------+-----+---------+------------+--------+
| max(ts)             | db  | tbl     | total_rows | chunks |
+---------------------+-----+---------+------------+--------+
| 2023-07-28 09:53:10 | foo | persons |          4 |      1 |
+---------------------+-----+---------+------------+--------+
1 row in set (0.00 sec)

The next step is to execute pt-table-sync with the –print option to review what the utility will actually do. The following output appears:

$ pt-table-sync --print h=192.168.0.59 --sync-to-master --database=foo --table=persons --replicate percona.checksums

DELETE FROM `foo`.`persons` WHERE `personid`='3' LIMIT 1 /*percona-toolkit src_db:foo src_tbl:persons src_dsn:P=3306,h=192.168.0.14 dst_db:foo dst_tbl:persons dst_dsn:h=192.168.0.59 lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:1519 user:percona host:localhost.localdomain*/;
REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('1', 'Iglesias', 'Par', 'IND', 'Jaipur', '2023-07-27 07:10:10') /*percona-toolkit src_db:foo src_tbl:persons src_dsn:P=3306,h=192.168.0.14 dst_db:foo dst_tbl:persons dst_dsn:h=192.168.0.59 lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:1519 user:percona host:localhost.localdomain*/;
REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('7', 'Ewd', 'Woo', 'CAN', 'Vancouver', '2023-07-28 07:10:10') /*percona-toolkit src_db:foo src_tbl:persons src_dsn:P=3306,h=192.168.0.14 dst_db:foo dst_tbl:persons dst_dsn:h=192.168.0.59 lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:1519 user:percona host:localhost.localdomain*/;

A DELETE statement has appeared. How is it possible?

As this is a dual-primary topology, it is advisable to perform a checksum on the other primary server as well, which is PS-primary-2. This will help identify any discrepancies and ensure data consistency between both primary servers.

$ pt-table-checksum h=192.168.0.59 --port=3306 --no-check-binlog-format --no-check-replication-filters --replicate percona.checksums --recursion-method=hosts --max-load Threads_running=50 --max-lag=100 --databases=foo --tables=persons

Checking if all tables can be checksummed ...
Starting checksum ...
            TS ERRORS  DIFFS     ROWS  DIFF_ROWS  CHUNKS SKIPPED    TIME TABLE
07-28T13:08:12      0      1        5          1       1       0   0.147 foo.persons

That’s when we notice that one server has six rows, and the other has four. How is it possible? Let’s review:

PS-primary-1 (none)> SELECT max(ts), db, tbl, SUM(this_cnt) AS total_rows, COUNT(*) AS chunks FROM percona.checksums WHERE ( master_cnt <> this_cnt OR master_crc <> this_crc OR ISNULL(master_crc) <> ISNULL(this_crc)) GROUP BY db, tbl;
+---------------------+-----+---------+------------+--------+
| max(ts)             | db  | tbl     | total_rows | chunks |
+---------------------+-----+---------+------------+--------+
| 2023-07-28 10:03:00 | foo | persons |          6 |      1 |
+---------------------+-----+---------+------------+--------+

PS-replica-1 (none)> SELECT max(ts), db, tbl, SUM(this_cnt) AS total_rows, COUNT(*) AS chunks FROM percona.checksums WHERE ( master_cnt <> this_cnt OR master_crc <> this_crc OR ISNULL(master_crc) <> ISNULL(this_crc)) GROUP BY db, tbl;
+---------------------+-----+---------+------------+--------+
| max(ts)             | db  | tbl     | total_rows | chunks |
+---------------------+-----+---------+------------+--------+
| 2023-07-28 10:03:00 | foo | persons |          6 |      1 |
+---------------------+-----+---------+------------+--------+

Let’s see what data the table currently contains:

$ for SERVE in 192.168.0.14 192.168.0.59 192.168.0.99; do echo ; echo $SERVE ; mysql -h $SERVE -e "select * from foo.persons"; done

192.168.0.14
+----------+-----------+-----------+---------+------------+---------------------+
| PersonID | LastName  | FirstName | Address | City       | Insert_timestamp    |
+----------+-----------+-----------+---------+------------+---------------------+
|        1 | Iglesias  | Par       | IND     | Jaipur     | 2023-07-27 07:10:10 |
|        2 | Rooster   | War       | IND     | Tamil Nadu | 2023-07-27 07:10:10 |
|        4 | SoS       | Syl       | CAN     | Ontario    | 2023-07-27 07:10:10 |
|        5 | Al-lot    | Pat       | IND     | Delhi      | 2023-07-27 07:10:10 |
|        6 | The Crazy | Vai       | IND     | Vadodara   | 2023-07-27 07:10:10 |
|        7 | Ewd       | Woo       | CAN     | Vancouver  | 2023-07-28 07:10:10 |
+----------+-----------+-----------+---------+------------+---------------------+

192.168.0.59
+----------+-----------+-----------+---------+------------+---------------------+
| PersonID | LastName  | FirstName | Address | City       | Insert_timestamp    |
+----------+-----------+-----------+---------+------------+---------------------+
|        2 | Rooster   | War       | IND     | Tamil Nadu | 2023-07-27 07:10:10 |
|        3 | Colonel   | Vil       | PER     | Lima       | 2023-07-27 07:10:10 |
|        4 | SoS       | Syl       | CAN     | Ontario    | 2023-07-27 07:10:10 |
|        5 | Al-lot    | Pat       | IND     | Delhi      | 2023-07-27 07:10:10 |
|        6 | The Crazy | Vai       | IND     | Vadodara   | 2023-07-27 07:10:10 |
+----------+-----------+-----------+---------+------------+---------------------+

192.168.0.99
+----------+-----------+-----------+---------+------------+---------------------+
| PersonID | LastName  | FirstName | Address | City       | Insert_timestamp    |
+----------+-----------+-----------+---------+------------+---------------------+
|        2 | Rooster   | War       | IND     | Tamil Nadu | 2023-07-27 07:10:10 |
|        4 | SoS       | Syl       | CAN     | Ontario    | 2023-07-27 07:10:10 |
|        5 | Al-lot    | Pat       | IND     | Delhi      | 2023-07-27 07:10:10 |
|        6 | The Crazy | Vai       | IND     | Vadodara   | 2023-07-27 07:10:10 |
+----------+-----------+-----------+---------+------------+---------------------+

As you can see, there is data drift everywhere. 

And how can this be possible? I’ll enumerate several possibilities, and it is not an exhaustive list

Data drift and integrity of replica data is complex. Let’s outline some common ways this can happen (generally speaking):

  • A statement is executed on a primary with: SET SESSION sql_log_bin = OFF
  • A statement was executed directly on the replica
  • This can happen if the replica was not in super_read_only and a super user executed
  • This can happen if the replica was not in read_only
  • A statement was executed on a replica, and the replica was later promoted to a primary without GTID in place
  • A primary server is not configured for full ACID compliance, and it crashed
  • At some point, the primary was not configured for row-based replication (even briefly)

These are the most common ways that data drift occurs, but certainly not the only ways that it can happen. More exotic cases can involve bugs, engine differences, version differences, etc.

Coming back to the issue, if we run pt-table-sync targeting the second primary instance, we will see the following result:

$ pt-table-sync --print h=192.168.0.14 --sync-to-master --database=foo --table=persons --replicate percona.checksums

DELETE FROM `foo`.`persons` WHERE `personid`='1' LIMIT 1 /*percona-toolkit src_db:foo src_tbl:persons src_dsn:P=3306,h=192.168.0.59 dst_db:foo dst_tbl:persons dst_dsn:h=192.168.0.14 lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:5732 user:percona host:localhost.localdomain*/;
DELETE FROM `foo`.`persons` WHERE `personid`='7' LIMIT 1 /*percona-toolkit src_db:foo src_tbl:persons src_dsn:P=3306,h=192.168.0.59 dst_db:foo dst_tbl:persons dst_dsn:h=192.168.0.14 lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:5732 user:percona host:localhost.localdomain*/;
REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('3', 'Colonel', 'Vil', 'PER', 'Lima', '2023-07-27 07:10:10') /*percona-toolkit src_db:foo src_tbl:persons src_dsn:P=3306,h=192.168.0.59 dst_db:foo dst_tbl:persons dst_dsn:h=192.168.0.14 lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:5732 user:percona host:localhost.localdomain*/;

Here, the output is less promising: it suggests deleting the rows with ID 1 and 7, and performing a REPLACE on ID 3.

Therefore, note that it asks us to delete rows, which is not correct. So, what to do?  After a discussion with the customer (we’ve shared with them all that we gathered), we agreed on the following:

  • We need to add the row with ID=3 to the instances where it’s missing.
  • We need to update the row with ID=1 in the instances where it’s missing.
  • We need to add the row with ID=7 to the instances where it’s missing.
  • We DO NOT need to delete any rows; no DELETES are necessary.

Important note: The client informed us that the data that exists only in one instance and not in the others is necessary.  The client also verified that the actions we proposed to them were correct. Remember: we, as DBAs, are the guardians of the data and its integrity, but not necessarily the owners, let alone know the content for ethical reasons.

How do we achieve this, then?

Unfortunately, in this case, using pt-table-sync is not recommended since, as we have seen with the –print option, it would perform DELETES in both cases, regardless of which primary instance we consider as the source of truth.

We also cannot execute the REPLACE on PS-primary-1 because the REPLACE command follows a specific logic explained here

MySQL uses the following algorithm for REPLACE (and LOAD DATA … REPLACE):

  1. Try to insert the new row into the table
  2. While the insertion fails because a duplicate-key error occurs for a primary key or unique index:
    1. Delete from the table the conflicting row that has the duplicate key value
    2. Try again to insert the new row into the table

That means that if we execute a REPLACE on PS-primary-1 where the row with ID 7 does not exist but does exist in PS-primary-2, it will result in a duplicate key error, which I’ll show you below (as the row already exists):

PS-primary-1 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('3', 'Colonel', 'Vil', 'PER', 'Lima', '2023-07-27 09:20:13');
Query OK, 1 row affected (0.01 sec)

We connected to PS-primary-2, and indeed, we found the duplicate key error:

PS-primary-2 (none)> PAGER grep "Last_Err"; SHOW SLAVE STATUSG
PAGER set to 'grep "Last_Err"'
                   Last_Errno: 1062
                   Last_Error: Could not execute Write_rows event on table foo.persons; Duplicate entry '3' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.000359, end_log_pos 26840
1 row in set (0.01 sec)

The commands we should execute would be the following:

set sql_log_bin=0;
REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('3', 'Colonel', 'Vil', 'PER', 'Lima', '2023-07-27 07:10:10');
REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('1', 'Iglesias', 'Par', 'IND', 'Jaipur', '2023-07-27 07:10:10');
REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('7', 'Ewd', 'Woo', 'CAN', 'Vancouver', '2023-07-28 07:10:10');

However, as we observed, the first one will fail due to a duplicate key.  So, how do we resolve this?  The safest option, in my humble opinion, would be to execute the three REPLACE statements on each of the instances by first invoking this command:

PS-primary-1 > set sql_log_bin=0;
Query OK, 0 rows affected (0.00 sec)

PS-primary-1 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('3', 'Colonel', 'Vil', 'PER', 'Lima', '2023-07-27 07:10:10');
Query OK, 1 row affected (0.01 sec)

PS-primary-1 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('1', 'Iglesias', 'Par', 'IND', 'Jaipur', '2023-07-27 07:10:10');
Query OK, 1 row affected (0.01 sec)

PS-primary-1 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('7', 'Ewd', 'Woo', 'CAN', 'Vancouver', '2023-07-28 07:10:10');
Query OK, 1 row affected (0.01 sec)


PS-primary-1 > Bye


PS-primary-2 > set sql_log_bin=0;
Query OK, 0 rows affected (0.00 sec)

PS-primary-2 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('3', 'Colonel', 'Vil', 'PER', 'Lima', '2023-07-27 07:10:10');
Query OK, 1 row affected (0.00 sec)

PS-primary-2 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('1', 'Iglesias', 'Par', 'IND', 'Jaipur', '2023-07-27 07:10:10');
Query OK, 1 row affected (0.01 sec)

PS-primary-2 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('7', 'Ewd', 'Woo', 'CAN', 'Vancouver', '2023-07-28 07:10:10');
Query OK, 1 row affected (0.01 sec)


PS-primary-2 > Bye


PS-replica-1 > set sql_log_bin=0;
Query OK, 0 rows affected (0.00 sec)

PS-replica-1 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('3', 'Colonel', 'Vil', 'PER', 'Lima', '2023-07-27 07:10:10');
Query OK, 1 row affected (0.01 sec)

PS-replica-1 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('1', 'Iglesias', 'Par', 'IND', 'Jaipur', '2023-07-27 07:10:10');
Query OK, 1 row affected (0.01 sec)

PS-replica-1 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('7', 'Ewd', 'Woo', 'CAN', 'Vancouver', '2023-07-28 07:10:10');
Query OK, 1 row affected (0.01 sec)

PS-replica-1 > set global super_read_only=1; set global read_only=1;
Query OK, 0 rows affected (0.00 sec)

Finally, we verify that all instances have the same data, either with the SELECT statement or the pt-table-checksum command.

$ for SERVE in 192.168.0.14 192.168.0.59 192.168.0.99; do echo ; echo $SERVE ; mysql -h $SERVE -e "select * from foo.persons"; done

192.168.0.14
+----------+-----------+-----------+---------+------------+---------------------+
| PersonID | LastName  | FirstName | Address | City       | Insert_timestamp    |
+----------+-----------+-----------+---------+------------+---------------------+
|        1 | Iglesias  | Par       | IND     | Jaipur     | 2023-07-27 07:10:10 |
|        2 | Rooster   | War       | IND     | Tamil Nadu | 2023-07-27 07:10:10 |
|        3 | Colonel   | Vil       | PER     | Lima       | 2023-07-27 07:10:10 |
|        4 | SoS       | Syl       | CAN     | Ontario    | 2023-07-27 07:10:10 |
|        5 | Al-lot    | Pat       | IND     | Delhi      | 2023-07-27 07:10:10 |
|        6 | The Crazy | Vai       | IND     | Vadodara   | 2023-07-27 07:10:10 |
|        7 | Ewd       | Woo       | CAN     | Vancouver  | 2023-07-28 07:10:10 |
+----------+-----------+-----------+---------+------------+---------------------+

192.168.0.59
+----------+-----------+-----------+---------+------------+---------------------+
| PersonID | LastName  | FirstName | Address | City       | Insert_timestamp    |
+----------+-----------+-----------+---------+------------+---------------------+
|        1 | Iglesias  | Par       | IND     | Jaipur     | 2023-07-27 07:10:10 |
|        2 | Rooster   | War       | IND     | Tamil Nadu | 2023-07-27 07:10:10 |
|        3 | Colonel   | Vil       | PER     | Lima       | 2023-07-27 07:10:10 |
|        4 | SoS       | Syl       | CAN     | Ontario    | 2023-07-27 07:10:10 |
|        5 | Al-lot    | Pat       | IND     | Delhi      | 2023-07-27 07:10:10 |
|        6 | The Crazy | Vai       | IND     | Vadodara   | 2023-07-27 07:10:10 |
|        7 | Ewd       | Woo       | CAN     | Vancouver  | 2023-07-28 07:10:10 |
+----------+-----------+-----------+---------+------------+---------------------+

192.168.0.99
+----------+-----------+-----------+---------+------------+---------------------+
| PersonID | LastName  | FirstName | Address | City       | Insert_timestamp    |
+----------+-----------+-----------+---------+------------+---------------------+
|        1 | Iglesias  | Par       | IND     | Jaipur     | 2023-07-27 07:10:10 |
|        2 | Rooster   | War       | IND     | Tamil Nadu | 2023-07-27 07:10:10 |
|        3 | Colonel   | Vil       | PER     | Lima       | 2023-07-27 07:10:10 |
|        4 | SoS       | Syl       | CAN     | Ontario    | 2023-07-27 07:10:10 |
|        5 | Al-lot    | Pat       | IND     | Delhi      | 2023-07-27 07:10:10 |
|        6 | The Crazy | Vai       | IND     | Vadodara   | 2023-07-27 07:10:10 |
|        7 | Ewd       | Woo       | CAN     | Vancouver  | 2023-07-28 07:10:10 |
+----------+-----------+-----------+---------+------------+---------------------+

$ pt-table-checksum h=192.168.0.14 --port=3306 --no-check-binlog-format --no-check-replication-filters --replicate percona.checksums --recursion-method=hosts --max-load Threads_running=50 --max-lag=100 --databases=foo --tables=persons

Checking if all tables can be checksummed ...
Starting checksum ...
            TS ERRORS  DIFFS     ROWS  DIFF_ROWS  CHUNKS SKIPPED    TIME TABLE
07-28T14:12:32      0      0        7          0       1       0   0.185 foo.persons

$ pt-table-checksum h=192.168.0.59 --port=3306 --no-check-binlog-format --no-check-replication-filters --replicate percona.checksums --recursion-method=hosts --max-load Threads_running=50 --max-lag=100 --databases=foo --tables=persons
Checking if all tables can be checksummed ...
Starting checksum ...
            TS ERRORS  DIFFS     ROWS  DIFF_ROWS  CHUNKS SKIPPED    TIME TABLE
07-28T14:12:50      0      0        7          0       1       0   0.148 foo.persons

Conclusion

Maintaining this type of topology (dual primaries) can be a headache, and they are prone to such problems. Therefore, resolving them is not always as simple as it may seem. The tools pt-table-checksum and pt-table-sync are essential for resolving these issues in most cases and for efficient verification, as in this case.

Remember that if you have any questions or concerns, you can always contact us.

Lastly, don’t forget to run the pt-table-sync with the –print option to verify and share with the client or data owner whether the data to be corrected and the proposed corrections seem appropriate. Always.

Let’s stay synchronized, and I’ll see you in the next blog!

 

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com