Jun
25
2024
--

MySQL Replication: How To Deal With the ‘Got Fatal Error 1236’ or MY-013114 Error

MySQL Got Fatal Error 1236Replication has been the core functionality, allowing high availability in MySQL for decades already. However, you may still encounter replication errors that keep you awake at night. One of the most common and challenging to deal with starts with: “Got fatal error 1236 from source when reading data from binary log“. This blog post is […]

Nov
05
2018
--

How to Quickly Add a Node to an InnoDB Cluster or Group Replication

Quickly Add a Node in InnoDB Cluster or Group Replication

Quickly Add a Node to an InnoDB Cluster or Group Replication

Quickly Add a Node to an InnoDB Cluster or Group Replication (Shutterstock)

In this blog, we’ll look at how to quickly add a node to an InnoDB Cluster or Group Replication using Percona XtraBackup.

Adding nodes to a Group Replication cluster can be easy (documented here), but it only works if the existing nodes have retained all the binary logs since the creation of the cluster. Obviously, this is possible if you create a new cluster from scratch. The nodes rotate old logs after some time, however. Technically, if the

gtid_purged

 set is non-empty, it means you will need another method to add a new node to a cluster. You also need a different method if data becomes inconsistent across cluster nodes for any reason. For example, you might hit something similar to this bug, or fall prey to human error.

Hot Backup to the Rescue

The quick and simple method I’ll present here requires the Percona XtraBackup tool to be installed, as well as some additional small tools for convenience. I tested my example on Centos 7, but it works similarly on other Linux distributions. First of all, you will need the Percona repository installed:

# yum install http://www.percona.com/downloads/percona-release/redhat/0.1-6/percona-release-0.1-6.noarch.rpm -y -q

Then, install Percona XtraBackup and the additional tools. You might need to enable the EPEL repo for the additional tools and the experimental Percona repo for XtraBackup 8.0 that works with MySQL 8.0. (Note: XtraBackup 8.0 is still not GA when writing this article, and we do NOT recommend or advise that you install XtraBackup 8.0 into a production environment until it is GA). For MySQL 5.7, Xtrabackup 2.4 from the regular repo is what you are looking for:

# grep -A3 percona-experimental-\$basearch /etc/yum.repos.d/percona-release.repo
[percona-experimental-$basearch]
name = Percona-Experimental YUM repository - $basearch
baseurl = http://repo.percona.com/experimental/$releasever/RPMS/$basearch
enabled = 1

# yum install pv pigz nmap-ncat percona-xtrabackup-80 -q
==============================================================================================================================================
 Package                             Arch                 Version                             Repository                                 Size
==============================================================================================================================================
Installing:
 nmap-ncat                           x86_64               2:6.40-13.el7                       base                                      205 k
 percona-xtrabackup-80               x86_64               8.0.1-2.alpha2.el7                  percona-experimental-x86_64                13 M
 pigz                                x86_64               2.3.4-1.el7                         epel                                       81 k
 pv                                  x86_64               1.4.6-1.el7                         epel                                       47 k
Installing for dependencies:
 perl-DBD-MySQL                      x86_64               4.023-6.el7                         base                                      140 k
Transaction Summary
==============================================================================================================================================
Install  4 Packages (+1 Dependent package)
Is this ok [y/d/N]: y
#

You need to do it on both the source and destination nodes. Now, my existing cluster node (I will call it a donor) – gr01 looks like this:

gr01 > select * from performance_schema.replication_group_members\G
*************************** 1. row ***************************
  CHANNEL_NAME: group_replication_applier
     MEMBER_ID: 76df8268-c95e-11e8-b55d-525400cae48b
   MEMBER_HOST: gr01
   MEMBER_PORT: 3306
  MEMBER_STATE: ONLINE
   MEMBER_ROLE: PRIMARY
MEMBER_VERSION: 8.0.13
1 row in set (0.00 sec)
gr01 > show global variables like 'gtid%';
+----------------------------------+-----------------------------------------------+
| Variable_name                    | Value                                         |
+----------------------------------+-----------------------------------------------+
| gtid_executed                    | aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-302662 |
| gtid_executed_compression_period | 1000                                          |
| gtid_mode                        | ON                                            |
| gtid_owned                       |                                               |
| gtid_purged                      | aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-295538 |
+----------------------------------+-----------------------------------------------+
5 rows in set (0.01 sec)

The new node candidate (I will call it a joiner) – gr02, has no data but the same MySQL version installed. It also has the required settings in place, like the existing node address in group_replication_group_seeds, etc. The next step is to stop the MySQL service on the joiner (if already running), and wipe out it’s datadir:

[root@gr02 ~]# rm -fr /var/lib/mysql/*

and start the “listener” process, that waits to receive the data snapshot (remember to open the TCP port if you have a firewall):

[root@gr02 ~]# nc -l -p 4444 |pv| unpigz -c | xbstream -x -C /var/lib/mysql

Then, start the backup job on the donor:

[root@gr01 ~]# xtrabackup --user=root --password=*** --backup --parallel=4 --stream=xbstream --target-dir=./ 2> backup.log |pv|pigz -c --fast| nc -w 2 192.168.56.98 4444
240MiB 0:00:02 [81.4MiB/s] [ <=>

On the joiner side, we will see:

[root@gr02 ~]# nc -l -p 4444 |pv| unpigz -c | xbstream -x -C /var/lib/mysql
21.2MiB 0:03:30 [ 103kiB/s] [ <=> ]
[root@gr02 ~]# du -hs /var/lib/mysql
241M /var/lib/mysql

BTW, if you noticed the difference in transfer rate between the two, please note that on the donor side I put

|pv|

 before the compressor while in the joiner before decompressor. This way, I can monitor the compression ratio at the same time!

The next step will be to prepare the backup on joiner:

[root@gr02 ~]# xtrabackup --use-memory=1G --prepare --target-dir=/var/lib/mysql 2>prepare.log
[root@gr02 ~]# tail -1 prepare.log
181019 19:18:56 completed OK!

and fix the files ownership:

[root@gr02 ~]# chown -R mysql:mysql /var/lib/mysql

Now we should verify the GTID position information and restart the joiner (I have the

group_replication_start_on_boot=off

 in my.cnf):

[root@gr02 ~]# cat /var/lib/mysql/xtrabackup_binlog_info
binlog.000023 893 aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-302662
[root@gr02 ~]# systemctl restart mysqld

Now, let’s check if the position reported by the node is consistent with the above:

gr02 > select @@GLOBAL.gtid_executed;
+-----------------------------------------------+
| @@GLOBAL.gtid_executed                        |
+-----------------------------------------------+
| aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-302660 |
+-----------------------------------------------+
1 row in set (0.00 sec)

No, it is not. We have to correct it:

gr02 > reset master; set global gtid_purged="aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-302662";
Query OK, 0 rows affected (0.05 sec)
Query OK, 0 rows affected (0.00 sec)

Finally, start the replication:

gr02 > START GROUP_REPLICATION;
Query OK, 0 rows affected (3.91 sec)

Let’s check the cluster status again:

gr01 > select * from performance_schema.replication_group_members\G
*************************** 1. row ***************************
CHANNEL_NAME: group_replication_applier
MEMBER_ID: 76df8268-c95e-11e8-b55d-525400cae48b
MEMBER_HOST: gr01
MEMBER_PORT: 3306
MEMBER_STATE: ONLINE
MEMBER_ROLE: PRIMARY
MEMBER_VERSION: 8.0.13
*************************** 2. row ***************************
CHANNEL_NAME: group_replication_applier
MEMBER_ID: a60a4124-d3d4-11e8-8ef2-525400cae48b
MEMBER_HOST: gr02
MEMBER_PORT: 3306
MEMBER_STATE: ONLINE
MEMBER_ROLE: SECONDARY
MEMBER_VERSION: 8.0.13
2 rows in set (0.00 sec)
gr01 > select * from performance_schema.replication_group_member_stats\G
*************************** 1. row ***************************
                              CHANNEL_NAME: group_replication_applier
                                   VIEW_ID: 15399708149765074:4
                                 MEMBER_ID: 76df8268-c95e-11e8-b55d-525400cae48b
               COUNT_TRANSACTIONS_IN_QUEUE: 0
                COUNT_TRANSACTIONS_CHECKED: 3
                  COUNT_CONFLICTS_DETECTED: 0
        COUNT_TRANSACTIONS_ROWS_VALIDATING: 0
        TRANSACTIONS_COMMITTED_ALL_MEMBERS: aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-302666
            LAST_CONFLICT_FREE_TRANSACTION: aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:302665
COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE: 0
         COUNT_TRANSACTIONS_REMOTE_APPLIED: 2
         COUNT_TRANSACTIONS_LOCAL_PROPOSED: 3
         COUNT_TRANSACTIONS_LOCAL_ROLLBACK: 0
*************************** 2. row ***************************
                              CHANNEL_NAME: group_replication_applier
                                   VIEW_ID: 15399708149765074:4
                                 MEMBER_ID: a60a4124-d3d4-11e8-8ef2-525400cae48b
               COUNT_TRANSACTIONS_IN_QUEUE: 0
                COUNT_TRANSACTIONS_CHECKED: 0
                  COUNT_CONFLICTS_DETECTED: 0
        COUNT_TRANSACTIONS_ROWS_VALIDATING: 0
        TRANSACTIONS_COMMITTED_ALL_MEMBERS: aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-302666
            LAST_CONFLICT_FREE_TRANSACTION:
COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE: 0
         COUNT_TRANSACTIONS_REMOTE_APPLIED: 0
         COUNT_TRANSACTIONS_LOCAL_PROPOSED: 0
         COUNT_TRANSACTIONS_LOCAL_ROLLBACK: 0
2 rows in set (0.00 sec)

OK, our cluster is consistent! The new node joined successfully as secondary. We can proceed to add more nodes!

Oct
01
2018
--

How To Fix MySQL Replication After an Incompatible DDL Command

fix MySQL replication after incompatible DDL

fix MySQL replication after incompatible DDLMySQL supports replicating to a slave that is one release higher. This allows us to easily upgrade our MySQL setup to a new version, by promoting the slave and pointing the application to it. However, though unsupported, there are times when the MySQL version of slave deployed is one release lower. In this scenario, if your application has been performing much better on an older version of MySQL, you would like to have a convenient option to downgrade. You can simply promote the slave to get the old performance back.

The MySQL manual says that ROW based replication can be used to replicate to a lower version, provided that no DDLs replicated are incompatible with the slave. One such incompatible command is ALTER USER which is a new feature in MySQL 5.7 and not available on 5.6. :

ALTER USER 'testuser'@'localhost' IDENTIFIED BY 'testuser';

Executing that command would break replication. Here is an example of a broken slave in non-GTID replication:

*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: repl
                  Master_Port: 5723
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000002
          Read_Master_Log_Pos: 36915649
               Relay_Log_File: mysql_sandbox5641-relay-bin.000006
                Relay_Log_Pos: 36174552
        Relay_Master_Log_File: mysql-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
*** redacted ***
                   Last_Errno: 1064
                   Last_Error: Error 'You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A784' at line 1' on query. Default database: ''. Query: 'ALTER USER 'testuser'@'localhost' IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A7846BCB8''
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 36174373
              Relay_Log_Space: 36916179
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
*** redacted ***
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 1064
               Last_SQL_Error: Error 'You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A784' at line 1' on query. Default database: ''. Query: 'ALTER USER 'testuser'@'localhost' IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A7846BCB8''
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
                  Master_UUID: 00005723-0000-0000-0000-000000005723
*** redacted ***
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp: 180918 22:03:40
*** redacted ***
                Auto_Position: 0
1 row in set (0.00 sec)

Skipping the statement does not resume replication:

mysql> STOP SLAVE;
Query OK, 0 rows affected (0.02 sec)
mysql> SET GLOBAL sql_slave_skip_counter=1;
Query OK, 0 rows affected (0.00 sec)
mysql> START SLAVE;
Query OK, 0 rows affected (0.01 sec)
mysql> SHOW SLAVE STATUS\G

Fixing non-GTID replication

When you check slave status, replication still isn’t fixed. To fix it, you must manually skip to the next binary log position. The current binary log (Relay_Master_Log_File) and position (Exec_Master_Log_Pos) executed are mysql-bin.000002 and 36174373 respectively. We can use mysqlbinlog on the master to determine the next position:

mysqlbinlog -v --base64-output=DECODE-ROWS --start-position=36174373 /ssd/sandboxes/msb_5_7_23/data/mysql-bin.000002 | head -n 30
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 36174373
#180918 22:03:40 server id 1  end_log_pos 36174438 CRC32 0xc7e1e553 	Anonymous_GTID	last_committed=19273	sequence_number=19277	rbr_only=no
SET @@SESSION.GTID_NEXT= 'ANONYMOUS'/*!*/;
# at 36174438
#180918 22:03:40 server id 1  end_log_pos 36174621 CRC32 0x2e5bb235 	Query	thread_id=563	exec_time=0	error_code=0
SET TIMESTAMP=1537279420/*!*/;
SET @@session.pseudo_thread_id=563/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
SET @@session.sql_mode=1436549152/*!*/;
SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;
/*!\C latin1 *//*!*/;
SET @@session.character_set_client=8,@@session.collation_connection=8,@@session.collation_server=8/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
ALTER USER 'testuser'@'localhost' IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A7846BCB8'
/*!*/;
# at 36174621
#180918 22:03:40 server id 1  end_log_pos 36174686 CRC32 0x86756b3f 	Anonymous_GTID	last_committed=19275	sequence_number=19278	rbr_only=yes
/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
SET @@SESSION.GTID_NEXT= 'ANONYMOUS'/*!*/;
# at 36174686
#180918 22:03:40 server id 1  end_log_pos 36174760 CRC32 0x30e663f9 	Query	thread_id=529	exec_time=0	error_code=0
SET TIMESTAMP=1537279420/*!*/;
BEGIN
/*!*/;
# at 36174760
#180918 22:03:40 server id 1  end_log_pos 36174819 CRC32 0x48054daf 	Table_map: `sbtest`.`sbtest1` mapped to number 226

Based on the output above, the next binary log position is 36174621. To fix the slave, run:

STOP SLAVE;
CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000002', MASTER_LOG_POS=36174621;
START SLAVE;

Verify if the slave threads are now running by executing SHOW SLAVE STATUS\G

Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: repl
                  Master_Port: 5723
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000002
          Read_Master_Log_Pos: 306841423
               Relay_Log_File: mysql_sandbox5641-relay-bin.000002
                Relay_Log_Pos: 190785290
        Relay_Master_Log_File: mysql-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
*** redacted ***
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 226959625
              Relay_Log_Space: 270667273
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
*** redacted ***
        Seconds_Behind_Master: 383
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
                  Master_UUID: 00005723-0000-0000-0000-000000005723
             Master_Info_File: /ssd/sandboxes/msb_5_6_41/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Opening tables
           Master_Retry_Count: 86400
*** redacted ***
                Auto_Position: 0

To make the slave consistent with the master, execute the compatible query on the slave.

SET SESSION sql_log_bin = 0;
GRANT USAGE ON *.* TO 'testuser'@'localhost' IDENTIFIED BY 'testuser';

Done.

GTID replication

For GTID replication, in addition to injecting an empty transaction for the offending statement, you’ll need skip it by using the non-GTID solution provided above. Once running, flip it back to GTID.

Here’s an example of a broken GTID slave:

mysql> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: repl
                  Master_Port: 5723
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000003
          Read_Master_Log_Pos: 14364967
               Relay_Log_File: mysql_sandbox5641-relay-bin.000002
                Relay_Log_Pos: 8630318
        Relay_Master_Log_File: mysql-bin.000003
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
*** redacted ***
                   Last_Errno: 1064
                   Last_Error: Error 'You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A784' at line 1' on query. Default database: ''. Query: 'ALTER USER 'testuser'@'localhost' IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A7846BCB8''
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 12468343
              Relay_Log_Space: 10527158
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
*** redacted ***
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 1064
               Last_SQL_Error: Error 'You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A784' at line 1' on query. Default database: ''. Query: 'ALTER USER 'testuser'@'localhost' IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A7846BCB8''
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
                  Master_UUID: 00005723-0000-0000-0000-000000005723
             Master_Info_File: /ssd/sandboxes/msb_5_6_41/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State:
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp: 180918 22:32:28
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set: 00005723-0000-0000-0000-000000005723:2280-8530
            Executed_Gtid_Set: 00005723-0000-0000-0000-000000005723:1-7403
                Auto_Position: 1
1 row in set (0.00 sec)
mysql> SHOW GLOBAL VARIABLES LIKE 'gtid_executed';
+---------------+---------------------------------------------+
| Variable_name | Value                                       |
+---------------+---------------------------------------------+
| gtid_executed | 00005723-0000-0000-0000-000000005723:1-7403 |
+---------------+---------------------------------------------+
1 row in set (0.00 sec)

Since the last position executed is 7403, so you’ll need to create an empty transaction for the offending sequence 7404.

STOP SLAVE;
SET GTID_NEXT='00005723-0000-0000-0000-000000005723:7404';
BEGIN;
COMMIT;
SET GTID_NEXT=AUTOMATIC;
START SLAVE;

Note: If you have MTS enabled, you can also get the offending GTID coordinates from Last_SQL_Error of SHOW SLAVE STATUS\G

The next step is to find the next binary log position. The current binary log(Relay_Master_Log_File) and position(Exec_Master_Log_Pos) executed are mysql-bin.000003 and 12468343 respectively. We can again use

mysqlbinlog

  on the master to determine the next position:

mysqlbinlog -v --base64-output=DECODE-ROWS --start-position=12468343 /ssd/sandboxes/msb_5_7_23/data/mysql-bin.000003 | head -n 30
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 12468343
#180918 22:32:19 server id 1  end_log_pos 12468408 CRC32 0x259ee085 	GTID	last_committed=7400	sequence_number=7404	rbr_only=no
SET @@SESSION.GTID_NEXT= '00005723-0000-0000-0000-000000005723:7404'/*!*/;
# at 12468408
#180918 22:32:19 server id 1  end_log_pos 12468591 CRC32 0xb349ad80 	Query	thread_id=142	exec_time=0	error_code=0
SET TIMESTAMP=1537281139/*!*/;
SET @@session.pseudo_thread_id=142/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
SET @@session.sql_mode=1436549152/*!*/;
SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;
/*!\C latin1 *//*!*/;
SET @@session.character_set_client=8,@@session.collation_connection=8,@@session.collation_server=8/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
ALTER USER 'testuser'@'localhost' IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A7846BCB8'
/*!*/;
# at 12468591
#180918 22:32:19 server id 1  end_log_pos 12468656 CRC32 0xb2019f3f 	GTID	last_committed=7400	sequence_number=7405	rbr_only=yes
/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
SET @@SESSION.GTID_NEXT= '00005723-0000-0000-0000-000000005723:7405'/*!*/;
# at 12468656
#180918 22:32:19 server id 1  end_log_pos 12468730 CRC32 0x76b5ea6c 	Query	thread_id=97	exec_time=0	error_code=0
SET TIMESTAMP=1537281139/*!*/;
BEGIN
/*!*/;
# at 12468730
#180918 22:32:19 server id 1  end_log_pos 12468789 CRC32 0x48f0ba6d 	Table_map: `sbtest`.`sbtest8` mapped to number 115

The next binary log position is 36174621. To fix the slave, run:

STOP SLAVE;
CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000003', MASTER_LOG_POS=12468591, MASTER_AUTO_POSITION=0;
START SLAVE;

Notice that I added MASTER_AUTO_POSITION=0 above to disable GTID replication for now. You can run SHOW SLAVE STATUS\G to determine that MySQL is running fine:

mysql> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: repl
                  Master_Port: 5723
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000003
          Read_Master_Log_Pos: 446194575
               Relay_Log_File: mysql_sandbox5641-relay-bin.000002
                Relay_Log_Pos: 12704248
        Relay_Master_Log_File: mysql-bin.000003
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
*** redacted ***
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 25172522
              Relay_Log_Space: 433726939
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
*** redacted ***
        Seconds_Behind_Master: 2018
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
                  Master_UUID: 00005723-0000-0000-0000-000000005723
             Master_Info_File: /ssd/sandboxes/msb_5_6_41/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Reading event from the relay log
           Master_Retry_Count: 86400
*** redacted ***
           Retrieved_Gtid_Set: 00005723-0000-0000-0000-000000005723:7405-264930
            Executed_Gtid_Set: 00005723-0000-0000-0000-000000005723:1-14947
                Auto_Position: 0

Since it’s running fine you can now revert back to GTID replication:

STOP SLAVE;
CHANGE MASTER TO MASTER_AUTO_POSITION=1;
START SLAVE;

Finally, to make the slave consistent with the master, execute the compatible query on the slave.

SET SESSION sql_log_bin = 0;
GRANT USAGE ON *.* TO 'testuser'@'localhost' IDENTIFIED BY 'testuser';

Summary

In this article, I’ve shared how to fix replication when it breaks due to an incompatible command being replicated to the slave. In fact, I’ve only identified ALTER USER as an incompatible command for 5.6. If there are other incompatible commands, please share them in the comment section. Thanks in advance.

Update: 
I filed a bug at https://bugs.mysql.com/bug.php?id=92629 to verify if the errors I’ve encountered here is a bug or undocumented behavior.

The post How To Fix MySQL Replication After an Incompatible DDL Command appeared first on Percona Database Performance Blog.

Jul
02
2018
--

Fixing ER_MASTER_HAS_PURGED_REQUIRED_GTIDS when pointing a slave to a different master

gtid auto position

gtid auto positionGTID replication has made it convenient to setup and maintain MySQL replication. You need not worry about binary log file and position thanks to GTID and auto-positioning. However, things can go wrong when pointing a slave to a different master. Consider a situation where the new master has executed transactions that haven’t been executed on the old master. If the corresponding binary logs have been purged already, how do you point the slave to the new master?

The scenario

Based on technical requirements and architectural change, there is a need to point the slave to a different master by

  1. Pointing it to another node in a PXC cluster
  2. Pointing it to another master in master/master replication
  3. Pointing it to another slave of a master
  4. Pointing it to the slave of a slave of the master … and so on and so forth.

Theoretically, pointing to a new master with GTID replication is easy. All you have to do is run:

STOP SLAVE;
CHANGE MASTER TO MASTER_HOST='new_master_ip';
START SLAVE;
SHOW SLAVE STATUS\G

Alas, in some cases, replication breaks due to missing binary logs:

*************************** 1. row ***************************
Slave_IO_State:
Master_Host: pxc_57_5
Master_User: repl
Master_Port: 3306
**redacted**
Slave_IO_Running: No
Slave_SQL_Running: Yes
** redacted **
Last_IO_Errno: 1236
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'
** redacted **
Master_Server_Id: 1
Master_UUID: 4998aaaa-6ed5-11e8-948c-0242ac120007
Master_Info_File: /var/lib/mysql/master.info
** redacted **
Last_IO_Error_Timestamp: 180613 08:08:20
Last_SQL_Error_Timestamp:
** redacted **
Retrieved_Gtid_Set:
Executed_Gtid_Set: 1904cf31-912b-ee17-4906-7dae335b4bfc:1-3
Auto_Position: 1

The strange issue here is that if you point the slave back to the old master, replication works just fine. The error says that there are missing binary logs in the new master that the slave needs. If there’s no problem with replication performance and the slave can easily catch up, then it looks like there are transactions executed in the new master that have not been executed in the old master but are recorded in the missing binary logs. The binary logs are most likely lost due to manually purging with PURGE BINARY LOGS or automatic purging if expire_logs_days is set.

At this point, it would be prudent to check and sync old master and new master with tools such as pt-table-checksum and pt-table-sync. However, if a consistency check has been performed and no differences have been found, or there’s confidence that the new master is a good copy—such as another node in the PXC cluster—you can follow the steps below to resolve the problem.

Solution

To solve the problem, the slave needs to execute the missing transactions. But since these transactions have been purged, the steps below provide the workaround.

Step 1 Find the GTID sequences that are purged from the new master that is needed by the slave

To identify which GTID sequences are missing, run SHOW GLOBAL VARIABLES LIKE 'gtid_purged'; and SHOW MASTER STATUS; on the new master and SHOW GLOBAL VARIABLES LIKE 'gtid_executed'; on the slave:

New Master:

mysql> SHOW GLOBAL VARIABLES LIKE 'gtid_purged';
+---------------+-------------------------------------------------------------------------------------+
| Variable_name | Value |
+---------------+-------------------------------------------------------------------------------------+
| gtid_purged | 1904cf31-912b-ee17-4906-7dae335b4bfc:1-2,
4998aaaa-6ed5-11e8-948c-0242ac120007:1-11 |
+---------------+-------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> SHOW MASTER STATUS;
+------------------+----------+--------------+------------------+-------------------------------------------------------------------------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------------------------------------------------------------------------+
| mysql-bin.000004 | 741 | | | 1904cf31-912b-ee17-4906-7dae335b4bfc:1-6,
4998aaaa-6ed5-11e8-948c-0242ac120007:1-11 |
+------------------+----------+--------------+------------------+-------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

Slave:

mysql> SHOW GLOBAL VARIABLES LIKE 'gtid_executed';
+---------------+------------------------------------------+
| Variable_name | Value |
+---------------+------------------------------------------+
| gtid_executed | 1904cf31-912b-ee17-4906-7dae335b4bfc:1-3 |
+---------------+------------------------------------------+
1 row in set (0.00 sec)

Take note that 1904cf31-912b-ee17-4906-7dae335b4bfc and 1904cf31-912b-ee17-4906-7dae335b4bfc are UUIDs and refer to the MySQL instance where the transaction originated from.

Based on the output:

  • The slave has executed 1904cf31-912b-ee17-4906-7dae335b4bfc:1-3
  • The new master has executed 1904cf31-912b-ee17-4906-7dae335b4bfc:1-6 and 4998aaaa-6ed5-11e8-948c-0242ac120007:1-11
  • The new master has purged 1904cf31-912b-ee17-4906-7dae335b4bfc:1-2 and 4998aaaa-6ed5-11e8-948c-0242ac120007:1-11

This means that the slave has no issue with 1904cf31-912b-ee17-4906-7dae335b4bfc it requires sequences 4-6 and sequences 3-6 are still available in the master. However, the slave cannot fetch sequences 1-11 from 4998aaaa-6ed5-11e8-948c-0242ac120007 because these has been purged from the master.

To summarize, the missing GTID sequences are 4998aaaa-6ed5-11e8-948c-0242ac120007:1-11.

Step 2: Identify where the purged GTID sequences came from

From the SHOW SLAVE STATUS output in the introduction section, it says that the Master_UUID is 4998aaaa-6ed5-11e8-948c-0242ac120007, which means the new master is the source of the missing transactions. You can also verify the new Master’s UUID by running SHOW GLOBAL VARIABLES LIKE 'server_uuid';

mysql> SHOW GLOBAL VARIABLES LIKE 'server_uuid';
+---------------+--------------------------------------+
| Variable_name | Value |
+---------------+--------------------------------------+
| server_uuid | 4998aaaa-6ed5-11e8-948c-0242ac120007 |
+---------------+--------------------------------------+
1 row in set (0.00 sec)

If the new master’s UUID does not match the missing GTID, it is most likely that this missing sequence came from its old master, another master higher up the chain or from another PXC node. If that other master still exists, you can run the same query on those masters to check.

The missing sequences are small such as 1-11. Typically, commands executed locally are due to performing maintenance on this server directly. For example, creating users, fixing privileges or updating passwords. However, you have no guarantee that this is the reason, since the binary logs have already been purged. If you still want to point the slave to the new master, proceed to step 3 or step 4.

Step 3. Injecting the missing transactions on the slave with empty transactions

The workaround is to pretend that those missing GTID sequences have been executed on the slave by injecting 11 empty transactions as instructed here by running:

SET GTID_NEXT='UUID:SEQUENCE_NO';
BEGIN;COMMIT;
SET GTID_NEXT='AUTOMATIC';

It looks tedious, but a simple script can automate this:

cat empty_transaction_generator.sh
#!/bin/bash
uuid=$1
first_sequence_no=$2
last_sequence_no=$3
while [ "$first_sequence_no" -le "$last_sequence_no" ]
do
echo "SET GTID_NEXT='$uuid:$first_sequence_no';"
echo "BEGIN;COMMIT;"
first_sequence_no=`expr $first_sequence_no + 1`
done
echo "SET GTID_NEXT='AUTOMATIC';"
bash empty_transaction_generator.sh 4998aaaa-6ed5-11e8-948c-0242ac120007 1 11
SET GTID_NEXT='4998aaaa-6ed5-11e8-948c-0242ac120007:1';
BEGIN;COMMIT;
SET GTID_NEXT='4998aaaa-6ed5-11e8-948c-0242ac120007:2';
BEGIN;COMMIT;
SET GTID_NEXT='4998aaaa-6ed5-11e8-948c-0242ac120007:3';
BEGIN;COMMIT;
SET GTID_NEXT='4998aaaa-6ed5-11e8-948c-0242ac120007:4';
BEGIN;COMMIT;
SET GTID_NEXT='4998aaaa-6ed5-11e8-948c-0242ac120007:5';
BEGIN;COMMIT;
SET GTID_NEXT='4998aaaa-6ed5-11e8-948c-0242ac120007:6';
BEGIN;COMMIT;
SET GTID_NEXT='4998aaaa-6ed5-11e8-948c-0242ac120007:7';
BEGIN;COMMIT;
SET GTID_NEXT='4998aaaa-6ed5-11e8-948c-0242ac120007:8';
BEGIN;COMMIT;
SET GTID_NEXT='4998aaaa-6ed5-11e8-948c-0242ac120007:9';
BEGIN;COMMIT;
SET GTID_NEXT='4998aaaa-6ed5-11e8-948c-0242ac120007:10';
BEGIN;COMMIT;
SET GTID_NEXT='4998aaaa-6ed5-11e8-948c-0242ac120007:11';
BEGIN;COMMIT;
SET GTID_NEXT='AUTOMATIC';

Before executing the generated output on the slave, stop replication first:

mysql> STOP SLAVE;
Query OK, 0 rows affected (0.00 sec)
mysql> SET GTID_NEXT='4998aaaa-6ed5-11e8-948c-0242ac120007:1';
Query OK, 0 rows affected (0.00 sec)
mysql> BEGIN;COMMIT;
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
mysql> SET GTID_NEXT='4998aaaa-6ed5-11e8-948c-0242ac120007:2';
Query OK, 0 rows affected (0.00 sec)
mysql> BEGIN;COMMIT;
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
mysql> SET GTID_NEXT='4998aaaa-6ed5-11e8-948c-0242ac120007:3';
Query OK, 0 rows affected (0.00 sec)
mysql> BEGIN;COMMIT;
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
mysql> SET GTID_NEXT='4998aaaa-6ed5-11e8-948c-0242ac120007:4';
Query OK, 0 rows affected (0.00 sec)
mysql> BEGIN;COMMIT;
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
mysql> SET GTID_NEXT='4998aaaa-6ed5-11e8-948c-0242ac120007:5';
Query OK, 0 rows affected (0.00 sec)
mysql> BEGIN;COMMIT;
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.01 sec)
mysql> SET GTID_NEXT='4998aaaa-6ed5-11e8-948c-0242ac120007:6';
Query OK, 0 rows affected (0.00 sec)
mysql> BEGIN;COMMIT;
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.01 sec)
mysql> SET GTID_NEXT='4998aaaa-6ed5-11e8-948c-0242ac120007:7';
Query OK, 0 rows affected (0.00 sec)
mysql> BEGIN;COMMIT;
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.01 sec)
mysql> SET GTID_NEXT='4998aaaa-6ed5-11e8-948c-0242ac120007:8';
Query OK, 0 rows affected (0.00 sec)
mysql> BEGIN;COMMIT;
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.01 sec)
mysql> SET GTID_NEXT='4998aaaa-6ed5-11e8-948c-0242ac120007:9';
Query OK, 0 rows affected (0.00 sec)
mysql> BEGIN;COMMIT;
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
mysql> SET GTID_NEXT='4998aaaa-6ed5-11e8-948c-0242ac120007:10';
Query OK, 0 rows affected (0.00 sec)
mysql> BEGIN;COMMIT;
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
mysql> SET GTID_NEXT='4998aaaa-6ed5-11e8-948c-0242ac120007:11';
Query OK, 0 rows affected (0.00 sec)
mysql> BEGIN;COMMIT;
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
mysql> SET GTID_NEXT='AUTOMATIC';
Query OK, 0 rows affected (0.00 sec)

There’s also an even easier solution of injecting empty transactions by using mysqlslavetrx from MySQL utilities. By stopping the slave first and running
mysqlslavetrx --gtid-set=4998aaaa-6ed5-11e8-948c-0242ac120007:1-11 --slaves=root:password@:3306 you will achieve the same result as above.

By running SHOW GLOBAL VARIABLES LIKE 'gtid_executed'; on the slave you can see that sequences 4998aaaa-6ed5-11e8-948c-0242ac120007:1-11 have been executed already:

mysql> SHOW GLOBAL VARIABLES LIKE 'gtid_executed';
+---------------+-------------------------------------------------------------------------------------+
| Variable_name | Value |
+---------------+-------------------------------------------------------------------------------------+
| gtid_executed | 1904cf31-912b-ee17-4906-7dae335b4bfc:1-3,
4998aaaa-6ed5-11e8-948c-0242ac120007:1-11 |
+---------------+-------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

Resume replication and check if replication is healthy by running START SLAVE; and SHOW SLAVE STATUS\G

mysql> START SLAVE;
Query OK, 0 rows affected (0.01 sec)
mysql> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: pxc_57_5
Master_User: repl
Master_Port: 3306
** redacted **
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
** redacted **
Seconds_Behind_Master: 0
** redacted **
Master_Server_Id: 1
Master_UUID: 4998aaaa-6ed5-11e8-948c-0242ac120007
** redacted **
Retrieved_Gtid_Set: 1904cf31-912b-ee17-4906-7dae335b4bfc:4-6
Executed_Gtid_Set: 1904cf31-912b-ee17-4906-7dae335b4bfc:1-6,
4998aaaa-6ed5-11e8-948c-0242ac120007:1-11
Auto_Position: 1
** redacted **
1 row in set (0.00 sec)

At this point, we have already solved the problem. However, there’s another way to restore the slave much faster but at the cost of erasing all the existing binary logs on the slave as mentioned in this article. If you want to do this, proceed to step 4.

Step 4. Add the missing sequences to GTID_EXECUTED by modifying GTID_PURGED.

CRITICAL NOTE:
If you followed the steps in Step 3, you do not need to perform Step 4!

To add the missing transactions, you’ll need to stop the slave, reset the master, place the original value of gtid_executed and the missing sequences in gtid_purged variable. A word of caution on using this method: this will purge the existing binary logs of the slave.

mysql> STOP SLAVE;
Query OK, 0 rows affected (0.02 sec)
mysql> RESET MASTER;
Query OK, 0 rows affected (0.02 sec)
mysql> SET GLOBAL gtid_purged="1904cf31-912b-ee17-4906-7dae335b4bfc:1-3,4998aaaa-6ed5-11e8-948c-0242ac120007:1-11";
Query OK, 0 rows affected (0.02 sec)

Similar to Step 3, running SHOW GLOBAL VARIABLES LIKE 'gtid_executed'; on the slave shows that sequence 4998aaaa-6ed5-11e8-948c-0242ac120007:1-11 has been executed already:

mysql> SHOW GLOBAL VARIABLES LIKE 'gtid_executed';
+---------------+-------------------------------------------------------------------------------------+
| Variable_name | Value |
+---------------+-------------------------------------------------------------------------------------+
| gtid_executed | 1904cf31-912b-ee17-4906-7dae335b4bfc:1-3,
4998aaaa-6ed5-11e8-948c-0242ac120007:1-11 |
+---------------+-------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

Run START SLAVE; and SHOW SLAVE STATUS\G to resume replication and check if replication is healthy:

mysql> START SLAVE;
Query OK, 0 rows affected (0.01 sec)
mysql> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: pxc_57_5
Master_User: repl
Master_Port: 3306
** redacted **
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
** redacted **
Seconds_Behind_Master: 0
** redacted **
Master_Server_Id: 1
Master_UUID: 4998aaaa-6ed5-11e8-948c-0242ac120007
** redacted **
Retrieved_Gtid_Set: 1904cf31-912b-ee17-4906-7dae335b4bfc:4-6
Executed_Gtid_Set: 1904cf31-912b-ee17-4906-7dae335b4bfc:1-6,
4998aaaa-6ed5-11e8-948c-0242ac120007:1-11
Auto_Position: 1
** redacted **
1 row in set (0.00 sec)

Step 5. Done

Summary

In this article, I demonstrated how to point the slave to a new master even if it’s missing some binary logs that need to be executed. Although, it is possible to do so with the workarounds shared above, it is prudent to check the consistency of the old and new master first before switching the slave to the new master.

The post Fixing ER_MASTER_HAS_PURGED_REQUIRED_GTIDS when pointing a slave to a different master appeared first on Percona Database Performance Blog.

Mar
13
2018
--

The Multi-Source GTID Replication Maze

Multi-Source GTID Replication

In this blog post, we’ll look at how to navigate some of the complexities of multi-source GTID replication.

GTID replication is often a real challenge for DBAs, especially if this has to do with multi-source GTID replication. A while back, I came across a really interesting customer environment with shards where multi-master, multi-source, multi-threaded MySQL 5.6 MIXED replication was active. This is a highly complex environment that has both pros and cons, introducing risks as a trade-off for specific customer requirements.

This is the set up of part of this environment:

I started looking into this setup when a statement broke replication between db1 and db10. Replication broke due to a statement executed on a schema that was not present on db10. This also resulted in changes originating from db1 to not being pushed down to db100 as db10, as we stopped the replication thread (for db1 channel).

On the other hand, replication was not stopped on db2 because the schema in question was present on db2. Replication between db2 and db20 was broken as well because the schema was not present in db20.

In order to fix db1->db10 replication, four GTID sets were injected in db10.

Here are some interesting blog posts regarding how to handle/fix GTID replication issues:

After injecting the GTID sets, we started replication again and everything ran fine.

 

After that, we had to check the db2->db20 replication, which, as I’ve already said, was broken as well. In this case, injecting only the first GTID trx into db20 instead of all of those causing issues on db10 was enough!

You may wonder how this is possible. Right? The answer is that the rest of them were replicated from db10 to db20, although the channel was not the same.

Another strange thing is the fact that although the replication thread for the db2->db20 channel was stopped (broken), checking the slave status on db20 showed that Executed_Gtid_Set was moving for all channels even though Retrieved_Gtid_Set for the broken one was stopped! So what was happening there?

This raised my curiosity, so I decided to do some further investigation and created scenarios regarding other strange things that could happen. An interesting one was about the replication filters. In our case, I thought “What would happen in the following scenario … ?”

Let’s say we write a row from db1 to db123.table789. This row is replicated to db10 (let’s say using channel 1) and to db2 (let’s say using channel2). On channel 1, we filter out the db123.% tables, on channel2 we don’t. db1 writes the row and the entry to the binary log. db2 writes the row after reading the entry from the binary log and subsequently writes the entry to its own binary log and replicates this change to db20. This change is also replicated to db10. So now, on db10 (depending on which channel finds the GTID first) it either gets filtered on channel1 and written to its own bin log at just startcommit with any actual DDL/DML removed, or if it is read first on channel2 (db1->db2 and then db20->db10) then it is NOT filtered out and executed instead. Is this correct? It definitely ISN’T!

Points of interest

You can find answers to the above questions in the points of interest listed below. Although it’s not really clear through the official documentation, this is what happens with GTID replication and multi-source GTID replication:

  • As we know GTID sets are unique across all nodes in a given cluster. In multi-source replication, Executed_Gtid_Set is common for all channels. This means that regardless the originating channel, when a GTID transaction is executed it is recorded in all channels’ Executed_Gtid_Set. Although it’s logical (each database is unique, so if a trx is going to affect a database it shouldn’t be tightened to a single channel regardless of the channel it uses), the documentation doesn’t provide much info around this.
  • When we have multi-source, multi-level replication, there are cases where the GTID sets originating from one master can end up on one slave via different replication paths. It’s not clear if it applies any special algorithm (although it doesn’t seem that there could be one), but the preferred method seems to be FIFO. The fastest wins! This means that GTID sets can travel to the slave via different channels, and it’s related to how fast the upper-level slaves can commit changes. In fact, the path doesn’t really matter as it only executes each GTID trx once.
  • Replication filters are global regardless the channel. This means they apply each filter to all channels. This is normal as we can’t define a replication filter per channel. In order to be able to debug such cases, adding a small replication delay per channel seems a good idea.
Oct
23
2017
--

MySQL Point in Time Recovery the Right Way

MySQL Point In Time Recovery

MySQL Point In Time RecoveryIn this blog, I’ll look at how to do MySQL point in time recovery (PITR) correctly.

Sometimes we need to restore from a backup, and then replay the transactions that happened after the backup was taken. This is a common procedure in most disaster recovery plans, when for example you accidentally drop a table/database or run an update/delete without the “where” clause and lose data.

The usual way is to get a copy of your binlogs and use mysqlbinlog to replay those transactions. But this approach has many pitfalls that can make the whole PITR process a nightmare. Some examples:

  • You need to make sure to run a single mysqldump command with all related binlogs, and pipe them to mysql at once. Otherwise, if binlog.000001 creates a temporary table, and binlog.000002 requires that temporary table, it will not be present. Each execution of MySQL creates a new connection:
shell> mysqlbinlog binlog.000001 | mysql -u root -p # Creates tmp table X
shell> mysqlbinlog binlog.000002 | mysql -u root -p # Uses tmp table X

  • We can say that it has to be an atomic operation. If it fails halfway through, it will be very difficult to know where it failed and even more difficult to resume from that point forward. There are many reasons for it to fail: InnoDB lock wait timeout / deadlock with some concurrent transaction, server and client have different
    max_allowed_packet

     and you get a Lost connection to MySQL server during query error, and so on.

So how can we overcome those limitations and have a reliable way to do Point In Time Recovery?

We can restore the backup on the desired server, build a second server with just the minimal MySQL required data and move the all binary logs to this “fake” server datadir. Then we need to configure the server where we want the PITR to happen as a slave of the fake server. From this point forward, it’s going to be pure MySQL replication happening.

To illustrate it, I create a Docker container on the machine. I have Percona Server for MySQL running on the box listening on 3306, and have already restored the backup on it. There is a tarball there with all binlogs required. The saved positions for PITR are as follows:

[root@localhost ~]# cat /var/lib/mysql/xtrabackup_binlog_info
master-bin.000007	1518932

I create a folder to store the Docker MySQL datadir:

mkdir /tmp/pitr
chown -R 1001 /tmp/pitr

I start the Docker container. As we can see from xtrabackup_binlog_info, my binlogs are named master-bin and I’ll be setting the same server-id as original master:

docker run --name ps_pitr -v /tmp/pitr:/var/lib/mysql
-p 3307:3306 -e MYSQL_ROOT_PASSWORD=secret
-d percona/percona-server:5.7.18
--log_bin=master-bin --server-id=10

In case you want to make usage of GTID, append --gtid-mode=ON --enforce_gtid_consistency=ON to the end of the Docker command.

The command above starts a MySQL instance, invokes mysqld –initialize, sets the root password to secret and it’s port 3306 is mapped back to my local 3307 port. Now I’ll stop it, remove the binlogs that it created, uncompress and move all required binlogs to its datadir mapped folder and start it again:

docker stop ps_pitr
rm /tmp/pitr/master-bin.*
tar -zxf binlogs.tgz -C /tmp/pitr
chown -R 1001 /tmp/pitr/master-bin.*
docker start ps_pitr

If it all worked correctly, at this point we can see the full list of binary logs on the Docker container by connecting on port 3307:

mysql -u root -psecret -P 3307 --protocol=TCP -e "SHOW BINARY LOGS"
mysql: [Warning] Using a password on the command line interface can be insecure.
+-------------------+-----------+
| Log_name          | File_size |
+-------------------+-----------+
| master-bin.000005 |  26216208 |
| master-bin.000006 |  26214614 |
| master-bin.000007 |  26214902 |
. . .
| master-bin.000074 |       154 |
+-------------------+-----------+

Now, all we need to do is connect to our server, which has the backup restored, and configure it as a slave from 3307:

mysql -u root -p
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or g.
Your MySQL connection id is 6
Server version: 5.7.18-16 Percona Server (GPL), Release 16, Revision d7301f8
Copyright (c) 2009-2017 Percona LLC and/or its affiliates
Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.
mysql> CHANGE MASTER TO MASTER_HOST='127.0.0.1', MASTER_PORT=3307, MASTER_USER='root', MASTER_PASSWORD='secret', MASTER_LOG_FILE='master-bin.000007', MASTER_LOG_POS=1518932;
Query OK, 0 rows affected, 2 warnings (0.01 sec)
mysql> START SLAVE;
Query OK, 0 rows affected (0.01 sec)
mysql> SHOW SLAVE STATUSG
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: root
                  Master_Port: 3307
                Connect_Retry: 60
              Master_Log_File: master-bin.000008
          Read_Master_Log_Pos: 449696
               Relay_Log_File: localhost-relay-bin.000002
                Relay_Log_Pos: 28957
        Relay_Master_Log_File: master-bin.000007
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 15217950
              Relay_Log_Space: 11476311
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 4382
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 10
                  Master_UUID: 80b9fe26-a945-11e7-aa1d-0242ac110002
             Master_Info_File: /var/lib/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Opening tables
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set:
                Auto_Position: 0
         Replicate_Rewrite_DB:
                 Channel_Name:
           Master_TLS_Version:
1 row in set (0.17 sec)
. . .
mysql> SHOW SLAVE STATUSG
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: root
                  Master_Port: 3307
                Connect_Retry: 60
              Master_Log_File: master-bin.000074
          Read_Master_Log_Pos: 154
               Relay_Log_File: localhost-relay-bin.000133
                Relay_Log_Pos: 381
        Relay_Master_Log_File: master-bin.000074
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 154
              Relay_Log_Space: 819
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 10
                  Master_UUID: 80b9fe26-a945-11e7-aa1d-0242ac110002
             Master_Info_File: /var/lib/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set:
                Auto_Position: 0
         Replicate_Rewrite_DB:
                 Channel_Name:
           Master_TLS_Version:
1 row in set (0.01 sec)

If you want to apply logs up to a particular time you can make use of mysqlbinlog to verify what the last position / GTID it should apply, and use START SLAVE UNTIL MASTER_LOG_FILE = 'log_name', MASTER_LOG_POS = log_pos or START SLAVE SQL_THREAD UNTIL SQL_AFTER_GTIDS = 3E11FA47-71CA-11E1-9E33-C80AA9429562:11-56.

Special thanks to Marcos Albe, who originally showed me this MySQL point in time recovery approach.

Aug
01
2017
--

Group Replication: the Sweet and the Sour

Group Replication

In this blog, we’ll look at group replication and how it deals with flow control (FC) and replication lag. 

Overview

In the last few months, we had two main actors in the MySQL ecosystem: ProxySQL and Group-Replication (with the evolution to InnoDB Cluster). 

While I have extensively covered the first, my last serious work on Group Replication dates back to some lab version years past.

Given that Oracle decided to declare it GA, and Percona’s decision to provide some level of Group Replication support, I decided it was time for me to take a look at it again.

We’ve seen a lot of coverage already too many Group Replication topics. There are articles about Group Replication and performance, Group Replication and basic functionalities (or lack of it like automatic node provisioning), Group Replication and ProxySQL, and so on.

But one question kept coming up over and over in my mind. If Group Replication and InnoDB Cluster have to work as an alternative to other (virtually) synchronous replication mechanisms, what changes do our customers need to consider if they want to move from one to the other?

Solutions using Galera (like Percona XtraDB Cluster) must take into account a central concept: clusters are data-centric. What matters is the data and the data state. Both must be the same on each node at any given time (commit/apply). To guarantee this, Percona XtraDB Cluster (and other solutions) use a set of data validation and Flow Control processes that work to the ensure a consistent cluster data set on each node.

The upshot of this principle is that an application can query ANY node in a Percona XtraDB Cluster and get the same data, or write to ANY node and know that the data is visible everywhere in the cluster at (virtually) the same time.

Last but not least, inconsistent nodes should be excluded and either rebuild or fixed before rejoining the cluster.

If you think about it, this is very useful. Guaranteeing consistency across nodes allows you to transparently split write/read operations, failover from one node to another with very few issues, and more.

When I conceived of this blog on Group Replication (or InnoDB Cluster), I put myself in the customer shoes. I asked myself: “Aside from all the other things we know (see above), what is the real impact of moving from Percona XtraDB Cluster to Group Replication/InnoDB Cluster for my application? Since Group Replication still (basically) uses replication with binlogs and relaylog, is there also a Flow Control mechanism?” An alarm bell started to ring in my mind.

My answer is: “Let’s do a proof of concept (PoC), and see what is really going on.”

The POC

I setup a simple set of servers using Group Replication with a very basic application performing writes on a single writer node, and (eventually) reads on the other nodes. 

You can find the schema definition here. Mainly I used the four tables from my windmills test suite — nothing special or specifically designed for Group Replication. I’ve used this test a lot for Percona XtraDB Cluster in the past, so was a perfect fit.

Test Definition

The application will do very simple work, and I wanted to test four main cases:

  1. One thread performing one insert at each transaction
  2. One thread performing 50 batched inserts at each transaction
  3. Eight threads performing one insert to each transaction
  4. Eight threads performing 50 batched inserts at each transaction

As you can see, a pretty simple set of operations. Then I decided to test it using the following four conditions on the servers:

  1. One slave worker FC as default
  2. One slave worker FC set to 25
  3. Eight slave workers FC as default
  4. Eight slave workers FC set to 25

Again nothing weird or strange from my point of view. I used four nodes:

  1. Gr1 Writer
  2. Gr2 Reader
  3. Gr3 Reader minimal latency (~10ms)
  4. Gr4 Reader minimal latency (~10ms)

Finally, I had to be sure I measured the lag in a way that allowed me to reference it consistently on all nodes. 

I think we can safely say that the incoming GTID (last_ Received_transaction_set from replication_connection_status) is definitely the last change applied to the master that the slave node knows about. More recent changes could have occurred, but network delay can prevent them from being “received.” The other point of reference is GTID_EXECUTED, which refers to the latest GTID processed on the node itself.

The closest query that can track the distance will be:

select @last_exec:=SUBSTRING_INDEX(SUBSTRING_INDEX(SUBSTRING_INDEX( @@global.GTID_EXECUTED,':',-2),':',1),'-',-1) last_executed;select  @last_rec:=SUBSTRING_INDEX(SUBSTRING_INDEX(SUBSTRING_INDEX( Received_transaction_set,':',-2),':',1),'-',-1) last_received FROM performance_schema.replication_connection_status WHERE Channel_name = 'group_replication_applier'; select (@last_rec - @last_exec) as real_lag

Or in the case of a single worker:

select @last_exec:=SUBSTRING_INDEX(SUBSTRING_INDEX( @@global.GTID_EXECUTED,':',-1),'-',-1) last_executed;select  @last_rec:=SUBSTRING_INDEX(SUBSTRING_INDEX(Received_transaction_set,':',-1),'-',-1) last_received FROM performance_schema.replication_connection_status WHERE Channel_name = 'group_replication_applier'; select (@last_rec - @last_exec) as real_lag;

The result will be something like this:

+---------------+
| last_executed |
+---------------+
| 23607         |
+---------------+
+---------------+
| last_received |
+---------------+
| 23607         |
+---------------+
+----------+
| real_lag |
+----------+
|        0 |
+----------+

The whole set of tests can be found here, with all the commands you need to run the application (you can find it here) and replicate the tests. I will focus on the results (otherwise this blog post would be far too long), but I invite you to see the details.

The Results

Efficiency on Writer by Execution Time and Rows/Sec

Using the raw data from the tests (Excel spreadsheet available here), I was interested in identifying if and how the Writer is affected by the use of Group Replication and flow control.

Reviewing the graph, we can see that the Writer has a linear increase in the execution time (when using default flow control) that matches the increase in the load. Nothing there is concerning, and all-in-all we see what is expected if the load is light. The volume of rows at the end justifies the execution time.

It’s a different scenario if we use flow control. The execution time increases significantly in both cases (single worker/multiple workers). In the worst case (eight threads, 50 inserts batch) it becomes four times higher than the same load without flow control.

What happens to the inserted rows? In the application, I traced the rows inserted/sec. It is easy to see what is going on there:

We can see that the Writer with flow control activated inserts less than a third of the rows it processes without flow control. 

We can definitely say that flow control has a significant impact on the Writer performance. To clarify, let’s look at this graph:

Without flow control, the Writer processes a high volume of rows in a limited amount of time (results from the test of eight workers, eight threads, 50 insert batch). With flow control, the situation changes drastically. The Writer takes a long time processing a significantly smaller number of rows/sec. In short, performance drops significantly.

But hey, I’m OK with that if it means having a consistent data-set cross all nodes. In the end, Percona XtraDB Cluster and similar solutions pay a significant performance price match the data-centric principle. 

Let’s see what happen on the other nodes.

Entries Lag

Well, this scenario is not so good:

When NOT using flow control, the nodes lag behind the writer significantly. Remember that by default flow control in Group Replication is set to 25000 entries (I mean 25K of entries!!!).

What happens is that as soon as I put some salt (see load) on the Writer, the slave nodes start to lag. When using the default single worker, that will have a significant impact. While using multiple workers, we see that the lag happens mainly on the node(s) with minimal (10ms) network latency. The sad thing is that is not really going down with respect to the single thread worker, indicating that the simple minimal latency of 10ms is enough to affect replication.

Time to activate the flow control and have no lag:

Unfortunately, this is not the case. As we can see, the lag of single worker remains high for Gr2 (154 entries). While using multiple workers, the Gr3/4 nodes can perform much better, with significantly less lag (but still high at ~1k entries).

It is important to remember that at this time the Writer is processing one-third or less of the rows it is normally able to. It is also important to note that I set 25 to the entry limit in flow control, and the Gr3 (and Gr4) nodes are still lagging more than 1K entries behind.

To clarify, let check the two graphs below:

Using the Writer (Master) as a baseline in entry #N, without flow control, the nodes (slaves) using Group Replication start to significantly lag behind the writer (even with a light load).

The distance in this PoC ranged from very minimal (with 58 entries), up to much higher loads (3849 entries):

Using flow control, the Writer (Master) diverges less, as expected. If it has a significant drop in performance (one-third or less), the nodes still lag. The worst-case is up to 1363 entries. 

I need to underline here that we have no further way (that I am aware of, anyway) to tune the lag and prevent it from happening.

This means an application cannot transparently split writes/reads and expect consistency. The gap is too high.

A Graph That Tells Us a Story

I used Percona Monitoring and Management (PMM) to keep an eye on the nodes while doing the tests. One of the graphs really showed me that Group Replication still has some “limits” as the replication mechanism for a cluster:

This graph shows the MySQL queries executed on all the four nodes, in the testing using 8-50 threads-batch and flow control. 

As you can see, the Gr1 (Writer) is the first one to take off, followed by Gr2. Nodes Gr3 and Gr4 require a bit more, given the binlog transmission (and 10ms delay). Once the data is there, they match (inconsistently) the Gr2 node. This is an effect of flow control asking the Master to slow down. But as previously seen, the nodes will never match the Writer. When the load test is over, the nodes continue to process the queue for additional ~130 seconds. Considering that the whole load takes 420 seconds on the Writer, this means that one-third of the total time on the Writer is spent syncing the slave AFTERWARDS.

The above graph shows the same test without flow control. It is interesting to see how the Writer moved above 300 queries/sec, while G2 stayed around 200 and Gr3/4 far below. The Writer was able to process the whole load in ~120 seconds instead 420, while Gr3/4 continue to process the load for an additional ~360 seconds.

This means that without flow control set, the nodes lag around 360 seconds behind the Master. With flow control set to 25, they lag 130 seconds.

This is a significant gap.

Conclusions

Going back to the reason why I was started this PoC, it looks like my application(s) are not a good fit for Group Replication given that I have set Percona XtraDB Cluster to scale out the reads and efficiently move my writer to another when I need to. 

Group Replication is still based on asynchronous replication (as my colleague Kenny said). It makes sense in many other cases, but it doesn’t compare to solutions based on virtually synchronous replication. It still requires a lot of refinement.

On the other hand, for applications that can afford to have a significant gap between writers and readers it is probably fine. But … doesn’t standard replication already cover that? 

Reviewing the Oracle documentations (https://dev.mysql.com/doc/refman/5.7/en/group-replication-background.html), I can see why Group Replication as part of the InnoDB cluster could help improve high availability when compared to standard replication. 

But I also think it is important to understand that Group Replication (and derived solutions like InnoDB cluster) are not comparable or a replacement for data-centric solutions as Percona XtraDB Cluster. At least up to now.

Good MySQL to everyone.

Jun
20
2017
--

The MySQL High Availability Landscape in 2017 (The Elders)

High Availability

In this blog, we’ll look at different MySQL high availability options.

The dynamic MySQL ecosystem is rapidly evolving many technologies built around MySQL. This is especially true for the technologies involved with the high availability (HA) aspects of MySQL. When I joined Percona back in 2009, some of these HA technologies were very popular – but have since been almost forgotten. During the same interval, new technologies have emerged. In order to give some perspective to the reader, and hopefully help to make better choices, I’ll review the MySQL HA landscape as it is in 2017. This review will be in three parts. The first part (this post) will cover the technologies that have been around for a long time: the elders. The second part will focus on the technologies that are very popular today: the adults. Finally, the last part will try to extrapolate which technologies could become popular in the upcoming years: the babies.

Quick disclaimer, I am reporting on the technologies I see the most. There are likely many other solutions not covered here, but I can’t talk about technologies I have barely or never used. Apart from the RDS-related technologies, all the technologies covered are open-source. The target audience for this post are people relatively new to MySQL.

The Elders

Let’s define the technologies in the elders group. These are technologies that anyone involved with MySQL for last ten years is sure to be aware of. I could have called this group the “classics”.  I include the following technologies in this group:

  • Replication
  • Shared storage
  • NDB cluster

Let’s review these technologies in the following sections.

Replication

Simple replication topology

 

MySQL replication is very well known. It is one of the main features behind the wide adoption of MySQL. Replication gets used almost everywhere. The reasons for that are numerous:

  • Replication is simple to setup. There are tons of how-to guides and scripts available to add a slave to a MySQL server. With Amazon RDS, adding a slave is just a few clicks.
  • Slaves allow you to easily scale reads. The slaves are accessible and can be used for reads. This is the most common way of scaling up a MySQL database.
  • Slaves have little impact on the master. Apart from the added network traffic, the presence of slaves does not impact the master performance significantly.
  • It is well known. No surprises here.
  • Used for failover. Your master died, promote a slave and use it as your new master.
  • Used for backups. You don’t want to overload your master with the backups, run them off a slave.

Of course, replication also has some issues:

  • Replication can lag. Replication used to be single-threaded. That means a master with a concurrent load could easily outpace a slave. MySQL 5.6 and MariaDB 10.0 have introduced some parallelism to the slave. Newer versions have further improved to a point where today’s slaves are many times faster than they were.
  • Slaves can diverge. When you modify data on the master, the slave must perform the exact same update. That seems easy, but there are many ways an update can be non-deterministic with statement-based replication. They fixed many issues, and the introduction of row-based replication has been another big step forward. Still, if you write directly to a slave you are asking for trouble. There is a read_only setting, but if the MySQL user has the “SUPER” privilege it is just ignored. That’s why there is now the “super_read_only” setting. Tools like pt-table-checksum and pt-table-sync from the Percona toolkit exist to solve this problem.
  • Replication can impact the master. I wrote above that the presence of slaves does not affect the master, but logging changes are more problematic. The most common issue is the InnoDB table-level locking for auto_increment values with statement-based replication. Only one thread can insert new rows at a time. You can avoid this issue with row-based replication and properly configuring settings.
  • Data gets lost. Replication is asynchronous. That means the master will reply “done” after a commit statement even though the slaves have not received updates yet. Some transactions can get lost if the master crashes.

Although an old technology, a lot of work has been done on replication. It is miles away from the replication implementation of 5.0.x. Here’s a list, likely incomplete, of the evolution of replication:

  • Row based replication (since 5.1). The binary internal representation of the rows is sent instead of the SQL statements. This makes replication more robust against slave divergence.
  • Global transaction ID (since 5.6). Transactions are uniquely identified. Replication can be setup without knowing the binlog file and offset.
  • Checksum (since 5.6). Binlog events have checksum values to validate their integrity.
  • Semi-sync replication (since 5.5). An addition to the replication protocol to make the master aware of the reception of events by the slaves. This helps to avoid losing data when a master crashes.
  • Multi-source replication (since 5.7). Allows a slave to have more than one master.
  • Multi-threaded replication (since 5.6). Allows a slave to use multiple threads. This helps to limit the slave lag.

Managing replication is a tedious job. The community has written many tools to manage replication:

  • MMM. An old Perl tool that used to be quite popular, but had many issues. Now rarely used.
  • MHA. The most popular tool to manage replication. It excels at reconfiguring replication without losing data, and does a decent at handling failover.  It is also simple. No wonder it is popular.
  • PRM. A Pacemaker-based solution developed to replace MMM. It’s quite good at failover, but not as good as MHA at reconfiguring replication. It’s also quite complex, thanks to Pacemaker. Not used much.
  • Orchestrator. The new cool tool. It can manage complex topologies and has a nice web-based interface to monitor and control the topology.

 

Shared Storage

Simple shared storage topology

 

Back when I was working for MySQL ten years ago, shared storage HA setups were very common. A shared storage HA cluster uses one copy of the database files between one of two servers. One server is active, the other one is passive. In order to be shared, the database files reside on a device that can be mounted by both servers. The device can be physical (like a SAN), or logical (like a Linux DRBD device). On top of that, you need a cluster manager (like Pacemaker) to handle the resources and failovers. This solution is very popular because it allows for failover without losing any transactions.

The main drawback of this setup is the need for an idle standby server. The standby server cannot have any other assigned duties since it must always be ready to take over the MySQL server. A shared storage solution is also obviously not resilient to file-level corruption (but that situation is exceptional). Finally, it doesn’t play well with a cloud-based environment.

Today, newly-deployed shared storage HA setups are rare. The only ones I encountered over the last year were either old implementations needing support, or new setups that deployed because of existing corporate technology stacks. That should tell you about the technology’s loss of popularity.

NDB Cluster

A simple NDB Cluster topology

 

An NDB Cluster is a distributed clustering solution that has been around for a long time. I personally started working with this technology back in 2008. An NDB Cluster has three types of nodes: SQL, management and data. A full HA cluster requires a minimum of four nodes.

An NDB Cluster is not a general purpose database due to its distributed nature. For suitable workloads, it is extraordinary good. For unsuitable workloads, it is miserable. A suitable workload for an NDB Cluster contains high concurrency, with a high rate of small primary key oriented transactions. Reaching one million trx/s on an NDB Cluster is nothing exceptional.

At the other end of the spectrum, a poor workload for an NDB Cluster is a single-threaded report query on a star-like schema. I have seen some extreme cases where just the network time of a reporting query amounted to more than 20 minutes.

Although NDB Clusters have improved, and are still improving, their usage has been pushed toward niche-type applications. Overall, the technology is losing ground and is now mostly used for Telcos and online gaming applications.

Dec
01
2016
--

Managing Replication with Percona XtraDB Cluster

Managing Replication

Managing ReplicationThis blog post discusses managing replication with Percona XtraDB Cluster.

Recently a customer asked me to setup replication between two distinct Percona XtraDB Clusters located in geographically separate data centers. The customer goal was to use one of the clusters only in case of disaster recovery. They tried extending the cluster, but because of the WAN latency impact on their writes and the requirements of a node in a third data center for quorum, they walked away from that setup. Since they were not concerned about losing a few transactions in case of a major disaster, they were OK with regular MySQL replication using GTIDs.

Easy enough right! Both clusters are cloud-based, of course, and the provider can stop/restart any node on short notice. This setup caused some concern for the customer around how to handle replication. Since they don’t have dedicated personnel to monitor replication, or at least handle alerts, they asked if we could find a way to automate the process. So, here we go!

We all try to solve the problems with the tools we know. In my case, I like Pacemaker a lot. So using Pacemaker was my first thought. In a cloud environment, a Pacemaker setup is not easy (wouldn’t that be a cluster in a cluster… a bit heavy). But wait! Percona XtraDB Cluster with Galera replication is already handling quorum, and it provides a means of exchanging information between the nodes. Why not use that?

We can detect quorum status the same way the clustercheck scripts do it. To exchange messages, why don’t we simply write to a table. The Galera replication will update the other nodes. I went on and wrote a bash script that is called by cron every minute. The script monitors the node state and the content of the table. If all is right, it updates the table to report its presence (and if it is acting as a slave or not). The script validates the presence of a slave in the cluster. If no reporting slave is found, the script proceeds to the “election” of a new slave, based on the

wsrep_local_index

 value. Basically, the script is a big bunch of “if” statements. The script is here, and the basic documentation on how to set it up here

Of course, if it works for one cluster, it can work for two. I have configured my customer’s two Percona XtraDB Clusters in a master-to-master relationship using this script. I ran through a bunch of failure scenario cases. The script survived all of them! But of course, this is new. If you are going to implement this solution, run your own set of tests! If you find any problem, file an issue on GitHub. I’ll be happy to fix it!

Dec
01
2016
--

Database Daily Ops Series: GTID Replication and Binary Logs Purge

GTID replication

GTID replicationThis blog continues the ongoing series on daily operations and GTID replication.

In this blog, I’m going to investigate why the error below has been appearing in a special environment I’ve been working with on the last few days:

Last_IO_Errno: 1236
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log:
'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the
master has purged binary logs containing GTIDs that the slave requires.'

The error provides the right message, and explains what is going on. But sometimes, it can be a bit tricky to solve this issue: you need additional information discovered after some tests and readings. We try and keep Managed Services scripted, in the sense that our advice and best practices are repeatable and consistent. However, some additional features and practices can be added depending on the customer situation.

Some time ago one of our customer’s database servers presented the above message. At that point, we could see the binary log files in a compressed form on master (gzipped). Of course, MySQL can’t identify a compressed file with a .gz extension as a binary log. We uncompressed the file, but replication presented the same problem – even after uncompressing the file and making sure the UUID of the current master and the TRX_ID were there. Obviously, I needed to go and investigate the problem to see what was going on.

After some reading, I re-read the below:

When the server starts, the global value of gtid_purged, which was called before as gtid_lost, is initialized to the set of GTIDs contained by the Previous_gtid_log_event of the oldest binary log. When a binary log is purged, gtid_purged is re-read from the binary log that has now become the oldest one.

=> https://dev.mysql.com/doc/refman/5.6/en/replication-options-gtids.html#sysvar_gtid_purged

That makes me think: if something is compressing binlogs on the master without purging them as expected by the GTID mechanism, it’s not going to be able to re-read existing GTIDs on disk. When the slave replication threads restarts, or the DBA issues commands like reset slave and reset master (to clean out the increased GTID sets on Executed_Gtid_Set from the SHOW SLAVE STATUS command, for example), this error can occur. But if I compress the file:

  • Will the slave get lost and not find all the needed GTIDs on the master after a reset slave/reset master?
  • If I purge the logs correctly, using PURGE BINARY LOGS, will the slave be OK when restarting replication threads?

Test 1: Compressing the oldest binary log file on master, restarting slave threads

I would like to test this very methodically. We’ll create one GTID per binary log, and then I will compress the oldest binary log file in order to make it unavailable for the slaves. I’m working with three virtual machines, one master and two slaves. On the second slave, I’m going to run the following sequence: stop slave, reset slave, reset master, start slave, and then, check the results. Let’s see what happens.

On master (tool01):

tool01 [(none)]:> show master logs;
+-------------------+-----------+
| Log_name          | File_size |
+-------------------+-----------+
| mysqld-bin.000001 |       341 |
| mysqld-bin.000002 |       381 |
| mysqld-bin.000003 |       333 |
+-------------------+-----------+
3 rows in set (0.00 sec)
tool01 [(none)]:> show binlog events in 'mysqld-bin.000001';
+-------------------+-----+----------------+-----------+-------------+-------------------------------------------------------------------+
| Log_name          | Pos | Event_type     | Server_id | End_log_pos | Info                                                              |
+-------------------+-----+----------------+-----------+-------------+-------------------------------------------------------------------+
| mysqld-bin.000001 |   4 | Format_desc    |         1 |         120 | Server ver: 5.6.32-log, Binlog ver: 4                             |
| mysqld-bin.000001 | 120 | Previous_gtids |         1 |         151 |                                                                   |
| mysqld-bin.000001 | 151 | Gtid           |         1 |         199 | SET @@SESSION.GTID_NEXT= '4fbe2d57-5843-11e6-9268-0800274fb806:1' |
| mysqld-bin.000001 | 199 | Query          |         1 |         293 | create database wb01                                              |
| mysqld-bin.000001 | 293 | Rotate         |         1 |         341 | mysqld-bin.000002;pos=4                                           |
+-------------------+-----+----------------+-----------+-------------+-------------------------------------------------------------------+
5 rows in set (0.00 sec)
tool01 [(none)]:> show binlog events in 'mysqld-bin.000002';
+-------------------+-----+----------------+-----------+-------------+-------------------------------------------------------------------+
| Log_name          | Pos | Event_type     | Server_id | End_log_pos | Info                                                              |
+-------------------+-----+----------------+-----------+-------------+-------------------------------------------------------------------+
| mysqld-bin.000002 |   4 | Format_desc    |         1 |         120 | Server ver: 5.6.32-log, Binlog ver: 4                             |
| mysqld-bin.000002 | 120 | Previous_gtids |         1 |         191 | 4fbe2d57-5843-11e6-9268-0800274fb806:1                            |
| mysqld-bin.000002 | 191 | Gtid           |         1 |         239 | SET @@SESSION.GTID_NEXT= '4fbe2d57-5843-11e6-9268-0800274fb806:2' |
| mysqld-bin.000002 | 239 | Query          |         1 |         333 | create database wb02                                              |
| mysqld-bin.000002 | 333 | Rotate         |         1 |         381 | mysqld-bin.000003;pos=4                                           |
+-------------------+-----+----------------+-----------+-------------+-------------------------------------------------------------------+
5 rows in set (0.00 sec)
tool01 [(none)]:> show binlog events in 'mysqld-bin.000003';
+-------------------+-----+----------------+-----------+-------------+-------------------------------------------------------------------+
| Log_name          | Pos | Event_type     | Server_id | End_log_pos | Info                                                              |
+-------------------+-----+----------------+-----------+-------------+-------------------------------------------------------------------+
| mysqld-bin.000003 |   4 | Format_desc    |         1 |         120 | Server ver: 5.6.32-log, Binlog ver: 4                             |
| mysqld-bin.000003 | 120 | Previous_gtids |         1 |         191 | 4fbe2d57-5843-11e6-9268-0800274fb806:1-2                          |
| mysqld-bin.000003 | 191 | Gtid           |         1 |         239 | SET @@SESSION.GTID_NEXT= '4fbe2d57-5843-11e6-9268-0800274fb806:3' |
| mysqld-bin.000003 | 239 | Query          |         1 |         333 | create database wb03                                              |
+-------------------+-----+----------------+-----------+-------------+-------------------------------------------------------------------+
4 rows in set (0.00 sec)

Here we see that each existing binary log file has just one transaction. That will make it easier to compress the oldest binary log, and then disappear with part of the existing GTIDs. When the slave connects to a master, it will first send all the Executed_Gtid_Set, and then the master sends all the missing IDs to the slave. As Stephane Combaudon said, we will force it to happen! Slave database servers are both currently in the same position:

tool02 [(none)]:> show slave statusG
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.0.10
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysqld-bin.000003
          Read_Master_Log_Pos: 333
               Relay_Log_File: mysqld-relay-bin.000006
                Relay_Log_Pos: 545
        Relay_Master_Log_File: mysqld-bin.000003
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
            ...
           Retrieved_Gtid_Set: 4fbe2d57-5843-11e6-9268-0800274fb806:1-3
            Executed_Gtid_Set: 4fbe2d57-5843-11e6-9268-0800274fb806:1-3
            
tool03 [(none)]:> show slave statusG
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.0.10
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysqld-bin.000003
          Read_Master_Log_Pos: 333
               Relay_Log_File: mysqld-relay-bin.000008
                Relay_Log_Pos: 451
        Relay_Master_Log_File: mysqld-bin.000003
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
...
           Retrieved_Gtid_Set: 4fbe2d57-5843-11e6-9268-0800274fb806:1-3
            Executed_Gtid_Set: 4fbe2d57-5843-11e6-9268-0800274fb806:1-3

Now, we’ll compress the oldest binary log on master:

[root@tool01 mysql]# ls -lh | grep mysqld-bin.
-rw-rw---- 1 mysql mysql  262 Nov 11 13:55 mysqld-bin.000001.gz #: this is the file containing 4fbe2d57-5843-11e6-9268-0800274fb806:1
-rw-rw---- 1 mysql mysql  381 Nov 11 13:55 mysqld-bin.000002
-rw-rw---- 1 mysql mysql  333 Nov 11 13:55 mysqld-bin.000003
-rw-rw---- 1 mysql mysql   60 Nov 11 13:55 mysqld-bin.index

On tool03, which is the database server that will be used, we will execute the replication reload:

tool03 [(none)]:> stop slave; reset slave; reset master; start slave;
Query OK, 0 rows affected (0.01 sec)
Query OK, 0 rows affected (0.03 sec)
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.02 sec)
tool03 [(none)]:> show slave statusG
*************************** 1. row ***************************
               Slave_IO_State:
                  Master_Host: 192.168.0.10
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File:
          Read_Master_Log_Pos: 4
               Relay_Log_File: mysqld-relay-bin.000002
                Relay_Log_Pos: 4
        Relay_Master_Log_File:
             Slave_IO_Running: No
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 0
              Relay_Log_Space: 151
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 1236
                Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
                  Master_UUID: 4fbe2d57-5843-11e6-9268-0800274fb806
             Master_Info_File: /var/lib/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp: 161111 14:47:13
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set:
                Auto_Position: 1
1 row in set (0.00 sec)

Bingo! We broke the replication streaming on the slave. Now we know that the missing GTID on the master was due to the compressed file, and wasn’t able to be passed along to the connecting slave during their negotiation. Additionally, @@GTID_PURGED was not reloaded as per what the online manual said. The test done and we confirmed the theory (if you have additional comments, enter it at the end of the blog).

Test 2: Purge the oldest file on master and reload replication on slave

Let’s make it as straightforward as possible. The purge can be done manually using the PURGE BINARY LOGS command to get it done a proper way as the binary log index file should be considered a part of this purge operation as well (it should be edited to remove the file name index entry together with the log file on disk). I’m going to execute the same as before, but include purging the file manually with the mentioned command.

tool01 [(none)]:> show master logs;
+-------------------+-----------+
| Log_name | File_size |
+-------------------+-----------+
| mysqld-bin.000001 | 341 |
| mysqld-bin.000002 | 381 |
| mysqld-bin.000003 | 333 |
+-------------------+-----------+
3 rows in set (0.00 sec)
tool01 [(none)]:> purge binary logs to 'mysqld-bin.000002';
Query OK, 0 rows affected (0.01 sec)
tool01 [(none)]:> show master logs;
+-------------------+-----------+
| Log_name | File_size |
+-------------------+-----------+
| mysqld-bin.000002 | 381 |
| mysqld-bin.000003 | 333 |
+-------------------+-----------+
2 rows in set (0.00 sec)

Now, we’ll execute the commands to check how it goes:

tool03 [(none)]:> stop slave; reset slave; reset master; start slave;
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.02 sec)
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.02 sec)
tool03 [(none)]:> show slave statusG
*************************** 1. row ***************************
               Slave_IO_State:
                  Master_Host: 192.168.0.10
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File:
          Read_Master_Log_Pos: 4
               Relay_Log_File: mysqld-relay-bin.000002
                Relay_Log_Pos: 4
        Relay_Master_Log_File:
             Slave_IO_Running: No
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 0
              Relay_Log_Space: 151
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 1236
                Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
                  Master_UUID: 4fbe2d57-5843-11e6-9268-0800274fb806
             Master_Info_File: /var/lib/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp: 161111 16:35:02
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set:
                Auto_Position: 1
1 row in set (0.00 sec)

The GTID on the purged file is needed by the slave. In both cases, we can set the @@GTID_PURGED as below with the transaction that we know was purged, and move forward with replication:

tool03 [(none)]:> stop slave; set global gtid_purged='4fbe2d57-5843-11e6-9268-0800274fb806:1';
Query OK, 0 rows affected, 1 warning (0.00 sec)
Query OK, 0 rows affected (0.01 sec)
tool03 [(none)]:> start slave;
Query OK, 0 rows affected (0.01 sec)

The above adjusts the GTID on @@GTID_PURGED to just request the existing GTIDs, using the oldest existing GTID minus one to make the slave start the replication from the oldest existing GTID. In our scenario above, the replica restarts replication from 4fbe2d57-5843-11e6-9268-0800274fb806:2, which lives on binary log file mysqld-bin.000002. Replication is fixed, as its threads can restart processing the data streaming coming from master.

You will need to execute additional steps in checksum and sync for the set of transactions that were jumped when setting a new value for @@GTID_PURGED. If replication continues to break after restarting, I advise you rebuild the slave (possibly the subject of future blog).

Good explanations about this can be found on the below bug, reported by the Facebook guys and Laurynas Biveinis, the Percona Server Lead (who clarified the issue):

  • MySQL Bugs: #72635: Data inconsistencies when master has truncated binary log with GTID after crash;
  • MySQL Bugs: #73032: Setting gtid_purged may break auto_position and thus slaves;

Conclusion

Be careful when purging or doing something manually with binary logs, because @@GTID_PURGED needs to be automatically updated when binary logs are purged. It seems to happen only when

expire_logs_days

 is set to purge binary logs. Yet you need to be careful when trusting this variable, because it doesn’t consider fraction of days, depending the number of writes on a database server, it can get disks full in minutes. This blog showed that even housekeeping scripts and the PURGER BINARY LOGS command were able to make it happen.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com