May
26
2022
--

Is MySQL Statement-Based / Mixed Replication Really Safe?

MySQL Statement-Based

MySQL Statement-BasedThe binary logging format in MySQL has been ROW by default since MySQL 5.7, yet there are still many users sticking with STATEMENT or MIXED formats for various reasons. In some cases, there is just simple hesitation from changing something that has worked for years on legacy applications. But in others, there may be serious blockers, most typically missing primary keys in badly designed schemas, which would lead to serious performance issues on the replicas.

As a Support Engineer, I can still see quite a few customers using STATEMENT or MIXED formats, even if they are already on MySQL 8.0. In many cases this is OK, but recently I had to deal with a pretty nasty case, where not using ROW format was found to cause the replicas to silently lose data updates, without raising any replication errors! Was it some really rare edge use case? Not at all! Let me demonstrate a very simple test case below to illustrate how easy it is to end up in such a bad situation.

— source 

mysql> select @@binlog_format,@@system_time_zone;
+-----------------+--------------------+
| @@binlog_format | @@system_time_zone |
+-----------------+--------------------+
| STATEMENT       | BST                |
+-----------------+--------------------+
1 row in set (0.00 sec)

mysql> CREATE TABLE `test1` (
    ->   `id` int(11) NOT NULL AUTO_INCREMENT,
    ->   `d` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    ->   `a` varchar(30) NOT NULL,
    ->   `name` varchar(25) DEFAULT NULL,
    ->   PRIMARY KEY (`a`),
    ->   UNIQUE KEY `id` (`id`)
    -> ) ENGINE=InnoDB;
Query OK, 0 rows affected, 1 warning (0.02 sec)

mysql> insert into test1 values (null,now(),"test1",0);
Query OK, 1 row affected (0.00 sec)

mysql> insert into test1 values (null,now(),"test2",0);
Query OK, 1 row affected (0.01 sec)

mysql> insert into test1 values (null,now(),"test3",0);
Query OK, 1 row affected (0.01 sec)

mysql> select * from test1;
+----+---------------------+-------+------+
| id | d                   | a     | name |
+----+---------------------+-------+------+
|  1 | 2022-05-22 10:13:37 | test1 | 0    |
|  2 | 2022-05-22 10:13:37 | test2 | 0    |
|  3 | 2022-05-22 10:13:38 | test3 | 0    |
+----+---------------------+-------+------+
3 rows in set (0.00 sec)

— replica

mysql> select @@system_time_zone;
+--------------------+
| @@system_time_zone |
+--------------------+
| UTC                |
+--------------------+
1 row in set (0.00 sec)

mysql> select * from db1.test1;
+----+---------------------+-------+------+
| id | d                   | a     | name |
+----+---------------------+-------+------+
|  1 | 2022-05-22 09:13:37 | test1 | 0    |
|  2 | 2022-05-22 09:13:37 | test2 | 0    |
|  3 | 2022-05-22 09:13:38 | test3 | 0    |
+----+---------------------+-------+------+
3 rows in set (0.00 sec)

— source 

mysql> UPDATE test1 SET name = 'foobar', d = CURRENT_TIMESTAMP WHERE a = 'test1' AND d = '2022-05-22 10:13:37';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> select * from test1;
+----+---------------------+-------+--------+
| id | d                   | a     | name   |
+----+---------------------+-------+--------+
|  1 | 2022-05-22 10:16:15 | test1 | foobar |
|  2 | 2022-05-22 10:13:37 | test2 | 0      |
|  3 | 2022-05-22 10:13:38 | test3 | 0      |
+----+---------------------+-------+--------+
3 rows in set (0.00 sec)

— replica

mysql> select * from db1.test1;
+----+---------------------+-------+------+
| id | d                   | a     | name |
+----+---------------------+-------+------+
|  1 | 2022-05-22 09:13:37 | test1 | 0    |
|  2 | 2022-05-22 09:13:37 | test2 | 0    |
|  3 | 2022-05-22 09:13:38 | test3 | 0    |
+----+---------------------+-------+------+
3 rows in set (0.00 sec)

mysql> pager egrep "Running|SQL_Error"
PAGER set to 'egrep "Running|SQL_Error"'

mysql > show replica status\G
           Replica_IO_Running: Yes
          Replica_SQL_Running: Yes
               Last_SQL_Error: 
    Replica_SQL_Running_State: Replica has read all relay log; waiting for more updates
     Last_SQL_Error_Timestamp: 
1 row in set (0.00 sec)

 

Another test, using UTC_TIME() compared against a column that was populated using the CURRENT_TIMESTAMP function:

— source 

mysql> select * from test1 WHERE TIME(d) > DATE_SUB(UTC_TIME(), INTERVAL 11 HOUR) AND id=3;
+----+---------------------+-------+------+
| id | d                   | a     | name |
+----+---------------------+-------+------+
|  3 | 2022-05-22 10:13:38 | test3 | 0    |
+----+---------------------+-------+------+
1 row in set (0.00 sec)

— replica

mysql> select * from test1 WHERE TIME(d) > DATE_SUB(UTC_TIME(), INTERVAL 11 HOUR) AND id=3;
Empty set (0.00 sec)

Therefore, when a similar condition is used for update:

— source

mysql> update test1 set name="bar" WHERE TIME(d) > DATE_SUB(UTC_TIME(), INTERVAL 11 HOUR) AND id=3;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> select * from test1 where id=3;
+----+---------------------+-------+------+
| id | d                   | a     | name |
+----+---------------------+-------+------+
|  3 | 2022-05-22 22:12:15 | test3 | bar  |
+----+---------------------+-------+------+
1 row in set (0.01 sec)

— replica

mysql> select * from test1 where id=3;
+----+---------------------+-------+------+
| id | d                   | a     | name |
+----+---------------------+-------+------+
|  3 | 2022-05-22 09:13:38 | test3 | 0    |
+----+---------------------+-------+------+
1 row in set (0.01 sec)

mysql > show replica status\G
           Replica_IO_Running: Yes
          Replica_SQL_Running: Yes
               Last_SQL_Error: 
    Replica_SQL_Running_State: Replica has read all relay log; waiting for more updates
     Last_SQL_Error_Timestamp: 
1 row in set (0.00 sec)

Again replica ignored the update while there is no replication error reported. This particular scenario can be expected to happen really often in geographically-distributed database environments.

As the functions used here are not considered unsafe for replication, two usual safety actions are not performed:

  • no warning is printed in the error log when the STATEMENT format used
  • replication event is not logged in RBR format when the MIXED format is used but rather left the original query as it is

I find this potentially dangerous and hence reported it: https://bugs.mysql.com/bug.php?id=107293

Summary

ROW-based replication has become the standard in MySQL and is the most reliable one. It is also the only one permitted for virtually synchronous replication solutions like Percona XtraDB Cluster/Galera and MySQL Group Replication.

At the same time, STATEMENT or even MIXED format, may lead to data consistency issues, which can be undetected for a long time, making it very difficult to investigate when finally replication error happens as a result.

If there is anything that has prevented you from switching to ROW format yet, the sooner you deal with it the better.

Jun
24
2013
--

pt-online-schema-change and binlog_format

Statement-based or row-based, or mixed?  We’ve all seen this discussed at length so I’m not trying to rehash tired arguments.  At a high level, the difference is simple:

  1. Statement based replication (SBR) replicates the SQL statements to the slave to be replayed
  2. Row based replication (RBR) replicates the actual rows changed to the slave to be replayed
  3. Mixed mode uses RBR in the event of a non-deterministic statement, otherwise uses SBR

Recently, I worked with a client to optimize their use of pt-online-schema-change and keep replication delay to a minimum.  We found that using RBR in conjunction with a smaller chunk-time was the best result in their environment due to reduced IO on the slave, but I wanted to recreate the test locally as well to see how it looked in the generic sense (sysbench for data/load).

Here was my local setup:

  1. 3 VMs (CentOS 6.4, 512M RAM, Macbook Pro host)
  2. 1 VM was for sysbench and pt-osc
  3. 1 master, 1 slave (Percona Server 5.5.30, 256M buffer pool)

And here is the base test that I ran:

  1. Populate the db with four, 1 million row tables
  2. Restart MySQL to ensure an empty buffer pool (no LRU dump/restore)
  3. Run pt-osc against the master and capture diagnostics on slave
  4. Repeat with both SBR and RBR

Visually, the most telling difference between the two baselines comes from comparing the slave IOPs when using SBR vs RBR:

Statement Based - IOPs (iostat)

Statement Based – IOPs (iostat -mx 1)

Row Based - IOPs (iostat)

Row Based – IOPs (iostat -mx 1)

 

While the write operations look similar in both, you can see the dramatic difference in the read operations in that they are almost negligible when using row based. I had assumed there would be high read IOPs as the buffer pool was empty, so I also verified that the Innodb_buffer_pool_reads (reads that missed the buffer pool and went to disk) on both:

SBR - Buffer Pool Reads (from disk)

SBR – Buffer Pool Reads (from disk)

RBR - Buffer Pool Reads (from disk)

RBR – Buffer Pool Reads (from disk)

 

Looking at how pt-osc and the buffer pool operate, these results make sense for this workload.  Here is the basic process for pt-osc:

  1. Create a new, shadow table based on the original table
  2. Apply the alters to the new shadow table
  3. Add triggers to the original table to track changes
  4. Run INSERT … SELECT CHUNK against the original table to populate the shadow table
  5. Rename the tables after all records are copied

In the case of SBR, the actual INSERT … SELECT CHUNK is replayed on the slave verbatim, meaning that the chunk will need to be read from disk if not in the BP in order to insert it into the new table.  However, when using RBR, you simply send the rows to the slave.  As this is a new table, there is nothing to read from disk so InnoDB simply writes the new page and it eventually gets flushed to disk.

This is where the –set-vars switch can come in handy with pt-online-schema-change.  Picture this scenario:

  1. The buffer pool is undersized (due to lack of memory compared to data size)
  2. You are altering a table that mostly archive (i.e. not hot data)
  3. The slave is nearly IO bound already as it is actively serving read traffic

In this case, adding extra read IOPs to the slave could be a performance killer.  So assuming you have binlog_format=mixed on the slave, you can use the –set-vars like this to run the alter using RBR to save on IOPs:

pt-online-schema-change –alter=”ENGINE=InnoDB” –set-vars=”binlog_format=row” –execute h=master,D=db,t=tbl

Keep in mind, RBR isn’t going to give you the same results in the course of normal replication.  When replicating changes to the slave, the page is requested from the buffer pool and read from disk if not present so that the changes can be applied to it.  This can still be a win (think of a query that is very complicated that returns only a few rows), but you won’t see the dramatic difference as you do when using this approach with pt-osc.

The post pt-online-schema-change and binlog_format appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com