This blog discusses newly added options for Percona XtraBackup 2.4.8 and how they can impact your database backups.
To avoid issues with MySQL 5.7 skipping the redo log for DDL, Percona XtraBackup has implemented three new options (
xtrabackup --lock-ddl
,
xtrabackup --lock-ddl-timeout
,
xtrabackup --lock-ddl-per-table
) that can be used to place MDL locks on tables while they are copied.
So why we need those options? Let’s discuss the process used to get there.
Originally, we found problems while running DDLs: Percona XtraBackup produced corrupted backups as described in two reports:
After experimenting, it was clear that the core cause of those fails was MySQL 5.7 skipping redo logging for some DDLs. This is a newly added feature to MySQL named Sorted Index Builds. You can read more from following links:
To prevent this we introduced a solution: wWhen Percona XtraBackup detects skipping the redo log), it aborts the backup to prevent creating a corrupted backup.
The scary error message you get with this fix is:
[FATAL] InnoDB: An optimized(without redo logging) DDLoperation has been performed. All modified pages may not have been flushed to the disk yet.
Percona XtraBackup will not be able to take a consistent backup. Retry the backup operation
We need to avoid aborting backup with this message. So how do we do that? Let’s create a test case first and reproduce the issue.
Prepare two tables:
sysbench /usr/share/sysbench/oltp_insert.lua --db-driver=mysql --mysql-db=db1 --mysql-user=msandbox --mysql-password=msandbox --table-size=2000000 --mysql-socket=/tmp/mysql_sandbox20393.sock prepare sysbench /usr/share/sysbench/oltp_insert.lua --db-driver=mysql --mysql-db=db2 --mysql-user=msandbox --mysql-password=msandbox --table-size=2000000 --mysql-socket=/tmp/mysql_sandbox20393.sock prepare
Create a test.sh file and place it in the sandbox:
#!/bin/bash echo "drop table if exists db1.sb1"|./use echo "create table sb1 as select id,c from sbtest1 where id < 150000;"|./use db1 echo "create unique index ix on sb1 (id)"|./use db1 sleep 1 echo "drop table if exists db2.sb1"|./use echo "create table sb1 as select id,c from sbtest1 where id < 150000;"|./use db2 echo "create unique index ix on sb1 (id)"|./use db2
Run the script in a loop while the backup is taken:
$ while true; do bash test.sh; done
Try to take a backup:
xtrabackup --defaults-file=/home/shahriyar.rzaev/sandboxes/rsandbox_Percona-Server-5_7_18/master/my.sandbox.cnf --user=msandbox --password='msandbox' --target-dir=/home/shahriyar.rzaev/backup_dir/ps_5.7_master/full/2017-07-26_11-11-45 --backup --host=127.0.0.1 --port=20393 --binlog-info=AUTO --galera-info --parallel 4 --check-privileges --no-version-check
You will likely get something like:
InnoDB: An optimized (without redo logging) DDLoperation has been performed. All modified pages may not have been flushed to the disk yet. PXB will not be able take a consistent backup. Retry the backup operation
Ok, now we have reproduced the error. To avoid this error, XtraBackup has the new options as mentioned above.
Using
--lock-ddl
:
xtrabackup --defaults-file=/home/shahriyar.rzaev/sandboxes/rsandbox_Percona-Server-5_7_18/master/my.sandbox.cnf --user=msandbox --password='msandbox' --target-dir=/home/shahriyar.rzaev/backup_dir/ps_5.7_master/full/2017-07-26_11-16-56 --backup --host=127.0.0.1 --port=20393 --binlog-info=AUTO --galera-info --parallel 4 --check-privileges --no-version-check --lock-ddl
The new thing you should notice is:
170726 11:16:56 Executing LOCK TABLES FOR BACKUP...
And the backup status:
xtrabackup: Transaction log of lsn (2808294311) to (2808304872) was copied. 170726 11:20:42 completed OK!
Another new option is --lock-ddl-per-table
:
xtrabackup --defaults-file=/home/shahriyar.rzaev/sandboxes/rsandbox_Percona-Server-5_7_18/master/my.sandbox.cnf --user=msandbox --password='msandbox' --target-dir=/home/shahriyar.rzaev/backup_dir/ps_5.7_master/full/2017-07-26_11-31-56 --backup --host=127.0.0.1 --port=20393 --binlog-info=AUTO --galera-info --parallel 4 --check-privileges --no-version-check --lock-ddl-per-table
The new output will look like this:
170726 11:32:33 [01] Copying ./ibdata1 to /home/shahriyar.rzaev/backup_dir/ps_5.7_master/full/2017-07-26_11-31-56/ibdata1 170726 11:32:33 Locking MDL for db1.sb1 170726 11:32:33 [02] Copying ./db1/sb1.ibd to /home/shahriyar.rzaev/backup_dir/ps_5.7_master/full/2017-07-26_11-31-56/db1/sb1.ibd 170726 11:32:33 Locking MDL for db1.sbtest1 170726 11:32:33 Locking MDL for db2.sb1 170726 11:32:33 [03] Copying ./db1/sbtest1.ibd to /home/shahriyar.rzaev/backup_dir/ps_5.7_master/full/2017-07-26_11-31-56/db1/sbtest1.ibd 170726 11:32:33 [04] Copying ./db2/sb1.ibd to /home/shahriyar.rzaev/backup_dir/ps_5.7_master/full/2017-07-26_11-31-56/db2/sb1.ibd 170726 11:32:33 [04] ...done 170726 11:32:33 >> log scanned up to (2892754398) 170726 11:32:34 Locking MDL for db2.sbtest1
The result of the backup:
170726 11:35:45 Unlocking MDL for all tables xtrabackup: Transaction log of lsn (2871333326) to (2892754764) was copied. 170726 11:35:45 completed OK!
The another thing I should add here is about using
--lock-ddl
with non-Percona Server for MySQL servers. For example., using it with MariaDB:
2017-07-26 12:08:32 ERROR FULL BACKUP FAILED! 2017-07-26 12:08:37 ERROR 170726 12:08:32 Connecting to MySQL server host: 127.0.0.1, user: msandbox, password: set, port: 10207, socket: /tmp/mysql_sandbox10207.sock Using server version 10.2.7-MariaDB 170726 12:08:32 Error: LOCK TABLES FOR BACKUP is not supported.
But you can use
--lock-ddl-per-table
with any server. Use
--lock-ddl-per-table
with caution, it can block updates to tables for highly loaded servers under some circumstances. Let’s explore one:
Table: CREATE TABLE t1 (a INT AUTO_INCREMENT PRIMARY KEY, b TEXT); Cases: connection 1: - BEGIN; SELECT * FROM sb1 LIMIT 1; <--- MDL connection 2: - UPDATE sb1 SET c = '288' WHERE id = 34; <--- completes OK connection 3: - CREATE INDEX sb1_1 ON sb1 (c(10)); <--- WAITING for MDL connection 2: - UPDATE sb1 SET c = '288' WHERE id = 34; <--- WAITING for MDL connection 1: - COMMIT; connection 2 and 3 are able to complete now
If one connection holds an MDL lock, and another connection does ALTER TABLE (CREATE INDEX is mapped to an ALTER TABLE statement to create indexes), then updates to that table are blocked.
Testing this with the backup process is quite easy:
Sample table:
CREATE TABLE `sb1` ( `id` int(11) NOT NULL DEFAULT '0', `c` char(120) NOT NULL DEFAULT '', UNIQUE KEY `ix` (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 select count(*) from sb1; +----------+ | count(*) | +----------+ | 149999 | +----------+ select * from sb1 limit 3; +----+-------------------------------------------------------------------------------------------------------------------------+ | id | c | +----+-------------------------------------------------------------------------------------------------------------------------+ | 1 | 83868641912-28773972837-60736120486-75162659906-27563526494-20381887404-41576422241-93426793964-56405065102-33518432330 | | 2 | 38014276128-25250245652-62722561801-27818678124-24890218270-18312424692-92565570600-36243745486-21199862476-38576014630 | | 3 | 33973744704-80540844748-72700647445-87330233173-87249600839-07301471459-22846777364-58808996678-64607045326-48799346817 | +----+-------------------------------------------------------------------------------------------------------------------------+
So our “connection 1:” is an
xtrabackup
command:
xtrabackup --defaults-file=/home/shahriyar.rzaev/sandboxes/rsandbox_Percona-Server-5_7_18/master/my.sandbox.cnf --user=msandbox --password='msandbox' --target-dir=/home/shahriyar.rzaev/backup_dir/ps_5.7_master/full/2017-07-28_07-55-30 --backup --host=127.0.0.1 --port=20393 --binlog-info=AUTO --galera-info --parallel 4 --check-privileges --no-version-check --lock-ddl-per-table
So after running the backup command and doing the same steps for “connection 2” and “connection 3,” the result is something like this in processlist:
show processlist; +----+----------+-----------------+------+---------+------+---------------------------------+----------------------------------------+-----------+---------------+ | Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined | +----+----------+-----------------+------+---------+------+---------------------------------+----------------------------------------+-----------+---------------+ | 4 | root | localhost | db1 | Sleep | 28 | | NULL | 0 | 1 | | 10 | root | localhost | db1 | Query | 26 | Waiting for table metadata lock | CREATE INDEX sb1_1 ON sb1 (c(10)) | 0 | 0 | | 11 | root | localhost | db1 | Query | 6 | Waiting for table metadata lock | UPDATE sb1 SET c = '288' WHERE id = 34 | 0 | 0 | | 12 | root | localhost | NULL | Query | 0 | starting | show processlist | 0 | 0 | | 13 | msandbox | localhost:36546 | NULL | Sleep | 31 | | NULL | 1 | 116 | | 14 | msandbox | localhost:36550 | NULL | Sleep | 17 | | NULL | 1 | 1 | +----+----------+-----------------+------+---------+------+---------------------------------+----------------------------------------+-----------+---------------+ 6 rows in set (0.00 sec)
Updates are only completed after the backup, as described. It should be clear now why you should use it with caution.
The last thing we should discuss is if you do not want to use any “hacks” with
xtrabackup
, you can do things on the MySQL side such as:
- Avoiding bad DDLs
- Enabling old_alter_table. When this variable is enabled, the server does not use the optimized method of processing an ALTER TABLE operation. It reverts to using a temporary table, copying over the data and then renaming the temporary table to the original, as used by MySQL 5.0 and earlier. So it is going to use ALGORITHM=COPY for alters.
In conclusion, we do not have a panacea for this issue but, you can use some of the described tricks to get rid of this problem. Thanks for reading!