Apr
02
2021
--

Percona XtraBackup Point-In-Time Recovery for the Single Database

Percona XtraBackup Point-In-Time Recovery

Percona XtraBackup Point-In-Time RecoveryRecovering to a particular time in the past is called Point-In-Time Recovery (PITR). With PITR you can rollback unwanted DELETE without WHERE clause or any other harmful command.

PITR with Percona XtraBackup is pretty straightforward and perfectly described in the user manual. You need to restore the data from the backup, then apply all binary logs created or updated after the backup was taken, but skip harmful event(s).

However, if your data set is large you may want to recover only the affected database or table. This is possible but you need to be smart when filtering events from the binary log. In this post, I will show how to perform such a partial recovery using Percona XtraBackup, mysql command-line client, and mysqlbinlog programs only. There is an alternative approach that involves creating a fake source server, that is described in MySQL Point in Time Recovery the Right Way. You may consider it, especially if you need to apply changes to a single table.

Percona XtraBackup Point-In-Time Recovery

For our example we will create data first, then run DROP and DELETE commands on two different tables. Then we will rollback these commands.

First, let’s assume we have a server with two databases: test and sbtest. We are using GTIDs and row-based binary log format. We also run the server with the option innodb_file_per_table=1 and all our InnoDB tables use individual tablespaces. Otherwise, the individual restore method would not work.

mysql> show tables from sbtest;
+------------------+
| Tables_in_sbtest |
+------------------+
| sbtest1          |
| sbtest2          |
| sbtest3          |
| sbtest4          |
| sbtest5          |
| sbtest6          |
| sbtest7          |
| sbtest8          |
+------------------+
8 rows in set (0.00 sec)

mysql> show tables from test;
+----------------+
| Tables_in_test |
+----------------+
| bar            |
| baz            |
| foo            |
+----------------+
3 rows in set (0.00 sec)

We will experiment with tables foo and bar. We assume that at the time of our first backup, each of the tables contained five rows. Tables in the database sbtest also contain data, but it does not really matter for our experiment.

mysql> select count(*) from foo;
+----------+
| count(*) |
+----------+
|        5 |
+----------+
1 row in set (0.00 sec)

mysql> select count(*) from bar;
+----------+
| count(*) |
+----------+
| 5        |
+----------+
1 row in set (0.00 sec)

mysql> select count(*) from baz;
+----------+
| count(*) |
+----------+
| 0        |
+----------+
1 row in set (0.00 sec)

Since we want to restore individual tables, we need to make a preparation before taking a backup: store database structure. We will do it with help of the mysqldump command. In this example, I store structure per database to make partial PITR easier, but you are free to use the option --all-databases.

mysqldump --no-data --set-gtid-purged=OFF --triggers --routines --events test > test_structure.sql
mysqldump --no-data --set-gtid-purged=OFF --triggers --routines --events sbtest > sbtest_structure.sql

Then we are ready to take the backup.

xtrabackup --parallel=8 --target-dir=./full_backup --backup

I am using the option --parallel to speed up the backup process.

Now let’s do some testing. First, let’s update rows in the table foo.

mysql> update foo set f1=f1*2;
Query OK, 5 rows affected (0.01 sec)
Rows matched: 5 Changed: 5 Warnings: 0

mysql> select * from foo;
+----+------+
| id | f1   |
+----+------+
|  1 |    2 |
|  2 |    4 |
|  3 |    6 |
|  4 |    8 |
|  5 |   10 |
+----+------+
5 rows in set (0.00 sec)

And then drop it and delete all rows from the table bar.

mysql> drop table foo;
Query OK, 0 rows affected (0.02 sec)

mysql> delete from bar;
Query OK, 5 rows affected (0.01 sec)

Finally, let’s insert a few rows into the tables bar and baz.

mysql> insert into bar(f1) values(6),(7),(8),(9),(10);
Query OK, 5 rows affected (0.01 sec)
Records: 5 Duplicates: 0 Warnings: 0

mysql> insert into baz(f1) values(1),(2),(3),(4),(5);
Query OK, 5 rows affected (0.01 sec)
Records: 5 Duplicates: 0 Warnings: 0

Assume that the DROP TABLE and DELETE command was an accident and we want to restore the state of the tables foo and bar as they were before these unlucky statements.

First, we need to prepare the backup.

Since we are interested in restoring only tables in the database test we need to prepare the backup with a special option --export that exports tablespaces in a way that they could be later imported:

xtrabackup --prepare --export --target-dir=./full_backup

Now the directory for the database test contains not only table definition files (.frm, only before 8.0) and tablespace files (.ibd) but also configuration files (.cfg).

Since we want all changes that happened after backup and before the problematic DROP TABLE and DELETE statements were applied, we need to identify which binary log and position were actual at the backup time. We can find it in the xtrabackup_binlog_info file:

$ cat full_backup/xtrabackup_binlog_info
master-bin.000004 1601 0ec00eed-87f3-11eb-acd9-98af65266957:1-56

Now we are ready to perform restore.

First, let’s restore the table foo from the backup. Restoring individual tablespaces requires the ALTER TABLE ... IMPORT TABLESPACE command. This command assumes that the table exists in the server. However, in our case, it was dropped and therefore we need to re-create it.

We will recreate the full database test from the file test_structure.sql

Since we do not want these administrative tasks to be re-applied, I suggest disabling binary logging for the session which will recreate the database structure.

mysql> set sql_log_bin=0;
Query OK, 0 rows affected (0.00 sec)

mysql> source test_structure.sql
Query OK, 0 rows affected (0.00 sec)

Query OK, 0 rows affected (0.00 sec)
....

Once tables are recreated discard their tablespaces. I will show an example for the table foo. Adjust the code for the rest of the tables.

mysql> alter table foo discard tablespace;
Query OK, 0 rows affected (0.01 sec)

Then, in another terminal, copy the tablespace and configuration files from the backup to the database directory:

cp full_backup/test/foo.{ibd,cfg} var/mysqld.1/data/test/

And, finally, import the tablespace:

mysql> alter table foo import tablespace;
Query OK, 0 rows affected (0.05 sec)

Repeat for the other tables in the database test.

Now you can enable binary logging back.

You can do the same task in a script. For example:

for table in `mysql test --skip-column-names --silent -e "show tables"`
> do
>   mysql test -e "set sql_log_bin=0; alter table $table discard tablespace"
>   cp full_backup/test/$table.{ibd,cfg} var/mysqld.1/data/test/
>   mysql test -e "set sql_log_bin=0; alter table $table import tablespace"
> done

Our tables are recovered but do not have the updates made after the backup.

mysql> select * from foo;
+----+------+
| id | f1   |
+----+------+
|  1 |    1 |
|  2 |    2 |
|  3 |    3 |
|  4 |    4 |
|  5 |    5 |
+----+------+
5 rows in set (0.00 sec)

mysql> select * from bar;
+----+------+
| id | f1   |
+----+------+
|  1 |    1 |
|  2 |    2 |
|  3 |    3 |
|  4 |    4 |
|  5 |    5 |
+----+------+
5 rows in set (0.00 sec)

mysql> select * from baz;
Empty set (0.00 sec)

Therefore, we need to restore data from the binary logs.

To do it we first need to identify the GTID of the disaster event. It can be done if we dump all binary logs updated and created after backup into a dump file and then search for the DROP TABLE and DELETE statements and skipping them.

First, let’s check which binary logs do we have.

mysql> show binary logs;
+-------------------+-----------+
| Log_name          | File_size |
+-------------------+-----------+
| master-bin.000001 |   1527476 |
| master-bin.000002 |      3035 |
| master-bin.000003 |      1987 |
| master-bin.000004 |      2466 |
| master-bin.000005 |       784 |
+-------------------+-----------+
5 rows in set (0.00 sec)

So we need to parse them, starting from the log master-bin.000004 and position 1601:

mysqlbinlog --start-position=1601 -vvv --base64-output=decode-rows --database=test master-bin.000004 master-bin.000005 > binlog_test.sql

I used options -vvv that prints SQL representation of row events, so we can find the one which we want to skip and --base64-output=decode-rows to not print row events at all. We will not use this file for the restore, only for searching the DROP TABLE and DELETE events.

Here they are, at the positions 2007 and 2123, with GTID 0ec00eed-87f3-11eb-acd9-98af65266957:58 and 0ec00eed-87f3-11eb-acd9-98af65266957:59

# at 2007
#210321 13:29:58 server id 1 end_log_pos 2123 CRC32 0xd1eb9854 Query thread_id=138 exec_time=0 error_code=0
use `test`/*!*/;
SET TIMESTAMP=1616322598/*!*/;
DROP TABLE `foo` /* generated by server */
/*!*/;
# at 2123
#210321 13:30:08 server id 1 end_log_pos 2188 CRC32 0xfc9b2088 GTID last_committed=7 sequence_number=8 rbr_only=yes original_committed_timestamp=0 immediate_commit_timestamp=0 transaction_length=0
/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
# original_commit_timestamp=0 (1970-01-01 02:00:00.000000 EET)
# immediate_commit_timestamp=0 (1970-01-01 02:00:00.000000 EET)
/*!80001 SET @@session.original_commit_timestamp=0*//*!*/;
/*!80014 SET @@session.original_server_version=0*//*!*/;
/*!80014 SET @@session.immediate_server_version=0*//*!*/;
SET @@SESSION.GTID_NEXT= '0ec00eed-87f3-11eb-acd9-98af65266957:59'/*!*/;
# at 2188
#210321 13:30:08 server id 1 end_log_pos 2260 CRC32 0x1d525b11 Query thread_id=138 exec_time=0 error_code=0
SET TIMESTAMP=1616322608/*!*/;
BEGIN
/*!*/;
# at 2260
#210321 13:30:08 server id 1 end_log_pos 2307 CRC32 0xb57ecb73 Table_map: `test`.`bar` mapped to number 226
# at 2307
#210321 13:30:08 server id 1 end_log_pos 2387 CRC32 0x6770a7e2 Delete_rows: table id 226 flags: STMT_END_F
### DELETE FROM `test`.`bar`
### WHERE
### @1=1 /* INT meta=0 nullable=0 is_null=0 */
### @2=1 /* INT meta=0 nullable=1 is_null=0 */
### DELETE FROM `test`.`bar`
### WHERE
...

Note that decoded row event contains a DELETE command for each affected row.

We may also find to which binary log this event belongs if search for the "Rotate to" event. In our case “Rotate to master-bin.000005” happened after the found positions, so we only need file master-bin.000004 In your case, you may need to skip events from the previous log files too.

So to restore the data we need to run mysqlbinlog one more time, this time with parameters:

mysqlbinlog  --start-position=1601 --exclude-gtids=0ec00eed-87f3-11eb-acd9-98af65266957:58-59 --database=test --skip-gtids=true master-bin.000004 master-bin.000005 > binlog_restore.sql

I removed options -vvvbecause we are not going to examine this restore file and option --base64-output=decode-rows because we need row events to present in the resulting file. I also used option --exclude-gtids=0ec00eed-87f3-11eb-acd9-98af65266957:58-59 to exclude GTIDs that we do not want to re-apply. We also need to use --skip-gtids=true because otherwise updates will be skipped since such GTIDs already exist on the server.

Now binlog_restore.sql contains all updates to the database test made after the backup and before the DROP statement. Let’s restore it.

mysql test < binlog_restore.sql

Restore went successfully. Our tables have all past updates.

mysql> select * from foo;
+----+------+
| id | f1   |
+----+------+
|  1 |    2 |
|  2 |    4 |
|  3 |    6 |
|  4 |    8 |
|  5 |   10 |
+----+------+
5 rows in set (0.01 sec)

mysql> select count(*) from bar;
+----------+
| count(*) |
+----------+
|       10 |
+----------+
1 row in set (0.00 sec)

mysql> select count(*) from baz;
+----------+
| count(*) |
+----------+
|        5 |
+----------+
1 row in set (0.00 sec)

Conclusion

You may save the time required for PITR if use the per-database restore method. However, you need to take into account the following considerations:

  • mysqlbinlog does not support filtering per table, therefore you either need to restore the full database or use a fake server method, described in MySQL Point in Time Recovery the Right Way.
  • Per-database filters depend on the USE statement in the statement-based binary log format. Therefore option --database can only be considered safe with a row-based format.
  • If you do not use GTID you still can use this method. You will need to combine options --start-position and --stop-position to skip the event.

Percona XtraBackup is a free, open source database backup solution for Percona Server for MySQL and MySQL.

Jun
28
2018
--

Faster Point In Time Recovery (PITR) in PostgreSQL Using a Delayed Standby

PostgreSQL Point in Time Recovery

PostgreSQL Point in Time RecoveryThe need to recover a database back to a certain point in time can be a nerve-racking task for DBAs and for businesses. Can this be simplified? Could it be made to work faster? Can we recover to a given point in time with zero loss of transactions/records? Fortunately, the answer to these questions is yes. PostgreSQL Point in Time Recovery (PITR) is an important facility. It offers DBAs the ability to restore a PostgreSQL database simply, quickly and without the loss of transactions or data.

In this post, we’ll help you to understand how this can be achieved, and reduce the potential for pain in the event of panic situations where you need to perform a PITR.

Before proceeding further, let us understand what could force us to perform a PITR.

  1. Someone has accidentally dropped or truncated a table.
  2. A failed deployment has made changes to the database that are difficult to reverse.
  3. You accidentally deleted or modified a lot of data, and as a consequence you cannot run your applications.

In such scenarios, you would immediately look for the latest full backup and the relevant transaction logs (aka WALs in PostgreSQL) to recover up to a known point in the past, before the error occurred. But what if your backup is corrupt and not valid?

Well, it is very important to perform a backup and recovery validation to ensure that the backups are always recoverable—we will address this in a future post. But, if the backup that you are looking at is corrupt, that can be a nightmare. One such unlucky incident for GitLab, where there was a backup restoration failure, caused a major outage followed by a data loss after recovery.

https://about.gitlab.com/2017/02/01/gitlab-dot-com-database-incident/

Even the best of plans can be hard to realize in practice.

It may be that our backups are intact and recoverable. Can we afford to wait until we copy/download the backup and recover it to another disk or server? What if the database size is several hundreds of GBs or several TBs like GitLab’s?

The solution to the problem is: add another standby that is always delayed by a few hours or a day.

This is one of the great features available in PostgreSQL. If you have migrated from Oracle RDBMS to PostgreSQL, you can think of it as an equivalent to FLASHBACK DATABASE in Oracle. Flashback database helps you to rewind data back in time. However, the technique does not work if you have dropped a data file. In fact, this is the case for both Oracle RDBMS and PostgreSQL PITR. ?

https://docs.oracle.com/cd/E11882_01/backup.112/e10642/flashdb.htm#BRADV71000

Adding a Delayed Standby in PostgreSQL

It is important that we use features like streaming replication to achieve high availability in PostgreSQL. Most of the environments have 1 master with 1 or more slaves (standby), either in the same data centre or geographically distributed. To save the time needed for PITR, you can add another slave that can always be delayed by a certain amount of time—this could be hours or days.

For example, if I know that my deployment is determined to be successful when no issues are observed in the first 12 hours, then I might delay one of the standbys by 12 hours.

To delay a standby, once you have setup streaming replication between your PostgreSQL master and slave, the following parameter needs to be added to the recovery.conf file of the slave, followed by a restart.

recovery_min_apply_delay = '12h' # or '1min' or 1d'

Now, let’s consider an example where you have inserted 10000 records at 10:27:34 AM and you have accidentally deleted 5000 records at 10:28:43 AM. Let’s say that you have a standby that is delayed by 1 hour. The steps to perform PITR using the delayed standby through until 10:27:34 AM look like this:

Steps to perform PostgreSQL Point in Time Recovery using a delayed standby


Step 1

Stop the slave (delayed standby) immediately, as soon as you have noticed that an accidental change has happened. If you know that the change has been already applied on the slave, then you cannot perform the point in time recovery using this method.

$ pg_ctl -D $PGDATA stop -mf

Step 2

Rename the recovery.conf file in your standby to another name.

$ mv $PGDATA/recovery.conf $PGDATA/recovery.conf.old

Step 3

Create a new recovery.conf file with the required parameters for PITR.

# recovery.conf file always exists in the Data Directory of Slave
recovery_target_time = '2018-06-07 10:27:34 EDT'
restore_command = 'sh /var/lib/pgsql/scripts/restore_command_script.sh %p %f'
recovery_target_action = 'pause'
recovery_target_inclusive = 'false'

recovery_target_time

Specifies the timestamp up to which you wish to recover your database.

restore_command

Shell command that can be used by PostgreSQL to fetch the required Transaction Logs (WALs) for recovery.
PostgreSQL sends the arguments %p (path to WAL file) and % f (WAL file name) to this shell command. These arguments can be used in the script you use to copy your WALs.

Here is an example script for your reference. This example relies on rsync. The script connects to the backup server to fetch the WALs requested by PostgreSQL. (We’ll cover the procedure to archive these WALs in another blog post soon: this could be a good time to subscribe to the Percona blog mailing list!)

$ cat /var/lib/pgsql/scripts/restore_command_script.sh
#!/bin/bash
# Enable passwordless ssh to Backup Server
# $1 is %p substituted by postgres as the path to WAL File
# $2 is the %f substituted by postgres as the WAL File Name
LOG=/var/lib/pgsql/scripts/restore_command.log
Backup_Server=192.168.0.12
ArchiveDir='/archives'
#
wal=$2
wal_with_path=$1
rsync --no-motd -ave ssh ${Backup_Server}:${ArchiveDir}/${wal} ${wal_with_path} >>$LOG 2>&1
if [ "$?" -ne "0" ]
then
echo "Restore Failed for WAL : $wal" >> $LOG
exit 1
fi
#

recovery_target_action

This is the action that needs to be performed after recovering the instance up to the recovery_target_time. Setting this to pause would let you modify the recovery_target_time after recovery, if you need to. You can then replay the transactions at a slow pace until your desired recovery target is reached. For example, you can recover until 2018-06-07 10:26:34 EDT and then modify recovery_target_time to 2018-06-07 10:27:34 EDT when using pause.

When you know that all the data you are looking for has been recovered, you can issue the following command to stop the recovery process, change the timeline and open the database for writes.

select pg_wal_replay_resume();

Other possible settings for this parameter are promote and shutdown. These do not allow you to replay a few more future transactions after the recovery, as you can with pause.

recovery_target_inclusive

Whether to stop recovery just after the specified recovery_target_time(true) or before(false).

Step 4

Start PostgreSQL using pg_ctl. Now, it should read the parameters in recovery.conf and perform the recovery until the time you set in the recovery_target_time.

$ pg_ctl -D $PGDATA start

Step 5

Here is how the log appears. It says that has performed point-in-time-recovery and has reached a consistent state as requested.

2018-06-07 10:43:22.303 EDT [1423] LOG: starting point-in-time recovery to 2018-06-07 10:27:34-04
2018-06-07 10:43:22.607 EDT [1423] LOG: redo starts at 0/40005B8
2018-06-07 10:43:22.607 EDT [1423] LOG: consistent recovery state reached at 0/40156B0
2018-06-07 10:43:22.608 EDT [1421] LOG: database system is ready to accept read only connections
2018-06-07 10:43:22.626 EDT [1423] LOG: recovery stopping before commit of transaction 570, time 2018-06-07 10:28:59.645685-04
2018-06-07 10:43:22.626 EDT [1423] LOG: recovery has paused
2018-06-07 10:43:22.626 EDT [1423] HINT: Execute pg_wal_replay_resume() to continue.

Step 6

You can now stop recovery and open the database for writes after PITR.

Before executing the next command, you may want to verify that you have got all the desired data by connecting to the database and executing some SQL’s. You can still perform reads before you stop recovery. If you notice that you need another few minutes (or hours) of transactions, then modify the parameter recovery_target_time and go back to step 4. Otherwise, you can stop the recovery by running the following command.

$ psql
select pg_wal_replay_resume();

Summing up

Using PostgreSQL Point in time Recovery is the most simple of procedures that does not involve any effort in identifying the latest backups, transaction logs and space or server to restore in a database emergency. These things happen! Also, it could save a lot of time because the replay of WALs is much faster than rebuilding an entire instance using backups, especially when you have a huge database.

Important post script: I tested and recorded these steps using PostgreSQL 10.4. It is possible with PostgreSQL 9.x versions, however, the parameters could change slightly and you should refer to the PostgreSQL documentation for the correct syntax.

The post Faster Point In Time Recovery (PITR) in PostgreSQL Using a Delayed Standby appeared first on Percona Database Performance Blog.

Oct
23
2017
--

MySQL Point in Time Recovery the Right Way

MySQL Point In Time Recovery

MySQL Point In Time RecoveryIn this blog, I’ll look at how to do MySQL point in time recovery (PITR) correctly.

Sometimes we need to restore from a backup, and then replay the transactions that happened after the backup was taken. This is a common procedure in most disaster recovery plans, when for example you accidentally drop a table/database or run an update/delete without the “where” clause and lose data.

The usual way is to get a copy of your binlogs and use mysqlbinlog to replay those transactions. But this approach has many pitfalls that can make the whole PITR process a nightmare. Some examples:

  • You need to make sure to run a single mysqldump command with all related binlogs, and pipe them to mysql at once. Otherwise, if binlog.000001 creates a temporary table, and binlog.000002 requires that temporary table, it will not be present. Each execution of MySQL creates a new connection:
shell> mysqlbinlog binlog.000001 | mysql -u root -p # Creates tmp table X
shell> mysqlbinlog binlog.000002 | mysql -u root -p # Uses tmp table X

  • We can say that it has to be an atomic operation. If it fails halfway through, it will be very difficult to know where it failed and even more difficult to resume from that point forward. There are many reasons for it to fail: InnoDB lock wait timeout / deadlock with some concurrent transaction, server and client have different
    max_allowed_packet

     and you get a Lost connection to MySQL server during query error, and so on.

So how can we overcome those limitations and have a reliable way to do Point In Time Recovery?

We can restore the backup on the desired server, build a second server with just the minimal MySQL required data and move the all binary logs to this “fake” server datadir. Then we need to configure the server where we want the PITR to happen as a slave of the fake server. From this point forward, it’s going to be pure MySQL replication happening.

To illustrate it, I create a Docker container on the machine. I have Percona Server for MySQL running on the box listening on 3306, and have already restored the backup on it. There is a tarball there with all binlogs required. The saved positions for PITR are as follows:

[root@localhost ~]# cat /var/lib/mysql/xtrabackup_binlog_info
master-bin.000007	1518932

I create a folder to store the Docker MySQL datadir:

mkdir /tmp/pitr
chown -R 1001 /tmp/pitr

I start the Docker container. As we can see from xtrabackup_binlog_info, my binlogs are named master-bin and I’ll be setting the same server-id as original master:

docker run --name ps_pitr -v /tmp/pitr:/var/lib/mysql
-p 3307:3306 -e MYSQL_ROOT_PASSWORD=secret
-d percona/percona-server:5.7.18
--log_bin=master-bin --server-id=10

In case you want to make usage of GTID, append --gtid-mode=ON --enforce_gtid_consistency=ON to the end of the Docker command.

The command above starts a MySQL instance, invokes mysqld –initialize, sets the root password to secret and it’s port 3306 is mapped back to my local 3307 port. Now I’ll stop it, remove the binlogs that it created, uncompress and move all required binlogs to its datadir mapped folder and start it again:

docker stop ps_pitr
rm /tmp/pitr/master-bin.*
tar -zxf binlogs.tgz -C /tmp/pitr
chown -R 1001 /tmp/pitr/master-bin.*
docker start ps_pitr

If it all worked correctly, at this point we can see the full list of binary logs on the Docker container by connecting on port 3307:

mysql -u root -psecret -P 3307 --protocol=TCP -e "SHOW BINARY LOGS"
mysql: [Warning] Using a password on the command line interface can be insecure.
+-------------------+-----------+
| Log_name          | File_size |
+-------------------+-----------+
| master-bin.000005 |  26216208 |
| master-bin.000006 |  26214614 |
| master-bin.000007 |  26214902 |
. . .
| master-bin.000074 |       154 |
+-------------------+-----------+

Now, all we need to do is connect to our server, which has the backup restored, and configure it as a slave from 3307:

mysql -u root -p
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or g.
Your MySQL connection id is 6
Server version: 5.7.18-16 Percona Server (GPL), Release 16, Revision d7301f8
Copyright (c) 2009-2017 Percona LLC and/or its affiliates
Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.
mysql> CHANGE MASTER TO MASTER_HOST='127.0.0.1', MASTER_PORT=3307, MASTER_USER='root', MASTER_PASSWORD='secret', MASTER_LOG_FILE='master-bin.000007', MASTER_LOG_POS=1518932;
Query OK, 0 rows affected, 2 warnings (0.01 sec)
mysql> START SLAVE;
Query OK, 0 rows affected (0.01 sec)
mysql> SHOW SLAVE STATUSG
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: root
                  Master_Port: 3307
                Connect_Retry: 60
              Master_Log_File: master-bin.000008
          Read_Master_Log_Pos: 449696
               Relay_Log_File: localhost-relay-bin.000002
                Relay_Log_Pos: 28957
        Relay_Master_Log_File: master-bin.000007
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 15217950
              Relay_Log_Space: 11476311
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 4382
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 10
                  Master_UUID: 80b9fe26-a945-11e7-aa1d-0242ac110002
             Master_Info_File: /var/lib/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Opening tables
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set:
                Auto_Position: 0
         Replicate_Rewrite_DB:
                 Channel_Name:
           Master_TLS_Version:
1 row in set (0.17 sec)
. . .
mysql> SHOW SLAVE STATUSG
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: root
                  Master_Port: 3307
                Connect_Retry: 60
              Master_Log_File: master-bin.000074
          Read_Master_Log_Pos: 154
               Relay_Log_File: localhost-relay-bin.000133
                Relay_Log_Pos: 381
        Relay_Master_Log_File: master-bin.000074
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 154
              Relay_Log_Space: 819
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 10
                  Master_UUID: 80b9fe26-a945-11e7-aa1d-0242ac110002
             Master_Info_File: /var/lib/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set:
                Auto_Position: 0
         Replicate_Rewrite_DB:
                 Channel_Name:
           Master_TLS_Version:
1 row in set (0.01 sec)

If you want to apply logs up to a particular time you can make use of mysqlbinlog to verify what the last position / GTID it should apply, and use START SLAVE UNTIL MASTER_LOG_FILE = 'log_name', MASTER_LOG_POS = log_pos or START SLAVE SQL_THREAD UNTIL SQL_AFTER_GTIDS = 3E11FA47-71CA-11E1-9E33-C80AA9429562:11-56.

Special thanks to Marcos Albe, who originally showed me this MySQL point in time recovery approach.

Feb
16
2017
--

MySQL Bug 72804 Workaround: “BINLOG statement can no longer be used to apply query events”

MySQL Bug 72804

MySQL Bug 72804In this blog post, we’ll look at a workaround for MySQL bug 72804.

Recently I worked on a ticket where a customer performed a point-in-time recovery PITR using a large set of binary logs. Normally we handle this by applying the last backup, then re-applying all binary logs created since the last backup. In the middle of the procedure, their new server crashed. We identified the binary log position and tried to restart the PITR from there. However, using the option

--start-position

, the restore failed with the error “The BINLOG statement of type Table_map was not preceded by a format description BINLOG statement.” This is a known bug and is reported as MySQL Bug #72804: “BINLOG statement can no longer be used to apply Query events.”

I created a small test to demonstrate a workaround that we implemented (and worked).

First, I ran a large import process that created several binary logs. I used a small value in 

max_binlog_size

 and tested using the database “employees” (a standard database used for testing).Then I dropped the database.

mysql> set sql_log_bin=0;
Query OK, 0 rows affected (0.33 sec)
mysql> drop database employees;
Query OK, 8 rows affected (1.25 sec)

To demonstrate the recovery process, I joined all the binary log files into one SQL file and started an import.

sveta@Thinkie:~/build/ps-5.7/mysql-test$ ../bin/mysqlbinlog var/mysqld.1/data/master.000001 var/mysqld.1/data/master.000002 var/mysqld.1/data/master.000003 var/mysqld.1/data/master.000004 var/mysqld.1/data/master.000005 > binlogs.sql
sveta@Thinkie:~/build/ps-5.7/mysql-test$ GENERATE_ERROR.sh binlogs.sql
sveta@Thinkie:~/build/ps-5.7/mysql-test$ mysql < binlogs.sql
ERROR 1064 (42000) at line 9020: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'inserting error

I intentionally generated a syntax error in the resulting file with the help of the GENERATE_ERROR.sh script (which just inserts a bogus SQL statement in a random row). The error message clearly showed where the import stopped: line 9020. I then created a file that cropped out the part that had already been imported (lines 1- 9020), and tried to import this new file.

sveta@Thinkie:~/build/ps-5.7/mysql-test$ tail -n +9021 binlogs.sql >binlogs_rest.sql
sveta@Thinkie:~/build/ps-5.7/mysql-test$ mysql < binlogs_rest.sql
ERROR 1609 (HY000) at line 134: The BINLOG statement of type `Table_map` was not preceded by a format description BINLOG statement.

Again, the import failed with exactly the same error as the customer. The reason for this error is that the BINLOG statement – which applies changes from the binary log – expects that the format description event gets run in the same session as the binary log import, but before it. The format description existed initially at the start of the import that failed at line 9020. The later import (from line 9021 on) doesn’t contain this format statement.

Fortunately, this format is the same for the same version! We can simply take it from the beginning the SQL log file (or the original binary file) and put into the file created after the crash without lines 1-9020.

With MySQL versions 5.6 and 5.7, this event is located in the first 11 rows:

sveta@Thinkie:~/build/ps-5.7/mysql-test$ head -n 11 binlogs.sql | cat -n
     1	/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
     2	/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
     3	DELIMITER /*!*/;
     4	# at 4
     5	#170128 17:58:11 server id 1  end_log_pos 123 CRC32 0xccda074a 	Start: binlog v 4, server v 5.7.16-9-debug-log created 170128 17:58:11 at startup
     6	ROLLBACK/*!*/;
     7	BINLOG '
     8	g7GMWA8BAAAAdwAAAHsAAAAAAAQANS43LjE2LTktZGVidWctbG9nAAAAAAAAAAAAAAAAAAAAAAAA
     9	AAAAAAAAAAAAAAAAAACDsYxYEzgNAAgAEgAEBAQEEgAAXwAEGggAAAAICAgCAAAACgoKKioAEjQA
    10	AUoH2sw=
    11	'/*!*/;

The first six rows are meta information, and rows 6-11 are the format event itself. The only thing we need to export into our resulting file is these 11 lines:

sveta@Thinkie:~/build/ps-5.7/mysql-test$ head -n 11 binlogs.sql > binlogs_rest_with_format.sql
sveta@Thinkie:~/build/ps-5.7/mysql-test$ cat binlogs_rest.sql >> binlogs_rest_with_format.sql
sveta@Thinkie:~/build/ps-5.7/mysql-test$ mysql < binlogs_rest_with_format.sql
sveta@Thinkie:~/build/ps-5.7/mysql-test$

After this, the import succeeded!

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com