Jan
05
2023
--

Tale of a MySQL 8 Upgrade and Implications on Backup

MySQL 8 Upgrade and Implications on Backup

MySQL 8 Upgrade and Implications on BackupRecently, we performed a database engine major version upgrade in one of our customers’ environments from MySQL 5.7.26 to 8.0.27. After this version upgrade, we experienced issues with backups and replication for one of the nodes.

In this article, I will explain these issues in detail and recommend a way to take backups from a replication environment.

To begin with, we upgraded all the database nodes from 5.7.26 to 8.0.27 and as a recommended way we have a backup set-up from one of the replica nodes. Physical backups are being taken using Percona XtraBackup (PXB) so it does not lock the database during the backup.

With MySQL 5.7, a backup was taken using PXB 2.4. Due to the new data dictionaries, redo log and undo log in MySQL 8.0, we also upgraded PXB to 8.0.27 to avoid compatibility issues, after upgrading the database to 8.0.

Before we discuss our original issue in detail, let us look at some of the options that should be used with PXB when you are taking backups from a replication environment to ensure consistency. These options are being used in this case as well.

–slave-info

It prints and stores the binary log position of the source server which can be useful to set up new replicas from the source.

–safe-slave-backup

This option is useful to handle temporary tables on replicas and when used it stops the SQL thread and waits until there are no temporary tables before taking a backup.

–lock-ddl

This option blocks all the DDL operations on replicas when a backup is running, so any DDL events do not corrupt the backups. Any DML events continue to occur, and only DDL events are blocked.

Now coming back to our original issue, after the database upgraded successfully to 8.0.27, we noticed several issues related to backups. One of the major issues that we noticed was replication failure on the replica node during the backup.

We identified below things to make sure that nothing changed after the upgrade:

  1. No change in the backup command
  2. All the tables were InnoDB as before and no engine change was performed

Further, we noticed that whenever PXB starts taking backups, the replication stops on the backup server. Upon checking further, we noticed that only the SQL thread gets stopped when the backup starts, and the IO thread keeps running. This pointed us to the fact that this replication break is happening because of PXB running with the –safe-slave-backup option, which is a recommended way of taking backups from the replica node as mentioned earlier.

However, we noticed that in the 5.7 environment, the backup was also running with the –safe-slave-backup option, and backups were running fine at that time and replication also had no issues. So with this major version upgrade, the only change we did with respect to the backup was upgrading the PXB packages to 8.0.27 from PXB 2.4.20.

In order to get more insights, I did a simple test and performed the backup using PXB 8.0.27 and PXB 2.4.26. During this test, I noticed that with PXB 8.0, –safe-slave-backup stops the SQL thread right after the backup starts (even before copying InnoDB files). In this test scenario, a complete backup takes around 20 seconds to finish and the SQL thread was stopped for the entire period of time.

2022-09-09T11:15:00.668080-00:00 0 [Note] [MY-011825] [Xtrabackup] Using server version 8.0.27

**2022-09-09T11:15:00.717674-00:00 0 [Note] [MY-011825] [Xtrabackup] Slave open temp tables: 0

2022-09-09T11:15:00.720502-00:00 0 [Note] [MY-011825] [Xtrabackup] Slave is safe to backup.**

2022-09-09T11:15:00.720573-00:00 0 [Note] [MY-011825] [Xtrabackup] Executing LOCK INSTANCE FOR BACKUP 

...

...

...

2022-09-09T11:15:01.248442-00:00 2 [Note] [MY-011825] [Xtrabackup] Copying ./db/t.ibd to /root/backups/db/t.ibd

2022-09-09T11:15:01.980691-00:00 1 [Note] [MY-011825] [Xtrabackup] >> log scanned up to (64948804922)

...

...

**2022-09-09T11:15:19.589799-00:00 0 [Note] [MY-011825] [Xtrabackup] Executing FLUSH TABLES WITH READ LOCK...**

**2022-09-09T11:15:19.590675-00:00 0 [Note] [MY-011825] [Xtrabackup] Starting to backup non-InnoDB tables and files**

...

...

2022-09-09T11:15:20.610477-00:00 0 [Note] [MY-011825] [Xtrabackup] Finished backing up non-InnoDB tables and files

2022-09-09T11:15:20.676140-00:00 0 [Note] [MY-011825] [Xtrabackup] Stopping log copying thread at LSN 64948804922

2022-09-09T11:15:20.728679-00:00 0 [Note] [MY-011825] [Xtrabackup] Executing UNLOCK INSTANCE

2022-09-09T11:15:20.729034-00:00 0 [Note] [MY-011825] [Xtrabackup] Executing UNLOCK TABLES

2022-09-09T11:15:20.729254-00:00 0 [Note] [MY-011825] [Xtrabackup] All tables unlocked

**2022-09-09T11:15:20.729268-00:00 0 [Note] [MY-011825] [Xtrabackup] Starting slave SQL thread**

...

...

2022-09-09T11:15:22.099206-00:00 0 [Note] [MY-011825] [Xtrabackup] completed OK!

However, with the PXB 2.4 version, the SQL thread gets stopped only while copying the non-InnoDB files, and when it takes a backup of InnoDB files SQL thread was not stopped. So for the entire duration of 20 seconds of backup, the SQL thread was stopped only for four seconds.

Using server version 5.7.26

/usr/bin/xtrabackup version 2.4.26 based on MySQL server 5.7.35 Linux (x86_64) (revision id: 19de43b)

...

...

220909 11:20:21 [01] Copying ./db/t.ibd to /root/backups/db/t.ibd

220909 11:20:22 >> log scanned up to (3991557882)

...

220909 11:20:35 >> log scanned up to (3991557882)

220909 11:20:35 [01]        ...done

...

...

**220909 11:20:36 Slave open temp tables: 0

220909 11:20:36 Slave is safe to backup.**

220909 11:20:36 Executing FLUSH NO_WRITE_TO_BINLOG TABLES...

220909 11:20:36 Executing FLUSH TABLES WITH READ LOCK...

**220909 11:20:36 Starting to backup non-InnoDB tables and files**

...

...

220909 11:20:40 Finished backing up non-InnoDB tables and files

...

...

**220909 11:20:40 Executing UNLOCK TABLES

220909 11:20:40 All tables unlocked

Starting slave SQL thread**

...

...

220909 11:20:40 completed OK!

The PXB safe-slave-backup option is designed to ensure consistent backups of a database replica. In versions prior to 8.0.22, this option would stop the SQL slave thread after backing up InnoDB tables and before copying non-InnoDB data files. However, this could result in a corrupt backup if a replicated DDL (Data Definition Language) statement was executed on the backup replica during the backup process.

To address this issue, the behavior of the safe-slave-backup option was changed in version 8.0.22. Now, when this option is specified, the “STOP SLAVE SQL_THREAD” command is issued at the beginning of the backup process. This prevents any replicated DDL statements from being executed during the backup, ensuring a consistent and reliable backup.

This is the reason that after upgrading the PXB packages to 8.0, replication was getting stopped when the backup process started and it stopped the SQL thread even though all the tables were InnoDB.

Conclusion

As of Percona XtraBackup 8.0.22-15.0, using a safe-slave-backup option stops the SQL replica thread before copying the InnoDB files. This behavior prevents any replicated DDL statements from being executed during the backup, ensuring a consistent and reliable backup.

Percona XtraBackup is a free, online, open source, complete database backup solution for all versions of Percona Server for MySQL and MySQL.

Try Percona XtraBackup Today

Jul
25
2016
--

MongoDB Consistent Backups

MongoDB consistent backups

In this post, I’m going to discuss MongoDB consistent backups, and how to achieve them.

You might have read before that MongoDB backup is not consistent. But what if I told you there is a tool that could make them consistent. What if this tool also would make it cluster-wide consistent, automatically compress the backup, become the first step toward continually incremental recording, notify your monitoring system and upload the backup to cloud storage for you?

It’s all TRUE!

Recently Percona-Labs created a new repository aimed at exactly these issues. We hope it will eventually grow into something that becomes part of the officially supported tools (like Percona Toolkit and  Percona’s Xtrabackup utility). Before we get into how it works, let’s talk about why we need it and its key highlights. Then (for all the engineering types reading this) we can discuss what is does and why.

Why do we need a consistent backup tool?

The first thing to note is you absolutely can’t have a consistent backup on a working system unless your node is in a replicaset. (You could even have a single node replicaset for this to be accurate.) Why? Consistency requires an operations log to say what changes occurred from the first point in the backup to the last point. This lets us ensure we are consistent to the end timestamp of the backup. We are unable to verify consistency when the MongoDB backup started without the ability to take a “snapshot” of data and then save the data while other changes occur. MongoDB does not have ACID-like isolation in this way. However, it can be consistent to the backup endpoint by applying any deltas at the end of the backup restore process.

You might say, “but mongodump already provides

--oplog

 for this feature.” You are right: it does, and it works great if you only have a single replicaset to backup. When we bring sharding into the mix, however, things get vastly more complicated. It ignores that flag and hits your primaries:

Screen Shot 2016-07-11 at 12.42.20 PM

In the diagram above you can see the backup and oplog recording for the first shard ended long before the second shard. As such, the consistency point needed is nowhere close to being covered by the red line. Even if all your shards are the same size, there would be some level of variance due to network, disk, CPU and memory speeds. The new tool helps you here by keeping track of the dumps, but also by having a thread recording the oplog for all shards until the last shard finishes. This ensures that all shards can be synced to the point in time where the last shard finished. At that moment in time, we have a consistent backup across all the shards. As you can see below, the oplog finished watching both shards after the last shard finish. On recovery, they remain in sync.

Screen Shot 2016-07-11 at 12.50.59 PM

You might ask, “well what about the meta-data stored in the config servers.” This is a great quest, as the behavior differs in our tool depending on if you’re using MongoDB 3.2’s new Config Servers as a replica set feature, or a legacy config server approach.

In the legacy mode, we 

fsyncAndLock

 the config servers just long enough to record a server config data dump. Then we stop the oplog tailer threads for all the shards. After that, and after the oplog tailers finish, we unlock the config server. This ensures we remove the race conditions that could occur if it took longer than expected to close an oplog cursor. However, if we run in 3.2 mode, the config servers act just like another shard. They get dumped at the same time, and the oplog just gets tailed until we complete the data shard dumps. The newest features available to MongoDB Community, MongoDB Enterprise, and Percona Server for MongoDB 3.2 make the process much simpler.

Key Takeaways from new tool

  1. Not yet an official Percona tool, but being used already by people as it’s just a wrapper to run multiple mongo dumps for you.
  2. If you execute the make setup, it outputs a single binary file that needs only python2.7 installed on your database system, even though under the hood it’s running many python modules in a virtualenv
  3. Dumps all shard in parallel and keeps tailing the oplog until all dumps are complete
  4. Handled backing up metadata for old and new config server topologies
  5. Can currently upload to S3, but more cloud storage is coming
  6. Backups compressed by default
  7. Uses the cluster_name,  time, and shard_name to make backup paths look like  /cluster1/<timestamp>/shard1.tgz, helping you keep things organized and letting you remove old backups by timestamp and cluster name.

Desired Roadmap

  • Mature into an officially support Percona product like  Xtrabackup
  • Fully Opensource and welcoming community improvements
  • Extending uploading to  CloudFiles by Rackspace, Azure ZRS, Google Cloud Storage and more
  • Complementary documentation on restores but can just natively use mongorestore tool also
  • Modular backup methods to extend to mongodump, LVM snapshots, ISCSI, EBS snapshots, MongoDB commands and more
  • Encryption before saving to disk
  • Partial backups and restores limit to specific databases and collections
  • Offline backup querying

Please be sure to check out the GitHub @mongodb_consistent_backup and log any issues or features requests.

Feel free to reach out to me on Twitter @dbmurphy_data or @percona with any questions or suggestions as well.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com