Sep
17
2025
--

MySQL with Diagrams Part Three: The Life Story of the Writing Process

How MySQL writes workWhen you run a simple write, …it may look simple, but under the hood, MySQL’s InnoDB engine kicks off a pretty complex sequence to ensure your data stays safe, consistent, and crash-recoverable. In the top-left corner of the diagram, we see exactly where this begins — the moment the query is executed: [crayon-68cb7b0282a22221090106/] The log […]

May
29
2025
--

How to Safely Upgrade InnoDB Cluster From MySQL 8.0 to 8.4

Upgrade InnoDB ClusterIn this blog, we continue from where we left off in the previous post, InnoDB Cluster Setup: Building a 3-Node High Availability Architecture, where we demonstrated how to set up a MySQL InnoDB Cluster with three nodes to achieve high availability. Here, we walk through the step-by-step process of performing a rolling upgrade of that […]

May
20
2025
--

InnoDB Cluster: Set Up Router and Validate Failover

InnoDB Cluster: Set Up Router and Validate FailoverSetting up an InnoDB Cluster requires three key components: Group Replication, MySQL Shell, and MySQL Router. In the previous post, we covered the process of building a 3-node InnoDB Cluster. In this post, we shift our focus to configuring MySQL Router and validating failover functionality. Environment overview We are using three InnoDB Cluster nodes along […]

Mar
24
2025
--

Percona XtraBackup 8.4 Pro: Reduce Server Locking by up to 4300X

Percona XtraBackup 8.4 Pro lockingWhen performing backups, reducing the amount of time your server is locked can significantly improve performance and minimize disruptions. Percona XtraBackup 8.4 Pro introduces improvements in how DDL (Data Definition Language) locks (aka Backup Locks) are managed, allowing for reduced locking during backups. In this post, we’ll explore the impact of these enhancements. TL;DR (Summary) […]

Dec
03
2024
--

Who Ate My MySQL Table Rows?

ALTER TABLE and OPTIMIZE TABLE on an InnoDB tableTL;DR ALTER TABLE and OPTIMIZE TABLE on an InnoDB table, which rebuilds the table without blocking concurrent changes to it (i.e., executed using INPLACE algorithm) and concurrent DML or purge activity on the table can occasionally lead to two significant problems: ALTER/OPTIMIZE TABLE failing with an unnecessary duplicate key error (even though there are no […]

Sep
25
2024
--

How Network Splits/Partitions Impact Group Replication in MySQL

Network Splits/Partition on Group ReplicationIn this blog post, we will explore how network partitions impact group replication and the way it detects and responds to failures. In case you haven’t checked out my previous blog post about group replication recovery strategies, please have a look at them for some insight. Topology: [crayon-66f40f3f3cad1905113104/] Scenario 1: One of the GR nodes […]

Aug
30
2024
--

When Warnings Deceive: The Curious Case of InnoDB’s Row Size Limitation

Mysterious warning Recently, I was involved in an investigation whose goal was to find out the reason for a warning message like this: [crayon-66d3235a263dd269727999/] The message looks clear, isn’t it? Well, the problem was that this particular table had not been changed for years, and so no DDL (ALTER) query was involved here. Moreover, there […]

Jan
18
2024
--

Syscalls Analysis in MySQL When Using innodb_flush_method and innodb_use_fdatasync

MySQL When Using innodb_flush_method and innodb_use_fdatasyncIn this blog post, we will discuss how to validate at the operating system level the effects of changing the innodb_flush_method to variations other than the default (particularly for O_DIRECT which is most commonly used) and the use of innodb_use_fdatasync.IntroductionFirst, let’s define what the innodb_flush_method parameter does. It dictates how InnoDB manages the flushing of data […]

Sep
19
2023
--

Use Physical Backups With MySQL InnoDB Redo Log Archiving

InnoDB Redo Log Archiving

In the world of data backup and security, physical backups play an extremely important role. Physical backup methods are faster than logical because they involve only file copying without conversion. This type of backup is suitable for large, important databases that need to be recovered quickly when problems occur.

Physical backups are the backups that consist of raw copies of the directories and files that store database contents. In addition to databases, the backup can include any related files, such as log or configuration files. Now, since backup speed and compactness are important for busy, important databases, Percona’s open source physical backup solution – Percona XtraBackup (PXB), takes into account all these aspects and benefits MySQL world with its exceptional capabilities!

This blog post will walk you through how PXB uses MySQL’s InnoDB redo log archiving to manage the redo log files on the busiest systems and looks at how a new feature released in PXB version 8.0.34-29 will let you set up redo log directory while performing a backup.

InnoDB redo log archiving

Backup utilities that copy redo log records may sometimes fail to keep pace with redo log generation while a backup operation is in progress, resulting in lost redo log records due to overwritten records. This issue most often occurs when there is significant MySQL server activity during the backup operation, and the redo log file storage media operates faster than the backup storage media. The redo log archiving feature, introduced in MySQL 8.0.17, addresses this issue by sequentially writing redo log records to an archive file and the redo log files. Backup utilities can copy redo log records from the archive file as necessary, thereby avoiding the potential loss of data.

Enabling redo log archiving on the server requires setting a value for the innodb_redo_log_archive_dirs system variable. The value is specified as a semicolon-separated list of labeled redo log archive directories. The label:directory pair is separated by a colon (:). For example:

mysql> SET GLOBAL innodb_redo_log_archive_dirs='label1:directory_path1[;label2:directory_path2;…]';

Now, first, we are going to demonstrate to you how we tested the new InnoDB redo log archiving on Percona Server for MySQL 8.0.33-25 (GPL) and should be using MySQL shell (mysqlsh) for enabling the InnoDB redo log archiving.

  1. Create new directories for storing redo log archives.
[root@ip-xxx-xx-xx-xx ~]# mkdir -p /var/lib/mysql-redo-archive/backup1
[root@ip-xxx-xx-xx-xx ~]# chown mysql. -R /var/lib/mysql-redo-archive
[root@ip-xxx-xx-xx-xx ~]# chmod -R 700 /var/lib/mysql-redo-archive/

  1. Login into mysqlsh by root user and check the variable innodb_redo_log_archive_dirs.
[root@ip-xxx-xx-xx-xx ~]# mysqlsh -uroot -p
MySQL Shell 8.0.33
Copyright (c) 2016, 2023, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
Other names may be trademarks of their respective owners.
Type 'help' or '?' for help; 'quit' to exit.
WARNING: Using a password on the command line interface can be insecure.
Creating a session to 'root@localhost'
Fetching schema names for auto-completion... Press ^C to stop.
Your MySQL connection id is 10 (X protocol)
Server version: 8.0.33-25 Percona Server (GPL), Release 25, Revision 60c9e2c5
No default schema selected; type use <schema> to set one.
MySQL localhost:33060+ ssl  SQL >
MySQL localhost:33060+ ssl JS > \sql
Switching to SQL mode... Commands end with ;
Fetching global names for auto-completion... Press ^C to stop.

MySQL localhost:33060+ ssl SQL > show global variables like 'innodb_redo_log_ar%';
+------------------------------+-------+
| Variable_name                | Value |
+------------------------------+-------+
| innodb_redo_log_archive_dirs |       |
+------------------------------+-------+
1 row in set (0.0031 sec)

  1. Set the value of variable innodb_redo_log_archive_dirs to the one in which we created the directories for redo log archives as below.
MySQL  localhost:33060+ ssl  SQL > set persist innodb_redo_log_archive_dirs='backup1:/var/lib/mysql-redo-archive/';
Query OK, 0 rows affected (0.0019 sec)
MySQL  localhost:33060+ ssl  SQL >  show global variables like 'innodb_redo_log_ar%';
+------------------------------+--------------------------------------+
| Variable_name                | Value                                |
+------------------------------+--------------------------------------+
| innodb_redo_log_archive_dirs | backup1:/var/lib/mysql-redo-archive/ |
+------------------------------+--------------------------------------+
1 row in set (0.0025 sec)

  1. Now it’s ready to work, but it’s not enabled. It will be enabled when the session initializes the backup and will invoke innodb_redo_log_archive_start().
MySQL  localhost:33060+ ssl  SQL > select innodb_redo_log_archive_start('backup1', 'backup1');
+-----------------------------------------------------+
| innodb_redo_log_archive_start('backup1', 'backup1') |
+-----------------------------------------------------+
|                                                   0 |
+-----------------------------------------------------+
1 row in set (0.0229 sec)

How to check if the redo log archiving is active

MySQL  localhost:33060+ ssl  SQL > select * from performance_schema.file_instances where event_name like '%::redo_log_archive_file'G
*************************** 1. row ***************************
FILE_NAME: /var/lib/mysql-redo-archive/backup1/archive.f255a32d-2fb4-11ee-889e-0242ac110005.000001.log
EVENT_NAME: wait/io/file/innodb/meb::redo_log_archive_file
OPEN_COUNT: 1
1 row in set (0.0015 sec)

So, this is not enough to ensure the redo log archiving is active. But we have the possibility also to check if the thread is active using this query:

MySQL  localhost:33060+ ssl  SQL > select thread_id, name, type from performance_schema.threads  where name like '%redo_log_archive%';
+-----------+-----------------------------------------------------+------------+
| thread_id | name                                                | type       |
+-----------+-----------------------------------------------------+------------+
|        48 | thread/innodb/meb::redo_log_archive_consumer_thread | BACKGROUND |
+-----------+-----------------------------------------------------+------------+
1 row in set (0.0021 sec)

If a row is returned, it means that the redo log archiving is enabled and active.

If, instead, you wish not to revert and to make it persist on service restart, add it to my.cnf under [mysqld]:

[mysqld]
innodb_redo_log_archive_dirs = redo-archived1:/mysql/archived-redo1

How to stop redo log archiving

MySQL  localhost:33060+ ssl  SQL > select innodb_redo_log_archive_stop();
+--------------------------------+
| innodb_redo_log_archive_stop() |
+--------------------------------+
|                              0 |
+--------------------------------+
1 row in set (0.0009 sec)

[root@ip-xxx-xx-xx-xx ~]# ls -ltrh /var/lib/mysql-redo-archive/backup1/
total 24K
-r--r-----. 1 mysql mysql 24K Aug 21 11:22 archive.94f4ab58-3d1c-11ee-9e4f-0af1fd7c44c9.000001.log

Revert back the changes

To revert back and disable redo log archiving:

mysql> SET GLOBAL innodb_redo_log_archive_dirs='';

How does Percona XtraBackup deal with redo log files?

PXB considers the log sequence number and starts copying away the data files when a backup is performed. The backup operation on InnoDB is a non-blocking operation, so the ibd files are allowed to change while the backup process is running. This means that while PXB is copying the files, the file reflects the state of the database at different times. PXB needs the redo log records for every change to the data files since it began execution.

PXB runs a background process that watches the transaction log files and copies changes from them. It needs to do this continually because InnoDB writes redo logs to the disk circularly and can be reused after a while. However, prior to PXB version 8.0.34-29, if InnoDB redo log archiving is not enabled and provided that redo logs are written faster, then PXB is able to copy it slower than it is written to the disk. It is possible to receive the below message:

xtrabackup: error: log block numbers mismatch:
xtrabackup: error: expected log block no. X, but got no. Y from the log file.
xtrabackup: error: it looks like InnoDB log has wrapped around before xtrabackup could process all records due to either log copying being too slow, or log files being too small.
xtrabackup: Error: xtrabackup_copy_logfile() failed.

Earlier to the release of the feature redo log archiving in MySQL 8.0.17, the  potential solutions to the above problem were:

  • Increasing the size of the redo logs so they are not wrapped while PXB is working.
  • The read speed is too slow, which usually is a sign of IO congestion.
  • The write speed is too slow, usually a sign of IO congestion or network congestion.
  • Taking backup off-peak time while the system saturation is lower.

However, now, with the use of the redo log archiving feature introduced in MySQL 8.0.17, you can overcome the slowness of physical backups and can efficiently perform the backup even during peak hours.

Also, PXB version 8.0.34-29 has introduced a new option –redo-log-arch-dir that can be used to set the redo log archiving directory if not already set in MySQL.

Using MySQL redo log archiving for Percona XtraBackup

So now we have a redo-log-file archived in the folder we created above demonstration.

[root@ip-xxx-xx-xx-xx ~]# ls -ltrh /var/lib/mysql-redo-archive/backup1/
total 24K
-r--r-----. 1 mysql mysql 24K Aug 21 11:22 archive.94f4ab58-3d1c-11ee-9e4f-0af1fd7c44c9.000001.log

Test with Percona XtraBackup

  1. Run PXB as the owner of the mysqld, typically the MySQL OS user. We will be using version 8.0.34-29
[root@ip-xxx-xx-xx-xx ~]# xtrabackup --version
2023-09-11T16:07:17.696628-00:00 0 [Note] [MY-011825] [Xtrabackup] recognized server arguments: --datadir=/var/lib/mysql
xtrabackup version 8.0.34-29 based on MySQL server 8.0.34 Linux (x86_64) (revision id: 5ba706ee)

This is required for PXB to use the InnoDB archiving feature. Otherwise, it will encounter a “Permission denied” error and will not use archiving as below:

2023-08-21T13:27:59.293520-00:00 0 [Note] [MY-011825] [Xtrabackup] xtrabackup redo_log_arch_dir is set to backup1:/var/lib/mysql-redo-archive/
2023-08-21T13:27:59.295379-00:00 0 [ERROR] [MY-011825] [Xtrabackup] failed to fetch query result select innodb_redo_log_archive_start('backup1', '1692710879293') : Cannot create redo log archive file '/var/lib/mysql-redo-archive/1692710879293/archive.94f4ab58-3d1c-11ee-9e4f-0af1fd7c44c9.000001.log' (OS errno: 13 - Permission denied)
2023-08-21T13:27:59.295429-00:00 0 [Note] [MY-011825] [Xtrabackup] Redo Log Archiving is not used.

PXB will create a temporary directory for the archive file. Since redo log archiving is a MySQL process, it needs to be owned by the MySQL OS user to be able to write to it.

1.1) Option 1: Login as the MySQL OS user:

shell> sudo -H -u mysql bash

Execute:

shell> xtrabackup --no-lock=1 --compress --parallel=4 --host=localhost --user=root --password='password_string' --backup=1 --target-dir=/Backup/21Aug 2> /tmp/b0-with-redo-archiving-as-mysql-os-user.log

1.2) Option 2: Directly execute as the root OS user for the MySQL user.

You must add -c on the command like:

shell> sudo -H -u mysql bash -c '<COMMAND>' 2> /tmp/backup.log

Execute:

shell> sudo -H -u mysql bash -c 'xtrabackup --no-lock=1 --compress --parallel=4 --host=localhost --user=root --password='password_string' --backup=1 --target-dir=/Backup/21Aug' 2> /tmp/b0-with-redo-archiving-as-mysql-os-user.log

4.2) Verify archiving is used:

shell> less /tmp/b0-with-redo-archiving-as-mysql-os-user.log 
2023-08-22T13:36:02.773345-00:00 0 [Note] [MY-011825] [Xtrabackup] recognized server arguments: --datadir=/var/lib/mysql  
2023-08-22T13:36:02.773543-00:00 0 [Note] [MY-011825] [Xtrabackup] recognized client arguments: --no-lock=1 --compress --parallel=4 --host=localhost --user=root --password=* --backup=1 --target-dir=/Backup/22Aug  xtrabackup version 8.0.34-29 based on MySQL server 8.0.34 Linux (x86_64) (revision id: 5ba706ee) 230822 13:36:02  version_check Connecting to MySQL server with DSN 'dbi:mysql:;mysql_read_default_group=xtrabackup;host=localhost' as 'root'  (using password: YES). 230822 13:36:02  version_check Connected to MySQL server 230822 13:36:02  version_check Executing a version check against the server... 230822 13:36:02  version_check Done. 
2023-08-22T13:36:02.899397-00:00 0 [Note] [MY-011825] [Xtrabackup] Connecting to MySQL server host: localhost, user: root, password: set, port: not set, socket: not set 2023-08-22T13:36:02.906059-00:00 0 [Note] [MY-011825] [Xtrabackup] Using server version 8.0.34 
.... 
.... 
2023-08-22T13:36:02.980089-00:00 0 [Note] [MY-011825] [Xtrabackup] Connecting to MySQL server host: localhost, user: root, password: set, port: not set, socket: not set 
2023-08-22T13:36:02.986698-00:00 0 [Note] [MY-011825] [Xtrabackup] xtrabackup redo_log_arch_dir is set to backup1:/var/lib/mysql-redo-archive/ 
2023-08-22T13:36:03.015207-00:00 0 [Note] [MY-011825] [Xtrabackup] Waiting for archive file '/var/lib/mysql-redo-archive//1692711362986/archive.94f4ab58-3d1c-11ee-9e4f-0af1fd7c44c9.000001.log' 
2023-08-22T13:36:03.091572-00:00 1 [Note] [MY-011825] [Xtrabackup] >> log scanned up to (19630018) 
.... 
.... 
2023-08-22T13:36:04.569714-00:00 0 [Note] [MY-011825] [Xtrabackup] Done: Compressing file /Backup/22Aug/xtrabackup_info.zst 
2023-08-22T13:36:05.583075-00:00 0 [Note] [MY-011825] [Xtrabackup] Transaction log of lsn (19630018) to (19630028) was copied. 
2023-08-22T13:36:05.790293-00:00 0 [Note] [MY-011825] [Xtrabackup] completed OK!

How to set –redo-log-arch-dir=name in PXB command

PXB 8.0.34-29 introduces a new feature  –redo-log-arch-dir=name. This option sets the redo log archive directory if this directory is not already set on the server. Here is a short example demonstrating the functioning of  –redo-log-arch-dir parameter.

Suppose innodb_redo_log_archive_dirs is NULL, which means initially, the redo logs archive directory is not set in MySQL, as shown below.

mysql>  select @@innodb_redo_log_archive_dirs;
+--------------------------------+
| @@innodb_redo_log_archive_dirs |
+--------------------------------+
| NULL                           |
+--------------------------------+
1 row in set (0.01 sec)

Create a redo log file directory and give the appropriate permissions

[root@ip-xxx-xx-xx-xx ~]# cd /var/lib/mysql-redo-archive/
[root@ip-xxx-xx-xx-xx ~]# mkdir redo_arch_backup

### Check mysql-redo-archive folder Permissions
[root@ip-xxx-xx-xx-xx ~]# ls -ltrh /var/lib/ |grep archive 
drwx------.  4 mysql  mysql     51 Sep 11 17:16 mysql-redo-archive

As you can see, PXB is all set to use the new redo archive directory just by passing the archive directory path to the parameter  –redo-log-arch-dir

–redo-log-arch-dir=redo_arch_backup:/var/lib/mysql-redo-archive/

[root@ip-xxx-xx-xx-xx ~]# sudo -H -u mysql bash
bash-5.1$ xtrabackup --no-lock=1 --compress --parallel=4 --host=localhost --user=root --password='password_string' --backup=1 --target-dir=/Backup/11Sept23_bkp --redo-log-arch-dir=redo_arch_backup:/var/lib/mysql-redo-archive/ 2> /tmp/a1-with-redo-archiving-as-mysql-os-user.log

Verifying the PXB logs:

[root@ip-xxx-xx-xx-xx ~]# cat /tmp/a1-with-redo-archiving-as-mysql-os-user.log
2023-09-11T17:16:47.836951-00:00 0 [Note] [MY-011825] [Xtrabackup] recognized server arguments: --datadir=/var/lib/mysql
2023-09-11T17:16:47.837105-00:00 0 [Note] [MY-011825] [Xtrabackup] recognized client arguments: --no-lock=1 --compress --parallel=4 --host=localhost --user=root --password=* --backup=1 --target-dir=/Backup/11Sept23_bkp --redo_log_arch_dir=redo_arch_backup:/var/lib/mysql-redo-archive/
xtrabackup version 8.0.34-29 based on MySQL server 8.0.34 Linux (x86_64) (revision id: 5ba706ee)
230911 17:16:47  version_check Connecting to MySQL server with DSN 'dbi:mysql:;mysql_read_default_group=xtrabackup;host=localhost' as 'root'  (using password: YES).
230911 17:16:47  version_check Connected to MySQL server
230911 17:16:47  version_check Executing a version check against the server...
230911 17:16:47  version_check Done.
2023-09-11T17:16:47.956847-00:00 0 [Note] [MY-011825] [Xtrabackup] Connecting to MySQL server host: localhost, user: root, password: set, port: not set, socket: not set
....
....
2023-09-11T17:16:48.028506-00:00 0 [Note] [MY-011825] [Xtrabackup] Connecting to MySQL server host: localhost, user: root, password: set, port: not set, socket: not set
2023-09-11T17:16:48.034515-00:00 0 [Note] [MY-011825] [Xtrabackup] xtrabackup redo_log_arch_dir is set to redo_arch_backup:/var/lib/mysql-redo-archive/
2023-09-11T17:16:48.056504-00:00 0 [Note] [MY-011825] [Xtrabackup] Waiting for archive file '/var/lib/mysql-redo-archive//1694452608034/archive.94f4ab58-3d1c-11ee-9e4f-0af1fd7c44c9.000001.log'
2023-09-11T17:16:48.137941-00:00 1 [Note] [MY-011825] [Xtrabackup] >> log scanned up to (19816218)
2023-09-11T17:16:48.139886-00:00 0 [Note] [MY-012953] [InnoDB] Disabling background ibuf IO read threads.
2023-09-11T17:16:48.343117-00:00 0 [Note] [MY-011825] [Xtrabackup] Generating a list of tablespaces
....
....
2023-09-11T17:16:49.496740-00:00 0 [Note] [MY-011825] [Xtrabackup] Compressing /Backup/11Sept23_bkp/backup-my.cnf.zst
2023-09-11T17:16:49.496789-00:00 0 [Note] [MY-011825] [Xtrabackup] Done: Compressing file /Backup/11Sept23_bkp/backup-my.cnf.zst
2023-09-11T17:16:49.498483-00:00 0 [Note] [MY-011825] [Xtrabackup] Compressing /Backup/11Sept23_bkp/xtrabackup_info.zst
2023-09-11T17:16:49.498521-00:00 0 [Note] [MY-011825] [Xtrabackup] Done: Compressing file /Backup/11Sept23_bkp/xtrabackup_info.zst
2023-09-11T17:16:50.501818-00:00 0 [Note] [MY-011825] [Xtrabackup] Transaction log of lsn (19816218) to (19816228) was copied.
2023-09-11T17:16:50.708435-00:00 0 [Note] [MY-011825] [Xtrabackup] completed OK!

mysql>  select @@innodb_redo_log_archive_dirs;
+-----------------------------------------------+
| @@innodb_redo_log_archive_dirs                |
+-----------------------------------------------+
| redo_arch_backup:/var/lib/mysql-redo-archive/ |
+-----------------------------------------------+
1 row in set (0.00 sec)

This means if innodb_redo_log_archive_dirs is not set in MySQL, PXB 8.0.34-29 will set this through –redo-log-arch-dir option and start using the redo log archive while a backup is initiated. When not set explicitly through –redo-log-arch-dir, PXB, by default, will first check innodb_redo_log_archive_dirs. If this variable is set in MySQL, as demonstrated previously, it will use the redo log arching directory assigned in innodb_redo_log_archive_dirs variable while performing the backup.

Conversely, if this MySQL variable innodb_redo_log_archive_dirs is NULL and you did not explicitly specify –redo-log-arch-dir while taking the backup, then PXB won’t use the redo log archiving feature.

Conclusion

This feature is crucial for heavy workloads when the backup storage doesn’t have the same capabilities as the production storage and cannot follow up the speed of the writes.

When enabled by the DBA, Percona XtraBackup will detect the archived redo log files and will use it automatically. To verify if the redo log archiving process was started and is still active, you can check the performance_schema.threads table as illustrated above. On the other hand, to use this feature with Percona XtraBackup, you can also explicitly use –redo-log-arch-dir=name.

Percona XtraBackup is a free, open source, complete online backup solution for all versions of Percona Server for MySQL and MySQL.

 

Try Percona XtraBackup today!

 

Related links:

https://docs.percona.com/percona-xtrabackup/8.0/release-notes/8.0/8.0.34-29.0.html

https://docs.percona.com/percona-xtrabackup/8.0/xtrabackup-option-reference.html#-redo-log-arch-dirname

https://dev.mysql.com/doc/refman/8.0/en/innodb-redo-log.html#innodb-redo-log-archiving

https://www.percona.com/blog/innodb-redo-log-archiving/

 

Aug
25
2023
--

Failover and Recovery Scenarios in InnoDB Cluster and ClusterSet

Failover and Recovery Scenarios in InnoDB Cluster

This blog post will focus on failover and recovery scenarios inside the InnoDB Cluster and ClusterSet environment. To know more about the deployments of these topologies, you can refer to the manuals – InnoDB Cluster and Innodb ClusterSet setup.

In the below snippet, we have two clusters (cluster1 and cluster2), which are connected via an async channel and combined, known as a ClusterSet topology. We are going to use the below topology in all of our cases.

MySQL  127.0.0.1:3308 ssl  JS > myclusterset.status({extended: 1})
{
    "clusters": {
        "Cluster1": {
            "clusterRole": "PRIMARY", 
            "globalStatus": "OK", 
            "primary": "127.0.0.1:3308", 
            "status": "OK", 
            "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
            "topology": {
                "127.0.0.1:3308": {
                    "address": "127.0.0.1:3308", 
                    "memberRole": "PRIMARY", 
                    "mode": "R/W", 
                    "status": "ONLINE", 
                    "version": "8.0.31"
                }, 
                "127.0.0.1:3309": {
                    "address": "127.0.0.1:3309", 
                    "memberRole": "SECONDARY", 
                    "mode": "R/O", 
                    "replicationLagFromImmediateSource": "", 
                    "replicationLagFromOriginalSource": "", 
                    "status": "ONLINE", 
                    "version": "8.0.31"
                }, 
                "127.0.0.1:3310": {
                    "address": "127.0.0.1:3310", 
                    "memberRole": "SECONDARY", 
                    "mode": "R/O", 
                    "replicationLagFromImmediateSource": "", 
                    "replicationLagFromOriginalSource": "", 
                    "status": "ONLINE", 
                    "version": "8.0.31"
                }
            }, 
            "transactionSet": "39ec1328-3909-11ee-a0f8-5254004d77d3:1-4,b3a79c40-3909-11ee-805d-5254004d77d3:1-124,b3a7a659-3909-11ee-805d-5254004d77d3:1-5"
        }, 
        "cluster2": {
            "clusterRole": "REPLICA", 
            "clusterSetReplication": {
                "applierStatus": "APPLIED_ALL", 
                "applierThreadState": "Waiting for an event from Coordinator", 
                "applierWorkerThreads": 4, 
                "receiver": "127.0.0.1:3311", 
                "receiverStatus": "ON", 
                "receiverThreadState": "Waiting for source to send event", 
                "source": "127.0.0.1:3308"
            }, 
            "clusterSetReplicationStatus": "OK", 
            "globalStatus": "OK", 
            "status": "OK", 
            "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
            "topology": {
                "127.0.0.1:3311": {
                    "address": "127.0.0.1:3311", 
                    "memberRole": "PRIMARY", 
                    "mode": "R/O", 
                    "replicationLagFromImmediateSource": "", 
                    "replicationLagFromOriginalSource": "", 
                    "status": "ONLINE", 
                    "version": "8.0.31"
                }, 
                "127.0.0.1:3312": {
                    "address": "127.0.0.1:3312", 
                    "memberRole": "SECONDARY", 
                    "mode": "R/O", 
                    "replicationLagFromImmediateSource": "", 
                    "replicationLagFromOriginalSource": "", 
                    "status": "ONLINE", 
                    "version": "8.0.31"
                }, 
                "127.0.0.1:3313": {
                    "address": "127.0.0.1:3313", 
                    "memberRole": "SECONDARY", 
                    "mode": "R/O", 
                    "replicationLagFromImmediateSource": "", 
                    "replicationLagFromOriginalSource": "", 
                    "status": "ONLINE", 
                    "version": "8.0.31"
                }
            }, 
            "transactionSet": "39ec1328-3909-11ee-a0f8-5254004d77d3:1-4,3db995e4-390a-11ee-b678-5254004d77d3:1-5,b3a79c40-3909-11ee-805d-5254004d77d3:1-124,b3a7a659-3909-11ee-805d-5254004d77d3:1-5", 
            "transactionSetConsistencyStatus": "OK", 
            "transactionSetErrantGtidSet": "", 
            "transactionSetMissingGtidSet": ""
        }
    }, 
    "domainName": "firstclusterset", 
    "globalPrimaryInstance": "127.0.0.1:3308", 
    "metadataServer": "127.0.0.1:3308", 
    "primaryCluster": "Cluster1", 
    "status": "HEALTHY", 
    "statusText": "All Clusters available."
}

How failover happens inside a single InnoDB Cluster

  • Connect to any node of the first cluster (“cluster1”) via MySQLShell and fetch the details.
    MySQL  127.0.0.1:3308 ssl  JS > c root@localhost:3308
    MySQL  localhost:3308 ssl  JS > cluster1=dba.getCluster()

    MySQL  localhost:3308 ssl  JS > cluster1.status()
    {
        "clusterName": "Cluster1", 
        "clusterRole": "PRIMARY", 
        "defaultReplicaSet": {
            "name": "default", 
            "primary": "127.0.0.1:3308", 
            "ssl": "REQUIRED", 
            "status": "OK", 
            "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
            "topology": {
                "127.0.0.1:3308": {
                    "address": "127.0.0.1:3308", 
                    "memberRole": "PRIMARY", 
                    "mode": "R/W", 
                    "readReplicas": {}, 
                    "role": "HA", 
                    "status": "ONLINE", 
                    "version": "8.0.31"
                }, 
                "127.0.0.1:3309": {
                    "address": "127.0.0.1:3309", 
                    "memberRole": "SECONDARY", 
                    "mode": "R/O", 
                    "readReplicas": {}, 
                    "replicationLagFromImmediateSource": "", 
                    "replicationLagFromOriginalSource": "", 
                    "role": "HA", 
                    "status": "ONLINE", 
                    "version": "8.0.31"
                }, 
                "127.0.0.1:3310": {
                    "address": "127.0.0.1:3310", 
                    "memberRole": "SECONDARY", 
                    "mode": "R/O", 
                    "readReplicas": {}, 
                    "replicationLagFromImmediateSource": "", 
                    "replicationLagFromOriginalSource": "", 
                    "role": "HA", 
                    "status": "ONLINE", 
                    "version": "8.0.31"
                }
            }, 
            "topologyMode": "Single-Primary"
        }, 
        "domainName": "firstclusterset", 
        "groupInformationSourceMember": "127.0.0.1:3308"
    }

     

  • Now perform a primary switchover from the instance (“127.0.0.1:3308”) to (“127.0.0.1:3309”).
MySQL  localhost:3308 ssl  JS > cluster1.setPrimaryInstance("root@127.0.0.1:3309")

Output:

Setting instance '127.0.0.1:3309' as the primary instance of cluster 'Cluster1'...

Instance '127.0.0.1:3308' was switched from PRIMARY to SECONDARY.
Instance '127.0.0.1:3309' was switched from SECONDARY to PRIMARY.
Instance '127.0.0.1:3310' remains SECONDARY.

The instance '127.0.0.1:3309' was successfully elected as primary.

  • Finally, the instance (“127.0.0.1:3309”) will show the status as primary.
MySQL localhost:3308 ssl JS > cluster1.status()
...
"127.0.0.1:3309": {
"address": "127.0.0.1:3309",
"memberRole": "PRIMARY",
"mode": "R/W",
"readReplicas": {},
"role": "HA",
"status": "ONLINE",
"version": "8.0.31"
...

How to rejoin the lost instance again

If, for some reason, an instance leaves the cluster or loses connection and can’t automatically rejoin the cluster, we might need to rejoin an instance to a cluster by issuing the “Cluster.rejoinInstance(instance)” command. Here, we will try to create a small example that can demonstrate the usage of this command.

  • Create some blocker by stopping group replication on the instance (“127.0.0.1:3310”).
MySQL localhost:3310 ssl SQL > stop group_replication;

  • Looking over the information below, we can see the instance (“127.0.0.1:3310”) is showing “MISSING” status.
MySQL  localhost:3310 ssl  JS > cluster1.status()
...
"127.0.0.1:3310": {
                "address": "127.0.0.1:3310", 
                "instanceErrors": [
                    "NOTE: group_replication is stopped."
                ], 
                "memberRole": "SECONDARY", 
                "memberState": "OFFLINE", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "(MISSING)", 
                "version": "8.0.31"
            }
...

  • Now, add the instance (“127.0.0.1:3310”) again with the rejoinInstance() command.
MySQL  localhost:3310 ssl  JS > cluster1.rejoinInstance('127.0.0.1:3310')

...
Validating instance configuration at 127.0.0.1:3310...
NOTE: Instance detected as a sandbox.
Please note that sandbox instances are only suitable for deploying test clusters for use within the same host.

This instance reports its own address as 127.0.0.1:3310

Instance configuration is suitable.
Rejoining instance '127.0.0.1:3310' to cluster 'Cluster1'...

Re-creating recovery account...
NOTE: User 'mysql_innodb_cluster_2558498413'@'%' already existed at instance '127.0.0.1:3309'. It will be deleted and created again with a new password.

* Waiting for the Cluster to synchronize with the PRIMARY Cluster...
** Transactions replicated  ############################################################  100% 

The instance '127.0.0.1:3310' was successfully rejoined to the cluster.
...

And after the above operation, the instance seems to be “ONLINE” now.

"127.0.0.1:3310": {
"address": "127.0.0.1:3310",
"memberRole": "SECONDARY",
"mode": "R/O",
"readReplicas": {},
"replicationLagFromImmediateSource": "",
"replicationLagFromOriginalSource": "",
"role": "HA",
"status": "ONLINE",
"version": "8.0.31"
}

How to recover a cluster from a quorum or vote loss

There are situations when the running instance can fail, and a cluster can lose its quorum (ability to vote) due to insufficient members. This could trigger when there is a failure of enough/majority of instances that make up the cluster to vote on Group Replication operations. In case a cluster loses the quorum, it can no longer process any write transactions within the cluster or change the cluster’s topology.

However, if any instance is online that contains the latest InnoDB Cluster metadata, it is possible to restore a cluster with the quorum.

Let’s see how we can use the feature “forceQuorumUsingPartitionOf” to recover the cluster again with minimal member votes.

  • First, we will try to fail the majority of nodes with a simple “KILL” operation.
root     30281     1  1 07:10 ?        00:02:03 /root/mysql-sandboxes/3308/bin/mysqld --defaults-file=/root/mysql-sandboxes/3308/my.cnf --user=root
root     30788     1  1 07:14 ?        00:01:53 /root/mysql-sandboxes/3309/bin/mysqld --defaults-file=/root/mysql-sandboxes/3309/my.cnf --user=root
root     30912     1  2 07:14 ?        00:02:48 /root/mysql-sandboxes/3310/bin/mysqld --defaults-file=/root/mysql-sandboxes/3310/my.cnf --user=root

[root@localhost ~]# kill -9 30281 30912

  • Now check the cluster1 status again. The cluster lost the quorum, and no write activity was allowed.
MySQL  localhost:3309 ssl  JS > cluster1.status()
{
    "clusterName": "Cluster1", 
    "clusterRole": "PRIMARY", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "127.0.0.1:3309", 
        "ssl": "REQUIRED", 
        "status": "NO_QUORUM", 
        "statusText": "Cluster has no quorum as visible from '127.0.0.1:3309' and cannot process write transactions. 2 members are not active.", 
        "topology": {
            "127.0.0.1:3308": {
                "address": "127.0.0.1:3308", 
                "memberRole": "SECONDARY", 
                "memberState": "(MISSING)", 
                "mode": "n/a", 
                "readReplicas": {}, 
                "role": "HA", 
                "shellConnectError": "MySQL Error 2003: Could not open connection to '127.0.0.1:3308': Can't connect to MySQL server on '127.0.0.1:3308' (111)", 
                "status": "UNREACHABLE", 
                "version": "8.0.31"
            }, 
            "127.0.0.1:3309": {
                "address": "127.0.0.1:3309", 
                "memberRole": "PRIMARY", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.31"
            }, 
            "127.0.0.1:3310": {
                "address": "127.0.0.1:3310", 
                "memberRole": "SECONDARY", 
                "memberState": "(MISSING)", 
                "mode": "n/a", 
                "readReplicas": {}, 
                "role": "HA", 
                "shellConnectError": "MySQL Error 2003: Could not open connection to '127.0.0.1:3310': Can't connect to MySQL server on '127.0.0.1:3310' (111)", 
                "status": "UNREACHABLE", 
                "version": "8.0.31"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "domainName": "firstclusterset", 
    "groupInformationSourceMember": "127.0.0.1:3309"
}

  • To fix this situation, we can connect to the available instance (“127.0.0.1:3309”) and reset the quorum again with the below command.
MySQL  localhost:3309 ssl  JS > mycluster1.forceQuorumUsingPartitionOf("root@localhost:3309")

...

Restoring cluster 'Cluster1' from loss of quorum, by using the partition composed of [127.0.0.1:3309]

Restoring the InnoDB cluster ...

The InnoDB cluster was successfully restored using the partition from the instance 'root@localhost:3309'.

WARNING: To avoid a split-brain scenario, ensure that all other members of the cluster are removed or joined back to the group that was restored.

...

  • The Instances with port (3308,3310) are still down/failed. However, we recovered the cluster quorum with a single primary member (“127.0.0.1:3309”). Now, we can perform the write activity on the cluster without any issues.
MySQL  localhost:3309 ssl  JS > cluster1.status()
{
    "clusterName": "Cluster1", 
    "clusterRole": "PRIMARY", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "127.0.0.1:3309", 
        "ssl": "REQUIRED", 
        "status": "OK_NO_TOLERANCE_PARTIAL", 
        "statusText": "Cluster is NOT tolerant to any failures. 2 members are not active.", 
        "topology": {
            "127.0.0.1:3308": {
                "address": "127.0.0.1:3308", 
                "memberRole": "SECONDARY", 
                "mode": "n/a", 
                "readReplicas": {}, 
                "role": "HA", 
                "shellConnectError": "MySQL Error 2003: Could not open connection to '127.0.0.1:3308': Can't connect to MySQL server on '127.0.0.1:3308' (111)", 
                "status": "(MISSING)"
            }, 
            "127.0.0.1:3309": {
                "address": "127.0.0.1:3309", 
                "memberRole": "PRIMARY", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.31"
            }, 
            "127.0.0.1:3310": {
                "address": "127.0.0.1:3310", 
                "memberRole": "SECONDARY", 
                "mode": "n/a", 
                "readReplicas": {}, 
                "role": "HA", 
                "shellConnectError": "MySQL Error 2003: Could not open connection to '127.0.0.1:3310': Can't connect to MySQL server on '127.0.0.1:3310' (111)", 
                "status": "(MISSING)"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "domainName": "firstclusterset", 
    "groupInformationSourceMember": "127.0.0.1:3309"
}


MySQL  localhost:3309 ssl  SQL > create database sbtest2;
Query OK, 1 row affected (0.0055 sec)

Later on, we can fix the failed instances, and they will join cluster1 again. Sometimes, we might need to perform the “rejoinInstance()” operation in order to add the nodes again.

How to recover a complete cluster from a major outage

Sometimes, even if all the nodes are up and some internal issues happen, like group replication is stuck or some networking problem, you might experience a complete outage and be unable to perform any writes/activity on the cluster.

In such circumstances, you can use any one node and use its metadata to recover the cluster. You need to connect to the most up-to-date instance, as otherwise, you may lose data or have inconsistency in the cluster nodes.

Let’s see how we can introduce such a situation manually and then try to fix that.

  • First, stop the group replication on all three instances.
MySQL localhost:3308 ssl SQL > stop group_replication;
MySQL localhost:3310 ssl SQL > stop group_replication;
MySQL localhost:3309 ssl SQL > stop group_replication;

127.0.0.1:3308": {
                "address": "127.0.0.1:3308", 
                "instanceErrors": [
                    "NOTE: group_replication is stopped."
                ], 
                "memberRole": "SECONDARY", 
                "memberState": "OFFLINE", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "(MISSING)", 
                "version": "8.0.31"
            }, 
            "127.0.0.1:3309": {
                "address": "127.0.0.1:3309", 
                "instanceErrors": [
                    "NOTE: group_replication is stopped."
                ], 
                "memberRole": "SECONDARY", 
                "memberState": "OFFLINE", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "(MISSING)", 
                "version": "8.0.31"
            }, 
            "127.0.0.1:3310": {
                "address": "127.0.0.1:3310", 
                "instanceErrors": [
                    "NOTE: group_replication is stopped."
                ], 
                "memberRole": "SECONDARY", 
                "memberState": "OFFLINE", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "(MISSING)", 
                "version": "8.0.31"
            }

Now we are completely stuck, and even if we try to perform any writes on the last primary member, we see the below error since the “–super-read-only” is enabled in order to protect the trxs on the database.

MySQL  localhost:3309 ssl  SQL > create database sbtest3;
ERROR: 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement

  • Now, we will fix the issue with the below set of steps.

1. Connect to the instance with most executions/Gtid’s. We can confirm the same by connecting to each instance and comparing the GTIDs with the below command.

mysql> SELECT @@GLOBAL.GTID_EXECUTED;
mysql> SELECT received_transaction_set FROM performance_schema.replication_connection_status WHERE channel_name="group_replication_applier";

2. In our case, it is the same in all three instances, so we can choose any of the nodes. Let’s choose the last primary (“127.0.0.1:3309”).

MySQL  localhost:3309 ssl  SQL > SELECT @@GLOBAL.GTID_EXECUTED;
+-------------------------------------------------------------------------------------------------------------------------------------------------+
| @@GLOBAL.GTID_EXECUTED                                                                                                                          |
+-------------------------------------------------------------------------------------------------------------------------------------------------+
| 39ec1328-3909-11ee-a0f8-5254004d77d3:1-4,
b3a79c40-3909-11ee-805d-5254004d77d3:1-152:1000071-1000079,
b3a7a659-3909-11ee-805d-5254004d77d3:1-12 |
+-------------------------------------------------------------------------------------------------------------------------------------------------+

MySQL  localhost:3309 ssl  SQL > SELECT received_transaction_set FROM performance_schema.replication_connection_status WHERE channel_name="group_replication_applier";
+-------------------------------------------------------------------------------------------------------------------------------------------------+
| received_transaction_set                                                                                                                        |
+-------------------------------------------------------------------------------------------------------------------------------------------------+
| 39ec1328-3909-11ee-a0f8-5254004d77d3:1-4,
b3a79c40-3909-11ee-805d-5254004d77d3:1-152:1000071-1000079,
b3a7a659-3909-11ee-805d-5254004d77d3:1-12 |
+-------------------------------------------------------------------------------------------------------------------------------------------------+

3. Finally, connect to the instance (“127.0.0.1:3309”) and execute the below command to recover from the outage/failure.

MySQL  localhost:3309 ssl  JS > c root@localhost:3309
MySQL  localhost:3309 ssl  JS > mycluster1 = dba.rebootClusterFromCompleteOutage("cluster1",{force: true})

...

Restoring the Cluster 'Cluster1' from complete outage...

Cluster instances: '127.0.0.1:3308' (OFFLINE), '127.0.0.1:3309' (OFFLINE), '127.0.0.1:3310' (OFFLINE)
Validating instance configuration at localhost:3309...

This instance reports its own address as 127.0.0.1:3309

Instance configuration is suitable.
* Waiting for seed instance to become ONLINE...
127.0.0.1:3309 was restored.
Validating instance configuration at 127.0.0.1:3308...

This instance reports its own address as 127.0.0.1:3308

Instance configuration is suitable.
Rejoining instance '127.0.0.1:3308' to cluster 'Cluster1'...

Re-creating recovery account...
NOTE: User 'mysql_innodb_cluster_3676500949'@'%' already existed at instance '127.0.0.1:3309'. It will be deleted and created again with a new password.

* Waiting for the Cluster to synchronize with the PRIMARY Cluster...
** Transactions replicated  ############################################################  100% 

The instance '127.0.0.1:3308' was successfully rejoined to the cluster.

Validating instance configuration at 127.0.0.1:3310...

This instance reports its own address as 127.0.0.1:3310

Instance configuration is suitable.
Rejoining instance '127.0.0.1:3310' to cluster 'Cluster1'...

Re-creating recovery account...
NOTE: User 'mysql_innodb_cluster_2558498413'@'%' already existed at instance '127.0.0.1:3309'. It will be deleted and created again with a new password.

* Waiting for the Cluster to synchronize with the PRIMARY Cluster...
** Transactions replicated  ############################################################  100% 

The instance '127.0.0.1:3310' was successfully rejoined to the cluster.

The Cluster was successfully rebooted.

<Cluster:Cluster1>

...

Note: Be careful with the force option as it can bypass other important checks like GTID_SET or instance reachability.

Now, if we check the status again, we see all three nodes are up now. The recovery command fixes all problems (group replication started) and rejoins the instances again.

MySQL  localhost:3309 ssl  JS > cluster1=dba.getCluster()
MySQL  localhost:3309 ssl  JS > cluster1.status()  
{
    "clusterName": "Cluster1", 
    "clusterRole": "PRIMARY", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "127.0.0.1:3309", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "127.0.0.1:3308": {
                "address": "127.0.0.1:3308", 
                "memberRole": "SECONDARY", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "replicationLagFromImmediateSource": "", 
                "replicationLagFromOriginalSource": "", 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.31"
            }, 
            "127.0.0.1:3309": {
                "address": "127.0.0.1:3309", 
                "memberRole": "PRIMARY", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.31"
            }, 
            "127.0.0.1:3310": {
                "address": "127.0.0.1:3310", 
                "memberRole": "SECONDARY", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "replicationLagFromImmediateSource": "", 
                "replicationLagFromOriginalSource": "", 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.31"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "domainName": "firstclusterset", 
    "groupInformationSourceMember": "127.0.0.1:3309"
}

How to perform switchover/failover from one cluster to another in a ClusterSet

Sometimes, we need to perform maintenance and update activities on the database instances. To avoid a long downtime or major impact on production, we do some control switchover so the concerned node can be offline without impacting any running workload.

Inside ClusterSet, we can perform such activity by changing the clusterRole from “REPLICA” to “PRIMARY” among the running clusters.

Let’s see the exact scenario below.

  • Fetch the ClusterSet information.
MySQL  127.0.0.1:3308 ssl  JS > myclusterset=dba.getClusterSet()
<ClusterSet:firstclusterset>

MySQL  127.0.0.1:3308 ssl  JS > myclusterset.status({extended: 1})

...

    "clusters": {
        "Cluster1": {
            "clusterRole": "PRIMARY", 
            "globalStatus": "OK", 
            "primary": "127.0.0.1:3309", 
            "status": "OK", 
            "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
            "topology": {

      "cluster2": {
            "clusterRole": "REPLICA", 
            "clusterSetReplication": {
                "applierStatus": "APPLIED_ALL", 
                "applierThreadState": "Waiting for an event from Coordinator", 
                "applierWorkerThreads": 4, 
                "receiver": "127.0.0.1:3311", 
                "receiverStatus": "ON", 
                "receiverThreadState": "Waiting for source to send event", 
                "source": "127.0.0.1:3309"
...

  • Performing a switchover from “cluster1” to “cluster2”.
MySQL  127.0.0.1:3308 ssl  JS > myclusterset.setPrimaryCluster('cluster2')

...

Switching the primary cluster of the clusterset to 'cluster2'
* Verifying clusterset status
** Checking cluster Cluster1
  Cluster 'Cluster1' is available
** Checking cluster cluster2
  Cluster 'cluster2' is available

* Reconciling 5 internally generated GTIDs

* Refreshing replication account of demoted cluster
* Synchronizing transaction backlog at 127.0.0.1:3311
** Transactions replicated  ############################################################  100% 


* Updating metadata

* Updating topology
** Changing replication source of 127.0.0.1:3308 to 127.0.0.1:3311
** Changing replication source of 127.0.0.1:3310 to 127.0.0.1:3311
** Changing replication source of 127.0.0.1:3309 to 127.0.0.1:3311
* Acquiring locks in replicaset instances
** Pre-synchronizing SECONDARIES
** Acquiring global lock at PRIMARY
** Acquiring global lock at SECONDARIES

* Synchronizing remaining transactions at promoted primary
** Transactions replicated  ############################################################  100% 


* Updating replica clusters
Cluster 'cluster2' was promoted to PRIMARY of the clusterset. The PRIMARY instance is '127.0.0.1:3311'
...

  • Now, if we see the status again, we can observe the change in the ClusterRole.
"clusters": {
        "Cluster1": {
            "clusterRole": "REPLICA", 
            "clusterSetReplication": {
                "applierStatus": "APPLIED_ALL", 
                "applierThreadState": "Waiting for an event from Coordinator", 
                "applierWorkerThreads": 4, 
                "receiver": "127.0.0.1:3309", 
                "receiverStatus": "ON", 
                "receiverThreadState": "Waiting for source to send event", 
                "source": "127.0.0.1:3311"
            }, 

"cluster2": {
            "clusterRole": "PRIMARY", 
            "globalStatus": "OK", 
            "primary": "127.0.0.1:3311", 
            "status": "OK", 
            "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
            "topology": {
                "127.0.0.1:3311": {
                    "address": "127.0.0.1:3311", 
                    "memberRole": "PRIMARY", 
                    "mode": "R/W", 
                    "status": "ONLINE", 
                    "version": "8.0.31"
                },

In some unlucky situations, when one of the clusters (“cluster1”) is completely down, and we don’t have any choice, we might need to do an emergency failover. The below command could be handy in those cases.

mysql> myclusterset.forcePrimaryCluster("cluster2")

In case you are using the MySQLRouter interface for routing traffic in the ClusterSet then you also need to change the Router option with (“setRoutingOption”) so the traffic can be diverted to the new Primary Cluster (“cluster2”) otherwise the application will fail and not able to communicate.

In the next scenario, we will see the usage of “setRoutingOption” in more detail.

How to change traffic routes with MySQLRouter

In the case of Cluster:

Within a single cluster or “cluster1,” the writes will fall to the Primary and reads will fall to the secondary in a balanced order. The router will automatically detect the failovers and choose the Primary.

In the case of ClusterSet:

We can switch the Primary stack for routing traffic from “cluster1” to “cluster2” with the help of the below steps.

  • Verifying the current configurations.
MySQL localhost:3309 ssl JS > myclusterset=dba.getClusterSet() 
MySQL  localhost:3309 ssl  JS > myclusterset.listRouters()
 
{
    "domainName": "firstclusterset", 
    "routers": {
        "localhost.localdomain::Router1": {
            "hostname": "localhost.localdomain", 
            "lastCheckIn": "2023-08-12 10:13:22", 
            "roPort": "6447", 
            "roXPort": "6449", 
            "rwPort": "6446", 
            "rwXPort": "6448", 
            "targetCluster": null, 
            "version": "8.0.31"
        }
    }
}

MySQL  localhost:3309 ssl  JS > myclusterset.routingOptions()

Output:

    
    {
    "domainName": "firstclusterset", 
    "global": {
        "invalidated_cluster_policy": "drop_all", 
        "stats_updates_frequency": 0, 
        "target_cluster": "primary"
    }, 
    "routers": {
        "localhost.localdomain::Router1": {}
    }
}

  • Now, switching the target cluster from “cluster1” to “cluster2”.
MySQL  localhost:3309 ssl  JS > myclusterset.setRoutingOption('localhost.localdomain::Router1', 'target_cluster', 'cluster2')
Routing option 'target_cluster' successfully updated in router 'localhost.localdomain::Router1'.

  • Checking the status again, we can see the target cluster is now changed to “cluster2”.
MySQL  localhost:3309 ssl  JS > myclusterset.routingOptions()
 
{
    "domainName": "firstclusterset", 
    "global": {
        "invalidated_cluster_policy": "drop_all", 
        "stats_updates_frequency": 0, 
        "target_cluster": "primary"
    }, 
    "routers": {
        "localhost.localdomain::Router1": {
            "target_cluster": "cluster2"
        }
    }
}

How to rejoin the lost cluster again in the ClusterSet

There are situations when the cluster has been marked as invalidated, or there might be some issue with the replication channel. To fix that, we might need to perform the “rejoinCluster()” process in order to add the same to the ClusterSet.

We can simulate the same by using the below steps.

  • Stopping Async replication channel on the Primary instance (“127.0.0.1:3311”) of cluster2, which is syncing from the instance (“127.0.0.1:3309”) of cluster1.
MySQL  localhost:3311 ssl  SQL > stop slave;

...
               Slave_IO_State: 
                  Master_Host: 127.0.0.1
                  Master_User: mysql_innodb_cs_63c324c0
                  Master_Port: 3309
                Connect_Retry: 3
              Master_Log_File: localhost-bin.000005
          Read_Master_Log_Pos: 2248
               Relay_Log_File: localhost-relay-bin-clusterset_replication.000002
                Relay_Log_Pos: 432
        Relay_Master_Log_File: localhost-bin.000005
             Slave_IO_Running: No
            Slave_SQL_Running: No
...

  • If we check the status again, we see the replication is stopped now on the cluster2 instance.
MySQL  localhost:3311 ssl  JS > myclusterset.status({extended:1})

"clusterRole": "REPLICA", 
            "clusterSetReplication": {
                "applierStatus": "OFF", 
                "applierThreadState": "", 
                "applierWorkerThreads": 4, 
                "receiver": "127.0.0.1:3311", 
                "receiverStatus": "OFF", 
                "receiverThreadState": "", 
                "source": "127.0.0.1:3309"
            }, 
            "clusterSetReplicationStatus": "STOPPED", 
            "globalStatus": "OK_NOT_REPLICATING", 
            "status": "OK", 
            "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.",

  • Now, we will fix the same by running the below rejoin command.
MySQL  localhost:3311 ssl  JS > myclusterset.rejoinCluster('cluster2')

...

Rejoining cluster 'cluster2' to the clusterset
NOTE: Cluster 'cluster2' is not invalidated
* Refreshing replication settings
** Changing replication source of 127.0.0.1:3312 to 127.0.0.1:3309
** Changing replication source of 127.0.0.1:3313 to 127.0.0.1:3309
** Changing replication source of 127.0.0.1:3311 to 127.0.0.1:3309

Cluster 'cluster2' was rejoined to the clusterset

...

So, the stopped replication started automatically and synced with the main cluster.

cluster2": 
            "clusterRole": "REPLICA", 
            "clusterSetReplication": {
                "applierStatus": "APPLIED_ALL", 
                "applierThreadState": "Waiting for an event from Coordinator", 
                "applierWorkerThreads": 4, 
                "receiver": "127.0.0.1:3311", 
                "receiverStatus": "ON", 
                "receiverThreadState": "Waiting for source to send event", 
                "source": "127.0.0.1:3309"
            }, 
            "clusterSetReplicationStatus": "OK", 
            "globalStatus": "OK", 
            "status": "OK", 
            "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.",

Summary

Here, we have discussed a few scenarios and methodologies that could be very useful in order to recover the cluster nodes and perform some manual failovers in the topology. The above-discussed options would be used in both InnoDB Cluster and Innodb ClusterSet-related environments.

Percona Distribution for MySQL is the most complete, stable, scalable, and secure open source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!

 

Try Percona Distribution for MySQL today!

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com