Having a separate DR cluster for production databases is a modern day requirement or necessity for tech and other related businesses that rely heavily on their database systems. Setting up such a [DC -> DR] topology for Percona XtraDB Cluster (PXC), which is a virtually- synchronous cluster, can be a bit challenging in a complex Kubernetes environment.
Here, Percona Operator for MySQL comes in handy, with a minimal number of steps to configure such a topology, which ensures a remote side backup or a disaster recovery solution.
So without taking much time, let’s see how the overall setup and configurations look from a practical standpoint.

DC Configuration
1) Here we have a three-node PXC cluster running on the DC side.
shell> kubectl get pods -n pxc NAME READY STATUS RESTARTS AGE cluster1-haproxy-0 2/2 Running 0 23h cluster1-haproxy-1 2/2 Running 0 23h cluster1-haproxy-2 2/2 Running 0 23h cluster1-pxc-0 3/3 Running 0 23h cluster1-pxc-1 3/3 Running 0 7h37m cluster1-pxc-2 3/3 Running 0 7h18m percona-xtradb-cluster-operator-6756dbf588-vxjxt 1/1 Running 0 24h xb-backup1-hlz2p 0/1 Completed 0 21h xb-cron-cluster1-fs-pvc-2026480026-372f8-2gfhr 0/1 Completed 0 13h
2) There are some configuration options which have to be enabled in a custom resource file[cr.yaml] to allow cross-site replication.
- Expose all source PXC nodes so they can be communicated from outside or DR cluster.
expose:
enabled: true
Type: LoadBalancer
- Define a dedicated replication channel and enable the source option.
replicationChannels:
- name: pxc1_to_pxc2
isSource: true
- Finally, applying the custom resource changes.
shell> kubectl apply -f cr.yaml
3) Now we will notice some “EXTERNAL IP” details for each PXC node. This is the endpoint that DR node [cluster1-pxc-0] will use to connect to DC.
shell> kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE cluster1-haproxy ClusterIP 34.118.227.249 <none> 3306/TCP,3309/TCP,33062/TCP,33060/TCP,8404/TCP 4h1m cluster1-haproxy-replicas ClusterIP 34.118.225.41 <none> 3306/TCP 4h1m cluster1-pxc ClusterIP None <none> 3306/TCP,33062/TCP,33060/TCP 4h1m cluster1-pxc-0 LoadBalancer 34.118.234.140 34.29.145.138 3306:30425/TCP 4h1m cluster1-pxc-1 LoadBalancer 34.118.239.132 34.30.233.0 3306:31340/TCP 4h1m cluster1-pxc-2 LoadBalancer 34.118.236.64 35.225.0.19 3306:30642/TCP 4h1m cluster1-pxc-unready ClusterIP None <none> 3306/TCP,33062/TCP,33060/TCP 4h1m percona-xtradb-cluster-operator ClusterIP 34.118.235.168 <none> 443/TCP 4h11m
At this point, we are done with the DC setup. Next, we will take a backup from Source which we later used to build the DR.
Backup
- Defining access key/secrets to connect to the GCP/S3 bucket.
cat backup-secret-s3.yaml
apiVersion: v1 kind: Secret metadata: name: my-cluster-name-backup-s3 type: Opaque data: AWS_ACCESS_KEY_ID: <KEY> AWS_SECRET_ACCESS_KEY: <SECRET>
- In the custom resource file [cr.yaml] , we also need to define the bucket , secret file and endpoint/region details.
backup:
storages:
s3-us-west:
type: s3
verifyTLS: true
s3:
bucket: <bucket>
credentialsSecret: my-cluster-name-backup-s3
region: us-west-2
endpointUrl: https://storage.googleapis.com
…
shell> kubectl apply -f cr.yaml
- Finally, we can take the backup by creating a [backup.yaml] file with below details.
apiVersion: pxc.percona.com/v1 kind: PerconaXtraDBClusterBackup metadata: # finalizers: # - percona.com/delete-backup name: backup1 spec: pxcCluster: cluster1 storageName: s3-us-west
…
shell> kubectl apply -f cr.yaml
- We can verify the successful backup as follows.
kubectl get pxc-backup NAME CLUSTER STORAGE DESTINATION STATUS COMPLETED AGE backup1 cluster1 s3-us-west s3://<bucket>/cluster1-2026-04-07-15:55:46-full Succeeded 125m 127m
As the backup is also ready, we can now move to the DR setup part.
DR Configuration
Below we have a similar PXC setup as having in DC in a separate Node/ K8s Cluster.
kubectl get pods -n pxc-dr NAME READY STATUS RESTARTS AGE cluster1-haproxy-0 2/2 Running 0 35h cluster1-haproxy-1 2/2 Running 0 35h cluster1-haproxy-2 2/2 Running 0 35h cluster1-pxc-0 3/3 Running 0 35h cluster1-pxc-1 3/3 Running 0 35h cluster1-pxc-2 3/3 Running 0 35h percona-xtradb-cluster-operator-6756dbf588-2wc5m 1/1 Running 0 38h prepare-job-restore1-cluster1-8h4vn 0/1 Completed 0 35h restore-job-restore1-cluster1-trfg6 0/1 Completed 0 35h xb-cron-cluster1-fs-pvc-2026480025-372f8-wv6bt 0/1 Completed 0 28h xb-cron-cluster1-fs-pvc-2026490025-372f8-gxd59 0/1 Completed 0 4h48m
First, we need to restore the backup on the DR server.
Data Restoration
- Here we will create the [backup-secret-s3.yaml] file which contains the GCP/S3 credentials.
apiVersion: v1 kind: Secret metadata: name: my-cluster-name-backup-s3 type: Opaque data: AWS_ACCESS_KEY_ID: <KEY> AWS_SECRET_ACCESS_KEY: <SECRET>
…
shell> kubectl apply -f backup-secret-s3.yaml
- Next, we will create a [restore.yaml] file while mentioning the backup source and other useful information.
apiVersion: pxc.percona.com/v1
kind: PerconaXtraDBClusterRestore
metadata:
name: restore1
# annotations:
# percona.com/headless-service: "true"
spec:
pxcCluster: cluster1
backupSource:
# verifyTLS: true
destination: s3://<bucket>/cluster1-2026-04-07-15:55:46-full
s3:
bucket: <bucket>
credentialsSecret: my-cluster-name-backup-s3
endpointUrl: https://storage.googleapis.com/
…
shell> kubectl apply -f restore.yaml
- Once the restoration is finished successfully, we will see the status below.
shell> kubectl get pxc-restore NAME CLUSTER STATUS COMPLETED AGE restore1 cluster1 Succeeded 27m
Now we can do the remaining DR changes in the custom resource file [cr.yaml]. Basically, we need to add the replication channel and all source EXTERNAL-IPs. This cross-DC replication supports Automatic Asynchronous Replication Connection Failover feature, so in case any of the DC node is down, the Replica can connect and resume from other available DC nodes.
replicationChannels:
- name: pxc1_to_pxc2
isSource: false
sourcesList:
- host: 34.29.145.138
port: 3306
weight: 100
- host: 34.30.233.0
port: 3306
weight: 100
- host: 35.225.0.19
port: 3306
weight: 100
…
shell> kubectl apply -f cr.yaml
For backup and restoration on the PXC operator, the manuals below can be referenced further.
- https://docs.percona.com/percona-operator-for-mysql/pxc/backups-ondemand.html
- https://docs.percona.com/percona-operator-for-mysql/pxc/backups-restore-to-new-cluster.html
Replication
Initially, when we check the replication status, we can notice the following error. This is because with [caching_sha2_password] authentication, it should be a secure SSL/TLS communication, or else we can use SOURCE_PUBLIC_KEY_PATH/GET_SOURCE_PUBLIC_KEY which basicaly enables the RSA key pair-based password exchange by requesting the public key from the source.
shell> kubectl exec -it cluster1-pxc-0 -- sh shell> mysql -uroot -p
mysql> show replica status\G;
*************************** 1. row ***************************
Replica_IO_State: Connecting to source
Source_Host: 35.225.0.19
Source_User: replication
Source_Port: 3306
Connect_Retry: 60
Source_Log_File:
Read_Source_Log_Pos: 4
Relay_Log_File: cluster1-pxc-0-relay-bin-pxc1_to_pxc2.000001
Relay_Log_Pos: 4
Relay_Source_Log_File:
Replica_IO_Running: Connecting
Replica_SQL_Running: Yes
...
Error:
Last_IO_Error: Error connecting to source 'replication@35.225.0.19:3306'. This was attempt 2/3, with a delay of 60 seconds between attempts. Message: Access denied for user 'replication'@'35.225.0.19.' (using password: YES)
Once we passed “GET_SOURCE_PUBLIC_KEY” in the “CHANGE REPLICATION” command the error is resolved and DR successfully able to communicate with the DC.
mysql> STOP REPLICA; mysql> STOP REPLICA IO_THREAD FOR CHANNEL 'pxc1_to_pxc2'; mysql> CHANGE REPLICATION SOURCE TO SOURCE_USER='replication', SOURCE_PASSWORD='password', GET_SOURCE_PUBLIC_KEY=1 FOR CHANNEL 'pxc1_to_pxc2'; mysql> START REPLICA;
Note – The Replication user will be auto-created on the DC node. So, with the help of below command we can get the decoded password for “replication” user.
shell> kubectl get secret cluster1-secrets -o jsonpath="{.data.replication}" | base64 --decode
mysql> show replica status\G;
*************************** 1. row ***************************
Replica_IO_State: Waiting for source to send event
Source_Host: 35.225.0.19
Source_User: replication
Source_Port: 3306
Connect_Retry: 60
Source_Log_File: binlog.000006
Read_Source_Log_Pos: 3047027
Relay_Log_File: cluster1-pxc-0-relay-bin-pxc1_to_pxc2.000001
Relay_Log_Pos: 150132
Relay_Source_Log_File: binlog.000006
Replica_IO_Running: Yes
Replica_SQL_Running: Yes
...
The other PXC DR nodes will sync as usual with the Galera Synchronous replication process.
Source Failover
The asynchronous connection failover is already enabled on the DR as we defined initially in the custom resource file. The “External IPs” shows different here because they changed in this testing scenario.
mysql> select * from performance_schema.replication_asynchronous_connection_failover; +--------------+---------------+------+-------------------+--------+--------------+ | CHANNEL_NAME | HOST | PORT | NETWORK_NAMESPACE | WEIGHT | MANAGED_NAME | +--------------+---------------+------+-------------------+--------+--------------+ | pxc1_to_pxc2 | 34.29.145.138 | 3306 | | 100 | | | pxc1_to_pxc2 | 34.45.151.96 | 3306 | | 100 | | | pxc1_to_pxc2 | 34.71.57.38 | 3306 | | 100 | | +--------------+---------------+------+-------------------+--------+--------------+ 3 rows in set (0.00 sec)
Now, in case the existing Source DC[cluster1-pxc-2] is down, the DR will connect to one of the other available DC nodes based on the “Weight” and chronological order [pxc-2, pxc-1, pxc-0 etc].
- Here, we temporarily take down the Source DC[cluster1-pxc-2] node.
kubectl get pods -n pxc NAME READY STATUS RESTARTS AGE cluster1-haproxy-0 2/2 Running 0 2d3h cluster1-haproxy-1 2/2 Running 0 2d3h cluster1-haproxy-2 2/2 Running 0 2d3h cluster1-pxc-0 3/3 Running 0 2d3h cluster1-pxc-1 3/3 Running 0 35h cluster1-pxc-2 2/3 Running 1 (6s ago) 34h percona-xtradb-cluster-operator-6756dbf588-vxjxt 1/1 Running 0 2d3h xb-backup1-hlz2p 0/1 Completed 0 2d1h xb-cron-cluster1-fs-pvc-2026480026-372f8-2gfhr 0/1 Completed 0 41h xb-cron-cluster1-fs-pvc-2026490026-372f8-mgfpv 0/1 Completed 0 17h
- The DR replication breaks as it can’t reach the DC [cluster1-pxc-2].
mysql> show replica status\G;
*************************** 1. row ***************************
Replica_IO_State: Reconnecting after a failed source event read
Source_Host: 34.71.57.38
Source_User: replication
Source_Port: 3306
Connect_Retry: 60
Source_Log_File: binlog.000012
Read_Source_Log_Pos: 198
Relay_Log_File: cluster1-pxc-0-relay-bin-pxc1_to_pxc2.000002
Relay_Log_Pos: 369
Relay_Source_Log_File: binlog.000012
Replica_IO_Running: Connecting
Replica_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Source_Log_Pos: 198
Relay_Log_Space: 602
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Source_SSL_Allowed: No
Source_SSL_CA_File:
Source_SSL_CA_Path:
Source_SSL_Cert:
Source_SSL_Cipher:
Source_SSL_Key:
Seconds_Behind_Source: NULL
Source_SSL_Verify_Server_Cert: Yes
Last_IO_Errno: 2003
Last_IO_Error: Error reconnecting to source 'replication@34.71.57.38:3306'. This was attempt 2/3, with a delay of 60 seconds between attempts. Message: Can't connect to MySQL server on '34.71.57.38:3306' (111)
- Once it reaches the “source_retry_count” and “source_connect_retry”, the Replica connects to another Source DC[cluster1-pxc-1].
mysql> show replica status\G;
*************************** 1. row ***************************
Replica_IO_State: Waiting for source to send event
Source_Host: 34.45.151.96
Source_User: replication
Source_Port: 3306
Connect_Retry: 60
Source_Log_File: binlog.000007
Read_Source_Log_Pos: 198
Relay_Log_File: cluster1-pxc-0-relay-bin-pxc1_to_pxc2.000003
Relay_Log_Pos: 369
Relay_Source_Log_File: binlog.000007
Replica_IO_Running: Yes
Replica_SQL_Running: Yes
...
Quick Summary
In this blog post, we walk through the steps to configure Cross-Site Replication in the Percona PXC operator. Although we have used the operator native Xtrabackup to feed the data to the DR via the restore process, we can also use logical backup options like (mysqldump, mydumper, etc.) to accomplish the same goals.
Using an “Asynchronous Replication” process to sync DR could lead to delays or replication lag due to its flow, or, more importantly, when working across data centres, where network latency is a big factor. However, adding a DR(PXC) cluster to DC(PXC) directly via synchronous replication could be more impactful or lead to flow control issues if any of the DR nodes struggle or experience performance/saturation issues. So, it’s equally important to consider all aspects or challenges before deploying in production.
The post Deploying Cross-Site Replication in Percona Operator for MySQL (PXC) appeared first on Percona.
The replication manager script can be particularly useful in complex PXC/Galera topologies that require Async/Multi-source replication. This will ease the auto source and replica failover to ensure all replication channels are healthy and in sync. If certain nodes shouldn’t be part of a async/multi-source replication, we can disable the replication manager script there to tightly controlled the flow. Alternatively, node participation can be controlled by adjusting the weights in the percona.weight table, allowing replication behavior to be managed more precisely.
Remember when Percona significantly improved query processing time by fixing the optimizer bug? I have described all the details in More Performant Query Processing in Percona Server for MySQL blog post. This time, we dug deeper into all the ideas from Enhanced for MySQL and based on our analysis, we proposed several new improvements. All […]
For decades, we’ve accepted a painful compromise: if you wanted logic inside the database, you had to write SQL/PSM (Persistent Stored Modules). It’s clunky, hard to debug, and declarative by nature, making it terrible for algorithmic tasks. That ends with Percona Server 8.4.7-7. We are introducing JS Stored Programs as a Tech Preview. Unlike Oracle’s […]
Data masking lets you hide sensitive fields (emails, credit-card numbers, job titles, etc.) while keeping data realistic for reporting, support, or testing. It is particularly useful when you collaborate with external entities and need to share your data for development reasons. You also need to protect your data and keep your customers’ privacy safe. Last […]
Right now, you’re probably hoping someone else will deal with this MySQL 8.0 end-of-life situation. Maybe your team can squeeze another few months out of it. Maybe Oracle will extend support. Maybe it won’t be as bad as everyone says. We get it. You’ve got enough things going on without adding “major database upgrade” to […]
TL;DR Percona Server for MySQL now offers experimental support for stored programs in the JS language. This free and open source alternative to Oracle’s Enterprise/Cloud-only feature enables users to write stored programs in a more modern, convenient, and often more familiar language. It is still in active development, and we would very much like your […]
In this blog post, we will describe typical usage scenarios for dictionary operations in the Data Masking Component, which is available in Percona Server for MySQL as an open source alternative to Oracle’s enterprise version. In particular, we will consider the following functions. gen_dictionary() – a function that returns a random term from a dictionary. gen_blocklist() – […]
In Percona Server for MySQL 8.0.41 / 8.4.4, we significantly re-designed the Data Masking Component. In particular, we made the following changes: Changed the user on behalf of whom we execute internal queries for dictionary operations. Introduced an in-memory dictionary term cache that allows significant speed-up of dictionary operations. Introduced masking_dictionaries_flush() User Defined Function. Introduced […]
In Percona Server for MySQL 8.0.41 / 8.4.4, we introduced several improvements in Encryption User-Defined Functions. Added support for RSAES-OAEP (OAEP) padding for RSA encrypt / decrypt operations. Added support for RSASSA-PSS (PSS) padding for RSA sign / verify operations. Added new encryption_udf.legacy_padding_scheme component system variable. Normalized character set support for all Encryption UDFs. PKCS1 […]