Sep
17
2021
--

Migration of a MySQL Database to a Kubernetes Cluster Using Asynchronous Replication

Migration of a MySQL Database to a Kubernetes Cluster Using Asynchronous Replication

Migration of a MySQL Database to a Kubernetes Cluster Using Asynchronous ReplicationNowadays, more and more companies are thinking about the migration of their infrastructure to Kubernetes. Databases are no exception. There were a lot of k8s operators that were created to simplify the different types of deployments and also perform routine day-to-day tasks like making the backups, renewing certificates, and so on.  If a few years ago nobody wanted to even listen about running databases in Kubernetes,  everything has changed now.

At Percona, we created a few very featureful k8s operators for Percona Server for MongoDB, PostgreSQL, and MySQL databases. Today we will talk about using cross-site replication – a new feature that was added to the latest release of Percona Distribution for MySQL Operator. This feature is based on synchronous connection failover mechanism.
The cross-site replication involves configuring one Percona XtraDB Cluster or a single/several MySQL servers as Source, and another Percona XtraDB Cluster (PXC) as a replica to allow asynchronous replication between them.  If an operator has several sources in custom resource (CR), it will automatically handle connection failure of the source DB.
This cross-site replication feature is supported only since MySQL 8.0.23, but you can read about migrating MySQL of earlier versions in this blog post.

The Goal

Migrate the MySQL database, which is deployed on-prem or in the cloud, to the Percona Distribution for MySQL Operator using asynchronous replication. This approach helps you reduce downtime and data loss for your application.

So, we have the following setup:

Migration of MySQL database to Kubernetes cluster using asynchronous replication

The following components are used:

1. MySQL 8.0.23 database (in my case it is Percona Server for MySQL) which is deployed in DO (as a Source) and Percona XtraBackup for the backup. In my test deployment, I use only one server as a Source to simplify the deployment. Depending on your topology of DB deployment, you can use several servers to use synchronous connection failover mechanism on the operator’s end.

2. Google Kubernetes Engine (GKE) cluster where Percona Distribution for MySQL Operator is deployed with PXC cluster (as a target).

3. AWS S3 bucket is used to save the backup from MySQL DB and then to restore the PXC cluster in k8s.

The following steps should be done to perform the migration procedure:

1. Make the MySQL database backup using Percona XtraBackup and upload it to the S3 bucket using xbcloud.

2. Perform the restore of the MySQL database from the S3 bucket into the PXC cluster which is deployed in k8s.

3. Configure asynchronous replication between MySQL server and PXC cluster managed by k8s operator.

As a result, we have asynchronous replication between MySQL server and PXC cluster in k8s which is in read-only mode.

Migration

Configure the target PXC cluster managed by k8s operator:

1. Deploy Percona Distribution for MySQL Operator on Kubernetes (I have used GKE 1.20).

# clone the git repository
git clone -b v1.9.0 https://github.com/percona/percona-xtradb-cluster-operator
cd percona-xtradb-cluster-operator

# deploy the operator
kubectl apply -f deploy/bundle.yaml

2. Create PXC cluster using the default custom resource manifest (CR).

# create my-cluster-secrets secret (do no use default passwords for production systems)
kubectl apply -f deploy/secrets.yaml

# create cluster by default it will be PXC 8.0.23
kubectl apply -f deploy/cr.yaml

3. Create the secret with credentials for the AWS S3 bucket which will be used for access to the S3 bucket during the restoration procedure.

# create S3-secret.yaml file with following content, and use correct credentials instead of XXXXXX

apiVersion: v1
kind: Secret
metadata:
  name: aws-s3-secret
type: Opaque
data:
  AWS_ACCESS_KEY_ID: XXXXXX
  AWS_SECRET_ACCESS_KEY: XXXXXX

# create secret
kubectl apply -f S3-secret.yaml

Configure the Source MySQL Server

1. Install Percona Server for MySQL 8.0.23 and Percona XtraBackup for the backup. Refer to the Installing Percona Server for MySQL and Installing Percona XtraBackup chapters in the documentation for installation instructions.


NOTE:
You need to add the following options to my.cnf to enable GTID support; otherwise, replication will not work because it is used by the PXC cluster  by default.

[mysqld]
enforce_gtid_consistency=ON
gtid_mode=ON

2. Create all needed users who will be used by k8s operator, the password should be the same as in

deploy/secrets.yaml

. Also, please note that the password for the root user should be the same as in deploy/secrets.yaml file for k8s the secret.  In my case, I used our default passwords from

deploy/secrets.yaml

file.

CREATE USER 'monitor'@'%' IDENTIFIED BY 'monitory' WITH MAX_USER_CONNECTIONS 100;
GRANT SELECT, PROCESS, SUPER, REPLICATION CLIENT, RELOAD ON *.* TO 'monitor'@'%';
GRANT SERVICE_CONNECTION_ADMIN ON *.* TO 'monitor'@'%';

CREATE USER 'operator'@'%' IDENTIFIED BY 'operatoradmin';
GRANT ALL ON *.* TO 'operator'@'%' WITH GRANT OPTION;

CREATE USER 'xtrabackup'@'%' IDENTIFIED BY 'backup_password';
GRANT ALL ON *.* TO 'xtrabackup'@'%';

CREATE USER 'replication'@'%' IDENTIFIED BY 'repl_password';
GRANT REPLICATION SLAVE ON *.* to 'replication'@'%';
FLUSH PRIVILEGES;

2. Make the backup of MySQL database using XtraBackup tool and upload it to S3 bucket.

# export aws credentials
export AWS_ACCESS_KEY_ID=XXXXXX
export AWS_SECRET_ACCESS_KEY=XXXXXX

#make the backup
xtrabackup --backup --stream=xbstream --target-dir=/tmp/backups/ --extra-lsndirk=/tmp/backups/  --password=root_password | xbcloud put --storage=s3 --parallel=10 --md5 --s3-bucket="mysql-testing-bucket" "db-test-1"

Now, everything is ready to perform the restore of the backup on the target database. So, let’s get back to our k8s cluster.

Configure the Asynchronous Replication to the Target PXC Cluster

If you have a completely clean source database (without any data), you can skip the points connected with backup and restoration of the database. Otherwise, do the following:

1. Restore the backup from the S3 bucket using the following manifest:

# create restore.yml file with following content

apiVersion: pxc.percona.com/v1
kind: PerconaXtraDBClusterRestore
metadata:
  name: restore1
spec:
  pxcCluster: cluster1
  backupSource:
    destination: s3://mysql-testing-bucket/db-test-1
    s3:
      credentialsSecret: aws-s3-secret
      region: us-east-1

# trigger the restoration procedure
kubectl apply -f restore.yml

As a result, you will have a PXC cluster with data from the source DB. Now everything is ready to configure the replication.

2. Edit custom resource manifest

deploy/cr.yaml

  to configure

spec.pxc.replicationChannels

 section.

spec:
  ...
  pxc:
    ...
    replicationChannels:
    - name: ps_to_pxc1
      isSource: false
      sourcesList:
        - host: <source_ip>
          port: 3306
          weight: 100

# apply CR
kubectl apply -f deploy/cr.yaml


Verify the Replication 

In order to check the replication, you need to connect to the PXC node and run the following commands:

kubectl exec -it cluster1-pxc-0 -c pxc -- mysql -uroot -proot_password -e "show replica status \G"
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: <ip-of-source-db>
                  Master_User: replication
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: binlog.000004
          Read_Master_Log_Pos: 529
               Relay_Log_File: cluster1-pxc-0-relay-bin-ps_to_pxc1.000002
                Relay_Log_Pos: 738
        Relay_Master_Log_File: binlog.000004
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 529
              Relay_Log_Space: 969
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
                  Master_UUID: 9741945e-148d-11ec-89e9-5ee1a3cf433f
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 3
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set: 9741945e-148d-11ec-89e9-5ee1a3cf433f:1-2
            Executed_Gtid_Set: 93f1e7bf-1495-11ec-80b2-06e6016a7c3d:1,
9647dc03-1495-11ec-a385-7e3b2511dacb:1-7,
9741945e-148d-11ec-89e9-5ee1a3cf433f:1-2
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name: ps_to_pxc1
           Master_TLS_Version:
       Master_public_key_path:
        Get_master_public_key: 0
            Network_Namespace:

Also, you can verify the replication by checking that the data is changing.

Promote the PXC Cluster as a Primary

As soon as you are ready (your application was reconfigured and ready to work with the new DB) to stop the replication and promote the PXC cluster in k8s to be a primary DB, you need to perform the following simple actions:

1. Edit the

deploy/cr.yaml

  and comment the replicationChannels

spec:
  ...
  pxc:
    ...
    #replicationChannels:
    #- name: ps_to_pxc1
    #  isSource: false
    #  sourcesList:
    #    - host: <source_ip>
    #      port: 3306
    #      weight: 100

2. Stop mysqld service on the source server to be sure that no new data is written.

 systemctl stop mysqld

3. Apply a new CR for k8s operator.

# apply CR
kubectl apply -f deploy/cr.yaml

As a result, replication is stopped and the read-only mode is disabled for the PXC cluster.

Conclusion

Technologies are changing so fast that a migration procedure to k8s cluster, seeming very complex at first sight, turns out to be not so difficult and nor time-consuming. But you need to keep in mind that significant changes were made. Firstly, you migrate the database to the PXC cluster which has some peculiarities, and, of course, Kubernetes itself.  If you are ready, you can start the journey to Kubernetes right now.

The Percona team is ready to guide you during this journey. If you have any questions,  please raise the topic in the community forum.

The Percona Kubernetes Operators automate the creation, alteration, or deletion of members in your Percona Distribution for MySQL, MongoDB, or PostgreSQL environment.

Learn More About Percona Kubernetes Operators

Sep
16
2021
--

Repoint Replica Servers in MySQL/Percona Server for MySQL 8.0

Repoint Replica Servers in MySQL

When doing migrations or failovers in MySQL, there is usually a need to do a topology change and repoint replica servers to obtain replication data from a different server.

For example, given servers {A, B, and C} and the following topology:

MySQL Topology

If you need to repoint C to be a replica of B, i.e:

repoint mysql

You can follow the next steps:

Note: log_replica_updates should be enabled on the soon-to-be primary as it is a prerequisite for chain replication.

Note: It is assumed that both replicas only stream from Server A and there are no conflicting replication filters in place that might break replication later on.

If Using File/Position-Based Replication:

1) Stop B and C

STOP REPLICA;

2) If replicas are multi-threaded, correct MTS gaps and make them single-threaded until all changes are applied. To do so, execute the following commands on BOTH nodes:

START REPLICA UNTIL SQL_AFTER_MTS_GAPS;
SHOW REPLICA STATUS\G -- repeat this until you see "Replica_SQL_Running: No"
STOP REPLICA;
SELECT @@global.replica_parallel_workers; -- take note to restore later
SET GLOBAL replica_parallel_workers=0; -- disable MTS during the operations

3) Then check which is the node that is more up to date by looking at Relay_Source_Log_File and Exec_Source_Log_Pos. Run on BOTH nodes:

SHOW REPLICA STATUS\G
# Take note of Relay_Source_Log_File/Exec_Source_Log_Pos from the most up to date node.

4) Sync replicas with UNTIL. Run on the most delayed node with above outputs:

START REPLICA UNTIL SOURCE_LOG_FILE='<Relay_Source_Log_File>', SOURCE_LOG_POS=<Exec_Source_Log_Pos>;
SHOW REPLICA STATUS\G -- repeat this until you see "Replica_SQL_Running: No"

5) If followed above steps, at this point both replicas should have the exact same data set and should be in sync at the same point in time.
# Double check that both replicas are stopped and with the same coords as doing topology changes while replication is ongoing and with diffs coords can cause inconsistencies:

SHOW REPLICA STATUS\G
# Replica_IO_Running must be “NO” in both replicas
# Replica_SQL_Running must be “NO” in both replicas
# Relay_Source_Log_File must match in both replicas
# Exec_Source_Log_Pos must match in both replicas

6) Get current coordinates from B (new intermediate primary). Execute on B:

SHOW MASTER STATUS \G
# Take note of File and Position

7) Repoint C to B. Execute on C with coords from previous step:

CHANGE REPLICATION SOURCE TO SOURCE_HOST='<ip-address-of-B>', SOURCE_LOG_FILE='<File>', SOURCE_LOG_POS='<Position>';

8) If you had disabled MTS, you should re-enable here for both B and C;

SET GLOBAL replica_parallel_workers=X; -- see output of step 2 for correct value

9) Restart replication normally. Run on both nodes:

START REPLICA;

If Using GTID-Based Replication:

1) Stop B and C:
STOP REPLICA;

2) If replicas are multi-threaded, correct MTS gaps and make them single-threaded until all changes are applied. Run on BOTH nodes:

SHOW REPLICA STATUS\G -- repeat this until you see "Replica_SQL_Running: No"
STOP REPLICA;
SELECT @@global.replica_parallel_workers; -- take note to restore later
SET GLOBAL replica_parallel_workers=0; -- disable MTS during the operations

3) Then check which is the node that is more up to date by looking at sequence numbers in Executed_Gtid_Set. Run on BOTH nodes:
?
SHOW REPLICA STATUS\G
# Take note of Executed_Gtid_Set with the largest sequence number. If there is a mismatch in the gtid sets it means there were either local writes or writes coming from some other server. In that case you should check data consistency between the servers, for example with pt-table-checksum . Then you need to fix gtid differences by either restoring the replica from scratch or fix errant transactions as explained on this other blogpost

4) Bring up all nodes to the same point in time. Run on node with smallest GTID sequence number;

START REPLICA UNTIL SQL_AFTER_GTIDS='<Executed_Gtid_Set>';
SHOW REPLICA STATUS\G -- repeat this until you see "Replica_SQL_Running: No"

5) If followed above steps, at this point both replicas should have the exact same data set and should be in sync at the same point in time.
# Double check that both replicas are stopped and with the same coords as doing topology changes while replication is ongoing and with diffs coords can cause inconsistencies:

SHOW REPLICA STATUS\G
# Replica_IO_Running must be “NO” in both replicas
# Replica_SQL_Running must be “NO” in both replicas
# Executed_Gtid_Set must match in both replicas

6) Now both replicas have identical data, so you can re-point C to replicate from B. Run on C:

CHANGE REPLICATION SOURCE TO SOURCE_HOST='<ip-address-of-B>'

7) If you had disabled MTS, you should re-enable here for both B and C;

SET GLOBAL replica_parallel_workers=X; -- see output of step 2 for correct value

8) Restart replication normally. Run on both nodes

START REPLICA;

Doing the opposite replication change from chain replication (A->B->C) into one primary with two replicas should be simpler:

If Using File/Position-Based Replication:

1) Stop replication on B and make sure B is not receiving any write activity:

STOP REPLICA;

2) Check current binary log position on B:

SHOW MASTER STATUS \G

3) On C check replication until C does catch up with B. On C:

SHOW REPLICA STATUS \G

# For C to have catch up with B, the following conditions should be met:
# “File” from B on step 2) should match Relay_Source_Log_File from 3)
# “Position” from B on step2) should match Exec_Source_Log_Pos from 3)

# After catchup, both servers will be in sync with the same data set.

4) Check current replication coords from B:

SHOW REPLICA STATUS \G
# Write down Relay_Source_Log_File and Exec_Source_Log_Pos from B, as we will be using this coords on C

5) Re point C to replicate from A. File and positions used should be the ones taken from B on last step: 

CHANGE REPLICATION SOURCE TO SOURCE_HOST='<ip-address-of-A>', SOURCE_LOG_FILE='<File>', SOURCE_LOG_POS='<Position>'

6) Restart replication normally. Run on both nodes:

START REPLICA;

If Using GTID-Based Replication:

1) Stop replication on B and make sure B is not receiving any write activity:

STOP REPLICA;

2) Check current binary log position on B:

SHOW MASTER STATUS \G

3) On C check replication until C does catch up with B. On C:

SHOW REPLICA STATUS \G

# For C to have catch up with B, the following conditions should be met:
# Executed_Gtid_Set from B step 2) should match Executed_Gtid_Set from 3)
# After catchup, both servers will be in sync with the same data set.

4) Re point C to replicate from A:

CHANGE REPLICATION SOURCE TO SOURCE_HOST='<ip-address-of-A>'

5) Restart replication normally. Run on both nodes

START REPLICA;

Conclusion:

Doing topology changes might seem hard at first, but with the above procedure, it should be easy and error-free! If you do not want to do the manual approach, then you can consider using tools like Orchestrator which allows for automatic failover and promotions.

Percona Distribution for MySQL is the most complete, stable, scalable, and secure, open-source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!

Download Percona Distribution for MySQL Today

Sep
13
2021
--

Percona Distribution for PostgreSQL Updates, Improvements to Percona XtraBackup: Release Roundup September 13, 2021

Percona Software Update Sept 13 2021

It’s release roundup time again here at Percona!

Percona Software Update Sept 13 2021Percona is a leading provider of unbiased open source database solutions that allow organizations to easily, securely, and affordably maintain business agility, minimize risks, and stay competitive.

Our Release Roundups showcase the latest Percona software updates, tools, and features to help you manage and deploy our software. It offers highlights and critical information, as well as links to the full release notes and direct links to the software or service itself to download.

Today’s post includes those releases and updates that have come out since August 30, 2021. Take a look!

 

Percona Distribution for PostgreSQL (11.13, 12.8 and 13.4)

On September 9, 2021, we released updates of Percona Distribution for PostgreSQL.  It provides the best and most critical enterprise components from the open-source community, in a single distribution, designed and tested to work together. Patroni, pgBackRest, pg_repack, and pgaudit are amongst the innovative components we utilize.

Percona Distribution for PostgreSQL 11.13 Release Notes
Download Percona Distribution for PostgreSQL 11.13

Percona Distribution for PostgreSQL 12.8 Release Notes
Download Percona Distribution for PostgreSQL 12.8

Percona Distribution for PostgreSQL 13.4 Release Notes
Download Percona Distribution for PostgreSQL 13.4

 

Percona XtraBackup 8.0.26-18.0

On September 2, 2021, we released Percona XtraBackup 8.0.26-18.0. It enables MySQL backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Along with some bug fixes, this release features some improvements, including xbcloud should retry on an error and utilize incremental backoff (Thanks to Baptiste Mille-Mathias for reporting this issue), and removal of the obsolete LOCKLESS binary log functionality since the performance_schema.log_status table is now used to get the log information on all the storages without locking them.

Download Percona XtraBackup 8.0.26-18.0

 

That’s it for this roundup, and be sure to follow us on Twitter to stay up-to-date on the most recent releases! Percona is a leader in providing best-of-breed enterprise-class support, consulting, managed services, training, and software for MySQL, MongoDB, PostgreSQL, MariaDB, and other open source databases in on-premises and cloud environments.

Sep
09
2021
--

Q&A on Webinar “Using Open Source Software to Optimize and Troubleshoot Your MySQL Environment”

Optimize and Troubleshoot Your MySQL Environment

Optimize and Troubleshoot Your MySQL EnvironmentThanks to everyone who attended last week’s webinar on Using Open Source Software to Optimize and Troubleshoot Your MySQL Environment; hopefully you’ve found the time we spent in Percona Monitoring and Management (PMM) useful.

We had a record-breaking number of questions during the talk and unfortunately weren’t able to answer them all live, so we decided to answer them separately. Also, there were several requests for best practices around installation and configuration. This is something we are considering for the next webinar in this series, so stay tuned!

If you weren’t able to attend, the recording is available for viewing. But now, without further ado, here are the questions that we didn’t have time to cover during the presentation.

 

Q: Can PMM also be used for a web hosting server (Cpanel, Directadminetc)?

PMM by default can monitor a node to provide vital statistics on the health of the host.  From there, you can use external exporters to monitor other applications and send the data to PMM to visualize and create alerts.

 

Q: Does it provide any query optimization suggestions if my query is bad? 

Not at present…that’s planned for the future query advisor

 

Q: How soon we will be able to use the alerting manager in production?

We are looking at late Sept to early Oct. When it’s ready, you will hear about it!

 

Q: Capturing Queries Data for performance checking can be costly and some monitoring systems capture data every few seconds. At what level of data is captured here and analyzed…live systems with lots of database traffic? What percentage (all of it,  2 seconds, 1 second, etc.)?

We adhere to ‘do no harm’ so the impact of PMM  is typically 1-4% of the busiest systems.  We offer custom resolutions to adjust the scrape frequency to balance the need for information with the need for performance.

 

Q: Are long-running queries captured that potentially slow down the system over time & shown as graph/alert? Also, is there potentially more than one instance of these types running over again by a user.?

This is something we are going to include in our Alerting capabilities (coming soon, see above).

 

Q: Can more than one of the metrics be compared against each other to gain more insight into a problem in graphical form? Can you in effect play with these graphs?

Yes, you can, this is in fact how most of the dashboards are designed, where we connect different metric series together to drive graphs that explain system performance.  While you may be able to edit the existing graphs, Percona recommends that you instead make a copy of the dashboard you’d like to modify and make your changes on the copy.  The reason for this is if you modify a dashboard distributed by PMM, it will be overwritten on the next upgrade, and you’ll lose your changes.

 

Q: Could you list what can be monitored using PMM? And explain what recommended plugins are available and what they are used for? 

Natively, any Linux system and pretty much all flavors of MySQL, MariaDB, MongoDB, and PostgreSQL. You can use external exporters to gather even more data than default and using Grafana as the basis for visualization of PMM allows you to create custom dashboards and a wealth of community plugins.

 

Q: Can you choose to monitor a particular set of users? Set of queries? Set of schema? 

You can filter it down to view based on username, particular schema, and then filter those results by particular query strings.  We can monitor as much or as little about your database as the user you define to pull data.

 

Q: How can we work on optimization when using cloud-based services like RDS where we have limited access?

PMM can monitor RDS instances and has simplified the connection and selection process of its remote monitoring capabilities.  We can provide nearly the same data as an on-prem database however we don’t have access to the node level statistics.

 

Q: For Oracle MySQL 5.7.29, if you have many tables/objects in the database, will the PMM query information_schema and load the DB?

We have a predefined limit of 1000 tables that will disable polling information schema but you can configure this to your liking both with the client and with remote monitoring. This CAN have a more significant impact on your system though especially with large table and row counts.

 

Q: At what point do I know I’ve done enough optimization? 

HA! It’s a never-ending game of cat and mouse considering the sheer volume of variables in play. It’s these times where monitoring data for before and after become vital.

 

Q: Can a database monitoring package be the source of database performance issues? In particular, mysqld_exporter is installed as a docker container, as I’m seeing “out of resources” on a trace on mysqld_exporter.

Of course, there are plenty of ways to generate database performance issues and it’s possible monitoring can result in some overhead. For an extreme example, here’s one way to replicate some overhead: start the pmm-client on a MySQL database and restore a blank DB from mysqldump. A few million rows at a time should generate LOTS of chaos and load between QAN and exporters. Our pmm client runs the exporter natively so no need to use a container.

 

Q: Is the query analytics somehow slowing down the database server as well? Or is it save to enable/use it without further impact?

The impact is minimal.  Most of the Query Analytics processing is done at the PMM server, the only impact to the client is retrieving the queries from slowlog or performance schema so this can have a bigger impact for the most extremely active DB’s but still should remain below 5% CPU hit.

 

Q: Did I understand correctly that PMM is not for RDS users and that AWS tools are available?

PMM certainly is for RDS! Since RDS is managed by AWS, PMM cannot collect CPU/Disk/Memory metrics but all MySQL metrics are still available even in RDS.

 

Q: Do you have any instructions/steps to install PMM to monitor MySQL RDS? 

  • Gear icon ? PMM Inventory ? Add Instance
  • Choose AWS/RDS Add Remote Instance
  • Use your AWS credentials to view your available RDS & Aurora nodes
  • Ensure that performance_schema is enabled

 

Watch the Recording

Sep
02
2021
--

Introducing xbcloud: Exponential Backoff Feature in Percona XtraBackup

xbcloud Percona XtraBackup

xbcloud Percona XtraBackupStoring your data locally can impose security and availability risks. Major cloud providers have object storage services available to allow you to upload and distribute data across different regions using various retention and restore policies.

Percona XtraBackup delivers the xbcloud binary – an auxiliary tool to allow users to upload backups to different cloud providers directly.

Today we are glad to announce the introduction of the Exponential Backoff feature to xbcloud.

In short, this new feature will allow your backup upload/download to work better with unstable network connections by retrying each chunk and adding an exponential wait time in between retries, increasing the chances of completion in case of an unstable connection or network glitch.

This new functionality is available on today’s release of Percona XtraBackup 8.0.26 and will be available in Percona XtraBackup 2.4.24.

How it Works – in General

Whenever one chunk upload or download fails to complete its operation, xbcloud will check the reason for the failure. It can be either a CURL / HTTP or a client-specific error. If the error is listed as retriable (more about that later in this post), xbcloud will backoff/sleep for a certain amount of time before trying again. It will retry the same chunk 10 times before aborting the whole process. 10 is the default retry amount and can be configured via --max-retries parameter.

How it Works – Backoff Algorithm

Network glitches/instabilities usually happen for a short period of time. To make xbcloud tool more reliable and increase the chances of a backup upload/download to complete during those instabilities, we pause for a certain period of time before retrying the same chunk. The algorithm chosen is known as exponential backoff.

In the case of a retry, we calculate the power of two using the number of retries we already did for that specific chunk as the exponential factor. Since xbcloud does multiple asynchronous requests in parallel, we factor in a random number of milliseconds between 1 and 1000 to each chunk. This is to avoid all asynchronous request backoff for the same amount of time and retry all at once, which could cause network congestion.

The backoff time will keep increasing as the same chunk keeps failing to upload/download. Getting by example the default --max-retry of 10, that would mean the last backoff will be around 17 minutes. 

To overcome this, we have implemented the --max-backoff parameter. This parameter defines the maximum time the program can sleep in milliseconds between chunk retries – Default to 300000 (5 minutes).

How it Works – Retriable Errors

We have a set of errors that we know we should retry the operations. For CURL, we retry on:

CURLE_GOT_NOTHING
CURLE_OPERATION_TIMEDOUT
CURLE_RECV_ERROR
CURLE_SEND_ERROR
CURLE_SEND_FAIL_REWIND
CURLE_PARTIAL_FILE
CURLE_SSL_CONNECT_ERROR

For HTTP, we retry the operation in case of the following status codes:

503
500
504
408

Each cloud provider might return a different CURL or HTTP error depending on the issue. To allow users to extend this list and not rely on us providing a new version of xbcloud, we created a mechanism to allow users to extend this list.

One can add new errors by setting --curl-retriable-errors / --http-retriable-errors respectively.

On top of that, we have enhanced the error handling when using --verbose output to specify in which error xbcloud failed and what parameter a user will have to add to retry on this error. Here is one example:

210701 14:34:23 /work/pxb/ins/8.0/bin/xbcloud: Operation failed. Error: Server returned nothing (no headers, no data)
210701 14:34:23 /work/pxb/ins/8.0/bin/xbcloud: Curl error (52) Server returned nothing (no headers, no data) is not configured as retriable. You can allow it by adding --curl-retriable-errors=52 parameter

Those options accept a comma list of error codes.

Example

Below is one example of xbcloud exponential backoff in practice used with --max-retries=5 --max-backoff=10000

210702 10:07:05 /work/pxb/ins/8.0/bin/xbcloud: Operation failed. Error: Server returned nothing (no headers, no data)
210702 10:07:05 /work/pxb/ins/8.0/bin/xbcloud: Sleeping for 2384 ms before retrying backup3/xtrabackup_logfile.00000000000000000006 [1]

. . .

210702 10:07:23 /work/pxb/ins/8.0/bin/xbcloud: Operation failed. Error: Server returned nothing (no headers, no data)
210702 10:07:23 /work/pxb/ins/8.0/bin/xbcloud: Sleeping for 4387 ms before retrying backup3/xtrabackup_logfile.00000000000000000006 [2]

. . .

210702 10:07:52 /work/pxb/ins/8.0/bin/xbcloud: Operation failed. Error: Failed sending data to the peer
210702 10:07:52 /work/pxb/ins/8.0/bin/xbcloud: Sleeping for 8691 ms before retrying backup3/xtrabackup_logfile.00000000000000000006 [3]

. . .

210702 10:08:47 /work/pxb/ins/8.0/bin/xbcloud: Operation failed. Error: Failed sending data to the peer
210702 10:08:47 /work/pxb/ins/8.0/bin/xbcloud: Sleeping for 10000 ms before retrying backup3/xtrabackup_logfile.00000000000000000006 [4]

. . .

210702 10:10:12 /work/pxb/ins/8.0/bin/xbcloud: successfully uploaded chunk: backup3/xtrabackup_logfile.00000000000000000006, size: 8388660

Let’s analyze the snippet log above:

  1. Chunk xtrabackup_logfile.00000000000000000006 failed to upload by the first time (as seen in the [1] above) and slept for 2384 milliseconds.
  2. Then the same chunk filed by the second time (as seen by the number within [] ) exponentially increasing the sleep time by 2
  3. When the chunk failed by the third time, we continued exponentially increasing the sleep time to around 8 seconds
  4. On the fourth time, we would originally increase the exponential time to around 16 seconds; however, we have used --max-backoff=10000which means that is the maximum sleep time between retries, resulting in the program waiting 10 seconds before trying the same chunk again.
  5. Then we can see that in the end, it successfully uploaded the chunk  xtrabackup_logfile.00000000000000000006

Summary

Best practices recommend distributing your backups to different locations. Cloud providers have dedicated services for this purpose. Using xbcloud alongside Percona XtraBackup are the tools to ensure you meet this requirement when talking about MySQL backup. On the other hand, we know that network connectivity can be unstable at the worst times. The new version of xbcloud won’t stop you from completing your backups as it will be more resilient to those instabilities with a variety of options to tune the network transfer.

Percona Distribution for MySQL is the most complete, stable, scalable, and secure, open-source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!

Download Percona Distribution for MySQL Today

Sep
01
2021
--

Congratulating Marcelo Altmann on his Promotion to Oracle ACE!

Marcelo Altmann promotion to Oracle ACE

Marcelo Altmann promotion to Oracle ACEWe’re excited to share that Marcelo Altmann from the Percona Server Engineering Team has just been promoted from Oracle ACE Associate to Oracle ACE.

Congratulations!

The Oracle ACE Program recognizes and rewards community members for their technical contributions to the Oracle community.

Marcelo initially joined Percona as a senior support engineer in our global services organization, where he helped customers with running their MySQL-based environments. In early 2020, he joined our Server Engineering team and has been actively involved in the development of Percona Server for MySQL and Percona XtraBackup since then.

Marcelo’s contributions to the MySQL ecosystem are countless – he’s an active blogger, he regularly submits bug reports to the MySQL team, organizes local and virtual meetups, and also contributes patches and bug fixes.

Congratulations again, Marcelo, we’re proud to have you on our team!

Aug
31
2021
--

My Favorite Percona Monitoring and Management Additional Dashboards

Percona Monitoring and Management Dashboards

Percona Monitoring and Management (PMM) has dashboards that cover a lot of ground, yet PMM Superpowers come from the fact you do not need to stick to dashboards that are included with the product! You also can easily install additional dashboards provided by the Community, as well as implement your own.

In this blog post, we will cover some of the additional dashboards which I find particularly helpful.

Node Processes Dashboard

Node Processes Dashboard

Get insights into the processes on the system to better understand resource usage by your database server vs other stuff on the system.   Unexpected resource hog processes are a quite common cause of downtime and performance issues.  More information in the Understanding Processes on your Linux Host blog post.

MySQL Memory Usage Details

MySQL Memory Usage Details

Ever wondered where MySQL memory usage comes from? This dashboard can shed a light on this dark place, showing the top global memory consumers as well as what users and client hosts contribute to memory usage.  More details in the Understanding MySQL Memory Usage with Performance Schema blog post.

MySQL Query Performance Troubleshooting

MySQL Query Performance Troubleshooting

Want to understand which queries are responsible for CPU, Disk, Memory, or Network Usage and get some other advanced MySQL Query Troubleshooting tools? Check out this dashboard.  Read more about it in the  MySQL Query Performance Troubleshooting blog post.

RED Method for MySQL Dashboard

RED Method for MySQL Dashboard

Want to apply the RED (Rate-Errors-Duration)  method to MySQL?  Check out this dashboard, and check out RED Method for MySQL Performance Analyses for more details.

OK, so let’s say you’re convinced and want to get those dashboards into your PMM install but manual installation does not excite you.  Here is how you can use custom dashboard provisioning  to install all of them:

curl -LJOs https://github.com/Percona-Lab/pmm-dashboards/raw/main/misc/import-dashboard-grafana-cloud.sh --output import-dashboard-grafana-cloud.sh
curl -LJOs https://github.com/Percona-Lab/pmm-dashboards/raw/main/misc/cleanup-dash.py --output cleanup-dash.py

chmod a+x import-dashboard-grafana-cloud.sh
chmod a+x cleanup-dash.py

./import-dashboard-grafana-cloud.sh -s <PMM_SERVER_IP> -u admin:<ADMIN_PASSWORD> -f Custom -d 13266 -d 12630 -d 12470 -d 14239

Note:  Node Processes and MySQL Memory Usage Details dashboards also require additional configuration on the client-side. Check out the blog posts mentioned for specifics.

Enjoy!

Percona Monitoring and Management is a best-of-breed open source database monitoring solution. It helps you reduce complexity, optimize performance, and improve the security of your business-critical database environments, no matter where they are located or deployed.

Download Percona Monitoring and Management Today

Aug
30
2021
--

Percona Server for MongoDB 5.0.2-1 Release Candidate, Updated Percona Backup for MongoDB: Release Roundup August 30, 2021

Percona Software Release Aug 30 2021

Percona Software Release Aug 30 2021It’s release roundup time again here at Percona!

Percona is a leading provider of unbiased open source database solutions that allow organizations to easily, securely, and affordably maintain business agility, minimize risks, and stay competitive.

Our Release Roundups showcase the latest Percona software updates, tools, and features to help you manage and deploy our software. It offers highlights and critical information, as well as links to the full release notes and direct links to the software or service itself to download.

Today’s post includes those releases and updates that have come out since August 16, 2021. Take a look!

 

Percona Distribution for MongoDB 4.4.8

Percona Distribution for MongoDB 4.4.8 was released on August 16, 2021. It is a freely available MongoDB database alternative, giving you a single solution that combines enterprise components from the open source community, designed and tested to work together. Percona Distribution for MongoDB includes Percona Server for MongoDB and Percona Backup for MongoDB, enabling you to run and operate your MongoDB efficiently with the data being consistently backed up.

Download Percona Distribution for MongoDB 4.4.8

 

Percona Server for MongoDB 5.0.2-1 (RC)

We are pleased to announce the release candidate of Percona Server for MongoDB 5.0.2-1 (RC) on August 16, 2021. It is an enhanced, source-available, and highly scalable database that is a fully compatible, drop-in replacement for MongoDB 5.0.2 Community Edition and includes all features of MongoDB 5.0.2 Community Edition, fully supporting MongoDB 5.0.2 Community Edition protocols and drivers, requiring no changes to MongoDB applications or code.

Note: With a lot of new features and modifications introduced, we recommend using this release candidate in testing environments only

Download Percona Server for MongoDB 5.0.2-1 (RC)

 

Percona Server for MongoDB 4.4.8-9

On August 16, 2021, Percona Server for MongoDB 4.4.8-9 was released. It’s a fully compatible, drop-in replacement for MongoDB 4.4.8 Community Edition, supporting MongoDB 4.4.8 protocols and drivers. Along with some bug fixes, there are a few improvements in this release, including the ability to view the status of hot backup using the mongo shell and the removal of excessive log messages and improved error messages for various backup cases.

Download Percona Server for MongoDB 4.4.8-9

 

Percona Backup for MongoDB 1.6.0

On August 16, 2021, we released Percona Backup for MongoDB 1.6.0, a distributed, low-impact solution for consistent backups of MongoDB sharded clusters and replica sets. This is a tool for creating consistent backups across a MongoDB sharded cluster (or a single replica set), and for restoring those backups to a specific point in time.

Release highlights include:

  • Support for Percona Server for MongoDB and MongoDB Community 5.0
  • Point-in-time recovery enhancements: ability to restore from any previous snapshot and configurable span of oplog events
  • JSON output for PBM commands to simplify interfacing PBM with applications

Download Percona Backup for MongoDB 1.6.0

 

Percona Server for MySQL 5.7.35-38

August 18, 2021, saw the release of Percona Server for MySQL 5.7.35-38, a free, fully compatible, enhanced, and open source drop-in replacement for any MySQL database.  It includes all the features and bug fixes available in MySQL 5.7.35 Community Edition in addition to enterprise-grade features developed by Percona.

Download Percona Server for MySQL 5.7.35-38

 

That’s it for this roundup, and be sure to follow us on Twitter to stay up-to-date on the most recent releases! Percona is a leader in providing best-of-breed enterprise-class support, consulting, managed services, training, and software for MySQL, MongoDB, PostgreSQL, MariaDB, and other open source databases in on-premises and cloud environments.

Aug
24
2021
--

PostgreSQL Custom Dashboards Ported to Percona Monitoring and Management 2

Custom Dashboards for PostgreSQL

At the recent Percona Live, Agustín Gallego and I did a presentation “Optimizing and Troubleshooting PostgreSQL with PMM“. While preparing for this talk, I’ve ported some of the older PostgreSQL custom dashboards that we published over the years over at grafana.com. Initially made for Percona Monitoring and Management (PMM) 1, they are now capable of being added to PMM 2 easily. In this short blog post, I will show you how to set up two of these dashboards on PMM 2: “PostgreSQL Tuple Statistics” and “PostgreSQL DB/Table Size Details”. Technically, the DB/Table Size one is for PMM 2, but there’s an easier way to set it up in recent PMM versions.

You can also check out the original blog post covering the installation of the Tuple Statistics dashboard for PMM 1: “Adding PostgreSQL Tuple Statistics Dashboard to the PMM Plugin“.

Disclaimer About PMM Versions

PostgreSQL support is continuously improving in PMM, and the engineering team is busy with QAN and exporter improvements. In PMM release 2.16.0, postgres_exporter was updated to the community version 0.8.0, which among other things significantly improves the way custom queries are executed. In this post, I assume that you’re using PMM version 2.16.0 or above. If you’re using an older version of PMM 2, you will have to use the dblink approach proposed in the original blog post. Note that some parts of the new dashboards may not work properly with the older PMM versions.

PostgreSQL Custom Dashboards Overview

The two dashboards that were ported are “PostgreSQL Tuple Statistics” and “PostgreSQL DB/Table Size Details“, both adding important missing pieces of information to PMM. They were initially implemented by Vadim Yalovets.

“PostgreSQL Tuple Statistics”, in addition to the breakdown of tuple operations, provides an overview of the dead/live tuples ratio and details on the history of vacuum executions. New addition with this port to PMM2 is that the dashboard now gives some insight into database age and current progress towards the “vacuum to prevent wraparound.” Having a view of operations happening on a per-table basis can highlight the hottest objects in the databases, and the vacuum details are a welcome addition to any monitoring system that works with PostgreSQL.

The “PostgreSQL DB/Table Size Details” dashboard is somewhat simpler and narrower in its scope. This dashboard gives an overview of database and table sizes and their rate of growth. Helpful when you want to understand where all the disk space went.

Setting up Custom Queries

Once you have set up the PMM client to monitor your PostgreSQL instance, you will find three directories where you can put custom queries:

# cd /usr/local/percona/pmm2/collectors/custom-queries/postgresql/
# ls -l
drwxr-xr-x. 2 pmm-agent pmm-agent  69 Jul 12 21:16 high-resolution
drwxr-xr-x. 2 pmm-agent pmm-agent 126 Jul 12 21:24 low-resolution
drwxr-xr-x. 2 pmm-agent pmm-agent  34 Jul 12 21:16 medium-resolution

Technical details can be found in the “Running Custom MySQL Queries in Percona Monitoring and Management” blog post. However, all you need to do is to create two files. Download or otherwise copy the following files to the low-resolution directory:

Once that’s done, restart the pmm-agent or kill the postgres_exporter process:

# systemctl restart pmm-agent

I recommend using the low resolution for these particular queries as, frankly, it doesn’t make a lot of sense to be checking database size and number of updates every 1 or 5 seconds.

Importing the Dashboards

Once the custom queries are prepared and the exporter is restarted, you can go ahead and import new dashboards in Grafana! See the official documentation on how to do that: Dashboards/Export and import.

You can pick up the dashboards on grafana.com: “PostgreSQL Tuple Statistics (Designed for PMM2)“, “PostgreSQL DB/Table Size Details (Designed for PMM2)“, or find raw sources over at my GitHub.

If everything worked as expected, you should see your new dashboards showing the data:

PMM Dashboard: PostgreSQL Tuple Details pt1

PostgreSQL Tuple Details pt1

PMM Dashboard: PostgreSQL Tuple Details pt2

PostgreSQL Tuple Details pt2

PMM Dashboard: PostgreSQL Tuple Details pt3

PostgreSQL Tuple Details pt3

PMM Dashboard: PostgreSQL DB/Table Size Details

PostgreSQL DB/Table Size Details

Enjoy monitoring with these PostgreSQL custom dashboards!

Aug
23
2021
--

August Is Hot, but Not as Hot as Our Next Community Engineering Meeting!

Percona Community Engineering Meeting

Percona Community Engineering MeetingThe engineering marvel that is Percona Monitoring and Management (PMM), has to be, for me, one of the best examples out there of how a large number of vastly different community projects can successfully combine into something whose value and features far exceeds the sum of its component open-source parts.

That’s a lot of words for a simple thing: it’s a metrics mover—it’s PMM!

PMM has exporters that suck system stats from a server or database.

The numbers are safely shuffled through agents across a crowded network toward the sanctuary of the PMM Server.

That’s where your valuable system values are collected and collated, charted and tabled, and where they emerge as fabled knowledge, logically organized and neatly arranged into dashboards.

Dashboards (What you see)

PMM’s dashboards are based on Grafana, and PMM relies on many community-based projects.

Exporters (What you don’t see)

Exporters are specialized programs that run and extract metrics data from a node or database. Every database is different, and so every exporter is too.

Community Meetings (What you may not know)

I tell you what you may already know simply as a prelude to an announcement of what you may not yet know.

We’ve been running monthly community meetings where our Engineering Gurus “talk technical” to share their knowledge of PMM and all its parts.

The format is fairly informal. We use our Discord server’s voice channel. We grab a refreshing region-sensitive hot or cold drink, a very loose agenda, and we talk.

All we need now is someone to listen.

The topics in this month’s meeting:

  • Exporters: Focusing this month on node_exporter, and how and why we’re reverting our fork of the Prometheus one.

  • Dashboards: How Percona’s differ from the community’s, and how to migrate them using the scripts in here. Here’s a brief preview.

    • Convert a dashboard from PMM

      PMM dashboard

      This script converts a PMM dashboard so it can be used in an external Prometheus + Grafana installation. It doesn’t need any input from you. It replaces PMM2 labels (node_name, service_name) that are used in variables with default labels (instance).

    • Convert a dashboard to PMM
      Convert a dashboard to PMM

    This one renames labels and variables interactively because it can’t know in advance the names of the labels you want to use. You must select names for renaming and provide a PMM2 label (node_name, service_name). The script collects a list of used variables and asks which have to be renamed in variables and expressions.

    PMM2 labels can be checked in the VM configuration files:

    VM configuration files

As we’ve already tweeted, our Gurus are not ashamed of showing their consoles. Some are bare and some are full, but they’re always fascinating to watch when they use them with such style.

Why not join us and see for yourself?

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com