Feb
24
2021
--

Point-In-Time Recovery in Kubernetes Operator for Percona XtraDB Cluster – Architecture Decisions

Point-In-Time Recovery in Kubernetes Operator

Point-In-Time Recovery in Kubernetes OperatorPoint-In-Time Recovery (PITR) for MySQL databases is an important feature that is essential and covers common use cases, like a recovery to the latest possible transaction or roll-back the database to a specific date before some bad query was executed. Percona Kubernetes Operator for Percona XtraDB Cluster (PXC) added support for PITR in version 1.7, and in this blog post we are going to look into the technical details and decisions we made to implement this feature.

Architecture Decisions

Store Binary Logs on Object Storage

MySQL uses binary logs to perform point-in-time recovery. Usually, they are stored locally along with the data, but it is not an option for us:

  • We run the cluster and we cannot rely on a single node’s local storage.
  • The cloud-native world lives in an ephemeral dimension, where nodes and pods can be terminated and S3-compatible storage is a de facto standard to store data.
  • We should be able to recover the data to another Kubernetes cluster in case of a disaster.

We have decided to add a new Binlog Uploader Pod, which connects to the available PXC member and uploads binary logs to S3. Under the hood, it relies on the mysqlbinlog utility.

Use Global Transaction ID

Binary logs on the clustered nodes are not synced and can have different names and contents. This becomes a problem for the Uploader, as it can connect to different PXC nodes for various reasons.

To solve this problem, we decided to rely on Global Transaction ID (GTID). It is a unique transaction identifier, but it is unique not only to the server on which it originated, but is unique across all servers in a given replication topology.  With the GTID captured in binary logs, we can identify any transaction not depending on the filename or its contents. This allows us to continue streaming binlogs from any PXC member at any moment.

User-Defined Functions

We have a unique identifier for every transaction, but the mysqlbinlog utility still doesn’t have the functionality to determine which binary log file contains which GTID. We decided to extend MySQL with few User Defined Functions and added them to Percona Server for MySQL and Percona XtraDB Cluster versions 8.0.21

get_gtid_set_by_binlog()

This function returns all GTIDs that are stored inside the given binlog file. We put the GTID setlist to a new file next to the binary log on S3.

get_binlog_by_gtid_set()

This function takes GTID set as an input and returns a binlog filename which is stored locally. We use it to figure out which GTIDs are already uploaded and which binlog to upload next. 

binlog uploader pod

Have open source expertise you want to share? Submit your talk for Percona Live ONLINE 2021!

Find the node with the oldest binary log

Our quality assurance team caught a bug before the release which can happen in the cluster only:

  • Add a new node to the Percona XtraDB Cluster (for example scale up from 3 to 5 nodes).
  • Binlog Uploader Pod tries to execute get_binlog_by_gtid_set on the new node but gets the error.
2021/01/19 11:23:19 ERROR: collect binlog files: get last uploaded binlog name by gtid set: scan binlog: sql: Scan error on column index 0, name "get_binlog_by_gtid_set('a8e657ab-5a47-11eb-bea2-f3554c9e5a8d:15')": converting NULL to string is unsupported

The error is valid, as this node is new and there are no binary log files that have the GTID set that Uploader got from S3. If you look into this pull request, the quick patch is to always pick the oldest node in the array or in other words the node, which most likely would have the binary logs we need. In the next release of the Operator, we add more sophisticated logic, to discover the node which has the oldest binary logs for sure.

Storageless binlog uploader

The size of binlogs depends on the cluster usage patterns, so it is hard to predict the size of the storage or memory required for them. We decided to take this complexity away by making our Binary Log Uploader Pod completely storageless. Mysqlbinlog can store remote binlog only into files, but we need to put them to S3. To get there we decided to use a named pipe or FIFO special file. Now mysqlbinlog utility loads the binary log file to a named pipe, our Uploader reads it and streams the data directly to S3.

Also, storageless design means that we never store any state between Uploader restarts. Basically, state is not needed, we only need to know which GTIDs are already uploaded and we have this data on a remote S3 bucket. Such design enables the continuous upload flow of binlogs.

Binlog upload delay

S3 protocol expects that the file is completely uploaded. If the file upload is interrupted (let’s say Uploader Pod is evicted), the file will not be accessible/visible on S3. Potentially we can lose many hours of binary logs because of such interruptions. That’s why we need to split the binlog stream into files and upload them separately.

One of the options that users can configure when enabling point-in-time recovery in Percona XtraDB Cluster Operator is timeBetweenUploads. It sets the number of seconds between uploads for Binlog Uploader Pod. By default, we set it to 60 seconds, but it can go down to one second. We do not recommend setting it too low, as every invocation of the Uploader leads to FLUSH BINARY LOGS command execution on the PXC node. We need to flush the logs to close the binary log file to upload it to external storage, but doing it frequently may negatively affect IO and as a result database performance.

Recovery

It is all about recovery and it has two steps:

  1. Recover the cluster from a full backup
  2. Apply binary logs

We already have the functionality to restore from a full backup (see here), so let’s get to applying the binary logs.

First, we need to figure out from which GTID set we should start applying binary logs – in other words: where do we start?. As we rely on the Percona XtraBackup utility to take full MySQL backups, what we need to do is read the xtrabackup_info file which has lots of useful metadata. We already have this file on S3 near the full backup.

Second, find the binlog which has the GTID set we need. As you remember, we store a file with binlog’s GTID sets on S3 already, so it boils down to reading these files.

Third, download binary logs and apply them. Here we rely on mysqlbinlog as well, which has the flags we need, like –stop-datetime – which stops recovery when the event with a specific timestamp is caught in the log.

point in time recovery

Conclusion

MySQL is more than 25 years old and has a great tooling ecosystem established around it, but as we saw in this blog post, not all these tools are cloud-native ready. Percona engineering teams are committed to providing users the same features across various environments, whether it is a bare-metal installation in the data center or cutting edge Kubernetes in the cloud.

Nov
17
2020
--

Wondering How to Run Percona XtraDB Cluster on Kubernetes? Try Our Operator!

Run Percona XtraDB Cluster on Kubernetes

Run Percona XtraDB Cluster on KubernetesKubernetes has been a big trend for a while now, particularly well-suited for microservices. Running your main databases on Kubernetes is probably NOT what you are looking for. However, there’s a niche market for them. My colleague Stephen Thorn did a great job explaining this in The Criticality of a Kubernetes Operator for Databases. If you are considering running your database on Kubernetes, have a look at it first. And, if after reading it you start wondering how the Operator works, Stephen also wrote an Introduction to Percona Kubernetes Operator for Percona XtraDB Cluster (PXC), which presents the Kubernetes architecture and how the Percona Operator simplifies the deployment of a full HA PXC cluster in this environment, proxies included!

Now, if you are curious about how it actually works in practice but are afraid the entry barrier is too high, I can help you with that. In fact, this technology is widespread now, with most cloud providers offering a dedicated Kubernetes engine. In this blog post, I’ll walk you over the steps on how to deploy a Percona XtraDB Cluster (PXC) using the Percona Operator for Kubernetes on Google Cloud Platform (GCP).

Creating a Virtual Environment to Run Kubernetes on GCP

Google Cloud Platform includes among its products the Google Kubernetes Engine (GKE). We can take advantage of their trial offer to create our test cluster there: https://cloud.google.com.

After you sign up, you can access all the bells and whistles in their web interface. Note the Kubernetes Engine API is not enabled by default, you need to do it by visiting the Kubernetes Engine section in the left menu, under COMPUTE.

For the purpose of deploying our environment, we should install their SDK and work from the command line: see https://cloud.google.com/sdk/docs/install and follow the respective installation instructions for your OS (you will probably want to install the SDK on your personal computer).

With the SDK installed, we can initialize our environment, which requires authenticating to the Google Cloud account:

gcloud init

You will be prompted to choose a cloud project to use: there’s one created by default when the account is activated, named “My First Project”. It will receive a unique id, which you can verify in the Google Cloud interface, but usually, it is displayed as the first option presented in the prompt.

Alternatively, you can use gcloud config set to configure your default project and zone, among other settings.

For this exercise, we will be creating a 3-node cluster named k8-test-cluster with n1-standard-4 instances in the us-central1-b zone:

gcloud container clusters create --machine-type n1-standard-4 --num-nodes 3 --zone us-central1-b --cluster-version latest k8-test-cluster

If the command above was successful, you should see your newly created cluster in the list returned by:

gcloud container clusters list

Getting Ready to Work with Kubernetes

Besides the Google Cloud SDK that is used to manage the cloud instances, we also need the Kubernetes command-line tool, kubectl, to manage the Kubernetes cluster. One way to install it is through gcloud itself:

gcloud components install kubectl

This method won’t work for everyone though, as the Cloud SDK component manager is disabled for certain kinds of installation, such as through apt or yum in Linux. I find myself in this group, using Ubuntu, but the failed attempt to install kubectl through gcloud suggested another approach that worked for me:

sudo apt install kubectl

Deploying a PXC Cluster Using the Percona Kubernetes Operator

The Percona operators are available on Github. The most straightforward way to obtain a copy is by cloning the operator’s repository. The latest version of the PXC operator is 1.6.0 and we can clone it with the following command:

git clone -b v1.6.0 https://github.com/percona/percona-xtradb-cluster-operator

Move inside the created directory:

cd percona-xtradb-cluster-operator

and run the following sequence of commands:

  1. Define the Custom Resource Definitions for PXC:

    kubectl apply -f deploy/crd.yaml
  2. Create a namespace on Kubernetes and associate it to your current context:

    kubectl create namespace pxc
    kubectl config set-context $(kubectl config current-context) --namespace=pxc
  3. Define Role-Based Access Control (RBAC) for PXC:

    kubectl apply -f deploy/rbac.yaml
  4. Start the operator within Kubernetes:

    kubectl apply -f deploy/operator.yaml
  5. Configure PXC users and their credentials:

    kubectl apply -f deploy/secrets.yaml
  6. Finally, deploy the cluster:

    kubectl apply -f deploy/cr.yaml

You can find a more detailed explanation of each of these steps, as well as how to customize your installation, in the Percona Kubernetes Operator for Percona XtraDB Cluster online documentation, which includes a quickstart guide for GKE.

Now, it is a matter of waiting for the deployment to complete, which you can monitor with:

kubectl get pods

A successful deployment will show output for the above command similar to:

NAME                                               READY   STATUS    RESTARTS   AGE
cluster1-haproxy-0                                 2/2     Running   0          4m21s
cluster1-haproxy-1                                 2/2     Running   0          2m47s
cluster1-haproxy-2                                 2/2     Running   0          2m21s
cluster1-pxc-0                                     1/1     Running   0          4m22s
cluster1-pxc-1                                     1/1     Running   0          2m52s
cluster1-pxc-2                                     1/1     Running   0          111s
percona-xtradb-cluster-operator-79d786dcfb-9lthw   1/1     Running   0          4m37s

As you can see above, the operator will deploy seven pods with the default settings, and those are distributed across the three GKE n1-standard-4 machines we created at first:

kubectl get nodes
NAME                                             STATUS   ROLES    AGE    VERSION
gke-k8-test-cluster-default-pool-02c370e1-gvfg   Ready    <none>   152m   v1.17.13-gke.1400
gke-k8-test-cluster-default-pool-02c370e1-lvh7   Ready    <none>   152m   v1.17.13-gke.1400
gke-k8-test-cluster-default-pool-02c370e1-qn3p   Ready    <none>   152m   v1.17.13-gke.1400

Accessing the Cluster

One way to access the cluster is by creating an interactive shell in the Kubernetes cluster:

kubectl run -i --rm --tty percona-client --image=percona:8.0 --restart=Never -- bash -il

From there, we can access MySQL through the cluster’s HAproxy writer node:

mysql -h cluster1-haproxy -uroot -proot_password

Note the hostname used above is an alias, the connection being routed to one of the HAproxy servers available in the cluster. It is also possible to connect to a specific node by modifying the host option -h with the node’s name:

mysql -h cluster1-pxc-0 -uroot -proot_password

This is where all the fun and experimentation starts: you can test and break things without worrying too much as you can easily and quickly start again from scratch.

Destroying the Cluster and Deleting the Test Environment

Once you are done playing with your Kubernetes cluster, you can destroy it with:

gcloud container clusters delete --zone=us-central1-b k8-test-cluster

It’s important to note the command above will not discard the persistent disk volumes that were created and used by the nodes, which you can check with the command:

gcloud compute disks list

A final purging command is required to remove those as well:

gcloud compute disks delete <disk_name_1> <disk_name_2> <disk_name_3> --zone=us-central1-b

If you are feeling overzealous, you can double-check that all has been deleted:

gcloud container clusters list

gcloud compute disks list

Learn More About Percona Kubernetes Operator for Percona XtraDB Cluster

Interested In Hands-On Learning?

Be sure to get in touch with Percona’s Training Department to schedule your PXC Kubernetes training engagement. Our expert trainers will guide your team firstly through the basics, cover all the configuration noted above (and then some), and then dive deeper into how the operator functions along with High-Availability exercises, disaster recovery scenarios, backups, restore, and much more.

Oct
30
2020
--

CVE-2020-15180 – Affects Percona XtraDB Cluster

CVE-2020-15180

CVE-2020-15180Galera replication technology, a key component of Percona XtraDB Cluster, suffered from a remote code execution vulnerability. Percona has been working with the vendor since early September on this issue and has made releases available to address the problem.

Applicability

A malicious party with access to the WSREP service port (4567/TCP) as well as prerequisite knowledge of the configuration of the Galera cluster name is required in order to exploit this vulnerability, which leads to remote code execution via the WSREP protocol. 

Fixes are available in Percona XtraDB Cluster versions:

>= 8.0.20-11.2

>= 5.7.31-31.45.2

>= 5.6.49-28.42.2

Credits

Percona would like to thank all the Percona staff involved in the resolution of this issue.

More Information

Release notes

Oct
08
2020
--

The Criticality of a Kubernetes Operator for Databases

Importance of Kubernetes Operators for Databases

Importance of Kubernetes Operators for DatabasesAs a Solutions Engineer at Percona, one of my responsibilities is to support our customers as they investigate new and emerging technologies. This affords me the opportunity to speak to many current and new customers who partner with Percona. The topic of Kubernetes is becoming more popular as companies are investigating and adopting this technology. The issue most companies are encountering is architecting a stateful database that doesn’t fall victim to an environment tuned for ephemeral workloads. This obviously introduces a level of complexity as to how to run a stateful database in an inherently stateless world, as databases are not natively designed for that.

To make your life easier, as a part of the Percona Cloud-Native Autonomous Database Initiative, our engineering teams have built two Kubernetes Operators: Percona Kubernetes Operator for Percona XtraDB Cluster and Percona Kubernetes Operator for Percona Server for MongoDB, which allows for Kubernetes Pods to be destroyed, moved, or created with no impact to the application. To see an overview of Kubernetes, you can read this previous blog of mine Introduction to Percona Kubernetes Operator for Percona XtraDB Cluster that covers this topic. It’s common for companies new to Kubernetes to attempt to run their databases in Kubernetes the same way they would in a traditional environment. But, this is not advised as it introduces the possibility of data loss and it is not recommended for production workloads. Why is this dangerous and how has Percona solved this?

Appropriate Workloads for Kubernetes

Kubernetes is not the answer for everyone. It’s even not the answer for most people. Do not be misled into thinking that moving a database into Kubernetes is going to solve any of your problems. Before you consider moving your database into Kubernetes, ensure the rest of your application is cloud-native and can be used with Kubernetes. Moving your database to Kubernetes should happen after you have started both elastic vertical and horizontal scale and need to orchestrate it to control costs.

As more companies are moving to Kubernetes something has to happen to the legacy workloads. Oftentimes we see a lift and shift mentality into Kubernetes, which can be dangerous or cause more work than expected. We have seen two primary ideal use cases for moving database workloads to Kubernetes: Microservices and Unified Abstraction Layer.

Monolithic, large datasets can prohibit some of Kubernetes’ strong points: self-healing and availability. This can be an issue due to the time it takes to physically transmit data to a new Pod instance as it joins the database cluster. If your dataset is too large, this process is slow due to physical limitations and prohibits performance and the availability of your database. Microservices are a great fit due to the relatively smaller datasets, which allows Kubernetes automation to work well with the dataset size.

Companies looking to take full advantage of cloud-native applications and databases can be a really good fit for Kubernetes as well. If you truly want the ability to deploy and run your databases anywhere utilizing the concept of a Unified Abstraction Layer, Kubernetes is a great option. You can move your databases to anywhere that is running Kubernetes and know it will work.

We talked about large unsharded datasets and the limitations Kubernetes presents when handling them, but we should mention a few more workloads better suited for traditional platforms. Applications with a throughput sensitivity may not do well on Kubernetes, or they may not be cost-effective to do so. Kubernetes is fundamentally designed for container orchestration and is not designed to handle highly performant databases that require low latency. This may be possible to achieve, but at what cost? This applies to highly performant distributed applications as well. Lowest latency across all nodes is not a core tenant of Kubernetes, so ensure you have planned and tested against this before you move everything over to Kubernetes.

Pods Are Cattle, Not Pets

If you’re not familiar with Pets vs Cattle, it’s a DevOps concept that differentiates deployment methodologies of unique servers that require attention when issues arise (pets) versus the ability to replace a server with a copy if issues arise (cattle). Due to the nature of how Kubernetes operates, Pods can be destroyed, spun up, and moved at any time due to factors outside of the application’s control, much like how cattle are treated. Kubernetes uses a scheduler, which by design, can destroy and recreate Pods to meet the configuration needs of your Kubernetes Cluster. This is great for stateless applications as any failure in the application will result in a Pod containing the application being destroyed and recreated, eliminating the need for human interaction, and greatly speeding up the process to a resolution. This isn’t ideal for databases as you don’t want your database to suddenly stop working, halt the application, and introduce the potential for lost or corrupted data. One of the tools Kubernetes can utilize to help combat this is called Stateful Sets. These help by keeping a Pod’s identity assigned to it as it is destroyed and re-created. This helps facilitate stateful workloads, but how does this come into play with high availability and utilizing the automation aspects of Kubernetes?

Databases Are Pets, Not Cattle

Databases by design need to keep their identity, information, and most importantly their data safe and accessible at all times. They are the backbones of the application as they are the source of truth an application relies on for normal processing. Any errors in their operations will quickly stop an application from functioning. They are important, to say the least. How can we safely run databases in Kubernetes and still ensure we have highly available database deployments? By using Stateful Sets and Persistent Volumes we can maintain data integrity, but we need an additional set of hands to take on database administrator tasks such as ensuring failover happens, database members are recovered, and re-join the highly available architecture, along with other technology-specific functions. Fortunately, Kubernetes is extensible and has Operators, which aims to automate the key task of a human operator who is managing a service or set of services.

Automation, Automation, Automation

We know the complexities of running a database (safely) in Kubernetes and some of the concepts used to help bridge the gaps between automation and traditional human functions. With the help of Percona’s Kubernetes Operators, we can safely run databases the way they were intended to run. Percona’s Kubernetes Operators are able to automate tasks that are usually done by a database administrator such as:

  • Fully automated deployments with strict consistency and no single point of failure
  • Automated scaling with the ability to change the size parameter to add or remove members of a Cluster or Replica-Set
  • Fully automated backups and restores
  • Fully automated self-healing by automatically recovering from the failure of a single Cluster or Replica-Set member.
  • Automatically manage system users on password rotation
  • Simplified updates

Always Use a Kubernetes Operator

With the complexities of running a highly available database environment and the inherent dangers introduced by using the dynamic Kubernetes environment, an Operator should always be used when deploying databases in Kubernetes. Fortunately, Percona has already solved this by providing Percona Kubernetes Operator for Percona XtraDB Cluster and Percona Kubernetes Operator for Percona Server for MongoDB. Percona provides full support for databases running in Kubernetes with the Percona Operators. If you are interested in learning more or obtaining support or professional services to maximize your database deployments, please reach out to us.

Oct
06
2020
--

Various Ways to Perform Schema Upgrades with Percona XtraDB Cluster

Perform Schema Upgrades with Percona XtraDB Cluster

Perform Schema Upgrades with Percona XtraDB ClusterSchema changes are the big challenges in Galera replication. So, it is recommended to understand the schema changes operation for everyone who uses the Percona XtraDB Cluster (PXB)/Galera clusters. In this blog, I am going to explain the operation and impact of the various schema changes methods used in the PXB/Galera cluster.

  • Schema changes with “wsrep_OSU_method = TOI”
  • Schema changes with “wsrep_OSU_method = RSU”
  • Schema changes with “ONLINE ALGORITHMS”
  • Schema changes with “pt-osc”
  • Schema changes with “gh-ost”

For testing:

  • I have configured the 3-node Percona Xtradb Cluster (8.0.19).
  • Executing read/write load using the sysbench.
mysql> select @@wsrep_cluster_address\G
*************************** 1. row ***************************
@@wsrep_cluster_address: gcomm://pxc81,pxc82,pxc83
1 row in set (0.00 sec)

mysql> select @@version, @@version_comment\G
*************************** 1. row ***************************
        @@version: 8.0.19-10
@@version_comment: Percona XtraDB Cluster (GPL), Release rel10, Revision 727f180, WSREP version 26.4.3
1 row in set (0.00 sec)

What is the Impact of Schema Changes in Clusters?

  • By default (TOI), all the nodes in the cluster will be pause during the ALTER process. Because the ALTER needs to be replicated on all the nodes. If the ALTER is big it will affect the performance and could be the cause of the downtime.
  • Rollback is not possible on schema upgrade. 
  • You can’t kill the ALTER query immediately during the operation. So, your application may need to wait until the ALTER completion. 
mysql> pager grep alter
PAGER set to 'grep alter'

mysql> show processlist;
| 19 | root            | localhost | schema_changes | Query   |   18 | altering table           | alter table sbtest1 add index idx_c(c) |         0 |             0 |
7 rows in set (0.00 sec)

mysql> kill 19;
ERROR 1095 (HY000): You are not owner of thread 19

  • MDLs are set only on one node. Not across all the nodes in the cluster. So, you need additional control over this. 

Schema Changes with “wsrep_OSU_method = TOI”

TOI: Total Order Isolation

  • TOI is the default method ( wsrep_OSU_method = TOI ) for schema changes.
  • DDL statements are processed in the same order with regard to other transactions in each node. 
  • The full cluster will be blocked/locked during the DDL operation. 
  • This guarantees data consistency.
mysql> select @@wsrep_OSU_method\G
*************************** 1. row ***************************
@@wsrep_OSU_method: TOI
1 row in set (0.00 sec)

Example:

I am going to run the below ALTER on “pxc81”.

alter table sbtest1 add index idx_c(c)

After initiating the ALTER on pxc81, My processlist states the COMMIT and UPDATE ( from sysbench ) statements are paused. Only ALTER is in progress. The COMMIT and UPDATE will be resumed once the ALTER is completed.

| 17 | root            | localhost | schema_changes | Execute |      15 | closing tables                           | COMMIT                                 |         0 |             0 |
| 17 | root            | localhost | schema_changes | Execute |      15 | updating                                 | UPDATE sbtest1 SET c='91668836759-30934071579-18064439108-53267873872-79461377960-32104006456-143369 |         0 |             1 |
| 24 | root            | localhost | schema_changes | Query   |      15 | altering table                           | alter table sbtest1 add index idx_c(c) |         0 |             0 |

But, still, the SELECT statement can be run with “wsrep_sync_wait != 1” because “wsrep_sync_wait = 1” needs the casualty checks from other nodes. So, it will fail.  

SELECT with “wsrep_sync_wait=1”

|  1 | system user     |           | schema_changes | Query   |     0 | altering table         | alter table sbtest1 add index idx_c(c) |         0 |             0 |
| 15 | root            | localhost | schema_changes | Query   |    40 | starting               | select * from sbtest1 where id=1       |         0 |             0 |

mysql> select * from sbtest1 where id=1;
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

TOI can be the right choice for quick operations.

  • CREATE STATEMENTS
  • RENAME INDEX
  • RENAME TABLE
  • DROP INDEX
  • ALGORITHM=INSTANT

Schema Changes with “wsrep_OSU_method = RSU”

RSU – Rolling Schema Upgrade

  • In this method, DDL statements will not replicate across the cluster nodes. Need to execute the DDL individually on all nodes.
  • The node which is executing the DDL will desync from the cluster group. The other nodes in the cluster are still operational and receive the application connections.
  • Once the node executes the DDL, it will start to apply the missing writesets.
  • In this method, the important thing is the WRITEs should not be performed on that particular table until the schema upgrade completes on all the nodes. Users should be very clear on this because the failure will break the cluster and the data may be unrecoverable. 
  • Gcache should be good enough to store the writesets.

Example:

At pxc82, I am going to execute the ALTER.

Session 1: (setting up the value to RSU – session-level) 

mysql> set wsrep_OSU_method=RSU;
Query OK, 0 rows affected (0.09 sec)

Session 2: (checking the node status)

mysql> show global status where Variable_name like 'wsrep_local_recv_queue' or Variable_name like 'wsrep_local_state_comment';
+---------------------------+--------+
| Variable_name             | Value  |
+---------------------------+--------+
| wsrep_local_recv_queue    | 0      |
| wsrep_local_state_comment | Synced |
+---------------------------+--------+
2 rows in set (0.00 sec)

Session 1: (executing the ALTER )

mysql> alter table sbtest1 add index idx_c(c);

Session 2: (checking again the node status )

Here the node went to Donor/Desynced state once the ALTER started. You can see the queue also keeps increasing.

mysql> nopager;  show global status where Variable_name like 'wsrep_local_recv_queue' or Variable_name like 'wsrep_local_state_comment';
PAGER set to stdout
+---------------------------+----------------+
| Variable_name             | Value          |
+---------------------------+----------------+
| wsrep_local_recv_queue    | 2053           |
| wsrep_local_state_comment | Donor/Desynced |
+---------------------------+----------------+
2 rows in set (0.21 sec)

Session 1: (ALTER completed)

mysql> alter table sbtest1 add index idx_c(c);
Query OK, 0 rows affected (2 min 6.52 sec)
Records: 0  Duplicates: 0  Warnings: 0

Session 2: (Node synced to cluster)

mysql> show global status where Variable_name like 'wsrep_local_recv_queue' or Variable_name like 'wsrep_local_state_comment';
+---------------------------+--------+
| Variable_name             | Value  |
+---------------------------+--------+
| wsrep_local_recv_queue    | 0      |
| wsrep_local_state_comment | Synced |
+---------------------------+--------+
2 rows in set (0.00 sec)

This step needs to be executed in pxc81 and pxc83 as well. After completing on all nodes, we are good to allow the WRITEs for that table. 

The RSU method is not truly disruption-free, as there are few bugs reported regarding RSU. Users should be very clear and careful about executing the RSU for schema updates:

https://jira.percona.com/browse/PXC-2620

https://jira.percona.com/browse/PXC-2293

https://jira.percona.com/browse/PXC-1980

Schema Changes with “ONLINE ALGORITHMS”

So far, we have 3 algorithms,

  • INPLACE
  • COPY
  • INSTANT

With TOI:

“ALGORITHM = INPLACE / COPY” still pauses the cluster during the operation. Galera doesn’t allow transactions when an ALTER TABLE statement is run. So if you are using TOI, any ALTER TABLE will block all transactions on all nodes.

| 17 | root            | localhost | schema_changes | Execute |      12 | closing tables                           | COMMIT                                                               |         0 |             0 |
| 18 | root            | localhost | schema_changes | Execute |      12 | closing tables                           | COMMIT                                                               |         0 |             0 |
| 32 | root            | localhost | schema_changes | Query   |      13 | altering table                           | alter table sbtest1 add index idx_c(c), algorithm=inplace,

“ALGORITHM=INSTANT” is supported and faster in TOI.

mysql> alter table sbtest1 add column test_Ins int , algorithm=instant;
Query OK, 0 rows affected (0.24 sec)
Records: 0  Duplicates: 0  Warnings: 0 lock=none

With RSU:

“ALGORITHM = INPLACE/COPY” is still not beneficial on RSU. It pauses the Galera replication and takes the node to Desync. 

mysql> show processlist;
| 62 | root            | localhost | schema_changes | Query   |    51 | altering table                       | alter table sbtest1 add index idx_c(c), algorithm=inplace, lock=none |         0 |             0 |
5 rows in set (0.06 sec)

mysql> nopager;  show global status where Variable_name like 'wsrep_local_recv_queue' or Variable_name like 'wsrep_local_state_comment';
PAGER set to stdout
+---------------------------+----------------+
| Variable_name             | Value          |
+---------------------------+----------------+
| wsrep_local_recv_queue    | 7335           |
| wsrep_local_state_comment | Donor/Desynced |
+---------------------------+----------------+
2 rows in set (0.03 sec)

“ALGORITHM=INSTANT” is supported and faster in RSU. But, still, you can use TOI to avoid the additional work.

mysql> alter table sbtest1 add column test_Inss int , algorithm=instant;
Query OK, 0 rows affected (0.19 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> select @@wsrep_OSU_method;
+--------------------+
| @@wsrep_OSU_method |
+--------------------+
| RSU                |
+--------------------+
1 row in set (0.02 sec)

I would suggest using the “ALGORITHM = INSTANT ” with TOI wherever you can. But, make sure you have the MySQL 8.x + version. Unfortunately, “ALGORITHM=INSTANT” currently only supports adding new columns. 

Schema Changes with “pt-osc”

pt-osc : Percona-online-schema-change

Personally, I like this approach very much and use this mostly in production environments. Pt-osc provides non-blocking schema upgrades on all nodes in one shot. This should be used with the TOI method. The action flow will be like this:

  • Create a new table “_tablename_new” with the required modification
  • Creates triggers for update the modified rows (insert / update / delete)
  • Copy the records from the original table to the new table using chunk operation.
  • Once the copy is completed, it will swap the table ( original ? _old, _new ? original ) and drop the triggers and old table. Direct DDLs ( RENAME TABLE, DROP TABLE ) will be used for this operation ( wsrep_OSU_method=TOI ). 

For the below ALTER, 

alter table schema_changes.sbtest1 add index idx_test_Ins(test_Ins);

Pt-osc flow in SQL syntax:

Creating new table...
CREATE TABLE `schema_changes`.`_sbtest1_new` (
  `id` int NOT NULL AUTO_INCREMENT,
  `k` int NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  `test_Ins` int DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `k_1` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=20400335 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
Created new table schema_changes._sbtest1_new OK.
Altering new table...
ALTER TABLE `schema_changes`.`_sbtest1_new` add index idx_test_Ins(test_Ins)
Altered `schema_changes`.`_sbtest1_new` OK.
Not creating triggers because this is a dry run.
Not copying rows because this is a dry run.
INSERT LOW_PRIORITY IGNORE INTO `schema_changes`.`_sbtest1_new` (`id`, `k`, `c`, `pad`, `test_ins`) SELECT `id`, `k`, `c`, `pad`, `test_ins` FROM `schema_changes`.`sbtest1` FORCE INDEX(`PRIMARY`) WHERE ((`id` >= ?)) AND ((`id` <= ?)) LOCK IN SHARE MODE /*pt-online-schema-change 9253 copy nibble*/
SELECT /*!40001 SQL_NO_CACHE */ `id` FROM `schema_changes`.`sbtest1` FORCE INDEX(`PRIMARY`) WHERE ((`id` >= ?)) ORDER BY `id` LIMIT ?, 2 /*next chunk boundary*/
Not swapping tables because this is a dry run.
Not dropping old table because this is a dry run.
Not dropping triggers because this is a dry run.
DROP TRIGGER IF EXISTS `schema_changes`.`pt_osc_schema_changes_sbtest1_del`
DROP TRIGGER IF EXISTS `schema_changes`.`pt_osc_schema_changes_sbtest1_upd`
DROP TRIGGER IF EXISTS `schema_changes`.`pt_osc_schema_changes_sbtest1_ins`
2020-09-30T08:31:17 Dropping new table...
DROP TABLE IF EXISTS `schema_changes`.`_sbtest1_new`;
2020-09-30T08:31:17 Dropped new table OK.

Pt-osc provides several options to perform the effective operations.  You can control the connections, active threads, load, chunk size etc ..

For Galera, we have the option “–max-flow-ctrl”. The option will check the average time cluster spent pausing for FC and make the tool pause if it goes over the percentage indicated in the option. By default, the tool will not check the FC.

[root@pxc81 log]# less /bin/pt-online-schema-change  | grep -i pausing
         print STDERR "Pausing because PXC Flow Control is active\n";
         print STDERR "Pausing because "

To make the schema changes on FOREIGN KEY tables, I would suggest using the “alter-foreign-keys-method = rebuild_constraints”. This helps to maintain the consistency of the schema and its relations. In this approach, before dropping the old table, it runs ALTER on all the child tables to drop existing FK and re-add new FK constraints that points to the columns from the new table. Again, adding and dropping the FOREIGN KEY will be the direct ALTER using TOI.

Schema changes with “gh-ost”

Gh-ost is doing a similar approach like “pt-osc”. It also helps to do the non-blocking ALTERs on all cluster nodes in one shot. The main difference is gh-ost is triggerless. Gh-ost uses the binary log to track the changes. So you need the following variables and thresholds to perform the gh-ost operation. 

log-bin=sakthi-bin
binlog-format=ROW
log-slave-updates=ON

The flow will be like,

  • Creates gh-ost table with the required modifications
  • Copy the records from the original table to the new table using chunk operation.
  • Apply the live changes by reading the DML events from binary logs.
  • Once the binary log events are applied, it will swap the tables ( original –> _old, gh-ost –> original ) and drop the old table.

Example:

[root@pxc81 schema_changes]# gh-ost --alter="add index idx_test_Inss(test_Ins)" --database=schema_changes --table=sbtest1 --user=root --password=Jesus@7sakthI --allow-on-master --execute
[2020/09/30 09:40:56] [info] binlogsyncer.go:133 create BinlogSyncer with config {99999 mysql 127.0.0.1 3306 root    false false <nil> false UTC true 0 0s 0s 0 false}
[2020/09/30 09:40:56] [info] binlogsyncer.go:354 begin to sync binlog from position (binlog.000027, 196850993)
[2020/09/30 09:40:56] [info] binlogsyncer.go:203 register slave for master server 127.0.0.1:3306
[2020/09/30 09:40:56] [info] binlogsyncer.go:723 rotate to (binlog.000027, 196850993)
# Migrating `schema_changes`.`sbtest1`; Ghost table is `schema_changes`.`_sbtest1_gho`
# Migrating pxc81:3306; inspecting pxc81:3306; executing on pxc81
# Migration started at Wed Sep 30 09:40:56 +0000 2020
# chunk-size: 1000; max-lag-millis: 1500ms; dml-batch-size: 10; max-load: ; critical-load: ; nice-ratio: 0.000000
# throttle-additional-flag-file: /tmp/gh-ost.throttle 
# Serving on unix socket: /tmp/gh-ost.schema_changes.sbtest1.sock
Copy: 0/6563240 0.0%; Applied: 0; Backlog: 0/1000; Time: 0s(total), 0s(copy); streamer: binlog.000027:196853401; Lag: 0.02s, State: migrating; ETA: N/A
Copy: 0/6563240 0.0%; Applied: 0; Backlog: 0/1000; Time: 1s(total), 1s(copy); streamer: binlog.000027:196858195; Lag: 0.01s, State: migrating; ETA: N/A
Copy: 22000/6563240 0.3%; Applied: 0; Backlog: 0/1000; Time: 2s(total), 2s(copy); streamer: binlog.000027:201067135; Lag: 0.01s, State: migrating; ETA: 9m58s

.......

Copy: 5682000/6563240 86.6%; Applied: 0; Backlog: 0/1000; Time: 16m10s(total), 16m10s(copy); streamer: binlog.000028:213168607; Lag: 0.01s, State: migrating; ETA: 2m30s
Copy: 6563000/6563240 100.0%; Applied: 0; Backlog: 0/1000; Time: 20m20s(total), 20m20s(copy); streamer: binlog.000028:382677405; Lag: 0.01s, State: migrating; ETA: 0s

Gh-ost also provides several options to perform effective operations.  You can control the connections, active threads, load, chunk size, etc.

But unfortunately, “–max-flow-ctl” option is not available in gh-ost. 

Conclusion

So, finally, I would say,

  • Always use the direct ALTER with TOI for the metadata changes and INSTANT ALTERs.
  • Use pt-online-schema-change with TOI and use the optimal flow control thresholds for InnoDB tables.
  • Schedule pt-online-schema-change operation in off-peak hours for FOREIGN KEY tables.
  • If you use RSU, never forget that you need to execute the ALTER on all nodes individually and you should block the WRITEs for that particular table. Make sure, your Gcache size is good enough to hold the writesets.
  • If you are concerned with triggers, you can use the gh-ost to make the ALTERs.
Oct
23
2019
--

ProxySQL 2.0.7 and proxysql-admin Tool Now Available

ProxySQL

ProxySQLProxySQL 2.0.7, released by ProxySQL, is now available for download in the Percona Repository along with Percona’s proxysql-admin tool.

ProxySQL is a high-performance proxy, currently for MySQL and database servers in the MySQL ecosystem (like Percona Server for MySQL and MariaDB). It acts as an intermediary for client requests seeking resources from the database. René Cannaò created ProxySQL for DBAs as a means of solving complex replication topology issues.

This release includes ProxySQL 2.0.7 which introduces many new features and enhancements and also fixes a number of bugs.

The proxysql-admin tool now supports MariaDB 10.4.

New Features

Improvements

  • PSQLADM-195: A new option --with-stats-reset has been added to the proxysql-status script to display the  *_reset tables from the stats database. If this option is not specified, these tables are not displayed by default.

Bugs Fixed

  • PSQLADM-157: In some cases, the proxysql-status script used the cat command to display a file without checking if the file existed and was readable.
  • PSQLADM-181: When run with --update-cluster --write-node=<node_name>, the proxysql-admin tool now verifies that the writer nodes are not read-only.

The ProxySQL 2.0.7 source and binary packages available from the Percona download page for ProxySQL include ProxySQL Admin – a tool developed by Percona to configure Percona XtraDB Cluster nodes into ProxySQL. Docker images for release 2.0.7 are available as well. You can download the original ProxySQL from GitHub. GitHub hosts the documentation in the wiki format.

ProxySQL is available under Open Source license GPLv3.

Oct
07
2019
--

Achieving Disaster Recovery with Percona XtraDB Cluster

Disaster Recovery with Percona XtraDB Cluster

Disaster Recovery with Percona XtraDB ClusterOne thing that comes up often from working with a variety of clients at Percona is “How can I achieve a Disaster Recovery (DR) solution with Percona XtraDB Cluster (PXC)?”  Unfortunately, decisions are sometimes made with far-reaching consequences by individuals who often do not well understand the architecture and its limitations.  As a Technical Account Manager (TAM), I am often engaged to help clients look for better solutions, or at least try to help them with it by mitigating as many issues as possible.  Clearly, in a perfect world, we would like to get the right experts involved in these types of discussions to ensure more appropriate solutions, but we all know this is not a perfect world.

One such example involves the idea that if we take a PXC cluster and split it into two datacenters with two nodes in a primary datacenter and one node in a separate datacenter, we will have a hot standby node at all times.  In this case, the application can be pointed to the third node in the event of something catastrophic.  This sounds great…in theory.  The problem is latency.

Latency can cripple a PXC cluster

By design, PXC is meant to work with nodes that can communicate with one another quickly.  The underlying cluster technology, known as Galera, is considered “virtually synchronous” in nature.  In this architecture, writesets are replicated to all active nodes in the cluster at transaction commit and will go into a queue.  Next, each node performs a certification of the writeset which is deterministic in nature.  A bug, notwithstanding, each node will either accept or reject the certification in the same manner.  So, either the writeset is applied on all nodes or it is rolled back by all nodes.  What matters in this discussion about the above is the write queue.

As writes come in on one of the nodes, the writesets are replicated to each of the other nodes.  In the above three-node cluster, the writes are certified quickly with the two nodes in the same datacenter.  However, the third node is located in a different datacenter some distance away.  In this case, the writeset must travel across the WAN and will go into a queue (wsrep_local_recv_queue).

So, what’s the problem?

To ensure that one node does not get too far behind the rest of the cluster, any node can send a flow control message to the cluster.  This instructs the cluster to stop replicating new events until the slow node catches up to within some number of writesets as defined by gcs.fc_limit in the configuration.  Essentially, when the number of transactions in the queue exceeds the gcs.fc_limit, flow control messages will be sent and the cluster will stop replicating new writesets.  Unless you have changed it, this will be 16 writesets.

Remember that PXC is virtually synchronous

When replication stops, all nodes stop accepting writes momentarily.  In this event, the system seems to stall until the local recv queue makes some space for new writesets, at which point replication will continue.  This can appear as a stall to the application and leads to huge performance issues.

So, what is a better solution to the above situation?  It is preferable to utilize an asynchronous Slave server replicating from the PXC cluster for failover.  This is the same replication that is built into Percona Server and not Galera.  While this may include adding another server, this could be mitigated by the use of garbd. This process will act as an arbitrator to maintain quorum of the cluster and thus decrease the number of data nodes in PXC by running the lightweight garbd process on an app server or some other server in the environment.  This keeps server count down in cases that are cost-sensitive.

The asynchronous nature means no sending of flow controls from the node in the remote datacenter.  Replication will lag and catch up as needed with the PXC cluster none the wiser.  Because all nodes in the PXC cluster are local to one another, ideally latency is minimal and writesets can be applied much more quickly and stalls minimized.  Of course, this does come with a few challenges.

One challenge is that due to the nature of asynchronous replication, there can be significant lag in the DR node.  This is not always an issue, however, as the only time you are using this server is during a disaster.  The writes will be sent over immediately by the Master, so it is reasonable to expect that the DR node will eventually catch up and you hope to not lose any data, although there are no guarantees.

This brings us to another concern.  Simple asynchronous replication has no guarantee of consistency like PXC provides.  In PXC, there are controls in place to guarantee consistency, but asynchronous replication provides none.  There are, therefore, cases where data drift can occur between Master and Slave.  To mitigate this risk, you can use pt-table-checksum from the Percona Toolkit to detect inconsistency between the Master node of the PXC cluster and the Slave and rectify it with pt-table-sync from the same toolkit.  This, of course, requires that you run this process often.  If it is done as part of an automated process, it should also be monitored to ensure it is being done and whether or not data drift is occurring.

You will also want to monitor that the Master node does not go down, as there is no built-in process of failing the DR node over to a new Master within the PXC cluster.  Our very own Yves Trudeau wrote a utility to manage this, and more information can be found here: https://github.com/y-trudeau/Mysql-tools/tree/master/PXC

Improving Performance

While this solution presents some additional complexity, it does provide for a more performant PXC cluster.  As a TAM, I have seen geographically-distributed PXC result in countless incidents of a system down in production.  Even when the system doesn’t go down, it often slows down due to the latency issues.  Why take that performance impact on every transaction for a DR solution that you hope to never use when there is an alternative solution as has been proposed here?  Instead, you maybe could benefit from an alternative approach that provides an acceptable failover solution while improving performance day in and day out.

Oct
04
2019
--

Percona XtraDB Cluster 8.0 (experimental release) : SST Improvements

xtradb sst improvements

xtradb sst improvementsStarting with the experimental release of Percona XtraDB Cluster 8.0, we have made changes to the SST process to make the process more robust and easier to use.

  • mysqldump and rsync are no longer supported SST methods.

    Support for mysqldump was deprecated starting with PXC 5.7 and has now been completely removed.

    MySQL 8.0 introduced a new Redo Log format that limited the use of rsync while upgrading from PXC 5.7 to 8.0. In addition, the new Galera-4 also introduced changes that further limits the use of rsync.

    The only supported SST method is xtrabackup-v2.

  • A separate Percona XtraBackup installation is no longer required.

    The required Percona XtraBackup (PXB) binaries are now shipped as part of PXC 8.0, they are not installed for general use. So if you want to use PXB outside of an SST, you will have to install PXB separately.

  • SST logging now uses MySQL error logging

    Previously, the SST script would write directly to the error log file. Now, the SST script uses MySQL error logging. A side effect of this change is that the SST logs are not immediately visible. This is due to the logging subsystem being initialized after the SST has completed.

  • The wsrep_sst_auth variable has been removed.

    PXC 8.0 now creates an internal user (mysql.pxc.sst.user) with a random password for use by PXB to take the backup. The cleartext of the password is not saved and the user is deleted after the SST has completed.

    (This feature is still in development and may change before PXC 8.0 GA)

  • PXC SST auto-upgrade

    When PXC 8.0 detects that the SST came from a lower version, mysql_upgrade is automatically invoked. Also “RESET SLAVE ALL” is run on the new node if needed. This is invoked when receiving an SST from PXC 5.7 and PXC 8.0.

    (This feature is still in development and may change before PXC 8.0 GA)

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

Oct
03
2019
--

Percona XtraDB Cluster 8.0 New Feature: wsrep_sst_auth Removal

Experimental Binary XtraDB 8.0

Experimental Binary XtraDB 8.0The problem

In PXC 5.6 and 5.7, when using xtrabackup-v2 as the SST method, the DBA must create a user with the appropriate privileges for use by Percona XtraBackup (PXB). The username and password of this backup user are specified in the wsrep_sst_auth variable.

This is a problem because this username and password was being stored in plaintext and required that the configuration file be secured.

The PXC 8.0 solution

(This feature is still under development and may change before PXC 8.0 GA)

Because the wsrep_sst_auth is only needed on the donor side to take a backup, PXC 8.0 uses an internal user (created specifically for use by PXC) with a randomly generated password. Since this user is only needed on the donor, the plaintext password is not needed on the joiner node.

This password consists of 32 characters generated at random. A new password is generated for each SST request. The plaintext of the password is never saved and never leaves the node. The username/password is sent to the SST script via unnamed pipes (stdin).

New PXC internal user accounts

mysql.pxc.internal.session

The mysql.pxc.internal.session user account provides the appropriate security context to create and set up the other PXC accounts. This account has a limited set of privileges, enough needed to create the mysql.pxc.sst.user??.

This account is locked and cannot be used to login (the password field will not allow login).

mysql.pxc.sst.user

The mysql.pxc.sst.user is used by XtraBackup to perform the backup. This account has the full set of privileges needed by XtraBackup.?? This account is created for an SST and is dropped at the end of an SST and also when the PXC node is shutdown. The creation/provisioning of this user account is not written to the binlog and is not replicated to other nodes. However, this account is sent with the backup to the joiner node. So the joiner node also has to drop this user after the SST has finished.

mysql.pxc.sst.role

The mysql.pxc.sst.role is the MySQL role that provides the privileges needed for XtraBackup. This allows for easy addition/removal of privileges needed for an SST.

The experimental release of PXC is based on MySQL 8.0.15, and we have not implemented the role-based support due to issues found with MySQL 8.0.15. This will be revisited in future versions of PXC 8.0.

Program flow

  1. DONOR node receives SST request from the JOINER
  2. DONOR node generates a random password and creates the internal SST user
    SET SESSION sql_log_bin = OFF;
    DROP USER IF EXISTS 'mysql.pxc.sst.user'@localhost;
    CREATE USER 'mysql.pxc.sst.user'@localhost IDENTIFIED WITH 'mysql_native_password' BY 'XXXXXXXX' ACCOUNT LOCK;
    GRANT 'mysql.pxc.sst.role'@localhost TO 'mysql.pxc.sst.user'@localhost;
    SET DEFAULT ROLE 'mysql.pxc.sst.role'@localhost to 'mysql.pxc.sst.user'@localhost;
    ALTER USER 'mysql.pxc.sst.user'@localhost ACCOUNT UNLOCK;

    The code that uses role is not being used in the current release due to issues with MySQL 8.0.15. Currently, we create the user with all the permissions needed explicitly.

  3. Launch the SST script (passing the username/password via stdin)
  4. SST uses the username/password to perform the backup
  5. SST script exits
  6. The DONOR node drops the user.
  7. The JOINER node receives the backup and drops the user. Note that the JOINER node also contains the internal SST user!

As a precaution, the user is also dropped when the server is shutdown.

Oct
01
2019
--

Experimental Binary of Percona XtraDB Cluster 8.0

Experimental Binary XtraDB 8.0

Experimental Binary XtraDB 8.0Percona is happy to announce the first experimental binary of Percona XtraDB Cluster 8.0 on October 1, 2019. This is a major step for tuning Percona XtraDB Cluster to be more cloud- and user-friendly. This release combines the updated and feature-rich Galera 4, with substantial improvements made by our development team.

Improvements and New Features

Galera 4, included in Percona XtraDB Cluster 8.0, has many new features. Here is a list of the most essential improvements:

  • Streaming replication supports large transactions
  • The synchronization functions allow action coordination (wsrep_last_seen_gtid, wsrep_last_written_gtid, wsrep_sync_wait_upto_gtid)
  • More granular and improved error logging. wsrep_debug is now a multi-valued variable to assist in controlling the logging, and logging messages have been significantly improved.
  • Some DML and DDL errors on a replicating node can either be ignored or suppressed. Use the wsrep_ignore_apply_errors variable to configure.
  • Multiple system tables help find out more about the state of the cluster state.
  • The wsrep infrastructure of Galera 4 is more robust than that of Galera 3. It features a faster execution of code with better state handling, improved predictability, and error handling.

Percona XtraDB Cluster 8.0 has been reworked in order to improve security and reliability as well as to provide more information about your cluster:

  • There is no need to create a backup user or maintain the credentials in plain text (a security flaw). An internal SST user is created, with a random password for making a backup, and this user is discarded immediately once the backup is done.
  • Percona XtraDB Cluster 8.0 now automatically launches the upgrade as needed (even for minor releases). This avoids manual intervention and simplifies the operation in the cloud.
  • SST (State Snapshot Transfer) rolls back or fixes an unwanted action. It is no more “a copy only block” but a smart operation to make the best use of the copy-phase.
  • Additional visibility statistics are introduced in order to obtain more information about Galera internal objects. This enables easy tracking of the state of execution and flow control.

Installation

You can only install this release from a tarball and it, therefore, cannot be installed through a package management system, such as apt or yum. Note that this release is not ready for use in any production environment.

Percona XtraDB Cluster 8.0 is based on the following:

Please be aware that this release will not be supported in the future, and as such, neither the upgrade to this release nor the downgrade from higher versions is supported.

This release is also packaged with Percona XtraBackup 8.0.5. All Percona software is open-source and free.

In order to experiment with Percona XtraDB Cluster 8.0 in your environment, download and unpack the tarball for your platform.

Note

Be sure to check your system and make sure that the packages are installed which Percona XtraDB Cluster 8.0 depends on.

For Debian or Ubuntu:

$ sudo apt-get install -y \
socat libdbd-mysql-perl \
rsync libaio1 libc6 libcurl3 libev4 libgcc1 libgcrypt20 \
libgpg-error0 libssl1.1 libstdc++6 zlib1g libatomic1

For Red Hat Enterprise Linux or CentOS:

$ sudo yum install -y openssl socat  \
procps-ng chkconfig procps-ng coreutils shadow-utils \
grep libaio libev libcurl perl-DBD-MySQL perl-Digest-MD5 \
libgcc rsync libstdc++ libgcrypt libgpg-error zlib glibc openssl-libs

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com