Nov
17
2020
--

Wondering How to Run Percona XtraDB Cluster on Kubernetes? Try Our Operator!

Run Percona XtraDB Cluster on Kubernetes

Run Percona XtraDB Cluster on KubernetesKubernetes has been a big trend for a while now, particularly well-suited for microservices. Running your main databases on Kubernetes is probably NOT what you are looking for. However, there’s a niche market for them. My colleague Stephen Thorn did a great job explaining this in The Criticality of a Kubernetes Operator for Databases. If you are considering running your database on Kubernetes, have a look at it first. And, if after reading it you start wondering how the Operator works, Stephen also wrote an Introduction to Percona Kubernetes Operator for Percona XtraDB Cluster (PXC), which presents the Kubernetes architecture and how the Percona Operator simplifies the deployment of a full HA PXC cluster in this environment, proxies included!

Now, if you are curious about how it actually works in practice but are afraid the entry barrier is too high, I can help you with that. In fact, this technology is widespread now, with most cloud providers offering a dedicated Kubernetes engine. In this blog post, I’ll walk you over the steps on how to deploy a Percona XtraDB Cluster (PXC) using the Percona Operator for Kubernetes on Google Cloud Platform (GCP).

Creating a Virtual Environment to Run Kubernetes on GCP

Google Cloud Platform includes among its products the Google Kubernetes Engine (GKE). We can take advantage of their trial offer to create our test cluster there: https://cloud.google.com.

After you sign up, you can access all the bells and whistles in their web interface. Note the Kubernetes Engine API is not enabled by default, you need to do it by visiting the Kubernetes Engine section in the left menu, under COMPUTE.

For the purpose of deploying our environment, we should install their SDK and work from the command line: see https://cloud.google.com/sdk/docs/install and follow the respective installation instructions for your OS (you will probably want to install the SDK on your personal computer).

With the SDK installed, we can initialize our environment, which requires authenticating to the Google Cloud account:

gcloud init

You will be prompted to choose a cloud project to use: there’s one created by default when the account is activated, named “My First Project”. It will receive a unique id, which you can verify in the Google Cloud interface, but usually, it is displayed as the first option presented in the prompt.

Alternatively, you can use gcloud config set to configure your default project and zone, among other settings.

For this exercise, we will be creating a 3-node cluster named k8-test-cluster with n1-standard-4 instances in the us-central1-b zone:

gcloud container clusters create --machine-type n1-standard-4 --num-nodes 3 --zone us-central1-b --cluster-version latest k8-test-cluster

If the command above was successful, you should see your newly created cluster in the list returned by:

gcloud container clusters list

Getting Ready to Work with Kubernetes

Besides the Google Cloud SDK that is used to manage the cloud instances, we also need the Kubernetes command-line tool, kubectl, to manage the Kubernetes cluster. One way to install it is through gcloud itself:

gcloud components install kubectl

This method won’t work for everyone though, as the Cloud SDK component manager is disabled for certain kinds of installation, such as through apt or yum in Linux. I find myself in this group, using Ubuntu, but the failed attempt to install kubectl through gcloud suggested another approach that worked for me:

sudo apt install kubectl

Deploying a PXC Cluster Using the Percona Kubernetes Operator

The Percona operators are available on Github. The most straightforward way to obtain a copy is by cloning the operator’s repository. The latest version of the PXC operator is 1.6.0 and we can clone it with the following command:

git clone -b v1.6.0 https://github.com/percona/percona-xtradb-cluster-operator

Move inside the created directory:

cd percona-xtradb-cluster-operator

and run the following sequence of commands:

  1. Define the Custom Resource Definitions for PXC:

    kubectl apply -f deploy/crd.yaml
  2. Create a namespace on Kubernetes and associate it to your current context:

    kubectl create namespace pxc
    kubectl config set-context $(kubectl config current-context) --namespace=pxc
  3. Define Role-Based Access Control (RBAC) for PXC:

    kubectl apply -f deploy/rbac.yaml
  4. Start the operator within Kubernetes:

    kubectl apply -f deploy/operator.yaml
  5. Configure PXC users and their credentials:

    kubectl apply -f deploy/secrets.yaml
  6. Finally, deploy the cluster:

    kubectl apply -f deploy/cr.yaml

You can find a more detailed explanation of each of these steps, as well as how to customize your installation, in the Percona Kubernetes Operator for Percona XtraDB Cluster online documentation, which includes a quickstart guide for GKE.

Now, it is a matter of waiting for the deployment to complete, which you can monitor with:

kubectl get pods

A successful deployment will show output for the above command similar to:

NAME                                               READY   STATUS    RESTARTS   AGE
cluster1-haproxy-0                                 2/2     Running   0          4m21s
cluster1-haproxy-1                                 2/2     Running   0          2m47s
cluster1-haproxy-2                                 2/2     Running   0          2m21s
cluster1-pxc-0                                     1/1     Running   0          4m22s
cluster1-pxc-1                                     1/1     Running   0          2m52s
cluster1-pxc-2                                     1/1     Running   0          111s
percona-xtradb-cluster-operator-79d786dcfb-9lthw   1/1     Running   0          4m37s

As you can see above, the operator will deploy seven pods with the default settings, and those are distributed across the three GKE n1-standard-4 machines we created at first:

kubectl get nodes
NAME                                             STATUS   ROLES    AGE    VERSION
gke-k8-test-cluster-default-pool-02c370e1-gvfg   Ready    <none>   152m   v1.17.13-gke.1400
gke-k8-test-cluster-default-pool-02c370e1-lvh7   Ready    <none>   152m   v1.17.13-gke.1400
gke-k8-test-cluster-default-pool-02c370e1-qn3p   Ready    <none>   152m   v1.17.13-gke.1400

Accessing the Cluster

One way to access the cluster is by creating an interactive shell in the Kubernetes cluster:

kubectl run -i --rm --tty percona-client --image=percona:8.0 --restart=Never -- bash -il

From there, we can access MySQL through the cluster’s HAproxy writer node:

mysql -h cluster1-haproxy -uroot -proot_password

Note the hostname used above is an alias, the connection being routed to one of the HAproxy servers available in the cluster. It is also possible to connect to a specific node by modifying the host option -h with the node’s name:

mysql -h cluster1-pxc-0 -uroot -proot_password

This is where all the fun and experimentation starts: you can test and break things without worrying too much as you can easily and quickly start again from scratch.

Destroying the Cluster and Deleting the Test Environment

Once you are done playing with your Kubernetes cluster, you can destroy it with:

gcloud container clusters delete --zone=us-central1-b k8-test-cluster

It’s important to note the command above will not discard the persistent disk volumes that were created and used by the nodes, which you can check with the command:

gcloud compute disks list

A final purging command is required to remove those as well:

gcloud compute disks delete <disk_name_1> <disk_name_2> <disk_name_3> --zone=us-central1-b

If you are feeling overzealous, you can double-check that all has been deleted:

gcloud container clusters list

gcloud compute disks list

Learn More About Percona Kubernetes Operator for Percona XtraDB Cluster

Interested In Hands-On Learning?

Be sure to get in touch with Percona’s Training Department to schedule your PXC Kubernetes training engagement. Our expert trainers will guide your team firstly through the basics, cover all the configuration noted above (and then some), and then dive deeper into how the operator functions along with High-Availability exercises, disaster recovery scenarios, backups, restore, and much more.

Oct
30
2020
--

CVE-2020-15180 – Affects Percona XtraDB Cluster

CVE-2020-15180

CVE-2020-15180Galera replication technology, a key component of Percona XtraDB Cluster, suffered from a remote code execution vulnerability. Percona has been working with the vendor since early September on this issue and has made releases available to address the problem.

Applicability

A malicious party with access to the WSREP service port (4567/TCP) as well as prerequisite knowledge of the configuration of the Galera cluster name is required in order to exploit this vulnerability, which leads to remote code execution via the WSREP protocol. 

Fixes are available in Percona XtraDB Cluster versions:

>= 8.0.20-11.2

>= 5.7.31-31.45.2

>= 5.6.49-28.42.2

Credits

Percona would like to thank all the Percona staff involved in the resolution of this issue.

More Information

Release notes

Oct
08
2020
--

The Criticality of a Kubernetes Operator for Databases

Importance of Kubernetes Operators for Databases

Importance of Kubernetes Operators for DatabasesAs a Solutions Engineer at Percona, one of my responsibilities is to support our customers as they investigate new and emerging technologies. This affords me the opportunity to speak to many current and new customers who partner with Percona. The topic of Kubernetes is becoming more popular as companies are investigating and adopting this technology. The issue most companies are encountering is architecting a stateful database that doesn’t fall victim to an environment tuned for ephemeral workloads. This obviously introduces a level of complexity as to how to run a stateful database in an inherently stateless world, as databases are not natively designed for that.

To make your life easier, as a part of the Percona Cloud-Native Autonomous Database Initiative, our engineering teams have built two Kubernetes Operators: Percona Kubernetes Operator for Percona XtraDB Cluster and Percona Kubernetes Operator for Percona Server for MongoDB, which allows for Kubernetes Pods to be destroyed, moved, or created with no impact to the application. To see an overview of Kubernetes, you can read this previous blog of mine Introduction to Percona Kubernetes Operator for Percona XtraDB Cluster that covers this topic. It’s common for companies new to Kubernetes to attempt to run their databases in Kubernetes the same way they would in a traditional environment. But, this is not advised as it introduces the possibility of data loss and it is not recommended for production workloads. Why is this dangerous and how has Percona solved this?

Appropriate Workloads for Kubernetes

Kubernetes is not the answer for everyone. It’s even not the answer for most people. Do not be misled into thinking that moving a database into Kubernetes is going to solve any of your problems. Before you consider moving your database into Kubernetes, ensure the rest of your application is cloud-native and can be used with Kubernetes. Moving your database to Kubernetes should happen after you have started both elastic vertical and horizontal scale and need to orchestrate it to control costs.

As more companies are moving to Kubernetes something has to happen to the legacy workloads. Oftentimes we see a lift and shift mentality into Kubernetes, which can be dangerous or cause more work than expected. We have seen two primary ideal use cases for moving database workloads to Kubernetes: Microservices and Unified Abstraction Layer.

Monolithic, large datasets can prohibit some of Kubernetes’ strong points: self-healing and availability. This can be an issue due to the time it takes to physically transmit data to a new Pod instance as it joins the database cluster. If your dataset is too large, this process is slow due to physical limitations and prohibits performance and the availability of your database. Microservices are a great fit due to the relatively smaller datasets, which allows Kubernetes automation to work well with the dataset size.

Companies looking to take full advantage of cloud-native applications and databases can be a really good fit for Kubernetes as well. If you truly want the ability to deploy and run your databases anywhere utilizing the concept of a Unified Abstraction Layer, Kubernetes is a great option. You can move your databases to anywhere that is running Kubernetes and know it will work.

We talked about large unsharded datasets and the limitations Kubernetes presents when handling them, but we should mention a few more workloads better suited for traditional platforms. Applications with a throughput sensitivity may not do well on Kubernetes, or they may not be cost-effective to do so. Kubernetes is fundamentally designed for container orchestration and is not designed to handle highly performant databases that require low latency. This may be possible to achieve, but at what cost? This applies to highly performant distributed applications as well. Lowest latency across all nodes is not a core tenant of Kubernetes, so ensure you have planned and tested against this before you move everything over to Kubernetes.

Pods Are Cattle, Not Pets

If you’re not familiar with Pets vs Cattle, it’s a DevOps concept that differentiates deployment methodologies of unique servers that require attention when issues arise (pets) versus the ability to replace a server with a copy if issues arise (cattle). Due to the nature of how Kubernetes operates, Pods can be destroyed, spun up, and moved at any time due to factors outside of the application’s control, much like how cattle are treated. Kubernetes uses a scheduler, which by design, can destroy and recreate Pods to meet the configuration needs of your Kubernetes Cluster. This is great for stateless applications as any failure in the application will result in a Pod containing the application being destroyed and recreated, eliminating the need for human interaction, and greatly speeding up the process to a resolution. This isn’t ideal for databases as you don’t want your database to suddenly stop working, halt the application, and introduce the potential for lost or corrupted data. One of the tools Kubernetes can utilize to help combat this is called Stateful Sets. These help by keeping a Pod’s identity assigned to it as it is destroyed and re-created. This helps facilitate stateful workloads, but how does this come into play with high availability and utilizing the automation aspects of Kubernetes?

Databases Are Pets, Not Cattle

Databases by design need to keep their identity, information, and most importantly their data safe and accessible at all times. They are the backbones of the application as they are the source of truth an application relies on for normal processing. Any errors in their operations will quickly stop an application from functioning. They are important, to say the least. How can we safely run databases in Kubernetes and still ensure we have highly available database deployments? By using Stateful Sets and Persistent Volumes we can maintain data integrity, but we need an additional set of hands to take on database administrator tasks such as ensuring failover happens, database members are recovered, and re-join the highly available architecture, along with other technology-specific functions. Fortunately, Kubernetes is extensible and has Operators, which aims to automate the key task of a human operator who is managing a service or set of services.

Automation, Automation, Automation

We know the complexities of running a database (safely) in Kubernetes and some of the concepts used to help bridge the gaps between automation and traditional human functions. With the help of Percona’s Kubernetes Operators, we can safely run databases the way they were intended to run. Percona’s Kubernetes Operators are able to automate tasks that are usually done by a database administrator such as:

  • Fully automated deployments with strict consistency and no single point of failure
  • Automated scaling with the ability to change the size parameter to add or remove members of a Cluster or Replica-Set
  • Fully automated backups and restores
  • Fully automated self-healing by automatically recovering from the failure of a single Cluster or Replica-Set member.
  • Automatically manage system users on password rotation
  • Simplified updates

Always Use a Kubernetes Operator

With the complexities of running a highly available database environment and the inherent dangers introduced by using the dynamic Kubernetes environment, an Operator should always be used when deploying databases in Kubernetes. Fortunately, Percona has already solved this by providing Percona Kubernetes Operator for Percona XtraDB Cluster and Percona Kubernetes Operator for Percona Server for MongoDB. Percona provides full support for databases running in Kubernetes with the Percona Operators. If you are interested in learning more or obtaining support or professional services to maximize your database deployments, please reach out to us.

Oct
06
2020
--

Various Ways to Perform Schema Upgrades with Percona XtraDB Cluster

Perform Schema Upgrades with Percona XtraDB Cluster

Perform Schema Upgrades with Percona XtraDB ClusterSchema changes are the big challenges in Galera replication. So, it is recommended to understand the schema changes operation for everyone who uses the Percona XtraDB Cluster (PXB)/Galera clusters. In this blog, I am going to explain the operation and impact of the various schema changes methods used in the PXB/Galera cluster.

  • Schema changes with “wsrep_OSU_method = TOI”
  • Schema changes with “wsrep_OSU_method = RSU”
  • Schema changes with “ONLINE ALGORITHMS”
  • Schema changes with “pt-osc”
  • Schema changes with “gh-ost”

For testing:

  • I have configured the 3-node Percona Xtradb Cluster (8.0.19).
  • Executing read/write load using the sysbench.
mysql> select @@wsrep_cluster_address\G
*************************** 1. row ***************************
@@wsrep_cluster_address: gcomm://pxc81,pxc82,pxc83
1 row in set (0.00 sec)

mysql> select @@version, @@version_comment\G
*************************** 1. row ***************************
        @@version: 8.0.19-10
@@version_comment: Percona XtraDB Cluster (GPL), Release rel10, Revision 727f180, WSREP version 26.4.3
1 row in set (0.00 sec)

What is the Impact of Schema Changes in Clusters?

  • By default (TOI), all the nodes in the cluster will be pause during the ALTER process. Because the ALTER needs to be replicated on all the nodes. If the ALTER is big it will affect the performance and could be the cause of the downtime.
  • Rollback is not possible on schema upgrade. 
  • You can’t kill the ALTER query immediately during the operation. So, your application may need to wait until the ALTER completion. 
mysql> pager grep alter
PAGER set to 'grep alter'

mysql> show processlist;
| 19 | root            | localhost | schema_changes | Query   |   18 | altering table           | alter table sbtest1 add index idx_c(c) |         0 |             0 |
7 rows in set (0.00 sec)

mysql> kill 19;
ERROR 1095 (HY000): You are not owner of thread 19

  • MDLs are set only on one node. Not across all the nodes in the cluster. So, you need additional control over this. 

Schema Changes with “wsrep_OSU_method = TOI”

TOI: Total Order Isolation

  • TOI is the default method ( wsrep_OSU_method = TOI ) for schema changes.
  • DDL statements are processed in the same order with regard to other transactions in each node. 
  • The full cluster will be blocked/locked during the DDL operation. 
  • This guarantees data consistency.
mysql> select @@wsrep_OSU_method\G
*************************** 1. row ***************************
@@wsrep_OSU_method: TOI
1 row in set (0.00 sec)

Example:

I am going to run the below ALTER on “pxc81”.

alter table sbtest1 add index idx_c(c)

After initiating the ALTER on pxc81, My processlist states the COMMIT and UPDATE ( from sysbench ) statements are paused. Only ALTER is in progress. The COMMIT and UPDATE will be resumed once the ALTER is completed.

| 17 | root            | localhost | schema_changes | Execute |      15 | closing tables                           | COMMIT                                 |         0 |             0 |
| 17 | root            | localhost | schema_changes | Execute |      15 | updating                                 | UPDATE sbtest1 SET c='91668836759-30934071579-18064439108-53267873872-79461377960-32104006456-143369 |         0 |             1 |
| 24 | root            | localhost | schema_changes | Query   |      15 | altering table                           | alter table sbtest1 add index idx_c(c) |         0 |             0 |

But, still, the SELECT statement can be run with “wsrep_sync_wait != 1” because “wsrep_sync_wait = 1” needs the casualty checks from other nodes. So, it will fail.  

SELECT with “wsrep_sync_wait=1”

|  1 | system user     |           | schema_changes | Query   |     0 | altering table         | alter table sbtest1 add index idx_c(c) |         0 |             0 |
| 15 | root            | localhost | schema_changes | Query   |    40 | starting               | select * from sbtest1 where id=1       |         0 |             0 |

mysql> select * from sbtest1 where id=1;
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

TOI can be the right choice for quick operations.

  • CREATE STATEMENTS
  • RENAME INDEX
  • RENAME TABLE
  • DROP INDEX
  • ALGORITHM=INSTANT

Schema Changes with “wsrep_OSU_method = RSU”

RSU – Rolling Schema Upgrade

  • In this method, DDL statements will not replicate across the cluster nodes. Need to execute the DDL individually on all nodes.
  • The node which is executing the DDL will desync from the cluster group. The other nodes in the cluster are still operational and receive the application connections.
  • Once the node executes the DDL, it will start to apply the missing writesets.
  • In this method, the important thing is the WRITEs should not be performed on that particular table until the schema upgrade completes on all the nodes. Users should be very clear on this because the failure will break the cluster and the data may be unrecoverable. 
  • Gcache should be good enough to store the writesets.

Example:

At pxc82, I am going to execute the ALTER.

Session 1: (setting up the value to RSU – session-level) 

mysql> set wsrep_OSU_method=RSU;
Query OK, 0 rows affected (0.09 sec)

Session 2: (checking the node status)

mysql> show global status where Variable_name like 'wsrep_local_recv_queue' or Variable_name like 'wsrep_local_state_comment';
+---------------------------+--------+
| Variable_name             | Value  |
+---------------------------+--------+
| wsrep_local_recv_queue    | 0      |
| wsrep_local_state_comment | Synced |
+---------------------------+--------+
2 rows in set (0.00 sec)

Session 1: (executing the ALTER )

mysql> alter table sbtest1 add index idx_c(c);

Session 2: (checking again the node status )

Here the node went to Donor/Desynced state once the ALTER started. You can see the queue also keeps increasing.

mysql> nopager;  show global status where Variable_name like 'wsrep_local_recv_queue' or Variable_name like 'wsrep_local_state_comment';
PAGER set to stdout
+---------------------------+----------------+
| Variable_name             | Value          |
+---------------------------+----------------+
| wsrep_local_recv_queue    | 2053           |
| wsrep_local_state_comment | Donor/Desynced |
+---------------------------+----------------+
2 rows in set (0.21 sec)

Session 1: (ALTER completed)

mysql> alter table sbtest1 add index idx_c(c);
Query OK, 0 rows affected (2 min 6.52 sec)
Records: 0  Duplicates: 0  Warnings: 0

Session 2: (Node synced to cluster)

mysql> show global status where Variable_name like 'wsrep_local_recv_queue' or Variable_name like 'wsrep_local_state_comment';
+---------------------------+--------+
| Variable_name             | Value  |
+---------------------------+--------+
| wsrep_local_recv_queue    | 0      |
| wsrep_local_state_comment | Synced |
+---------------------------+--------+
2 rows in set (0.00 sec)

This step needs to be executed in pxc81 and pxc83 as well. After completing on all nodes, we are good to allow the WRITEs for that table. 

The RSU method is not truly disruption-free, as there are few bugs reported regarding RSU. Users should be very clear and careful about executing the RSU for schema updates:

https://jira.percona.com/browse/PXC-2620

https://jira.percona.com/browse/PXC-2293

https://jira.percona.com/browse/PXC-1980

Schema Changes with “ONLINE ALGORITHMS”

So far, we have 3 algorithms,

  • INPLACE
  • COPY
  • INSTANT

With TOI:

“ALGORITHM = INPLACE / COPY” still pauses the cluster during the operation. Galera doesn’t allow transactions when an ALTER TABLE statement is run. So if you are using TOI, any ALTER TABLE will block all transactions on all nodes.

| 17 | root            | localhost | schema_changes | Execute |      12 | closing tables                           | COMMIT                                                               |         0 |             0 |
| 18 | root            | localhost | schema_changes | Execute |      12 | closing tables                           | COMMIT                                                               |         0 |             0 |
| 32 | root            | localhost | schema_changes | Query   |      13 | altering table                           | alter table sbtest1 add index idx_c(c), algorithm=inplace,

“ALGORITHM=INSTANT” is supported and faster in TOI.

mysql> alter table sbtest1 add column test_Ins int , algorithm=instant;
Query OK, 0 rows affected (0.24 sec)
Records: 0  Duplicates: 0  Warnings: 0 lock=none

With RSU:

“ALGORITHM = INPLACE/COPY” is still not beneficial on RSU. It pauses the Galera replication and takes the node to Desync. 

mysql> show processlist;
| 62 | root            | localhost | schema_changes | Query   |    51 | altering table                       | alter table sbtest1 add index idx_c(c), algorithm=inplace, lock=none |         0 |             0 |
5 rows in set (0.06 sec)

mysql> nopager;  show global status where Variable_name like 'wsrep_local_recv_queue' or Variable_name like 'wsrep_local_state_comment';
PAGER set to stdout
+---------------------------+----------------+
| Variable_name             | Value          |
+---------------------------+----------------+
| wsrep_local_recv_queue    | 7335           |
| wsrep_local_state_comment | Donor/Desynced |
+---------------------------+----------------+
2 rows in set (0.03 sec)

“ALGORITHM=INSTANT” is supported and faster in RSU. But, still, you can use TOI to avoid the additional work.

mysql> alter table sbtest1 add column test_Inss int , algorithm=instant;
Query OK, 0 rows affected (0.19 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> select @@wsrep_OSU_method;
+--------------------+
| @@wsrep_OSU_method |
+--------------------+
| RSU                |
+--------------------+
1 row in set (0.02 sec)

I would suggest using the “ALGORITHM = INSTANT ” with TOI wherever you can. But, make sure you have the MySQL 8.x + version. Unfortunately, “ALGORITHM=INSTANT” currently only supports adding new columns. 

Schema Changes with “pt-osc”

pt-osc : Percona-online-schema-change

Personally, I like this approach very much and use this mostly in production environments. Pt-osc provides non-blocking schema upgrades on all nodes in one shot. This should be used with the TOI method. The action flow will be like this:

  • Create a new table “_tablename_new” with the required modification
  • Creates triggers for update the modified rows (insert / update / delete)
  • Copy the records from the original table to the new table using chunk operation.
  • Once the copy is completed, it will swap the table ( original ? _old, _new ? original ) and drop the triggers and old table. Direct DDLs ( RENAME TABLE, DROP TABLE ) will be used for this operation ( wsrep_OSU_method=TOI ). 

For the below ALTER, 

alter table schema_changes.sbtest1 add index idx_test_Ins(test_Ins);

Pt-osc flow in SQL syntax:

Creating new table...
CREATE TABLE `schema_changes`.`_sbtest1_new` (
  `id` int NOT NULL AUTO_INCREMENT,
  `k` int NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  `test_Ins` int DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `k_1` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=20400335 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
Created new table schema_changes._sbtest1_new OK.
Altering new table...
ALTER TABLE `schema_changes`.`_sbtest1_new` add index idx_test_Ins(test_Ins)
Altered `schema_changes`.`_sbtest1_new` OK.
Not creating triggers because this is a dry run.
Not copying rows because this is a dry run.
INSERT LOW_PRIORITY IGNORE INTO `schema_changes`.`_sbtest1_new` (`id`, `k`, `c`, `pad`, `test_ins`) SELECT `id`, `k`, `c`, `pad`, `test_ins` FROM `schema_changes`.`sbtest1` FORCE INDEX(`PRIMARY`) WHERE ((`id` >= ?)) AND ((`id` <= ?)) LOCK IN SHARE MODE /*pt-online-schema-change 9253 copy nibble*/
SELECT /*!40001 SQL_NO_CACHE */ `id` FROM `schema_changes`.`sbtest1` FORCE INDEX(`PRIMARY`) WHERE ((`id` >= ?)) ORDER BY `id` LIMIT ?, 2 /*next chunk boundary*/
Not swapping tables because this is a dry run.
Not dropping old table because this is a dry run.
Not dropping triggers because this is a dry run.
DROP TRIGGER IF EXISTS `schema_changes`.`pt_osc_schema_changes_sbtest1_del`
DROP TRIGGER IF EXISTS `schema_changes`.`pt_osc_schema_changes_sbtest1_upd`
DROP TRIGGER IF EXISTS `schema_changes`.`pt_osc_schema_changes_sbtest1_ins`
2020-09-30T08:31:17 Dropping new table...
DROP TABLE IF EXISTS `schema_changes`.`_sbtest1_new`;
2020-09-30T08:31:17 Dropped new table OK.

Pt-osc provides several options to perform the effective operations.  You can control the connections, active threads, load, chunk size etc ..

For Galera, we have the option “–max-flow-ctrl”. The option will check the average time cluster spent pausing for FC and make the tool pause if it goes over the percentage indicated in the option. By default, the tool will not check the FC.

[root@pxc81 log]# less /bin/pt-online-schema-change  | grep -i pausing
         print STDERR "Pausing because PXC Flow Control is active\n";
         print STDERR "Pausing because "

To make the schema changes on FOREIGN KEY tables, I would suggest using the “alter-foreign-keys-method = rebuild_constraints”. This helps to maintain the consistency of the schema and its relations. In this approach, before dropping the old table, it runs ALTER on all the child tables to drop existing FK and re-add new FK constraints that points to the columns from the new table. Again, adding and dropping the FOREIGN KEY will be the direct ALTER using TOI.

Schema changes with “gh-ost”

Gh-ost is doing a similar approach like “pt-osc”. It also helps to do the non-blocking ALTERs on all cluster nodes in one shot. The main difference is gh-ost is triggerless. Gh-ost uses the binary log to track the changes. So you need the following variables and thresholds to perform the gh-ost operation. 

log-bin=sakthi-bin
binlog-format=ROW
log-slave-updates=ON

The flow will be like,

  • Creates gh-ost table with the required modifications
  • Copy the records from the original table to the new table using chunk operation.
  • Apply the live changes by reading the DML events from binary logs.
  • Once the binary log events are applied, it will swap the tables ( original –> _old, gh-ost –> original ) and drop the old table.

Example:

[root@pxc81 schema_changes]# gh-ost --alter="add index idx_test_Inss(test_Ins)" --database=schema_changes --table=sbtest1 --user=root --password=Jesus@7sakthI --allow-on-master --execute
[2020/09/30 09:40:56] [info] binlogsyncer.go:133 create BinlogSyncer with config {99999 mysql 127.0.0.1 3306 root    false false <nil> false UTC true 0 0s 0s 0 false}
[2020/09/30 09:40:56] [info] binlogsyncer.go:354 begin to sync binlog from position (binlog.000027, 196850993)
[2020/09/30 09:40:56] [info] binlogsyncer.go:203 register slave for master server 127.0.0.1:3306
[2020/09/30 09:40:56] [info] binlogsyncer.go:723 rotate to (binlog.000027, 196850993)
# Migrating `schema_changes`.`sbtest1`; Ghost table is `schema_changes`.`_sbtest1_gho`
# Migrating pxc81:3306; inspecting pxc81:3306; executing on pxc81
# Migration started at Wed Sep 30 09:40:56 +0000 2020
# chunk-size: 1000; max-lag-millis: 1500ms; dml-batch-size: 10; max-load: ; critical-load: ; nice-ratio: 0.000000
# throttle-additional-flag-file: /tmp/gh-ost.throttle 
# Serving on unix socket: /tmp/gh-ost.schema_changes.sbtest1.sock
Copy: 0/6563240 0.0%; Applied: 0; Backlog: 0/1000; Time: 0s(total), 0s(copy); streamer: binlog.000027:196853401; Lag: 0.02s, State: migrating; ETA: N/A
Copy: 0/6563240 0.0%; Applied: 0; Backlog: 0/1000; Time: 1s(total), 1s(copy); streamer: binlog.000027:196858195; Lag: 0.01s, State: migrating; ETA: N/A
Copy: 22000/6563240 0.3%; Applied: 0; Backlog: 0/1000; Time: 2s(total), 2s(copy); streamer: binlog.000027:201067135; Lag: 0.01s, State: migrating; ETA: 9m58s

.......

Copy: 5682000/6563240 86.6%; Applied: 0; Backlog: 0/1000; Time: 16m10s(total), 16m10s(copy); streamer: binlog.000028:213168607; Lag: 0.01s, State: migrating; ETA: 2m30s
Copy: 6563000/6563240 100.0%; Applied: 0; Backlog: 0/1000; Time: 20m20s(total), 20m20s(copy); streamer: binlog.000028:382677405; Lag: 0.01s, State: migrating; ETA: 0s

Gh-ost also provides several options to perform effective operations.  You can control the connections, active threads, load, chunk size, etc.

But unfortunately, “–max-flow-ctl” option is not available in gh-ost. 

Conclusion

So, finally, I would say,

  • Always use the direct ALTER with TOI for the metadata changes and INSTANT ALTERs.
  • Use pt-online-schema-change with TOI and use the optimal flow control thresholds for InnoDB tables.
  • Schedule pt-online-schema-change operation in off-peak hours for FOREIGN KEY tables.
  • If you use RSU, never forget that you need to execute the ALTER on all nodes individually and you should block the WRITEs for that particular table. Make sure, your Gcache size is good enough to hold the writesets.
  • If you are concerned with triggers, you can use the gh-ost to make the ALTERs.
Oct
23
2019
--

ProxySQL 2.0.7 and proxysql-admin Tool Now Available

ProxySQL

ProxySQLProxySQL 2.0.7, released by ProxySQL, is now available for download in the Percona Repository along with Percona’s proxysql-admin tool.

ProxySQL is a high-performance proxy, currently for MySQL and database servers in the MySQL ecosystem (like Percona Server for MySQL and MariaDB). It acts as an intermediary for client requests seeking resources from the database. René Cannaò created ProxySQL for DBAs as a means of solving complex replication topology issues.

This release includes ProxySQL 2.0.7 which introduces many new features and enhancements and also fixes a number of bugs.

The proxysql-admin tool now supports MariaDB 10.4.

New Features

Improvements

  • PSQLADM-195: A new option --with-stats-reset has been added to the proxysql-status script to display the  *_reset tables from the stats database. If this option is not specified, these tables are not displayed by default.

Bugs Fixed

  • PSQLADM-157: In some cases, the proxysql-status script used the cat command to display a file without checking if the file existed and was readable.
  • PSQLADM-181: When run with --update-cluster --write-node=<node_name>, the proxysql-admin tool now verifies that the writer nodes are not read-only.

The ProxySQL 2.0.7 source and binary packages available from the Percona download page for ProxySQL include ProxySQL Admin – a tool developed by Percona to configure Percona XtraDB Cluster nodes into ProxySQL. Docker images for release 2.0.7 are available as well. You can download the original ProxySQL from GitHub. GitHub hosts the documentation in the wiki format.

ProxySQL is available under Open Source license GPLv3.

Oct
07
2019
--

Achieving Disaster Recovery with Percona XtraDB Cluster

Disaster Recovery with Percona XtraDB Cluster

Disaster Recovery with Percona XtraDB ClusterOne thing that comes up often from working with a variety of clients at Percona is “How can I achieve a Disaster Recovery (DR) solution with Percona XtraDB Cluster (PXC)?”  Unfortunately, decisions are sometimes made with far-reaching consequences by individuals who often do not well understand the architecture and its limitations.  As a Technical Account Manager (TAM), I am often engaged to help clients look for better solutions, or at least try to help them with it by mitigating as many issues as possible.  Clearly, in a perfect world, we would like to get the right experts involved in these types of discussions to ensure more appropriate solutions, but we all know this is not a perfect world.

One such example involves the idea that if we take a PXC cluster and split it into two datacenters with two nodes in a primary datacenter and one node in a separate datacenter, we will have a hot standby node at all times.  In this case, the application can be pointed to the third node in the event of something catastrophic.  This sounds great…in theory.  The problem is latency.

Latency can cripple a PXC cluster

By design, PXC is meant to work with nodes that can communicate with one another quickly.  The underlying cluster technology, known as Galera, is considered “virtually synchronous” in nature.  In this architecture, writesets are replicated to all active nodes in the cluster at transaction commit and will go into a queue.  Next, each node performs a certification of the writeset which is deterministic in nature.  A bug, notwithstanding, each node will either accept or reject the certification in the same manner.  So, either the writeset is applied on all nodes or it is rolled back by all nodes.  What matters in this discussion about the above is the write queue.

As writes come in on one of the nodes, the writesets are replicated to each of the other nodes.  In the above three-node cluster, the writes are certified quickly with the two nodes in the same datacenter.  However, the third node is located in a different datacenter some distance away.  In this case, the writeset must travel across the WAN and will go into a queue (wsrep_local_recv_queue).

So, what’s the problem?

To ensure that one node does not get too far behind the rest of the cluster, any node can send a flow control message to the cluster.  This instructs the cluster to stop replicating new events until the slow node catches up to within some number of writesets as defined by gcs.fc_limit in the configuration.  Essentially, when the number of transactions in the queue exceeds the gcs.fc_limit, flow control messages will be sent and the cluster will stop replicating new writesets.  Unless you have changed it, this will be 16 writesets.

Remember that PXC is virtually synchronous

When replication stops, all nodes stop accepting writes momentarily.  In this event, the system seems to stall until the local recv queue makes some space for new writesets, at which point replication will continue.  This can appear as a stall to the application and leads to huge performance issues.

So, what is a better solution to the above situation?  It is preferable to utilize an asynchronous Slave server replicating from the PXC cluster for failover.  This is the same replication that is built into Percona Server and not Galera.  While this may include adding another server, this could be mitigated by the use of garbd. This process will act as an arbitrator to maintain quorum of the cluster and thus decrease the number of data nodes in PXC by running the lightweight garbd process on an app server or some other server in the environment.  This keeps server count down in cases that are cost-sensitive.

The asynchronous nature means no sending of flow controls from the node in the remote datacenter.  Replication will lag and catch up as needed with the PXC cluster none the wiser.  Because all nodes in the PXC cluster are local to one another, ideally latency is minimal and writesets can be applied much more quickly and stalls minimized.  Of course, this does come with a few challenges.

One challenge is that due to the nature of asynchronous replication, there can be significant lag in the DR node.  This is not always an issue, however, as the only time you are using this server is during a disaster.  The writes will be sent over immediately by the Master, so it is reasonable to expect that the DR node will eventually catch up and you hope to not lose any data, although there are no guarantees.

This brings us to another concern.  Simple asynchronous replication has no guarantee of consistency like PXC provides.  In PXC, there are controls in place to guarantee consistency, but asynchronous replication provides none.  There are, therefore, cases where data drift can occur between Master and Slave.  To mitigate this risk, you can use pt-table-checksum from the Percona Toolkit to detect inconsistency between the Master node of the PXC cluster and the Slave and rectify it with pt-table-sync from the same toolkit.  This, of course, requires that you run this process often.  If it is done as part of an automated process, it should also be monitored to ensure it is being done and whether or not data drift is occurring.

You will also want to monitor that the Master node does not go down, as there is no built-in process of failing the DR node over to a new Master within the PXC cluster.  Our very own Yves Trudeau wrote a utility to manage this, and more information can be found here: https://github.com/y-trudeau/Mysql-tools/tree/master/PXC

Improving Performance

While this solution presents some additional complexity, it does provide for a more performant PXC cluster.  As a TAM, I have seen geographically-distributed PXC result in countless incidents of a system down in production.  Even when the system doesn’t go down, it often slows down due to the latency issues.  Why take that performance impact on every transaction for a DR solution that you hope to never use when there is an alternative solution as has been proposed here?  Instead, you maybe could benefit from an alternative approach that provides an acceptable failover solution while improving performance day in and day out.

Oct
04
2019
--

Percona XtraDB Cluster 8.0 (experimental release) : SST Improvements

xtradb sst improvements

xtradb sst improvementsStarting with the experimental release of Percona XtraDB Cluster 8.0, we have made changes to the SST process to make the process more robust and easier to use.

  • mysqldump and rsync are no longer supported SST methods.

    Support for mysqldump was deprecated starting with PXC 5.7 and has now been completely removed.

    MySQL 8.0 introduced a new Redo Log format that limited the use of rsync while upgrading from PXC 5.7 to 8.0. In addition, the new Galera-4 also introduced changes that further limits the use of rsync.

    The only supported SST method is xtrabackup-v2.

  • A separate Percona XtraBackup installation is no longer required.

    The required Percona XtraBackup (PXB) binaries are now shipped as part of PXC 8.0, they are not installed for general use. So if you want to use PXB outside of an SST, you will have to install PXB separately.

  • SST logging now uses MySQL error logging

    Previously, the SST script would write directly to the error log file. Now, the SST script uses MySQL error logging. A side effect of this change is that the SST logs are not immediately visible. This is due to the logging subsystem being initialized after the SST has completed.

  • The wsrep_sst_auth variable has been removed.

    PXC 8.0 now creates an internal user (mysql.pxc.sst.user) with a random password for use by PXB to take the backup. The cleartext of the password is not saved and the user is deleted after the SST has completed.

    (This feature is still in development and may change before PXC 8.0 GA)

  • PXC SST auto-upgrade

    When PXC 8.0 detects that the SST came from a lower version, mysql_upgrade is automatically invoked. Also “RESET SLAVE ALL” is run on the new node if needed. This is invoked when receiving an SST from PXC 5.7 and PXC 8.0.

    (This feature is still in development and may change before PXC 8.0 GA)

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

Oct
03
2019
--

Percona XtraDB Cluster 8.0 New Feature: wsrep_sst_auth Removal

Experimental Binary XtraDB 8.0

Experimental Binary XtraDB 8.0The problem

In PXC 5.6 and 5.7, when using xtrabackup-v2 as the SST method, the DBA must create a user with the appropriate privileges for use by Percona XtraBackup (PXB). The username and password of this backup user are specified in the wsrep_sst_auth variable.

This is a problem because this username and password was being stored in plaintext and required that the configuration file be secured.

The PXC 8.0 solution

(This feature is still under development and may change before PXC 8.0 GA)

Because the wsrep_sst_auth is only needed on the donor side to take a backup, PXC 8.0 uses an internal user (created specifically for use by PXC) with a randomly generated password. Since this user is only needed on the donor, the plaintext password is not needed on the joiner node.

This password consists of 32 characters generated at random. A new password is generated for each SST request. The plaintext of the password is never saved and never leaves the node. The username/password is sent to the SST script via unnamed pipes (stdin).

New PXC internal user accounts

mysql.pxc.internal.session

The mysql.pxc.internal.session user account provides the appropriate security context to create and set up the other PXC accounts. This account has a limited set of privileges, enough needed to create the mysql.pxc.sst.user??.

This account is locked and cannot be used to login (the password field will not allow login).

mysql.pxc.sst.user

The mysql.pxc.sst.user is used by XtraBackup to perform the backup. This account has the full set of privileges needed by XtraBackup.?? This account is created for an SST and is dropped at the end of an SST and also when the PXC node is shutdown. The creation/provisioning of this user account is not written to the binlog and is not replicated to other nodes. However, this account is sent with the backup to the joiner node. So the joiner node also has to drop this user after the SST has finished.

mysql.pxc.sst.role

The mysql.pxc.sst.role is the MySQL role that provides the privileges needed for XtraBackup. This allows for easy addition/removal of privileges needed for an SST.

The experimental release of PXC is based on MySQL 8.0.15, and we have not implemented the role-based support due to issues found with MySQL 8.0.15. This will be revisited in future versions of PXC 8.0.

Program flow

  1. DONOR node receives SST request from the JOINER
  2. DONOR node generates a random password and creates the internal SST user
    SET SESSION sql_log_bin = OFF;
    DROP USER IF EXISTS 'mysql.pxc.sst.user'@localhost;
    CREATE USER 'mysql.pxc.sst.user'@localhost IDENTIFIED WITH 'mysql_native_password' BY 'XXXXXXXX' ACCOUNT LOCK;
    GRANT 'mysql.pxc.sst.role'@localhost TO 'mysql.pxc.sst.user'@localhost;
    SET DEFAULT ROLE 'mysql.pxc.sst.role'@localhost to 'mysql.pxc.sst.user'@localhost;
    ALTER USER 'mysql.pxc.sst.user'@localhost ACCOUNT UNLOCK;

    The code that uses role is not being used in the current release due to issues with MySQL 8.0.15. Currently, we create the user with all the permissions needed explicitly.

  3. Launch the SST script (passing the username/password via stdin)
  4. SST uses the username/password to perform the backup
  5. SST script exits
  6. The DONOR node drops the user.
  7. The JOINER node receives the backup and drops the user. Note that the JOINER node also contains the internal SST user!

As a precaution, the user is also dropped when the server is shutdown.

Oct
01
2019
--

Experimental Binary of Percona XtraDB Cluster 8.0

Experimental Binary XtraDB 8.0

Experimental Binary XtraDB 8.0Percona is happy to announce the first experimental binary of Percona XtraDB Cluster 8.0 on October 1, 2019. This is a major step for tuning Percona XtraDB Cluster to be more cloud- and user-friendly. This release combines the updated and feature-rich Galera 4, with substantial improvements made by our development team.

Improvements and New Features

Galera 4, included in Percona XtraDB Cluster 8.0, has many new features. Here is a list of the most essential improvements:

  • Streaming replication supports large transactions
  • The synchronization functions allow action coordination (wsrep_last_seen_gtid, wsrep_last_written_gtid, wsrep_sync_wait_upto_gtid)
  • More granular and improved error logging. wsrep_debug is now a multi-valued variable to assist in controlling the logging, and logging messages have been significantly improved.
  • Some DML and DDL errors on a replicating node can either be ignored or suppressed. Use the wsrep_ignore_apply_errors variable to configure.
  • Multiple system tables help find out more about the state of the cluster state.
  • The wsrep infrastructure of Galera 4 is more robust than that of Galera 3. It features a faster execution of code with better state handling, improved predictability, and error handling.

Percona XtraDB Cluster 8.0 has been reworked in order to improve security and reliability as well as to provide more information about your cluster:

  • There is no need to create a backup user or maintain the credentials in plain text (a security flaw). An internal SST user is created, with a random password for making a backup, and this user is discarded immediately once the backup is done.
  • Percona XtraDB Cluster 8.0 now automatically launches the upgrade as needed (even for minor releases). This avoids manual intervention and simplifies the operation in the cloud.
  • SST (State Snapshot Transfer) rolls back or fixes an unwanted action. It is no more “a copy only block” but a smart operation to make the best use of the copy-phase.
  • Additional visibility statistics are introduced in order to obtain more information about Galera internal objects. This enables easy tracking of the state of execution and flow control.

Installation

You can only install this release from a tarball and it, therefore, cannot be installed through a package management system, such as apt or yum. Note that this release is not ready for use in any production environment.

Percona XtraDB Cluster 8.0 is based on the following:

Please be aware that this release will not be supported in the future, and as such, neither the upgrade to this release nor the downgrade from higher versions is supported.

This release is also packaged with Percona XtraBackup 8.0.5. All Percona software is open-source and free.

In order to experiment with Percona XtraDB Cluster 8.0 in your environment, download and unpack the tarball for your platform.

Note

Be sure to check your system and make sure that the packages are installed which Percona XtraDB Cluster 8.0 depends on.

For Debian or Ubuntu:

$ sudo apt-get install -y \
socat libdbd-mysql-perl \
rsync libaio1 libc6 libcurl3 libev4 libgcc1 libgcrypt20 \
libgpg-error0 libssl1.1 libstdc++6 zlib1g libatomic1

For Red Hat Enterprise Linux or CentOS:

$ sudo yum install -y openssl socat  \
procps-ng chkconfig procps-ng coreutils shadow-utils \
grep libaio libev libcurl perl-DBD-MySQL perl-Digest-MD5 \
libgcc rsync libstdc++ libgcrypt libgpg-error zlib glibc openssl-libs

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

Sep
26
2019
--

Running Percona XtraDB Cluster on Raspberry PI 3

Percona XtraDB Cluster on Raspberry PI 3

Percona XtraDB Cluster on Raspberry PI 3In a previous post, I showed you how to compile Percona Mysql 5.7 on Raspberry PI 3. Now, I’ll show you how to compile and run the latest version of Percona XtraDB Cluster 5.7.26.

We will need at least 3 RaspberryPi 3 boards, and I recommend you use an external SSD drive to compile and use as MySQL’s “datadir” to avoid the stalls associated with the microSD card, which will cause PXC to run slow.

In this post, we are going to run many OS commands and configure PXC. I recommend having minimal knowledge about PXC and Linux commands.

How to install CentOS

Download the centos image from this link http://mirror.ufro.cl/centos-altarch/7.6.1810/isos/armhfp/CentOS-Userland-7-armv7hl-RaspberryPI-Minimal-1810-sda.raw.xz

I’m using this open source software https://www.balena.io/etcher/ to flash/burn the microSD card because it is very simple and works fine. I recommend using a 16GB card or larger.

If the previous process finished ok, take out the microSD card and put it into the raspberry and boot it. Wait until the boot process finishes and log in with this user and password

user: root
password: centos

By default, the root partition is very small and you need to expand it. Run the next command, this process is fast.

/usr/bin/rootfs-expand

Finally, to finish the installation process it is necessary to install many packages to compile PXC and percona-xtrabackup,

yum install -y cmake wget automake bzr gcc make telnet gcc-c++ vim libcurl-devel readline-devel ncurses-devel screen zlib-devel bison locate libaio libaio-devel autoconf socat libtool  libgcrypt-devel libev-devel vim-common check-devel openssl-devel boost-devel glib2-devel

Repeat all the previous steps over all those 3 dbnodes because some libs are needed to start mysql and to work with percona-xtrabackup.

How to compile PXC + Galera + Xtrabackup

I am using an external SSD disk because it’s faster than the microSD which I mounted on /mnt directory, so we will use this partition to download, compile, and create the datadir.

For the latest PXC version, download and decompress it using the following commands:

cd /mnt
wget https://www.percona.com/downloads/Percona-XtraDB-Cluster-57/Percona-XtraDB-Cluster-5.7.26-31.37/source/tarball/Percona-XtraDB-Cluster-5.7.26-31.37.tar.gz
tar zxf Percona-XtraDB-Cluster-5.7.26-31.37.tar.gz
cd Percona-XtraDB-Cluster-5.7.26-31.37

RaspberryPi3’s are not x86_64 architecture; it uses an ARMv7 which forces us to compile PXC from scratch. For me, this process took 6-ish hours. I recommend running the next commands in a screen session.

screen -SDRL compile_pxc
cmake -DINSTALL_LAYOUT=STANDALONE -DCMAKE_INSTALL_PREFIX=/mnt/pxc5726 -DDOWNLOAD_BOOST=ON -DWITH_BOOST=/mnt/sda1/my_boost -DWITH_WSREP=1 .
make
make install

Create a new OS user to be used by the mysql process.

useradd --no-create-home mysql

Now to continue it is necessary to create the datadir directory. I recommend to use the SSD disk attached and set the mysql permissions to the directory:

mkdir /mnt/pxc5726/data/
chown mysql.mysql /mnt/pxc5726/data/

Configure my.cnf with the minimal params to start mysql.

vim /etc/my.cnf

[client]
socket=/mnt/pxc5726/data/mysql.sock

[mysqld]
datadir = /mnt/pxc5726/data
binlog-format = row
log_bin = /mnt/pxc5726/data/binlog
innodb_buffer_pool_size = 128M
socket=/mnt/pxc5726/data/mysql.sock
symbolic-links=0

wsrep_provider=/mnt/pxc5726/libgalera_smm.so

wsrep_on=ON
wsrep_cluster_name="pxc-cluster"
wsrep_cluster_address="gcomm://192.168.88.134"

wsrep_node_name="pxc1"
wsrep_node_address=192.168.88.134

wsrep_debug=1

wsrep_sst_method=xtrabackup-v2
wsrep_sst_auth=sstuser:passw0rd

pxc_strict_mode=ENFORCING

[mysqld_safe]
log-error=/mnt/pxc5726/data/mysqld.log
pid-file=/mnt/pxc5726/data/mysqld.pid

So far we’ve only compiled PXC. Before we can start the mysql process, we also need to compile the Galera library, as, without this library, mysql will start as a standalone db-node and will not have any “cluster” capabilities.

To build libgalera_smm.so you need to install scons.

Download using the next command:

cd /mnt
wget http://prdownloads.sourceforge.net/scons/scons-3.1.0.tar.gz
tar zxf scons-3.1.0.tar.gz
cd scons-3.1.0
python setup.py install

Now we will proceed to compile libgalera. The source code for this library exists in the Percona-XtraDB-Cluster-57 directory download:

cd /mnt/Percona-XtraDB-Cluster-5.7.26-31.37/percona-xtradb-cluster-galera
scons -j4 libgalera_smm.so debug=3 psi=1 BOOSTINCLUDE=/mnt/my_boost/boost_1_59_0/boost BOOST_LIBS=/mnt/my_boost/boost_1_59_0/libs

If the compile process finishes ok, it will create a new file like this:

/mnt/Percona-XtraDB-Cluster-5.7.26-31.37/percona-xtradb-cluster-galera/libgalera_smm.so

I recommend copying libgalera_smm.so to the installed PXC directory to avoid deleting, in that case, remove the source directory.

cp /mnt/Percona-XtraDB-Cluster-5.7.26-31.37/percona-xtradb-cluster-galera/libgalera_smm.so /mnt/pxc5726

Also, this is useful because after installing and compiling all the packages we can create a compressed file and copy to the rest of the db-nodes to avoid having to compile these packages again.

The last step is to download and compile percona-xtrabackup. I used the latest version:

screen -SDRL compile_xtrabackup

cd /mnt
wget https://www.percona.com/downloads/Percona-XtraBackup-2.4/Percona-XtraBackup-2.4.15/source/tarball/percona-xtrabackup-2.4.15.tar.gz
tar zxf percona-xtrabackup-2.4.15.tar.gz
cd percona-xtrabackup-2.4.15
cmake -DBUILD_CONFIG=xtrabackup_release -DWITH_MAN_PAGES=OFF -DCMAKE_INSTALL_PREFIX=/mnt/xtrabackup .
make
make install

If all the previous steps were successful, we will add those binary directories in the PATH variable:

vim /etc/profile.d/percona.sh

export PATH=$PATH:/mnt/xtrabackup/bin:/mnt/pxc5726/bin

Save and exit and run the “export” command manually to set it in the active session:

export PATH=$PATH:/mnt/xtrabackup/bin:/mnt/pxc5726/bin

Well, so far we have PXC, Galera-lib, and percona-xtrabackup compiled for the ARMv7 architecture, which is all that is needed to work with PXC, so now we can start playing.

We will call the first host where we compiled “pxc1”. We will proceed to set the hostname:

$ vim /etc/hostname
pxc1

$ hostname pxc1
$ exit

It is necessary to connect again to ssh to refresh the hostname and you will see something like this:

$ ssh ip
[root@pxc1 ~]#

Now we are ready to open the ports needed by PXC;  xtrabackup, galera, and mysql.

$ firewall-cmd --permanent --add-port=3306/tcp
$ firewall-cmd --permanent --add-port=4567/tcp
$ firewall-cmd --permanent --add-port=4444/tcp

$ firewall-cmd --reload

$ firewall-cmd --list-ports
22/tcp 3306/tcp 4567/tcp 4444/tcp

Repeat the above procedures to set the hostname and open ports in the other db nodes.

Finally, to start the first node you’ll need to initialize the system databases and then launch the mysql process with the –wsrep-new-cluster option. This will bootstrap the cluster so that it starts with 1 node. Run the following from the command-line:

$ mysqld --initialize-insecure --user=mysql --basedir=/mnt/pxc5726 --datadir=/mnt/pxc5726/data
$ mysqld_safe --defaults-file=/etc/my.cnf --user=mysql --wsrep-new-cluster &

Check the mysql error log to see any errors:

$ tailf /mnt/pxc5726/data/mysqld.log

...
2019-09-08T23:23:17.415991Z 0 [Note] /mnt/pxc5726/bin/mysqld: ready for connections.
...

Create the new mysql user, and this will be used for xtrabackup to sync another db node on IST or SST process.

$ mysql

CREATE USER 'sstuser'@'localhost' IDENTIFIED BY 'passw0rd';
GRANT PROCESS, RELOAD, LOCK TABLES, REPLICATION CLIENT ON *.* TO 'sstuser'@'localhost';
FLUSH PRIVILEGES;

Lastly, I recommend checking if the galera library was loaded successfully:

$ mysql

show global status where Variable_name='wsrep_provider_name' 
or Variable_name='wsrep_provider_name' 
or Variable_name='wsrep_provider_vendor' 
or Variable_name='wsrep_local_state' 
or Variable_name='wsrep_local_state_comment' 
or Variable_name='wsrep_cluster_size' 
or Variable_name='wsrep_cluster_status' 
or Variable_name='wsrep_connected' 
or Variable_name='wsrep_ready' 
or Variable_name='wsrep_evs_state' 
or Variable_name like 'wsrep_flow_control%';

You will see something like this:

+----------------------------------+-----------------------------------+
| Variable_name                    | Value                             |
+----------------------------------+-----------------------------------+
| wsrep_flow_control_paused_ns     | 0                                 |
| wsrep_flow_control_paused        | 0.000000                          |
| wsrep_flow_control_sent          | 0                                 |
| wsrep_flow_control_recv          | 0                                 |
| wsrep_flow_control_interval      | [ 100, 100 ]                      |
| wsrep_flow_control_interval_low  | 100                               |
| wsrep_flow_control_interval_high | 100                               |
| wsrep_flow_control_status        | OFF                               |
| wsrep_local_state                | 4                                 |
| wsrep_local_state_comment        | Synced                            |
| wsrep_evs_state                  | OPERATIONAL                       |
| wsrep_cluster_size               | 1                                 | <----
| wsrep_cluster_status             | Primary                           |
| wsrep_connected                  | ON                                |
| wsrep_provider_name              | Galera                            |
| wsrep_provider_vendor            | Codership Oy <info@codership.com> |
| wsrep_ready                      | ON                                |
+----------------------------------+-----------------------------------+

As you can see this is the first node and the cluster size is 1.

How to copy the previous compiled source code to the rest of the db nodes

You don’t need to compile again over all the rest of the db nodes, just compress PXC and Xtrabackup directories and copy to the rest of the db nodes, nothing else.

In the first step, we compiled and installed on the next directories:

/mnt/pxc5726
/mnt/xtrabackup

We’re going to proceed to compress both directories and copy to the other servers (pxc2 and pxc3)

$ cd /mnt
$ tar czf pxc5726.tgz pxc5726
$ tar czf xtrabackup.tgz xtrabackup

Let’s copy to each db node:

$ cd /mnt
$ scp pxc5726.tgz xtrabackup.tgz IP_PXC2:/mnt

Now connect to each db node and start decompressing, configure my.cnf, other stuff, and start mysql.

$ ssh IP_PXC2
$ cd /mnt
$ tar zxf pxc5726.tgz
$ tar zxf xtrabackup.tgz

From here we are going to repeat several steps that we did previously.

Connect to IP_PXC2.

$ ssh IP_PXC2

Add the next directories in the PATH variable:

vim /etc/profile.d/percona.sh

export PATH=$PATH:/mnt/xtrabackup/bin:/mnt/pxc5726/bin

Save and exit and run the “export” command manually to set it in the active session:

export PATH=$PATH:/mnt/xtrabackup/bin:/mnt/pxc5726/bin

Create a new OS user to be used by mysql process:

useradd --no-create-home mysql

Now to continue it’s necessary to create the datadir directory. I recommend using the SSD disk attached and set the mysql permissions to the directory.

mkdir /mnt/pxc5726/data/
chown mysql.mysql /mnt/pxc5726/data/

Configure my.cnf with the minimal params to start mysql.

vim /etc/my.cnf

[client]
socket=/mnt/pxc5726/data/mysql.sock

[mysqld]
datadir = /mnt/pxc5726/data
binlog-format = row
log_bin = /mnt/pxc5726/data/binlog
innodb_buffer_pool_size = 128M
socket=/mnt/pxc5726/data/mysql.sock
symbolic-links=0

wsrep_provider=/mnt/pxc5726/libgalera_smm.so

wsrep_on=ON
wsrep_cluster_name="pxc-cluster"
wsrep_cluster_address="gcomm://IP_PXC1,IP_PXC2"

wsrep_node_name="pxc2"
wsrep_node_address=IP_PXC2

wsrep_debug=1

wsrep_sst_method=xtrabackup-v2
wsrep_sst_auth=sstuser:passw0rd

pxc_strict_mode=ENFORCING

[mysqld_safe]
log-error=/mnt/pxc5726/data/mysqld.log
pid-file=/mnt/pxc5726/data/mysqld.pid

Now we need to start the second db node. This time you don’t need to add “–wsrep-new-cluster” param (from the second and next nodes it’s not needed), just start mysql and run the next command:

$ mysqld_safe --defaults-file=/etc/my.cnf --user=mysql &

Check the mysql error log to see any error:

$ tailf /mnt/pxc5726/data/mysqld.log

...
2019-09-08T23:53:17.415991Z 0 [Note] /mnt/pxc5726/bin/mysqld: ready for connections.
...

Check if the galera library was loaded successfully:

$ mysql

show global status where Variable_name='wsrep_provider_name' 
or Variable_name='wsrep_provider_name' 
or Variable_name='wsrep_provider_vendor' 
or Variable_name='wsrep_local_state' 
or Variable_name='wsrep_local_state_comment' 
or Variable_name='wsrep_cluster_size' 
or Variable_name='wsrep_cluster_status' 
or Variable_name='wsrep_connected' 
or Variable_name='wsrep_ready' 
or Variable_name='wsrep_evs_state' 
or Variable_name like 'wsrep_flow_control%';

You will see something like this:

+----------------------------------+-----------------------------------+
| Variable_name                    | Value                             |
+----------------------------------+-----------------------------------+
| wsrep_flow_control_paused_ns     | 0                                 |
| wsrep_flow_control_paused        | 0.000000                          |
| wsrep_flow_control_sent          | 0                                 |
| wsrep_flow_control_recv          | 0                                 |
| wsrep_flow_control_interval      | [ 100, 100 ]                      |
| wsrep_flow_control_interval_low  | 100                               |
| wsrep_flow_control_interval_high | 100                               |
| wsrep_flow_control_status        | OFF                               |
| wsrep_local_state                | 4                                 |
| wsrep_local_state_comment        | Synced                            |
| wsrep_evs_state                  | OPERATIONAL                       |
| wsrep_cluster_size               | 2                                 | <---
| wsrep_cluster_status             | Primary                           |
| wsrep_connected                  | ON                                |
| wsrep_provider_name              | Galera                            |
| wsrep_provider_vendor            | Codership Oy <info@codership.com> |
| wsrep_ready                      | ON                                |
+----------------------------------+-----------------------------------+

Excellent, we have the second node up, now is it’s part of the cluster and the cluster size is 2.

Repeat the previous steps to configure IP_PXC3 and nexts.

Summary

We are ready to start playing and doing a lot of tests. I recommend running sysbench to test how many transactions this environment supports, kill some nodes to test SST and IST, and I recommend using pt-online-schema-change and check this blog “How to Perform Compatible Schema Changes in Percona XtraDB Cluster” to learn how this tool works in PXC.

I hope you enjoyed this guide, and if you want to compile other PXC 5.7.X versions, please go ahead, the steps will be very similar.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com