Oct
08
2020
--

The Criticality of a Kubernetes Operator for Databases

Importance of Kubernetes Operators for Databases

Importance of Kubernetes Operators for DatabasesAs a Solutions Engineer at Percona, one of my responsibilities is to support our customers as they investigate new and emerging technologies. This affords me the opportunity to speak to many current and new customers who partner with Percona. The topic of Kubernetes is becoming more popular as companies are investigating and adopting this technology. The issue most companies are encountering is architecting a stateful database that doesn’t fall victim to an environment tuned for ephemeral workloads. This obviously introduces a level of complexity as to how to run a stateful database in an inherently stateless world, as databases are not natively designed for that.

To make your life easier, as a part of the Percona Cloud-Native Autonomous Database Initiative, our engineering teams have built two Kubernetes Operators: Percona Kubernetes Operator for Percona XtraDB Cluster and Percona Kubernetes Operator for Percona Server for MongoDB, which allows for Kubernetes Pods to be destroyed, moved, or created with no impact to the application. To see an overview of Kubernetes, you can read this previous blog of mine Introduction to Percona Kubernetes Operator for Percona XtraDB Cluster that covers this topic. It’s common for companies new to Kubernetes to attempt to run their databases in Kubernetes the same way they would in a traditional environment. But, this is not advised as it introduces the possibility of data loss and it is not recommended for production workloads. Why is this dangerous and how has Percona solved this?

Appropriate Workloads for Kubernetes

Kubernetes is not the answer for everyone. It’s even not the answer for most people. Do not be misled into thinking that moving a database into Kubernetes is going to solve any of your problems. Before you consider moving your database into Kubernetes, ensure the rest of your application is cloud-native and can be used with Kubernetes. Moving your database to Kubernetes should happen after you have started both elastic vertical and horizontal scale and need to orchestrate it to control costs.

As more companies are moving to Kubernetes something has to happen to the legacy workloads. Oftentimes we see a lift and shift mentality into Kubernetes, which can be dangerous or cause more work than expected. We have seen two primary ideal use cases for moving database workloads to Kubernetes: Microservices and Unified Abstraction Layer.

Monolithic, large datasets can prohibit some of Kubernetes’ strong points: self-healing and availability. This can be an issue due to the time it takes to physically transmit data to a new Pod instance as it joins the database cluster. If your dataset is too large, this process is slow due to physical limitations and prohibits performance and the availability of your database. Microservices are a great fit due to the relatively smaller datasets, which allows Kubernetes automation to work well with the dataset size.

Companies looking to take full advantage of cloud-native applications and databases can be a really good fit for Kubernetes as well. If you truly want the ability to deploy and run your databases anywhere utilizing the concept of a Unified Abstraction Layer, Kubernetes is a great option. You can move your databases to anywhere that is running Kubernetes and know it will work.

We talked about large unsharded datasets and the limitations Kubernetes presents when handling them, but we should mention a few more workloads better suited for traditional platforms. Applications with a throughput sensitivity may not do well on Kubernetes, or they may not be cost-effective to do so. Kubernetes is fundamentally designed for container orchestration and is not designed to handle highly performant databases that require low latency. This may be possible to achieve, but at what cost? This applies to highly performant distributed applications as well. Lowest latency across all nodes is not a core tenant of Kubernetes, so ensure you have planned and tested against this before you move everything over to Kubernetes.

Pods Are Cattle, Not Pets

If you’re not familiar with Pets vs Cattle, it’s a DevOps concept that differentiates deployment methodologies of unique servers that require attention when issues arise (pets) versus the ability to replace a server with a copy if issues arise (cattle). Due to the nature of how Kubernetes operates, Pods can be destroyed, spun up, and moved at any time due to factors outside of the application’s control, much like how cattle are treated. Kubernetes uses a scheduler, which by design, can destroy and recreate Pods to meet the configuration needs of your Kubernetes Cluster. This is great for stateless applications as any failure in the application will result in a Pod containing the application being destroyed and recreated, eliminating the need for human interaction, and greatly speeding up the process to a resolution. This isn’t ideal for databases as you don’t want your database to suddenly stop working, halt the application, and introduce the potential for lost or corrupted data. One of the tools Kubernetes can utilize to help combat this is called Stateful Sets. These help by keeping a Pod’s identity assigned to it as it is destroyed and re-created. This helps facilitate stateful workloads, but how does this come into play with high availability and utilizing the automation aspects of Kubernetes?

Databases Are Pets, Not Cattle

Databases by design need to keep their identity, information, and most importantly their data safe and accessible at all times. They are the backbones of the application as they are the source of truth an application relies on for normal processing. Any errors in their operations will quickly stop an application from functioning. They are important, to say the least. How can we safely run databases in Kubernetes and still ensure we have highly available database deployments? By using Stateful Sets and Persistent Volumes we can maintain data integrity, but we need an additional set of hands to take on database administrator tasks such as ensuring failover happens, database members are recovered, and re-join the highly available architecture, along with other technology-specific functions. Fortunately, Kubernetes is extensible and has Operators, which aims to automate the key task of a human operator who is managing a service or set of services.

Automation, Automation, Automation

We know the complexities of running a database (safely) in Kubernetes and some of the concepts used to help bridge the gaps between automation and traditional human functions. With the help of Percona’s Kubernetes Operators, we can safely run databases the way they were intended to run. Percona’s Kubernetes Operators are able to automate tasks that are usually done by a database administrator such as:

  • Fully automated deployments with strict consistency and no single point of failure
  • Automated scaling with the ability to change the size parameter to add or remove members of a Cluster or Replica-Set
  • Fully automated backups and restores
  • Fully automated self-healing by automatically recovering from the failure of a single Cluster or Replica-Set member.
  • Automatically manage system users on password rotation
  • Simplified updates

Always Use a Kubernetes Operator

With the complexities of running a highly available database environment and the inherent dangers introduced by using the dynamic Kubernetes environment, an Operator should always be used when deploying databases in Kubernetes. Fortunately, Percona has already solved this by providing Percona Kubernetes Operator for Percona XtraDB Cluster and Percona Kubernetes Operator for Percona Server for MongoDB. Percona provides full support for databases running in Kubernetes with the Percona Operators. If you are interested in learning more or obtaining support or professional services to maximize your database deployments, please reach out to us.

Oct
06
2020
--

Various Ways to Perform Schema Upgrades with Percona XtraDB Cluster

Perform Schema Upgrades with Percona XtraDB Cluster

Perform Schema Upgrades with Percona XtraDB ClusterSchema changes are the big challenges in Galera replication. So, it is recommended to understand the schema changes operation for everyone who uses the Percona XtraDB Cluster (PXB)/Galera clusters. In this blog, I am going to explain the operation and impact of the various schema changes methods used in the PXB/Galera cluster.

  • Schema changes with “wsrep_OSU_method = TOI”
  • Schema changes with “wsrep_OSU_method = RSU”
  • Schema changes with “ONLINE ALGORITHMS”
  • Schema changes with “pt-osc”
  • Schema changes with “gh-ost”

For testing:

  • I have configured the 3-node Percona Xtradb Cluster (8.0.19).
  • Executing read/write load using the sysbench.
mysql> select @@wsrep_cluster_address\G
*************************** 1. row ***************************
@@wsrep_cluster_address: gcomm://pxc81,pxc82,pxc83
1 row in set (0.00 sec)

mysql> select @@version, @@version_comment\G
*************************** 1. row ***************************
        @@version: 8.0.19-10
@@version_comment: Percona XtraDB Cluster (GPL), Release rel10, Revision 727f180, WSREP version 26.4.3
1 row in set (0.00 sec)

What is the Impact of Schema Changes in Clusters?

  • By default (TOI), all the nodes in the cluster will be pause during the ALTER process. Because the ALTER needs to be replicated on all the nodes. If the ALTER is big it will affect the performance and could be the cause of the downtime.
  • Rollback is not possible on schema upgrade. 
  • You can’t kill the ALTER query immediately during the operation. So, your application may need to wait until the ALTER completion. 
mysql> pager grep alter
PAGER set to 'grep alter'

mysql> show processlist;
| 19 | root            | localhost | schema_changes | Query   |   18 | altering table           | alter table sbtest1 add index idx_c(c) |         0 |             0 |
7 rows in set (0.00 sec)

mysql> kill 19;
ERROR 1095 (HY000): You are not owner of thread 19

  • MDLs are set only on one node. Not across all the nodes in the cluster. So, you need additional control over this. 

Schema Changes with “wsrep_OSU_method = TOI”

TOI: Total Order Isolation

  • TOI is the default method ( wsrep_OSU_method = TOI ) for schema changes.
  • DDL statements are processed in the same order with regard to other transactions in each node. 
  • The full cluster will be blocked/locked during the DDL operation. 
  • This guarantees data consistency.
mysql> select @@wsrep_OSU_method\G
*************************** 1. row ***************************
@@wsrep_OSU_method: TOI
1 row in set (0.00 sec)

Example:

I am going to run the below ALTER on “pxc81”.

alter table sbtest1 add index idx_c(c)

After initiating the ALTER on pxc81, My processlist states the COMMIT and UPDATE ( from sysbench ) statements are paused. Only ALTER is in progress. The COMMIT and UPDATE will be resumed once the ALTER is completed.

| 17 | root            | localhost | schema_changes | Execute |      15 | closing tables                           | COMMIT                                 |         0 |             0 |
| 17 | root            | localhost | schema_changes | Execute |      15 | updating                                 | UPDATE sbtest1 SET c='91668836759-30934071579-18064439108-53267873872-79461377960-32104006456-143369 |         0 |             1 |
| 24 | root            | localhost | schema_changes | Query   |      15 | altering table                           | alter table sbtest1 add index idx_c(c) |         0 |             0 |

But, still, the SELECT statement can be run with “wsrep_sync_wait != 1” because “wsrep_sync_wait = 1” needs the casualty checks from other nodes. So, it will fail.  

SELECT with “wsrep_sync_wait=1”

|  1 | system user     |           | schema_changes | Query   |     0 | altering table         | alter table sbtest1 add index idx_c(c) |         0 |             0 |
| 15 | root            | localhost | schema_changes | Query   |    40 | starting               | select * from sbtest1 where id=1       |         0 |             0 |

mysql> select * from sbtest1 where id=1;
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

TOI can be the right choice for quick operations.

  • CREATE STATEMENTS
  • RENAME INDEX
  • RENAME TABLE
  • DROP INDEX
  • ALGORITHM=INSTANT

Schema Changes with “wsrep_OSU_method = RSU”

RSU – Rolling Schema Upgrade

  • In this method, DDL statements will not replicate across the cluster nodes. Need to execute the DDL individually on all nodes.
  • The node which is executing the DDL will desync from the cluster group. The other nodes in the cluster are still operational and receive the application connections.
  • Once the node executes the DDL, it will start to apply the missing writesets.
  • In this method, the important thing is the WRITEs should not be performed on that particular table until the schema upgrade completes on all the nodes. Users should be very clear on this because the failure will break the cluster and the data may be unrecoverable. 
  • Gcache should be good enough to store the writesets.

Example:

At pxc82, I am going to execute the ALTER.

Session 1: (setting up the value to RSU – session-level) 

mysql> set wsrep_OSU_method=RSU;
Query OK, 0 rows affected (0.09 sec)

Session 2: (checking the node status)

mysql> show global status where Variable_name like 'wsrep_local_recv_queue' or Variable_name like 'wsrep_local_state_comment';
+---------------------------+--------+
| Variable_name             | Value  |
+---------------------------+--------+
| wsrep_local_recv_queue    | 0      |
| wsrep_local_state_comment | Synced |
+---------------------------+--------+
2 rows in set (0.00 sec)

Session 1: (executing the ALTER )

mysql> alter table sbtest1 add index idx_c(c);

Session 2: (checking again the node status )

Here the node went to Donor/Desynced state once the ALTER started. You can see the queue also keeps increasing.

mysql> nopager;  show global status where Variable_name like 'wsrep_local_recv_queue' or Variable_name like 'wsrep_local_state_comment';
PAGER set to stdout
+---------------------------+----------------+
| Variable_name             | Value          |
+---------------------------+----------------+
| wsrep_local_recv_queue    | 2053           |
| wsrep_local_state_comment | Donor/Desynced |
+---------------------------+----------------+
2 rows in set (0.21 sec)

Session 1: (ALTER completed)

mysql> alter table sbtest1 add index idx_c(c);
Query OK, 0 rows affected (2 min 6.52 sec)
Records: 0  Duplicates: 0  Warnings: 0

Session 2: (Node synced to cluster)

mysql> show global status where Variable_name like 'wsrep_local_recv_queue' or Variable_name like 'wsrep_local_state_comment';
+---------------------------+--------+
| Variable_name             | Value  |
+---------------------------+--------+
| wsrep_local_recv_queue    | 0      |
| wsrep_local_state_comment | Synced |
+---------------------------+--------+
2 rows in set (0.00 sec)

This step needs to be executed in pxc81 and pxc83 as well. After completing on all nodes, we are good to allow the WRITEs for that table. 

The RSU method is not truly disruption-free, as there are few bugs reported regarding RSU. Users should be very clear and careful about executing the RSU for schema updates:

https://jira.percona.com/browse/PXC-2620

https://jira.percona.com/browse/PXC-2293

https://jira.percona.com/browse/PXC-1980

Schema Changes with “ONLINE ALGORITHMS”

So far, we have 3 algorithms,

  • INPLACE
  • COPY
  • INSTANT

With TOI:

“ALGORITHM = INPLACE / COPY” still pauses the cluster during the operation. Galera doesn’t allow transactions when an ALTER TABLE statement is run. So if you are using TOI, any ALTER TABLE will block all transactions on all nodes.

| 17 | root            | localhost | schema_changes | Execute |      12 | closing tables                           | COMMIT                                                               |         0 |             0 |
| 18 | root            | localhost | schema_changes | Execute |      12 | closing tables                           | COMMIT                                                               |         0 |             0 |
| 32 | root            | localhost | schema_changes | Query   |      13 | altering table                           | alter table sbtest1 add index idx_c(c), algorithm=inplace,

“ALGORITHM=INSTANT” is supported and faster in TOI.

mysql> alter table sbtest1 add column test_Ins int , algorithm=instant;
Query OK, 0 rows affected (0.24 sec)
Records: 0  Duplicates: 0  Warnings: 0 lock=none

With RSU:

“ALGORITHM = INPLACE/COPY” is still not beneficial on RSU. It pauses the Galera replication and takes the node to Desync. 

mysql> show processlist;
| 62 | root            | localhost | schema_changes | Query   |    51 | altering table                       | alter table sbtest1 add index idx_c(c), algorithm=inplace, lock=none |         0 |             0 |
5 rows in set (0.06 sec)

mysql> nopager;  show global status where Variable_name like 'wsrep_local_recv_queue' or Variable_name like 'wsrep_local_state_comment';
PAGER set to stdout
+---------------------------+----------------+
| Variable_name             | Value          |
+---------------------------+----------------+
| wsrep_local_recv_queue    | 7335           |
| wsrep_local_state_comment | Donor/Desynced |
+---------------------------+----------------+
2 rows in set (0.03 sec)

“ALGORITHM=INSTANT” is supported and faster in RSU. But, still, you can use TOI to avoid the additional work.

mysql> alter table sbtest1 add column test_Inss int , algorithm=instant;
Query OK, 0 rows affected (0.19 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> select @@wsrep_OSU_method;
+--------------------+
| @@wsrep_OSU_method |
+--------------------+
| RSU                |
+--------------------+
1 row in set (0.02 sec)

I would suggest using the “ALGORITHM = INSTANT ” with TOI wherever you can. But, make sure you have the MySQL 8.x + version. Unfortunately, “ALGORITHM=INSTANT” currently only supports adding new columns. 

Schema Changes with “pt-osc”

pt-osc : Percona-online-schema-change

Personally, I like this approach very much and use this mostly in production environments. Pt-osc provides non-blocking schema upgrades on all nodes in one shot. This should be used with the TOI method. The action flow will be like this:

  • Create a new table “_tablename_new” with the required modification
  • Creates triggers for update the modified rows (insert / update / delete)
  • Copy the records from the original table to the new table using chunk operation.
  • Once the copy is completed, it will swap the table ( original ? _old, _new ? original ) and drop the triggers and old table. Direct DDLs ( RENAME TABLE, DROP TABLE ) will be used for this operation ( wsrep_OSU_method=TOI ). 

For the below ALTER, 

alter table schema_changes.sbtest1 add index idx_test_Ins(test_Ins);

Pt-osc flow in SQL syntax:

Creating new table...
CREATE TABLE `schema_changes`.`_sbtest1_new` (
  `id` int NOT NULL AUTO_INCREMENT,
  `k` int NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  `test_Ins` int DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `k_1` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=20400335 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
Created new table schema_changes._sbtest1_new OK.
Altering new table...
ALTER TABLE `schema_changes`.`_sbtest1_new` add index idx_test_Ins(test_Ins)
Altered `schema_changes`.`_sbtest1_new` OK.
Not creating triggers because this is a dry run.
Not copying rows because this is a dry run.
INSERT LOW_PRIORITY IGNORE INTO `schema_changes`.`_sbtest1_new` (`id`, `k`, `c`, `pad`, `test_ins`) SELECT `id`, `k`, `c`, `pad`, `test_ins` FROM `schema_changes`.`sbtest1` FORCE INDEX(`PRIMARY`) WHERE ((`id` >= ?)) AND ((`id` <= ?)) LOCK IN SHARE MODE /*pt-online-schema-change 9253 copy nibble*/
SELECT /*!40001 SQL_NO_CACHE */ `id` FROM `schema_changes`.`sbtest1` FORCE INDEX(`PRIMARY`) WHERE ((`id` >= ?)) ORDER BY `id` LIMIT ?, 2 /*next chunk boundary*/
Not swapping tables because this is a dry run.
Not dropping old table because this is a dry run.
Not dropping triggers because this is a dry run.
DROP TRIGGER IF EXISTS `schema_changes`.`pt_osc_schema_changes_sbtest1_del`
DROP TRIGGER IF EXISTS `schema_changes`.`pt_osc_schema_changes_sbtest1_upd`
DROP TRIGGER IF EXISTS `schema_changes`.`pt_osc_schema_changes_sbtest1_ins`
2020-09-30T08:31:17 Dropping new table...
DROP TABLE IF EXISTS `schema_changes`.`_sbtest1_new`;
2020-09-30T08:31:17 Dropped new table OK.

Pt-osc provides several options to perform the effective operations.  You can control the connections, active threads, load, chunk size etc ..

For Galera, we have the option “–max-flow-ctrl”. The option will check the average time cluster spent pausing for FC and make the tool pause if it goes over the percentage indicated in the option. By default, the tool will not check the FC.

[root@pxc81 log]# less /bin/pt-online-schema-change  | grep -i pausing
         print STDERR "Pausing because PXC Flow Control is active\n";
         print STDERR "Pausing because "

To make the schema changes on FOREIGN KEY tables, I would suggest using the “alter-foreign-keys-method = rebuild_constraints”. This helps to maintain the consistency of the schema and its relations. In this approach, before dropping the old table, it runs ALTER on all the child tables to drop existing FK and re-add new FK constraints that points to the columns from the new table. Again, adding and dropping the FOREIGN KEY will be the direct ALTER using TOI.

Schema changes with “gh-ost”

Gh-ost is doing a similar approach like “pt-osc”. It also helps to do the non-blocking ALTERs on all cluster nodes in one shot. The main difference is gh-ost is triggerless. Gh-ost uses the binary log to track the changes. So you need the following variables and thresholds to perform the gh-ost operation. 

log-bin=sakthi-bin
binlog-format=ROW
log-slave-updates=ON

The flow will be like,

  • Creates gh-ost table with the required modifications
  • Copy the records from the original table to the new table using chunk operation.
  • Apply the live changes by reading the DML events from binary logs.
  • Once the binary log events are applied, it will swap the tables ( original –> _old, gh-ost –> original ) and drop the old table.

Example:

[root@pxc81 schema_changes]# gh-ost --alter="add index idx_test_Inss(test_Ins)" --database=schema_changes --table=sbtest1 --user=root --password=Jesus@7sakthI --allow-on-master --execute
[2020/09/30 09:40:56] [info] binlogsyncer.go:133 create BinlogSyncer with config {99999 mysql 127.0.0.1 3306 root    false false <nil> false UTC true 0 0s 0s 0 false}
[2020/09/30 09:40:56] [info] binlogsyncer.go:354 begin to sync binlog from position (binlog.000027, 196850993)
[2020/09/30 09:40:56] [info] binlogsyncer.go:203 register slave for master server 127.0.0.1:3306
[2020/09/30 09:40:56] [info] binlogsyncer.go:723 rotate to (binlog.000027, 196850993)
# Migrating `schema_changes`.`sbtest1`; Ghost table is `schema_changes`.`_sbtest1_gho`
# Migrating pxc81:3306; inspecting pxc81:3306; executing on pxc81
# Migration started at Wed Sep 30 09:40:56 +0000 2020
# chunk-size: 1000; max-lag-millis: 1500ms; dml-batch-size: 10; max-load: ; critical-load: ; nice-ratio: 0.000000
# throttle-additional-flag-file: /tmp/gh-ost.throttle 
# Serving on unix socket: /tmp/gh-ost.schema_changes.sbtest1.sock
Copy: 0/6563240 0.0%; Applied: 0; Backlog: 0/1000; Time: 0s(total), 0s(copy); streamer: binlog.000027:196853401; Lag: 0.02s, State: migrating; ETA: N/A
Copy: 0/6563240 0.0%; Applied: 0; Backlog: 0/1000; Time: 1s(total), 1s(copy); streamer: binlog.000027:196858195; Lag: 0.01s, State: migrating; ETA: N/A
Copy: 22000/6563240 0.3%; Applied: 0; Backlog: 0/1000; Time: 2s(total), 2s(copy); streamer: binlog.000027:201067135; Lag: 0.01s, State: migrating; ETA: 9m58s

.......

Copy: 5682000/6563240 86.6%; Applied: 0; Backlog: 0/1000; Time: 16m10s(total), 16m10s(copy); streamer: binlog.000028:213168607; Lag: 0.01s, State: migrating; ETA: 2m30s
Copy: 6563000/6563240 100.0%; Applied: 0; Backlog: 0/1000; Time: 20m20s(total), 20m20s(copy); streamer: binlog.000028:382677405; Lag: 0.01s, State: migrating; ETA: 0s

Gh-ost also provides several options to perform effective operations.  You can control the connections, active threads, load, chunk size, etc.

But unfortunately, “–max-flow-ctl” option is not available in gh-ost. 

Conclusion

So, finally, I would say,

  • Always use the direct ALTER with TOI for the metadata changes and INSTANT ALTERs.
  • Use pt-online-schema-change with TOI and use the optimal flow control thresholds for InnoDB tables.
  • Schedule pt-online-schema-change operation in off-peak hours for FOREIGN KEY tables.
  • If you use RSU, never forget that you need to execute the ALTER on all nodes individually and you should block the WRITEs for that particular table. Make sure, your Gcache size is good enough to hold the writesets.
  • If you are concerned with triggers, you can use the gh-ost to make the ALTERs.
Oct
23
2019
--

ProxySQL 2.0.7 and proxysql-admin Tool Now Available

ProxySQL

ProxySQLProxySQL 2.0.7, released by ProxySQL, is now available for download in the Percona Repository along with Percona’s proxysql-admin tool.

ProxySQL is a high-performance proxy, currently for MySQL and database servers in the MySQL ecosystem (like Percona Server for MySQL and MariaDB). It acts as an intermediary for client requests seeking resources from the database. René Cannaò created ProxySQL for DBAs as a means of solving complex replication topology issues.

This release includes ProxySQL 2.0.7 which introduces many new features and enhancements and also fixes a number of bugs.

The proxysql-admin tool now supports MariaDB 10.4.

New Features

Improvements

  • PSQLADM-195: A new option --with-stats-reset has been added to the proxysql-status script to display the  *_reset tables from the stats database. If this option is not specified, these tables are not displayed by default.

Bugs Fixed

  • PSQLADM-157: In some cases, the proxysql-status script used the cat command to display a file without checking if the file existed and was readable.
  • PSQLADM-181: When run with --update-cluster --write-node=<node_name>, the proxysql-admin tool now verifies that the writer nodes are not read-only.

The ProxySQL 2.0.7 source and binary packages available from the Percona download page for ProxySQL include ProxySQL Admin – a tool developed by Percona to configure Percona XtraDB Cluster nodes into ProxySQL. Docker images for release 2.0.7 are available as well. You can download the original ProxySQL from GitHub. GitHub hosts the documentation in the wiki format.

ProxySQL is available under Open Source license GPLv3.

Oct
07
2019
--

Achieving Disaster Recovery with Percona XtraDB Cluster

Disaster Recovery with Percona XtraDB Cluster

Disaster Recovery with Percona XtraDB ClusterOne thing that comes up often from working with a variety of clients at Percona is “How can I achieve a Disaster Recovery (DR) solution with Percona XtraDB Cluster (PXC)?”  Unfortunately, decisions are sometimes made with far-reaching consequences by individuals who often do not well understand the architecture and its limitations.  As a Technical Account Manager (TAM), I am often engaged to help clients look for better solutions, or at least try to help them with it by mitigating as many issues as possible.  Clearly, in a perfect world, we would like to get the right experts involved in these types of discussions to ensure more appropriate solutions, but we all know this is not a perfect world.

One such example involves the idea that if we take a PXC cluster and split it into two datacenters with two nodes in a primary datacenter and one node in a separate datacenter, we will have a hot standby node at all times.  In this case, the application can be pointed to the third node in the event of something catastrophic.  This sounds great…in theory.  The problem is latency.

Latency can cripple a PXC cluster

By design, PXC is meant to work with nodes that can communicate with one another quickly.  The underlying cluster technology, known as Galera, is considered “virtually synchronous” in nature.  In this architecture, writesets are replicated to all active nodes in the cluster at transaction commit and will go into a queue.  Next, each node performs a certification of the writeset which is deterministic in nature.  A bug, notwithstanding, each node will either accept or reject the certification in the same manner.  So, either the writeset is applied on all nodes or it is rolled back by all nodes.  What matters in this discussion about the above is the write queue.

As writes come in on one of the nodes, the writesets are replicated to each of the other nodes.  In the above three-node cluster, the writes are certified quickly with the two nodes in the same datacenter.  However, the third node is located in a different datacenter some distance away.  In this case, the writeset must travel across the WAN and will go into a queue (wsrep_local_recv_queue).

So, what’s the problem?

To ensure that one node does not get too far behind the rest of the cluster, any node can send a flow control message to the cluster.  This instructs the cluster to stop replicating new events until the slow node catches up to within some number of writesets as defined by gcs.fc_limit in the configuration.  Essentially, when the number of transactions in the queue exceeds the gcs.fc_limit, flow control messages will be sent and the cluster will stop replicating new writesets.  Unless you have changed it, this will be 16 writesets.

Remember that PXC is virtually synchronous

When replication stops, all nodes stop accepting writes momentarily.  In this event, the system seems to stall until the local recv queue makes some space for new writesets, at which point replication will continue.  This can appear as a stall to the application and leads to huge performance issues.

So, what is a better solution to the above situation?  It is preferable to utilize an asynchronous Slave server replicating from the PXC cluster for failover.  This is the same replication that is built into Percona Server and not Galera.  While this may include adding another server, this could be mitigated by the use of garbd. This process will act as an arbitrator to maintain quorum of the cluster and thus decrease the number of data nodes in PXC by running the lightweight garbd process on an app server or some other server in the environment.  This keeps server count down in cases that are cost-sensitive.

The asynchronous nature means no sending of flow controls from the node in the remote datacenter.  Replication will lag and catch up as needed with the PXC cluster none the wiser.  Because all nodes in the PXC cluster are local to one another, ideally latency is minimal and writesets can be applied much more quickly and stalls minimized.  Of course, this does come with a few challenges.

One challenge is that due to the nature of asynchronous replication, there can be significant lag in the DR node.  This is not always an issue, however, as the only time you are using this server is during a disaster.  The writes will be sent over immediately by the Master, so it is reasonable to expect that the DR node will eventually catch up and you hope to not lose any data, although there are no guarantees.

This brings us to another concern.  Simple asynchronous replication has no guarantee of consistency like PXC provides.  In PXC, there are controls in place to guarantee consistency, but asynchronous replication provides none.  There are, therefore, cases where data drift can occur between Master and Slave.  To mitigate this risk, you can use pt-table-checksum from the Percona Toolkit to detect inconsistency between the Master node of the PXC cluster and the Slave and rectify it with pt-table-sync from the same toolkit.  This, of course, requires that you run this process often.  If it is done as part of an automated process, it should also be monitored to ensure it is being done and whether or not data drift is occurring.

You will also want to monitor that the Master node does not go down, as there is no built-in process of failing the DR node over to a new Master within the PXC cluster.  Our very own Yves Trudeau wrote a utility to manage this, and more information can be found here: https://github.com/y-trudeau/Mysql-tools/tree/master/PXC

Improving Performance

While this solution presents some additional complexity, it does provide for a more performant PXC cluster.  As a TAM, I have seen geographically-distributed PXC result in countless incidents of a system down in production.  Even when the system doesn’t go down, it often slows down due to the latency issues.  Why take that performance impact on every transaction for a DR solution that you hope to never use when there is an alternative solution as has been proposed here?  Instead, you maybe could benefit from an alternative approach that provides an acceptable failover solution while improving performance day in and day out.

Oct
04
2019
--

Percona XtraDB Cluster 8.0 (experimental release) : SST Improvements

xtradb sst improvements

xtradb sst improvementsStarting with the experimental release of Percona XtraDB Cluster 8.0, we have made changes to the SST process to make the process more robust and easier to use.

  • mysqldump and rsync are no longer supported SST methods.

    Support for mysqldump was deprecated starting with PXC 5.7 and has now been completely removed.

    MySQL 8.0 introduced a new Redo Log format that limited the use of rsync while upgrading from PXC 5.7 to 8.0. In addition, the new Galera-4 also introduced changes that further limits the use of rsync.

    The only supported SST method is xtrabackup-v2.

  • A separate Percona XtraBackup installation is no longer required.

    The required Percona XtraBackup (PXB) binaries are now shipped as part of PXC 8.0, they are not installed for general use. So if you want to use PXB outside of an SST, you will have to install PXB separately.

  • SST logging now uses MySQL error logging

    Previously, the SST script would write directly to the error log file. Now, the SST script uses MySQL error logging. A side effect of this change is that the SST logs are not immediately visible. This is due to the logging subsystem being initialized after the SST has completed.

  • The wsrep_sst_auth variable has been removed.

    PXC 8.0 now creates an internal user (mysql.pxc.sst.user) with a random password for use by PXB to take the backup. The cleartext of the password is not saved and the user is deleted after the SST has completed.

    (This feature is still in development and may change before PXC 8.0 GA)

  • PXC SST auto-upgrade

    When PXC 8.0 detects that the SST came from a lower version, mysql_upgrade is automatically invoked. Also “RESET SLAVE ALL” is run on the new node if needed. This is invoked when receiving an SST from PXC 5.7 and PXC 8.0.

    (This feature is still in development and may change before PXC 8.0 GA)

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

Oct
03
2019
--

Percona XtraDB Cluster 8.0 New Feature: wsrep_sst_auth Removal

Experimental Binary XtraDB 8.0

Experimental Binary XtraDB 8.0The problem

In PXC 5.6 and 5.7, when using xtrabackup-v2 as the SST method, the DBA must create a user with the appropriate privileges for use by Percona XtraBackup (PXB). The username and password of this backup user are specified in the wsrep_sst_auth variable.

This is a problem because this username and password was being stored in plaintext and required that the configuration file be secured.

The PXC 8.0 solution

(This feature is still under development and may change before PXC 8.0 GA)

Because the wsrep_sst_auth is only needed on the donor side to take a backup, PXC 8.0 uses an internal user (created specifically for use by PXC) with a randomly generated password. Since this user is only needed on the donor, the plaintext password is not needed on the joiner node.

This password consists of 32 characters generated at random. A new password is generated for each SST request. The plaintext of the password is never saved and never leaves the node. The username/password is sent to the SST script via unnamed pipes (stdin).

New PXC internal user accounts

mysql.pxc.internal.session

The mysql.pxc.internal.session user account provides the appropriate security context to create and set up the other PXC accounts. This account has a limited set of privileges, enough needed to create the mysql.pxc.sst.user??.

This account is locked and cannot be used to login (the password field will not allow login).

mysql.pxc.sst.user

The mysql.pxc.sst.user is used by XtraBackup to perform the backup. This account has the full set of privileges needed by XtraBackup.?? This account is created for an SST and is dropped at the end of an SST and also when the PXC node is shutdown. The creation/provisioning of this user account is not written to the binlog and is not replicated to other nodes. However, this account is sent with the backup to the joiner node. So the joiner node also has to drop this user after the SST has finished.

mysql.pxc.sst.role

The mysql.pxc.sst.role is the MySQL role that provides the privileges needed for XtraBackup. This allows for easy addition/removal of privileges needed for an SST.

The experimental release of PXC is based on MySQL 8.0.15, and we have not implemented the role-based support due to issues found with MySQL 8.0.15. This will be revisited in future versions of PXC 8.0.

Program flow

  1. DONOR node receives SST request from the JOINER
  2. DONOR node generates a random password and creates the internal SST user
    SET SESSION sql_log_bin = OFF;
    DROP USER IF EXISTS 'mysql.pxc.sst.user'@localhost;
    CREATE USER 'mysql.pxc.sst.user'@localhost IDENTIFIED WITH 'mysql_native_password' BY 'XXXXXXXX' ACCOUNT LOCK;
    GRANT 'mysql.pxc.sst.role'@localhost TO 'mysql.pxc.sst.user'@localhost;
    SET DEFAULT ROLE 'mysql.pxc.sst.role'@localhost to 'mysql.pxc.sst.user'@localhost;
    ALTER USER 'mysql.pxc.sst.user'@localhost ACCOUNT UNLOCK;

    The code that uses role is not being used in the current release due to issues with MySQL 8.0.15. Currently, we create the user with all the permissions needed explicitly.

  3. Launch the SST script (passing the username/password via stdin)
  4. SST uses the username/password to perform the backup
  5. SST script exits
  6. The DONOR node drops the user.
  7. The JOINER node receives the backup and drops the user. Note that the JOINER node also contains the internal SST user!

As a precaution, the user is also dropped when the server is shutdown.

Oct
01
2019
--

Experimental Binary of Percona XtraDB Cluster 8.0

Experimental Binary XtraDB 8.0

Experimental Binary XtraDB 8.0Percona is happy to announce the first experimental binary of Percona XtraDB Cluster 8.0 on October 1, 2019. This is a major step for tuning Percona XtraDB Cluster to be more cloud- and user-friendly. This release combines the updated and feature-rich Galera 4, with substantial improvements made by our development team.

Improvements and New Features

Galera 4, included in Percona XtraDB Cluster 8.0, has many new features. Here is a list of the most essential improvements:

  • Streaming replication supports large transactions
  • The synchronization functions allow action coordination (wsrep_last_seen_gtid, wsrep_last_written_gtid, wsrep_sync_wait_upto_gtid)
  • More granular and improved error logging. wsrep_debug is now a multi-valued variable to assist in controlling the logging, and logging messages have been significantly improved.
  • Some DML and DDL errors on a replicating node can either be ignored or suppressed. Use the wsrep_ignore_apply_errors variable to configure.
  • Multiple system tables help find out more about the state of the cluster state.
  • The wsrep infrastructure of Galera 4 is more robust than that of Galera 3. It features a faster execution of code with better state handling, improved predictability, and error handling.

Percona XtraDB Cluster 8.0 has been reworked in order to improve security and reliability as well as to provide more information about your cluster:

  • There is no need to create a backup user or maintain the credentials in plain text (a security flaw). An internal SST user is created, with a random password for making a backup, and this user is discarded immediately once the backup is done.
  • Percona XtraDB Cluster 8.0 now automatically launches the upgrade as needed (even for minor releases). This avoids manual intervention and simplifies the operation in the cloud.
  • SST (State Snapshot Transfer) rolls back or fixes an unwanted action. It is no more “a copy only block” but a smart operation to make the best use of the copy-phase.
  • Additional visibility statistics are introduced in order to obtain more information about Galera internal objects. This enables easy tracking of the state of execution and flow control.

Installation

You can only install this release from a tarball and it, therefore, cannot be installed through a package management system, such as apt or yum. Note that this release is not ready for use in any production environment.

Percona XtraDB Cluster 8.0 is based on the following:

Please be aware that this release will not be supported in the future, and as such, neither the upgrade to this release nor the downgrade from higher versions is supported.

This release is also packaged with Percona XtraBackup 8.0.5. All Percona software is open-source and free.

In order to experiment with Percona XtraDB Cluster 8.0 in your environment, download and unpack the tarball for your platform.

Note

Be sure to check your system and make sure that the packages are installed which Percona XtraDB Cluster 8.0 depends on.

For Debian or Ubuntu:

$ sudo apt-get install -y \
socat libdbd-mysql-perl \
rsync libaio1 libc6 libcurl3 libev4 libgcc1 libgcrypt20 \
libgpg-error0 libssl1.1 libstdc++6 zlib1g libatomic1

For Red Hat Enterprise Linux or CentOS:

$ sudo yum install -y openssl socat  \
procps-ng chkconfig procps-ng coreutils shadow-utils \
grep libaio libev libcurl perl-DBD-MySQL perl-Digest-MD5 \
libgcc rsync libstdc++ libgcrypt libgpg-error zlib glibc openssl-libs

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

Sep
26
2019
--

Running Percona XtraDB Cluster on Raspberry PI 3

Percona XtraDB Cluster on Raspberry PI 3

Percona XtraDB Cluster on Raspberry PI 3In a previous post, I showed you how to compile Percona Mysql 5.7 on Raspberry PI 3. Now, I’ll show you how to compile and run the latest version of Percona XtraDB Cluster 5.7.26.

We will need at least 3 RaspberryPi 3 boards, and I recommend you use an external SSD drive to compile and use as MySQL’s “datadir” to avoid the stalls associated with the microSD card, which will cause PXC to run slow.

In this post, we are going to run many OS commands and configure PXC. I recommend having minimal knowledge about PXC and Linux commands.

How to install CentOS

Download the centos image from this link http://mirror.ufro.cl/centos-altarch/7.6.1810/isos/armhfp/CentOS-Userland-7-armv7hl-RaspberryPI-Minimal-1810-sda.raw.xz

I’m using this open source software https://www.balena.io/etcher/ to flash/burn the microSD card because it is very simple and works fine. I recommend using a 16GB card or larger.

If the previous process finished ok, take out the microSD card and put it into the raspberry and boot it. Wait until the boot process finishes and log in with this user and password

user: root
password: centos

By default, the root partition is very small and you need to expand it. Run the next command, this process is fast.

/usr/bin/rootfs-expand

Finally, to finish the installation process it is necessary to install many packages to compile PXC and percona-xtrabackup,

yum install -y cmake wget automake bzr gcc make telnet gcc-c++ vim libcurl-devel readline-devel ncurses-devel screen zlib-devel bison locate libaio libaio-devel autoconf socat libtool  libgcrypt-devel libev-devel vim-common check-devel openssl-devel boost-devel glib2-devel

Repeat all the previous steps over all those 3 dbnodes because some libs are needed to start mysql and to work with percona-xtrabackup.

How to compile PXC + Galera + Xtrabackup

I am using an external SSD disk because it’s faster than the microSD which I mounted on /mnt directory, so we will use this partition to download, compile, and create the datadir.

For the latest PXC version, download and decompress it using the following commands:

cd /mnt
wget https://www.percona.com/downloads/Percona-XtraDB-Cluster-57/Percona-XtraDB-Cluster-5.7.26-31.37/source/tarball/Percona-XtraDB-Cluster-5.7.26-31.37.tar.gz
tar zxf Percona-XtraDB-Cluster-5.7.26-31.37.tar.gz
cd Percona-XtraDB-Cluster-5.7.26-31.37

RaspberryPi3’s are not x86_64 architecture; it uses an ARMv7 which forces us to compile PXC from scratch. For me, this process took 6-ish hours. I recommend running the next commands in a screen session.

screen -SDRL compile_pxc
cmake -DINSTALL_LAYOUT=STANDALONE -DCMAKE_INSTALL_PREFIX=/mnt/pxc5726 -DDOWNLOAD_BOOST=ON -DWITH_BOOST=/mnt/sda1/my_boost -DWITH_WSREP=1 .
make
make install

Create a new OS user to be used by the mysql process.

useradd --no-create-home mysql

Now to continue it is necessary to create the datadir directory. I recommend to use the SSD disk attached and set the mysql permissions to the directory:

mkdir /mnt/pxc5726/data/
chown mysql.mysql /mnt/pxc5726/data/

Configure my.cnf with the minimal params to start mysql.

vim /etc/my.cnf

[client]
socket=/mnt/pxc5726/data/mysql.sock

[mysqld]
datadir = /mnt/pxc5726/data
binlog-format = row
log_bin = /mnt/pxc5726/data/binlog
innodb_buffer_pool_size = 128M
socket=/mnt/pxc5726/data/mysql.sock
symbolic-links=0

wsrep_provider=/mnt/pxc5726/libgalera_smm.so

wsrep_on=ON
wsrep_cluster_name="pxc-cluster"
wsrep_cluster_address="gcomm://192.168.88.134"

wsrep_node_name="pxc1"
wsrep_node_address=192.168.88.134

wsrep_debug=1

wsrep_sst_method=xtrabackup-v2
wsrep_sst_auth=sstuser:passw0rd

pxc_strict_mode=ENFORCING

[mysqld_safe]
log-error=/mnt/pxc5726/data/mysqld.log
pid-file=/mnt/pxc5726/data/mysqld.pid

So far we’ve only compiled PXC. Before we can start the mysql process, we also need to compile the Galera library, as, without this library, mysql will start as a standalone db-node and will not have any “cluster” capabilities.

To build libgalera_smm.so you need to install scons.

Download using the next command:

cd /mnt
wget http://prdownloads.sourceforge.net/scons/scons-3.1.0.tar.gz
tar zxf scons-3.1.0.tar.gz
cd scons-3.1.0
python setup.py install

Now we will proceed to compile libgalera. The source code for this library exists in the Percona-XtraDB-Cluster-57 directory download:

cd /mnt/Percona-XtraDB-Cluster-5.7.26-31.37/percona-xtradb-cluster-galera
scons -j4 libgalera_smm.so debug=3 psi=1 BOOSTINCLUDE=/mnt/my_boost/boost_1_59_0/boost BOOST_LIBS=/mnt/my_boost/boost_1_59_0/libs

If the compile process finishes ok, it will create a new file like this:

/mnt/Percona-XtraDB-Cluster-5.7.26-31.37/percona-xtradb-cluster-galera/libgalera_smm.so

I recommend copying libgalera_smm.so to the installed PXC directory to avoid deleting, in that case, remove the source directory.

cp /mnt/Percona-XtraDB-Cluster-5.7.26-31.37/percona-xtradb-cluster-galera/libgalera_smm.so /mnt/pxc5726

Also, this is useful because after installing and compiling all the packages we can create a compressed file and copy to the rest of the db-nodes to avoid having to compile these packages again.

The last step is to download and compile percona-xtrabackup. I used the latest version:

screen -SDRL compile_xtrabackup

cd /mnt
wget https://www.percona.com/downloads/Percona-XtraBackup-2.4/Percona-XtraBackup-2.4.15/source/tarball/percona-xtrabackup-2.4.15.tar.gz
tar zxf percona-xtrabackup-2.4.15.tar.gz
cd percona-xtrabackup-2.4.15
cmake -DBUILD_CONFIG=xtrabackup_release -DWITH_MAN_PAGES=OFF -DCMAKE_INSTALL_PREFIX=/mnt/xtrabackup .
make
make install

If all the previous steps were successful, we will add those binary directories in the PATH variable:

vim /etc/profile.d/percona.sh

export PATH=$PATH:/mnt/xtrabackup/bin:/mnt/pxc5726/bin

Save and exit and run the “export” command manually to set it in the active session:

export PATH=$PATH:/mnt/xtrabackup/bin:/mnt/pxc5726/bin

Well, so far we have PXC, Galera-lib, and percona-xtrabackup compiled for the ARMv7 architecture, which is all that is needed to work with PXC, so now we can start playing.

We will call the first host where we compiled “pxc1”. We will proceed to set the hostname:

$ vim /etc/hostname
pxc1

$ hostname pxc1
$ exit

It is necessary to connect again to ssh to refresh the hostname and you will see something like this:

$ ssh ip
[root@pxc1 ~]#

Now we are ready to open the ports needed by PXC;  xtrabackup, galera, and mysql.

$ firewall-cmd --permanent --add-port=3306/tcp
$ firewall-cmd --permanent --add-port=4567/tcp
$ firewall-cmd --permanent --add-port=4444/tcp

$ firewall-cmd --reload

$ firewall-cmd --list-ports
22/tcp 3306/tcp 4567/tcp 4444/tcp

Repeat the above procedures to set the hostname and open ports in the other db nodes.

Finally, to start the first node you’ll need to initialize the system databases and then launch the mysql process with the –wsrep-new-cluster option. This will bootstrap the cluster so that it starts with 1 node. Run the following from the command-line:

$ mysqld --initialize-insecure --user=mysql --basedir=/mnt/pxc5726 --datadir=/mnt/pxc5726/data
$ mysqld_safe --defaults-file=/etc/my.cnf --user=mysql --wsrep-new-cluster &

Check the mysql error log to see any errors:

$ tailf /mnt/pxc5726/data/mysqld.log

...
2019-09-08T23:23:17.415991Z 0 [Note] /mnt/pxc5726/bin/mysqld: ready for connections.
...

Create the new mysql user, and this will be used for xtrabackup to sync another db node on IST or SST process.

$ mysql

CREATE USER 'sstuser'@'localhost' IDENTIFIED BY 'passw0rd';
GRANT PROCESS, RELOAD, LOCK TABLES, REPLICATION CLIENT ON *.* TO 'sstuser'@'localhost';
FLUSH PRIVILEGES;

Lastly, I recommend checking if the galera library was loaded successfully:

$ mysql

show global status where Variable_name='wsrep_provider_name' 
or Variable_name='wsrep_provider_name' 
or Variable_name='wsrep_provider_vendor' 
or Variable_name='wsrep_local_state' 
or Variable_name='wsrep_local_state_comment' 
or Variable_name='wsrep_cluster_size' 
or Variable_name='wsrep_cluster_status' 
or Variable_name='wsrep_connected' 
or Variable_name='wsrep_ready' 
or Variable_name='wsrep_evs_state' 
or Variable_name like 'wsrep_flow_control%';

You will see something like this:

+----------------------------------+-----------------------------------+
| Variable_name                    | Value                             |
+----------------------------------+-----------------------------------+
| wsrep_flow_control_paused_ns     | 0                                 |
| wsrep_flow_control_paused        | 0.000000                          |
| wsrep_flow_control_sent          | 0                                 |
| wsrep_flow_control_recv          | 0                                 |
| wsrep_flow_control_interval      | [ 100, 100 ]                      |
| wsrep_flow_control_interval_low  | 100                               |
| wsrep_flow_control_interval_high | 100                               |
| wsrep_flow_control_status        | OFF                               |
| wsrep_local_state                | 4                                 |
| wsrep_local_state_comment        | Synced                            |
| wsrep_evs_state                  | OPERATIONAL                       |
| wsrep_cluster_size               | 1                                 | <----
| wsrep_cluster_status             | Primary                           |
| wsrep_connected                  | ON                                |
| wsrep_provider_name              | Galera                            |
| wsrep_provider_vendor            | Codership Oy <info@codership.com> |
| wsrep_ready                      | ON                                |
+----------------------------------+-----------------------------------+

As you can see this is the first node and the cluster size is 1.

How to copy the previous compiled source code to the rest of the db nodes

You don’t need to compile again over all the rest of the db nodes, just compress PXC and Xtrabackup directories and copy to the rest of the db nodes, nothing else.

In the first step, we compiled and installed on the next directories:

/mnt/pxc5726
/mnt/xtrabackup

We’re going to proceed to compress both directories and copy to the other servers (pxc2 and pxc3)

$ cd /mnt
$ tar czf pxc5726.tgz pxc5726
$ tar czf xtrabackup.tgz xtrabackup

Let’s copy to each db node:

$ cd /mnt
$ scp pxc5726.tgz xtrabackup.tgz IP_PXC2:/mnt

Now connect to each db node and start decompressing, configure my.cnf, other stuff, and start mysql.

$ ssh IP_PXC2
$ cd /mnt
$ tar zxf pxc5726.tgz
$ tar zxf xtrabackup.tgz

From here we are going to repeat several steps that we did previously.

Connect to IP_PXC2.

$ ssh IP_PXC2

Add the next directories in the PATH variable:

vim /etc/profile.d/percona.sh

export PATH=$PATH:/mnt/xtrabackup/bin:/mnt/pxc5726/bin

Save and exit and run the “export” command manually to set it in the active session:

export PATH=$PATH:/mnt/xtrabackup/bin:/mnt/pxc5726/bin

Create a new OS user to be used by mysql process:

useradd --no-create-home mysql

Now to continue it’s necessary to create the datadir directory. I recommend using the SSD disk attached and set the mysql permissions to the directory.

mkdir /mnt/pxc5726/data/
chown mysql.mysql /mnt/pxc5726/data/

Configure my.cnf with the minimal params to start mysql.

vim /etc/my.cnf

[client]
socket=/mnt/pxc5726/data/mysql.sock

[mysqld]
datadir = /mnt/pxc5726/data
binlog-format = row
log_bin = /mnt/pxc5726/data/binlog
innodb_buffer_pool_size = 128M
socket=/mnt/pxc5726/data/mysql.sock
symbolic-links=0

wsrep_provider=/mnt/pxc5726/libgalera_smm.so

wsrep_on=ON
wsrep_cluster_name="pxc-cluster"
wsrep_cluster_address="gcomm://IP_PXC1,IP_PXC2"

wsrep_node_name="pxc2"
wsrep_node_address=IP_PXC2

wsrep_debug=1

wsrep_sst_method=xtrabackup-v2
wsrep_sst_auth=sstuser:passw0rd

pxc_strict_mode=ENFORCING

[mysqld_safe]
log-error=/mnt/pxc5726/data/mysqld.log
pid-file=/mnt/pxc5726/data/mysqld.pid

Now we need to start the second db node. This time you don’t need to add “–wsrep-new-cluster” param (from the second and next nodes it’s not needed), just start mysql and run the next command:

$ mysqld_safe --defaults-file=/etc/my.cnf --user=mysql &

Check the mysql error log to see any error:

$ tailf /mnt/pxc5726/data/mysqld.log

...
2019-09-08T23:53:17.415991Z 0 [Note] /mnt/pxc5726/bin/mysqld: ready for connections.
...

Check if the galera library was loaded successfully:

$ mysql

show global status where Variable_name='wsrep_provider_name' 
or Variable_name='wsrep_provider_name' 
or Variable_name='wsrep_provider_vendor' 
or Variable_name='wsrep_local_state' 
or Variable_name='wsrep_local_state_comment' 
or Variable_name='wsrep_cluster_size' 
or Variable_name='wsrep_cluster_status' 
or Variable_name='wsrep_connected' 
or Variable_name='wsrep_ready' 
or Variable_name='wsrep_evs_state' 
or Variable_name like 'wsrep_flow_control%';

You will see something like this:

+----------------------------------+-----------------------------------+
| Variable_name                    | Value                             |
+----------------------------------+-----------------------------------+
| wsrep_flow_control_paused_ns     | 0                                 |
| wsrep_flow_control_paused        | 0.000000                          |
| wsrep_flow_control_sent          | 0                                 |
| wsrep_flow_control_recv          | 0                                 |
| wsrep_flow_control_interval      | [ 100, 100 ]                      |
| wsrep_flow_control_interval_low  | 100                               |
| wsrep_flow_control_interval_high | 100                               |
| wsrep_flow_control_status        | OFF                               |
| wsrep_local_state                | 4                                 |
| wsrep_local_state_comment        | Synced                            |
| wsrep_evs_state                  | OPERATIONAL                       |
| wsrep_cluster_size               | 2                                 | <---
| wsrep_cluster_status             | Primary                           |
| wsrep_connected                  | ON                                |
| wsrep_provider_name              | Galera                            |
| wsrep_provider_vendor            | Codership Oy <info@codership.com> |
| wsrep_ready                      | ON                                |
+----------------------------------+-----------------------------------+

Excellent, we have the second node up, now is it’s part of the cluster and the cluster size is 2.

Repeat the previous steps to configure IP_PXC3 and nexts.

Summary

We are ready to start playing and doing a lot of tests. I recommend running sysbench to test how many transactions this environment supports, kill some nodes to test SST and IST, and I recommend using pt-online-schema-change and check this blog “How to Perform Compatible Schema Changes in Percona XtraDB Cluster” to learn how this tool works in PXC.

I hope you enjoyed this guide, and if you want to compile other PXC 5.7.X versions, please go ahead, the steps will be very similar.

Sep
18
2019
--

Percona XtraDB Cluster 5.7.27-31.39 Is Now Available

Percona XtraDB Cluster

Percona XtraDB ClusterPercona is happy to announce the release of Percona XtraDB Cluster 5.7.27-31.39 on September 18, 2019. Binaries are available from the downloads section or from our software repositories.

Percona XtraDB Cluster 5.7.27-31.39 is now the current release, based on the following:

All Percona software is open-source and free.

Bugs Fixed

  • PXC-2432: PXC was not updating the information_schema user/client statistics properly.
  • PXC-2555: SST initialization delay: fixed a bug where the SST process took too long to detect if a child process was running.
  • PXC-2557: Fixed a crash when a node goes NON-PRIMARY and SHOW STATUS is executed.
  • PXC-2592: PXC restarting automatically on data inconsistency.
  • PXC-2605: PXC could crash when log_slow_verbosity included InnoDB.  Fixed upstream PS-5820.
  • PXC-2639: Fixed an issue where a SQL admin command (like OPTIMIZE) could cause a deadlock.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

Sep
17
2019
--

Percona XtraDB Cluster 5.6.45-28.36 Is Now Available

Percona XtraDB Cluster

Percona XtraDB Cluster

Percona is glad to announce the release of Percona XtraDB Cluster 5.6.45-28.36 on September 17, 2019. Binaries are available from the downloads section or from our software repositories.

Percona XtraDB Cluster 5.6.45-28.36 is now the current release, based on the following:

All Percona software is open-source and free.

Bugs Fixed

  • PXC-2432: PXC was not updating the information schema user/client statistics properly.
  • PXC-2555: SST initialization delay: fixed a bug where the SST process took too long to detect if a child process was running.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com