Apr
17
2018
--

Webinar Wednesday, April 18, 2018: Percona XtraDB Cluster 5.7 Tutorial

Percona XtraDB Cluster Tutorial

Percona XtraDB Cluster 5.7 TutorialPlease join Percona’s Architect, Tibi Köröcz as he presents Percona XtraDB Cluster 5.7 Tutorial on Wednesday, April 18, 2018, at 7:00 am PDT (UTC-7) / 10:00 am EDT (UTC-4).

Never used Percona XtraDB Cluster before? Come join this 45-minute tutorial where we will introduce you to the concepts of a fully functional Percona XtraDB Cluster.

In this tutorial, we will show you how you can install Percona XtraDB Cluster with ProxySQL, and monitor it with Percona Monitoring and Management (PMM).

We will also cover topics like bootstrap, IST, SST, Certification, common-failure situations and online schema changes.

Register for the webinar now.

Percona XtraDB ClusterTibor Köröcz, Senior Consultant

Tibi joined Percona in 2015 as a Consultant. Before joining Percona, among many other things, he worked at the world’s largest car hire booking service as a Senior Database Engineer. He enjoys trying and working with the latest technologies and applications that can help or work with MySQL. In his spare time, he likes to spend time with his friends, travel around the world and play ultimate frisbee.

The post Webinar Wednesday, April 18, 2018: Percona XtraDB Cluster 5.7 Tutorial appeared first on Percona Database Performance Blog.

Apr
16
2018
--

ProxySQL 1.4.7 and Updated proxysql-admin Tool Now in the Percona Repository

ProxySQL Admin

ProxySQL 1.4.5ProxySQL 1.4.7, released by ProxySQL, is now available for download in the Percona Repository along with an updated version of Percona’s proxysql-admin tool.

ProxySQL is a high-performance proxy, currently for MySQL and its forks (like Percona Server for MySQL and MariaDB). It acts as an intermediary for client requests seeking resources from the database. René Cannaò created ProxySQL for DBAs as a means of solving complex replication topology issues.

The ProxySQL 1.4.7 source and binary packages available at https://percona.com/downloads/proxysql include ProxySQL Admin – a tool, developed by Percona to configure Percona XtraDB Cluster nodes into ProxySQL. Docker images for release 1.4.7 are available as well: https://hub.docker.com/r/percona/proxysql/. You can download the original ProxySQL from https://github.com/sysown/proxysql/releases.

This release fixes the following bugs in ProxySQL Admin:

Usability improvements:

  • Added proxysql-status  tool to dump ProxySQL configuration and statistics.

Bug fixes:

  • PSQLADM-2: ProxySQL galera checker script didn’t check if another instance of itself is already running. While running more then one copy of proxysql_galera_checker in the same runtime environment at the same time is still not supported, the introduced fix is able to prevent duplicate script execution in most cases.
  • PSQLADM-40: ProxySQL scheduler generated a lot of proxysql_galera_checker  and  proxysql_node_monitor processes in case of wrong ProxySQL credentials in proxysql-admin.cnf file.
  • PSQLADM-41: Timeout error handling was improved with clear messages.
  • PSQLADM-42: An inconsistency of the date format in ProxySQL and scripts was fixed.
  • PSQLADM-43: proxysql_galera_checker didn’t take into account the possibility of special characters presence in mysql-monitor_password.
  • PSQLADM-44: proxysql_galera_checker generated unclear errors in the proxysql.log file if wrong credentials where passed.
  • PSQLADM-46: proxysql_node_monitor script incorrectly split the hostname and the port number in URLs containing hyphen character.

ProxySQL is available under OpenSource license GPLv3.

The post ProxySQL 1.4.7 and Updated proxysql-admin Tool Now in the Percona Repository appeared first on Percona Database Performance Blog.

Mar
28
2018
--

Percona XtraDB Cluster on Amazon GP2 Volumes

In this blog post, we look at the performance of Percona XtraDB Cluster on Amazon GP2 volumes.

In our overview blog post on Best Practices for Percona XtraDB Cluster on AWSGP2 volumes did not show good results. However, we allocated only the size needed to fit the database (200GB volumes). Percona XtraDB Cluster did not show good performance on these volumes and provided only limited IOPs.

After publishing our material, Amazon engineers pointed that we should try GP2 volumes with the size allocated to provide 10000 IOPS. If we allocated volumes with size 3.3 TiB or more, we should achieve 10000 IOPS.

It might not be initially clear what the benefit of allocating 3.3 TiB volumes for the database, which is only 100 GiB in size, but in reality GP2 volumes this size are cheaper than IO provisioned volumes that provide 10000 IOPS. Below, we will show Percona XtraDB Cluster results on GP2 volumes 3.3 TB in size.

In the previous post. we used four different instance sizes: r4.large, r4.xlarge, r4.2xlarge and r4.4xlarge. In this case with GP2 volumes of 3.3TB, they are only available with r4.2xlarge and r4.4xlarge instances. We will only test these instances.

The dataset and workload are the same as in the previous post

First, let’s review throughput and latency:

Percona XtraDB Cluster on Amazon GP2 Volumes

The legend:

  • r4/gp2 – the previous results (on GP2 volumes 200GB)
  • r4/gp2.3T – the results on GP2 volumes with 3.3 TB in size
  • r4/io1.10k – IO provisioned volumes 10000 IOPS
  • i3/nvme – I3 instances with NVMe storage.

The takeaway from these results is that 3.3 TB GP2 volumes greatly improve performance, and the results are comparable with IO provisioned volumes.

To compare the stability of latency on GP2 vs. IO1, we check the latency distribution (99% latency with 5-second interval resolution):

Percona XtraDB Cluster on Amazon GP2 Volumes 2

There is no major difference between these volumes types.

With cloud resources, you should always consider cost. Let’s review the cost of the volumes itself:

Percona XtraDB Cluster on Amazon GP2 Volumes 3

We can see that 3.3TB GP2 volumes are much more expensive than 200GB, but still about only the half of the cost of IO provisioned volumes (when we add the cost of provisioned IOPS).

And to compare the full cost of resources, let’s review the cost of an instance (we will use 1-year reserved prices):

Percona XtraDB Cluster on Amazon GP2 Volumes 4

The points of interest:

  • The cost of an r4.2xlarge instance:
    • With 3.3TB GPL2 volume: 78.5 cents/hour
    • With IO1 volume: 125.03 cents/hour
  • The cost of an r4.4xlarge instance
    • With 3.3TB GPL2 volume: 109.25 cents/hour
    • With IO1 volume: 156.32 cents/hour

And given the identical throughput, it may be more economically feasible to use 3.3TB GP2 volumes instead of IO provisioned volumes.

Now we can compare the transactions per second cost of 3.3 TB GP2 volumes with other instances:

Percona XtraDB Cluster on Amazon GP2 Volumes 5

While i3 instances are still a clear winner, if you need to have the capabilities that EBS volumes provide, you might want to consider large GP2 volumes instead of IO provisioned volumes.

In general, large GP2 volumes provide a way to increase IO performance. It seems to be a viable alternative to IO provisioned volumes.

The post Percona XtraDB Cluster on Amazon GP2 Volumes appeared first on Percona Database Performance Blog.

Mar
21
2018
--

FLUSH and LOCK Handling in Percona XtraDB Cluster

FLUSH and LOCK Handling

FLUSH and LOCK HandlingIn this blog post, we’ll look at how Percona XtraDB Cluster (PXC) executes FLUSH and LOCK handling.

Introduction

Percona XtraDB Cluster is a multi-master solution that allows parallel execution of the transactions on multiple nodes at the same point in time. Given this semantics, it is important to understand how Percona XtraDB Cluster executes statements regarding FLUSH and LOCK handling (that operate at node level).

The section below enlist different flavors of these statements and their PXC semantics

FLUSH TABLE WITH READ LOCK
  • FTWRL is normally used for backup purposes.
  • Execution of this command establishes a global level read lock.
  • This read lock is non-preemptable by the background running applier thread.
  • PXC causes the node to move to DESYNC state (thereby blocking emission of flow-control) and also pauses the node.
2018-03-08T05:09:54.293991Z 0 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 1777)
2018-03-08T05:09:58.040809Z 5 [Note] WSREP: Provider paused at c7daf065-2285-11e8-a848-af3e3329ab8f:2002 (2047)
2018-03-08T05:14:20.508317Z 5 [Note] WSREP: resuming provider at 2047
2018-03-08T05:14:20.508350Z 5 [Note] WSREP: Provider resumed.
2018-03-08T05:14:20.508887Z 0 [Note] WSREP: Member 1.0 (n2) resyncs itself to group
2018-03-08T05:14:20.508900Z 0 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 29145)
2018-03-08T05:15:16.932759Z 0 [Note] WSREP: Member 1.0 (n2) synced with group.
2018-03-08T05:15:16.932782Z 0 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 29145)
2018-03-08T05:15:16.988029Z 2 [Note] WSREP: Synchronized with group, ready for connections
2018-03-08T05:15:16.988054Z 2 [Note] WSREP: Setting wsrep_ready to true

  • Other nodes of the cluster continue to process the workload.
  • DESYNC and pause node continue to see the replication traffic. Though it doesn’t process the write-sets, they are appended to Galera cache for future processing.
  • Fallback: When FTWRL is released (through UNLOCK TABLES), and if the workload is active on other nodes of the cluster, FTWRL executed node may start emitting flow-control to cover the backlog. Check details here.
FLUSH TABLE <tablename> (WITH READ LOCK|FOR EXPORT)
  • It is meant to take global level read lock on the said table only. This lock command is not replicated and so pxc_strict_mode = ENFORCING blocks execution of this command.
  • This read lock is non-preemptable by the background running applier thread.
  • Execution of this command will cause the node to pause.
  • If the flush command executing node is same as workload processing node, then the node will pause immediately
  • If the flush command executing node is different from workload processing node, then the write-sets are queued to the incoming queue and flow-control will cause the pause.
  • End-result is cluster will stall in both cases.
2018-03-07T06:40:00.143783Z 5 [Note] WSREP: Provider paused at 40de14ba-21be-11e8-8e3d-0ee226700bda:147682 (149032)
2018-03-07T06:40:00.144347Z 5 [Note] InnoDB: Sync to disk of `test`.`t` started.
2018-03-07T06:40:00.144365Z 5 [Note] InnoDB: Stopping purge
2018-03-07T06:40:00.144468Z 5 [Note] InnoDB: Writing table metadata to './test/t.cfg'
2018-03-07T06:40:00.144537Z 5 [Note] InnoDB: Table `test`.`t` flushed to disk
2018-03-07T06:40:01.855847Z 5 [Note] InnoDB: Deleting the meta-data file './test/t.cfg'
2018-03-07T06:40:01.855874Z 5 [Note] InnoDB: Resuming purge
2018-03-07T06:40:01.855955Z 5 [Note] WSREP: resuming provider at 149032
2018-03-07T06:40:01.855970Z 5 [Note] WSREP: Provider resumed.

  • Once the lock is released (through UNLOCK TABLES), node resumes apply of write-sets.
LOCK TABLE <tablename> READ/WRITE
  • LOCK TABLE command is meant to lock the said table in the said mode.
  • Again, the lock established by this command is non-preemptable.
  • LOCK is taken at node level (command is not replicated) so pxc_strict_mode = ENFORCING blocks this command.
  • There is no state change in PXC on the execution of this command.
  • If the lock is taken on the table that is not being touched by the active workload, the workload can continue to progress. If the lock is taken on the table that is part of the workload, said transaction in the workload will wait for the lock to get released, in turn, will cause complete workload to halt.
GET_LOCK
  • It is named lock and follows same semantics as LOCK TABLE for PXC. (Base semantics of MySQL are slightly different that you can check here).
LOCK TABLES FOR BACKUP
  • As the semantics goes, this lock is specially meant for backup and blocks non-transactional changes (like the updates to non-transactional engine = MyISAM and DDL changes).
  • PXC doesn’t have any special add-on semantics for this command
LOCK BINLOG FOR BACKUP
  • This statement blocks write to binlog. PXC always generates a binlog (persist to disk is controlled by the log-bin setting). If you disable log-bin, then PXC enables emulation-based binlogging.
  • This effectively means this command can cause the cluster to stall.

Tracking active lock/flush

  • If you have executed a flush or lock command and wanted to find out, it is possible using the com_% counter. These counters are connection specific, so execute these commands from the same client connection. Also, these counters are aggregate counters and incremental only.
mysql> show status like 'Com%lock%';
+----------------------------+-------+
| Variable_name | Value |
+----------------------------+-------+
| Com_lock_tables | 2 |
| Com_lock_tables_for_backup | 1 |
| Com_lock_binlog_for_backup | 1 |
| Com_unlock_binlog | 1 |
| Com_unlock_tables | 5 |
+----------------------------+-------+
5 rows in set (0.01 sec)
mysql> show status like '%flush%';
+--------------------------------------+---------+
| Variable_name | Value |
+--------------------------------------+---------+
| Com_flush | 4 |
| Flush_commands | 3 |
* Flush_commands is a global counter. Check MySQL documentation for more details.

Conclusion

By now, we can conclude that the user should be a bit more careful when executing local lock commands (understanding the semantics and the effect). Careful execution of these commands can help serve your purpose.

The post FLUSH and LOCK Handling in Percona XtraDB Cluster appeared first on Percona Database Performance Blog.

Mar
20
2018
--

Percona Blog Poll: What Percona Software Are You Using?

Percona Software

Percona SoftwareThis blog post contains a poll that helps us find out what Percona software the open source database community is using.

Nearly 20 years ago, Netscape released the source code for its Netscape Communicator web browser. This marked one of the biggest moments in “open source” history. The formation of The Open Source Initiative happened shortly after that. Bruce Perens, one of the working group’s founders, adapted his Free Software Guidelines as the official Open Source Definition.

Since then, open source software has gone from being the exception in large projects and enterprises, to being a normal part of huge deployments and daily business activities. Open source software is used by some of the biggest online companies: Facebook, YouTube, Twitter, etc. Many of these companies depend on open source software as part of their business model.

Percona’s mission is to champion unbiased open source database solutions. As part of this mission, we provide open source software, completely free of charge and for reuse. We developed our Percona Server for MySQL and Percona Server for MongoDB solutions to not only be drop-in replacements for existing open source software, but often incorporate “enterprise” features from upstream.

We’ve also recognized a need for a database clustering and backup solutions, and created Percona XtraDB Cluster and Percona XtraBackup to address those concerns.

Beyond database software, Percona has created management and monitoring tools like Percona Monitoring and Management that not only help DBAs with day-to-day tasks, but also use metrics to find out how best to configure, optimize and architect a database environment to best meet the needs of applications and websites.

What we’d like to know is which of our software products are you currently using in your database environment? Are you using just database software, just management and monitoring tools, or a combination of both? As Percona makes plans for the year, we’d like to know what the community is using, what they find helpful, and how we can best allocate our resources to address those needs. We are always looking for the best ways to invest in and grow the Percona software and tools people use.

Complete the survey below by selecting all the options that apply.

Note: There is a poll embedded within this post, please visit the site to participate in this post’s poll.

Thanks in advance for your responses – this helps us see which of our software is being deployed in the community.

The post Percona Blog Poll: What Percona Software Are You Using? appeared first on Percona Database Performance Blog.

Mar
20
2018
--

Webinar Thursday, March 22, 2018: Percona XtraDB Cluster 5.7 with ProxySQL for Your MySQL High Availability and Clustering Needs

MySQL high availability

MySQL high availabilityPlease join Percona’s Ramesh Sivaraman (QA Engineer) and Krunal Bauskar (Software Engineer, Percona XtraDB Cluster Lead) as they present Percona XtraDB Cluster 5.7 with ProxySQL for Your MySQL High Availability and Clustering Needs on Thursday, March 22, 2018 at 8:30 am PDT (UTC-7) / 11:30 am EDT (UTC-4).

Percona has developed Percona XtraDB Cluster (based on Galera Cluster) and integrated it with ProxySQL to address MySQL high availability and clustering. These two products working together provide a great out-of-the-box synchronous replication setup.

In this webinar, we’ll look at why this is a great solution, and what types of deployments you should consider using it in.

Register for the webinar now.

MySQL High AvailabilityKrunal is Percona XtraDB Cluster lead at Percona. He is responsible for day-to-day Percona XtraDB Cluster development, what goes into Percona XtraDB Cluster, bug fixes, releases, etc. Before joining Percona, he worked as part of InnoDB team at MySQL/Oracle. He authored most of the temporary table revamp work, undo log truncate, atomic truncate and a lot of other features. In the past, he was associated with Yahoo! Labs researching big data problems, and a database startup that is now part of Teradata. His interests mainly include data-management at any scale and he has been practicing it for more than decade.

MySQL High AvailabilityRamesh joined the Percona QA Team in March 2014. Prior to joining Percona, he provided MySQL database support to various service- and product-based Internet companies. Ramesh’s professional interests include writing shell/Perl scripts to automate routine tasks, and new technology. Ramesh lives in Kerala, the southern part of India, close to his family.

The post Webinar Thursday, March 22, 2018: Percona XtraDB Cluster 5.7 with ProxySQL for Your MySQL High Availability and Clustering Needs appeared first on Percona Database Performance Blog.

Mar
19
2018
--

Percona XtraDB Cluster, MySQL Asynchronous Replication and log-slave-updates

Asynchronous Replication

Recently, I’ve been looking into issues with the interactions between MySQL asynchronous replication and Galera replication. In this blog post, I’d like to share what I’ve learned.

MySQL asynchronous replication and Galera replication interactions are complicated due to the number of factors involved (Galera replication vs. asynchronous replication, replication filters, and row-based vs. statement-based replication). So as a start, I’ll look at an issue that came up with setting up an asynchronous replication channel between two Percona XtraDB Cluster (PXC) clusters.

Here’s a view of the desired topology:

MySQL asynchronous replication

The Problem

We want to set up an asynchronous replication channel between two PXC clusters. We also set log-slave-updates on the async slave (PXC node 2a in the topology diagram).

This is an interesting configuration and results in unexpected behavior as the replication depends on the node where the operation was performed. Let’s use CREATE TABLE as an example.

  • Run CREATE TABLE on Node 1a.  The table replicates to Node 1b, but not to the nodes in cluster 2.
  • Run CREATE TABLE on Node 1b. The table replicates to all nodes (both cluster 1 and cluster 2).

Some background information

Understanding the problem requires some knowledge of MySQL threads. However, as a simplification, I’ll group the threads into three groups:

  • Main MySQL Client Thread: This thread handles the requests for the client connection (here the client is an external entity).
  • Async Replication Threads: There are multiple threads in use here, some handle I/O, and some apply the updates, but we will view them as a single entity for simplicity.
  • Galera Threads: There are also multiple threads here, similar to the Async Replication Threads. (The name Galera here refers to the underlying replication technology used by PXC.)

For more information on MySQL threads, see
https://dev.mysql.com/doc/refman/5.7/en/mysql-threads.html

Why is the data not replicating?

In the first case (CREATE TABLE executed on Node1a)

  • The table is replicated from Node1a -> Node 1b via Galera replication.
  • The table is not replicated because the async replication threads are not picking up the changes.

In the second case (CREATE TABLE executed on Node 1b)

  • The table is replicated from Node1b -> Node 1a via Galera replication.
  • The table is replicated from Node1b -> Node 2a via async replication. This differs from the first case because the statement is executed on the Main MySQL client thread.  The async replication threads pick up the changes and send them to Node 2a.
  • The table is replicated from Node 2a -> Node 2b via Galera replication because log-slave-updates has been enabled on Node2a.

That last part is the important bit. We can view the Galera replication threads as another set of asynchronous replication threads. So if data is coming in via async replication, they have to be made visible to Galera by log-slave-updates.  This is true in the other direction also: log-slave-updates must be enabled for Galera to supply data to async replication.

This is very similar to chained replication
https://dev.mysql.com/doc/refman/5.7/en/replication-options-slave.html

The Solution

In this scenario, the answer is to set log-slave-updates on Node 1b (the async master) and on Node 2a (the async slave).

We set log-slave-updates on node 1b to allow the async threads to pickup the changes from the Galera threads.

We set log-slave-updates on node 2a to allow the Galera threads to pickup the changes from the async threads. Starting with PXC 5.7.17, calling START SLAVE on a PXC node will return an error unless log-slave-updates is enabled.

You must enable log-slave-updates on the node for data to be transferred between Galera and asynchronous replication.

Recommendations/Best Practices

If you plan to use MySQL asynchronous replication with Percona XtraDB Cluster (either as async master or slave), we recommend that you enable log-slave-updates on all nodes within the cluster. This to (1) to ensure that any async replication connections to/from the cluster work correctly and (2) to ensure that all the nodes within a cluster share the same configuration and behavior.

Recommended configuration diagram for the clusters:

MySQL asynchronous replication

The post Percona XtraDB Cluster, MySQL Asynchronous Replication and log-slave-updates appeared first on Percona Database Performance Blog.

Mar
12
2018
--

ProxySQL 1.4.6 and Updated proxysql-admin Tool Now in the Percona Repository

ProxySQL 1.4.5

ProxySQL 1.4.5ProxySQL 1.4.6, released by ProxySQL, is now available for download in the Percona Repository along with an updated version of Percona’s proxysql-admin tool.

ProxySQL is a high-performance proxy, currently for MySQL and its forks (like Percona Server for MySQL and MariaDB). It acts as an intermediary for client requests seeking resources from the database. René Cannaò created ProxySQL for DBAs as a means of solving complex replication topology issues.

The ProxySQL 1.4.6 source and binary packages available at https://percona.com/downloads/proxysql include ProxySQL Admin – a tool, developed by Percona to configure Percona XtraDB Cluster nodes into ProxySQL. Docker images for release 1.4.6 are available as well: https://hub.docker.com/r/percona/proxysql/. You can download the original ProxySQL from https://github.com/sysown/proxysql/releases.

This release fixes the following bugs in ProxySQL Admin:

Usability improvements:

  • #PSQLADM-32: Now, proxysql_admin script can configure multiple clusters in ProxySQL, when there are unique cluster names specified by the wsrep_cluster_name option, and the proxysql_admin.cnf configuration contains different ProxySQL READ/WRITE hostgroup and different application user for each cluster. Currently multiple clusters support is not compatible with host priority feature, which works only with a single cluster.
  • PR #81: The new version substantially increases the number of test cases in the ProxySQL Admin test-suite.

Bug fixes:

  • Fixed #PSQLADM-35proxysql_galera_checker monitoring script was unable to discover new writer nodes.
  • Fixed #PSQLADM-36: upgrade to ProxySQL 1.4.5 from the previous version was broken.
  • Fixed #79 by properly quoting the MONITOR_USERNAME environment variable in the admin script query.

ProxySQL is available under OpenSource license GPLv3.

Feb
15
2018
--

ProxySQL 1.4.5 and Updated proxysql-admin Tool Now in the Percona Repository

ProxySQL 1.4.5

ProxySQL 1.4.5ProxySQL 1.4.5, released by ProxySQL, is now available for download in the Percona Repository along with an updated version of Percona’s proxysql-admin tool.

ProxySQL is a high-performance proxy, currently for MySQL and its forks (like Percona Server for MySQL and MariaDB). It acts as an intermediary for client requests seeking resources from the database. René Cannaò created ProxySQL for DBAs as a means of solving complex replication topology issues.

The ProxySQL 1.4.5 source and binary packages available at https://percona.com/downloads/proxysql include ProxySQL Admin – a tool, developed by Percona to configure Percona XtraDB Cluster nodes into ProxySQL. Docker images for release 1.4.5 are available as well: https://hub.docker.com/r/percona/proxysql/. You can download the original ProxySQL from https://github.com/sysown/proxysql/releases.

This release fixes the following bugs in ProxySQL Admin:

Usability improvements:

  • #PSQLADM-6: If the cluster node goes offline, the proxysql_node_monitor script now sets the node status as OFFLINE_HARD, and does not remove it from the ProxySQL database. Also, logging is consistent regardless of the cluster node online status.
  • #PSQLADM-30: Validation was added for the host priority file.
  • #PSQLADM-33: Added --proxysql-datadir option to run the proxysql-admin script with a custom ProxySQL data directory.
  • Also, BATS test suite was added for the proxysql-admin testing.

Bug fixes:

  • Fixed#PSQLADM-5: PXC mode specified with proxysql-admin with use of --mode parameter was not persistent.
  • Fixed#PSQLADM-8: ProxySQL High CPU load took place when mysqld was hanging.

ProxySQL is available under OpenSource license GPLv3.

Feb
13
2018
--

Want IST Not SST for Node Rejoins? We Have a Solution!

IST Not SST for Node Rejoins

IST Not SST for Node RejoinsWhat if we tell you that there is a sure way to get IST not SST for node rejoins? You can guarantee that a new node rejoins using IST. Sound interesting? Keep reading.

Normally when a node is taken out of the cluster for a short period of time (for maintenance or shutdown), gcache on other nodes of the cluster help donate the missing write-set(s) when the node rejoins. This approach works if you have configured a larger gcache, or the downtime is short enough. Both approaches aren’t good, especially for a production cluster. Also, a larger gcache for the server lifetime means blocking larger disk-space when the same job can be done with relative smaller disk-space.

Re-configuring gcache, on a potential DONOR node before downtime requires a node shutdown. (Dynamic resizing of the gcache is not possible, or rather not needed now.) Restoring it back to original size needs another shutdown. So “three shutdowns” for a single downtime. No way …… not acceptable with busy production clusters and the possibility of more errors.

Introducing “gcache.freeze_purge_at_seqno”

Given the said pain-point, we are introducing gcache.freeze_purge_at_seqno Percona XtraDB Cluster 5.7.20. This controls the purging of the gcache, thereby retaining more data to facilitate IST when the node rejoins.

All the transactions in the Galera cluster world are assigned unique global sequence number (seqno). Tracking things happens using this seqno (like wsrep_last_applied, wsrep_last_committed, wsrep_replicated, wsrep_local_cached_downto, etc…). wsrep_local_cached_downto represents the sequence number down to which the gcache has been purged. Say wsrep_local_cached_downto = N, then gcache has data from [N, wsrep_replicated] and has purged data from [1,N).

gcache.freeze_purge_at_seqno takes three values:

  1. -1 (default): no freeze, the purge operates as normal.
  2. x (should be valid seqno in gcache): freeze purge of write-sets >= x. The best way to select x is to use the wsrep_last_applied value as an indicator from the node that you plan to shut down. (wsrep_applied * 0.09. Retain this extra 10% to trick the safety gap heuristic algorithm of IST.)
  3. now: freeze purge of write-sets >= smallest seqno currently in gcache. Instant freeze of gcache-purge. (If tracing x (above) is difficult, simply use “now” and you are good).

Set this on an existing node of the cluster (that will continue to be part of the cluster and can act as potential DONOR). This node continues to retain the write-sets, thereby allowing the restarting node to rejoin using IST. (You can feed the said node as a preferred DONOR through wsrep_sst_donor while restarting the said rejoining node.)

Remember to set it back to -1 once the node rejoins. This avoids hogging space on the DONOR beyond the said timeline. On the next purge cycle, all the old retained write-sets are freed as well (reclaiming the space back to original).

Note:

To find out existing value of gcache.freeze_purge_at_seqno query wsrep_provider_options.
select @@wsrep_provider_options;
To set gcache.freeze_purge_at_seqno
set global wsrep_provider_options="gcache.freeze_purge_at_seqno = now";

Why should you use it?

  • gcache grows dynamically (using existing pagestore mechanism) and shrinks once the user sets it back to -1. This means you only use disk-space when needed.
  • No restart needed. The user can concentrate on maintenance node only.
  • No complex math or understanding of seqno involved (simply use “now”).
  • Less prone to error, as SST is one of the major error-prone areas with the cluster.

So why wait? Give it a try! It is part of Percona XtraDB Cluster 5.7.20 onwards, and helps you get IST not SST for node rejoins

Note: If you need more information about gcache, check here and here.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com