Dec
19
2017
--

Webinar Q&A: Percona XtraDB Cluster 101

Percona XtraDB Cluster

Percona XtraDB ClusterIn this blog, we will answer questions from our webinar on Percona XtraDB Cluster 101.

Recently (7 Dec 2017) I presented a webinar about Percona XtraDB Cluster 101. Firstly, thanks to all the attendees: we had a great webinar with quite some interesting questions and feedback.

Through this blog, I’ll answer most of the questions that were raised during the webinar.

Q. How does the need for the acknowledgment from other nodes affect the speed of writes?

A. There are two parts to replication: delivering a transaction (including acknowledgment) and applying the transaction. Generally, the first part is pretty quick and dictated by the network latency. The second part is time-consuming, but happens asynchronously. So acknowledging a transaction from other nodes is not that time-consuming.

Q. How can geo-distributed nodes affect the speed of writes?

A. The longest node dictates cluster performance (in terms of latency). You can’t write faster than the time it takes for a packet to reach the longest node (round-trip-latency). So geo-distribution does affect write performance.

Q. Would you consider Master -> Slave replication in RDS a traditional replication of MySQL? And how easy is it replicating from PXC to RDS if its possible

A. If an application doesn’t need high availability, then a user can explore the MASTER-SLAVE replication. But I would argue that if I am going to spend time booting two servers (MASTER and SLAVE), then why not boot both as MASTER (through Percona XtraDB Cluster). This ensures HA and write-scalability. Percona XtraDB Cluster is flexible for all topologies, and can act as async-master or async-slave too.

Q. Moving forward, is there a plan to deal with version control tools like Flyway that still uses Get locks?

A. Statements like GET_LOCK that establish local locks at the said node are not cluster-safe, so they are blocked with

pxc_strict_mode=ENFORCING

 and not recommended for use. With that said, if the application/user tries to use these statements in a non-conflicting way (with the load directed to single master) it could still work.

Q. With an ASYNC slave, can you use GTID? I know there is a bug(s) that prevent this currently from working 100% in MariaDB (though the bug is close to being fixed – MDEV-10715)?

A. Yes, you can use GTID with async-slave. MariaDB has different implementation of GTID so I am not in a position to comment on the latter part.

Q. Do tables need to have a primary key for the cluster to work?

A. Yes, all tables that you plan to use in a cluster should have a primary key (

pxc_strict_mode=ENFORCING

 enforces this criterion). This is mainly needed for conflict resolution, when the same conflicting workload is executed on multiple-nodes.

Q. What is the best wsrep_sst_method? (for huge database)

A. Percona XtraDB Cluster recommends using XtraBackup. It doesn’t lock the tables for the complete SST life-time, so you can continue to use the node while it is acting as DONOR.

Q. Is the “show processlist” node-specific? Is there an equivalent command to show the whole cluster process list?

A. Yes, show processlist is node specific. There is currently no way to cluster-wide-processlist.

Q. does PXC support partitioned tables?

A. Yes, using InnoDB native partitioning.

Q. These nodes (PXC-nodes) are api nodes or data nodes ?

A. Data nodes.

Q. If cluster went down then everytime it follow SST/IST?

A. It depends. If there is DONOR that has a missing write-set, then the node can rejoin through IST else SST.

Q. Around how much time it will take to join the cluster?

A. The time a node takes to join back depends on the size of the data. Generally, the time for SST is longer than IST. The good part is with 5.7.17+ we have considerably reduced the time for IST, so that a node can join faster than before.

Q. How does IST (incremental state transfer) process affect cluster performance?

A. IST is asynchronous and doesn’t emit FLOW_CONTROL, so cluster can continue to perform as normal. A small slice of DONOR bandwidth is used to send data to the JOINER, but it is not that significant to affect the overall cluster performance.

Q. How do you handle a situation when three simultaneous transactions try to insert auto_increment value?

A. Percona XtraDB Cluster has a concept of wsrep_auto_increment_control that adjust the increment size on each node based on a number of nodes in the cluster. Please check this link for more details.

Q. Imagine that a table A has a trigger on insert that inserts data into another table B. And there are two concurrent transactions: TA inserts into table A (and the trigger makes an insert into B) and TB that inserts the same data directly into B. Will such a conflict – insert from TB and from trigger – be detected?

A. Yes. A transaction can touch multiple data-objects and when the conflict resolution is done, it will check all the objects that transaction is planning to modify before certifying a transaction is safe to apply.

Q. How PXC will make sure of data integrity with parallel processing?

A. Percona XtraDB Cluster has conflict resolution protocol. This protocol is based on FIRST COMMITTER WIN principle that ensures only the first transaction (from a group of a conflicting transaction) commits to cluster.

Q. I’ve created a three node cluster and replication is working. I’d like to copy our production data to the cluster since exporting and importing from MySQL takes a long time. Should I have waited to bootstrap the cluster until the data directory is transferred?

A. If you already have a cluster in place then you are simply adding new tables to the cluster. You can start adding (LOADING) the tables and these tables are immediately replicated to the other nodes of the cluster. An alternative would be to start the first node of the cluster with the pre-loaded data that then becomes cluster state. Other joining nodes copy it over through SST.

Q. Do we have an option to autospinup the compute nodes in a cloud? If PXC will have that option or do we manually need to spinup the Instance and setup the replication?

A. You will have to manually configure it.

Q. Why does XtraBackup not work due bootstrapping but works perfectly after bootstrapping? rsync is working in both cases.

A. Not sure I get the question completely, but XtraBackup works in all scenarios. If you are facing any issue, please log it on launchpad.

Q. As per flow control, one node waits for the other node to be in sync. Won’t there be latency in writing the data?

A. The transaction originated from one node needs to get replicated on other nodes of the cluster. This is what we can call latency and is dictated by network latency. Flow-control is mainly to regulate a scenario wherein one node of the cluster falls way behind other nodes of the cluster.

Q. Can we set up PXC using AWS EC2?

A. Yes.

Once again, thanks for taking time to attend the webinar. If you have more questions, then please post them to the Percona XtraDB Cluster forum here. Also, we have a lot of blogs about Percona XtraDB Cluster. Make sure you check them out here.

Dec
06
2017
--

Webinar Thursday, December 7, 2017: Percona XtraDB Cluster (PXC) 101

Percona XtraDB Cluster

Percona XtraDB ClusterJoin Percona’s Software Engineer (PXC Lead), Krunal Bauskar as he presents Percona XtraDB Cluster 101 on Thursday, December 7, 2017, at 7:00 am PST / 10:00 am EST (UTC-8).

Tags: Percona XtraDB Cluster, MySQL, High Availability, Clustering

Experience Level: Beginner

Percona XtraDB Cluster (PXC) is a multi-master solution that offers virtual synchronous replication among clustering node. It is based on the Codership Galera replication library. In this session, we will explore some key features of Percona XtraDB Cluster that make it enterprise ready including some recently added 5.7 exclusive features.

This webinar is an introductory and will cover the following topics:

  • ProxySQL load balancer
  • Multi-master replication
  • Synchronous replication
  • Data at rest encryption
  • Improved SST Security through simplified configuration
  • Easy to setup encrypted between-nodes communication
  • ProxySQL-assisted Percona XtraDB Cluster maintenance mode
  • Automatic node provisioning
  • Percona XtraDB Cluster “strict-mode”

Register for the webinar now.

Percona XtraDB ClusterKrunal Bauskar, C/C++ Engineer

Krunal joined Percona in September 2015. Before joining Percona he worked as part of the InnoDB team at MySQL/Oracle. He authored most of the temporary table revamp work besides a lot of other features. In the past, he was associated with Yahoo! Labs researching on big data problems and database startup which is now part of Teradata. His interest mainly includes data-management at any scale, and he has been practicing it for more than a decade now. He loves to spend time with his family or get involved in social work, unless he is out for some near-by exploration drive. He is located out of Pune, India.

Nov
28
2017
--

Best Practices for Percona XtraDB Cluster on AWS

Percona XtraDB Cluster on AWS 2 small

In this blog post I’ll look at the performance of Percona XtraDB Cluster on AWS using different service instances, and recommend some best practices for maximizing performance.

You can use Percona XtraDB Cluster in AWS environments. We often get questions about how best to deploy it, and how to optimize both performance and spend when doing so. I decided to look into it with some benchmark testing.

For these benchmark tests, I used the following configuration:

  • Region:
    • Availability zones: US East – 1, zones: b, c, d
    • Sysbench 1.0.8
    • ProxySQL 1.4.3
    • 10 tables, 40mln records – ~95GB dataset
    • Percona XtraDB Cluster 5.7.18
    • Amazon Linux AMI

We evaluated different AWS instances to provide the best recommendation to run Percona XtraDB Cluster. We used instances

  • With General Purpose storage volumes, 200GB each
  • With IO provisioned volumes, 200GB, 10000 IOS
  • I3 instances with local attached NVMe storage.

We also used different instance sizes:

Instance vCPU Memory
r4.large 2 15.25
r4.xlarge 4 30.5
r4.2xlarge 8 61
r4.4xlarge 16 122
i3.large 2 15.25
i3.xlarge 4 30.5
i3.2xlarge 8 61
i3.4xlarge 16 122

 

While I3 instances with NVMe storage do not provide the same functionality for handling shared storage and snapshots as General Purpose and IO provisioned volumes, since Percona XtraDB Cluster provides data duplication by itself we think it is still valid to include them in this comparison.

We ran benchmarks in the US East 1 (N. Virginia) Region, and we used different availability zones for each of the Percona XtraDB Cluster zones (mostly zones “b”, “c” and “d”):

Percona XtraDB Cluster on AWS 1

The client was directly connected and used ProxySQL, so we were able to measure ProxySQL’s performance overhead as well.

ProxySQL is an advanced method to access Percona XtraDB Cluster. It can perform a health check of the nodes and route the traffic to the ONLINE node. It can also split read and write traffic and route read traffic to different nodes (although we didn’t use this capability in our benchmark).

In our benchmarks, we used 1,4, 16, 64 and 256 user threads. For this detailed review, however, we’ll look at the 64 thread case. 

Results

First, let’s review the average throughput (higher is better) and latency (lower is better) results (we measured 99% percentile with one-second resolution):

Percona XtraDB Cluster on AWS 2

Results summary, raw performance:

The performance for Percona XtraDB Cluster running on GP2 volumes is often pretty slow, so it is hard to recommend this volume type for the serious workloads.

IO provisioned volumes perform much better, and should be considered as the primary target for Percona XtraDB Cluster deployments. I3 instances show even better performance.

I3 instances use locally attached volumes and do not provide equal functionality as EBS IO provisioned volumes — although some of these limitations are covered by Percona XtraDB Cluster’s ability to keep copies of data on each node.

Results summary for jitter:

Along with average throughput and latency, it is important to take into account “jitter” — how stable is the performance during the runs?

Percona XtraDB Cluster on AWS 3

Latency variation for GP2 volumes is significant — practically not acceptable for serious usage. Let’s review the latency for only IO provisioning and NVMe volumes. The following chart provides better scale for just these two:

Percona XtraDB Cluster on AWS 4

At this scale, we see that NVMe provides a 99% better response time and is more stable. There is still variation for IO provisioned volumes.

Results summary, cost

When speaking about instance and volume types, it would be impractical to avoid mentioning of the instance costs. We need to analyze how much we need to pay to achieve the better performance. So we prepared data how much does it cost to produce throughput of 1000 transactions per second.

We compare on-demand and reserved instances pricing (reserved for one year / all upfront / tenancy-default):

Percona XtraDB Cluster on AWS 5

Because IO provisioned instances give much better performance, the price performance is comparable if not better than GP2 instances.

I3 instances are a clear winner.

It is also interesting to compare the raw cost of benchmarked instances:

Percona XtraDB Cluster on AWS 6

We can see that IO provisioned instances are the most expensive, and using reserved instances does not provide much savings. To understand the reason for this, let’s take a look at how cost is calculated for components:

Percona XtraDB Cluster on AWS 7

So for IO provisioned volumes, the majority of the cost comes from IO provisioning (which is the same for both on-demand and reserved instances).

Percona XtraDB Cluster scalability

Another interesting effort is looking at how Percona XtraDB Cluster performance scales with the instance size. As we double resources (both CPU and Memory) while increasing the instance size, how does it affect Percona XtraDB Cluster?

So let’s take a look at throughput:

Percona XtraDB Cluster on AWS 8

Throughput improves with increasing the instance size. Let’s calculate speedup with increasing instance size for IO provisioned and I3 instances:

Speedup X Times to Large Instance IO1 i3
large 1 1
xlarge 2.67 2.11
2xlarge 5.38 4.31
4xlarge 5.96 7.83

 

Percona XtraDB Cluster can scale (improve performance) with increasing instance size. Keep in mind, however, that it depends significantly on the workload. You may not get the same performance speedup as in this benchmark.

ProxySQL overhead

As mentioned above, ProxySQL adds additional functionality to the cluster. It can also add overhead, however. We would like to understand the expected performance penalty, so we compared the throughput and latency with and without ProxySQL.

Out of box, the ProxySQL performance was not great and required additional tuning. 

ProxySQL specific configuration:

  • Use connection through TCP-IP address, not through local socket
  • Adjust  mysql-max_stmts_per_connection variable for optimal value (default:50) – optimal – 1000
  • Ensure that “monitor@<host>” user has permissions as it’s important for proper handling of prepared statement.
    • CREATE USER ‘monitor’@‘172.30.%.%’ IDENTIFIED BY ‘monitor’;

Throughput:

Percona XtraDB Cluster on AWS 9

Response time:

Percona XtraDB Cluster on AWS 10

ProxySQL performance penalty in throughput

ProxySQL performance penalty IO1 i3
large 0.97 0.98
xlarge 1.03 0.97
2xlarge 0.95 0.95
4xlarge 0.96 0.93

 

It appears that ProxySQL adds 3-7% overhead. I wouldn’t consider this a significant penalty for additional functionality.

Summary

Amazon instances

First, the results show that instances based on General Purpose volumes do not provide acceptable performance and should be avoided in general for serious production usage. The choice is between IO provisioned instances and NVMe based instances.

IO provisioned instances are more expensive, but offer much better performance than General Purpose volumes. If we also look at price/performance metric, IO provisioned volumes are comparable with General Purpose volumes. You should use IO provisioned volumes if you are looking for the functionality provided by EBS volumes.

If you do not need EBS volumes, however, then i3 instances with NVMe volumes are a better choice. Both are cheaper and provide better performance than IO provisioned instances. Percona XtraDB Cluster provides data duplication on its own, which mitigates the need for EBS volumes to some extent.

ProxySQL overhead

We recommend using Percona XtraDB Cluster in combination with ProxySQL, as ProxySQL provides additional management and routing functionality. In general, the overhead for ProxySQL is not significant. But in our experience, however, ProxySQL has to be properly tuned — otherwise the performance penalty could be a bottleneck.

Percona XtraDB Cluster scalability

AWS has great capability to increase the instance size (both CPU and memory) if we exceed the capacity of the current instance. From our experiments, we see that Percona XtraDB Cluster can scale along with and benefit from increased instance size.

Below is a chart showing the speedup in relation to the instance size:

Percona XtraDB Cluster on AWS 11

So increasing the instance size is a feasible strategy for improving Percona XtraDB Cluster performance in an AWS environment.

Thanks for reading this benchmark! Put any questions or thoughts in the comments below.

Nov
15
2017
--

Understanding how an IST donor is selected

IST donor cluster

IST donor clusterIn a clustering environment, we often see a node that needs to be taken down for maintenance. For a node to rejoin, it should re-sync with the cluster state. In PXC (Percona XtraDB Cluster), there are 2 ways for the rejoining node to re-sync: State Snapshot Transfer (SST) and Incremental State Transfer (IST). SST involves a full data transfer (which could be time consuming). IST is an incremental data transfer whereby only missing write-sets are donated by a DONOR to the rejoining node (aka as JOINER).

In this article I will try to show how a DONOR for the IST process is selected.

Selecting an IST DONOR

First, a word about gcache. Each node retains some write-sets in its cache known as gcache. Once this gcache is full it is purged to make room for new write-sets. Based on gcache configuration, each node may retain a different span of write-sets. The wider the span, the greater the probability of the node acting as prospective DONOR. The lowest seqno in gcache can be queried using ( 

show status like 'wsrep_local_cached_downto'

 )

Let’s understand the IST DONOR algorithm with a topology and working example:

  • Say we have 3 node cluster: N1, N2, N3.
  • To start with, all 3 nodes are in sync (wsrep_last_committed is the same for all 3 nodes, let’s say 100).
  • N3 is schedule for maintenance and is taken down.
  • In meantime N1 and N2 processes workload, thereby moving them from 100 -> 1100.
  • N1 and N2 also purges the gcache. Let’s say wsrep_local_cached_downto for N1 and N2 is 110 and 90 respectively.
  • Now N3 is restarted and discovers that the cluster has made progress from 100 -> 1100 and so it needs the write-sets from (101, 1100).
  • It starts looking for a prospective DONOR.
    • N1 can service data from (110, 1100) but the request is for (101, 1100) so N1 can’t act as DONOR
    • N2 can service data from (90, 1100) and the request is for (101, 1100) so N2 can act as DONOR.

Safety gap and how it affects DONOR selection

So far so good. But can N2 reliably act as DONOR? While N3 is evaluating the prospective DONOR, what if N2 purges more data and now wsrep_local_cached_downto on N2 is 105? In order to accommodate this, the N3 algorithm adds a safety gap.

safety gap = (Current State of Cluster – Lowest available seqno from any of the existing node of the cluster) * 0.008

So the N2 range is considered to be (90 + (1100 – 90) * 0.008, 1100) = (98, 1100).

Can now N2 act as DONOR ? Yes: (98, 1100) < (101, 1100)

What if N2 had purged up to 95 and then N3 started looking for prospective DONOR?

In this case the N2 range would be (95 + (1100 – 95) * 0.008, 1100) = (103, 1100), ruling N2 out from the prospective DONOR list.

Twist at the end

Considering the latter case above (N2 purged up to 95), it has been proven that N2 can’t act as the IST DONOR and the only way for N3 to join is through SST.

What if I say that N3 still joins back using IST? CONFUSED?

Once N3 falls back from IST to SST it will select a SST donor. This selection is done sequentially and nominates N1 as the first choice. N1 doesn’t have the required write-sets, so SST is forced.

But what if I configure

wsrep_sst_donor=N2

  on N3? This will cause N2 to get selected instead of N1. But wait: N2 doesn’t qualify either as with safety gap, the range is (103, 1100).

That’s true. But the request has IST + SST request, so even though N3 ruled out N2 as the IST DONOR, a request is sent for one last try. If N2 can service the request using IST, it is allowed to do so.  Otherwise it falls back to SST.

Interesting! This is a well thought out algorithm from Codership: I applaud them for this and the many other important control functions that go on backstage of the galera cluster.

Oct
30
2017
--

Percona XtraDB Cluster 5.7.19-29.22-3 is now available

Percona XtraDB Cluster 5.7

Percona XtraDB Cluster 5.7Percona announces the release of Percona XtraDB Cluster 5.7.19-29.22-3 on October 27, 2017. Binaries are available from the downloads section or our software repositories.

NOTE: You can also run Docker containers from the images in the Docker Hub repository.

Percona XtraDB Cluster 5.7.19-29.22-3 is now the current release, based on the following:

All Percona software is open-source and free.

Fixed Bugs

  • Added access checks for DDL commands to make sure they do not get replicated if they failed without proper permissions. Previously, when a user tried to perform certain DDL actions that failed locally due to lack of privileges, the command could still be replicated to other nodes, because access checks were performed after replication.This vulnerability is identified as CVE-2017-15365.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

Oct
30
2017
--

Percona XtraDB Cluster 5.6.37-26.21-3 is Now Available

Percona XtraDB Cluster 5.7

Percona XtraDB Cluster 5.6.34-26.19Percona announces the release of Percona XtraDB Cluster 5.6.37-26.21-3 on October 27, 2017. Binaries are available from the downloads section or our software repositories.

Percona XtraDB Cluster 5.6.37-26.21-3 is now the current release, based on the following:

All Percona software is open-source and free.

Fixed Bugs

  • Added access checks for DDL commands to make sure they do not get replicated if they failed without proper permissions. Previously, when a user tried to perform certain DDL actions that failed locally due to lack of privileges, the command could still be replicated to other nodes, because access checks were performed after replication.This vulnerability is identified as CVE-2017-15365.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

Sep
25
2017
--

Percona Live Europe: Tutorials Day

Percona Live Tutorials

Percona Live Europe TutorialsWelcome to the first day of the Percona Live Open Source Database Conference Europe 2017: Tutorials day! Technically the first day of the conference, this day focused on provided hands-on tutorials for people interested in learning directly how to use open source tools and technologies.

Today attendees went to training sessions taught by open source database experts and got first-hand experience configuring, working with, and experimenting with various open source technologies and software.

The first full day (which includes opening keynote speakers and breakout sessions) starts Tuesday 9/26 at 9:15 am.

Some of the tutorial topics covered today were:

Percona Live Europe TutorialsMonitoring MySQL Performance with Percona Monitoring and Management (PMM)

Michael Coburn, Percona

This was a hands-on tutorial covering how to set up monitoring for MySQL database servers using the Percona Monitoring and Management (PMM) platform. PMM is an open-source collection of tools for managing and monitoring MySQL and MongoDB performance. It provides thorough time-based analysis for database servers to ensure that they work as efficiently as possible.

We learned about:

  • The best practices on MySQL monitoring
  • Metrics and time series
  • Data collection, management and visualization tools
  • Monitoring deployment
  • How to use graphs to spot performance issues
  • Query analytics
  • Alerts
  • Trending and capacity planning
  • How to monitor HA

Percona Live Europe TutorialsHands-on ProxySQL

Rene Cannao, ProxySQL

ProxySQL is an open source proxy for MySQL that can provide HA and high performance with no changes in the application, using several built-in features and integration with clustering software. Those were only a few of the features we learned about in this hands-on tutorial.

Percona Live Europe TutorialsMongoDB: Sharded Cluster Tutorial

Jason Terpko, ObjectRocket
Antonios Giannopoulos, ObjectRocket

This tutorial guided us through the many considerations when deploying a sharded cluster. It covered the services that make up a sharded cluster, configuration recommendations for these services, shard key selection, use cases, and how data is managed within a sharded cluster. Maintaining a sharded cluster also has its challenges. We reviewed these challenges and how you can prevent them with proper design or ways to resolve them if they exist today.

Percona Live Europe TutorialsInnoDB Architecture and Performance Optimization

Peter Zaitsev, Percona

InnoDB is the most commonly used storage engine for MySQL and Percona Server for MySQL. It is the focus of most of the storage engine development by the MySQL and Percona Server for MySQL development teams.

In this tutorial, we looked at the InnoDB architecture, including new feature developments for InnoDB in MySQL 5.7 and Percona Server for MySQL 5.7. Peter explained how to use InnoDB in a database environment to get the best application performance and provide specific advice on server configuration, schema design, application architecture and hardware choices.

Peter updated this tutorial from previous versions to cover new MySQL 5.7 and Percona Server for MySQL 5.7 InnoDB features.

Join us tomorrow for the first full day of the Percona Live Open Source Database Conference Europe 2017!

Sep
22
2017
--

Percona XtraDB Cluster 5.7.19-29.22 is now available

Percona XtraDB Cluster 5.7

Percona XtraDB Cluster 5.7Percona announces the release of Percona XtraDB Cluster 5.7.19-29.22 on September 22, 2017. Binaries are available from the downloads section or our software repositories.

NOTE: You can also run Docker containers from the images in the Docker Hub repository.

Percona XtraDB Cluster 5.7.19-29.22 is now the current release, based on the following:

All Percona software is open-source and free.

Upgrade Instructions

After you upgrade each node to Percona XtraDB Cluster 5.7.19-29.22, run the following command on one of the nodes:

$ mysql -uroot -p < /usr/share/mysql/pxc_cluster_view.sql

Then restart all nodes, one at a time:

$ sudo service mysql restart

New Features

  • Introduced the pxc_cluster_view table to get a unified view of the cluster. This table is exposed through the performance schema.

    mysql> select * from pxc_cluster_view;
    -----------------------------------------------------------------------------
    HOST_NAME  UUID                                  STATUS  LOCAL_INDEX  SEGMENT
    -----------------------------------------------------------------------------
    n1         b25bfd59-93ad-11e7-99c7-7b26c63037a2  DONOR   0            0
    n2         be7eae92-93ad-11e7-88d8-92f8234d6ce2  JOINER  1            0
    -----------------------------------------------------------------------------
    2 rows in set (0.01 sec)
  • PXC-803: Added support for new features in Percona XtraBackup 2.4.7:

    • wsrep_debug enables debug logging
    • encrypt_threads specifies the number of threads that XtraBackup should use for encrypting data (when encrypt=1). This value is passed using the --encrypt-threads option in XtraBackup.
    • backup_threads specifies the number of threads that XtraBackup should use to create backups. See the --parallel option in XtraBackup.

Improvements

  • PXC-835: Limited wsrep_node_name to 64 bytes.
  • PXC-846: Improved logging to report reason of IST failure.
  • PXC-851: Added version compatibility check during SST with XtraBackup:
    • If a donor is 5.6 and a joiner is 5.7: A warning is printed to perform mysql_upgrade.
    • If a donor is 5.7 and a joiner is 5.6: An error is printed and SST is rejected.

Fixed Bugs

  • PXC-825: Fixed script for SST with XtraBackup (wsrep_sst_xtrabackup-v2) to include the --defaults-group-suffix when logging to syslog. For more information, see #1559498.
  • PXC-826: Fixed multi-source replication to PXC node slave. For more information, see #1676464.
  • PXC-827: Fixed handling of different binlog names between donor and joiner nodes when GTID is enabled. For more information, see #1690398.
  • PXC-830: Rejected the RESET MASTER operation when wsrep provider is enabled and gtid_mode is set to ON. For more information, see #1249284.
  • PXC-833: Fixed connection failure handling during SST by making the donor retry connection to joiner every second for a maximum of 30 retries. For more information, see #1696273.
  • PXC-839: Fixed GTID inconsistency when setting gtid_next.
  • PXC-840: Fixed typo in alias for systemd configuration.
  • PXC-841: Added check to avoid replication of DDL if sql_log_bin is disabled. For more information, see #1706820.
  • PXC-842: Fixed deadlocks during Load Data Infile (LDI) with log-bin disabled by ensuring that a new transaction (of 10 000 rows) starts only after the previous one is committed by both wsrep and InnoDB. For more information, see #1706514.
  • PXC-843: Fixed situation where the joiner hangs after SST has failed by dropping all transactions in the receive queue. For more information, see #1707633.
  • PXC-853: Fixed cluster recovery by enabling wsrep_ready whenever nodes become PRIMARY.
  • PXC-862: Fixed script for SST with XtraBackup (wsrep_sst_xtrabackup-v2) to use the ssl-dhparams value from the configuration file.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

Sep
20
2017
--

Percona XtraDB Cluster 5.6.37-26.21 is Now Available

Percona XtraDB Cluster 5.7

Percona XtraDB Cluster 5.6.34-26.19Percona announces the release of Percona XtraDB Cluster 5.6.37-26.21 on September 20, 2017. Binaries are available from the downloads section or our software repositories.

Percona XtraDB Cluster 5.6.37-26.21 is now the current release, based on the following:

All Percona software is open-source and free.

Improvements

  • PXC-851: Added version compatibility check during SST with XtraBackup:
    • If donor is 5.6 and joiner is 5.7: A warning is printed to perform mysql_upgrade.
    • If donor is 5.7 and joiner is 5.6: An error is printed and SST is rejected.

Fixed Bugs

  • PXC-825: Fixed script for SST with XtraBackup (wsrep_sst_xtrabackup-v2) to include the --defaults-group-suffix when logging to syslog. For more information, see #1559498.
  • PXC-827: Fixed handling of different binlog names between donor and joiner nodes when GTID is enabled. For more information, see #1690398.
  • PXC-830: Rejected the RESET MASTER operation when wsrep provider is enabled and gtid_mode is set to ON. For more information, see #1249284.
  • PXC-833: Fixed connection failure handling during SST by making the donor retry connection to joiner every second for a maximum of 30 retries. For more information, see #1696273.
  • PXC-841: Added check to avoid replication of DDL if sql_log_bin is disabled. For more information, see #1706820.
  • PXC-853: Fixed cluster recovery by enabling wsrep_ready whenever nodes become PRIMARY.
  • PXC-862: Fixed script for SST with XtraBackup (wsrep_sst_xtrabackup-v2) to use the ssl-dhparams value from the configuration file.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

Sep
15
2017
--

The MySQL High Availability Landscape in 2017 (the Babies)

MySQL High Availability

This post is the third of a series focusing on the MySQL high availability solutions available in 2017.

The first post looked at the elders, the technologies that have been around for more than ten years. The second post talked about the adults, the more recent and mature technologies. In this post, we will look at the emerging MySQL high availability solutions. The “baby” MySQL high availability solutions I chose for the blog are group replication, proxies and distributed storage.

Group replication

Group replication is the Oracle response to Galera. The term “InnoDB cluster” means a cluster using group replication. The goal is offering similar functionalities, especially the almost synchronous feature.

At first glance, the group replication implementation appears to be rather elegant. The basis is the GTID replication mode. The nodes of an InnoDB cluster share a single UUID sequence. To control the replication lag, Oracle added a flow control layer. While Galera requires unanimity, group replication only requires a majority. The majority protocol in use is derived from Paxos. A majority protocol makes the cluster more resilient to a slow node.

Like Galera, when you add flow control you needs queues. Group replication has two queues. There is one queue for the certification process and one queue for the appliers. What is interesting in the Oracle approach is the presence of a throttling mechanism. When flow control is requested by a node, instead of halting the processing of new transactions like Galera, the rate of transactions is throttled. That can help to meet strict timing SLAs.

Because the group replication logic is fairly similar to Galera, they suffer from the same limitations: large transactions, latency and hot rows. Group replication is recent. The first GA version is 5.7.17, from December 2016. It is natural then that it has a number of sharp edges. I won’t extend too much here, but if you are interested read here, here. I am confident over time group replication will get more polished. Some automation, like the Galera SST process, would also be welcome.

Given the fact the technology is recent, I know no Percona customer using group replication in production.

Proxies

Intelligent proxies can be viewed as another type of upcoming MySQL high availability solution. It is not strictly MySQL. In fact, this solution is more of a mix of other solutions.

The principle is simple: you connect to a proxy, and the proxy directs you to a valid MySQL server. The proxy has to monitor the states of the back-end servers, and maybe even perform actions on them. Of course, the proxy layer must not become a single point of failure. There should be more than one proxy host for basic HA. If more that one proxy is used at the same time, they’ll have to agree on the state of the back-end servers. For example, on a cluster using MySQL async replication, if the proxies are not sending the write traffic to the same host, things will quickly become messy.

There are few ways of achieving this. The simplest solution is an active-passive setup where only one proxy is active at a given time. You’ll need some kind of logic to determine if the proxy host is available or not. Typical choices will use tools like keepalived or Pacemaker.

A second option is to have the proxies agree to a deterministic way of identifying a writer node. For example, with a Galera-based cluster, the sane back-end node with the lowest wsrep_local_index could be the writer node.

Finally, the proxies could talk to each other and coordinate. Such an approach is promising. It could allow a single proxy to perform the monitoring and inform its peers of the results. It would allow also coordinated actions on the cluster when a failure is detected.

Currently, there are a few options in terms of proxies:

  • ProxySQL: An open-source that understands the MySQL protocol and can do R/W splitting, query caching, sharding, SQL firewalling, etc. A new alpha level feature, mirroring, targets the inter-proxy communication need.
  • MaxScale: No longer fully open-source (BSL), but understands the MySQL protocol. Can do R/W splitting, sharding, binlog serving, SQL firewalling, etc.
  • MySQL Router: MySQL Router is an open-source proxy developed by Oracle for InnoDB Cluster (Group replication). It understands the MySQL protocol and also supports the new X protocol. It can do R/W splitting.
  • HAProxy: HAProxy is a popular open-source TCP level proxy. It doesn’t understand the MySQL protocol. It needs helper scripts, responding to HTTP type requests, to figure the node’s health.

To these open source proxies, there are two well-known commercial proxy-like solutions, Tungsten and ScaleArc. Both of these technologies are mature and are not “babies” in terms of age and traction. On top of these, there are also numerous hardware-based load balancer solutions.

The importance of proxies in MySQL high availability has led Percona to include ProxySQL in the latest releases of Percona XtraDB Cluster. In collaboration with the ProxySQL maintainer, René Cannaò, features have been added to make ProxySQL aware of the Percona XtraDB Cluster state.

Proxies are already often deployed in MySQL high availability solutions. Often proxies are only doing load balancing type work. We start to see deployment using proxies for more advanced things, like read/write splitting and sharding.

Distributed storage

Replication setup using distributed storage

 

This MySQL high availability solution is a project I am interested in. It is fair to say it is more a “fetus” than a real “baby,” since I know nobody using it in production. You can see this solution as a shared storage approach on steroids.

The simplest solution requires a three-node Ceph cluster. The nodes also run MySQL and the datadir is a Ceph RBD block device. Data in Ceph is automatically replicated to multiple hosts. This built-in data replication is an important component of the solution. Also, Ceph RBD supports snapshots and clones. A clone is a copy of the whole data set that consumes only the data that changed (delta) in terms of storage. Our three MySQL servers will thus not use three full copies of the dataset, but only one full copy and two deltas. As time passes, the deltas grow. When they are too large, we can simply generate new snapshots and clones and be back to day one. The generation of a new snapshot and clone takes a few seconds, and doesn’t require stopping MySQL.

The obvious use case for the distributed storage approach is a read-intensive workload on a very large dataset. The setup can handle a lot of writes. The higher the write load, the more frequently there will be a snapshot refresh. Keep in mind that refreshing a snapshot of a 10 TB data set takes barely more time than for a 1 GB data set.

For that purpose, I wrote an SST script for Percona XtraDB Cluster that works with Ceph. I blogged about it here. I also wrote a Ceph snapshot/clone backup script that can provision a slave from a master snapshot. I’ll blog about how to use this Ceph backup script in the near future.

Going further with distributed storage, multiple MySQL instances could use the same data pages. Ceph would be use as a distributed object store for InnoDB pages. This would allow to build an open-source Aurora like database. Coupled with Galera or Group replication, you could have a highly-available MySQL cluster sharing a single copy of the dataset.

I started to modify MySQL, actually Percona Server for MySQL 5.7, to add support for Ceph/Rados. Rados is the object store protocol of Ceph. There is still a lot of effort needed to make it work. My primary job is not development, so progress is slow. My work can be found (here). The source compiles well but MySQL doesn’t fully start. I need to debug where things are going wrong.

Adding a feature to MySQL like that is an awesome way to learn the internals of MySQL. I would really appreciate any help if you are interested in this project.

Conclusion

Over the three articles in this series, we have covered the 2017 landscape of MySQL high availability solutions. The first focused on the old timers, “the elders”, composed of: replication, shared storage and NDB. The second articles dealt with the solutions that are more recent and have a good traction: Galera and RDS Aurora. The conclusion of the series is the current article, which looked at what could be possibly coming in term of MySQL high availability solutions.

The main goal of this series is to help planning the deployment of MySQL in a highly-available way. I hope it can be used for hints and pointers to get better and more efficient solutions.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com