Oct
09
2020
--

Amazon Aurora Multi-Primary First Impression

Amazon Aurora Multi-Primary First Impression

Amazon Aurora Multi-Primary First ImpressionFor what reason should I use a real multi-primary setup?

To be clear, not a multi-writer solution where any node can become the active writer in case of needs, as for Percona XtraDB Cluster (PXC) or Percona Server for MySQL using Group_replication. No, we are talking about a multi-primary setup where I can write at the same time on multiple nodes. I want to insist on this “why?”.

After having excluded the possible solutions mentioned above, both covering the famous 99.995% availability, which is 26.30 minutes of downtime in a year, what is left?

Disaster Recovery? Well, that is something I would love to have, but to be a real DR solution we need to put several kilometers (miles for imperial) in the middle.

And we know (see here and here) that aside from some misleading advertising, we cannot have a tightly coupled cluster solution across geographical regions.

So, what is left? I may need more HA, ok, that is a valid reason. Or I may need to scale the number of writes, which is a valid reason as well. This means, in the end, that I am looking to a multi-primary because:

  • Scale writes (more nodes more writes)
    • Consistent reads (what I write on A must be visible on B)
  • Gives me 0 (zero) downtime, or close to that (5 nines is a maximum downtime of 864 milliseconds per day!!)
  • Allow me to shift the writer pointer at any time from A to B and vice versa, consistently.

Now, keeping myself bound to the MySQL ecosystem, my natural choice would be MySQL NDB cluster. But my (virtual) boss was at AWS re-invent and someone mentioned to him that Aurora Multi-Primary does what I was looking for. This (long) article is my voyage in discovering if that is true or … not.

Given I am focused on the behavior first, and NOT interested in absolute numbers to shock the audience with millions of QPS, I will use low-level Aurora instances. And will perform tests from two EC2 in the same VPC/region of the nodes.

You can find the details about the tests on GitHub here.

Finally, I will test:

  • Connection speed
  • Stale read
  • Write single node for baseline
  • Write on both node:
    • Scaling splitting the load by schema
    • Scaling same schema

Test Results

Let us start to have some real fun. The first test is …

Connection Speed

The purpose of this test is to evaluate the time taken in opening a new connection and time taken to close it. The action of the open/close connection can be a very expensive operation, especially if applications do not use a connection pool mechanism.

Amazon Aurora Multi-Primary


As we can see, ProxySQL results to be the most efficient way to deal with opening connections, which was expected given the way it is designed to reuse open connections towards the backend.


Different is the close connection operation, in which ProxySQL seems to take a little bit longer.

As a global observation, we can say that by using ProxySQL we have more consistent behavior. Of course, this test is a simplistic one, and we are not checking the scalability (from 1 to N connections) but it is good enough to give us the initial feeling. Specific connection tests will be the focus of the next blog on Aurora MM.

Stale Reads

Aurora multi-primary uses the same mechanism of the default Aurora to update the buffer pool:


Using the Page Cache update, just doing both ways. This means that the Buffer Pool of Node2 is updated with the modification performed in Node1 and vice versa.

To verify if an application would be really able to have consistent reads, I have run this test. This test is meant to measure if, and how many, stale reads we will have when writing on a node and reading from the other.

Amazon Aurora multi-primary has two consistency models:

Aurora consistency model
As an interesting fact, the result was that with the default consistency model (INSTANCE_RAW), we got a 100% stale read.
Given that I focused on identifying the level of the cost that exists when using the other consistency model (REGIONAL_RAW), that allows an application to have consistent reads.

The results indicate an increase of 44% in total execution time, and of 95% (22 times slower) in write execution.

It is interesting to note that the time taken is in some way predictable and consistent between the two consistency models.

The graph below shows in yellow how long the application must wait to see the correct data on the reader node. In blue is the amount of time the application waits to get back the same consistent read because it must wait for the commit on the writer.

lag time in nanoseconds

As you can see, the two are more or less aligned.

Given the performance cost imposed by using REGIONAL_RAW,  all the other tests are done with the default INSTANCE_RAW, unless explicitly stated.

Writing Tests

All tests run in this section were done using sysbench-tpcc with the following settings:

sysbench ./tpcc.lua --mysql-host=<> --mysql-port=3306 --mysql-user=<> --mysql-password=<> --mysql-db=tpcc --time=300 --threads=32 --report-interval=1 --tables=10 --scale=15  --mysql_table_options=" CHARSET=utf8 COLLATE=utf8_bin"  --db-driver=mysql prepare

 sysbench /opt/tools/sysbench-tpcc/tpcc.lua --mysql-host=$mysqlhost --mysql-port=$port --mysql-user=<> --mysql-password=<> --mysql-db=tpcc --db-driver=mysql --tables=10 --scale=15 --time=$time  --rand-type=zipfian --rand-zipfian-exp=0 --report-interval=1 --mysql-ignore-errors=all --histogram  --report_csv=yes --stats_format=csv --db-ps-mode=disable --threads=$threads run

Write Single Node (Baseline)

Before starting the comparative analysis, I was looking to define what was the “limit” of traffic/load for this platform.

baseline reads/writes

From the graph above, we can see that this setup scales up to 128 threads and after that, the performance remains more or less steady.

Amazon claims that we can mainly double the performance when using both nodes in write mode and use a different schema to avoid conflict.

aurora scalability

Once more, remember I am not interested in the absolute numbers here, but I am expecting the same behavior. Given that, our expectation is to see:

expected scalability

Write on Both Nodes, Different Schemas

So AWS recommend this as the scaling solution:


And I diligently follow the advice. I used two EC2 nodes in the same subnet of the Aurora Node, writing to a different schema (tpcc & tpcc2).

Overview

Let us make it short and go straight to the point. Did we get the expected scalability?

Well, no:


We just had a 26% increase, quite far to be the expected 100% Let us see what happened in detail (if not interested just skip and go to the next test).

Node 1

Schema read writes Aurora

Node 2


As you can see, Node1 was (more or less) keeping up with the expectations and being close to the expected performance. But Node2 was just not keeping up, and performances there were just terrible.

The graphs below show what happened.

While Node1 was (again more or less) scaling up to the baseline expectations (128 threads), Node2 collapsed on its knees at 16 threads. Node2 was never able to scale up.

Reads

Node 1


Node1 is scaling the reads as expected, but also here and there we can see performance deterioration.

Node 2


Node2 is not scaling Reads at all.

Writes

Node 1


Same as Read.

Node 2


Same as read.

Now someone may think I was making a mistake and I was writing on the same schema. I assure you I was not. Check the next test to see what happened if using the same schema.

Write on Both Nodes,  Same Schema

Overview

Now, now, Marco, this is unfair. You know this will cause contention. Yes, I do! But nonetheless, I was curious to see what was going to happen and how the platform would deal with that level of contention.

My expectations were to have a lot of performance degradation and an increased number of locks. About conflict I was not wrong, node2 after the test reported:

+-------------+---------+-------------------------+
| table       | index   | PHYSICAL_CONFLICTS_HIST |
+-------------+---------+-------------------------+
| district9   | PRIMARY |                    3450 |
| district6   | PRIMARY |                    3361 |
| district2   | PRIMARY |                    3356 |
| district8   | PRIMARY |                    3271 |
| district4   | PRIMARY |                    3237 |
| district10  | PRIMARY |                    3237 |
| district7   | PRIMARY |                    3237 |
| district3   | PRIMARY |                    3217 |
| district5   | PRIMARY |                    3156 |
| district1   | PRIMARY |                    3072 |
| warehouse2  | PRIMARY |                    1867 |
| warehouse10 | PRIMARY |                    1850 |
| warehouse6  | PRIMARY |                    1808 |
| warehouse5  | PRIMARY |                    1781 |
| warehouse3  | PRIMARY |                    1773 |
| warehouse9  | PRIMARY |                    1769 |
| warehouse4  | PRIMARY |                    1745 |
| warehouse7  | PRIMARY |                    1736 |
| warehouse1  | PRIMARY |                    1735 |
| warehouse8  | PRIMARY |                    1635 |
+-------------+---------+-------------------------+

Which is obviously a strong indication something was not working right. In terms of performance gain, if we compare ONLY the result with the 128 Threads:


Also with the high level of conflict, we still have 12% of performance gain.

The problem is that in general, we have the two nodes behaving quite badly. If you check the graph below you can see that the level of conflict is such to prevent the nodes not only to scale but to act consistently.

Node 1

Write on Both Nodes,  Same Schema

Node 2


Reads

In the following graphs, we can see how node1 had issues and it actually crashed three times, during tests with 32/64/512 threads. Node2 was always up but the performances were very low.

Node 1


Node 2


Writes

Node 1


Node 2


Recovery From Crashed Node

About recovery time, reading the AWS documentation and listening to presentations, I often heard that Aurora Multi-Primary is a 0 downtime solution. Or other statements like: “in applications where you can’t afford even brief downtime for database write operations, a multi-master cluster can help to avoid an outage when a writer instance becomes unavailable. The multi-master cluster doesn’t use the failover mechanism, because it doesn’t need to promote another DB instance to have read/write capability”

To achieve this the suggestion, the solution I found was to have applications pointing directly to the Nodes endpoint and not use the Cluster endpoint.

In this context, the solution pointing to the Nodes should be able to failover within a second or so, while the cluster endpoint:

Recovery From Crashed Node

Personally, I think that designing an architecture where the application is responsible for the connection to the database and failover is some kind of refuse from 2001. But if you feel this is the way, well, go for it.

What I did for testing is to use ProxySQL, as plain as possible with nothing else, and the basic monitor coming from the native monitor. I then compared the results with the tests using the Cluster endpoint. In this way, I adopt the advice of pointing directly at the nodes, but I was doing things in our time.

The results are below and they confirm (more or less) the data coming from Amazon.


A downtime of seven seconds is quite a long time nowadays, especially if I am targeting the 5 nines solution that I want to remember is 864 ms downtime per day. Using ProxySQL is going closer to that, but still too long to be called zero downtime.
I also have fail-back issues when using the AWS cluster endpoint, given it was not able to move the connection to the joining node seamlessly.

Last but not least, when using the consistency level INSTANCE_RAW, I had some data issue as well as PK conflict:
FATAL: mysql_drv_query() returned error 1062 (Duplicate entry ‘18828082’ for key ‘PRIMARY’) 

Conclusions

As state at the beginning of this long blog, the reasonable expectations to go for a multi-primary solution were:

  • Scale writes (more nodes more writes)
  • Gives me zero downtime, or close to that (5 nines is a maximum downtime of 864 milliseconds per day!!)
  • Allow me to shift the writer pointer at any time from A to B and vice versa, consistently.

Honestly, I feel we have completely failed the scaling point. Probably if I use the largest Aurora I will get much better absolute numbers, and it will take me more to encounter the same issues, but I will. In any case, if the multi-primary solution is designed to provide that scalability, and it should do that with any version.

I did not have zero downtime, but I was able to failover pretty quickly with ProxySQL.

Finally, unless the consistency model is REGIONAL_RAW, shifting from one node to the other is not prone to possible negative effects like stale reads. Given that I consider this requirement not satisfied in full.

Given all the above, I think this solution could eventually be valid only for High Availability (close to being 5 nines), but given it comes with some limitations I do not feel comfortable in preferring it over others just for that, at the end default Aurora is already good enough as a High available solution.

References

AWS re:Invent 2019: Amazon Aurora Multi-Master: Scaling out database write performance

Working with Aurora multi-master clusters

Improving enterprises ha and disaster recovery solutions reviewed

Robust ha solutions with proxysql

Limitations of multi-master clusters

Oct
07
2020
--

Webinar October 27: Disaster Recovery and High Availability – The Concepts, The Mistakes, and How To Properly Plan For Failure

Percona Disaster Recovery and High Availability

Percona Disaster Recovery and High AvailabilityAny good system must be built to expect the unexpected. None are perfect and at some point, something WILL happen to render the system non-operational causing failure.

Join Dimitri Vanoverbeke, Senior Percona Engineer, as he discusses the concepts of High Availability, Disaster Recovery, common missteps that happen along the way, and how to ultimately prepare for failure.

Please join Dimitri Vanoverbeke on Tuesday, October 27th, at 1 pm EDT for his webinar “Disaster Recovery and High Availability – The Concepts, The Mistakes, and How To Properly Plan For Failure“.

Register for Webinar

If you can’t attend, sign up anyway and we’ll send you the slides and recording afterward.

Oct
01
2019
--

Experimental Binary of Percona XtraDB Cluster 8.0

Experimental Binary XtraDB 8.0

Experimental Binary XtraDB 8.0Percona is happy to announce the first experimental binary of Percona XtraDB Cluster 8.0 on October 1, 2019. This is a major step for tuning Percona XtraDB Cluster to be more cloud- and user-friendly. This release combines the updated and feature-rich Galera 4, with substantial improvements made by our development team.

Improvements and New Features

Galera 4, included in Percona XtraDB Cluster 8.0, has many new features. Here is a list of the most essential improvements:

  • Streaming replication supports large transactions
  • The synchronization functions allow action coordination (wsrep_last_seen_gtid, wsrep_last_written_gtid, wsrep_sync_wait_upto_gtid)
  • More granular and improved error logging. wsrep_debug is now a multi-valued variable to assist in controlling the logging, and logging messages have been significantly improved.
  • Some DML and DDL errors on a replicating node can either be ignored or suppressed. Use the wsrep_ignore_apply_errors variable to configure.
  • Multiple system tables help find out more about the state of the cluster state.
  • The wsrep infrastructure of Galera 4 is more robust than that of Galera 3. It features a faster execution of code with better state handling, improved predictability, and error handling.

Percona XtraDB Cluster 8.0 has been reworked in order to improve security and reliability as well as to provide more information about your cluster:

  • There is no need to create a backup user or maintain the credentials in plain text (a security flaw). An internal SST user is created, with a random password for making a backup, and this user is discarded immediately once the backup is done.
  • Percona XtraDB Cluster 8.0 now automatically launches the upgrade as needed (even for minor releases). This avoids manual intervention and simplifies the operation in the cloud.
  • SST (State Snapshot Transfer) rolls back or fixes an unwanted action. It is no more “a copy only block” but a smart operation to make the best use of the copy-phase.
  • Additional visibility statistics are introduced in order to obtain more information about Galera internal objects. This enables easy tracking of the state of execution and flow control.

Installation

You can only install this release from a tarball and it, therefore, cannot be installed through a package management system, such as apt or yum. Note that this release is not ready for use in any production environment.

Percona XtraDB Cluster 8.0 is based on the following:

Please be aware that this release will not be supported in the future, and as such, neither the upgrade to this release nor the downgrade from higher versions is supported.

This release is also packaged with Percona XtraBackup 8.0.5. All Percona software is open-source and free.

In order to experiment with Percona XtraDB Cluster 8.0 in your environment, download and unpack the tarball for your platform.

Note

Be sure to check your system and make sure that the packages are installed which Percona XtraDB Cluster 8.0 depends on.

For Debian or Ubuntu:

$ sudo apt-get install -y \
socat libdbd-mysql-perl \
rsync libaio1 libc6 libcurl3 libev4 libgcc1 libgcrypt20 \
libgpg-error0 libssl1.1 libstdc++6 zlib1g libatomic1

For Red Hat Enterprise Linux or CentOS:

$ sudo yum install -y openssl socat  \
procps-ng chkconfig procps-ng coreutils shadow-utils \
grep libaio libev libcurl perl-DBD-MySQL perl-Digest-MD5 \
libgcc rsync libstdc++ libgcrypt libgpg-error zlib glibc openssl-libs

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

Sep
18
2019
--

Percona XtraDB Cluster 5.7.27-31.39 Is Now Available

Percona XtraDB Cluster

Percona XtraDB ClusterPercona is happy to announce the release of Percona XtraDB Cluster 5.7.27-31.39 on September 18, 2019. Binaries are available from the downloads section or from our software repositories.

Percona XtraDB Cluster 5.7.27-31.39 is now the current release, based on the following:

All Percona software is open-source and free.

Bugs Fixed

  • PXC-2432: PXC was not updating the information_schema user/client statistics properly.
  • PXC-2555: SST initialization delay: fixed a bug where the SST process took too long to detect if a child process was running.
  • PXC-2557: Fixed a crash when a node goes NON-PRIMARY and SHOW STATUS is executed.
  • PXC-2592: PXC restarting automatically on data inconsistency.
  • PXC-2605: PXC could crash when log_slow_verbosity included InnoDB.  Fixed upstream PS-5820.
  • PXC-2639: Fixed an issue where a SQL admin command (like OPTIMIZE) could cause a deadlock.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

Sep
17
2019
--

Percona XtraDB Cluster 5.6.45-28.36 Is Now Available

Percona XtraDB Cluster

Percona XtraDB Cluster

Percona is glad to announce the release of Percona XtraDB Cluster 5.6.45-28.36 on September 17, 2019. Binaries are available from the downloads section or from our software repositories.

Percona XtraDB Cluster 5.6.45-28.36 is now the current release, based on the following:

All Percona software is open-source and free.

Bugs Fixed

  • PXC-2432: PXC was not updating the information schema user/client statistics properly.
  • PXC-2555: SST initialization delay: fixed a bug where the SST process took too long to detect if a child process was running.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

Jun
26
2019
--

Percona XtraDB Cluster 5.7.26-31.37 Is Now Available

Percona XtraDB Cluster 5.7

Percona XtraDB Cluster 5.7

Percona is glad to announce the release of Percona XtraDB Cluster 5.7.26-31.37 on June 26, 2019. Binaries are available from the downloads section or from our software repositories.

Percona XtraDB Cluster 5.7.26-31.37 is now the current release, based on the following:

All Percona software is open-source and free.

Bugs Fixed

  • PXC-2480: In some cases, Percona XtraDB Cluster could not replicate CURRENT_USER() used in the ALTER statement. USER() and CURRENT_USER() are no longer allowed in any ALTER statement since they fail when replicated.
  • PXC-2487: The case when a DDL or DML action was in progress from one client and the provider was updated
    from another client could result in a race condition.
  • PXC-2490: Percona XtraDB Cluster could crash when binlog_space_limit was set to a value other than zero during wsrep_recover mode.
  • PXC-2491: SST could fail if the donor had encrypted undo logs.
  • PXC-2497: The user can set the preferred donor by setting the wsrep_sst_donor variable. An IP address is not valid as the value of this variable. If the user still used an IP address, an error message was produced that did not provide sufficient information. The error message has been improved to suggest that the user check the value of the wsrep_sst_donor for an IP address.
  • PXC-2537: Nodes could crash after an attempt to set a password using mysqladmin

Other bugs fixedPXC-2276PXC-2292PXC-2476,  PXC-2560

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

Jun
19
2019
--

Percona XtraDB Cluster 5.6.44-28.34 Is Now Available

Percona XtraDB Cluster 5.7

Percona XtraDB Cluster 5.7

Percona is glad to announce the release of Percona XtraDB Cluster 5.6.44-28.34 on June 19, 2019. Binaries are available from the downloads section or from our software repositories.

Percona XtraDB Cluster 5.6.44-28.34 is now the current release, based on the following:

All Percona software is open-source and free.

Bugs Fixed

  • PXC-2480: In some cases, Percona XtraDB Cluster could not replicate CURRENT_USER() used in the ALTER statement. USER() and CURRENT_USER() are no longer allowed in any ALTER statement since they fail when replicated.
  • PXC-2487: The case when a DDL or DML action was in progress from one client and the provider was updated
    from another client could result in a race condition.
  • PXC-2490: Percona XtraDB Cluster could crash when binlog_space_limit was set to a value other than zero during wsrep_recover mode.
  • PXC-2497: The user can set the preferred donor by setting the wsrep_sst_donor variable. An IP address is not valid as the value of this variable. If the user still used an IP address, an error message was produced that did not provide sufficient information. The error message has been improved to suggest that the user check the value of the wsrep_sst_donor for an IP address.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

Apr
26
2019
--

Load Balanced ProxySQL in Google Cloud

GCP ProxySQL LB

There are three different ways ProxySQL can direct traffic between your application and the backend MySQL services.

  1. Locally, on the MySQL servers.
  2. Between the MySQL servers and the application.
  3. Colocated on the application servers themselves.

Without going through too much detail – each has its own limitations. In the first form, the application needs to know about all MySQL servers at any given point in time. With the third form, a large number of application servers, especially in the age of Kubernetes, where apps can simply recycle easily or be scaled up and down, backend connections can increase exponentially leading to issues.

In the second form, load balancing between a pool of ProxySQL servers is normally the challenge. Do you load balance the load balancers? While there are approaches like balancing from the application, similar to how the MongoDB drivers works, the application still needs to know and maintain a list of healthy backend proxies.

Google Cloud Platform’s (GCP) internal load balancer is a software based managed service which is implemented via virtual networking. This means, unlike physical load balancers, it is designed not to be a single point of failure.

We have played with Internal Load Balancers (ILB) and ProxySQL using the architecture below. There’s a few steps and items involved to be explained.

VM Instance Group

An instance group will be created to run the ProxySQL services. This needs to be a managed instance group so they are distributed between multiple zones. A managed instance group can auto scale, which you might or might not want. For example, a problematic ProxySQL instance can easily be replaced with a templatized VM instance.

Health Check

Health checks are the tricky part. GCP’s internal load balancer supports HTTP(S), SSL and TCP health checks. In this case, as long as ProxySQL is responding on the service port or admin port the service is up, right? Yes, but this is not necessarily enough since a port may respond but the instance can be misconfigured and return errors.

With ProxySQL you have to treat it like an actual MySQL instance (i.e. login and issue a query). On the other hand, the load balancer should be agnostic and does not necessarily need to know which backends do or do not work. The availability of backends should be left to ProxySQL as much as possible.

One way to achieve this is to use dummy rewrite rules inside ProxySQL. In the example below, we’ve configured an account called

percona

that is assigned to a non-existent hostgroup. What we are doing is simply rewriting

SELECT 1

  queries to return an OK result.

mysql> INSERT INTO mysql_query_rules (active, username, match_pattern, OK_msg)
 - > VALUES (1, 'percona', 'SELECT 1', '1');
Query OK, 1 row affected (0.00 sec)
mysql> LOAD MYSQL QUERY RULES TO RUNTIME;
Query OK, 0 rows affected (0.00 sec)
mysql> SAVE MYSQL QUERY RULES TO DISK;
Query OK, 0 rows affected (0.01 sec)

[root@west1-proxy-group-9bgs ~]# mysql -upercona -ppassword -P3306 -h127.1
...
mysql> SELECT 1;
Query OK, 0 rows affected (0.00 sec)
1

It does not solve the problem though where ILB only supports primitive TCP check and HTTP checks. We still need a layer where the response from ProxySQL will be properly translated to ILB in a form it will understand. My personal preference is to expose an HTTP service that queries ProxySQL and responds to ILB HTTP based health check. It provides additional flexibility like being able to check specific or all backends.

Firewall Rules

Health checks to ProxySQL instances comes from a specific set of IP ranges. In our case, these would be 130.211.0.0/22 and 35.191.0.0/16. Firewall ports needs to be open from these ranges to either the HTTP or TCP ports in the ProxySQL instances.

In our next post, we will use Orchestrator to manage cross region replication for high availability.

Apr
10
2019
--

How to Add More Nodes to an Existing ProxySQL Cluster

proxysql on application instances

In my previous post, some time ago, I wrote about the new cluster feature of ProxySQL. For that post, we were working with three nodes, now we’ll work with even more! If you’ve installed one ProxySQL per application instance and would like to work up to more, then this post is for you. If this is new to you, though, read my earlier post first for more context.

Check the image below to understand the structure of “one ProxySQL per application”. This means you have ProxySQL installed, and your application (Java, PHP, Apache server etc) in the same VM (virtual machine).

Schematic of a ProxySQL cluster

Having taken a look at that you probably have a few questions, such as:

  • What happens if you have 20 nodes synced and you now need to add 100 or more nodes?
  • How can I sync the new nodes without introducing errors?

Don’t be scared, it’s a simple process.

Remember there are only four tables which can be synced over the cluster. These tables are:

  • mysql_query_rules
  • mysql_servers
  • mysql_users
  • proxysql_servers

From a new proxysql cluster installation

After a fresh installation all those four tables are empty. That means if we configure it as a new node in a cluster, all rows in those tables will be copied instantly.

Generally the installation is straightforward.

Now ProxySQL is up and all the configuration are in default settings

Connect to the ProxySQL console:

mysql -uadmin -padmin -h127.0.0.1 -P6032 --prompt='\u (\d)>'

Now there are two steps remaining:

  • Configure global_variables table
  • Configure proxysql_servers table

How to configure the global_variables table

Below is an example showing the minimal parameters to set – you can change the username and password according to your needs.

You can copy and paste the username and passwords from the current cluster and monitoring process by running the next command in a node of the current cluster:

select * from global_variables where variable_name in ('admin-admin_credentials', 'admin-cluster_password', 'mysql-monitor_password', 'admin-cluster_username', 'mysql-monitor_username');

You can update the parameters of the current node by using this as a template:

update global_variables set variable_value='<REPLACE-HERE>' where variable_name='admin-admin_credentials';
update global_variables set variable_value='<REPLACE-HERE>' where variable_name='admin-cluster_username';
update global_variables set variable_value='<REPLACE-HERE>' where variable_name='admin-cluster_password';
update global_variables set variable_value='<REPLACE-HERE>' where variable_name='mysql-monitor_username';
update global_variables set variable_value='<REPLACE-HERE>' where variable_name='mysql-monitor_password';
update global_variables set variable_value=1000 where variable_name='admin-cluster_check_interval_ms';
update global_variables set variable_value=10 where variable_name='admin-cluster_check_status_frequency';
update global_variables set variable_value='true' where variable_name='admin-cluster_mysql_query_rules_save_to_disk';
update global_variables set variable_value='true' where variable_name='admin-cluster_mysql_servers_save_to_disk';
update global_variables set variable_value='true' where variable_name='admin-cluster_mysql_users_save_to_disk';
update global_variables set variable_value='true' where variable_name='admin-cluster_proxysql_servers_save_to_disk';
update global_variables set variable_value=3 where variable_name='admin-cluster_mysql_query_rules_diffs_before_sync';
update global_variables set variable_value=3 where variable_name='admin-cluster_mysql_servers_diffs_before_sync';
update global_variables set variable_value=3 where variable_name='admin-cluster_mysql_users_diffs_before_sync';
update global_variables set variable_value=3 where variable_name='admin-cluster_proxysql_servers_diffs_before_sync';
load admin variables to RUNTIME;
save admin variables to disk;

Configure “proxysql_servers” table

At this point you need keep in mind that you will need to INSERT into this table “all” the IPs from the other ProxySQL nodes.

Why so? Because this table will have a new epoch time and this process will overwrite the rest of the nodes listed by its last table update.

In our example, let’s assume the IP of the new node is 10.0.1.3 (i.e. node 3, below)

INSERT INTO proxysql_servers (hostname,port,weight,comment) VALUES ('10.0.1.1',6032,0,'p1');
INSERT INTO proxysql_servers (hostname,port,weight,comment) VALUES ('10.0.1.2',6032,0,'p2');
INSERT INTO proxysql_servers (hostname,port,weight,comment) VALUES ('10.0.1.3',6032,0,'p3');
LOAD PROXYSQL SERVERS TO RUNTIME;
SAVE PROXYSQL SERVERS TO DISK;

If you already have many ProxySQL servers in the cluster, you can run mysqldump as this will help speed up this process.

If that’s the case, you need to find the most up to date node and use mysqldump to export the data from the proxysql_servers table.

How would you find this node? There are a few stats tables in ProxySQL, and in this case we can use two of these to help identify the right node.

SELECT stats_proxysql_servers_checksums.hostname, stats_proxysql_servers_metrics.Uptime_s, stats_proxysql_servers_checksums.port, stats_proxysql_servers_checksums.name, stats_proxysql_servers_checksums.version, FROM_UNIXTIME(stats_proxysql_servers_checksums.epoch) epoch, stats_proxysql_servers_checksums.checksum, stats_proxysql_servers_checksums.diff_check FROM stats_proxysql_servers_metrics JOIN stats_proxysql_servers_checksums ON stats_proxysql_servers_checksums.hostname = stats_proxysql_servers_metrics.hostname WHERE stats_proxysql_servers_metrics.Uptime_s > 0 ORDER BY epoch DESC

Here’s an example output

+------------+----------+------+-------------------+---------+---------------------+--------------------+------------+
| hostname   | Uptime_s | port | name              | version | epoch               | checksum           | diff_check |
+------------+----------+------+-------------------+---------+---------------------+--------------------+------------+
| 10.0.1.1   | 1190     | 6032 | mysql_users       | 2       | 2019-04-04 12:04:21 | 0xDB07AC7A298E1690 | 0          |
| 10.0.1.2   | 2210     | 6032 | mysql_users       | 2       | 2019-04-04 12:04:18 | 0xDB07AC7A298E1690 | 0          |
| 10.0.1.1   | 1190     | 6032 | mysql_query_rules | 1       | 2019-04-04 12:00:07 | 0xBC63D734643857A5 | 0          |
| 10.0.1.1   | 1190     | 6032 | mysql_servers     | 1       | 2019-04-04 12:00:07 | 0x0000000000000000 | 0          |
| 10.0.1.1   | 1190     | 6032 | proxysql_servers  | 1       | 2019-04-04 12:00:07 | 0x233638C097DE6190 | 0          |
| 10.0.1.2   | 2210     | 6032 | mysql_query_rules | 1       | 2019-04-04 11:43:13 | 0xBC63D734643857A5 | 0          |
| 10.0.1.2   | 2210     | 6032 | mysql_servers     | 1       | 2019-04-04 11:43:13 | 0x0000000000000000 | 0          |
| 10.0.1.2   | 2210     | 6032 | proxysql_servers  | 1       | 2019-04-04 11:43:13 | 0x233638C097DE6190 | 0          |
| 10.0.1.2   | 2210     | 6032 | admin_variables   | 0       | 1970-01-01 00:00:00 |                    | 0          |
| 10.0.1.2   | 2210     | 6032 | mysql_variables   | 0       | 1970-01-01 00:00:00 |                    | 0          |
| 10.0.1.1   | 1190     | 6032 | admin_variables   | 0       | 1970-01-01 00:00:00 |                    | 0          |
| 10.0.1.1   | 1190     | 6032 | mysql_variables   | 0       | 1970-01-01 00:00:00 |                    | 0          |
+------------+----------+------+-------------------+---------+---------------------+--------------------+------------+

For each table, we can see different versions, each related to table changes.  In this case, we need to look for the latest epoch time for the table “proxysql_servers“. In this example , above, we can see that the server with the IP address of  10.0.1.1 is the latest version.  We can now run the next command to get a backup of all the IP data from the current cluster

mysqldump --host=127.0.0.1 --port=6032 --skip-opt --no-create-info --no-tablespaces --skip-triggers --skip-events main proxysql_servers > proxysql_servers.sql

Now, copy the output to the new node and import into proxysql_servers table. Here’s an example: the data has been exported to the file proxysql_servers.sql which we’ll now load to the new node:

source proxysql_servers.sql
LOAD PROXYSQL SERVERS TO RUNTIME;
SAVE PROXYSQL SERVERS TO DISK;

You can run the next SELECTs to verify that the new node has the data from the current cluster, and in doing so ensure that the nodes are in sync as expected:

select * from mysql_query_rules;
select * from mysql_servers;
select * from mysql_users ;
select * from proxysql_servers;

How can we check if there are errors in the synchronization process?

Run the next command to fetch data from the table stats_proxysql_servers_checksums:

SELECT hostname, port, name, version, FROM_UNIXTIME(epoch) epoch, checksum, diff_check FROM stats_proxysql_servers_checksums  ORDER BY epoch;

Using our example, here’s the data saved in stats_proxysql_servers_checksums for the proxysql_servers table

admin ((none))>SELECT hostname, port, name, version, FROM_UNIXTIME(epoch) epoch, checksum, diff_check FROM stats_proxysql_servers_checksums  ORDER BY epoch;
+------------+------+-------------------+---------+---------------------+--------------------+------------+---------------------+
| hostname   | port | name              | version | epoch               | checksum           | diff_check | DATETIME('NOW')     |
+------------+------+-------------------+---------+---------------------+--------------------+------------+---------------------+
...
| 10.0.1.1   | 6032 | proxysql_servers  | 2       | 2019-03-25 13:36:17 | 0xC7D7443B96FC2A94 | 92         | 2019-03-25 13:38:31 |
| 10.0.1.2   | 6032 | proxysql_servers  | 2       | 2019-03-25 13:36:34 | 0xC7D7443B96FC2A94 | 92         | 2019-03-25 13:38:31 |
...
| 10.0.1.3   | 6032 | proxysql_servers  | 2       | 2019-03-25 13:37:00 | 0x233638C097DE6190 | 0          | 2019-03-25 13:38:31 |
...
+------------+------+-------------------+---------+---------------------+--------------------+------------+---------------------+

As we can see in the column “diff” there are some differences in the checksum of the other nodes: our new node (IP 10.0.1.3) has the checksum 0x233638C097DE6190 where as nodes 1 and 2 both show 0xC7D7443B96FC2A94

This indicates that these nodes have different data. Is this correct? Well, yes, in this case, since in our example the nodes 1 and 2 were created on the current ProxySQL cluster

How do you fix this?

Well, you connect to the node 1 and INSERT the data for node 3 (the new node).

This change will propagate the changes over the current cluster – node 2 and node 3 – and will update the table proxysql_servers.

INSERT INTO proxysql_servers (hostname,port,weight,comment) VALUES ('10.0.1.3',6032,0,'p3');
LOAD PROXYSQL SERVERS TO RUNTIME;

What happens if the new nodes already exist in the proxysql_servers table of the current cluster?

You can add the new ProxySQL node IPs to the current cluster before you start installing and configuring the new nodes.

In fact, this works fine and without any issues. The steps I went through, above, are the same, but when you check for differences in the table stats_proxysql_servers_checksums you should find there are none, and that the “diff” column should be displayed as 0.

How does ProxySQL monitor other ProxySQL nodes to sync locally?

Here’s a link to the explanation from the ProxySQL wiki page https://github.com/sysown/proxysql/wiki/ProxySQL-Cluster

Proxies monitor each other, so they immediately recognize that when a checksum of a configuration changes, the configuration has changed. It’s possible that the remote peer’s configuration and its own configuration were changed at the same time, or within a short period of time. So the proxy must also check its own status.

When proxies find differences:

  • If its own version is 1 , find the peer with version > 1 and with the highest epoch, and sync immediately
  • If its own version is greater than 1, starts counting for how many checks they differ
    • when the number of checks in which they differ is greater than cluster__name___diffs_before_sync and cluster__name__diffs_before_sync itself is greater than 0, find the peer with version > 1 and with the highest epoch, and sync immediately.

Note: it is possible that a difference is detected against one node, but the sync is performed against a different node. Since ProxySQL bases the sync on the node with the highest epoch, it is expected that all the nodes will converge.

To perform the syncing process:

These next steps run automatically when other nodes detect a change: this is an example for the table mysql_users

The same connection that’s used to perform the health check is used to execute a series of SELECTstatements in the form of SELECT _list_of_columns_ FROM runtime_module

SELECT username, password, active, use_ssl, default_hostgroup, default_schema, schema_locked, transaction_persistent, fast_forward, backend, frontend, max_connections FROM runtime_mysql_users;

In the node where it detected a difference in the checksum, it will run the next statements automatically:

DELETE FROM mysql_servers;
INSERT INTO mysql_users(username, password, active, use_ssl, default_hostgroup, default_schema, schema_locked, transaction_persistent, fast_forward, backend, frontend, max_connections) VALUES ('user_app','secret_password', 1, 0, 10, '', 0, 1, 0, 0, 1, 1000);
LOAD MYSQL USERS TO RUNTIME;
SAVE MYSQL USERS TO DISK;

Be careful:

After you’ve added the new ProxySQL node to the cluster, all four tables will sync immediately using the node with the highest epoch time as the basis for sync. Any changes to the tables previously added to the cluster, will propagate and overwrite the cluster

Summary

ProxySQL cluster is a powerful tool, and it’s being used more and more in production environments. If you haven’t tried it before, I’d recommend that you start testing ProxySQL cluster with a view to improving your application’s service. Hope you found this post helpful!

 

Apr
02
2019
--

Percona XtraDB Cluster Operator 0.3.0 Early Access Release Is Now Available

Percona XtraDB Cluster Operator

Percona announces the release of Percona XtraDB Cluster Operator 0.3.0 early access.

The Percona XtraDB Cluster Operator simplifies the deployment and management of Percona XtraDB Cluster in a Kubernetes or OpenShift environment. It extends the Kubernetes API with a new custom resource for deploying, configuring and managing the application through the whole life cycle.Percona XtraDB Cluster Operator

You can install the Percona XtraDB Cluster Operator on Kubernetes or OpenShift. While the operator does not support all the Percona XtraDB Cluster features in this early access release, instructions on how to install and configure it are already available along with the operator source code, hosted in our Github repository.

The Percona XtraDB Cluster Operator is an early access release. Percona doesn’t recommend it for production environments.

New features

Improvements

Fixed Bugs

  • CLOUD-148: Pod Disruption Budget code caused the wrong configuration to be applied for ProxySQL and had lack of multiple availability zones support.
  • CLOUD-138: The restore-backup.sh script was exiting with an error because its code was not taking into account images version numbers.
  • CLOUD-118: The backup recovery job was unable to start if Persistent Volume for backup and Persistent Volume for Pod-0 were placed in different availability zones.

Percona XtraDB Cluster is an open source, cost-effective and robust clustering solution for businesses. It integrates Percona Server for MySQL with the Galera replication library to produce a highly-available and scalable MySQL® cluster complete with synchronous multi-master replication, zero data loss and automatic node provisioning using Percona XtraBackup.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com