When working with group replication, MySQL router would be the obvious choice for the connection layer. It is tightly coupled with the rest of the technologies since it is part of the InnoDB cluster stack.The problem is that except for simple workloads, MySQL router’s performance is still not on par with other proxies like Haproxy […]
20
2023
Comparisons of Proxies for MySQL
With a special focus on Percona Operator for MySQL
Overview
HAProxy, ProxySQL, MySQL Router (AKA MySQL Proxy); in the last few years, I had to answer multiple times on what proxy to use and in what scenario. When designing an architecture, many components need to be considered before deciding on the best solution.
When deciding what to pick, there are many things to consider, like where the proxy needs to be, if it “just” needs to redirect the connections, or if more features need to be in, like caching and filtering, or if it needs to be integrated with some MySQL embedded automation.
Given that, there never was a single straight answer. Instead, an analysis needs to be done. Only after a better understanding of the environment, the needs, and the evolution that the platform needs to achieve is it possible to decide what will be the better choice.
However, recently we have seen an increase in the usage of MySQL on Kubernetes, especially with the adoption of Percona Operator for MySQL. In this case, we have a quite well-defined scenario that can resemble the image below:
In this scenario, the proxies must sit inside Pods, balancing the incoming traffic from the Service LoadBalancer connecting with the active data nodes.
Their role is merely to be sure that any incoming connection is redirected to nodes that can serve them, which includes having a separation between Read/Write and Read Only traffic, a separation that can be achieved, at the service level, with automatic recognition or with two separate entry points.
In this scenario, it is also crucial to be efficient in resource utilization and scaling with frugality. In this context, features like filtering, firewalling, or caching are redundant and may consume resources that could be allocated to scaling. Those are also features that will work better outside the K8s/Operator cluster, given the closer to the application they are located, the better they will serve.
About that, we must always remember the concept that each K8s/Operator cluster needs to be seen as a single service, not as a real cluster. In short, each cluster is, in reality, a single database with high availability and other functionalities built in.
Anyhow, we are here to talk about Proxies. Once we have defined that we have one clear mandate in mind, we need to identify which product allows our K8s/Operator solution to:
- Scale at the maximum the number of incoming connections
- Serve the request with the higher efficiency
- Consume as fewer resources as possible
The environment
To identify the above points, I have simulated a possible K8s/Operator environment, creating:
- One powerful application node, where I run sysbench read-only tests, scaling from two to 4096 threads. (Type c5.4xlarge)
- Three mid-data nodes with several gigabytes of data in with MySQL and Group Replication (Type m5.xlarge)
- One proxy node running on a resource-limited box (Type t2.micro)
The tests
We will have very simple test cases. The first one has the scope to define the baseline, identifying the moment when we will have the first level of saturation due to the number of connections. In this case, we will increase the number of connections and keep a low number of operations.
The second test will define how well the increasing load is served inside the previously identified range.
For documentation, the sysbench commands are:
Test1
sysbench ./src/lua/windmills/oltp_read.lua --db-driver=mysql --tables=200 --table_size=1000000 --rand-type=zipfian --rand-zipfian-exp=0 --skip_trx=true --report-interval=1 --mysql-ignore-errors=all --mysql_storage_engine=innodb --auto_inc=off --histogram --stats_format=csv --db-ps-mode=disable --point-selects=50 --reconnect=10 --range-selects=true –rate=100 --threads=<#Threads from 2 to 4096> --time=1200 run
Test2
sysbench ./src/lua/windmills/oltp_read.lua --mysql-host=<host> --mysql-port=<port> --mysql-user=<user> --mysql-password=<pw> --mysql-db=<schema> --db-driver=mysql --tables=200 --table_size=1000000 --rand-type=zipfian --rand-zipfian-exp=0 --skip_trx=true --report-interval=1 --mysql-ignore-errors=all --mysql_storage_engine=innodb --auto_inc=off --histogram --table_name=<tablename> --stats_format=csv --db-ps-mode=disable --point-selects=50 --reconnect=10 --range-selects=true --threads=<#Threads from 2 to 4096> --time=1200 run
Results
Test 1
As indicated here, I was looking to identify when the first Proxy will reach a dimension that would not be manageable. The load is all in creating and serving the connections, while the number of operations is capped at 100.
As you can see, and as I was expecting, the three Proxies were behaving more or less the same, serving the same number of operations (they were capped, so why not) until they weren’t.
MySQL router, after the 2048 connection, could not serve anything more.
NOTE: MySQL Router actually stopped working at 1024 threads, but using version 8.0.32, I enabled the feature: connection_sharing. That allows it to go a bit further.
Let us take a look also the latency:
Here the situation starts to be a little bit more complicated. MySQL Router is the one that has the higher latency no matter what. However, HAProxy and ProxySQL have interesting behavior. HAProxy performs better with a low number of connections, while ProxySQL performs better when a high number of connections is in place.
This is due to the multiplexing and the very efficient way ProxySQL uses to deal with high load.
Everything has a cost:
HAProxy is definitely using fewer user CPU resources than ProxySQL or MySQL Router …
.. we can also notice that HAProxy barely reaches, on average, the 1.5 CPU load while ProxySQL is at 2.50 and MySQL Router around 2.
To be honest, I was expecting something like this, given ProxySQL’s need to handle the connections and the other basic routing. What was instead a surprise was MySQL Router, why does it have a higher load?
Brief summary
This test highlights that HAProxy and ProxySQL can reach a level of connection higher than the slowest runner in the game (MySQL Router). It is also clear that traffic is better served under a high number of connections by ProxySQL, but it requires more resources.
Test 2
When the going gets tough, the tough get going
Let’s remove the –rate limitation and see what will happen.
The scenario with load changes drastically. We can see how HAProxy can serve the connection and allow the execution of more operations for the whole test. ProxySQL is immediately after it and behaves quite well, up to 128 threads, then it just collapses.
MySQL Router never takes off; it always stays below the 1k reads/second, while HAProxy served 8.2k and ProxySQL 6.6k.
Looking at the latency, we can see that HAProxy gradually increased as expected, while ProxySQL and MySQL Router just went up from the 256 threads on.
To observe that both ProxySQL and MySQL Router could not complete the tests with 4096 threads.
Why? HAProxy always stays below 50% CPU, no matter the increasing number of threads/connections, scaling the load very efficiently. MySQL router was almost immediately reaching the saturation point, being affected by the number of threads/connections and the number of operations. That was unexpected, given we do not have a level 7 capability in MySQL Router.
Finally, ProxySQL, which was working fine up to a certain limit, reached saturation point and could not serve the load. I am saying load because ProxySQL is a level 7 proxy and is aware of the content of the load. Given that, on top of multiplexing, additional resource consumption was expected.
Here we just have a clear confirmation of what was already said above, with 100% CPU utilization reached by MySQL Router with just 16 threads, and ProxySQL way after at 256 threads.
Brief summary
HAProxy comes up as the champion in this test; there is no doubt that it could scale the increasing load in connection without being affected significantly by the load generated by the requests. The lower consumption in resources also indicates the possible space for even more scaling.
ProxySQL was penalized by the limited resources, but this was the game, we had to get the most out of the few available. This test indicates that it is not optimal to use ProxySQL inside the Operator; it is a wrong choice if low resource and scalability are a must.
MySQL Router was never in the game. Unless a serious refactoring, MySQL Router is designed for very limited scalability, as such, the only way to adopt it is to have many of them at the application node level. Utilizing it close to the data nodes in a centralized position is a mistake.
Conclusions
I started showing an image of how the MySQL service is organized and want to close by showing the variation that, for me, is the one to be considered the default approach:
This highlights that we must always choose the right tool for the job.
The Proxy in architectures involving MySQL/Percona Server for MySQL/Percona XtraDB Cluster is a crucial element for the scalability of the cluster, no matter if using K8s or not. Choosing the one that serves us better is important, which can sometimes be ProxySQL over HAProxy.
However, when talking about K8s and Operators, we must recognize the need to optimize the resources usage for the specific service. In that context, there is no discussion about it, HAProxy is the best solution and the one we should go to.
My final observation is about MySQL Router (aka MySQL Proxy).
Unless there is a significant refactoring of the product, at the moment, it is not even close to what the other two can do. From the tests done so far, it requires a complete reshaping, starting to identify why it is so subject to the load coming from the query more than the load coming from the connections.
Great MySQL to everyone.
References
11
2021
Full Read Consistency Within Percona Kubernetes Operator for Percona XtraDB Cluster
The aim of Percona Kubernetes Operator for Percona XtraDB Cluster is to be a special type of controller introduced to simplify complex deployments. The Operator extends the Kubernetes API with custom resources. The Operator solution is using Percona XtraDB Cluster (PXC) behind the hood to provide a highly available, resilient, and scalable MySQL service in the Kubernetes space.
This solution comes with all the advantages/disadvantages provided by Kubernetes, plus some advantages of its own like the capacity to scale reads on the nodes that are not Primary.
Of course, there are some limitations like the way PXC handles DDLs, which may impact the service, but there is always a cost to pay to get something, expecting to have it all for free is unreasonable.
In this context, we need to talk and cover what is full read consistency in this solution and why it is important to understand the role it plays.
Stale Reads
When using Kubernetes we should talk about the service and not about the technology/product used to deliver such service.
In our case, the Percona Operator is there to deliver a MySQL service. We should then see that as a whole, as a single object. To be more clear what we must consider is NOT the fact we have a cluster behind the service but that we have a service that to be resilient and highly available, use a cluster.
We should not care if a node/pod goes down unless the service is discontinued.
What we have as a plus in the Percona Operator solution is a certain level of READ scalability. This achieved optimizing the use of the non PRIMARY nodes, and instead of having them sitting there applying only replicated data, the Percona Operator provides access to them to scale the reads.
But… there is always a BUT ?
Let us start with an image:
By design, the apply and commit finalize in Galera (PXC) may have (and has) a delay between nodes. This means that, if using defaults, applications may have inconsistent reads if trying to access the data from different nodes than the Primary.
It provides access using two different solutions:
- Using HAProxy (default)
- Using ProxySQL
When using HAProxy you will have 2 entry points:
- cluster1-haproxy, which will point to the Primary ONLY, for reads and writes. This is the default entry point for the applications to the MySQL database.
- cluster1-haproxy-replicas, which will point to all three nodes and is supposed to be used for READS only. This is the PLUS you can use if your application has READ/WRITE separation.
Please note that at the moment there is nothing preventing an application to use the cluster1-haproxy-replicas also for write, but that is dangerous and wrong because will generate a lot of certification conflicts and BF abort given it will distribute writes all over the cluster impacting on performance as well (and not giving you any write scaling):
[marcotusa@instance-1 ~]$ for i in `seq 1 100`; do mysql -h cluster1-haproxy-replicas -e "insert into test.iamwritingto values(null,@@hostname)";done +----------------+-------------+ | host | count(host) | +----------------+-------------+ | cluster1-pxc-1 | 34 | | cluster1-pxc-2 | 33 | | cluster1-pxc-0 | 33 | +----------------+-------------+
When using ProxySQL the entry point is a single one, but you may define query rules to automatically split the R/W requests coming from the application. This is the preferred method when an application has no way to separate the READS from the writes.
Here I have done a comparison of the two methods, HAProxy and ProxySQL.
Now, as mentioned above, by default, PXC (any Galera base solution) comes with some relaxed settings, for performance purposes. This is normally fine in many standard cases, but if you use the Percona Operator and use the PLUS of scaling reads using the second access point with HAproxy or Query Rules with Proxysql, you should NOT have stale reads, given the service must provide consistent data, as if you are acting on a single node.
To achieve that you can change the defaults and change the parameter in PXC wsrep_sync_wait.
When changing the parameter wsrep_sync_wait as explained in the documentation, the node initiates a causality check, blocking incoming queries while it catches up with the cluster.
Once all data on the node receiving the READ request is commit_finalized, the node performs the read.
But this has a performance impact, as said before.
What Is The Impact?
To test the performance impact I had used a cluster deployed in GKE, with these characteristics:
- 3 Main nodes n2-standard-8 (8 vCPUs, 32 GB memory)
- 1 App node n2-standard-8 (8 vCPUs, 32 GB memory)
- PXC pods using:
- 25GB of the 32 available
- 6 CPU of the 8 available
- HAProxy:
- 600m CPU
- 1GB RAM
- PMM agent
- 500m CPU
- 500 MB Ram
In the application node, I used sysbench running two instances, one in r/w mode the other only reads. Finally, to test the stale read, I used the stale read test from my test suite.
Given I was looking for results with a moderate load, I just used 68/96/128 threads per sysbench instance.
Results
Marco, did we have or not have stale reads? Yes, we did:
I had from 0 (with very light load) up to 37% stale reads with a MODERATED load, where moderated was the 128 threads sysbench running.
Setting wsrep_sync_wait=3 of course I had full consistency. But I had performance loss:
As you can see, I had an average loss of 11% in case of READS:
While for writes the average loss was 16%.
Conclusions
At this point, we need to stop and think about what is worth doing. If my application is READs heavy and READs scaling, it is probably worth enabling the full synchronicity given scaling on the additional node allows me to have 2x or more READs.
If instead my application is write critical, probably losing also ~16% performance is not good.
Finally if my application is stale reads tolerant, I will just go with the defaults and get all the benefits without penalties.
Also keep in mind that Percona Kubernetes Operator for Percona XtraDB Cluster is designed to offer a MySQL service so the state of the single node is not as critical as if you are using a default PXC installation, PODs are by nature ephemeral objects while service is resilient.
References
Percona Kubernetes Operator for Percona XtraDB Cluster
https://github.com/Tusamarco/testsuite
https://en.wikipedia.org/wiki/Isolation_(database_systems)#Dirty_reads
https://galeracluster.com/library/documentation/mysql-wsrep-options.html#wsrep-sync-wait
https://www.slideshare.net/lefred.descamps/galera-replication-demystified-how-does-it-work
15
2015
Proxy Protocol and Percona XtraDB Cluster: A Quick Guide
On September 21st, we released Percona XtraDB Cluster 5.6.25. This is the first PXC release supporting proxy-protocol that has been included in Percona Server since 5.6.25-73.0.
With this blog post, I want to promote a new feature that you may have ignored.
Let’s start with a description of the use case:
Architecture Overview:
HAProxy is configured like this:
listen 3307-active-passive-writes 0.0.0.0:3307 mode tcp balance leastconn option httpchk server pxc1 pxc1:3306 check port 8000 inter 1000 rise 3 fall 3 server pxc2 pxc2:3306 check port 8000 inter 1000 rise 3 fall 3 backup server pxc3 pxc3:3306 check port 8000 inter 1000 rise 3 fall 3 backup
So until now, when we connected from any machine to HA Proxy in MySQL, the connection was made by HAProxy’s host.
[root@app1 ~]# mysql -h haproxy -P 3307 -utest -ptest -e "select @@wsrep_node_name, sleep(20)"; [root@app2 ~]# mysql -h haproxy -P 3307 -utest -ptest -e "select @@wsrep_node_name, sleep(20)";
In the processlist we could see it like this:
pxc1 mysql> SELECT PROCESSLIST_ID AS id, PROCESSLIST_USER AS user, PROCESSLIST_HOST AS host, PROCESSLIST_INFO FROM performance_schema.threads WHERE PROCESSLIST_INFO LIKE 'select @% sleep%'; +------+------+---------+-------------------------------------+ | id | user | host | PROCESSLIST_INFO | +------+------+---------+-------------------------------------+ | 961 | test | haproxy | select @@wsrep_node_name, sleep(20) | | 963 | test | haproxy | select @@wsrep_node_name, sleep(20) | +------+------+---00----+-------------------------------------+
Shown like this, it’s impossible to know which connection is made from app1 and which from app2, as they all come from the proxy. Therefore this is also the host that’s required in GRANTS.
And it is particularly for this reason that we decided to support proxy protocol.
Let’s add the required setting in my.cnf:
proxy_protocol_networks = *
Then we need to also modify HAProxy’s config to be like this:
listen 3307-active-passive-writes 0.0.0.0:3307 mode tcp balance leastconn option httpchk server pxc1 pxc1:3306 send-proxy check port 8000 inter 1000 rise 3 fall 3 server pxc2 pxc2:3306 send-proxy check port 8000 inter 1000 rise 3 fall 3 backup server pxc3 pxc3:3306 send-proxy check port 8000 inter 1000 rise 3 fall 3 backup
send-proxy-v2 is also supported.
From HA Proxy’s Manual:
Version 1 senders MAY only produce the human-readable header format. Version 2
senders MAY only produce the binary header format. Version 1 receivers MUST at
least implement the human-readable header format. Version 2 receivers MUST at
least implement the binary header format, and it is recommended that they also
implement the human-readable header format for better interoperability and ease
of upgrade when facing version 1 senders.
So now, when MySQL and HAProxy are restarted, we can see in the processlist the origin of the connection:
+------+------+------+-------------------------------------+ | id | user | host | PROCESSLIST_INFO | +------+------+------+-------------------------------------+ | 283 | test | app1 | select @@wsrep_node_name, sleep(20) | | 284 | test | app2 | select @@wsrep_node_name, sleep(20) | +------+------+------+-------------------------------------+
Is everything perfect then? Not really…
One disadvantage of this proxy_protocol_network setting is that now you are not able to connect to MySQL if you don’t send the proxy headers:
[root@app1 ~]# mysql -h pxc1 -P 3306 -utest -ptest -e "select @@wsrep_node_name, sleep(20)";
This connection seems stalled… no answer, no error…
In the MySQL processlist we can see:
pxc1 mysql> show full processlist; +-----+----------------------+-----------------+------+---------+------+--------------------+-----------------------+-----------+---------------+ | Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined | +-----+----------------------+-----------------+------+---------+------+--------------------+-----------------------+-----------+---------------+ | 1 | system user | | NULL | Sleep | 1922 | NULL | NULL | 0 | 0 | | 2 | system user | | NULL | Sleep | 1922 | wsrep aborter idle | NULL | 0 | 0 | | 9 | root | localhost | NULL | Query | 0 | init | show full processlist | 0 | 0 | | 360 | unauthenticated user | connecting host | NULL | Connect | NULL | login | NULL | 0 | 0 | +-----+----------------------+-----------------+------+---------+------+--------------------+-----------------------+-----------+---------------+
This means you can connect only via HAProxy or the socket… Of course this limits you a lot, not only for DBAs or developers but also for things like replication slaves, for example.
And you can also imagine how fast you could reach the max connections as every tentative will be stalled in that state:
pxc1 mysql> show full processlist; +-----+----------------------+-----------------+------+---------+------+--------------------+-----------------------+-----------+---------------+ | Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined | +-----+----------------------+-----------------+------+---------+------+--------------------+-----------------------+-----------+---------------+ | 1 | system user | | NULL | Sleep | 1969 | NULL | NULL | 0 | 0 | | 2 | system user | | NULL | Sleep | 1969 | wsrep aborter idle | NULL | 0 | 0 | | 9 | root | localhost | NULL | Query | 0 | init | show full processlist | 0 | 0 | | 990 | unauthenticated user | connecting host | NULL | Connect | NULL | login | NULL | 0 | 0 | | 992 | unauthenticated user | connecting host | NULL | Connect | NULL | login | NULL | 0 | 0 | | 993 | unauthenticated user | connecting host | NULL | Connect | NULL | login | NULL | 0 | 0 | | 994 | unauthenticated user | connecting host | NULL | Connect | NULL | login | NULL | 0 | 0 | | 996 | unauthenticated user | connecting host | NULL | Connect | NULL | login | NULL | 0 | 0 | | 997 | unauthenticated user | connecting host | NULL | Connect | NULL | login | NULL | 0 | 0 | +-----+----------------------+-----------------+------+---------+------+--------------------+-----------------------+-----------+---------------+
The usual MySQL variable connect_timeout doesn’t work.
To avoid this and also to connect directly to MySQL, you need to specify from which network MySQL expects those proxy headers. Therefore proxy_protocol_network (ppn) should be set to the proxy address (or network range if your proxies use a different dedicated one):
proxy_protocol_networks = 192.168.56.5
Even if this is the recommended solution, it won’t work because it’s mandatory to bind MySQL to listen to the IPv4 address:
bind_address=192.168.56.2 (on pxc1)
This is the error you would face if you don’t bind MySQL:
ERROR 2013 (HY000): Lost connection to MySQL server at 'reading authorization packet', system error: 0
When this is implemented in the configuration and MySQL is restarted, it’s now possible to connect to MySQL from the proxy or directly:
[root@app1 ~]# mysql -h haproxy -P 3307 -utest -ptest -e "select @@wsrep_node_name, sleep(20)" +-------------------+-----------+ | @@wsrep_node_name | sleep(20) | +-------------------+-----------+ | pxc1 | 0 | +-------------------+-----------+ [root@app1 ~]# mysql -h pxc1 -P 3306 -utest -ptest -e "select @@wsrep_node_name, sleep(20)" +-------------------+-----------+ | @@wsrep_node_name | sleep(20) | +-------------------+-----------+ | pxc1 | 0 | +-------------------+-----------+
I tried to completely disable IPv6 instead of binding to a specific IP in MySQL (something I’ve heard of?) but it didn’t help:
[root@pxc2 ~]# mysql -h haproxy -P 3307 -utest -ptest -e "select @@wsrep_node_name, sleep(20)" ERROR 2013 (HY000): Lost connection to MySQL server at 'reading authorization packet', system error: 0
To summarize, if you want to enable the resolution of the real client (origin) instead of HAProxy’s IP in MySQL’s processlist and GRANTS (for authentication), you need:
- add proxy-protocol in HAProxy
- add proxy_protocol_networks to my.cnf and set it to the proxy’s IP
- bind mysql to its IPv4 IP
This is also a summary of settings I’ve tried and their results:
MySQL bind | proxy_protocol_networks | HA Proxy protocol option | Result |
---|---|---|---|
n/a | n/a | n/a | We see HA Proxy IP/host in MySQL as source for the connection |
n/a | n/a | send-proxy | ERROR 2013 (HY000): Lost connection to MySQL server at ‘reading authorization packet’, system error: 0 |
n/a | ppn=* | send-proxy | We see APP as source of connection when connecting using the proxy We can’t connect directly to MySQL |
n/a | ppn=ha proxy’s IP | send-proxy | We can connect directly We can’t conection using the proxy: ERROR 2013 (HY000): Lost connection to MySQL server at ‘reading authorization packet’, system error: 0 |
IP | ppn=ha proxy’s IP | send-proxy | We see APP as source of the connection when using the proxy We can connect direclty |
IP | ppn=MySQL’s IP | send-proxy | Cannot connect using the proxy: ERROR 2013 (HY000): Lost connection to MySQL server at ‘reading authorization packet’, system error: 0 |
IP | ppn=network range (192.168.56.0/24) | send-proxy | We can connect using the Proxy and see APP as source We cannot connect directly |
These two bugs were reported while writing this blog post:
- When using proxy protocol, mysql needs to bind to an address instead of 0.0.0.0
- proxy-protocol doesn’t take in consideration connect_timeout, possible DOS
The post Proxy Protocol and Percona XtraDB Cluster: A Quick Guide appeared first on MySQL Performance Blog.
19
2015
Q&A: High availability when using MySQL in the cloud
Last week I hosted a webinar on using MySQL in the cloud for High Availability (HA) alongside 451 Research analyst Jason Stamper. You can watch the recording and also download the slides (free) here. Just click the “Register” button at the end of that page.
We had several excellent questions and we didn’t have time to get to several of them in the allotted time. I’m posting them here along with the answers. Feel free to ask follow-up questions in the comments below.
Q: Can the TokuDB engine be used in a PXC environment?
A: No, TokuDB cannot currently be used in a PXC environment, the only supported engine in Percona XtraDB Cluster 5.6 is InnoDB.
—
Q: With Galera replication (PXC), is balancing the load on each node?
A: No, you need to implement your own load balancing and HA layer between your clients and the Percona XtraDB Cluster server. Examples mentioned in the webinar include HAProxy and F5 BigIP.
—
Q: What’s the best version of Percona XtraDB Cluster regarding InnoDB performance?
A: In general for best performance you should be using the latest release of Percona XtraDB Cluster 5.6, which is currently 5.6.24, released on June 3rd, 2015.
—
Q: Can I redirect my writes in Percona XtraDB Cluster to multiple nodes using the HAProxy? While trying with SysBench I can see write-only goes to first nodes in PXC while reads does goes to multiple nodes.
A: Yes you can configure HAProxy to distribute both reads and writes across all of your nodes in a Percona XtraDB Cluster environment. Perhaps SysBench created only one database connection for all writes, and so haproxy kept those confined to only one host. You may want to experiment with parallel_prepare.lua.
—
Q: What’s the optimal HA for small datasets (db is less than 10gb)?
A: The optimal HA deployment for small datasets would be dependent on your level of recovery required (tolerance for loss of transactions) and time that you can be in an unavailable state (seconds, minutes, hours?). Unfortunately there isn’t a single answer to your question, however, if you are interested in further discussion on this point Percona would be happy to coordinate a time to speak. Please feel free to contact me directly and we can continue the conversation at michael.coburn@percona.com.
—
Q: Is there a concept of local master vs. remote master with PXC?
A: No there is no concept of local vs remote master. All nodes in a Percona XtraDB Cluster can now be classified as Master, regardless of their proximity to the clients.
—
Q: Are there any concerns when considering AWS RDS or AURORA DB for MySQL HA in the Cloud?
A: Regarding AWS RDS, yes this a good option for MySQL HA in the Cloud. I unfortunately haven’t worked with Aurora DB that much yet so I don’t have an opinion on it’s suitability for HA in the Cloud.
—
Q: We tried out PXC awhile back and it used to lock everything whenever any ddl was done. Has that changed?
A: We would have to look at the specifics of your environment, however, there have been numerous improvements in the 1½ years of development since Percona XtraDB Cluster went Generally Available (GA) on January 30th, 2014 in version 5.6.15.
—
Q: Is using the arbitrator a must?
A: No the arbitrator role via the garbd daemon is generally only used when operating in a minimal environment of two nodes that contain the data and you need a third node for quorum but don’t want to store the data a third time.
—
Q: Can we do a cluster across different zones?
A: Yes you can. However be aware that the latency incurred for all cluster certification operations will be impacted by the round trip time between nodes.
—
Q: Does PXC also support the MyISAM database?
A: No, Percona XtraDB Cluster does not support any storage engine other than InnoDB as of PXC 5.6.
—
Q: How do load balancers affect the throughput in a Galera-based setup given that the write would be limited by the slowest node?
A: Load balancers will introduce some measure of additional latency in the form of CPU time in the load balancer layer as it evaluates its own ruleset, and also in network time due to additional hop via load balancer. Otherwise there should be no perceptible difference in the write throughput of a Percona XtraDB Cluster with and without a load balancer as it relates to the “slowest node” factor.
—
Q: Have you used MaxScale yet? If so, what are your thoughts?
A: Unfortunately I haven’t used MaxScale however Yves Trudeau, Percona Principal Architect, has recently written about MaxScale in this blog post.
—
Q: How do you configure timeout and maintain persistent connection to HAProxy?
A: I would encourage you to refer to the HAProxy Documentation.
The post Q&A: High availability when using MySQL in the cloud appeared first on MySQL Performance Blog.
18
2014
Making HAProxy 1.5 replication lag aware in MySQL
HAProxy is frequently used as a software load balancer in the MySQL world. Peter Boros, in a past post, explained how to set it up with Percona XtraDB Cluster (PXC) so that it only sends queries to available nodes. The same approach can be used in a regular master-slaves setup to spread the read load across multiple slaves. However with MySQL replication, another factor comes into play: replication lag. In this case the approach mentioned for Percona XtraDB Cluster does not work that well as the check we presented only returns ‘up’ or ‘down’. We would like to be able to tune the weight of a replica inside HAProxy depending on its replication lag. This is what we will do in this post using HAProxy 1.5.
Agent checks in HAProxy
HAProxy 1.5 allows us to run an agent check, which is a check that can be added to a regular health check. The benefit of agent checks is that the return value can be ‘up’ or ‘down’, but also a weight.
What is an agent? It is simply a program that can be accessed from a TCP connection on a given port. So if we want to run an agent on a MySQL server that will:
- Mark the server as down in HAProxy if replication is not working
- Set the weight to 100% if the replication lag is < 10s
- Set the weight to 50% if the replication lag is >= 10s and < 60s
- Set the weight to 5% in all other situations
We can use a script like this:
$ less agent.php = 10 && $lag < 60){ return "up 50%"; } else return "up 5%"; } set_time_limit(0); $socket = stream_socket_server("tcp://127.0.0.1:$port", $errno, $errstr); if (!$socket) { echo "$errstr ($errno) n"; } else { while ($conn = stream_socket_accept($socket,9999999999999)) { $cmd = "$mysql -h127.0.0.1 -u$user -p$password -P$mysql_port -Ee "$query" | grep Seconds_Behind_Master | cut -d ':' -f2 | tr -d ' '"; exec("$cmd",$lag); $weight = set_weight($lag[0]); unset($lag); fputs ($conn, $weight); fclose ($conn); } fclose($socket); } ?>
If you want the script to be accessible from port 6789 and connect to a MySQL instance running on port 3306, run:
$ php agent.php 6789 3306
You will also need a dedicated MySQL user:
mysql> GRANT REPLICATION CLIENT ON *.* TO 'haproxy'@'127.0.0.1' IDENTIFIED BY 'haproxy_pwd';
When the agent is started, you can check that it is working properly:
# telnet 127.0.0.1 6789 Trying 127.0.0.1... Connected to 127.0.0.1. Escape character is '^]'. up 100% Connection closed by foreign host.
Assuming it is run locally on the app server, that 2 replicas are available (192.168.10.2 and 192.168.10.3) and that the application will send all reads on port 3307, you will define a frontend and a backend in your HAProxy configuration like this:
frontend read_only-front bind *:3307 mode tcp option tcplog log global default_backend read_only-back backend read_only-back mode tcp balance leastconn server slave1 192.168.10.2 weight 100 check agent-check agent-port 6789 inter 1000 rise 1 fall 1 on-marked-down shutdown-sessions server slave2 192.168.10.3 weight 100 check agent-check agent-port 6789 inter 1000 rise 1 fall 1 on-marked-down shutdown-sessions
Demo
Now that everything is set up, let’s see how HAProxy can dynamically change the weight of the servers depending on the replication lag.
No lag
# Slave1 $ mysql -Ee "show slave status" | grep Seconds_Behind_Master Seconds_Behind_Master: 0 # Slave2 $ mysql -Ee "show slave status" | grep Seconds_Behind_Master Seconds_Behind_Master: 0 # HAProxy $ echo "show stat" | socat stdio /run/haproxy/admin.sock | cut -d ',' -f1,2,18,19 # pxname,svname,status,weight read_only-front,FRONTEND,OPEN, read_only-back,slave1,UP,100 read_only-back,slave2,UP,100 read_only-back,BACKEND,UP,200
Slave1 lagging
# Slave1 $ mysql -Ee "show slave status" | grep Seconds_Behind_Master Seconds_Behind_Master: 25 # Slave2 $ mysql -Ee "show slave status" | grep Seconds_Behind_Master Seconds_Behind_Master: 0 # echo "show stat" | socat stdio /run/haproxy/admin.sock | cut -d ',' -f1,2,18,19 # pxname,svname,status,weight read_only-front,FRONTEND,OPEN, read_only-back,slave1,UP,50 read_only-back,slave2,UP,100 read_only-back,BACKEND,UP,150
Slave2 down
# Slave1 $ mysql -Ee "show slave status" | grep Seconds_Behind_Master Seconds_Behind_Master: 0 # Slave2 $ mysql -Ee "show slave status" | grep Seconds_Behind_Master Seconds_Behind_Master: NULL # echo "show stat" | socat stdio /run/haproxy/admin.sock | cut -d ',' -f1,2,18,19 # pxname,svname,status,weight read_only-front,FRONTEND,OPEN, read_only-back,slave1,UP,100 read_only-back,slave2,DOWN (agent),100 read_only-back,BACKEND,UP,100
Conclusion
Agent checks are a nice addition in HAProxy 1.5. The setup presented above is a bit simplistic though: for instance, if HAProxy fails to connect to the agent, it will not mark the corresponding as down. It is then recommended to keep a regular health check along with the agent check.
Astute readers will also notice that in this configuration, if replication is broken on all nodes, HAProxy will stop sending reads. This may not be the best solution. Possible options are: stop the agent and mark the servers as UP using the stats socket or add the master as a backup server.
And as a final note, you can edit the code of the agent so that replication lag is measured with Percona Toolkit’s pt-heartbeat instead of Seconds_Behind_Master
.
The post Making HAProxy 1.5 replication lag aware in MySQL appeared first on MySQL Performance Blog.
03
2014
HAProxy: Give me some logs on CentOS 6.5!
HAProxy is frequently used as a load-balancer in front of a Galera cluster. While diagnosing an issue with HAProxy configuration, I realized that logging doesn’t work out of the box on CentOS 6.5. Here is a simple recipe to fix the issue.
If you look at the top of /etc/haproxy/haproxy.cfg
, you will see something like:
global log 127.0.0.1 local2 [...]
This means that HAProxy will send its messages to rsyslog on 127.0.0.1. But by default, rsyslog doesn’t listen on any address, hence the issue.
Let’s edit /etc/rsyslog.conf
and uncomment these lines:
$ModLoad imudp $UDPServerRun 514
This will make rsyslog listen on UDP port 514 for all IP addresses. Optionally you can limit to 127.0.0.1 by adding:
$UDPServerAddress 127.0.0.1
Now create a /etc/rsyslog.d/haproxy.conf
file containing:
local2.* /var/log/haproxy.log
You can of course be more specific and create separate log files according to the level of messages:
local2.=info /var/log/haproxy-info.log local2.notice /var/log/haproxy-allbutinfo.log
Then restart rsyslog and see that log files are created:
# service rsyslog restart Shutting down system logger: [ OK ] Starting system logger: [ OK ] # ls -l /var/log | grep haproxy -rw-------. 1 root root 131 3 oct. 10:43 haproxy-allbutinfo.log -rw-------. 1 root root 106 3 oct. 10:42 haproxy-info.log
Now you can start your debugging session!
The post HAProxy: Give me some logs on CentOS 6.5! appeared first on MySQL Performance Blog.
15
2013
Using keepalived for HA on top of Percona XtraDB Cluster
Percona XtraDB Cluster (PXC) itself manages quorum and node failure. Minorities of nodes in a network partition situation will move themselves into a Non-primary state and not allow any DB activity. Nodes in such a state will be easily detectable via SHOW GLOBAL STATUS variables.
It’s common to use HAproxy with PXC for load balancing purposes, but what if you are planning to just send traffic to a single node? We would standardly use keepalived to HA HAproxy, and keepalived supports track_scripts that can monitor whatever we want, so why not just monitor PXC directly?
If we have clustercheck working on all hosts:
mysql> GRANT USAGE ON *.* TO 'clustercheck'@'localhost' IDENTIFIED BY PASSWORD '*2470C0C06DEE42FD1618BB99005ADCA2EC9D1E19'; [root@node1 ~]# /usr/bin/clustercheck clustercheck password 0; echo $? HTTP/1.1 200 OK Content-Type: text/plain Connection: close Content-Length: 40 Percona XtraDB Cluster Node is synced. 0
Then we can just install keepalived and the this config on all nodes:
vrrp_script chk_pxc { script "/usr/bin/clustercheck clustercheck password 0" interval 1 } vrrp_instance PXC { state MASTER interface eth1 virtual_router_id 51 priority 100 nopreempt virtual_ipaddress { 192.168.70.100 } track_script { chk_pxc } notify_master "/bin/echo 'now master' > /tmp/keepalived.state" notify_backup "/bin/echo 'now backup' > /tmp/keepalived.state" notify_fault "/bin/echo 'now fault' > /tmp/keepalived.state" }
And start the keepalived service. The virtual IP above will be brought up on an active node in the cluster and moved around if clustercheck fails.
[root@node1 ~]# cat /tmp/keepalived.state now backup [root@node2 ~]# cat /tmp/keepalived.state now master [root@node3 ~]# cat /tmp/keepalived.state now backup [root@node2 ~]# ip a l | grep 192.168.70.100 inet 192.168.70.100/32 scope global eth1 [root@node3 ~]# mysql -h 192.168.70.100 -u test -ptest test -e "show global variables like 'wsrep_node_name'" +-----------------+-------+ | Variable_name | Value | +-----------------+-------+ | wsrep_node_name | node2 | +-----------------+-------+
If I shutdown PXC on node2:
[root@node2 keepalived]# service mysql stop Shutting down MySQL (Percona XtraDB Cluster)....... SUCCESS! [root@node2 ~]# /usr/bin/clustercheck clustercheck password 0; echo $? HTTP/1.1 503 Service Unavailable Content-Type: text/plain Connection: close Content-Length: 44 Percona XtraDB Cluster Node is not synced. 1 [root@node1 ~]# cat /tmp/keepalived.state now master [root@node2 ~]# cat /tmp/keepalived.state now fault [root@node3 ~]# cat /tmp/keepalived.state now backup [root@node1 ~]# ip a l | grep 192.168.70.100 inet 192.168.70.100/32 scope global eth1 [root@node2 ~]# ip a l | grep 192.168.70.100 [root@node2 ~]# [root@node3 ~]# ip a l | grep 192.168.70.100 [root@node3 ~]#
We can see node2 moves to a FAULT state and the VIP moves to node1 instead. This provides us with a very simple way to do Application to PXC high availability.
A few additional notes:
- You can disqualify donors (i.e., make clustercheck fail on Donor/Desynced nodes) by setting the 3rd argument to clustercheck to 0. Setting this to 1 means Donors can retain the VIP.
- Each keepalived instance monitors its own state only, hence the @localhost GRANT. This is much cleaner than exposing clustercheck as a web port via xinetd.
- It’s possible to more complex things with keepalived like multiple vips, node weighting, etc.
- Keepalived can track over multiple network interfaces (in this example, just eth1) for better reliability.
The post Using keepalived for HA on top of Percona XtraDB Cluster appeared first on MySQL Performance Blog.
17
2013
Percona XtraDB Cluster (PXC) in the real world: Share your use cases!
The aim of this post is to enumerate real-world usage of Percona XtraDB Cluster (PXC), and also to solicit use cases from the readers. One of the prominent usages in the production environment that we have come across (and our Percona consultants have assisted) is that of HP Cloud. There is a post about it here by Patrick Galbraith of HP. The post focuses on their deployment of PXC for HP Cloud DNS. The post focuses on the key aspects of synchronous replication setup with high-availability guarantees like split-brain immunity.
Nobody likes to debug async replication while its broken or do the master-master/master-slave switchover when master is dying/dead. Yes, there are wrappers/scripts around this to make life easier, however, wouldn’t it be nice if this was built into the system itself? PXC based on Galera strives to provide that. Scaling makes sense only when addition/removal of hosts from a cluster or a HA setup is simple and uncomplicated.
Their post focuses on following aspects:
- Initial setup
- Setup of other nodes with SST (Xtrabackup SST)
- Integration of chef with PXC
- Finally, integration of HAProxy as a loadbalancer.
To elucidate, their initial setup goes into bootstrapping the first node. Note that in the cloud environment other nodes are not known until they are brought up, hence bootstrapping with an empty gcomm:// is done for the first node by the chef. The second node is then added which SSTs with node1 (based on gcomm://node1 of node2) through Xtrabackup SST (state snapshot transfer). Node3 subsequently joins the cluster with node1 and node2 in its gcomm:// (since by this time node1, node2 are up). After this, a subsequent run of chef-client is done to update the cnf files with IP address of members (excluding itself). The rationale behind this is that when a node is restarted (and there are others when it comes up) it joins the cluster seamlessly. I would like to note here that we are adding a bootstrap parameter to PXC so that any latter modifications like these to cnf files are not required and preset it during cluster startup itself. The only caveat is that the node information – IP address or hostname – should be known in advance (the node itself needn’t be up), which may not be feasible in a cloud environment.
Next, the SST. Xtrabackup SST is used there. SST matters a lot because not only is it used during initial node setup but also it is required when a node has been down for a while and IST (incremental state transfer) is not feasible. It also helps when node data integrity is compromised. So, naturally duration of SST is paramount. We recommend Xtrabackup SST for its reduced locking period from its use (which means the donor is blocked for a shorter while). By using Xtrabackup for SST, you also get its benefits like compression, parallel streaming, encryption, compact backups which can be used for SST (Note, the wsrep_sst_xtrabackup in 5.5.30 can’t do those except parallel, the one in 5.5.31 will handle them all, also XB 2.1 is required for most).
Finally, the HAProxy. HAProxy is one of the loadbalancers recommended for use with PXC. The other one is glb. HAProxy is used with xinetd on the node along with a script which checks PXC for its sync status. As referenced in that post, you can refer a post by Peter Boros (“Percona XtraDB Cluster reference architecture with HaProxy“) for details. In their setup they have automated this with a HAProxy in each AZ (Availability Zone) for the API server. To add, we are looking at reducing the overhead here, through steps like replacing xinetd and clustercheck with a single serving process (we are adding one in 5.5.31), looking for optimizations with HAProxy to account for high connection rates, and using pacemaker with PXC. The goal is to reduce the overhead of status checks, mainly on the node. You can also look this PLMCE talk for HAProxy deployment strategies with PXC.
To conclude, it is interesting to note that they have been able to manage this with a small team. That strongly implies scalability of resources – you scale more with less, and that is how it should be. We would like to hear from you about your architectural setup around PXC – any challenges you faced (and horror stories if any), any special deployment methodologies you employed (Puppet, Chef, Salt, Ansible etc. ), and finally any suggestions.
The post Percona XtraDB Cluster (PXC) in the real world: Share your use cases! appeared first on MySQL Performance Blog.