Jan
19
2017
--

Setup ProxySQL for High Availability (not a Single Point of Failure)

ProxySQL for High Availability

In this blog post, we’ll look at how to set up ProxySQL for high availability.

During the last few months, we’ve had a lot of opportunities to present and discuss a very powerful tool that will become more and more used in the architectures supporting ProxySQL for High AvailabilityMySQL: ProxySQL.

ProxySQL is becoming more flexible, solid, performant and used every day (http://www.proxysql.com/ and recent http://www.proxysql.com/compare). You can use ProxySQL for high availability.

The tool is a winner when compared to similar ones, and we should all have a clear(er) idea of how to integrate it in our architectures in order to achieve the best results.

The first thing to keep in mind is that ProxySQL doesn’t natively support any high availability solution. We can setup a cluster of MySQL(s) and achieve four or even five nines of HA. But if we include ProxySQL as it is, and as a single block, our HA has a single point of failure (SPOF) that can drag us down in the case of a crash.

ProxySQL for High AvailabilityTo solve this, the most common solution is setting up ProxySQL as part of a tile architecture, where Application/ProxySQL are deployed together.

This is a good solution for some cases, and it for sure reduces the network hops, but it might be less than practical when our architecture has a very high number of tiles (hundreds of application servers, which is not so unusual nowadays).
In that case, managing ProxySQL is challenging. But more problematically, ProxySQL must perform several checks on the destination servers (MySQL). If we have 400 instances of ProxySQL, we end up keeping our databases busy just performing the checks.

In short, this is not a smart move.

ProxySQL for High AvailabilityAnother possible approach is to have two layers of ProxySQL, one close to the application and another in the middle to connect to the database.

I personally don’t like this approach for many reasons. The most relevant reasons are that this approach creates additional complexity in the management of the platform, and it adds network hops.

So what can be done?

I love the KISS principle because I am lazy and don’t want to reinvent a wheel someone else has already invented. I also like it when my customers don’t need to depend on me or any other colleagues after I am done and gone. They must be able to manage, understand and fix their environment by themselves.

To keep things simple, here is my checklist:

  • Exclude the cases where a tile (application/ProxySQL) makes sense
  • Exclude the “in the cloud” cases where tools like ELB (Elastic Load Balancer) exist
  • Exclude architecture that already includes a balancer

What can I use for the remaining cases?

The answer comes with combining existing solutions and existing blocks: KeepAlived + ProxySQl + MySQL.

For an explanation of KeepAlived, visit http://www.keepalived.org/.

Short description
“Keepalived is a routing software written in C. The main goal of this project is to provide simple and robust facilities for load balancing and high-availability to Linux system and Linux-based infrastructures. The load balancing framework relies on well-known and widely used Linux Virtual Server (IPVS) kernel module providing Layer4 load balancing. Keepalived implements a set of checkers to dynamically and adaptively maintain and manage load-balanced server pool according to their health. On the other hand, high-availability is achieved by VRRP protocol. VRRP is a fundamental brick for router failover. Also, Keepalived implements a set of hooks to the VRRP finite state machine providing low-level and high-speed protocol interactions. Keepalived frameworks can be used independently or all together to provide resilient infrastructures.”

Bingo! This is exactly what we need for our ProxySQL setup.

Below, I will explain how to set up:

  • A simple solution based on a single VIP
  • A more complex solution using multiple VIPs
  • An even more complex solution using virtual VIPs and virtual servers

All we want to do is to prevent ProxySQL from becoming a SPOF. And while doing that, we need to reduce network hops as much as possible (keeping the solution SIMPLE).

Another important concept to keep in mind is that ProxySQL (re)start takes place in less than a second. This means that if it crashes, assuming it can be restarted by the angel process, doing that is much more efficient than any kind of failover mechanism. As such, your solution plan should keep in mind the ~1 second time of ProxySQL restart as a baseline.

Ready? Let’s go.

Setup

Choose three machines to host the combination of Keepalived and ProxySQL.

In the following example, I will use three machines for ProxySQL and Keepalived, and three hosting Percona XtraDB Cluster. You can have the Keepalived+ProxySQL whenever you like (even on the same Percona XtraDB Cluster box).

For the following examples, we will have:

PXC
node1 192.168.0.5 galera1h1n5
node2 192.168.0.21 galera2h2n21
node3 192.168.0.231 galera1h3n31
 
ProxySQL-Keepalived
test1 192.168.0.11
test2 192.168.0.12
test3 192.168.0.235
 
VIP 192.168.0.88 /89/90
 

To check, I will use this table (please create it in your MySQL server):

DROP TABLE  test.`testtable2`;
 CREATE TABLE test.`testtable2` (
  `autoInc` bigint(11) NOT NULL AUTO_INCREMENT,
  `a` varchar(100) COLLATE utf8_bin NOT NULL,
  `b` varchar(100) COLLATE utf8_bin NOT NULL,
  `host` varchar(100) COLLATE utf8_bin NOT NULL,
  `userhost` varchar(100) COLLATE utf8_bin NOT NULL,
  PRIMARY KEY (`autoInc`)
) ENGINE=InnoDB ROW_FORMAT=DYNAMIC;

And this bash TEST command to use later:

while [ 1 ];do export mydate=$(date +'%Y-%m-%d %H:%M:%S.%6N'); mysql --defaults-extra-file=./my.cnf -h 192.168.0.88 -P 3311  --skip-column-names -b -e "BEGIN;set @userHost='a';select concat(user,'_', host) into @userHost from information_schema.processlist  where user = 'load_RW' limit 1;insert into test.testtable2 values(NULL,'$mydate',SYSDATE(6),@@hostname,@userHost);commit;select * from test.testtable2 order by 1 DESC limit 1" ; sleep 1;done
  1. Install ProxySQL (refer to https://github.com/sysown/proxysql/wiki#installation)
  2. Install Keepalived (yum install keepalived; apt-get install keepalived)
  3. Setup ProxySQL users and servers

Once you have your ProxySQL up (run the same configuration on all ProxySQL nodes, it is much simpler), connect to the Admin interface and execute the following:

DELETE FROM mysql_replication_hostgroups WHERE writer_hostgroup=500 ;
DELETE FROM mysql_servers WHERE hostgroup_id IN (500,501);
 
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.0.5',500,3306,1000000000);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.0.5',501,3306,100);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.0.21',500,3306,1000000);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.0.21',501,3306,1000000000);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.0.231',500,3306,100);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.0.231',501,3306,1000000000);
LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK;
 
DELETE FROM mysql_users WHERE username='load_RW';
INSERT INTO mysql_users (username,password,active,default_hostgroup,default_schema,transaction_persistent) VALUES ('load_RW','test',1,500,'test',1);
LOAD MYSQL USERS TO RUNTIME;SAVE MYSQL USERS TO DISK;
 
DELETE FROM mysql_query_rules WHERE rule_id IN (200,201);
INSERT INTO mysql_query_rules (rule_id,username,destination_hostgroup,active,retries,match_digest,apply) VALUES(200,'load_RW',501,1,3,'^SELECT.*FOR UPDATE',1);
INSERT INTO mysql_query_rules (rule_id,username,destination_hostgroup,active,retries,match_digest,apply) VALUES(201,'load_RW',501,1,3,'^SELECT ',1);
 
LOAD MYSQL QUERY RULES TO RUNTIME;SAVE MYSQL QUERY RULES TO DISK;

Create a my.cnf file in your default dir with:

[mysql]
user=load_RW
password=test

Simple Setup using a single VIP, 3 ProxySQL and 3 Galera nodes

ProxySQL for High Availability

First, setup the Keepalived configuration file (/etc/keepalived/keepalived.conf):

global_defs {
  # Keepalived process identifier
  lvs_id proxy_HA
}
# Script used to check if Proxy is running
vrrp_script check_proxy {
  script "killall -0 proxysql"
  interval 2
  weight 2
}
# Virtual interface
# The priority specifies the order in which the assigned interface to take over in a failover
vrrp_instance VI_01 {
  state MASTER
  interface em1
  virtual_router_id 51
  priority <calculate on the WEIGHT for each node>
 
  # The virtual ip address shared between the two loadbalancers
  virtual_ipaddress {
    192.168.0.88 dev em1
  }
  track_script {
    check_proxy
  }
}

Given the above, and given I want to have TEST1 as the main priority, we will set as:

test1 = 101
test2 = 100
test3 = 99

Modify the config in each node with the above values and (re)start Keepalived.

If all is set correctly, you will see the following in the system log of the TEST1 machine:

Jan 10 17:56:56 mysqlt1 systemd: Started LVS and VRRP High Availability Monitor.
Jan 10 17:56:56 mysqlt1 Keepalived_healthcheckers[6183]: Configuration is using : 6436 Bytes
Jan 10 17:56:56 mysqlt1 Keepalived_healthcheckers[6183]: Using LinkWatch kernel netlink reflector...
Jan 10 17:56:56 mysqlt1 Keepalived_vrrp[6184]: Configuration is using : 63090 Bytes
Jan 10 17:56:56 mysqlt1 Keepalived_vrrp[6184]: Using LinkWatch kernel netlink reflector...
Jan 10 17:56:56 mysqlt1 Keepalived_vrrp[6184]: VRRP sockpool: [ifindex(2), proto(112), unicast(0), fd(10,11)]
Jan 10 17:56:56 mysqlt1 Keepalived_vrrp[6184]: VRRP_Script(check_proxy) succeeded
Jan 10 17:56:57 mysqlt1 Keepalived_vrrp[6184]: VRRP_Instance(VI_01) Transition to MASTER STATE
Jan 10 17:56:57 mysqlt1 Keepalived_vrrp[6184]: VRRP_Instance(VI_01) Received lower prio advert, forcing new election
Jan 10 17:56:57 mysqlt1 Keepalived_vrrp[6184]: VRRP_Instance(VI_01) Received higher prio advert
Jan 10 17:56:57 mysqlt1 Keepalived_vrrp[6184]: VRRP_Instance(VI_01) Entering BACKUP STATE
Jan 10 17:56:58 mysqlt1 Keepalived_vrrp[6184]: VRRP_Instance(VI_01) forcing a new MASTER election
...
Jan 10 17:57:00 mysqlt1 Keepalived_vrrp[6184]: VRRP_Instance(VI_01) Transition to MASTER STATE
Jan 10 17:57:01 mysqlt1 Keepalived_vrrp[6184]: VRRP_Instance(VI_01) Entering MASTER STATE <-- MASTER
Jan 10 17:57:01 mysqlt1 Keepalived_vrrp[6184]: VRRP_Instance(VI_01) setting protocol VIPs.
Jan 10 17:57:01 mysqlt1 Keepalived_healthcheckers[6183]: Netlink reflector reports IP 192.168.0.88 added
Jan 10 17:57:01 mysqlt1 avahi-daemon[937]: Registering new address record for 192.168.0.88 on em1.IPv4.
Jan 10 17:57:01 mysqlt1 Keepalived_vrrp[6184]: VRRP_Instance(VI_01) Sending gratuitous ARPs on em1 for 192.168.0.88

In the other two machines:

Jan 10 17:56:59 mysqlt2 Keepalived_vrrp[13107]: VRRP_Instance(VI_01) Entering BACKUP STATE <--- 

Which means the node is there as a Backup.  ?

Now it’s time to test our connection to our ProxySQL pool. From an application node, or just from your laptop, open three terminals. In each one:

watch -n 1 'mysql -h <IP OF THE REAL PROXY (test1|test2|test3)> -P 3310 -uadmin -padmin -t -e "select * from stats_mysql_connection_pool where hostgroup in (500,501,9500,9501) order by hostgroup,srv_host ;" -e " select srv_host,command,avg(time_ms), count(ThreadID) from stats_mysql_processlist group by srv_host,command;" -e "select * from stats_mysql_commands_counters where  Total_Time_us > 0;"'

Unless you are already sending queries to proxies, the proxies are doing nothing. Time to start the test bash as I indicated above. If everything is working correctly, you will see the bash command reporting this:

+----+----------------------------+----------------------------+-------------+----------------------------+
| 49 | 2017-01-10 18:12:07.739152 | 2017-01-10 18:12:07.733282 | galera1h1n5 | load_RW_192.168.0.11:33273 |
+----+----------------------------+----------------------------+-------------+----------------------------+
  ID    execution time in the bash   exec time inside mysql     node hostname   user and where the connection is coming from

The other three running bash commands will show that ONLY the ProxySQL in TEST1 is currently getting/serving requests, because it is the one with the VIP:

+-----------+---------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| hostgroup | srv_host      | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms |
+-----------+---------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| 500       | 192.168.0.21  | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0       | 0               | 0               | 629        |
| 500       | 192.168.0.231 | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0       | 0               | 0               | 510        |
| 500       | 192.168.0.5   | 3306     | ONLINE | 0        | 0        | 3      | 0       | 18      | 882             | 303             | 502        |
| 501       | 192.168.0.21  | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0       | 0               | 0               | 629        |
| 501       | 192.168.0.231 | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0       | 0               | 0               | 510        |
| 501       | 192.168.0.5   | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0       | 0               | 0               | 502        |
+-----------+---------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
+---------+---------------+-----------+-----------+-----------+---------+---------+----------+----------+-----------+-----------+--------+--------+---------+----------+
| Command | Total_Time_us | Total_cnt | cnt_100us | cnt_500us | cnt_1ms | cnt_5ms | cnt_10ms | cnt_50ms | cnt_100ms | cnt_500ms | cnt_1s | cnt_5s | cnt_10s | cnt_INFs |
+---------+---------------+-----------+-----------+-----------+---------+---------+----------+----------+-----------+-----------+--------+--------+---------+----------+
| BEGIN   | 9051          | 3         | 0         | 0         | 0       | 3       | 0        | 0        | 0         | 0         | 0      | 0      | 0       | 0        |
| COMMIT  | 47853         | 3         | 0         | 0         | 0       | 0       | 0        | 3        | 0         | 0         | 0      | 0      | 0       | 0        |
| INSERT  | 3032          | 3         | 0         | 0         | 1       | 2       | 0        | 0        | 0         | 0         | 0      | 0      | 0       | 0        |
| SELECT  | 8216          | 9         | 3         | 0         | 3       | 3       | 0        | 0        | 0         | 0         | 0      | 0      | 0       | 0        |
| SET     | 2154          | 3         | 0         | 0         | 3       | 0       | 0        | 0        | 0         | 0         | 0      | 0      | 0       | 0        |
+---------+---------------+-----------+-----------+-----------+---------+---------+----------+----------+-----------+-----------+--------+--------+---------+----------+

This is all as expected. Time to see if the failover-failback works along the chain.

Let us kill the ProxySQL on TEST1 while the test bash command is running.

killall -9 proxysql

Here is what you will get:

+----+----------------------------+----------------------------+-------------+----------------------------+
| 91 | 2017-01-10 18:19:06.188233 | 2017-01-10 18:19:06.183327 | galera1h1n5 | load_RW_192.168.0.11:33964 |
+----+----------------------------+----------------------------+-------------+----------------------------+
ERROR 2003 (HY000): Can't connect to MySQL server on '192.168.0.88' (111)
+----+----------------------------+----------------------------+-------------+----------------------------+
| 94 | 2017-01-10 18:19:08.250093 | 2017-01-10 18:19:11.250927 | galera1h1n5 | load_RW_192.168.0.12:39635 | <-- note
+----+----------------------------+----------------------------+-------------+----------------------------+

The source changed, but not the Percona XtraDB Cluster node. If you check the system log for TEST1:

Jan 10 18:19:06 mysqlt1 Keepalived_vrrp[6184]: VRRP_Script(check_proxy) failed
Jan 10 18:19:07 mysqlt1 Keepalived_vrrp[6184]: VRRP_Instance(VI_01) Received higher prio advert
Jan 10 18:19:07 mysqlt1 Keepalived_vrrp[6184]: VRRP_Instance(VI_01) Entering BACKUP STATE
Jan 10 18:19:07 mysqlt1 Keepalived_vrrp[6184]: VRRP_Instance(VI_01) removing protocol VIPs.
Jan 10 18:19:07 mysqlt1 Keepalived_healthcheckers[6183]: Netlink reflector reports IP 192.168.0.88 removed

While on TEST2:

Jan 10 18:19:08 mysqlt2 Keepalived_vrrp[13107]: VRRP_Instance(VI_01) Transition to MASTER STATE
Jan 10 18:19:09 mysqlt2 Keepalived_vrrp[13107]: VRRP_Instance(VI_01) Entering MASTER STATE
Jan 10 18:19:09 mysqlt2 Keepalived_vrrp[13107]: VRRP_Instance(VI_01) setting protocol VIPs.
Jan 10 18:19:09 mysqlt2 Keepalived_healthcheckers[13106]: Netlink reflector reports IP 192.168.0.88 added
Jan 10 18:19:09 mysqlt2 Keepalived_vrrp[13107]: VRRP_Instance(VI_01) Sending gratuitous ARPs on em1 for 192.168.0.88

Simple, and elegant. No need to re-invent the wheel to get a smoothly working process.

The total time for the recovery for the ProxySQL crash is about 5.06 seconds, considering the wider window (last application start, last recovery in Percona XtraDB Cluster 2017-01-10 18:19:06.188233|2017-01-10 18:19:11.250927).

As such this is the worse scenario, keeping in mind that we run the check for the ProxySQL every two seconds (real recovery max window 5-2=3 sec).

OK, what about fail-back?

Let us restart the proxysql service:

/etc/init.d/proxysql start (or systemctl)

Here’s the output:

+-----+----------------------------+----------------------------+-------------+----------------------------+
| 403 | 2017-01-10 18:29:34.550304 | 2017-01-10 18:29:34.545970 | galera1h1n5 | load_RW_192.168.0.12:40330 |
+-----+----------------------------+----------------------------+-------------+----------------------------+
+-----+----------------------------+----------------------------+-------------+----------------------------+
| 406 | 2017-01-10 18:29:35.597984 | 2017-01-10 18:29:38.599496 | galera1h1n5 | load_RW_192.168.0.11:34640 |
+-----+----------------------------+----------------------------+-------------+----------------------------+

The worst recovery time is 4.04 seconds, of which 2 seconds are a delay due to the check interval.

Of course, the test is running every second and is running a single operation. As such, the impact is minimal (no error in fail-back) and the recovery longer.

Let’s check another thing: is the failover working as expected? Test1 -> 2 -> 3? Let’s kill 1 and 2 and see:

Kill Test1 :
+-----+----------------------------+----------------------------+-------------+----------------------------+
| 448 | 2017-01-10 18:35:43.092878 | 2017-01-10 18:35:43.086484 | galera1h1n5 | load_RW_192.168.0.11:35240 |
+-----+----------------------------+----------------------------+-------------+----------------------------+
+-----+----------------------------+----------------------------+-------------+----------------------------+
| 451 | 2017-01-10 18:35:47.188307 | 2017-01-10 18:35:50.191465 | galera1h1n5 | load_RW_192.168.0.12:40935 |
+-----+----------------------------+----------------------------+-------------+----------------------------+
...
Kill Test2
+-----+----------------------------+----------------------------+-------------+----------------------------+
| 463 | 2017-01-10 18:35:54.379280 | 2017-01-10 18:35:54.373331 | galera1h1n5 | load_RW_192.168.0.12:40948 |
+-----+----------------------------+----------------------------+-------------+----------------------------+
+-----+----------------------------+----------------------------+-------------+-----------------------------+
| 466 | 2017-01-10 18:36:08.603754 | 2017-01-10 18:36:09.602075 | galera1h1n5 | load_RW_192.168.0.235:33268 |
+-----+----------------------------+----------------------------+-------------+-----------------------------+

This image is where you should be at the end:

ProxySQL for High Availability

Where the last server is Test3. In this case, I have killed one server immediately after the other. Keepalived had to take a bit longer failing over, but it still did it correctly and following the planned chain.

Fail-back is smooth as usual:

+-----+----------------------------+----------------------------+-------------+-----------------------------+
| 502 | 2017-01-10 18:39:18.749687 | 2017-01-10 18:39:18.749688 | galera1h1n5 | load_RW_192.168.0.235:33738 |
+-----+----------------------------+----------------------------+-------------+-----------------------------+
+-----+----------------------------+----------------------------+-------------+----------------------------+
| 505 | 2017-01-10 18:39:19.794888 | 2017-01-10 18:39:22.800800 | galera1h1n5 | load_RW_192.168.0.11:35476 |
+-----+----------------------------+----------------------------+-------------+----------------------------+

Let’s see another case

The case above is simple. But as a caveat, I can only access one ProxySQL a time. This might be good or not. In any case, it might be nice to have the possibility to choose the ProxySQL. With Keepalived, you can. We can actually set an X number of VIPs and associate them to each test box.

The result will be that each server hosting ProxySQL will also host a VIP, and will eventually be able to fail-over to any of the other two servers.

Failing-over/back is fully managed by Keepalived, checking as we did before if ProxySQL is running. An example of the configuration of one node can be seen below:

global_defs {
  # Keepalived process identifier
  lvs_id proxy_HA
}
# Script used to check if Proxy is running
vrrp_script check_proxy {
  script "killall -0 proxysql"
  interval 2
  weight 3
}
 
# Virtual interface 1
# The priority specifies the order in which the assigned interface to take over in a failover
vrrp_instance VI_01 {
  state MASTER
  interface em1
  virtual_router_id 51
  priority 102
 
  # The virtual ip address shared between the two loadbalancers
  virtual_ipaddress {
    192.168.0.88 dev em1
  }
  track_script {
    check_proxy
  }
}
 
# Virtual interface 2
# The priority specifies the order in which the assigned interface to take over in a failover
vrrp_instance VI_02 {
  state MASTER
  interface em1
  virtual_router_id 52
  priority 100
 
  # The virtual ip address shared between the two loadbalancers
  virtual_ipaddress {
    192.168.0.89 dev em1
  }
  track_script {
    check_proxy
  }
}
 
# Virtual interface 3
# The priority specifies the order in which the assigned interface to take over in a failover
vrrp_instance VI_03 {
  state MASTER
  interface em1
  virtual_router_id 53
  priority 99
 
  # The virtual ip address shared between the two loadbalancers
  virtual_ipaddress {
    192.168.0.90 dev em1
  }
  track_script {
    check_proxy
  }
}

The tricky part, in this case, is to play with the PRIORITY for each VIP and each server, such that you will NOT assign the same IP twice. The whole set of configs can be found at GitHub here: https://github.com/Tusamarco/proxysql_ha.

Performing a check with the test bash, we have:

Test 1 crash
+-----+----------------------------+----------------------------+-------------+----------------------------+
| 422 | 2017-01-11 18:30:14.411668 | 2017-01-11 18:30:14.344009 | galera1h1n5 | load_RW_192.168.0.11:55962 |
+-----+----------------------------+----------------------------+-------------+----------------------------+
ERROR 2003 (HY000): Can't connect to MySQL server on '192.168.0.88' (111)
+-----+----------------------------+----------------------------+-------------+----------------------------+
| 426 | 2017-01-11 18:30:18.531279 | 2017-01-11 18:30:21.473536 | galera1h1n5 | load_RW_192.168.0.12:49728 | <-- new server
+-----+----------------------------+----------------------------+-------------+----------------------------+
....
Test 2 crash
+-----+----------------------------+----------------------------+-------------+----------------------------+
| 450 | 2017-01-11 18:30:27.885213 | 2017-01-11 18:30:27.819432 | galera1h1n5 | load_RW_192.168.0.12:49745 |
+-----+----------------------------+----------------------------+-------------+----------------------------+
ERROR 2003 (HY000): Can't connect to MySQL server on '192.168.0.88' (111)
+-----+----------------------------+----------------------------+-------------+-----------------------------+
| 454 | 2017-01-11 18:30:30.971708 | 2017-01-11 18:30:37.916263 | galera1h1n5 | load_RW_192.168.0.235:33336 | <-- new server
+-----+----------------------------+----------------------------+-------------+-----------------------------+

The final state of IPs on Test3:

enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 08:00:27:c2:16:3f brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.235/24 brd 192.168.0.255 scope global enp0s8   <-- Real IP
       valid_lft forever preferred_lft forever
    inet 192.168.0.90/32 scope global enp0s8    <--- VIP 3
       valid_lft forever preferred_lft forever
    inet 192.168.0.89/32 scope global enp0s8    <--- VIP 2
       valid_lft forever preferred_lft forever
    inet 192.168.0.88/32 scope global enp0s8    <--- VIP 1
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fec2:163f/64 scope link
       valid_lft forever preferred_lft forever

And this is the image:

ProxySQL for High Availability

 

Recovery times:

test 1 crash = 7.06 sec (worse case scenario)
  test 2 crash = 10.03 sec (worse case scenario)

Conclusions

In this example, I used a test that checks the process. But a check can be anything reporting 0|1. The limit is defined only by what you need.

The failover times can be significantly shorter by reducing the check time, and only counting the time taken to move the VIP. I preferred to show the worse case scenario, using an application with a one-second interval. This is a pessimistic view of what normally happens with real traffic.

I was looking for a simple, simple, simple way to add HA to ProxySQL – something that can be easily integrated with automation, and that is actually also well-established and maintained. In my opinion, using Keepalived is a good solution because it matches all the above expectations.

Implementing a set of ProxySQL nodes, and having Keepalived manage the failover between them, is pretty easy. But you can expand the usage (and the complexity) if you need to, counting on tools that are already part of the Linux stack. There is no need to re-invent the wheel with a crazy mechanism.

If you want to have fun doing crazy things, at least start with something that helps you to go beyond the basics. For instance, I was also playing a bit with Keepalived and a virtual server, creating a set of redundant ProxySQL with load balancers and . . . but that is another story (blog).

?

Great MySQL and ProxySQL to all!

Jan
11
2017
--

How to Replace MySQL with Percona Server on a CPanel, WHM VPS or Dedicated Server

Replace MySQL with Percona Server

Replace MySQL with Percona ServerIn this blog post, we’ll look at how to replace MySQL with Percona Server for MySQL on a CPanel, WHM VPS or dedicated server.

In general, CPanel and WHM have been leaning towards support of MariaDB over other flavors. This is partly due to the upstream repos replacing the MySQL package with MariaDB (for example, on CentOS).

MySQL 5.6 is still supported though, which means they are keeping support for core MySQL products. But if you want to get some extra performance enhancements or enterprise features for free, without getting too many bells and whistles, you might want to install Percona Server.

I’ve done this work on a new dedicated server with the latest WHM and CPanel on CentOS 7, with MySQL 5.6 installed. Besides the backup, this is a fairly quick process.

It’s pretty simple. From the Percona Server for MySQL 5.7 installation doc, we can get the YUM repo. (Run commands as root if you can, otherwise as sudo.)

yum install http://www.percona.com/downloads/percona-release/redhat/0.1-3/percona-release-0.1-3.noarch.rpm

Now that we have the repo, let’s install Percona XtraBackup in case we need to roll this back at any point:

yum install percona-xtrabackup

This server had a drive mounted at /backup, so I created the backup with the following commands:

xtrabackup --target-dir=/backup/xtrabackup --backup
xtrabackup --target-dir=/backup/xtrabackup --prepare

Now that we have a good backup, let’s remove MySQL:

service mysql stop
yum remove MySQL* mysql*

Depending on your dependency chain, this could remove Percona XtraBackup, but that can be fixed. Let’s accept this uninstall.

Let’s install Percona Server for MySQL 5.7 and Percona Toolkit:

yum install Percona-Server-server-57 percona-toolkit percona-xtrabackup

Now that it’s installed ensure the mysql service is running. If it isn’t, start it. Now let’s upgrade:

mysql_upgrade

NOTE. This works if you can log in as root without a password; if you can’t, you will need to specify the

-u

 and

-p

 flags.

Once you run the upgrade, restart the mysql service:

service mysql restart

And there you go, you are now running on Percona Server for MySQL 5.7. If your managed providers tell you it’s not supported, don’t worry! It works as long as CPanel supports MySQL 5.6.

If you have any issues, just restore the backup.

NOTE: One thing to keep in mind is that 5.7 breaks CPanel’s ability to create users in MySQL. I believe this is due to the changes to the mysql.user table. If this is an issue for you, you can always use Percona Server for MySQL 5.6.

Jan
10
2017
--

Webinar Thursday, January 12: Percona Software News and Roadmap Update for Q1 2017

Percona Software News and RoadmapPlease join Percona CEO Peter Zaitsev for a webinar on Thursday, January 12, 2017 at 11 am PST/ 2 pm EST (UTC-8) for a discussion on the Percona Software News and Roadmap Update for Q1 2017.

In this webinar, Peter will discuss what’s new in Percona open source software. This will include Percona Server for MySQL and MongoDB, Percona XtraBackup, Percona Toolkit, Percona XtraDB Cluster and Percona Monitoring and Management.

During this webinar Peter will talk about newly released features in Percona software, show a few quick demos and share with you highlights from the Percona open source software roadmap.

Peter will also talk about new developments in Percona commercial services and finish with a Q&A.

Register for the Percona Software News and Roadmap Update webinar here.

Percona Software News and RoadmapPeter Zaitsev co-founded Percona and assumed the role of CEO in 2006. As one of the foremost experts on MySQL strategy and optimization, Peter leveraged both his technical vision and entrepreneurial skills to grow Percona from a two-person shop to one of the most respected open source companies in the business. With over 150 professionals in 20 plus countries, Peter’s venture now serves over 3000 customers – including the “who’s who” of internet giants, large enterprises and many exciting startups.

Peter was an early employee at MySQL AB, eventually leading the company’s High Performance Group. A serial entrepreneur, Peter co-founded his first startup while attending Moscow State University where he majored in Computer Science. Peter is a co-author of High Performance MySQL: Optimization, Backups, and Replication, one of the most popular books on MySQL performance. Peter frequently speaks as an expert lecturer at MySQL and related conferences, and regularly posts on the Percona Data Performance Blog. He has also been tapped as a contributor to Fortune and DZone, and his recent ebook Practical MySQL Performance Optimization is one of Percona’s most popular downloads.

 

Dec
23
2016
--

Percona Server for MongoDB 3.4 Beta is now available

Percona Server for MongoDB

Percona is pleased to announce the release of Percona Server for MongoDB 3.4.0-1.0beta on December 23, 2016. Download the latest version from the Percona web site or the Percona Software Repositories.

NOTE: Beta packages are available from testing repository.

Percona Server for MongoDB is an enhanced, open source, fully compatible, highly scalable, zero-maintenance downtime database supporting the MongoDB v3.4 protocol and drivers. It extends MongoDB with Percona Memory Engine and MongoRocks storage engine, as well as adding features like external authentication, audit logging, and profiling rate limiting. Percona Server for MongoDB requires no changes to MongoDB applications or code.


This beta release is based on MongoDB 3.4.0 and includes the following additional changes:

  • Red Hat Enterprise Linux 5 and derivatives (including CentOS 5) are no longer supported.
  • MongoRocks is now based on RocksDB 4.11.
  • PerconaFT and TokuBackup were removed.
    As alternatives, we recommend using MongoRocks for write-heavy workloads and Hot Backup for physical data backups on a running server.

Percona Server for MongoDB 3.4.0-1.0beta release notes are available in the official documentation.

 

Dec
15
2016
--

Percona XtraDB Cluster 5.7.16-27.19 is now available

Percona XtraDB Cluster 5.7.16-27.19

Percona XtraDB Cluster 5.7.16-27.19

Percona announces the release of Percona XtraDB Cluster 5.7.16-27.19 on December 15, 2016. Binaries are available from the downloads section or our software repositories.

Percona XtraDB Cluster 5.7.16-27.19 is now the current release, based on the following:

All Percona software is open-source and free.

Deprecated

  • The following encryption modes are now deprecated:
    • encrypt=1
    • encrypt=2
    • encrypt=3

The default is encrypt=0 with encryption disabled. The recommended mode now is the new encrypt=4, which uses SSL files generated by MySQL.

For more information, see Encrypting PXC Traffic.

New Features

  • Added encrypt=4 mode for SST encryption that uses SSL files generated by MySQL. Modes 1, 2, and 3 are now deprecated.
  • ProxySQL assisted maintenance mode that enables you to take a node down without adjusting ProxySQL manually. The mode is controlled using the pxc_maint_mode variable, which can be set to one of the following values:
    • DISABLED: This is the default state that tells ProxySQL to route traffic to the node as usual.
    • SHUTDOWN: This state is set automatically when you initiate node shutdown.
    • MAINTENANCE: You can change to this state if you need to perform maintenance on a node without shutting it down.

For more information, see Assisted Maintenance Mode.

  • Simplified SSL configuration for Galera/SST traffic with pxc-encrypt-cluster-traffic option, which auto-configures SSL encryption.

For more information, see SSL Automatic Configuration.

  • Added the wsrep_flow_control_interval status variable that displays the lower and upper limits of the flow control system used for the Galera receive queue.

Fixed Bugs

  • Optimized IST donor selection logic to avoid SST. Child processes are now cleaned-up and node state is resumed if SST fails.
  • Added init.ok to the list of files that do not get removed during SST.
  • Fixed error with ASIO library not acknowledging an EPOLLIN event when building Galera.
  • Fixed stalling of DML workload on slave node caused by FLUSH TABLE executed on the master.
    For more information, see 1629296.
  • Fixed super_read_only to not apply to Galera replication applier.
    For more information, see 1634295.
  • Redirected netcat output to stdout to avoid it in the log.
    For more information, see 1625968.
  • Enabled replication of ALTER USER statements.
    For more information, see 1376269.
  • Changed the wsrep_max_ws_rows variable to ignore non-replicated write-sets generated by DML action on temporary tables (explict or implicit).
    For more information, see 1638138.
  • Fixed SST to fail with an error if SSL is not supported by socat, instead of switching to unencrypted mode.
  • Fixed SST with SSL to auto-generate a 2048-bit dhparams file for versions of socat before 1.7.3. These older versions use 512-bit dhparams file by default that gets rejected by newer clients with dh key too small error.
  • PXC-731: Changed the wsrep_cluster_name variable to read-only, because changing it dynamically leads to high overhead.
    For more information, see 1620439.
  • PXC-732: Improved error message when any of the SSL files required for SST are missing.
  • PXC-735: Fixed SST to fail with an error when netcat is used (transferfmt=nc) with SSL encryption (encrypt set to 2, 3 or 4), instead of silently switching to unencrypted mode.
  • Fixed faulty switch case that caused cluster to stall when the repl.commit_order variable was set to 2 (LOCAL_OOOC mode that should allow out-of-order committing for local transactions).

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

Dec
14
2016
--

Percona XtraDB Cluster 5.6.34-26.19 is now available

Percona XtraDB Cluster 5.7.16-27.19

Percona XtraDB Cluster 5.6.34-26.19

Percona announces the release of Percona XtraDB Cluster 5.6.34-26.19 on December 14, 2016. Binaries are available from the downloads section or our software repositories.

Percona XtraDB Cluster 5.6.34-26.19 is now the current release, based on the following:

All Percona software is open-source and free. Details of this release can be found in the 5.6.34-26.19 milestone on Launchpad.

Deprecated

  • The following encryption modes are now deprecated:
    • encrypt=1
    • encrypt=2
    • encrypt=3

The default is encrypt=0 with encryption disabled. The recommended mode now is the new encrypt=4, which uses SSL files generated by MySQL.

For more information, see Encrypting PXC Traffic.

New Features

  • Added encrypt=4 mode for SST encryption that uses SSL files generated by MySQL. Modes 1, 2, and 3 are now deprecated.

Fixed Bugs

  • Optimized IST donor selection logic to avoid SST. Child processes are now cleaned-up and node state is resumed if SST fails.
  • Added init.ok to the list of files that do not get removed during SST.
  • Fixed error with ASIO library not acknowledging an EPOLLIN event when building Galera.
  • Fixed stalling of DML workload on slave node caused by FLUSH TABLE executed on the master.
    For more information, see 1629296.
  • Fixed super_read_only to not apply to Galera replication applier.
    For more information, see 1634295.
  • Redirected netcat output to stdout to avoid it in the log.
    For more information, see 1625968.
  • Enabled replication of ALTER USER statements.
    For more information, see 1376269.
  • Changed the wsrep_max_ws_rows variable to ignore non-replicated write-sets generated by DML action on temporary tables (explicit or implicit).
    For more information, see 1638138.
  • Fixed SST to fail with an error if SSL is not supported by socat, instead of switching to unencrypted mode.
  • Fixed SST with SSL to auto-generate a 2048-bit dhparams file for versions of socat before 1.7.3. These older versions use 512-bit dhparams file by default that gets rejected by newer clients with dh key too small error.
  • PXC-731: Changed the wsrep_cluster_name variable to read-only, because changing it dynamically leads to high overhead.
    For more information, see 1620439.
  • PXC-732: Improved error message when any of the SSL files required for SST are missing.
  • PXC-735: Fixed SST to fail with an error when netcat is used (transferfmt=nc) with SSL encryption (encrypt set to 2, 3 or 4), instead of silently switching to unencrypted mode.
  • Fixed faulty switch case that caused the cluster to stall when the repl.commit_order variable was set to 2 (LOCAL_OOOC mode that should allow out-of-order committing for local transactions).

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

Nov
29
2016
--

Percona XtraBackup 2.3.6 is now available

Percona XtraBackup 2.4.5

Percona XtraBackup 2.3.6Percona announces the release of Percona XtraBackup 2.3.6 on November 29, 2016. Downloads are available from our download site or Percona Software Repositories.

Percona XtraBackup enables MySQL backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, Percona XtraBackup drives down backup costs while providing unique features for MySQL backups.

This release is the current GA (Generally Available) stable release in the 2.3 series.

New Features
  • Percona XtraBackup now supports SHA256 passwords. Using the SHA256 algorithm requires either SSL encrypted connection, or using public key encryption for password exchange which is only available when both client and server are linked with OpenSSL.
  • Percona XtraBackup now supports Command Options for Secure Connections.
  • NOTE: Due to xbcrypt format changes, backups encrypted with this Percona XtraBackup version will not be recoverable by older versions.
Bugs Fixed:
  • Fixed intermittent assertion failures that were happening when Percona XtraBackup couldn’t correctly identify server version. Bug fixed #1568009.
  • Safe slave backup algorithm performed too short delays between retries which could cause backups to fail on a busy servers. Bug fixed #1624473.
  • Fixed new compilation warnings with GCC 6. Bug fixed #1641612.
  • xbcrypt was not setting the Initialization Vector (IV) correctly (and thus is was not using an IV). This was causing the same ciphertext to be generated across different runs (for the same message/same key). The IV provides the extra randomness to ensure that the same ciphertext is not generated across runs. Bug fixed #1643949.
  • Backup would still succeed even if XtraBackup would fail to write the metadata. Bug fixed #1623210.
  • xbcloud now supports EMC ECS Swift API Authorization requests. Bugs fixed #1638017 and #1638020 (Txomin Barturen).
  • Percona XtraBackup would fail to backup MariaDB 10.2 with the unsupported server version error message. Bug fixed #1602842.

Other bugs fixed: #1639764 and #1639767.

Release notes with all the bugfixes for Percona XtraBackup 2.3.6 are available in our online documentation. Bugs can be reported on the launchpad bug tracker.

Nov
28
2016
--

Percona Server for MongoDB 3.0.14-1.9 is now available

Percona Server for MongoDB 3.0.14-1.9

Percona Server for MongoDB 3.0.14-1.9Percona announces the release of Percona Server for MongoDB 3.0.14-1.9 on November 28, 2016. Download the latest version from the Percona web site or the Percona Software Repositories.

Percona Server for MongoDB 3.0.14-1.9 is an enhanced, open source, fully compatible, highly scalable, zero-maintenance downtime database supporting the MongoDB v3.0 protocol and drivers. Based on MongoDB 3.0.14, it extends MongoDB with MongoRocks and PerconaFT storage engines, as well as features like external authentication and audit logging. Percona Server for MongoDB requires no changes to MongoDB applications or code.

NOTE: PerconaFT has been deprecated and will be removed in the future.

This release includes all changes from MongoDB 3.0.13 and MongoDB 3.0.14. We implemented no additional fixes or features.

You can find the release notes in the official documentation.

Nov
28
2016
--

Percona Server 5.7.16-10 is now available

percona server 5.7.16-10

percona server 5.7.16-10Percona announces the GA release of Percona Server 5.7.16-10 on November 28, 2016. Download the latest version from the Percona web site or the Percona Software Repositories.

Based on MySQL 5.7.16, including all the bug fixes in it, Percona Server 5.7.16-10 is the current GA release in the Percona Server 5.7 series. Percona’s provides completely open-source and free software. Find release details in the 5.7.16-10 milestone at Launchpad.

Deprecated Features:
  • Metrics for scalability measurement feature is now deprecated. Users who have installed this plugin but are not using its capability are advised to uninstall the plugin due to known crashing bugs.
Bugs Fixed
  • When a stored routine would call an administrative command such as OPTIMIZE TABLE, ANALYZE TABLE, ALTER TABLE, CREATE/DROP INDEX, etc. the effective value of log_slow_sp_statements was overwritten by the value of log_slow_admin_statements. Bug fixed #719368.
  • Server wouldn’t start after crash with with innodb_force_recovery set to 6 if parallel doublewrite file existed. Bug fixed #1629879.
  • Thread Pool thread limit reached and failed to create thread messages are now printed on the first occurrence as well. Bug fixed #1636500.
  • INFORMATION_SCHEMA.TABLE_STATISTICS and INFORMATION_SCHEMA.INDEX_STATISTICS tables were not correctly updated for TokuDB. Bug fixed #1629448.

Other bugs fixed: #1633061, #1633430, and #1635184.

The release notes for Percona Server 5.7.16-10 are available in the online documentation. Please report any bugs on the launchpad bug tracker .

Nov
28
2016
--

Percona Server 5.6.34-79.1 is now available

percona server 5.6.34-79.1

percona server 5.6.34-79.1Percona announces the release of Percona Server 5.6.34-79.1 on November 28, 2016. Download the latest version from the Percona web site or the Percona Software Repositories.

Based on MySQL 5.6.34, including all the bug fixes in it, Percona Server 5.6.34-79.1 is the current GA release in the Percona Server 5.6 series. Percona Server is open-source and free – this is the latest release of our enhanced, drop-in replacement for MySQL. Complete details of this release are available in the 5.6.34-79.1 milestone on Launchpad.

Deprecated features:
  • Metrics for scalability measurement feature is now deprecated. Users who have installed this plugin but are not using its capability are advised to uninstall the plugin due to known crashing bugs.
Bugs fixed:
  • When a stored routine would call an administrative command such as OPTIMIZE TABLE, ANALYZE TABLE, ALTER TABLE, CREATE/DROP INDEX, etc. the effective value of log_slow_sp_statements was overwritten by the value of log_slow_admin_statements. Bug fixed #719368.
  • Thread Pool thread limit reached and failed to create thread messages are now printed on the first occurrence as well. Bug fixed #1636500.
  • INFORMATION_SCHEMA.TABLE_STATISTICS and INFORMATION_SCHEMA.INDEX_STATISTICS tables were not correctly updated for TokuDB. Bug fixed #1629448.

Other bugs fixed: #1633061, #1633430, and #1635184.

Release notes for Percona Server 5.6.34-79.1 are available in the online documentation. Please report any bugs on the launchpad bug tracker.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com