Jul
16
2014
--

High Availability with mysqlnd_ms on Percona XtraDB Cluster

This is the second part of my series on High Availability with mysqlnd_ms. In my first post, “Simple MySQL Master HA with mysqlnd_ms,” I showed a simple HA solution using asynchronous MySQL replication. This time we will see how to leverage an all-primary cluster where you can write to all nodes. In this post I used Percona XtraDB Cluster, but you should also be able to do the same with MySQL NDB Cluster or Tungsten Replicator.

To start with, here is the mysqlnd_ms configuration I used:

mysqlnd_ms_mm.ini

.  All of these files are available from my Github repository. Below, I have three Percona XtraDB Cluster nodes, all defined as masters and no slaves. I’ve configured a roundrobin filter where all connections will happen on the first node, in this case

192.168.56.44

 . In case the first node fails, the second node will be used and so forth until no more nodes are available. Another interesting configuration option here is the loop_before_master strategy whereby if connection or a statement to the current server fails, it will be retried silently on the remaining nodes before returning an error to the user, more on this below.

{
   "primary": {
     "master": {
       "master_1": {
         "host": "192.168.56.44",
         "port": "3306"
       },
       "master_2": {
         "host": "192.168.56.43",
         "port": "3306"
       },
       "master_3": {
         "host": "192.168.56.42",
         "port": "3306"
       }
     },
     "slave": { },
     "filters": { "roundrobin": [ ] },
     "failover": { "strategy": "loop_before_master", "remember_failed": true }
   }
 }

Similar to my previous post, I also used a custom INI file for PHP to use, this time aptly named

master-master.ini

 :

mysqlnd_ms.enable = 1
mysqlnd_ms.disable_rw_split = 1
mysqlnd_ms.multi_master = 1
mysqlnd_ms.force_config_usage = 1
mysqlnd_ms.config_file = /home/revin/git/demo-me/phpugph201407/mysqlnd_ms_mm.ini

A new addition to this configuration is

mysqlnd_ms.multi_master

 , when enabled it would allow you to use all nodes or just one and treat the others as passive. The PHP script I used this time is called

master-master.php

 , it is largely similar to

master-slave-ng.php

with a few differences:

  1. There is no need for 
    /tmp/PRIMARY_HAS_FAILED

      sentinel as all nodes were writable.

  2. There is no need for 
    /*ms=master*/

      SQL hint when validating a connection from connect_mysql function since all nodes acts as master.

So here is a quick test, first with roundrobin filter, after 4 INSERTs, I shutdown 

192.168.56.44

  which sends my connection to the next server in the configuration,

192.168.56.43

 . When I started back 

192.168.56.44

  again, the script resumed connections there. Pretty cool right?

[revin@forge phpugph201407]$ php -c master-master.ini master-master.php
Last value 3564 from host 192.168.56.44 via TCP/IP and thread id 19
Last value 3565 from host 192.168.56.44 via TCP/IP and thread id 20
Last value 3566 from host 192.168.56.44 via TCP/IP and thread id 21
Last value 3567 from host 192.168.56.44 via TCP/IP and thread id 22
Warning: connect_mysql(): MySQL server has gone away in /home/revin/git/demo-me/phpugph201407/master-master.php on line 63
Warning: connect_mysql(): Error while reading greeting packet. PID=23464 in /home/revin/git/demo-me/phpugph201407/master-master.php on line 63
ERRROR: 192.168.56.43 via TCP/IP [2006] MySQL server has gone away on line 30
Last value 0 from host  and thread id 0
Last value 3568 from host 192.168.56.43 via TCP/IP and thread id 1552
Last value 3569 from host 192.168.56.43 via TCP/IP and thread id 1553
[...]
Last value 3584 from host 192.168.56.43 via TCP/IP and thread id 1568
Last value 3585 from host 192.168.56.44 via TCP/IP and thread id 18

Here’s another test using the random filter which allows you to write to all nodes, on my mysqlnd_ms_mm.ini above, I just changed 

roundrobin

  to

random

 . As you can see, all three nodes were being used, of course in random, at the same time you will also see when I shutdown 

192.168.56.44

  around where the 

connect_mysql

  errors and then the server was used again near the bottom after a started it back up. Still pretty cool right?

[revin@forge phpugph201407]$ php -c master-master.ini master-master.php
Last value 3590 from host 192.168.56.42 via TCP/IP and thread id 2060
Last value 3591 from host 192.168.56.43 via TCP/IP and thread id 1569
Last value 3592 from host 192.168.56.43 via TCP/IP and thread id 1570
Warning: connect_mysql(): MySQL server has gone away in /home/revin/git/demo-me/phpugph201407/master-master.php on line 63
Warning: connect_mysql(): Error while reading greeting packet. PID=23919 in /home/revin/git/demo-me/phpugph201407/master-master.php on line 63
ERRROR: 192.168.56.43 via TCP/IP [2006] MySQL server has gone away on line 30
Last value 0 from host  and thread id 0
Last value 3593 from host 192.168.56.42 via TCP/IP and thread id 2061
Last value 3594 from host 192.168.56.42 via TCP/IP and thread id 2062
Last value 3595 from host 192.168.56.42 via TCP/IP and thread id 2063
Last value 3596 from host 192.168.56.42 via TCP/IP and thread id 2064
Last value 3597 from host 192.168.56.43 via TCP/IP and thread id 1576
Last value 3598 from host 192.168.56.43 via TCP/IP and thread id 1577
Last value 3599 from host 192.168.56.43 via TCP/IP and thread id 1578
Last value 3600 from host 192.168.56.43 via TCP/IP and thread id 1579
Last value 3601 from host 192.168.56.42 via TCP/IP and thread id 2065
Last value 3602 from host 192.168.56.43 via TCP/IP and thread id 1581
Last value 3603 from host 192.168.56.43 via TCP/IP and thread id 1582
Last value 3604 from host 192.168.56.42 via TCP/IP and thread id 2066
Last value 3605 from host 192.168.56.44 via TCP/IP and thread id 19
Last value 3606 from host 192.168.56.43 via TCP/IP and thread id 1583
Last value 3607 from host 192.168.56.44 via TCP/IP and thread id 21

So here are some issues I’ve observed during these tests:

  1. remember_failed

      during failover does not work as advertised. Supposedly, a failed node should not be used again for every connection request but in my test, this is not the case. See more from this bug. This means that if you have 2 out of 3 failed nodes in this scenario the overhead would be too big when testing both connections. Perhaps some sort of in memory shared TTL can be used to overcome this? I’m not sure.

  2. If you look closely around line 7 on my last output above the error displayed is kind of misleading. In particular it says
    ERRROR: 192.168.56.43 via TCP/IP

     , whereby it was not 

    192.168.56.43

      that failed, it was

    192.168.56.43

     . This is because under the hood, immediately after failure the next node will be cycled to, this is especially true since we have loop_before_master configured. I sure do have a bug on the script that should capture the 

    host_info

      properly, but this is something to always keep in mind so you don’t keep scratching your head.

So we’ve seen these two forms of possibilities and they definitely have use cases and advantages. On the other hand because of the issues we have found so far (I’ve reported 4 bugs on the PHP bugs database during the course of these tests including one crashing), I recommend to make sure you test seriously before putting this on production.

The post High Availability with mysqlnd_ms on Percona XtraDB Cluster appeared first on MySQL Performance Blog.

Jul
14
2014
--

Simple MySQL Master HA with mysqlnd_ms

I had the pleasure of presenting to the PHP Users Group Philippines a few days ago about mysqlnd_ms. The mysqlnd plugin, MySQL Master Slave, is a transparent layer on top of mysqlnd extension. This allows you to do read-write splitting and slave reads load balancing without needing to change anything from your application. But do you know you can also achieve a form of high availability with this plugin? I shared 2 forms on my presentation, using async MySQL replication either in master-slave configuration or master-master configuration, while the second form is having an all primary cluster where you can write to all nodes.

This first part is to demonstrate how you can achieve a simple HA solution using the first form. First, all the sample code here can be found on my GitHub repository. So, to use the mysqlnd_ms plugin, it uses an additional external configuration file in JSON format. This configuration file, will define your master and slave nodes, failover properties and any filters (connection selection method) you want to dictate how the algorithm will provide you the connection.

Let’s start with the mysqlnd_ms configuration I used,

mysqlnd_ms_ms.ini

 :

{
  "primary": {
    "master": {
      "master_1": {
        "host": "127.0.0.1",
        "port": "33001"
      }
    },
    "slave": {
    }
  },
  "standby": {
    "master": {
      "master_1": {
        "host": "127.0.0.1",
        "port": "33002"
      }
    },
    "slave": {
    }
  }
}

Here, I have two applications defined, one called “primary” and another called “standby”, I have not defined any slaves for simplicity. The two MySQL instances running on port 33001 and 33002 are in master-master configuration.

mysqlnd_ms.enable = 1
mysqlnd_ms.disable_rw_split = 1
mysqlnd_ms.force_config_usage = 1
mysqlnd_ms.config_file = /home/revin/git/demo-me/phpugph201407/mysqlnd_ms_ms.ini

This is the custom INI file I used for the tests,

master-slave.ini

 . The first line simply enables the plugin for use. The second line, mysqlnd_ms.disable_rw_split instructs the plugin that I should only send all queries to the master because I only have masters for this test.

As for the PHP script, the full copy can be found here, as it is a bit lengthy I will just explain the logic on what it does.

  1. To start the test, it bootstraps the test table via DROP and then CREATE queries.
  2. It then enters a for loop where it will execute an INSERT followed by a SELECT to validate the newly inserted row and additional information like the current active server id and the connection id.
  3. For every iteration of the loop, a new mysqli object is created to simulate non-persistent connections to the database server.
  4. To create the new connection, a call to the function 
    connect_mysql

      is made which returns a mysqli object when successful. An important thing to remember here is that mysqlnd_ms uses lazy connections by default, this means that when the mysqli object is created, it is not really connected yet to the server. One has to issue a statement like 

    'SELECT 1'

      to start the connection manually or call

    mysqli::real_connect

     . Not even 

    mysqli::ping

      does not work without the former, I’ve opened this bug.

  5. After the mysqli object is returned, the INSERT statement will trigger mysqlnd_ms to actually establish the connection and then execute the statement. This is where the good part is, if the connection cannot be made, the query_write_mysql function will know and will re-request the connection from connect_mysql, this time within the connect_mysql function, connection to the primary will be retried at least 10 times if the type of error from the previous failure is something related to a connection like error numbers 
    2002

      and

    2003

     . If the connection cannot be established after 10 retries, the application creates a sentinel file as 

    /tmp/PRIMARY_HAS_FAILED

      and will retry the connection to the secondary (slave or passive-master).

Here is an example run, my primary has a server id or 101 while my standby is 102:

[revin@forge phpugph201407]$ php -c master-slave.ini master-slave-ng.php
Last value 0001 from server id 101 thread id 7
Last value 0003 from server id 101 thread id 8
37: [2002] Connection refused
Connection to host 'primary' failed: [0] Connection refused, retrying (1 of 10) in 3 seconds
Connection to host 'primary' failed: [0] Connection refused, retrying (2 of 10) in 3 seconds
Connection to host 'primary' failed: [0] Connection refused, retrying (3 of 10) in 3 seconds
Connection to host 'primary' failed: [0] Connection refused, retrying (4 of 10) in 3 seconds
Connection to host 'primary' failed: [0] Connection refused, retrying (5 of 10) in 3 seconds
Connection to host 'primary' failed: [0] Connection refused, retrying (6 of 10) in 3 seconds
Connection to host 'primary' failed: [0] Connection refused, retrying (7 of 10) in 3 seconds
Connection to host 'primary' failed: [0] Connection refused, retrying (8 of 10) in 3 seconds
Connection to host 'primary' failed: [0] Connection refused, retrying (9 of 10) in 3 seconds
Connection to host 'primary' failed: [0] Connection refused, retrying (10 of 10) in 3 seconds
The primary host 'primary' has failed after 30 seconds, failing over to standby!
52: [2002] Connection refused
Last value 0004 from server id 102 thread id 635
Last value 0006 from server id 102 thread id 636
Last value 0008 from server id 102 thread id 637
[...]

This is not the perfect setup and there are a number of limitations, however it tells us that if you have a simple HA requirement like if you’re not running a very critical application but still do not want to be waken up at night but rather deal with issues in the morning, this might just fit. So here are some more notes:

    • If you have a master-slave configuration, you just basically shot your primary (master) in the foot during the failover. You may need to rebuild its data in the morning.
    • If instead you have master-master, you might just be able to bring the primary master back online, get it caught up in replication and then delete 
      /tmp/PRIMARY_HAS_FAILED

        file to switch your application back to it.

    • The use of 
      /tmp/PRIMARY_HAS_FAILED

        sentinel file is rudimentary, its not the only way. You should consider sending notifications to yourself when failover happens because this method requires human intervention to put back the primary master back in rotation.

The same effect can be achieved with a little more coding, but you can already take advantage of the plugin with less.

I’ve also tested the plugin on the second form where you can write to multiple masters using Percona XtraDB Cluster. I’ve found a few interesting issues there so stay tuned.

The post Simple MySQL Master HA with mysqlnd_ms appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com