May
16
2014
--

Benchmark: SimpleHTTPServer vs pyclustercheck (twisted implementation)

Github user Adrianlzt provided a python-twisted alternative version of pyclustercheck per discussion on issue 7.

Due to sporadic performance issues noted with the original implementation in SimpleHTTPserver, the benchmarks which I’ve included as part of the project on github use mutli-mechanize library,

  • cache time 1 sec
  • 2 x 100 thread pools
  • 60s ramp up time
  • 600s total duration
  • testing simulated node fail (always returns 503, rechecks mysql node on cache expiry)
  • AMD FX(tm)-8350 Eight-Core Processor
  • Intel 330 SSD
  • local loop back test (127.0.0.1)

The SimpleHTTPServer instance faired as follows:

All_Transactions_throughput All_Transactions_response_times_intervals All_Transactions_response_times

Right away we can see around 500TPS throughput, however as can be seen in both response time graphs there are “outlying” transactions, something is causing the response time to spike dramatically  SimpleHTTPServer, how does the twisted alternative compare? (note the benchmarks are from the current HEAD with features re-added to avoid regression, caching and ipv6 support)

All_Transactions_throughput All_Transactions_response_times_intervals All_Transactions_response_times

 

Ouch! We appear to have taken a performance hit, at least in terms of TPS -19% rough estimates however compare the response time graphs to find a much more consistent plot, we had outliers hitting near  70s for SimpleHTTP server, we’re always under 1s within twisted.

Great! So why isn’t this merged into the master branch as the main project and therfor bundled with Percona XtraDB Cluster (PXC)? The issue here is the required version of python-twisted; ipv6 support was introduced in issue 8 by user Nibler999 and to avoid regression I re-added support for ipv6 in this commit for twisted

ipv6 support for python-twisted is not in the version distributed to main server OS’s such as

  • EL6 python-twisted 8.x
  • Ubuntu 10.04 LTS python-twisted 11.x

What’s the issue here? Attempting to bind / listen to an ipv6 interface yields the following error: twisted.internet.error.CannotListenError: Couldn't listen on :::8000: [Errno -9] Address family for hostname not supported.

Due to this regression (breaking of ipv6 support) the twisted version can not at this time be merged into master, the twisted version however as can be seen from above is much more consistent and if you have the “cycles” to implement it (e.g. install twisted from pypy via pip / easy_install to get >= 12.x) and test it’s a promising alternative.

To illustrate this further the benchmark was made more gruling:

  • 5 x 100 thread pools
  • 60s ramp up
  • 600s total duration

First the twisted results, note the initial spike is due to a local python issue where it locked up creating a new thread in multi-mechanize:

All_Transactions_response_times All_Transactions_response_times_intervals All_Transactions_throughput

Now the SimpleHTTPServer results:

All_Transactions_response_times All_Transactions_response_times_intervals All_Transactions_throughput

Oh dear, as the load increases clearly we get some stability issues inside SimpleHTTP server…

Also worth noting is the timeouts

  • twisted: grep 'timed out' results.csv | wc -l == 0
  • SimpleHTTPServer: grep 'timed out' results.csv | wc -l == 470

 

… in the case of increased load the twisted model performs far more consistently under the same test conditions when compared against SimpleHTTPServer. I include the multi-mechanize scripts as part of the project on GitHub – as such you can recreate these tests yourself and gauge the performance to see if twisted or SimpleHTTP suits your needs.

The post Benchmark: SimpleHTTPServer vs pyclustercheck (twisted implementation) appeared first on MySQL Performance Blog.

Oct
15
2013
--

Using keepalived for HA on top of Percona XtraDB Cluster

Percona XtraDB Cluster (PXC) itself manages quorum and node failure.  Minorities of nodes in a network partition situation will move themselves into a Non-primary state and not allow any DB activity.  Nodes in such a state will be easily detectable via SHOW GLOBAL STATUS variables.

It’s common to use HAproxy with PXC for load balancing purposes, but what if you are planning to just send traffic to a single node?  We would standardly use keepalived to HA HAproxy, and keepalived supports track_scripts that can monitor whatever we want, so why not just monitor PXC directly?

If we have clustercheck working on all hosts:

mysql> GRANT USAGE ON *.* TO 'clustercheck'@'localhost' IDENTIFIED BY PASSWORD '*2470C0C06DEE42FD1618BB99005ADCA2EC9D1E19';
[root@node1 ~]# /usr/bin/clustercheck clustercheck password 0; echo $?
HTTP/1.1 200 OK
Content-Type: text/plain
Connection: close
Content-Length: 40
Percona XtraDB Cluster Node is synced.
0

Then we can just install keepalived and the this config on all nodes:

vrrp_script chk_pxc {
        script "/usr/bin/clustercheck clustercheck password 0"
        interval 1
}
vrrp_instance PXC {
    state MASTER
    interface eth1
    virtual_router_id 51
    priority 100
    nopreempt
    virtual_ipaddress {
        192.168.70.100
    }
    track_script {
        chk_pxc
    }
    notify_master "/bin/echo 'now master' > /tmp/keepalived.state"
    notify_backup "/bin/echo 'now backup' > /tmp/keepalived.state"
    notify_fault "/bin/echo 'now fault' > /tmp/keepalived.state"
}

And start the keepalived service. The virtual IP above will be brought up on an active node in the cluster and moved around if clustercheck fails.

[root@node1 ~]# cat /tmp/keepalived.state
now backup
[root@node2 ~]# cat /tmp/keepalived.state
now master
[root@node3 ~]# cat /tmp/keepalived.state
now backup
[root@node2 ~]# ip a l | grep 192.168.70.100
    inet 192.168.70.100/32 scope global eth1
[root@node3 ~]# mysql -h 192.168.70.100 -u test -ptest test -e "show global variables like 'wsrep_node_name'"
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| wsrep_node_name | node2 |
+-----------------+-------+

If I shutdown PXC on node2:

[root@node2 keepalived]# service mysql stop
Shutting down MySQL (Percona XtraDB Cluster)....... SUCCESS!
[root@node2 ~]# /usr/bin/clustercheck clustercheck password 0; echo $?
HTTP/1.1 503 Service Unavailable
Content-Type: text/plain
Connection: close
Content-Length: 44
Percona XtraDB Cluster Node is not synced.
1
[root@node1 ~]# cat /tmp/keepalived.state
now master
[root@node2 ~]# cat /tmp/keepalived.state
now fault
[root@node3 ~]# cat /tmp/keepalived.state
now backup
[root@node1 ~]# ip a l | grep 192.168.70.100
    inet 192.168.70.100/32 scope global eth1
[root@node2 ~]# ip a l | grep 192.168.70.100
[root@node2 ~]#
[root@node3 ~]# ip a l | grep 192.168.70.100
[root@node3 ~]#

We can see node2 moves to a FAULT state and the VIP moves to node1 instead.  This provides us with a very simple way to do Application to PXC high availability.

A few additional notes:

  • You can disqualify donors (i.e., make clustercheck fail on Donor/Desynced nodes) by setting the 3rd argument to clustercheck to 0.  Setting this to 1 means Donors can retain the VIP.
  • Each keepalived instance monitors its own state only, hence the @localhost GRANT.  This is much cleaner than exposing clustercheck as a web port via xinetd.
  • It’s possible to more complex things with keepalived like multiple vips, node weighting, etc.
  • Keepalived can track over multiple network interfaces (in this example, just eth1) for better reliability.

The post Using keepalived for HA on top of Percona XtraDB Cluster appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com