Dec
30
2013
--

Percona Monitoring Plugins 1.1.1 release (enterprise-grade MySQL monitoring and graphing)

Percona announces the release of Percona Monitoring Plugins 1.1.1 to address the critical bug that appears after the 1.1 upgrade.

Changelog:

* Cacti mysql graphs stop working with data input field “server-id” after 1.1 upgrade (bug 1264814)
* Non-integer poller errors for MySQL Query Response Time (bug 1264353)

A new tarball is available from downloads area or RPM and DEB packages from our software repositories. The plugins are fully supported for customers with a Percona Support contract and free installation services are provided as part of some contracts. In addition as part of Percona’s Remote DBA installation and setup of these tools are included with our services. You can find links to the documentation, forums and more at the project homepage.

Percona Monitoring PluginsAbout Percona Monitoring Plugins
Percona Monitoring Plugins are high-quality components to add enterprise-grade MySQL monitoring and graphing capabilities to your existing in-house, on-premises monitoring solutions. The components are designed to integrate seamlessly with widely deployed solutions such as Nagios and Cacti and are delivered in the form of templates, plugins, and scripts which make it easy to monitor MySQL performance.

The post Percona Monitoring Plugins 1.1.1 release (enterprise-grade MySQL monitoring and graphing) appeared first on MySQL Performance Blog.

Dec
30
2013
--

[ERROR] mysqld: Sort aborted: Server shutdown in progress

Recently, one of our support customers faced this: “[ERROR] mysqld: Sort aborted: Server shutdown in progress.” At first it would appear this occurred because of a mysql restart (i.e. the MySQL server restarted and the query got killed during the stopping of mysql). However, while debugging this problem I found no evidence of a MySQL server restart – which proves that what’s “apparent” is not always the case, so this error message was a bit misleading. Check this bug report for further details http://bugs.mysql.com/bug.php?id=18256 (it was reported back in 2006).

I found that there are two possible reasons for this error: Either the MySQL server restarts during execution of the query, or the query got killed forcefully during execution which utilizes the “Using filesort” algorithm.

So, let’s try to reproduce the scenario for this particular error. I opened two mysql sessions, mysql1 and mysql2.

mysql1> EXPLAIN SELECT * FROM dummy WHERE user_id <> 245424 GROUP BY title ORDER BY group_id;
+----+-------------+-------+-------+---------------+---------+---------+------+--------+----------------------------------------------+
| user_id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows   | Extra                                        |
+----+-------------+-------+-------+---------------+---------+---------+------+--------+----------------------------------------------+
|  1 | SIMPLE      | dummy | range | PRIMARY       | PRIMARY | 4       | NULL | 580096 | Using where; Using temporary; Using filesort |
+----+-------------+-------+-------+---------------+---------+---------+------+--------+----------------------------------------------+

Then, I executed a query in session1 (i.e. mysql1) and while the query is running I killed it by logging into other session, mysql2.

mysql2> show full processlist\G
*************************** 1. row ***************************
Id: 2
User: root
Host: localhost
db: test
Command: Query
Time: 1
State: Sorting result
Info: SELECT * FROM dummy WHERE id <> 245424 GROUP BY title ORDER BY group_id
Rows_sent: 0
Rows_examined: 0
mysql2> kill 2;
Query OK, 0 rows affected (0.00 sec)
mysql2> show full processlist\G
*************************** 1. row ***************************
User_id: 2
User: root
Host: localhost
db: test
Command: Killed
Time: 8
State: Creating sort index
Info: SELECT * FROM dummy WHERE user_id <> 245424 GROUP BY title ORDER BY group_id
Rows_sent: 0
Rows_examined: 0
mysql1> SELECT * FROM dummy WHERE user_id <> 245424 GROUP BY title ORDER BY group_id;
ERROR 2013 (HY000): Lost connection to MySQL server during query

In the error log I found…

2013-12-16 00:05:42 3746 [ERROR] /usr/local/mysql/bin/mysqld: Sort aborted: Server shutdown in progress.

This looks confusing because of the error message, “Server shutdown in progress”. Then I restarted the MySQL server during execution of this query and the same error logged in error log which seems correct as mysql is shutdown during query execution.

2013-12-16 00:05:46 3746 [ERROR] /usr/local/mysql/bin/mysqld: Sort aborted: Server shutdown in progress

So, basically when the MySQL server is shutting down it kills all connected threads so “server shutdown in progress” error message is correct in this context. That means this error is not only recorded when the MySQL server restarts during query execution (which uses ORDER BY on non-index column’s) but also when that particular query is explicitly killed during it’s execution.

Conclusion:
The MySQL error “Server shutdown in progress” is confusing. It pops up when you kill a query explicitly and it then appears like the MySQL server is restarted – which is not the case in that scenario. This bug is quite old and was finally fixed in MySQL 5-5-35 and 5-6-15.

The post [ERROR] mysqld: Sort aborted: Server shutdown in progress appeared first on MySQL Performance Blog.

Dec
28
2013
--

Just Do it

I trust everyone had a super Holiday time, and we still have New Years to come? I’ve been sick since Christmas Eve and have achieved very little except wallow in self-pity. Tamara says that men make the worst patients. I won’t disagree. It has given me plenty of time to catch up on TV (including my yearly ritual of watching the Everest series), and more importantly, to think.

This is the time when we all make resolutions for the coming year. We dream big and hope we can actually pursue those dreams through the end of January let alone the whole year. Most of us fail. I applaud those of you that succeed. After studying goal motivation theory for even an hour, it is easy to see what makes a good resolution. We usually succeed when it is something we genuinely desire, rather than something we feel obligated to do. This is why diet and exercise resolutions rarely work. If we had really wanted to get in shape or lose 50 lbs we wouldn’t need to wait until the New Year to resolve to do it next year. My last year’s resolution to write more was easy – I was super excited to do that and it required very little motivation.

Sometimes though, we do have a genuine desire but procrastinate. That’s me, I’ve been a big procrastinator all my life (but I’ll fix that next year :)). There are many causes of this condition and a common one is fear, the fear of trying and failing or falling short. It is easier not to try at all. One single resource almost cured my tendency to procrastinate: Feel the Fear and do it anyway. A great read. I caught myself bowing out of several things during 2013 due to fear, most of them things that I really had wanted to do, such as an author speaking engagement. I lost opportunities at work too (though others might not have noticed as keenly as I). Clearly it is time to refresh my “cure”.

I’m not picking a resolution for 2014, I’m picking a mantra. I intend to post it at my desk and in my home office and assert it to myself whenever fear or procrastination sets in. Theoretically I can achieve a dozen resolutions with this simple mantra:

Just do it!

 

What are your resolutions or mantras for 2014?

 

Dec
26
2013
--

2 cases for MySQL server overload

Your MySQL server is overloaded. You see hundreds of running queries in the SHOW PROCESSLIST taking many seconds to run, or can’t connect at all because all connections slots are busy. If you have worked with MySQL long enough you surely have seen it, probably more than once. This is what I would call “MySQL server overload” – having more work than the server can possibly handle. At this point I see people often jumping to the conclusion that something went wrong with MySQL and focus all their effort on this belief. This is also often how we see questions framed when they are filed with our Support or to Emergency Consulting.

In fact there are two very distinct causes for such a situation – and to find the resolution most effectively you need to understand what you’re dealing with. These are:

1. MySQL gets more work than normal. More work may mean more queries or a different query mix. Increasing the amount of more complicated queries will cause an increased amount of work even if the query rate goes down. The reasons for this may vary – it might be the load spike due to variety of external factors; a Google bot coming and starting indexing heavy and uncached pages; it can also be caused by Memcache (or other Cache) expiration or going down as well as many other reasons.

2. MySQL processes the usual amount of work slower. There are several reasons why MySQL may start acting slower. It can be caused by an increased level of concurrency, reduced buffer pool efficiency (caching), changed query plans, accumulated unpurged data in Innodb tables, as well as variety of external factors (to MySQL) – such as running concurrent backup, Battery on RAID volume going bad, effects of co-tenants in Virtualized environments.

For both of those points above I’ve listed some of examples of WHY it could happen: The list is far from complete but my main point is that both of those conditions have a lot of reasons why they can happen and typically both of them need to be considered.

Now to the question: “How do you know which case are you dealing with?”

To be able to answer which case you’re dealing with you need to have some historical data to serve as a baseline. If you’re running Percona Monitoring Plugins you can look at the graphs for queries (of the different kind), number of rows traversed as well as IO happening. In many cases this will be enough to see if MySQL is getting much more load because there will be spike in amount of Selects it is handling with an appropriate increase in the number of rows traversed and IO.

Sometimes, though, indicators may be pointing in different directions, and as such, both cases are possible. For example, a lower number of Select queries and Higher amount of rows traversed per second may mean both that some queries changed their execution plans to worsen your normal workload, or the workload has changed and you’re getting a lot more complicated queries.

To really understand this you need to look at the historical information on query basics. For a long time you could do this with the Percona Toolkit tool pt-query-digest’s query history feature, though few people would go through the process of setting up regular log parsing to create such history.

A more user-friendly tool that can help you answer this question is Percona Cloud Tools’ Query Analytics tool (currently in beta). It will analyze your queries and keep regular workload summaries over time. As such you can take a look at the queries server that has been running at the time of the incident and compare it to the historical data. Are you running more queries or less? Are there some queries that became a lot more frequent? Started taking more time? You can find exact answers to all of those questions. If you find queries that have been taking a longer time than before, then you will be able to see when and where the increase in response time comes about by examining a lot more roles than usual (often pointing to changed plans), locking or increased disk IO.

Summary: Whatever tool you use, make sure you find out which case of MySQL overload are you really dealing with as it can help you to focus your efforts and get the best result with the least effort possible. And just because MySQL got more of a load and it caused the overload, and possibly downtime – it does not mean you should focus only on reducing workload back to the previous level. Often enough an increase of the work is the wonderful thing coming from increasing service popularity and as such you need to focus on MySQL being able to handle more work better. Even in this case it is much better to know that nothing “broke” compared to yesterday but rather that there is more optimization work which needs to be done.

The post 2 cases for MySQL server overload appeared first on MySQL Performance Blog.

Dec
24
2013
--

Renaming database schema in MySQL

One of the routine tasks for a DBA is renaming database schemas, and as such MySQL added a command to carry out that purpose called “RENAME DATABASE <database_name>”. However this command just made it through a few minor releases before being discontinued (from MySQL 5.1.7 to 5.1.23). Here’s a link to the reference manual regarding the command http://dev.mysql.com/doc/refman/5.1/en/rename-database.html. Vadim wrote a MySQL Performance Blog post about this a few years ago where he mentions the dangerous nature of this command – that post was appropriately headlined, “Dangerous Command.” Today we will see what are the ways in which a database schema can be renamed and which of them is the quickest.

Method 1: A well-known method for renaming database schema is by dumping the schema using Mysqldump and restoring it in another schema, and then dropping the old schema (if needed).

[root@percona ~]# mysqldump emp > emp.out
[root@percona ~]# mysql -e "CREATE DATABASE employees;"
[root@percona ~]# mysql employees < emp.out
[root@percona ~]# mysql -e "DROP DATABASE emp;"

Although the above method is easy, it is time and space consuming. What if the schema is more than a 100GB? There are methods where you can pipe the above commands together to save on space, however it will not save time.

To remedy such situations, there is another quick method to rename schemas, however, some care must be taken while doing it.

Method 2: MySQL has a very good feature for renaming tables that even works across different schemas. This rename operation is atomic and no one else can access the table while its being renamed. This takes a short time to complete since changing a table’s name or its schema is only a metadata change. Here is procedural approach at doing the rename:

  • a) Create the new database schema with the desired name.
  • b) Rename the tables from old schema to new schema, using MySQL’s “RENAME TABLE” command.
  • c) Drop the old database schema.

If there are views, triggers, functions, stored procedures in the schema, those will need to be recreated too. MySQL’s “RENAME TABLE” fails if there are triggers exists on the tables. To remedy this we can do the following things :

1) Dump the triggers, events and stored routines in a separate file. This done using -E, -R flags (in addition to -t -d which dumps the triggers) to the mysqldump command. Once triggers are dumped, we will need to drop them from the schema, for RENAME TABLE command to work.

$ mysqldump <old_schema_name> -d -t -R -E > stored_routines_triggers_events.out

2) Generate a list of  only “BASE” tables. These can be found using a query on information_schema.TABLES table.

mysql> select TABLE_NAME from information_schema.tables where table_schema='<old_schema_name>' and TABLE_TYPE='BASE TABLE';

3) Dump the views in an out file. Views can be found using a query on the same information_schema.TABLES table.

mysql> select TABLE_NAME from information_schema.tables where table_schema='<old_schema_name>' and TABLE_TYPE='VIEW';
$ mysqldump <database> <view1> <view2> … > views.out

4) Drop the triggers on the current tables in the old_schema.

mysql> DROP TRIGGER <trigger_name>;
...

5) Restore the above dump files once all the “Base” tables found in step #2 are renamed.

mysql> RENAME TABLE <old_schema>.table_name TO <new_schema>.table_name;
...
$ mysql <new_schema> < views.out
$ mysql <new_schema> < stored_routines_triggers_events.out

 

Intricacies with above methods :

  • We may need to update the GRANTS for users such that they match the correct schema_name. These could fixed with a simple UPDATE on mysql.columns_priv, mysql.procs_priv, mysql.tables_priv, mysql.db tables updating the old_schema name to new_schema and calling “Flush privileges;”.

Although “method 2″ seems a bit more complicated than the “method 1″, this is totally scriptable. A simple bash script to carry out the above steps in proper sequence, can help you save space and time while renaming database schemas next time.

We on the Percona Remote DBA team have written a script called “rename_db” that works in the following way :

[root@percona ~]# /tmp/rename_db
rename_db <server> <database> <new_database>

To demonstrate the use of this script, we used a sample schema “emp”, created test triggers, stored routines on that schema. We will try to rename the database schema using the script, which takes some seconds to complete as opposed to time consuming dump/restore method.

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| emp                |
| mysql              |
| performance_schema |
| test               |
+--------------------+

[root@percona ~]# time /tmp/rename_db localhost emp emp_test
create database emp_test DEFAULT CHARACTER SET latin1
drop trigger salary_trigger
rename table emp.__emp_new to emp_test.__emp_new
rename table emp._emp_new to emp_test._emp_new
rename table emp.departments to emp_test.departments
rename table emp.dept to emp_test.dept
rename table emp.dept_emp to emp_test.dept_emp
rename table emp.dept_manager to emp_test.dept_manager
rename table emp.emp to emp_test.emp
rename table emp.employees to emp_test.employees
rename table emp.salaries_temp to emp_test.salaries_temp
rename table emp.titles to emp_test.titles
loading views
loading triggers, routines and events
Dropping database emp
real    0m0.643s
user    0m0.053s
sys     0m0.131s

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| emp_test           |
| mysql              |
| performance_schema |
| test               |
+--------------------+

As you can see in the above output the database schema “emp” was renamed to “emp_test” in less than a second.

Lastly, we are happy to share the script we used above for “method 2″.

#!/bin/bash
# Copyright 2013 Percona LLC and/or its affiliates
set -e
if [ -z "$3" ]; then
    echo "rename_db <server> <database> <new_database>"
    exit 1
fi
db_exists=`mysql -h $1 -e "show databases like '$3'" -sss`
if [ -n "$db_exists" ]; then
    echo "ERROR: New database already exists $3"
    exit 1
fi
TIMESTAMP=`date +%s`
character_set=`mysql -h $1 -e "show create database $2\G" -sss | grep ^Create | awk -F'CHARACTER SET ' '{print $2}' | awk '{print $1}'`
TABLES=`mysql -h $1 -e "select TABLE_NAME from information_schema.tables where table_schema='$2' and TABLE_TYPE='BASE TABLE'" -sss`
STATUS=$?
if [ "$STATUS" != 0 ] || [ -z "$TABLES" ]; then
    echo "Error retrieving tables from $2"
    exit 1
fi
echo "create database $3 DEFAULT CHARACTER SET $character_set"
mysql -h $1 -e "create database $3 DEFAULT CHARACTER SET $character_set"
TRIGGERS=`mysql -h $1 $2 -e "show triggers\G" | grep Trigger: | awk '{print $2}'`
VIEWS=`mysql -h $1 -e "select TABLE_NAME from information_schema.tables where table_schema='$2' and TABLE_TYPE='VIEW'" -sss`
if [ -n "$VIEWS" ]; then
    mysqldump -h $1 $2 $VIEWS > /tmp/${2}_views${TIMESTAMP}.dump
fi
mysqldump -h $1 $2 -d -t -R -E > /tmp/${2}_triggers${TIMESTAMP}.dump
for TRIGGER in $TRIGGERS; do
    echo "drop trigger $TRIGGER"
    mysql -h $1 $2 -e "drop trigger $TRIGGER"
done
for TABLE in $TABLES; do
    echo "rename table $2.$TABLE to $3.$TABLE"
    mysql -h $1 $2 -e "SET FOREIGN_KEY_CHECKS=0; rename table $2.$TABLE to $3.$TABLE"
done
if [ -n "$VIEWS" ]; then
    echo "loading views"
    mysql -h $1 $3 < /tmp/${2}_views${TIMESTAMP}.dump
fi
echo "loading triggers, routines and events"
mysql -h $1 $3 < /tmp/${2}_triggers${TIMESTAMP}.dump
TABLES=`mysql -h $1 -e "select TABLE_NAME from information_schema.tables where table_schema='$2' and TABLE_TYPE='BASE TABLE'" -sss`
if [ -z "$TABLES" ]; then
    echo "Dropping database $2"
    mysql -h $1 $2 -e "drop database $2"
fi
if [ `mysql -h $1 -e "select count(*) from mysql.columns_priv where db='$2'" -sss` -gt 0 ]; then
    COLUMNS_PRIV="    UPDATE mysql.columns_priv set db='$3' WHERE db='$2';"
fi
if [ `mysql -h $1 -e "select count(*) from mysql.procs_priv where db='$2'" -sss` -gt 0 ]; then
    PROCS_PRIV="    UPDATE mysql.procs_priv set db='$3' WHERE db='$2';"
fi
if [ `mysql -h $1 -e "select count(*) from mysql.tables_priv where db='$2'" -sss` -gt 0 ]; then
    TABLES_PRIV="    UPDATE mysql.tables_priv set db='$3' WHERE db='$2';"
fi
if [ `mysql -h $1 -e "select count(*) from mysql.db where db='$2'" -sss` -gt 0 ]; then
    DB_PRIV="    UPDATE mysql.db set db='$3' WHERE db='$2';"
fi
if [ -n "$COLUMNS_PRIV" ] || [ -n "$PROCS_PRIV" ] || [ -n "$TABLES_PRIV" ] || [ -n "$DB_PRIV" ]; then
    echo "IF YOU WANT TO RENAME the GRANTS YOU NEED TO RUN ALL OUTPUT BELOW:"
    if [ -n "$COLUMNS_PRIV" ]; then echo "$COLUMNS_PRIV"; fi
    if [ -n "$PROCS_PRIV" ]; then echo "$PROCS_PRIV"; fi
    if [ -n "$TABLES_PRIV" ]; then echo "$TABLES_PRIV"; fi
    if [ -n "$DB_PRIV" ]; then echo "$DB_PRIV"; fi
    echo "    flush privileges;"
fi

 

The post Renaming database schema in MySQL appeared first on MySQL Performance Blog.

Dec
23
2013
--

Talking Drupal #029 – Drupal in Education

Show Topics

– Statistics 

– Where does Drupal fit in Education

– Why is Drupal a good fit

– Examples of effective use of Drupal

– Business model for development shops

Module of the Week

– Book Made Simple – https://drupal.org/project/book_made_simple

Drupal 8 Topic of the Week

– Discovering Changes in D8 – how to deal with issues.

Links

– http://funnymonkey.com/

– https://www.acquia.com/drupal-education

– https://drupal.org/project/julio

– http://funnymonkey.com/content/sally

– http://w3techs.com/technologies/segmentation/tld-edu-/content_management

– http://www.schoolyard.com/

Hosts

– Stephen Cross – www.ParallaxInfoTech.com @stephencross

– Jason Pamental – www.hwdesignco.com @jpamental

– John Picozzi – www.RubicDesign.com @johnpicozzi 

– Nic Laflin – www.nLightened.net @nicxvan

Guest

– Bill Fitzgerald – @funnymonkey

Dec
23
2013
--

Talking Drupal #029 – Drupal in Education

Show Topics

– Statistics 

– Where does Drupal fit in Education

– Why is Drupal a good fit

– Examples of effective use of Drupal

– Business model for development shops

Module of the Week

– Book Made Simple – https://drupal.org/project/book_made_simple

Drupal 8 Topic of the Week

– Discovering Changes in D8 – how to deal with issues.

Links

– http://funnymonkey.com/

– https://www.acquia.com/drupal-education

– https://drupal.org/project/julio

– http://funnymonkey.com/content/sally

– http://w3techs.com/technologies/segmentation/tld-edu-/content_management

– http://www.schoolyard.com/

Hosts

– Stephen Cross – www.ParallaxInfoTech.com @stephencross

– Jason Pamental – www.hwdesignco.com @jpamental

– John Picozzi – www.RubicDesign.com @johnpicozzi 

– Nic Laflin – www.nLightened.net @nicxvan

Guest

– Bill Fitzgerald – @funnymonkey

Dec
23
2013
--

MySQL 5.7.3 milestone release fixes some of my pet peeves

It is wonderful to see some of my original pet peeves fixed in MySQL 5.7.3! It has not even taken 10 years :)

I remember when starting my work at MySQL Support that I would recommend using UNION ALL instead of plain UNION because it would not require duplicate elimination, and as such, would not require the creation of a temporary table in basic cases. Guess what? MySQL actually did it anyway. The learning experience from this incident was to never use common sense alone when thinking about MySQL – test and validate your assumptions because for one reason or another things may work differently from what “obviously makes sense.” I do not want to single out MySQL – this approach has helped me to be successful with problem analyses of all kinds of software.

The Multi Column IN optimization is another one that was long overdue for fixing. It was a wonderful feature when it was first introduced allowing Multi-key lookup queries from applications written much easier… so I was quite surprised to learn using this works differently from equivalent form written with OR:
SELECT * FROM T WHERE ((A,B) in (1,1), (2,2)) would work differently from SELECT * FROM T WHERE (A=1 and B=1) OR (A=2 and B=2)
This is essentially the different syntax to express same logical condition and it is very strange to see MySQL being unable to optimize the first form of it. It was clearly a half-baked feature in MySQL and I’m glad MySQL team is getting to fixing it for MySQL 5.7.

There are other impressive things in this Milestone as well:

Support for persistent connections and quick reconnects: I’m really glad work is going on supporting both application development patterns – of supporting a high number of connections and alternative styles when you just connect and disconnect all the time – and so the number of connections maintained is low. 50K connects/disconnects a second is very impressive number and when it comes to the number of connections – you can get to tens of thousands of connections with Thread Pool (part of MySQL Enterprise edition and also available in Percona Server as Open Source Alternative).

MetaData Lock Instrumentation is another very important feature. As of right now in MySQL 5.6 the Meta Data Locking patterns can be pretty complicated while there is very limited visibility into what is really happening – why threads are waiting for locks, they are waiting for and who is holding them. The small concern I have here is – it is done as part of Performance Schema which means additional overhead, even though the locks wait information is already always available internally and probably the current lock waits could be displayed without need for extra overhead in the similar way as INNODB_LOCKS table exposes internal Innodb locks.

MySQL 5.6 brought a lot of impressive changes and there is a lot of exciting stuff comes with the MySQL 5.7 milestones releases we have seen so far.

The post MySQL 5.7.3 milestone release fixes some of my pet peeves appeared first on MySQL Performance Blog.

Dec
19
2013
--

Automatic replication relaying in Galera 3.x (available with PXC 5.6)

A decade ago MySQL folks were in love with the concept of a relay slave for MySQL high availability across data centers.  A relay is a single slave in a remote data center that receives replication from the global master and, in turn, replicates to all the other local slaves in that data center.  This saved a lot of bandwidth, especially back in the days before memcached when scaling reads meant lots of slaves.  Sending 20 copies of your replication stream cross-WAN gets expensive.

In Galera and Percona XtraDB Cluster (PXC), by default when a transaction commits on a given node it is sent to every other node in the cluster from that node.  That is, the actual writeset payload (the RBR events) are sent over the network to every other node, so the bandwidth to replicate is roughly:

<writeset size> * (<number of nodes> - 1)

If any of your nodes happen to be in a remote data center, the replication is still duplicated for each remote node, much like a master-slave topology without a relay.

Replication traffic with default Galera tuning (and pre-3.x)

To illustrate this I setup a 3 node PXC 5.6 cluster test environment: (it would work the same on PXC 5.5 and Galera 2.x)

nosegments

 

This isn’t the best design for HA, but let’s assume nodes 2 and 3 are in a remote data center.  If I use some simple iptables ACCEPT rules in the OUTPUT chain, I can easily track the amount of bandwidth replication uses on each node in a simple 1 minute sysbench update-only test that writes only on node1:

pkts bytes target     prot opt in     out     source               destination
node1:
	24689   18M ACCEPT     tcp  --  any    eth1    192.168.70.2         192.168.70.3
	24389   18M ACCEPT     tcp  --  any    eth1    192.168.70.2         192.168.70.4
node2:
	24802 2977K ACCEPT     tcp  --  any    eth1    192.168.70.3         192.168.70.2
	20758 2767K ACCEPT     tcp  --  any    eth1    192.168.70.3         192.168.70.4
node3:
	22764 2871K ACCEPT     tcp  --  any    eth1    192.168.70.4         192.168.70.2
	20872 2772K ACCEPT     tcp  --  any    eth1    192.168.70.4         192.168.70.3

We can see that node1 sends a full 18M of data to both node2 and node3.  The traffic from nodes 2 and 3 between each other and back to node1 is group communication, you can think of it as replication acknowledgements and other cluster communication.

Replication traffic with Galera 3 WAN segments configured

Galera 3 (available with PXC 5.6) introduces a new feature called WAN segments that basically implements the relay-slave concept, but in a more elegant way.  To enable this, we simply assign each node in a given data center a common gmcast.segment integer in wsrep_provider_options.  Each data center must have a distinct identifier and each node in that data center should have the same segment.

If we apply this configuration to our above environment where node1 is in gmcast.segment=1 and nodes 2 and 3 are in gmcast.segment=2, we get the following network throughput from the same 1 minute test:

pkts bytes target     prot opt in     out     source               destination
node1:
	20642   15M ACCEPT     tcp  --  any    eth1    192.168.70.2         192.168.70.3
	 6088  317K ACCEPT     tcp  --  any    eth1    192.168.70.2         192.168.70.4
node2:
	19045 2368K ACCEPT     tcp  --  any    eth1    192.168.70.3         192.168.70.2
	33652   17M ACCEPT     tcp  --  any    eth1    192.168.70.3         192.168.70.4
node3:
	14682 2144K ACCEPT     tcp  --  any    eth1    192.168.70.4         192.168.70.2
	21974 2522K ACCEPT     tcp  --  any    eth1    192.168.70.4         192.168.70.3

We can now clearly see that our replication is following this path, using node2 as a relay:

segments

 

So our hypothetical WAN link here between segment 1 and segment 2 only needs a single copy of the replication stream instead of one per remote node.

But why is this better than a regular old async relay slave?  It’s better because node2 was chosen dynamically to be the relay, I did not configure anything special besides the segment designation.  The cluster could have just as easily chosen node3.  If node2 failed, node3 will simply take over relay responsibilities (assuming there were more nodes).

Further, as I understand the feature, there’s nothing forcing all replication to get relayed through a single node in each segment.  Any given transaction from any given node in the cluster might use any node in a given segment as a relay.  The relaying is actually per-transaction and fully dynamic.  No fuss, no muss.

What about commit latency?

Astute readers know that node1 still must ultimately get acknowledgement from all other nodes before responding to the client.  When we are using segment relays, this should add some latency to commit time.

In my testing I was on a single virtual LAN, but my commit latency averages came out about pretty close.  I also setup a WAN environment on AWS where node1 was in us-east-1 and nodes 2 and 3 were in us-west-1 and the difference in commit latency was effectively nil.

chart_1 (1)

The additional latency is about 1ms in the LAN test case, these are 3 VMs on the same physical host, so there’s probably some additional overhead here in play.  The high latency between the data centers fully masks the relaying overhead in a true WAN case.

Here are the raw results from the WAN tests:

No Segments

Sysbench run
sysbench 0.5:  multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 8
Random number generator seed is 0 and will be ignored
Threads started!
OLTP test statistics:
    queries performed:
        read:                            0
        write:                           3954
        other:                           0
        total:                           3954
    transactions:                        0      (0.00 per sec.)
    deadlocks:                           0      (0.00 per sec.)
    read/write requests:                 3954   (65.80 per sec.)
    other operations:                    0      (0.00 per sec.)
General statistics:
    total time:                          60.0952s
    total number of events:              3954
    total time taken by event execution: 480.4790s
    response time:
         min:                                 83.20ms
         avg:                                121.52ms
         max:                                321.30ms
         approx.  95 percentile:             169.67ms
Threads fairness:
    events (avg/stddev):           494.2500/1.85
    execution time (avg/stddev):   60.0599/0.03

With Segments

Sysbench run
sysbench 0.5:  multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 8
Initializing random number generator from seed (1).
Threads started!
OLTP test statistics:
    queries performed:
        read:                            0
        write:                           3944
        other:                           0
        total:                           3944
    transactions:                        0      (0.00 per sec.)
    deadlocks:                           0      (0.00 per sec.)
    read/write requests:                 3944   (65.63 per sec.)
    other operations:                    0      (0.00 per sec.)
General statistics:
    total time:                          60.0957s
    total number of events:              3944
    total time taken by event execution: 480.1212s
    response time:
         min:                                 82.96ms
         avg:                                121.73ms
         max:                                226.33ms
         approx.  95 percentile:             166.85ms
Threads fairness:
    events (avg/stddev):           493.0000/1.58
    execution time (avg/stddev):   60.0151/0.03

 

Test for yourself

I built my test environment on both local VMs and in AWS using an open source Vagrant environment you can find here: https://github.com/jayjanssen/pxc_testing/tree/5_6_segments (check the run_segments.sh script as well as the README.md and documentation for the submodule).

We’ve also released Percona Xtradb Cluster 5.6 RC1 with Galera 3.2 , the above Vagrant environment should pull the latest 5.6 build in automatically.

The post Automatic replication relaying in Galera 3.x (available with PXC 5.6) appeared first on MySQL Performance Blog.

Dec
18
2013
--

Talking Drupal #028 OTC v2

The Talking Drupal hosts don’t have a set agenda in “Off the Cuff” episodes.  Today they talk about what characteristics a good Drupal developer and website have. 

Hosts

Stephen Cross – www.ParallaxInfoTech.com @stephencross

Jason Pamental – www.hwdesignco.com @jpamental

John Picozzi – www.RubicDesign.com @johnpicozzi 

Nic Laflin – www.nLightened.net @nicxvan

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com