Worrying about the ‘InnoDB: detected cycle in LRU for buffer pool (…)’ message?

If you use Percona Server 5.5 and you have configured it to use multiple buffer pool instances than sooner or later you’ll see the following lines on the server’s error log and chances are you’ll be worried about them:

InnoDB: detected cycle in LRU for buffer pool 5, skipping to next buffer pool.
InnoDB: detected cycle in LRU for buffer pool 3, skipping to next buffer pool.
InnoDB: detected cycle in LRU for buffer pool 7, skipping to next buffer pool.

Worry not as this is mostly harmless. It’s becoming a February tradition for me (Fernando) to face a question about this subject (ok, it’s maybe a coincidence) and this time I’ve teamed up with my dear colleague and software engineer George Lorch to provide you the most complete blog post ever published on this topic(with a belated thank you! to Ernie Souhrada, with whom I’ve also discussed this same matter one year ago).

InnoDB internals: what is “LRU” ?

There’s a short and to-the-point section of the MySQL manual that explains in a clear way what is the InnoDB buffer pool, how it operates and why it plays such an important role in MySQL performance. If you’re interested in understanding InnoDB internals then that page is a good start. In this section we’ll refrain ourselves to explain what the “LRU” that shows in our subject message is so we’ll only slightly dig into InnoDB internals, enough to make for some context. Here’s a quick introduction to the buffer pool, quoting from the above manual page:

InnoDB maintains a storage area called the buffer pool for caching data and indexes in memory. (…) Ideally, you set the size of the buffer pool to as large a value as practical, leaving enough memory for other processes on the server to run without excessive paging. The larger the buffer pool, the more InnoDB acts like an in-memory database, reading data from disk once and then accessing the data from memory during subsequent reads.

In practice, however, we can rarely fit our whole dataset inside the InnoDB buffer pool so there must be a process to manage this limited pool of memory pages:

InnoDB manages the pool as a list, using a variation of the least recently used (LRU) algorithm. When room is needed to add a new block to the pool, InnoDB evicts the least recently used block and adds the new block to the middle of the list.

There you go, InnoDB employs a variation of the Least Recently Used algorithm called midpoint insertion strategy to manage the pages within the buffer pool. We should mention it does makes exceptions, such as during a full table scan, when it knows the loaded pages might end up being read only a single time.

Dumping and reloading the buffer pool

Before we can get to the main point of this article lets first examine why would you want to dump the buffer pool to disk, which is at the core of the matter here: that’s when those warning messages we’re discussing may appear.

When you start a MySQL server the buffer pool is empty by default. Performance is at it’s worse at this point because no data can be found in memory so in practice each request for data results in an I/O operation to retrieve the data in the disk and bring it to memory. With time the buffer pool gets filled and performance improves – more and more data can now be found in memory. With yet more time we reach a peek performance state: the buffer pool not only is full but it is filled with the most popular data. The time between the start of the server and reaching this optimum state in the buffer pool is called server warm up. How long it takes depends mostly on two things: the size of the buffer pool and the level of activity of the server – the less busy it is the less requests it will get and thus more time is needed until the popular data is fully loaded.

Now, there could be a shortcut: what if before we save the buffer pool on a disk file before we stop MySQL? We could later use it to reload the buffer pool to an optimum state when we restart the server, thus decreasing the warm up period dramatically.

Percona was a pioneer in this field related to other MySQL distributions and implemented this functionality in Percona Server 5.5. Later on, MySQL 5.6 was released with a similar functionality which also allowed preloading the buffer pool for a faster warm up. Percona Server 5.6 incorporates this upstream feature, effectively replacing its own implementation. However, while in Percona Server 5.5 we could periodically dump the buffer pool in MySQL and Percona Server 5.6 it is only dumped at shutdown or at request.

“Detected cycle in LRU”

In the section above we introduced a functionality that allows to dump a fingerprint of the buffer pool to disk so we can later reload it at server restart (note that even though the buffer pool might be very large the fingerprint will be small enough to make this practical). What we didn’t mention was that this is yet most useful outside of maintenance time and planned shutdows – that is, when the server crashes. When a crash happens it’s that more important to bring it back to a warm up state soon, so it can resume providing data fast enough. And giving we cannot predict a crash the only way we can arrange to have the latest buffer pool on disk is by flushing it often.

While the buffer pool is divided into pages for efficiency of high-volume read operations it is implemented as a linked list of pages, for efficiency of cache management. During the process of dumping the buffer pool to disk a mutex is acquired on the LRU list. However, this mutex is not hold for the duration of the process – it is periodically released to prevent stalling of the system. The problem is: in between the release of the mutex and the moment it is acquired again the list may get reshuffled. Since the dump keeps a pointer to its position across the mutex boundry, the dump can get put into some artificial cycling.

Lets consider a linked list:

A > B > C > D > E

where each letter corresponds to a memory page. Now lets say the initial dump was partially taken and covered the first three pages, “A > B > C”, placing a pointer on “C” before releasing the mutex. Once the mutex is reacquired the list has been reshuffled:  “A > C > B > D > E”. The resulting junction of the partial list we have already copied and the reshuffled list now includes a loop, which would incur in a cycle: “(A > B > C) > B > D > E”. When the dumping process detects a cycle on the LRU list it stops copying from the actual buffer pool, throws in a warning message, and moves on to the next buffer pool instance – otherwise it would keep dumping in an infinite loop.

How harmless are those messages ?

It is fairly harmless except for the fact you will only have a partial LRU list dump for that buffer pool instance – that is, until the next dump occurs. If the server crashes or is shutdown before the next dump takes place the existing one won’t be totally up to date for the server warm up to complete – it will still be used and will still provide a partially filled, somewhat “warm” buffer pool, just not as optimal as it could have been if the last dump had been taken fully.

The post Worrying about the ‘InnoDB: detected cycle in LRU for buffer pool (…)’ message? appeared first on MySQL Performance Blog.


Multi-Valued INSERTs, AUTO_INCREMENT & Percona XtraDB Cluster

A common migration path from standalone MySQL/Percona Server to a Percona XtraDB Cluster (PXC) environment involves some measure of time where one node in the new cluster has been configured as a slave of the production master that the cluster is slated to replace. In this way, the new cluster acts as a slave of the production environment – traditional replication takes care of getting the data into the cluster, and then Galera replication handles the intra-cluster traffic. This often works without issue, although there is one case that I’ve encountered recently where special care must be taken to properly configure the stream to ensure that replication does not break. If you use multi-valued inserts with auto-increment columns, then this post is for you.

For purposes of our discussion, assume that we have a basic 3-node PXC cluster that we’ve set up using the PXC Reference Architecture document, and that we’re replicating from an asynchronous master (call it “server A”) into one of the PXC nodes. Without loss of generality, we’ll pick PXC03. Also, for purposes of our discussion, we’ll be working with the following table definition:

serverA> show create table auto_inc_test;
CREATE TABLE `auto_inc_test` (
  `stuff` varchar(20) DEFAULT NULL,
1 row in set (0.00 sec)
serverA> SELECT * FROM auto_inc_test;
Empty set (0.00 sec)

If we insert rows into this table one at a time, we have no issues.

serverA> INSERT INTO auto_inc_test(stuff) VALUES ('first row');
serverA> INSERT INTO auto_inc_test(stuff) VALUES ('second row');
serverA> INSERT INTO auto_inc_test(stuff) VALUES ('third row');
serverA> SELECT * FROM auto_inc_test;
| i | stuff      |
| 1 | first row  |
| 2 | second row |
| 3 | third row  |
PXC03> SELECT * FROM auto_inc_test;
| i | stuff      |
| 1 | first row  |
| 2 | second row |
| 3 | third row  |

But if we start doing multi-valued inserts, we can run into a problem.

serverA> INSERT INTO auto_inc_test(stuff) VALUES('first row'),('second row'),('third row');
serverA> INSERT INTO auto_inc_test(stuff) VALUES('fourth row'),('fifth row');
serverA> SELECT * FROM auto_inc_test;
| i | stuff      |
| 1 | first row  |
| 2 | second row |
| 3 | third row  |
| 4 | fourth row |
| 5 | fifth row  |
PXC03> SELECT * FROM auto_inc_test;
| i | stuff      |
| 1 | first row  |
| 2 | second row |
| 5 | third row  |
... output elided ...
Last_SQL_Errno: 1062
Last_SQL_Error: Error 'Duplicate entry '5' for key 'PRIMARY'' on query. Default database: 'test'. Query: 'INSERT INTO auto_inc_test (stuff) VALUES ('fourth row'),('fifth row')'
... output elided ...

Uh oh. Replication is broken and our data is now inconsistent. So why does this happen and how can we prevent it?


The astute observer will note that I have not yet said anything about the binary log format on the master. If the binary log format on the master is already set to ROW, then the above error will not occur. RBR will properly replicate multi-valued INSERTs to the PXC cluster without issue, and the data will be consistent. Problem solved. However, there may be reasons that the master is not using or cannot use RBR, such as disk space or IOPS limitations, and thus it’s running in MIXED or STATEMENT mode. In that case, we need to look elsewhere….


When set to ON (the default), this variable has the effect of automatically specifying values for auto_increment_increment and auto_increment_offset based on the cluster size. The idea behind it is to help prevent auto-increment value conflicts when writing to multiple nodes. However, what it also means is that in a multi-node cluster, the auto-increment values generated by any given node will never be consecutive and the “next” auto-increment value on the slave cluster node will always be higher than what the master believes it should be. For example:

serverA> INSERT INTO auto_inc_test (stuff) VALUES ('first row'),('second row'),('third row');
serverA> SHOW CREATE TABLE auto_inc_test;
CREATE TABLE `auto_inc_test` (
  `stuff` varchar(20) DEFAULT NULL,
PXC03> SHOW CREATE TABLE auto_inc_test;
CREATE TABLE `auto_inc_test` (
  `stuff` varchar(20) DEFAULT NULL,

Hmm, that’s not good.

The above scenario can be avoided by setting wsrep_auto_increment_control to OFF on the node that’s acting as a slave while the cluster is still receiving asynchronous replication traffic. This can be configured in /etc/my.cnf or set dynamically from the MySQL command line with:

SET GLOBAL wsrep_auto_increment_control='OFF';

In the testing that I’ve done, this appears to be sufficient to deal with the problem, even if the original master has non-standard values specified for the auto_increment_* variables (e.g., because it’s part of a master-master pair where the odd ID numbers are generated by one server and the even ID numbers from the other one).

In fact, if the cluster is always going to be used in single-writer mode, there’s a compelling argument to be made for setting this variable to OFF on all of the cluster nodes even when going into production: it will prevent the cluster from burning through N auto-increment IDs every time a single row is inserted.

Fixing it when it’s broken

There’s no secret magic here. Percona Toolkit’s pt-table-checksum and pt-table-sync can be used to check and repair the data divergence between the master and the slave cluster node, just as if PXC were not involved, although you may find that it’s just faster to rebuild the cluster, depending upon how many diffs are found. Sometimes this issue happens right away when the master-to-cluster replication is started, and fixing it involves just one or two rows; other times I have seen it not occur for days with lots of rows out of sync once it finally breaks.

The tl;dr version

In sum, if you’re doing traditional replication into a PXC cluster (or any other flavor of MySQL/MariaDB + Galera), you may have issues with multi-valued INSERT statements, and this can be prevented with a configuration change on either side of the replication stream:

  • On the master, set binlog_format=ROW
  • On the PXC slave node, set wsrep_auto_increment_control=OFF

The post Multi-Valued INSERTs, AUTO_INCREMENT & Percona XtraDB Cluster appeared first on MySQL Performance Blog.


TIMESTAMP Columns, Amazon RDS 5.6, and You

This comes from an issue that I worked on recently, wherein a customer reported that their application was working fine under stock MySQL 5.6 but producing erroneous results when they tried running it on Amazon RDS 5.6. They had a table which, on the working server, contained two TIMESTAMP columns, one which defaulted to CURRENT_TIMESTAMP and the other which defaulted to ’0000-00-00 00:00:00′, like so:

CREATE TABLE mysql56 (
  ts2 TIMESTAMP NOT NULL DEFAULT '0000-00-00 00:00:00',

However, under Amazon RDS, the same table looked like this:


They mentioned that their schema contains TIMESTAMP column definitions without any modifiers for nullability or default values. In other words, they were doing something like this:

    ts1 TIMESTAMP,

It’s a known issue (or change, or difference, whatever we choose to call it) that MySQL is deprecating defaults for TIMESTAMP columns that don’t have any nullability or default-value specifiers; this is covered in the 5.6 documentation. However, the docs also mention that the default value for this setting is OFF – i.e., if you create a table with TIMESTAMP columns without any defaults, it will fill them in for you, similarly to what I’ve described above.

As it turns out, the RDS default for this setting is ON, hence the “NULL DEFAULT NULL” modifiers when creating the table under RDS. We changed the parameter group, restarted the instance (note that this variable is NOT dynamic), and their schema-creation script created the tables in the proper way.

So, what have we learned here?

  • Migrating from standalone MySQL to Amazon RDS sometimes has hidden pitfalls that aren’t always readily apparent. Many times it will “just work” – but sometimes it doesn’t. Percona is, of course, happy to help review your configurations and assist with any Amazon RDS implementation plans you might have.
  • When in doubt, fully-specify your TIMESTAMP columns. If you want them NOT NULL, say so. If you want a default value or an on-updated value, set it. Even the configuration variable explicit_defaults_for_timestamp is deprecated and slated for removal in a future version, so eventually it won’t be possible to get the old pre-5.6 behavior at all.

The post TIMESTAMP Columns, Amazon RDS 5.6, and You appeared first on MySQL Performance Blog.


Heartbleed: Separating FAQ From FUD

If you’ve been following this blog (my colleague, David Busby, posted about it yesterday) or any tech news outlet in the past few days, you’ve probably seen some mention of the “Heartbleed” vulnerability in certain versions of the OpenSSL library.

So what is ‘Heartbleed’, really?

In short, Heartbleed is an information-leak issue. An attacker can exploit this bug to retrieve the contents of a server’s memory without any need for local access. According to the researchers that discovered it, this can be done without leaving any trace of compromise on the system. In other words, if you’re vulnerable, they can steal your keys and you won’t even notice that they’ve gone missing. I use the word “keys” literally here; by being able to access the contents of the impacted service’s memory, the attacker is able to retrieve, among other things, private encryption keys for SSL/TLS-based services, which means that the attacker could be able to decrypt communications, impersonate other users (see here, for example, for a session hijacking attack based on this bug), and generally gain access to data which is otherwise believed to be secure. This is a big deal. It isn’t often that bugs have their own dedicated websites and domain names, but this one does: http://www.heartbleed.com

Why is it such a big deal?

One, because it has apparently been in existence since at least 2012. Two, because SSL encryption is widespread across the Internet. And three, because there’s no way to know if your keys have been compromised. The best detection that currently exists for this are some Snort rules, but if you’re not using Snort or some other IDS, then you’re basically in the dark.

What kinds of services are impacted?

Any software that uses the SSL/TLS features of a vulnerable version of OpenSSL. This means Apache, NGINX, Percona Server, MariaDB, the commercial version of MySQL 5.6.6+, Dovecot, Postfix, SSL/TLS VPN software (OpenVPN, for example), instant-messaging clients, and many more. Also, software packages that bundle their own vulnerable version of SSL rather than relying on the system’s version, which might be patched. In other words, it’s probably easier to explain what isn’t affected.

What’s NOT impacted?

SSH does not use SSL/TLS, so you’re OK there. If you downloaded a binary installation of MySQL community from Oracle, you’re also OK, because the community builds use yaSSL, which is not known to be vulnerable to this bug. In fact, anything that’s been built against yaSSL (or some other SSL library that isn’t OpenSSL) should be fine. Obviously, any service which doesn’t use SSL/TLS is not going to be vulnerable, either, since the salient code paths aren’t going to be executed. So, for example, if you don’t use SSL for your MySQL connections, then this bug isn’t going to affect your database server, although it probably still impacts you in other ways (e.g., your web servers).

What about Amazon cloud services?

According to Amazon’s security bulletin on the issue, all vulnerable services have been patched, but they still recommend that you rotate your SSL certificates.

Do I need to upgrade Percona Server, MySQL, NGINX, Apache, or other server software?

No, not unless you built any of these and statically-linked them with a vulnerable version of OpenSSL. This is not common. 99% of affected users can fix this issue by upgrading their OpenSSL libraries and cycling their keys/certificates.

What about client-level tools, like Percona Toolkit or XtraBackup?

Again, no. The client sides of Percona Toolkit, Percona XtraBackup, and Percona Cloud Tools (PCT) are not impacted. Moreover, the PCT website has already been patched. Encrypted backups are still secure.

There are some conflicting reports out there about exactly how much information leakage this bug allows. What’s the deal?

Some of the news articles and blogs claim that with this bug, any piece of the server’s memory can be accessed. Others have stated that the disclosure is limited to memory space owned by processes using a vulnerable version of OpenSSL. As far as we are aware, and as reported in CERT’s Vulnerability Notes Database, the impact of the bug is the latter; i.e., it is NOT possible for an attacker to use this exploit to retrieve arbitrary bits of your server’s memory, only bits of memory from your vulnerable services. That said, your vulnerable services could still leak information that attackers could exploit in other ways.

How do I know if I’m affected?

You can test your web server at http://filippo.io/Heartbleed/ – enter your site’s URL and see what it says. If you have Go installed, you can also get a command-line tool from Github and test from the privacy of your own workstation. There’s also a Python implementation. You can also check the version of OpenSSL that’s installed on your servers. If you’re running OpenSSL 1.0.1 through 1.0.1f or 1.0.2-beta, you’re vulnerable. (Side note here: some distributions, such as RHEL/CentOS, have patched the issue without actually updating the OpenSSL version number; on RPM-based systems you can view the changelog to see if it’s been fixed, for example):

rpm -q --changelog openssl | head -2
* Mon Apr 07 2014 Tomáš Mráz <tmraz@redhat.com> 1.0.1e-16.7
- fix CVE-2014-0160 - information disclosure in TLS heartbeat extension

Also, note that versions of OpenSSL prior to 1.0.1 are not known to be vulnerable. If you’re still unsure, we would be happy to assist you in determining the extent of the issue in your environment and taking any required corrective action. Just give us a call.

How can I know if I’ve been compromised?

If you’ve already been compromised, you have no way of knowing. However, if you use Snort as an IDS, you can use some rules developed by Fox-IT to detect and defer new attacks; other IDS/IPS vendors may have similar rule updates available. Without some sort of IDS in place, however, attacks can and will go unnoticed.

Are there any exploits for this currently out there?

Currently, yes, there are some proofs-of-concept floating around out there, although before this week, that was uncertain. But given that this is likely a 2-year-old bug, I would be quite surprised if someone, somewhere (you came to my talk at Percona Live last week, didn’t you? Remember what I said about assuming that you’re already owned? There are No Secrets Anymore.) didn’t have a solid, viable exploit.

So, then, what should I do?

Ubuntu, RedHat/CentOS, Amazon, and Fedora have already released patched versions of the OpenSSL library. Upgrade now. Do it now, as in right now. If you’ve compiled your own version of OpenSSL from source, upgrade to 1.0.1g or rebuild your existing source with the -DOPENSSL_NO_HEARTBEATS flag.

Once that’s been done, stop any certificate-using services, regenerate the private keys for any services that use SSL (Apache, MySQL, whatever), and then obtain a new SSL certificate from whatever certificate authority you’re using. Once the new certificates are installed, restart your services. You can also, of course, do the private key regeneration and certificate cycling on a separate machine, and only bring the service down long enough to install the new keys and certificates. Yes, you will need to restart MySQL. Or Apache. Or whatever. Full stop, full start, no SIGHUP (or equivalent).

Unfortunately, that’s still not all. Keeping in mind the nature of this bug, you should also change / reset any user passwords and/or user sessions that were in effect prior to patching your system and recycling your keys. See, for example, the session hijacking exploit referenced above. Also note that Google services were, prior to their patching of the bug, vulnerable to this issue, which means that it’s entirely possible that your Gmail login (or any other Google-related login) could have been compromised.

Can I get away with just upgrading OpenSSL?

NO. At a bare minimum, you will need to restart your services, but in order to really be sure you’ve fixed the problem, you also need to cycle your keys and certificates (and revoke your old ones, if possible). This is the time-consuming part, but since you have no way of knowing whether or not someone has previously compromised your private keys (and you can bet that now that the bug is public, there will be a lot of would-be miscreants out there looking for servers to abuse), the only safe thing to do is cycle them. You wouldn’t leave your locks unchanged after being burgled, would you?

Also note that once you do upgrade OpenSSL, you can get a quick list of the services that need to be restarted by running the following:

sudo lsof | grep ssl | grep DEL

Where can I get help and/or more information?

In addition to the assorted links already mentioned, you can read up on the nuts and bolts of this bug, or various news articles such as this or this. There are a lot of articles out there right now, most of which are characterizing this as a major issue. It is. Test your vulnerability, upgrade your OpenSSL and rotate your private keys and certificates, and then change your passwords.

The post Heartbleed: Separating FAQ From FUD appeared first on MySQL Performance Blog.


MySQL encryption performance, revisited

This is part two on a two-part series on the performance implications of in-flight data encryption with MySQL. In the first part, I focused specifically on the impact of using MySQL’s built-in SSL support with some rather surprising results. Certainly it was expected that query throughput would be lower with SSL than without, but I was rather surprised by the magnitude of the performance hit incurred at connection setup time. These results naturally lended themselves to some further investigation; in particular, I wanted to compare performance differences between MySQL’s built-in SSL encryption facilities and external encryption technologies, such as SSH tunneling. I’ll also be using this post to address a couple of questions posed in the comments on my original article. So, without further ado….

Test Environment

The various tests discussed in this post involved a total of four different machines:

  • Machine A: m1.xlarge EC2 instance (4 vCPU/15GB RAM/Amazon Linux) in US-West-2/Oregon
  • Machine B: m1.xlarge EC2 instance (4 vCPU/15GB RAM/Amazon Linux) in EU-West/Ireland
  • Machine C: Intel Core i7-2600K 3.4GHz (8 HT cores/32GB RAM/CentOS 6.4)
  • Machine D: Intel Core i3-550 3.2GHz (4 HT cores/16GB RAM/CentOS 6.4)

Some tests were conducted with MySQL 5.6.13-community, and other tests were conducted with Percona Server 5.6.13.

External Encryption Technology

For this test, I looked at one of the most common ways of establishing a site-to-site link in the absence of a “true” VPN – the good old-fashioned SSH tunnel. I don’t have ready access to sufficient gear to set up a hardware-accelerated VPN, but this is enough to illustrate the point. Recall that the default SSL cipher suite used by MySQL/SSL is DHE-RSA-AES256-SHA; if we unpack this a bit, it tells us that we’re using Diffie-Hellman key exchange with SHA1 as our hashing function, RSA for authentication, and encryption with 256-bit AES (in CBC mode, according to the OpenSSL docs). Although perhaps not immediately obvious, this same cipher suite is pretty easy to emulate with OpenSSH. The SSH version 2 protocol uses DHE/RSA/SHA1 by default, so then all we need to do is explicitly specify the AES256-CBC cipher when we’re setting up our tunnel, and we should be, for all intents and purposes, comparing encrypted apples to encrypted apples. For the sake of curiosity, we also try our SSH tunnel with AES256 in CTR mode, which should theoretically be a bit faster due to its ability to encrypt blocks in parallel, but as it turns out, at least for this test, the difference is marginal.

The machines used for this test were machines C (server) and D (client), which are on the same VLAN of a gigabit Ethernet link, and the test script in use was similar to that from part 1, wherein we attempt to create 100 connections as fast as possible. Each test configuration was run 10 times, with the averages and standard deviations for each test listed in the table below, and the numbers are given in connections per second. Also, note that for this particular test, all keys used are 4096 bits and tests were run against Percona Server 5.6.13 only.


1001.33 (59.26) 22.23 (0.1392) 476.52 (11.87) 482.02 (13.42)


Or, for those of you who like graphics, I think the chart speaks for itself.


Obviously, not using encryption is still the fastest game in town, but connection throughput over an SSH tunnel doesn’t obliterate performance to anywhere near the same level as using MySQL’s native SSL capability. Going from 1000 cps to 22 cps is potentially unusable, but I’d venture a guess that 470-480 cps in a single thread would still provide serviceable results for most people.

Connection Performance over High-Latency Links

Pursuit of data for this test came out of a question left in the comments on my original post; specifically, how the SSL connection penalty is impacted by increasing network latency. We can see from the results above that on a low-latency link, using SSL dramatically impacts performance, but what if we’re doing that connection over a WAN link? It might be the case that if we’re already paying a latency penalty due to simple round-trip time, maybe throwing encryption into the mix won’t hurt that much, even if we do it with MySQL’s built-in SSL support. Thus, for this test, I spun up two different Amazon EC2 instances (machines A and B, described above). Machine C, loacated in Northern California, was used as the client, and this test was conducted both with Community MySQL as well as Percona Server, and with key sizes ranging from zero to 4096 bits. The SSL cipher suite used was the default, and the test script the same as before – run 10 trials, create 100 connections as fast as we can, and report results in connections per second. However, for this test, the raw numbers are somewhat secondary; we’re really just looking to see the impact of network latency on SSL connection performance.

First, from C to B (Northern California to Ireland):
--- ec2-54-220-212-210.eu-west-1.compute.amazonaws.com ping statistics ---
50 packets transmitted, 50 received, 0% packet loss, time 49228ms
rtt min/avg/max/mdev = 167.851/170.126/177.433/2.289 ms


And next, from C to A (Northern California to Oregon):
--- ec2-54-212-134-221.us-west-2.compute.amazonaws.com ping statistics ---
50 packets transmitted, 50 received, 0% packet loss, time 49108ms
rtt min/avg/max/mdev = 42.543/44.648/59.994/3.194 ms


As expected, obviously the raw numbers are much lower for the transcontinental test than they are when connecting to servers that, at least geographically, are only a few hundred miles away, but as it turns out, with the exception of how Community MySQL behaves, which I’ll talk about in a minute, the percentage decrease in performance actually doesn’t really vary by that much. The table below compares connections from C to B with connections from C to A.

MySQL 5.6.13 US->EU MySQL 5.6.13 US->US PS 5.6.13 US->EU PS 5.6.13 US->US PS 5.6.13-static US->EU PS 5.6.13-static US->US
1024-bit 34.39% 36.13% 34.59% 35.23% 33.44% 36.31%
2048-bit 37.04% 45.07% 33.91% 38.35% 34.30% 35.40%
4096-bit 51.85% 71.66% 37.06% 43.17% 37.64% 41.66%


A few comments on the above. First, at 1024 bits of SSL encryption, it doesn’t make that much difference if you’re 40ms away or 170ms away. Second, as latency decreases, the connection-throughput performance lost due to SSL encryption overhead increases. That makes sense, particularly when we consider that in the base cases (two servers on the same LAN, or connections over a TCP socket to the same server), connection throughput performance is dominated by the use (or not) of SSL. However, the one thing in all of this that doesn’t make sense to me is just how badly community MySQL fares with 4096-bit encryption compared to Percona Server. What’s so special about 4096-bit encryption that tanks the performance of Community MySQL but doesn’t seem to impact Percona Server nearly as much? I don’t have a good answer for this nor even a good hypothesis; if it hadn’t happened on both tests I’d probably have claimed a PEBCAT, but now I’m just not sure. So, if anyone else tries this test, I’d be curious to know if you get the same results.

Parting Thoughts

Regardless of the anomaly with MySQL 5.6.13 and 4096-bit SSL, I think the primary takeaway from both this post and its predecessor is pretty clear: if you need end-to-end encryption of your MySQL traffic and you’re using replication or a connection-pooling sort of workload, MySQL’s built-in SSL support will probably serve you just fine, but if your application is of the type that requires setting up and tearing down large numbers of connections frequently, you might want to look at offloading that encryption work elsewhere, even if “elsewhere” just means running it through an SSH tunnel.

The post MySQL encryption performance, revisited appeared first on MySQL Performance Blog.


SSL Performance Overhead in MySQL

NOTE: This is part 1 of what will be a two-part series on the performance implications of using in-flight data encryption.

Some of you may recall my security webinar from back in mid-August; one of the follow-up questions that I was asked was about the performance impact of enabling SSL connections. My answer was 25%, based on some 2011 data that I had seen over on yaSSL’s website, but I included the caveat that it is workload-dependent, because the most expensive part of using SSL is establishing the connection. Not long thereafter, I received a request to conduct some more specific benchmarks surrounding SSL usage in MySQL, and today I’m going to show the results.

First, the testing environment. All tests were performed on an Intel Core i7-2600K 3.4GHz CPU (8 cores, HT included) with 32GB of RAM and CentOS 6.4. The disk subsystem is a 2-disk RAID-0 of Samsung 830 SSDs, although since we’re only concerned with measuring the overhead added by using SSL connections, we’ll only be conducting read-only tests with a dataset that fits completely in the buffer pool. The version of MySQL used for this experiment is Community Edition 5.6.13, and the testing tools are sysbench 0.5 and Perl. We conduct two tests, each one designed to simulate one of the most common MySQL usage patterns. First, we examine connection pooling, often seen in the Java world, where some small set of connections are established by, for example, the servlet container and then just passed around to the application as needed, and one-request-per-connection, typical in the LAMP world, where the script that displays a given page might connect to the database, run a couple of queries, and then disconnect.

Test 1: Connection Pool

For the first test, I ran sysbench in read-only mode at concurrency levels of 1, 2, 4, 8, 16, and 32 threads, first with no encryption and then with SSL enabled and key lengths of 1024, 2048, and 4096 bits. 8 sysbench tables were prepared, each containing 100,000 rows, resulting in a total data size of approximately 256MB. The size of my InnoDB buffer pool was 4GB, and before conducting each official measurement run, I ran a warm-up run to prime the buffer pool. Each official test run lasted 10 minutes; this might seem short, but unlike, say, a PCIe flash storage device, I would not expect the variable under observation to really change that much over time or need time to stabilize. The basic sysbench syntax used is shown below.

for SSL in on off ;
  for threads in 1 2 4 8 16 32 ;
    sysbench --test=/usr/share/sysbench/oltp.lua --mysql-user=msandbox$SSL --mysql-password=msandbox \
       --mysql-host= --mysql-port=5613 --mysql-db=sbtest --mysql-ssl=$SSL \
       --oltp-tables-count=8 --num-threads=$threads --oltp-dist-type=uniform --oltp-read-only=on \
       --report-interval=10 --max-time=600 --max-requests=0 run > sb-ssl_${SSL}-threads-${threads}.out

If you’re not familiar with sysbench, the important thing to know about it for our purposes is that it does not connect and disconnect after each query or after each transaction. It establishes N connections to the database (where N is the number of threads) and runs queries though them until the test is over. This behavior provides our connection-pool simulation. The assumption, given what we know about where SSL is the slowest, is that the performance penalty here should be the lowest. First, let’s look at raw throughput, measured in queries per second:


The average throughput and standard deviation (both measured in queries per second) for each test configuration is shown below in tabular format:



# of threads
SSL key size
1 2 4 8 16 32
SSL OFF 9250.18 (1005.82) 18297.61 (689.22) 33910.31 (446.02) 50077.60 (1525.37) 49844.49 (934.86) 49651.09 (498.68)
1024-bit 2406.53 (288.53) 4650.56 (558.58) 9183.33 (1565.41) 26007.11 (345.79) 25959.61 (343.55) 25913.69 (192.90)
2048-bit 2448.43 (290.02) 4641.61 (510.91) 8951.67 (1043.99) 26143.25 (360.84) 25872.10 (324.48) 25764.48 (370.33)
4096-bit 2427.95 (289.00) 4641.32 (547.57) 8991.37 (1005.89) 26058.09 (432.86) 25990.13 (439.53) 26041.27 (780.71)





So, given that this is an 8-core machine and IO isn’t a factor, we would expect throughput to max out at 8 threads, so the levelling-off of performance is expected. What we also see is that it doesn’t seem to make much difference what key length is used, which is also largely expected. However, I definitely didn’t think the encryption overhead would be so high.

The next graph here is 95th-percentile latency from the same test:


And in tabular format, the raw numbers (average and standard deviation):




# of threads
SSL key size
1 2 4 8 16 32
SSL OFF 1.882 (0.522) 1.728 (0.167) 1.764 (0.145) 2.459 (0.523) 6.616 (0.251) 27.307 (0.817)
1024-bit 6.151 (0.241) 6.442 (0.180) 6.677 (0.289) 4.535 (0.507) 11.481 (1.403) 37.152 (0.393)
2048-bit 6.083 (0.277) 6.510 (0.081) 6.693 (0.043) 4.498 (0.503) 11.222 (1.502) 37.387 (0.393)
4096-bit 6.120 (0.268) 6.454 (0.119) 6.690 (0.043) 4.571 (0.727) 11.194 (1.395) 37.26 (0.307)





With the exception of 8 and 32 threads, the latency introduced by the use of SSL is constant at right around 5ms, regardless of the key length or the number of threads. I’m not surprised that there’s a large jump in latency at 32 threads, but I don’t have an immediate explanation for the improvement in the SSL latency numbers at 8 threads.

Test 2: Connection Time

For the second test, I wrote a simple Perl script to just connect and disconnect from the database as fast as possible. We know that it’s the connection setup which is the slowest part of SSL, and the previous test already shows us roughly what we can expect for SSL encryption overhead for sending data once the connection has been established, so let’s see just how much overhead SSL adds to connection time. The basic script to do this is quite simple (non-SSL version shown):

use DBI;
use Time::HiRes qw(time);
$start = time;
for (my $i=0; $i<100; $i++) {
  my $dbh = DBI->connect("dbi:mysql:host=;port=5613",
  undef $dbh;
printf "%.6f\n", time - $start;

As with test #1, I ran test #2 with no encryption and SSL encryption of 1024, 2048, and 4098 bits, and I conducted 10 trials of each configuration. Then I took the elapsed time for each test and converted it to connections per second. The graph below shows the results from each run:

Here are the averages and standard deviations:



Encryption Average connections per second Standard deviation
None 2701.75 165.54
1024-bit 77.04 6.14
2048-bit 28.183 1.713
4096-bit 5.45 0.015



Yes, that’s right, 4096-bit SSL connections are 3 orders of magnitude slower to establish than unencrypted connections. Really, the connection overhead for any level of SSL usage is quite high when compared to the unencrypted test, and it’s certainly much higher than my original quoted number of 25%.



Analysis and Parting Thoughts

So, what do we take away from this? The first thing is, of course, is that SSL overhead is a lot higher than 25%, particularly if your application uses anything close to the one-connection-per-request pattern. For a system which establishes and maintains long-running connections, the initial connection overhead becomes a non-factor, regardless of the encryption strength, but there’s still a rather large performance penalty compared to the unencrypted connection.

This leads directly into the second point, which is that connection pooling is by far a more efficient method of using SSL if your application can support it.

But what if connection pooling isn’t an option, MySQL’s SSL performance is insufficient, and you still need full encryption of data in-flight? Run the encryption component of your system at a lower layer – a VPN with hardware crypto would be the fastest approach, but even something as simple as an SSH tunnel or OpenVPN *might* be faster than SSL within MySQL. I’ll be exploring some of these solutions in a follow-up post.

And finally… when in doubt, run your own benchmarks. I don’t have an explanation for why the yaSSL numbers are so different from these (maybe yaSSL is a faster SSL library than openSSL, or maybe they used a different cipher – if you’re curious, the original 25% number came from slides 56-58 of this presentation), but in any event, this does illustrate why it’s important to run tests on your own hardware and with your own workload when you’re interested in finding out how well something will perform rather than taking someone else’s word for it.

The post SSL Performance Overhead in MySQL appeared first on MySQL Performance Blog.


MySQL Security Webinar: Follow-up Q&A

MySQL Security Webinar: Follow-up Q&AThanks to everyone who attended last week’s webinar on MySQL security; hopefully you’ve all gone out and set SELinux to enforcing mode if you weren’t already running that way. If you weren’t able to attend, the recording and slides are available for viewing/download. But now, without further ado, here are the questions which we didn’t have time to cover during the presentation.

Q: Do you have a favorite software firewall you recommend that I can run on an EC2 instance in front of my MySQL server?
A: I’d probably just do this with iptables. Any of the other Linux-based software firewall packages are all going to be wrappers around iptables anyway. However, if your MySQL server is already in EC2, you’re going to be better served by Amazon security groups as your front-line firewall. You can also run inside a VPC for some additional isolation.

Q: What do you use for disk encryption in the cloud environment?
A: I might use GPG if I need to encrypt specific files, or just encrypt data in the database via the MySQL functions or in my application, and there’s also no reason why you can’t use dm-crypt/LUKS in the cloud if you’re willing to accept that mounting the device will require some manual intervention, but for the most part, I think disk encryption for servers is pretty useless, because once you enter the key to unlock and mount the volume, anyone that can get access to that machine has full access to the data. I’ve heard of Gazzang being a possible solution here, but I’ve not personally looked at it, so I can’t speak to its performance or suitability.

Q: How much overhead does SSL put on a heavily-loaded MySQL box?
A: That depends on a lot of factors, such as the cipher in use and the kinds of load you’re running. The slowest part of SSL is the connection setup, so you’re going to see a lot less overhead for replication, connection pooling, or long-running transactions than you would from an application with rapidly connects to the database and then disconnects. There’s a graph on the yaSSL site which shows roughly a 25% penalty at 32 threads, but their benchmarks are from 2011 and they were run on someone’s laptop, so I have my doubts as to how well that translates to modern server-grade hardware.

Q: Can MySQL do LDAP/Kerberos login?
A: You can use the PAM authentication plugin (either the commercial one from Oracle or Percona’s open source version) to authenticate against an LDAP database.

Thanks again for attending and submitting your questions; security is one of those massive topics where it’s only possible to scratch the surface in a one-hour webinar. Later this year I may do a follow-on presentation wherein we skip over all of the system/network/application details and do a deeper dive only on MySQL security-related tweaks and best/worst practices, so stay tuned!

The post MySQL Security Webinar: Follow-up Q&A appeared first on MySQL Performance Blog.


MySQL Security: Armoring Your Dolphin

MySQL Security: Armoring Your DolphinMy colleague and teammate Ernie Souhrada will be presenting a webinar on Wednesday, August 21, 2013 at 10 a.m. PDT titled “MySQL Security: Armoring Your Dolphin.”

This is a popular topic with news breaking routinely that yet another Internet company has leaked private data of one form or another. Ernie’s webinar will be a great overview of security MySQL from top to bottom, including changes related to security in the 5.6 release.

Topics to be covered include:

  • Basic security concepts
  • Security above the MySQL layer (network, hardware, OS, etc.)
  • Tips for application design
  • A more secure MySQL configuration
  • Security-related changes in MySQL 5.6

Attendees will leave this presentation knowing where to start when identifying vulnerability in their systems.

Be sure to register for the webinar in advance!

The post MySQL Security: Armoring Your Dolphin appeared first on MySQL Performance Blog.


InnoDB Full-text Search in MySQL 5.6: Part 3, Performance

This is part 3 of a 3 part series covering the new InnoDB full-text search features in MySQL 5.6. To catch up on the previous parts, see part 1 or part 2

Some of you may recall a few months ago that I promised a third part in my InnoDB full-text search (FTS) series, in which I’d actually take a look at the performance of InnoDB FTS in MySQL 5.6 versus traditional MyISAM FTS. I hadn’t planned on quite such a gap between part 2 and part 3, but as they say, better late than never. Recall that we have been working with two data sets, one which I call SEO (8000-keyword-stuffed web pages) and the other which I call DIR (800K directory records), and we are comparing MyISAM FTS in MySQL 5.5.30 versus InnoDB FTS in MySQL 5.6.10.

For reference, although this is not really what I would call a benchmark run, the platform I’m using here is a Core i7-2600 3.4GHz, 32GiB of RAM, and 2 Samsung 256GB 830 SSDs in RAID-0. The OS is CentOS 6.4, and the filesystem is XFS with dm-crypt/LUKS. All MySQL settings are their respective defaults, except for innodb_ft_min_token_size, which is set to 4 (instead of the default of 3) to match MyISAM’s default ft_min_word_len.

Also, recall that the table definition for the DIR data set is:

CREATE TABLE dir_test (
  full_name VARCHAR(100),
  details TEXT

The table definition for the SEO data set is:

CREATE TABLE seo_test (
 title VARCHAR(255),

Table Load / Index Creation

First, let’s try loading data and creating our FT indexes in one pass – i.e., we’ll create the FT indexes as part of the original table definition itself. In particular, this means adding “FULLTEXT KEY (full_name, details)” to our DIR tables and adding “FULLTEXT KEY (title, body)” to the SEO tables. We’ll then drop these tables, drop our file cache, restart MySQL, and try the same process in two passes: first we’ll load the table, and then we’ll do an ALTER to add the FT indexes. All times in seconds.

Engine Data Set one-pass (load) two-pass (load, alter)
MyISAM SEO 3.91 3.96 (0.76, 3.20)
InnoDB SEO 3.777 7.32 (1.53, 5.79)
MyISAM DIR 43.159 44.93 (6.99, 37.94)
InnoDB DIR 330.76 56.99 (12.70, 44.29)

Interesting. For MyISAM, we might say that it really doesn’t make too much difference which way you proceed, as the numbers from the one-pass load and the two-pass load are within a few percent of each other, but for InnoDB, we have mixed behavior. With the smaller SEO data set, it makes more sense to do it in a one-pass process, but with the larger DIR data set, the two-pass load is much faster.

Recall that when adding the first FT index to an InnoDB table, the table itself has to be rebuilt to add the FTS_DOC_ID column, so I suspect that the size of the table when it gets rebuilt has a lot to do with the performance difference on the smaller data set. The SEO data set fits completely into the buffer pool, the DIR data set does not. That also suggests that it’s worth comparing the time required to add a second FT index (this time we will just index each table’s TEXT/MEDIUMTEXT field). While we’re at it, let’s look at the time required to drop the second FT index as well. Again, all times in seconds.

Engine Data Set FT Index Create Time FT Index Drop Time
MyISAM SEO 6.34 3.17
InnoDB SEO 3.26 0.01
MyISAM DIR 74.96 37.82
InnoDB DIR 24.59 0.01

InnoDB wins this second test all around. I’d attribute InnoDB’s win here partially to not having to rebuild the whole table with second (and subsequent) indexes, but also to the fact that at least some the InnoDB data was already in the buffer pool from when the first FT index was created. Also, we know that InnoDB generally drops indexes extremely quickly, whereas MyISAM requires a rebuild of the .MYI file, so InnoDB’s win on the drop test isn’t surprising.

Query Performance

Recall the queries that were used in the previous post from this series:

1. SELECT id, title, MATCH(title, body) AGAINST ('arizona business records'
   IN NATURAL LANGUAGE MODE) AS score FROM seo_test_{myisam,innodb} ORDER BY 3
2. SELECT id, title, MATCH(title, body) AGAINST ('corporation commission forms'
   IN NATURAL LANGUAGE MODE) AS score FROM seo_test_{myisam,innodb} ORDER BY 3 DESC
   LIMIT 5;
3. SELECT id, full_name, MATCH(full_name, details) AGAINST ('+james +peterson +arizona'
   IN BOOLEAN MODE) AS score FROM dir_test_{myisam,innodb} ORDER BY 3 DESC LIMIT 5;
4. SELECT id, full_name, MATCH(full_name, details) AGAINST ('+james +peterson arizona'
   IN BOOLEAN MODE) AS score FROM dir_test_{myisam,innodb} ORDER BY 3 DESC LIMIT 5;
5. SELECT id, full_name, MATCH(full_name, details) AGAINST ('"Thomas B Smith"'
   IN BOOLEAN MODE) AS score FROM dir_test_{myisam,innodb} ORDER BY 3 DESC LIMIT 1;

The queries were run consecutively from top to bottom, a total of 10 times each. Here are the results in tabular format:

Query # Engine Min. Execution Time Avg. Execution Time Max. Execution Time
1 MyISAM 0.007953 0.008102 0.008409
1 InnoDB 0.014986 0.015331 0.016243
2 MyISAM 0.001815 0.001893 0.001998
2 InnoDB 0.001987 0.002077 0.002156
3 MyISAM 0.000748 0.000817 0.000871
3 InnoDB 0.670110 0.676540 0.684837
4 MyISAM 0.001199 0.001283 0.001372
4 InnoDB 0.055479 0.056256 0.060985
5 MyISAM 0.008471 0.008597 0.008817
5 InnoDB 0.624305 0.630959 0.641415

Not a lot of variance in execution times for a given query, so that’s good, but InnoDB is always coming back slower than MyISAM. In general, I’m not that surprised that MyISAM tends to be faster; this is a simple single-threaded, read-only test, so none of the areas where InnoDB shines (e.g., concurrent read/write access) are being exercised here, but I am quite surprised by queries #3 and #5, where InnoDB is just getting smoked.

I ran both versions of query 5 with profiling enabled, and for the most part, the time spent in each query state was identical between the InnoDB and MyISAM versions of the query, with one exception.

InnoDB: | Creating sort index | 0.626529 |
MyISAM: | Creating sort index | 0.014588 |

That’s where the bulk of the execution time is. According to the docs, this thread state means that the thread is processing a SELECT which required an internal temporary table. Ok, sure, that makes sense, but it doesn’t really explain why InnoDB is taking so much longer, and here’s where things get a bit interesting. If you recall part 2 in this series, query 5 actually returned 0 results when run against InnoDB with the default configuration because of the middle initial “B”, and I had to set innodb_ft_min_token_size to 1 in order to get results back. For the sake of completeness, I did that again here, then restarted the server and recreated my FT index. The results? Execution time dropped by 50% and ‘Creating sort index’ didn’t even appear in the query profile:

mysql [localhost] {msandbox} (test): SELECT id, full_name, MATCH(full_name, details) AGAINST
('"Thomas B Smith"' IN BOOLEAN MODE) AS score FROM dir_test_innodb ORDER BY 3 DESC LIMIT 1;
| id    | full_name      | score             |
| 62633 | Thomas B Smith | 32.89915466308594 |
1 row in set (0.31 sec)
mysql [localhost] {msandbox} (test): show profile;
| Status                  | Duration |
| starting                | 0.000090 |
| checking permissions    | 0.000007 |
| Opening tables          | 0.000017 |
| init                    | 0.000034 |
| System lock             | 0.000012 |
| optimizing              | 0.000008 |
| statistics              | 0.000027 |
| preparing               | 0.000012 |
| FULLTEXT initialization | 0.304933 |
| executing               | 0.000008 |
| Sending data            | 0.000684 |
| end                     | 0.000006 |
| query end               | 0.000006 |
| closing tables          | 0.000011 |
| freeing items           | 0.000019 |
| cleaning up             | 0.000003 |

Hm. It’s still slower than MyISAM by quite a bit, but much faster than before. The reason it’s faster is because it found an exact match and I only asked for one row, but if I change LIMIT 1 to LIMIT 2 (or limit N>1), then ‘Creating sort index’ returns to the tune of roughly 0.5 to 0.6 seconds, and ‘FULLTEXT initialization’ remains at 0.3 seconds. So this answers another lingering question: there is a significant performance impact to using a lower innodb_ft_min_token_size (ifmts), and it can work for you or against you, depending upon your queries and how many rows you’re searching for. The time spent in “Creating sort index” doesn’t vary too much (maybe 0.05s) between ifmts=1 and ifmts=4, but the time spent in FULLTEXT initialization with ifmts=4 was typically only a few milliseconds, as opposed to the 300ms seen here.

Finally, I tried experimenting with different buffer pool sizes, temporary table sizes, per-thread buffer sizes, and I also tried changing from Antelope (ROW_FORMAT=COMPACT) to Barracuda (ROW_FORMAT=DYNAMIC) and switching character sets from utf8 to latin1, but none of these made any difference. The only thing which seemed to provide a bit of a performance improvement was upgrading to 5.6.12. The execution times for the InnoDB FTS queries under 5.6.12 were about 5-10 percent faster than with 5.6.10, and query #2 actually performed a bit better under InnoDB than MyISAM (average execution time 0.00075 seconds faster), but other than that, MyISAM still wins on raw SELECT performance.

Three blog posts later, then, what’s my overall take on InnoDB FTS in MySQL 5.6? I don’t think it’s great, but it’s serviceable. The performance for BOOLEAN MODE queries definitely leaves something to be desired, but I think InnoDB FTS fills a need for those people who want the features and capabilities of InnoDB but can’t modify their existing applications or who just don’t have enough FTS traffic to justify building out a Sphinx/Solr/Lucene-based solution.

The post InnoDB Full-text Search in MySQL 5.6: Part 3, Performance appeared first on MySQL Performance Blog.


Choosing a MySQL HA Solution – Post-Webinar Q&A

Percona MySQL webinar Q&AThanks to everyone who was in attendance on 05 June 2013 for my “Choosing a MySQL HA Solution” webinar. If you weren’t able to make it but are interested in listening to the presentation, it’s currently up and available for viewing over at percona.com.

My apologies if we weren’t able to get to your question during the initial session, so I’ll address those lingering questions in this post, along with providing a bit more detail on some of the questions that I did cover during the session.

Q: What is the reason that I recommended DRBD be used only on physical hardware and not on virtual machines?
A: I covered this a bit during the session, but to provide a bit more commentary. There are really two main reasons that I don’t like DRBD with virtual machines. First is the disk IO performance hit that comes with DRBD. When you run a virtual machine, in most cases your virtual disk is basically nothing more than a file on the host filesystem rather than an actual physical volume. So, in effect, you’re adding performance hit to performance hit. The other reason is because virtual machines tend to handle process scheduling and timing much differently than traditional hardware, and I’ve seen situations where even under just a moderate load on the host system, DRBD can end up in a split-brain situation. Typically DRBD is run in combination with heartbeat or pacemaker or something similar, and if one of the servers in the DRBD pair is having issues receiving packets from the other side, things start to degrade.

If you’re just setting up a test environment to get familiar with the technology, certainly there’s nothing wrong with a virtual-machine approach, but for the optimal production deployment you want two identical physical boxes with at least 4 NIC ports. Why 4? Two of those NICs should be directly connected from one machine to another and configured with Linux network interface bonding in balance-rr mode; this is the only NIC bonding mode which will allow you to stripe a TCP connection over multiple ports, and with a two-NIC bonded pair you’ll get roughly 1.67x the throughput of a single port. [Don’t try more than 2; the additional work that the kernel has to do in reordering TCP packets can actually result in performance that’s worse than a single NIC.] The other two NIC ports should be bonded with active-backup or LACP (depending on your switching infrastructure). In essence, you’re looking for two of everything.

Q: Is it possible to replicate only some tables from master to slave, and if so, how?
A: Yes. There are a few of ways to do it. One way is to set up replication filters on the slave, as described in the MySQL manual. You can configure the slave to only replicate a specific database, a set of databases, or specific tables; you can also configure the slave to replicate everything EXCEPT a specified set of databases and tables. With this method, every event is still written to the binary log on the master, but the slave determines what to do with it. The other approach is to filter what gets written to the binary log on the master. This can be useful if you’re trying to limit the amount of data sent over the wire, but it comes at the cost of having binary logs which are no longer complete or useful for point-in-time recovery. Generally if you’re considering the binary log filtering approach, I think it’s safer to set up a relay master (it can be on the same machine if you use the BLACKHOLE storage engine) in between the actual master and the slaves at the bottom of the replication topology. On the top-level master, you leave all filters disabled; on the relay master you add binary log filters; this ensures that the top-level master and its binary logs are fully intact, and then the binary log filters are executed on the intermediate server, thus resulting in less binary log data being sent down to the lower-level slaves. You can also use the slave_compressed_protocol option in /etc/my.cnf if bandwidth is a concern.

If you’re thinking about employing a filtering solution of any sort and it’s something you’re not that familiar with, I’d suggest reading over the manual’s description of how replication filtering rules are processed. There are some informative flowcharts in addition to the textual explanation.

Q: What is my recommendation for MySQL 5.6? Is MHA ready for MySQL 5.6?
A: My personal recommendation for MySQL 5.6, based on what I have seen so far (admittedly, not that much – I have personally only worked with one customer that’s using 5.6), is that I question its readiness. There have been some unfortunate performance regressions compared to 5.5, such as bug #69258, and bug #69318, and I think it might take one or two more point releases before we can consider it truly production-ready. I suppose I should state an obvious disclaimer here in that the aforementioned statements reflect only my opinion and not any official Percona position, and I will look forward to being able to retract them.

MHA should work fine with 5.6 if you’re not using GTID-based replication. If you are using GTID-based replication, it looks like you’ll need to wait for the next release of MHA.

Q: How do you monitor MySQL to determine when it’s not responsive?
A: The easiest way to do this is to simply connect to it and attempt to run a simple query. I’d suggest doing this as a user that does NOT have the SUPER privilege; a user with SUPER will always be able to connect, and that may not accurately report what your application servers are seeing. But, if you can connect and run a simple query and receive a response back in a reasonable (what is reasonable is determined by your environment), it’s a safe bet that MySQL is up and running.

That said, there are plenty of other MySQL and OS-level status variables that you might want to keep an eye on to prevent a problem before it occurs. For example, if you see a large number of long-running queries in the output of SHOW PROCESSLIST that might be a sign that trouble is brewing. Or if you watch your MySQL server’s memory usage and you see that it’s starting to swap, that might indicate that something is up. FWIW, solutions like MHA and PRM simply attempt to connect and run a simple query, but it’s always possible to build something more involved if that’s what you need.

Q: What MySQL HA solution is most similar to Oracle RAC?
A: Oracle RAC is a “shared everything” system; there really isn’t anything quite like that in the MySQL world, except maybe MyISAM files on top of a clustered filesystem with an external lock manager. You might say that MySQL/NDB Cluster is also somewhat similar to Oracle RAC; all of the SQL nodes in a MySQL Cluster installation are going to be talking to the same set of data nodes on the back end, but I think that’s probably where the similarities end.

Q: What kind of replication issues can you get with storage solutions like GlusterFS?
A: I have to admit that this was probably the most interesting question asked; my initial reaction was, honestly, why would you ever want to use MySQL on top of something like that? I can’t imagine the performance being all that fantastic, and I could see network latency wreaking havoc with MySQL’s internal understanding of what’s happening on the system below. Having never experimented with GlusterFS, though, I decided to give it a shot. I set up a 3 node cluster in AWS, got the volume mounted, and went to try to install MySQL. It failed miserably on the mysql_install_db process.

Every time I tried, I got some variant of this message:
130606 2:22:41 [ERROR] /usr/libexec/mysqld: Incorrect information in file: ‘./mysql/servers.frm’
ERROR: 1033 Incorrect information in file: ‘./mysql/servers.frm’

Running mysql_install_db under strace, I saw that it was getting a lot of “bad file descriptor” errors. I was able to get the server up with –skip-grant-tables and insert a few rows into an InnoDB table, but trying to run a simple mysqlslap just hung on me, so that’s where I left it. Maybe I’ll try messing around with this again at some point in the future, but I’m not optimistic that this is a viable use case. If someone can prove me wrong, I’d be interested in knowing how you’ve set it up.

The post Choosing a MySQL HA Solution – Post-Webinar Q&A appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com