Feb
27
2012
--

Announcing Percona XtraBackup 1.9.1

Percona is glad to announce the second beta release towards Percona XtraBackup 2.0! We call this version 1.9.1 as we are using the 1.9.x version numbers for betas of 2.0 to clearly differentiate beta releases from the final 2.0 release. We do not recommend using Beta releases in production. Link to BETA version and Release notes.

If you are looking for a stable release for production use, check out our latest stable release (which was 1.6.5 at the time of writing).

This beta adds some really exciting features and bug fixes.

New features include:

  • Backup and restore of LRU dump.
    For servers that support  dump and restore of the XtraDB buffer pool (LRU), namely Percona Server, this will enable a server restored from backup to give excellent performance without a lengthy “warm up” time. If provisioning a slave, this means your new slave will end up with a warmed up buffer pool to mimic that of the master.
  • Streaming incremental backups.
    It is now possible to stream incremental backups, lifting a limitation from previous versions where it was only possible to stream full backups.
  • Parallel compression
    Percona XtraBackup can now use multiple CPUs to perform a compressed backup, significantly reducing the amount of time needed.
  • Parallel compression (‘--compress-threads‘) can be used together with parallel file copying (‘--parallel‘). For example, ‘--parallel=4 --compress --compress-threads=2‘ will create 4 IO threads that will read the data and pipe it to 2 compression threads.
  • New streaming format to work with parallel compression.
    The xbstream format is now supported along with the tar format. The new xbstream format works with parallel compression to (without caching) stream several compressed files at once.
  • tar4ibd functionality is replaced by functionality in the xtrabackup binary itself.
  • The xbstream utility is added
    This utility has a tar like interface to the xbstream format.
Feb
25
2012
--

Introducing new type of benchmark

Traditionally the most benchmarks are focusing on throughput. We all get used to that, and in fact in our benchmarks, sysbench and tpcc-mysql, the final result is also represents the throughput (transactions per second in sysbench; NewOrder transactions Per Minute in tpcc-mysql). However, like Mark Callaghan mentioned in comments, response time is way more important metric to compare.

I want to pretend that we pioneered (not invented, but started to use widely) a benchmark methodology when we measure not the final throughput, but rather periodic probes (i.e. every 10 sec).
It allows us to draw “stability” graphs, like this one

where we can see not only a final result, but how the system behaves in dynamic.

What’s wrong with existing benchmarks?

Well, all benchmarks are lie, and focusing on throughput does not get any closer to reality.

Benchmarks, like sysbench or tpcc-mysql, start N threads and try to push the database as much as possible, bombarding the system with queries with no pause.

That rarely happens in real life. There are no systems that are pushed to 100% load all time.

So, how we can model it? There are different theories, and the one which describes user’s behavior, is Queueing theory. In short we can assume that users send requests with some arrival rate (which can be different in the different part of day/week/month/year though). And what is important for an end user is response time, that is how long the user has to wait on results. E.g. when you go to your Facebook profile or Wikipedia page, you expect to get response within second or two.

How we should change the benchmark to base on this model ?
There are my working ideas:

  • Benchmark starts N working threads, but they all are idle until asked to handle a request
  • Benchmarks generates transactions with a given rate, i.e. M transactions per second and puts into a queue. The interval between arrivals is not uniform, but rather distributed by Exponential distribution, with ? = M. That how it goes if to believe to the Poisson process.
    For example, if our target is arrival rate 2 queries per second, then exponential distribution will give us following intervals (in sec) between events: 0.61, 0.43, 1.55, 0.18, 0.01, 0.76, 0.09, 1.26, …

    Or if we represent graphically (point means even arrival):

    As you see interval is far from being strict 0.5 sec, but 0.5 is the mean of this random generation function. On the graph you see 20 events arrived within 9 seconds.

  • Transactions from the queue are handled by one of free threads, or are waiting in the queue until one of threads are ready. The time waiting in the queue is added to a total response time.
  • As a result we measure 95% or 99% response times.

What does it give to us? It allows to see:

  • What is the response time we may expect having a given arrival rate
  • What is the optimal number of working threads (the one that provides best response times)

When it is useful?

At this moment I am looking to answer on questions like:
– When we add additional node to a cluster (e.g. XtraDB Cluster), how does it affect the response time ?
– When we put a load to two nodes instead of three nodes, will it help to improve the response time ?
– Do we need to increase number of working threads when we add nodes ?

Beside cluster testing, it will also help to see an affect of having a side on the server. For example, the famous problem with DROP TABLE performance. Does DROP TABLE, running in separate session, affect a response time of queries that handle user load ? The same for mysqldump, how does it affect short user queries ?

In fact I have a prototype based on sysbench. It is there lp:~vadim-tk/sysbench/inj-rate/. It works like a regular sysbench, but you need to specify the additional parameter tx-rate, which defines an expected arrival rate (in transactions per second).

There are some results from my early experiments. Assume we have an usual sysbench setup, and we target an arrival rate as 3000 transactions per second (regular sysbench transactions). We vary working threads from 1 to 128.

There are results for 16-128 threads (the result is 99% response time, taken every 10 sec. the less is better)

We can see that 16 threads give best 99% response time (15.74ms final), 32 threads: 16.75 ms, 64 threads: 25.14ms.
And with 128 threads we have pretty terrible unstable response times, with 1579.91ms final.
That means that 16-32 threads is probably best number of concurrently working threads (for this kind of workload and this arrival rate).

Ok, but what happens if we have not enough working threads? You can see it from following graph (1-8 threads):

The queue piles up, waiting time grows, and the final response time grows linearly up to ~30 sec, where benchmark stops, because the queue is full.

I am looking for your comments, do you find it useful?


Feb
24
2012
--

Percona XtraDB Cluster Events

We are on our way to delivery a production ready release of Percona XtraDB Cluster by the end March-2012.
We will do couple events to cover features and architecture of XtraDB Cluster.

So if you interested to hear or read more on this topic, there is good deal: Register for conference before 1-Mar-2012 with code PL-Book, and you will receive our book for free (in the additional to ridiculously low early-bird prices).


Feb
23
2012
--

Faster Point In Time Recovery with LVM2 Snaphots and Binary Logs

LVM snapshots is one powerful way of taking a consistent backup of your MySQL databases – but did you know that you can now restore directly from a snapshot (and binary logs for point in time recovery) in case of that ‘Oops’ moment? Let me show you quickly how.

This howto assumes that you already have a decent know how of LVM and snaphots and using LVM2 >= 2.02.58 (January 2010) which is when the --merge option was made available to the lvconvert binary. Base installs of Ubuntu 11.04 (Natty) and CentOS 5.7 includes packages with this feature, previous releases might also include them via supplemental repositories i.e. updates on CentOS. If you are using InnoDB, it is also important that your transaction logs (ib_logfile*) are on the same logical volume, if not, you could potentially trigger crash recovery when an LSN mismatch occurs and still end up with inconsistent data.

Now, assuming I have the following logical volumes – mysql-data for my datadir, and mysql-logs for my binary logs. I also have  the latest snapshot of the mysql-data logical volume taken named ‘mysql-data-201202230157‘ using a script* I put together to make sure I have a consistent snapshot of the MySQL data files. Restoring snapshot alone may not be enough since there can be a lot more events from the time of the snapshots until you discover the problem, so it is really important that you have your binary logs on a different LV or copied it someplace else if they are on the same LV before we restore.

[root@sb logs]# /root/bin/lvmsnap.sh snapshot
Taking a new snapshot ..
done

Trimming excess snapshots ..
  Logical volume "mysql-data-201202230135" successfully removed
done

  LV                      VG       Attr   LSize  Origin     Snap%  Move Log Copy%  Convert
  lv_root                 VolGroup -wi-ao  5.54g                                          
  lv_swap                 VolGroup -wi-ao  1.97g                                          
  mysql-data              sb       owi-ao 20.00g                                          
  mysql-data-201202230150 sb       swi-a-  5.00g mysql-data   0.00                        
  mysql-data-201202230153 sb       swi-a-  5.00g mysql-data   0.00                        
  mysql-data-201202230155 sb       swi-a-  5.00g mysql-data   0.00                        
  mysql-data-201202230157 sb       swi-a-  5.00g mysql-data   0.00                        
  mysql-logs              sb       -wi-ao 10.00g

Suppose sometime after the snapshot, I accidentally dropped the salaries tables from the employees database! (Oops, I can have an angry mob of employees who may not get their salaries on time!).

mysql> delete from salaries where emp_no = 10001;
Query OK, 17 rows affected (0.15 sec)

mysql> drop table salaries; -- Ooops!
Query OK, 0 rows affected (0.49 sec)

mysql> alter table employees add column age smallint unsigned not null default 0;
Query OK, 300024 rows affected (13.28 sec)
Records: 300024  Duplicates: 0  Warnings: 0

mysql> update employees set age = ((YEAR(CURDATE())-YEAR(birth_date)) - (RIGHT(CURDATE(),5)<RIGHT(birth_date,5)));
Query OK, 300024 rows affected (6.51 sec)
Rows matched: 300024  Changed: 300024  Warnings: 0

With the last snapshot I have above, I should be able to restore up to before the first DELETE statement above. If you use my LVM snapshot script*, it also saves the binary log coordinates when the snapshot was taken and saves it into a file specified as variable on the script. Below is the binary log coordinates for when the last snapshot on my list above.

[root@sb logs]# cat mysql-data-201202230157-binlog-info
File	Position	Binlog_Do_DB	Binlog_Ignore_DB
mysql-bin.000022	336796712

Using the coordinates above, I can start searching for the position of the DELETE statement so we can skip that after the snapshot restart. Using the below command and some inline searches, I was able to pinpoint the position of the delete statement, it is at 336797160.

[root@sb logs]# mysqlbinlog --verbose --base64-output=decode-rows --start-position 336796712 mysql-bin.000022
...
# at 336797160
#120223  1:59:55 server id 1  end_log_pos 336797275 	Query	thread_id=47	exec_time=1	error_code=0
SET TIMESTAMP=1329980395/*!*/;
DROP TABLE `salaries` /* generated by server */
...

Now, let’s restore our data from the snapshot, under the hood restore is really a “rollback” to the snapshot state when it was taken. We will do this using lvconvert’s –merge option, to merge the state of the snapshot to the original LV.

[root@sb logs]# /etc/init.d/mysql stop
Shutting down MySQL (Percona Server).....                  [  OK  ]
[root@sb logs]# umount /mysql/data
[root@sb logs]# lvconvert --merge /dev/sb/mysql-data-201202230157
  Merging of volume mysql-data-201202230157 started.
  mysql-data: Merged: 2.3%
  mysql-data: Merged: 0.2%
  mysql-data: Merged: 0.0%
  Merge of snapshot into logical volume mysql-data has finished.
  Logical volume "mysql-data-201202230157" successfully removed
[root@sb logs]# mount /mysql/data
[root@sb logs]# /etc/init.d/mysql start
Starting MySQL (Percona Server)                            [  OK  ]

You should shutdown MySQL first, then unmount the logical volume holding the MySQL data. This way you don’t have to deactivate/activate the original logical volume to start the merging. So let’s see if our salaries table is restored.

mysql> SHOW TABLES;
+---------------------+
| Tables_in_employees |
+---------------------+
...
| salaries            |
...
+---------------------+
6 rows in set (0.05 sec)

Success! Your salaries data is back. But, we still have to apply the data after the snapshot, skipping the DROP statement. You should take another snapshot now – in case you missed to skip the DROP statement! I know for a fact that for every MySQL restart, the logs are flushed and a new binary log is created, looking at the current binary logs after restoring the snapshot, I know I have to apply mysql-bin.00022 only starting from position 336796712 and skipping the DROP statement at position 336797160:

mysql> show binary logs;
+------------------+-----------+
| Log_name         | File_size |
+------------------+-----------+
...
| mysql-bin.000022 | 336797725 |
| mysql-bin.000023 |       107 |
+------------------+-----------+
23 rows in set (0.00 sec)

mysql> select count(*) from salaries where emp_no = 10001;
+----------+
| count(*) |
+----------+
|       17 |
+----------+
1 row in set (0.00 sec)

[root@sb logs]# mysqlbinlog --verbose --base64-output=decode-rows --start-position 336796712 --stop-position 336797133 mysql-bin.000022 | mysql
[root@sb logs]# mysqlbinlog --verbose --base64-output=decode-rows --start-position 336797275 mysql-bin.000022 | mysql

mysql> select count(*) from salaries where emp_no = 10001;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.00 sec)

mysql> show tables;
+---------------------+
| Tables_in_employees |
+---------------------+
...
| salaries            |
...
+---------------------+
6 rows in set (0.00 sec)

So, as you can see, I now have a consistent data and still have my salaries table back.

Because leaving a production server with active snapshots can affect performance, this is not really an advisable backup solution. If your server somehow blew up in flames, hardware problems or encounter and LVM bug, your snapshots are useless. Also, you cannot test restore your snapshots – they are one time use!

However, if you can tolerate the extra IO overhead i.e. development or staging server , then this is still a valid backup method of course on top of your regularly tested (offsite) backups.

Another ideal use case for this method is when you are planning to execute a long running ALTER or server upgrade, this method can be good quick rollback procedure in case something fails during the operation.

* While writing this blog I hacked a quick shell script to create snapshots and uploaded it here. By no means it is perfect, you can use `lvmsnap.sh snapshot` to create snapshots. The restore functionality is not finished but you can use it for quick testing :) i.e. `lvmsnap.sh restore <snapshot-timestamp>`

Feb
23
2012
--

Percona XtraDB Cluster 5.5.20 – Beta release

I am happy to announce the availability of beta release of our new product Percona XtraDB Cluster.

Percona XtraDB Cluster is High Availability and Scalability solution for MySQL Users and the beta release is based on Percona Server 5.5.20 and the recently released Galera 2.0 GA

The main focus in this release:

  • Incremental State Transfer, especially useful for WAN deployments
  • Support of Percona XtraBackup for Snapshot State Transfer. This feature is available in pre-release series for XtraBackup 2.0

Despite Galera 2.0 has status “production ready”, we still name our release “beta”, as
we want to give an additional attention on an integration of Galera with Percona Server features and
on using XtraBackup for SST. Please test it if you are going to try XtraDB Cluster.

Percona XtraDB Cluster provides:

  • Synchronous replication. Transaction either commited on all nodes or none.
  • Multi-master replication. You can write to any node.
  • Parallel applying events on slave. Real “parallel replication”.
  • Automatic node provisioning.
  • Data consistency. No more unsyncronised slaves.

Percona XtraDB Cluster is fully compatible with MySQL or Percona Server in the following meaning:

  • Data compatibility. Percona XtraDB Cluster works with databases created in MySQL / Percona Server
  • Application compatibility. There is no or minimal application changes required to start work with Percona XtraDB Cluster

Please note, this is a beta release, not suitable for production yet. Bugs and some rough edges are expected. We are encouraging you to try and play with it in testing environment and report us your feedback and experience.

We expect to have a production-ready release by the end of March 2012.

Links:

Previous posts on this topic:


Feb
23
2012
--

Black-Box MySQL Performance Analysis with TCP Traffic

For about the past year I’ve been formulating a series of tools and practices that can provide deep insight into system performance simply by looking at TCP packet headers, and when they arrive and depart from a system. This works for MySQL as well as a lot of other types of systems, because it doesn’t require any of the contents of the packet. Thus, it works without knowledge of what the server and client are conversing about. Packet headers contain only information that’s usually regarded as non-sensitive (IP address, port, TCP flags, etc), so it’s also very easy to get access to this data even in highly secure environments.

I’ve finally written up a paper that shows some of my techniques for detecting problems in a system, which can be an easy way to answer questions such as “is there something we should look into more deeply?” without launching a full-blown analysis project first. It’s available from the white paper section of our website: MySQL Performance Analysis with Percona Toolkit and TCP/IP Network Traffic

Feb
23
2012
--

Percona Server vs MySQL on Intel 320 SSD

If you are terrified by the stability of the results in MySQL in my previous post, I am going to show what we can get with Percona Server. This is also to address the results presented there Benchmarking MariaDB-5.3.4

The initial benchmark is described in Benchmarks of Intel 320 SSD 600GB, and the result for MySQL 5.5.20 in case with 4 (46GB of data) and 16 tables (184GB of data) you can see in my experiments with R graphics.

How do we solve it in Percona Server ? There is whole set of improvement we made, like:

  • Big log files
  • Tuned flushing algorithm
  • Disable flushing of neighbor pages

and the configuration to provide better experience on SSD is :

innodb_flush_neighbor_pages = 0
innodb_adaptive_flushing_method  = keep_average
innodb_log_block_size = 4096
innodb_log_file_size = 4G

Versions: MySQL 5.5.20, Percona Server 5.5.19

With these settings we have following results:

As you see with Percona Server we have stable and predictable lines.

Now, how to compare these results ?
If we draw next boxplot:

and compare the average (middle line inside box) for whole 1h run, we may get impression that average throughput for Percona Server is worse, because averages for 16 tables are:

  • MySQL: 3658 tps
  • Percona Server: 3487 tps

and now if you draw a column plot with these results, you will get something like:

One, looking on this graph, may come to the conclusion: wow, there is a regression in Percona Server.

But if we cut of first 1800 sec, to exclude warmup period, the average will be different:

  • MySQL: 3746 tps
  • Percona Server: 3704 tps

And for comparison, average throughput for 4 tables:

  • MySQL: 3882 tps
  • Percona Server: 6735 tps

The Percona Server is still slower, but you say me, would you rather prefer a stable throughput or sporadic jumps ? Furthermore, there is a way to improve throughput in Percona Server: increase innodb_log_file_size.

There are stability timeline for Percona Server with innodb_log_file_size=8GB

And to aggregate results and provide final numbers, jitter (after initial warmup 1800 sec)

So, in the conclusion, you can see that with a proper tuning, Percona Server/XtraDB outperforms MySQL, and provides a more stable throughput. Of course if a tuning is too hard to figure it out, you always can fall back to the vanilla InnoDB-plugin, like MariaDB suggests in Benchmarking MariaDB-5.3.4.

Raw results and scripts are on Benchmarks Launchpad


Feb
22
2012
--

Some fun with R visualization

My previous post I finished with the graph with unstable results.

There I won’t analyze causes, but rather I want to show some different ways to present results.

I enjoy working with R, and though I am not even close to be proficient in it, I want to share some graphs you can build with R + ggplot2.

The conditions of the benchmark are the same as in the previous post, with difference there are results for 4 and 16 tables cases running MySQL 5.5.20.

Let me remind how I do measurements. I run benchmark for 1 hours, with measurements every 10 seconds.
So we have 360 points – metrics.

If we draw them all, it will look like:

I will also show my R code how to make it

m <- ggplot(dv.ver,
            aes(x = sec, Throughput, color=factor(Tables)))
m + geom_point()

The previous graph is not very representative, so we may add some lines to see a trend.

m + geom_point() + geom_line()

This looks better, but still you may have hard time answering: which case shows the better throughput? what number we should take as the final result?

Jitter graph may help:

m <- ggplot(dv.ver,
            aes(x = factor(Tables), Throughput, color=factor(Tables)))
m + geom_jitter(alpha=0.75)

With jitter we see some dense areas, which shows "most likely" throughput.

So let's build density graphs:

m <- ggplot(dd,
            aes(x = Throughput,fill=factor(Tables)))
m+geom_density(alpha = 0.7)

or

m+geom_density(alpha = 0.7)+facet_wrap(~Tables,ncol=1)

In these graphs Axe X is Throughput and Axe Y represents density of hitting given Throughput.
That may give you an idea how to compare both results, and that the biggest density is around 3600-3800 tps.

And we are moving to numbers, we can build boxplots:

m <- ggplot(dd,
            aes(x = factor(Tables),y=Throughput,fill=factor(Tables)))
m+geom_boxplot()

That may be not easy to read if you never saw boxplots. There is good reading on this way to represent data. In short - the middle line inside a box is median (line that divides top 50% and bottom 50%),
the line that limits the top of a box - 75% quantile (divides 75% bottom and 25% top results), and correspondingly
- the line at the bottom of a box - 25% quantile (you should have an idea already what does that mean).
You may decide what measurements you want to take to compare the results - median, 75%, etc.

And finally we can combine jitter and boxplot to get:

m <- ggplot(dd,
            aes(x = factor(Tables),y=Throughput,color=factor(Tables)))
m+geom_boxplot()+geom_jitter()

That's it for today.
The full script sysbench-4-16.R with data you can get on benchmarks launchpad

If you want to see more visualizations idea, you may check out Brendan's blog:

And, yes, if you wonder what to do with such unstable results in MySQL - stay tuned. There is a solution.


Feb
22
2012
--

Benchmarks of Intel 320 SSD 600GB

I have a chance to test a system with Intel 320 SSD drives (NewRelic provided me with an access to the server), and compare performance with SAS hard drives.

System specification

  • Dell PowerEdge R610
  • Memory: 48GB
  • CPU: Intel(R) Xeon(R) CPU X5650
  • RAID controller: Perc H800
  • RAID configuration: RAID 5 over 11 disks + 1 hot spare. RAID 5 is chosen for space purposes. In this configuration using 600GB disk, we can get 5.5T of useful space
  • Intel drives: Intel 320 SSD 600GB
  • HDD drives: Seagate Cheetah 15K 600GB 16MB Cache SAS
  • Filesystem: XFS, mkfs.xfs -s size=4096, mount -o nobarrier

Benchmark:
For the benchmark I took a sysbench uniform oltp rw workload. 256 tables, 50mil rows each, which gives in total 3T of data.
To vary a ratio memory/data I will vary an amount of tables from 256 (3TB) to 32 (375GB).
As a backend database I use Percona Server 5.5.19.

I should mention that on these datasizes, sysbench workload is pretty nasty, MySQL will mostly reads and writes pages from buffer pool (replacing pages in buffer pool). This however allows us to see the best possible scenario for SSD running under MySQL, the final result will show the best possible gain.
I do measurements every 10 sec to see stability of results.

Graphical result:

Tabular:

Tables HDD SDD Ratio
32   1226 1644 1.340946
64    140  571 4.078571
96    101  506 5.009901
128    89  486 5.460674
192    79  484 6.126582
256    75  495 6.600000

As you can see, on the big datasizes we have 5-6x improvement. However on 32 tables (375GB of data), the result became unstable.

There is a graph with time series with 10 sec measurements.

It looks like we are having symptoms of the flushing problem. This is to investigate later.

The scripts and raw results are on Benchmarks Launchpad.


Feb
22
2012
--

How to Monitor MySQL with Percona’s Nagios Plugins

In this post, I’ll cover the new MySQL monitoring plugins we created for Nagios, and explain their features and intended purpose.

I want to add a little context. What problem were we trying to solve with these plugins? Why yet another set of MySQL monitoring plugins?

The typical problem with Nagios monitoring (and indeed with most other monitoring systems) is the alert spam. I’ve spoken to a lot of users with monitoring systems over the years, and everyone says that they get a lot of false alarms, and that they have email filters set up for alerts. Recently I spoke to one person who gets about thirty thousand alerts per week (!!!) in his email. I’m sure you can see that this makes the monitoring system much less effective. The usual cause of such noisy monitoring is when you alert on things that don’t really indicate problems.

To avoid this problem as much as possible, I suggest the following:

  1. Alert only on things that reliably indicate a real problem. Do not alert on a threshold that merely indicates something is abnormal; abnormal things happen constantly. It’s useful to spend time investigating abnormalities, but in general, you should try to make sure that doesn’t creep into your fault-detection alerts.
  2. Alert only on things that are actionable. If you get an alert and you can’t do anything about it, that’s a certain recipe for a /dev/null email filter.
  3. Avoid ratios and thresholds as much as possible, because they are rarely actionable or reliable indicators of problems, and it’s difficult or impossible to find the right threshold for all situations. This suggestion is a consequence of the previous two, but it’s worth calling out separately in my opinion. I’ll highlight a couple of threshold-based alerts later in this blog post, and explain the need to be careful with them.

Although we’re not finished with the plugins, we have a usable starting point, and we’ll build on it as time passes. The plugins we created are based on our analysis of several hundred customer downtime incidents, and their causes and preventions. You can read more about that research in our white papers, here and here.

If you read those white papers, you’ll see a pretty extensive list of suggestions for what to monitor in MySQL. Those are only suggestions; any given deployment might want to monitor only some of them, as appropriate, and might need to include others as well. Here’s how you can use our new monitoring plugins to check for some of the conditions that caused downtime for our customers:

  • Failed or full LVM snapshots. LVM backup strategies can fail to delete snapshots after finishing, sometimes due to the backup script failing before completion. Additionally, LVM snapshots that don’t have enough copy-on-write space can fail before the backup completes. The pmp-check-lvm-snapshots plugin will alert when a snapshot is failed or overly full.
  • Too many deadlocks. Deadlocks (and lock wait timeouts, covered in another plugin) are frequent causes of serious errors in applications because the applications often don’t handle deadlocks correctly, and in many cases also don’t log the resulting errors. The best place to fix these problems is in the application, but the DBA really needs to be monitoring it as well, because an application error can creep in easily and be deployed, and you will not find it anywhere but in production. The pmp-check-mysql-deadlocks plugin can help you detect such problems, in conjunction with Percona Toolkit’s pt-deadlock-logger tool.
  • Open but deleted filehandles. It’s quite easy for a bad logrotate script (such as the one that ships by default with many server installations) or a system administration mistake to delete one of MySQL’s log or data files from disk. The database server will continue to function, however, because the file isn’t really deleted. The MySQL server keeps an open filehandle to the file, and it won’t be deleted until the server restarts, so the problem remains undetected while it becomes more and more severe. This can cause two serious types of problems: the disk can fill up with no apparent cause or solution, or you can restart the server and lose your data permanently. The pmp-check-mysql-deleted-files plugin will alert when files have been deleted but remain open via a filehandle.
  • Wrong filesystem privileges. It’s very easy for a file to creep into MySQL’s data directory with wrong permissions, without causing problems until a later date. For example, if a database isn’t owned by MySQL, you won’t have any problems until you try to create or drop a table, or the database itself. Likewise, if a system administrator leaves a temporary file in the database directory and then you try to drop the database, it will fail. These things tend to happen at inconvenient times. The pmp-check-mysql-file-privs plugin checks whether MySQL owns every file and directory within its data directory.
  • InnoDB is getting stuck, has long-running transactions, or other problems. The pmp-check-mysql-innodb plugin checks for several indicators of severe internal problems within InnoDB, such as contention that crosses a tipping point and cripples the server. Sometimes such conditions are like credit card debt: easy to start, impossible to dig out afterwards. Another very common cause of serious incidents is application misbehavior, such as opening a transaction but never committing it, which creates a lot of bloat or locking in the database server. This plugin can alert on that and a couple other types of common problems, too.
  • Missing PID file. A variety of problems can delete MySQL’s PID file, such as a buggy init script that attempts to start the server when it’s already running, or executing the server by hand instead of through the init script. Overly zealous cron jobs to delete old files in /tmp can do the same thing. If the PID file is missing, then many init scripts will behave very badly. Sample problems include starting two instances of MySQL on the same files in the data directory, and refusing to shut down MySQL so that it is killed forcibly (i.e. crashed) instead of gracefully shutting down for a system reboot. The pmp-check-mysql-pidfile plugin verifies that MySQL’s PID file exists.
  • Evidence of contention in the processlist. The pmp-check-mysql-processlist plugin checks SHOW FULL PROCESSLIST for patterns that indicate similar problems to those discussed within InnoDB, but it can find evidence of problems in the server as well as the storage engine. For example, a pile-up of connections all trying to open or close tables can indicate LOCK_open contention.
  • Replication is delayed or stopped. Our plugins are structured to monitor as few things as possible, but as many as needed. Replication is one thing that needs to be monitored in two dimensions, which shouldn’t be combined: delay (Seconds_behind_master or the delay indicated by the pt-heartbeat tool, if you choose), and actual failure. The pmp-check-mysql-replication-delay plugin checks for excessive delay, and pmp-check-mysql-replication-running checks that replication is functioning correctly.
  • Replicas have different data than their master. In addition to delay or failure, replication can fail silently and data can diverge or drift on a replica. The pt-table-checksum tool from Percona Toolkit can generate table checksums, and the pmp-check-pt-table-checksum plugin can alert when those checksums indicate that the replica is not logically identical to its master.
  • The server is allocating too much memory. A runaway server process, or poorly optimized queries, can consume too much memory and cause the server to swap until it is unresponsive and has to be power cycled. In addition, other processes can be runaways, too, such as a poorly written script or daemon; it is not unheard of for a monitoring agent or RAID utility software to do this. The pmp-check-unix-memory plugin checks for a single process consuming too much memory and/or for too little free memory in general. This plugin requires careful configuration because it uses a threshold for which there is no good one-size-fits-all value. This plugin, perhaps more than any other, should only be used judiciously.
  • The server has been restarted recently. It is surprising how often a server crashes, perhaps even cycling through crashes, and is not detected until sometime later when the problem has grown much more serious. You can use the pmp-check-mysql-status plugin to alert when the server’s Uptime variable is too small.
  • The server is approaching max_connections. Overflowing the max_connections setting is a bit like a lack of disk space: it is possible for it to be slowly approached over time, and then to begin causing problems when connections are rejected. Similarly to deadlocks and timeouts, applications often don’t handle or log this error appropriately. The pmp-check-mysql-status plugin can be configured to alert when Threads_connected approaches max_connections too closely, which can give the system administrator time to address the situation before it becomes a problem. This check requires careful configuration, or it too can become a noise generator. However, if your application environment makes it difficult to handle max_connections problems (the use of fixed-size connection pooling, for example, which can sometimes require application downtime to resolve), then it might be worth considering to let you deal with the problem when it’s convenient instead of when it’s urgent.

In addition to the checks in that list, there are several others that can be performed with existing plugins, so there was no need to write new ones. For example, the plugins that ship with Nagios are suitable for checking that the server is alive and responds to a query.

The pmp-check-mysql-status is also very flexible and can be used to trigger an alert on almost anything from SHOW VARIABLES and SHOW STATUS that you want. If you want to monitor ratios and rates, for example, you can certainly do so; it also supports an incremental mode of operation, so you can (for example) get the incremental increase in a variable such as Queries and alert on a queries-per-second threshold. (I don’t suggest doing this; I’m only listing an example of what you can do. The documentation has several other examples.)

In future releases we will include more plugins to monitor more types of problems, but the current plugins already cover some of the most common and severe conditions that we’ve observed to cause actual production downtime. Please download and try out the plugins, and help us improve them. If you are a customer and you need help configuring any of the plugins, we are happy to assist. In addition, please ask our sales staff to include a free monitoring setup in a new support contract. I believe they will be happy to negotiate this with you.