Mar
23
2012
--

Best kept MySQLDump Secret

Many people use mysqldump –single-transaction to get consistent backup for their Innodb tables without making database read only. In most cases it works, but did you know there are some cases when you can get table entirely missing from the backup if you use this technique ?

The problem comes from the fact how MySQL’s Transactions work with DDL, In particular ALTER TABLE. When ALTER TABLE is Performed in many cases it will Create temporary table with modified structure, copy data to that table and when drop original table and rename such temporary table to original name.

How does data visibility works in this case ? DDLs are not transactional and as such the running transaction will not see the contents of old table once it is dropped, transaction also will see the new table which was created after transaction was started, including table created by ALTER TABLE statement. Transactions however apply to DATA which is stored in this table and so data which was inserted after start of transaction (by ALTER TABLE statement) will not be visible. In the end we will get new structure in the dump but no data.

Here is example:

SESSION1:
mysql> show tables;
+--------------------+
| Tables_in_dumptest |
+--------------------+
| A                  |
| B                  |
| C                  |
+--------------------+
3 rows in set (0.00 sec)

mysql> select count(*) from A;
+----------+
| count(*) |
+----------+
|  2359296 |
+----------+
1 row in set (1.73 sec)

mysql> select * from C;
+------+
| t    |
+------+
| test |
+------+
1 row in set (0.00 sec)

SESSION2:
root@ubuntu:~/dump# mysqldump --single-transaction dumptest > dump.sql

SESSION1:  (before dump has completed)
mysql> alter table C add i int not null;
Query OK, 1 row affected (0.65 sec)
Records: 1  Duplicates: 0  Warnings: 0

SESSION2:

root@ubuntu:~/dump# tail -29 dump.sql

DROP TABLE IF EXISTS `C`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `C` (
  `t` char(255) NOT NULL,
  `i` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
/*!40101 SET character_set_client = @saved_cs_client */;

--
-- Dumping data for table `C`
--

LOCK TABLES `C` WRITE;
/*!40000 ALTER TABLE `C` DISABLE KEYS */;
/*!40000 ALTER TABLE `C` ENABLE KEYS */;
UNLOCK TABLES;
/*!40103 SET TIME_ZONE=@OLD_TIME_ZONE */;

/*!40101 SET SQL_MODE=@OLD_SQL_MODE */;
/*!40014 SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS */;
/*!40014 SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS */;
/*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */;
/*!40101 SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS */;
/*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */;
/*!40111 SET SQL_NOTES=@OLD_SQL_NOTES */;

-- Dump completed on 2012-03-23  7:27:18

SESSION1:

mysql> select * from C;
+------+---+
| t    | i |
+------+---+
| test | 0 |
+------+---+
1 row in set (0.00 sec)

As you can see as we altered table C at the same time as mysqldump was running we got table empty table with new structure in mysqldump instead of table with valuable data.

This is a pretty edge case scenario neither the less it can be problem for some workloads which run ALTER TABLE regularly during normal operation. I also hope if you get some empty
tables in your mysqldump –single-transaction backups you will know the potential cause for it.

What are potential solutions for this problem ? you can use mysqldump –lock-all-tables instead which does not have this problem at the cost of having database read only for the
duration of operation. You can also use Percona Xtrabackup, LVM or other database backup approach which does not relay on transaction visibility.

Mar
22
2012
--

BOFs and Lightning Talks Announced for Percona Live MySQL Conference & Expo

The Percona Live MySQL Conference & Expo is going to be awesome! Great speakers, an A-list of sponsors, countless opportunities to engage with the community, and an enthusiastic crowd of MySQL users ensure this is going to be a great event. The conference features 72 breakout sessions, keynotes by leading industry luminaries, an optional day of 16 tutorial sessions, a bustling exhibit hall, and numerous opportunities to connect with other community members.

I am pleased to announce the conference Birds of a Feather sessions and Lightning Talks. Birds of a Feather sessions will be Tuesday and Wednesday nights following the evening receptions. Lightning Talks will be Wednesday evening in the Exhibit Hall during the Community Networking Reception. Check the conference schedule for more details on times and locations.

The conference room block at the Hyatt is already sold out and the tutorial sessions, which require preregistration, are filling rapidly. Don’t miss out on this great opportunity to connect with the MySQL community at the largest community gathering in the world.

To sweeten the deal and help convince you to register now rather than two weeks from now, we are pleased to make a special, limited time offer. Register between Thursday March 22nd and midnight PDT on Thursday March 29th and you will be entered in a drawing for:

  • One of three new Kindle Fires
  • One of five copies of the new book High Performance MySQL, 3rd Edition

Don’t miss this special offer – register now at the conference registration website!

Mar
21
2012
--

Multi Range Read (MRR) in MySQL 5.6 and MariaDB 5.5

This is the second blog post in the series of blog posts leading up to the talk comparing the optimizer enhancements in MySQL 5.6 and MariaDB 5.5. This blog post is aimed at the optimizer enhancement Multi Range Read (MRR). Its available in both MySQL 5.6 and MariaDB 5.5

Now let’s take a look at what this optimization actually is and what benefits it brings.

Multi Range Read

With traditional secondary index lookups, if the columns that are being fetched do not belong to the secondary index definition (and hence covering index optimization is not used), then primary key lookups have to be performed for each secondary key entry fetched. This means that secondary key lookups for column values that do not belong to the secondary index definition can result in a lot of Random I/O. The purpose of MRR is to reduce this Random I/O and make it more sequential, by having a buffer in between where secondary key tuples are buffered and then sorted by the primary key values, and then instead of point primary key lookups, a range lookup is performed on the primary key by using the sorted primary key values.

Let me give you a simple example. Suppose you have the following query executed on the InnoDB table:

SELECT non_key_column FROM tbl WHERE key_column=x

This query will roughly be evaluated in following steps, without MRR:

  1. SELECT key_column, pk_column FROM tbl WHERE key_column=x ORDER BY key_column
    (Note that secondary keys in InnoDB contain primary key columns)
  2. For each pk_column value in step 1 do:
    SELECT non_key_column FROM tbl WHERE pk_column=val

As you can see that the values returned from Step 1 are sorted by the secondary key column ‘key_column’, and then for each value of ‘pk_column’ which is a part of the secondary key tuple, a point primary key lookup is made against base table, the number of these point primary key lookups will be depend on the number of rows that match the condition ‘key_column=x’. You can see that there are a lot of random primary key lookups made.

With MRR, then steps above are changed to the following:

  1. SELECT key_column, pk_column FROM tbl WHERE key_column=x ORDER BY key_column
    (Note that secondary keys in InnoDB contain primary key columns)
  2. Buffer each pk_column value fetched from step 1, and when the buffer is full sort them by pk_column, and do a range primary key lookup as follows:
    SELECT non_key_column from tbl WHERE pk_column IN (…)

As you can see by utilizing the buffer for sorting the secondary key tuples by pk_column, we have converted a lot of point primary key lookups to one or more range primary key lookup. Thereby, converting Random access to one or more sequential access. There is one another interesting thing that has come up here, and that is the importance of the size of the buffer used for sorting the secondary key tuples. If the buffer size is large enough only a single range lookup will be needed, however if the buffer size is small as compared to the combined size of the secondary key tuples fetched, then the number of range lookups will be:

CEIL(S/N)
where,
S is the combined size of the secondary key tuples fetched, and
N is the buffer size.

In MySQL 5.6 the buffer size used by MRR can be controlled by the variable read_rnd_buffer_size, while MariaDB introduces a different variable to control the MRR buffer size mrr_buffer_size. Both buffer sizes default to 256K in MySQL 5.6 and MariaDB 5.5 respectively, which might be low depending on your scenario.

You can read more about the MRR optimization available in MySQL 5.6 here:
http://dev.mysql.com/doc/refman/5.6/en/mrr-optimization.html
and as available in MariaDB 5.5 here:
http://kb.askmonty.org/en/multi-range-read-optimization

Now let’s move on to the benchmarks, to see the difference in numbers.

Benchmark results

For the purpose of this benchmark, I have used TPC-H Query #10 and ran it on TPC-H dataset (InnoDB tables) with a Scale Factor of 2 (InnoDB dataset size ~5G). I did not use Scale Factor of 40 (InnoDB dataset size ~95G), because the query was taking far too long to execute, ~11 hours in case of MySQL 5.5 and ~5 hours in case of MySQL 5.6 and MariaDB 5.5. Note that query cache is disabled during these benchmark runs and that the disks are 4 5.4K disks in Software RAID5.

Also note that the following changes were made in the MySQL config:
optimizer_switch=’index_condition_pushdown=off’
optimizer_switch=’mrr=on’
optimizer_switch=’mrr_sort_keys=on’ (only on MariaDB 5.5)
optimizer_switch=’mrr_cost_based=off’
read_rnd_buffer_size=4M (only on MySQL 5.6)
mrr_buffer_size=4M (only on MariaDB 5.5)

We have turned off ICP optimization for the purpose of this particular benchmark, because we want to see the individual affect of an optimization (where possible). Also note that we have turned off mrr_cost_based, this is because the cost based algorithm used to calculate the cost of MRR when the optimizer is choosing the query execution plan, is not sufficiently tuned and it is recommended to turn this off.

The query used is:

select
        c_custkey,
        c_name,
        sum(l_extendedprice * (1 - l_discount)) as revenue,
        c_acctbal,
        n_name,
        c_address,
        c_phone,
        c_comment
from
        customer,
        orders,
        lineitem,
        nation
where
        c_custkey = o_custkey
        and l_orderkey = o_orderkey
        and o_orderdate >= '1993-08-01'
        and o_orderdate < date_add( '1993-08-01' ,interval '3' month)
        and l_returnflag = 'R'
        and c_nationkey = n_nationkey
group by
        c_custkey,
        c_name,
        c_acctbal,
        c_phone,
        n_name,
        c_address,
        c_comment
order by
        revenue desc
LIMIT 20;

In-memory workload

Now let's see how effective is MRR when the workload fits entirely in memory. For the purpose of benchmarking in-memory workload, the InnoDB buffer pool size is set to 6G and the buffer pool was warmed up, so that the relevant pages were already loaded in the buffer pool. Note that as mentioned at the start of the benchmark results section, the InnoDB dataset size is ~5G. Ok so now let's take a look at the graph:

MRR doesn't really make any positive difference to the query times for MySQL 5.6, when the workload fits entirely in memory, because there is no extra cost for memory access at random locations versus memory access at sequential locations. In fact there is extra cost added by the buffering step introduced by MRR, and hence, there is a slight increase in query time for MySQL 5.6, increase of 0.02s. But the query times for MariaDB 5.5 are greater than both MySQL 5.5 and MySQL 5.6

IO bound workload

Now let's see how effective is MRR when the workload is IO bound. For the purpose of benchmarking IO bound workload, the InnoDB buffer pool size is set to 1G and the buffer pool was not warmed up, so that it does not have the relevant pages loaded up already:

MRR does make a lot of difference when the workload is IO bound, the query time is decreased from ~11min to under a minute. The query time is reduced further when the buffer size is set to 4M. Note also that query time for MariaDB is still a little higher by a couple of seconds, when compared to MySQL 5.6.

Now let's take a look at the status counters.

MySQL Status Counters

These status counters were captured when performing the benchmark on IO bound workload, mentioned above.

Counter Name MySQL 5.5 MySQL 5.6 MySQL 5.6 w/ read_rnd_bufer_size=4M MariaDB 5.5 MariaDB 5.5 w/ mrr_buffer_size=4M
Created_tmp_disk_tables 1 1 1 1 1
Created_tmp_tables 1 1 1 1 1
Handler_mrr_init N/A 0 0 1 1
Handler_mrr_rowid_refills N/A N/A N/A 1 0
Handler_read_key 508833 623738 622184 508913 507516
Handler_read_next 574320 574320 572889 574320 572889
Handler_read_rnd_next 136077 136094 136366 136163 136435
Innodb_buffer_pool_read_ahead 0 20920 23669 20920 23734
Innodb_buffer_pool_read_requests 1361851 1264739 1235472 1263290 1235781
Innodb_buffer_pool_reads 120548 102948 76882 102672 76832
Innodb_data_read 1.84G 1.89G 1.53G 1.89G 1.53G
Innodb_data_reads 120552 123872 100551 103011 77213
Innodb_pages_read 120548 123868 100551 123592 100566
Innodb_rows_read 799239 914146 912318 914146 912318
Select_scan 1 1 1 1 1
Sort_scan 1 1 1 1 1
  • As you can see from the status counters above that both MySQL 5.6 and MariaDB 5.5 are reporting high numbers for Innodb_buffer_pool_read_ahead which shows that the access pattern was sequential and hence InnoDB decided to do read_ahead, while in MySQL 5.5 no read_ahead was done because the access pattern was not sequential. Another thing to note is that more read_ahead is done when the buffer size used by MRR, is set to 4M, which obviously means that the more index tuples that can fit in the buffer the more sequential the access pattern will be.
  • There is one MRR related variable introduced in MySQL 5.6 and MariaDB 5.5 Handler_mrr_init and another additional one introduced in MariaDB 5.5 Handler_mrr_rowid_refills. Handler_mrr_init is incremented when a MRR range scan is performed, but we can see its only incremented in MariaDB 5.5 and not in MySQL 5.6, is that because of a bug in MySQL 5.6 code? As MRR was used in both MySQL 5.6 and MariaDB 5.5. Handler_mrr_rowid_refills counts how many times the buffer used by MRR had to be reinitialized, because the buffer was small and not all index tuples could fit in the buffer. If this is > 0 then it means Handler_mrr_rowid_refills + 1 MRR range scans had to be performed. As in the table above you can with default buffer size of 256K, MariaDB 5.5 shows that Handler_mrr_rowid_refills = 1, which means the buffer is small and there were 2 MRR range scans needed. But with a buffer size of 4M, we can see that Handler_mrr_rowid_refills = 0, which means that the buffer was big enough and only 1 MRR range scan was needed, which is as efficient as it can be. This is also evident in the query times, which is lower by a couple of seconds when buffer size of 4M is used.
  • Another interesting thing to note is that MySQL 5.6 and MariaDB 5.5 are both reading more rows than MySQL 5.5, as can be seen by the numbers reported for the status counter Innodb_rows_read. While MySQL 5.6 is also reporting increased numbers for the counter Handler_read_key. This is because of how status counter values are incremented when index lookup is performed. As I explained at the start of the post that traditional index lookup (for non-index-only columns) involves, reading an index record, and then using the PK column value in the index record to make a lookup in the PK. Both these lookups are performed in a single call to the storage engine and the counters Handler_read_key and Innodb_rows_read are incremented by ONE. However, when MRR is used then there are two separate calls made to the storage engine to perform the index record read and then to perform the MRR range scan on the PK. This causes the counters Handler_read_key and Innodb_rows_read to be incremented by TWO. It does not actually mean that queries with MRR are performing badly. The interesting thing is that though both MariaDB and MySQL 5.6 are reporting high numbers for Innodb_rows_read, which is completely in line with how the counters behave with MRR, but the value for counter Handler_read_key is more or less the same for MariaDB 5.5 when compared to MySQL 5.5, and this does not make sense to me. Probably its due to a bug in how counter is calculated inside MariaDB?

Other Observations

Sometimes both for MariaDB 5.5 and MySQL 5.6, the optimizer chooses the wrong query execution plan. Let's take a look at what are the good and bad query execution plans.

a. Bad Plan

id      select_type     table   type    possible_keys   key     key_len ref     rows    filtered        Extra
1       SIMPLE  nation  ALL     PRIMARY NULL    NULL    NULL    25      100.00  Using temporary; Using filesort
1       SIMPLE  customer        ref     PRIMARY,i_c_nationkey   i_c_nationkey   5       dbt3.nation.n_nationkey 2123    100.00
1       SIMPLE  orders  ref     PRIMARY,i_o_orderdate,i_o_custkey       i_o_custkey     5       dbt3.customer.c_custkey 7       100.00  Using where
1       SIMPLE  lineitem        ref     PRIMARY,i_l_orderkey,i_l_orderkey_quantity      PRIMARY 4       dbt3.orders.o_orderkey  1       100.00  Using where

b. Good Plan

id      select_type     table   type    possible_keys   key     key_len ref     rows    filtered        Extra
1       SIMPLE  orders  range   PRIMARY,i_o_orderdate,i_o_custkey       i_o_orderdate   4       NULL    232722  100.00  Using where; Rowid-ordered scan; Using temporary; Using filesort
1       SIMPLE  customer        eq_ref  PRIMARY,i_c_nationkey   PRIMARY 4       dbt3.orders.o_custkey   1       100.00  Using where
1       SIMPLE  nation  eq_ref  PRIMARY PRIMARY 4       dbt3.customer.c_nationkey       1       100.00
1       SIMPLE  lineitem        ref     PRIMARY,i_l_orderkey,i_l_orderkey_quantity      PRIMARY 4       dbt3.orders.o_orderkey  2       100.00  Using where

So during cold query runs the optimizer would switch to using plan 'a', which does not involve MRR, and the query time for MySQL 5.6 and MariaDB 5.5 jumps to ~11min (this is the query time for MySQL 5.5) While when it sticks to plan 'b' for MySQL 5.6 and MariaDB 5.5, then query times remain under a minute. So when the correct query execution plan is not used, there is no difference in query times between MySQL 5.5 and MySQL 5.6/MariaDB 5.5 This is another area of improvement in the optimizer, as it is clearly a part of the optimizer's job to select the best query execution plan. I had noted a similar thing when benchmarking ICP, the optimizer made a wrong choice. It looks like that there is still improvement and changes needed in the optimizer's cost estimation algorithm.

MariaDB 5.5 expands the concept of MRR to improve the performance of secondary key lookups as well. But this works only with joins and specifically with Block Access Join Algorithms. So I am not going to cover it here, but will cover it in my next post which will be on Block Access Join Algorithms.

Conclusion

There is a huge speedup when the workload is IO bound, the query time goes down from ~11min to under a minute. The query time is reduced further when buffer size is set large enough so that the index tuples fit in the buffer. But there is no performance improvement when the workload is in-memory, in fact MRR adds extra sorting overhead which means that the queries are just a bit slower as compared to MySQL 5.5 MRR clearly changes the access pattern to sequential, and hence InnoDB is able to do many read_aheads. Another thing to take away is that MariaDB is just a bit slower as compared to MySQL 5.6, may be something for the MariaDB guys to look at.

Mar
21
2012
--

Troubleshooting MySQL Memory Usage

One of the most painful troubleshooting tasks with MySQL is troubleshooting memory usage. The problem usually starts like this – you have configured MySQL to use reasonable global buffers, such as
innodb_buffer_size, key_buffer_size etc, you have reasonable amount of connections but yet MySQL takes much more memory than you would expect, causing swapping or other problems.

This simple problem on the surface becomes challenge with MySQL because there are no clear resource usage metrics available, and so in most cases you do not know where exactly memory is allocated. This was not much of the problem in MySQL 3.23 when there would only handful of places where memory could be allocated but it is a lot larger problems with MySQL 5.5 with addition of user variables, stored procedures, prepared statements etc which can be a memory hog.

My intent with this post is dual. I would encourage MySQL Team at Oracle, MariaDB team or Drizzle team to take a look into solving this problem. You will be thanked by a lot of people running MySQL in wild. I also wanted to share some troubleshooting techniques I use.

Plot Memory Usage

First I would like to see MySQL memory consumption plotted. I use “VSZ” columns from “ps” output on Linux. It helps me to understand how this memory allocation happens – does it grows slowly and never ending which would correspond to memory leak or resource leak, does it spikes at certain times which I may be able to correlate to some events ?

If you can’t get fancy graph quickly you can use this simple script:

while true
do
  date >> ps.log
  ps aux | grep mysqld >> ps.log
  sleep 60
done

Check for Table Cache Related Allocations

There are cases when MySQL will allocate a lot of memory for table cache, especially if you’re using large blobs. It is easy to check though. Run “FLUSH TABLES” and see whenever memory usage goes down. Note though because of how memory is allocated from OS you might not see “VSZ” going down. What you might see instead is flushing tables regularly or reducing table cache reduces memory consumption to be withing the reason.

Connection Related Allocations
Another set of buffers correspond to connections – orphaned prepared statements, user variables, huge network buffer (can grow up to max_packet_size per connection) are all connection buffers and so if you close connection MySQL can clean them up. Killing connections (or stopping related applications) and observing whenever memory usage shrinks can be very helpful in diagnostics. If you have many applications it might make sense to shut them down one by one so it is easier to understand which of the applications were responsible for memory allocation.

If you figured out it is related to connections and identified application which causes excessive memory usage you might look at how it uses MySQL to identify potential causes. Is it working with large blobs ? Using user variables ? Prepared Statements ? memory tables ? In a lot of cases you have to guess and test, you can’t get information of how much memory connection uses and for which purposes it is allocated. Reducing various per connection variables is how you can test it, though it does not cover everything. For prepared statements you might want to look at Prepared_stmt_count to see how many prepared statements are allocated on server and Com_stmt_send_long_data to see whenever sending long data feature is used as it can increase amount of resources prepared statements take on server. There is no comparable variables of how many user variables are allocated (and how much memory they use).

Memory Tables

MEMORY tables can take memory. There are implicit MEMORY tables which are allocated for query execution, which size can be controlled by tmp_table_size and which also only exist for duration of query execution so it is usually easy to catch them. There are also explicit MEMORY tables you can create both as permanent and temporary. There is a max_heap_table_size variable which allows you to limit size of MEMORY tables (the limit applies both to implicit and explicit ones) but as there is no control of how many tables application can create it does not really allows to restrict memory usage. For Permanent tables it is easy. We can look at information_schema to see how much memory is being used by current MEMORY tables:

mysql> select sum(data_length+index_length) from information_schema.tables where engine='memory';
+-------------------------------+
| sum(data_length+index_length) |
+-------------------------------+
|                        126984 |
+-------------------------------+
1 row in set (0.98 sec)

This however will not cover TEMPORARY tables some of your connections might have created and forgot to remove (or still use for processing). Of course you will see these tables going away if you close connection. In Percona Server you can do better as you can query temporary tables too:

mysql> select sum(data_length+index_length) from information_schema.global_temporary_tables where engine='memory';
+-------------------------------+
| sum(data_length+index_length) |
+-------------------------------+
|                        126984 |
+-------------------------------+
1 row in set (0.00 sec)

You can even go deeper to check which sessions have created which temporary tables (both in memory and not):

mysql> select * from information_schema.global_temporary_tables \G
*************************** 1. row ***************************
    SESSION_ID: 7234
  TABLE_SCHEMA: test
    TABLE_NAME: my
        ENGINE: InnoDB
          NAME: #sql516_1c42_2
    TABLE_ROWS: 0
AVG_ROW_LENGTH: 0
   DATA_LENGTH: 16384
  INDEX_LENGTH: 0
   CREATE_TIME: NULL
   UPDATE_TIME: NULL
*************************** 2. row ***************************
    SESSION_ID: 7234
  TABLE_SCHEMA: test
    TABLE_NAME: tmp
        ENGINE: MEMORY
          NAME: #sql516_1c42_1
    TABLE_ROWS: 2
AVG_ROW_LENGTH: 257
   DATA_LENGTH: 126984
  INDEX_LENGTH: 0
   CREATE_TIME: 2012-03-21 05:52:02
   UPDATE_TIME: NULL
*************************** 3. row ***************************
    SESSION_ID: 7231
  TABLE_SCHEMA: test
    TABLE_NAME: z
        ENGINE: InnoDB
          NAME: #sql516_1c3f_0
    TABLE_ROWS: 0
AVG_ROW_LENGTH: 0
   DATA_LENGTH: 16384
  INDEX_LENGTH: 0
   CREATE_TIME: NULL
   UPDATE_TIME: NULL
3 rows in set (0.00 sec)

Innodb Memory Usage

Finally it is often helpful to check how much memory Innodb has allocated. In fact this is often one of the first things I do as it is least intrusive. Run SHOW ENGINE INNODB STATUS and look for memory information block, which can use like this:

----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 132183490560; in additional pool allocated 0
Internal hash tables (constant factor + variable factor)
    Adaptive hash index 4422068288      (2039977928 + 2382090360)
    Page hash           127499384
    Dictionary cache    512619219       (509995888 + 2623331)
    File system         294352  (82672 + 211680)
    Lock system         318875832       (318747272 + 128560)
    Recovery system     0       (0 + 0)
    Threads             425080  (406936 + 18144)
Dictionary memory allocated 2623331
Buffer pool size        7864319
Buffer pool size, bytes 128849002496
Free buffers            1
Database pages          8252672
Old database pages      3046376
Modified db pages       23419

I’m using the information from Percona Server again which provides a lot more insight about how memory is used. I would note though the output is a bit confusing as “Total Memory Allocated” is really not the total any more as Innodb has moved allocating memory from operation system directly not from addition memory pool (hence 0). Such allocations are not seen in “Total memory allocated” line yet they are seen in one of the lines below. The most likely cause for Innodb run away memory allocation is Dictionary cache though there could be other reasons too.

Memory Leaks
There are many kinds of memory leaks and I would say these are rather rare in MySQL. Most suspected memory leaks end up being some run away resource usage, though these can happen. Some memory leaks might happen per connection and they will be gone when connection is closed, other correspond to global memory allocation and will result in increased memory allocation until server is restarted. I would suspect memory leak when you see memory usage growing which can’t be connected to any known resource use. For example for global memory leaks you would see memory usage continues to grow even if you close connections and tables regularly. Another common thing about memory leaks is because it is memory which is allocated and forgotten about, unless it is very small blocks, it should be just swapped out and when never needed again. So if you see swap file used space gradually growing and there are “swap outs” but a lot less “swap ins” chances are it is caused by memory leak.

Dealing with memory leaks is rather complicated as good tools to detect memory leaks like valgrind are too slow to run in production. So the best thing to do in this case is see whenever you can create isolated repeatable test cases based on your application, which can illustrate memory leak and when it can be found and fixed. This is where your MySQL Support contract can be handy.

Conclusion
Understanding where MySQL can allocate memory can help us to find the cause in most cases. It is not as straightforward as it should be and I’m very hopeful future releases of MySQL, MariaDB or Drizzle bring improvements in this space allowing us to see directly for what purpose memory is allocated and so detect all kinds of memory usage problems easier.

Mar
13
2012
--

Webinar “How to Turbocharge Your MySQL Performance Using Flash Storage”

Next Wednesday, March-21, 11:00am Pacific Time, Baron and me will be co-presenting with Virident webinar: “How to Turbocharge Your MySQL Performance Using Flash Storage” (From Virident side: Shridar Subramanian and Shirish Jamthe).

Running MySQL on SSD in interesting topic and on the webinar we will cover:

  • Configuration and optimization techniques to fully leverage flash-based storage solutions in MySQL environments
  • Evaluation criteria and techniques for selecting the suitable flash-storage technology for the relevant MySQL workloads
  • Price/performance advantages (ROI) when flash storage is used appropriately for MySQL workloads
  • Approaches for scaling MySQL instances on fewer servers while delivering optimal performance using flash drives

The registration is free and available there. Hope to see you on the webinar!

Mar
12
2012
--

Index Condition Pushdown in MySQL 5.6 and MariaDB 5.5 and its performance impact

I have been working with Peter in preparation for the talk comparing the optimizer enhancements in MySQL 5.6 and MariaDB 5.5. We are taking a look at and benchmarking optimizer enhancements one by one. So in the same way this blog post is aimed at a new optimizer enhancement Index Condition Pushdown (ICP). Its available in both MySQL 5.6 and MariaDB 5.5

Now let’s take a look briefly at what this enhancement actually is, and what is it aimed at.

Index Condition Pushdown

Traditional B-Tree index lookups have some limitations in cases such as range scans, where index parts after the part on which range condition is applied cannot be used for filtering records. For example, suppose you have a key defined as:

KEY `i_l_partkey` (`l_partkey`,`l_quantity`,`l_shipmode`,`l_shipinstruct`)

and the WHERE condition defined as:

l_partkey = x
and l_quantity >= 1 and l_quantity <= 1+10
and l_shipmode in ('AIR', 'AIR REG')
and l_shipinstruct = 'DELIVER IN PERSON'

Then MySQL will use the key as if its only defined as including columns l_partkey and l_quantity and will not filter using the columns l_shipmode and l_shipinstruct. And so all rows matching condition l_partkey = x and and l_quantity >= 1 and l_quantity <= 1+10 will be fetched from the Primary Key and returned to MySQL server which will then in turn apply the remaining parts of the WHERE clause l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON'. So clearly if you have thousands of rows that match l_partkey and l_quantity but only a few hundred that match all the condition, then obviously you would be reading a lot of unnecessary data.

This is where ICP comes into play. With ICP, the server pushes down all conditions of the WHERE clause that match the key definition to the storage engine and then filtering is done in two steps:

  • Filter by the prefix of the index using traditional B-Tree index search
  • Filter by applying the where condition on the index entries fetched

You can read more about index condition pushdown as available in MySQL 5.6 here:
http://dev.mysql.com/doc/refman/5.6/en/index-condition-pushdown-optimization.html
and as available in MariaDB 5.5 here:
http://kb.askmonty.org/en/index-condition-pushdown

Now let's take a look at the benchmarks to see how much difference does this really make.

Benchmark results

For the purpose of this benchmark I used TPC-H Query #19 and ran it on TPC-H dataset (InnoDB tables) with scale factors of 2 (InnoDB dataset size ~5G) and 40 (InnoDB dataset size ~95G), so that I could see the speedup in case of in-memory and IO bound workloads. For the purpose of these benchmarks query cache was disabled and the buffer pool size was set to 6G.

The query is:

select
        sum(l_extendedprice* (1 - l_discount)) as revenue
from
        lineitem force index(i_l_partkey),
        part
where
        (
                p_partkey = l_partkey
                and p_brand = 'Brand#45'
                and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG')
                and l_quantity >= 1 and l_quantity <= 1+10
                and p_size between 1 and 5
                and l_shipmode in ('AIR', 'AIR REG')
                and l_shipinstruct = 'DELIVER IN PERSON'
        )
        or
        (
                p_partkey = l_partkey
                and p_brand = 'Brand#24'
                and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK')
                and l_quantity >= 14 and l_quantity <= 14+10
                and p_size between 1 and 10
                and l_shipmode in ('AIR', 'AIR REG')
                and l_shipinstruct = 'DELIVER IN PERSON'
        )
        or
        (
                p_partkey = l_partkey
                and p_brand = 'Brand#53'
                and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG')
                and l_quantity >= 28 and l_quantity <= 28+10
                and p_size between 1 and 15
                and l_shipmode in ('AIR', 'AIR REG')
                and l_shipinstruct = 'DELIVER IN PERSON'
        )

There are two changes that I made to the query and the TPC-H tables' structure.
- Added a new index: KEY `i_l_partkey` (`l_partkey`,`l_quantity`,`l_shipmode`,`l_shipinstruct`)
- Added an index hint in the query: lineitem force index(i_l_partkey)

The size of the buffer pool used for the benchmarks is 6G and the disks are 4 5.4K disks in Software RAID5.

In-memory workload

Now let's first take a look at how effective is ICP when the workload is in memory. For the purpose of benchmarking in-memory workload, I used a Scale Factor of 2 (dataset size ~5G), and the buffer pool was warmed up so that the relevant pages were already loaded in the buffer pool:

IO bound workload

And what about IO bound workload, when the dataset does not fit into memory. For the purpose of benchmarking IO bound workload, I used a Scale Factor of 40 (dataset size ~95G) and the buffer pool was not warmed up, so that it does not have the relevant pages loaded up already:

Now let's take a look at the status counters to see how much does this optimization make a difference.

MySQL Status Counters

These status counters are taken when performing the benchmark on IO bound workload, mentioned above.

Counter Name MySQL 5.5 MySQL 5.6 MariaDB 5.5
Created_tmp_disk_tables 0 0 0
Created_tmp_tables 0 0 0
Handler_icp_attempts N/A N/A 571312
Handler_icp_match N/A N/A 4541
Handler_read_key 19310 19192 18989
Handler_read_next 579426 4514 4541
Handler_read_rnd_next 8000332 8000349 8000416
Innodb_buffer_pool_read_ahead 81132 81067 81069
Innodb_buffer_pool_read_requests 4693757 1293628 1292675
Innodb_buffer_pool_reads 540805 28468 28205
Innodb_data_read 9.49G 1.67G 1.67G
Innodb_data_reads 621939 109537 29796
Innodb_pages_read 621937 109535 109274
Innodb_rows_read 8579426 8004514 8004541
Select_scan 1 1 1
  • We can see that values for Handler_read_key are more or less the same, because there is no change in the index lookup method, the index range scan is still done the same way as in MySQL 5.5, all the index pages that match the range condition on the l_quantity will still be fetched. Its the optimization afterwards that counts.
  • As we can see there is a huge reduction in the value of Handler_read_next, as with ICP the remaining parts of the WHERE clause are checked directly on the index pages fetched without having to fetch the entire rows from the PK, which means 128x less subsequent reads from the PK.
  • There are two status counters available in MariaDB 5.3/5.5 that are not available in MySQL, Handler_icp_attempts and Handler_icp_match, which show how many times the remaining WHERE condition was checked on the fetched index pages and how many times the index entries matched the condition. The value of Handler_icp_match directly co-relates to that of Handler_read_next. The smaller the ratio of Handler_icp_attempts to Handler_icp_match the better the filtering.
  • Another important thing is that there is 18x less IO page reads in the case of MySQL 5.6 and MariaDB 5.5 and 8x less amount of data read (1.67G compared to 9.49G). This just does not mean reduction in IO but means more free pages are available in the buffer pool to cater to other queries. For example, in the case of MySQL 5.5 (with a buffer pool of 6G), the query will be IO bound even with a warmed up buffer pool, because there is just too much data read. While in the case of MySQL 5.6 and MariaDB 5.5, the queries will have all the pages in-memory with warm caches and still 72% of the buffer pool will be empty to handle other queries.

Other Observations

There are two indexes defined on the table with similar prefixes:

  • KEY `i_l_suppkey_partkey` (`l_partkey`, `l_suppkey`)
  • KEY `i_l_partkey` (`l_partkey`,`l_quantity`,`l_shipmode`,`l_shipinstruct`)

Obviously, the key i_l_partkey is much more restrictive. Although it cannot filter more rows by using traditional B-Tree index lookup used by MySQL, when compared to the key i_l_suppkey_partkey. But after the B-Tree lookup step, the second key can filter a lot more data using ICP on the remaining conditions after l_quantity. Yet MySQL 5.6 and MariaDB 5.5 optimizer does not consider this logic when selecting the index. In all the runs of the benchmark, the optimizer chose the key i_l_suppkey_partkey because the optimizer estimates that both indexes will mean same no. of rows and i_l_suppkey_partkey has a smaller key size. To get around this limitation in the optimizer I had to use index hint (FORCE INDEX). Clearly, selecting the correct index is an important part of optimizer's job, and there is room for improvement there.

Another thing which I feel is that there is still further optimization possible, like moving ICP directly to the index lookup function so that the limitation in the optimizer which prevents other parts of the key after the first range condition are removed.

Conclusion

There is a huge speed up 78 minutes down to 2 minutes in case of completely IO bound workload, and upto 2x speedup in case when the workload is in-memory. There is not much difference here in terms of the performance gains on MariaDB 5.5 vs MySQL 5.6 and both are on-par. This is great for queries that suffer from the weakness of the current MySQL optimizer when it comes to doing multi-column range scans, and the ICP optimization will benefit a lot of your queries like the one I showed in my example. Not only that, because there will be a cut down in the number of data pages read into the buffer pool, it means better buffer pool utilization.

On a last note, I will be posting the benchmark scripts, table definitions, the configuration files for MySQL 5.5/5.6 and MariaDB 5.5 and the description of the hardware used.

Feb
23
2012
--

Faster Point In Time Recovery with LVM2 Snaphots and Binary Logs

LVM snapshots is one powerful way of taking a consistent backup of your MySQL databases – but did you know that you can now restore directly from a snapshot (and binary logs for point in time recovery) in case of that ‘Oops’ moment? Let me show you quickly how.

This howto assumes that you already have a decent know how of LVM and snaphots and using LVM2 >= 2.02.58 (January 2010) which is when the --merge option was made available to the lvconvert binary. Base installs of Ubuntu 11.04 (Natty) and CentOS 5.7 includes packages with this feature, previous releases might also include them via supplemental repositories i.e. updates on CentOS. If you are using InnoDB, it is also important that your transaction logs (ib_logfile*) are on the same logical volume, if not, you could potentially trigger crash recovery when an LSN mismatch occurs and still end up with inconsistent data.

Now, assuming I have the following logical volumes – mysql-data for my datadir, and mysql-logs for my binary logs. I also have  the latest snapshot of the mysql-data logical volume taken named ‘mysql-data-201202230157‘ using a script* I put together to make sure I have a consistent snapshot of the MySQL data files. Restoring snapshot alone may not be enough since there can be a lot more events from the time of the snapshots until you discover the problem, so it is really important that you have your binary logs on a different LV or copied it someplace else if they are on the same LV before we restore.

[root@sb logs]# /root/bin/lvmsnap.sh snapshot
Taking a new snapshot ..
done

Trimming excess snapshots ..
  Logical volume "mysql-data-201202230135" successfully removed
done

  LV                      VG       Attr   LSize  Origin     Snap%  Move Log Copy%  Convert
  lv_root                 VolGroup -wi-ao  5.54g                                          
  lv_swap                 VolGroup -wi-ao  1.97g                                          
  mysql-data              sb       owi-ao 20.00g                                          
  mysql-data-201202230150 sb       swi-a-  5.00g mysql-data   0.00                        
  mysql-data-201202230153 sb       swi-a-  5.00g mysql-data   0.00                        
  mysql-data-201202230155 sb       swi-a-  5.00g mysql-data   0.00                        
  mysql-data-201202230157 sb       swi-a-  5.00g mysql-data   0.00                        
  mysql-logs              sb       -wi-ao 10.00g

Suppose sometime after the snapshot, I accidentally dropped the salaries tables from the employees database! (Oops, I can have an angry mob of employees who may not get their salaries on time!).

mysql> delete from salaries where emp_no = 10001;
Query OK, 17 rows affected (0.15 sec)

mysql> drop table salaries; -- Ooops!
Query OK, 0 rows affected (0.49 sec)

mysql> alter table employees add column age smallint unsigned not null default 0;
Query OK, 300024 rows affected (13.28 sec)
Records: 300024  Duplicates: 0  Warnings: 0

mysql> update employees set age = ((YEAR(CURDATE())-YEAR(birth_date)) - (RIGHT(CURDATE(),5)<RIGHT(birth_date,5)));
Query OK, 300024 rows affected (6.51 sec)
Rows matched: 300024  Changed: 300024  Warnings: 0

With the last snapshot I have above, I should be able to restore up to before the first DELETE statement above. If you use my LVM snapshot script*, it also saves the binary log coordinates when the snapshot was taken and saves it into a file specified as variable on the script. Below is the binary log coordinates for when the last snapshot on my list above.

[root@sb logs]# cat mysql-data-201202230157-binlog-info
File	Position	Binlog_Do_DB	Binlog_Ignore_DB
mysql-bin.000022	336796712

Using the coordinates above, I can start searching for the position of the DELETE statement so we can skip that after the snapshot restart. Using the below command and some inline searches, I was able to pinpoint the position of the delete statement, it is at 336797160.

[root@sb logs]# mysqlbinlog --verbose --base64-output=decode-rows --start-position 336796712 mysql-bin.000022
...
# at 336797160
#120223  1:59:55 server id 1  end_log_pos 336797275 	Query	thread_id=47	exec_time=1	error_code=0
SET TIMESTAMP=1329980395/*!*/;
DROP TABLE `salaries` /* generated by server */
...

Now, let’s restore our data from the snapshot, under the hood restore is really a “rollback” to the snapshot state when it was taken. We will do this using lvconvert’s –merge option, to merge the state of the snapshot to the original LV.

[root@sb logs]# /etc/init.d/mysql stop
Shutting down MySQL (Percona Server).....                  [  OK  ]
[root@sb logs]# umount /mysql/data
[root@sb logs]# lvconvert --merge /dev/sb/mysql-data-201202230157
  Merging of volume mysql-data-201202230157 started.
  mysql-data: Merged: 2.3%
  mysql-data: Merged: 0.2%
  mysql-data: Merged: 0.0%
  Merge of snapshot into logical volume mysql-data has finished.
  Logical volume "mysql-data-201202230157" successfully removed
[root@sb logs]# mount /mysql/data
[root@sb logs]# /etc/init.d/mysql start
Starting MySQL (Percona Server)                            [  OK  ]

You should shutdown MySQL first, then unmount the logical volume holding the MySQL data. This way you don’t have to deactivate/activate the original logical volume to start the merging. So let’s see if our salaries table is restored.

mysql> SHOW TABLES;
+---------------------+
| Tables_in_employees |
+---------------------+
...
| salaries            |
...
+---------------------+
6 rows in set (0.05 sec)

Success! Your salaries data is back. But, we still have to apply the data after the snapshot, skipping the DROP statement. You should take another snapshot now – in case you missed to skip the DROP statement! I know for a fact that for every MySQL restart, the logs are flushed and a new binary log is created, looking at the current binary logs after restoring the snapshot, I know I have to apply mysql-bin.00022 only starting from position 336796712 and skipping the DROP statement at position 336797160:

mysql> show binary logs;
+------------------+-----------+
| Log_name         | File_size |
+------------------+-----------+
...
| mysql-bin.000022 | 336797725 |
| mysql-bin.000023 |       107 |
+------------------+-----------+
23 rows in set (0.00 sec)

mysql> select count(*) from salaries where emp_no = 10001;
+----------+
| count(*) |
+----------+
|       17 |
+----------+
1 row in set (0.00 sec)

[root@sb logs]# mysqlbinlog --verbose --base64-output=decode-rows --start-position 336796712 --stop-position 336797133 mysql-bin.000022 | mysql
[root@sb logs]# mysqlbinlog --verbose --base64-output=decode-rows --start-position 336797275 mysql-bin.000022 | mysql

mysql> select count(*) from salaries where emp_no = 10001;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.00 sec)

mysql> show tables;
+---------------------+
| Tables_in_employees |
+---------------------+
...
| salaries            |
...
+---------------------+
6 rows in set (0.00 sec)

So, as you can see, I now have a consistent data and still have my salaries table back.

Because leaving a production server with active snapshots can affect performance, this is not really an advisable backup solution. If your server somehow blew up in flames, hardware problems or encounter and LVM bug, your snapshots are useless. Also, you cannot test restore your snapshots – they are one time use!

However, if you can tolerate the extra IO overhead i.e. development or staging server , then this is still a valid backup method of course on top of your regularly tested (offsite) backups.

Another ideal use case for this method is when you are planning to execute a long running ALTER or server upgrade, this method can be good quick rollback procedure in case something fails during the operation.

* While writing this blog I hacked a quick shell script to create snapshots and uploaded it here. By no means it is perfect, you can use `lvmsnap.sh snapshot` to create snapshots. The restore functionality is not finished but you can use it for quick testing :) i.e. `lvmsnap.sh restore <snapshot-timestamp>`

Feb
23
2012
--

Black-Box MySQL Performance Analysis with TCP Traffic

For about the past year I’ve been formulating a series of tools and practices that can provide deep insight into system performance simply by looking at TCP packet headers, and when they arrive and depart from a system. This works for MySQL as well as a lot of other types of systems, because it doesn’t require any of the contents of the packet. Thus, it works without knowledge of what the server and client are conversing about. Packet headers contain only information that’s usually regarded as non-sensitive (IP address, port, TCP flags, etc), so it’s also very easy to get access to this data even in highly secure environments.

I’ve finally written up a paper that shows some of my techniques for detecting problems in a system, which can be an easy way to answer questions such as “is there something we should look into more deeply?” without launching a full-blown analysis project first. It’s available from the white paper section of our website: MySQL Performance Analysis with Percona Toolkit and TCP/IP Network Traffic

Feb
22
2012
--

How to Monitor MySQL with Percona’s Nagios Plugins

In this post, I’ll cover the new MySQL monitoring plugins we created for Nagios, and explain their features and intended purpose.

I want to add a little context. What problem were we trying to solve with these plugins? Why yet another set of MySQL monitoring plugins?

The typical problem with Nagios monitoring (and indeed with most other monitoring systems) is the alert spam. I’ve spoken to a lot of users with monitoring systems over the years, and everyone says that they get a lot of false alarms, and that they have email filters set up for alerts. Recently I spoke to one person who gets about thirty thousand alerts per week (!!!) in his email. I’m sure you can see that this makes the monitoring system much less effective. The usual cause of such noisy monitoring is when you alert on things that don’t really indicate problems.

To avoid this problem as much as possible, I suggest the following:

  1. Alert only on things that reliably indicate a real problem. Do not alert on a threshold that merely indicates something is abnormal; abnormal things happen constantly. It’s useful to spend time investigating abnormalities, but in general, you should try to make sure that doesn’t creep into your fault-detection alerts.
  2. Alert only on things that are actionable. If you get an alert and you can’t do anything about it, that’s a certain recipe for a /dev/null email filter.
  3. Avoid ratios and thresholds as much as possible, because they are rarely actionable or reliable indicators of problems, and it’s difficult or impossible to find the right threshold for all situations. This suggestion is a consequence of the previous two, but it’s worth calling out separately in my opinion. I’ll highlight a couple of threshold-based alerts later in this blog post, and explain the need to be careful with them.

Although we’re not finished with the plugins, we have a usable starting point, and we’ll build on it as time passes. The plugins we created are based on our analysis of several hundred customer downtime incidents, and their causes and preventions. You can read more about that research in our white papers, here and here.

If you read those white papers, you’ll see a pretty extensive list of suggestions for what to monitor in MySQL. Those are only suggestions; any given deployment might want to monitor only some of them, as appropriate, and might need to include others as well. Here’s how you can use our new monitoring plugins to check for some of the conditions that caused downtime for our customers:

  • Failed or full LVM snapshots. LVM backup strategies can fail to delete snapshots after finishing, sometimes due to the backup script failing before completion. Additionally, LVM snapshots that don’t have enough copy-on-write space can fail before the backup completes. The pmp-check-lvm-snapshots plugin will alert when a snapshot is failed or overly full.
  • Too many deadlocks. Deadlocks (and lock wait timeouts, covered in another plugin) are frequent causes of serious errors in applications because the applications often don’t handle deadlocks correctly, and in many cases also don’t log the resulting errors. The best place to fix these problems is in the application, but the DBA really needs to be monitoring it as well, because an application error can creep in easily and be deployed, and you will not find it anywhere but in production. The pmp-check-mysql-deadlocks plugin can help you detect such problems, in conjunction with Percona Toolkit’s pt-deadlock-logger tool.
  • Open but deleted filehandles. It’s quite easy for a bad logrotate script (such as the one that ships by default with many server installations) or a system administration mistake to delete one of MySQL’s log or data files from disk. The database server will continue to function, however, because the file isn’t really deleted. The MySQL server keeps an open filehandle to the file, and it won’t be deleted until the server restarts, so the problem remains undetected while it becomes more and more severe. This can cause two serious types of problems: the disk can fill up with no apparent cause or solution, or you can restart the server and lose your data permanently. The pmp-check-mysql-deleted-files plugin will alert when files have been deleted but remain open via a filehandle.
  • Wrong filesystem privileges. It’s very easy for a file to creep into MySQL’s data directory with wrong permissions, without causing problems until a later date. For example, if a database isn’t owned by MySQL, you won’t have any problems until you try to create or drop a table, or the database itself. Likewise, if a system administrator leaves a temporary file in the database directory and then you try to drop the database, it will fail. These things tend to happen at inconvenient times. The pmp-check-mysql-file-privs plugin checks whether MySQL owns every file and directory within its data directory.
  • InnoDB is getting stuck, has long-running transactions, or other problems. The pmp-check-mysql-innodb plugin checks for several indicators of severe internal problems within InnoDB, such as contention that crosses a tipping point and cripples the server. Sometimes such conditions are like credit card debt: easy to start, impossible to dig out afterwards. Another very common cause of serious incidents is application misbehavior, such as opening a transaction but never committing it, which creates a lot of bloat or locking in the database server. This plugin can alert on that and a couple other types of common problems, too.
  • Missing PID file. A variety of problems can delete MySQL’s PID file, such as a buggy init script that attempts to start the server when it’s already running, or executing the server by hand instead of through the init script. Overly zealous cron jobs to delete old files in /tmp can do the same thing. If the PID file is missing, then many init scripts will behave very badly. Sample problems include starting two instances of MySQL on the same files in the data directory, and refusing to shut down MySQL so that it is killed forcibly (i.e. crashed) instead of gracefully shutting down for a system reboot. The pmp-check-mysql-pidfile plugin verifies that MySQL’s PID file exists.
  • Evidence of contention in the processlist. The pmp-check-mysql-processlist plugin checks SHOW FULL PROCESSLIST for patterns that indicate similar problems to those discussed within InnoDB, but it can find evidence of problems in the server as well as the storage engine. For example, a pile-up of connections all trying to open or close tables can indicate LOCK_open contention.
  • Replication is delayed or stopped. Our plugins are structured to monitor as few things as possible, but as many as needed. Replication is one thing that needs to be monitored in two dimensions, which shouldn’t be combined: delay (Seconds_behind_master or the delay indicated by the pt-heartbeat tool, if you choose), and actual failure. The pmp-check-mysql-replication-delay plugin checks for excessive delay, and pmp-check-mysql-replication-running checks that replication is functioning correctly.
  • Replicas have different data than their master. In addition to delay or failure, replication can fail silently and data can diverge or drift on a replica. The pt-table-checksum tool from Percona Toolkit can generate table checksums, and the pmp-check-pt-table-checksum plugin can alert when those checksums indicate that the replica is not logically identical to its master.
  • The server is allocating too much memory. A runaway server process, or poorly optimized queries, can consume too much memory and cause the server to swap until it is unresponsive and has to be power cycled. In addition, other processes can be runaways, too, such as a poorly written script or daemon; it is not unheard of for a monitoring agent or RAID utility software to do this. The pmp-check-unix-memory plugin checks for a single process consuming too much memory and/or for too little free memory in general. This plugin requires careful configuration because it uses a threshold for which there is no good one-size-fits-all value. This plugin, perhaps more than any other, should only be used judiciously.
  • The server has been restarted recently. It is surprising how often a server crashes, perhaps even cycling through crashes, and is not detected until sometime later when the problem has grown much more serious. You can use the pmp-check-mysql-status plugin to alert when the server’s Uptime variable is too small.
  • The server is approaching max_connections. Overflowing the max_connections setting is a bit like a lack of disk space: it is possible for it to be slowly approached over time, and then to begin causing problems when connections are rejected. Similarly to deadlocks and timeouts, applications often don’t handle or log this error appropriately. The pmp-check-mysql-status plugin can be configured to alert when Threads_connected approaches max_connections too closely, which can give the system administrator time to address the situation before it becomes a problem. This check requires careful configuration, or it too can become a noise generator. However, if your application environment makes it difficult to handle max_connections problems (the use of fixed-size connection pooling, for example, which can sometimes require application downtime to resolve), then it might be worth considering to let you deal with the problem when it’s convenient instead of when it’s urgent.

In addition to the checks in that list, there are several others that can be performed with existing plugins, so there was no need to write new ones. For example, the plugins that ship with Nagios are suitable for checking that the server is alive and responds to a query.

The pmp-check-mysql-status is also very flexible and can be used to trigger an alert on almost anything from SHOW VARIABLES and SHOW STATUS that you want. If you want to monitor ratios and rates, for example, you can certainly do so; it also supports an incremental mode of operation, so you can (for example) get the incremental increase in a variable such as Queries and alert on a queries-per-second threshold. (I don’t suggest doing this; I’m only listing an example of what you can do. The documentation has several other examples.)

In future releases we will include more plugins to monitor more types of problems, but the current plugins already cover some of the most common and severe conditions that we’ve observed to cause actual production downtime. Please download and try out the plugins, and help us improve them. If you are a customer and you need help configuring any of the plugins, we are happy to assist. In addition, please ask our sales staff to include a free monitoring setup in a new support contract. I believe they will be happy to negotiate this with you.

Feb
21
2012
--

DBD::mysql 4.014 breaks pt-table-checksum 2.0

DBD::mysql 4.014 breaks pt-table-checksum 2.0.  The cause is unknown, but the effect is a lot of errors like:

DBD::mysql::st execute failed: called with 2 bind variables when 6 are needed [for Statement "..." with ParamValues: ...] at ./pt-table-checksum line 7216.

The fix is simple: upgrade (or even downgrade) DBD::mysql to any version except 4.014. To see which version of DBD::mysql a system has, execute:

perl -MDBD::mysql -e 'print $DBD::mysql::VERSION, "\n";'

This bug may affect other Percona Toolkit tools, but currently pt-table-checksum 2.0 is the only victim. This bug does not affect pt-table-checksum 1.0, and it cannot be worked around in pt-table-checksum 2.0 because the bug in in DBD::mysql.

This bug affects pt-table-checksum 2.0 but not 1.0 because the newer version uses prepared statements with parameter values, whereas the older version does not. It seems, although I have not verified this, that DBD::mysql 4.014 has some sort of caching mechanism which causes it to use the wrong prepared statement.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com