Mar
28
2011
--

Maatkit’s mk-query-digest filters

Have you ever seen BIG weird numbers in mk-query-digest report that just seem wrong? I have! Here’s one report I got today:

...
# Attribute          total     min     max     avg     95%  stddev  median
# ============     ======= ======= ======= ======= ======= ======= =======
# Exec time          5088s     1us    171s     2ms   467us   104ms    28us
# Lock time            76s       0      3s    26us    69us     3ms       0
# Rows sent          9.80M       0   1.05M    3.50    0.99  642.32       0
# Rows examine       5.59G       0  82.56M   2.00k    0.99  97.41k       0
# Rows affecte     457.30k       0   2.62k    0.16    0.99    1.68       0
# Rows read          2.16G       0  82.56M  788.53   21.45  82.91k    0.99
# Bytes sent         2.14T       0   4.00G 781.27k   3.52k  47.84M   84.10
# Merge passes     273.47G       0   4.00G  97.69k       0  10.35M       0
# Tmp tables       225.85G       0   4.00G  80.67k       0   7.89M       0
# Tmp disk tbl     381.88G       0   4.00G 136.41k       0  14.08M       0
# Tmp tbl size     255.54G       0   2.39G  91.28k       0   8.59M       0
# Query size       418.38M       6 257.39k  149.45  563.87   1.49k   42.48
# InnoDB:
# IO r bytes       272.92G       0   1.58G 479.74k       0  18.19M       0
# IO r ops         328.05G       0   2.00G 576.65k       0  23.08M       0
# IO r wait        237964272912s       0 3377771733s 398921s       0 22888946s       0
# pages distin     156.50G       0   2.39G 275.11k   27.38  14.48M    4.96
# queue wait       143150489533s       0 3377769328s 239976s       0 16014027s       0
# rec lock wai     216352062699s       0 4085510331s 362690s       0 27625029s       0
...

That can’t be right! Apparently there are couple bugs in the slowlog patch that aren’t really critical as the numbers are only wrong for the administrator commands. Yet when aggregated it kind of messes up the whole mk-query-digest output.

There’s a simple solution to that though — mk-query-digest –filter. With just a small change it will completely ignore administrator commands:

mk-query-digest --filter='$event->{arg} !~ m/administrator command/' in > out

If these bugs are really bugging you, feel free to sponsor the fix. Otherwise enjoy the power of the almighty mk-query-digest!

Nov
29
2010
--

Data Corruption, DRBD and story of bug

Working with customer, I faced pretty nasty bug, which is actually not rare situation , but in this particular there are some lessons I would like to share.

The case is pretty much described in bug 55981, or
in pastebin.

Everything below is related to InnoDB-plugin/XtraDB, but not to regular InnoDB ( i.e in MySQL 5.0)

In short, if you use big BLOBS ( TEXT, MEDIUMBLOB, etc) (that allocated in external segment in InnoDB), you can get your database in trash state just executing update on row with blob and rolling back transaction twice ( on the same row)

The keywords that diagnose you hit this bug is

InnoDB: Serious error! InnoDB is trying to free page N
InnoDB: though it is already marked as free in the tablespace!
InnoDB: The tablespace free space info is corrupt.
InnoDB: You may need to dump your InnoDB tables and recreate the whole
InnoDB: database!

Trash state means that InnoDB won’t start, and you need to use innodb_force_recovery=3 and mysqldump your data. What makes problem even worse is that InnoDB does not report tablename, so you are pretty much blind and need to dump whole dataset, which can be long process.

The moment where DRBD come in play, is if you use DRBD for HA purposes ( as is in the case I worked with), you screwed,
as DRBD mirroring physical data, and MySQL keeping crashing on both instances – active and passive.

So DRBD can’t be considered fully reliable HA solution if there is risk that application corrupts data by itself

Now to bug 55981. It has MySQL version 5.6, but the problem exists in MySQL 5.1.50 or below and in MySQL 5.5, and
corresponding bug is 55543, which you actually can’t see, as
“You do not have access to bug #55543.”, because it is marked as “Security problem”.

And I actually tend to agree that bug can be considered as “security”.
If you running public hosting or your public users can execute direct SQL statements, I strongly recommend to upgrade to
MySQL version 5.1.51+ .

Now another interesting point – how can you be sure that 5.1.51 works.

The bug 55543 is not mentioned in ChangeLog for 5.1.51 nor 5.1.52. However if you look into source code and revision history, you can see that bug 55543 is fixed in MySQL 5.1.51. I assume it is technical problem with ChangeLog process and it will be fixed soon, but I reported it so it is not lost

At the end let me reiterate my points:
- if you have BLOB/TEXT fields in your InnoDB-plugin schema, it is recommended to upgrade to 5.1.51+
- if you provide public access to MySQL instance ( hosting provider, etc) with InnoDB-plugin installed – it is STRONGLY recommended to upgrade to 5.1.51+
- Review your HA schema. DRBD by itself (without additional solutions) can’t guaranty decent level of High Availability. And just to be clear, it is not DRBD problem, DRBD basically can’t help if there is possibility that application corrupts data by itself. For this case regular Master-Slave setup (in addition to DRBD) would protect from such problem.


Entry posted by Vadim |
6 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Jun
29
2010
--

Recall of Percona Server 5.1.47-11.0

Percona Server release 11.0 which we announced few days ago unfortunately was released with a bug introduced while implementing stripping comments in query cache which could cause server crash with certain types of queries if query cache is enabled. We have released Percona Server release 11.1 which includes a fix for this issue. If you can’t perform upgrade promptly you can disable query cache until you can do this.

We’re sorry for shipping release with such bug.

The issue with this bug in more details is as follows:
MySQL Query Cache works by checking incoming queries if they may be found in query cache – this is done by performing simple check if query is starting from SEL or comment. After Query is parsed and executed MySQL stores results for SELECT queries in Query Cache. There are cases when queries will not be considered as cachable query during first check, however result for them will be stored in the query cache. This rare case was not handled properly in the new feature added and it was causing MySQL to crash.

The most typical case for queries causing issues would be queries starting with brackets and having space before select such as ( select * from t1 ) Queries starting with brackets are often used with UNION queries.


Entry posted by peter |
9 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Feb
15
2010
--

Missleading Innodb message on recovery

As I wrote about 2 years ago the feature of Innodb to store copy of master’s position in Slave’s Innodb tablespace got broken. There is a lot of discussions at the corresponding bug report while outcome of the fix remained uncertain for me (the bug is market duplicate while the bugs it seems to be duplicate for describe different issues).
Anyway. The customer came to me today having the following message in the error log after Slave crash while running MySQL 5.1.41 (extra virgin version without XtraDB or Percona Patches)

InnoDB: In a MySQL replication slave the last master binlog file
InnoDB: position 0 10000000, file name mysql-bin.005000
InnoDB: Last MySQL binlog file position 0 20000000, file name ./mysql-bin.003000

He has tried to restart replication from mysql-bin.005000 position 10000000 which failed with “Could not find first log file name in binary log index file” error message

Looking at the Master for this slave I could see its binary log files going only to about 2000, so the binary log file mentioned did not ever exist on this master. What is going on ?
The thing is Innodb does not update this information any more however if it is stored in the tablespace the code is still there to print it. This database was running older MySQL version a while back which was updated to MySQL 5.1 months ago, moved to the new hardware by physical copy and the log file numbers restarted back from 1 but tablespace still contained the ancient data.

I reported the bug on this which should be easy to fix. Otherwise it is easy mistake to make. We also have a patch which restores this information and uses it on slave crash recovery.


Entry posted by peter |
2 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Jan
09
2010
--

Getting around optimizer limitations with an IN() list

There was a discussion on LinkedIn one month ago that caught my eye:

Database search by “within x number of miles” radius?

Anyone out there created a zipcode database and created a “search within x numer of miles” function ?
Thankful for any tips you can throw my way..

J

A few people commented that some solutions wouldn’t scale. To understand why these sorts of geographic search queries are problematic in MySQL, it’s best to show some execution plans on dummy data:

SQL:

  1. EXPLAIN SELECT * FROM coordinates FORCE INDEX (x_y_col_a) WHERE x BETWEEN 30 AND 40
  2. AND y BETWEEN 50 AND 60 AND col_a = ‘set1′;
  3. +—-+————-+————-+——-+—————+———–+———+——+——+————-+
  4. | id | select_type | TABLE       | type  | possible_keys | KEY       | key_len | ref  | rows | Extra       |
  5. +—-+————-+————-+——-+—————+———–+———+——+——+————-+
  6. 1 | SIMPLE      | coordinates | range | x_y_col_a     | x_y_col_a | 38      | NULL | 4032 | USING WHERE |
  7. +—-+————-+————-+——-+—————+———–+———+——+——+————-+
  8. 1 row IN SET (0.00 sec)
  9.  
  10. EXPLAIN SELECT * FROM coordinates FORCE INDEX (x_y_col_a) WHERE x BETWEEN 30 AND 40;
  11. +—-+————-+————-+——-+—————+———–+———+——+——+————-+
  12. | id | select_type | TABLE       | type  | possible_keys | KEY       | key_len | ref  | rows | Extra       |
  13. +—-+————-+————-+——-+—————+———–+———+——+——+————-+
  14. 1 | SIMPLE      | coordinates | range | x_y_col_a     | x_y_col_a | 3       | NULL | 4032 | USING WHERE |
  15. +—-+————-+————-+——-+—————+———–+———+——+——+————-+
  16. 1 row IN SET (0.01 sec)
  17.  
  18. SELECT count(*) FROM coordinates FORCE INDEX (x_y_col_a) WHERE x BETWEEN 30 AND 40
  19. AND y BETWEEN 50 AND 60 AND col_a = ‘set1′;
  20. +———-+
  21. | count(*) |
  22. +———-+
  23. |        1 |
  24. +———-+
  25. 1 row IN SET (0.00 sec)
  26.  
  27. SELECT count(*) FROM coordinates FORCE INDEX (x_y_col_a) WHERE x BETWEEN 30 AND 40;
  28. +———-+
  29. | count(*) |
  30. +———-+
  31. |     1664 |
  32. +———-+
  33. 1 row IN SET (0.01 sec)

Did you notice that we estimate just as many rows on the first EXPLAIN as the second one? That doesn’t make any sense! The index covers x,y and col_a and should be eliminating a lot of searching, since there is only one row which meets this condition!

The reason for this is simply a missing feature of the MySQL optimizer – and it has to do with using x BETWEEN 30 and 40 (and it’s also true with x >= 30 AND x <= 40). Using a range like this prevents us from using the rest of the index. There is a workaround, but it’s not pretty:

SQL:

  1. EXPLAIN SELECT * FROM coordinates WHERE x IN
  2. (30.30,30.61,31.18,31.26,31.72,32.11,32.25,32.30,32.42,32.91,33.27,
  3. 33.69,33.79,33.93,34.62,34.78,35.10,35.41,36.62,36.93,37.17,38.93,39.20,
  4. 39.56,39.84,39.87) AND y IN (59.58,56.81,57.27,54.14,56.43,51.87,54.59,
  5. 59.56,57.42,54.13,56.79,59.45) AND col_a = ‘set1′;
  6. +—-+————-+————-+——-+—————+———–+———+——+——+————-+
  7. | id | select_type | TABLE       | type  | possible_keys | KEY       | key_len | ref  | rows | Extra       |
  8. +—-+————-+————-+——-+—————+———–+———+——+——+————-+
  9. 1 | SIMPLE      | coordinates | range | x_y_col_a     | x_y_col_a | 38      | NULL312 | USING WHERE |
  10. +—-+————-+————-+——-+—————+———–+———+——+——+————-+
  11. 1 row IN SET (0.00 sec)

The ugliest thing about this, is that in real life you wouldn’t know all your possible values of X or Y, and so you may end up with a very big IN list. The workaround to this, is to create steppings of the value X and Y that we can use for indexes:

SQL:

  1. ALTER TABLE coordinates ADD x_floor INT NOT NULL, ADD y_floor INT NOT NULL, DROP INDEX x_y_col_a,
  2. ADD INDEX x_floor_y_floor_col_a (x_floor, y_floor, col_a);
  3.  
  4. UPDATE coordinates SET x_floor = FLOOR(x), y_floor = FLOOR(y);
  5.  
  6. EXPLAIN SELECT * FROM coordinates WHERE x_floor IN (30,31,32,33,34,35,36,37,38,39,40)
  7. AND y_floor IN (50,51,52,53,54,55,56,57,58,59,60) AND col_a = ‘set1′\G
  8. *************************** 1. row ***************************
  9.            id: 1
  10.   select_type: SIMPLE
  11.         TABLE: coordinates
  12.          type: range
  13. possible_keys: x_floor_y_floor_col_a
  14.           KEY: x_floor_y_floor_col_a
  15.       key_len: 40
  16.           ref: NULL
  17.          rows: 121
  18.         Extra: USING WHERE
  19. 1 row IN SET (0.00 sec)

Fantastic! The only remaining problem with this query is that it’s not quite identical to our original. In this query 60.79 will be floored to 60 (and erroneously included in our results):

SQL:

  1. SELECT * FROM coordinates WHERE x_floor IN (30,31,32,33,34,35,36,37,38,39,40)
  2. AND y_floor IN (50,51,52,53,54,55,56,57,58,59,60) AND col_a = ‘set1′;
  3. +—–+——-+——-+——-+——-+———+———+
  4. | id  | x     | y     | col_a | col_b | x_floor | y_floor |
  5. +—–+——-+——-+——-+——-+———+———+
  6. | 144 | 33.79 | 54.59 | set1  | NULL  |      33 |      54 |
  7. 38 | 39.20 | 60.79 | set1  | NULL  |      39 |      60 |
  8. +—–+——-+——-+——-+——-+———+———+
  9. 2 rows IN SET (0.00 sec)

However, this is a quick fix by re-including the original WHERE conditions (we are just no longer using an index on them):

SQL:

  1. EXPLAIN SELECT * FROM coordinates WHERE x_floor IN (30,31,32,33,34,35,36,37,38,39,40)
  2. AND y_floor IN (50,51,52,53,54,55,56,57,58,59,60)
  3. AND col_a = ‘set1′ AND x BETWEEN 30 AND 40 AND y BETWEEN 50 AND 60\G
  4. *************************** 1. row ***************************
  5.            id: 1
  6.   select_type: SIMPLE
  7.         TABLE: coordinates
  8.          type: range
  9. possible_keys: x_floor_y_floor_col_a
  10.           KEY: x_floor_y_floor_col_a
  11.       key_len: 40
  12.           ref: NULL
  13.          rows: 121
  14.         Extra: USING WHERE
  15. 1 row IN SET (0.00 sec)
  16.  
  17. SELECT * FROM coordinates WHERE x_floor IN (30,31,32,33,34,35,36,37,38,39,40)
  18. AND y_floor IN (50,51,52,53,54,55,56,57,58,59,60)
  19. AND col_a = ‘set1′ AND x BETWEEN 30 AND 40 AND y BETWEEN 50 AND 60;
  20. +—–+——-+——-+——-+——-+———+———+
  21. | id  | x     | y     | col_a | col_b | x_floor | y_floor |
  22. +—–+——-+——-+——-+——-+———+———+
  23. | 144 | 33.79 | 54.59 | set1  | NULL  |      33 |      54 |
  24. +—–+——-+——-+——-+——-+———+———+
  25. 1 row IN SET (0.00 sec)

Conclusion:
My examples were only on a small amount of data (16 000 rows) that fitted in memory, but the original query would have full table scanned if I didn’t use a FORCE INDEX hint. Add more data, and if X can’t filter out enough rows by itself this can create a real problem.

Workarounds are all very good, but they also make applications more difficult to maintain. If you really want to do these types of queries, you should give Sphinx a try.


Entry posted by Morgan Tocker |
6 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Dec
05
2009
--

How many partitions can you have ?

I had an interesting case recently. The customer dealing with large MySQL data warehouse had the table which was had data merged into it with INSERT ON DUPLICATE KEY UPDATE statements. The performance was extremely slow. I turned out it is caused by hundreds of daily partitions created for this table. What is the most interesting (and surprising) not every statement is affected equally as you can see from the benchmarks above:

I got the following test table created:

SQL:

  1. CREATE TABLE `p10` (
  2.     `id` int(10) UNSIGNED NOT NULL,
  3.     `c`  int(10) UNSIGNED NOT NULL,
  4.      PRIMARY KEY (`id`),
  5.      KEY(c)
  6.      ) ENGINE=InnoDB
  7.      PARTITION BY RANGE(id) (
  8.      PARTITION p100000 VALUES LESS THAN(100001),
  9. PARTITION p200000 VALUES LESS THAN(200001),
  10. PARTITION p300000 VALUES LESS THAN(300001),
  11. PARTITION p400000 VALUES LESS THAN(400001),
  12. PARTITION p500000 VALUES LESS THAN(500001),
  13. PARTITION p600000 VALUES LESS THAN(600001),
  14. PARTITION p700000 VALUES LESS THAN(700001),
  15. PARTITION p800000 VALUES LESS THAN(800001),
  16. PARTITION p900000 VALUES LESS THAN(900001),
  17. PARTITION p1000000 VALUES LESS THAN(1000001)
  18.      );

I used MySQL 5.1.41 for my tests increasing buffer pool to 1G and log file size to 128M so all data is comfortably in memory.

I ranged number of partitions from 1 to 1000 and loaded the table with 1000000 of sequential values from 1 to 1million (the C column was set same as ID column) using bulk insert statements – 1000 rows in each.

Loading the data was taking 9 seconds for 1 partitions, 10 seconds for 10 partitions, 16 seconds for 100 partitions and 23 seconds for 1000 partitions, which means it slows down 2.5 times as number of partitions increases to 1000.

This regression is somethat surprising as in reality only 1 (max 2) partitions got data inserted to them per insert statement.

As I tested the UPDATE path on INSERT OF DUPLICATE KEY UPDATE (adding ON DUPLICATE KEY UPDATE set c=c+1 to my bulk inserts) the regression became even larger.

1 partition insert took 50 seconds to complete, 10 partitions 52, 100 partitions 72, 1000 partitions 290 seconds. The performance loss with large number of partitions is about 6 times in this case and the pattern is a bit different – the slowdown becomes drastic as we go from 100 to 1000 partitions – about 4 times compared to only 1.5 times slow down for the same case with just insert.

The difference grows even larger if we remove index on column C – the “UPDATE” part of INSERT ON DUPLICATE KEY UPDATE completes in 22 seconds for 1 partition and 250 for 1000 partitions – which is over 10x difference.

Both MyISAM and Innodb are affected by this issue. The Update path for MyISAM without indexes took 10 seconds for 1 partition vs 52 seconds for 1000 partitions (the table_cache was warm for this test)

I had couple of suspects in this case – is this the “setup” overhead of statement opening all partitions for execution or is this “per row” overhead. To check this I ran the statements with different number of rows in the batch. With 100 rows per batch and with 10000 rows per batch performance was not significantly different from 1000 rows per batch so I conclude this is per row overhead.

This test gives other rather interesting data point – it looks like updates in Innodb for CPU bound workloads can be as much as 5 times slower than inserts – which is quite a difference.

CONCLUSION: This is interesting topic and I should do more tests on this. In the meanwhile the data point is simple – be careful with number of partitions you use. Creating unused partitions for future use may cost you. From this benchmarks the performance remains reasonable with about 100 partitions (which is reasonable number for most applications) though I would expect numbers to be significantly workload dependent.


Entry posted by peter |
15 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Nov
19
2009
--

Rare evil MySQL Bug

There is the rare bug which I ran into every so often. Last time I’ve seen it about 3 years ago on MySQL 4.1 and I hoped it is long fixed since… but it looks like it is not. I now get to see MySQL 5.4.2 in the funny state.

When you see bug happening you would see MySQL log flooded with error messages like this:

091119 23:03:34 [ERROR] Error in accept: Resource temporarily unavailable
091119 23:03:34 [ERROR] Error in accept: Resource temporarily unavailable
091119 23:03:34 [ERROR] Error in accept: Resource temporarily unavailable
091119 23:03:34 [ERROR] Error in accept: Resource temporarily unavailable

filling out disk space

Depending on the case you may be able to connect to MySQL through Unix Socket or TCP/IP or neither.
It also looks like there is a correlation between having a lot of tables and such condition.

Previously I was unlucky with seeing this issue in production so we had to restart MySQL quickly currently I have a test MySQL showing some behavior.

Here is what strace tells me:

[percona@test9 msb_5_4_2]$ strace -f -p 19229
Process 19229 attached with 23 threads – interrupt to quit
[pid 19286] rt_sigtimedwait([HUP QUIT ALRM TERM TSTP],
[pid 19285] futex(0x165962ec, FUTEX_WAIT_PRIVATE, 1, NULL
[pid 19284] select(0, NULL, NULL, NULL, {0, 765000}
[pid 19283] select(0, NULL, NULL, NULL, {0, 412000}
[pid 19248] futex(0x1781193c, FUTEX_WAIT_PRIVATE, 1, NULL
[pid 19247] futex(0x178118bc, FUTEX_WAIT_PRIVATE, 1, NULL
[pid 19246] futex(0x1781183c, FUTEX_WAIT_PRIVATE, 1, NULL
[pid 19245] futex(0x178117bc, FUTEX_WAIT_PRIVATE, 1, NULL
[pid 19244] futex(0x1781173c, FUTEX_WAIT_PRIVATE, 1, NULL
[pid 19232] futex(0x1781113c, FUTEX_WAIT_PRIVATE, 165, NULL
[pid 19229] accept(16392,
[pid 19241] futex(0x178115bc, FUTEX_WAIT_PRIVATE, 319, NULL
[pid 19243] futex(0x178116bc, FUTEX_WAIT_PRIVATE, 1, NULL
[pid 19242] futex(0x1781163c, FUTEX_WAIT_PRIVATE, 1, NULL
[pid 19240] futex(0x1781153c, FUTEX_WAIT_PRIVATE, 3, NULL
[pid 19239] futex(0x178114bc, FUTEX_WAIT_PRIVATE, 7, NULL
[pid 19238] futex(0x1781143c, FUTEX_WAIT_PRIVATE, 1, NULL
[pid 19237] futex(0x178113bc, FUTEX_WAIT_PRIVATE, 1, NULL
[pid 19236] futex(0x1781133c, FUTEX_WAIT_PRIVATE, 1, NULL
[pid 19235] futex(0x178112bc, FUTEX_WAIT_PRIVATE, 7, NULL
[pid 19234] futex(0x1781123c, FUTEX_WAIT_PRIVATE, 1, NULL
[pid 19233] futex(0x178111bc, FUTEX_WAIT_PRIVATE, 207, NULL
[pid 19231] futex(0x178110bc, FUTEX_WAIT_PRIVATE, 1, NULL
[pid 19229] <... accept resumed> 0x7fffffac70b0, [18423225245113516048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 19229] accept(16392, 0x7fffffac70b0, [18423225245113516048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 19229] accept(16392, 0x7fffffac70b0, [18423225245113516048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 19229] accept(16392, 0x7fffffac70b0, [18423225245113516048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 19229] fcntl(16392, F_SETFL, O_RDWR) = 0
[pid 19229] fcntl(16392, F_SETFL, O_RDWR) = 0
[pid 19229] select(16394, [1025 1040 1042 1044
....
8 9709 9714 9716 9717 15368 15369], NULL, NULL, NULL*** buffer overflow detected ***: strace terminated
======= Backtrace: =========
/lib64/libc.so.6(__chk_fail+0x2f)[0x35f36e6aff]
/lib64/libc.so.6[0x35f36e5ad3]
strace[0x408b60]

So as you can see accept gets pretty high socket number – probably because of large innodb_open_files I tested with in this case – which all can be used during recovery.

Note the process gets accept on the socket to fail with EAGAIN (which is not well described in the manual) and later getting call to select call with 16384 sockets. This does not seems to be the healthy number to work with SELECT call and It is quite possible something goes wrong because of it.

If you have any ideas why could be going wrong in this case.


Entry posted by peter |
7 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Written by in: bugs,MySQL,Zend Developer |

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com