Apr
09
2015
--

InnoDB locks and deadlocks with or without index for different isolation level

Recently, I was working on one of the issue related to locks and deadlocks with InnoDB tables and I found very interesting details about how InnoDB locks and deadlocks works with or without index for different Isolation levels.

Here, I would like to describe a small test case about how SELECT ..FOR UPDATE (with and without limit) behave with INSERT/UPDATE and with READ-COMMITED and REPEATABLE-READ Isolation levels. I’m creating a small table data_col with few records. Initially, this test case was written by Bill Karwin to explain details to customer, but here I have used a bit modified test case.

CREATE TABLE data_col (dataname VARCHAR(10), period INT, expires DATE, host VARCHAR(10));

INSERT INTO data_col VALUES (‘med1′, 1,’2014-01-01 00:00:00′, ‘server1′);
INSERT INTO data_col VALUES (‘med2′, 1,’2014-02-15 00:00:00′, ‘server2′);
INSERT INTO data_col VALUES (‘med3′, 1,’2014-03-20 00:00:00′, ‘server3′);
INSERT INTO data_col VALUES (‘med4′, 1,’2014-04-10 00:00:00′, ‘server4′);
INSERT INTO data_col VALUES (‘med5′, 1,’2014-05-01 00:00:00′, ‘server5′);

Case 1: No index on expires, tx_isolation=READ-COMMITTED.

Session 1: SET tx_isolation=’READ-COMMITTED’; START TRANSACTION;
Session 2: SET tx_isolation=’READ-COMMITTED’; START TRANSACTION;
Session 1: SELECT * FROM data_col WHERE expires < ‘2014-03-01′ ORDER BY expires LIMIT 1 FOR UPDATE;
Session 2: INSERT INTO data_col VALUES (‘med6′, 1,’2014-06-03 00:00:00′, ‘server6′); /* success */
Session 1: SELECT * FROM data_col WHERE expires < ‘2014-07-01′ ORDER BY expires LIMIT 1 FOR UPDATE; /* hangs */
Session 2: SELECT * FROM data_col WHERE expires < ‘2014-07-01′ ORDER BY expires LIMIT 1 FOR UPDATE; /* deadlocks */

------------------------
LATEST DETECTED DEADLOCK
------------------------
2015-04-04 12:35:45 7f3f0a084700
*** (1) TRANSACTION:
TRANSACTION 28960, ACTIVE 24 sec fetching rows
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 360, 7 row lock(s)
MySQL thread id 39, OS thread handle 0x7f3f0a0b5700, query id 158 localhost root Creating sort index
SELECT * FROM data_col WHERE expires < '2014-07-01' ORDER BY expires LIMIT 1 FOR UPDATE
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 14 page no 3 n bits 80 index `GEN_CLUST_INDEX` of table `nil`.`data_col` trx id 28960 lock_mode X locks rec but not gap waiting
*** (2) TRANSACTION:
TRANSACTION 28961, ACTIVE 17 sec starting index read
mysql tables in use 1, locked 1
3 lock struct(s), heap size 360, 2 row lock(s), undo log entries 1
MySQL thread id 40, OS thread handle 0x7f3f0a084700, query id 159 localhost root Creating sort index
SELECT * FROM data_col WHERE expires < '2014-07-01' ORDER BY expires LIMIT 1 FOR UPDATE
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 14 page no 3 n bits 80 index `GEN_CLUST_INDEX` of table `nil`.`data_col` trx id 28961 lock_mode X locks rec but not gap
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 14 page no 3 n bits 80 index `GEN_CLUST_INDEX` of table `nil`.`data_col` trx id 28961 lock_mode X locks rec but not gap waiting
*** WE ROLL BACK TRANSACTION (1)

With READ-COMMITTED, even If Session 1 locks records with condition “expires < ‘2014-03-01′ “, Session 2 can Insert the record as Session 1 is not using gap lock (lock_mode X locks rec but not gap waiting) and we can insert/update the records outside of set of Session 1 examined. But when Session 1 tried to acquire locks on higher range (“expires < ‘2014-07-01′ “), it will be hanged and if we do the same thing from Session 2, it will turn to deadlock.

Here, When there is no primary key, InnoDB table will create it’s own cluster index, which is GEN_CLUST_INDEX.

Case 2: No index on expires, tx_isolation=REPEATABLE-READ.

Session 1: SET tx_isolation=’REPEATABLE-READ’; START TRANSACTION;
Session 2: SET tx_isolation=’REPEATABLE-READ’; START TRANSACTION;
Session 1: SELECT * FROM data_col WHERE expires < ‘2014-03-01′ ORDER BY expires LIMIT 1 FOR UPDATE;
Session 2: INSERT INTO data_col VALUES (‘med6′, 1,’2014-06-03 00:00:00′, ‘server6′); /* hangs */

------------
TRANSACTIONS
------------
Trx id counter 29502
Purge done for trx's n:o < 29500 undo n:o < 0 state: running but idle
History list length 86
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 0, not started
MySQL thread id 41, OS thread handle 0x7f2e901f5700, query id 175 localhost root init
show engine innodb status
---TRANSACTION 29501, ACTIVE 4 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 360, 1 row lock(s)
MySQL thread id 40, OS thread handle 0x7f2e90226700, query id 174 localhost root update
INSERT INTO data_col VALUES ('med6', 1,'2014-06-03 00:00:00', 'server6')
------- TRX HAS BEEN WAITING 4 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 16 page no 3 n bits 80 index `GEN_CLUST_INDEX` of table `nil`.`data_col` trx id 29501 lock_mode X insert intention waiting
------------------
---TRANSACTION 29500, ACTIVE 10 sec
2 lock struct(s), heap size 360, 13 row lock(s)
MySQL thread id 39, OS thread handle 0x7f2e90257700, query id 173 localhost root cleaning up

With REPEATABLE-READ, we can see that when Session 1 locks records with condition “expires < ‘2014-03-01′ “, Session 2 can’t Insert the record and waiting for the lock release (lock_mode X insert intention waiting) from Session 1 because it’s using gap locks. Here,  insert intention is one type of gap lock.  This lock signals the intent to insert in such a way that multiple transactions inserting into the same index gap need not wait for each other if they are not inserting at the same position within the gap.

This is why for some scenario/ business logic, REPEATABLE-READ is better isolation level to prevent deadlocks by using more row locks. (including gap locks)

Case 3: Added Primary Key on dataname and Index on expires, tx_isolation=READ-COMMITTED.

Session 1: SET tx_isolation=’READ-COMMITTED’; START TRANSACTION;
Session 2: SET tx_isolation=’READ-COMMITTED’; START TRANSACTION;
Session 1: SELECT * FROM data_col WHERE expires < ‘2014-03-01′ ORDER BY expires LIMIT 1 FOR UPDATE;
Session 2: INSERT INTO data_col VALUES (‘med13′, 1,’2014-06-13 00:00:00′, ‘server13′); /* success */
Session 1: SELECT * FROM data_col WHERE expires < ‘2014-07-01′ ORDER BY expires LIMIT 1 FOR UPDATE; /* success */
Session 2: SELECT * FROM data_col WHERE expires < ‘2014-07-01′ ORDER BY expires LIMIT 1 FOR UPDATE; /* hang*/

------------
TRANSACTIONS
------------
Trx id counter 29452
Purge done for trx's n:o < 29450 undo n:o < 0 state: running but idle
History list length 68
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 0, not started
MySQL thread id 41, OS thread handle 0x7f2e901f5700, query id 140 localhost root init
show engine innodb status
---TRANSACTION 29451, ACTIVE 23 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 360, 1 row lock(s), undo log entries 1
MySQL thread id 40, OS thread handle 0x7f2e90226700, query id 139 localhost root Sending data
SELECT * FROM data_col WHERE expires < '2014-07-01' ORDER BY expires LIMIT 1 FOR UPDATE
------- TRX HAS BEEN WAITING 8 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 15 page no 4 n bits 80 index `expires` of table `nil`.`data_col` trx id 29451 lock_mode X locks rec but not gap waiting
------------------
---TRANSACTION 29450, ACTIVE 29 sec
3 lock struct(s), heap size 360, 2 row lock(s)
MySQL thread id 39, OS thread handle 0x7f2e90257700, query id 138 localhost root cleaning up

If there is index on “expires” column which we use to filter the records, it will show different behavior with READ-COMMITED. In Case 1,  we were not able to acquire locks on higher range (“expires < ‘2014-07-01′ “)  while here, we can do that with index on expires. So when Session 1 has already locked the rows, another Session 2 can’t acquire the same lock and will be waiting to release locks from Session 1.

But it we remove LIMIT 1 from SELECT…FOR UPDATE then it will behave the same like Case 1.

———————–Just removed limit 1—————————————

Session 1: SET tx_isolation=’READ-COMMITTED’; START TRANSACTION;
Session 2: SET tx_isolation=’READ-COMMITTED’; START TRANSACTION;
Session 1: SELECT * FROM data_col WHERE expires < ‘2014-03-01′ ORDER BY expires FOR UPDATE;
Session 2: INSERT INTO data_col VALUES (‘med14′, 1,’2014-06-04 00:00:00′, ‘server14′); /* success */
Session 1: SELECT * FROM data_col WHERE expires < ‘2014-07-01′ ORDER BY expires FOR UPDATE; /* hang */
Session 2: SELECT * FROM data_col WHERE expires < ‘2014-07-01′ ORDER BY expires FOR UPDATE; /* deadlock*/

------------------------
LATEST DETECTED DEADLOCK
------------------------
2015-04-04 14:22:49 7f3f0a084700
*** (1) TRANSACTION:
TRANSACTION 29051, ACTIVE 52 sec fetching rows
mysql tables in use 1, locked 1
LOCK WAIT 4 lock struct(s), heap size 1184, 9 row lock(s)
MySQL thread id 39, OS thread handle 0x7f3f0a0b5700, query id 251 localhost root Creating sort index
SELECT * FROM data_col WHERE expires < '2014-07-01' ORDER BY expires FOR UPDATE
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 15 page no 3 n bits 80 index `PRIMARY` of table `nil`.`data_col` trx id 29051 lock_mode X locks rec but not gap waiting
*** (2) TRANSACTION:
TRANSACTION 29052, ACTIVE 13 sec starting index read
mysql tables in use 1, locked 1
3 lock struct(s), heap size 360, 2 row lock(s), undo log entries 1
MySQL thread id 40, OS thread handle 0x7f3f0a084700, query id 252 localhost root Creating sort index
SELECT * FROM data_col WHERE expires < '2014-07-01' ORDER BY expires FOR UPDATE
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 15 page no 3 n bits 80 index `PRIMARY` of table `nil`.`data_col` trx id 29052 lock_mode X locks rec but not gap
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 15 page no 3 n bits 80 index `PRIMARY` of table `nil`.`data_col` trx id 29052 lock_mode X locks rec but not gap waiting
*** WE ROLL BACK TRANSACTION (2)
------------

Case 4: Added Primary Key on dataname and Index on expires, tx_isolation=REPEATABLE-READ.

Session 1: SET tx_isolation=’REPEATABLE-READ’; START TRANSACTION;
Session 2: SET tx_isolation=’REPEATABLE-READ’; START TRANSACTION;
Session 1: SELECT * FROM data_col WHERE expires < ‘2014-07-01′ ORDER BY expires LIMIT 1 FOR UPDATE;
Session 2: INSERT INTO data_col VALUES (‘med15′, 1,’2014-06-10 00:00:00′, ‘server15′); /* success */

Here, Unlike Case 2, we’ll be able to INSERT record from Session 2 as Session 1 is not using gap lock.

——————–Successfully happened but if I’ll remove the limit 1, —————————-

Session 1: SET tx_isolation=’REPEATABLE-READ’; START TRANSACTION;
Session 2: SET tx_isolation=’REPEATABLE-READ’; START TRANSACTION;
Session 1: SELECT * FROM data_col WHERE expires < ‘2014-07-01′ ORDER BY expires FOR UPDATE;
Session 2: INSERT INTO data_col VALUES (‘med11′, 1,’2014-06-11 00:00:00′, ‘server11′); /* hang */

------------
TRANSACTIONS
------------
Trx id counter 29454
Purge done for trx's n:o < 29450 undo n:o < 0 state: running but idle
History list length 68
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 0, not started
MySQL thread id 41, OS thread handle 0x7f2e901f5700, query id 149 localhost root init
show engine innodb status
---TRANSACTION 29453, ACTIVE 6 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 360, 1 row lock(s)
MySQL thread id 40, OS thread handle 0x7f2e90226700, query id 148 localhost root update
INSERT INTO data_col VALUES ('med11', 1,'2014-06-11 00:00:00', 'server11')
------- TRX HAS BEEN WAITING 6 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 15 page no 3 n bits 80 index `PRIMARY` of table `nil`.`data_col` trx id 29453 lock_mode X locks gap before rec insert intention waiting
------------------
---TRANSACTION 29452, ACTIVE 18 sec
2 lock struct(s), heap size 360, 12 row lock(s)
MySQL thread id 39, OS thread handle 0x7f2e90257700, query id 147 localhost root cleaning up
--------

But here, Session 1 will use gap lock, so Session 2 can’t able to insert record until lock release.  To know more about deadlock, I would suggest to read below blog post, “How to deal with MySQL deadlocks” by my colleague Peiran Song.

There are plenty of blog posts to describe InnoDB locks but few things are still missing in manual. Here is the list of some of those bugs.

Conclusion: We can see from above test case that to prevent deadlocks, sometimes we have to use indexes,  sometimes higher isolation level helps, even if it counter-intuitive and sometimes application OR table schema changes can help.

If you are coming to the Percona Live MySQL Conference and Expo 2015, then visit me for this talk: “Understanding InnDB locks and deadlocks” (16 April 3:00PM – 3:50PM @ Ballroom A).

The post InnoDB locks and deadlocks with or without index for different isolation level appeared first on MySQL Performance Blog.

Sep
02
2014
--

Using sysbench 0.5 for performing MySQL benchmarks

Given the recent excitement & interest around OpenStack I wanted to make sure I was ready to conduct appropriate evaluations of system performance.  I generally turn to sysbench since it comes with a variety of different tests (accessed via –test= option interface), including:

  • fileio – File I/O test
  • cpu – CPU performance test
  • memory – Memory functions speed test
  • threads – Threads subsystem performance test
  • mutex – Mutex performance test

As you can see, sysbench lets you stress many of the fundamental components of your hardware and infrastructure, such as your disk subsystem, along with your CPUs and memory. An additional option exists that is designed to perform synthetic stress testing of MySQL, and I was surprised when I didn’t see it in the above list on version 0.5, as it used to show up as “oltp – OLTP test”. What happened to –test=oltp ??

This list is from the latest release of sysbench which is 0.5 — you’re only going to be on this version if you build it yourself or if you use the package provided courtesy of Frederic Descamps (thanks lefred!).  If you’re using the version from EPEL, Ubuntu 14.04, or Debian 7 you’re still using version 0.4.12 (check with sysbench –version).  One thing you’ll notice is that the test type of OLTP doesn’t show up anymore.  What gives?  I was scratching my head until I asked on Percona IRC and found out that in 0.5 the standard OLTP test type was replaced with a different syntax, that instead of passing parameters to sysbench you instead reference scripts written in lua.  The advantage here is that now you have an interface in order to write your own specific load tests (provided you know lua, but it isn’t hard).  For those of you looking to run the pre-canned load tests they still exist but you have to have them as part of the RPM install or otherwise copied to your system.

Fortunately if you use the package provided by lefred you’ll find these lua scripts here (this is using Amazon ami as of August 4th, 2014):

[root@pxc-control ~]# ls -l /usr/share/doc/sysbench/tests/db/
total 44
-rw-r--r-- 1 root root 3585 Sep 7 2012 common.lua
-rw-r--r-- 1 root root 340 Sep 7 2012 delete.lua
-rw-r--r-- 1 root root 830 Sep 7 2012 insert.lua
-rw-r--r-- 1 root root 2925 Sep 7 2012 oltp.lua
-rw-r--r-- 1 root root 342 Sep 7 2012 oltp_simple.lua
-rw-r--r-- 1 root root 425 Sep 7 2012 parallel_prepare.lua
-rw-r--r-- 1 root root 343 Sep 7 2012 select.lua
-rw-r--r-- 1 root root 3964 Sep 7 2012 select_random_points.lua
-rw-r--r-- 1 root root 4066 Sep 7 2012 select_random_ranges.lua
-rw-r--r-- 1 root root 343 Sep 7 2012 update_index.lua
-rw-r--r-- 1 root root 552 Sep 7 2012 update_non_index.lua

So the trick (if you want to call it that) is that instead of passing a single word to the –test directive, instead you pass the full path to the lua script.

This is the old way (sysbench 0.4.12 from EPEL repo):

--test=oltp --oltp-test-mode=complex

This is the new way (sysbench 0.5):

--test=/usr/share/doc/sysbench/tests/db/insert.lua

Here is an example of a test I’m running through haproxy against a 3-node PXC cluster doing the INSERT-only test type so you can see the full syntax I pass to sysbench:

[root@pxc-control ~]# cat sys_haproxy.sh
#!/bin/bash
sysbench
--test=/usr/share/doc/sysbench/tests/db/insert.lua
--mysql-host=pxc-control
--mysql-port=9999
--mysql-user=sysbench-haproxy
--mysql-password=sysbench-haproxy
--mysql-db=sbtest
--mysql-table-type=innodb
--oltp-test-mode=complex
--oltp-read-only=off
--oltp-reconnect=on
--oltp-table-size=1000000
--max-requests=100000000
--num-threads=3
--report-interval=1
--report-checkpoints=10
--tx-rate=24
$1

And here’s what the insert.lua script looks like:

[root@pxc-control ~]# cat /usr/share/doc/sysbench/tests/db/insert.lua
pathtest = string.match(test, "(.*/)") or ""
dofile(pathtest .. "common.lua")
function thread_init(thread_id)
   set_vars()
end
function event(thread_id)
   local table_name
   local i
   local c_val
   local k_val
   local pad_val
   table_name = "sbtest".. sb_rand_uniform(1, oltp_tables_count)
   if (oltp_auto_inc) then
      i = 0
   else
      i = sb_rand_uniq(1, oltp_table_size)
   end
   k_val = sb_rand(1, oltp_table_size)
   c_val = sb_rand_str([[
###########-###########-###########-###########-###########-###########-###########-###########-###########-###########]])
   pad_val = sb_rand_str([[
###########-###########-###########-###########-###########]])
   rs = db_query("INSERT INTO " .. table_name ..  " (id, k, c, pad) VALUES " .. string.format("(%d, %d, '%s', '%s')",i, k_val, c_val, pad_val))
end

The thing that I like most about sysbench 0.5 (beyond the lua interface, of course!) is that it now comes with a –report-interval option (which I generally set as = 1) so that you get output while the script is running. No more waiting until the end of the test to get feedback! Here’s a sample of sysbench 0.5 in action running the INSERT test through a local haproxy instance and writing to three nodes in a PXC cluster such as OpenStack Trove might do:

[root@pxc-control ~]# ./sys_haproxy.sh run
sysbench 0.5:  multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 3
Report intermediate results every 1 second(s)
Random number generator seed is 0 and will be ignored
Threads started!
[   1s] threads: 3, tps: 0.00, reads/s: 0.00, writes/s: 1099.28, response time: 9.86ms (95%)
[   2s] threads: 3, tps: 0.00, reads/s: 0.00, writes/s: 973.02, response time: 10.77ms (95%)
[   3s] threads: 3, tps: 0.00, reads/s: 0.00, writes/s: 1181.01, response time: 6.23ms (95%)
[   4s] threads: 3, tps: 0.00, reads/s: 0.00, writes/s: 1103.00, response time: 6.77ms (95%)

I would also like to call your attention to a blog post by Nilnandan Joshi from Percona’s Support team where he describes a method to build sysbench 0.5 on Debian 7.  Thanks Nil for pointing this out!

I hope that helps others out there who upgrade to sysbench 0.5 and then have questions about where –test=oltp went to. I’d love to hear your own sysbench use cases, and whether anyone else is publishing lua scripts for their own load testing!

The post Using sysbench 0.5 for performing MySQL benchmarks appeared first on MySQL Performance Blog.

Mar
19
2014
--

How to log slow queries on Slave in MySQL 5.0 with pt-query-digest

Working as a Percona Support Engineer, every day we are seeing lots of issues related to MySQL replication. One very common issue is slave lagging. There are many reasons for slave lag but one common reason is that queries are taking more time on slave then master. How to check and log those long-running queries?  From MySQL 5.1, log-slow-slave-statements variable was introduced, which you can enable on slave and log slow queries. But what if you want to log slow queries on slave in earlier versions like MySQL 5.0?  There is a good solution/workaround: pt-query-digest. How? let’s take a look….

If you want to log all queries that are running on slave (including those, running by sql thread), you can use pt-query-digest with –processlist and –print (In pt-query-digest 2.1.9) OR –output (In pt-query-digest 2.2.7) options and log all queries in specific file. I have tested it in my local environment and it works.

You can start pt-query-digest like below on Slave,

nil@Dell:~$ /percona-toolkit-2.1.9/bin/pt-query-digest --processlist u=msandbox,p=msandbox,S=/tmp/mysql_sandbox34498.sock --print --no-report
OR
nil@Dell:-$ /percona-toolkit-2.2.7/bin/pt-query-digest --processlist u=msandbox,p=msandbox,S=/tmp/mysql_sandbox34498.sock --no-report --output=slowlog

Run some long running queries on Master,

nil@Dell:~$ mysql -umsandbox -p --socket=/tmp/mysql_sandbox34497.sock
Enter password:
mysql> use percona
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> delete from test limit 5000000;
Query OK, 5000000 rows affected (1 min 54.33 sec)
mysql> delete from test limit 5000000;
Query OK, 5000000 rows affected (1 min 56.42 sec)

mysql>

and you’ll see the output on Slave like this,

nil@Dell:~/Downloads/percona-toolkit-2.1.9/bin$ ./pt-query-digest --processlist u=msandbox,p=msandbox,S=/tmp/mysql_sandbox34498.sock --print --no-report
# Time: 2014-03-18T12:10:57
# User@Host: system user[system user] @ []
# Query_time: 114.000000 Lock_time: 0.000000 Rows_sent: 0 Rows_examined: 0
use percona;
delete from test limit 5000000;
nil@Dell:~/Downloads/percona-toolkit-2.2.7/bin$ pt-query-digest --processlist u=msandbox,p=msandbox,S=/tmp/mysql_sandbox34498.sock --no-report --output=slowlog
# Time: 2014-03-18T12:21:05
# User@Host: system user[system user] @ []
# Query_time: 117.000000 Lock_time: 0.000000 Rows_sent: 0 Rows_examined: 0
use percona;
delete from test limit 5000000;

You can also run pt-query-digest in background like a daemon and send output to the specific file like slow.log and review it.

i.e /percona-toolkit-2.1.9/bin/pt-query-digest –processlist u=msandbox,p=msandbox,S=/tmp/mysql_sandbox34498.sock –print –no-report > slow.log 2>&1

OR

i.e /percona-toolkit-2.2.7/bin/pt-query-digest –processlist u=msandbox,p=msandbox,S=/tmp/mysql_sandbox34498.sock –no-report –output=slowlog > slow.log 2>&1

Here, the default output will be just like slow query log. If we have master-master replication where every master is slave too and we want to log only those statements that are executing by sql_thread then –filter option can be used like this:

pt-query-digest –filter ‘$event->user eq “system user”‘ –no-report –output=slowlog

Since pt-query-digest–processlist polls 10 times/second ( –interval option), it’s not reliable to use for collecting complete query logs, because quick queries could fall in between the polling intervals. And in any case, it won’t measure query time with precision any better than 1/10th of a second. But if the goal is to identify queries that are very long-running, it should be adequate.

The post How to log slow queries on Slave in MySQL 5.0 with pt-query-digest appeared first on MySQL Performance Blog.

Aug
12
2013
--

Want to archive tables? Use Percona Toolkit’s pt-archiver

pt-archiverPercona Toolkit’s pt-archiver is one of the best utilities to archive the records from large tables to another tables or files. One interesting thing is that pt-archiver is a read-write tool. It deletes data from the source by default, so after archiving you don’t need to delete it separately.

As it is done by default, you should take care before actually running it on then production server. You can test your archiving jobs with the – dry-run  OR you can use the –no-delete option if you’re not sure about. The purpose of this script is mainly to archive old data from the table without impacting OLTP queries and insert the data into another table on the same/different server OR into a file in a format which is suitable for LOAD DATA INFILE.

How does pt-archiver select records to archive? 

Pt-archiver uses the index to select records from the table. The index is used to optimize repeated accesses to the table. Pt-archiver remembers the last row it retrieves from each SELECT statement, and uses it to construct a WHERE clause. It does this using the columns in the specified index that should allow MySQL to start the next SELECT where the last one ended – rather than potentially scanning from the beginning of the table with each successive SELECT.

If you want to run pt-archiver with a specific index you can use the “-i” option in –source DSN options. The “-i” option tells pt-archiver which index it should scan to archive. This appears in a FORCE INDEX or USE INDEX hint in the SELECT statements that are used to fetch rows to archive. If you don’t specify anything, pt-archiver will auto-discover a good index, preferring a PRIMARY KEY if one exists. Most of the time, without “-i” option, pt-archiver works well.

How to run pt-archiver?

For archive records into normal file, you can run something like

pt-archiver --source h=localhost,D=nil,t=test --file '/home/nilnandan/%Y-%m-%d-tabname' --where "name='nil'" --limit-1000

From archive records from one table to another table on same server or different, you can run something like

pt-archiver --source h=localhost,D=nil,t=test --dest h=fedora.vm --where "name='nil'" --limit-1000

Please check this before you use default file option (-F) in –source  http://www.percona.com/doc/percona-toolkit/2.1/pt-archiver.html#cmdoption-pt-archiver–dest

Archiving in a replication environment:

In the replication environment it’s really important that the slave should not lag for a long time. So for that, there are two options which we can use while archiving to control the slave lag on slave server.

–check-slave-lag : Pause archiving until the specified DSN’s slave lag is less than –max-lag. In this option, you can give slave details to connect slave lag. (i.e –check-slave-lag h=localhost,S=/tmp/mysql_sandbox29784.sock)

–max-lag : Pause archiving if the slave given by –check-slave-lag lags.

This options causes pt-archiver to look at the slave every time when it’s about to fetch another row. If the slave’s lag is greater than the option’s value, or if the slave isn’t running (so its lag is NULL), pt-archiver sleeps for –check-interval seconds and then looks at the lag again. It repeats until the slave is caught up, then proceeds to fetch and archive the row.

Some useful options for pt-archiver:

–for-update/-share-lock  : Adds the FOR UPDATE/LOCK IN SHARE MODE  modifier to SELECT statements.

–no-delete : Do not delete archived rows.

–plugin : Perl module name to use as a generic plugin.

–progress : Print progress information every X rows.

–statistics : Collect and print timing statistics.

–where : WHERE clause to limit which rows to archive (required).

nilnandan@nil:~$ pt-archiver --source h=localhost,D=nil,t=test,S=/tmp/mysql_sandbox29783.sock --file '/home/nilnandan/%Y-%m-%d-tabname' --where "name='nilnandan'" --limit=50000 --progress=50000 --txn-size=50000 --statistics --bulk-delete --max-lag=1 --check-interval=15 --check-slave-lag h=localhost,S=/tmp/mysql_sandbox29784.sock
TIME ELAPSED COUNT
2013-08-08T10:08:39 0 0
2013-08-08T10:09:25 46 50000
2013-08-08T10:10:32 113 100000
2013-08-08T10:11:41 182 148576
Started at 2013-08-08T10:08:39, ended at 2013-08-08T10:11:59
Source: D=nil,S=/tmp/mysql_sandbox29783.sock,h=localhost,t=test
SELECT 148576
INSERT 0
DELETE 148576
Action Count Time Pct
print_file 148576 18.2674 9.12
bulk_deleting 3 8.9535 4.47
select 4 2.9204 1.46
commit 3 0.0005 0.00
other 0 170.0719 84.95
nilnandan@nil:~$

Percona Toolkit’s pt-archiver works with Percona XtraDB Cluster (PXC) 5.5.28-23.7 and newer, but there are three limitations you should consider before archiving on a cluster. You can get more information here.

pt-archiver is extensible via a plugin mechanism. You can inject your own code to add advanced archiving logic that could be useful for archiving dependent data, applying complex business rules, or building a data warehouse during the archiving process. Follow this URL for more info on that.

Bugs related to pt-archiver: https://bugs.launchpad.net/percona-toolkit/+bugs?field.tag=pt-archiver

More details about pt-archiver: http://www.percona.com/doc/percona-toolkit/2.2/pt-archiver.html

The post Want to archive tables? Use Percona Toolkit’s pt-archiver appeared first on MySQL Performance Blog.

Jun
26
2013
--

Limited disk space? Compact backups with Percona Xtrabackup 2.1

Percona XtraBackup for MySQLOne very interesting feature, “Compact Backup,” is introduced in Percona XtraBackup 2.1. You can run “compact backups” with the  –compact option, which is very useful for those who have limited disk space to keep the database backup. Now let’s first understand how it works. When we are using –compact option with Innobackupex, it will omit the secondary index pages. This will make the backups more compact and this way they will take less space on disk but the downside of this is, the backup prepare process will take longer time because those secondary index pages will be recreated while preparing the backup. Here, we need to consider couple of things before implement it or use it.

  1. Compact backups are not supported for system table space, so in order to work correctly innodb-file-per-table option should be enabled.
  2. Difference in backup size will be depend on the size of the secondary indexes. so you’ll not see drastically changes in backup size if you have less secondary indexes in database.

Taking Compact Backup: We can use –compact option with innobackupex like  sudo innobackupex –compact /home/X_Backup/. As it looked like interesting feature, I have tried to teste it with some scenarios which I would like to share. I have taken a table something like this

Create Table: CREATE TABLE `compact_test` ( `id` int(11) DEFAULT NULL,
`name` varchar(50) DEFAULT NULL,
`city` varchar(25) DEFAULT NULL,
`pin` int(11) DEFAULT NULL,
`phone` bigint(20) DEFAULT NULL,
`mobile` bigint(20) DEFAULT NULL )
ENGINE=InnoDB DEFAULT CHARSET=latin1

Added around 10M records and without index I have tried to take backup with and without –compact option. Found bellow result.

3.9G 2013-06-19_10-21-58 – without –compact option, without index – Time: 3 minutes

3.9G 2013-06-19_10-48-44 – with –compact option, without index – Time: 3 minutes

1.4G backup.xbstream – compress  backup with xbstream – Time: 2.5 minutes. Total time 5.5  minutes (backup + compress)

For further testing, I have added indexes on compact_test table for name, city, pin, phone and mobile columns and found below result.

4.7G 2013-06-19_11-42-07 – without –compact option, with indexes – Time: around 3 minutes

3.9G 2013-06-19_11-55-13 – with –compact option, with indexes – Time: around 3 minutes

1.4G backup_with_indexes.xbstream – compress above backup with xbstream – Time: 3 minutes.  Total time 6  minutes (backup + compress)

So now it clarifies that if you have many indexes on tables then only we can take benefit of –compact  otherwise it will not be useful that much. After taking simple backup, if we can compress it with xbstream, it takes more time but looks like it’s worth it.  One more thing, that if you have many indexes and you are using –compact option, then you can compress only that much space which is occupied by indexes. You can see in above example that with –compact, backup size (3.9G) is same like without index backup.

If you want to check from backup dir that if this backup is taken with –compact or not, you can simply check xtrabackup_checkpoints file, Compact value will be 0 if compact option will not be used.

backup_type = full-backuped

from_lsn = 0

to_lsn = 9023692002

last_lsn = 9023692002

compact = 1

Restoring Compact Backup:

  • Prepare Backup

While preparing the backup with –apply-log on both the backups, I found that –apply-log takes around 13 minutes with the compact backup while its taking 14 seconds for without compact backup.  I have also tried to use –use-memory option to give extra memory to prepare operation but I think it didn’t affect to time. (http://www.percona.com/doc/percona-xtrabackup/2.1/innobackupex/innobackupex_option_reference.html#cmdoption-innobackupex–use-memory) There is also one option like –rebuild-threads where you can spcify no. of threads to make processes parallel. I have tried to test it with 3 tables but didn’t make any difference to process time.  It might be possible that more tables can make difference. (http://www.percona.com/doc/percona-xtrabackup/2.1/innobackupex/innobackupex_option_reference.html#cmdoption-innobackupex–rebuild-threads)

4.7G 2013-06-19_11-42-07 – Without –compact , prepared the backup – Time: 15 seconds

3.9G 2013-06-19_11-55-13 – With –compact, prepared the backup – Time: 13 minutes 

3.9G 2013-06-19_13-08-05 – With –compact and –use-memory=1G, prepared the backup – Time 13 minutes

As I said, this is the downside of compact backup that its taking longer time to prepare it. Even with compact backup, –apply-log output will be slightly different. i.e.

nilnandan@nil:~$ date && sudo innobackupex –apply-log –rebuild-indexes –use-memory=1GB /home/nilnandan/X_Backup/2013-06-19_13-08-05 && date
Wed Jun 19 14:04:13 IST 2013 [sudo] password for nilnandan:

InnoDB Backup Utility v1.5.1-xtrabackup; Copyright 2003, 2009 Innobase Oy
and Percona Ireland Ltd 2009-2012. All Rights Reserved.

130619 14:04:17 innobackupex: Starting ibbackup with command: xtrabackup_55 –defaults-file=”/home/nilnandan/X_Backup/2013-06-19_13-08-05/backup-my.cnf” –defaults-group=”mysqld” –prepare –target-dir=/home/nilnandan/X_Backup/2013-06-19_13-08-05 –use-memory=1GB –tmpdir=/tmp –rebuild-indexes

Starting to expand compacted .ibd files.
130619 14:04:18 InnoDB: Warning: allocated tablespace 14, old maximum was 9
Expanding ./test/compact_test.ibd

130619 14:05:00 InnoDB: Database was not shut down normally! InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files

130619 14:05:02 InnoDB: Waiting for the background threads to start
[01] Checking if there are indexes to rebuild in table test/compact_test (space id: 14)
[01] Found index name
[01] Found index city
[01] Found index phone
[01] Found index mobile
[01] Rebuilding 4 index(es).
130619 14:17:28 Percona XtraDB (http://www.percona.com) 1.1.8-20.1 started; log sequence number 11016623144

xtrabackup: starting shutdown with innodb_fast_shutdown = 1
130619 14:17:37 InnoDB: Starting shutdown

130619 14:17:42 InnoDB: Shutdown completed; log sequence number 11764348940
130619 14:17:42 innobackupex: completed OK!
Wed Jun 19 14:17:42 IST 2013
nilnandan@nil:~$

  • Restore backup to Data dir

Restore the compact backup is very simple and just like normal innobackupex utility. You can use –copy-back option with innobackupex and restore the prepared backup into database dir.

nilnandan@nil:~/X_Backup$ sudo innobackupex –copy-back /home/nilnandan/X_Backup/2013-06-19_13-08-05

It will copy all the data-related files back to the server’s datadir, determined by the server’s my.cnf configuration file. I would suggest to check the last line of the output for a success message. i.e

130619 14:17:42 innobackupex: completed OK!

Conclusion: As we can see, compact backup is helpful for saving disk space but it will also slow the preparation process. But for people who are concerned with disk space over recovery time, the compact + xbstream (archive) can be the best solution.  Actually, it’s just matter of need.

The post Limited disk space? Compact backups with Percona Xtrabackup 2.1 appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com