MySQL replication primer with pt-table-checksum and pt-table-sync

MySQL replication is a process that allows you to easily maintain multiple copies of MySQL data by having them copied automatically from a master to a slave database.

It’s essential to make sure the slave servers have the same set of data as the master to ensure data is consistent within the replication stream. MySQL slave server data can drift from the master for many reasons – e.g. replication errors, accidental direct updates on slave, etc.

Here at Percona Support we highly recommend that our customers periodically run the pt-table-checksum tool to verify data consistency within replication streams. Specifically, after fixing replication errors on slave servers to ensure that the slave has identical data as its master. As you don’t want to put yourself in a situation where you need to failover to a slave server for some reason and you find different data on that slave server.

In this post, I will examine the pt-table-checksum and pt-table-sync tools usage from Percona Toolkit on different replication topologies. We often receive queries from customers about how to run these tools and I hope this post will help.

Percona Toolkit is a free collection of advanced command-line tools to perform a variety of MySQL server and system tasks that are too difficult or complex to perform manually.

One of those tools is pt-table-checksum, which works by dividing table rows into chunks of rows. The size of a chunk changes dynamically during the operation to avoid overloading the server. pt-table-checksum has many safeguards including variation into chunk size to make sure queries run in a desired amount of time.

pt-table-checksum verifies chunk size by running EXPLAIN query on each chunk. It also monitors slave server’s continuously in order to make sure replicas not falls too far behind and in this case tool pauses itself to allow slave to catch up. Along with that there are many other safeguards builtin and you can find all the details in this documentation

In my first example case, I am going to run pt-table-checksum against pair of replication servers – i.e. master having only one slave in replication topology. We will run pt-table-checksum tool on master server to verify data integrity on slave and in case If differences found by pt-table-checksum tool we will sync those changes on slave server via pt-table-sync tool.

I have created a dummy table under test database and inserted 10 records on master server as below:

mysql-master> create table dummy (id int(11) not null auto_increment primary key, name char(5)) engine=innodb;
Query OK, 0 rows affected (0.08 sec)
mysql-master> insert into dummy VALUES (1,'a'), (2,'b'), (3,'c'), (4,'d'), (5,'e'), (6,'f'), (7,'g'), (8,'h'), (9,'i'), (10,'j');
Query OK, 10 rows affected (0.00 sec)
Records: 10  Duplicates: 0  Warnings: 0
mysql-master> select * from dummy;
| id   | name |
|    1 | a    |
|    2 | b    |
|    3 | c    |
|    4 | d    |
|    5 | e    |
|    6 | f    |
|    7 | g    |
|    8 | h    |
|    9 | i    |
|   10 | j    |
10 rows in set (0.00 sec)

Then I intentionally deleted a few records from the slave server to make it inconsistent with the master for the purpose of this post.

mysql-slave> delete from dummy where id>5;
Query OK, 5 rows affected (0.03 sec)
mysql-slave> select * from dummy;
| id | name |
|  1 | a    |
|  2 | b    |
|  3 | c    |
|  4 | d    |
|  5 | e    |
5 rows in set (0.00 sec)

Now, in this case the master server has 10 records on the dummy table while the slave server has only 5 records missing records from id>5 – we will run pt-table-checksum at this point on the master server to see if the pt-table-checksum tool catches those differences.

[root@master]# pt-table-checksum --replicate=percona.checksums --ignore-databases mysql h=localhost,u=checksum_user,p=checksum_password
07-11T18:30:13      0      1       10       1       0   1.044 test.dummy

This needs to be executed on the master. The user and password you specify will be used to not only connect to the master but the slaves as well. You need the following privileges for the pt-table-checksum mysql user:

mysql-master> GRANT REPLICATION SLAVE,PROCESS,SUPER, SELECT ON *.* TO `checksum_user`@'%' IDENTIFIED BY 'checksum_password';
mysql-master> GRANT ALL PRIVILEGES ON percona.* TO `checksum_user`@'%';

Earlier, in pt-table-checksum command, I used –replicate option which writes replication queries to mentioned table percona.checksums. Next I passed  –ignore-databases option which accepts comma separated list of databases to ignore. Moreover, –create-replicate-table and —empty-replicate-table options are “Yes” by default and you can specify both options explicitly if you want to create database table different then percona.checksums.

pt-table-checksum reported 1 DIFF which is number of chunks which are different from master on one or more slaves. You can find details about tabular columns e.g. TS, ERRORS and so on on documentation of pt-table-checksum. After that, I ran next command to identify which table has difference on slave.

[root@master]# pt-table-checksum --replicate=percona.checksums --replicate-check-only --ignore-databases mysql h=localhost,u=checksum_user,p=checksum_password
Differences on slave
test.dummy 1 -5 1

In this command I used –replicate-check-only option which only reports the tables with having differences vice versa only checksum differences on detected replicas are printed. It doesn’t checksum any tables. It checks replicas for differences found by previous checksumming, and then exits.

You may also login to the slave and also execute below query to find out which tables have inconsistencies.

mysql-slave> SELECT db, tbl, SUM(this_cnt) AS total_rows, COUNT(*) AS chunks
FROM percona.checksums
master_cnt <> this_cnt
OR master_crc <> this_crc
OR ISNULL(master_crc) <> ISNULL(this_crc))
GROUP BY db, tbl;

pt-table-checksum identified test.dummy table is different on slave now we are going to use pt-table-sync tool to synchronize table data between MySQL servers.

[root@slave]# pt-table-sync --print --replicate=percona.checksums --sync-to-master h=localhost,u=checksum_user,p=checksum_password
REPLACE INTO `test`.`dummy`(`id`, `name`) VALUES ('6', 'f') /*percona-toolkit src_db:test src_tbl:dummy src_dsn:P=3306,h=,p=...,u=checksum_user dst_db:test dst_tbl:dummy dst_dsn:h=localhost,p=...,u=checksum_user lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:24683 user:root host:slave*/;
REPLACE INTO `test`.`dummy`(`id`, `name`) VALUES ('7', 'g') /*percona-toolkit src_db:test src_tbl:dummy src_dsn:P=3306,h=,p=...,u=checksum_user dst_db:test dst_tbl:dummy dst_dsn:h=localhost,p=...,u=checksum_user lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:24683 user:root host:slave*/;
REPLACE INTO `test`.`dummy`(`id`, `name`) VALUES ('8', 'h') /*percona-toolkit src_db:test src_tbl:dummy src_dsn:P=3306,h=,p=...,u=checksum_user dst_db:test dst_tbl:dummy dst_dsn:h=localhost,p=...,u=checksum_user lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:24683 user:root host:slave*/;
REPLACE INTO `test`.`dummy`(`id`, `name`) VALUES ('9', 'i') /*percona-toolkit src_db:test src_tbl:dummy src_dsn:P=3306,h=,p=...,u=checksum_user dst_db:test dst_tbl:dummy dst_dsn:h=localhost,p=...,u=checksum_user lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:24683 user:root host:slave*/;
REPLACE INTO `test`.`dummy`(`id`, `name`) VALUES ('10', 'j') /*percona-toolkit src_db:test src_tbl:dummy src_dsn:P=3306,h=,p=...,u=checksum_user dst_db:test dst_tbl:dummy dst_dsn:h=localhost,p=...,u=checksum_user lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:24683 user:root host:slave*/;

I ran the pt-table-sync tool from an opposite host this time i.e. from the slave as I used the –sync-to-master option which treats DSN as slave and syncs to master. Again, pt-table-sync will use the mysql username and password you specify to connect to the slave as well as to its master. –replicate option here examines the specified table to find out the data differences and –print just prints the SQL  (REPLACE queries) not actually executes it.

You may audit the queries before executing to sync data between master/slave.  You may see it printed only missing records on the slave. Once you are happy with the results, you can substitute –print with –execute to do actual synchronization.

As a reminder, these queries always executed on the master as this is the only safe way to do the changes on slave. However, on the master it’s no-op changes as these records already exists on master but then falls to slave via replication stream to sync it with master.

If you find lots of differences on your slave server it may lag during synchronization of those changes. As I mentioned earlier, you can use –print option to go through your queries which are going to be executed to sync slave with master server. I found this post useful if you see a huge difference in the table between master/slave(s).

Note, you may use the –dry-run option initially which only analyzes print information about the sync algorithm and then exits. It shows verbose output; it doesn’t do any changes though. –dry-run parameter will basically instruct pt-table-sync to not actually do the sync, but just perform some checks.

Let me present another replication topology, where the master has two slaves where slave2 is running on non-default port 3307 while master and slave1 running on port 3306. Further, slave2 is out of sync with master and I will show you how to sync slave2 which running on port 3307 with master.

mysql-master> SELECT * FROM dummy;
| id | name |
|  1 | a    |
|  2 | b    |
|  3 | c    |
|  4 | d    |
|  5 | e    |
5 rows in set (0.00 sec)
mysql-slave1> SELECT * FROM test.dummy;
| id | name |
|  1 | a    |
|  2 | b    |
|  3 | c    |
|  4 | d    |
|  5 | e    |
5 rows in set (0.00 sec)
mysql-slave2> SELECT * FROM test.dummy;
| id | name |
|  1 | a    |
|  2 | b    |
|  3 | c    |

Let’s run pt-table-checksum tool on master database server.

[root@master]# pt-table-checksum --replicate percona.checksums --ignore-databases=mysql h=,u=checksum_user,p=checksum_password --recursion-method=dsn=D=percona,t=dsns
07-23T13:57:39      0      0        2       1       0   0.310 percona.dsns
07-23T13:57:39      0      1        5       1       0   0.036 test.dummy

I used –recursion-method parameter this time which is method to use find slaves in replication stream and it’s pretty useful when your servers run on non-standard port i.e. other than 3306. I created dsns table under percona database with following entries. You may find dsns table structure in documentation.

mysql> SELECT * FROM dsns;
| id | parent_id | dsn                                                        |
|  1 |         1 | h=,u=checksum_user,p=checksum_password,P=3306 |
|  2 |         2 | h=,u=checksum_user,p=checksum_password,P=3307 |

Next I ran below pt-table-checksum command to identify which slave server has differences on test.dummy table.

[root@master]# pt-table-checksum --replicate=percona.checksums --replicate-check-only --ignore-databases=mysql h=,u=checksum_user,p=checksum_password --recursion-method=dsn=D=percona,t=dsns
Differences on slave2
test.dummy 1 -2 1

This shows that slave2 has different data on test.dummy table as compared to the master. Now let’s run pt-table-sync tool to sync those differences and make slave2 identical as the master.

[root@slave2] ./pt-table-sync --print --replicate=percona.checksums --sync-to-master h=,u=checksum_user,p=checksum_password
REPLACE INTO `test`.`dummy`(`id`, `name`) VALUES ('4', 'd') /*percona-toolkit src_db:test src_tbl:dummy src_dsn:P=3306,h=,p=...,u=checksum dst_db:test dst_tbl:dummy dst_dsn:h=,p=...,u=checksum lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:1514 user:root host:slave2*/;
REPLACE INTO `test`.`dummy`(`id`, `name`) VALUES ('5', 'e') /*percona-toolkit src_db:test src_tbl:dummy src_dsn:P=3306,h=,p=...,u=checksum dst_db:test dst_tbl:dummy dst_dsn:h=,p=...,u=checksum lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:1514 user:root host:slave2*/;

It shows 2 rows are different on slave2. Substituting –print with –execute synchronized the differences on slave2 and re-running pt-table-checksum tool shows no more differences.

pt-table-checksum and pt-table-sync are the finest tools from Percona Toolkit to validate data between master/slave(s). With the help of these tools you can easily identify data drifts and fix them. I mentioned a couple of replication topologies above about how to check replication consistency and how to fix it in case of data drift. You may script pt-table-checksum / pt-table-sync steps and cron checksum script to periodically check the data consistency within replication stream.

This procedure is only safe for a single level master-slave(s) hierarchy. I will discuss the procedure for other topologies in future posts – i.e. I will describe more complex scenarios on how to use these tools in chain replication i.e. master -> slave1 -> slave2 pair and in Percona XtraDB Cluster setup.

The post MySQL replication primer with pt-table-checksum and pt-table-sync appeared first on MySQL Performance Blog.


Deep dive into MySQL’s innochecksum tool

Percona XtraBackupOne of our Percona Support customers recently reported that Percona XtraBackup failed with a page corruption error on an InnoDB table. The customer thought it was a problem or bug in the Percona XtraBackup tool. After investigation we found that an InnoDB page was actually corrupted and a Percona XtraBackup tool caught the error as expected and hence the backup job failed.

I thought this would be an interesting topic and worthy of a blog post. In this article I will describe the innochecksum tool, when and how to use it and what are the possible fixes if an InnoDB table suffers from page corruption.

The innochecksum tool is an offline tool that prints checksums for InnoDB files. This tool reads an InnoDB tablespace file, calculates the checksum for each page and reports mismatches, if any. A checksum mismatch is an indication of corrupt pages. Being as offline tool, innochecksum can’t be used on tablespace file that a MySQL server is currently using,  hence you need to shutdown the server prior to running the innochecksum tool. If you try to run the innochecksum tool on a running MySQL server, then there is a chance that innochecksum crashes or reports a bad checksum for a good page, resulting false positives results. There is chance when you run innochecksum on a tablespace file that is opened by mysqld, that pages are dirty and not checksummed yet by the InnoDB storage engine itself.

The point: don’t run innochecksum against a running server.

InnoDB corruption can be caused by many factors (e.g. power lost, faulty hardware, bugs).  The InnoDB storage engine validates calculated checksum while reading pages from a tablespace on disk to the stored checksum in the page. In case, InnoDB finds page checksum mismatch it will force down the MySQL server.

Let me show you a page corruption error identified by Percona XtraBackup during a backup run in which the backup failed afterward.

[01] xtrabackup: Database page corruption detected at page 25413, retrying...
[01] xtrabackup: Database page corruption detected at page 25413, retrying...
[01] xtrabackup: Database page corruption detected at page 25413, retrying...

First, we need to identify if the tablespace is really corrupted for that particular table. I do that with the help of the innochecksum utility as shown below. As I mentioned earlier, make sure to shut down MySQL before using the innochecksum tool.

$ innochecksum -p 25413 /path/to/datadir/database_name/table_name.ibd

I passed the -p (page) flag for innochecksum to only check the specific pages that were reported corrupt by Percona XtraBackup. Without passing any option to the innochecksum tool, it will check entire tablespace for corruption which will required additional server downtime. The innochecksum tool also supports the -d (debug) option to print the checksum for each page and the -v (verbose) parameter to print a progress indicator. You can find more details in the manual. If the tool reports page corruption then database table is really corrupted as below.

page 25413 invalid (fails log sequence number check)

In order to fix this issue, the first thing you should try is to mysqldump the corrupted table and If mysqldump succeeded then problem exists in secondary indexes for that tablespace. This is because the mysqldump utility doesn’t touch indexes as indexes are created after all rows are inserted.

If mysqldump succeeds then the problem is associated with indexes. I would suggest following options to fix the corruption.

— Execute OPTIMIZE TABLE on that table which rebuilds indexes. The table will be locked during the operation prior to MySQL 5.6.17. Since MySQL 5.6.17 OPTIMIZE TABLE is an online operation.
— Rebuild table with the pt-online-schema-change tool from Percona Toolkit. This will give the same result as OPTIMIZE TABLE a non-blocking way as the pt-online-schema=change tool is online schema change tool.
— Drop all secondary indexes and then recreate them. The table will be locked during that operation for writes only. Again, you can use pt-online-schema-change tool for this purpose without sacrificing read/writes ability on the table during the drop and create indexes operation.

Finally, I would suggest to re-run the innochecksum tool to verify the tables integrity again as this will make sure there is no more page corruption. In this case we found that the table was actually corrupted and fixing table corruption through the backup/reload table fixed the problem and Percona XtraBackup ran fine during the next run.

It is possible that mysqldump crashes a MySQL server for a corrupted table. Fortunately, Percona Server contains innodb_corrupt_table_action which you can enable. The configuration variable is dynamic in nature, this means enabling it doesn’t requires a MySQL server restart. Prior to Percona Server 5.6 innodb_corrupt_table_action was known as innodb_pass_corrupt_table. Once you enable this option, you can try mysqldump again. If you are using Oracle MySQL then I would suggest to try this with innodb_force_recovery in case mysqldump fails to dump the table contents.

As a side note, if your backup is successful without any errors while performing a backup with Percona Xtrabackup, this means your InnoDB tables don’t have any page checksum mismatch or corruption. Percona XtraBackup can validate page checksums and in case of errors it  logs error and exists as I mentioned above.

There is also a modified version of the innochecksum made available by Facebook’s Mark Callaghan and can be found in this bug report which provides extra stats on tablespace undo blocks. There is another tool made by Google’s Jeremy Cole known as the InnoDB Ruby Tool to examine the internals of InnoDB.


  • Innochecksum is an offline InnoDB checksum tool. This means you must stop MySQL server. Otherwise it produces “Unable to lock file” error since MySQL 5.7.
  • Old versions of innochecksum only supports files up to 2GB in size. However, since MySQL 5.6 innochecksum supports files greater than 2GB in size.
  • Percona Server variable innodb_corrupt_table_action is supported on tables existing in their tablespace (i.e. innodb_file_per_table).
  • If you are using compressed tables (ROW_FORMAT=COMPRESSED) , then you must use innochecksum from MySQL 5.7.2 or greater, as earlier versions of innochecksum don’t support compressed tables. Check this bug for details.

New Features for the innochecksum tool from MySQL 5.7:

  • As I mentioned above, since MySQL 5.7 innochecksum supports file sizes greater than 2GB.
  • Since MySQL 5.7 you can log the output with the –log option.
  • –page-type-summary option added for page type summaries.
  • MySQL 5.7 also includes another nice option –page-type-dump which dumps the details of each page to standard output (STDOUT) or standard error (STDERR).
  • Since MySQL 5.7 innochecksum can be executed on multiple user-defined system tablespace files.
  • Since MySQL 5.7 innochecksum can be executed on multiple system tablespace files.

You can read more about this is in the MySQL 5.7 manual page of innochecksum.

In this post, we identified InnoDB page corruption using the logs generated by Percona XtraBackup and fixed  them by using the mysqldump tool. But again, unfortunately, there are chances that Percona XtraBackup will not always fail in the same way when it finds corruption. So in some cases, it’s not easy to tell whether Percona XtraBackup has failed due to a bad checksum or a bug of its own. But in most cases, page corruption is the culprit if Percona XtraBackup fails to complete.

To summarize, I would say that Percona XtraBackup is a good way of verifying whether or not InnoDB pages are corrupted – and you can also verify the same thing via the mysqldump utility.

The post Deep dive into MySQL’s innochecksum tool appeared first on MySQL Performance Blog.


3 handy tools to remove problematic MySQL processes

3 handy tools to remove problematic MySQL processesDBAs often encounter situations where they need to kill queries to ensure there are no long-running queries on a MySQL server that would impact performance. Long-running queries can be the result of many factors. Fortunately, Percona Server contains some handy tools to remove problematic MySQL processes. I will highlight all of the tools via some examples in this post.

There have been some good posts on this blog about the pt-kill tool, like this one by Arunjith Aravindan titled “How a set of queries can be killed in MySQL using Percona Toolkit’s pt-kill.” Let’s dive into pt-kill a bit further with a few more examples. What does pt-kill do? It kills MySQL connections. Say you wanted to run pt-kill from a cronjob and then get an email on every killed process/query. Here is typical example for that.

$ pt-kill --interval 1 --run-time 1 --busy-time 5 --log /path/to/kill_long_running_thread.log --match-info "^(select|SELECT|Select)" --kill --print --user=xxxxxx --password=xxxxxxxxxx

Assume this is running from a cronjob, When pt-kill executes, it will kill queries longer than 5 seconds. By default, pt-kill runs forever –run-time option tells how long pt-kill to run before exiting If –interval and –busy-time parameters are used together then the explicit –interval value is used. Otherwise the default interval is 30 seconds. Note: this will only kill all read queries as per the –match-info parameter.

The above command will log all killed queries in the file referenced with the –log option. If you need to be notified via email for every killed query, the command below will do it. Off-course, you need to have the system configured to send e-mail.

tail -n 0 -F /path/to/kill_long_running_thread.log | while read LOG
echo "$LOG" | mail -s "pt-kill alert"

You can execute this shell script in the background within screen or with the nohup utility continuously to monitor the log file. It will send an email whenever any new killed query arrives to the referenced log file in the pt-kill command. Unfortunately, there is no option to notify-by-email in pt-kill at the moment, so this is sort of a workaround.

In order to log all killed queries into a database table you will need to use the –log-dsn option as per the example below.

$ pt-kill --interval 1 --busy-time 1 --create-log-table --log-dsn=h=localhost,D=percona,t=kill_log --daemonize --match-info "^(select|SELECT|Select)" --kill

All killed queries will be logged into percona.kill_log table. The –daemonize option will run this command in the background forever and will kill all SELECT queries running longer than 1 second (–busy-time 1). The –interval option instructs pt-kill to scan processes every 1 second (–interval 1).

mysql> select * from kill_log;
| kill_id | server_id | timestamp           | reason                  | kill_error | Id    | User | Host      | db   | Command | Time | State      | Info            | Time_ms |
|      17 |         1 | 2015-01-10 08:38:33 | Query matches Info spec |            | 35146 | root | localhost | NULL | Query   |    0 | User sleep | SELECT SLEEP(5) |    NULL |
|      20 |         1 | 2015-01-10 08:38:34 | Query matches Info spec |            | 35223 | root | localhost | NULL | Query   |    0 | User sleep | SELECT SLEEP(5) |    NULL |

With the help of logging killed queries into a database tables. You can easily get all the trends/and /statistics on killed queries via SQL.

By default the tool kills the oldest of the queries that would have been running for more than a given –busy-time.  If you need to kill all the threads that have been busy for more than a specified –busy-time, then this will do it:

$ pt-kill --victims=all --busy-time=60

Statement Timeout in Percona Server:
The max-statement-time feature is ported from the Twitter patches. This feature can be used to limit the query execution time by specifying the timeout value in the max_statement_time variable. When the specified number of milliseconds is reached the server aborts the statement and returns the error below to the client.

ERROR 1877 (70101): Query execution was interrupted, max_statement_time exceeded

Let me demonstrate this through another example:

mysql [localhost] {msandbox} (world) > SET max_statement_time=1;
Query OK, 0 rows affected (0.00 sec)
mysql [localhost] {msandbox} (world) > show variables like 'max_statement_time';
| Variable_name      | Value |
| max_statement_time | 1     |
1 row in set (0.00 sec)
mysql [localhost] {msandbox} (world) > SELECT * FROM City WHERE District = 'abcd';
ERROR 1885 (70101): Query execution was interrupted, max_statement_time exceeded
mysql [localhost] {msandbox} (world) > UPDATE City SET District='abcd' WHERE ID = 2001;
ERROR 1885 (70101): Query execution was interrupted, max_statement_time exceeded
mysql [localhost] {msandbox} (world) > ALTER TABLE City ADD INDEX district_idx (district);
ERROR 1885 (70101): Query execution was interrupted, max_statement_time exceeded

As you can see from this example statement, the timeout feature works for all statements including SELECT/DML/DDL queries.

mysql [localhost] {msandbox} (world) > show status like 'Max_statement%';
| Variable_name                 | Value |
| Max_statement_time_exceeded   | 3     |
| Max_statement_time_set        | 19    |
| Max_statement_time_set_failed | 0     |
3 rows in set (0.00 sec)

The above mentioned status variables are stats for a statement timeout feature. Max_statement_time_exceeded will inform you that the total number of statements exceeded the defined timeout. Max_statement_time_set defines the number of statements for which execution time limit was set. You can find more details in this documentation. The statement timeout feature was introduced in Percona Server 5.6. You can check if your specific version of Percona Server supports this feature or not via the have_statement_timeout variable.

mysql [localhost] {msandbox} (world) > show global variables like 'have_statement_timeout';
| Variable_name          | Value |
| have_statement_timeout | YES   |
1 row in set (0.00 sec)

Bugs you should be aware of: -> This affects how the feature interacts with stored procedures. If you use stored procedures, max_statement_time might not behave as you expect. -> This is documentation bug. Percona Server timeouts might not be safe for some statements like DDL and should not be used with such queries, The documentation does not reflect this. You should be very careful if you set a global statement timeout, It affects data changing queries as well. For best results set the max_statement_time variable in a session before running queries that you want to be killed if they execute too long, instead of using a global variable. -> This affects the statement timeout feature on the query level. You must set max_statement_time in a session or globally instead however, this bug is fixed in latest version i.e. Percona Server 5.6.22-72.0

InnoDB Kill Idle Transactions:
This feature was introduced in Percona Server 5.5. It limits the age of idle XtraDB transactions and will kill idle transactions longer than a specified threshold for innodb_kill_idle_transaction. This feature is useful when autocommit is disabled on the server side and you are relying on the application to commit transactions and want to avoid long running transactions that are uncommitted. Application logic errors sometimes leaves transactions uncommitted. Let me demonstrate it quickly through one example:

mysql [localhost] {msandbox} (world) > show variables like 'autocommit';
| Variable_name | Value |
| autocommit    | OFF   |
mysql [localhost] {msandbox} (world) > show global variables like 'innodb_kill_idle_transaction';
| Variable_name                | Value |
| innodb_kill_idle_transaction | 10    |
mysql [localhost] {msandbox} (world) > START TRANSACTION; SELECT NOW(); INSERT INTO City_backup (Name,CountryCode,District,Population) VALUES ('Karachi','PK','Sindh','1000000');
Query OK, 0 rows affected (0.00 sec)
| NOW()               |
| 2015-01-31 07:11:39 |
1 row in set (0.00 sec)
Query OK, 1 row affected (0.01 sec)
mysql [localhost] {msandbox} (world) > SHOW ENGINE INNODB STATUSG
*************************** 1. row ***************************
---TRANSACTION 173076, ACTIVE 10 sec
1 lock struct(s), heap size 360, 0 row lock(s), undo log entries 1
MySQL thread id 15, OS thread handle 0x7f862e9bf700, query id 176 localhost msandbox init
TABLE LOCK table `world`.`City_backup` trx id 173076 lock mode IX
mysql [localhost] {msandbox} (world) > SELECT NOW(); SELECT * FROM City_backup;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id:    16
Current database: world
| NOW()               |
| 2015-01-31 07:12:06 |
1 row in set (0.01 sec)
Empty set (0.00 sec)

In this post, I shared some tools that can help you get rid of long-running transactions to help ensure that you don’t run into performance issues. This is one of the many good reasons to use Percona Server, which has some extra features on top of vanilla MySQL server.


The post 3 handy tools to remove problematic MySQL processes appeared first on MySQL Performance Blog.


MySQL 5.6 Transportable Tablespaces best practices

In MySQL 5.6 Oracle introduced a Transportable Tablespace feature (copying tablespaces to another server) and Percona Server adopted it for partial backups which means you can now take individual database or table backups and your destination server can be a vanilla MySQL server. Moreover, since Percona Server 5.6, innodb_import_table_from_xtrabackup is obsolete as Percona Server also implemented Oracle MySQL’s transportable tablespaces feature which as I mentioned gives you the ability to copy tablespace (table.ibd) between servers. Let me demonstrate this through one example where I am going to take partial backup of selective tables instead of an entire MySQL server and restore it on a running MySQL server on destination without taking it offline.

MySQL 5.6 Transportable Tablespaces with Percona XtraBackup
MySQL 5.6 Transportable Tablespaces with Percona XtraBackup
Percona XtraBackup is open source, free MySQL hot backup software that performs non-blocking backups for InnoDB and XtraDB databases. You can backup up your databases without sacrificing read/write ability and, on top of that, Percona XtraBackup supports partial backup schemas that correspond to backup-only specific databases or tables instead of taking backups of the entire database server.

For partial backups, your source server from where you taking the backup must have the innodb_file_per_table option enabled and the importing server should have innodb_file_per_table and innodb_expand_import enabled  – or innodb_import_table_from_xtrabackup (only supported for Percona Server) depends on Percona Server version for the the last option for restoring the database tables. This is all valid till Percona Server version 5.5 and you can find further details about partial backups here. Percona CTO Vadim Tkachenko wrote nice post on it about how to copy InnoDB tables between servers on Percona Server prior to version 5.6.

I am going to use Percona Server 5.6 as it uses the feature of Transportable Tablespace. There are two tables under database irfan named as “test” and “dummy”. I am going to take backup of only test table as partial backup instead taking backup of entire database server.

mysql> show tables;
| Tables_in_irfan|
| dummy          |
| test           |
mysql> show create table test;
| Table | Create Table                                                                                                             |
| test  | CREATE TABLE `test` (
  `id` int(11) DEFAULT NULL,
  `name` char(15) DEFAULT NULL
1 row in set (0.00 sec)
mysql> SELECT COUNT(*) FROM test;
| COUNT(*) |
|     1000 |
1 row in set (0.00 sec)

I am going to use latest version of Percona XtraBackup which supports multiple ways to take partial backups that are –include option, –tables-file option and –databases option. I am going to use –tables-file option to take backup of specific database table.

irfan@source$ xtrabackup --version
xtrabackup version 2.2.5 based on MySQL server 5.6.21 Linux (x86_64) (revision id: )
irfan@source$ mysql --skip-column-names -e "SELECT CONCAT(table_schema,'.',table_name) FROM information_schema.tables WHERE table_schema IN ('irfan') AND TABLE_NAME = 'test';" > /root/tables_to_backup.txt
irfan@source$ cat tables_to_backup.txt

Now as below you need to take backup of irfan.test database table. Note, how –tables-file option passed to backup only irfan.test database table which takes backup of only specified database table in tables_to_backup.txt file.

irfan@source$ innobackupex --no-timestamp --tables-file=/root/tables_to_backup.txt /root/partial_backup/ > /root/xtrabackup.log 2>&1
irfan@source$ cat /root/xtrabackup.log
>> log scanned up to (2453801809)
>> log scanned up to (2453801809)
[01]        ...done
[01] Copying ./irfan/test.ibd to /root/partial_backup/irfan/test.ibd
[01]        ...done
xtrabackup: Creating suspend file '/root/partial_backup/xtrabackup_suspended_2' with pid '14442'
141101 12:37:27  innobackupex: Starting to backup non-InnoDB tables and files
innobackupex: in subdirectories of '/var/lib/mysql'
innobackupex: Backing up file '/var/lib/mysql/irfan/test.frm'
141101 12:37:27  innobackupex: Finished backing up non-InnoDB tables and files
141101 12:37:27  innobackupex: Executing LOCK BINLOG FOR BACKUP...
141101 12:37:27  innobackupex: Executing FLUSH ENGINE LOGS...
141101 12:37:27  innobackupex: Waiting for log copying to finish
xtrabackup: Creating suspend file '/root/partial_backup/xtrabackup_log_copied' with pid '14442'
xtrabackup: Transaction log of lsn (2453801809) to (2453801809) was copied.
141101 12:37:28  innobackupex: Executing UNLOCK BINLOG
141101 12:37:28  innobackupex: Executing UNLOCK TABLES
141101 12:37:28  innobackupex: All tables unlocked
141101 12:37:28  innobackupex: completed OK!

Successful backup with show you “completed OK” at the end of the backup. If you scripted the backup for automation you can also check backup status to see whether it succeeded or failed by checking exit status of the backup script.

For the next step, we need to prepare the backup because there might be uncomitted transactions that needs to rollback or transactions in the log to be replayed to backup. You need to mention –export option specifically to prepare backup in order to create table.exp and table.cfg files (prior to MySQL/PS 5.6). You can read more on it in documentation. You can prepare the backup as below.

irfan@source$ innobackupex --apply-log --export /root/partial_backup/ > /root/xtrabackup-prepare.log 2>&1
irfan@source$ cat /root/xtrabackup-prepare.log
xtrabackup version 2.2.5 based on MySQL server 5.6.21 Linux (x86_64) (revision id: )
xtrabackup: auto-enabling --innodb-file-per-table due to the --export option
xtrabackup: cd to /root/partial_backup
xtrabackup: This target seems to be already prepared.
xtrabackup: export option is specified.
xtrabackup: export metadata of table 'irfan/test' to file `./irfan/test.exp` (1 indexes)
xtrabackup:     name=GEN_CLUST_INDEX, id.low=5043, page=3
xtrabackup: starting shutdown with innodb_fast_shutdown = 0
InnoDB: FTS optimize thread exiting.
InnoDB: Starting shutdown...
InnoDB: Shutdown completed; log sequence number 2453817454
141102 11:25:13  innobackupex: completed OK!

Again a successfully prepared backup should show you “completed OK” at end. Once the backup is prepared it means it’s usable and ready to restore. For that first create the same table structure on destination as source.

mysql [localhost] {msandbox} (irfan) > CREATE TABLE `test` (
    ->   `id` int(11) DEFAULT NULL,
    ->   `name` char(15) DEFAULT NULL
    -> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Query OK, 0 rows affected (0.12 sec)
mysql [localhost] {msandbox} (irfan) > show tables;
| Tables_in_irfan |
| test           |
1 row in set (0.00 sec)
irfan@destination$ ls -la data/irfan/
total 116
drwx------ 2 root root  4096 Nov  2 16:29 .
drwx------ 6 root root  4096 Nov  2 16:23 ..
-rw-rw---- 1 root root  8586 Nov  2 16:29 test.frm
-rw-rw---- 1 root root 98304 Nov  2 16:29 test.ibd

Now discard the existing tablespace and copy the test.ibd and test.cfg which Percona XtraBackup produced after preparing the backup and import tablespace back from source to destination as illustrated below. As a side note, you may need to change the ownership of test.cfg file and test.ibd file to mysql in order to make it accessible from MySQL server.

mysql [localhost] {msandbox} (irfan) > ALTER TABLE test DISCARD TABLESPACE;
Query OK, 0 rows affected (0.10 sec)
irfan@source$ cd /root/partial_backup/irfan/
irfan@source$ scp test.cfg test.ibd root@destination:/root/sandboxes/msb_PS-5_6_21/data/irfan/
mysql [localhost] {msandbox} (irfan) > ALTER TABLE test IMPORT TABLESPACE;
Query OK, 0 rows affected (0.10 sec)
irfan@destination$ tail -30 data/msandbox.err
2014-11-02 16:32:53 2037 [Note] InnoDB: Importing tablespace for table 'irfan/test' that was exported from host 'Hostname unknown'
2014-11-02 16:32:53 2037 [Note] InnoDB: Phase I - Update all pages
2014-11-02 16:32:53 2037 [Note] InnoDB: Sync to disk
2014-11-02 16:32:53 2037 [Note] InnoDB: Sync to disk - done!
2014-11-02 16:32:53 2037 [Note] InnoDB: Phase III - Flush changes to disk
2014-11-02 16:32:53 2037 [Note] InnoDB: Phase IV - Flush complete
mysql [localhost] {msandbox} (irfan) > SELECt COUNT(*) FROM test;
| COUNT(*) |
|     1000 |

NOTE:  The .cfg file contains the InnoDB dictionary dump in special(binary) format for corresponding table.The .cfg file is not required to import a tablespace to MySQL 5.6 or Percona Server 5.6. A tablespace can be imported successfully even if it is from another server, but innodb will do schema validation if the corresponding .cfg file is present in the same directory.

Now, as you can see from the error log, that table is imported successfully on test database and changes to innodb tablespace completed correctly. My colleague Miguel Angel Nieto wrote a related post on this titled, “How to recover a single InnoDB table from a Full Backup.”

There is another method to copy a table from a running MySQL instance to another running MySQL server which is described in the MySQL manual. For completeness let me describe to you the procedure quickly.

MySQL 5.6 Transportable Tablespaces with FLUSH TABLES FOR EXPORT
On source server, I have table named “dummy” which i will copy to destination server.

mysql [localhost] {msandbox} (irfan) > show tables;
| Tables_in_test |
| dummy          |
| test           |
mysql [localhost] {msandbox} (irfan) > SELECT COUNT(*) FROM dummy;
| COUNT(*) |
|      100 |

First, create the same table structure on destination server.

mysql [localhost] {msandbox} (irfan) > CREATE TABLE `dummy` (
`id` int(11) DEFAULT NULL,
`dummy` varchar(10) DEFAULT NULL
Query OK, 0 rows affected (0.07 sec)

On the destination server, discard the existing tablespace before importing tablespace from source server.

mysql [localhost] {msandbox} (irfan) > ALTER TABLE dummy DISCARD TABLESPACE;
Query OK, 0 rows affected (0.00 sec)

Run FLUSH TABLES FOR EXPORT on the source server to ensure all table changes flushed to the disk. It will block write transactions to the named table while only allowing read-only operations. This can be a problem on high-write workload systems as a table can be blocked for a longer period of time if your table is huge in size. While making backups with Percona XtraBackup you can manage this in a non-blocking way and it will also save binary log coordinates that can be useful in replication and disaster recovery scenario.

FLUSH TABLES FOR EXPORT is only applicable to Oracle MySQL 5.6/Percona Server 5.6 FLUSH TABLES FOR EXPORT and will produce table metadata file .cfg and tablespace file .ibd. Make sure you copy both table.cfg and table.ibd files before releasing lock. It’s worth mentioning that you shouldn’t logout from the mysql server before copying table cfg and table.ibd file otherwise it will release the lock. After copying table metadata file (.cfg) and tablespace (.ibd) file release the lock on source server.

# Don't terminate session after acquiring lock otherwise lock will be released.
mysql [localhost] {msandbox} (irfan) > FLUSH TABLES dummy FOR EXPORT;
Query OK, 0 rows affected (0.00 sec)
# open another terminal Need another session to copy table files.
irfan@source$ scp dummy.cfg dummy.ibd root@destination-server:/var/lib/mysql/irfan/
# Go back to first session.
mysql [localhost] {msandbox} (irfan) > UNLOCK TABLES;

On destination server import the tablespace and verify from mysql error log as below.

mysql [localhost] {msandbox} (test) > ALTER TABLE dummy IMPORT TABLESPACE;
Query OK, 0 rows affected (0.04 sec)
$ tail -f data/msandbox.err
2014-11-04 13:12:13 2061 [Note] InnoDB: Phase I - Update all pages
2014-11-04 13:12:13 2061 [Note] InnoDB: Sync to disk
2014-11-04 13:12:13 2061 [Note] InnoDB: Sync to disk - done!
2014-11-04 13:12:13 2061 [Note] InnoDB: Phase III - Flush changes to disk
2014-11-04 13:12:13 2061 [Note] InnoDB: Phase IV - Flush complete
mysql [localhost] {msandbox} (irfan) > SELECT COUNT(*) FROM dummy;
| COUNT(*) |
|      100 |

Take into account that there are some LIMITATIONS when copying tablespace between servers which are briefly outlined in the MySQL manual

Transportable Tablespace is nice feature introduced in MySQL 5.6 and I described the use cases for that in this post. Prior to MySQL 5.6, you could backup/restore specific database tables via Percona XtraBackup’s partial backup feature (destination server should be Percona Server for this case).  Comments are welcome :)

The post MySQL 5.6 Transportable Tablespaces best practices appeared first on MySQL Performance Blog.


MySQL Replication: ‘Got fatal error 1236? causes and cures

MySQL Replication: 'Got fatal error 1236' causes and curesMySQL replication is a core process for maintaining multiple copies of data – and replication is a very important aspect in database administration. In order to synchronize data between master and slaves you need to make sure that data transfers smoothly, and to do so you need to act promptly regarding replication errors to continue data synchronization. Here on the Percona Support team, we often help customers with replication broken-related issues. In this post I’ll highlight the top most critical replication error code 1236 along with the causes and cure. MySQL replication error “Got fatal error 1236” can be triggered by multiple reasons and I will try to cover all of them.

Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: ‘log event entry exceeded max_allowed_packet; Increase max_allowed_packet on master; the first event ‘binlog.000201′ at 5480571

This is a typical error on the slave(s) server. It reflects the problem around max_allowed_packet size. max_allowed_packet refers to single SQL statement sent to the MySQL server as binary log event from master to slave. This error usually occurs when you have a different size of max_allowed_packet on the master and slave (i.e. master max_allowed_packet size is greater then slave server). When the MySQL master server tries to send a bigger packet than defined on the slave server,  the slave server then fails to accept it and hence the error. In order to alleviate this issue please make sure to have the same value for max_allowed_packet on both slave and master. You can read more about max_allowed_packet here.

This error usually occurs when updating a huge number of rows on the master and it doesn’t fit into the value of slave max_allowed_packet size because slave max_allowed_packet size is lower then the master. This usually happens with queries “LOAD DATA INFILE” or “INSERT .. SELECT” queries. As per my experience, this can also be caused by application logic that can generate a huge INSERT with junk data. Take into account, that one new variable introduced in MySQL 5.6.6 and later slave_max_allowed_packet_size which controls the maximum packet size for the replication threads. It overrides the max_allowed_packet variable on slave and it’s default value is 1 GB. In this post, “max_allowed_packet and binary log corruption in MySQL,”my colleague Miguel Angel Nieto explains this error in detail.

Got fatal error 1236 from master when reading data from binary log: ‘Could not find first log file name in binary log index file’

This error occurs when the slave server required binary log for replication no longer exists on the master database server. In one of the scenarios for this, your slave server is stopped for some reason for a few hours/days and when you resume replication on the slave it fails with above error.

When you investigate you will find that the master server is no longer requesting binary logs which the slave server needs to pull in order to synchronize data. Possible reasons for this include the master server expired binary logs via system variable expire_logs_days – or someone manually deleted binary logs from master via PURGE BINARY LOGS command or via ‘rm -f’ command or may be you have some cronjob which archives older binary logs to claim disk space, etc. So, make sure you always have the required binary logs exists on the master server and you can update your procedures to keep binary logs that the slave server requires by monitoring the “Relay_master_log_file” variable from SHOW SLAVE STATUS output. Moreover, if you have set expire_log_days in my.cnf old binlogs expire automatically and are removed. This means when MySQL opens a new binlog file, it checks the older binlogs, and purges any that are older than the value of expire_logs_days (in days). Percona Server added a feature to expire logs based on total number of files used instead of the age of the binlog files. So in that configuration, if you get a spike of traffic, it could cause binlogs to disappear sooner than you expect. For more information check Restricting the number of binlog files.

In order to resolve this problem, the only clean solution I can think of is to re-create the slave server from a master server backup or from other slave in replication topology.

– Got fatal error 1236 from master when reading data from binary log: ‘binlog truncated in the middle of event; consider out of disk space on master; the first event ‘mysql-bin.000525′ at 175770780, the last event read from ‘/data/mysql/repl/mysql-bin.000525′ at 175770780, the last byte read from ‘/data/mysql/repl/mysql-bin.000525′ at 175771648.’

Usually, this caused by sync_binlog <>1 on the master server which means binary log events may not be synchronized on the disk. There might be a committed SQL statement or row change (depending on your replication format) on the master that did not make it to the slave because the event is truncated. The solution would be to move the slave thread to the next available binary log and initialize slave thread with the first available position on binary log as below:

mysql> CHANGE MASTE R TO MASTER_LOG_FILE='mysql-bin.000526', MASTER_LOG_POS=4;

– [ERROR] Slave I/O: Got fatal error 1236 from master when reading data from binary log: ‘Client requested master to start replication from impossible position; the first event ‘mysql-bin.010711′ at 55212580, the last event read from ‘/var/lib/mysql/log/mysql-bin.000711′ at 4, the last byte read from ‘/var/lib/mysql/log/mysql-bin.010711′ at 4.’, Error_code: 1236

I foresee master server crashed or rebooted and hence binary log events not synchronized on disk. This usually happens when sync_binlog != 1 on the master. You can investigate it as inspecting binary log contents as below:

$ mysqlbinlog --base64-output=decode-rows --verbose --verbose --start-position=55212580 mysql-bin.010711

You will find this is the last position of binary log and end of binary log file. This issue can usually be fixed by moving the slave to the next binary log. In this case it would be:


This will resume replication.

To avoid corrupted binlogs on the master, enabling sync_binlog=1 on master helps in most cases. sync_binlog=1 will synchronize the binary log to disk after every commit. sync_binlog makes MySQL perform on fsync on the binary log in addition to the fsync by InnoDB. As a reminder, it has some cost impact as it will synchronize the write-to-binary log on disk after every commit. On the other hand, sync_binlog=1 overhead can be very minimal or negligible if the disk subsystem is SSD along with battery-backed cache (BBU). You can read more about this here in the manual.

sync_binlog is a dynamic option that you can enable on the fly. Here’s how:

mysql-master> SET GLOBAL sync_binlog=1;

To make the change persistent across reboot, you can add this parameter in my.cnf.

As a side note, along with replication fixes, it is always a better option to make sure your replica is in the master and to validate data between master/slaves. Fortunately, Percona Toolkit has tools for this purpose: pt-table-checksum & pt-table-sync. Before checking for replication consistency, be sure to check the replication environment and then, later, to sync any differences.

The post MySQL Replication: ‘Got fatal error 1236′ causes and cures appeared first on MySQL Performance Blog.


MySQL upgrade best practices

MySQL upgrade best practicesMySQL upgrades are necessary tasks and we field a variety of questions here at Percona Support regarding MySQL upgrade best practices. This post highlights recommended ways to upgrade MySQL in different scenarios.

Why are MySQL upgrades needed? The reasons are many and include: Access to new features, performance benefits, bug fixes…. However, MySQL upgrades can be risky if not tested extensively beforehand with your application because the process might break it, prevent the application from functioning properly – or performance issues could arise following the upgrade. Moreover, I suggest keeping an eye on new releases of MySQL and Percona Server – check what has changed in the most recent version. Perhaps the latest release has a fix for an issue that you have been experiencing.

Upgrading one major version via SQL Dump:

Upgrading between one major version covers upgrading from Percona Server 5.1 to 5.5 or Percona Server 5.5 to 5.6 and the same implies to Oracle MySQL.

First of all, upgrading between one major version is neither straightforward nor risk-free. Initially you should read “Upgrading from Previous Series” documentation here and here. In that documentation, please place special attention to all of the sections marked “Incompatible Change” and check whether you may be affected by those changes. There might be configuration changes as well as variables renamed, a few older variables obsoleted and new variables introduced – so make sure that you adjust your my.cnf accordingly. For Percona Server specific changes please refer here and here for Percona Server 5.5 & Percona Server 5.6, respectively.

Now there are several possible approaches you may take, where one may be more feasible than the other depending on the current replication topology and total data size – and one might also be safer than another. Let me show you an upgrade procedure… an example upgrading from Percona Server 5.5 to Percona Server 5.6.

In general, there are two types of MySQL upgrades:

  • In place, where you use existing datadir against the new MySQL major version, with just running mysql_upgrade after binaries are upgraded,
  • SQL dump on an old version and then restore it on a new version (using mysqldump utility or alternatives, like mydumper).

Also in general the second type is safer, but as you may expect a much slower MySQL upgrade process.

Theoretically, the safest scenario is:

Here’s a basic procedure (you should stop application writes before starting).

1) Capture users and permissions information. This will backup all your existing user privileges.

$ wget;
$ perl pt-show-grants --user=root --ask-pass --flush > /root/grants.sql

2) Produce a logical dump of the 5.5 instance, excluding the mysql, information_schema and performance_schema databases:

$ mysql -BNe "SELECT SCHEMA_NAME FROM INFORMATION_SCHEMA.SCHEMATA WHERE SCHEMA_NAME NOT IN ('mysql', 'performance_schema', 'information_schema')" | tr 'n' ' ' > /root/dbs-to-dump.sql
$ mysqldump --routines --events --single-transaction --databases $(cat /root/dbs-to-dump.sql) > /root/full-data-dump.sql

3) Stop 5.5 instance.

$ service mysql stop
$ /etc/init.d/mysql stop

4) Move old datadir (assuming /var/lib/mysql; edit accordingly to your setup):

$ mv /var/lib/mysql/ /var/lib/mysql-55

5) Install 5.6 (simply as you would do when not upgrading). If you don’t use a package manager (yum/apt-get) then is likely that you need to run mysql_install_db and mysql_upgrade.

6) Load the users back to new upgraded version of MySQL.

$ mysql -uroot < /root/grants.sql

7) Load the data back to new version of MySQL.

$ mysql -e "SET GLOBAL max_allowed_packet=1024*1024*1024";
$ mysql -uroot -p --max-allowed-packet=1G < /root/full-data-dump.sql;

At this point all tables have been re-created and loaded in MySQL 5.6 so every binary representation is native to MySQL 5.6. You’ve also completed the cleanest/most-stable upgrade path and your application can resume service – and for that reason it’s worth mentioning that this upgrade path is the same with either upgrading vanila MySQL or Percona Server. Further, you may upgrade from Oracle MySQL to Percona Server, for example,  upgrading Oracle MySQL 5.5 to Percona Server 5.6. Again, the MySQL upgrade path as described would be the same as Percona Server, which is a drop-in replacement of Oracle MySQL.

“SQL dump” is also known as a logical backup. It is safer in the sense that when restoring, all tables will be created using the format of the new MySQL binaries you’re using, which bypasses compatibility issues in general. Still for large data like data in terabytes, gigabytes… this may be a very time-consuming approach. On the other hand, by dumping/reloading such large data sets, it is possible that you will be able to recover a lot of free space on the disk as the InnoDB table spaces will be re-created from scratch, thus optimized and defragmented. If the data was often updated/deleted, the benefits may be significant.

Minor version MySQL upgrade within the same major version via In-Place Upgrade:

This implies to upgrading within the same series e.g. MySQL 5.5.35 to MySQL 5.5.38 or Percona Server 5.6.14 to latest Percona Server 5.6.20.

This is known as an in-place upgrade, where you just install a newer binary package and then run mysql_upgrade script, which checks and updates system tables if necessary. Still, with the in-place upgrade we highly recommend checking release notes for new features, bug fixes, etc. For Percona Server 5.5 and Percona Server 5.6, release notes can be found here and here respectively.

For Percona Server we have additional documents describing some details when it comes to upgrading Percona Server with a focus on Percona-specific features that can be found here and here. This also covers complete In-Place Upgrade procedure with the yum/apt package manager.

Also, to be on safe side you can do the upgrade with a logical dump using the earlier described procedure via mysqldump or mydumper program – where the former does parallel backups and restore and logical backup – and is the safest approach for the upgrade.

MySQL Upgrade directly to the latest version by skipping one major version in between:

This includes upgrading from MySQL 5.0 to MySQL 5.5 by skipping version 5.1 in between or upgrading MySQL 5.1 to MySQL 5.6 by skipping version 5.5 in between. Further, this also includes upgrading to MySQL 5.6 directly from MySQL 5.0 although there should be very few users still using MySQL version 5.0. This also implies to Percona Server.

For the topic, we would assume upgrading from Oracle MySQL or Percona Server 5.1 directly to version 5.6 by skipping one major version 5.5 in between.

Before anything, this is a serious upgrade, and a huge step over one major MySQL version. That is, it’s risky. Upgrading by using just binaries update is not supported and it’s not safe skipping major versions in between, so you should never do this from 5.0->5.5, 5.1->5.6, and surely not for 5.0->5.6. One problem is that not all changes in MySQL versions are backwards compatible. Some differences were introduced that may affect both how the data is handled, but also how the server behaves including both SQL language and MySQL server and storage engines internals. Another thing is that between MySQL 5.0 and 5.6 versions, a number of default setting variables were changed, which may result in completely different, unexpected behavior. For example since MySQL 5.5 the default storage engine is InnoDB and since MySQL 5.6 by default InnoDB will use a separate tablespace for each table and GTID replication was also introduced. But there are many more details which I won’t list here. All of those changes are described in “Upgrading from Previous Series” documentation as described above.

It’s worth mentioning that upgrading by skipping one major version is highly not recommended. Upgrading from MySQL 5.1 to 5.6 shouldn’t be done in one shot. Instead, I would suggest upgrading from version 5.1 to 5.5 and then from version 5.5 to 5.6 and running mysql_upgrade at each step. That will cope with the changes in formats as explained in the manual.

MySQL Upgrade Precautions:

MySQL upgrade precautions are an essential part of the upgrade itself. Before you upgrade make sure you have thoroughly tested all application parts with the desired version of MySQL. This is especially needed for an upgrade between major versions or if you are upgrading by skipping one major version in-between (e.g. upgrade from MySQL 5.1 to MySQL 5.6).

Make sure you read release notes carefully and that you are aware of all the changes. You can find Oracle MySQL 5.5 and 5.6 release notes as follows:

While Percona Server specific release notes can be found below for same versions as described above.

If you are planning to upgrade to Oracle MySQL 5.6 or Percona Server 5.6 I would recommend first checking for existing critical bugs. Bugs you should aware of:

Upgrade Hierarchy:

This is yet another important aspect of any MySQL upgrade. You should plan your upgrade along with an upgrade hierarchy. This is always recommend: upgrade your dev/QA servers first, then staging server’s before moving to production. In fact, you can spare upgraded instances where you have desired upgraded versions of MySQL and then test your application extensively.

Once you are happy with the MySQL upgrade on your test servers, staging servers, etc., then you can begin the MySQL upgrade on your production servers. In replication environments we highly recommend upgrading the MySQL slaves first (one by one) and then finally upgrading the MySQL master. In reality,  you can upgrade one of the slaves first and run it for few days to be on safe side – all the while closely monitoring its performance. If you don’t have a replication setup it may be worth creating a replica to test the newer version of MySQL on it first. Once you are happy with the results you can upgrade remaining the slaves and finally the master.

 How Percona software helps you in a MySQL upgrade:

In any MySQL upgrade, Percona Toolkit comes to the rescue. Percona Tookit contains a number of tools that help a great deal.

pt-upgrade is one of such tool. It allows you to test whether the new MySQL instance handles some specific queries at least as fast as old version. There may be some substantial differences as the MySQL query optimizer has changed a lot between versions 5.1 and 5.6 and also data statistics may be refreshed, hence the query plan may change. You can check further in the manual about optimizer changes.

pt-query-digest is another great tool that might help you in the upgrade. You can replay your slow query log against existing and new desired MySQL versions for before and after query performance validation.

You can also benefit from Percona Cloud Tools for MySQL which is a hosted service providing access to query performance insights for all MySQL uses. You can signup for free now because this service is in public beta. Percona Cloud Tools, among other things, allows you to visually check your queries performance after a MySQL upgrade.

It’s highly recommended to backup your data before your MySQL upgrade. Percona XtraBackup is free and open source (like all Percona software). It’s a hot backup tool which backs-up your data online without scarifying read/write ability from the database and it will backup your data with minor impact.

Last but not least, You will find this post pretty useful, too: “Upgrading MySQL.” It’s a few years old but still very relevant. And also take a look at this informative webinar, “Upgrading to MySQL 5.6: Best Practices.” Both are from Percona CEO Peter Zaitsev.

A MySQL upgrade might look like a simple task –  but actually it’s not. I’ve tried to cover most of the MySQL upgrade scenarios in this post that you will encounter. Again, I recommend to briefly test your application parts before pushing it “live,” otherwise it may break your application or part of it – or may minimize performance instead of a performance gain. Finally, I recommend having a downgrade plan in place before the MySQL upgrade just in case something goes wrong. Planning a proper downgrade procedure will minimize your app downtime when things go wrong. I’m looking forward to your comments and questions below.

The post MySQL upgrade best practices appeared first on MySQL Performance Blog.


MySQL 101: Monitor Disk I/O with pt-diskstats

MySQL 101: Monitor Disk I/O with pt-diskstatsHere on the Percona Support team we often ask customers to retrieve disk stats to monitor disk IO and to measure block devices iops and latency. There are a number of tools available to monitor IO on Linux. iostat is one of the popular tools and Percona Toolkit, which is free, contains the pt-diskstats tool for this purpose. The pt-diskstats tool is similar to iostat but it’s more interactive and contains extended information. pt-diskstats reports current disk activity and shows the statistics for the last second (which by default is 1 second) and will continue until interrupted. The pt-diskstats tool collects samples of /proc/diskstats.

In this post, I will share some examples about how to monitor and check to see if the IO subsystem is performing properly or if any disks are a limiting factor – all this by using the pt-diskstats tool.

pt-diskstats output consists on number of columns and in order to interpret pt-diskstats output we need to know what each column represents.

  • rd_s tells about number of reads per second while wr_s represents number of writes per second.
  • rd_rt and wr_rt shows average response time in milliseconds for reads & writes respectively, which is similar to iostat tool output await column but pt-diskstats shows individual response time for reads and writes at disk level. Just a note, modern iostat splits read and write latency out, but most distros don’t have the latest iostat in their systat (or equivalent) package.
  • rd_mrg and wr_mrg are other two important columns in pt-diskstats output.  *_mrg is telling us how many of the original operations the IO elevator (disk scheduler) was able to merge to reduce IOPS, so *_mrg is telling us a quite important thing by letting us know that the IO scheduler was able to consolidate many or few operations. If rd_mrg/wr_mrg is high% then the IO workload is sequential on the other hand, If rd_mrg/wr_mrg is a low% then IO workload is all random. Binary logs, redo logs (aka ib_logfile*), undo log and doublewrite buffer all need sequential writes.
  • qtime and stime are last two columns in pt-diskstats output where qtime reflects to time spent in disk scheduler queue i.e. average queue time before sending it to physical device and on the other hand stime is average service time which is time accumulated to process the physical device request. Note, that qtime is not discriminated between reads and writes and you can check if response time is higher for qtime than it signal towards disk scheduler. Also note that service time (stime field and svctm field in in pt-diskstats & iostat output respectively) is not reliable on Linux. If you read the iostat manual you will see it is deprecated.

Along with that, there are many other parameters for pt-diskstats – you can found full documentation here. Below is an example of pt-disktats in action. I used the  –devices-regex option which prints only device information that matches this Perl regex.

$ pt-diskstats --devices-regex=sd --interval 5
#ts device rd_s rd_avkb rd_mb_s rd_mrg rd_cnc rd_rt wr_s wr_avkb wr_mb_s wr_mrg wr_cnc wr_rt busy in_prg io_s qtime stime
1.1 sda   21.6   22.8    0.5      45%    1.2  29.4 275.5   4.0    1.1      0%   40.0  145.1   65%   158 297.1 155.0   2.1
1.1 sdb   15.0   21.0    0.3      33%    0.1   5.2   0.0   0.0    0.0      0%    0.0    0.0   11%     1  15.0   0.5   4.7
1.1 sdc    5.6   10.0    0.1       0%    0.0   5.2   1.9   6.0    0.0     33%    0.0    2.0    3%     0   7.5   0.4   3.6
1.1 sdd    0.0    0.0    0.0       0%    0.0   0.0   0.0   0.0    0.0      0%    0.0    0.0    0%     0   0.0   0.0   0.0
5.0 sda   17.0   14.8    0.2      64%    3.1  66.7 404.9   4.6    1.8     14%  140.9  298.5  100%   111 421.9 277.6   1.9
5.0 sdb   14.0   19.9    0.3      48%    0.1   5.5   0.4 174.0    0.1     98%    0.0    0.0   11%     0  14.4   0.9   2.4
5.0 sdc    3.6   27.1    0.1      61%    0.0   3.5   2.8   5.7    0.0     30%    0.0    2.0    3%     0   6.4   0.7   2.4
5.0 sdd    0.0    0.0    0.0       0%    0.0   0.0   0.0   0.0    0.0      0%    0.0    0.0    0%     0   0.0   0.0   0.0

These are the stats from 7200 RPM SATA disks. As you can see, the write-response time is very high and most of that is made up of IO queue time. This shows the problem exactly. The problem is that the IO subsystem is not able to handle the write workload because the amount of writes that are being performed are way beyond what it can handle. It means the disks cannot service every request concurrently. The workload would actually depend a lot on where the hot data is stored and as we can see in this particular case the workload only hits a single disk out of the 4 disks. A single 7.2K RPM disk can only do about 100 random writes per second which is not a lot considering heavy workload.

It’s not particularly a hardware issue but a hardware capacity issue. The kind of workload that is present and the amount of writes that are performed per second are not something that the IO subsystem is able to handle in an efficient manner. Mostly writes are generated on this server as can be seen by the disk stats.

Let me show you a second example. Here you can see read latency. rd_rt is consistently between 10ms-30ms. It depends on how fast the disks are spinning and the number of disks. To deal with it possible solutions would be to optimize queries to avoid table scans, use memcached where possible, use SSD’s as it can provide good I/O performance with high concurrency. You will find this post useful on SSD’s from our CEO, Peter Zaitsev.

#ts device rd_s rd_avkb rd_mb_s rd_mrg rd_cnc rd_rt wr_s wr_avkb wr_mb_s wr_mrg wr_cnc wr_rt busy in_prg io_s qtime stime
1.0 sdb    33.0  29.1      0.9     0%    1.1  34.7   7.0   10.3    0.1     61%    0.0   0.4   99%   1    40.0  2.2  19.5
1.0 sdb1    0.0   0.0      0.0     0%    0.0   0.0   7.0   10.3    0.1     61%    0.0   0.4    1%   0     7.0  0.0   0.4
1.0 sdb2   33.0  29.1      0.9     0%    1.1  34.7   0.0    0.0    0.0      0%    0.0   0.0   99%   1    33.0  3.5  30.2
1.0 sdb    81.9  28.5      2.3     0%    1.1  14.0   0.0    0.0    0.0      0%    0.0   0.0   99%   1    81.9  2.0  12.0
1.0 sdb1    0.0   0.0      0.0     0%    0.0   0.0   0.0    0.0    0.0      0%    0.0   0.0    0%   0     0.0  0.0   0.0
1.0 sdb2   81.9  28.5      2.3     0%    1.1  14.0   0.0    0.0    0.0      0%    0.0   0.0   99%   1    81.9  2.0  12.0
1.0 sdb    50.0  25.7      1.3     0%    1.3  25.1  13.0   11.7    0.1     66%    0.0   0.7   99%   1    63.0  3.4  11.3
1.0 sdb1   25.0  21.3      0.5     0%    0.6  25.2  13.0   11.7    0.1     66%    0.0   0.7   46%   1    38.0  3.2   7.3
1.0 sdb2   25.0  30.1      0.7     0%    0.6  25.0   0.0    0.0    0.0      0%    0.0   0.0   56%   0    25.0  3.6  22.2

From the below diskstats output it seems that IO is saturated between both reads and writes. This can be noticed with high value for columns rd_s and wr_s. In this particular case, consider having disks in either RAID 5 (better for read only workload) or RAID 10 array is good option along with battery-backed write cache (BBWC) as single disk can really be bad for performance when you are IO bound.

device rd_s rd_avkb rd_mb_s rd_mrg rd_cnc rd_rt wr_s  wr_avkb wr_mb_s wr_mrg wr_cnc wr_rt  busy in_prg io_s  qtime stime
sdb1  362.0  27.4    9.7     0%    2.7    7.5   525.2  20.2    10.3    35%    6.4    8.0   100%   0    887.2  7.0   0.9
sdb1  439.9  26.5   11.4     0%    3.4    7.7   545.7  20.8    11.1    34%    9.8   11.9   100%   0    985.6  9.6   0.8
sdb1  576.6  26.5   14.9     0%    4.5    7.8   400.2  19.9     7.8    34%    6.7   10.9   100%   0    976.8  8.6   0.8
sdb1  410.8  24.2    9.7     0%    2.9    7.1   403.1  18.3     7.2    34%   10.8   17.7   100%   0    813.9 12.5   1.0
sdb1  378.4  24.6    9.1     0%    2.7    7.3   506.1  16.5     8.2    33%    5.7    7.6   100%   0    884.4  6.6   0.9
sdb1  572.8  26.1   14.6     0%    4.8    8.4   422.6  17.2     7.1    30%    1.7    2.8   100%   0    995.4  4.7   0.8
sdb1  429.2  23.0    9.6     0%    3.2    7.4   511.9  14.5     7.2    31%    1.2    1.7   100%   0    941.2  3.6   0.9

The following example reflects write heavy activity but write-response time is very good, under 1ms, which shows disks are healthy and capable of handling high number of IOPS.

#ts device rd_s rd_avkb rd_mb_s rd_mrg rd_cnc rd_rt wr_s wr_avkb wr_mb_s wr_mrg wr_cnc wr_rt busy in_prg io_s qtime stime
1.0  dm-0  530.8 16.0    8.3     0%    0.3    0.5  6124.0  5.1    30.7    0%     1.7    0.3  86%    2   6654.8  0.2  0.1
2.0  dm-0  633.1 16.1    10.0    0%    0.3    0.5  6173.0  6.1    36.6    0%     1.7    0.3  88%    1   6806.1  0.2  0.1
3.0  dm-0  731.8 16.0    11.5    0%    0.4    0.5  6064.2  5.8    34.1    0%     1.9    0.3  90%    2   6795.9  0.2  0.1
4.0  dm-0  711.1 16.0    11.1    0%    0.3    0.5  6448.5  5.4    34.3    0%     1.8    0.3  92%    2   7159.6  0.2  0.1
5.0  dm-0  700.1 16.0    10.9    0%    0.4    0.5  5689.4  5.8    32.2    0%     1.9    0.3  88%    0   6389.5  0.2  0.1
6.0  dm-0  774.1 16.0    12.1    0%    0.3    0.4  6409.5  5.5    34.2    0%     1.7    0.3  86%    0   7183.5  0.2  0.1
7.0  dm-0  849.6 16.0    13.3    0%    0.4    0.5  6151.2  5.4    32.3    0%     1.9    0.3  88%    3   7000.8  0.2  0.1
8.0  dm-0  664.2 16.0    10.4    0%    0.3    0.5  6349.2  5.7    35.1    0%     2.0    0.3  90%    2   7013.4  0.2  0.1
9.0  dm-0  951.0 16.0    14.9    0%    0.4    0.4  5807.0  5.3    29.9    0%     1.8    0.3  90%    3   6758.0  0.2  0.1
10.0 dm-0  742.0 16.0    11.6    0%    0.3    0.5  6461.1  5.1    32.2    0%     1.7    0.3  87%    1   7203.2  0.2  0.1

Let me show you a final example. I used –interval and –iterations parameters for pt-diskstats which tells us to wait for a number of seconds before printing the next disk stats and to limit the number of samples respectively. If you notice, you will see in 3rd iteration high latency (rd_rt, wr_rt) mostly for reads. Also, you can notice a high value for queue time (qtime) and service time (stime) where qtime is related to disk IO scheduler settings. For MySQL database servers we usually recommends noop/deadline instead of default cfq.

$ pt-diskstats --interval=20 --iterations=3
#ts device rd_s rd_avkb rd_mb_s rd_mrg rd_cnc rd_rt wr_s wr_avkb wr_mb_s wr_mrg wr_cnc wr_rt busy in_prg io_s qtime stime
10.4 hda   11.7  4.0     0.0     0%     0.0    1.1  40.7   11.7    0.5    26%    0.1    2.1  10%    0    52.5  0.4   1.5
10.4 hda2   0.0  0.0     0.0     0%     0.0    0.0   0.4    7.0    0.0    43%    0.0    0.1   0%    0     0.4  0.0   0.1
10.4 hda3   0.0  0.0     0.0     0%     0.0    0.0   0.4  107.0    0.0    96%    0.0    0.2   0%    0     0.4  0.0   0.2
10.4 hda5   0.0  0.0     0.0     0%     0.0    0.0   0.7   20.0    0.0    80%    0.0    0.3   0%    0     0.7  0.1   0.2
10.4 hda6   0.0  0.0     0.0     0%     0.0    0.0   0.1    4.0    0.0     0%    0.0    4.0   0%    0     0.1  0.0   4.0
10.4 hda9  11.7  4.0     0.0     0%     0.0    1.1  39.2   10.7    0.4     3%    0.1    2.7   9%    0    50.9  0.5   1.8
10.4 drbd1 11.7  4.0     0.0     0%     0.0    1.1  39.1   10.7    0.4     0%    0.1    2.8   9%    0    50.8  0.5   1.7
20.0 hda   14.6  4.0     0.1     0%     0.0    1.4  39.5   12.3    0.5    26%    0.3    6.4  18%    0    54.1  2.6   2.7
20.0 hda2   0.0  0.0     0.0     0%     0.0    0.0   0.4    9.1    0.0    56%    0.0   42.0   3%    0     0.4  0.0  42.0
20.0 hda3   0.0  0.0     0.0     0%     0.0    0.0   1.5   22.3    0.0    82%    0.0    1.5   0%    0     1.5  1.2   0.3
20.0 hda5   0.0  0.0     0.0     0%     0.0    0.0   1.1   18.9    0.0    79%    0.1   21.4  11%    0     1.1  0.1  21.3
20.0 hda6   0.0  0.0     0.0     0%     0.0    0.0   0.8   10.4    0.0    62%    0.0    1.5   0%    0     0.8  1.3   0.2
20.0 hda9  14.6  4.0     0.1     0%     0.0    1.4  35.8   11.7    0.4     3%    0.2    4.9  18%    0    50.4  0.5   3.5
20.0 drbd1 14.6  4.0     0.1     0%     0.0    1.4  36.4   11.6    0.4     0%    0.2    5.1  17%    0    51.0  0.5   3.4
20.0 hda    0.9  4.0     0.0     0%     0.2   251.9  28.8  61.8    1.7    92%    4.5   13.1  31%    2    29.6 12.8   0.9
20.0 hda2   0.0  0.0     0.0     0%     0.0     0.0   0.6   8.3    0.0    52%    0.1   98.2   6%    0     0.6 48.9  49.3
20.0 hda3   0.0  0.0     0.0     0%     0.0     0.0   2.0  23.2    0.0    83%    0.0    1.4   0%    0     2.0  1.2   0.3
20.0 hda5   0.0  0.0     0.0     0%     0.0     0.0   4.9 249.4    1.2    98%    4.0   13.2   9%    0     4.9 12.9   0.3
20.0 hda6   0.0  0.0     0.0     0%     0.0     0.0   0.0   0.0    0.0     0%    0.0    0.0   0%    0     0.0  0.0   0.0
20.0 hda9   0.9  4.0     0.0     0%     0.2   251.9  21.3  24.2    0.5    32%    0.4   12.9  31%    2    22.2 10.2   9.7
20.0 drbd1  0.9  4.0     0.0     0%     0.2   251.9  30.6  17.0    0.5     0%    0.7   24.1  30%    5    31.4 21.0   9.5

You can see the busy column in pt-diskstats output which is the same as the util column in iostat – which points to utilization. Actually, pt-diskstats is quite similar to the iostat tool but pt-diskstats is more interactive and has more information. The busy percentage is only telling us for how long the IO subsystem was busy, but is not indicating capacity. So the only time you care about %busy is when it’s 100% and at the same time latency (await in iostat and rd_rt/wr_rt in diskstats output) increases over -say- 5ms. You can estimate capacity of your IO subsystem and then look at the IOPS being consumed (r/s + w/s columns). Also, the system can process more than one request in parallel (in case of RAID) so %busy can go beyond 100% in pt-diskstats output.

If you need to check disk throughput, block device IOPS run the following to capture metrics from your IO subsystem and see if utilization matches other worrisome symptoms. I would suggest capturing disk stats during peak load. Output can be grouped by sample or by disk using the –group-by option. You can use the sysbench benchmark tool for this purpose to measure database server performance. You will find this link useful for sysbench tool details.

$ pt-diskstats --group-by=all --iterations=7200 > /tmp/pt-diskstats.out;


pt-diskstats is one of the finest tools from Percona Toolkit. By using this tool you can easily spot disk bottlenecks, measure the IO subsystem and identify how much IOPS your drive can handle (i.e. disk capacity).

The post MySQL 101: Monitor Disk I/O with pt-diskstats appeared first on MySQL Performance Blog.


Tools and tips for analysis of MySQL’s Slow Query Log

MySQL's Slow Query LogMySQL has a nice feature, slow query log, which allows you to log all queries that exceed a predefined about of time to execute. Peter Zaitsev first wrote about this back in 2006 – there have been a few other posts here on the MySQL Performance Blog since then (check this and this, too) but I wanted to revisit his original subject in today’s post.

Query optimization is essential for good database server performance and usually DBAs need to ensure the top performance possible for all queries. In MySQL, the desirable way is to generate a query log for all running queries within a specific time period and then run a query analysis tool to identify the bad queries. Percona Toolkit’s pt-query-digest is one of the most powerful tools for SQL analysis. That’s because pt-query-digest can generate a very comprehensive report that spots problematic queries very efficiently. It works equally well with Oracle MySQL server. This post will focus mainly on pt-query-digest.

Slow query log is great at spotting really slow queries that are good candidates for optimization. Beginning with MySQL 5.1.21, the minimum value is 0 for long_query_time, and the value can be specified to a resolution of microseconds. In Percona Server additional statistics may be output to the slow query log. You can find the full details here. For our clients, we often need to identify queries that impact an application the most. It does not always have to be the slowest queries – queries that runs more frequently with lower execution time per call put more load on a server than queries running with lower frequency. We of course want to get rid of really slow queries but to really optimize application throughput, we also need to investigate queries that generate most of the load. Further, if you enable option log_queries_not_using_indexes  then MySQL will log queries doing full table scans which doesn’t always reflect that the query is slow, because in some situations the query optimizer chooses full table scan rather than using any available index or probably showing all records from a small table.

Our usual recommendation is to generate the slow log with long_query_time=0. This will record all the traffic but this will be I/O intensive and will eat up disk space very quickly depending on your workload. So beware of running with long_query_time=0 for only a specific period of time and revert it back to logging only very slow queries. In Percona Server there is nice option where you can limit the rate of logging, log_slow_rate_limit is the option to handle it. Filtering slow query log is very helpful too in some cases e.g. if we know the main performance issue is table scans we can log queries only doing full table scans or if we see I/O is bottleneck we can collect queries doing full scans and queries creating on disk temporary tables. Again, this is only possible in Percona Server with the log_slow_filter option. Also, you may want to collect everything on slow query log and then filter with pt-query-digest. Depending on I/O capacity, you might prefer one or another way, as collecting everything in slow query log allows us to investigate other queries too if needed. Finally, use pt-query-digest to generate an aggregate report over slow query log which highlights the problematic part very efficiently. Again, pt-query-digest can bring up server load high so our usual recommendation on it is to move slow query log to some staging/dev server and run pt-query-digest over there to generate the report.

Note: changing the long_query_time parameter value only affects newly created connections to log queries exceeds long_query_time threshold. In Percona Server there is feature which changes variable scope to global instead of local. Enabling slow_query_log_use_global_control  log queries for connected sessions too after changing long_query_time parameter threshold. You can read more about this patch here.

I am not going to show you a detailed report of pt-query-digest and explain each part of it here, because it is well defined already by my colleague Ovais Tariq in this post. However, I will show you some of the other aspects of pt-query-digest tool here.

Let me show you code snippets that enable slow query log for only a specific time period with long_query_time=0 and log_slow_verbosity to ‘full’. log_slow_verbosity is a Percona Server variable which logs extra stats such as information on query cache, Filesort, temporary tables, InnoDB statistics etc. Once you are done collecting logs, revert back the values for long_query_time to the previous value, and finally run pt-query-digest on the log to generate report. Note: run the below code in same MySQL session.

-- Save previous settings
mysql> SELECT @@global.log_slow_verbosity INTO @__log_slow_verbosity;
mysql> SELECT @@global.long_query_time INTO @__long_query_time;
mysql> SELECT @@global.slow_query_log INTO @__slow_query_log;
mysql> SELECT @@global.log_slow_slave_statements INTO @__log_slow_slave_statements;
-- Keep this in safe place, we'll need to run pt-query-digest
mysql> SELECT NOW() AS "Time Since";
-- Set values to enable query collection
mysql> SET GLOBAL slow_query_log_use_global_control='log_slow_verbosity,long_query_time';
mysql> SET GLOBAL log_slow_verbosity='full';
mysql> SET GLOBAL slow_query_log=1;
mysql> SET GLOBAL long_query_time=0;
mysql> SET GLOBAL log_slow_slave_statements=1;
-- Verify settings are OK
mysql> SELECT @@global.long_query_time, @@global.slow_query_log, @@global.log_slow_verbosity;
-- wait for 30 - 60 minutes
-- Keep this one too, also for pt-query-digest
mysql> SELECT NOW() AS "Time Until";
-- Revert to previous values
mysql> SET GLOBAL slow_query_log=@__slow_query_log;
mysql> SET GLOBAL long_query_time=@__long_query_time;
mysql> SET GLOBAL log_slow_verbosity=@__log_slow_verbosity; -- if percona server
mysql> SET GLOBAL log_slow_slave_statements=@__log_slow_slave_statements;
-- Verify settings are back to previous values
mysql> SELECT @@global.long_query_time, @@global.slow_query_log, @@global.log_slow_verbosity, @@global.slow_query_log_file;
-- Then with pt-query-digest run like (replace values for time-since, time-until and log name)
$ pt-query-digest --since='<time-since>' --until='<time-until>' --limit=100% /path/to/slow_query_log_file.log > /path/to/report.out
-- If you're not using Percona Server then you need to remove all references to log_slow_verbosity, slow_query_log_use_global_control and log_slow_slave_statements (priot MySQL 5.6).

My colleague Bill Karwin wrote bash script that does almost the same as the above code. You can find the script to collect slow logs here. This script doesn’t hold connection to the database session while you wait for logs to accumulate and it sets all the variables back to the state they were before. For full documentation view this.

Further, you can also get explain output into the report from the pt-query-digest tool. For that you need to use –explain parameter similar to as follows.

$ pt-query-digest --explain u=<user>,p=<password>,h=<hostname> /path/to/slow.log > /path/to/report.out

Explain output in query report will get you all the information for query execution plan and explain output signal towards how that particular query going to be executed. Note that, if you execute pt-query-digest over slow query log other than originated server of slow query log as I mentioned above e.g. staging/dev you may get different execution path for the query in the report or lower number of rows to examined, etc., because usually staging/dev servers has different data distribution, different MySQL versions, or different indexes. MySQL explain adds overhead as queries needs to be prepared on the server to generate intended query execution path. For this reason, you may want to run pt-query-digest with –explain on a production replica.

It’s worth mentioning that logging queries with log_slow_verbosity in Percona Server is really handy as it shows lots of additional statistics and it is more helpful in situations when the explain plan reports a different execution path than when the query is executed. On that particular topic, you may want to check this nice post.

pt-query-digest also supports filters. You can read more about it here. Let me show you an example. The following command will discard everything apart from insert/update/delete queries in pt-query-digest output report.

$ pt-query-digest --filter '$event->{arg} =~ m/^(insert|update|delete)/i' --since='<time-since>' --until='<time-until>' --limit=100% /path/to/slow_query_log_file.log > /path/to/report.out

If you’re looking for some GUI tools for pt-query-digest then I would recommend reading this nice blogpost from my colleague Roman. Further, our CEO Peter Zaitsev also wrote a post recently where he shows the comparison between performance_schema and slow query log. Check here for details.

In related new, Percona recently announced Percona Cloud Tools, the next generation of tools for MySQL. It runs a client-side agent (pt-agent) which runs pt-query-digest on the server with some intervals and uploads the aggregated data to the Percona Cloud Tools API which process it further.  Query Analytics is one tool from the Percona Cloud Tools that provides advanced query metrics. It  is a nice visualization tool. You may be interested to learn more about it here, and it’s also worth viewing this related webinar about Percona Cloud Tools from our CTO Vadim Tkachenko.

pt-query-digest from Percona Toolkit is a versatile (and free) tool for slow query log analysis. It provides good insight about every individual query, especially in Percona Server with log_slow_verbosity enabled, e.g. log queries with microsecond precision, log information about the query’s execution plan. On top of that, Percona Cloud Tools includes Query Analytics which provides you with good visuals about query performance and also provides a view of historical data.

The post Tools and tips for analysis of MySQL’s Slow Query Log appeared first on MySQL Performance Blog.


[ERROR] mysqld: Sort aborted: Server shutdown in progress

Recently, one of our support customers faced this: “[ERROR] mysqld: Sort aborted: Server shutdown in progress.” At first it would appear this occurred because of a mysql restart (i.e. the MySQL server restarted and the query got killed during the stopping of mysql). However, while debugging this problem I found no evidence of a MySQL server restart – which proves that what’s “apparent” is not always the case, so this error message was a bit misleading. Check this bug report for further details (it was reported back in 2006).

I found that there are two possible reasons for this error: Either the MySQL server restarts during execution of the query, or the query got killed forcefully during execution which utilizes the “Using filesort” algorithm.

So, let’s try to reproduce the scenario for this particular error. I opened two mysql sessions, mysql1 and mysql2.

mysql1> EXPLAIN SELECT * FROM dummy WHERE user_id <> 245424 GROUP BY title ORDER BY group_id;
| user_id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows   | Extra                                        |
|  1 | SIMPLE      | dummy | range | PRIMARY       | PRIMARY | 4       | NULL | 580096 | Using where; Using temporary; Using filesort |

Then, I executed a query in session1 (i.e. mysql1) and while the query is running I killed it by logging into other session, mysql2.

mysql2> show full processlist\G
*************************** 1. row ***************************
Id: 2
User: root
Host: localhost
db: test
Command: Query
Time: 1
State: Sorting result
Info: SELECT * FROM dummy WHERE id <> 245424 GROUP BY title ORDER BY group_id
Rows_sent: 0
Rows_examined: 0
mysql2> kill 2;
Query OK, 0 rows affected (0.00 sec)
mysql2> show full processlist\G
*************************** 1. row ***************************
User_id: 2
User: root
Host: localhost
db: test
Command: Killed
Time: 8
State: Creating sort index
Info: SELECT * FROM dummy WHERE user_id <> 245424 GROUP BY title ORDER BY group_id
Rows_sent: 0
Rows_examined: 0
mysql1> SELECT * FROM dummy WHERE user_id <> 245424 GROUP BY title ORDER BY group_id;
ERROR 2013 (HY000): Lost connection to MySQL server during query

In the error log I found…

2013-12-16 00:05:42 3746 [ERROR] /usr/local/mysql/bin/mysqld: Sort aborted: Server shutdown in progress.

This looks confusing because of the error message, “Server shutdown in progress”. Then I restarted the MySQL server during execution of this query and the same error logged in error log which seems correct as mysql is shutdown during query execution.

2013-12-16 00:05:46 3746 [ERROR] /usr/local/mysql/bin/mysqld: Sort aborted: Server shutdown in progress

So, basically when the MySQL server is shutting down it kills all connected threads so “server shutdown in progress” error message is correct in this context. That means this error is not only recorded when the MySQL server restarts during query execution (which uses ORDER BY on non-index column’s) but also when that particular query is explicitly killed during it’s execution.

The MySQL error “Server shutdown in progress” is confusing. It pops up when you kill a query explicitly and it then appears like the MySQL server is restarted – which is not the case in that scenario. This bug is quite old and was finally fixed in MySQL 5-5-35 and 5-6-15.

The post [ERROR] mysqld: Sort aborted: Server shutdown in progress appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by