Apr
22
2013
--

Percona XtraBackup 2.1.0 for MySQL beta now available

Percona XtraBackupPercona is glad to announce the release of Percona XtraBackup 2.1.0 for MySQL beta on April 22, 2013. Downloads are available from our download site here. For this BETA release, we will not be making APT and YUM repositories available, just base deb and RPM packages

This is an BETA quality release and is not intended for production. If you want a high quality, generally available release, the current stable version should be used (currently 2.0.6 in the 2.0 series at the time of writing).

This release contains all of the features and bug fixes in Percona XtraBackup for MySQL 2.0.6, plus the following:

New Features:

  • Percona XtraBackup has implemented basic support for MySQL 5.6, Percona Server 5.6 and MariaDB 10.0. Basic support means that these versions are are recognized by XtraBackup, and that backup/restore works as long as no 5.6-specific features are used (such as GTID, remote/transportable tablespaces, separate undo tablespace, 5.6-style buffer pool dump files).
  • Percona XtraBackup can use XtraDB changed page tracking feature to perform the Incremental Backups now.

Bugs Fixed:

  • Fixed couple of warnings found in innobackupex when all warnings have been made FATAL. Bug fixed #1116177.
  • innobackupex is using SHOW MASTER STATUS to obtain binlog file and position. This could trigger a bug if the server being backed up was standalone server (neither master nor slave in replication) and binlog information wasn’t available. Fixed by not creating xtrabackup_binlog_info file when binlog isn’t available. Bug fixed #1168513.
  • Fixed the typo in the innobackupex error output. Bug fixed #1157225.
  • Redundant code has been removed from xtrabackup.cc. Bug fixed #1162765.

Other bugs fixed: bug fixed #1158154 and bug fixed #1166713.

Release notes with all the bugfixes for Percona XtraBackup for MySQL 2.1.0-beta1 are available in our online documentation. Bugs can be reported on the launchpad bug tracker.

The post Percona XtraBackup 2.1.0 for MySQL beta now available appeared first on MySQL Performance Blog.

Apr
22
2013
--

I need your help!

Hi!

Would you please spare me a few minutes and help me a little bit?

As you might know—or maybe not—I’m making my living as an independent trainer and consultant. Up till now I’ve only delivered on-site training at the clients’ site, but I though it makes sense to offer open trainings as well so that singe participants can also join. For that I’d need to know how many people would like to join such a training, where they are physically located, and which database they are using. So, I’ve set up a short survey:

http://winand.at/services/sql-performance-training/survey

It’s just one form and won’t take much time. Naturally, you are under no obligation if you fill out the form, and as a way of saying thank you for your time, I’ll be drawing three participants to receive a DON’T PANIC towel—the perfect preparation for the upcoming Towel Day on 25th May. So take a minute to complete the survey now because the deadline is 26th of April!

Thanks,

-markus

Original title and author: “I need your help!” by Markus Winand.

Apr
22
2013
--

How to recover table structure from InnoDB dictionary

To recover a dropped or corrupt table with Percona Data Recovery Tool for InnoDB you need two things: media with records(ibdata1, *.ibd, disk image, etc.) and a table structure. Indeed, there is no information about the table structure in an InnoDB page. Normally we either recover the structure from .frm files or take it from some old backup.

A new tool sys_parser can recover the table structure from InnoDB dictionary.

Why do we need a new tool anyway? It is absolutely critical to have an accurate table definition to ensure a successful recovery. Even an unnoticeable difference like NULL or NOT NULL can shift all values by a byte and thus will spoil the result. That’s why I prefer the structure from .frm files over taken from backups. But in some cases even .frm files is not an option:

  • Table was dropped and innodb_file_per_table is ON
  • Frm file corrupt, zeroed out, lost or SHOW CREATE TABLE crashes MySQL

There is yet another source of information about the table structure – InnoDB dictionary. Let’s review tables from the dictionary and see what it can give us. We will need four of them:

  • SYS_TABLES
  • SYS_INDEXES
  • SYS_COLUMNS
  • SYS_FIELDS

SYS_TABLES
Here InnoDB keeps correspondence between human readable table names and their internal identifiers.

CREATE TABLE `SYS_TABLES` (
  `NAME` varchar(255) NOT NULL DEFAULT '',
  `ID` bigint(20) unsigned NOT NULL DEFAULT '0',
  `N_COLS` int(10) DEFAULT NULL,
  `TYPE` int(10) unsigned DEFAULT NULL,
  `MIX_ID` bigint(20) unsigned DEFAULT NULL,
  `MIX_LEN` int(10) unsigned DEFAULT NULL,
  `CLUSTER_NAME` varchar(255) DEFAULT NULL,
  `SPACE` int(10) unsigned DEFAULT NULL,
  PRIMARY KEY (`NAME`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

NAME is a human readable table name in form database_name/table_name e.g. sakila/actor. ID is a table identifier. We will need the table id to find indexes of the table.

mysql> select * from SYS_TABLES WHERE NAME='sakila/actor';
+--------------+-----+--------+------+--------+---------+--------------+-------+
| NAME         | ID  | N_COLS | TYPE | MIX_ID | MIX_LEN | CLUSTER_NAME | SPACE |
+--------------+-----+--------+------+--------+---------+--------------+-------+
| sakila/actor | 741 |      4 |    1 |      0 |       0 |              |   738 |
+--------------+-----+--------+------+--------+---------+--------------+-------+

SYS_INDEXES
This table lists all indexes the table has, secondary as well as the primary.

CREATE TABLE `SYS_INDEXES` (
  `TABLE_ID` bigint(20) unsigned NOT NULL DEFAULT '0',
  `ID` bigint(20) unsigned NOT NULL DEFAULT '0',
  `NAME` varchar(120) DEFAULT NULL,
  `N_FIELDS` int(10) unsigned DEFAULT NULL,
  `TYPE` int(10) unsigned DEFAULT NULL,
  `SPACE` int(10) unsigned DEFAULT NULL,
  `PAGE_NO` int(10) unsigned DEFAULT NULL,
  PRIMARY KEY (`TABLE_ID`,`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

So, TABLE_ID is our table_id. ID here is the index identifier. We need it to find InnoDB pages which belong to the table’s index. Which one? A table can have many secondary indexes, but only in the primary index we can find all fields. It must exist for any InnoDB table. If explicitely defined its NAME is PRIMARY. If the primary key is not defined InnoDB will use a unique secondary index as the primary. If there is no any unique index InnoDB will create one implicitely. Its name will be GEN_CLUST_INDEX.
It doesn’t matter how the primary index gets created it will have minimal ID among the indexes of the table.

mysql> select * from SYS_INDEXES WHERE TABLE_ID=741;
+----------+------+---------------------+----------+------+-------+---------+
| TABLE_ID | ID   | NAME                | N_FIELDS | TYPE | SPACE | PAGE_NO |
+----------+------+---------------------+----------+------+-------+---------+
|      741 | 1679 | PRIMARY             |        1 |    3 |   738 |       3 |
|      741 | 1680 | idx_actor_last_name |        1 |    0 |   738 |       4 |
+----------+------+---------------------+----------+------+-------+---------+

SYS_COLUMNS
Table SYS_COLUMNS stores fields names and type information of the table.

CREATE TABLE `SYS_COLUMNS` (
  `TABLE_ID` bigint(20) unsigned NOT NULL,
  `POS` int(10) unsigned NOT NULL,
  `NAME` varchar(255) DEFAULT NULL,
  `MTYPE` int(10) unsigned DEFAULT NULL,
  `PRTYPE` int(10) unsigned DEFAULT NULL,
  `LEN` int(10) unsigned DEFAULT NULL,
  `PREC` int(10) unsigned DEFAULT NULL,
  PRIMARY KEY (`TABLE_ID`,`POS`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

Here TABLE_ID is a well known table identifier, POS – a position of a field in the table. NAME is the name of a field, MTYPE and PRTYPE store information about the field type, encoding, NULL/NOT NULL properties etc.
LEN is the maximum number of bytes a field uses to store a value. I’m not sure what PREC is used for. It sounds like a short from “precision”, but at least for DECIMAL type where it would make sense it is still zero. If you know how InnoDB uses PREC please let me know.

mysql> select * from SYS_COLUMNS WHERE TABLE_ID=741;
+----------+-----+-------------+-------+---------+------+------+
| TABLE_ID | POS | NAME        | MTYPE | PRTYPE  | LEN  | PREC |
+----------+-----+-------------+-------+---------+------+------+
|      741 |   0 | actor_id    |     6 |    1794 |    2 |    0 |
|      741 |   1 | first_name  |    12 | 2162959 |  135 |    0 |
|      741 |   2 | last_name   |    12 | 2162959 |  135 |    0 |
|      741 |   3 | last_update |     6 |    1799 |    4 |    0 |
+----------+-----+-------------+-------+---------+------+------+

So, we know all fields of the table, we can get the type. Is it enough for the recovery? No.

SYS_FIELDS
We need to know what fields form the primary key. The matter is regardless at what position primary key fields are defined in CREATE TABLE statement internally they always go first in a record. The second issue we should take into account is internal fields DB_TRX_ID and DB_ROLL_PTR . These two fields always reside between the primary key fields and the rest of the fields.
SYS_FIELDS lists fields of all indexes, including the primary.

CREATE TABLE `SYS_FIELDS` (
  `INDEX_ID` bigint(20) unsigned NOT NULL,
  `POS` int(10) unsigned NOT NULL,
  `COL_NAME` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`INDEX_ID`,`POS`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

Fields names explain their content.
Index id in our example is 1679:

mysql> SELECT * FROM SYS_FIELDS WHERE INDEX_ID = 1679;
+----------+-----+----------+
| INDEX_ID | POS | COL_NAME |
+----------+-----+----------+
|     1679 |   0 | actor_id |
+----------+-----+----------+

Which means the primary key of the table is one field actor_id.

Now we have all necessary information to generate CREATE TABLE statement.

0. Download the latest revision of the recovery tool

bzr branch lp:percona-data-recovery-tool-for-innodb

1. Compile the dictionary parsers

make dict_parsers

2. Split ibdata1 with page_parser

./page_parser -f /var/lib/mysql/ibdata1

3. Recover SYS_TABLES, SYS_INDEXES, SYS_COLUMNS and SYS_FIELDS from indexes 0-1, 0-3, 0-2 and 0-4 respectively.

./bin/constraints_parser.SYS_FIELDS -4f pages-ibdata1/FIL_PAGE_INDEX/<index_id>

4. Load dumps of the dictionary tables into some MySQL server. Use LOAD DATA INFILE constraints_parser generates

mysql>LOAD DATA INFILE '/path/to/SYS_FIELDS' REPLACE INTO TABLE `SYS_FIELDS` FIELDS TERMINATED BY '\t' OPTIONALLY ENCLOSED BY '"' LINES STARTING BY 'SYS_FIELDS\t' (`INDEX_ID`, `POS`, `COL_NAME`);

5. Now everything is ready to generate a CREATE TABLE statement for a table:

./sys_parser -u root sakila/actor
CREATE TABLE `actor`(
        `actor_id` SMALLINT UNSIGNED NOT NULL,
        `first_name` VARCHAR(45) CHARACTER SET 'utf8' COLLATE 'utf8_general_ci' NOT NULL,
        `last_name` VARCHAR(45) CHARACTER SET 'utf8' COLLATE 'utf8_general_ci' NOT NULL,
        `last_update` TIMESTAMP NOT NULL,
        PRIMARY KEY (`actor_id`)
) ENGINE=InnoDB;

In the end there are two notes:

  • The generated structure differs from the original structure of the table, but it is good enough for the recovery.
  • DECIMAL fields are not fully supported. If a field was of DECIMAL(5,2) type sys_parser will generate DECIMAL(5,0). It has to be corrected manually

The post How to recover table structure from InnoDB dictionary appeared first on MySQL Performance Blog.

Apr
19
2013
--

MySQL alternative Percona Server 5.1.68 -14.6 now available

Percona Server

Percona Server for MySQL version 5.1.68-14.6

Percona is glad to announce the release of Percona Server  5.1.68 -14.6 on April 19, 2013 (downloads are available here and from the Percona Software Repositories). Based on MySQL 5.1.68, including all the bug fixes in it, Percona Server 5.1.68-14.6, a MySQL alternative, is now the current stable release in the 5.1 series. All of Percona‘s software is open source and free, all the details of the release can be found in the 5.1.68-14.6 milestone at Launchpad.

Bugs Fixed:

  • Fixed yum dependencies that were causing conflicts in CentOS 6.3 during installation. Bugs fixed #1031427 and #1051874.
  • When mysqldump was used with --innodb-optimize-keys option it produced invalid SQL for cases when there was an explicitly named foreign key constraint which implied an implicit secondary index with the same name. Fixed by detecting such cases and omitting the corresponding secondary keys from deferred key creation optimization. Bug fixed #1081016.
  • When mysqldump was used with --innodb-optimize-keys and --no-data options, all secondary key definitions would be lost. Bug fixed #989253.
  • Percona Server was built with YaSSL which could cause some of the programs that use it to crash. Fixed by building packages with OpenSSL support rather than the bundled YaSSL library. Bug fixed #1104977.
  • Fix for bug #1070856 introduced a regression in Percona Server 5.1.66-14.2 which could cause a server to hang when binary log is enabled. Bug fixed #1162085.
  • Percona Server would re-create the test database when using rpm on server upgrade, even if the database was previously removed. Bug fixed #710799.
  • Debian packages included the old version of innotop. Fixed by removing innotop and its InnoDBParser Perl package from source and Debian installation. Bug fixed #1032139.
  • Percona Server was missing help texts in the MySQL client because the help tables were missing. Bug fixed #1041981.

Other bugs fixes: bug fixed #1154962, bug fixed #1154959, bug fixed #1154957, bug fixed #1154954, bug fixed #1144059, bug fixed #1050536.

Bugs can be reported on the launchpad bug tracker. Release notes for Percona Server 5.1.68-14.6 are available in our online documentation.

The post MySQL alternative Percona Server 5.1.68 -14.6 now available appeared first on MySQL Performance Blog.

Apr
19
2013
--

The write cache: Swap insanity tome III

Swapping has always been something bad for MySQL performance but it is even more important for HA systems. It is so important to avoid swapping with HA that NDB cluster basically forbids calling malloc after the startup phase and hence its rather complex configuration.

Probably most readers of this blog know (or should know) about Linux swappiness setting, which basically controls how important is the file cache for Linux. Basically, with InnoDB, since the file cache is not important we add “vm.swappiness = 0″ to “/etc/sysctl.conf” and run “sysctl -p” and we are done.

Swappiness solves part of the swapping issue but not all. With Numa systems, the picture is more complex and swapping can occur because of a memory imbalance between the physical cpus, the sockets and not cores. Jeremy Cole explained this here and here. In summary, you need to interleave the allocation of memory for the MySQL process using the numactl utility, drop the file cache and pre-allocate the innodb buffer pool with the innodb_buffer_pool_populate option.

That solves most of the swapping issues but… I recently ended up in a situation where, after having done all that, a server was swapping episodically for a few minutes, enough to cause serious issues to the pacemaker setup the server is one of the nodes. That caused resource failovers. Resource failover when there’s no fencing and when the servers are not responsive because of swapping is pretty bad for a cluster and we often ended up is strange, unexpected states. What was I missing?

In order to figure out, I started of data gathering metrics like: pt-stalk, vmstat, top, iostat, etc all saving to disk with the current time stamp in front of each line. I quickly found that the issues happened during the backups but also, during the push of the backups to S3. Why? The push to S3 was especially puzzling since MySQL was not involved. The server was running with nearly 7GB of free memory but the push to S3 involves a split operation to create files of 5GB since the backup size is about 28GB and the file upload size limit is 5GB. So, about 28GB was written to disk in a short while. Here’s the vmstat output during the split process, the cron job started at 10am.


Fri Mar 22 10:03:22 UTC 2013 procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
Fri Mar 22 10:03:22 UTC 2013 r b swpd free buff cache si so bi bo in cs us sy id wa st
Fri Mar 22 10:03:22 UTC 2013 2 1 291992 224892 138848 7254284 0 0 148504 476 24190 24233 8 6 81 5 0
Fri Mar 22 10:03:24 UTC 2013 2 1 291992 222228 138848 7258064 0 0 194040 96 22443 20425 7 6 83 5 0
Fri Mar 22 10:03:26 UTC 2013 3 1 291992 224088 138852 7260196 0 0 172386 156 27143 27637 8 7 80 5 0
Fri Mar 22 10:03:28 UTC 2013 1 4 291992 219104 138832 7262820 0 0 174854 160 18893 17002 6 5 83 6 0
Fri Mar 22 10:03:30 UTC 2013 1 2 291992 324640 138836 7153440 0 0 143736 132 19318 17425 7 5 76 12 0
Fri Mar 22 10:03:32 UTC 2013 0 2 291984 292988 138840 7183556 0 0 138480 1206 19126 16359 3 5 81 12 0
Fri Mar 22 10:03:34 UTC 2013 4 0 291984 216932 138832 7255856 0 0 169072 936 20541 16555 3 7 83 8 0
Fri Mar 22 10:03:36 UTC 2013 3 7 300144 242052 138836 7216444 0 4080 53400 53714 18422 16234 2 7 72 18 0
Fri Mar 22 10:03:38 UTC 2013 0 4 395888 355136 138844 7197992 0 47872 3350 125712 24148 21633 3 4 70 23 0
Fri Mar 22 10:03:40 UTC 2013 4 4 495212 450516 138864 7208188 0 49664 2532 164356 30539 21120 2 3 81 13 0

Note the free column going down and the cache one going up and then swapping starting at 10:03:36. Also, the server is reading a lot (~150MB/s) but doing very little physical writes. The writes were cached and the write cache was causing the problem. It happened that the setting vm.dirty_ratio was set to 20 on a 48GB server, allowing nearly 10GB of “write cache”. A quick survey of the servers I have access to showed that the common values are 20 and 40. I simply set the dirty_ratio to 2 (1GB of write cache) and the issue is gone since.

So, add to your verification list the point of making sure you look at the Linux settings for the dirty pages. You can either set “vm.dirty_ratio” or “vm.dirty_bytes”, but keep in mind that only one is used; setting one to a non-zero value sets the other to zero.

The post The write cache: Swap insanity tome III appeared first on MySQL Performance Blog.

Apr
18
2013
--

Rotating MySQL slow logs safely

This blog post is part two of two. Like part one, published Wednesday, this is a cross-post from Groupon’s engineering blog. Thanks again to Kyle Oppenheim at Groupon. And one more reminder that I’ll be at the Percona Live MySQL Conference and Expo next week in Santa Clara, California so look for me there. You can checkout the sessions I’ll be leading here.

In my last post, I described a solution for keeping the caches of a MySQL standby server hot using MySQL slow logs with long_query_time set to 0. Here are a some lessons we learned when logging a high volume of queries to the slow log.

Do not use copytruncate

Logrotate offers two techniques for log rotation (your log rotation scheme likely offers similar options with a different name):

  1. copytruncate – Copies the file to a new name, and then truncates the original file.
  2. no copytruncate – Uses the rename() system call to move the file to a new name, and then expects the daemon to be signaled to reopen its log file.

MySQL has a mutex for slow log writes. Truncation can block MySQL because the OS serializes access to the inode during the truncate operation. This problem is particularly evident when using the ext3 file system (instead of xfs).

Use FLUSH LOGS instead of sending SIGHUP

When copytruncate is disabled, MySQL must be told to reopen the slow log file. There are two options for signaling:

  1. Send a HUP signal to the mysqld process.
  2. Use the mysql console or mysqladmin utility to FLUSH LOGS;

These options should be equivalent, but MySQL bug 65481 explains that the HUP signal also flushes tables in addition to logs. Flushing tables can impact running queries.

Disable MySQL slow logs during rotation

Flushing logs takes time. Meanwhile, queries are still being executed. To prevent MySQL from filling the slow log buffer, we disable the MySQL slow logs temporarily during log rotation.

Putting it all together

Here is a logrotate configuration file for a slow log that illustrates these best practices:

/var/mysql/slow_query.log {
    nocompress
    create 660 mysql mysql
    size 1G
    dateext
    missingok
    notifempty
    sharedscripts
    postrotate
       /usr/local/bin/mysql -e 'select @@global.long_query_time into @lqt_save; set global long_query_time=2000; select sleep(2); FLUSH LOGS; select sleep(2); set global long_query_time=@lqt_save;'
    endscript
    rotate 150

The post Rotating MySQL slow logs safely appeared first on MySQL Performance Blog.

Apr
17
2013
--

Disconnecting a replication slave is easier with MySQL 5.5+ (RESET SLAVE vs. RESET SLAVE ALL)

It’s not uncommon to promote a server from slave to master. One of the key things to protect your data integrity is to make sure that the promoted slave is permanently disconnected from its old master. If not, it may get writes from the old master, which can cause all kinds of data corruption. MySQL provides the handy RESET SLAVE command. But as we’ll see, its behavior has changed along with the MySQL versions and it’s easy to shoot yourself in the foot if you use it incorrectly. So how do you safely disconnect a replication slave?

In short

  • For MySQL 5.0 and 5.1, run STOP SLAVE, CHANGE MASTER TO MASTER_HOST='' and then RESET SLAVE.
  • For MySQL 5.5 and 5.6, run STOP SLAVE and then RESET SLAVE ALL.
  • For all versions, ban master-user, master-host and master-password settings in my.cnf, this may cause huge problems (it’s anyway no longer supported from MySQL 5.5).

If you want to know more details, please read on!

MySQL 5.0/5.1

First let’s consider MySQL 5.0 and 5.1. RESET SLAVE will remove the master.info and relay-log.info files as well as all the relay log files. This looks great, but does it ensure the replica is disconnected from its master?
Let’s try:

mysql> stop slave;
Query OK, 0 rows affected (0.00 sec)
mysql> reset slave;
Query OK, 0 rows affected (0.03 sec)
mysql> show slave status\G
*************************** 1. row ***************************
             Slave_IO_State:
                Master_Host: 127.0.0.1
                Master_User: test
                Master_Port: 3306
              Connect_Retry: 60
            Master_Log_File:
        Read_Master_Log_Pos: 4
             Relay_Log_File: mysql_sandbox35302-relay-bin.000001
              Relay_Log_Pos: 4
      Relay_Master_Log_File:
           Slave_IO_Running: No
          Slave_SQL_Running: No
          		[...]

This is not expected: instead of removing all settings, some of them are reset to default values. This means that if you run START SLAVE (or if it’s done automatically, for instance when restarting the server without the skip-slave-start option), replication may start again. But as the master position has been deleted, replication will restart at the beginning of the first available binary log, which is very likely to corrupt your data by reexecuting some queries.

Here’s a trick to make RESET SLAVE work as expected: use CHANGE MASTER TO MASTER_HOST='':

mysql> stop slave;
Query OK, 0 rows affected (0.00 sec)
mysql> change master to master_host='';
Query OK, 0 rows affected (0.02 sec)
mysql> reset slave;
Query OK, 0 rows affected (0.04 sec)
mysql> show slave status\G
Empty set (0.00 sec)
mysql> start slave;
ERROR 1200 (HY000): The server is not configured as slave; fix in config file or with CHANGE MASTER TO

Much better! If we try to restart replication, it fails. However, I don’t like the error message, specifically the ‘fix in config file’ part. What happens if we specify the master-user, master-password, master-host and master-port in the my.cnf file?

# cat my.cnf
[...]
master-user=rsandbox
master-password=rsandbox
master-host=127.0.0.1
master-port=35301
[...]

Let’s disconnect the slave:

mysql> stop slave;
Query OK, 0 rows affected (0.00 sec)
mysql> change master to master_host='';
Query OK, 0 rows affected (0.03 sec)
mysql> reset slave;
Query OK, 0 rows affected (0.03 sec)
mysql> show slave status\G
*************************** 1. row ***************************
             Slave_IO_State:
                Master_Host: 127.0.0.1
                Master_User: rsandbox
                Master_Port: 35301
              Connect_Retry: 60
            Master_Log_File:
        Read_Master_Log_Pos: 4
             Relay_Log_File: mysql_sandbox35302-relay-bin.000001
              Relay_Log_Pos: 4
      Relay_Master_Log_File:
           Slave_IO_Running: No
          Slave_SQL_Running: No
          [...]

Connection settings are automatically restored, which makes disconnecting the replica impossible. And again, if you restart replication, it will read events from the first available binary log file on the master, which is probably not what you want. So never set master-xxx variables in my.cnf!

From MySQL 5.5

Starting with MySQL 5.5, the situation has slightly changed. First the master-xxx variables are no longer supported, which is a great improvement. But the RESET SLAVE statement also behaves differently:

mysql> stop slave;
Query OK, 0 rows affected (0,01 sec)
mysql > reset slave;
Query OK, 0 rows affected (0,11 sec)
mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State:
                  Master_Host: 127.0.0.1
                  Master_User: rsandbox
                  Master_Port: 18675
                Connect_Retry: 60
              Master_Log_File:
          Read_Master_Log_Pos: 4
               Relay_Log_File: mysql_sandbox18676-relay-bin.000001
                Relay_Log_Pos: 4
        [...]

As stated in the documentation, the connection parameters are still held in memory. In any case, you will be able to restart replication, but again as no replication coordinate is specified, replication will start at the beginning of the first available binary log file, with all the nasty consequences we can imagine.

Even worse, the CHANGE MASTER TO MASTER_HOST='' trick no longer works:

mysql> stop slave;
Query OK, 0 rows affected (0,01 sec)
mysql> change master to master_host='';
ERROR 1210 (HY000): Incorrect arguments to MASTER_HOST

Fortunately, the documentation also specifies that we can use RESET SLAVE ALL to remove all replication-related configuration:

mysql> stop slave;
Query OK, 0 rows affected (0,00 sec)
mysql> reset slave all;
Query OK, 0 rows affected (0,04 sec)
mysql> show slave status\G
Empty set (0,00 sec)

Very good! The command does work as expected without any additional tricks. As soon as you are aware of the difference between RESET SLAVE and RESET SLAVE ALL, disconnecting a replication slave is much easier with MySQL 5.5+.

The post Disconnecting a replication slave is easier with MySQL 5.5+ (RESET SLAVE vs. RESET SLAVE ALL) appeared first on MySQL Performance Blog.

Apr
16
2013
--

Percona XtraDB Cluster 5.5.30-23.7.4 for MySQL now available

Percona is glad to announce the release of Percona XtraDB Cluster 5.5.30-23.7.4 for MySQL on April 17, 2013. Binaries are available from the downloads area or from our software repositories.

New Features:

  • Percona XtraDB Cluster has implemented initial implementation of weighted quorum. Weight for node can be assigned via pc.weight option in the wsrep_provider_options variable. Accepted values are in the range [0, 255] (inclusive). Quorum is computed using weighted sum over group members.
  • Percona XtraDB Cluster binary will now be bundled with the libjemalloc library. For RPM/deb packages, this library will be available for download from our repositories. Benchmark showing the impact of memory allocators on MySQL performance can be found in this blogpost.
  • This release of Percona XtraDB Cluster has fixed number of foreign key and packaging bugs.

Bug Fixes:

  • Fixed yum dependencies that were causing conflicts in CentOS 6.3 during installation. Bug fixed #1031427 (Ignacio Nin).
  • In case the Percona XtraDB Cluster was built from the source rpm, wsrep revision information would be missing. Bug fixed #1128906 (Alexey Bychko).
  • The method of generating md5 digest over tuples in a table with no primary key was not deterministic which could lead to a node failure. Bug fixed #1019473 (Seppo Jaakola).
  • Percona XtraDB Cluster was built with YaSSL which could cause some of the programs that use it to crash. Fixed by building packages with OpenSSL support rather than the bundled YaSSL library. Bug fixed #1104977 (Raghavendra D Prabhu).
  • Clustercheck script would hang in case the MySQL server on a node is hung. As a consequence clustercheck script would never fail-over that server. Bug fixed #1035927 (Raghavendra D Prabhu).
  • High values in variables evs.send_window and evs.user_send_window could trigger cluster crash under high load. Bug fixed #1080539 (Teemu Ollakka).
  • Standard MySQL port would be used when port number wasn’t explicitly defined in the wsrep_node_incoming_address. Bug fixed #1082406 (Alex Yurchenko).
  • Dropping a non-existing temporary table would be replicated when TOI was used in wsrep_OSU_method variable. This bug was fixed for the case when DROP TEMPORARY TABLE statement was used, but it will still replicate in case DROP TABLE statement is used on a temporary table. Bug fixed #1084702 (Seppo Jaakola).
  • In case two nodes in a 3-node cluster had to abort due to inconsistency, one wouldn’t correctly notify the surviving node which would lead to surviving node to loose the primary component and cause subsequent downtime. Bug fixed #1108165 (Alex Yurchenko).
  • In some cases non-uniform foreign key reference could cause a slave crash. Fixed by using primary key of the child table when appending exclusive key for cascading delete operation. Bug fixed #1089490 (Seppo Jaakola).
  • Parallel applying would fail in case mixed CHAR and VARCHAR columns would be used in foreign key definitions. Bug fixed #1100496 (Seppo Jaakola).
  • Debian packages included the old version of innotop. Fixed by removing innotop and its InnoDBParser Perl package from source and Debian installation. Bug fixed #1032139 (Alexey Bychko).
  • The mysqld_safe script would fail to retrieve the Galera replication position on Ubuntu 10.04, because the different shell was used. Bug fixed #1108431 (Alex Yurchenko).
  • Cascading foreign key constraint could lead to unresolved replication conflict and leave a slave hanging. Bug fixed #1130888 (Seppo Jaakola).
  • If MySQL replication threads were started before running wsrep recovery, this would lead to memory corruption and server crash. Bug fixed #1132974 (Seppo Jaakola).
  • Conflicting prepared statements in multi-master use case could cause node to hang. This was happening due to prepared statement execution loop, which does not honor wsrep status codes correctly. Bug fixed #1144911 (Seppo Jaakola).
  • State Snapshot Transfer with Xtrabackup would fail if the tmpdir was specified more than once in the MySQL configuration file (my.cnf). Bugs fixed #1160047 and #1086978 (Raghavendra D Prabhu).
  • Donor node would run XtraBackup indefinitely when xtrabackup tmpdir was set up on tmpfs. Bug fixed #1086978 (Alex Yurchenko).
  • Issues with compiling Galera on the ARM architecture has been fixed. Bug fixed #1133047 (Alex Yurchenko).
  • Upstream bugfix for bug #59354 triggered a regression that could cause transaction conflicts. Bug fixed #1158221 (Seppo Jaakola).
  • Galera builds would fail when they were built with the new boost library. Bug fixed #1131736 (Alex Yurchenko).
  • Folder lost+found wasn’t included in the rsync SST filter, this caused the SST failure due to insufficient privileges. Fixed by excluding lost+found folder if found. Bug fixed #1154095 (Alex Yurchenko).
  • If variable innodb_thread_concurrency has been defined to throttle InnoDB access, and work load contained DDL statements, a cluster node could remain hanging for unresolved MDL conflict. Fixed by adding a new method to cancel a thread waiting for InnoDB concurrency. Bug fixed #1155183 (Seppo Jaakola).
  • Handling of the network issues in Galera has been improved. Bug fixed #1153727 (Teemu Ollakka).
  • Fixed the wrong path in the /etc/xinetd.d/mysqlchk script. Bugs fixed #1000761 and #1132934 (Raghavendra D Prabhu).
  • When upgrading the Percona-XtraDB-Cluster-server package, /usr/bin/clustercheck script would get overwritten, and any changes (such as username and password) would be lost. Bug fixed #1158443 (Raghavendra D Prabhu).
  • In case CREATE TABLE AS SELECT statement was running in parallel with the DDL statement on the selected table, in some cases first statement could be left hanging. Bug fixed #1164893 (Seppo Jaakola).
  • Galera builds would fail when gcc 4.8 was used. Bug fixed #1164992 (Alex Yurchenko).
  • Percona-XtraDB-Cluster-galera package version number didn’t match the wsrep_provider_version one. Bug fixed #1111672 (Alexey Bychko).
  • Only rpm debug build was available for Percona XtraDB Cluster, fixed by providing the deb debug build as well. Bug fixed #1096123 (Ignacio Nin).

Percona XtraDB ClusterOther bug fixes: bug fixed #1162421 (Seppo Jaakola), bug fixed #1093054 (Alex Yurchenko), bug fixed #1166060 (Teemu Ollakka), bug fixed #1166065 (Teemu Ollakka).

Based on Percona Server 5.5.30-30.2 including all the bug fixes in it and on Codership wsrep API 5.5.30-23.7.4, Percona XtraDB Cluster 5.5.30-23.7.4 is now the current stable release. All of Percona’s software is open-source and free. Release notes for Percona XtraDB Cluster 5.5.30-23.7.4 are available in our online documentation.

We did our best to eliminate bugs and problems, but this is a software, so bugs are expected. If you encounter them, please report them to our bug tracking system.

UPDATE[18-04-2013]: There was a RPM packaging regression introduced with the fix for bug #710799. This regression only affected clean RPM installations and not upgrades. We have pushed the fixed packages to the repositories. Bug fixed #1170024.

The post Percona XtraDB Cluster 5.5.30-23.7.4 for MySQL now available appeared first on MySQL Performance Blog.

Apr
16
2013
--

Is your MySQL buffer pool warm? Make it sweat!

Today’s blog post diving into the waters of the MySQL buffer pool is a cross-post from Groupon’s engineering blog, and is Part 1 of 2. Thank you to Kyle Oppenheim at Groupon for contributing to this project and post. We’ll be posting Part 2 on Thursday. I’ll be at the Percona Live MySQL Conference and Expo next week in Santa Clara, California so look for me there – I’d love to connect and talk more about MySQL buffer pools or anything else that’s on your mind!

There are numerous solutions for MySQL high availability. Many rely on MySQL’s asynchronous replication to maintain a warm standby server which is flipped into service if the active master server has an issue. At Groupon, our standard MySQL database configuration follows this active/passive pattern. There is a virtual IP that points to the active server of a pair. The passive server has mysqld running and replicating from the active master. In theory, failing over is the simple matter of moving the virtual IP. In practice, it’s slightly more complicated. To manage this process, we use a tool co-developed with Percona.

“Warm standby server”? Did you catch that? What does that mean? In most of these high-availability solutions it means that mysqld is running on the standby server and that replication is not lagging. Unfortunately, that’s often not sufficient for failover during peak traffic. The standby server does not process query traffic except for replication. The buffer pool and adaptive hash index on the standby server will not have recently accessed pages. When it starts handling queries after failover, the lower cache hit rates can lead to outages. Specifically, at Groupon, our servers would be heavily I/O bound after failover as the buffer pool pages were loaded from disk.

Replaying Queries

Working with Groupon, we have developed a solution to keep the standby server’s caches hot. (See my Fosdem 2013 slides for information about discarded designs and benchmarks.)

First, we set long_query_time to 0 in order to log every query. (See part two for handling massive slow log volume.) The slow logs are served, via HTTP, by mysql_slowlogd. This daemon is similar to running tail -f slow.log, except that it knows how to follow the log stream across log rotation events. On the standby server, the logs are replayed with Percona Playback by streaming the slow log from the active server.

wget -q -O - http://master_server:3307/slow | percona-playback --mysql-host 127.0.0.1 --mysql-username playback --mysql-password PaSSwOrd --mysql-schema schema_name --query-log-stdin --dispatcher-plugin thread-pool --thread-pool-threads-count 100 --session-init-query \"set innodb_fake_changes=1\" > /var/log/playback.log 2>&1 &

Our awesome development team added a few features to Percona Playback to make it work better for this use case. You will need version 0.6 or later to get these features. Be aware that playback output is really verbose, in production, most likely you want it to redirect to /dev/null, and only have a log file for debugging purposes.

  1. Streaming logs from stdin Percona Playback now supports the –query-log-stdin command-line option for accepting a never-ending stream of queries to playback.
  2. Read-only playback Using the –session-init-query command-line option, we set the option innodb_fake_changes to prevent INSERTs, UPDATEs, and DELETEs from corrupting the data on the standby server. You will need Percona Server in order to use innodb_fake_changes.
  3. Thread pool Percona Playback added a connection pool option via –dispatcher-plugin-thread-pool that will allow connection reuse. This is necessary when running a large stream of queries.

Benchmarks

We benchmarked with slow query logs captured from our production systems. We restored a production database backup to our test database so that our test database was consistent before applying the captured query traffic. This is an important step because update statements that match no rows or insert statements that have duplicate key errors may be faster than an actual database write.

The slow logs were split into chunks, each containing roughly 1M queries. We warmed the cold database with the first chunk and replayed the second chunk after the warmup.

Disk read I/O for chunk 1 followed by 2 - MySQL buffer pool

The y axis is logarithmic, so the difference between the IO usage is 2 orders of magnitude. All graphs looked like this (we did 39 measurements), the next graph shows chunk 4’s workload warmed up with chunk 3.

Disk read IO for chunk 3 followed by 4 - MySQL buffer pool

The result is similar for every single graph, each chunk warmed up the buffer pool for the next one.

disk_io_chunk_1and1 - MySQL buffer pool

As an additional experiment we tried replaying the same chunk again. We expected everything to be cached if we warmed the cache with the exact same data. All the graphs from such self-warming experiments look like this one. The green part of the graph lines up with the blue part.

Check back Thursday for Part 2!

The post Is your MySQL buffer pool warm? Make it sweat! appeared first on MySQL Performance Blog.

Apr
16
2013
--

Testing the Micron P320h

The Micron P320h SSD is an SLC-based PCIe solid-state storage device which claims to provide the highest read throughput of any server-grade SSD, and at Micron’s request, I recently took some time to put the card through its paces, and the numbers are indeed quite impressive.

For reference, the benchmarks for this device were performed primarily on a Dell R720 with 192GB of RAM and two Xeon E5-2660 processors that yield a total of 32 virtual cores. This is the same machine which was used in my previous benchmark run. A small handful of additional tests were also performed using the Cisco UCS C250. The operating system in use was CentOS 6.3, and for the sysbench fileIO tests, the EXT4 filesystem was used. The card itself is the 700GB model.

So let’s take a look at the data.

With the sysbench fileIO test in asynchronous mode, read performance is an extremely steady 3202MiB/sec with almost no deviation. Write performance is also both very strong and very steady, coming in at a bit over 1730MiB/sec with a standard deviation of a bit less than 13MiB/sec.

realssd-asyncIO

 

When we calculate in the fact that the block size in use here is 16KiB, these numbers equate to over 110,000 write IOPS and almost 205,000 read IOPS.

When we switch over to synchronous IO, we find that the card is quite capable of matching the asynchronous performance:

syncIO-throughput

Synchronous read reaches peak capacity somewhere between 32 and 64 threads, and synchronous write tops out somewhere between 64 and 128 threads. The latency numbers are equally impressive; the next two graphs show 95th and 99th-percentile response time, but there really isn’t much difference between the two.

syncIO-latency

At 64 read threads, we reach peak performance with latency of roughly 0.5 milliseconds; and at 128 write threads we have maximum throughput with latency just over 3ms.

How well does it perform with MySQL? Exact results vary, depending upon the usual factors (read/write ratio, working set size, buffer pool size, etc.) but overall the card is extremely quick and handily outperforms the other cards that it was tested against. For example, in the graph below we compare the performance of the P320h on a standard TPCC-MySQL test to the original FusionIO and the Intel i910 with assorted buffer pool sizes:

tpcc-mysql-devicecompare

 

And in this graph we look at the card’s performance on sysbench OLTP:

sysbench-oltp-ext4xfs

It is worth noting here that EXT4 outperforms XFS by a fairly significant margin. The approximate raw numbers, in tabular format, are:

EXT4 XFS
13GiB BP 22000 7500
25GiB BP 17000 9000
50GiB BP 21000 11000
75GiB BP 25000 15000
100GiB BP 31000 19000
125GiB BP 36000 25000

In the final analysis, there may or may not be faster cards out there, but the Micron P320h is the fastest one that I have personally seen to date.

The post Testing the Micron P320h appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com