Jul
08
2014
--

TIMESTAMP Columns, Amazon RDS 5.6, and You

This comes from an issue that I worked on recently, wherein a customer reported that their application was working fine under stock MySQL 5.6 but producing erroneous results when they tried running it on Amazon RDS 5.6. They had a table which, on the working server, contained two TIMESTAMP columns, one which defaulted to CURRENT_TIMESTAMP and the other which defaulted to ’0000-00-00 00:00:00′, like so:

CREATE TABLE mysql56 (
  id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
  ts1 TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  ts2 TIMESTAMP NOT NULL DEFAULT '0000-00-00 00:00:00',
);

However, under Amazon RDS, the same table looked like this:

CREATE TABLE rds56 ( 
  id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
  ts1 TIMESTAMP NULL DEFAULT NULL,
  ts2 TIMESTAMP NULL DEFAULT NULL, 
);

They mentioned that their schema contains TIMESTAMP column definitions without any modifiers for nullability or default values. In other words, they were doing something like this:

CREATE TABLE foo56 (
    id NOT NULL AUTO_INCREMENT PRIMARY KEY,
    ts1 TIMESTAMP,
    ts2 TIMESTAMP
);

It’s a known issue (or change, or difference, whatever we choose to call it) that MySQL is deprecating defaults for TIMESTAMP columns that don’t have any nullability or default-value specifiers; this is covered in the 5.6 documentation. However, the docs also mention that the default value for this setting is OFF – i.e., if you create a table with TIMESTAMP columns without any defaults, it will fill them in for you, similarly to what I’ve described above.

As it turns out, the RDS default for this setting is ON, hence the “NULL DEFAULT NULL” modifiers when creating the table under RDS. We changed the parameter group, restarted the instance (note that this variable is NOT dynamic), and their schema-creation script created the tables in the proper way.

So, what have we learned here?

  • Migrating from standalone MySQL to Amazon RDS sometimes has hidden pitfalls that aren’t always readily apparent. Many times it will “just work” – but sometimes it doesn’t. Percona is, of course, happy to help review your configurations and assist with any Amazon RDS implementation plans you might have.
  • When in doubt, fully-specify your TIMESTAMP columns. If you want them NOT NULL, say so. If you want a default value or an on-updated value, set it. Even the configuration variable explicit_defaults_for_timestamp is deprecated and slated for removal in a future version, so eventually it won’t be possible to get the old pre-5.6 behavior at all.

The post TIMESTAMP Columns, Amazon RDS 5.6, and You appeared first on MySQL Performance Blog.

Nov
05
2013
--

How to recover an orphaned .ibd file with MySQL 5.6

A few years ago Yves Trudeau and Aleksandr Kuzminsky wrote posts about different ways for recovering orphaned .ibd files:

Today I want to show you how to do that in more easy and quick way. In my example I’ll restore a “payment.ibd” file (payment table) from Sakila DB on a server with MySQL 5.5 (but with help from MySQL 5.6 and sandbox).

In my case the OS is CentOS. So I needed to:
install mysqlsandbox(check instructions there)
download latest Percona Server 5.6:

wget http://www.percona.com/redir/downloads/Percona-Server-5.6/LATEST/release-5.6.14-62.0/483/binary/linux/i686/Percona-Server-5.6.14-rel62.0-483.Linux.i686.tar.gz

create sandbox

make_sandbox Percona-Server-5.6.14-rel62.0-483.Linux.i686.tar.gz

test it

mysql -umsandbox -pmsandbox --host=127.0.0.1 --port=5614 -e "select @@version\G"
*************************** 1. row ***************************
@@version: 5.6.14-rel62.0

It Works!

check datadir and if the innodb_file_per_table option enabled (this is requirement)

mysql -umsandbox -pmsandbox --host=127.0.0.1 --port=5614 -e "show variables like 'datadir'"
+---------------+---------------------------------------+
| Variable_name | Value                                 |
+---------------+---------------------------------------+
| datadir       | /home/mixa/sandboxes/msb_5_6_14/data/ |
+---------------+---------------------------------------+

mysql -umsandbox -pmsandbox --host=127.0.0.1 --port=5614 -e "show variables like '%per_table'"
+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| innodb_file_per_table | ON    |
+-----------------------+-------+

If it’s not enabled then you’ll need to enable it

mysql -umsandbox -pmsandbox --host=127.0.0.1 --port=5614 -e "SET GLOBAL innodb_file_per_table=1"

create empty payment table on 5.6 sandbox

mysql -umsandbox -pmsandbox --host=127.0.0.1 --port=5614 test < payment_table.sql

payment_table.sql – is file with “SHOW CREATE TABLE” statement for payment table. The table structure should be the same.

cat payment_table.sql
CREATE TABLE `payment` (
  `payment_id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
  `customer_id` smallint(5) unsigned NOT NULL,
  `staff_id` tinyint(3) unsigned NOT NULL,
  `rental_id` int(11) DEFAULT NULL,
  `amount` decimal(5,2) NOT NULL,
  `payment_date` datetime NOT NULL,
  `last_update` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`payment_id`),
  KEY `idx_fk_staff_id` (`staff_id`),
  KEY `idx_fk_customer_id` (`customer_id`),
  KEY `fk_payment_rental` (`rental_id`)
) ENGINE=InnoDB

stop sandbox

./sandboxes/msb_5_6_14/stop

replace .ibd file (in my case the correct copy of it is located in my homedir)

cp ~/payment.ibd ~/sandboxes/msb_5_6_14/data/test/ -f

make sure permissions are ok for .ibd file

sudo chmod 660 ~/sandboxes/msb_5_6_14/data/test/payment.ibd
sudo chown : ~/sandboxes/msb_5_6_14/data/test/payment.ibd

start sandbox

./sandboxes/msb_5_6_14/start

Currently if you’ll try to select something from the table you’ll get an error:

select from table

mysql -umsandbox -pmsandbox --host=127.0.0.1 --port=5614 -e "select count(*) from test.payment"
ERROR 2013 (HY000) at line 1: Lost connection to MySQL server during query
mysql -umsandbox -pmsandbox --host=127.0.0.1 --port=5614 -e "select count(*) from test.payment"
ERROR 1146 (42S02) at line 1: Table 'test.payment' doesn't exist

error log

2013-11-02 14:36:34 b7eff990  InnoDB: Error: table 'test/payment'
InnoDB: in InnoDB data dictionary has tablespace id 7,
InnoDB: but a tablespace with that id does not exist. There is
InnoDB: a tablespace of name test/payment and id 10, though. Have
InnoDB: you deleted or moved .ibd files?
...  ...
2013-11-02 14:36:36 11640 [ERROR] InnoDB: Failed to find tablespace for table '"test"."payment"' in the cache. Attempting to load the tablespace with space id 7.
2013-11-02 14:36:36 11640 [ERROR] InnoDB: In file './test/payment.ibd', tablespace id and flags are 10 and 0, but in the InnoDB data dictionary they are 7 and 0. Have you moved InnoDB .ibd files around without using the commands DISCARD TABLESPACE and IMPORT TABLESPACE? Please refer to http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting-datadict.html for how to resolve the issue.
2013-11-02 14:36:36 a31a2b90  InnoDB: Operating system error number 2 in a file operation.
InnoDB: The error means the system cannot find the path specified.
2013-11-02 14:36:36 11640 [ERROR] InnoDB: Could not find a valid tablespace file for 'test/payment'. See http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting-datadict.html for how to resolve the issue.
2013-11-02 14:36:36 a31a2b90 InnoDB: cannot calculate statistics for table "test"."payment" because the .ibd file is missing. For help, please refer to http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting.html

How to Fix it? In 5.6 tablespace management is very improved so the only thing needed is “ALTER TABLE .. DISCARD TABLESPACE” and “ALTER TABLE .. IMPORT TABLESPACE”.

Please check also limitations: Tablespace Copying Limitations

Look at example:

Discard tablespace

mysql -umsandbox -pmsandbox --host=127.0.0.1 --port=5614 -e "alter table test.payment discard tablespace; show warnings;"
+---------+------+--------------------------------------------------------+
| Level   | Code | Message                                                |
+---------+------+--------------------------------------------------------+
| Warning | 1812 | InnoDB: Tablespace is missing for table 'test/payment' |
| Warning | 1812 | InnoDB: Tablespace is missing for table 'payment'      |
+---------+------+--------------------------------------------------------+

Import tablespace

mysql -umsandbox -pmsandbox --host=127.0.0.1 --port=5614 -e "alter table test.payment import tablespace; show warnings"
+---------+------+----------------------------------------------------------------------------------------------------------------------------------------------+
| Level   | Code | Message                                                                                                                                      |
+---------+------+----------------------------------------------------------------------------------------------------------------------------------------------+
| Warning | 1810 | InnoDB: IO Read error: (2, No such file or directory) Error opening './test/payment.cfg', will attempt to import without schema verification |
+---------+------+----------------------------------------------------------------------------------------------------------------------------------------------+

That’s it, data recovered, payment table accessible on 5.6 sandbox.

Now check if data exists in payment table on sandbox:

mysql -umsandbox -pmsandbox --host=127.0.0.1 --port=5614 -e "select count(*) from test.payment"
+----------+
| count(*) |
+----------+
|    16049 |
+----------+
mysql -umsandbox -pmsandbox --host=127.0.0.1 --port=5614 -e "select * from test.payment limit 1\G"
*************************** 1. row ***************************
  payment_id: 1
 customer_id: 1
    staff_id: 1
   rental_id: 76
      amount: 2.99
payment_date: 0000-00-09 03:49:32
 last_update: 2028-02-08 12:32:35

Exists.

So dump it from sandbox and restore on 5.5:

dump from 5.6

mysqldump -umsandbox -pmsandbox --host=127.0.0.1 --port=5614 --add-drop-table test payment > ~/payment_dump.sql

restore to 5.5

mysql -u user -p  < ~/payment_dump.sql

Check if data exists on 5.5

mysql -u root -e "select * from test.payment limit 3;"
+------------+-------------+----------+-----------+--------+---------------------+---------------------+
| payment_id | customer_id | staff_id | rental_id | amount | payment_date        | last_update         |
+------------+-------------+----------+-----------+--------+---------------------+---------------------+
|          1 |           1 |        1 |        76 |   2.99 | 0000-00-09 03:49:32 | 2028-02-08 12:32:35 |
|          2 |           1 |        1 |       573 |   0.99 | 0000-00-09 03:49:32 | 0000-00-00 00:00:00 |
|          3 |           1 |        1 |      1185 |   5.99 | 0000-00-09 03:49:37 | 0000-00-00 00:00:00 |
+------------+-------------+----------+-----------+--------+---------------------+---------------------+

During my work with this case I got into a situation in which the drop table payment on 5.5 wasn’t possible because payment.idb there wasn’t correct – so the server crashed each time I tried to access to this table. The workaround is:
– stop server
– rm .ibd file
– start server
– drop table as usually by DROP TABLE command

The post How to recover an orphaned .ibd file with MySQL 5.6 appeared first on MySQL Performance Blog.

Jul
05
2013
--

Schema changes – what’s new in MySQL 5.6?

MySQL 5.6Among many of the improvements you can enjoy in MySQL 5.6, there is one that addresses a huge operational problem that most DBAs and System Administrators encounter in their life: schema changes.

While it is usually not a problem for small tables or those in early stages of product life cycle, schema changes become a huge pain once your tables get a significant amount of data. Planning for maintenance is becoming more and more difficult, and your worldwide users want the service to be up and running 24/7, while on the other hand, your developers desire to introduce schema changes every week.

PITA

But what is the real problem here? Let me illustrate very typical case:

Session1> ALTER TABLE revision ADD COLUMN mycol tinyint;
Query OK, 1611193 rows affected (1 min 5.74 sec)
Records: 1611193  Duplicates: 0  Warnings: 0
Session2> INSERT INTO revision SET rev_page=3,rev_comment="test",rev_text_id=1;
Query OK, 1 row affected (48.30 sec)
Session3 > show processlist;
+----+------+-----------+------+---------+------+---------------------------------+----------------------------------------------------------------------+
| Id | User | Host      | db   | Command | Time | State                           | Info                                                                 |
+----+------+-----------+------+---------+------+---------------------------------+----------------------------------------------------------------------+
|  1 | root | localhost | test | Query   |   47 | copy to tmp table               | ALTER TABLE revision ADD COLUMN mycol tinyint                        |
|  2 | root | localhost | test | Query   |   30 | Waiting for table metadata lock | INSERT INTO revision SET rev_page=3,rev_comment="test",rev_text_id=1 |
|  3 | root | localhost | NULL | Query   |    0 | init                            | show processlist                                                     |
+----+------+-----------+------+---------+------+---------------------------------+----------------------------------------------------------------------+
3 rows in set (0.01 sec)

What you see above is how changing table’s structure works in MySQL in all pre-5.6 versions. Normally the example INSERT statement is done in matter of microseconds. But once a DBA runs this ALTER TABLE (session1), the application (session2) has to wait for the ALTER to complete before INSERT and other DML statements can succeed. As a result, application will stall on writes to this table until ALTER TABLE completes.

WORKAROUNDS

There are many less or more complicated approaches to the ALTER TABLE problem in MySQL, starting from just well planned downtime window, through master/slave switching techniques, ending on using advanced tools that let you do the change in less possible intrusive way. These tools are out there for years, yet it appears that still many MySQL users are not aware of them. Let me just name here pt-online-schema-change from Percona Toolkit, oak-online-alter-table from Openark Kit or Facebook OSC.

LONG STORY

Historically, MySQL had to perform full table copy for all DDL (Data Definition Language) operations, while you could not write to a table being altered during the process. This problem is even more painful when awaiting writes can also block following reads.

An important, though small step forward was made for InnoDB storage engine since new InnoDB version (aka InnoDB plugin) was introduced for MySQL 5.1. Since then, you can create and drop indexes without copying the whole table contents. So at least the operation that is practised very often for query optimizations, can be done much much quicker.

Next significant improvement (or perhaps I should say fix) in DDL operations area was introduced in MySQL 5.5 as Metadata Locking. Basically from now on schema changes are working properly and consistent with transactions. You will find this explained in details here.

ONLINE(!) DDL in MySQL 5.6

Fortunately, this is not where so much desired evolution in this area has stopped! With MySQL 5.6, this time a huge step forward was made: from now on most of the ALTER types won’t block writes to a table that is being changed!

Another improvement is that in addition to existing instant ALTER implementations (like change default value for a column), now you can perform also following operations without the need of full table copy:

  • Change auto-increment value for a column
  • Rename a column (if data type remains the same)*
  • Add/Drop a foreign key constraint

As said before though, the main improvement in MySQL 5.6 is that during the ALTER operation a table being changed can still be fully accessible by clients, so both reads and writes (with few exceptions) are allowed! Complete statement matrix can be found here.

Let’s see some examples in practice

Example 1 – reset auto-increment value for a column

It can happen that wrong explicit insert sets the value for an auto-increment column higher then necessary. It may be even close to the data type limit and we want to make the table using lower values back again. We realize that and delete that high value row, but…

Session1> SELECT t.TABLE_NAME,c.COLUMN_NAME,c.COLUMN_KEY,t.AUTO_INCREMENT FROM information_schema.tables t JOIN information_schema.columns c USING (TABLE_SCHEMA,TABLE_NAME) WHERE t.TABLE_SCHEMA="test" AND t.TABLE_NAME="revision" AND c.EXTRA="auto_increment";
+------------+-------------+------------+----------------+
| TABLE_NAME | COLUMN_NAME | COLUMN_KEY | AUTO_INCREMENT |
+------------+-------------+------------+----------------+
| revision   | rev_id      | PRI        |        2000002 |
+------------+-------------+------------+----------------+
1 row in set (0.03 sec)
Session1> select max(rev_id) from revision;
+-------------+
| max(rev_id) |
+-------------+
| 1700000     |
+-------------+
1 row in set (0.00 sec)
Session1> insert into revision set rev_page=3,rev_comment="test",rev_text_id=1;
Query OK, 1 row affected (0.02 sec)
Session1> select max(rev_id) from revision;
+-------------+
| max(rev_id) |
+-------------+
| 2000002     |
+-------------+
1 row in set (0.00 sec)

Not possible since the table has next AUTO_INCREMENT=2000002. So this alter is our only help:

Session1> ALTER TABLE revision AUTO_INCREMENT=1700001;
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
Session1> insert into revision set rev_page=4,rev_comment="test",rev_text_id=1;
Query OK, 1 row affected (0.01 sec)
Session1> select max(rev_id) from revision;
+-------------+
| max(rev_id) |
+-------------+
|     1700001 |
+-------------+

Finally, this operation in MySQL 5.6 is instant! In previous MySQL versions such ALTER causes full table rewrite and blocks the table for writes for the whole process time. In version 5.5.31 the same ALTER on the same hardware looks like that:

Session1> ALTER TABLE revision AUTO_INCREMENT=1700001;
Query OK, 1611226 rows affected (1 min 3.42 sec)
Records: 1611226  Duplicates: 0  Warnings: 0

Example 2 – DROP COLUMN

Session1> ALTER TABLE revision DROP COLUMN rev_sha1;
Query OK, 0 rows affected (1 min 39.24 sec)
Records: 0 Duplicates: 0 Warnings: 0

During the ALTER is in progress:

Session2> insert into revision set rev_page=3,rev_comment="test",rev_text_id=1,rev_sha1=2;
Query OK, 1 row affected (0.01 sec)

After the ALTER finished:

Session2> insert into revision set rev_page=3,rev_comment="test",rev_text_id=1,rev_sha1=2;
ERROR 1054 (42S22): Unknown column 'rev_sha1' in 'field list'

Great! Drop table was non-blocking, we can use the table without interruption.

Example 3 – RENAME COLUMN

Original column definition was:
rev_deleted tinyint(1) unsigned NOT NULL DEFAULT '0',

Session1> ALTER TABLE revision CHANGE COLUMN rev_deleted rev_deleted1 tinyint(1) unsigned NOT NULL DEFAULT '0';
Query OK, 0 rows affected (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 0

(The same column rename in MySQL 5.5 or earlier copies the whole table.)

But let’s try another column:
rev_timestamp char(14) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL DEFAULT '',

Session1> ALTER TABLE revision CHANGE COLUMN rev_timestamp rev_date char(14) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL DEFAULT '';
Query OK, 1611226 rows affected (1 min 43.34 sec)
Records: 1611226  Duplicates: 0  Warnings: 0

What the heck?

Hmm, let’s try another one:
rev_comment tinyblob NOT NULL,

Session1> ALTER TABLE revision CHANGE COLUMN rev_comment rev_comment1 tinyblob NOT NULL;
Query OK, 1611226 rows affected (2 min 7.91 sec)
Records: 1611226  Duplicates: 0  Warnings: 0

So, by the way of writing this blog post I identified a new bug and reported it here: http://bugs.mysql.com/bug.php?id=69580
In short, Online DDL does not work as expected when you rename a column of binary data type, but also for a char type with binary collation. The bug not only leads to full table copy but also blocks a table for writes.

Example 4 – NEW ALTER TABLE OPTIONS

In case you are not sure if an ALTER TABLE will copy and/or block the table, and you want to make sure your DDL statement won’t cause such problems, you can add new alter specifications to the statement: ALGORITHM and LOCK.

Session1> ALTER TABLE revision ENGINE=InnoDB, LOCK=NONE;
ERROR 1846 (0A000): LOCK=NONE is not supported. Reason: COPY algorithm requires a lock. Try LOCK=SHARED.

As we can see full table rebuild still needs locking.

Session1> ALTER TABLE revision ADD COLUMN mycol3 tinyint, ALGORITHM=COPY, LOCK=NONE;
ERROR 1846 (0A000): LOCK=NONE is not supported. Reason: COPY algorithm requires a lock. Try LOCK=SHARED.

So, ALGORITHM=COPY and LOCK=NONE are mutually exclusive.

Session1> ALTER TABLE revision MODIFY COLUMN rev_id bigint(8) unsigned NOT NULL AUTO_INCREMENT, ALGORITHM=INPLACE;
ERROR 1846 (0A000): ALGORITHM=INPLACE is not supported. Reason: Cannot change column type INPLACE. Try ALGORITHM=COPY.

Data type change of a column is both locking and makes full table copy.

Session1> ALTER TABLE revision ADD COLUMN mycol3 tinyint, ALGORITHM=INPLACE, LOCK=NONE;
Query OK, 0 rows affected (1 min 38.84 sec)
Records: 0  Duplicates: 0  Warnings: 0

Above example is pretty interesting as the new method is used, so no full table copy takes place, however whole operation is still as much expensive as the old way. This is because significant data reorganization is needed inside the existing ibd tablespace.

ALGORITHM and LOCK options allow you also to force using old method with full table copy by adding ALGORITHM=COPY or protect a table from writes with LOCK=SHARED. The same effect can be achieved with SET old_alter_table=1 before the ALTER.

NEW DIAGNOSTICS

I would like to mention that MySQL 5.6 offers new monitoring tools that we can use to see more details also about ongoing ALTERs.

This is how it looks like during the ALTER with adding a column is in progress:

Session3> SELECT * FROM information_schema.innodb_metrics WHERE name LIKE 'ddl%'\G
*************************** 1. row ***************************
NAME: ddl_background_drop_indexes
SUBSYSTEM: ddl
COUNT: 0
MAX_COUNT: NULL
MIN_COUNT: NULL
AVG_COUNT: 0
COUNT_RESET: 0
MAX_COUNT_RESET: NULL
MIN_COUNT_RESET: NULL
AVG_COUNT_RESET: NULL
TIME_ENABLED: 2013-06-04 09:45:36
TIME_DISABLED: NULL
TIME_ELAPSED: 175015
TIME_RESET: NULL
STATUS: enabled
TYPE: counter
COMMENT: Number of indexes waiting to be dropped after failed index creation
*************************** 2. row ***************************
NAME: ddl_background_drop_tables
SUBSYSTEM: ddl
COUNT: 0
MAX_COUNT: NULL
MIN_COUNT: NULL
AVG_COUNT: 0
COUNT_RESET: 0
MAX_COUNT_RESET: NULL
MIN_COUNT_RESET: NULL
AVG_COUNT_RESET: NULL
TIME_ENABLED: 2013-06-04 09:45:36
TIME_DISABLED: NULL
TIME_ELAPSED: 175015
TIME_RESET: NULL
STATUS: enabled
TYPE: counter
COMMENT: Number of tables in background drop table list
*************************** 3. row ***************************
NAME: ddl_online_create_index
SUBSYSTEM: ddl
COUNT: 1
MAX_COUNT: NULL
MIN_COUNT: 0
AVG_COUNT: 0.000005713795960346256
COUNT_RESET: 1
MAX_COUNT_RESET: NULL
MIN_COUNT_RESET: 0
AVG_COUNT_RESET: NULL
TIME_ENABLED: 2013-06-04 09:45:36
TIME_DISABLED: NULL
TIME_ELAPSED: 175015
TIME_RESET: NULL
STATUS: enabled
TYPE: counter
COMMENT: Number of indexes being created online
*************************** 4. row ***************************
NAME: ddl_pending_alter_table
SUBSYSTEM: ddl
COUNT: 1
MAX_COUNT: NULL
MIN_COUNT: 0
AVG_COUNT: 0.000005713795960346256
COUNT_RESET: 1
MAX_COUNT_RESET: NULL
MIN_COUNT_RESET: 0
AVG_COUNT_RESET: NULL
TIME_ENABLED: 2013-06-04 09:45:36
TIME_DISABLED: NULL
TIME_ELAPSED: 175015
TIME_RESET: NULL
STATUS: enabled
TYPE: counter
COMMENT: Number of ALTER TABLE, CREATE INDEX, DROP INDEX in progress
4 rows in set (0.01 sec)

To enable those counters, simply do:

Session3> set global innodb_monitor_enable = module_ddl;
Query OK, 0 rows affected (0.00 sec)

IS ONLINE DDL GOOD ENOUGH IN MySQL 5.6?

I would not say that. Although schema changes are now a lot more friendly, but there seems to be still a room for improvement in terms of becoming even more “online”. And there is another aspect – we can’t really control the way online DDL is done in terms of server I/O load and replication. It is often the case when full table rewrite would generate enough high disk utilization to cause performance problems. Yet we cannot do anything about this in MySQL’s Online DDL – it will copy a table with full available speed no matter if it can saturate I/O throughput or not. Here the tools like pt-online-schema-change have this big advantage where it divides table copy into chunks and checks automatically a given status variable if there is not too high load on the server, and pauses data copy process if so.

When we speak about replication – with Online DDL each ALTER is a single operation, so after the master finishes it, it gets replicated to slaves. This will cause replication lag for the time slave has to apply long ALTER, as well as increased I/O activity on all slaves at the same time. Also in this regard pt-online-schema-change chunk copy model has huge advantage – schema change is replicated to the slaves and the tool will check how much slaves are lagged hence pause the copy process in case the lag is too big. Also the data chunks are automatically adjusted to fit within given time limit.

I was curious to compare time required for the same ALTER with ONLINE DDL versus pt-online-schema-change:

[root@centos6-2 ~]# time pt-online-schema-change D=test,t=test --alter "ADD COLUMN mycol4 bigint" --execute
Operation, tries, wait:
  copy_rows, 10, 0.25
  create_triggers, 10, 1
  drop_triggers, 10, 1
  swap_tables, 10, 1
  update_foreign_keys, 10, 1
Altering `test`.`test`...
Creating new table...
Created new table test._test_new OK.
Altering new table...
Altered `test`.`_test_new` OK.
Creating triggers...
Created triggers OK.
Copying approximately 952180 rows...
Copied rows OK.
Swapping tables...
Swapped original and new tables OK.
Dropping old table...
Dropped old table `test`.`_test_old` OK.
Dropping triggers...
Dropped triggers OK.
Successfully altered `test`.`test`.
real    0m22.531s
user    0m0.342s
sys     0m0.095s

versus:

Session1> ALTER TABLE test ADD COLUMN mycol4 bigint, ALGORITHM=INPLACE, LOCK=NONE;
Query OK, 0 rows affected (30.64 sec)
Records: 0  Duplicates: 0  Warnings: 0

pt-online-schema-change wins here clearly, but it’s just one simple example out of many possible scenarios!

OVERHEAD

The fact that a table is ready for writes during Online DDL in progress, does not mean it will be performed at the same speed.

I tried a simple sysbench test to compare overall throughput of the MySQL server when the server is idle versus when ONLINE ALTER is in progress. Before the test I warmed up InnoDB buffers and ran sysbench test several times. All data for sysbench table fit in memory.

Then I started ALTER statement (for an unrelated table):

Session1> ALTER TABLE revision ADD COLUMN mycol4 tinyint, ALGORITHM=INPLACE, LOCK=SHARED;                                                                                                             Query OK, 0 rows affected (1 min 54.73 sec)
Records: 0  Duplicates: 0  Warnings: 0

And just after that, the same sysbench command:

[root@centos6-2 ~]# sysbench --num-threads=3 --max-requests=5000 --db-driver=mysql --test=oltp --mysql-table-engine=InnoDB --mysql-db=test --mysql-user=root run
sysbench 0.4.12:  multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 3
Doing OLTP test.
Running mixed OLTP test
Using Special distribution (12 iterations,  1 pct of values are returned in 75 pct cases)
Using "BEGIN" for starting transactions
Using auto_inc on the id column
Maximum number of requests for OLTP test is limited to 5000
Threads started!
...

Below we can see concurrent threads:

Session3 > show processlist;
+----+------+-----------+------+---------+------+----------------+--------------------------------------------------------------------------------+
| Id | User | Host      | db   | Command | Time | State          | Info                                                                           |
+----+------+-----------+------+---------+------+----------------+--------------------------------------------------------------------------------+
|  1 | root | localhost | test | Sleep   |  282 |                | NULL                                                                           |
|  2 | root | localhost | test | Query   |   29 | altering table | ALTER TABLE revision ADD COLUMN mycol4 tinyint, ALGORITHM=INPLACE, LOCK=SHARED |
|  5 | root | localhost | NULL | Query   |    0 | init           | show processlist                                                               |
| 39 | root | localhost | test | Execute |    0 | Writing to net | SELECT c from sbtest where id=?                                                |
| 40 | root | localhost | test | Execute |    0 | Writing to net | DELETE from sbtest where id=?                                                  |
| 41 | root | localhost | test | Execute |    0 | Writing to net | SELECT c from sbtest where id=?                                                |
+----+------+-----------+------+---------+------+----------------+--------------------------------------------------------------------------------+
6 rows in set (0.00 sec)

To be clear, sysbench was started after ALTER, and finished before ALTER was done.

Result? On idle MySQL instance sysbench test score is around 270 transactions per second with minimal variation between many the same tests.
While Online DDL of unrelated table is in progress, the same test scored with average 110 transactions per second. Also minimal variation between many the same tests.

In next test, I altered the sbtest table – the same which is used by sysbench test, hence additional overhead of recording all the changes to that table.

Session1> ALTER TABLE sbtest ADD COLUMN mycol4 tinyint, ALGORITHM=INPLACE, LOCK=NONE;
Query OK, 0 rows affected (1 min 5.88 sec)
Records: 0  Duplicates: 0  Warnings: 0

Again, sysbench finished before the ALTER, and now the score was 90 transactions per second.
Note though this was only a very simple test of one example use case.

BUGS

At the end I’d like to mention bugs regarding online DDL that in my opinion are important to be aware of. Fortunately there are not many of them. Following were active as of version 5.6.12:

http://bugs.mysql.com/bug.php?id=69151 – in some circumstances statements needing table copy (like add column) create temporary table in tmpdir instead of the database directory

http://bugs.mysql.com/bug.php?id=69444 – Replication is not crash safe with DDL statements

http://bugs.mysql.com/bug.php?id=69580 – the bug around column rename, I just reported and mentioned above

CONCLUSION

I think schema changes in MySQL 5.6 are really less painful out of the box, plus we can still choose external tools if needed. Having said that, MySQL operators are in much better shape now. It is really good to see MySQL evolution is continued in right direction.

The post Schema changes – what’s new in MySQL 5.6? appeared first on MySQL Performance Blog.

May
31
2013
--

The small improvements of MySQL 5.6: Duplicate Index Detection

MySQL 5.6Here at the MySQL Performance Blog, we’ve been discussing the several new features that MySQL 5.6 brought: GTID-based replication, InnoDB Fulltext, Memcached integration, a more complete performance schema, online DDL and several other InnoDB and query optimizer improvements. However, I plan to focus on a series of posts on the small but handy improvements – changes and bug corrections – in MySQL 5.6 that can make our lives easier and have passed almost unnoticed by most (not all) DBAs.

Duplicate Index Detection

I commented about this on my last webinar, but did not have time to analyze it in-depth.  If you try to do something like this in MySQL 5.5, you will succeed without errors or warnings:

mysql> ALTER TABLE test ADD INDEX (col2);
Query OK, 0 rows affected (0.22 sec)
Records: 0  Duplicates: 0  Warnings: 0
mysql> ALTER TABLE test ADD INDEX (col2);
Query OK, 0 rows affected (0.25 sec)
Records: 0  Duplicates: 0  Warnings: 0
mysql> SHOW CREATE TABLE test\G
*************************** 1. row ***************************
       Table: test
Create Table: CREATE TABLE `test` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `col2` int(11) DEFAULT NULL,
  `col3` varchar(200) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `col2` (`col2`),
  KEY `col2_2` (`col2`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

In previous versions of MySQL, you can create two indexes with the same columns (in the same order) and the server will not complain.

If we execute the same sentences in MySQL 5.6, the second ALTER will also succeed -and the index will be created-, but we will get a warning (note severity, to be exact):

mysql> ALTER TABLE test ADD INDEX (col2);
Query OK, 0 rows affected (0.36 sec)
Records: 0  Duplicates: 0  Warnings: 0
mysql> ALTER TABLE test ADD INDEX (col2);
Query OK, 0 rows affected, 1 warning (0.56 sec)
Records: 0  Duplicates: 0  Warnings: 1
mysql> SHOW WARNINGS\G
*************************** 1. row ***************************
  Level: Note
   Code: 1831
Message: Duplicate index 'col2_2' defined on the table 'test.test'. This is deprecated and will be disallowed in a future release.
1 row in set (0.00 sec)

As the message points correctly, this is a human mistake, as it is a waste of resources that could potentially impact our performance, and should be avoided. By the way, a good practice to avoid this is always naming your keys with a consistent pattern. This new behavior was introduced in 5.6.7 with the closing of this bug (although this was initially accepted as a bug as early as 2005!).

The report explains more in detail what the “will be disallowed in a future release” means. In MySQL 5.7 the checks will be stricter: in the default SQL mode, a duplicate index will throw a warning instead of a note. In strict mode, it will throw an error and the second ALTER will fail, preventing the creation of the duplicate index.

Does it mean that tools like pt-duplicate-key-checker will not be necessary for MySQL 5.6? Let’s have a look at the code implementing this feature. The warning will only be thrown if the index has not been created automatically, it is not a foreign key, and it has the exact column definition in the same order. In other words, it checks for duplicate keys, but not redundant ones. What is the difference? Let’s see an example. If we execute:

mysql> ALTER TABLE test ADD INDEX (col2);
Query OK, 0 rows affected (0.34 sec)
Records: 0  Duplicates: 0  Warnings: 0
mysql> ALTER TABLE test ADD INDEX (col2, col3);
Query OK, 0 rows affected (0.39 sec)
Records: 0  Duplicates: 0  Warnings: 0

We get no warnings and no errors in 5.6, as the indexes are different. But as you may know, we can use the second index not only for filtering on both columns, but also for filtering by just the first one. Why can’t MySQL enforce this kind of constraints? For many reasons: the first one because it would break 99% of all applications out there that use MySQL, for which we at Percona tend to find redundant indexes. And second, because in some cases, we may need to have what at first seems redundant indexes but effectively they are not -for example, if one of the two indexes was unique or a foreign key.

This is the output of our tool when run on the same table, correctly identifying the redundancy:

$ pt-duplicate-key-checker --tables test.test
# ########################################################################
# test.test
# ########################################################################
# col2 is a left-prefix of col2_2
# Key definitions:
#   KEY `col2` (`col2`),
#   KEY `col2_2` (`col2`,`col3`)
# Column types:
#         `col2` int(11) default null
#         `col3` varchar(200) default null
# To remove this duplicate index, execute:
ALTER TABLE `test`.`test` DROP INDEX `col2`;
# ########################################################################
# Summary of indexes
# ########################################################################
# Size Duplicate Indexes   5
# Total Duplicate Indexes  1
# Total Indexes            3

Additionally, pt-duplicate-key-checker will detect subtle redundancies that are engine-dependent, like redundant suffixes for secondary keys in InnoDB. As some of this redundancies could be necessary, depending on the query optimizer and the MySQL version, we always recommend to check manually the optimizations proposed by Percona Toolkit. The MySQL server, of course, cannot risk to block directly all cases.

A set of MySQL utilities were introduced by Oracle recently, which includes mysqlindexcheck, similar to pt-duplicate-key-checker, but it does not detect all cases. For example:

mysql> alter table test add index redundant (col2, id);
Query OK, 0 rows affected (1.57 sec)
Records: 0  Duplicates: 0  Warnings: 0
$ mysqlindexcheck --server=user:pass@localhost test.test
# Source on localhost: ... connected.
$ pt-duplicate-key-checker --tables test.test
# ########################################################################
# test.test
# ########################################################################
# Key redundant ends with a prefix of the clustered index
# Key definitions:
#   KEY `redundant` (`col2`,`id`)
#   PRIMARY KEY (`id`),
# Column types:
#         `col2` int(11) default null
#         `id` int(11) not null auto_increment
# To shorten this duplicate clustered index, execute:
ALTER TABLE `test`.`test` DROP INDEX `redundant`, ADD INDEX `redundant` (`col2`);
# ########################################################################
# Summary of indexes
# ########################################################################
# Size Duplicate Indexes   9
# Total Duplicate Indexes  1
# Total Indexes            2

By the way, if you want to get more familiar with this and other particularities of the latest MySQL GA release, have a look at our upcoming sessions for the “Moving to 5.6″ training course in America (Austin, San Jose) and Europe (Manchester, Utrecht).

The post The small improvements of MySQL 5.6: Duplicate Index Detection appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com