Sep
09
2013
--

How InnoDB promotes UNIQUE constraints

The other day I was running pt-duplicate-key-checker on behalf of a customer and noticed some peculiar recommendations on an InnoDB table with an odd structure (no PRIMARY key, but multiple UNIQUE constraints). This got me thinking about how InnoDB promotes UNIQUE constraints to the role of PRIMARY KEYs. The documentation is pretty clear:

[DOCS]
When you define a PRIMARY KEY on your table, InnoDB uses it as the clustered index. Define a primary key for each table that you create. If there is no logical unique and non-null column or set of columns, add a new auto-increment column, whose values are filled in automatically.

If you do not define a PRIMARY KEY for your table, MySQL locates the first UNIQUE index where all the key columns are NOT NULL and InnoDB uses it as the clustered index.

If the table has no PRIMARY KEY or suitable UNIQUE index, InnoDB internally generates a hidden clustered index on a synthetic column containing row ID values. The rows are ordered by the ID that InnoDB assigns to the rows in such a table. The row ID is a 6-byte field that increases monotonically as new rows are inserted. Thus, the rows ordered by the row ID are physically in insertion order.
[/DOCS]

But it really leaves a number of outstanding questions, the most pressing of which (in my mind) is whether or not the column used in the clustered index can change over time or if it is set and remains at table creation time.

Let’s take the following example, using MySQL 5.6.11:


mysql> CREATE TABLE t1 (
-> c1 char(10),
-> c2 char(10),
-> KEY (c2)
-> ) ENGINE=InnoDB;
Query OK, 0 rows affected (0.14 sec)

mysql> INSERT INTO t1 VALUES (‘abcdefghij’,’klmnopqrst’);
Query OK, 1 row affected (0.00 sec)

mysql> EXPLAIN SELECT c1, c2 FROM t1 WHERE c2='klmnopqrst'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: t1
type: ref
possible_keys: c2
key: c2
key_len: 11
ref: const
rows: 1
Extra: Using index condition
1 row in set (0.00 sec)

We can clearly confirm here that there is no candidate column to use as the clustered index. So, if we add a UNIQUE constraint, will it be used?


mysql> ALTER TABLE t1 ADD UNIQUE (c1);
Query OK, 0 rows affected (0.15 sec)
Records: 0 Duplicates: 0 Warnings: 0

 

mysql> EXPLAIN SELECT c1, c2 FROM t1 WHERE c2='klmnopqrst'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: t1
type: ref
possible_keys: c2
key: c2
key_len: 11
ref: const
rows: 1
Extra: Using index condition
1 row in set (0.00 sec)

No. Because InnoDB will only use it as the clustered index if all of the key columns are NOT NULL. So, let’s see if that will change:


mysql> ALTER TABLE t1 CHANGE COLUMN c1 c1 CHAR(10) NOT NULL;
Query OK, 0 rows affected, 1 warning (0.23 sec)
Records: 0 Duplicates: 0 Warnings: 1

 

mysql> EXPLAIN SELECT c1, c2 FROM t1 WHERE c2='klmnopqrst'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: t1
type: ref
possible_keys: c2
key: c2
key_len: 11
ref: const
rows: 1
Extra: Using where; Using index
1 row in set (0.00 sec)

Fantastic. InnoDB will seemingly re-evaluate candidates for clustered index as the result of each alter statement. What happens if we add another candidate? Will InnoDB keep the original choice or update again?


mysql> ALTER TABLE t1 ADD COLUMN c3 char(10) not null;
Query OK, 0 rows affected, 1 warning (0.21 sec)
Records: 0 Duplicates: 0 Warnings: 1

mysql> UPDATE t1 SET c3=’uvwxyz1234′;
Query OK, 1 row affected (0.01 sec)
Rows matched: 1 Changed: 1 Warnings: 0

mysql> ALTER TABLE t1 ADD UNIQUE (c3);
Query OK, 0 rows affected (0.24 sec)
Records: 0 Duplicates: 0 Warnings: 0

mysql> EXPLAIN SELECT c1, c2 FROM t1 WHERE c2=’klmnopqrst’\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: t1
type: ref
possible_keys: c2
key: c2
key_len: 11
ref: const
rows: 1
Extra: Using where; Using index
1 row in set (0.00 sec)

mysql> EXPLAIN SELECT c3, c2 FROM t1 WHERE c2='klmnopqrst'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: t1
type: ref
possible_keys: c2
key: c2
key_len: 11
ref: const
rows: 1
Extra: Using index condition
1 row in set (0.00 sec)

It seems that InnoDB kept the original clustered index. This is expected because MySQL will choose the first index that fits it’s criteria (UNIQUE, NOT NULL) and c1 is the first column declared in the table definition. Let’s change that:


mysql> ALTER TABLE t1 DROP COLUMN c3;
Query OK, 0 rows affected (0.27 sec)
Records: 0 Duplicates: 0 Warnings: 0

mysql> ALTER TABLE t1 ADD COLUMN c3 char(10) not null FIRST;
Query OK, 0 rows affected (0.26 sec)
Records: 0 Duplicates: 0 Warnings: 0

mysql> UPDATE t1 SET c3=’uvwxyz1234′;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0

mysql> ALTER TABLE t1 ADD UNIQUE (c3);
Query OK, 0 rows affected (0.18 sec)
Records: 0 Duplicates: 0 Warnings: 0

mysql> EXPLAIN SELECT c1, c2 FROM t1 WHERE c2=’klmnopqrst’\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: t1
type: ref
possible_keys: c2
key: c2
key_len: 11
ref: const
rows: 1
Extra: Using where; Using index
1 row in set (0.00 sec)

mysql> EXPLAIN SELECT c3, c2 FROM t1 WHERE c2=’klmnopqrst’\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: t1
type: ref
possible_keys: c2
key: c2
key_len: 11
ref: const
rows: 1
Extra: Using index condition
1 row in set (0.00 sec)

mysql> SHOW CREATE TABLE t1\G
*************************** 1. row ***************************
Table: t1
Create Table: CREATE TABLE `t1` (
`c3` char(10) NOT NULL,
`c1` char(10) NOT NULL,
`c2` char(10) DEFAULT NULL,
UNIQUE KEY `c1` (`c1`),
UNIQUE KEY `c3` (`c3`),
KEY `c2` (`c2`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

So in this case, InnoDB kept using c1 as the clustered index even though c3 comes first in the table definition. If you look at the order of UNIQUE KEYs, you see that c1 remains first. You can always tell the index that is used as the clustered index because it appears first in the list of UNIQUE KEYs. Here is an ALTER statement verifying, by changing c1 to default to NULL, it moves down in the UNIQUE index list:


mysql> show create table t1\G
*************************** 1. row ***************************
Table: t1
Create Table: CREATE TABLE `t1` (
`c1` char(10) NOT NULL,
`c2` char(10) DEFAULT NULL,
`c3` char(10) NOT NULL,
UNIQUE KEY `c1` (`c1`),
UNIQUE KEY `c3` (`c3`),
KEY `c2` (`c2`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

mysql> ALTER TABLE t1 CHANGE COLUMN c1 c1 char(10);
Query OK, 0 rows affected (0.32 sec)
Records: 0 Duplicates: 0 Warnings: 0

mysql> show create table t1\G
*************************** 1. row ***************************
Table: t1
Create Table: CREATE TABLE `t1` (
`c1` char(10) DEFAULT NULL,
`c2` char(10) DEFAULT NULL,
`c3` char(10) NOT NULL,
UNIQUE KEY `c3` (`c3`),
UNIQUE KEY `c1` (`c1`),
KEY `c2` (`c2`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

So, the moral of the story is that InnoDB will choose the first created NOT NULL UNIQUE indexed column for clustered index (although this is not “exposed” anywhere aside from the SHOW CREATE TABLE order or EXPLAIN), and that is re-evaluated at each ALTER statement (indeed, the decision about promoting a UNIQUE index to the clustered index is not even stored anywhere on disk; it is re-evaluated each time InnoDB loads the table into the data dictionary cache, even if the table structure has not been changed by ALTER TABLE [via Alexey Kopytov]). Even though it behaves deterministically, it is still much clearer to declare a column as the PRIMARY KEY instead of relying on this behavior.

The post How InnoDB promotes UNIQUE constraints appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com