Feb
24
2015
--

Is MySQL’s innodb_file_per_table slowing you down?

MySQL’s innodb_file_per_table is a wonderful thing – most of the time. Having every table use its own .ibd file allows you to easily reclaim space when dropping or truncating tables. But in some use cases, it may cause significant performance issues.

Many of you in the audience are responsible for running automated tests on your codebase before deploying to production. If you are, then one of your goals is having tests run as quickly as possible so you can run them as frequently as possible. Often times you can change specific settings in your test environment that don’t affect the outcome of the test, but do improve throughput. This post discusses how innodb_file_per_table is one of those settings.

I recently spoke with a customer whose use case involved creating hundreds of tables on up to 16 schemas concurrently as part of a Jenkins testing environment. This was not in production, so performance was far more important than durability. They’d run their tests, and then drop the schemas. This process took close to 20 minutes. They asked “How can we make this faster?”

Due to the number of tables involved innodb_file_per_table seemed a likely culprit.

It’s been noted here on the MySQL Performance Blog that innodb_file_per_table can cause table creation and drops to slow down. But what exactly is the performance hit? We wanted to find out.

The innodb_file_per_table Test:

On a test server running CentOS release 6.5, xfs filesystem, and 5.6.22-71.0-log Percona Server, I ran the following homemade benchmark bash script:

[root@host ~]# time $(for db in {1..16};
do mysql -e "create database bench$db";
$(for tb in {1..500}; do $(mysql bench$db -e "create table tab${tb} (i int) engine=innodb"); done) & done)

If you open the mysql client in another screen or terminal, you should see something like this:

...
+-------+------+-----------+---------+---------+------+----------------+------------------------------------------+-----------+---------------+         
| Id    | User | Host      | db      | Command | Time | State          | Info                                     | Rows_sent | Rows_examined |         
+-------+------+-----------+---------+---------+------+----------------+------------------------------------------+-----------+---------------+         
| 80013 | root | localhost | NULL    | Query   |    0 | init           | show processlist                         |         0 |             0 |         
| 89462 | root | localhost | bench5  | Query   |    0 | creating table | create table tab95 (i int) engine=innodb |         0 |             0 |         
| 89466 | root | localhost | bench8  | Query   |    0 | creating table | create table tab81 (i int) engine=innodb |         0 |             0 |         
| 89467 | root | localhost | bench1  | Query   |    0 | creating table | create table tab91 (i int) engine=innodb |         0 |             0 |         
| 89468 | root | localhost | bench13 | Query   |    0 | creating table | create table tab90 (i int) engine=innodb |         0 |             0 |         
| 89469 | root | localhost | bench15 | Query   |    0 | creating table | create table tab90 (i int) engine=innodb |         0 |             0 |         
| 89472 | root | localhost | bench9  | Query   |    0 | creating table | create table tab86 (i int) engine=innodb |         0 |             0 |         
| 89473 | root | localhost | bench10 | Query   |    0 | creating table | create table tab94 (i int) engine=innodb |         0 |             0 |         
| 89474 | root | localhost | bench11 | Query   |    0 | creating table | create table tab80 (i int) engine=innodb |         0 |             0 |         
| 89475 | root | localhost | bench3  | Query   |    0 | creating table | create table tab80 (i int) engine=innodb |         0 |             0 |         
| 89476 | root | localhost | bench2  | Query   |    0 | creating table | create table tab82 (i int) engine=innodb |         0 |             0 |         
| 89478 | root | localhost | bench4  | Query   |    0 | creating table | create table tab91 (i int) engine=innodb |         0 |             0 |         
| 89479 | root | localhost | bench16 | Query   |    0 | creating table | create table tab88 (i int) engine=innodb |         0 |             0 |         
| 89481 | root | localhost | bench12 | Query   |    0 | creating table | create table tab90 (i int) engine=innodb |         0 |             0 |         
| 89483 | root | localhost | bench6  | Query   |    0 | creating table | create table tab96 (i int) engine=innodb |         0 |             0 |         
| 89484 | root | localhost | bench14 | Query   |    0 | creating table | create table tab95 (i int) engine=innodb |         0 |             0 |         
+-------+------+-----------+---------+---------+------+----------------+------------------------------------------+-----------+---------------+         
...        

After creating the tables, I dropped all schemas concurrently:

[root@host ~]# time $(for db in {1..16};
do mysql -e "drop database bench${db}" & done)

So what was the difference with innodb_file_per_table ON vs OFF?

  • With innodb_file_per_table=ON
    • Schema and table creation = 1m54.852s
    • Schema drops = 1m21.682s
  • With innodb_file_per_table=OFF
    • Schema and table creation = 0m59.968s
    • Schema drops = 0m54.870s

So creation time decreased by 48%, drop time decreased by 33%.

I think its worth noting that this benchmark creates and drops empty tables. Dropping InnoDB tables created with innodb_file_per_table=ON can take much longer if they have large amounts of data.

Please also be aware that there are always trade-offs when modifying your InnoDB settings. That is outside the scope of this post, so please research and test before making changes. The MySQL documentation discusses that here.  In 5.6.6 and up, innodb_file_per_table is ON by default. MySQL 5.6 will also create temp tables as InnoDB, as noted here.

So there you have it. If your primary goal is to improve create and drop table time, turning OFF innodb_file_per_table will save significant amounts of time.

The post Is MySQL’s innodb_file_per_table slowing you down? appeared first on MySQL Performance Blog.

Sep
24
2013
--

How to reclaim space in InnoDB when innodb_file_per_table is ON

When innodb_file_per_table is OFF and all data is going to be stored in ibdata files. If you drop some tables of delete some data then there is no any other way to reclaim that unused disk space except dump/reload method.

When Innodb_file_per_table is ON, each table stores data and indexes in it’s own tablespace file. However, the shared tablespace-ibdata1 can still grow and you can check more information here about why it grows and what are the solutions.

http://www.mysqlperformanceblog.com/2013/08/20/why-is-the-ibdata1-file-continuously-growing-in-mysql/

Following the recent blog post from Miguel Angel Nieto titled “Why is the ibdata1 file continuously growing in MySQL?“, and since this is a very common question for Percona Support, this post covers how to reclaim the space when we are using innodb_file_per_table. Also, I will show you how to do it without causing performance or availability problems with the help of our Percona Toolkit.

When you remove rows, they are just marked as deleted on disk but space will be consumed by InnoDB files which can be re-used later when you insert/update more rows but it will never shrink. Very old MySQL bug : http://bugs.mysql.com/bug.php?id=1341

But, if you are using innodb_file_per_table then you can reclaim the space by running OPTIMIZE TABLE on that table. OPTIMIZE TABLE will create a new identical empty table. Then it will copy row by row data from old table to the new one. In this process a new .ibd tablespace will be created and the space will be reclaimed

mysql> select count(*) From test;
+----------+
| count(*) |
+----------+
| 3145728 |
+----------+
root@nil:/var/lib/mysql/mydb# ll -h
...
-rw-rw---- 1 mysql mysql 168M Sep 5 11:52 test.ibd
mysql> delete from test limit 2000000;
mysql> select count(*) From test;
+----------+
| count(*) |
+----------+
| 1145728 |
+----------+
root@nil:/var/lib/mysql/mydb# ll -h
...
-rw-rw---- 1 mysql mysql 168M Sep 5 11:52 test.ibd

You can see that after deleting 2M records, the test.ibd size was 168M.

mysql> optimize table test;
+-----------+----------+----------+-------------------------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+-----------+----------+----------+-------------------------------------------------------------------+
| mydb.test | optimize | note | Table does not support optimize, doing recreate + analyze instead |
| mydb.test | optimize | status | OK |
+-----------+----------+----------+-------------------------------------------------------------------+
root@nil:/var/lib/mysql/mydb# ll -h
...
-rw-rw---- 1 mysql mysql 68M Sep 5 12:47 test.ibd

After OPTIMIZE, you will be able to reclaim the space. As you can see, test.ibd file size is decreased from 168M to 68M.

I would like to mention here that during that process the table will be locked.(Table locked for just Writes) Which can affect to the performance when you’ll have large table. So If you don’t want to lock the table then you can use one of the best utility by Percona, pt-online-schema-change. It can ALTER without locking tables. You can run ALTER TABLE with ENGINE=INNODB which will re-create the table and reclaim the space.

(It’s always recommended to use latest version of pt-* utilities)

mysql> select count(*) from test;
+----------+
| count(*) |
+----------+
| 2991456 |
+----------+
mysql> delete from test limit 2000000;
mysql> select count(*) from test;
+----------+
| count(*) |
+----------+
| 991456 |
+----------+
root@nil:/var/lib/mysql/mydb# ll -h
...
-rw-rw---- 1 mysql mysql 157M Sep 6 11:52 test.ibd
nilnandan@nil:~$ pt-online-schema-change --alter "ENGINE=InnoDB" D=mydb,t=test --execute
Operation, tries, wait:
copy_rows, 10, 0.25
create_triggers, 10, 1
drop_triggers, 10, 1
swap_tables, 10, 1
update_foreign_keys, 10, 1
Altering `mydb`.`test`...
Creating new table...
Created new table mydb._test_new OK.
Altering new table...
Altered `mydb`.`_test_new` OK.
2013-09-06T15:37:46 Creating triggers...
2013-09-06T15:37:47 Created triggers OK.
2013-09-06T15:37:47 Copying approximately 991800 rows...
Copying `mydb`.`test`: 96% 00:01 remain
2013-09-06T15:38:17 Copied rows OK.
2013-09-06T15:38:17 Swapping tables...
2013-09-06T15:38:18 Swapped original and new tables OK.
2013-09-06T15:38:18 Dropping old table...
2013-09-06T15:38:18 Dropped old table `mydb`.`_test_old` OK.
2013-09-06T15:38:18 Dropping triggers...
2013-09-06T15:38:18 Dropped triggers OK.
Successfully altered `mydb`.`test`.
root@nil:/var/lib/mysql/mydb# ll -h
...
-rw-rw---- 1 mysql mysql 56M Sep 6 15:38 test.ibd

Same here, you can notice that test.ibd file size decreased from 157M to 56M.

NOTE: Please make sure that you have ample space before you run pt-online-schema-change because it will create a temporary table that contains roughly the size of the original table. By running this on the primary node, the changes will be replicated to your slaves.

The post How to reclaim space in InnoDB when innodb_file_per_table is ON appeared first on MySQL Performance Blog.

May
12
2011
--

Connecting orphaned .ibd files

There are two ways InnoDB can organize tablespaces. First is when all data, indexes and system buffers are stored in a single tablespace. This is typicaly one or several ibdata files. A well known innodb_file_per_table option brings the second one. Tables and system areas are split into different files. Usually system tablespace is located in ibdata1 file and every InnoDB table has two files e.g.  actor.frm and actor.ibd.

The annoying thing about .ibd files you can’t easily copy the an .ibd file to another MySQL server. If you try to very often you’ll get an error in the log:

InnoDB: Error: tablespace id is 10 in the data dictionary
InnoDB: but in file ./sakila/actor.ibd it is 15!

However sometimes you have to connect the .ibd file to an alien ibdata1.

There are several situation when you have to:

1. ibdata1 is erroneously removed

2. ibdata1 is heavily corrupted and innodb_force_recovery doesn’t help

Chris Calender suggests two methods. The first is create/drop the table many times until space_id in InnoDB dictionary and .ibd file match. The second is to edit space_id inside .ibd file with a hex editor.

I would like to elaborate the second method.

But let’s understand first what’s going on and why InnoDB refuses to use suggested .ibd file.

There is an InnoDB dictionary. It consists of several internal tables. For our topic only SYS_TABLES and SYS_INDEXES are relevant. These are usual InnoDB tables, but they’re hidden from a user(you can see them in information_scheme database in Percona Server though).

The structure of these tables is following:

SYS_TABLES:

CREATE TABLE `SYS_TABLES` (
`NAME` varchar(255) NOT NULL default '',
`ID` bigint(20) unsigned NOT NULL default '0',
`N_COLS` int(10) default NULL,
`TYPE` int(10) unsigned default NULL,
`MIX_ID` bigint(20) unsigned default NULL,
`MIX_LEN` int(10) unsigned default NULL,
`CLUSTER_NAME` varchar(255) default NULL,
`SPACE` int(10) unsigned default NULL,
PRIMARY KEY  (`NAME`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

SYS_INDEXES:

CREATE TABLE `SYS_INDEXES` (
`TABLE_ID` bigint(20) unsigned NOT NULL default '0',
`ID` bigint(20) unsigned NOT NULL default '0',
`NAME` varchar(120) default NULL,
`N_FIELDS` int(10) unsigned default NULL,
`TYPE` int(10) unsigned default NULL,
`SPACE` int(10) unsigned default NULL,
`PAGE_NO` int(10) unsigned default NULL,
PRIMARY KEY  (`TABLE_ID`,`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

Please note field SPACE. For table actor it is equal to 15:

mysql> select * from `INNODB_SYS_TABLES` where `SCHEMA` = 'sakila' AND `NAME` = 'actor'\G
*************************** 1. row ***************************
SCHEMA: sakila
NAME: actor
ID: 13
N_COLS: 2147483652
TYPE: 1
MIX_ID: 0
MIX_LEN: 0
CLUSTER_NAME:
SPACE: 15
1 row in set (0.00 sec)

SPACE is equal to 15 in all actor’s indexes:

mysql> select * from INNODB_SYS_INDEXES WHERE TABLE_ID = 13\G
*************************** 1. row ***************************
TABLE_ID: 13
ID: 15
NAME: PRIMARY
N_FIELDS: 1
TYPE: 3
SPACE: 15
PAGE_NO: 3
*************************** 2. row ***************************
TABLE_ID: 13
ID: 16
NAME: idx_actor_last_name
N_FIELDS: 1
TYPE: 0
SPACE: 15
PAGE_NO: 4
2 rows in set (0.00 sec)

In InnoDB world actor.ibd is a tablespace. It has space_id and it is equal to 15 for this particular table at this particular server.

As you can see secondary indexes are stored in actor.ibd as well.

But where is space_id in actor.ibd?

Like any other tablespace actor.ibd costsists of a set of InnoDB pages. A page is 16k long (UNIV_PAGE_SIZE in the source code).

Let’s take a look at the page header:

InnoDB Page Header
Name Size Description
FIL_PAGE_SPACE_OR_CHKSUM 4 /* in < MySQL-4.0.14 space id the
page belongs to (== 0) but in later
versions the ‘new’ checksum of the
page */
FIL_PAGE_OFFSET 4 ordinal page number from start of space
FIL_PAGE_PREV 4 offset of previous page in key order
FIL_PAGE_NEXT 4 offset of next page in key order
FIL_PAGE_LSN 8 log serial number of page’s latest log record
FIL_PAGE_TYPE 2 current defined types are: FIL_PAGE_INDEX, FIL_PAGE_UNDO_LOG, FIL_PAGE_INODE, FIL_PAGE_IBUF_FREE_LIST
FIL_PAGE_FILE_FLUSH_LSN 8 “the file has been flushed to disk at least up to this lsn” (log serial number), valid only on the first page of the file
FIL_PAGE_ARCH_LOG_NO_OR_SPACE_NO 4 /* starting from 4.1.x this
contains the space id of the page */

So, space_id is 4 bytes written to every InnoDB page. An .ibd file can be huge, while ibdata1 is usually smaller. Thus, it is easier to modify space_id in InnoDB dictionary once than in every InnoDB page.

How let’s connect actor.ibd from sakila database taken from some MySQL server.

0. Create empty InnoDB tablespace.

1. Create the table:

mysql>CREATE TABLE actor (
actor_id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT,
first_name VARCHAR(45) NOT NULL,
last_name VARCHAR(45) NOT NULL,
last_update TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY  (actor_id),
KEY idx_actor_last_name (last_name)
)ENGINE=InnoDB DEFAULT CHARSET=utf8;

This command will create respective records in SYS_TABLES and SYS_INDEXES.

2. Now let’s modify SPACE in InnoDB dictionary. MySQL must be stopped at this point. There is a tool ibdconnect in Percona InnoDB Recovery Tool. Make sure you’re using the latest version from the trunk.

It reads space_id from an .ibd file and updates the dictionary in ibdata1.

# ./ibdconnect -o /var/lib/mysql/ibdata1 -f /var/lib/mysql/sakila/actor.ibd -d sakila -t actor
actor.ibd
actor.ibd belongs to space #15
... Skipped output...
SYS_TABLES is updated successfully
... Skipped output...
SYS_INDEXES is updated successfully

It is possible that space_id from actor.ibd is already used by some other table.
In this case if ibdata was updated MySQL will fail to start with error:

InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Error: trying to add tablespace 15 of name './sakila/customer.ibd'
InnoDB: to the tablespace memory cache, but tablespace
InnoDB: 15 of name './sakila/actor.ibd' already exists in the tablespace
InnoDB: memory cache!

To refrain from such error ibdconnect does check if the space_id is already used.
It will refuse to update ibdata1:

$ ./ibdconnect -o ibdata1 -f t2.ibd -d sakila -t actor
...
Error: Space id: 12 is already used in InnoDB dictionary for table test/t2

In this case you need to drop table `test`.`t2` and create it again. InnoDB will assign other space_id, thus 12 will be freed.

3. Now SPACE is modified in the dictionary, but checksums are bad. To regenerate them use innochecksum from the same toolset. Run it two times:

# ./innochecksum -f /var/lib/mysql/ibdata1
page 8 invalid (fails new style checksum)
page 8: new style: calculated = 0x B7C5C82C; recorded = 0x BFE71C21
fixing new checksum of page 8
page 11 invalid (fails new style checksum)
page 11: new style: calculated = 0x E4189B9B; recorded = 0x C168689B
fixing new checksum of page 11
#
# ./innochecksum -f /var/lib/mysql/ibdata1
page 8 invalid (fails old style checksum)
page 8: old style: calculated = 0x 8195646B; recorded = 0x DA79A2EE
fixing old checksum of page 8
page 8 invalid (fails new style checksum)
page 8: new style: calculated = 0x 119FD630; recorded = 0x B7C5C82C
fixing new checksum of page 8
page 11 invalid (fails old style checksum)
page 11: old style: calculated = 0x 908297E7; recorded = 0x 6536CEE8
fixing old checksum of page 11
page 11 invalid (fails new style checksum)
page 11: new style: calculated = 0x D5DC3269; recorded = 0x E4189B9B
fixing new checksum of page 11

4. The third time to be sure ibdata1 has valid checksums:

#./innochecksum /var/lib/mysql/ibdata1
#

5. Now you can start MySQL and take a dump from the table. The table is accessible, but due to obvious reason it shold not be used in production.

The tool ibdconnect was tested on MySQL 5.1 on CentOS 5.6 x86_64. However it expected to work on all versions of MySQL/InnoDB.

Dec
25
2010
--

Spreading .ibd files across multiple disks; the optimization that isn’t

Inspired by Baron’s earlier post, here is one I hear quite frequently –

“If you enable innodb_file_per_table, each table is it’s own .ibd file.  You can then relocate the heavy hit tables to a different location and create symlinks to the original location.”

There are a few things wrong with this advice:

  1. InnoDB does not support these symlinks.  If you run an ALTER TABLE command, what you will find is that a new temporary table is created (in the original location!), the symlink is destroyed, and the temporary table is renamed.  Your “optimization” is lost.
  2. Striping (with RAID) is usually a far better optimization.  Striping a table across multiple disks effectively balances the  ‘heavy hit’ access across many more disks.  With 1 disk/table you are more likely to have the unbalance one disk overloaded, and many idle.
  3. You restrict your backup methods.  You can’t LVM snapshot across logical volumes.

Another common claim with this recommendation is that it allows you to quickly add space when running out.  LVM actually allows you to add physical volumes, and increase the size of logical volumes ;)   This is much easier to do than more one large table around.


Entry posted by Morgan Tocker |
22 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com