Jul
30
2014
--

Examining the TokuDB MySQL storage engine file structure

As we know different storage engines in MySQL have different file structures. Every table in MySQL 5.6 must have a .frm file in the database directory matching the table name. But where the rest of the data resides depends on the storage engine.

For MyISAM we have .MYI and .MYD files in the database directory (unless special settings are in place); for InnoDB we might have data stored in the single table space (typically ibdata1 in the database directory) or as file per table (or better said file per partition) producing a single file with .ibd extension for each table/partition. TokuDB as of this version (7.1.7) has its own innovative approach to storing the table contents.

I have created the table in the database test having the following structure:

CREATE TABLE `mytable` (
  `id` int(10) unsigned NOT NULL,
  `c` varchar(15) NOT NULL,
  `d` int(11) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `c` (`c`),
  KEY `d` (`d`,`c`),
  KEY `d_2` (`d`)
) ENGINE=TokuDB DEFAULT CHARSET=latin1

No files appear in the “test” database directory besides mytable.frm, however few files are created in the database directory:

-rwxrwx--x  1 mysql mysql       40960 Jul 29 21:01 _test_mytable_key_c_22f19b0_3_19_B_0.tokudb
-rwxrwx--x  1 mysql mysql       16896 Jul 29 21:02 _test_mytable_key_d_2_22f223b_3_19_B_0.tokudb
-rwxrwx--x  1 mysql mysql       16896 Jul 29 21:01 _test_mytable_key_d_22f1c9a_3_19_B_0.tokudb
-rwxrwx--x  1 mysql mysql       32768 Jul 29 21:01 _test_mytable_main_22f1818_2_19.tokudb
-rwxrwx--x  1 mysql mysql       65536 Jul 29 21:02 _test_mytable_status_22f1818_1_19.tokudb

As you can see the table is presented by a series of files – the “status” file, the “main” table which contains clustered fractal tree index (primary key) plus each index is stored in its own file. Note how files are named – to include the database name, file name and the key name (the name you give to the key, not the columns involved). This is followed by something like “22f1818_1_19″ which I assume is kind of internal TokuDB object identifier.

Note also (at least in my system) files are created with executable bit set. I see no reason for this and this is probably just a minor bug.

Another minor bug (or intended design limitation?) seems to be TokuDB might loose the actual table name in its file name when you alter the table. For example as I altered the table to drop one of the keys and add another one named “superkey” I see the “mytable” name is replaced with “sql_390c_247″ which looks very much like the temporary table which was used to rebuild the table:

-rwxrwx--x  1 mysql mysql       32768 Jul 29 21:15 _test_sql_390c_247_key_c_22f6f7d_3_19.tokudb
-rwxrwx--x  1 mysql mysql       32768 Jul 29 21:15 _test_sql_390c_247_key_d_22f6f7d_5_19.tokudb
-rwxrwx--x  1 mysql mysql       32768 Jul 29 21:15 _test_sql_390c_247_key_superkey_22f6f7d_4_19.tokudb
-rwxrwx--x  1 mysql mysql       32768 Jul 29 21:14 _test_sql_390c_247_main_22f6f7d_2_19.tokudb
-rwxrwx--x  1 mysql mysql       32768 Jul 29 21:14 _test_sql_390c_247_status_22f6f7d_1_19.tokudb

I like the approach of storing different indexes in the different files as this makes it much easier to drop the index as well as potentially allows the placement of indexes onto different storage if it is desired for some reason. However putting all tables in the database root is a bad idea – having substantial amount of tables, especially with few indexes, each producing huge amounts of files, makes it inconvenient to work with the database directory (which often contains other files – log files, binary logs etc.) plus it might push file systems to their limits or performance limits dealing with huge amounts of files in the single directory.

Many also like having files in the data directory as it allows, in basic configurations, to use simple Unix tools such as du to see how much space given database physically takes.

Same as InnoDB and MyISAM, TokuDB will create a separate set of files for each partition, placing it in the same directory, having same table partitioned by HASH on primary key with 4 partitions I observe:

-rwxrwx--x  1 mysql mysql       32768 Jul 29 21:22 _test_sql_390c_25b_P_p0_key_c_22f9f35_3_19.tokudb
-rwxrwx--x  1 mysql mysql       32768 Jul 29 21:22 _test_sql_390c_25b_P_p0_key_d_22f9f35_5_19.tokudb
-rwxrwx--x  1 mysql mysql       32768 Jul 29 21:22 _test_sql_390c_25b_P_p0_key_superkey_22f9f35_4_19.tokudb
-rwxrwx--x  1 mysql mysql       32768 Jul 29 21:22 _test_sql_390c_25b_P_p0_main_22f9f35_2_19.tokudb
-rwxrwx--x  1 mysql mysql       32768 Jul 29 21:22 _test_sql_390c_25b_P_p0_status_22f9f35_1_19.tokudb
-rwxrwx--x  1 mysql mysql       32768 Jul 29 21:22 _test_sql_390c_25b_P_p1_key_c_22f9f8e_3_19.tokudb
-rwxrwx--x  1 mysql mysql       32768 Jul 29 21:22 _test_sql_390c_25b_P_p1_key_d_22f9f8e_5_19.tokudb
-rwxrwx--x  1 mysql mysql       32768 Jul 29 21:22 _test_sql_390c_25b_P_p1_key_superkey_22f9f8e_4_19.tokudb
-rwxrwx--x  1 mysql mysql       32768 Jul 29 21:22 _test_sql_390c_25b_P_p1_main_22f9f8e_2_19.tokudb
-rwxrwx--x  1 mysql mysql       32768 Jul 29 21:22 _test_sql_390c_25b_P_p1_status_22f9f8e_1_19.tokudb
-rwxrwx--x  1 mysql mysql       32768 Jul 29 21:22 _test_sql_390c_25b_P_p2_key_c_22f9fe1_3_19.tokudb
-rwxrwx--x  1 mysql mysql       32768 Jul 29 21:22 _test_sql_390c_25b_P_p2_key_d_22f9fe1_5_19.tokudb
-rwxrwx--x  1 mysql mysql       32768 Jul 29 21:22 _test_sql_390c_25b_P_p2_key_superkey_22f9fe1_4_19.tokudb
-rwxrwx--x  1 mysql mysql       32768 Jul 29 21:22 _test_sql_390c_25b_P_p2_main_22f9fe1_2_19.tokudb
-rwxrwx--x  1 mysql mysql       32768 Jul 29 21:22 _test_sql_390c_25b_P_p2_status_22f9fe1_1_19.tokudb
-rwxrwx--x  1 mysql mysql       32768 Jul 29 21:22 _test_sql_390c_25b_P_p3_key_c_22f9ffb_3_19.tokudb
-rwxrwx--x  1 mysql mysql       32768 Jul 29 21:22 _test_sql_390c_25b_P_p3_key_d_22f9ffb_5_19.tokudb
-rwxrwx--x  1 mysql mysql       32768 Jul 29 21:22 _test_sql_390c_25b_P_p3_key_superkey_22f9ffb_4_19.tokudb
-rwxrwx--x  1 mysql mysql       32768 Jul 29 21:22 _test_sql_390c_25b_P_p3_main_22f9ffb_2_19.tokudb
-rwxrwx--x  1 mysql mysql       32768 Jul 29 21:22 _test_sql_390c_25b_P_p3_status_22f9ffb_1_19.tokudb

As you can see “P_p1″ to “P_p2″ suffixes added to each of the files.

What other files exist beyond those which come from TokuDB tables?

There are few system files

-rwxrwx--x  1 mysql mysql       32768 Jul 29 21:16 tokudb.directory
-rwxrwx--x  1 mysql mysql       16384 Jul 17 19:09 tokudb.environment
-rwxrwx--x  1 mysql mysql     1048576 Jul 29 21:22 tokudb.rollback

Which as their name say contain “directory” – metadata about tables and indexes in the system, rollback – contains data which is needed for transaction rollback and environment contains some kind of information about environment. I did not dig into this – I just see this file is not changed after I have started the instance unlike other 2 which are changed when table structure is changed (tokudb.directory) or when database is modified (tokudb.rollback)

Next you will see

-rw-------  1 mysql mysql           0 Jul 17 19:09 __tokudb_lock_dont_delete_me_data
-rw-------  1 mysql mysql           0 Jul 17 19:09 __tokudb_lock_dont_delete_me_environment
-rw-------  1 mysql mysql           0 Jul 17 19:09 __tokudb_lock_dont_delete_me_logs
-rw-------  1 mysql mysql           0 Jul 17 19:09 __tokudb_lock_dont_delete_me_recovery
-rw-------  1 mysql mysql           0 Jul 17 19:09 __tokudb_lock_dont_delete_me_temp

I see TokuDB really tries to lock the database instance preventing concurrent access to the directory with multiple files. I think InnoDB does it more clean way placing the lock on system tablespace which does not require extra files… though the TokuDB team might have some specific reason to do it this way. Might be Oracle holds the software patent on preventing concurrent database operation by locking the file?

Finally there is the transaction log file:

-rwx------  1 mysql mysql    37499084 Jul 29 21:34 log000000002593.tokulog25

TokuDB transaction log files are not pre-allocated like InnoDB’s but they look more similar to MySQL binary logs – they have sequentially increasing file numbers which will increment as new files are created, file itself will grow as new data is written to the log. As I understand log rotation happens during checkpoint and you would typically see only one log file.

There is a lot more for me to learn when it comes to TokuDB file layout and purpose of individual files, yet I hope this provides you with good basic overview of the TokuDB MySQL storage engine.

Further Reading: Rich Prohaska from TokuDB team was kind enough to share some more good reading with me, for those extra curious: TokuDB Files and File Descriptors, Separate Database Directories Design

The post Examining the TokuDB MySQL storage engine file structure appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com