This blog post discusses a new TokuDB and PerconaFT database file management feature in two Percona Server releases.
By now you have hopefully read through Peter’s post and my two prior posts on the TokuDB/PerconaFT file set. If you have not, it is probably a good idea to run through them now before we get into the details of this new feature.
We introduced a new server option beginning in Percona Server 5.6.33-79.0 and Percona Server 5.7.15-9, called tokudb_dir_per_db, that addresses two shortcomings within the current TokuDB implementation:
- The renaming of data files on table/index rename
- The ability to group data files together within a directory that represents a single database.
The new option is disabled by default in 5.6.33-79.0, but will be enabled by default beginning in 5.7.15-9
New table renaming functionality
When you rename a TokuDB table via SQL, the data files on disk keep their original names. Only the mapping in the PerconaFT directory file is changed to map the new dictionary name to the original internal file names. This makes it difficult to quickly match database/table/index names to their actual files on disk, requiring you to use the INFORMATION_SCHEMA.TOKUDB_FILE_MAP to cross reference.
When tokudb_dir_per_db is enabled, this is no longer the case. When you rename a table, the mapping in the PerconaFT directory file will be updated, and the files will be renamed on disk to reflect the new table name. This was a lot more difficult to implement than we originally thought due to the non-transactional nature of the underlying file systems. We had to “invent” a transactional file system rename functionality within PerconaFT to ensure that a crash during a file system mv src dst was recoverable.
New directory layout functionality
Many users have had issues with managing the huge volume of individual files used by TokuDB and PerconaFT. We are beginning to take some steps to help improve the manageability of these files, and potentially even reduce the number of files present.
When you enable tokudb_dir_per_db, all new tables and indices are placed within their corresponding database directory within the tokudb_data_dir or system datadir. Existing table files will not be automatically relocated to their corresponding database directory.
You can easily move a table’s data files into the new scheme and proper database directory with a few steps:
mysql> SET GLOBAL tokudb_dir_per_db=true;
mysql> RENAME TABLE <table> TO <tmp_table>;
mysql> RENAME TABLE <tmp_table> TO <table>;
Two renames are needed because MySQL will not allow you to rename a table to itself. The first rename renames the table to the temporary name and mv‘s the tables files into the owning database directory. The second rename sets the table name back to the original name. Tables can, of course, also be renamed/moved across databases and will be placed correctly into the corresponding database directory.
You must be careful with renaming tables! If you have used any tricks to create symlinks of the database directories on different storage volumes, the mv is not a simple directory mv on the same volume, but a physical copy across volumes. This can take quite some time and prevents access to the table being moved during the copy.
NOTE: If you have tokudb_data_dir set to something other than the system datadir, TokuDB creates a directory matching the name of the database. Upon dropping of the database, this directory remains behind.
Example:
While running Percona Server 5.7.15-9 with tokudb_dir_per_db=false to illustrate the old behavior, create a table t1, show the file map, and list the data directory:
mysql> SET GLOBAL tokudb_dir_per_db=false;
Query OK, 0 rows affected (0.00 sec)
mysql> DROP DATABASE IF EXISTS test; CREATE DATABASE test; USE test;
Query OK, 0 rows affected (0.00 sec)
Query OK, 1 row affected (0.00 sec)
Database changed
mysql> CREATE TABLE t1(a INT PRIMARY KEY, b INT, c VARCHAR(200), KEY kab(a, b)) ENGINE=TOKUDB;
Query OK, 0 rows affected (0.07 sec)
mysql> SELECT * FROM INFORMATION_SCHEMA.TOKUDB_FILE_MAP;
*************************** 1. row ***************************
dictionary_name: ./test/t1-key-kab
internal_file_name: ./_test_t1_key_kab_de_3_1d.tokudb
table_schema: test
table_name: t1
table_dictionary_name: key-kab
*************************** 2. row ***************************
dictionary_name: ./test/t1-main
internal_file_name: ./_test_t1_main_de_2_1d.tokudb
table_schema: test
table_name: t1
table_dictionary_name: main
*************************** 3. row ***************************
dictionary_name: ./test/t1-status
internal_file_name: ./_test_t1_status_de_1_1d.tokudb
table_schema: test
table_name: t1
table_dictionary_name: status
3 rows in set (0.00 sec)
$ ls -1 data/*.tokudb
data/_test_t1_key_kab_de_3_1d.tokudb
data/_test_t1_main_de_2_1d.tokudb
data/_test_t1_status_de_1_1d.tokudb
We see the data files for our table t1 as the three files _test_t1_*
Rename t1 to all_the_kings_horses, show the file map again, and another listing of the data directory:
mysql> RENAME TABLE t1 TO all_the_kings_horses;
Query OK, 0 rows affected (0.01 sec)
mysql> SELECT * FROM INFORMATION_SCHEMA.TOKUDB_FILE_MAP;
*************************** 1. row ***************************
dictionary_name: ./test/all_the_kings_horses-key-kab
internal_file_name: ./_test_t1_key_kab_de_3_1d.tokudb
table_schema: test
table_name: all_the_kings_horses
table_dictionary_name: key-kab
*************************** 2. row ***************************
dictionary_name: ./test/all_the_kings_horses-main
internal_file_name: ./_test_t1_main_de_2_1d.tokudb
table_schema: test
table_name: all_the_kings_horses
table_dictionary_name: main
*************************** 3. row ***************************
dictionary_name: ./test/all_the_kings_horses-status
internal_file_name: ./_test_t1_status_de_1_1d.tokudb
table_schema: test
table_name: all_the_kings_horses
table_dictionary_name: status
3 rows in set (0.00 sec)
$ ls -1 data/*.tokudb
data/_test_t1_key_kab_de_3_1d.tokudb
data/_test_t1_main_de_2_1d.tokudb
data/_test_t1_status_de_1_1d.tokudb
The file names remained the same as the original table, but the file map has changed to reflect the new table/dictionary names.
Let us inject a little confusion by adding another index to the table:
mysql> alter table all_the_kings_horses add index kac(a,c);
Query OK, 0 rows affected (0.08 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> SELECT * FROM INFORMATION_SCHEMA.TOKUDB_FILE_MAP;
*************************** 1. row ***************************
dictionary_name: ./test/all_the_kings_horses-key-kab
internal_file_name: ./_test_t1_key_kab_de_3_1d.tokudb
table_schema: test
table_name: all_the_kings_horses
table_dictionary_name: key-kab
*************************** 2. row ***************************
dictionary_name: ./test/all_the_kings_horses-key-kac
internal_file_name: ./_test_all_the_kings_horses_key_kac_e3_3_1d_B_0.tokudb
table_schema: test
table_name: all_the_kings_horses
table_dictionary_name: key-kac
*************************** 3. row ***************************
dictionary_name: ./test/all_the_kings_horses-main
internal_file_name: ./_test_t1_main_de_2_1d.tokudb
table_schema: test
table_name: all_the_kings_horses
table_dictionary_name: main
*************************** 4. row ***************************
dictionary_name: ./test/all_the_kings_horses-status
internal_file_name: ./_test_t1_status_de_1_1d.tokudb
table_schema: test
table_name: all_the_kings_horses
table_dictionary_name: status
4 rows in set (0.00 sec)
$ ls -1 data/*.tokudb
data/_test_all_the_kings_horses_key_kac_e3_3_1d_B_0.tokudb
data/_test_t1_key_kab_de_3_1d.tokudb
data/_test_t1_main_de_2_1d.tokudb
data/_test_t1_status_de_1_1d.tokudb
The file for the new index kac was created with the current table name, not the original.
Now we move on to the new behavior. First make sure that tokudb_dir_per_db=true, then rename the table again, show the file map, and do another directory listing:
mysql> SET GLOBAL tokudb_dir_per_db=true;
Query OK, 0 rows affected (0.00 sec)
mysql> RENAME TABLE all_the_kings_horses TO all_the_kings_men;
Query OK, 0 rows affected (0.02 sec)
mysql> SELECT * FROM INFORMATION_SCHEMA.TOKUDB_FILE_MAP;
*************************** 1. row ***************************
dictionary_name: ./test/all_the_kings_men-key-kab
internal_file_name: ./test/all_the_kings_men_key_kab_ea_2_1d.tokudb
table_schema: test
table_name: all_the_kings_men
table_dictionary_name: key-kab
*************************** 2. row ***************************
dictionary_name: ./test/all_the_kings_men-key-kac
internal_file_name: ./test/all_the_kings_men_key_kac_ea_3_1d.tokudb
table_schema: test
table_name: all_the_kings_men
table_dictionary_name: key-kac
*************************** 3. row ***************************
dictionary_name: ./test/all_the_kings_men-main
internal_file_name: ./test/all_the_kings_men_main_ea_4_1d.tokudb
table_schema: test
table_name: all_the_kings_men
table_dictionary_name: main
*************************** 4. row ***************************
dictionary_name: ./test/all_the_kings_men-status
internal_file_name: ./test/all_the_kings_men_status_ea_5_1d.tokudb
table_schema: test
table_name: all_the_kings_men
table_dictionary_name: status
4 rows in set (0.00 sec)
$ ls -1 data/test/*.tokudb
data/test/all_the_kings_men_key_kab_ea_2_1d.tokudb
data/test/all_the_kings_men_key_kac_ea_3_1d.tokudb
data/test/all_the_kings_men_main_ea_4_1d.tokudb
data/test/all_the_kings_men_status_ea_5_1d.tokudb
The database files have now been renamed to properly match the name of the database, table and keys, and they have been moved into the data/test directory.
Now let us watch all that action with an alternate tokudb_data_dir. Rather than showing how to move files around as mentioned in the previous blog posts, we will just reset our TokuDB installation and start the server with a different tokudb_data_dir that is a sibling to the server datadir, called tokudb_data.
mysql> SET GLOBAL tokudb_dir_per_db=true;
Query OK, 0 rows affected (0.00 sec)
mysql> DROP DATABASE IF EXISTS test; CREATE DATABASE test; USE test;
Query OK, 0 rows affected, 1 warning (0.00 sec)
Query OK, 1 row affected (0.00 sec)
Database changed
mysql> CREATE TABLE t1(a INT PRIMARY KEY, b INT, c VARCHAR(200), KEY kab(a, b)) ENGINE=TOKUDB;
Query OK, 0 rows affected (0.15 sec)
mysql> SELECT * FROM INFORMATION_SCHEMA.TOKUDB_FILE_MAP;
*************************** 1. row ***************************
dictionary_name: ./test/t1-key-kab
internal_file_name: /ssd/toku/DB-295/percona-server-install-5.7/tokudb_data/test/t1_key_kab_d9_3_1d.tokudb
table_schema: test
table_name: t1
table_dictionary_name: key-kab
*************************** 2. row ***************************
dictionary_name: ./test/t1-main
internal_file_name: /ssd/toku/DB-295/percona-server-install-5.7/tokudb_data/test/t1_main_d9_2_1d.tokudb
table_schema: test
table_name: t1
table_dictionary_name: main
*************************** 3. row ***************************
dictionary_name: ./test/t1-status
internal_file_name: /ssd/toku/DB-295/percona-server-install-5.7/tokudb_data/test/t1_status_d9_1_1d.tokudb
table_schema: test
table_name: t1
table_dictionary_name: status
3 rows in set (0.00 sec)
$ ls -1 tokudb_data/
test
__tokudb_lock_dont_delete_me_data
__tokudb_lock_dont_delete_me_temp
$ ls -1 tokudb_data/test/
t1_key_kab_d9_3_1d.tokudb
t1_main_d9_2_1d.tokudb
t1_status_d9_1_1d.tokudb
This shows us TokuDB now putting everything over in the directory we specified in our tokudb_data_dir, and is following the tokudb_dir_per_db paradigm by creating the directory called test before creating the table.
What happens when we drop that database?
mysql> drop database test;
Query OK, 1 row affected (0.02 sec)
$ ls -1 tokudb_data/
test
__tokudb_lock_dont_delete_me_data
__tokudb_lock_dont_delete_me_temp
$ ls -1 tokudb_data/test/
All of the tables files have been removed, but as mentioned above, the test still exists and needs to be removed manually.
Thanks for reading! We hope that you have found this blog series useful and look forward to hearing your experience with this new feature.
In the future we will investigate implementing the CREATE TABLE … DATA|INDEX DIRECTORY=… feature for TokuDB, which builds on top of this work.
We need to give a shout out to Vladislav Lesin, who took the lead on this issue and fought several battles in his attempt to ensure that this is fully crash safe and recoverable.