Jul
29
2015
--

Why base64-output=DECODE-ROWS does not print row events in MySQL binary logs

Lately I saw many cases when users specified option

--base64-output=DECODE-ROWS

  to print out a statement representation of row events in MySQL binary logs just to get nothing. Reason for this is obvious: option

--base64-output=DECODE-ROWS

  does not convert row events into its string representation, this is job of option

--verbose

. But why users mix these two options so often? This blog post is result of my investigations.

There are already two great blog posts about printing row events on the Percona blog: “Debugging problems with row based replication” by Justin Swanhart and “Identifying useful info from MySQL row-based binary logs” by Alok Pathak.

Both authors run

mysqlbinlog

  with options 

–base64-output=decode-rows -vv

  and demonstrate how a combination of them can produce human-readable output of row events. However, one thing which is not clear yet is what the differences are between these options. I want to underline the differences in this post.

Let’s check the user manual first.

–base64-output=value

This option determines when events should be displayed encoded as base-64 strings using BINLOG statements. The option has these permissible values (not case sensitive):

    AUTO (“automatic”) or UNSPEC (“unspecified”) displays BINLOG statements automatically when necessary (that is, for format description events and row events). If no –base64-output option is given, the effect is the same as –base64-output=AUTO.
    Note

    Automatic BINLOG display is the only safe behavior if you intend to use the output of mysqlbinlog to re-execute binary log file contents. The other option values are intended only for debugging or testing purposes because they may produce output that does not include all events in executable form.

    NEVER causes BINLOG statements not to be displayed. mysqlbinlog exits with an error if a row event is found that must be displayed using BINLOG.

    DECODE-ROWS specifies to mysqlbinlog that you intend for row events to be decoded and displayed as commented SQL statements by also specifying the –verbose option. Like NEVER, DECODE-ROWS suppresses display of BINLOG statements, but unlike NEVER, it does not exit with an error if a row event is found.

For examples that show the effect of –base64-output and –verbose on row event output, see Section 4.6.8.2, “mysqlbinlog Row Event Display”.

Literally

--base64-output=DECODE-ROWS

  just suppresses

BINLOG

  statement and does not print anything.

To test its effect I run command

insert into t values (2, 'bar');

on an InnoDB table while binary log uses ROW format. As expected if I specify no option I will receive unreadable output:

$mysqlbinlog var/mysqld.1/data/master-bin.000002
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 4
#150720 15:19:15 server id 1  end_log_pos 120 CRC32 0x3d52aee2  Start: binlog v 4, server v 5.6.25-73.1-debug-log created 150720 15:19:15
BINLOG '
Q+esVQ8BAAAAdAAAAHgAAAAAAAQANS42LjI1LTczLjEtZGVidWctbG9nAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAEzgNAAgAEgAEBAQEEgAAXAAEGggAAAAICAgCAAAACgoKGRkAAeKu
Uj0=
'/*!*/;
# at 120
#150720 15:19:21 server id 1  end_log_pos 192 CRC32 0xbebac59d  Query   thread_id=2     exec_time=0     error_code=0
SET TIMESTAMP=1437394761/*!*/;
SET @@session.pseudo_thread_id=2/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
SET @@session.sql_mode=1073741824/*!*/;
SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;
/*!C utf8 *//*!*/;
SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=8/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
BEGIN
/*!*/;
# at 192
#150720 15:19:21 server id 1  end_log_pos 239 CRC32 0xe143838b  Table_map: `test`.`t` mapped to number 70
# at 239
#150720 15:19:21 server id 1  end_log_pos 283 CRC32 0x75523a2d  Write_rows: table id 70 flags: STMT_END_F
BINLOG '
SeesVRMBAAAALwAAAO8AAAAAAEYAAAAAAAEABHRlc3QAAXQAAgMPAv8AA4uDQ+E=
SeesVR4BAAAALAAAABsBAAAAAEYAAAAAAAEAAgAC//wCAAAAA2Jhci06UnU=
'/*!*/;
# at 283
#150720 15:19:21 server id 1  end_log_pos 314 CRC32 0xd183c769  Xid = 14
COMMIT/*!*/;
# at 314
#150720 15:19:22 server id 1  end_log_pos 362 CRC32 0x892fe43b  Rotate to master-bin.000003  pos: 4
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

The

INSERT

  is here:

BINLOG '
SeesVRMBAAAALwAAAO8AAAAAAEYAAAAAAAEABHRlc3QAAXQAAgMPAv8AA4uDQ+E=
SeesVR4BAAAALAAAABsBAAAAAEYAAAAAAAEAAgAC//wCAAAAA2Jhci06UnU=
'/*!*/;

But this string is not for humans.

What will happen if I add option

--base64-output=DECODE-ROWS

 ?

$mysqlbinlog var/mysqld.1/data/master-bin.000002 --base64-output=DECODE-ROWS
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 4
#150720 15:19:15 server id 1  end_log_pos 120 CRC32 0x3d52aee2  Start: binlog v 4, server v 5.6.25-73.1-debug-log created 150720 15:19:15
# at 120
#150720 15:19:21 server id 1  end_log_pos 192 CRC32 0xbebac59d  Query   thread_id=2     exec_time=0     error_code=0
SET TIMESTAMP=1437394761/*!*/;
SET @@session.pseudo_thread_id=2/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
SET @@session.sql_mode=1073741824/*!*/;
SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;
/*!C utf8 *//*!*/;
SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=8/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
BEGIN
/*!*/;
# at 192
#150720 15:19:21 server id 1  end_log_pos 239 CRC32 0xe143838b  Table_map: `test`.`t` mapped to number 70
# at 239
#150720 15:19:21 server id 1  end_log_pos 283 CRC32 0x75523a2d  Write_rows: table id 70 flags: STMT_END_F
# at 283
#150720 15:19:21 server id 1  end_log_pos 314 CRC32 0xd183c769  Xid = 14
COMMIT/*!*/;
# at 314
#150720 15:19:22 server id 1  end_log_pos 362 CRC32 0x892fe43b  Rotate to master-bin.000003  pos: 4
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

Row event was just suppressed!

Lets now check option verbose:

–verbose, -v

Reconstruct row events and display them as commented SQL statements. If this option is given twice, the output includes comments to indicate column data types and some metadata.

For examples that show the effect of –base64-output and –verbose on row event output, see Section 4.6.8.2, “mysqlbinlog Row Event Display”.

Surprisingly

--base64-output=DECODE-ROWS

  is not needed!:

$mysqlbinlog var/mysqld.1/data/master-bin.000002 --verbose
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 4
#150720 15:19:15 server id 1  end_log_pos 120 CRC32 0x3d52aee2  Start: binlog v 4, server v 5.6.25-73.1-debug-log created 150720 15:19:15
BINLOG '
Q+esVQ8BAAAAdAAAAHgAAAAAAAQANS42LjI1LTczLjEtZGVidWctbG9nAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAEzgNAAgAEgAEBAQEEgAAXAAEGggAAAAICAgCAAAACgoKGRkAAeKu
Uj0=
'/*!*/;
# at 120
#150720 15:19:21 server id 1  end_log_pos 192 CRC32 0xbebac59d  Query   thread_id=2     exec_time=0     error_code=0
SET TIMESTAMP=1437394761/*!*/;
SET @@session.pseudo_thread_id=2/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
SET @@session.sql_mode=1073741824/*!*/;
SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;
/*!C utf8 *//*!*/;
SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=8/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
BEGIN
/*!*/;
# at 192
#150720 15:19:21 server id 1  end_log_pos 239 CRC32 0xe143838b  Table_map: `test`.`t` mapped to number 70
# at 239
#150720 15:19:21 server id 1  end_log_pos 283 CRC32 0x75523a2d  Write_rows: table id 70 flags: STMT_END_F
BINLOG '
SeesVRMBAAAALwAAAO8AAAAAAEYAAAAAAAEABHRlc3QAAXQAAgMPAv8AA4uDQ+E=
SeesVR4BAAAALAAAABsBAAAAAEYAAAAAAAEAAgAC//wCAAAAA2Jhci06UnU=
'/*!*/;
### INSERT INTO `test`.`t`
### SET
###   @1=2
###   @2='bar'
# at 283
#150720 15:19:21 server id 1  end_log_pos 314 CRC32 0xd183c769  Xid = 14
COMMIT/*!*/;
# at 314
#150720 15:19:22 server id 1  end_log_pos 362 CRC32 0x892fe43b  Rotate to master-bin.000003  pos: 4
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

INSERT statement successfully restored as:

### INSERT INTO `test`.`t`
### SET
###   @1=2
###   @2='bar'
# at 283

Why do the bloggers mentioned above suggest to use

--base64-output=DECODE-ROWS

 ? Lets try to use both options:

$mysqlbinlog var/mysqld.1/data/master-bin.000002 --base64-output=DECODE-ROWS --verbose
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 4
#150720 15:19:15 server id 1  end_log_pos 120 CRC32 0x3d52aee2  Start: binlog v 4, server v 5.6.25-73.1-debug-log created 150720 15:19:15
# at 120
#150720 15:19:21 server id 1  end_log_pos 192 CRC32 0xbebac59d  Query   thread_id=2     exec_time=0     error_code=0
SET TIMESTAMP=1437394761/*!*/;
SET @@session.pseudo_thread_id=2/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
SET @@session.sql_mode=1073741824/*!*/;
SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;
/*!C utf8 *//*!*/;
SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=8/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
BEGIN
/*!*/;
# at 192
#150720 15:19:21 server id 1  end_log_pos 239 CRC32 0xe143838b  Table_map: `test`.`t` mapped to number 70
# at 239
#150720 15:19:21 server id 1  end_log_pos 283 CRC32 0x75523a2d  Write_rows: table id 70 flags: STMT_END_F
### INSERT INTO `test`.`t`
### SET
###   @1=2
###   @2='bar'
# at 283
#150720 15:19:21 server id 1  end_log_pos 314 CRC32 0xd183c769  Xid = 14
COMMIT/*!*/;
# at 314
#150720 15:19:22 server id 1  end_log_pos 362 CRC32 0x892fe43b  Rotate to master-bin.000003  pos: 4
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

In this case row event was suppressed and statement is printed. Also, the resulting file cannot be used to re-apply events, because statements are commented out. This is very useful when binary log is big and you just need to investigate what it contains, not re-apply events.

This is not main purpose of this post, but you can also find information about columns metadata if specify option

--verbose

  twice:

$mysqlbinlog var/mysqld.1/data/master-bin.000002 --base64-output=DECODE-ROWS --verbose --verbose
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 4
#150720 15:19:15 server id 1  end_log_pos 120 CRC32 0x3d52aee2  Start: binlog v 4, server v 5.6.25-73.1-debug-log created 150720 15:19:15
# at 120
#150720 15:19:21 server id 1  end_log_pos 192 CRC32 0xbebac59d  Query   thread_id=2     exec_time=0     error_code=0
SET TIMESTAMP=1437394761/*!*/;
SET @@session.pseudo_thread_id=2/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
SET @@session.sql_mode=1073741824/*!*/;
SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;
/*!C utf8 *//*!*/;
SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=8/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
BEGIN
/*!*/;
# at 192
#150720 15:19:21 server id 1  end_log_pos 239 CRC32 0xe143838b  Table_map: `test`.`t` mapped to number 70
# at 239
#150720 15:19:21 server id 1  end_log_pos 283 CRC32 0x75523a2d  Write_rows: table id 70 flags: STMT_END_F
### INSERT INTO `test`.`t`
### SET
###   @1=2 /* INT meta=0 nullable=1 is_null=0 */
###   @2='bar' /* VARSTRING(255) meta=255 nullable=1 is_null=0 */
# at 283
#150720 15:19:21 server id 1  end_log_pos 314 CRC32 0xd183c769  Xid = 14
COMMIT/*!*/;
# at 314
#150720 15:19:22 server id 1  end_log_pos 362 CRC32 0x892fe43b  Rotate to master-bin.000003  pos: 4
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

Note, this is, again, job of

--verbose

 , not

--base64-output=DECODE-ROWS

To conclude:

If you want to see statement representation of row events use option

--verbose (-v)


If you want to see metadata of columns specify

--verbose

  twice:

--verbose --verbose

  or

-vv


If you want to suppress output of row events specify option

--base64-output=DECODE-ROWS

The post Why base64-output=DECODE-ROWS does not print row events in MySQL binary logs appeared first on MySQL Performance Blog.

Aug
22
2014
--

When (and how) to move an InnoDB table outside the shared tablespace

In my last post, “A closer look at the MySQL ibdata1 disk space issue and big tables,” I looked at the growing ibdata1 problem under the perspective of having big tables residing inside the so-called shared tablespace. In the particular case that motivated that post, we had a customer running out of disk space in his server who was looking for a way to make the ibdata1 file shrink. As you may know, that file (or, as explained there, the set of ibdata files composing the shared tablespace) stores all InnoDB tables created when innodb_file_per_table is disabled, but also other InnoDB structures, such as undo logs and data dictionary.

For example, when you run a transaction involving InnoDB tables, MySQL will first write all the changes it triggers in an undo log, for the case you later decide to “roll them back”. Long standing, uncommited transactions are one of the causes for a growing ibdata file. Of course, if you have innodb_file_per_table disabled then all your InnoDB tables live inside it. That was what happened in that case.

So, how do you move a table outside the shared tablespace and change the storage engine it relies on? As importantly, how does that affects disk space use? I’ve explored some of the options presented in the previous post and now share my findings with you below.

The experiment

I created a very simple InnoDB table inside the shared tablespace of a fresh installed Percona Server 5.5.37-rel35.1 with support for TokuDB and configured it with a 1GB buffer pool. I’ve used a 50G partition to host the ‘datadir’ and another one for ‘tmpdir’:

Filesystem                 Size Used Avail Use% Mounted on
/dev/mapper/vg0-lvFernando1 50G 110M  47G   1%  /media/lvFernando1  # datadir
/dev/mapper/vg0-lvFernando2 50G  52M  47G   1%  /media/lvFernando2  # tmpdir

Here’s the table structure:

CREATE TABLE `joinit` (
  `i` int(11) NOT NULL AUTO_INCREMENT,
  `s` varchar(64) DEFAULT NULL,
  `t` time NOT NULL,
  `g` int(11) NOT NULL,
  PRIMARY KEY (`i`)
) ENGINE=InnoDB  DEFAULT CHARSET=latin1;

I populated it with 134 million rows using the following routine:

INSERT INTO joinit VALUES (NULL, uuid(), time(now()),  (FLOOR( 1 + RAND( ) *60 )));
INSERT INTO joinit SELECT NULL, uuid(), time(now()),  (FLOOR( 1 + RAND( ) *60 )) FROM joinit;  # repeat until you reach your target number of rows

which resulted in the table below:

mysql> show table status from test like 'joinit'G
*************************** 1. row ***************************
Name: joinit
Engine: InnoDB
Version: 10
Row_format: Compact
Rows: 134217909
Avg_row_length: 72
Data_length: 9783214080
Max_data_length: 0
Index_length: 0
Data_free: 1013972992
Auto_increment: 134872552
Create_time: 2014-07-30 20:42:42
Update_time: NULL
Check_time: NULL
Collation: latin1_swedish_ci
Checksum: NULL
Create_options:
Comment:
1 row in set (0.00 sec)

The resulting ibdata1 file was showing to have 11G, which accounted in practice for 100% of the datadir partition use then. What follows next is a few experiences I did by converting that table to use a different storage engine, moving it outside the shared tablespace, compressing it, and dumping and restoring the database back to see the effects in disk space use. I haven’t timed how long running each command took and focused mostly on the generated files size. As a bonus, I’ve also looked at how to extend the shared table space by adding an extra ibdata file.

#1) Converting to MyISAM

Technical characteristics and features apart, MyISAM tables are know to occupy less disk space than InnoDB’s ones. How much less depends on the actual table structure. Here I made the conversion in the most simplest way:

mysql> ALTER TABLE test.joinit ENGINE=MYISAM;

which created the following files (the .frm file already existed):

$ ls -lh /media/lvFernando1/data/test/
total 8.3G
-rw-rw---- 1 fernando.laudares admin 8.5K Jul 31 16:21 joinit.frm
-rw-rw---- 1 fernando.laudares admin 7.0G Jul 31 16:27 joinit.MYD
-rw-rw---- 1 fernando.laudares admin 1.3G Jul 31 16:27 joinit.MYI

The resulting MyISAM files amounted for an additional 8.3G of disk space use:

/dev/mapper/vg0-lvFernando1 50G 19G 29G 40% /media/lvFernando1

I was expecting smaller files but, of course, the result depends largely on the data types of the columns composing the table. The problem (or the consequence) is that we end up with close to the double of the initial disk space being used:

As it happens with the other solutions presented in this section that migrate the target table outside the shared tablespace, the common/safest way to reclaim the freed (unused) space inside the ibdata1 file back to the operating system is by doing a dump & restore of the full database.

There’s an alternative approach with MyISAM though, which doesn’t involve dump & restore and only requires a MySQL restart. However, you need to convert all InnoDB tables to MyISAM, stop MySQL, delete all ib* files (there should be no remaining .ibd files after you’ve converted all InnoDB tables to MyISAM), and then restart MySQL again. Upon MySQL restart, ibdata1 will be re-created with it’s default initial size (more on this below). You can then convert the MyISAM tables back to InnoDB and if you have innodb_file_per_table enabled this time then the tables will be created with their own private tablespace file.

#2) Exporting the table to a private tablespace

Once you have innodb_file_per_table enabled you can move a table residing inside ibdata1 to it’s private tablespace (it’s own .ibd file) by either running ALTER TABLE or OPTIMIZE TABLE. Both commands create a “temporary” (though InnoDB, not MyISAM) table with it’s own tablespace file inside the database directory (and not in the tmpdir, as I believed it would happen), the rows from the target table being copied over there.

Here’s showing the temporary table (#sql-4f10_1) that was created while the process was still ongoing:

$ ls -lh /media/lvFernando1/data/test
total 2.2G
-rw-rw---- 1 fernando.laudares admin 8.5K Jul 30 20:42 joinit.frm
-rw-rw---- 1 fernando.laudares admin 8.5K Jul 30 23:05 #sql-4f10_1.frm
-rw-rw---- 1 fernando.laudares admin 2.2G Jul 30 23:12 #sql-4f10_1.ibd

and the resulting .ibd file from when the process completed:

$ ls -lh /media/lvFernando1/data/test
total 9.3G
-rw-rw---- 1 fernando.laudares admin 8.5K Jul 30 23:05 joinit.frm
-rw-rw---- 1 fernando.laudares admin 9.3G Jul 30 23:35 joinit.ibd

Note that the new joinit.ibd file is about 9.3G. Again, the process resulted in the use of extra disk space:

/dev/mapper/vg0-lvFernando1 50G 20G 28G 42% /media/lvFernando1

#3) Dump and restore: looking at the disk space use

As pointed, the way to reclaim unused disk space inside ibdata1 back to the file system (and consequently making it shrink) is by dumping the full database to a text file and then restoring it back. I’ve started by doing a simple full mysqldump encompassing all databases:

$ mysqldump -S /tmp/mysql_sandbox5537.sock --user=msandbox --password=msandbox --all-databases > dump.sql

which created the following file:

-rw-r--r-- 1 fernando.laudares admin 8.1G Jul 31 00:02 dump.sql

I’ve then stopped MySQL, wiped out the full datadir and used the script mysql_install_db to (re-)create the system tables (though that’s not needed, I did it to level comparisons; once you have all InnoDB tables out of the system tablespace you can simply delete all ib* files along any .ibd and .frm files for related InnoDB tables), started MySQL again, and finally restored the backup:

$ mysql -S /tmp/mysql_sandbox5537.sock --user=msandbox --password=msandbox < dump.sql

This resulted in:

/dev/mapper/vg0-lvFernando1 50G 9.4G 38G 21% /media/lvFernando1

and:

-rw-rw---- 1 fernando.laudares admin 18M Jul 31 05:05 /media/lvFernando1/data/ibdata1

ibdata1 did shrink in the process (returning to the default* initial size) and we recovered back a couple of gigabytes that were not being used by the single InnoDB table that used to live inside it.

* The default value for innodb_data_file_path for the Percona Server release we used on the tests is “ibdata1:10M:autoextend”. However, the manual states that “the default behavior is to create a single auto-extending data file, slightly larger than 10MB“. Given that innodb_autoextend_increment is set to 8M by default what we get in practice is an initialized ibdata1 file of 18M, as seen above.

#4) Compressing the table

Back to the original scenario, with the test table still living in the shared tablespace, I’ve enabled innodb_file_per_table, set Barracuda as the innodb_file_format and then ran:

mysql> ALTER TABLE test.joinit ENGINE=INNODB ROW_FORMAT=Compressed;

This resulted in an .ibd file with half the size of an uncompressed one:

-rw-rw---- 1 fernando.laudares admin 4.7G Jul 31 13:59 joinit.ibd

The process was a lengthy one but resulted in considerable less disk space use than converting the table to MyISAM:

/dev/mapper/vg0-lvFernando1 50G 20G 28G 42% /media/lvFernando1

#5) Converting the table to TokuDB

I’ve used TokuDB version 7.1.6 (not the latest one), which was bundled in Percona Server 5.5.37-rel35.1.

mysql> ALTER TABLE test.joinit ENGINE=TOKUDB;

As was the case with converting the InnoDB table to it’s own tablespace, a temporary table was created to intermediate the process, with the table definition residing in the test database directory:

$ ls -lh /media/lvFernando1/data/test/
total 24K
-rw-r----- 1 fernando.laudares admin 8.5K Jul 31 14:22 joinit.frm
-rw-rw---- 1 fernando.laudares admin 8.5K Jul 31 14:28 #sql-7f51_1.frm

However, TokuDB created a temporary file in the main datadir to copy the data into, show here at some point during the process:

-rwxrwx--x 1 fernando.laudares admin 32K Jul 31 14:28 _test_sql_7f51_1_main_a_2_19.tokudb
-rwxrwx--x 1 fernando.laudares admin 16K Jul 31 14:28 _test_sql_7f51_1_status_a_1_19.tokudb
-rw------- 1 fernando.laudares admin 528M Jul 31 14:29 tokuldNk5W4v

Once the process completed the file tokuldNk5W4v disappeared and the data ended up in the _test_sql_7f51_1_main_a_2_19.tokudb file:

-rwxrwx--x 1 fernando.laudares admin 1.1G Jul 31 14:32 _test_sql_7f51_1_main_d_1_19_B_0.tokudb

The new .tokudb file is so small compared to the other files obtained in the previous approaches that it almost goes unnoticed when looking at the disk space use:

/dev/mapper/vg0-lvFernando1 50G 12G 36G 25% /media/lvFernando1

Curiously, that file retained the name/reference of the temporary table it used, even though the temporary table file description (#sql-7f51_1.frm) disappeared as well; only joinit.frm remained under the test database directory.

I decided to do a dump & restore of the whole database to see what that would result in this case:

$ mysqldump -S /tmp/mysql_sandbox5537.sock --user=msandbox --password=msandbox --all-databases > dump2.sql
$ ls -lh dump2.sql
-rw-r--r-- 1 fernando.laudares admin 8.1G Jul 31 15:07 /home/fernando.laudares/dump2.sql

Once it was restored we found the ibdata1 file back to its initial default size and the disk space used by the datadir was down to a scant 1.1G. Curiously (again), the dump & restore procedure did fixed the name of the TokuDB table file:

-rwxrwx--x 1 fernando.laudares admin 980M Jul 31 15:36 _test_joinit_main_21_1_19_B_0.tokudb

The process of converting the resident InnoDB table to TokuDB was faster than compressing it with ROW_FORMAT=Compressed and resulted in a much smaller file. This is not to say that using TokuDB is the best solution but to point that the process can be accomplished in less time and make use of lesser extra disk space than InnoDB does, which can simply come up handy if you don’t have much space left. Also, remember the test was done with a not very large (and simple in structure) test table; you may need to check if yours can be converted to TokuDB and what changes to indexes you might need to do (if any).

#6) Expanding the shared tablespace

As previously mentioned, I’ve been using the default value of  “ibdata1:10M:autoextend” for innodb_data_file_path for my tests. If you have another partition with unused space available and you look only for a solution to the “running out of disk space” problem and don’t mind keeping your big table inside the shared tablespace, then you can expand it. That can be done by means of adding a second ibdata file to the tablespace definition. Of course, that only works if you create it outside the partition already hosting ibdata1.

To do so, you need to stop MySQL and verify the size of the ibdata1 file you have (list size in bytes as “ls -lh” will round it here):

$ ls -l /media/lvFernando1/data/ibdata1
-rw-r----- 1 fernando.laudares admin 10957619200 Aug 1 16:20 /media/lvFernando1/data/ibdata1

I had available space on the /media/lvFernando2 partition so I decided to have ibdata2 created there. To do so, I’ve added the following 2 lines to my.cnf:

innodb_data_home_dir =
innodb_data_file_path=/media/lvFernando1/data/ibdata1:10957619200;/media/lvFernando2/tmp/ibdata2:10M:autoextend

Then I restarted MySQL and confirmed that ibdata2 was created there upon initialization:

$ ls -lh /media/lvFernando2/tmp/
total 10M
-rw-rw---- 1 fernando.laudares admin 10M Aug 1 16:30 ibdata2

Three important things to note on this approach:

  1. There can be only one ibdata file listed in innodb_data_file_path configured with “autoextend” – the last one in the list.
  2. You need to redefine ibdata1 in innodb_data_file_path using it’s current size. Any other size won’t work.
  3. From the moment you redefined ibdata1 with it’s current size and added ibdata2 configured with “autoexend”, ibdata1 won’t grow any bigger; the common tablespace will be increased through ibdata2.

What the 3rd point really means in practice is that you need to plan this maneuver accordingly: if you still have space available in the partition that hosts ibdata1 and you want to use it first then you need to delay these changes until you get to that point. From the moment you add ibdata2 to the common tablespace definition new data will start to be added into it.

Conclusion

The little experiment I did helped explain how some of the solutions proposed in my previous post for moving an InnoDB table outside the shared tablespace would work in practice and, most importantly, how much more disk space would be needed along the process. It was interesting to see that none of them made use of the tmpdir when they created temporary tables to intermediate the table conversion process; those were always created on the main datadir. The test table I’ve used had only a bit more than 10G, far away from a 1TB-sized table, so the results I’ve got may differ from what you would find in a larger environment.

As complementary information, MySQL 5.6 allows for the creation of InnoDB tables having their private tablespaces living outside the datadir (but that must be defined at table creation time and can’t be changed by means of an ALTER TABLE) as well as transportable tablespaces (which allow for the copy of private tablespaces from one database server to another).

The post When (and how) to move an InnoDB table outside the shared tablespace appeared first on MySQL Performance Blog.

Aug
22
2014
--

When (and how) to move an InnoDB table outside the shared tablespace

In my last post, “A closer look at the MySQL ibdata1 disk space issue and big tables,” I looked at the growing ibdata1 problem under the perspective of having big tables residing inside the so-called shared tablespace. In the particular case that motivated that post, we had a customer running out of disk space in his server who was looking for a way to make the ibdata1 file shrink. As you may know, that file (or, as explained there, the set of ibdata files composing the shared tablespace) stores all InnoDB tables created when innodb_file_per_table is disabled, but also other InnoDB structures, such as undo logs and data dictionary.

For example, when you run a transaction involving InnoDB tables, MySQL will first write all the changes it triggers in an undo log, for the case you later decide to “roll them back”. Long standing, uncommited transactions are one of the causes for a growing ibdata file. Of course, if you have innodb_file_per_table disabled then all your InnoDB tables live inside it. That was what happened in that case.

So, how do you move a table outside the shared tablespace and change the storage engine it relies on? As importantly, how does that affects disk space use? I’ve explored some of the options presented in the previous post and now share my findings with you below.

The experiment

I created a very simple InnoDB table inside the shared tablespace of a fresh installed Percona Server 5.5.37-rel35.1 with support for TokuDB and configured it with a 1GB buffer pool. I’ve used a 50G partition to host the ‘datadir’ and another one for ‘tmpdir’:

Filesystem                 Size Used Avail Use% Mounted on
/dev/mapper/vg0-lvFernando1 50G 110M  47G   1%  /media/lvFernando1  # datadir
/dev/mapper/vg0-lvFernando2 50G  52M  47G   1%  /media/lvFernando2  # tmpdir

Here’s the table structure:

CREATE TABLE `joinit` (
  `i` int(11) NOT NULL AUTO_INCREMENT,
  `s` varchar(64) DEFAULT NULL,
  `t` time NOT NULL,
  `g` int(11) NOT NULL,
  PRIMARY KEY (`i`)
) ENGINE=InnoDB  DEFAULT CHARSET=latin1;

I populated it with 134 million rows using the following routine:

INSERT INTO joinit VALUES (NULL, uuid(), time(now()),  (FLOOR( 1 + RAND( ) *60 )));
INSERT INTO joinit SELECT NULL, uuid(), time(now()),  (FLOOR( 1 + RAND( ) *60 )) FROM joinit;  # repeat until you reach your target number of rows

which resulted in the table below:

mysql> show table status from test like 'joinit'G
*************************** 1. row ***************************
Name: joinit
Engine: InnoDB
Version: 10
Row_format: Compact
Rows: 134217909
Avg_row_length: 72
Data_length: 9783214080
Max_data_length: 0
Index_length: 0
Data_free: 1013972992
Auto_increment: 134872552
Create_time: 2014-07-30 20:42:42
Update_time: NULL
Check_time: NULL
Collation: latin1_swedish_ci
Checksum: NULL
Create_options:
Comment:
1 row in set (0.00 sec)

The resulting ibdata1 file was showing to have 11G, which accounted in practice for 100% of the datadir partition use then. What follows next is a few experiences I did by converting that table to use a different storage engine, moving it outside the shared tablespace, compressing it, and dumping and restoring the database back to see the effects in disk space use. I haven’t timed how long running each command took and focused mostly on the generated files size. As a bonus, I’ve also looked at how to extend the shared table space by adding an extra ibdata file.

#1) Converting to MyISAM

Technical characteristics and features apart, MyISAM tables are know to occupy less disk space than InnoDB’s ones. How much less depends on the actual table structure. Here I made the conversion in the most simplest way:

mysql> ALTER TABLE test.joinit ENGINE=MYISAM;

which created the following files (the .frm file already existed):

$ ls -lh /media/lvFernando1/data/test/
total 8.3G
-rw-rw---- 1 fernando.laudares admin 8.5K Jul 31 16:21 joinit.frm
-rw-rw---- 1 fernando.laudares admin 7.0G Jul 31 16:27 joinit.MYD
-rw-rw---- 1 fernando.laudares admin 1.3G Jul 31 16:27 joinit.MYI

The resulting MyISAM files amounted for an additional 8.3G of disk space use:

/dev/mapper/vg0-lvFernando1 50G 19G 29G 40% /media/lvFernando1

I was expecting smaller files but, of course, the result depends largely on the data types of the columns composing the table. The problem (or the consequence) is that we end up with close to the double of the initial disk space being used:

As it happens with the other solutions presented in this section that migrate the target table outside the shared tablespace, the common/safest way to reclaim the freed (unused) space inside the ibdata1 file back to the operating system is by doing a dump & restore of the full database.

There’s an alternative approach with MyISAM though, which doesn’t involve dump & restore and only requires a MySQL restart. However, you need to convert all InnoDB tables to MyISAM, stop MySQL, delete all ib* files (there should be no remaining .ibd files after you’ve converted all InnoDB tables to MyISAM), and then restart MySQL again. Upon MySQL restart, ibdata1 will be re-created with it’s default initial size (more on this below). You can then convert the MyISAM tables back to InnoDB and if you have innodb_file_per_table enabled this time then the tables will be created with their own private tablespace file.

#2) Exporting the table to a private tablespace

Once you have innodb_file_per_table enabled you can move a table residing inside ibdata1 to it’s private tablespace (it’s own .ibd file) by either running ALTER TABLE or OPTIMIZE TABLE. Both commands create a “temporary” (though InnoDB, not MyISAM) table with it’s own tablespace file inside the database directory (and not in the tmpdir, as I believed it would happen), the rows from the target table being copied over there.

Here’s showing the temporary table (#sql-4f10_1) that was created while the process was still ongoing:

$ ls -lh /media/lvFernando1/data/test
total 2.2G
-rw-rw---- 1 fernando.laudares admin 8.5K Jul 30 20:42 joinit.frm
-rw-rw---- 1 fernando.laudares admin 8.5K Jul 30 23:05 #sql-4f10_1.frm
-rw-rw---- 1 fernando.laudares admin 2.2G Jul 30 23:12 #sql-4f10_1.ibd

and the resulting .ibd file from when the process completed:

$ ls -lh /media/lvFernando1/data/test
total 9.3G
-rw-rw---- 1 fernando.laudares admin 8.5K Jul 30 23:05 joinit.frm
-rw-rw---- 1 fernando.laudares admin 9.3G Jul 30 23:35 joinit.ibd

Note that the new joinit.ibd file is about 9.3G. Again, the process resulted in the use of extra disk space:

/dev/mapper/vg0-lvFernando1 50G 20G 28G 42% /media/lvFernando1

#3) Dump and restore: looking at the disk space use

As pointed, the way to reclaim unused disk space inside ibdata1 back to the file system (and consequently making it shrink) is by dumping the full database to a text file and then restoring it back. I’ve started by doing a simple full mysqldump encompassing all databases:

$ mysqldump -S /tmp/mysql_sandbox5537.sock --user=msandbox --password=msandbox --all-databases > dump.sql

which created the following file:

-rw-r--r-- 1 fernando.laudares admin 8.1G Jul 31 00:02 dump.sql

I’ve then stopped MySQL, wiped out the full datadir and used the script mysql_install_db to (re-)create the system tables (though that’s not needed, I did it to level comparisons; once you have all InnoDB tables out of the system tablespace you can simply delete all ib* files along any .ibd and .frm files for related InnoDB tables), started MySQL again, and finally restored the backup:

$ mysql -S /tmp/mysql_sandbox5537.sock --user=msandbox --password=msandbox < dump.sql

This resulted in:

/dev/mapper/vg0-lvFernando1 50G 9.4G 38G 21% /media/lvFernando1

and:

-rw-rw---- 1 fernando.laudares admin 18M Jul 31 05:05 /media/lvFernando1/data/ibdata1

ibdata1 did shrink in the process (returning to the default* initial size) and we recovered back a couple of gigabytes that were not being used by the single InnoDB table that used to live inside it.

* The default value for innodb_data_file_path for the Percona Server release we used on the tests is “ibdata1:10M:autoextend”. However, the manual states that “the default behavior is to create a single auto-extending data file, slightly larger than 10MB“. Given that innodb_autoextend_increment is set to 8M by default what we get in practice is an initialized ibdata1 file of 18M, as seen above.

#4) Compressing the table

Back to the original scenario, with the test table still living in the shared tablespace, I’ve enabled innodb_file_per_table, set Barracuda as the innodb_file_format and then ran:

mysql> ALTER TABLE test.joinit ENGINE=INNODB ROW_FORMAT=Compressed;

This resulted in an .ibd file with half the size of an uncompressed one:

-rw-rw---- 1 fernando.laudares admin 4.7G Jul 31 13:59 joinit.ibd

The process was a lengthy one but resulted in considerable less disk space use than converting the table to MyISAM:

/dev/mapper/vg0-lvFernando1 50G 20G 28G 42% /media/lvFernando1

#5) Converting the table to TokuDB

I’ve used TokuDB version 7.1.6 (not the latest one), which was bundled in Percona Server 5.5.37-rel35.1.

mysql> ALTER TABLE test.joinit ENGINE=TOKUDB;

As was the case with converting the InnoDB table to it’s own tablespace, a temporary table was created to intermediate the process, with the table definition residing in the test database directory:

$ ls -lh /media/lvFernando1/data/test/
total 24K
-rw-r----- 1 fernando.laudares admin 8.5K Jul 31 14:22 joinit.frm
-rw-rw---- 1 fernando.laudares admin 8.5K Jul 31 14:28 #sql-7f51_1.frm

However, TokuDB created a temporary file in the main datadir to copy the data into, show here at some point during the process:

-rwxrwx--x 1 fernando.laudares admin 32K Jul 31 14:28 _test_sql_7f51_1_main_a_2_19.tokudb
-rwxrwx--x 1 fernando.laudares admin 16K Jul 31 14:28 _test_sql_7f51_1_status_a_1_19.tokudb
-rw------- 1 fernando.laudares admin 528M Jul 31 14:29 tokuldNk5W4v

Once the process completed the file tokuldNk5W4v disappeared and the data ended up in the _test_sql_7f51_1_main_a_2_19.tokudb file:

-rwxrwx--x 1 fernando.laudares admin 1.1G Jul 31 14:32 _test_sql_7f51_1_main_d_1_19_B_0.tokudb

The new .tokudb file is so small compared to the other files obtained in the previous approaches that it almost goes unnoticed when looking at the disk space use:

/dev/mapper/vg0-lvFernando1 50G 12G 36G 25% /media/lvFernando1

Curiously, that file retained the name/reference of the temporary table it used, even though the temporary table file description (#sql-7f51_1.frm) disappeared as well; only joinit.frm remained under the test database directory.

I decided to do a dump & restore of the whole database to see what that would result in this case:

$ mysqldump -S /tmp/mysql_sandbox5537.sock --user=msandbox --password=msandbox --all-databases > dump2.sql
$ ls -lh dump2.sql
-rw-r--r-- 1 fernando.laudares admin 8.1G Jul 31 15:07 /home/fernando.laudares/dump2.sql

Once it was restored we found the ibdata1 file back to its initial default size and the disk space used by the datadir was down to a scant 1.1G. Curiously (again), the dump & restore procedure did fixed the name of the TokuDB table file:

-rwxrwx--x 1 fernando.laudares admin 980M Jul 31 15:36 _test_joinit_main_21_1_19_B_0.tokudb

The process of converting the resident InnoDB table to TokuDB was faster than compressing it with ROW_FORMAT=Compressed and resulted in a much smaller file. This is not to say that using TokuDB is the best solution but to point that the process can be accomplished in less time and make use of lesser extra disk space than InnoDB does, which can simply come up handy if you don’t have much space left. Also, remember the test was done with a not very large (and simple in structure) test table; you may need to check if yours can be converted to TokuDB and what changes to indexes you might need to do (if any).

#6) Expanding the shared tablespace

As previously mentioned, I’ve been using the default value of  “ibdata1:10M:autoextend” for innodb_data_file_path for my tests. If you have another partition with unused space available and you look only for a solution to the “running out of disk space” problem and don’t mind keeping your big table inside the shared tablespace, then you can expand it. That can be done by means of adding a second ibdata file to the tablespace definition. Of course, that only works if you create it outside the partition already hosting ibdata1.

To do so, you need to stop MySQL and verify the size of the ibdata1 file you have (list size in bytes as “ls -lh” will round it here):

$ ls -l /media/lvFernando1/data/ibdata1
-rw-r----- 1 fernando.laudares admin 10957619200 Aug 1 16:20 /media/lvFernando1/data/ibdata1

I had available space on the /media/lvFernando2 partition so I decided to have ibdata2 created there. To do so, I’ve added the following 2 lines to my.cnf:

innodb_data_home_dir =
innodb_data_file_path=/media/lvFernando1/data/ibdata1:10957619200;/media/lvFernando2/tmp/ibdata2:10M:autoextend

Then I restarted MySQL and confirmed that ibdata2 was created there upon initialization:

$ ls -lh /media/lvFernando2/tmp/
total 10M
-rw-rw---- 1 fernando.laudares admin 10M Aug 1 16:30 ibdata2

Three important things to note on this approach:

  1. There can be only one ibdata file listed in innodb_data_file_path configured with “autoextend” – the last one in the list.
  2. You need to redefine ibdata1 in innodb_data_file_path using it’s current size. Any other size won’t work.
  3. From the moment you redefined ibdata1 with it’s current size and added ibdata2 configured with “autoexend”, ibdata1 won’t grow any bigger; the common tablespace will be increased through ibdata2.

What the 3rd point really means in practice is that you need to plan this maneuver accordingly: if you still have space available in the partition that hosts ibdata1 and you want to use it first then you need to delay these changes until you get to that point. From the moment you add ibdata2 to the common tablespace definition new data will start to be added into it.

Conclusion

The little experiment I did helped explain how some of the solutions proposed in my previous post for moving an InnoDB table outside the shared tablespace would work in practice and, most importantly, how much more disk space would be needed along the process. It was interesting to see that none of them made use of the tmpdir when they created temporary tables to intermediate the table conversion process; those were always created on the main datadir. The test table I’ve used had only a bit more than 10G, far away from a 1TB-sized table, so the results I’ve got may differ from what you would find in a larger environment.

As complementary information, MySQL 5.6 allows for the creation of InnoDB tables having their private tablespaces living outside the datadir (but that must be defined at table creation time and can’t be changed by means of an ALTER TABLE) as well as transportable tablespaces (which allow for the copy of private tablespaces from one database server to another).

The post When (and how) to move an InnoDB table outside the shared tablespace appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com