Apr
12
2023
--

Don’t Auto pt-online-schema-change for Tables With Foreign Keys

pt-online-schema-change

During the early stages of my career, I was captivated by the theories and concepts surrounding foreign keys and how they empowered us to maintain data integrity. However, in practical application, I have found them to be quite challenging to manage. I am sure you’re a champion DBA, but I keep my distance from foreign keys.

With that short story as a background, this post aims to address the dilemma that arises when utilizing the pt-online-schema-change tool on tables that contain foreign keys.

We already know what one of the most used Percona tools pt-online-schema-change is and how pt-online-schema-change handles foreign keys.

When utilizing the pt-online-schema-change tool to alter a table, such as renaming the original table to be replaced as a new one, it’s important to consider how foreign keys are affected. The tool offers two options, rebuild_constraints, and drop_swap, to ensure that foreign keys continue to reference the correct table during the alteration process. Here are the high-level steps involved:

  1. Drop and re-add foreign key constraints (rebuild_constraints)
  • Rename Current Table T1 to T1_old (FKs are pointing to T1_old now)
  • Rename New table with changes, _T1_new to T1 (T1 doesn’t have any tables referencing it)
  • ALTER on child tables to adjust the foreign key to point to T1 rather than T1_old.
  1. Drop the original table and rename the new table as original (drop_swap)
  • Disable foreign key checks for the session (FOREIGN_KEY_CHECKS=0)
  • Drop the original table
  • Rename the new table _T1_new to T1

It’s worth noting that when using the “drop_swap” option, toward the end of the process, the child table’s foreign key will be pointing to the updated table. However, this approach carries some risks due to the following facts:

  • For a short time between dropping the original table and renaming the temporary table, the table to be altered simply does not exist.
  • In the event of an error during the rename operation, rollback is impossible, as the original table has already been dropped. This can result in data loss or other unintended consequences.

The pt-online-schema-change tool also provides a third option, “auto.” When selecting “auto,” the tool makes a decision between “rebuild_constraints” and “drop_swap” based on the size of the child table involved in the alteration process. This allows for flexibility in choosing the appropriate option based on the specific scenario and the size of the affected child table.

Here, pt-online-schema-change knows the time taken for the row copy operation from the old table to the new table. It can estimate if the child table can be altered within –chunk-time (that’s considered fast). If child table ALTERs are fast, pt-online-schema-change uses rebuild_constraints; otherwise, use the drop_swap approach.

To error is human, and hence I could be wrong. Luckily the Percona tools are open source. I encourage you to review the source at the pt-online-schema-change’s git. Following is the code block that describes the “auto” behavior and then calls the determine_alter_fk_method.

# XXX Auto-choose the alter fk method BEFORE swapping/renaming tables
# else everything will break because if drop_swap is chosen, then we
# most NOT rename tables or drop the old table.
if ( $alter_fk_method eq 'auto' ) {
# If chunk time is set, then use the average rate of rows/s
# from copying the orig table to determine the max size of
# a child table that can be altered within one chunk time.
# The limit is a fudge factor. Chunk time won't be set if
# the user specified --chunk-size=N on the cmd line, in which
# case the max child table size is their specified chunk size
# times the fudge factor.
my $max_rows = $o->get('dry-run') ? $o->get('chunk-size') * $limit
: $chunk_time && $avg_rate ? $avg_rate * $chunk_time * $limit
: $o->get('chunk-size') * $limit;
PTDEBUG && _d('Max allowed child table size:', $max_rows);

$alter_fk_method = determine_alter_fk_method(
child_tables => $child_tables,
max_rows => $max_rows,
Cxn => $cxn,
OptionParser => $o,
);

Now, the tool will use drop_swap for the child tables which are larger in data size or a number of rows. 

I want to bring something important to your attention, and that is rebuilding the constraints will be slower for large tables unless it is done “ONLINE.” The online operation will not really need the data copy, and it will be faster.

The documentation reads:

    The INPLACE algorithm is supported when foreign_key_checks is disabled. Otherwise, only the COPY algorithm is supported.

This has already been identified and improved in the pt-online-schema-change version 3.3.1, as noted in the bug report https://jira.percona.com/browse/PT-1327

Thus, in my opinion, while using pt-online-schema-change, it is both safe and efficient to use “rebuild_constraints” as long as the tool is version 3.3.1 or higher. Avoid using “drop_swap,” and surely it makes less sense to use “auto.” Don’t “auto” the pt-online-schema-change for tables with foreign keys; understand the options and take control in your hands. That said, it is really easy to upgrade to the latest version of Percona Toolkit.

Percona Toolkit

Apr
05
2022
--

Hidden Cost of Foreign Key Constraints in MySQL

Hidden Cost of Foreign Key Constraints in MySQL

Hidden Cost of Foreign Key Constraints in MySQLDo you wonder if MySQL tells you the truth about writes to tables that have foreign key constraints? The situation is complex, and getting visibility on what is really happening can be a problem.

I found this issue intriguing and decided to share and highlight some examples.

Query Plan

Let us take this example table:

CREATE TABLE `product` (
  `category` int NOT NULL,
  `id` int NOT NULL,
  `price` decimal(10,0) DEFAULT NULL,
  PRIMARY KEY (`category`,`id`)
) ENGINE=InnoDB;

We want to know how costly an example UPDATE against this table will be:

mysql > EXPLAIN update product set id=id+1 where id=65032158 and category=3741760\G
*************************** 1. row ***************************
           id: 1
  select_type: UPDATE
        table: product
   partitions: NULL
         type: range
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 8
          ref: const,const
         rows: 1
     filtered: 100.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)

This shows that just one table and one row will be considered. This looks right as we’re using a primary key lookup, and one row matches:

mysql > select * from product where id=65032158;
+----------+----------+-------+
| category | id       | price |
+----------+----------+-------+
|  3741760 | 65032158 |     2 |
+----------+----------+-------+
1 row in set (0.02 sec)

In our case, this was not the entire truth as the table has FK relationships:

mysql > SELECT TABLE_NAME,COLUMN_NAME,CONSTRAINT_NAME, REFERENCED_TABLE_NAME,REFERENCED_COLUMN_NAME from INFORMATION_SCHEMA.KEY_COLUMN_USAGE where REFERENCED_TABLE_NAME='product' AND REFERENCED_TABLE_SCHEMA='db1'\G
*************************** 1. row ***************************
            TABLE_NAME: product_order
           COLUMN_NAME: product_category
       CONSTRAINT_NAME: product_order_ibfk_1
 REFERENCED_TABLE_NAME: product
REFERENCED_COLUMN_NAME: category
*************************** 2. row ***************************
            TABLE_NAME: product_order
           COLUMN_NAME: product_id
       CONSTRAINT_NAME: product_order_ibfk_1
 REFERENCED_TABLE_NAME: product
REFERENCED_COLUMN_NAME: id
2 rows in set (0.01 sec)

The related table has a defined ON UPDATE CASCADE action linked to our table:

CREATE TABLE `product_order` (
  `no` int NOT NULL AUTO_INCREMENT,
  `product_category` int NOT NULL,
  `product_id` int NOT NULL,
  `customer_id` int NOT NULL,
  PRIMARY KEY (`no`),
  KEY `product_category` (`product_category`,`product_id`),
  KEY `customer_id` (`customer_id`),
  CONSTRAINT `product_order_ibfk_1` FOREIGN KEY (`product_category`, `product_id`) REFERENCES `product` (`category`, `id`) ON DELETE RESTRICT ON UPDATE CASCADE,
  CONSTRAINT `product_order_ibfk_2` FOREIGN KEY (`customer_id`) REFERENCES `customer` (`id`)
) ENGINE=InnoDB;

Therefore, the EXPLAIN plan completely fails to recognize this fact, and the plan tries to convince us that such an update will only change one row in our database.

Another typical method to analyze slow queries is checking the per-session status handlers. In this case, it looks like this:

mysql > flush status; update product set id=id+1 where id=65032158 and category=3741760\G
Query OK, 0 rows affected (0.00 sec)

Query OK, 1 row affected (0.02 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql > show status like 'handler%';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| Handler_commit             | 2     |
| Handler_delete             | 0     |
| Handler_discover           | 0     |
| Handler_external_lock      | 2     |
| Handler_mrr_init           | 0     |
| Handler_prepare            | 2     |
| Handler_read_first         | 0     |
| Handler_read_key           | 1     |
| Handler_read_last          | 0     |
| Handler_read_next          | 0     |
| Handler_read_prev          | 0     |
| Handler_read_rnd           | 0     |
| Handler_read_rnd_next      | 0     |
| Handler_rollback           | 0     |
| Handler_savepoint          | 0     |
| Handler_savepoint_rollback | 0     |
| Handler_update             | 1     |
| Handler_write              | 0     |
+----------------------------+-------+
18 rows in set (0.01 sec)

Handler_update, as well as the Rows Changed information in the query outcome output, are not taking changes in referencing product_order table into account either. Without additional checks, we don’t even know that additional work has been done! 

Monitoring

Let’s see how foreign constraints can impact monitoring database activities.

We already know monitoring Handler_update won’t work as expected. Let’s check the InnoDB engine-related counter (on an idle MySQL instance as this is a global only counter):

mysql > show status like 'Innodb_rows_updated';
+---------------------+-------+
| Variable_name       | Value |
+---------------------+-------+
| Innodb_rows_updated | 21369 |
+---------------------+-------+
1 row in set (0.00 sec)

Here is our UPDATE and how many data rows it really changes:

mysql > select product_id,count(*) from product_order where product_id in (65032159,65032160) group by product_id;
+------------+----------+
| product_id | count(*) |
+------------+----------+
|   65032159 |      897 |
+------------+----------+
1 row in set (0.02 sec)

mysql > update product set id=id+1 where id=65032159 and category=3741760\G
Query OK, 1 row affected (0.02 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql > select product_id,count(*) from product_order where product_id in (65032159,65032160) group by product_id;
+------------+----------+
| product_id | count(*) |
+------------+----------+
|   65032160 |      897 |
+------------+----------+
1 row in set (0.01 sec)

mysql > show status like 'Innodb_rows_updated';
+---------------------+-------+
| Variable_name       | Value |
+---------------------+-------+
| Innodb_rows_updated | 22267 |
+---------------------+-------+
1 row in set (0.00 sec)

mysql > select 22267-21369;
+-------------+
| 22267-21369 |
+-------------+
|         898 |
+-------------+
1 row in set (0.00 sec)

This confirms that InnoDB exposes the real number of rows updated correctly here.

Child table locks are also exposed for active transactions with SHOW ENGINE INNODB STATUS (after enabling innodb_status_output_locks).

What about Performance Schema, another popular method to monitor the database? 

mysql > truncate `performance_schema`.`table_io_waits_summary_by_table`;
Query OK, 0 rows affected (0.00 sec)

mysql > select OBJECT_NAME,COUNT_UPDATE,COUNT_WRITE,COUNT_READ,COUNT_DELETE,COUNT_INSERT from `performance_schema`.`table_io_waits_summary_by_table` where OBJECT_NAME like "prod%";
+---------------+--------------+-------------+------------+--------------+--------------+
| OBJECT_NAME   | COUNT_UPDATE | COUNT_WRITE | COUNT_READ | COUNT_DELETE | COUNT_INSERT |
+---------------+--------------+-------------+------------+--------------+--------------+
| product       |            0 |           0 |          0 |            0 |            0 |
| product_order |            0 |           0 |          0 |            0 |            0 |
+---------------+--------------+-------------+------------+--------------+--------------+
2 rows in set (0.00 sec)

mysql > update product set id=id+1 where id=65032159 and category=3741760\G
Query OK, 1 row affected (0.02 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql > select OBJECT_NAME,COUNT_UPDATE,COUNT_WRITE,COUNT_READ,COUNT_DELETE,COUNT_INSERT from `performance_schema`.`table_io_waits_summary_by_table` where OBJECT_NAME like "prod%";
+---------------+--------------+-------------+------------+--------------+--------------+
| OBJECT_NAME   | COUNT_UPDATE | COUNT_WRITE | COUNT_READ | COUNT_DELETE | COUNT_INSERT |
+---------------+--------------+-------------+------------+--------------+--------------+
| product       |            1 |           1 |          1 |            0 |            0 |
| product_order |            0 |           0 |          0 |            0 |            0 |
+---------------+--------------+-------------+------------+--------------+--------------+
2 rows in set (0.00 sec)

mysql > select product_id,count(*) from product_order where product_id in (65032159,65032160) group by product_id;
+------------+----------+
| product_id | count(*) |
+------------+----------+
|   65032160 |      897 |
+------------+----------+
1 row in set (0.02 sec)

mysql > select OBJECT_NAME,COUNT_UPDATE,COUNT_WRITE,COUNT_READ,COUNT_DELETE,COUNT_INSERT from `performance_schema`.`table_io_waits_summary_by_table` where OBJECT_NAME like "prod%";
+---------------+--------------+-------------+------------+--------------+--------------+
| OBJECT_NAME   | COUNT_UPDATE | COUNT_WRITE | COUNT_READ | COUNT_DELETE | COUNT_INSERT |
+---------------+--------------+-------------+------------+--------------+--------------+
| product       |            1 |           1 |          1 |            0 |            0 |
| product_order |            0 |           0 |      54028 |            0 |            0 |
+---------------+--------------+-------------+------------+--------------+--------------+
2 rows in set (0.00 sec)

Unfortunately, Performance Schema completely missed what happened due to the Foreign Key constraint! I think it is a serious issue and so I reported it here: https://bugs.mysql.com/bug.php?id=106012

Summary

I hope I was able to draw your attention to the need to exercise caution when investigating DML queries and system load when Foreign Key constraints are used! Maybe you were surprised that a simple single row update or delete required so much time? It could be that under the hood MySQL changed thousands of rows and hid this fact from you!

Mar
21
2017
--

Dropping the Foreign Key Constraint Using pt-online-schema-change

pt-online-schema-change

Foreign KeyIn this blog post, we’ll look at how to get rid of the unused Foreign Key (FK) constraint and/or related columns/keys with the help of pt-online-schema-change and the power of its plugins.

Before we proceed, here is a useful blog post written by Peter Zaitsev on Hijacking Innodb Foreign Keys.

If you are trying to get rid of an unused foreign key (FK) constraint and related columns from versions older than MySQL 5.6, or tables that cannot be executed with

ALTER TABLE ... ALGORITHM=INPLACE

 because of limitations mentioned here (specifically, tables with 5.5 TIMESTAMP formats), you can use

pt-online-schema-change

.

For DROP FOREIGN KEY

constraint_name

  with

pt-online-schema-change

 requires specifying

_constraint_name

 rather than the real

constraint_name

. This is due to a limitation in MySQL:

pt-online-schema-change

 adds a leading underscore to foreign key constraint names when creating the new table. Here’s is a simple example of one such case:

CREATE TABLE `test3` (
  `Id` int(11) NOT NULL DEFAULT '0',
  `Firstname` varchar(32) DEFAULT NULL,
  `City` varchar(32) DEFAULT NULL,
  PRIMARY KEY (`Id`),
  CONSTRAINT `FKID` FOREIGN KEY (`Id`) REFERENCES `test4` (`Id`)
 ) ENGINE=InnoDB DEFAULT CHARSET=latin1

To drop the constraint, we are supposed to add an underscore prior to

constraint_name

 FKID:

[root@siddhant ~]# pt-online-schema-change --user=root --execute --set-vars=foreign_key_checks=0  --alter-foreign-keys-method=rebuild_constraints --alter="DROP FOREIGN KEY _FKID" D=apps02,t=test3 --socket=/tmp/mysql-master5520.sock
Operation, tries, wait:
analyze_table, 10, 1
copy_rows, 10, 0.25
……...Altering `apps02`.`test3`...
Creating new table...
Created new table apps02._test3_new OK.
Altering new table….... …….
2017-02-11T12:45:12 Dropped old table `apps02`.`_test3_old` OK.
2017-02-11T12:45:12 Dropping triggers...
2017-02-11T12:45:12 Dropped triggers OK.
Successfully altered `apps02`.`test3`.

Below is one case where if, for some reason, you already have an FK constraint with an underscore the above method of adding an additional underscore to already underscored _FK will fail with an error while dropping it:

Error altering new table `apps02`.`_test3_new`: DBD::mysql::db do failed: Error on rename of './apps02/_test3_new' to './apps02/#sql2-697-19' (errno: 152) [for Statement "ALTER TABLE `apps02`.`_test3_new` DROP FOREIGN KEY ___FKID"] at /usr/bin/pt-online-schema-change line 9069.

In such cases, we will have to make use of the

--plugin

  option used along with a file that calls the 

pt_online_schema_change_plugin

 class and a hook

after_alter_new_table

 to drop the FK constraint. For example, a table with the FK constraint with an underscore is:

CREATE TABLE `test` (
  `Id` int(11) NOT NULL DEFAULT '0',
  `Firstname` varchar(32) DEFAULT NULL,
  `City` varchar(32) DEFAULT NULL,
  PRIMARY KEY (`Id`),
  CONSTRAINT `___fkId` FOREIGN KEY (`Id`) REFERENCES `test2` (`Id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

Here we have a table with foreign key

___fkid

 using three underscores. Our plugin for dropping the constraint should be as follows:

[root@siddhant ~]# cat ptosc_plugin_drop_fk.pl
package pt_online_schema_change_plugin;
use strict;
sub new {
   my ($class, %args) = @_;
   my $self = { %args };
   return bless $self, $class;
}
sub after_alter_new_table {
   my ($self, %args) = @_;
   my $new_tbl = $args{new_tbl};
   my $dbh     = $self->{cxn}->dbh;
   my $sth = $dbh->prepare("ALTER TABLE $new_tbl->{name} DROP FOREIGN KEY __fkId");
   $sth->execute();
}
1;

NOTE: DROP FOREIGN KEY CONSTRAINT in the plugin has one underscore less than original foreign key constraint, 

__fkId

 vs.

___fkId

. Also, the alter statement will be NOOP alter (i.e., 

--alter ="ENGINE=INNODB"

).

Here is the

pt-online-schema-change

 execution example with the plugin.

[root@siddhant ~]#  pt-online-schema-change --user=root --execute  --set-vars=foreign_key_checks=0  --alter-foreign-keys-method=rebuild_constraints --alter="ENGINE=INNODB" --plugin=/root/ptosc_plugin_drop_fk.pl  D=apps01,t=test --socket=/tmp/mysql-master5520.sock
Created plugin from /root/ptosc_plugin_drop_fk.pl.
Operation, tries, wait:
  analyze_table, 10, 1
  copy_rows, 10, 0.25
  create_triggers, 10, 1
  drop_triggers, 10, 1
  swap_tables, 10, 1
  update_foreign_keys, 10, 1
Altering `apps01`.`test`...
Creating new table...
Created new table apps01._test_new OK.
Altering new table...
Altered `apps01`.`_test_new` OK.
2017-02-11T11:26:14 Creating triggers...
2017-02-11T11:26:14 Created triggers OK.
2017-02-11T11:26:14 Copied rows OK.
2017-02-11T11:26:14 Swapping tables...
2017-02-11T11:26:14 Swapped original and new tables OK.
2017-02-11T11:26:14 Dropping old table...
2017-02-11T11:26:14 Dropped old table `apps01`.`_test_old` OK.
2017-02-11T11:26:14 Dropping triggers...
2017-02-11T11:26:14 Dropped triggers OK.
Successfully altered `apps01`.`test`.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com