Mar
14
2018
--

Basic Internal Troubleshooting Tools for MySQL Server Webinar: Q & A

Troubleshooting Tools for MySQL

Troubleshooting Tools for MySQLIn this blog, I will provide answers to the Q & A for the Basic Internal Troubleshooting Tools for MySQL Server webinar.

First, I want to thank everybody for attending my February 15, 2018, webinar on troubleshooting tools for MySQL. The recording and slides for the webinar are available here. Below is the list of your questions that I was unable to answer fully during the webinar.

Q: How do we prevent the schema prefix from appearing in the show create view. This is causing issue with restore on another server with a different DB. See the issue here and reproducible test case: https://gist.github.com/anonymous/30cb138b598fec46be762789397796b6

A: I shortened the example in order to fit it in this blog:

mysql> create table t1(f1 int);
Query OK, 0 rows affected (3.47 sec)
mysql> create view v1 as select * from t1;
Query OK, 0 rows affected (0.21 sec)
mysql> show create view v1G
*************************** 1. row ***************************
                View: v1
         Create View: CREATE ALGORITHM=UNDEFINED DEFINER=`root`@`localhost` SQL SECURITY DEFINER VIEW `v1` AS select `t1`.`f1` AS `f1` from `t1`
character_set_client: utf8
collation_connection: utf8_general_ci
1 row in set (0.00 sec)
mysql> select * from information_schema.views where table_schema='test'G
*************************** 1. row ***************************
       TABLE_CATALOG: def
        TABLE_SCHEMA: test
          TABLE_NAME: v1
     VIEW_DEFINITION: select `test`.`t1`.`f1` AS `f1` from `test`.`t1`
        CHECK_OPTION: NONE
        IS_UPDATABLE: YES
             DEFINER: root@localhost
       SECURITY_TYPE: DEFINER
CHARACTER_SET_CLIENT: utf8
COLLATION_CONNECTION: utf8_general_ci
1 row in set (0.00 sec)

The issue you experienced happened because even if you created a view as

SELECT foo FROM table1;

, it is stored as

SELECT foo FROM your_schema.table1;

. You can see it if you query the 

*.frm

  file for the view:

sveta@Thinkie:~/build/ps-5.7/mysql-test$ cat var/mysqld.1/data/test/v1.frm
TYPE=VIEW
query=select `test`.`t1`.`f1` AS `f1` from `test`.`t1`
md5=5840f59d1287385629fcb9b948e53d96
updatable=1
algorithm=0
definer_user=root
definer_host=localhost
suid=2
with_check_option=0
timestamp=2018-02-24 10:27:45
create-version=1
source=select * from t1
client_cs_name=utf8
connection_cl_name=utf8_general_ci
view_body_utf8=select `test`.`t1`.`f1` AS `f1` from `test`.`t1`

You cannot prevent the schema prefix from being stored. If you restore the view on a different server with a different database name, you should edit the view definition manually. If you already restored the view that points to a non-existent schema, just recreate it.

VIEW

 is metadata only and does not hold any data, so this operation is non-blocking and will run momentarily.

Q: What is thread/sql/compress_gtid_table in performance_schema.threads?

A:

thread/sql/compress_gtid_table

  is name of the instrument. You can read this and other instruments as below:

  • thread/

     is a group of instruments. In this case, it is the instruments that are visible in the

    THREADS

     table.

  • thread/sql/

     is the group of instruments that are part of the server kernel code. If you are not familiar with MySQL source tree, download the source code tarball and check its content. The main components are:

    • sql

        – server kernel

    • storage

       – where storage engines code located (

      storage/innobase

        is InnoDB code,

      storage/myisam

        is MyISAM code and so on)

    • vio

       – input-output functions

    • mysys

       – code, shared between all parts of the server

    • client

       – client library and utilities

    • strings

       – functions to work with strings

This is not full list. For more information consult MySQL Internals Manual

  • thread/sql/compress_gtid_table

     is the name of the particular instrument.

Unfortunately, there is no link to source code for instrumented threads in the table

THREADS

, but we can easily find them in the

sql

 directory. The function

compress_gtid_table

 is defined in

sql/rpl_gtid_persist.cc

and we can check comments and find what it is doing:

/**
The main function of the compression thread.
- compress the gtid_executed table when get a compression signal.
@param p_thd Thread requesting to compress the table
@return
@retval 0 OK. always, the compression thread will swallow any error
for going to wait for next compression signal until
it is terminated.
*/
extern "C" {
static void *compress_gtid_table(void *p_thd)
{

You can also find the description of mysql.gtid_executed compression in the User Reference Manual.

You can follow the same actions to find out what other MySQL threads are doing.

Q: How does a novice on MySQL learn the core basics about MySQL. The documentation can be very vast which surpasses my understanding right now. Are there any good intro books you can recommend for a System Admin?

A: I learned MySQL a long time ago, and a book that I can recommend written for version 5.0. This is “MySQL 5.0 Certification Study Guide” by Paul DuBois,? Stefan Hinz and Carsten Pedersen. The book is in two parts: one is devoted to SQL developers and explains how to run and tune queries. The second part is for DBAs and describes how to tune MySQL server. I asked my colleagues to suggest more modern books for you, and this one is still on the list for many. This is in all cases an awesome book for beginners, just note that MySQL has changed a lot since 5.0 and you need to deepen your knowledge after you finish reading this book.

Another book that was recommended is “MySQL” by Paul DuBois. It is written for beginners and has plenty of content. Paul DuBois has been working on (and continues to work on) the official MySQL documentation for many years, and knows MySQL in great detail.

Another book is “Murach’s MySQL” by Joel Murach, which is used as a course book in many colleges for “Introduction into Databases” type classes.

For System Administrators, you can read “Systems Performance: Enterprise and the Cloud” by Brendan Gregg. This book talks about how to tune operating systems for performance. This is one of the consistent tasks we have to do when administering MySQL. I also recommend that you study Brendan Gregg’s website, which is a great source of information for everyone who is interested in operating system performance tuning.

After you finish the books for novices, you can check out “High Performance MySQL, 3rd Edition” by Peter Zaitsev, Vadim Tkachenko, Baron Schwartz and “MySQL Troubleshooting” by Sveta Smirnova (yours truly =) ). These two books require at least basic MySQL knowledge, however.

Q: Does the database migration goes on same way? Do these tools work for migration as well?

A: The tools I discussed in this webinar are available for any version of MySQL/Percona/MariaDB server. You may use them for migration. For example, it is always useful to compare configuration (

SHOW GLOBAL VARIABLES

) on both “old” and “new” servers. It helps if you observe performance drops on the “new” server. Or you can check table definitions before and after migration. There are many more uses for these tools during the migration process.

Q: How can we take backup of a single schema from a MySQL AWS instance without affecting the performance of applications. An AWS RDS instance to be more clear. mysqldump we cannot use in RDS instance in the current scenario.

A: You can connect to your RDS instance with mysqldump from your local machine, exactly like your MySQL clients connect to it. Then you can collect a dump of a single database, table or even specify the option –where to limit the resulting set to only a portion of the table. Note, by default

mysqldump

 is blocking, but if you backup solely transactional tables (InnoDB, TokuDB, MyRocks) you can run

mysqldump

 with the option

--single-transaction

, which starts the transaction at the beginning of the backup job.

Alternatively, you can use AWS Database Migration Service, which allows you to replicate your databases. Then you can take a backup of a single schema using whatever method you like.

Q: Why do some sites suggest to turn off information and performance schema? Is it important to keep it on or turn it off?

A: You cannot turn off Information Schema. It is always available.

Performance Schema in earlier versions (before 5.6.14) was resource-consuming, even if it was idle while enabled. These limitations were fixed a long time ago, and you don’t need to keep it off. At least unless you hit some new bug.

Q: How do we handle storage level threshold if a data file size grows and reaches max threshold when unnoticed? Can you please help on this question?

A: Do you mean what will happen if the data file grows until filesystem has no space? In this case, clients receive the error

"OS error code 28: No space left on device"

  until space is freed and mysqld can start functioning normally again. If it can write into error log file (for example, if it is located on different disk), you will see messages about error 28 in the error log file too.

Q: What are the performance bottlenecks when enabling performance_schema. Is there any benchmark we can have?

A: Just enabling Performance Schema in version 5.6 and up does not cause any performance issue. With version 5.7, it can also start with almost zero allocated memory, so it won’t affect your other buffers. The Performance Schema causes impact when you enable particular instruments. Most of them are instruments that start with the name

events_waits_*

. I performed benchmarks on effects of particular Performance Schema instruments and published them in this post.

Q: Suggest us some tips about creating a real-time dashboards for the same as we have some replication environment? it would be great if you can help us here for building business level dashboards

A: This is topic for yet another webinar or, better still, a tutorial. For starters, I recommend you to check out the “MySQL Replication” dashboard in PMM and extend it using the metrics that you need.

Thanks for attending the webinar on internal troubleshooting tools for MySQL.

Apr
15
2015
--

Checking table definition consistency with mysqldiff

Data inconsistencies in replication environments are a pretty common. There are lots of posts that explain how to fix those using pt-table-checksum and pt-table-sync. Usually we only care about the data but from time to time we receive this question in support:

How can I check the table definition consistency between servers?

Replication also allow us to have different table definition between master and slaves. For example, there are some cases that you need some indexes on slaves for querying purposes but are not really needed on the master. There are some other cases where those differences are just a mistake that needs to be fixed.

mysqldiff, included in Oracle’s MySQL Utilities, can help us to find those differences and get the information we need to fix those them. In this post I’m going to show you how to use it with an example.

Find table definition inconsistencies

mysqldiff allows us to find those inconsistencies checking the differences between the tables on the same server (different databases) or on different servers (also possible on different databases). In this example I’m going to search for differences in table definitions between two different servers, server1 and server2.

The command line is pretty simple. This is used to compare the tables on “test” database:

mysqldiff --server1=user@host1 --server2=user@host2 test:test

If the database name is different:

mysqldiff --server1=user@host1 --server2=user@host2 testdb:anotherdb

If the table name is different:

mysqldiff --server1=user@host1 --server2=user@host2 testdb.table1:anotherdb.anothertable

Now I want to check the table definition consistency between two servers. The database’s name is “employees”:

# mysqldiff --force --server1=root:msandbox@127.0.0.1:21489 --server2=root:msandbox@127.0.0.1:21490 employees:employees
# WARNING: Using a password on the command line interface can be insecure.
# server1 on 127.0.0.1: ... connected.
# server2 on 127.0.0.1: ... connected.
# Comparing `employees` to `employees`                             [PASS]
# Comparing `employees`.`departments` to `employees`.`departments`   [FAIL]
# Object definitions differ. (--changes-for=server1)
#
--- `employees`.`departments`
+++ `employees`.`departments`
@@ -1,6 +1,6 @@
 CREATE TABLE `departments` (
   `dept_no` char(4) NOT NULL,
-  `dept_name` varchar(40) NOT NULL,
+  `dept_name` varchar(256) DEFAULT NULL,
   PRIMARY KEY (`dept_no`),
   UNIQUE KEY `dept_name` (`dept_name`)
 ) ENGINE=InnoDB DEFAULT CHARSET=latin1
# Comparing `employees`.`dept_emp` to `employees`.`dept_emp`       [PASS]
# Comparing `employees`.`dept_manager` to `employees`.`dept_manager`   [PASS]
# Comparing `employees`.`employees` to `employees`.`employees`     [FAIL]
# Object definitions differ. (--changes-for=server1)
#
--- `employees`.`employees`
+++ `employees`.`employees`
@@ -5,5 +5,6 @@
   `last_name` varchar(16) NOT NULL,
   `gender` enum('M','F') NOT NULL,
   `hire_date` date NOT NULL,
-  PRIMARY KEY (`emp_no`)
+  PRIMARY KEY (`emp_no`),
+  KEY `last_name` (`last_name`,`first_name`)
 ) ENGINE=InnoDB DEFAULT CHARSET=latin1
# Comparing `employees`.`salaries` to `employees`.`salaries`       [PASS]
# Comparing `employees`.`titles` to `employees`.`titles`           [PASS]
Compare failed. One or more differences found.

There are at least two differences. One in departments table and another one in employees table. The output is similar to diff. By default the tool stops after finding the first difference. That’s why we use –force, to tell the tool to continue checking all the tables.

It shows us that on departments the dept_name is varchar(40) on server1 and varchar(256) on server2. For “employees” table, it has a KEY (last_name, first_name) on the server2 that is not present on server1. Why is it taking server2 as a reference? Because of this line:

# Object definitions differ. (--changes-for=server1)

So, the changes shown on the diff are for server1. If you want server2 to be the one to be changed and server1 used as reference, then –changes-for=server2 would be needed.

In some cases the diff output is not really useful. We actually need a SQL query to do the changes on the server. We just need to add –difftype=sql to the command line:

# mysqldiff --force --difftype=sql --server1=root:msandbox@127.0.0.1:21489 --server2=root:msandbox@127.0.0.1:21490 employees:employees
[...]
# Comparing `employees`.`departments` to `employees`.`departments`   [FAIL]
# Transformation for --changes-for=server1:
ALTER TABLE `employees`.`departments`
  DROP INDEX dept_name,
  ADD UNIQUE INDEX dept_name (dept_name),
  CHANGE COLUMN dept_name dept_name varchar(256) NULL;
[...]
# Comparing `employees`.`employees` to `employees`.`employees`     [FAIL]
# Transformation for --changes-for=server1:
#
ALTER TABLE `employees`.`employees`
  DROP PRIMARY KEY,
  ADD PRIMARY KEY(`emp_no`),
  ADD INDEX last_name (last_name,first_name);

As we can see, the tool is not perfect. There are two problems here:

1- On “departments table” it drops a UNIQUE key that is present in both servers only to add it again. Waste of time and resources.

2- On “employees” table it drops and recreate the PRIMARY KEY, again something that is not needed a all.

I have created a bug report but this also teaches us a good lesson. Don’t just copy and paste commands without first double checking it.

What mysqldiff runs under the hood?

Mostly queries on INFORMATION_SCHEMA. These are the ones used to check inconsistencies on departments:

SHOW CREATE TABLE `departments`;
SELECT TABLE_SCHEMA, TABLE_NAME, ENGINE, AUTO_INCREMENT, AVG_ROW_LENGTH, CHECKSUM, TABLE_COLLATION, TABLE_COMMENT, ROW_FORMAT, CREATE_OPTIONS
  FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments';
SELECT ORDINAL_POSITION, COLUMN_NAME, COLUMN_TYPE, IS_NULLABLE,
         COLUMN_DEFAULT, EXTRA, COLUMN_COMMENT, COLUMN_KEY
  FROM INFORMATION_SCHEMA.COLUMNS
  WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments';
SELECT PARTITION_NAME, SUBPARTITION_NAME, PARTITION_ORDINAL_POSITION,
         SUBPARTITION_ORDINAL_POSITION, PARTITION_METHOD, SUBPARTITION_METHOD,
         PARTITION_EXPRESSION, SUBPARTITION_EXPRESSION, PARTITION_DESCRIPTION
  FROM INFORMATION_SCHEMA.PARTITIONS
  WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments';
SELECT CONSTRAINT_NAME, COLUMN_NAME, REFERENCED_TABLE_SCHEMA,
         REFERENCED_TABLE_NAME, REFERENCED_COLUMN_NAME
  FROM INFORMATION_SCHEMA.KEY_COLUMN_USAGE
  WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments' AND
        REFERENCED_TABLE_SCHEMA IS NOT NULL;
SELECT TABLE_SCHEMA, TABLE_NAME, ENGINE, AUTO_INCREMENT, AVG_ROW_LENGTH, CHECKSUM, TABLE_COLLATION, TABLE_COMMENT, ROW_FORMAT, CREATE_OPTIONS
  FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments';
SELECT ORDINAL_POSITION, COLUMN_NAME, COLUMN_TYPE, IS_NULLABLE,
         COLUMN_DEFAULT, EXTRA, COLUMN_COMMENT, COLUMN_KEY
  FROM INFORMATION_SCHEMA.COLUMNS
  WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments';
SELECT PARTITION_NAME, SUBPARTITION_NAME, PARTITION_ORDINAL_POSITION,
         SUBPARTITION_ORDINAL_POSITION, PARTITION_METHOD, SUBPARTITION_METHOD,
         PARTITION_EXPRESSION, SUBPARTITION_EXPRESSION, PARTITION_DESCRIPTION
  FROM INFORMATION_SCHEMA.PARTITIONS
  WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments';

As a summary, it checks partitions, row format, collation, constraints and so on.

Conclusion

There are different tools for different purposes. We can check the data consistency with pt-table-checkum/pt-table-sync but also the table definitions with mysqldiff.

The post Checking table definition consistency with mysqldiff appeared first on MySQL Performance Blog.

Apr
15
2015
--

Checking table definition consistency with mysqldiff

Data inconsistencies in replication environments are a pretty common. There are lots of posts that explain how to fix those using pt-table-checksum and pt-table-sync. Usually we only care about the data but from time to time we receive this question in support:

How can I check the table definition consistency between servers?

Replication also allow us to have different table definition between master and slaves. For example, there are some cases that you need some indexes on slaves for querying purposes but are not really needed on the master. There are some other cases where those differences are just a mistake that needs to be fixed.

mysqldiff, included in Oracle’s MySQL Utilities, can help us to find those differences and get the information we need to fix those them. In this post I’m going to show you how to use it with an example.

Find table definition inconsistencies

mysqldiff allows us to find those inconsistencies checking the differences between the tables on the same server (different databases) or on different servers (also possible on different databases). In this example I’m going to search for differences in table definitions between two different servers, server1 and server2.

The command line is pretty simple. This is used to compare the tables on “test” database:

mysqldiff --server1=user@host1 --server2=user@host2 test:test

If the database name is different:

mysqldiff --server1=user@host1 --server2=user@host2 testdb:anotherdb

If the table name is different:

mysqldiff --server1=user@host1 --server2=user@host2 testdb.table1:anotherdb.anothertable

Now I want to check the table definition consistency between two servers. The database’s name is “employees”:

# mysqldiff --force --server1=root:msandbox@127.0.0.1:21489 --server2=root:msandbox@127.0.0.1:21490 employees:employees
# WARNING: Using a password on the command line interface can be insecure.
# server1 on 127.0.0.1: ... connected.
# server2 on 127.0.0.1: ... connected.
# Comparing `employees` to `employees`                             [PASS]
# Comparing `employees`.`departments` to `employees`.`departments`   [FAIL]
# Object definitions differ. (--changes-for=server1)
#
--- `employees`.`departments`
+++ `employees`.`departments`
@@ -1,6 +1,6 @@
 CREATE TABLE `departments` (
   `dept_no` char(4) NOT NULL,
-  `dept_name` varchar(40) NOT NULL,
+  `dept_name` varchar(256) DEFAULT NULL,
   PRIMARY KEY (`dept_no`),
   UNIQUE KEY `dept_name` (`dept_name`)
 ) ENGINE=InnoDB DEFAULT CHARSET=latin1
# Comparing `employees`.`dept_emp` to `employees`.`dept_emp`       [PASS]
# Comparing `employees`.`dept_manager` to `employees`.`dept_manager`   [PASS]
# Comparing `employees`.`employees` to `employees`.`employees`     [FAIL]
# Object definitions differ. (--changes-for=server1)
#
--- `employees`.`employees`
+++ `employees`.`employees`
@@ -5,5 +5,6 @@
   `last_name` varchar(16) NOT NULL,
   `gender` enum('M','F') NOT NULL,
   `hire_date` date NOT NULL,
-  PRIMARY KEY (`emp_no`)
+  PRIMARY KEY (`emp_no`),
+  KEY `last_name` (`last_name`,`first_name`)
 ) ENGINE=InnoDB DEFAULT CHARSET=latin1
# Comparing `employees`.`salaries` to `employees`.`salaries`       [PASS]
# Comparing `employees`.`titles` to `employees`.`titles`           [PASS]
Compare failed. One or more differences found.

There are at least two differences. One in departments table and another one in employees table. The output is similar to diff. By default the tool stops after finding the first difference. That’s why we use –force, to tell the tool to continue checking all the tables.

It shows us that on departments the dept_name is varchar(40) on server1 and varchar(256) on server2. For “employees” table, it has a KEY (last_name, first_name) on the server2 that is not present on server1. Why is it taking server2 as a reference? Because of this line:

# Object definitions differ. (--changes-for=server1)

So, the changes shown on the diff are for server1. If you want server2 to be the one to be changed and server1 used as reference, then –changes-for=server2 would be needed.

In some cases the diff output is not really useful. We actually need a SQL query to do the changes on the server. We just need to add –difftype=sql to the command line:

# mysqldiff --force --difftype=sql --server1=root:msandbox@127.0.0.1:21489 --server2=root:msandbox@127.0.0.1:21490 employees:employees
[...]
# Comparing `employees`.`departments` to `employees`.`departments`   [FAIL]
# Transformation for --changes-for=server1:
ALTER TABLE `employees`.`departments`
  DROP INDEX dept_name,
  ADD UNIQUE INDEX dept_name (dept_name),
  CHANGE COLUMN dept_name dept_name varchar(256) NULL;
[...]
# Comparing `employees`.`employees` to `employees`.`employees`     [FAIL]
# Transformation for --changes-for=server1:
#
ALTER TABLE `employees`.`employees`
  DROP PRIMARY KEY,
  ADD PRIMARY KEY(`emp_no`),
  ADD INDEX last_name (last_name,first_name);

As we can see, the tool is not perfect. There are two problems here:

1- On “departments table” it drops a UNIQUE key that is present in both servers only to add it again. Waste of time and resources.

2- On “employees” table it drops and recreate the PRIMARY KEY, again something that is not needed a all.

I have created a bug report but this also teaches us a good lesson. Don’t just copy and paste commands without first double checking it.

What mysqldiff runs under the hood?

Mostly queries on INFORMATION_SCHEMA. These are the ones used to check inconsistencies on departments:

SHOW CREATE TABLE `departments`;
SELECT TABLE_SCHEMA, TABLE_NAME, ENGINE, AUTO_INCREMENT, AVG_ROW_LENGTH, CHECKSUM, TABLE_COLLATION, TABLE_COMMENT, ROW_FORMAT, CREATE_OPTIONS
  FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments';
SELECT ORDINAL_POSITION, COLUMN_NAME, COLUMN_TYPE, IS_NULLABLE,
         COLUMN_DEFAULT, EXTRA, COLUMN_COMMENT, COLUMN_KEY
  FROM INFORMATION_SCHEMA.COLUMNS
  WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments';
SELECT PARTITION_NAME, SUBPARTITION_NAME, PARTITION_ORDINAL_POSITION,
         SUBPARTITION_ORDINAL_POSITION, PARTITION_METHOD, SUBPARTITION_METHOD,
         PARTITION_EXPRESSION, SUBPARTITION_EXPRESSION, PARTITION_DESCRIPTION
  FROM INFORMATION_SCHEMA.PARTITIONS
  WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments';
SELECT CONSTRAINT_NAME, COLUMN_NAME, REFERENCED_TABLE_SCHEMA,
         REFERENCED_TABLE_NAME, REFERENCED_COLUMN_NAME
  FROM INFORMATION_SCHEMA.KEY_COLUMN_USAGE
  WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments' AND
        REFERENCED_TABLE_SCHEMA IS NOT NULL;
SELECT TABLE_SCHEMA, TABLE_NAME, ENGINE, AUTO_INCREMENT, AVG_ROW_LENGTH, CHECKSUM, TABLE_COLLATION, TABLE_COMMENT, ROW_FORMAT, CREATE_OPTIONS
  FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments';
SELECT ORDINAL_POSITION, COLUMN_NAME, COLUMN_TYPE, IS_NULLABLE,
         COLUMN_DEFAULT, EXTRA, COLUMN_COMMENT, COLUMN_KEY
  FROM INFORMATION_SCHEMA.COLUMNS
  WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments';
SELECT PARTITION_NAME, SUBPARTITION_NAME, PARTITION_ORDINAL_POSITION,
         SUBPARTITION_ORDINAL_POSITION, PARTITION_METHOD, SUBPARTITION_METHOD,
         PARTITION_EXPRESSION, SUBPARTITION_EXPRESSION, PARTITION_DESCRIPTION
  FROM INFORMATION_SCHEMA.PARTITIONS
  WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments';

As a summary, it checks partitions, row format, collation, constraints and so on.

Conclusion

There are different tools for different purposes. We can check the data consistency with pt-table-checkum/pt-table-sync but also the table definitions with mysqldiff.

The post Checking table definition consistency with mysqldiff appeared first on MySQL Performance Blog.

Dec
29
2014
--

How well does your table fit in the InnoDB buffer pool in MySQL 5.6+?

Some time ago, Peter Zaitsev posted a blog titled “How well does your table fits in innodb buffer pool?” He used some special INFORMATION_SCHEMA tables developed for Percona Server 5.1 to report how much of each InnoDB table and index resides in your buffer pool.

As Peter pointed out, you can use this view into the buffer pool to watch a buffer pool warm up with pages as you run queries. You can also use it for capacity planning. If you expect some tables need to be fully loaded in the buffer pool to be used efficiently, but the buffer pool isn’t large enough to hold them, then it’s time to increase the size of the buffer pool.

The problem, however, was that system tables change from version to version. Specifically, the INNODB_BUFFER_POOL_PAGES_INDEX table no longer exists in Percona Server 5.6, and the INNODB_INDEX_STATS table changed some column names in Percona Server 5.5.8, and the table no longer exists in Percona Server 5.6. So many of the comments on Peter’s blog rightly pointed out that the example query didn’t work on subsequent versions of Percona Server. And MySQL Community Edition at the time didn’t have the feature at all. They asked for an update to the blog post.

So here’s an updated, simplified query to report the content of your buffer pool, tested on the most recent versions.

Percona Server 5.1 and 5.5:

USE information_schema;
SET @page_size = @@innodb_page_size;
SET @bp_pages = @@innodb_buffer_pool_size/@page_size;
SELECT P.TABLE_NAME, P.PAGE_TYPE,
CASE WHEN P.INDEX_NAME IS NULL THEN NULL WHEN P.TABLE_NAME LIKE '`SYS_%' THEN P.INDEX_NAME WHEN P.INDEX_NAME <> 'PRIMARY' THEN 'SECONDARY' ELSE 'PRIMARY' END AS INDEX_TYPE,
COUNT(DISTINCT P.PAGE_NUMBER) AS PAGES,
ROUND(100*COUNT(DISTINCT P.PAGE_NUMBER)/@bp_pages,2) AS PCT_OF_BUFFER_POOL,
CASE WHEN P.TABLE_NAME IS NULL THEN NULL WHEN P.TABLE_NAME LIKE 'SYS_%' THEN NULL ELSE ROUND(100*COUNT(DISTINCT P.PAGE_NUMBER)/CASE P.INDEX_NAME WHEN 'PRIMARY' THEN TS.DATA_LENGTH/@page_size ELSE TS.INDEX_LENGTH/@page_size END, 2) END AS PCT_OF_INDEX
FROM INNODB_BUFFER_PAGE AS P
JOIN INNODB_SYS_TABLES AS T ON P.SPACE = T.SPACE
JOIN TABLES AS TS ON (T.SCHEMA, T.NAME) = (TS.TABLE_SCHEMA, TS.TABLE_NAME)
WHERE TS.TABLE_SCHEMA <> 'mysql'
GROUP BY TABLE_NAME, PAGE_TYPE, INDEX_TYPE;

MySQL 5.6 and 5.7 (this also works on Percona Server 5.6):

USE information_schema;
SET @page_size = @@innodb_page_size;
SET @bp_pages = @@innodb_buffer_pool_size/@page_size;
SELECT P.TABLE_NAME, P.PAGE_TYPE,
CASE WHEN P.INDEX_NAME IS NULL THEN NULL WHEN P.TABLE_NAME LIKE '`SYS_%' THEN P.INDEX_NAME WHEN P.INDEX_NAME <> 'PRIMARY' THEN 'SECONDARY' ELSE 'PRIMARY' END AS INDEX_TYPE,
COUNT(DISTINCT P.PAGE_NUMBER) AS PAGES,
ROUND(100*COUNT(DISTINCT P.PAGE_NUMBER)/@bp_pages,2) AS PCT_OF_BUFFER_POOL,
CASE WHEN P.TABLE_NAME IS NULL THEN NULL WHEN P.TABLE_NAME LIKE 'SYS_%' THEN NULL ELSE ROUND(100*COUNT(DISTINCT P.PAGE_NUMBER)/CASE P.INDEX_NAME WHEN 'PRIMARY' THEN TS.DATA_LENGTH/@page_size ELSE TS.INDEX_LENGTH/@page_size END, 2) END AS PCT_OF_INDEX
FROM INNODB_BUFFER_PAGE AS P
JOIN INNODB_SYS_TABLES AS T ON P.SPACE = T.SPACE
JOIN TABLES AS TS ON T.NAME = CONCAT(TS.TABLE_SCHEMA, '/', TS.TABLE_NAME)
WHERE TS.TABLE_SCHEMA <> 'mysql'
GROUP BY TABLE_NAME, PAGE_TYPE, INDEX_TYPE;

In both cases, the output looks something like the following (if I have read from a single table called test.foo):

+--------------+-------------------+------------+-------+--------------------+--------------+
| TABLE_NAME   | PAGE_TYPE         | INDEX_TYPE | PAGES | PCT_OF_BUFFER_POOL | PCT_OF_INDEX |
+--------------+-------------------+------------+-------+--------------------+--------------+
| NULL         | FILE_SPACE_HEADER | NULL       |     1 |               0.00 |         NULL |
| NULL         | IBUF_BITMAP       | NULL       |     1 |               0.00 |         NULL |
| NULL         | INODE             | NULL       |     1 |               0.00 |         NULL |
| `test`.`foo` | INDEX             | PRIMARY    |  2176 |               3.32 |        98.37 |
| `test`.`foo` | INDEX             | SECONDARY  |  2893 |               4.41 |        88.47 |
+--------------+-------------------+------------+-------+--------------------+--------------+

Unfortunately, the INFORMATION_SCHEMA tables report total size of secondary indexes for a table, but not the size of each index individually. Therefore this query shows the percent of index only for the primary index (which is also the clustered index, i.e. the table itself), and then all other secondary indexes grouped together.

PERFORMANCE_SCHEMA solution

The PERFORMANCE_SCHEMA also includes some information about the contents of the buffer pool. The MySQL SYS Schema makes it easy to query this. But this view doesn’t calculate the percentage of each table in the buffer pool, nor the percentage of the buffer pool occupied by each table.

mysql> SELECT * FROM sys.innodb_buffer_stats_by_table;
+---------------+----------------------+------------+------------+-------+--------------+-----------+-------------+
| object_schema | object_name          | allocated  | data       | pages | pages_hashed | pages_old | rows_cached |
+---------------+----------------------+------------+------------+-------+--------------+-----------+-------------+
| test          | foo                  | 149.64 MiB | 106.19 MiB |  9577 |         9577 |      9577 |     1050490 |
| InnoDB System | SYS_TABLES           | 160.00 KiB | 91.24 KiB  |    10 |           10 |        10 |         594 |
| InnoDB System | SYS_INDEXES          | 128.00 KiB | 93.59 KiB  |     8 |            8 |         8 |        1345 |
| InnoDB System | SYS_COLUMNS          | 80.00 KiB  | 47.13 KiB  |     5 |            5 |         5 |         761 |
| InnoDB System | SYS_DATAFILES        | 48.00 KiB  | 16.40 KiB  |     3 |            3 |         3 |         246 |
| InnoDB System | SYS_FIELDS           | 48.00 KiB  | 16.02 KiB  |     3 |            3 |         3 |         377 |
| InnoDB System | SYS_FOREIGN          | 48.00 KiB  | 0 bytes    |     3 |            3 |         3 |           0 |
| InnoDB System | SYS_TABLESPACES      | 48.00 KiB  | 15.83 KiB  |     3 |            3 |         3 |         242 |
| InnoDB System | SYS_FOREIGN_COLS     | 16.00 KiB  | 0 bytes    |     1 |            1 |         1 |           0 |
. . .

The post How well does your table fit in the InnoDB buffer pool in MySQL 5.6+? appeared first on MySQL Performance Blog.

Jan
14
2014
--

InnoDB file formats: Here is one pitfall to avoid

UPDATED: explaining the role of innodb_strict_mode and correcting introduction of innodb_file_format

Compressed tables is an example of an InnoDB feature that became available with the Barracuda file format, introduced in the InnoDB plugin. They can bring significant gains in raw performance and scalability: given the data is stored in a compressed format the amount of memory and disk space necessary to hold it and move it around (disk/memory) is lower, thus making them attractive for servers equipped with SSD drives of smaller capacity.

The notion of “file formats” (defined by the variable innodb_file_format) was first introduced when InnoDB was still a plugin. The evolution of InnoDB has lead to the development of new features and some of them required the support of new on-disk data structures. That means those particular features (like compressed tables) will only work with the newer file format. To make things clear and help manage compatibility issues when upgrading and (specially) downgrading MySQL the original file format started being referred to as Antelope.

The default file format in MySQL 5.6 and the latest 5.5 releases is Antelope. Note this can be a bit confusing as the first releases of 5.5 (until 5.5.7) introduced the new file format as being the default one, a decision that was later reversed to assure maximum compatibility in replication configurations comprised of servers running different versions of MySQL. To be sure about which file format is the one set as default in your server you can issue:

mysql> SHOW VARIABLES LIKE 'innodb_file_format';

The important lesson here that motivated me to write this post is that the file format can only be defined for tablespaces – not tables, in general. This is documented in the manual but maybe not entirely clear:

innodb_file_format: The file format to use for new InnoDB tables. Currently, Antelope and Barracuda are supported. This applies only for tables that have their own tablespace, so for it to have an effect, innodb_file_per_table must be enabled. The Barracuda file format is required for certain InnoDB features such as table compression.

Even if you configure your server with innodb_file_format=Barracuda and recreate the datadir and basic tables with the script mysql_install_db, the common tablespace will always use Antelope. So, to create tables under the new file format it is imperative you use innodb_file_per_table. Although this requirements is documented what might be misleading here is the fact there’s no error being issued if you set the file format to Barracuda and create a new compressed table without having innodb_one_file_per_table set – only a couple of warnings, if you pay close attention. Here’s an example:

mysql> SET GLOBAL innodb_file_format=Barracuda;
Query OK, 0 rows affected (0.00 sec)

mysql> create table test.testA (id int) row_format=Compressed;
Query OK, 0 rows affected, 2 warnings (0.96 sec)

If you do choose to check the warnings, you’ll find:

mysql> show WARNINGS;
+---------+------+---------------------------------------------------------------+
| Level | Code | Message |
+---------+------+---------------------------------------------------------------+
| Warning | 1478 | InnoDB: ROW_FORMAT=COMPRESSED requires innodb_file_per_table. |
| Warning | 1478 | InnoDB: assuming ROW_FORMAT=COMPACT. |
+---------+------+---------------------------------------------------------------+
2 rows in set (0.00 sec)

This happens when innodb_strict_mode is turned OFF, as it usually is. If it was turned ON the table creation would fail with the following error:

mysql> create table test.testA (id int) row_format=Compressed;
ERROR 1031 (HY000): Table storage engine for 'testA' doesn't have this option

Now, let’s take a look at what the INFORMATION_SCHEMA tell us about this table:

mysql> SELECT * FROM information_schema.tables WHERE table_schema='test' and table_name='testA'\G
*************************** 1. row ***************************
TABLE_CATALOG: def
TABLE_SCHEMA: test
TABLE_NAME: testA
TABLE_TYPE: BASE TABLE
ENGINE: InnoDB
VERSION: 10
ROW_FORMAT: Compact
TABLE_ROWS: 0
AVG_ROW_LENGTH: 0
DATA_LENGTH: 16384
MAX_DATA_LENGTH: 0
INDEX_LENGTH: 0
DATA_FREE: 0
AUTO_INCREMENT: NULL
CREATE_TIME: 2014-01-07 14:21:05
UPDATE_TIME: NULL
CHECK_TIME: NULL
TABLE_COLLATION: latin1_swedish_ci
CHECKSUM: NULL
CREATE_OPTIONS: row_format=COMPRESSED
TABLE_COMMENT:
1 row in set (0.00 sec)

There’s two at-first-look “contradictory” fields here:

  •  ”ROW_FORMAT” says the table is using the Compact format while
  •  ”CREATE_OPTIONS” indicates “row_format=COMPRESSED” has been used when creating the table

The one to consider is ROW_FORMAT: CREATE_OPTION is used to store the options that were used at the moment the table was created and is evoked by the SHOW CREATE TABLE statement to “reconstruct” it:

mysql> show create table test.testA;
*************************** 1. row ***************************
Table: testA
Create Table: CREATE TABLE `testA` (
`id` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1 ROW_FORMAT=COMPRESSED
1 row in set (0.00 sec)

Conclusion

A customer contacted us asking how he could get a list of the tables using the compression format, which we can obtain by interrogating INFORMATION_SCHEMA:

mysql> SELECT TABLE_NAME, ROW_FORMAT FROM INFORMATION_SCHEMA.TABLES WHERE ROW_FORMAT=Compressed’;

To their surprise this statement returned an empty set. We verified that the tables created by them specified ROW_FORMAT=Compressed but as shown in this article this method is not to be trusted – ask the INFORMATION_SCHEMA instead.

The post InnoDB file formats: Here is one pitfall to avoid appeared first on MySQL Performance Blog.

Dec
03
2013
--

innodb_stats_on_metadata and slow queries on INFORMATION_SCHEMA

INFORMATION_SCHEMA is usually the place to go when you want to get facts about a system (how many tables do we have? what are the 10 largest tables? What is data size and index size for table t?, etc). However it is also quite common that such queries are very slow and create lots of I/O load. Here is a tip to avoid theses hassles: set innodb_stats_on_metadata to OFF.

This is a topic we already talked about, but given the number of systems suffering from INFORMATION_SCHEMA slowness, I think it is good to bring innodb_stats_on_metadata back on the table.

The problem

Let’s look at a system I’ve seen recently: MySQL 5.5, working set fitting in memory but not the whole dataset, around 4000 InnoDB tables.

The I/O load is very light as the server is an idle replica. You can see the I/O load from this partial pt-diskstats output:

#ts device    rd_s rd_avkb    wr_s wr_avkb  busy in_prg
1.0 sda2       0.0     0.0     0.0     0.0    0%      0
1.0 sda2       0.0     0.0    16.0     9.2    0%      0
1.0 sda2       0.0     0.0     0.0     0.0    0%      0
1.0 sda2       0.0     0.0     0.0     0.0    0%      0
1.0 sda2       0.0     0.0     2.0     4.0    0%      0

The customer wanted to know what could be improved from the schema so we started by finding the 10 largest tables:


mysql> SELECT table_schema as 'DB',
table_name as 'TABLE',
CONCAT(ROUND(( data_length + index_length ) / ( 1024 * 1024 * 1024 ), 2), 'G') 'TOTAL'
FROM information_schema.TABLES
ORDER BY data_length + index_length DESC
LIMIT 10;
[...]
10 rows in set (1 min 32.23 sec)

1mn32s is slow, but it’s not really a problem. But the I/O load triggered by this query IS a problem:

#ts device    rd_s rd_avkb    wr_s wr_avkb busy in_prg
1.0 sda2       0.0     0.0    18.0     8.7   0%      0
1.0 sda2       0.0     0.0     0.0     0.0   0%      0
1.0 sda2     100.0    16.0     0.0     0.0  79%      1
1.0 sda2     184.0    16.0     5.0     4.8  96%      1
1.0 sda2      97.0    16.0     0.0     0.0  98%      1
1.0 sda2     140.0    16.0     0.0     0.0  98%      1
1.0 sda2     122.0    16.0    17.0     4.0  98%      1
1.0 sda2     147.0    16.0     0.0     0.0  98%      1
[...]
1.0 sda2     136.0    16.0     0.0     0.0  98%      1
1.0 sda2     139.0    16.0     0.0     0.0  98%      1
1.0 sda2     149.0    16.0     0.0     0.0  98%      1
1.0 sda2     114.0    16.0     0.0     0.0  98%      1
1.0 sda2     147.0    16.0     8.0     4.0  96%      1
1.0 sda2     192.0    16.0     0.0     0.0  97%      1
1.0 sda2     141.0    16.0     0.0     0.0  98%      1
1.0 sda2     167.0    16.0     0.0     0.0  98%      1
1.0 sda2      15.0    16.0     0.0     0.0   6%      0
1.0 sda2       0.0     0.0     0.0     0.0   0%      0
1.0 sda2       0.0     0.0    16.0     4.0   0%      0

The disks are 100% busy reading InnoDB pages for our query. No doubt that if the server was running queries from the application, they would have been negatively impacted.

Now let’s execute the same query with innodb_stats_on_metadata = OFF;


mysql> SET GLOBAL innodb_stats_on_metadata = OFF;

 

mysql> SELECT [...]
10 rows in set (0.45 sec)

And let’s look at pt-diskstats:

#ts device    rd_s rd_avkb    wr_s wr_avkb busy in_prg
1.0 sda2       0.0     0.0    16.0     9.2   0%      0
1.0 sda2       0.0     0.0     0.0     0.0   0%      0
1.0 sda2       0.0     0.0     0.0     0.0   0%      0
1.0 sda2       0.0     0.0    16.0     4.0   1%      0
1.0 sda2       0.0     0.0     0.0     0.0   0%      0
1.0 sda2       0.0     0.0     0.0     0.0   0%      0
1.0 sda2       0.0     0.0     0.0     0.0   0%      0
1.0 sda2       0.0     0.0     0.0     0.0   0%      0

No read load this time (and a much faster query!).

What is innodb_stats_on_metadata?

When the option is set to ON, InnoDB index statistics are updated when running SHOW TABLE STATUS, SHOW INDEX or when querying INFORMATION_SCHEMA.TABLES or INFORMATION_SCHEMA.STATISTICS. These statistics include the cardinality and the number of entries, they are used by the optimizer to find an optimal execution plan.

So even if SELECT statements cannot change the real statistics, MySQL updates the statistics for InnoDB tables. This is counterintuitive.

Is it useful? Not really, because InnoDB will always compute statistics when you open a table for the first time and when significant portions of the table have been modified (and when you run ANALYZE TABLE).

Now why did we have such a high read load when innodb_stats_on_metadata was set to ON? For InnoDB, statistics are estimated from random index dives, which translates to random reads.

The problem was magnified in the example because the whole dataset was not fitting in memory, the number of tables was high and the I/O subsystem was not very powerful.

Conclusion

It’s worth mentioning that the default value is now OFF with MySQL 5.6. So if you’re using MySQL 5.6, there’s no need to change anything. If you’re using MySQL 5.1 or 5.5, set innodb_stats_on_metadata to OFF and show your boss how you were able to get a 200x performance boost on some queries! And if you’re using MySQL 5.0 or below, you’ve just found another reason to upgrade!

The post innodb_stats_on_metadata and slow queries on INFORMATION_SCHEMA appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com