Dec
03
2013
--

innodb_stats_on_metadata and slow queries on INFORMATION_SCHEMA

INFORMATION_SCHEMA is usually the place to go when you want to get facts about a system (how many tables do we have? what are the 10 largest tables? What is data size and index size for table t?, etc). However it is also quite common that such queries are very slow and create lots of I/O load. Here is a tip to avoid theses hassles: set innodb_stats_on_metadata to OFF.

This is a topic we already talked about, but given the number of systems suffering from INFORMATION_SCHEMA slowness, I think it is good to bring innodb_stats_on_metadata back on the table.

The problem

Let’s look at a system I’ve seen recently: MySQL 5.5, working set fitting in memory but not the whole dataset, around 4000 InnoDB tables.

The I/O load is very light as the server is an idle replica. You can see the I/O load from this partial pt-diskstats output:

#ts device    rd_s rd_avkb    wr_s wr_avkb  busy in_prg
1.0 sda2       0.0     0.0     0.0     0.0    0%      0
1.0 sda2       0.0     0.0    16.0     9.2    0%      0
1.0 sda2       0.0     0.0     0.0     0.0    0%      0
1.0 sda2       0.0     0.0     0.0     0.0    0%      0
1.0 sda2       0.0     0.0     2.0     4.0    0%      0

The customer wanted to know what could be improved from the schema so we started by finding the 10 largest tables:


mysql> SELECT table_schema as 'DB',
table_name as 'TABLE',
CONCAT(ROUND(( data_length + index_length ) / ( 1024 * 1024 * 1024 ), 2), 'G') 'TOTAL'
FROM information_schema.TABLES
ORDER BY data_length + index_length DESC
LIMIT 10;
[...]
10 rows in set (1 min 32.23 sec)

1mn32s is slow, but it’s not really a problem. But the I/O load triggered by this query IS a problem:

#ts device    rd_s rd_avkb    wr_s wr_avkb busy in_prg
1.0 sda2       0.0     0.0    18.0     8.7   0%      0
1.0 sda2       0.0     0.0     0.0     0.0   0%      0
1.0 sda2     100.0    16.0     0.0     0.0  79%      1
1.0 sda2     184.0    16.0     5.0     4.8  96%      1
1.0 sda2      97.0    16.0     0.0     0.0  98%      1
1.0 sda2     140.0    16.0     0.0     0.0  98%      1
1.0 sda2     122.0    16.0    17.0     4.0  98%      1
1.0 sda2     147.0    16.0     0.0     0.0  98%      1
[...]
1.0 sda2     136.0    16.0     0.0     0.0  98%      1
1.0 sda2     139.0    16.0     0.0     0.0  98%      1
1.0 sda2     149.0    16.0     0.0     0.0  98%      1
1.0 sda2     114.0    16.0     0.0     0.0  98%      1
1.0 sda2     147.0    16.0     8.0     4.0  96%      1
1.0 sda2     192.0    16.0     0.0     0.0  97%      1
1.0 sda2     141.0    16.0     0.0     0.0  98%      1
1.0 sda2     167.0    16.0     0.0     0.0  98%      1
1.0 sda2      15.0    16.0     0.0     0.0   6%      0
1.0 sda2       0.0     0.0     0.0     0.0   0%      0
1.0 sda2       0.0     0.0    16.0     4.0   0%      0

The disks are 100% busy reading InnoDB pages for our query. No doubt that if the server was running queries from the application, they would have been negatively impacted.

Now let’s execute the same query with innodb_stats_on_metadata = OFF;


mysql> SET GLOBAL innodb_stats_on_metadata = OFF;

 

mysql> SELECT [...]
10 rows in set (0.45 sec)

And let’s look at pt-diskstats:

#ts device    rd_s rd_avkb    wr_s wr_avkb busy in_prg
1.0 sda2       0.0     0.0    16.0     9.2   0%      0
1.0 sda2       0.0     0.0     0.0     0.0   0%      0
1.0 sda2       0.0     0.0     0.0     0.0   0%      0
1.0 sda2       0.0     0.0    16.0     4.0   1%      0
1.0 sda2       0.0     0.0     0.0     0.0   0%      0
1.0 sda2       0.0     0.0     0.0     0.0   0%      0
1.0 sda2       0.0     0.0     0.0     0.0   0%      0
1.0 sda2       0.0     0.0     0.0     0.0   0%      0

No read load this time (and a much faster query!).

What is innodb_stats_on_metadata?

When the option is set to ON, InnoDB index statistics are updated when running SHOW TABLE STATUS, SHOW INDEX or when querying INFORMATION_SCHEMA.TABLES or INFORMATION_SCHEMA.STATISTICS. These statistics include the cardinality and the number of entries, they are used by the optimizer to find an optimal execution plan.

So even if SELECT statements cannot change the real statistics, MySQL updates the statistics for InnoDB tables. This is counterintuitive.

Is it useful? Not really, because InnoDB will always compute statistics when you open a table for the first time and when significant portions of the table have been modified (and when you run ANALYZE TABLE).

Now why did we have such a high read load when innodb_stats_on_metadata was set to ON? For InnoDB, statistics are estimated from random index dives, which translates to random reads.

The problem was magnified in the example because the whole dataset was not fitting in memory, the number of tables was high and the I/O subsystem was not very powerful.

Conclusion

It’s worth mentioning that the default value is now OFF with MySQL 5.6. So if you’re using MySQL 5.6, there’s no need to change anything. If you’re using MySQL 5.1 or 5.5, set innodb_stats_on_metadata to OFF and show your boss how you were able to get a 200x performance boost on some queries! And if you’re using MySQL 5.0 or below, you’ve just found another reason to upgrade!

The post innodb_stats_on_metadata and slow queries on INFORMATION_SCHEMA appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com