More on Checkpoints in InnoDB MySQL 8

Recently I posted about checkpointing in MySQL, where MySQL showed interesting “wave” behavior.

Soon after Dimitri posted a solution with how to fix “waves,” and I would like to dig a little more into proposed suggestions, as there are some materials to process.

This post will be very heavy on InnoDB configuration, so let’s start with the basic configuration for MySQL, but before that some initial environment.

I use MySQL version 8.0.21 on the hardware as described here

As for the storage, I am not using some “old dusty SSD”, but production available Enterprise-Grade Intel SATA SSD D3-S4510. This SSD is able to handle the throughput of 468MiB/sec of random writes or 30000 IOPS of random writes of 16KiB blocks.

So initial configuration for my test was:

datadir= /data/mysql8-8.0.21
bind_address =



server_id = 7

# general
table_open_cache = 200000


# files

# buffers
innodb_buffer_pool_size= 140G


innodb_flush_log_at_trx_commit  = 1
innodb_doublewrite= 1
innodb_flush_method             = O_DIRECT
innodb_file_per_table           = 1




There is a lot of parameters, so let’s highlight the most relevant for this test:

innodb_buffer_pool_size= 140G

Buffer pool size is enough to fit all data, which is about 100GB in size


Adaptive hash index is enabled (as it comes in default InnoDB config)


This is what defaults provide, but I will increase it, following my previous post. 



These parameters define the limit of 20GB for our redo logs, and this is important, as our workload will be “redo-log” bounded, as we will see from the results



You may ask, why do I use 2000 and 4000, while the storage can handle 30000 IOPS.

This is a valid point, and as we can see later, these parameters are not high enough for this workload, but also it does not mean we should use them all the way up to 30000, as we will see from the results.

MySQL Manual says the following about innodb_io_capacity:

“The innodb_io_capacity variable defines the overall I/O capacity available to InnoDB. It should be set to approximately the number of I/O operations that the system can perform per second (IOPS). When innodb_io_capacity is set, InnoDB estimates the I/O bandwidth available for background tasks based on the set value.” 

From this, you may get the impression that if you set innodb_io_capacity to I/O bandwidth of your storage, you should be fine. Though this part does not say what you should take as I/O operations. For example, if your storage can perform 500MB/sec, then if you do 4KB block IO operations it will be 125000 IO per second, and if you do 16KB IO, then it will be 33000 IO per second. 

MySQL manual leaves it up to your imagination, but as InnoDB typical page size is 16KB, let’s assume we do 16KB blocks IO.

However later on that page, we can read:

“Ideally, keep the setting as low as practical, but not so low that background activities fall behind. If the value is too high, data is removed from the buffer pool and change buffer too quickly for caching to provide a significant benefit. For busy systems capable of higher I/O rates, you can set a higher value to help the server handle the background maintenance work associated with a high rate of row changes”


“Consider write workload when tuning innodb_io_capacity. Systems with large write workloads are likely to benefit from a higher setting. A lower setting may be sufficient for systems with a small write workload.”

I do not see that the manual provides much guidance about what value I should use, so we will test it.

Initial results

So if we benchmark with initial parameters, we can see the “wave” pattern.


As for why this is happening, let’s check Percona Monitoring and Management “InnoDB Checkpoint Age” chart:

Actually InnoDB Flushing by Type in PMM does not show sync flushing yet, so I had to modify chart a little to show “sync flushing” in orange line:

And we immediately see that Uncheckpointed Bytes exceed Max Checkpoint Age in 16.61GiB, which is defined by 20GiB of innodb log files. 16.61GiB is less than 20GB, because InnoDB reserves some cushion for the cases exactly like this, so even if we exceed 16.61GiB, InnoDB still has an opportunity to flush data.

Also, we see that before Uncheckpointed Bytes exceed Max Checkpoint Age, InnoDB flushes pages with the rate 4000 IOPS, just as defined by innodb_io_capacity_max.

We should try to avoid the case when Uncheckpointed Bytes exceed Max Checkpoint Age, because when it happens, InnoDB gets into “emergency” flushing mode, and in fact, this is what causes the waves we see. I should have detected this in my previous post, mea culpa.

So the first conclusion we can make – if InnoDB does not flush fast enough, what if we increase innodb_io_capacity_max ? Sure, let’s see. And for the simplification, for the next experiments, I will use

Innodb_io_capacity = innodb_io_capacity_max, unless specified otherwise.

Next run with Innodb_io_capacity = innodb_io_capacity_max = 7000

Not much improvement and this also confirmed by InnoDB Checkpoint ge chart

InnoDB tries to flush more pages per second up to 5600 pages/sec, but it is not enough to avoid exceeding Max Checkpoint Age.

Why is this the case? The answer is a double write buffer.

Even though MySQL improved the doublewrite buffer in MySQL 8.0.20, it does not perform well enough with proposed defaults. 

Well, at least the problem was solved because previous Oracle ran benchmarks with disabled doublewrite, just to hide and totally ignore the issue with doublewrite. For the example check this.

But let’s get back to our 8.0.21 and fixed doublewrite.

Dimiti mentions:

“the main config options for DBLWR in MySQL 8.0 are:

  • innodb_doublewrite_files = N
    innodb_doublewrite_pages = M”

Let’s check the manual again:

“The innodb_doublewrite_files variable is intended for advanced performance tuning. The default setting should be suitable for most users.


The innodb_doublewrite_pages variable (introduced in MySQL 8.0.20) controls the number of maximum number of doublewrite pages per thread. If no value is specified, innodb_doublewrite_pages is set to the innodb_write_io_threads value. This variable is intended for advanced performance tuning. The default value should be suitable for most users.

Was it wrong to assume that innodb_doublewrite_files and  innodb_doublewrite_pages provides the value suitable for our use case?

But let’s try with the values Dmitri recommended to look into, I will use

innodb_doublewrite_files=2 and innodb_doublewrite_pages=128

Results with innodb_doublewrite_files=2 and innodb_doublewrite_pages=128

The problem with waves is fixed! 

And InnoDB Checkpoint Age chart:

Now we are able to keep Uncheckpointed Bytes under Max Checkpoint Age, and this is what fixed “waves” pattern.

We can say that parallel doublewrite is a new welcomed improvement, but the fact that one has to change innodb_doublewrite_pages in order to get improved performance is the design flaw in my opinion.

But there are still a lot of variations in 1 sec resolution and small drops. Before we get to them, let’s take a look at another suggestion: use –innodb_adaptive_hash_index=0 ( that is to disable Adaptive Hash Index). I will use AHI=0 on the charts to mark this setting.

Let’s take a look at the results with improved settings and with –innodb_adaptive_hash_index=0

Results with –innodb_adaptive_hash_index=0

To see what is the real improvement with –innodb_adaptive_hash_index=0 , let’s compare barcharts:

Or in numeric form:

settings Avg tps, last 2000 sec
io_cap_max=7000,doublewrite=opt 7578.69
io_cap_max=7000,doublewrite=opt,AHI=0 7996.33

So –innodb_adaptive_hash_index=0 really brings some improvements, about 5.5%, so I will use  –innodb_adaptive_hash_index=0 for further experiments.

Let’s see if increased innodb_buffer_pool_instances=32 will help to smooth periodical variance.

Results with innodb_buffer_pool_instances=32

So indeed using innodb_buffer_pool_instances=32 gets us less variations, keeping overall throughput about the same. It is 7936.28 tps for this case.

Now let’s review the parameter innodb_change_buffering=none, which Dmitri also suggests.

Results with innodb_change_buffering=none

There is NO practical difference if we disable innodb_change_buffer.

And if we take a look at PMM change buffer chart:

We can see there is NO Change Buffer activity outside of the initial 20 mins. I am not sure why Dimitri suggested disabling it. In fact, Change Buffer can be quite useful, and I will show it in my benchmark for the different workloads.

Now let’s take a look at suggested settings with Innodb_io_capacity = innodb_io_capacity_max = 8000. That will INCREASE innodb_io_capacity_max , and compare to results with innodb_io_capacity_max = 7000.

Or in tabular form:

settings Avg tps, last 2000 sec
io_cap_max=7000,doublewrite=opt,AHI=0,BPI=32 7936.28
io_cap_max=8000,doublewrite=opt,AHI=0,BPI=32 7693.08

Actually with innodb_io_capacity_max=8000 the throughput is LESS than with  innodb_io_capacity_max=7000

Can you guess why? 

Let’s compare InnoDB Checkpoint Age.

This is for innodb_io_capacity_max=8000 :

And this is for innodb_io_capacity_max=7000 

This is like a child’s game: Find the difference.

The difference is that with  innodb_io_capacity_max=7000
Uncheckpointed Bytes is 13.66 GiB,
and with innodb_io_capacity_max=8000
Uncheckpointed Bytes is 12.51 GiB

What does it mean? It means that with innodb_io_capacity_max=7000 HAS to flush LESS pages and still keep within Max Checkpoint Age.

In fact, if we try to push even further, and use innodb_io_capacity_max=innodb_io_capacity=6500 we will get InnoDB Checkpoint Age chart as:

Where Uncheckpointed Bytes are 15.47 GiB. Does it improve throughput? Absolutely!

settings Avg tps, last 2000 sec
io_cap_max=6500,doublewrite=opt,AHI=0,BPI=32 8233.628
io_cap_max=7000,doublewrite=opt,AHI=0,BPI=32 7936.283
io_cap_max=8000,io_cap_max=8000,doublewrite=opt,AHI=0,BPI=32 7693.084

The difference between innodb_io_capacity_max=6500 and innodb_io_capacity_max=8000 is 7%

This now becomes clear what Manual means in the part where it says:

“Ideally, keep the setting as low as practical, but not so low that background activities fall behind”

So we really need to increase innodb_io_capacity_max to the level that Uncheckpointed Bytes stays under Max Checkpoint Age, but not by much, otherwise InnoDB will do more work then it is needed and it will affect the throughput.

In my opinion, this is a serious design flaw in InnoDB Adaptive Flushing, that you actually need to wiggle innodb_io_capacity_max to achieve appropriate results.

Inverse relationship between innodb_io_capacity_max and innodb_log_file_size

To show an even more complicated relation between innodb_io_capacity_max and innodb_log_file_size, let consider the following experiment.

We will increase innodb_log_file_size from 10GB to 20GB, effectively doubling our redo-log capacity.

And now let’s check InnoDB Checkpoint Age with innodb_io_capacity_max=7000:

We can see there is a lot of space in InnoDB logs which InnoDB does not use. There is only 22.58GiB of Uncheckpointed Bytes, while 33.24 GiB are available.

So what happens if we increase innodb_io_capacity_max to 4500

 InnoDB Checkpoint Age with innodb_io_capacity_max=4500:

In this setup, We can push Uncheckpointed Bytes to 29.80 GiB, and it has a positive effect on the throughput.

Let’s compare throughput :

settings Avg tps, last 2000 sec
io_cap_max=4500,log_size=40GB,doublewrite=opt,AHI=0,BPI=32 9865.308
io_cap_max=7000,log_size=40GB,doublewrite=opt,AHI=0,BPI=32 9374.121

So by decreasing innodb_io_capacity_max from 7000 to 4500 we can gain 5.2% in the throughput.

Please note that we can’t continue to decrease innodb_io_capacity_max, because in this case Uncheckpointed Bytes risks to exceed Max Checkpoint Age, and this will lead to the negative effect of emergency flushing.

So again, in order to improve throughput, we should be DECREASING innodb_io_capacity_max, but only to a certain threshold. We should not be setting innodb_io_capacity_max to 30000, to what really SATA SSD can provide.

Again, for me, this is a major design flaw in the current InnoDB Adaptive Flushing. Please note this was a static workload. If your workload changes during the day, it is practically impossible to come up with optimal value. 


Trying to summarize all of the above, I want to highlight:

  • To fix “wave” pattern we need to tune innodb_io_capacity_max and innodb_doublewrite_pages 
  • InnoDB parallel doublewrite in MySQL 8.0.20 is a definitely positive improvement, but the default values seem chosen poorly, in contradiction with Manual. I wish Oracle/MySQL shipped features that work out of the box for most users.
  • InnoDB Adaptive Hash index is not helping here, and you get better performance by disabling it. I also observed that in other workloads, the InnoDB Adaptive Hash index might be another broken subsystem, which Oracle ignores to fix and just disables it in its benchmarks.
  • InnoDB Change Buffer has no effect on this workload, so you may or may not disable it — there is no difference. But I saw a positive effect from InnoDB Change Buffer in other workloads, so I do not recommend blindly disabling it.
  • Now about InnoDB Adaptive Flushing. In my opinion, InnoDB Adaptive Flushing relies too much on manual tuning of innodb_io_capacity_max , which in fact has nothing to do with the real storage IO capacity. In fact, often you need to lower innodb_io_capacity_max  to get better performance, but not make it too low, because at some point it will hurt the performance. The best way to monitor it is to check InnoDB Checkpoint Age chart in PMM
  • I would encourage Oracle to fix the broken design of InnoDB Adaptive Flushing, where it would detect IO capacity automatically and to not flush aggressively, but to keep  Uncheckpointed Bytes just under Max Checkpoint Age. Let’s hope Oracle faster than doublewrite buffer because history shows that to force Oracle to make improvements in InnoDB IO subsystem, we need to do it first in Percona Server for MySQL like we did with parallel doublewrite buffer.  For the reference parallel doublewrite was implemented first in Percona Server for MySQL 5.7.11-4 which was released March 15th, 2016. Oracle implemented (with not optimal default settings ) parallel doublewrite in MySQL 8.0.20, which was released 4 years later after Percona Server, on April 4th, 2020.

Webinar Wed 8/29: Databases in the Hosted Cloud


databases-in-the-cloudPlease join Percona’s Chief Evangelist, Colin Charles on Wednesday, August 29th, 2018, as he presents Databases in the Hosted Cloud at 7:00 AM PDT (UTC-7) / 10:00 AM EDT (UTC-4).


Nearly everyone today uses some form of database in the hosted cloud. You can use hosted MySQL, MariaDB, Percona Server, and PostgreSQL in several cloud providers as a database as a service (DBaaS).

In this webinar, Colin Charles explores how to efficiently deploy a cloud database configured for optimal performance, with a particular focus on MySQL.

You’ll learn the differences between the various public cloud offerings for Amazon RDS including Aurora, Google Cloud SQL, Rackspace OpenStack DBaaS, Microsoft Azure, and Alibaba Cloud, as well as the access methods and the level of control you have. Hosting in the cloud can be a challenge but after today’s webinar, we’ll make sure you walk away with a better understanding of how you can leverage the cloud for your business needs.

Topics include:

  • Backup strategies
  • Planning multiple data centers for availability
  • Where to host your application
  • How to get the most performance out of the solution
  • Cost
  • Monitoring
  • Moving from one DBaaS to another
  • Moving from a DBaaS to your own hosted platform

Register Now.

The post Webinar Wed 8/29: Databases in the Hosted Cloud appeared first on Percona Database Performance Blog.


Webinar Wed 7/18: MariaDB 10.3 vs. MySQL 8.0

MariaDB 10.3 vs MySQL 8.0

MariaDB 10.3 vs MySQL 8.0Please join Percona’s Chief Evangelist, Colin Charles as he presents as he presents MariaDB 10.3 vs. MySQL 8.0 on Wednesday, July 18th, 2018, at 9:00 AM PDT (UTC-7) / 12:00 PM EDT (UTC-4).


Technical considerations

Are they syntactically similar? Where do these two databases differ? Why would I use one over the other?

MariaDB 10.3 is on the path of gradually diverging from MySQL 8.0. One obvious example is the internal data dictionary currently under development for MySQL 8.0. This is a major change to the way metadata is stored and used within the server, and MariaDB doesn’t have an equivalent feature. Implementing this feature could mark the end of datafile-level compatibility between MySQL and MariaDB.

Non-technical considerations

There are also non-technical differences between MySQL 8.0 and MariaDB 10.3, including:

Licensing: MySQL offers their code as open-source under the GPL, and provides the option of non-GPL commercial distribution in the form of MySQL Enterprise. MariaDB can only use the GPL, because their work is derived from the MySQL source code under the terms of that license.

Support services: Oracle provides technical support, training, certification and consulting for MySQL, while MariaDB has their own support services. Some people will prefer working with smaller companies, as traditionally it affords them more leverage as a customer.

Community contributions: MariaDB touts the fact that they accept more community contributions than Oracle. Part of the reason for this disparity is that developers like to contribute features, bug fixes and other code without a lot of paperwork overhead (and they complain about the Oracle Contributor Agreement). However, MariaDB has its own MariaDB Contributor Agreement — which more or less serves the same purpose.

Colin will take a look at some of the differences between MariaDB 10.3 and MySQL 8.0 and help answer some of the common questions our Database Performance Experts get about the two databases.

Register Now

The post Webinar Wed 7/18: MariaDB 10.3 vs. MySQL 8.0 appeared first on Percona Database Performance Blog.


MySQL 8.0


MySQL 8.0

If you haven’t heard the news yet, MySQL 8.0 is apparently the next release of the world-famous database server.

Obviously abandoning plans to name the next release 5.8, Percona Server’s upstream provider relabelled all 5.8-related bugs to 8.0 as follows:

Reported version value updated to reflect release name change from 5.8 to 8.0

What will MySQL 8.0 bring to the world?

While lossless RBR has been suggested by Simon Mudd (for example), the actual feature list (except a Boost 1.60.0 upgrade!) remains a secret.

As far as bug and feature requests go, a smart google query revealed which bugs are likely to be fixed in (or are feature requests for) MySQL 8.0.

Here is the full list:

  • MySQL Bug #79380: Upgrade to Boost 1.60.0
  • MySQL Bug #79037: get rid of dynamic_array in st_mysql_options
  • MySQL Bug #80793: EXTEND EXPLAIN to cover ALTER TABLE
  • MySQL Bug #79812: JSON_ARRAY and JSON_OBJECT return …
  • MySQL Bug #79666: fix errors reported by ubsan
  • MySQL Bug #79463: Improve P_S configuration behaviour
  • MySQL Bug #79939: default_password_lifetime > 0 should print …
  • MySQL Bug #79330: DROP TABLESPACE fails for missing general …
  • MySQL Bug #80772: Excessive memory used in memory/innodb …
  • MySQL Bug #80481: Accesses to new data-dictionary add confusing …
  • MySQL Bug #77712: mysql_real_query does not report an error for …
  • MySQL Bug #79813: Boolean values are returned inconsistently with …
  • MySQL Bug #79073: Optimizer hint to disallow full scan
  • MySQL Bug #77732: REGRESSION: replication fails for insufficient …
  • MySQL Bug #79076: make hostname a dynamic variable
  • MySQL Bug #78978: Add microseconds support to UNIX_TIMESTAMP
  • MySQL Bug #77600: Bump major version of libmysqlclient in 8.0
  • MySQL Bug #79182: main.help_verbose failing on freebsd
  • MySQL Bug #80627: incorrect function referenced in spatial error …
  • MySQL Bug #80372: Built-in mysql functions are case sensitive …
  • MySQL Bug #79150: InnoDB: Remove runtime checks for 32-bit file …
  • MySQL Bug #76918: Unhelpful error for mysql_ssl_rsa_setup when …
  • MySQL Bug #80523: current_memory in sys.session can go negative!
  • MySQL Bug #78210: SHUTDOWN command should have an option …
  • MySQL Bug #80823: sys should have a mdl session oriented view
  • MySQL Bug #78374: “CREATE USER IF NOT EXISTS” reports an error
  • MySQL Bug #79522: can mysqldump print the fully qualified table …
  • MySQL Bug #78457: Use gettext and .po(t) files for translations
  • MySQL Bug #78593: mysqlpump creates incorrect ALTER TABLE …
  • MySQL Bug #78041: GROUP_CONCAT() truncation should be an …
  • MySQL Bug #76927: Duplicate UK values in READ-COMMITTED …
  • MySQL Bug #77997: Automatic mysql_upgrade
  • MySQL Bug #78495: Table mysql.gtid_executed cannot be opened.
  • MySQL Bug #78698: Simple delete query causes InnoDB: Failing …
  • MySQL Bug #76392: Assume that index_id is unique within a …
  • MySQL Bug #76671: InnoDB: Assertion failure in thread 19 in file …
  • MySQL Bug #76803: InnoDB: Unlock row could not find a 2 mode …
  • MySQL Bug #78527: incomplete support and/or documentation of …
  • MySQL Bug #78732: InnoDB: Failing assertion: *mbmaxlen < 5 in file …
  • MySQL Bug #76356: Reduce header file dependencies for …
  • MySQL Bug #77056: There is no clear error message if …
  • MySQL Bug #76329: COLLATE option not accepted in generated …
  • MySQL Bug #79500: InnoDB: Assertion failure in thread …
  • MySQL Bug #72284: please use better options to …
  • MySQL Bug #78397: Subquery Materialization on DELETE WHERE …
  • MySQL Bug #76552: Cannot shutdown MySQL using JDBC driver
  • MySQL Bug #76532: MySQL calls exit(MYSQLD_ABORT_EXIT …
  • MySQL Bug #76432: handle_fatal_signal (sig=11) in …
  • MySQL Bug #41925: Warning 1366 Incorrect string value: … for …
  • MySQL Bug #78452: Alter table add virtual index hits assert in …
  • MySQL Bug #77097: InnoDB Online DDL should support change …
  • MySQL Bug #77149: sys should possibly offer user threads …

Powered by WordPress | Theme: Aeros 2.0 by