Adaptive Hash Index on AWS Aurora

Adaptive Hash Index on AWS Aurora

Adaptive Hash Index on AWS AuroraRecently I had a case where queries against Aurora Reader were 2-3 times slower than on the Writer node. In this blog post, we are going to discuss why.

I am not going to go into the details of how Aurora works, as there are other blog posts discussing that. Here I am only going to focus on one part.

The Problem

My customer reported there is a huge performance difference between the Reader and the Writer node just by running selects. I was a bit surprised, as the select queries should run locally on the reader node, the dataset could fit easily in memory, there were no reads on disk level, and everything looked fine.

I was trying to rule out every option when one of my colleagues mentioned I should have a look at the InnoDB_Adaptive_Hash_Indexes. He was right – it was disabled on the Reader nodes, I could see it on the console.

Let’s enable the adaptive hash index

I opened the control panel and I was checking the parameter groups, but the adaptive hash index was already enabled. Ok, I might have made a mistake but I double checked myself many times and it was true. Adaptive hash was disabled on the console but enabled on the control panel. That means the AWS control panel is lying!

I have restarted the nodes multiple times, and I have created new test clusters, etc… but I was not able to enable adaptive hash on the Reader node. It was enabled on the Writer node, and it was working.

Is this causing the performance difference?

Because I was able to enable or disable the adaptive hash index on the Writer node, I continued my tests there and I could confirm that when I disabled it the queries got slower. Same speed as on the Reader node. When I enabled,  AHI queries got faster.

In general with AHI on the Writer node, the customer’s queries were running 2 times faster.

AHI can help for many workloads but not all of them, and you should test your queries/workload both with and without AHI.

Why is it disabled on the Reader?

I have to be honest because I am not an AWS engineer and I do not know the code of Aurora, but I am only guessing here and I might be wrong.

Why can I change it in the parameter group?

We can modify the adaptive hash in the parameter groups, but there is no impact on the Reader nodes at all. Many customers could think they have AHI enabled but actually, they don’t. I think this is a bad practice because if we cannot enable it on the Reader node we should not be able to change it on the control panel.

Is this causing any performance problems for me?

If you are using the Reader node for selects queries, which are based on secondary keys, you are probably suffering from this but it depends on your workload if it is impacting your performance or not. In my customer’s case, the difference was 2 times slower without AHI.

But I want fast queries!

If your queries heavily benefit from AHI, you should run your queries on the Writer node or even on an async slave, or have a look on AWS RDS which does not have this limitation or use EC2 instances. You could also check query cache in Aurora.

Query Cache

In Aurora, they reworked the Query Cache which does not have the limitations like in Community Edition or in Percona Server.  Cacheable queries take out an “exclusive lock” on MySQL’s query cache. In the real world, that means only one query can use the Query Cache at a time and all the other queries have to wait for the mutex. Also in MySQL 8.0 they completely removed the Query Cache.

But in Aurora they redesigned it and they removed this limitation – there is no single global mutex on the Query Cache anymore. I think one of the reasons for this is could be because they knew that Adaptive Hash won’t work.

Does AWS know about this?

I have created a ticket to AWS engineers to get some feedback on this, and they verified my findings and have confirmed Adaptive Hash Index cannot be enabled on the Reader nodes. They are looking into why we can modify it on the control panel.


I would recommend checking your queries on your Reader nodes to make sure they perform well and compare the performance with the Writer node. At this moment, we cannot enable AHI on Reader nodes, and I am not sure if that will change any time soon. But this can impact the performance in some cases, for sure.


Is Adaptive Hash Index in InnoDB right for my workload?

adaptive hash index in InnoDBThis blog post will discuss what the Adaptive Hash Index in InnoDB is used for, and whether it is a good fit for your workload.

Adaptive Hash Index (AHI) is one of the least understood features in InnoDB. In theory, it magically determines when it is worth supplementing InnoDB B-Tree-based indexes with fast hash lookup tables and then builds them automatically without a prompt from the user.

Since AHI is supposed to work “like magic,” it has very little configuration available. In the early versions there were no configuration options available at all. Later versions added


  to disable AHI if required (by setting it to “0” or “OFF”). MySQL 5.7 added the ability to partition AHI by enabling


  (FYI, this feature existed in Percona Server as


 since version 5.5.)

To understand AHI’s impact on performance, think about it as if it were a cache. If an AHI “Hit” happens, we have much better lookup performance; if it is an AHI “Miss,” then performance gets slightly worse (as checking a hash table for matches is fast, but not free).

This is not the only part of the equation though. In addition to the cost of lookup, there is also the cost of AHI maintenance. We can compare maintenance costs – which can be seen in terms of rows added to and removed from AHI – to successful lookups. A high ratio means a lot of lookups sped up at the low cost. A low ratio means the opposite: we’re probably paying too much maintenance cost for little benefit.

Finally there is also a cost for adding an extra contention. If your workload consists of lookups to a large number of indexes or tables, you can probably reduce the impact by setting 


  appropriately. If there is a hot index, however, AHI could become a bottleneck at high concurrency and might need to be disabled.

To determine if AHI is likely to help my workload, we should verify that the AHI hit and successful lookups to maintenance operations ratios are as high as possible.

Let’s investigate what really happens for some simple workloads. I will use a basic Sysbench Lookup by the primary key – the most simple workload possible. We’ll find that even in this case we’ll find a number of behaviors.

For this test, I am using MySQL 5.7.11 with a 16GB buffer pool. The base command line for sysbench is:

sysbench --test=/usr/share/doc/sysbench/tests/db/select.lua   --report-interval=1 --oltp-table-size=1 --max-time=0 --oltp-read-only=off --max-requests=0 --num-threads=1 --rand-type=uniform --db-driver=mysql --mysql-password=password --mysql-db=test_innodb  run

Looking up a single row



  from above; this is a not a mistake, but tests how AHI behaves in a very basic case:

oltp table size=1

And it works perfectly: there is a 100% hit ratio with no AHI maintenance operations to speak of.

10000 rows in the table

When we change the OLTP table setting to


 , we get the following picture:

oltp table size=10k

oltp table size=10k-2

Again, we see almost no overhead. There is a rare incident of 16 rows or so being added to AHI (probably due to an AHI hash collision). Otherwise, it’s almost perfect.

10M rows in the table

If we change the setting to


, we now have more data (but still much less than buffer pool size):

oltp table size=10m

oltp table size=10m-2

In this case, there is clearly a warm-up period before we get close to the 100% hit ratio – and it never quite hits 100% (even after a longer run). In this case, maintenance operations appear to keep going without showing signs of asymptotically reaching zero. My take on this is that with 10M rows there is a higher chance of hash collisions – causing more AHI rebuilding.

500M rows in the table, uniform distribution

Let’s now set the OLTP table size as follows:


. This will push the data size beyond the Innodb buffer pool size.

oltp table size=500m

oltp table size=500m-2

Here we see a lot of buffer pool misses, causing the a very poor AHI hit ratio (never reaching 1%).   We can also see a large overhead of tens of thousands of rows added/removed from AHI. Obviously, AHI is not adding any value in this case

500M rows, Pareto distribution

Finally, let’s use the setting


, and add


. The


 setting enables a skewed distribution, a more typical scenario for many real life data access patterns.

oltp table size=500m-Pareto

oltp table size=500m-Pareto-2

In this case we see the AHI hit ratio gradually improving, and reaching close to 50%. The  AHI maintenance overhead is going down, but never reaches anything that suggests it is worth it.

It is important to note in both this and the previous case that AHI has not reached a “steady state” yet. A steady state condition shows the number of rows added and removed becoming close to equal.

As you can see from the math in the workloads shown above, the Adaptive Hash Index in InnoDB “magic” doesn’t always happen! There are cases when AHI is indeed helpful, and then there are others when AHI adds a lot of data structure maintenance overhead and takes memory away from buffer pool – not to mention the contention overhead. In these cases, it’s better that AHI is disabled.

Unfortunately, AHI does not seem to have the logic built-in to detect if there is too much “churn” going on to make maintaining AHI worthwhile.

I suggest using these numbers as a general guide to decide whether AHI is likely to benefit your workload. Make sure to run a test/benchmark to be sure.

Interested in learning more about other InnoDB Internals? Please join me for the Innodb Architecture and Performance Optimization Tutorial at Percona Live!

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com