Attack No-PK Replication Lag with MySQL/Percona Server 8 Invisible Columns!

no primary key replication lag mysql

no primary key replication lag mysqlThe most common issue when using row-based replication (RBR) is replication lag due to the lack of Primary keys.

The problem is that any replicated DML will do a full table scan for each modified row on the replica. This bug report explains it more in-depth: https://bugs.mysql.com/bug.php?id=53375

For example, if a delete is executed on the following table definition:

CREATE TABLE `joinit` (
  `i` int NOT NULL,
  `s` varchar(64) DEFAULT NULL,
  `t` time NOT NULL,
  `g` int NOT NULL


With this amount of rows:

mysql> select count(*) from joinit;
| count(*) |
|  1048576 |


The delete being:

mysql> flush status ;

mysql> delete from joinit where i > 5 and i < 150;
Query OK, 88 rows affected (0.04 sec)

mysql> show status like '%handler%';
| Variable_name              | Value   |
| Handler_commit             | 2       |
| Handler_delete             | 1       |
| Handler_read_rnd_next      | 1048577 |

It can be seen that the delete on the Primary requires a full table scan (Handler_read_rnd_next matches row amount + 1) to delete 88 rows.

The additional problem is that each of the rows being deleted will be recorded in the binary log individually like this:

#220112 18:29:05 server id 1  end_log_pos 3248339 CRC32 0xdd9d1cb2 Delete_rows: table id 106 flags: STMT_END_F
### DELETE FROM `test2`.`joinit`
###   @1=6
###   @2='764d302b-73d5-11ec-afc8-00163ef3b519'
###   @3='18:28:39'
###   @4=27
### DELETE FROM `test2`.`joinit`
###   @1=7
###   @2='764d30bc-73d5-11ec-afc8-00163ef3b519'
###   @3='18:28:39'
###   @4=5
{88 items}

Which will result in 88 full table scans on the replica, and hence the performance degradation.

For these cases, the recommendation is to add a primary key to the table, but sometimes adding a PK might not be easy because:

  • There are no existing columns that could be considered a PK.
  • Or adding a new column (as the PK) is not possible as it might impact queries from a 3rd party tool that we have no control over (or too complex to fix with query rewrite plugin).

The solution is to use MySQL/Percona Server for MySQL 8 and add an invisible column

Adding a new column (named “newc”) invisible as a primary key can be done with the following line:


Note, adding a PK is an expensive operation that requires a table rebuild as shown here.

After adding an invisible PK, the table will look like this:

CREATE TABLE `joinit` (
  `newc` int NOT NULL AUTO_INCREMENT /*!80023 INVISIBLE */,
  `i` int NOT NULL,
  `s` varchar(64) DEFAULT NULL,
  `t` time NOT NULL,
  `g` int NOT NULL,
  PRIMARY KEY (`newc`)


Deleting a row now will be recorded in the binary log like this:

### DELETE FROM `test`.`joinit`
###   @1=1048577
###   @2=1
###   @3='string'
###   @4='17:23:04'
###   @5=5
# at 430
#220112 17:24:56 server id 1  end_log_pos 461 CRC32 0x826f3af6 Xid = 71

Where @1 is the first column ( the PK in this case) which the replica can use to find the matching row without having to do a full table scan.

The operation executed on the replica would be similar to the following which requires only one scan to find the matching row:

mysql> flush status ;
Query OK, 0 rows affected (0.01 sec)

mysql> delete from joinit where newc = 1048578;
Query OK, 1 row affected (0.00 sec)

mysql> show status like '%handler%';
| Variable_name              | Value |
| Handler_commit             | 2     |
| Handler_read_key           | 1     |
| Handler_read_rnd_next      | 0     |


Also as the name suggests, an invisible column won’t show nor it needs to be referenced when doing operations over the table, i.e:

mysql> select * from joinit limit 2; 
| i | s                                    | t        | g  |
| 2 | ecc6cbed-73c9-11ec-afc8-00163ef3b519 | 17:06:03 | 58 |
| 3 | ecc7d9bb-73c9-11ec-afc8-00163ef3b519 | 17:06:03 | 56 |
2 rows in set (0.00 sec)

mysql> insert into joinit values (4, "string", now(), 5);
Query OK, 1 row affected (0.01 sec)


But if needed, the new column (newc) can be fetched if explicitly queried:

mysql> select newc, i, s, t, g from joinit limit 2; 
| newc | i | s                                    | t        | g  |
|    1 | 2 | ecc6cbed-73c9-11ec-afc8-00163ef3b519 | 17:06:03 | 58 |
|    2 | 3 | ecc7d9bb-73c9-11ec-afc8-00163ef3b519 | 17:06:03 | 56 |
2 rows in set (0.00 sec)


What If…?

What if MySQL automatically detects that the PK is missing for InnoDB tables and adds the invisible PK?

Taking into account that an internal six bytes PK is already added when the PK is missing, it might be a good idea to allow the possibility of making the PK visible if you need to. 

This means that when you execute this CREATE TABLE statement:

CREATE TABLE `joinit` (
  `i` int NOT NULL,
  `s` varchar(64) DEFAULT NULL,
  `t` time NOT NULL,
  `g` int NOT NULL,
  PRIMARY KEY (`newc`)


Will be automatically translated to:

CREATE TABLE `joinit` (
  `newc` int NOT NULL AUTO_INCREMENT /*!80023 INVISIBLE */,
  `i` int NOT NULL,
  `s` varchar(64) DEFAULT NULL,
  `t` time NOT NULL,
  `g` int NOT NULL,
  PRIMARY KEY (`newc`)


And then we can execute this command:


To make it visible.


Missing primary keys is a problem for scaling databases, as replication will require a full table scan for each updated/delete row, and the more data the more lag.

Adding a PK might not be always possible because of 3rd party tools or restrictions, but adding an invisible primary key will do the trick and have the benefits of adding a PK without impacting syntax and operations from 3rd party clients/tools. What will be awesome is to make MySQL able to detect the missing PK, add it automatically, and change it to visible if you need to.


Comparing Graviton (ARM) Performance to Intel and AMD for MySQL (Part 3)

Comparing Graviton (ARM) Performance to Intel and AMD for MySQL

Comparing Graviton (ARM) Performance to Intel and AMD for MySQLRecently we published the first part (m5, m5a, m6g) and the second part (C5, C5a, C6g) of research regarding comparing Graviton ARM with AMD and Intel CPU on AWS. We selected general-purpose EC2 instances with the same configurations (amount of vCPU in the first part). In the second part, we compared compute-optimized EC2 instances with the same conditions. The main goal was to see the trend and make a general comparison of CPU types on the AWS platform only for MySQL. We didn’t set the goal to compare the performance of different CPU types. Our expertise is in MySQL performance tuning. We share research “as is” with all scripts, and anyone interested could rerun and reproduce it.
All scripts, raw logs and additional plots are available on GitHub: 

(arm_cpu_comparison_m5, csv_file_with_all_data_m5,


arm_cpu_comparison_m6, csv_file_with_all_data_m6). 

We were happy to see the reactions from our Percona Blog readers to our research. And we are open to any feedback. If anyone has any ideas on updating our methodology, we would be happy to correct it. 

This post is a continuation of research based on our interest in general-purpose EC2 (and, of course, because we saw that our audience wanted to see it). The main inspiration for this research was the feedback of our readers that we compared different generations of instances, especially old AMD instances (m5a.*), and compared it with the latest Graviton instances (m6g.*).  Additionally, we also decided to use the latest Intels instances (m6i.*) too.

Today, we will talk about (AWS) the latest general-purpose EC2: M6i, M6a, M6g (complete list in appendix). 

Short Conclusion:

  1. In most cases for m6i, m6g, and m6a instances, Intel shows better performance in throughput for MySQL read transactions. However, AMD instances are pretty close to Intel’s results.
  2. Sometimes Intel could show a significant advantage — more than almost 200k rps (almost 45% better) than Graviton. However, AMD’s gap wasn’t as significant as in previous results.
    Unfortunately, we compared Graviton with others. So we didn’t concentrate on comparing AMD with Intel. 
  3. If we could say in a few words: m6i instances (with Intel)  are better in their class than other m6a, m6g instances (in performance for MySql). And this advantage starts from 5%-10% and could be up to 45% compared with other CPUs.
  4. But Gravitons instances are still cheaper

Details, or How We Got Our Short Conclusion:


  1. Tests were run  on M6i.* (Intel) , M6a.* (AMD),  M6g.*(Graviton) EC2 instances in the US-EAST-1 region. (List of EC2 see in the appendix). It was selected using only the same class of instances without additional upgrades. The main goal is to take the same instances with only differences in CPU types and identify their performance for MySQL.
  2. Monitoring was done with Percona Monitoring and Management (PMM).
  3. OS: Ubuntu 20.04 LTS 
  4. Load tool (sysbench) and target DB (MySQL) installed on the same EC2 instance.
  5. Oracle MySQL Community Server — 8.0.26-0 — installed from official packages (it was installed from Ubuntu repositories).
  6. Load tool: sysbench —  1.0.18
  7. innodb_buffer_pool_size=80% of available RAM
  8. Test duration is five minutes for each thread and then 90 seconds cool down before the next iteration. 
  9. Tests were run four times independently (to smooth outliers / to have more reproducible results). Then results were averaged for graphs. Also, graphs show min and max values that were during the test, which shows the range of variance. 
  10. We are going to use the “high-concurrency” scenario definition for scenarios when the number of threads would be bigger than the number of vCPU. And “low-concurrent” scenario definition with scenarios where the number of threads would be less or equal to a number of vCPU on EC2.
  11. We are comparing MySQL behavior on the same class of EC2, not CPU performance.
  12. We got some feedback regarding our methodology, and we would update it in the next iteration, with a different configuration, but for this particular research we leave previous to have possibility compare “apples to apples”.
  13. The post is not sponsored by any external company. It was produced using only Percona resources. We do not control what AWS uses as CPU in their instances, we only operate with what they offer. 

Test Case:


To use only CPU (without disk and network) we decided to use only read queries from memory. To do this we did the following actions. 

1. Create DB with 10 tables with 10 000 000 rows each table

sysbench oltp_read_only --threads=10 --mysql-user=sbtest --mysql-password=sbtest --table-size=10000000 --tables=10 --db-driver=mysql --mysql-db=sbtest prepare

2. Load all data to LOAD_buffer 

sysbench oltp_read_only --time=300 --threads=10 --table-size=1000000 --mysql-user=sbtest --mysql-password=sbtest --db-driver=mysql --mysql-db=sbtest run


3. Run in a loop for same scenario but  different concurrency THREAD (1,2,4,8,16,32,64,128) on each EC2 

sysbench oltp_read_only --time=300 --threads=${THREAD} --table-size=100000 --mysql-user=sbtest --mysql-password=sbtest --db-driver=mysql --mysql-db=sbtest run


Result reviewing was split into four parts:

  1. For “small” EC2 with 2, 4, and 8 vCPU
  2. For “medium” EC2 with 16  and 32 vCPU
  3. For  “large” EC2 with 48 and 64 vCPU
  4. For all scenarios to see the overall picture.

There would be four graphs for each test:

  1. Throughput (queries per second) that EC2 could perform for each scenario (number of threads)
  2. Latency 95 percentile that  EC2 could perform for each scenario, (number of threads)
  3. Relative comparing Graviton and Intel, Graviton, and AMD
  4. Absolute comparing Graviton and Intel, Graviton and AMD

Validation that all load goes to the CPU, not to DISK I/O or network, was done also using PMM (Percona Monitoring and Management). 

perona monitoring and management

pic 0.1. OS monitoring during all test stages


Result for EC2 with 2, 4, and 8 vCPU:

Result for EC2 with 2, 4, and 8 vCPU

Plot 1.1.  Throughput (queries per second) for EC2 with 2, 4 and 8 vCPU for scenarios with 1,2,4,8,16,32,64,128 threads


Plot 1.2.  Latencies (95 percentile) during the test for EC2 with 2, 4 and 8 vCPU for scenarios with 1,2,4,8,16,32,64,128 threads


Plot 1.3.1 Percentage comparison Graviton and Intel CPU in throughput (queries per second) for EC2 with 2, 4 and 8 vCPU for scenarios with 1,2,4,8,16,32,64,128 threads


Plot 1.3.2  Percentage comparison Graviton and AMD CPU in throughput (queries per second) for EC2 with 2, 4 and 8 vCPU for scenarios with 1,2,4,8,16,32,64,128 threads


Plot 1.4.1. Numbers comparison Graviton and Intel CPU in throughput (queries per second) for EC2 with 2, 4 and 8 vCPU for scenarios with 1,2,4,8,16,32,64,128 threads


Plot 1.4.2. Numbers comparison Graviton and Intel CPU in throughput (queries per second) for EC2 with 2, 4 and 8 vCPU for scenarios with 1,2,4,8,16,32,64,128 threads



  1. Based on plot 1.1. We could say that EC2 with Intel hasn’t an absolute advantage compared with Graviton and AMD. 
  2. Especially Intel and AMD, showing an advantage a little bit over – 20% over Graviton.
  3. In numbers, it is over five thousand and more requests per second. 
  4. AMD showed better results for two vCPU instances. 
  5. And it looks like in M6 class of Gravitons CPUs show the worst result compared with others.

Result for EC2 with 16 and 32 vCPU:

Result for EC2 with 16 and 32 vCPU

Plot 2.1. Throughput (queries per second) for EC2 with 16 and 32 vCPU for scenarios with 1,2,4,8,16,32,64,128 threads


Plot 2.2. Latencies (95 percentile) during the test for EC2 with 16 and 32 vCPU for scenarios with 1,2 4,8,16,32,64,128 threads


Plot 2.3.1 Percentage comparison Graviton and Intel CPU in throughput (queries per second) for EC2 with 16 and 32 vCPU for scenarios with 1,2,4,8,16,32,64,128 threads


Plot 2.3.2  Percentage comparison Graviton and AMD CPU in throughput (queries per second) for EC2 with 16 and 32 vCPU for scenarios with 1,2,4,8,16,32,64,128 threads


Plot 2.4.1. Numbers comparison Graviton and Intel CPU in throughput (queries per second) for EC2 with 16 and 32 vCPU for scenarios with 1,2,4,8,16,32,64,128 threads


Plot 2.4.2. Numbers comparison Graviton and AMD CPU in throughput (queries per second) for EC2 with 16 and 32 vCPU for scenarios with 1,2,4,8,16,32,64,128 threads



  1. Plot 2.1 shows that Intel vCPU is more performance efficient. AMD is the second, and Graviton is the third. 
  2. According to plots 2.3.1 and 2.3.2, Intel is better than Graviton up to 30 % and AMD is better than Graviton up to 20%. Graviton has some exceptional performance advantage over  AMD in some scenarios. But with this configuration and this instance classes, it is an exception according to the plot 2.3.2 scenarios for 8 and 16 concurrent threads. 
  3. In real numbers, Intel could execute up to 140 k read transactions more than Graviton CPUs, and AMD could read more than 70 k read transactions than Graviton. (plot 2.1. , plot 2.4.1.)
  4. In most cases, AMD and Intel are better than Graviton EC2 instances (plot 2.1, plot 2.3.2, plot 2.4.2).


Result for EC2 with 48 and 64 vCPU:

Result for EC2 with 48 and 64 vCPU

Plot 3.1. Throughput (queries per second) for EC2 with 48 and 64 vCPU for scenarios with 1,2,4,8,16,32,64,128 threads


Plot 3.2. Latencies (95 percentile) during the test for EC2 with 48 and 64 vCPU for scenarios with 1,2 4,8,16,32,64,128 threads


Plot 3.3.1 Percentage comparison Graviton and Intel CPU in throughput (queries per second) for EC2 with 48 and 64 vCPU for scenarios with 1,2,4,8,16,32,64,128 threads


Plot 3.3.2  Percentage comparison Graviton and AMD CPU in throughput (queries per second) for EC2 with 48 and 64 vCPU for scenarios with 1,2,4,8,16,32,64,128 threads


Plot 3.4.1. Numbers comparison Graviton and Intel CPU in throughput (queries per second) for EC2 with 48 and 64 vCPU for scenarios with 1,2,4,8,16,32,64,128 threads


Plot 3.4.2. Numbers comparison Graviton and AMD CPU in throughput (queries per second) for EC2 with 48 and 64 vCPU for scenarios with 1,2,4,8,16,32,64,128 threads


  1. For “Large” instances, Intel is still better than other vCPU. AMD was still in second place, except when Graviton stole some cases. (plot 3.1.)
  2. According to plot 3.3.1. Intel showed an advantage over Graviton up to 45%. On the other hand, AMD showed an advantage over Graviton up to 20% in the same case.
  3. There were two cases when Graviton showed some better results, but it is an exception. 
  4. In real numbers: Intel could generate over 150k-200k read transactions more than Graviton. And AMD could execute more than 70k – 130k read transactions than Graviton.


Full Result Overview:

Plot 4.1.1. Throughput (queries per second) – bar plot for EC2 with 2, 4, 8, 16, 32, 48 and 64 vCPU for scenarios with 1,2,4,8,16,32,64,128 threads


Plot 4.1.2. Throughput (queries per second) – line plot for EC2 with 2, 4, 8, 16, 32, 48 and 64 vCPU for scenarios with 1,2,4,8,16,32,64,128 threads


Plot 4.2.1. Latencies (95 percentile) during the test – bar plot for EC2 with 2, 4, 8, 16, 32, 48 and 64 vCPU for scenarios with 1,2,4,8,16,32,64,128 threads


Plot 4.2.2. Latencies (95 percentile) during the test – line plot for EC2 with 2, 4, 8, 16, 32, 48 and 64 vCPU for scenarios with 1,2,4,8,16,32,64,128 threads


Plot 4.3.1 Percentage comparison Graviton and Intel CPU in throughput (queries per second) for EC2 with 2, 4, 8, 16, 32, 48 and 64 vCPU for scenarios with 1,2,4,8,16,32,64,128 threads


Plot 4.3.2 Percentage comparison Graviton and AMD CPU in throughput (queries per second) for EC2 with 2, 4, 8, 16, 32, 48 and 64 vCPU for scenarios with 1,2,4,8,16,32,64,128 threads


Plot 4.4.1. Numbers comparison Graviton and Intel CPU in throughput (queries per second) for EC2 with 2, 4, 8, 16, 32, 48 and 64 vCPU for scenarios with 1,2,4,8,16,32,64,128 threads


Plot 4.4.2. Numbers comparison Graviton and AMD CPU in throughput (queries per second) for EC2 with 2, 4, 8, 16, 32, 48 and 64 vCPU for scenarios with 1,2,4,8,16,32,64,128 threads


Plot 4.5.1. Percentage comparison INTEL and AMD CPU in throughput (queries per second) for EC2 with 2, 4, 8, 16, 32, 48 and 64 vCPU for scenarios with 1,2,4,8,16,32,64,128 threads


Plot 4.5.2. Numbers comparison INTEL and AMD CPU in throughput (queries per second) for EC2 with 2, 4, 8, 16, 32, 48 and 64 vCPU for scenarios with 1,2,4,8,16,32,64,128 threads

Final Thoughts

  1. We compare general-purpose EC2 (m6i, m6a, m6g) instances from the AWS platform and their behavior for MySQL.  
  2. In these competitions, Graviton instances (m6g) do not provide any competitive results for MySQL. 
  3. There was some strange behavior. AMD and Intel showed their best performance when loaded (in the number of threads) are equal numbers of vCPU. According to plot 4.1.2. We could see some jump in performance when the load becomes the same as the amount of vCPU. This point was hard to see on the bar chart. But this is very interesting. However, Graviton worked more slightly without any “jumps”, and that’s why it showed exceptionally better results in some scenarios with AMD.
  4. Last point. Everyone wants to see an AMD vs Intel comparison. Plot 4.5.1 and 4.5.2.  The result – Intel is better in most cases.  And AMD was better only in one case with 2 vCPU. So the advantage of Intel compared with AMD could rise up to 96% for “large instances” (in some cases). It is unbelievable. But in most cases, this advantage is that Intel could run in 30% more MySql read transactions than AMD.
  5. It is still an open question regarding the economic efficiency of all this EC2. We would research this topic and answer this question a little bit later.


List of EC2 used in research:

CPU type Cpu info:

Model name

EC2 Memory GB Amount vCPU EC2 price per hour (USD)
AMD AMD EPYC 7R13 Processor 2650 MHz m6a.large 8 2 $0.0864
AMD m6a.xlarge 16 4 $0.1728
AMD m6a.2xlarge 32 8 $0.3456
AMD m6a.4xlarge 64 16 $0.6912
AMD m6a.8xlarge 128 32 $1.3824
AMD m6a.12xlarge 192 48 $2.0736
AMD m6a.16xlarge 256 64 $2.7648
Graviton ARMv8 AWS Graviton2 2500 MHz m6g.large 8 2 $0.077 
Graviton m6g.xlarge 16 4 $0.154
Graviton m6g.2xlarge 32 8 $0.308
Graviton m6g.4xlarge 64 16 $0.616
Graviton m6g.8xlarge 128 32 $1.232
Graviton m6g.12xlarge 192 48 $1.848
Graviton m6g.16xlarge 256 64 $2.464
Intel Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz m6i.large 8 2 $0.096000
Intel m6i.xlarge 16 4 $0.192000
Intel m6i.2xlarge 32 8 $0.384000 
Intel m6i.4xlarge 64 16 $0.768000
Intel m6i.8xlarge 128 32 $1.536000
Intel m6i.12xlarge 192 48 $2.304000
Intel m6i.16xlarge 256 64 $3.072000







server_id = 7

# general

table_open_cache = 200000






# files





# buffers






innodb_flush_log_at_trx_commit  = 1

innodb_doublewrite= 1

innodb_flush_method= O_DIRECT

innodb_file_per_table= 1





bind_address =




Online DDL With Group Replication in MySQL 8.0.27

Online DDL With Group Replication in MySQL 8.0.27

Online DDL With Group Replication in MySQL 8.0.27In April 2021, I wrote an article about Online DDL and Group Replication. At that time we were dealing with MySQL 8.0.23 and also opened a bug report which did not have the right answer to the case presented. 

Anyhow, in that article I have shown how an online DDL was de facto locking the whole cluster for a very long time even when using the consistency level set to EVENTUAL.

This article is to give justice to the work done by the MySQL/Oracle engineers to correct that annoying inconvenience. 

Before going ahead, let us remember how an Online DDL was propagated in a group replication cluster, and identify the differences with what happens now, all with the consistency level set to EVENTUAL (see).

In MySQL 8.0.23 we were having:

While in MySQL 8.0.27 we have:

As you can see from the images we have three different phases. Phase one is the same between version 8.0.23 and version 8.0.27. 

Phases two and three, instead, are quite different. In MySQL 8.0.23 after the DDL is applied on the Primary, it is propagated to the other nodes, but a metalock was also acquired and the control was NOT returned. The result was that not only the session executing the DDL was kept on hold, but also all the other sessions performing modifications. 

Only when the operation was over on all secondaries, the DDL was pushed to Binlog and disseminated for Asynchronous replication, lock raised and operation can restart.

Instead, in MySQL 8.0.27,  once the operation is over on the primary the DDL is pushed to binlog, disseminated to the secondaries and control returned. The result is that the write operations on primary have no interruption whatsoever and the DDL is distributed to secondary and Asynchronous replication at the same time. 

This is a fantastic improvement, available only with consistency level EVENTUAL, but still, fantastic.

Let’s See Some Numbers

To test the operation, I have used the same approach used in the previous tests in the article mentioned above.

Connection 1:
    ALTER TABLE windmills_test ADD INDEX idx_1 (`uuid`,`active`), ALGORITHM=INPLACE, LOCK=NONE;
    ALTER TABLE windmills_test drop INDEX idx_1, ALGORITHM=INPLACE;
Connection 2:
 while [ 1 = 1 ];do da=$(date +'%s.%3N');/opt/mysql_templates/mysql-8P/bin/mysql --defaults-file=./my.cnf -uroot -D windmills_large -e "insert into windmills_test  select null,uuid,millid,kwatts_s,date,location,active,time,strrecordtype from windmill7 limit 1;" -e "select count(*) from windmills_large.windmills_test;" > /dev/null;db=$(date +'%s.%3N'); echo "$(echo "($db - $da)"|bc)";sleep 1;done

Connection 3:
 while [ 1 = 1 ];do da=$(date +'%s.%3N');/opt/mysql_templates/mysql-8P/bin/mysql --defaults-file=./my.cnf -uroot -D windmills_large -e "insert into windmill8  select null,uuid,millid,kwatts_s,date,location,active,time,strrecordtype from windmill7 limit 1;" -e "select count(*) from windmills_large.windmills_test;" > /dev/null;db=$(date +'%s.%3N'); echo "$(echo "($db - $da)"|bc)";sleep 1;done

Connections 4-5:
     while [ 1 = 1 ];do echo "$(date +'%T.%3N')";/opt/mysql_templates/mysql-8P/bin/mysql --defaults-file=./my.cnf -uroot -D windmills_large -e "show full processlist;"|egrep -i -e "(windmills_test|windmills_large)"|grep -i -v localhost;sleep 1;done

Modifying a table with ~5 million rows:

node1-DC1 (root@localhost) [windmills_large]>select count(*) from  windmills_test;
| count(*) |
|  5002909 |

The numbers below represent the time second/milliseconds taken by the operation to complete. While I was also catching the state of the ALTER on the other node I am not reporting it here given it is not relevant. 

EVENTUAL (on the primary only)
Node 1 same table:
.186 <--- no locking during alter on the same node
.217 <--- moment of commit

Node 1 another table :
.198 <--- no locking during alter on the same node
.211  <--- moment of commit

As you can see there is just a very small delay at the moment of commit, but other impacts.

Now if we compare this with the recent tests I have done for Percona XtraDB Cluster (PXC) Non-Blocking operation (see A Look Into Percona XtraDB Cluster Non-Blocking Operation for Online Schema Upgrade) with the same number of rows and same kind of table/data:

Action Group Replication PXC (NBO)
Time on hold for insert for altering table ~ 0.217 sec ~ 120 sec
Time on hold for insert for another table ~ 0.211 sec ~ 25 sec

However, yes there is a however, PXC was maintaining consistency between the different nodes during the DDL execution, while MySQL 8.0.27 with Group Replication was postponing consistency on the secondaries, thus Primary and Secondary were not in sync until full DDL finalization on the secondaries.


MySQL 8.0.27 comes with this nice fix that significantly reduces the impact of an online DDL operation on a busy server. But we can still observe a significant misalignment of the data between the nodes when a DDL is executing. 

On the other hand, PXC with NBO is a bit more “expensive” in time, but nodes remain aligned all the time.

In the end, is what is more important for you to choose one or the other solution, consistency vs. operational impact.

Great MySQL to all.


MySQL 8.0 Functional Indexes

MySQL 8.0 Functional Indexes

MySQL 8.0 Functional IndexesWorking with hundreds of different customers I often face similar problems around running queries. One very common problem when trying to optimize a database environment is index usage. A query that cannot use an index is usually a long-running one, consuming more memory or triggering more disk iops.

A very common case is when a query uses a filter condition against a column that is involved in some kind of functional expression. An index on that column can not be used.

Starting from MySQL 8.0.13 functional indexes are supported. In this article, I’m going to show what they are and how they work.

The Well-Known Problem

As already mentioned, a very common problem about index usage is when you have a filter condition against one or more columns involved in some kind of functional expression.

Let’s see a simple example.

You have a table called products containing the details of your products, including a create_time TIMESTAMP column. If you would like to calculate the average price of your products on a specific month you could do the following:

mysql> SELECT AVG(price) FROM products WHERE MONTH(create_time)=10;
| AVG(price) |
| 202.982582 |

The query returns the right value, but take a look at the EXPLAIN:

mysql> EXPLAIN SELECT AVG(price) FROM products WHERE MONTH(create_time)=10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: products
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 99015
     filtered: 100.00
        Extra: Using where


The query triggers a full scan of the table. Let’s create an index on create_time and check again:

mysql> ALTER TABLE products ADD INDEX(create_time);
Query OK, 0 rows affected (0.71 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> explain SELECT AVG(price) FROM products WHERE MONTH(create_time)=10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: products
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 99015
     filtered: 100.00
        Extra: Using where


A full scan again. The index we have created is not effective. Indeed any time an indexed column is involved in a function the index can not be used.

To optimize the query the workaround is rewriting it differently in order to isolate the indexed column from the function.

Let’s test the following equivalent query:

mysql> SELECT AVG(price) FROM products WHERE create_time BETWEEN '2019-10-01' AND '2019-11-01';
| AVG(price) |
| 202.982582 |

mysql> EXPLAIN SELECT AVG(price) FROM products WHERE create_time BETWEEN '2019-10-01' AND '2019-11-01'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: products
   partitions: NULL
         type: range
possible_keys: create_time
          key: create_time
      key_len: 5
          ref: NULL
         rows: 182
     filtered: 100.00
        Extra: Using index condition


Cool, now the index is used. Then rewriting the query was the typical suggestion.

Quite a simple solution, but not all the times it was possible to change the application code for many valid reasons. So, what to do then?


MySQL 8.0 Functional Indexes

Starting from version 8.0.13, MySQL supports functional indexes. Instead of indexing a simple column, you can create the index on the result of any function applied to a column or multiple columns.

Long story short, now you can do the following:

mysql> ALTER TABLE products ADD INDEX((MONTH(create_time)));
Query OK, 0 rows affected (0.74 sec)
Records: 0  Duplicates: 0  Warnings: 0

Be aware of the double parentheses. The syntax is correct since the expression must be enclosed within parentheses to distinguish it from columns or column prefixes.

Indeed the following returns an error:

mysql> ALTER TABLE products ADD INDEX(MONTH(create_time));
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'create_time))' at line 1

Let’s check now our original query and see what happens to the EXPLAIN

mysql> SELECT AVG(price) FROM products WHERE MONTH(create_time)=10;
| AVG(price) |
| 202.982582 |


mysql> EXPLAIN SELECT AVG(price) FROM products WHERE MONTH(create_time)=10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: products
   partitions: NULL
         type: ref
possible_keys: functional_index
          key: functional_index
      key_len: 5
          ref: const
         rows: 182
     filtered: 100.00
        Extra: NULL


The query is no longer a full scan and runs faster. The functional_index has been used, with only 182 rows examined. Awesome.

Thanks to the functional index we are no longer forced to rewrite the query.

Which Functional Indexes are Permitted

We have seen an example involving a simple function applied to a column, but you are granted to create more complex indexes.

A functional index may contain any kind of expressions, not only a single function. The following patterns are valid functional indexes:

INDEX( ( col1 + col2 ) )
INDEX( ( FUNC(col1) + col2 – col3 ) )

You can use ASC or DESC as well:

INDEX( ( MONTH(col1) ) DESC )

You can have multiple functional parts, each one included in parentheses:

INDEX( ( col1 + col2 ), ( FUNC(col2) ) )

You can mix functional with nonfunctional parts:

INDEX( (FUNC(col1)), col2, (col2 + col3), col4 )

There are also limitations you should be aware of:

  • A functional key can not contain a single column. The following is not permitted:
    INDEX( (col1), (col2) )
  • The primary key can not include a functional key part
  • The foreign key can not include a functional key part
  • SPATIAL and FULLTEXT indexes can not include functional key parts
  • A functional key part can not refer to a column prefix

At last, remember that the functional index is useful only to optimize the query that uses the exact same expression. An index created with nonfunctional parts can be used instead to solve multiple different queries.

For example, the following conditions can not rely on the functional index we have created:

WHERE YEAR(create_time) = 2019

WHERE create_time > ‘2019-10-01’

WHERE create_time BETWEEN ‘2019-10-01’ AND ‘2019-11-01’


All these will trigger a full scan.

Functional Index Internal

The functional indexes are implemented as hidden virtual generated columns. For this reason, you can emulate the same behavior even on MySQL 5.7 by explicitly creating the virtual column. We can test this, starting by dropping the indexes we have created so far.

mysql> SHOW CREATE TABLE products\G
*************************** 1. row ***************************
       Table: products
Create Table: CREATE TABLE `products` (
  `id` int unsigned NOT NULL AUTO_INCREMENT,
  `description` longtext,
  `price` decimal(8,2) DEFAULT NULL,
  `create_time` timestamp NULL DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `create_time` (`create_time`),
  KEY `functional_index` ((month(`create_time`)))
) ENGINE=InnoDB AUTO_INCREMENT=149960 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci


mysql> ALTER TABLE products DROP INDEX `create_time`, DROP INDEX `functional_index`;
Query OK, 0 rows affected (0.03 sec)

We can try now to create the virtual generated column:

Query OK, 0 rows affected (0.04 sec)

Create the index on the virtual column:

mysql> ALTER TABLE products ADD INDEX(create_month);
Query OK, 0 rows affected (0.55 sec)


mysql> SHOW CREATE TABLE products\G
*************************** 1. row ***************************
       Table: products
Create Table: CREATE TABLE `products` (
  `id` int unsigned NOT NULL AUTO_INCREMENT,
  `description` longtext,
  `price` decimal(8,2) DEFAULT NULL,
  `create_time` timestamp NULL DEFAULT NULL,
  `create_month` tinyint GENERATED ALWAYS AS (month(`create_time`)) VIRTUAL,
  PRIMARY KEY (`id`),
  KEY `create_month` (`create_month`)
) ENGINE=InnoDB AUTO_INCREMENT=149960 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci


We can now try our original query. We expect to see the same behavior as the functional index.

mysql> SELECT AVG(price) FROM products WHERE MONTH(create_time)=10;
| AVG(price) |
| 202.982582 |

mysql> EXPLAIN SELECT AVG(price) FROM products WHERE MONTH(create_time)=10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: products
   partitions: NULL
         type: ref
possible_keys: create_month
          key: create_month
      key_len: 2
          ref: const
         rows: 182
     filtered: 100.00
        Extra: NULL


Indeed, the behavior is the same. The index on the virtual column can be used and the query is optimized.

The good news is that you can use this workaround to emulate a functional index even on 5.7, getting the same benefits. The advantage of MySQL 8.0 is that it is completely transparent, no need to create the virtual column.

Since the functional index is implemented as a hidden virtual column, there is no additional space needed for the data, only the index space will be added to the table.

By the way, this is the same technique used for creating indexes on JSON documents’ fields.


The functional index support is an interesting improvement you can find in MySQL 8.0. Some of the queries that required rewriting to get optimized don’t require that anymore. Just remember that only the queries having the same filter pattern can rely on the functional index. Then you need to create additional indexes or other functional indexes to improve other search patterns.

The same feature can be implemented on MySQL 5.7 with the explicit creation of a virtual generated column and the index.

For more detailed information, read the following page:



In Application and Database Design, Small Things Can Have a Big Impact

Application and Database Design

Application and Database DesignWith modern application design, systems are becoming more diverse, varied and have more components than ever before. Developers are often forced to become master chefs adding the ingredients from dozens of different technologies and blending them together to create something tasty and amazing. But with so many different ingredients, it is often difficult to understand how the individual ingredients interact with each other. The more diverse the application, the more likely it is that some seemingly insignificant combination of technology may cause cascading effects.

Many people I talk to have hundreds if not thousands of different libraries, APIs, components, and services making up the systems they support. In this type of environment, it is very difficult to know what small thing could add up to something much bigger. Look at some of the more recent big cloud or application outages, they often have their root cause in something rather small.

Street Fight: Python 3.10 -vs- 3.9.7

Let me give you an example of something I ran across recently. I noticed that performance on two different nodes running the same hardware/application code was performing drastically different than one another. The app server was running close to 100% CPU on one server while the other was around 40%. Each had the same workload and the same database, etc. It turned out that one server was using Python 3.10.0 and the other was running 3.9.7. This combined with the MySQL Connector/Python lead to almost a 50% reduction in database throughput (the 3.10.0 release saw a regression). However, this performance change was not seen either in my PostgreSQL testing or in testing the mysqlclient connector. It happened only when running the pure python version of the MySQL connector. See the results below:

Python 3.10 -vs- 3.9.7

Note: This workload was using a warmed BP, with workload running for over 24 hours before the test. I cycled the application server making the change to either the version of python or the MySQL library. These tests are repeatable regardless of the length of the run. All data fits into memory here. I am not trying to make an authoritative statement on a specific technology, merely pointing out the complexity of layers of technology.

Looking at this purely from the user perspective, this particular benchmark simulates certain types of users.  Let’s look at the number of users who could complete their actions per second.  I will also add another pure python MySQL driver to the mix.

Note: There appears to be a significant regression in 3.10. The application server was significantly busier when using Python 3.10 and one of the pure python drivers than when running the same test in 3.9 or earlier.

The main difference between the MySQL Connector and mysqlclient is the mysqlclient is using the C libmysqlclient. Oddly the official MySQL Connector says it should switch between the pure python and C version if available, but I was not seeing that behavior ( so I have to look into it ). This resulted in the page load time for the app in Python 3.10.0 taking 0.05 seconds up from 0.03 seconds in Python 3.9.7. However, the key thing I wanted to highlight is that sometimes seemingly small or insignificant changes can lead to a drastic difference in performance and stability. You could be running along fine for months or even years before something upgrades to something that you would not think would drastically impact performance.

Can This Be Fixed With Better Testing?

You may think this is a poster child for testing before upgrading components, and while that is a requirement, it won’t necessarily prevent this type of issue. While technology combinations, upgrades, and releases can often have some odd side effects, oftentimes issues they introduce remain hidden and don’t manifest until some outside influence pops up. Note: These generally happen at the worst time possible, like during a significant marketing campaign, event, etc. The most common way I see nasty bugs get exposed is not through a release or in testing but often with a change in workload. Workload changes can hide or raise bottlenecks at the worst times. Let’s take a look at the above combination of Python version and different connectors with a different workload:

Here the combination of reporting and read/write workload push all the app nodes and database nodes to the redline. These look fairly similar in terms of performance, but the workload is hiding the above issues I mentioned. A system pushed to the red will behave differently than it will in the real world. If you ended up testing upgrading python on your app servers to 3.10.0 by pushing your systems to the max, you may see the above small regression as within acceptable limits. In reality, however, the upgrade could net you seeing a 50% decrease in throughput when moved to production.

Do We Care?

Depending on how your application is built many people won’t notice the above-mentioned performance regression after the upgrade happens. First, most people do not run their servers even close to 100% load, adding more load on the boxes may not immediately impact their user’s performance. Adding .02 seconds of load time to a user may be imperceptible unless under heavy load ( which would increase that load time). The practical impact is to speed up the point at which you need to either add more nodes or upgrade their instances sooner.

Second, scaling application nodes automatically is almost a requirement in most modern cloud-native environments. Reaching a point where you need to add more nodes and more processing power will come with increases in users on your application, so it is easily explained away.

Suppose users won’t immediately notice and the system will automatically expand as needed ( preventing you from knowing or getting involved ). In that case, do we, or should we care about adding more nodes or servers? Adding nodes is cheap; it is not free.

First, there is a direct cost to you. Take your hosting costs for your application servers and double them in the case above. What is that? $10K, $100K, $1M a year? That is money that is wasted. Look no further than the recent news lamenting the ever-increasing costs of the cloud i.e.:

Second, there is a bigger cost that comes with complexity. Observability is such a huge topic because everything we end up doing in modern environments is done in mass. The more nodes and servers you have the more potential exists for issues or one node behaving badly. While our goal is to create a system where everything is replaceable and can be torn down and rebuilt to overcome problems, this is often not the reality. Instead, we end up replicating bad code, underlying bottlenecks, and making a single problem a problem at 100x the scale.

We need to care. Application workload is a living entity that grows, shrinks, and expands with users. While modern systems need to scale quickly up and down to meet the demand, that should not preclude us from having to look out for hidden issues and bottlenecks. It is vitally important that we understand our applications workloads and look for deviations in the normal patterns.  We need to ask why something changed and dig in to find the answer.  Just because we can build automation to mitigate the problem does not mean we should get complacent and lazy about fixing and optimizing our systems.


Comparing AMD EPYC Performance with Intel Xeon in GCP

AMD EPYC Intel Xeon

AMD EPYC Intel XeonRecently we were asked to check the performance of the new family of AMD EPYC processors when using MySQL in Google Cloud Virtual Machines. This was motivated by a user running MySQL in the N1 machines family and willing to upgrade to N2D generation considering the potential cost savings using the new AMD family. 

The idea behind the analysis is to do a side-by-side comparison of performance considering some factors: 

  • EPYC processors have demonstrated better performance in purely CPU-based operations according to published benchmarks. 
  • EPYC platform has lower costs compared to the Intel Xeon platform. 

The goal of this analysis is to check if cost reductions by upgrading from N1 to N2D are worth the change to avoid suffering from performance problems and eventually reduce the machine size from the current 64 cores based (N1 n1-highmem-64 – Intel Haswell) to either N2D 64 cores (n2d-highmem-64 – AMD Rome) or even to 48 cores (n2d-highmem-48 – AMD Rome), to provide some extra context we included N2 (the new generation of Intel machines) into the analysis. 

In order to do a purely CPU performance comparison we created 4 different VMs:

NAME: n1-64
MACHINE_TYPE: n1-highmem-64
Intel Haswell – Xeon 2.30GHz
*This VM corresponds to the same type as the type we use in Production.

NAME: n2-64
MACHINE_TYPE: n2-highmem-64
Intel Cascade Lake – Xeon 2.80GHz

NAME: n2d-48
MACHINE_TYPE: n2d-highmem-48
AMD Epyc Rome – 2.25Ghz

NAME: n2d-64
MACHINE_TYPE: n2d-highmem-64
AMD Epyc Rome – 2.25Ghz

For the analysis, we used MySQL Community Server 5.7.35-log and this is the basic configuration:

datadir   = /var/lib/mysql
socket = /var/lib/mysql/mysql.sock
log-error   = /var/lib/mysql/mysqld.err
pid-file = /var/run/mysqld/mysqld.pid
server_id                       = 100
binlog_format                   = ROW
sync_binlog                     = 1000
expire_logs_days                = 2

innodb_buffer_pool_size         = 350G
innodb_buffer_pool_instances    = 32
innodb_concurrency_tickets      = 5000
innodb_thread_concurrency       = 128
innodb_write_io_threads         = 16
innodb_read_io_threads          = 16
innodb_flush_log_at_trx_commit  = 1
innodb_flush_method             = O_DIRECT
innodb_log_file_size            = 8G
innodb_file_per_table           = 1
innodb_autoinc_lock_mode        = 2
innodb_buffer_pool_dump_at_shutdown = 1
innodb_buffer_pool_load_at_startup  = 1

table_open_cache                = 5000
thread_cache_size               = 2000
query_cache_size                = 0
query_cache_type                = 0

In all cases, we placed a 1TB balanced persistent drive so we get enough IO performance for the tests. We wanted to normalize all the specs so we can focus on the CPU performance, so don’t pay too much attention to the chances for improving performance for IO operations and so. 

The analysis is based on sysbench oltp read-only workload with an in-memory dataset, the reason for this is that we want to generate traffic that can saturate CPU while not being affected by IO or Memory. 

The approach for the benchmark was also simple, we executed RO OLTP work for 16, 32, 64, 128, and 256 threads with a one-minute wait between runs. Scripts and results from tests can be found here

Let’s jump into the analysis, these are the number of Queries that instances are capable to run: 

MySQL Queries

The maximum amount of TPS by Instance Type by the number of threads:

Threads/Instance N1-64 N2-64 N2D-48 N2D-64


164k 230k 144k



265k 347k 252k 268k


415k 598k 345k



398k 591k 335k


256 381k 554k 328k


Some observations: 

  • In all cases we reached the maximum TPS at 64 threads, this is somehow expected as we are not generating CPU context switches. 
  • Roughly we get a maximum of 598k tps in n2-highmem-64 and 444k tps in n2d-highmem-64 instance types which are the bigger ones. While this is expected Intel-based architecture outperforms AMD by a 35% 
  • Maximum tps seems to be reached with 64 threads, this is expected considering the number of CPU threads we can use in parallel. 
  • While n1-highmem-64 (Intel Xeon) and n2d-highmem-48 (AMD Epyc) seems to start suffering performance issues when the amount of threads exceeds the max number of cores the bigger instances running with 64 cores are capable to sustain the throughput a bit better, these instances start to be impacted when we reach 4x the amount of CPU cores. 

Let’s have a look at the CPU utilization on each node:

CPU utilization on each node

Additional observations: 

  • n1-highmem-64 and n2d-highmem-48 are reaching 100% utilization at 64 threads running. 
  • With 64 threads running n2-highmem-64 reaches 100% utilization while n2d-highmem-64 is still below. Although Intel provides better throughput overall probably by having a faster CPU clock (2.8Ghz vs 2.25Ghz) 
  • For 128 and 256 threads all CPUs show similar utilization. 

For the sake of analysis this is the estimated costs of each of used machines (at the moment of writing the post):
n1-highmem-64 $2,035.49/month = $0.000785297/second
n2-highmem-64 $2,549.39/month = $0.000983561/second
n2d-highmem-48 $1,698.54/month = $0.000655301/second
n2d-highmem-64 $2,231.06/month = $0.000860748/second

Costs above will give us roughly at peaks of TPS:
n1-highmem-64 costs are $0.0000000019/trx
n2-highmem-64 costs are $0.0000000016/trx
n2d-highmem-48 costs are $0.0000000019/trx
n2d-highmem-64 costs are $0.0000000019/trx


While this is not a super exhaustive analysis of all implications of CPU performance for MySQL workload we get a very good understanding of cost vs performance analysis. 

  • n1 family, currently used in production, shows very similar performance to n2d family (AMD) when running with the same amount of cores. This changes a lot when we move into the n2 family (Intel) which outperforms all other instances. 
  • While the cut in costs for moving into n2d-highmem-48 will represent ~$4k/year the performance penalty is close to 20%.
  • Comparing the costs per trx at peaks of loads we can see that both n2-64 and n2d-64 are pretty much the same but n2-64 will give us 35% more throughput, this is definitely something to consider if we plan to squeeze the CPU power.   
  • If the consideration is to go with n2 generation then definitely the n2d-highmem-64 is a very good choice to balance performance and costs but n2-highmem-64 will give much better performance per dollar spent. 

Percona Server for MySQL Encryption Options and Choices

Percona Server for MySQL Encryption

Percona Server for MySQL EncryptionSecurity will always be a main focal point of a company’s data. A common question I get from clients is, “how do I enable encryption?” Like every good consulting answer, it depends on what you are trying to encrypt. This post is a high-level summary of the different options available for encryption in Percona Server for MySQL.

Different certifications require different levels of encryption. For example, PCI requires both encryptions of data at rest and in transit. Here are the main facets of encryption for MySQL:

  • Data at Rest
    • Full disk encryption (at the OS level)
    • Transparent Data Encryption – TDE
    • Column/field-level encryption
  • Data in Transit
    • TLS Connections

Data at Rest

Data at rest is frequently the most asked about part of encryption. Data at rest encryption has multiple components, but at the core is simply ensuring that the data is encrypted at some level when stored. Here are the primary ways we can look at the encryption of data at rest.

Full Disk Encryption (FDE)

This is the easiest and most portable method of encrypting data at rest. When using full disk encryption, the main goal is to protect the hard drives in the event they are compromised. If a disk is removed from the server or the server is removed from a rack, the disk isn’t readable without the encryption key.

This can be managed in different ways, but the infrastructure team generally handles it. Frequently, enterprises already have disk encryption as part of the infrastructure stack. This makes FDE a relatively easy option for data at rest encryption. It also has the advantage of being portable. Regardless of which database technology you use, the encryption is managed at the server level.

The main disadvantage of FDE is that when the server is running, and the disk is mounted, all data is readable. It offers no protection against an attack on a running server once mounted.

Transparent Data Encryption (TDE)

Moving up the chain, the next option for data at rest encryption is Transparent Data Encryption (TDE). In contrast to FDE, this method encrypts the actual InnoDB data and log files. The main difference with database TDE is that the encryption is managed through the database, not at the server level. With this approach, the data and log files are encrypted on disk by the database. As data is read by MySQL/queries, the encrypted pages are read from disk and decrypted to be loaded into InnoDB’s buffer pool for execution.

For this method, the encryption keys are managed either through local files or a remote KMS (such as Hashicorp Vault) with the keyring_plugin. While this approach helps prevent any OS user from simply copying data files, the decrypted data does reside in memory which could be susceptible to a clever hacker. We must rely on OS-level memory protections for further assurance. It also adds a level of complexity for key management and backups that is now shifted to the DBA team.

Column Level Encryption

While the prior methods of at-rest encryption can help to meet various compliance requirements, both are limited when it comes to a running system. In either case, if a running system is compromised, the data stored is fully readable. Column level encryption works to protect the data in a running system without a key. Without a key, the data in the encrypted column is unreadable.

While this method protects selected data in a running system, it often requires application-level changes. Inserts are done with a specific encryption function (AES_ENCRYPT in MySQL, for example). To read the data, AES_DECRYPT with the specified key is required. The main risk with this approach is sending the plaintext values as part of the query. This can be sniffed if not using TLS or potentially leaked through log files. The better approach is to encrypt the data in the application BEFORE sending it to MySQL to ensure no plaintext is ever passed between systems.

In some cases, you can use a shared key for the entire application. Other approaches would be to use an envelope method and store a unique key alongside each encrypted value (protected by a separate master key).

Either way, it is important to understand one of the primary downsides to this approach – indexes and sort order can and will be impacted. For example, if you are encrypting the SSN number, you won’t be able to sort by SSN within MySQL. You would be able to look up a row using the SSN number but would need to pass the encrypted value.

Data in Transit

Now that we’ve discussed the different types of data-at-rest encryption, it is important to encrypt traffic to and from the database. Connecting to the server via TLS ensures that any sensitive sent to or from the server is encrypted. This can prevent data from leaking over the wire or via man-in-the-middle attacks.

This is a straightforward way to secure communication, and when combined with some at-rest encryption, serves to check a few more boxes towards various compliances.


Overall, there are several aspects of encryption in MySQL. This makes it possible to meet many common compliance requirements for different types of regulations. Security is a critical piece of the database tier, and these discussions are needed across teams in an organization. Ensuring that security, infrastructure, and the database team are on the same page is essential, especially during the design phase. Let our Professional Services team help you implement the approach that is best suited for your requirements – we are here to help!

Percona Distribution for MySQL is the most complete, stable, scalable, and secure, open-source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!

Download Percona Distribution for MySQL Today


Taking a Look at BTRFS for MySQL


BTRFS for MySQLFollowing my post MySQL/ZFS Performance Update, a few people have suggested I should take a look at BTRFS (“butter-FS”, “b-tree FS”) with MySQL. BTRFS is a filesystem with an architecture and a set of features that are similar to ZFS and with a GPL license. It is a copy-on-write (CoW) filesystem supporting snapshots, RAID, and data compression. These are compelling features for a database server so let’s have a look.

Many years ago, in 2012, Vadim wrote a blog post about BTRFS and the results were disappointing. Needless to say that since 2012, a lot of work and effort has been invested in BTRFS. So, this post will examine the BTRFS version that comes with the latest Ubuntu LTS, 20.04. It is not bleeding edge but it is likely the most recent release and Linux kernel I see in production environments. Ubuntu 20.04 LTS is based on the Linux kernel 5.4.0.

Test Environment

Doing benchmarks is not my core duty at Percona, as, before all, I am a consultant working with our customers. I didn’t want to have a cloud-based instance running mostly idle for months. Instead, I used a KVM instance in my personal lab and made sure the host could easily provide the required resources. The test instance has the following characteristics:

  • 4 CPU
  • 4 GB of RAM
  • 50 GB of storage space throttle at 500 IOPS and 8MB/s

The storage bandwidth limitation is to mimic the AWS EBS behavior capping the IO size to 16KB. With KVM, I used the following iotune section:


These IO limitations are very important to consider in respect to the following results and analysis.

For these benchmarks, I use Percona Server for MySQL 8.0.22-13 and unless stated otherwise, the relevant configuration variables are:

datadir = /var/lib/mysql/data
innodb_log_group_home_dir = /var/lib/mysql/log
innodb_lru_scan_depth = 256 # given the bp size, makes more sense
innodb_flush_neighbors = 0
innodb_fast_shutdown = 2 # skip flushing
innodb_flush_method = O_DIRECT # ZFS uses fsync
performance_schema = off

Benchmark Procedure

For this post, I used the sysbench implementation of the TPCC. The table parameter was set to 10 and the size parameter was set to 20. Those settings yielded a MySQL dataset size of approximately 22GB uncompressed.

All benchmarks used eight threads and lasted two hours. For simplicity, I am only reporting here the total number of events over the duration of the benchmark. TPCC results usually report only one type of event, New order transactions. Keep this in mind if you intend to compare my results with other TPCC benchmarks.

Finally, the dataset is refreshed for every run either by restoring a tar archive or by creating the dataset using the sysbench prepare option.


BTRFS was created and mounted using:

mkfs.btrfs /dev/vdb -f
mount -t btrfs /dev/vdb /var/lib/mysql -o compress-force=zstd:1,noatime,autodefrag
mkdir /var/lib/mysql/data; mkdir /var/lib/mysql/log;
chown -R mysql:mysql /var/lib/mysql


ZFS was created and configured using:

zpool create -f bench /dev/vdb
zfs set compression=lz4 atime=off logbias=throughput bench
zfs create -o mountpoint=/var/lib/mysql/data -o recordsize=16k -o primarycache=metadata bench/data
zfs create -o mountpoint=/var/lib/mysql/log bench/log
echo 2147483648 > /sys/module/zfs/parameters/zfs_arc_max
echo 0 > /sys/module/zfs/parameters/zfs_arc_min

Since the ZFS file cache, the ARC, is compressed, an attempt was made with 3GB of ARC and only 256MB of buffer pool. I called this configuration “ARC bias”.


mkfs.ext4 -F /dev/vdb"
mount -t ext4 /dev/vdb /var/lib/mysql -o noatime,norelatime"


TPCC events

The performance results are presented below. I must admit my surprise at the low results of BTRFS. Over the two-hour period, btrfs wasn’t able to reach 20k events. The “btrfs tar” configuration restored the dataset from a tar file instead of doing a “prepare” using the database. This helped BTRFS somewhat (see the filesystem size results below) but it was clearly insufficient to really make a difference. I really wonder if it is a misconfiguration on my part, contact me in the comments if you think I made a mistake.

The ZFS performance is more than three times higher, reaching almost 67k events. By squeezing more data in the ARC, the ARC bias configuration even managed to execute more events than ext4, about 97k versus 94k.

TPCC performance comparison

TPCC performance comparison, ext4, BTRFS and ZFS

Filesystem Size

Performance, although an important aspect, is not the only consideration behind the decision of using a filesystem like BTRFS or ZFS. The impacts of data compression on filesystem size is also important. The resulting filesystem sizes are presented below.

TPCC dataset size comparison, ext4, BTRFS and ZFS

TPCC dataset size comparison, ext4, BTRFS and ZFS

The uncompressed TPCC dataset size with ext4 is 21GB. Surprisingly, when the dataset is created by the database, with small random disk operations, BTRFS appears to not compress the data at all. This is in stark contrast to the behavior when the dataset is restored from a tar archive. The restore of the tar archive causes large sequential disk operations which trigger compression. The resulting filesystem size is 4.2GB, a fifth of the original size.

Although this is a great compression ratio, the fact that normal database operations yield no compression with BTRFS is really problematic. Using ZFS with lz4 compression, the filesystem size is 6GB. Also, the ZFS compression ratio is not significantly affected by the method of restoring the data.

The performance issues of BTRFS have also been observed by Phoronix; their SQLite and PostgreSQL results are pretty significant and inline with the results presented in this post. It seems that BTRFS is not optimized for the small random IO type of operations requested by database engines. Phonorix recently published an update for the kernel 5.14 and the situation may have improved. The BTRFS random IO write operation results still seems to be quite low.


Although I have been pleased with the ease of installation and configuration of BTRFS, a database workload seems to be far from optimal for it. BTRFS struggles with small random IO operations and doesn’t compress the small blocks. So until these shortcomings are addressed, I will not consider BTRFS as a prime contender for database workloads.


Percona Support Team New Year Greetings!

Happy New Year from Percona Support

In Percona Support we spend a few hours per week doing Labs: short collective projects, not directly related to the tickets.
One of our last labs was about to get ready for the winter holiday season which means preparing the tree and decorating it. To do it we used our favorite tools: MySQL, and ASCII art (https://en.wikipedia.org/wiki/ASCII_art). As a result, we created a function that prints into the terminal a New Year Tree (https://en.wikipedia.org/wiki/New_Year_tree) and decorates it with symbols we want.
For example, to create a three-level tree, decorated with starts, call the function as follow:

$ mysql ny_tree -e "select ny_tree(3, '*')" --vertical --skip-column-names -s
*************************** 1. row ***************************

  /* \
 /  * \

Since MySQL supports Unicode you can use any symbol, having it has the same width as the space symbol. For example, you can decorate your tree with wax candles without any risk of firing up your home:
$ mysql ny_tree -e "select ny_tree(4, '?')" --vertical --skip-column-names -s
*************************** 1. row ***************************

   /? \
  /?  ?\
 /?? ?  \
You can change your terminal colors to those that look more celebrating, align the size of the terminal with the number of levels in the tree, and call the function in a loop to have an animated picture.
For example, animation at the beginning of this post was created by a few calls like:
$ for i in `seq 1 1 1000`; do mysql ny_tree -e "select '\\nHappy New 2022 Year\!', ny_tree(12, '?')" --vertical --skip-column-names -s; sleep 1; done
I only adjusted the number of levels for each terminal size.
Code for the


  and two helper functions is located at https://github.com/Percona-Lab/ny-tree

We, Percona Support Team, wish you Happy Holidays and Happy New Year without serious database issues!

Percona Monitoring and Management 2 Test Drive Using VirtualBox and SSH Tunnels

PMM using VirtualBox and SSH

PMM using VirtualBox and SSHPercona Monitoring and Management 2 (PMM2) is the database monitoring suite assembled and developed by Percona. It is based on standard open source components and custom-made software integrations. PMM2 helps you reduce complexity, optimize performance, and improve the security of your business-critical database environments, no matter where they are located or deployed.

This blog post will describe a method to test PMM2 using your laptop’s VirtualBox, ssh tunnels, and without installing any agents on the database servers. This is a testing and evaluation environment, not intended for production. If you want to perform a full-fledged test of PMM2, we recommend an environment as similar as possible to your final production setup: enterprise virtualization, docker containers, or AWS. We assume that your laptop doesn’t have direct connectivity to the databases, this is why we use ssh tunnels.

PMM2 architecture consists of 2+1 components:PMM2 High Level Architecture
Two components run on your infrastructure: PMM Agents and PMM Server. The agents gather the metrics at the database and operating system levels. PMM Server takes care of processing, storing, grouping, and displaying these metrics. It can also perform additional operations like capturing serverless databases metrics, backups, and sending alerts (the last two features are in technical preview as of this writing). The other component to complete the formula is the Percona Platform, which adds more features to PMM, from advisors to DBaaS. Disclaimer: Percona Platform is in preview release with limited functionality – suitable for early adopters, development, and testing. Besides the extended features added to PMM, the Percona Platform brings together distributions of MySQL, PostgreSQL, and MongoDB including a range of open-source tools for data backup, availability, and management. You can learn more about the Platform here.

To make setup easier, PMM2 Server can be run either as a docker container or importing an OVA image, executed using VMWare VSphere, VirtualBox, or any other hypervisor. If you run your infrastructure in AWS, you can deploy PMM from the AWS Marketplace.

To run the agents, you need a Linux box. We recommend running the agents and the database on the same node. PMM can also gather the metrics using a direct connection to a server-less database or running an operating system that does not support the agent.

Often, installing the agents is a stopper for some DBAs who would like to test PMM2. Also, while containers are frequent in large organizations, we find virtualization and containers relegated to development and quality assurance environments. These environments usually don’t have direct access to production databases.

TCP Forwarding Across SSH Connections

AllowTcpForwarding is the ssh daemon configuration option that allows forwarding TCP ports across the ssh connection. At first sight, this may seem a security risk, but as the ssh documentation states: “disabling TCP forwarding does not improve security unless users are also denied shell access, as they can always install their forwarders.”

If your system administrators do not allow TCP forwarding, other options available to accomplish the same results are socat or netcat. But we will not cover them here.

If your laptop has direct access to the databases, you can skip all the ssh tunnels and use the direct access method described later in this post.

Install PMM 2 Ova

Download the Open Virtualization Format compatible image from https://www.percona.com/downloads/pmm2/ or use the command line:

$ wget https://www.percona.com/downloads/pmm2/2.25.0/ova/pmm-server-2.25.0.ova

You can import the OVA file using the UI, with the import option from the file menu, or using the command line:

$ VBoxManage import pmm-server-2.25.0.ova --vsys 0 --vmname "PMM Testing"
Interpreting /Users/pep/Downloads/pmm-server-2.25.0.ova...
vmdisk1 42949672960 -1 http://www.vmware.com/interfaces/specifications/vmdk.html#streamOptimized PMM2-Server-2021-12-13-1012-disk001.vmdk -1 -1
vmdisk2 429496729600 -1 http://www.vmware.com/interfaces/specifications/vmdk.html#streamOptimized PMM2-Server-2021-12-13-1012-disk002.vmdk -1 -1

Virtual system 0:
0: Suggested OS type: "RedHat_64"
(change with "--vsys 0 --ostype "; use "list ostypes" to list all possible values)
1: VM name specified with --vmname: "PMM Testing"
2: Suggested VM group "/"
(change with "--vsys 0 --group ")
3: Suggested VM settings file name "/Users/Pep/VirtualBox VMs/PMM2-Server-2021-12-13-1012/PMM2-Server-2021-12-13-1012.vbox"
(change with "--vsys 0 --settingsfile ")
4: Suggested VM base folder "/Users/Pep/VirtualBox VMs"
(change with "--vsys 0 --basefolder ")
5: Product (ignored): Percona Monitoring and Management
6: Vendor (ignored): Percona
7: Version (ignored): 2021-12-13
8: ProductUrl (ignored): https://www.percona.com/software/database-tools/percona-monitoring-and-management
9: VendorUrl (ignored): https://www.percona.com
10: Description "Percona Monitoring and Management (PMM) is an open-source platform for managing and monitoring MySQL and MongoDB performance"
(change with "--vsys 0 --description ")
11: Number of CPUs: 1
(change with "--vsys 0 --cpus ")
12: Guest memory: 4096 MB
(change with "--vsys 0 --memory ")
13: Network adapter: orig NAT, config 3, extra slot=0;type=NAT
14: SCSI controller, type LsiLogic
(change with "--vsys 0 --unit 14 --scsitype {BusLogic|LsiLogic}";
disable with "--vsys 0 --unit 14 --ignore")
15: Hard disk image: source image=PMM2-Server-2021-12-13-1012-disk001.vmdk, target path=PMM2-Server-2021-12-13-1012-disk001.vmdk, controller=14;channel=0
(change target path with "--vsys 0 --unit 15 --disk path";
disable with "--vsys 0 --unit 15 --ignore")
16: Hard disk image: source image=PMM2-Server-2021-12-13-1012-disk002.vmdk, target path=PMM2-Server-2021-12-13-1012-disk002.vmdk, controller=14;channel=1
(change target path with "--vsys 0 --unit 16 --disk path";
disable with "--vsys 0 --unit 16 --ignore")
Successfully imported the appliance.

Once the machine is imported, we will connect it to a host-only network. This network restricts network traffic only between the host and the virtual machines. But first, let’s find a suitable network:

$ VBoxManage list hostonlyifs
Name: vboxnet0
GUID: 786f6276-656e-4074-8000-0a0027000000
DHCP: Disabled
IPV6NetworkMaskPrefixLength: 0
HardwareAddress: 0a:00:27:00:00:00
MediumType: Ethernet
Wireless: No
Status: Up
VBoxNetworkName: HostInterfaceNetworking-vboxnet0

Select the first one that has Status up, write down the name and IP address. Then make sure that there is a DHCP server assigned to that interface:

$ VBoxManage list dhcpservers
NetworkName: HostInterfaceNetworking-vboxnet0
Dhcpd IP:
Enabled: Yes
Global Configuration:
minLeaseTime: default
defaultLeaseTime: default
maxLeaseTime: default
Forced options: None
Suppressed opts.: None
Groups: None
Individual Configs: None

Now we will assign two network interfaces to our PMM virtual machine. One is allocated to the internal network, and the other uses NAT to connect to the internet and, for example, check for upgrades.

$ VBoxManage modifyvm "PMM Testing" --nic1 hostonly --hostonlyadapter1 vboxnet0
$ VBoxManage modifyvm "PMM Testing" --nic2 natnetwork

Once networking is configured, we may start the virtual machine.

$ VBoxManage startvm "PMM Testing"

The next step is to retrieve the IP address assigned to our PMM box. First, we will obtain the MAC address of the network card we recently added:

$ VBoxManage showvminfo "PMM Testing" | grep -i vboxnet0
NIC 1: MAC: 08002772600D, Attachment: Host-only Interface 'vboxnet0', Cable connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0 Mbps, Boot priority: 0, Promisc Policy: deny, Bandwidth group: none

Using the retrieved MAC address we can look for the DHCP leases:

$ VBoxManage dhcpserver findlease --interface=vboxnet0 --mac-address=08002772600D
IP Address:
MAC Address: 08:00:27:72:60:0d
State: acked
Issued: 2021-12-21T22:15:54Z (1640124954)
Expire: 2021-12-21T22:25:54Z (1640125554)
TTL: 600 sec, currently 444 sec left

This is the IP address we will use to access the PMM server. Open a browser to connect to with the default credentials: admin/admin.

PMM2 Login window
The following step configures the tunnels to connect to the databases we monitor.

Set Up the SSH Tunnels

This is the topology of our network:

Private network topology
We will open two ssh connections per server we want to access from PMM. Open a terminal session and execute the following command, replacing it with the username you normally use to connect to your jump host:

$ ssh -L @

This creates a tunnel that connects the MySQL Server port 3306 with our local internal address in the same port. If you want to connect to more than one MySQL instance, you must use different ports. To open the tunnel for the MongoDB server, use the following command:

$ ssh -L @

Test the tunnel connectivity to the MySQL host using netcat:

$ nc -z 3306
Connection to port 3306 [tcp/mysql] succeeded!

And also test the connectivity to the MongoDB host:

$ nc -z 27017
Connection to port 27017 [tcp/*] succeeded!

This is the topology of our network including the ssh tunnels.
SSH tunnels

Configure Accounts

Follow the PMM documentation and create a MySQL account (or use an already existing account) with the required privileges:


Note that we need to use the internal IP address for the jump host. If you don’t know the IP address, use the wildcard ‘%’.

Add also the credentials for MongoDB, run this in a Mongo session:

role: "explainRole",
privileges: [{
resource: {
db: "",
collection: ""
actions: [

user: "pmm_mongodb",
pwd: "",
roles: [
{ role: "explainRole", db: "admin" },
{ role: "clusterMonitor", db: "admin" },
{ role: "read", db: "local" }

Add the Services to PMM

We can’t install the agents because we don’t have access to our PMM testing environment from the database servers. Instead, we will configure both services as remote instances. Go to the “Configuration” menu , select “PMM Inventory” , then “Add instance” . Then choose MySQL Add a remote instance.
Complete the following fields:
Hostname: (This is the internal Host-Only VirtualBox address)
Service name: MySQL8
Port: 3306
Username: pmm
Password: <password>

And press the button. It will check the connectivity and, if everything is correct, the MySQL service will be added to the inventory. If there is an error, double-check that the ssh connection is still open and you entered the correct credentials. Make sure that the host that you specified to create the MySQL user is correct.

We will use a similar process for MongoDB:

These are the fields you have to complete with the correct information:
Hostname: (Again, the internal Host-Only VirtualBox address)
Service name: MongoDB
Port: 27017
Username: pmm_mongodb
Password: <password>

And press the button. It will check the connectivity and, if everything is correct, the MongoDB service will be added to the inventory. If there is an error, double-check again that the ssh connection is open and you entered the correct credentials. You can use also the MongoDB client application to check access.

Once you added both services, you just need to wait for a few minutes to give time to collect data and start testing PMM2.

PMM2 Query Analyzer

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com