InnoDB Performance Optimization Basics

InnoDB Performance Optimization Basics

This blog is in reference to our previous ones for ‘Innodb Performance Optimizations Basics’ 2007 and 2013. Although there have been many blogs about adjusting MySQL variables for better performance since then, I think this topic deserves a blog update since the last update was a decade ago, and MySQL 5.7 and 8.0 have been released since then with some major changes.

These guidelines work well for a wide range of applications, though the optimal settings, of course, depend on the workload.



The amount of RAM to be provisioned for database servers can vary greatly depending on the size of the database and the specific requirements of the company. Some servers may need a few GBs of RAM, while others may need hundreds of GBs or even terabytes of RAM. Factors that can affect the amount of RAM needed by a database server include the total size of the database, the number of concurrent users, and the complexity of the database queries. As datasets continue to grow in size, the amount of RAM required to store and process these datasets also increases. By caching hot datasets, indexes, and ongoing changes, InnoDB can provide faster response times and utilize disk IO in a much more optimal way.


From a CPU standpoint, faster processors with many cores provide better throughput. CPUs with 32/64 cores are still common, and we see some large clients with 96 cores, and the latest MySQL versions can utilize them much better than before. However, it is worth noting that simply adding more CPU cores does not always result in improved performance. CPU core usage will also depend on the specific workload of the application, such as the number of concurrent users or the complexity of the queries being run.


The type of storage and disk used for database servers can have a significant impact on performance and reliability. Nowadays, solid-state drives (SSDs) or non-volatile memory express (NVMe) drives are preferred over traditional hard disk drives (HDDs) for database servers due to their faster read and write speeds, lower latency, and improved reliability. While NVMe or SSDs are generally more expensive than HDDs, the increased performance and reliability that they offer make them a cost-effective choice for database servers that require fast access to data and minimal downtime. RAID 10 is still the recommended level for most workloads, but make sure your RAID controller can utilize the SSD drive’s performance and will not become the actual bottleneck.

Operating system

Linux is the most common operating system for high-performance MySQL servers. Make sure to use modern filesystems, like EXT4, XFS, or ZFS on Linux, combined with the most recent kernel. Each of them has its own limits and advantages: for example, XFS is fast in deleting large files, while EXT4 can provide better performance on fast SSD drives, and ZFS on Linux has progressed a lot. Benchmark before you decide.

For database servers, we usually recommend our clients have:

  1. Jemalloc installed and enabled for MySQL.
  2. Transparent huge pages (THP) disabled.
  3. Setting swappiness to one is generally recommended, lowering the tendency of swapping.
  4. Setting oom_score_adj to -800.


Different cloud providers offer a range of instance types and sizes, each with varying amounts of CPU, memory, and storage. Some cloud providers also offer specialized instances for database workloads, which may provide additional features and optimizations for performance and scalability. One of the benefits of cloud-based database servers is the ability to scale resources up or down as needed. It’s important to consider the potential need for scaling and select an instance type and size to accommodate future growth. Some cloud providers also offer auto-scaling features that can automatically adjust the number of instances based on workload demand.

MySQL InnoDB settings

(Dynamic) – Does not require MySQL restart for change.

(Static) – Requires MySQL restart for change.

innodb_buffer_pool_size (Dynamic) – InnoDB relies heavily on the buffer pool and should be set correctly. Typically a good value is 70%-80% of available memory. Also, refer to innodb_buffer_pool_chunk_size mentioned below. 

innodb_buffer_pool_instances  (Static) – Enabling this is useful in highly concurrent workloads as it may reduce contention of the global mutexes. The optimal value can be decided after testing multiple settings, starting from eight is a good choice.

innodb_buffer_pool_chunk_size  (Static) – Defines the chunk size by which the buffer pool is enlarged or reduced. This variable is not dynamic, and if it is incorrectly configured, it could lead to undesired situations.  Refer to InnoDB Buffer Pool Resizing: Chunk Change for more details on configuration.

innodb_log_file_size (Static) – Large enough InnoDB transaction logs are crucial for good, stable write performance. But also larger log files mean that the recovery process will be slower in case of a crash. However, this variable has been deprecated since 8.0.30. Refer to innodb_redo_log_capacity below. 

innodb_redo_log_capacity (Dynamic) – Introduced in 8.0.30, this defines the amount of disk space occupied by redo log files. This variable supersedes the innodb_log_files_in_group and innodb_log_file_size variables. When this setting is defined, the innodb_log_files_in_group and innodb_log_file_size settings are ignored (those two variables are now deprecated since 8.0.30).

innodb_log_buffer_size (Dynamic) – InnoDB writes changed data records into its log buffer, which is kept in memory, and it saves disk I/O for large transactions as it does not need to write the log of changes to disk before transaction commit. If you have transactions that update, insert, or delete many rows, making the log buffer larger saves disk I/O. 

innodb_flush_log_at_trx_commit (Dynamic) – The default value of ‘1’ gives the most durability (ACID compliance) at a cost of increased filesystem writes/syncs. Setting the value to ‘0’ or ‘2’ will give more performance but less durability. At a minimum, transactions are flushed once per second.

innodb_thread_concurrency (Dynamic) – With improvements to the InnoDB engine, it is recommended to allow the engine to control the concurrency by keeping it to the default value (which is zero). If you see concurrency issues, you can tune this variable. A recommended value is two times the number of CPUs plus the number of disks. 

innodb_flush_method (Static) – Setting this to O_DIRECT will avoid a performance penalty from double buffering; this means InnoDB bypasses the operating system’s file cache and writes data directly to disk (reducing the number of I/O operations required).

innodb_online_alter_log_max_size (Dynamic) – The upper limit in bytes on the size of the temporary log files used during online DDL operations for InnoDB tables. If a temporary log file exceeds the upper size limit, the ALTER TABLE operation fails, and all uncommitted concurrent DML operations are rolled back. Thus, a large value for this option allows more DML to happen during an online DDL operation but also extends the period of time at the end of the DDL operation when the table is locked to apply the data from the log.

innodb_numa_interleave (Static) – For ‘NUMA enabled systems’ with large amounts of memory (i.e.,> 128GB), we recommend turning on NUMA interleaving. Enabling this parameter configures memory allocation to be ‘interleaved’ across the various CPU-Memory channels. This helps “even out” memory allocations so that one CPU does not become a memory bottleneck.

innodb_buffer_pool_dump_at_shutdown/innodb_buffer_pool_load_at_startup (Dynamic/Static respectively) – These variables allow you to dump the contents of the InnoDB buffer pool to disk at shutdown and load it back at startup, which will pre-warm the buffer pool so that you don’t have to start with a cold buffer pool after a restart.

innodb_buffer_pool_dump_pct (Dynamic) – The option defines the percentage of most recently used buffer pool pages to dump. By default, MySQL only saves 25% of the most actively accessed pages, which should be reasonable for most use cases, it can then be loaded faster than if you try to load every page in the buffer pool (100%), many of which might not be necessary for a general workload. You can increase this percentage if needed for your use case.

Innodb_io_capacity (Dynamic) –  It defines the number of I/O operations per second (IOPS) available to InnoDB background tasks, such as flushing pages from the buffer pool and merging data from the change buffer. Ideally, keep the setting as low as practical but not so low that background activities fall behind. Refer to this for more information on configuration.

Innodb_io_capacity_max (Dynamic) –  If the flushing activity falls behind, InnoDB can flush more aggressively, at a higher rate than innodb_io_capacity. innodb_io_capacity_max defines the maximum number of IOPS performed by InnoDB background tasks in such situations. Refer to Give Love to Your SSDs – Reduce innodb_io_capacity_max! for more information on configuration.

innodb_autoinc_lock_mode (Static) – Setting the value to ‘2’ (interleaved mode) can remove the need for an auto-inc lock (at the table level) and can increase performance when using multi-row insert statements to insert values into a table with an auto-increment primary key. Note that this requires either ROW or MIXED binlog format. (The default setting is 2 as of MySQL 8.0)

innodb_temp_data_file_path (Static) – Defines the relative path, name, size, and attributes of InnoDB temporary tablespace data files. If you do not specify a value for innodb_temp_data_file_path, the default behavior is to create a single, auto-extending data file named ibtmp1 in the MySQL data directory. For 5.7, it is recommended to set a max value to avoid the risk of datadir partition filling up due to a heavy or bad query. 8.0 introduced session temporary tablespaces, temporary tables, or the internal optimizer tables no longer use ‘ibtmp1’.

innodb_stats_on_metadata (Dynamic) – The default setting of “OFF” avoids unnecessary updating of InnoDB statistics and can greatly improve read speeds.

innodb_page_cleaners (Static) – InnoDB supports multiple page cleaner threads for flushing dirty pages from buffer pool instances. If your workload is write-IO bound when flushing dirty pages from buffer pool instances to data files, and if your system hardware has available capacity, increasing the number of page cleaner threads may help improve write-IO throughput.

innodb_deadlock_detect (Dynamic) – This option can be used to disable deadlock detection. On high-concurrency systems, deadlock detection can cause a slowdown when numerous threads wait for the same lock.

Application tuning for InnoDB

Make sure your application is prepared to handle deadlocks that may happen. Review your table structure and see how you can take advantage of InnoDB properties – clustering by primary key, having a primary key in all indexes (so keep primary key short), and fast lookups by primary keys (try to use it in joins).


There are many other options you may want to tune, but here we’ve covered the important InnoDB parameters, OS-related tweaking, and hardware for optimal MySQL server performance. I hope this helps!

If all of this seems overwhelming or difficult to understand, our world-class MySQL Training is here to help. Our two-day Intensive Scaling and Optimizations for MySQL class walks you through much of what you need to know to optimize your MySQL performance.

Percona Distribution for MySQL is the most complete, stable, scalable, and secure open-source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!


Try Percona Distribution for MySQL today!


How MySQL 8.0.21 and MariaDB 10.5.4 Perform in IO-Bound Scenarios on SATA SSD and NVMe Storage

Don’t Spin Your Data, Use SSDs!

Enabling and Disabling Jemalloc on Percona Server

Settling the Myth of Transparent HugePages for Databases

MySQL 101: Linux Tuning for MySQL

MySQL 101: Parameters to Tune for MySQL Performance

Adjusting MySQL 8.0 Memory Parameters


Why PostgreSQL Is a Top Choice for Enterprise-level Databases

postgresql enterprise

When it comes to enterprise-level databases, there are several options available in the market, but PostgreSQL stands out as one of the most popular and reliable choices. PostgreSQL is a free and open source object-relational database management system (ORDBMS) that has existed since the mid-1990s.

Over the years, it has evolved into a robust and feature-rich database that offers several advantages over other database management systems. In this blog post, we will explore some of the reasons why PostgreSQL is a top choice for enterprise-level databases.

Open source and free

  • PostgreSQL is an open source database, which means it is free to use, distribute, and modify.
  • This makes it an attractive option for businesses of all sizes, especially for startups and small businesses that may not have the budget to invest in expensive database management systems.
  • PostgreSQL is backed by a large community of developers contributing to its development, support, and documentation.

Advanced features

PostgreSQL offers a wide range of advanced features that make it a top choice for enterprise-level databases. Some of the notable features include:

  • Support for JSON and XML data types
  • Full-text search capabilities
  • Built-in replication and high availability
  • Extensible architecture with support for custom data types, functions, and operators
  • Support for foreign keys and referential integrity constraints
  • Transactions with ACID (Atomicity, Consistency, Isolation, Durability) compliance


  • PostgreSQL is designed to scale with growing businesses. It can handle large datasets and complex queries with ease, and it can be easily scaled horizontally or vertically.
  • PostgreSQL supports sharding, which allows data to be distributed across multiple servers, making it ideal for high-traffic websites and applications.


  • PostgreSQL is known for its reliability and stability. It has a robust transactional system that ensures data integrity and consistency, even in high-transaction environments.
  • It has a proven track record of handling large volumes of data and high-traffic websites. PostgreSQL uses a multi-version concurrency control (MVCC) system, ensuring multiple users can access the same data simultaneously without conflicts or data loss.
  • It also supports multiple concurrent users and can handle complex database operations without downtime or data loss. This makes it an excellent choice for mission-critical applications that require high availability.


PostgreSQL offers a high degree of flexibility, allowing developers to customize it to their specific needs. It supports a wide range of data types, including JSON, and can handle complex data structures. It also offers a range of extensions and plugins that can be used to enhance its functionality, making it a versatile choice for production-ready databases.


PostgreSQL offers robust security features to protect data from unauthorized access. It supports SSL encryption for secure communication and offers various authentication methods, including LDAP, Kerberos, and GSSAPI. It also supports role-based access control, which allows developers to control access to data at the user level. This makes it an ideal choice for businesses that handle sensitive data.

Integration with other tools and technologies

PostgreSQL integrates seamlessly with other tools and technologies commonly used in enterprise-level applications. It has connectors for programming languages such as Java, Python, and PHP, as well as integrations with popular data visualization tools such as Tableau and Power BI.


PostgreSQL is a top choice for production-ready databases due to its scalability, reliability, flexibility, security, and community support. Its features make it an ideal choice for businesses that require a robust and versatile database system. If you are looking for a database system for your enterprise-level application, PostgreSQL is definitely worth considering.

Percona provides managed PostgreSQL services that include database monitoring, performance tuning, and backup and recovery services. Percona also provides PostgreSQL Support Services for businesses and organizations that use PostgreSQL as their primary database technology.

Percona Distribution for PostgreSQL provides the best and most critical enterprise components from the open-source community, in a single distribution, designed and tested to work together.

Download Percona Distribution for PostgreSQL Today!


Join the Percona PostgreSQL Community


Deploy Percona Monitoring and Management on Amazon EKS With eksctl and Helm

Deploy Percona Monitoring and Management on Amazon EKS

One of the installation methods we support for our database software is through Helm. We have a collection of Helm charts, in this repository, for the following Percona software:

Through this blog post, you will learn how to install Percona Monitoring and Management in a Kubernetes cluster on Amazon EKS using Helm and eksctl.


For the installation of PMM with Helm, you will need:

  • Kubernetes 1.22+
  • Helm 3.2.0+
  • PV (Persistent Volume) provisioner support in the underlying infrastructure

If you want to install PMM in a Kubernetes cluster on Amazon EKS, you can use eksctl to create the required cluster. I published this blog post Creating a Kubernetes cluster on Amazon EKS with eksctl, in the Percona Community blog, where I explain how to use this tool.

For an easy way to deploy the Kubernetes cluster, check Percona My Database as a Service (MyDBaaS), a Percona Labs project. I also recommend checking this article Percona Labs Presents: Infrastructure Generator for Percona Database as a Service (DBaaS) on our blog, where the process of creating the cluster is described.

MyDBaaS will help you with cluster creation. It will generate the configuration file needed for eksctl, or it can deploy the cluster to AWS.

To use eksctl you must:

  • Install kubectl
  • Create a user with minimum IAM policies
  • Create an access key ID and secret access key for the user previously created
  • Install AWS CLI and configure authentication (
    aws configure

    , from the command line)

  • Install eksctl

Create a Kubernetes cluster

To create the cluster, you need to generate the configuration file for eksctl. Go to and fill out the details of your cluster.

MyDBaaS - Kubernetes Cluster Configuration

Figure 1: MyDBaaS – Kubernetes Cluster Configuration

  1. Give your cluster a name. MyDbaas is the default value.
  2. Select the number of nodes. The value selected by default is three but you can create a cluster of up to five nodes.
  3. Write the instance type.
  4. Select the region. The default is us-west-2.

If you don’t know what instance type to use, go to the Instance Type Selector and select:

  • Number of CPUs
  • Memory size
  • Region
Instance Type Selector

Figure 2: MyDBaaS – Instance Type Selector

As stated on the website, this tool will only return the configuration file needed for eksctl. You can also provide your AWS credentials for the tool to deploy the EKS cluster.

After filling out the details, click on Submit and you will get the configuration file that will look like this:

- name: aws-ebs-csi-driver
kind: ClusterConfig
  name: MyDbaas
  region: us-east-1
- desiredCapacity: 2
      ebs: true
      efs: true
      fsx: true
  instanceType: m5.large
  maxSize: 5
  minSize: 1
  name: ng-1
  - echo 'OPTIONS="--default-ulimit nofile=1048576:1048576"' >> /etc/sysconfig/docker
  - systemctl restart docker
  volumeSize: 100
  volumeType: gp2

Then, create the cluster by running the following command:

eksctl create cluster -f cluster.yaml

While running, eksctl will create the cluster and all the necessary resources. It will take a few minutes to complete.

eksctl Running

Figure 3: eksctl Running

Cluster credentials can be found in


. Try

kubectl get nodes

to verify that this file is valid, as suggested by eksctl.

Install PMM with Helm

Once the cluster has been created and configured, you can install PMM using Helm.

  1. Add the repository
helm repo add percona

  1. Install PMM
helm install pmm --set service.type="LoadBalancer" percona/pmm

Once the PMM server runs, you must get its IP address. Run this command to get this value.

kubectl get services monitoring-service

You will get an output similar to the following.

NAME                 TYPE           CLUSTER-IP      EXTERNAL-IP                                                              PORT(S)                      AGE
monitoring-service   LoadBalancer   443:30252/TCP,80:31201/TCP   100s

 is the external IP and the one you need to access PMM from the browser.

Before accessing the dashboard of PMM, get the password.

export ADMIN_PASS=$(kubectl get secret pmm-secret --namespace default -o jsonpath='{.data.PMM_ADMIN_PASSWORD}' | base64 --decode)

The value of


  is the password you need to log into the dashboard. The default user is admin.

Go to the browser and paste the external IP in the address bar.

PMM running on Kubernetes

Figure 4: PMM running on Kubernetes

Now you have PMM running in the cloud on Amazon EKS.

I recommend you check this article Percona Monitoring and Management in Kubernetes is now in Tech Preview on our blog for more information about PMM in Kubernetes using Helm.


Through this blog post, you learned how to create a Kubernetes cluster with eksctl and deploy PMM using Helm with the help of Percona MyDBaaS. The process would be the same for any of the Percona software in the collection of Helm charts.

Learn more about Percona MyDBaaS!


Increase the Ability to Securely Replicate Your Data and Restrict Replication To Row-based Events in MySQL

mysql replication

In this blog, I’ll discuss the use case for replication. We want to improve our ability to replicate your data and limit replication to row-based events securely, wherein we do not have control over the source(s).

The replica doesn’t have checking capabilities when processing replicated transactions as of MySQL 8.0.18. It does this to carry out all instructions from its upstream. The replica must impose data access limitations on the replicated stream because changes may get past the security barrier separating the source and replica in some configurations. In that situation, implementing the upstream changes in a more constrained security context is beneficial to organizations needing privilege-controlled aggregate data from multiple separate databases.

In MySQL 8.0.18, a new feature PRIVILEGE_CHECKS_USER is introduced in replication channels. When a PRIVILEGE CHECKS USER account is used, a replication channel is more protected from unauthorized or unintentional use of privileged or undesirable actions.

By default, the replication threads use the privileges of the MySQL user account configured for replication. This can pose a security risk if the user account has privileges not necessary for replication, such as the SUPER privilege or the ability to execute arbitrary SQL statements. You can reduce the risk of unauthorized access or malicious attacks by setting PRIVILEGE_CHECKS_USER to a separate user account with limited privileges. The replication threads will use the limited privileges of the PRIVILEGE_CHECKS_USER account, which can help prevent the execution of unauthorized SQL statements or the manipulation of sensitive data.

PRIVILEGE CHECKS USER is helpful in multi-source replication to carefully gather data from various sources.

Configure user on replica


mysql> SET SQL_LOG_BIN=0;

mysql> CREATE USER repl;


mysql> SET SQL_LOG_BIN=1;



1 row in set (0.00 sec)


  • The SESSION_VARIABLES_ADMIN privilege is a subset of the SYSTEM_VARIABLES_ADMIN and SUPER privileges. SESSION VARIABLES ADMIN might be a security risk if the replication provider is unreliable.
  • Minimal performance issues may be noticed.
  • Messages are recorded in the error file if the channel terminates due to failed privilege checks.

Another feature, REQUIRE ROW FORMAT, was added in MySQL 8.0.19, limiting replication to row-based replication events. The replication channel will only take row-based replication events if REQUIRE ROW FORMAT is set. If REQUIRE ROW FORMAT is enabled, row-based binary logging (binlog format=ROW) is required on the source server.





1 row in set (0.00 sec)

All transactions received and implemented after this new feature has been enabled in a replication channel are checked, and the following are not permitted:

(Upon encountering any of these events, replication shall fail and stop.)

  • DML that logged for statement-based replication
  • Creation or deletion of temp tables: The reason for disallowing the replication of temporary tables is that they may contain sensitive data that is not intended to be persisted beyond their lifetime. For example, a temporary table may contain sensitive customer data used to calculate a tax or fee, but this data may be deleted as soon as the calculation is complete. If this temporary data is replicated to the replica, it may be vulnerable to exposure or manipulation.
  • LOAD DATA events: The reason for disallowing the replication of LOAD DATA instructions is that the data loaded from the file may be in plaintext and not encrypted. When the data is loaded into the database using SQL statements, it can be encrypted before insertion to protect it from unauthorized access or disclosure. However, when the data is loaded using LOAD DATA, it is not automatically encrypted and may be vulnerable to tampering or interception.


Using PRIVILEGE_CHECKS_USER and require_row_format can further increase the security of a MySQL replication environment.

As mentioned earlier, PRIVILEGE_CHECKS_USER can limit the privileges required by the replication threads. You can reduce the risk of unauthorized access by setting PRIVILEGE_CHECKS_USER to a separate user account with limited privileges.

require_row_format, on the other hand, can be used to ensure that tables in the InnoDB storage engine use the ROW format, which is the most secure option for replication. When require_row_format is set to ON, MySQL will require that all tables that use the InnoDB storage engine use the ROW format. This can help ensure that replication events are processed correctly and can help prevent data inconsistencies.

By combining PRIVILEGE_CHECKS_USER and require_row_format, you can create a more secure replication environment. The replication threads will use the limited privileges of the PRIVILEGE_CHECKS_USER account, while the tables in the InnoDB storage engine will use the secure ROW format. This can help prevent unauthorized access to sensitive data or operations and reduce the risk of data inconsistencies during replication.

It’s important to note that using PRIVILEGE_CHECKS_USER and require_row_format should be done carefully, as changing these variables can have implications for the performance and functionality of the replication environment. Testing any changes in a development or test environment before implementing them in a production environment is recommended.

Percona Distribution for MySQL is the most complete, stable, scalable, and secure open-source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!

Try Percona Distribution for MySQL today!


Multi-tenants and Branches in Neon Serverless PostgreSQL

serverless postgresql


I will be speaking at Percona Live 2023 about serverless PostgreSQL. Join us at this event if you are interested!


Recently, Percona introduced Percona Builds for Neon (Introducing Percona Builds for Serverless PostgreSQL), which makes it easy to install and experiment with serverless PostgreSQL. And I followed it with how you can run easy experimentations with Neon using Docker (Using Docker To Deploy Neon Serverless PostgreSQL).

Before getting into more details about Neon functionality, we need to introduce two important concepts Neon operates with: Tenants and Data Branches.


Multi-tenancy is a software architecture pattern where a single instance of a software application serves multiple tenants, allowing them to share resources like storage, processing power, and memory while maintaining separate, secure access to their respective data.

In the context of databases, a tenant’s data can be logically separated from other tenants’ data in the same database so that each tenant can only access and manage their own data. This is particularly useful in cloud-based applications, where multiple customers or organizations share the same underlying infrastructure to reduce costs and improve scalability.

The key aspect of multi-tenancy is the isolation of tenant data, ensuring that each tenant’s data remains secure, private, and separate from other tenants’ data.

In the case of Neon, the single pageserver already comes with multi-tenant functionality, and at the very least, you must have one tenant to create PostgreSQL compute node instances.

pageserver multi-tenant

The requirement is that each tenant will operate with its own compute node.

tenant compute-mode

Compute Nodes can be created/destroyed on demand; they are not required if there is no current need to access data for a given tenant.

Data branches

Neon enables you to seamlessly create branches of your Postgres database, enhancing your development workflow. With each code deployment, you can establish a separate branch for your testing environments. This way, developers and testers can obtain a data copy for testing or experimentation without impacting the main data.

Utilizing the “copy on write” method, branches are virtually cost-free. Neon refers to data branches as “timelines,” and these terms are often used interchangeably.

Neon serverless postgres

You can think (at least I do) of timelines as GitHub branches. Just keep in mind that data branches can’t be merged back.

In the example above:

  • Timeline 1 is branched from the Main timeline after two commits.
  • Timeline 2 is branched from the Main timeline after four commits.
  • Timeline 3 is branched from Timeline 1 after two more commits to Timeline 1.

And to expand on compute node requirements: each timeline requires a separate compute node to operate (and you can shut down the compute node if you do not need to work with the timeline right now).

There is a restriction to be aware of: each tenant/timeline can have only one compute node.

Schematically it looks like this:

postgres compute node

I will show the practical work with tenants/timelines in the following blog post, so subscribe for all updates.

Interested in serverless PostgreSQL?

You are welcome to experiment with Neon using our Docker images.

For feedback and questions, please use our Forums: (Latest Percona Labs/Percona Build for Serverless PostgreSQL topics – Percona Community Forum)

Follow our blog for more information about setups, performance, and unique features of serverless PostgreSQL.

If you want to talk with us about builds for serverless PostgreSQL or stay in touch for future updates – leave your contact details below.

Contact form

Also, if you are interested in possible Support for serverless PostgreSQL, fill out the Contact form, and I will get in touch with you.


Talking Drupal #391 – Building Your Career

Today we are talking about Building Your Career with Mike Anello.

For show notes visit:


  • How we started our careers
  • Broad career opportunities
  • Mentorship
  • Roles
  • First step after graduating
  • First step in switching
  • Common hurdles
  • Resources like Drupal Easy
  • Value of a career in Drupal
  • How do you find jobs
  • How do you build and maintain your Drupal career
  • How about your Drupal resume
  • Any advice


if (class_exists(‘Kint’)) { // Set the max_depth to prevent out-of-memory. \Kint::$max_depth = 4; } Guests

Mike Anello – Drupal Easy @ultimike


Nic Laflin – @nicxvan John Picozzi – @johnpicozzi Jacob Rockowitz – @jrockowitz

MOTW Correspondent

Martin Anderson-Clutz – @mandclu Devel Debug Log Allows developers to inspect the contents of variables. If those are classes you can inspect nested properties and all methods available.


Comparisons of Proxies for MySQL

mysql proxy

With a special focus on Percona Operator for MySQL


HAProxy, ProxySQL, MySQL Router (AKA MySQL Proxy); in the last few years, I had to answer multiple times on what proxy to use and in what scenario. When designing an architecture, many components need to be considered before deciding on the best solution.

When deciding what to pick, there are many things to consider, like where the proxy needs to be, if it “just” needs to redirect the connections, or if more features need to be in, like caching and filtering, or if it needs to be integrated with some MySQL embedded automation.

Given that, there never was a single straight answer. Instead, an analysis needs to be done. Only after a better understanding of the environment, the needs, and the evolution that the platform needs to achieve is it possible to decide what will be the better choice.

However, recently we have seen an increase in the usage of MySQL on Kubernetes, especially with the adoption of Percona Operator for MySQL. In this case, we have a quite well-defined scenario that can resemble the image below:

MySQL on Kubernetes

In this scenario, the proxies must sit inside Pods, balancing the incoming traffic from the Service LoadBalancer connecting with the active data nodes.

Their role is merely to be sure that any incoming connection is redirected to nodes that can serve them, which includes having a separation between Read/Write and Read Only traffic, a separation that can be achieved, at the service level, with automatic recognition or with two separate entry points.

In this scenario, it is also crucial to be efficient in resource utilization and scaling with frugality. In this context, features like filtering, firewalling, or caching are redundant and may consume resources that could be allocated to scaling. Those are also features that will work better outside the K8s/Operator cluster, given the closer to the application they are located, the better they will serve.

About that, we must always remember the concept that each K8s/Operator cluster needs to be seen as a single service, not as a real cluster. In short, each cluster is, in reality, a single database with high availability and other functionalities built in.

Anyhow, we are here to talk about Proxies. Once we have defined that we have one clear mandate in mind, we need to identify which product allows our K8s/Operator solution to:

  • Scale at the maximum the number of incoming connections
  • Serve the request with the higher efficiency
  • Consume as fewer resources as possible

The environment

To identify the above points, I have simulated a possible K8s/Operator environment, creating:

  • One powerful application node, where I run sysbench read-only tests, scaling from two to 4096 threads. (Type c5.4xlarge)
  • Three mid-data nodes with several gigabytes of data in with MySQL and Group Replication (Type m5.xlarge)
  • One proxy node running on a resource-limited box (Type t2.micro)

The tests

We will have very simple test cases. The first one has the scope to define the baseline, identifying the moment when we will have the first level of saturation due to the number of connections. In this case, we will increase the number of connections and keep a low number of operations.

The second test will define how well the increasing load is served inside the previously identified range. 

For documentation, the sysbench commands are:


sysbench ./src/lua/windmills/oltp_read.lua  --db-driver=mysql --tables=200 --table_size=1000000 
 --rand-type=zipfian --rand-zipfian-exp=0 --skip_trx=true  --report-interval=1 --mysql-ignore-errors=all 
--mysql_storage_engine=innodb --auto_inc=off --histogram  --stats_format=csv --db-ps-mode=disable --point-selects=50 
--reconnect=10 --range-selects=true –rate=100 --threads=<#Threads from 2 to 4096> --time=1200 run


sysbench ./src/lua/windmills/oltp_read.lua  --mysql-host=<host> --mysql-port=<port> --mysql-user=<user> 
--mysql-password=<pw> --mysql-db=<schema> --db-driver=mysql --tables=200 --table_size=1000000  --rand-type=zipfian 
--rand-zipfian-exp=0 --skip_trx=true  --report-interval=1 --mysql-ignore-errors=all --mysql_storage_engine=innodb 
--auto_inc=off --histogram --table_name=<tablename>  --stats_format=csv --db-ps-mode=disable --point-selects=50 
--reconnect=10 --range-selects=true --threads=<#Threads from 2 to 4096> --time=1200 run


Test 1

As indicated here, I was looking to identify when the first Proxy will reach a dimension that would not be manageable. The load is all in creating and serving the connections, while the number of operations is capped at 100. 

As you can see, and as I was expecting, the three Proxies were behaving more or less the same, serving the same number of operations (they were capped, so why not) until they weren’t.

MySQL router, after the 2048 connection, could not serve anything more.

NOTE: MySQL Router actually stopped working at 1024 threads, but using version 8.0.32, I enabled the feature: connection_sharing. That allows it to go a bit further.  

Let us take a look also the latency:

latency threads

Here the situation starts to be a little bit more complicated. MySQL Router is the one that has the higher latency no matter what. However, HAProxy and ProxySQL have interesting behavior. HAProxy performs better with a low number of connections, while ProxySQL performs better when a high number of connections is in place.  

This is due to the multiplexing and the very efficient way ProxySQL uses to deal with high load.

Everything has a cost:

HAProxy is definitely using fewer user CPU resources than ProxySQL or MySQL Router …


.. we can also notice that HAProxy barely reaches, on average, the 1.5 CPU load while ProxySQL is at 2.50 and MySQL Router around 2. 

To be honest, I was expecting something like this, given ProxySQL’s need to handle the connections and the other basic routing. What was instead a surprise was MySQL Router, why does it have a higher load?

Brief summary

This test highlights that HAProxy and ProxySQL can reach a level of connection higher than the slowest runner in the game (MySQL Router). It is also clear that traffic is better served under a high number of connections by ProxySQL, but it requires more resources. 

Test 2

When the going gets tough, the tough get going

Let’s remove the –rate limitation and see what will happen. 

mysql events

The scenario with load changes drastically. We can see how HAProxy can serve the connection and allow the execution of more operations for the whole test. ProxySQL is immediately after it and behaves quite well, up to 128 threads, then it just collapses. 

MySQL Router never takes off; it always stays below the 1k reads/second, while HAProxy served 8.2k and ProxySQL 6.6k.

mysql latency

Looking at the latency, we can see that HAProxy gradually increased as expected, while ProxySQL and MySQL Router just went up from the 256 threads on. 

To observe that both ProxySQL and MySQL Router could not complete the tests with 4096 threads.

ProxySQL and MySQL Router

Why? HAProxy always stays below 50% CPU, no matter the increasing number of threads/connections, scaling the load very efficiently. MySQL router was almost immediately reaching the saturation point, being affected by the number of threads/connections and the number of operations. That was unexpected, given we do not have a level 7 capability in MySQL Router.

Finally, ProxySQL, which was working fine up to a certain limit, reached saturation point and could not serve the load. I am saying load because ProxySQL is a level 7 proxy and is aware of the content of the load. Given that, on top of multiplexing, additional resource consumption was expected.   

proxysql usage

Here we just have a clear confirmation of what was already said above, with 100% CPU utilization reached by MySQL Router with just 16 threads, and ProxySQL way after at 256 threads.

Brief summary

HAProxy comes up as the champion in this test; there is no doubt that it could scale the increasing load in connection without being affected significantly by the load generated by the requests. The lower consumption in resources also indicates the possible space for even more scaling.

ProxySQL was penalized by the limited resources, but this was the game, we had to get the most out of the few available. This test indicates that it is not optimal to use ProxySQL inside the Operator; it is a wrong choice if low resource and scalability are a must.    

MySQL Router was never in the game. Unless a serious refactoring, MySQL Router is designed for very limited scalability, as such, the only way to adopt it is to have many of them at the application node level. Utilizing it close to the data nodes in a centralized position is a mistake.  


I started showing an image of how the MySQL service is organized and want to close by showing the variation that, for me, is the one to be considered the default approach:

MySQL service is organized

This highlights that we must always choose the right tool for the job. 

The Proxy in architectures involving MySQL/Percona Server for MySQL/Percona XtraDB Cluster is a crucial element for the scalability of the cluster, no matter if using K8s or not. Choosing the one that serves us better is important, which can sometimes be ProxySQL over HAProxy. 

However, when talking about K8s and Operators, we must recognize the need to optimize the resources usage for the specific service. In that context, there is no discussion about it, HAProxy is the best solution and the one we should go to. 

My final observation is about MySQL Router (aka MySQL Proxy). 

Unless there is a significant refactoring of the product, at the moment, it is not even close to what the other two can do. From the tests done so far, it requires a complete reshaping, starting to identify why it is so subject to the load coming from the query more than the load coming from the connections.   

Great MySQL to everyone. 



Percona Monitoring and Management 2 Scaling and Capacity Planning

2022 was an exciting year for Percona Monitoring and Management (PMM). We’ve added and improved many features, including Alerting and Backup Management. These updates are designed to keep databases running at peak performance and simplify database operations. But as companies grow and see more demand for their databases, we need to ensure that PMM also remains scalable so you don’t need to worry about its performance while tending to the rest of your environment.

PMM2 uses VictoriaMetrics (VM) as its metrics storage engine. Percona’s co-Founder Peter Zaitsev wrote a detailed post about migration from Prometheus to VictoriaMetrics, One of the most significant differences in terms of performance of PMM2 comes with the usage for VM, which can also be derived from performance comparison on node_exporter metrics between Prometheus and VictoriaMetrics.

Planning for resources of a PMM Server host instance can be tricky because the numbers can change depending on the DB instances being monitored by PMM. For example, a higher number of data samples ingested per second or a monitoring database with a huge number of tables (1000+) can affect performance; similarly, the configuration for exporters or a custom metric resolution can also have an impact on the performance of a PMM server host. The point is that scaling up PMM isn’t linear, and this post is only meant to give you a general idea and serve our users as a good starting point when planning to set up PMM2.

The VictoriaMetrics team has also published some best practices, which can also be referred to while planning for resources for setting up PMM2.

Home Dashboard for PMM2
We have tested PMM version 2.33.0 with its default configuration, and it can monitor more than 1,000 MySQL services, with the databases running with the default sysbench Read-Write workload.  We observed that the overall performance of PMM monitoring 1,000 database services was good, and no significant resource usage spikes were observed; this is a HUGE increase in performance and capacity over previous versions!  Please note that the focus of these tests was around standard metrics gathering and display, we’ll use a future blog post to benchmark some of the more intensive query analytics (QAN) performance numbers.

Capacity planning and setup details

We used a dedicated 32-core CPU and 64GB of RAM for our testing.

CPU Usage for PMM Server Host System

The CPU usage averaged 24% utilization, as you can see in the above picture.


Memory Utilization for PMM Server Host System

Virtual Memory utilization was averaging 48 GB of RAM.

VictoriaMetrics maintains an in-memory cache for mapping active time series into internal series IDs. The cache size depends on the available memory for VictoriaMetrics in the host system; hence planning for enough RAM on the host system is important for better performance to avoid having a high percentage of slow inserts.

If we talk about overall disk usage for the instance monitoring 1,000 Database services, the average disk usage per datapoint comes out to be roughly .25 Bytes, or you should plan storage roughly between 500 GB – one TB for a default 30 day retention period.

Average Datapoints Size

Average datapoints size

Stats on Victoria Metrics

We recommended having at least two GB RAM and a two-core system for PMM Server as a bare minimum requirement to set up monitoring your database services. With this minimum recommended setup, you can monitor up to three databases comfortably, possibly more, depending on some of your environment’s already mentioned factors.

Based on our observations and various setups we have done with PMM, overall, with a reasonably powerful pmm-server host system (eight+ GB RAM and eight+ cores), it is the most optimum to target monitoring 32 databases per core or 16 databases per one GB RAM, hence keeping this in mind is really helpful while planning resources for your respective monitoring setups.

Number of Database Services Monitored Min Recommend Requirement
0-250 services 8 Core, 16 GB RAM
250-500 services 16 Core, 32 GB RAM
500-1000 services 32 Core, 64 GB RAM

PMM scalability improved dramatically through UX and performance research

In earlier versions of PMM2, the Home Dashboard could not load with more than 400 DB services, resulting in a frustrating experience for users. Interacting with UI elements such as filters and date pickers was previously impossible. We conducted thorough research to improve scalability and the user experience on our Home Dashboard for 1,000 database services. Our findings revealed that the design of the Home Dashboard heavily impacted scalability and poor UX on the UI resulting in unresponsive pages.

We redesigned the Home Dashboard as a solution, and the results were significant. The new dashboard provides a much better user experience with more critical information being displayed and scalability for environments up to 1000 DB services. The overall load time has improved dramatically, going from 50+ seconds to roughly 20 seconds, and there are no longer any unresponsive errors on the UI. Users can now interact with filters on other dashboards seamlessly as well!

There are still some limitations we’re working on addressing

  • Instance Overview Dashboards, which are shipped with PMM, do not work well with such a large number of instances, so it is recommended not to rely on them when such a high number of databases are being monitored. They would work well only with a maximum of 400 database services.
  • There is a known issue around the request “URI too Large” pop-up message that is visible because of some large query requests, this also leads to an issue with setting a big time range for observing metrics from the monitored Database.  Our team is planning to implement a fix for this soon.
  • QAN takes 50+ seconds to load up when 400+ database services are monitored. Also, the overall interaction with QAN feels laggy when searching and applying filters across a big list of services/nodes. Our team is working on improving the overall user experience of the QAN App, which will soon be fixed in future releases of PMM.

Not a formula but a rule of thumb

Overall resource usage in PMM depends on the configuration and workload, and it may vary for different setups so it’s difficult to say, “for monitoring this number of DB services, you need a machine of that size.” This post is meant to show how the PMM server scales up and performs with the default setup and all database host nodes configured in default metrics mode (push).

We plan to also work on another follow-up post on the performance and scalability where we would highlight results for different dashboards and QAN, showcasing the improvements we have made over the last few PMM releases.

Tell us what you think about PMM!

We’re excited about all of the improvements we’ve made, but we’re not done yet! Have some thoughts about how we can improve PMM, or want to ask questions? Come talk to us in the Percona Forums, and let us know what you’re thinking!

PMM Forum


Percona Labs Presents: Infrastructure Generator for Percona Database as a Service (DBaaS)

Percona MyDBaaS

Let’s look at how you can run Percona databases on Kubernetes, the easy way.

Chances are that if you are using the latest Percona Monitoring and Management (PMM) version, you have seen the availability of the new Percona Database as a Service (DBaaS). If not, go and get a glimpse of the fantastic feature with our docs on DBaaS – Percona Monitoring and Management.

Now, if you like it and wanna give it a try! (Yay!), but you don’t wanna deal with Kubernetes, (nay)o worries; we have a tool for you. Introducing the Percona DBaaS Infrastructure Creator, or Percona My Database as a Service (MyDBaaS).

My Database as a Service

First, a clarification: this tool is focused on running your DBaaS on Amazon Web Services (AWS).

percona My Database as a Service

At the time of the writing of this article, the PMM DBaaS supported a limited number of cloud vendors, AWS being one of them. This tool creates the infrastructure in an AWS account.

The usage is pretty simple. You have three features:

  • An instance selector: In case you don’t know what instance type to use, but you do know how many CPUs and how much memory your DBs requires.
  • A Cluster Creator: This is where you decide on some minimal and basic properties of the cluster and deploy it.
  • Deploy a Percona Operator: You can choose from our three Operators for Kubernetes to be deployed on the same infrastructure you are creating.

Instance selector

Currently, AWS has 270 different instance types available. Which one to use? The instance selector will help you with that. Just pass the number of vCPUs, amount of memory, and the region you intend to use, and it will show a list of EKS-suitable EC2 instances.

percona my database Instance selector

Why ask for the region? Costs! Instances types have different costs depending on the region they are in, and costs are something the tool will show you:

Cluster creator

A very basic interface. You only need to pass the name of the cluster, the amount of desired nodes, the instance type, and on which region you would like to run the cluster.

percona my database cluster

If you pass your AWS key/secret, the tool will take care of deploying the cluster, and once it is done, it will return the contents of the Kubeconfig file for you to use in the DBaaS of PMM.

A note on how we handle your AWS key/secret

Percona will never store this information. In the particular case of this tool, we do set the following environment variables:


After the creation of the cluster, the environment variables are unset.

Now, why is there an option to not pass the AWS credentials? Well, for security, of course. 

Under the hood, the cluster is created using EKSCTL, a “CloudFormation stack creator.” 

If you are proficient enough with ekstcl, you can just copy/paste the created YAML and run the eksctl command on your own server without sharing your credentials ever. Now, for DBaaS, you still have to provide them, but that’s another topic

using EKSCTL

If you choose to pass the AWS credentials, an EKS cluster will be deployed. The outcome will be the contents of the kubeconfig file

kubeconfig file Percona

With that in place, one can now go to PMM and register the Kubernetes cluster as described in to deploy the DBaaS in there.

But there’s more:

Percona Operators

Now, there’s an additional option: deploying a Percona Operator. Currently, PMM DBaaS focuses on the Percona Operator for MySQL based on Percona XtraDB Cluster. With the MyDBaaS tool, you can choose from three operators available.

To have the full deployment, you need to check both AWS credentials and deploy a Percona Operator.

deploy a Percona Operator

At the end of the installation, you will have the contents of the Kubeconfig file, plus:

  • The password of the root user 
  • Instructions on how to connect to it

installing percona operator for MySQL

After this, you can use a Percona database running on Kubernetes. Have fun!

About Percona Labs

Percona Labs is a place for the community to have a voice on how we approach solutions, curated by Percona’s CTO, Vadim Tkachenko. Feedback is always welcome:

Community Forum


Automating Physical Backups of MongoDB on Kubernetes


We at Percona talk a lot about how Kubernetes Operators automate the deployment and management of databases. Operators seamlessly handle lots of Kubernetes primitives and database configuration bits and pieces, all to remove toil from operation teams and provide a self-service experience for developers.

Today we want to take you backstage and show what is really happening under the hood. We will review technical decisions and obstacles we faced when implementing physical backup in Percona Operator for MongoDB version 1.14.0. The feature is now in technical preview.

The why

Percona Server for MongoDB can handle petabytes of data. The users of our Operators want to host huge datasets in Kubernetes too. However, using a logical restore recovery method takes a lot of time and can lead to SLA breaches.

Our Operator uses Percona Backup for MongoDB (PBM) as a tool to backup and restore databases. We have been leveraging logical backups for quite some time. Physical backups in PBM were introduced a few months ago, and, with those, we saw significant improvement in recovery time (read more in this blog post, Physical Backup Support in Percona Backup for MongoDB):

physical backup mongodb

So if we want to reduce the Recovery Time Objective (RTO) for big data sets, we must provide physical backups and restores in the Operator.

The how


When you enable backups in your cluster, the Operator adds a sidecar container to each replset pod (including the Config Server pods if sharding is enabled) to run pbm-agent. These agents listen for PBM commands and perform backups and restores on the pods. The Operator opens connections to the database to send PBM commands to its collections or to track the status of PBM operations.

kubernetes backup mongodb


Enabling physical backups was fairly straightforward. The Operator already knew how to start a logical backup, so we just passed a single field to the PBM command if the requested backup was physical.

Unlike logical backups, PBM needs direct access to mongod’s data directory, and we mounted the persistent volume claim (PVC) on PBM sidecar containers. With these two simple changes, the Operator could perform physical backups. But whether you know it from theory or from experience (ouch!), a backup is useless if you cannot restore it.


Enabling physical restores wasn’t easy. PBM had two major limitations when it came to physical restores:

1. PBM stops the mongod process during the restore.

2. It runs a temporary mongod process after the backup files are copied to the data directory.

What makes physical restores a complex feature is dealing with the implications of these two constraints.

PBM kills the mongod process

PBM kills the mongod process in the container, which means the operator can’t query PBM collections from the database during the restore to track their status. This forced us to use the PBM CLI tool to control the PBM operations during the physical restore instead of opening a connection to the database.

Note: We are now considering using the same approach for every PBM operation in the Operator. Let us know what you think.

Another side effect of PBM killing mongod was restarting the mongod container. The mongod process runs as PID 1 in the container, and when you kill it, the container restarts. But the PBM agent needs to keep working after mongod is killed to complete the physical restore. For this, we have created a special entrypoint to be used during the restore. This special entrypoint starts mongod and just sleeps after mongod has finished.

Temporary mongod started by PBM

When you start a physical restore, PBM does approximately the following:

1. Kills the mongod process and wipes the data directory.

2. Copies the backup files to the data directory.

3. Starts a temporary mongod process listening on a random port to perform operations on oplog.

4. Kills the mongod and PBM agent processes.

For a complete walkthrough, see Physical Backup Support in Percona Backup for MongoDB.

From the Operator’s point of view, the problem with this temporary mongod process is its version. We support MongoDB 4.4, MongoDB 5.0, and now MongoDB 6.0 in the Operator, and the temporary mongod needs to be the same version as the MongoDB running in the cluster. This means we either include mongod binary in PBM docker images and create a separate image for each version combination or find another solution.

Well… we found it: We copy the PBM binaries into the mongod container before the restore. This way, the PBM agent can kill and start the mongod process as often as it wants, and it’s guaranteed to be the same version with cluster.

Tying it all together

By now, it may seem like I have thrown random solutions to each problem. PBM, CLI, special entrypoint, copying PBM binaries… How does it all work?

When you start a physical restore, the Operator patches the StatefulSet of each replset to:

1. Switch the mongod container entrypoint to our new special entrypoint: It’ll start pbm-agent in the background and then start the mongod process. The script itself will run as PID 1, and after mongod is killed, it’ll sleep forever.

2. Inject a new init container: This init container will copy the PBM and pbm-agent binaries into the mongod container.

3. Remove the PBM sidecar: We must ensure that only one pbm-agent is running in each pod.

Until all StatefulSets (except mongos) are updated, the PerconaServerMongoDBRestore object is in a “waiting” state. Once they’re all updated, the operator starts the restore, and the restore object goes into the requested state, then the running state, and finally, the ready or error state.


Operators simplify the deployment and management of applications on Kubernetes. Physical backups and restores are quite a popular feature for MongoDB clusters. Restoring from physical backup with Percona Backup for MongoDB has multiple manual steps. This article shows what kind of complexity is hidden behind this seemingly simple step.


Try Percona Operator for MongoDB

Powered by WordPress | Theme: Aeros 2.0 by