May
16
2023
--

MySQL Connection Security With Connection Control Plugins

MySQL connection control plugin

As a database administrator, have you ever been in a situation when your database confronted a brute force attack? A brute force attack can be launched against a user account in MySQL. MySQL replies with success or error based on supplied credentials, and the time required for the verification is almost the same in either case. Hence, an attacker can launch a brute force attack against a MySQL user account at a rapid rate and can try many different passwords.

According to cryptography, a brute-force attack consists of an attacker trying many passwords or passphrases with the hope of eventually guessing correctly. The attacker systematically checks all possible passwords and passphrases until the correct one is found.

It’s not just brute force attacks going on; the IT industry has recently seen a steady increase in distributed denial of service (DDoS) attacks. Have you also been targeted in such a type of connection flow on port 3306?

Today we would like to walk you through a special kind of plugin, i.e., the connection_control plugin! It was introduced in MySQL 8.0 and also back-ported to MySQL 5.7 and MySQL 5.6.

What are connection control plugins?

The connection control plugin library allows administrators to introduce an increasing delay in the server response to connections after a designated number of consecutive unsuccessful login attempts.

The idea behind using a connection control plugin is to configure a MySQL server so that the server will delay its response. The unauthorized user or a client does not know whether the password is correct or not unless the server replies. Thus, if an attacker attacks a server by spawning multiple connection requests, such connections have to be active until the time server replies. Introducing a delay makes it harder for attackers because now resources are occupied with ensuring connection requests are active. This technique can slow down brute force attacks against MySQL user accounts.

The plugin library contains two plugins:

  • CONNECTION_CONTROL checks incoming connection attempts and adds a delay to server responses as necessary. This plugin also exposes system variables that enable its operation to be configured and a status variable that provides rudimentary monitoring information.

How to install connection control plugins

To load the plugins at runtime, use these statements, adjusting the .so suffix for your platform as necessary. Here I am going to test it with Percona Server for MySQL 5.7.36:

mysql> INSTALL PLUGIN CONNECTION_CONTROL SONAME 'connection_control.so';
INSTALL PLUGIN CONNECTION_CONTROL_FAILED_LOGIN_ATTEMPTS SONAME 'connection_control.so';
Query OK, 0 rows affected (0.01 sec)
Query OK, 0 rows affected (0.00 sec)

Or, you can install the plugin in my.cnf. Add these options under the [mysqld] option group in the MySQL configuration file (/etc/my.cnf):

[mysqld]

plugin-load-add=connection_control.so
connection-control=FORCE_PLUS_PERMANENT
connection-control-failed-login-attempts=FORCE_PLUS_PERMANENT

Now let’s take a deeper look at each of these configurations options:

  • plugin-load-add=connection_control.so
    – Loads the connection_control.so library each time the server is started.
  • connection_control=FORCE_PLUS_PERMANENT
    – Prevents the server from running without the CONNECTION_CONTROL plugin, and server startup fails if the plugin does not initialize successfully.
  • connection-control-failed-login-attempts=FORCE_PLUS_PERMANENT
    – Prevents the server from running without the CONNECTION_CONTROL_FAILED_LOGIN_ATTEMPTS plugin, and server startup fails if the plugin does not initialize successfully.

To verify plugin installation, restart the server and examine the INFORMATION_SCHEMA.PLUGINS table or use the SHOW PLUGINS statement:

mysql> SELECT PLUGIN_NAME, PLUGIN_STATUS
      FROM INFORMATION_SCHEMA.PLUGINS
      WHERE PLUGIN_NAME LIKE 'connection%';
+------------------------------------------+---------------+
| PLUGIN_NAME                              | PLUGIN_STATUS |
+------------------------------------------+---------------+
| CONNECTION_CONTROL                       | ACTIVE        |
| CONNECTION_CONTROL_FAILED_LOGIN_ATTEMPTS | ACTIVE        |
+------------------------------------------+---------------+

Configure connection control thresholds

Now, let’s configure the server response delay for failed connection attempts using these server parameters. We will set the threshold of consecutive failed connections tentative to three and add a connection delay of a minimum of one second.

mysql> SET GLOBAL connection_control_failed_connections_threshold = 3;
SET GLOBAL connection_control_min_connection_delay = 1000;  
SET GLOBAL connection_control_min_connection_delay = 90000;

Alternatively, to set and persist the variables at runtime, use these statements:

mysql> SET PERSIST connection_control_failed_connections_threshold = 3;
SET PERSIST connection_control_min_connection_delay = 1000;

Also, you can add these options under the [mysqld] option group in the MySQL configuration file (/etc/my.cnf) to adjust them later as necessary.

[mysqld]

connection_control_failed_connections_threshold=3
connection_control_min_connection_delay=1000 
connection_control_max_connection_delay=2147483647

Let’s talk about each of these variables in more detail:

Testing and monitoring the connections

First terminal:

Note, here we will add a minimum one-second delay and set the failed connection threshold to three. The max connection delay is set to 90 seconds.

mysql> SET GLOBAL connection_control_failed_connections_threshold = 3;
SET GLOBAL connection_control_min_connection_delay = 1000;
set global connection_control_max_connection_delay=90000;

Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)

mysql> show variables like '%connection_control%';
+-------------------------------------------------+-------+
| Variable_name                                   | Value |
+-------------------------------------------------+-------+
| connection_control_failed_connections_threshold | 3     |
| connection_control_max_connection_delay         | 90000 |
| connection_control_min_connection_delay         | 1000  |
+-------------------------------------------------+-------+
3 rows in set (0.00 sec)

Try to fetch the values of these variables:

mysql> select * from information_schema.connection_control_failed_login_attempts;
Empty set (0.00 sec)

mysql> show global status like 'connection_control_%';
+------------------------------------+-------+
| Variable_name                      | Value |
+------------------------------------+-------+
| Connection_control_delay_generated | 0     |
+------------------------------------+-------+
1 row in set (0.00 sec)

Initially, there are no failed connection attempts and no connection delay generated since we are freshly configuring the parameters here. So, there you can see an empty set for information_schema.connection_control_failed_login_attempts and zero Connection_control_delay_generated.

Second terminal:

With the above settings, I tested brute force for 53 fake connections.

Open another terminal and perform these incorrect connections as a root user, specifying a wrong password each time.

[root@ip-xxx-xx-xx-xx ~]# for i in `seq 1 53`;   do time mysql mysql  -uroot -p”try_an_incorrect_password” 
-h xxx.xx.x.x 2>&1 >/dev/null | grep meh ;   done
0.093
0.092
0.093
1.093
2.093
3.105
4.093
5.093
…
…
45.092
46.093
47.093
48.093
49.092
50.093

What’s happening with these connections?

  • In the MySQL processlist, each connection will be in the state of “Waiting in connection_control plugin.”
  • Each connection will experience small but noticeable delays after the third connection attempt and will keep on increasing until you make the last attempt. With each subsequent failed attempt, the delay is increased by one second until it reaches the maximum limit. Meaning if the 50th connection is established after three unsuccessful logins, the 51st connection will take 51 seconds, and the 52nd connection will again take 52 seconds, and so on. This means the delay keeps on increasing until the connection_control_max_connection_delay is reached. As such, automated brute force attack tools will no longer be as useful since they will face continuous delays.

First terminal

Now switch back to the first terminal and recheck the values of the variables.

The connection_control starts monitoring all failed connection attempts and keeps track of consecutive failed connection attempts for each user.

Until consecutive failed attempts are less than the threshold, i.e., three, in our case, the user does not experience any delay. This should avoid delay in genuine cases where the user incorrectly typed his/her password.

Here you can notice that the status of Connection_control_delay_generated is now 50, and CONNECTION_CONTROL_FAILED_LOGIN_ATTEMPTS is 53.

mysql> show global status like 'connection_control_%';
+------------------------------------+-------+
| Variable_name                      | Value |
+------------------------------------+-------+
| Connection_control_delay_generated | 50    |
+------------------------------------+-------+
1 row in set (0.00 sec)

mysql> SELECT FAILED_ATTEMPTS FROM INFORMATION_SCHEMA.CONNECTION_CONTROL_FAILED_LOGIN_ATTEMPTS;
+-----------------+
| FAILED_ATTEMPTS |
+-----------------+
|              53 |
+-----------------+
1 row in set (0.00 sec)

What happens when you want to make a successful/genuine login?

Please note the server will continue to introduce such a delay for all subsequent failed connections and the first successful connection. Hence, the delay generated for the first successful login attempt after 53 unsuccessful logins is 53 seconds. Suppose MySQL does not add any delay after the first successful connection, in that case, the attacker will get an indication that delay implies the wrong password and thus free pending connections after waiting for a specific amount of time. So, if you try to make one successful connection after N number of unsuccessful attempts, you will surely experience a delay of N seconds for the first successful login attempt.

[root@ip-xxx-xx-xx-xx ~]# date; mysql -uroot -p’correct_password’ -hxxx.xx.x.x -e "select now();";date
Tue Apr 18 06:27:36 PM UTC 2023
mysql: [Warning] Using a password on the command line interface can be insecure.
+---------------------+
| now()               |
+---------------------+
| 2023-04-18 18:28:29 |
+---------------------+
Tue Apr 18 06:28:29 PM UTC 2023

Which user caused this brute force attack?

You can also determine from which user or host these failed connection attempts were made.

mysql> select * from information_schema.connection_control_failed_login_attempts;
+-----------------------+-----------------+
| USERHOST              | FAILED_ATTEMPTS |
+-----------------------+-----------------+
| 'root'@'xxx-xx-xx-xx' |              53 |
+-----------------------+-----------------+
1 row in set (0.00 sec)

How to reset the failed connection threshold

In case you want to reset these counters, you just have to again assign a value to the variable connection_control_failed_connections_threshold:

mysql> SET GLOBAL connection_control_failed_connections_threshold = 4;
Query OK, 0 rows affected (0.00 sec)

# Now you can see the values are reset!
mysql> select * from information_schema.connection_control_failed_login_attempts;
Empty set (0.00 sec)

mysql> show global status like 'connection_control_%';
+------------------------------------+-------+
| Variable_name                      | Value |
+------------------------------------+-------+
| Connection_control_delay_generated | 0     |
+------------------------------------+-------+
1 row in set (0.00 sec)

Conclusion

MySQL connection control can be very useful to limit the inconvenience of a brute force attack or unseemly TCP connections. Percona Server for MySQL also supports these plugins. By default, it is not enabled, but you can perform the same steps to enable the plugin and set up a secure connection for your environment.

To summarize, the plugin provides the following:

  • A way to configure a threshold after which an increasing delay is triggered
  • Ability to configure a delay in the server’s reply with minimum and maximum limits
  • An information schema view to see monitoring information related to failed connection attempts

Please check our latest blogs on how you can keep your database secure:

Keep Your Database Secure With Percona Advisors

Improving MySQL Password Security with Validation Plugin

MySQL 8: Account Locking

Brute-Force MySQL Password From a Hash

Version 2 Advisor Checks for PMM 2.28 and Newer

May
16
2023
--

Backup and Restore with MyDumper on Docker

Backup and Restore with MyDumper on Docker

At the end of 2021, I pushed the first Docker image to hub.docker.com. This was the first official image and since then, we have been improving our testing and packaging procedures based on Docker, CircleCI, and GitHub Actions. However, when I’m coding,  I’m not testing in Docker. But a couple of weeks ago, when I was reviewing an issue, I realized some interesting Docker use cases that I want to share.

Common use case

First, we are going to review how to take a simple backup with MyDumper to warm you up:

docker run --name mydumper 
     --rm 
     -v ${backups}:/backups  
     mydumper/mydumper:v0.14.4-7 
     sh -c "rm -rf /backups/data; 
          mydumper -h 172.17.0.5 
               -o /backups/data 
               -B test 
               -v 3 
               -r 1000 
               -L /backups/mydumper.log"

You will find the backup files and the log on ${backups}. Then you can restore it using:

docker run --name mydumper 
     --rm 
     -v ${backups}:/backups 
     mydumper/mydumper:v0.14.4-7 
     sh -c "myloader -h 172.17.0.4 
               -d /backups/data 
               -B test 
               -v 3 
               -o 
               -L /backups/myloader.log"

And if you want to do it faster, you can do it all at once:

docker run --name mydumper 
     --rm 
     -v ${backups}:/backups 
     mydumper/mydumper:v0.14.4-7 
     sh -c "rm -rf /backups/data; 
          mydumper -h 172.17.0.5 
               -o /backups/data 
               -B test 
               -v 3 
               -r 1000 
               -L /backups/mydumper.log ; 
          myloader -h 172.17.0.4 
               -d /backups/data 
               -B test 
               -v 3 
               -o 
               -L /backups/myloader.log"

We can remove the option to mount a volume (-v ${backups}:/backups), as the data will reside inside the container.

Advance use case

Since version 0.14.4-7, I created the Docker image with ZSTD instead of GZIP because it is faster. Other options that are always useful are –rows/-r and –chunk-filesize/-F. On the latest releases, you can run ‘100:1000:0’ for -r, which means:

  • 100 as the minimal chunk size
  • 1000 will be the starting point
  • 0 means that there won’t be a maximum limit

And in this case, where we want small files to be sent to myloader as soon as possible, and because we don’t care about the number of files either, -F will be set to 1.

In the next use case, we are going to stream the backup through the stdout from mydumper to myloader, streaming the content without sharing the backup dir:

docker run --name mydumper 
     --rm 
     -v ${backups}:/backups 
     mydumper/mydumper:v0.14.4-7 
     sh -c "rm -rf /backups/data; 
          mydumper -h 172.17.0.5 
               -o /backups/data 
               -B test 
               -v 3 
               -r 100:1000:0 
               -L /backups/mydumper.log 
               -F 1 
               --stream 
               -c 
        | myloader -h 172.17.0.4 
               -d /backups/data_tmp 
               -B test 
               -v 3 
               -o 
               -L /backups/myloader.log 
               --stream"

In this case, backup files will be created on /backups/data, sent through the pipeline, and stored on /backups/data_tmp until myloader imports that backup file, and then it will remove it.

To optimize this procedure, now, we can share the backup directory setting –stream to NO_STREAM_AND_NO_DELETE, which is not going to stream the content of the file but is going to stream the filename, and it will not delete it as we want the file to be shared to myloader:

docker run --name mydumper 
     --rm 
     -v ${backups}:/backups 
     mydumper/mydumper:v0.14.4-7 
     sh -c "rm -rf /backups/data; 
          mydumper -h 172.17.0.5 
               -o /backups/data 
               -B test 
               -v 3 
               -r 100:1000:0 
               -L /backups/mydumper.log 
               -F 1 
               --stream=NO_STREAM_AND_NO_DELETE 
               -c 
        | myloader -h 172.17.0.4 
               -d /backups/data 
               -B test 
               -v 3 
               -o 
               -L /backups/myloader.log 
               --stream"

As you can see, the directory is the same. Myloader will delete the files after importing them, but if you want to keep the backup files, you should use –stream=NO_DELETE.

The performance gain will vary depending on the database size and number of tables. This can also be combined with another MyDumper feature, masquerade your backups, which allows you to build safer QA/Testing environments.

Conclusion

MyDumper, which already has proven that it is the fastest logical backup solution, now offers a simple and powerful way to migrate data in a dockerized environment.

Percona Distribution for MySQL is the most complete, stable, scalable, and secure open source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!

 

Try Percona Distribution for MySQL today!

May
15
2023
--

Proof of Concept: Horizontal Write Scaling for MySQL With Kubernetes Operator

horizontal write scaling kubernetes

Historically MySQL is great in horizontal READ scale. The scaling, in that case, is offered by the different number of Replica nodes, no matter if using standard asynchronous replication or synchronous replication.

However, those solutions do not offer the same level of scaling for writes operation.

Why? Because the solutions still rely on writing in one single node that works as Primary. Also, in the case of multi-Primary, the writes will be distributed by transaction. In both cases, when using virtually-synchronous replication, the process will require certification from each node and local (by node) write; as such, the number of writes is NOT distributed across multiple nodes but duplicated.

The main reason behind this is that MySQL is a relational database system (RDBMS), and any data that is going to be written in it must respect the RDBMS rules. In short, any data that is written must be consistent with the data present. To achieve that, the data needs to be checked with the existing through defined relations and constraints. This action is something that can affect very large datasets and be very expensive. Think about updating a table with millions of rows that refer to another table with another million rows.

An image may help:

data model for ecommerce

Every time I will insert an order, I must be sure that all the related elements are in place and consistent.

This operation is quite expensive but our database can run it in a few milliseconds or less, thanks to several optimizations that allow the node to execute most of them in memory with no or little access to mass storage.

The key factor is that the whole data structure resides in the same location (node), facilitating the operations.

Once we have understood that, it will also become clear why we cannot have relational data split in multiple nodes and have to distribute writes by table. If I have a node that manages only the items, another one the orders, and another one the payments, I will need to have my solution able to deal with distributed transactions, each of which needs to certify and verify other nodes’ data.

This level of distribution will seriously affect the efficiency of the operation, which will increase the response time significantly. This is it. Nothing is impossible; however, the performances will be so impacted that each operation may take seconds instead of milliseconds or a fraction of it unless lifting some of the rules breaks the relational model.

MySQL, as well as other RDBMS, are designed to work respecting the model and cannot scale in any way by fragmenting and distributing a schema, so what can be done to scale?

The alternative is to split a consistent set of data into fragments. What is a consistent set of data? It all depends on the kind of information we are dealing with. Keeping in mind the example above, where we have a shop online serving multiple customers, we need to identify which is the most effective way to split the data.

For instance, if we try to split the data by Product Type (Books, CD/DVD, etc.), we will have a huge duplication of data related to customers/orders/shipments and so on, and all this data is also quite dynamic given I will have customers constantly ordering things.

Why duplicate the data? Because if I do not duplicate that data, I will not know if a customer has already bought or not that specific item, or I will have to ask again about the shipment address and so on. This also means that whenever a customer buys something or puts something on the wish list, I have to reconcile the data in all my nodes/clusters.

On the other hand, if I choose to split my data by country of customer’s residence, the only data I will have to duplicate and keep in sync is the one related to the products, of which the most dynamic one will be the number of items in stock. This, of course, is unless I can organize my products by country as well, which is a bit unusual nowadays but not impossible.

Another possible case is if I am a health organization and I manage several hospitals. As for the example above, it will be easier to split my data by hospital, given most of the data related to patients is bound to the hospital itself, as well as treatments and any other element related to hospital management. In contrast, it will make no sense to split by patient’s country of residence.

This technique of splitting the data into smaller pieces is called sharding and is currently the only way we have to scale RDBMS horizontally. 

In the MySQL open source ecosystem, we have only two consolidated ways to perform sharding — Vitess and ProxySQL. The first one is a complete solution that takes ownership of your database and manages almost any aspect of its operations in a sharded environment and includes a lot of specific features for DBAs to deal with daily operations like table modifications, backup, and more.

While this may look great, it also has some strings attached, including the complexity and proprietary environment. That makes Vitess a good fit for “complex” sharding scenarios where other solutions may not be enough.

ProxySQL does not have a sharding mechanism “per se,” but given the way it works and the features it has, it allows us to build simple sharding solutions.

It is important to note that most of the DBA operations will still be on DBA to be executed, with incremented complexity given the sharding environment.

There is a third option which is application-aware sharding.

This solution sees the application aware of the need to split the data into smaller fragments and internally point the data to different “connectors” that are connected to multiple data sources.

In this case, the application is aware of a customer’s country and will redirect all the operations related to him to the datasource responsible for the specific fragment.

Normally this solution requires a full code redesign and could be quite difficult to achieve when it is injected after the initial code architecture definition.

On the other hand, if done at design, it is probably the best solution because it will allow the application to define the sharding rules and can also optimize the different data sources using different technologies for different uses.

One example could be using an RDBMS for most of the Online transaction processing (OLTP) data shared by country and having the products as distributed memory cache with a different technology. At the same time, all the data related to orders, payments, and customer history can be consolidated in a data warehouse used to generate reporting.    

As said, the last one is probably the most powerful, scalable, and difficult to design, and unfortunately, it represents probably less than 5% of the solution currently deployed. 

As well, very few cases are in need to have a full system/solution to provide scalability with sharding.

By experience, most of the needs for horizontal scaling fell in the simple scenario, where there is the need to achieve sharding and data separation, very often with sharding-nothing architecture. In shared-nothing, each shard can live in a totally separate logical schema instance / physical database server/data center/continent. There is no ongoing need to retain shared access (from between shards) to the other unpartitioned tables in other shards.

The POC

Why this POC?

Over the years, I have faced a lot of customers talking about scaling their database solution and looking at very complex sharding as Vitess as the first and only way to go.

This without even considering if their needs were driving them there for real.

In my experience and in talking with several colleagues, I am not alone when analyzing the real needs. After discussing with all the parties impacted, only a very small percentage of customers were in real need of complex solutions. Most of the others were just trying to avoid a project that will implement simple shared-nothing solutions. Why? Because apparently, it is simpler to migrate data to a platform that does all for you than accept a bit of additional work and challenge at the beginning but keep a simple approach. Also, going for the last shining things always has its magic.

On top of that, with the rise of Kubernetes and MySQL Operators, a lot of confusion started to circulate, most of which was generated by the total lack of understanding that a database and a relational database are two separate things. That lack of understanding of the difference and the real problems attached to an RDBMS had brought some to talk about horizontal scaling for databases, with a concerning superficiality and without clarifying if they were talking about RDBMS or not. As such, some clarification is long due as well as putting back the KISS principle as the main focus.

Given that, I thought that refreshing how ProxySQL could help in building a simple sharding solution may help to clarify the issues, reset the expectations and show how we can do things in a simpler way.  (See my old post, MySQL Sharding with ProxySQL.)

To do so, I built a simple POC that illustrates how you can use Percona Operator for MySQL (POM) and ProxySQL to build a sharded environment with a good level of automation for some standard operations like backup/restore software upgrade and resource scaling.

Why ProxySQL?

In the following example, we mimic a case where we need a simple sharding solution, which means we just need to redirect the data to different data containers, keeping the database maintenance operations on us. In this common case, we do not need to implement a full sharding system such as Vitess.  

As illustrated above, ProxySQL allows us to set up a common entry point for the application and then redirect the traffic on the base of identified sharding keys. It will also allow us to redirect read/write traffic to the primary and read-only traffic to all secondaries. 

The other interesting thing is that we can have ProxySQL as part of the application pod, or as an independent service. Best practices indicate that having ProxySQL closer to the application will be more efficient, especially if we decide to activate the caching feature.  

Why POM?

Percona Operator for MySQL has three main solutions: Percona Operator for Percona XtraDB Cluster, Percona Operator for MySQL Group Replication, and Percona Operator for Percona Server for MySQL. The first two are based on virtually-synchronous replication and allow the cluster to keep the data state consistent across all pods, guaranteeing that the service will always offer consistent data. In the K8s context, we can see POM as a single service with native horizontal scalability for reads, while for writes, we will adopt the mentioned sharding approach. 

The other important aspect of using a POM-based solution is the automation it comes with. Deploying POM, you will be able to set automation for backups, software updates, monitoring (using Percona Monitoring and Management (PMM)), and last but not least, the possibility to scale UP or DOWN just by changing the needed resources. 

The elements used

kubernetes sharding

In our POC, I will use a modified version of sysbench (https://github.com/Tusamarco/sysbench) that has an additional field continent and I will use that as a sharding key. At the moment, and for the purpose of this simple POC, I will only have two shards.

As the diagram above illustrates here, we have a simple deployment but good enough to illustrate the sharding approach.

We have:

  • The application(s) node(s) — it is really up to you if you want to test with one application node or more. Nothing will change, as well as for the ProxySQL nodes, but just keep in mind that if you use more ProxySQL nodes is better to activate the internal cluster support or use consul to synchronize them.
  • Shard one is based on POM with PXC; it has the following:
  • Load balancer for service entry point
    • Entry point for r/w
    • Entry point for read only
  • Three Pods for Haproxy
    • Haproxy container
    • Pmm agent container
  • Three Pods with data nodes (PXC)
    • PXC cluster node container
    • Log streaming
    • Pmm container 
  • Backup/restore service 
  • Shard two is based on POM for Percona Server for MySQL and Group Replication (technical preview)
    • Load balancer for service entry point
      • Entry point for r/w
      • Entry point for read-only
    • Three Pods for MySQL Router (testing)
      • MySQL router container
    • Three Pods with data nodes (PS with GR)
      • PS -GR cluster node container
      • Log streaming
      • Pmm container 
    • Backup/restore on scheduler

Now you may have noticed that the representation of the nodes is different in size; this is not a mistake while drawing. It indicates that I have allocated more resources (CPU and Memory) to shard1 than shard2. Why? Because I can and I am simulating a situation where a shard2 gets less traffic, at least temporarily, as such I do not want to give it the same resources as shard1. I will eventually increase them if I see the need.

The settings

Data layer

Let us start with the easy one, the data layer configuration. Configuring the environment correctly is the key, and to do so, I am using a tool that I wrote specifically to calculate the needed configuration in a K8s POM environment. You can find it here (https://github.com/Tusamarco/mysqloperatorcalculator). 

Once you have compiled it and run it, you can simply ask what “dimensions” are supported, or you can define a custom level of resources, but you will still need to indicate the expected load level. In any case, please refer to the README in the repository with all the instructions.

The full cr.yaml for PXC shard1 is here, while the one for PS-GR is here

For Shard1: I asked for resources to cover traffic of type 2 (Light OLTP), configuration type 5 (2XLarge) 1000 connections.

For Shard2: I ask for resources to cover traffic of type 2 (Light OLTP), configuration type 2 (Small), 100 connections.     

Once you have the CRs defined, you can follow the official guidelines to set the environment up:

It is time now to see the ProxySQL settings.

ProxySQL and sharding rules

As mentioned before, we will test the load sharding by continent, and we know that ProxySQL will not provide additional help to automatically manage the sharded environment. 

Given that one way to do it is to create a DBA account per shard or to inject shard information in the commands while executing. I will use the less comfortable one just to prove if it works, the different DBA accounts. 

We will have two shards: the sharding key is the continent field, and the continents will be grouped as follows:

  • Shard one:
    • Asia
    • Africa
    • Antarctica
    • Europe
    • North America
  • Shard two:
    • Oceania
    • South America

The DBAs users:

  • dba_g1
  • dba_g2

The application user:

  • app_test

The host groups will be:

  • Shard one
    • 100 Read and Write
    • 101 Read only
  • Shard two
    • 200 Read and Write
    • 201 Read only

Once that is defined, we need to identify which query rules will serve us and how. What we want is to redirect all the incoming queries for:

  • Asia, Africa, Antarctica, Europe, and North America to shard1.
  • Oceania and South America to shard2
  • Split the queries in R/W and Read only
  • Prevent the execution of any query that does not have a shard key
  • Backup data at regular intervals and store it in a safe place

ProxySQL and sharding rules

Given the above, we first define the rules for the DBAs accounts.

We set the Hostgroup for each DBA and then if the query matches the sharding rule, we redirect it to the proper sharding. Otherwise, the HG will remain as set.

This allows us to execute queries like CREATE/DROP table on our shard without a problem but will allow us to send data where needed. 

For instance, the one below is the output of the queries that sysbench will run.

Prepare:

INSERT INTO windmills_test1 /*  continent=Asia */ (uuid,millid,kwatts_s,date,location,continent,active,strrecordtype) VALUES(UUID(), 79, 3949999,NOW(),'mr18n2L9K88eMlGn7CcctT9RwKSB1FebW397','Asia',0,'quq')

In this case, I have the application simply injecting a comment in the INSERT SQL declaring the shard key; given I am using the account dba_g1 to create/prepare the schemas, rules 32/32 will be used and given I have sett apply=1, ProxySQL will exit the query rules parsing and send the command to the relevant hostgroup.

Run:

SELECT id, millid, date,continent,active,kwatts_s FROM windmills_test1 WHERE id BETWEEN ? AND ? AND continent='South America'

SELECT SUM(kwatts_s) FROM windmills_test1 WHERE id BETWEEN ? AND ?  and active=1  AND continent='Asia'
SELECT id, millid, date,continent,active,kwatts_s  FROM windmills_test1 WHERE id BETWEEN ? AND ?  AND continent='Oceania' ORDER BY millid

SELECT DISTINCT millid,continent,active,kwatts_s   FROM windmills_test1 WHERE id BETWEEN ? AND ? AND active =1  AND continent='Oceania' ORDER BY millid

UPDATE windmills_test1 SET active=? WHERE id=?  AND continent='Asia'
UPDATE windmills_test1 SET strrecordtype=? WHERE id=?  AND continent='North America'

DELETE FROM windmills_test1 WHERE id=?  AND continent='Antarctica'

INSERT INTO windmills_test1 /* continent=Antarctica */ (id,uuid,millid,kwatts_s,date,location,continent,active,strrecordtype) VALUES (?, UUID(), ?, ?, NOW(), ?, ?, ?,?) ON DUPLICATE KEY UPDATE kwatts_s=kwatts_s+1

The above are executed during the tests.  In all of them, the sharding key is present, either in the WHERE clause OR as a comment. 

Of course, if I execute one of them without the sharding key, the firewall rule will stop the query execution, i.e.:

mysql> SELECT id, millid, date,continent,active,kwatts_s FROM windmills_test1 WHERE id BETWEEN ? AND ?;
ERROR 1148 (42000): It is impossible to redirect this command to a defined shard. Please be sure you Have the Continent definition in your query, or that you use a defined DBA account (dba_g{1/2})


Check
here for the full command list.

Setting up the dataset

Once the rules are set, it is time to set up the schemas and the data using sysbench (https://github.com/Tusamarco/sysbench). Remember to use windmills_sharding tests.  

The first operation is to build the schema on SHARD2 without filling it with data. This is a DBA action; as such, we will execute it using the dba_g2 account:

sysbench ./src/lua/windmills_sharding/oltp_read.lua  --mysql-host=10.0.1.96  --mysql-port=6033 --mysql-user=dba_g2 --mysql-password=xxx --mysql-db=windmills_large --mysql_storage_engine=innodb --db-driver=mysql --tables=4 --table_size=0 --table_name=windmills --mysql-ignore-errors=all --threads=1  prepare

Setting table_size and pointing to the ProxySQL IP/port will do, and I will have the following:

mysql> select current_user(), @@hostname;
+----------------+-------------------+
| current_user() | @@hostname        |
+----------------+-------------------+
| dba_g2@%       | ps-mysql1-mysql-0 |
+----------------+-------------------+
1 row in set (0.01 sec)

mysql> use windmills_large;
Database changed
mysql> show tables;
+---------------------------+
| Tables_in_windmills_large |
+---------------------------+
| windmills1                |
| windmills2                |
| windmills3                |
| windmills4                |
+---------------------------+
4 rows in set (0.01 sec)

mysql> select count(*) from windmills1;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.09 sec)

All set but empty.

Now let us do the same but with the other DBA user:

sysbench ./src/lua/windmills_sharding/oltp_read.lua  --mysql-host=10.0.1.96  --mysql-port=6033 --mysql-user=dba_g1 --mysql-password=xxx --mysql-db=windmills_large --mysql_storage_engine=innodb --db-driver=mysql --tables=4 --table_size=400 --table_name=windmills --mysql-ignore-errors=all --threads=1  prepare

If I do now the select above with user dba_g2:

mysql> select current_user(), @@hostname;select count(*) from windmills1;
+----------------+-------------------+
| current_user() | @@hostname        |
+----------------+-------------------+
| dba_g2@%       | ps-mysql1-mysql-0 |
+----------------+-------------------+
1 row in set (0.00 sec)

+----------+
| count(*) |
+----------+
|      113 |
+----------+
1 row in set (0.00 sec)

While If I reconnect and use dba_g1:

mysql> select current_user(), @@hostname;select count(*) from windmills1;
+----------------+--------------------+
| current_user() | @@hostname         |
+----------------+--------------------+
| dba_g1@%       | mt-cluster-1-pxc-0 |
+----------------+--------------------+
1 row in set (0.00 sec)

+----------+
| count(*) |
+----------+
|      287 |
+----------+
1 row in set (0.01 sec)

I can also check on ProxySQL to see which rules were utilized:

select active,hits,destination_hostgroup, mysql_query_rules.rule_id, match_digest, match_pattern, replace_pattern, cache_ttl, apply,flagIn,flagOUT FROM mysql_query_rules NATURAL JOIN stats.stats_mysql_query_rules ORDER BY mysql_query_rules.rule_id;

+------+-----------------------+---------+---------------------+----------------------------------------------------------------------------+-------+--------+---------+
| hits | destination_hostgroup | rule_id | match_digest        | match_pattern                                                              | apply | flagIN | flagOUT |
+------+-----------------------+---------+---------------------+----------------------------------------------------------------------------+-------+--------+---------+
| 3261 | 100                   | 20      | NULL                | NULL                                                                       | 0     | 0      | 500     |
| 51   | 200                   | 21      | NULL                | NULL                                                                       | 0     | 0      | 600     |
| 2320 | 100                   | 31      | NULL                | scontinents*(=|like)s*'*(Asia|Africa|Antarctica|Europe|North America)'* | 1     | 500    | 0       |
| 880  | 200                   | 32      | NULL                | scontinents*(=|like)s*'*(Oceania|South America)'*                       | 1     | 500    | 0       |
| 0    | 100                   | 34      | NULL                | scontinents*(=|like)s*'*(Asia|Africa|Antarctica|Europe|North America)'* | 1     | 600    | 0       |
| 0    | 200                   | 35      | NULL                | scontinents*(=|like)s*'*(Oceania|South America)'*                       | 1     | 600    | 0       |
| 2    | 100                   | 51      | NULL                | scontinents*(=|like)s*'*(Asia|Africa|Antarctica|Europe|North America)'* | 0     | 0      | 1001    |
| 0    | 200                   | 54      | NULL                | scontinents*(=|like)s*'*(Oceania|South America)'*                       | 0     | 0      | 1002    |
| 0    | 100                   | 60      | NULL                | NULL                                                                       | 0     | 50     | 1001    |
| 0    | 200                   | 62      | NULL                | NULL                                                                       | 0     | 60     | 1002    |
| 7    | NULL                  | 2000    | .                   | NULL                                                                       | 1     | 0      | NULL    |
| 0    | 100                   | 2040    | ^SELECT.*FOR UPDATE | NULL                                                                       | 1     | 1001   | NULL    |
| 2    | 101                   | 2041    | ^SELECT.*$          | NULL                                                                       | 1     | 1001   | NULL    |
| 0    | 200                   | 2050    | ^SELECT.*FOR UPDATE | NULL                                                                       | 1     | 1002   | NULL    |
| 0    | 201                   | 2051    | ^SELECT.*$          | NULL                                                                       | 1     | 1002   | NULL    |
+------+-----------------------+---------+---------------------+----------------------------------------------------------------------------+-------+--------+---------+

Running the application

Now that the data load test was successful let us do the real load following the indication as above but use 80 Tables and just a bit more records like 20000, nothing huge. 

Once the data is loaded, we will have the two shards with different numbers of records. If all went well, the shard2 should have ¼ of the total and shard1 ¾.

When the load is over, I have, as expected:

mysql> select current_user(), @@hostname;select count(*) as shard1 from windmills_large.windmills80;select /* continent=shard2 */ count(*) as shard2 from windmills_large.windmills80;
+----------------+--------------------+
| current_user() | @@hostname         |
+----------------+--------------------+
| dba_g1@%       | mt-cluster-1-pxc-0 |
+----------------+--------------------+
1 row in set (0.00 sec)

+--------+
| shard1 |
+--------+
|  14272 | ← Table windmills80 in SHARD1
+--------+
+--------+
| shard2 |
+--------+
|   5728 | ← Table windmills80 in SHARD2
+--------+

As you may have already noticed, I used a trick to query the other shard using the dba_g1 user, I just passed in the query the shard2 definition as a comment. That is all we need.

Let us execute the run command for writes in sysbench and see what happens.

The first thing we can notice while doing writes is the query distribution:

+--------+-----------+----------------------------------------------------------------------------+----------+--------+----------+----------+--------+---------+-------------+---------+
| weight | hostgroup | srv_host                                                                   | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | MaxConnUsed | Queries |
+--------+-----------+----------------------------------------------------------------------------+----------+--------+----------+----------+--------+---------+-------------+---------+
| 10000  | 100       | ac966f7d46c04400fb92a3603f0e2634-193113472.eu-central-1.elb.amazonaws.com  | 3306     | ONLINE | 24	     | 0        | 138    | 66      | 25          | 1309353 |
| 100    | 101       | a5c8836b7c05b41928ca84f2beb48aee-1936458168.eu-central-1.elb.amazonaws.com | 3306     | ONLINE | 0	     | 0        | 0      | 0       | 0           |       0 |
| 10000  | 200       | a039ab70e9f564f5e879d5e1374d9ffa-1769267689.eu-central-1.elb.amazonaws.com | 3306     | ONLINE | 24	     | 1        | 129    | 66      | 25          |  516407 |
| 10000  | 201       | a039ab70e9f564f5e879d5e1374d9ffa-1769267689.eu-central-1.elb.amazonaws.com | 6447     | ONLINE | 0	     | 0        | 0      | 0       | 0           |       0 |
+--------+-----------+----------------------------------------------------------------------------+----------+--------+----------+----------+--------+---------+-------------+---------+

Where we can notice that the load in connection is evenly distributed, while the load is mainly going to shard1 as we expected, given we have an unbalanced sharding by design.

At the MySQL level, we had:

Questions

MySQL

Com Type

The final point is, what is the gain of using this sharding approach?

Well, we still need to consider the fact we are testing on a very small set of data. However, if we can already identify some benefits here, that will be an interesting result. 

Let’s see the write operations with 24 and 64 threads:

MySQL writes

MySQL latency

We get a gain of ~33% just using sharding, while for latency, we do not have a cost. On the contrary, also with a small load increase, we can see how the sharded solution performs better. Of course, we are still talking about a low number of rows and running threads but the gain is there. 

Backup

The backup and restore operation when using POM is completely managed by the operator (see instructions in the POM documentation https://docs.percona.com/percona-operator-for-mysql/pxc/backups.html and https://docs.percona.com/percona-operator-for-mysql/ps/backups.html). 

The interesting part is that we can have multiple kinds of backup solutions, like:

  • On-demand
  • Scheduled 
  • Full Point in time recovery with log streaming

Automation will allow us to set a schedule as simple as this:

schedule:
     - name: "sat-night-backup"
        schedule: "0 0 * * 6"
        keep: 3
        storageName: s3-eu-west
      - name: "daily-backup"
        schedule: "0 3 * * *"
        keep: 7
        storageName: s3-eu-west

Or, if you want to run the on-demand:

kubectl apply -f backup.yaml

Where the backup.yaml file has very simple information:

apiVersion: ps.percona.com/v1alpha1
kind: PerconaServerMySQLBackup
metadata:
  name: ps-gr-sharding-test-2nd-of-may
#  finalizers:
#    - delete-backup
spec:
  clusterName: ps-mysql1
  storageName: s3-ondemand

Using both methods, we will be able to soon have a good set of backups like:

POM (PXC)

cron-mt-cluster-1-s3-eu-west-20234293010-3vsve   mt-cluster-1   s3-eu-west    s3://mt-bucket-backup-tl/scheduled/mt-cluster-1-2023-04-29-03:00:10-full   Succeeded   3d9h        3d9h
cron-mt-cluster-1-s3-eu-west-20234303010-3vsve   mt-cluster-1   s3-eu-west    s3://mt-bucket-backup-tl/scheduled/mt-cluster-1-2023-04-30-03:00:10-full   Succeeded   2d9h        2d9h
cron-mt-cluster-1-s3-eu-west-2023513010-3vsve    mt-cluster-1   s3-eu-west    s3://mt-bucket-backup-tl/scheduled/mt-cluster-1-2023-05-01-03:00:10-full   Succeeded   33h         33h
cron-mt-cluster-1-s3-eu-west-2023523010-3vsve    mt-cluster-1   s3-eu-west    s3://mt-bucket-backup-tl/scheduled/mt-cluster-1-2023-05-02-03:00:10-full   Succeeded   9h          9h

POM (PS) *

NAME                             STORAGE       DESTINATION                                                                     STATE       COMPLETED   AGE
ps-gr-sharding-test              s3-ondemand   s3://mt-bucket-backup-tl/ondemand/ondemand/ps-mysql1-2023-05-01-15:10:04-full   Succeeded   21h         21h
ps-gr-sharding-test-2nd-of-may   s3-ondemand   s3://mt-bucket-backup-tl/ondemand/ondemand/ps-mysql1-2023-05-02-12:22:24-full   Succeeded   27m         27m

Note that as DBA, we still need to validate the backups with a restore procedure. That part is not automated (yet). 

*Note that Backup for POM PS is available only on demand, given the solution is still in technical preview.

When will this solution fit in?

As mentioned multiple times, this solution can cover simple cases of sharding; better if you have shared-nothing. 

It also requires work from the DBA side in case of DDL operations or resharding. 

You also need to be able to change some SQL code to be sure to have present the sharding key/information in any SQL executed.

When will this solution not fit in?

Several things could prevent you from using this solution. The most common ones are:

  • You need to query multiple shards at the same time. This is not possible with ProxySQL.
  • You do not have a DBA to perform administrative work and need to rely on an automated system.
  • Distributed transaction cross-shard.
  • No access to SQL code.

Conclusions

We do not have the Amletic dilemma about sharding or not sharding. 

When using an RDBMS like MySQL, if you need horizontal scalability, you need to shard. 

The point is there is no magic wand or solution; moving to sharding is an expensive and impacting operation. If you choose it at the beginning, before doing any application development, the effort can be significantly less. 

Doing sooner will also allow you to test proper solutions, where proper is a KISS solution. Always go for the less complex things, because in two years you will be super happy about your decision.  

If, instead, you must convert a current solution, then prepare for bloodshed, or at least for a long journey. 

In any case, we need to keep in mind a few key points:

  • Do not believe most of the articles on the internet that promise you infinite scalability for your database. If there is no distinction in the article between a simple database and an RDBMS, run away. 
  • Do not go for the last shiny things just because they shine. Test them and evaluate IF it makes sense for you. Better to spend a quarter testing now a few solutions than fight for years with something that you do not fully comprehend.  
  • Using containers/operators/Kubernetes does not scale per se; you must find a mechanism to have the solution scaling. There is absolutely NO difference with premises. What you may get is a good level of automation. However, that will come with a good level of complexity, and it is up to you to evaluate if it makes sense or not.  

As said at the beginning, for MySQL, the choice is limited. Vitess is the full complete solution, with a lot of coding to provide you with a complete platform to deal with your scaling needs.

However, do not be so fast to exclude ProxySQL as a possible solution. There are out there already many using it also for sharding. 

This small POC used a synthetic case, but it also shows that with just four rules, you can achieve a decent solution. A real scenario could be a bit more complex … or not. 

References

Vitess (https://vitess.io/docs/)

ProxySQL (https://proxysql.com/documentation/)

Firewalling with ProxySQL (https://www.tusacentral.com/joomla/index.php/mysql-blogs/197-proxysql-firewalling)

Sharding:

 

The Percona Kubernetes Operators automate the creation, alteration, or deletion of members in your Percona Distribution for MySQL, MongoDB, or PostgreSQL environment.

 

Learn More About Percona Kubernetes Operators

May
02
2023
--

How To Use pt-secure-collect for Capturing Data in a Secure Way From the OS and Database System

How To Use pt-secure-collect

Sometimes crucial data sharing is avoided because of compliance rules, organizational policies, or numerous security concerns. The common use cases involve sharing pt-mysql-summary, pt-stalk, and other OS-related details to assist Support Engineers or any other third-party team troubleshoot database-related issues.

In this context, pt-secure-collect is a very important utility from Percona, which helps capture the required information securely and also provides aid in masking the existing information.

Pt-secure-collect helps in collecting, sanitizing, and encrypting data from various sources. By default, this utility collects the output with the help of pt-stalk, pt-summary, and pt-mysql-summary.

Let’s see how this tool works.

Installation

The tool can be installed via the Percona official repositories:

sudo yum install percona-toolkit

Another option for downloading pt-secure-collect is either via the Percona Toolkit or directly installing the specific tool.

shell> sudo wget https://downloads.percona.com/downloads/percona-toolkit/3.5.2/binary/redhat/7/x86_64/percona-toolkit-3.5.2-2.el7.x86_64.rpm 
shell> sudo yum install percona-toolkit-3.5.2-2.el7.x86_64.rpm

OR

shell> sudo wget percona.com/get/pt-secure-collect 
shell> sudo chmod +x pt-secure-collect

 Now, let’s run our first command to capture the OS/Database-related details from the tool.

shell> ./pt-secure-collect collect --bin-dir=/usr/bin/ --temp-dir=/home/vagrant/pt/ --mysql-host=localhost --mysql-port=3306 --mysql-user=root --mysql-password=Root@1234
Encryption password

Output:

INFO[2023-04-22 06:54:10] Temp directory is "/home/vagrant/pt/"
INFO[2023-04-22 06:54:10] Creating output file "/home/vagrant/pt/pt-stalk_2023-04-22_06_54_10.out"  
INFO[2023-04-22 06:54:10] Running pt-stalk --no-stalk --iterations=2 --sleep=30 --host=localhost --dest=/home/vagrant/pt/ --port=3306 --user=root --password=********  
INFO[2023-04-22 06:55:42] Creating output file "/home/vagrant/pt/pt-summary_2023-04-22_06_55_42.out"  
INFO[2023-04-22 06:55:42] Running pt-summary                            
INFO[2023-04-22 06:55:48] Creating output file "/home/vagrant/pt/pt-mysql-summary_2023-04-22_06_55_48.out"  
INFO[2023-04-22 06:55:48] Running pt-mysql-summary --host=localhost --port=3306 --user=root --password=********  
INFO[2023-04-22 06:56:01] Sanitizing output collected data              
INFO[2023-04-22 06:56:17] Creating tar file "/home/vagrant/pt/pt.tar.gz"  
INFO[2023-04-22 06:56:17] Encrypting "/home/vagrant/pt/pt.tar.gz" file into "/home/vagrant/pt/pt.tar.gz.aes"  
INFO[2023-04-22 06:56:17] Skipping encrypted file "pt.tar.gz.aes"   

So, here the above command collected the data from the “pt*” tools securely. By default, it encrypts the data and asks for the encryption password as well. However, we can skip that part by mentioning this option “ –no-encrypt”  option. 

Options:-

--bin-dir => Directory having the Percona Toolkit binaries (pt* tools). 
--temp-dir => Temporary directory used for the data collection.

Note – In order to run the command successfully all prerequisites binaries of (pt-stalk, pt-summary, and pt-mysql-summary) must be present and included in the command.

Let’s decrypt the file and observe the captured details:

shell> ./pt-secure-collect decrypt /home/vagrant/pt/pt.tar.gz.aes  --outfile=/home/vagrant/pt/pt.tar.gz
Encryption password:
INFO[2023-04-22 07:01:55] Decrypting file "/home/vagrant/pt/pt.tar.gz.aes" into "/home/vagrant/pt/pt.tar.gz" 

Note – Here, we need to provide the password which we used at the time of encryption.

--outfile => Write the output to this file. If omitted, the output file name will be the same as the input file, adding the .aes extension.

Now, inside the path, we can see the unencrypted file. Followed by this, we can uncompress the file to see the contents.

shell> /home/vagrant/pt 
-rw-------. 1 vagrant vagrant 500K Apr 22 07:01 pt.tar.gz

shell> tar -xzvf pt.tar.gz

Let’s look at a couple of examples where the sensitive data has been altered or masked.

  • With pt-secure-collect:
Hostname | hostname 
log_error | /var/log/hostname 
Config File | /etc/hostname 
pid-file        = /var/run/mysqld/hostname 
log-error     = /var/log/hostname 
socket        = /var/lib/mysql/hostname

  • Without pt-secure-collect:
Hostname | localhost.localdomain 
log_error | /var/log/mysqld.log 
Config File | /etc/my.cnf 
pid-file       = /var/run/mysqld/mysqld.pid 
log-error     = /var/log/mysqld.log 
socket        = /var/lib/mysql/mysql.sock

Note – We can clearly see some differences in the both types of outputs. With pt-secure-collection the above information was just replaced with some random value(“hostname”).

Now, let’s see how we can sanitize an existing file “pt-mysql-summary.out” and mask the critical information that ends with the below output section.

shell> ./pt-secure-collect sanitize --input-file=/home/vagrant/pt-mysql-summary.out > /home/vagrant/pt-mysql-summary_sanitize.out

Output:

Hostname | hostname 
Pidfile | /var/run/mysqld/hostname (exists) 
log_error | /var/log/hostname 
Config File | /etc/hostname 
pid-file        = /var/run/mysqld/hostname 
log-error     = /var/log/hostname 
socket        = /var/lib/mysql/hostname 
log-error     = /var/log/mariadb/hostname
pid-file        = /var/run/mariadb/hostname

You may also control the information which you want to skip from masking with settings with option –no-sanitize-hostnames and  –no-sanitize-queries.

Here, we see one more example where the critical information, such as “hostname” details inside the OS log file (“/var/log/messages”), is masked/replaced by some other value.

shell> sudo ./pt-secure-collect sanitize --input-file=/var/log/messages > /home/vagrant/messages_sanitize.out

 

Output (without pt-secure-collect):

Apr 23 03:37:13 localhost pmm-agent: #033[31mERRO#033[0m[2023-04-23T03:37:13.547+00:00] time="2023-04-23T03:37:13Z" level=error msg="Error opening connection to ProxySQL: dial tcp 127.0.0.1:6032: connect: connection refused" source="exporter.go:169"  #033[31magentID#033[0m=/agent_id/04dd6ad8-5c2e-4c52-a624-eb3bc7357651 #033[31mcomponent#033[0m=agent-process #033[31mtype#033[0m=proxysql_exporter

Output (with pt-secure-collect):

Apr 23 03:37:13 localhost pmm-agent: #033[31mERRO#033[0m[2023-04-23T03:37:13.547+00:00] time="2023-04-23T03:37:13Z" level=error msg="Error opening connection to ProxySQL: dial tcp hostname:6032: connect: connection refused" source="hostname:169"  #033[31magentID#033[0m=/agent_id/04dd6ad8-5c2e-4c52-a624-eb3bc7357651 #033[31mcomponent#033[0m=agent-process #033[31mtype#033[0m=proxysql_exporte

 

Summary

With the help of this tool, both OS and database-level information/logs can be encrypted or masked with some different values to hide the sensitive data. This tool comes in handy while dealing with critical data troubleshooting with any third-party stakeholders and also maintains security/compliance-related practices.

Percona Toolkit is a collection of advanced open source command-line tools, developed and used by the Percona technical staff, that are engineered to perform a variety of MySQL, MariaDB, MongoDB, and PostgreSQL server and system tasks that are too difficult or complex to perform manually.

 

Learn more about Percona Toolkit

May
02
2023
--

ChatGPT vs. MySQL DBA Challenge

ChatGPT vs MySQL

Given the increasing popularity of ChatGPT in solving questions of humankind, I decided to take a better look at the current state of AI. At the same time that I see database engineers relying on the tool, sites such as StackOverflow are banning ChatGPT. As the question comes from time to time, is my profession at risk of being extinct due to AI? As my mentor said when I started working in the tech area, the answer is: it depends. I say this because, at this moment in time, my answer is a big No. As we will see right after, the answers are quite generic, but they are at least in the right direction. Nothing, however, prohibits the following AI models from being smarter and providing precise answers.

Let’s look at some questions I did that a MySQL DBA usually needs to answer in an interview process.

Questions

Q: I have a MySQL server with 500 GB of RAM; my data set is 100 GB. How large my InnoDB buffer pool needs to be?

ChatGPT: The InnoDB buffer pool is used by MySQL to cache frequently accessed data in memory. It is recommended to set the buffer pool size to a value that is large enough to hold your entire dataset in memory. Since your dataset is 100 GB and you have 500 GB of RAM, you can allocate a significant portion of the available memory to the InnoDB buffer pool. A commonly recommended value for the buffer pool size is between 50% to 75% of the available memory. In your case, a buffer pool size of around 375 GB (75% of 500 GB) should be a good starting point. However, the actual value that you should use will depend on the specific characteristics of your workload. Keep in mind that setting the buffer pool size too high may result in other processes on your server competing for memory, which can impact performance. It is also important to monitor your server’s memory usage regularly to ensure that it is not being exhausted by the buffer pool.

So this answer was inaccurate and evasive. Since my data set is 100GB, I would like to see ChatGPT explicitly mention that a good starting point would be 100GB. If we expand the cache concept more, the buffer pool could be even less if the working set  (hot data) is smaller.

Q: I have a MySQL server, and my application is writing at a rate of 100 MB/hour in my redo logs. How large does my innodb_redo_log_capacity need to be?

ChatGPT: The innodb_redo_log_capacity parameter specifies the maximum size of the InnoDB redo log buffer, which is used to store changes made to the database before they are written to disk. The size of the redo log buffer should be large enough to handle the maximum amount of changes that can occur within the time it takes to write the changes to disk. To calculate the required size of your redo log buffer, you can use the following formula:

innodb_redo_log_capacity = (max_rate_of_change * max_redo_log_write_time) / redo_log_block_size

In your case, you have provided the maximum rate of change, which is 100 MB/hour. You will need to estimate the maximum time it takes to write the changes to disk and the redo log block size. Assuming a redo log block size of 512 bytes (which is the default value for innodb_log_block_size), and a maximum time of 10 minutes (0.16667 hours) to write the changes to disk, the calculation would be as follows: […]

I didn’t paste the full answer on purpose because it was very wrong. The innodb_redo_log_capacity is a setting from the MySQL server, while the innodb_log_block_size is a setting from MariaDB. Also, the redo log capacity is not the same as the redo log buffer, so it mixes different technologies and concepts. If you want to know more about how to calculate a good redo log size, check out this blog post:

MySQL 8.0 Dynamic Redo Log Sizing

Q: I have the following table in my MySQL server:

CREATE TABLE `joinit` (
  `i` int(11) NOT NULL AUTO_INCREMENT,
  `s` varchar(64) DEFAULT NULL,
  `t` time NOT NULL,
  `g` int(11) NOT NULL,
  PRIMARY KEY (`i`)
) ENGINE=InnoDB  DEFAULT CHARSET=latin1;

Q: How do I optimize the following query:

SELECT * FROM joinit WHERE g = 10;

ChatGPT: To optimize the query SELECT * FROM joinit WHERE g = 10;, you can follow these steps:

Add an index to the g column: Since the query is filtering by the g column, adding an index on that column will improve the query’s performance. You can add the index using the following command:

ALTER TABLE joinit ADD INDEX (g);

[…]

The answer had a few more comments, but the fact that ChatGPT gave the first answer the “ADD INDEX” command is impressive. 

Q: I have a MySQL server with 10 CPUs. What is the ideal number for my innodb_thread_concurrency?

ChatGPT: The ideal value for the innodb_thread_concurrency setting in MySQL depends on several factors, including the workload on the server, the number of CPU cores, and the amount of available memory. In general, the recommended value for innodb_thread_concurrency on a server with 10 CPUs is typically between 16 and 32. However, it’s important to note that the optimal value may vary depending on your specific workload and hardware configuration.  To determine the ideal value for your server, you can start by setting the innodb_thread_concurrency value to a moderate number (e.g., 16) and monitoring the server’s performance. You can then gradually increase or decrease the value to optimize the server’s performance based on your workload.[…]

The answer here follows a general rule of thumb. If you are interested in more details about this setting, you can check the following blog post:

InnoDB Thread Concurrency

Also, during the review of this blog post, one of my colleagues was not convinced by the suggestion (which is great!). He said:

Indeed but 16 to 32 in a 10 core server may lead to context switch contention so the number IMO is not right, a good starting point is setting a number smaller than the amount of CPUs. So the actual answer is highly dependent on the workload (i.e. concurrent threads running, duration of queries, etc)

As we can see, the decision is not only based on a formula or documentation. Much of the knowledge relies on the experience and lessons learned during the problems faced. Also, the debate with other engineers allows you to gather other points of view, which improves the solution to a point that is nowhere near ChatGPT can reach.

Let’s make it harder now.

Q: How can I refresh statistics for my spatial indexes in MySQL?

ChatGPT: In MySQL, you can refresh the statistics for your spatial indexes by using the ANALYZE TABLE statement. […]

This answer is wrong. Only when the table is rebuilt the statistics for spatial indexes are refreshed. Also, there is a documented bug:

OPTIMIZE TABLE does not sort R-tree indexes, such as spatial indexes on POINT columns. (Bug #23578)

Q: I have installed MySQL 8 in Ubuntu 22.04 and moved my datadir to another location. I adjusted my datadir in my.cnf file, but my database is not starting. What could it be?

ChatGPT: If you have moved the MySQL data directory to another location and updated the datadir parameter in the my.cnf configuration file, but MySQL is still not starting, there might be other configuration parameters that need to be updated.

[…]

The answer here is generic. However, if I modify my question to make it more accurate:

Q: I have installed MySQL 8 in Ubuntu 22.04 and moved my datadir to another location. I adjusted my datadir in my.cnf file, but my database is not starting. The datadir is correct. What could it be?

Then I get a more plausible answer:

ChatGPT: If the data directory is correct and there are no errors in the MySQL error log file, it’s possible that the file permissions are not set correctly for the new data directory.[…]

Q: I have the following iostat sample:

2020_10_06_14_22_25-iostat:Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
2020_10_06_14_21_55-iostat:sda               2.32     0.84  386.08 1004.05  5570.21  7514.67    18.83     4.45    12.05    0.79    1.14   0.49  8.16
2020_10_06_14_21_55-iostat:sda               0.00     0.00  113.00 1535.00  1808.00 13727.50    18.85     1.61    10.98    0.93    0.98   0.42  8.40
2020_10_06_14_21_55-iostat:sda               0.00     1.00   79.00 1494.00  1264.00  7203.00    10.77     2.04    20.66    0.65    0.67   0.45  1.30
2020_10_06_14_21_55-iostat:sda               0.00     0.00  132.00 1500.00  2112.00 14508.50    20.37     5.87    30.15    0.68    1.19   0.47  6.10

Q: Am I having I/O issues?

ChatGPT: To determine if you are having I/O issues, you need to analyze the %util column. This column shows the percentage of time that the disk was busy during the sampling period. A high %util value indicates that the disk is busy for a significant portion of the time, which could indicate an I/O bottleneck. In your sample, the %util values range from 1.30 to 8.40, which are relatively low values and may not necessarily indicate I/O issues.

The answer could be better. The answer does not consider the queue or latency of the sample, which could indicate a disk with issues. Basically, ChatGPT is taking into consideration only the %util column to conclude the answer.

Q: I have a server with 2 NUMA cores. Which setting can I use in my Percona Server to have better memory usage?

ChatGPT: If you have a server with 2 NUMA cores, you can use the numa-interleave option in Percona Server to improve memory usage.

[…]

This answer applies only to older versions of the Percona Server. Nowadays, you can use the innodb_numa_interleave. I found it relevant to add it here because, as we can see, the data set used to feed the AI needs to be always refreshed to provide relevant answers.

Conclusion

The answers provided by ChatGPT are accurate, but in general, there is too much generic information that could not help to work on specific issues. Also, the data set used by ChatGPT is very important because we might receive outdated or misleading information. We cannot take the answer blindly, and the fact we need to check the answer requires previous knowledge in the area. 

Still, ChatGPT can provide a good starting point, especially for those starting with MySQL and Linux.

Many may say that I need to provide precise questions to obtain my desired answer. That is correct. However, working with our customers, most of the time during an emergency, the only information they have is that the application is slow or not connecting to the database. So, in the real world, asking a precise question is difficult if you are not a database expert. Being able to know the right question is half of the path to finding the right answer. And to finish with this post, I did one last question to ChatGPT:

Q: Is ChatGPT going to replace a MySQL DBA job?

Answer: No, ChatGPT is an AI language model developed by OpenAI and is not designed to replace a MySQL DBA job. While ChatGPT can provide some level of assistance and guidance for MySQL-related questions, it cannot replace the knowledge, expertise, and decision-making abilities of a human MySQL DBA. A MySQL DBA is responsible for maintaining, securing, and optimizing MySQL database systems, which involves a range of complex tasks and decision-making processes that require human expertise and experience.

So, as for now, we are safe 😀 !

Jokes and fears aside, I don’t believe technology is designed to replace but rather to help us to be more effective and to dedicate our time to tasks that really bring value to our customers. I’m positive and excited to see what is coming for the next years! And if ChatGPT does not give you a proper answer, feel free to contact us on social media or our forum

Useful resources

Finally, you can reach us through social networks, our forum, or access our material using the links presented below:

May
01
2023
--

Save Money in AWS RDS: Don’t Trust the Defaults

aws rds

Default settings can help you get started quickly – but they can also cost you performance and a higher cloud bill at the end of the month. Want to save money on your AWS RDS bill? I’ll show you some MySQL settings to tune to get better performance, and cost savings, with AWS RDS.

Recently I was engaged in a MySQL Performance Audit for a customer to help troubleshoot performance issues that led to downtime during periods of high traffic on their AWS RDS MySQL instances. During heavy loads, they would see messages about their InnoDB settings in the error logs:

[Note] InnoDB: page_cleaner: 1000ms intended loop took 4460ms. The settings might not be optimal. (flushed=140, during the time.)

This message is normally a side effect of a storage subsystem that is not capable of keeping up with the number of writes (e.g., IOPs) required by MySQL. This is “Hey MySQL, try to write less. I can’t keep up,” which is a common situation when innodb_io_capacity_max is set too high.

After some time of receiving these messages, eventually, they hit performance issues to the point that the server becomes unresponsive for a few minutes. After that, things went back to normal.

Let’s look at the problem and try to gather some context information.

Investigating AWS RDS performance issues

We had a db.m5.8xlarge instance type (32vCPU – 128GB of RAM) with a gp2 storage of 5TB, which should provide up to 10000 IOPS (this is the maximum capacity allowed by gp2), running MySQL 5.7. This is a pretty decent setup, and I don’t see many customers needing to write this many sustained IOPS.

The innodb_io_capacity_max parameter was set to 2000, so the hardware should be able to deliver that many IOPS without major issues. However, gp2 suffers from a tricky way of calculating credits and usage that may drive erroneous conclusions about the real capacity of the storage. Reviewing the CloudWatch graphics, we only had roughly 8-9k IOPS (reads and writes) used during spikes.

AWS RDS MySQL

writeops

While the IO utilization was quite high, there should be some room to get more IOPS, but we were still seeing errors. What caught my attention was the self-healing condition shown by MySQL after a few minutes.

Normally, the common solution that was actually discussed during our kick-off call was, “Well, there is always the chance to move to Provisioned IOPS, but that is quite expensive.” Yes, this is true, io2 volumes are expensive, and honestly, I think they should be used only where really high IO capacity at expected latencies is required, and this didn’t seem to be the case.

Otherwise, most of the environments can adapt to gp2/gp3 volumes; for that matter, you need to provision a big enough volume and get enough IOPS.

Finding the “smoking gun” with pt-mysql-summary

Not too long ago, my colleague Yves Trudeau and I worked on a series of posts debating how to configure an instance for write-intensive workloads. A quick look at the pt-mysql-summary output shows something really interesting when approaching the issue out of the busy period of load:

# InnoDB #####################################################
                  Version | 5.7.38
         Buffer Pool Size | 93.0G
         Buffer Pool Fill | 100%
        Buffer Pool Dirty | 1%
           File Per Table | ON
                Page Size | 16k
            Log File Size | 2 * 128.0M = 256.0M
          Log Buffer Size | 8M
             Flush Method | O_DIRECT
      Flush Log At Commit | 1
               XA Support | ON
                Checksums | ON
              Doublewrite | ON
          R/W I/O Threads | 4 4
             I/O Capacity | 200
       Thread Concurrency | 0
      Concurrency Tickets | 5000
       Commit Concurrency | 0
      Txn Isolation Level | REPEATABLE-READ
        Adaptive Flushing | ON
      Adaptive Checkpoint | 
           Checkpoint Age | 78M
             InnoDB Queue | 0 queries inside InnoDB, 0 queries in queue

 

Wait, what? 256M of redo logs and a Checkpoint Age of only 78M? That is quite conservative, considering a 93GB buffer pool size. I guess we should assume bigger redo logs for such a big buffer pool. Bingo! We have a smoking gun here.

Additionally, full ACID features were enabled, this is innodb_flush_log_at_trx_commit=1 and sync_binlog=1, which adds a lot of write overhead to every operation because, during the commit stage, everything is flushed to disk (or to gp2 in this case).

Considering a spike of load running a lot of writing queries, hitting the max checkpoint age in this setup is a very likely situation.

Basically, MySQL will perform flushing operations at a certain rate depending on several factors. This rate is normally close to innodb_io_capacity (200 by default); if the number of writes starts to approach to max checkpoint age, then the adaptive flushing algorithm will start to push up to innodb_io_capacity_max (2000 by default) to try to keep the free space in the redo logs far from the max checkpoint age limit.

If we keep pushing, we can eventually reach the max checkpoint age, which will drive the system to the synchronous state, meaning that a sort of furious flushing operations will happen beyond innodb_io_capacity_max and all writing operations will be paused (freezing writes) until there is free room in the redo logs to keep writing.

This was exactly what was happening on this server. We calculated roughly how many writes were being performed per hour, and then we recommended increasing the size of redo log files to 2x2GB each (4GB total). In practical terms, it was 3.7G due to some rounding that RDS does, so we got:

# InnoDB #####################################################
                  Version | 5.7.38
         Buffer Pool Size | 92.0G
         Buffer Pool Fill | 100%
        Buffer Pool Dirty | 2%
           File Per Table | ON
                Page Size | 16k
            Log File Size | 2 * 1.9G = 3.7G
          Log Buffer Size | 8M
             Flush Method | O_DIRECT

 

Then we also increased the innodb_io_capacity_max to 4000, so we let the adaptive flushing algorithm increase writes with some more room. Results in CloudWatch show we were right:

 

AWS RDS Cloud MySQL

The reduction during the last couple of weeks is more than 50% of IOPS, which is pretty decent now, and we haven’t changed the hardware at all. Actually, it was possible to reduce the storage size to 3TB and avoid moving to expensive io2 (provisioned IOPS) storage.

Conclusions

RDS normally works very well out of the box; most of the configurations are properly set for the type of instance provisioned. Still, I’ve found that the RDS default size of the redo logs being this small is silly, and people using a fully managed solution would expect not to worry about some common tuning.

MySQL 8.0 implemented innodb_dedicated_server that auto sizes innodb_log_file_size and innodb_log_files_in_group (now replaced by innodb_redo_log_capacity) as a function of InnoDB Buffer Pool size using a pretty simple, but effective, algorithm, and I guess it shouldn’t be hard for AWS team to implement it. We’ve done some research, and it seems RDS is not pushing this login into the 8.0 versions, which sounds strange to have such a default for innodb_redo_log_capacity

In the meantime, checking how RDS MySQL is configured with default parameters is something we all should review to avoid the typical “throwing more hardware solution” – and, by extension, spending more money.

Percona Consultants have decades of experience solving complex database performance issues and design challenges. They’ll work with you to understand your goals and objectives and provide the best, unbiased solutions for your database environment.

 

Learn more about Percona Consulting

 

A personalized Percona Database Performance Audit will help uncover potential performance killers in your current configuration.

 

Get your personalized audit

Apr
24
2023
--

How to Persist a Hashed Format Password Inside ProxySQL

persist a hashed password ProxySQL

In this blog post, we will see how to persist the password inside the ProxySQL mysql_users table in hashed format only. Also, even if someone stored the password in cleartext, we see how to change those into the hashed format easily.

Here we are just highlighting one of the scenarios during work on the client environment where we noticed that the ProxySQL mysql_users table had more than 100 user entries, but some of them were available/inserted into the clear text password, whereas some were inserted properly into hashed entries.

Before just explaining those simple commands that were used to fix those clear text entries into the hashed entry quickly, let’s see some more information about the ProxySQL mysql_users table and the password formats.

Password formats inside ProxySQL

ProxySQL is capable of storing passwords in two different formats within the mysql_users.password field, whether in-memory or on-disk. These formats include plain text and hashed passwords.

Plain text passwords are extremely vulnerable to unauthorized access, as anyone with access to the database or configuration files can easily read them. While storing these files in a secure location can mitigate some security concerns, there is still a risk of data breaches. Hashed passwords, on the other hand, are stored in the same format as passwords in the MySQL server’s “mysql.user.password” (before MySQL 8.0 version) or “mysql.user.authentication_string” column (since MySQL 8.0 version using the mysql_native_password plugin), providing an added layer of security.

In ProxySQL, any password that begins with an asterisk (*) is considered to be a hashed password.

The Admin interface of ProxySQL lacks a PASSWORD() function. Therefore, any passwords stored within ProxySQL are preserved in the format in which they were originally inserted. This format may either be plain text or a hashed value.

Note: In general, ProxySQL doesn’t support the user created using the caching_sha2_password plugin password, once the same mysql.user.authentication_string is stored inside the mysql_users.password column. Still, there is a workaround for using those user accounts that are created inside the database using the caching_sha2_password plugin by inserting the clear text password entries inside the ProxySQL mysql_users.password column, but that is not recommended as per security best practices to keep clear text password entries inside the ProxySQL. Hence, we could say, ProxySQL and MySQL communication better support users that are created with the mysql_native_password plugin inside the database. 

 

For more details, please check this blog post ProxySQL Support for MySQL caching_sha2_password and the official ProxySQL documentation Information about MySQL 8.0 – ProxySQL .

So, to explain this scenario, here we created four different test DB user accounts inside the database with the mysql_native_password plugin.

From a database node: 

Username Password (In Clear Text)
test1 test1
test2 test2
test3 test3
test4 test4
mysql [localhost:8028] {msandbox} ((none)) > select user,host,authentication_string,plugin from mysql.user where user like 'test%'G
*************************** 1. row ***************************
                 user: test1
                 host: localhost
authentication_string: *06C0BF5B64ECE2F648B5F048A71903906BA08E5C
               plugin: mysql_native_password
*************************** 2. row ***************************
                 user: test2
                 host: localhost
authentication_string: *7CEB3FDE5F7A9C4CE5FBE610D7D8EDA62EBE5F4E
               plugin: mysql_native_password
*************************** 3. row ***************************
                 user: test3
                 host: localhost
authentication_string: *F357E78CABAD76FD3F1018EF85D78499B6ACC431
               plugin: mysql_native_password
*************************** 4. row ***************************
                 user: test4
                 host: localhost
authentication_string: *D159BBDA31273BE3F4F00715B4A439925C6A0F2D
               plugin: mysql_native_password
4 rows in set (0.00 sec)

From ProxySQL: 

Here we will insert the user accounts into the mysql_users tables in mixed clear text format as well as in hash format.

ProxySQL_Admin> INSERT INTO mysql_users(username,password) VALUES ('test1','test1'), ('test2','*7CEB3FDE5F7A9C4CE5FBE610D7D8EDA62EBE5F4E'),('test3','test3'), ('test4','*D159BBDA31273BE3F4F00715B4A439925C6A0F2D');
Query OK, 4 rows affected (0.00 sec)


ProxySQL_Admin> select distinct username,password from mysql_users where username like 'test%';
+----------+-------------------------------------------+
| username | password                                  |
+----------+-------------------------------------------+
| test1    | test1                                     |
| test2    | *7CEB3FDE5F7A9C4CE5FBE610D7D8EDA62EBE5F4E |
| test3    | test3                                     |
| test4    | *D159BBDA31273BE3F4F00715B4A439925C6A0F2D |
+----------+-------------------------------------------+
4 rows in set (0.00 sec)


ProxySQL_Admin> LOAD MYSQL USERS TO RUNTIME;
Query OK, 0 rows affected (0.00 sec)


ProxySQL_Admin> SAVE MYSQL USERS TO DISK;
Query OK, 0 rows affected (0.01 sec)


ProxySQL_Admin> select distinct username,password from mysql_users where username like 'test%';
+----------+-------------------------------------------+
| username | password                                  |
+----------+-------------------------------------------+
| test1    | test1                                     |
| test2    | *7CEB3FDE5F7A9C4CE5FBE610D7D8EDA62EBE5F4E |
| test3    | test3                                     |
| test4    | *D159BBDA31273BE3F4F00715B4A439925C6A0F2D |
+----------+-------------------------------------------+
4 rows in set (0.00 sec)

Below are some other ProxySQL configurations done to verify the ProxySQL’s mysql_users table password is working fine to establish a connection with the MySQL database.

ProxySQL_Admin> select hostgroup_id,hostname,port,status from mysql_servers;
+--------------+-----------+------+--------+
| hostgroup_id | hostname  | port | status |
+--------------+-----------+------+--------+
| 10           | localhost | 8028 | ONLINE |
+--------------+-----------+------+--------+
1 row in set (0.00 sec)


ProxySQL_Admin> select rule_id,active,proxy_port,match_digest,destination_hostgroup,retries,apply from mysql_query_rulesG
*************************** 1. row ***************************
              rule_id: 1048
               active: 1
           proxy_port: 6033
         match_digest: ^SELECT.*FOR UPDATE
destination_hostgroup: 10
              retries: 3
                apply: 1
*************************** 2. row ***************************
              rule_id: 1050
               active: 1
           proxy_port: 6033
         match_digest: ^SELECT.*$
destination_hostgroup: 10
              retries: 3
                apply: 1
2 rows in set (0.00 sec)

Let’s check the database connectivity via ProxySQL using these DB user accounts.

for i in {1..4}; do mysql -h 127.0.0.1 -utest$i -ptest$i -P6033 -e"select current_user(),version();";done
mysql: [Warning] Using a password on the command line interface can be insecure.
+-----------------+-----------+
| current_user()  | version() |
+-----------------+-----------+
| test1@localhost | 8.0.28    |
+-----------------+-----------+
mysql: [Warning] Using a password on the command line interface can be insecure.
+-----------------+-----------+
| current_user()  | version() |
+-----------------+-----------+
| test2@localhost | 8.0.28    |
+-----------------+-----------+
mysql: [Warning] Using a password on the command line interface can be insecure.
+-----------------+-----------+
| current_user()  | version() |
+-----------------+-----------+
| test3@localhost | 8.0.28    |
+-----------------+-----------+
mysql: [Warning] Using a password on the command line interface can be insecure.
+-----------------+-----------+
| current_user()  | version() |
+-----------------+-----------+
| test4@localhost | 8.0.28    |
+-----------------+-----------+

Here above, the main problem is that our mysql_users tables have some plain text passwords visible for some user accounts (test1 & test3), which we don’t want to keep in clear/plain text password format. Instead, all password entries should be available in the hashed password format.

There is one way to fix this problem: drop those clear text password entries for user accounts, get the password hash for those user passwords generated from the MySQL database prompt using the PASSWORD() function, and later insert those actual hashed entries inside the mysql_users table to fix the issue.

But as earlier mentioned, if our mysql_users table had a lot of (>100) entries, fixing those passwords manually can be a tedious job.

Note: Here it is assumed we are not using the percona-scheduler-admin client, which has the feature to sync your user accounts directly with the database nodes in the ProxySQL mysql_users table.

So for this case, let’s see the next section, where we will understand how the admin-hash_passwords variable will help us to solve this problem and persist only hashed password entries inside the ProxySQL’s mysql_users table.

ProxySQL’s admin-hash_passwords variable

ProxySQL version 1.2.3 has included a new global boolean variable called admin-hash_passwords, which is enabled by default to support hashed passwords. If admin-hash_passwords=true, passwords will be automatically hashed during runtime when executing the LOAD MYSQL USERS TO RUNTIME command. However, passwords stored in the mysql_users tables will not be automatically hashed.

Nevertheless, it is possible to hash these passwords both in-memory and on-disk by copying users from RUNTIME using commands such as SAVE MYSQL USERS FROM RUNTIME” after executing LOAD MYSQL USERS TO RUNTIME and then saving the updated information using SAVE MYSQL USERS TO DISK.

Let’s persist the hashed password inside ProxySQL

ProxySQL_Admin> select @@admin-hash_passwords;
+------------------------+
| @@admin-hash_passwords |
+------------------------+
| true                   |
+------------------------+
1 row in set (0.00 sec)


ProxySQL_Admin> select distinct username,password from mysql_users where username like 'test%';
+----------+-------------------------------------------+
| username | password                                  |
+----------+-------------------------------------------+
| test1    | test1                                     |
| test2    | *7CEB3FDE5F7A9C4CE5FBE610D7D8EDA62EBE5F4E |
| test3    | test3                                     |
| test4    | *D159BBDA31273BE3F4F00715B4A439925C6A0F2D |
+----------+-------------------------------------------+
4 rows in set (0.00 sec)


ProxySQL_Admin> LOAD MYSQL USERS TO RUNTIME;
Query OK, 0 rows affected (0.01 sec)


ProxySQL_Admin> select distinct username,password from mysql_users where username like 'test%';
+----------+-------------------------------------------+
| username | password                                  |
+----------+-------------------------------------------+
| test1    | test1                                     |
| test2    | *7CEB3FDE5F7A9C4CE5FBE610D7D8EDA62EBE5F4E |
| test3    | test3                                     |
| test4    | *D159BBDA31273BE3F4F00715B4A439925C6A0F2D |
+----------+-------------------------------------------+
4 rows in set (0.00 sec)

Currently, passwords are hashed at RUNTIME, but they are not hashed on the mysql_users table. To hash them inside the mysql_users table as well, we need to run the SAVE MYSQL USERS FROM RUNTIME command.

ProxySQL_Admin> SAVE MYSQL USERS FROM RUNTIME;
Query OK, 0 rows affected (0.00 sec)


ProxySQL_Admin> select distinct username,password from mysql_users where username like 'test%';
+----------+-------------------------------------------+
| username | password                                  |
+----------+-------------------------------------------+
| test1    | *06C0BF5B64ECE2F648B5F048A71903906BA08E5C |
| test2    | *7CEB3FDE5F7A9C4CE5FBE610D7D8EDA62EBE5F4E |
| test3    | *F357E78CABAD76FD3F1018EF85D78499B6ACC431 |
| test4    | *D159BBDA31273BE3F4F00715B4A439925C6A0F2D |
+----------+-------------------------------------------+
4 rows in set (0.00 sec)

The command SAVE MYSQL USERS TO DISK can now be used to store/persist the hashed passwords on the disk.

ProxySQL_Admin> SAVE MYSQL USERS TO DISK;
Query OK, 0 rows affected (0.01 sec)


ProxySQL_Admin> select distinct username,password from mysql_users where username like 'test%';
+----------+-------------------------------------------+
| username | password                                  |
+----------+-------------------------------------------+
| test1    | *06C0BF5B64ECE2F648B5F048A71903906BA08E5C |
| test2    | *7CEB3FDE5F7A9C4CE5FBE610D7D8EDA62EBE5F4E |
| test3    | *F357E78CABAD76FD3F1018EF85D78499B6ACC431 |
| test4    | *D159BBDA31273BE3F4F00715B4A439925C6A0F2D |
+----------+-------------------------------------------+
4 rows in set (0.00 sec)

Let’s verify the database connectivity via ProxySQL using these DB user accounts.

for i in {1..4}; do mysql -h 127.0.0.1 -utest$i -ptest$i -P6033 -e"select current_user(),version();";done
mysql: [Warning] Using a password on the command line interface can be insecure.
+-----------------+-----------+
| current_user()  | version() |
+-----------------+-----------+
| test1@localhost | 8.0.28    |
+-----------------+-----------+
mysql: [Warning] Using a password on the command line interface can be insecure.
+-----------------+-----------+
| current_user()  | version() |
+-----------------+-----------+
| test2@localhost | 8.0.28    |
+-----------------+-----------+
mysql: [Warning] Using a password on the command line interface can be insecure.
+-----------------+-----------+
| current_user()  | version() |
+-----------------+-----------+
| test3@localhost | 8.0.28    |
+-----------------+-----------+
mysql: [Warning] Using a password on the command line interface can be insecure.
+-----------------+-----------+
| current_user()  | version() |
+-----------------+-----------+
| test4@localhost | 8.0.28    |
+-----------------+-----------+

Conclusion

Using the admin-hash_passwords feature can be extremely beneficial when there are mixed-format password entries in the mysql_users table. By saving the RUNTIME entries, which consist of hashed passwords, to disk and persisting only these entries in the mysql_users table of ProxySQL, we can easily simplify the management of hashed password entries. Furthermore, to ensure that only hashed password entries are stored within ProxySQL, it is imperative to create database user accounts using the mysql_native_password plugin.

Percona Distribution for MySQL is the most complete, stable, scalable, and secure open source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!

 

Try Percona Distribution for MySQL today!

Apr
21
2023
--

Fixing Errant GTID With Orchestrator: The Easy Way Out

Fixing Errant GTI With Orchestrator

In this article, we will discuss errant Transaction /GTID and how we can solve them with the Orchestrator tool.

Orchestrator is a MySQL high availability and replication management tool that runs as a service and provides command line access, HTTP API, and Web interface. I will not go into the details of the Orchestrator but will explore one of the features that can help us solve the errant GTID in a replication topology.

What are errant transactions?

Simply stated, they are transactions executed directly on a replica. Thus they only exist on a specific replica. This could result from a mistake (the application wrote to a replica instead of writing to the source) or by design (you need additional tables for reports).

What problem can errant transactions cause?

The major problem it causes during a planned change in a MySQL replication topology is that the transaction is not present in the binlog and hence cannot be sent over to the replica, which causes a replication error.

So let’s jump into generating and fixing an errant transaction. Below is my current topology:

[root@monitor ~]# orchestrator-client -c topology-tabulated -alias testcluster | tr '|' 't'
192.168.56.10:3306  0s ok 5.7.41-44-log rw ROW GTID
+ 192.168.56.20:3306 0s ok 5.7.41-44-log ro ROW GTID
+ 192.168.56.30:3306 0s ok 5.7.41-44-log ro ROW GTID

Now let’s make some changes on any of the replicas, which will generate an errant transaction. On 192.168.56.20:3306, I created a test database:

mysql> create database test;
Query OK, 1 row affected (0.00 sec)

This will result in an errant transaction, so let’s see how the Orchestrator will show the topology.

[root@monitor ~]# orchestrator-client -c topology-tabulated -alias testcluster | tr '|' 't'
192.168.56.10:3306   0s ok 5.7.41-44-log  rw ROW   GTID
+ 192.168.56.20:3306 0s ok 5.7.41-44-log  ro ROW   GTID:errant
+ 192.168.56.30:3306 0s ok 5.7.41-44-log  ro ROW   GTID

Now you can see we have an errant transaction; we can check in more detail by using the Orchestrator API as below:

[root@monitor ~]# orchestrator-client -c which-gtid-errant -i 192.168.56.20:3306
A71a855a-dcdc-11ed-99d7-080027e6334b:1

To know which binlogs have this errant transaction, you check with the below command:

[root@monitor ~]# orchestrator-client -c locate-gtid-errant -i 192.168.56.20:3306
mysqlbinlog.000001

Checking the binlogs is very important. We should know what changes were made to the replica, and you can check that binlog for specific GTIDs.

We can get the output from replication analysis, and you use this API feature in your custom code in case you want to monitor the topology for errant transactions:

[root@monitor ~]# orchestrator-client -c api -path replication-analysis | jq . | grep -A2 -B2 "StructureAnalysis"
      "Analysis": "NoProblem",
      "Description": "",
      "StructureAnalysis": [
        "ErrantGTIDStructureWarning"

There is a more detailed way to compare the ExecutedGtidSet and GtidErrant on the whole topology. So let me show you below:

[root@monitor ~]# sudo orchestrator-client -c api -path cluster/testcluster | jq -C '.[] | {Port: .Key.Port, Hostname: .Key.Hostname,ServerUUID: .ServerUUID, ExecutedGtidSet: .ExecutedGtidSet, GtidErrant:.GtidErrant}'
{
  "Port": 3306,
  "Hostname": "192.168.56.10",
  "ServerUUID": "3b678bc9-dcdc-11ed-b9fc-080027e6334b",
  "ExecutedGtidSet": "3b678bc9-dcdc-11ed-b9fc-080027e6334b:1-10",
  "GtidErrant": ""
}
{
  "Port": 3306,
  "Hostname": "192.168.56.20",
  "ServerUUID": "a71a855a-dcdc-11ed-99d7-080027e6334b",
  "ExecutedGtidSet": "3b678bc9-dcdc-11ed-b9fc-080027e6334b:1-10,na71a855a-dcdc-11ed-99d7-080027e6334b:1",
  "GtidErrant": "a71a855a-dcdc-11ed-99d7-080027e6334b:1"
}
{
  "Port": 3306,
  "Hostname": "192.168.56.30",
  "ServerUUID": "ea6c6af9-dcdc-11ed-9e09-080027e6334b",
  "ExecutedGtidSet": "3b678bc9-dcdc-11ed-b9fc-080027e6334b:1-10",
  "GtidErrant": ""
}

So now we know about the issue, let’s fix it with the Orchestrator.

The first way to fix it is to inject an empty transaction, which can be done as below:

[root@monitor ~]# orchestrator-client -c gtid-errant-inject-empty -i 192.168.56.20:3306
192.168.56.20:3306

[root@monitor ~]# orchestrator-client -c topology-tabulated -alias testcluster | tr '|' 't'

192.168.56.10:3306   0s ok 5.7.41-44-log  rw ROW   GTID

+ 192.168.56.20:3306 0s ok 5.7.41-44-log  ro ROW   GTID

+ 192.168.56.30:3306 0s ok 5.7.41-44-log  ro ROW   GTID

The gtid-errant-inject-empty configuration contains settings related to injecting empty transactions to reconcile Global Transaction Identifiers (GTIDs) in a MySQL replication topology. GTIDs are a way to uniquely identify transactions in a MySQL cluster, and ensuring their consistency is critical for maintaining data integrity.

So with injecting an empty transaction, the Orchestrator will inject the empty transaction from the top, it will replicate to the bottom, and that GTID will be ignored by the replica server, which already has it. So now you can see that the gti-executed set is changed, and it contains the GTID with UUID from the replica 192.168.56.20:3306.

[root@monitor ~]# sudo orchestrator-client -c api -path cluster/testcluster | jq -C '.[] | {Port: .Key.Port, Hostname: .Key.Hostname,ServerUUID: .ServerUUID, ExecutedGtidSet: .ExecutedGtidSet, GtidErrant: .GtidErrant}'
{
  "Port": 3306,
  "Hostname": "192.168.56.10",
  "ServerUUID": "3b678bc9-dcdc-11ed-b9fc-080027e6334b",
  "ExecutedGtidSet": "3b678bc9-dcdc-11ed-b9fc-080027e6334b:1-10,na71a855a-dcdc-11ed-99d7-080027e6334b:1",
  "GtidErrant": ""
}
{
  "Port": 3306,
  "Hostname": "192.168.56.20",
  "ServerUUID": "a71a855a-dcdc-11ed-99d7-080027e6334b",
  "ExecutedGtidSet": "3b678bc9-dcdc-11ed-b9fc-080027e6334b:1-10,na71a855a-dcdc-11ed-99d7-080027e6334b:1",
  "GtidErrant": ""
}
{
  "Port": 3306,
  "Hostname": "192.168.56.30",
  "ServerUUID": "ea6c6af9-dcdc-11ed-9e09-080027e6334b",
  "ExecutedGtidSet": "3b678bc9-dcdc-11ed-b9fc-080027e6334b:1-10,na71a855a-dcdc-11ed-99d7-080027e6334b:1",
  "GtidErrant": ""
}

Another way to fix this is a DANGEROUS way is to reset the master.

Orchestrator has a command gtid-errant-reset-master, applied on an instance:

Then this command “fixes” errant GTID transactions via RESET MASTER; SET GLOBAL gtid_purged…

This command is, of course, destructive to the server’s binary logs. If binary logs are assumed to enable incremental restore, then this command is dangerous.

So an example to fix an errant transaction is:

[root@monitor ~]# orchestrator-client -c topology-tabulated -alias testcluster | tr '|' 't'
192.168.56.10:3306   0s ok 5.7.41-44-log  rw ROW   GTID
+ 192.168.56.20:3306 0s ok 5.7.41-44-log  ro ROW   GTID:errant
+ 192.168.56.30:3306 0s ok 5.7.41-44-log  ro ROW   GTID

[root@monitor ~]# orchestrator-client -c which-gtid-errant -i 192.168.56.20:3306
A71a855a-dcdc-11ed-99d7-080027e6334b:2

This is how it looks:

{
  "Port": 3306,
  "Hostname": "192.168.56.20",
  "ServerUUID": "a71a855a-dcdc-11ed-99d7-080027e6334b",
  "ExecutedGtidSet": "3b678bc9-dcdc-11ed-b9fc-080027e6334b:1-10,na71a855a-dcdc-11ed-99d7-080027e6334b:1-2",
  "GtidErrant": "a71a855a-dcdc-11ed-99d7-080027e6334b:2"
}

Let’s reset the master.

[root@monitor ~]# orchestrator-client -c gtid-errant-reset-master -i 192.168.56.20:3306
192.168.56.20:3306

Now you can see that the ExecutedGtidSet is synced with the Source ExecutedGtidSet.

[root@monitor ~]# sudo orchestrator-client -c api -path cluster/testcluster | jq -C '.[] | {Port: .Key.Port, Hostname: .Key.Hostname,ServerUUID: .ServerUUID, ExecutedGtidSet: .ExecutedGtidSet, GtidErrant: .GtidErrant}'
{
  "Port": 3306,
  "Hostname": "192.168.56.10",
  "ServerUUID": "3b678bc9-dcdc-11ed-b9fc-080027e6334b",
  "ExecutedGtidSet": "3b678bc9-dcdc-11ed-b9fc-080027e6334b:1-10,na71a855a-dcdc-11ed-99d7-080027e6334b:1",
  "GtidErrant": ""
}
{
  "Port": 3306,
  "Hostname": "192.168.56.20",
  "ServerUUID": "a71a855a-dcdc-11ed-99d7-080027e6334b",
  "ExecutedGtidSet": "3b678bc9-dcdc-11ed-b9fc-080027e6334b:1-10,na71a855a-dcdc-11ed-99d7-080027e6334b:1",
  "GtidErrant": ""
}
{
  "Port": 3306,
  "Hostname": "192.168.56.30",
  "ServerUUID": "ea6c6af9-dcdc-11ed-9e09-080027e6334b",
  "ExecutedGtidSet": "3b678bc9-dcdc-11ed-b9fc-080027e6334b:1-10,na71a855a-dcdc-11ed-99d7-080027e6334b:1",
  "GtidErrant": ""
}

But this option is risky because this command actually purged the binlogs, and if any app is tailing the logs or if binary logs are assumed to enable incremental restore, then this command is dangerous and not recommended. It’s better to use gtid-errant-inject-empty, and if you still want to use gtid-errant-reset-master on a busy replica, then stop the replication first and make sure to wait for two or three minutes, then use gtid-errant-reset-master.

Conclusion

If you want to switch to GTID-based replication, make sure to check errant transactions before any planned or unplanned replication topology change. And specifically, be careful if you use a tool that reconfigures replication for you. It is always recommended to use the pt-table-checksum and pt-table-sync if you ever get this kind of situation where changes were made to the replica.

Percona Distribution for MySQL is the most complete, stable, scalable, and secure open source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!

 

Try Percona Distribution for MySQL today!

Apr
20
2023
--

Upgrading to MySQL 8: Tools That Can Help

upgrading to MySQL 8

As we approach end of life for MySQL 5.7 later this year, many businesses are currently working towards upgrading to MySQL 8. Such major version upgrades are rarely simple, but thankfully there are tools that can help smooth the process and ensure a successful upgrade.

It should be noted that while the technical aspects of the upgrade process are beyond the scope of this blog post, it is crucial to create a testing environment to verify the upgrade before proceeding to upgrade your production servers, particularly with MySQL 8. 

As there is no procedure for downgrading from MySQL 8 other than restoring a backup, testing and validation are more critical than previous major version upgrades. 

With that disclaimer out of the way, let’s examine some of the tools that are available to simplify the upgrade process.

Percona Toolkit

Percona Toolkit is a collection of advanced open source command-line tools developed and used by the Percona technical staff that are engineered to perform a variety of MySQL server and system tasks that are too difficult or complex to perform manually. 

Several of these tools can help with upgrade planning, making the entire process much easier and less prone to downtime or issues. 

    • pt-upgrade
      • The pt-upgrade tool helps you run application SELECT queries and generates reports on how each query pattern performs on the servers across different versions of MySQL. 
    • pt-query-digest
      • As best practice dictates gathering and testing all application queries by activating the slow log for a period of time, most companies will end up with an enormous amount of slow log data. The pt-query-digest tool can assist in query digest preparation for your upgrade testing. 
    • pt-config-diff
      • The pt-config-diff tool helps determine the differences in MySQL settings between files and server variables. This allows a comparison of the upgraded version to the previous version, allowing validation of configuration differences. 
    • pt-show-grants
      • The pt-show-grants tool extracts, orders, and then prints grants for MySQL user accounts. This can help to export and back up your MySQL grants before an upgrade or allow you to easily replicate users from one server to another by simply extracting the grants from the first server and piping the output directly into another server.

Early last year, my colleague Arunjith Aravindan wrote a great blog post covering these Percona tools while also detailing the pt-upgrade process itself. You can read his post here: Percona Utilities That Make Major MySQL Version Upgrades Easier.

For more information or to download the Percona Toolkit, please visit: https://docs.percona.com/percona-toolkit/

MySQL Shell Upgrade Checker

In recent years, MySQL has continued to evolve and introduce new features to enhance its capabilities. One of the more relevant additions to the MySQL toolset is the MySQL Shell Upgrade Checker, which is a built-in upgrade tool that enables users to check whether their MySQL instances are compatible with a specific version before upgrading.

The MySQL Shell Upgrade Checker is designed to help users avoid potential issues that may arise during the upgrade process. This tool works by analyzing the structure and content of the existing database and then comparing it to the requirements of the new version. In doing so, it can detect any potential issues, such as deprecated syntax or incompatible data types, and provide guidance on resolving them.

The MySQL 8 Shell Upgrade checker will check for compatibility of the dataset, looking for things such as:

    • Use of old temporal types
    • Use of database objects which conflict with reserved words
    • Removed system variables

All in all, the MySQL Shell Upgrade Checker contains around 21 total checks and will produce a detailed report of any errors, warnings, or notices.

MySQL 8 & MySQL Shell Upgrade Checker vs. pt-upgrade

Don’t confuse the MySQL Shell Upgrade Checker Utility with the pt-upgrade tool since they are used for very different kinds of major version upgrade testing. 

The MySQL Upgrade Checker Utility performs a variety of tests on a selected MySQL server to ascertain whether the upgrade will be successful. This tool, however, does NOT confirm whether the upgrade will be compatible with the application queries or routines, which is where pt-upgrade would come in. 

For more information on the MySQL 8 Shell Upgrade Checker, please refer to the following:

https://dev.mysql.com/doc/mysql-shell/8.0/en/mysql-shell-utilities-upgrade.html

Conclusion

As always, with anything database related, one of the most important tasks to complete prior to any upgrade is to test extensively. This limits the risk of unforeseen circumstances and allows time to address any unfavorable results before the upgrade. Downgrading a major version is not very feasible once writes are sent to the new MySQL 8 source, so proper planning beforehand is critical.

The tools listed in this post can assist with this planning and help negate any nasty surprises before they become a problem. Percona is always here to help as well, so if you’d like to discuss our Support or Professional Service options, please feel free to contact us!

Percona Distribution for MySQL is the most complete, stable, scalable, and secure open source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!

 

Try Percona Distribution for MySQL today!

Apr
19
2023
--

MySQL 8.0.33 – A Quick Peek

mysql

MySQL released version 8.0.33 on April 18th, featuring some attention-catching features.  This blog post is a quick review of the release notes looking for the exciting items, and comments in italics are solely my own.

User-defined collations are now deprecated and will be removed in a future version.  This is probably not a show-stopper for most and probably a scary situation for those dependent on them as there may not be an alternative.  Hopefully, UTF8MB4 is good enough.

The Performance Shema now has a Server Telemetry Traces service. This interface provides plugins and components a way to retrieve notifications related to SQL statements’ lifetime. We are directed to the Server telemetry traces service section in the MySQL Source Code documentation

The SSL library has been updated to OpenSSL version 1.1.1t. 

Inclusive language updates take time

More cleanup in replacing the terms “master,” “slave,” and “MTS” is now replaced in error messages relating to MySQL Replication by “source,” “replica,” and “MTA,” respectively. Many folks incorrectly thought this would be a simple search-and-replace operation, but it takes a while to get this work done.  This work started in 2020 and hopefully will be done soon.

Increased mysqlbinlog values

Finally, mysqlbinlog –start-position now accepts values up to 18446744073709551615, unless the –read-from-remote-server or –read-from-remote-source option is also used, in which case the maximum is 4294967295. Thank goodness, how many of you were also stuck when trying to start at 18446744073709551614 and got stuck?

Trying to set a default value for a generated column? Using a generated column with DEFAULT(col_name) to specify the default value for a named column is no longer permitted and now emits an error message.

Bugs closed in MySQL 8.0.33

There are 141 bug fixes included.  Thank you, contributors and MySQL Engineering.

Did you notice a difference in query plans between 5.7 and 8.0? Well, here is an interesting bug explanation:

When the MySQL 5.7 Optimizer has two choices for an index to filter rows, one primary and one secondary, it picks a range scan on the secondary index because it uses more key parts. MySQL 8.0 did not use this logic, instead choosing the primary index to filter rows with WHERE clause filtering. Primary key use is not suitable in such cases due to the presence of LIMIT and due to the nature of data distribution. The secondary index was not considered while resolving order by due to constant elimination. This resulted in much different query plans in MySQL 5.7 and MySQL 8.0 for the same query.

We solve this issue in MySQL 8.0 by skipping the constant key parts of the index during order-by evaluation only if the query is constant-optimized, which can be done at this time, but not during LIMIT analysis. (Bug #34291261)

Conclusion

This is an interesting release just because of the sheer volume of over one hundred and forty bug fixes.  Is there anything ‘earth shaking’ in ’33?  Not for most people. Those with questions about poor query plan generation between 5.7 and 8.0 may want to see if the above fix resolves those issues.

Now instead of upgrading to MySQL Community Server 8.0.33, you might want to look at Percona Server for MySQL. You get enterprise features like data encryption, data masking, RocksDB, and auditing, but you get these enterprise features for free.

Percona Distribution for MySQL is the most complete, stable, scalable, and secure open-source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!

 

Try Percona Distribution for MySQL today!

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com