Jun
13
2024
--

Sakila, Where Are You Going?

At Percona, we monitor our users’ needs and try to accommodate them. One aspect we monitor is the MySQL version distribution/utilization. Observing that, we identified a very interesting trend: the lack of migration from 5.7 to 8.x, or better yet, the need of many to remain on 5.7.  That observation has triggered several actions from […]

Jun
13
2024
--

Sakila, Where Are You Going?

At Percona, we monitor our users’ needs and try to accommodate them. One aspect we monitor is the MySQL version distribution/utilization. Observing that, we identified a very interesting trend: the lack of migration from 5.7 to 8.x, or better yet, the need of many to remain on 5.7.  That observation has triggered several actions from […]

Feb
01
2024
--

Simplify the Use of ENV Variables in Percona Monitoring and Management AMI

ENV Variables in PMM AMI.jpgThe Percona Monitoring and Management (PMM) Amazon Machine Image (AMI) currently lacks native support for ENV variables. In this guide, we’ll walk through a straightforward workaround that simplifies the process of using ENV variables in PMM AMI and reapplying them after an upgrade.Step one: Adding ENV variables to /srv/.envBegin by consolidating your ENV variables in […]

Jan
05
2024
--

Configuring Keyring for Encryption Using AWS Key Management Service in Percona Server for MySQL

Configuring Keyring for Encryption Using AWS Key Management ServiceThe AWS KMS component is now available in Percona Server for MySQL starting from version 8.0.30. This addition enables data-at-rest encryption by utilizing the AWS KMS component, providing the functionality to create and manage cryptographic keys across AWS services.How do we set up encryption using AWS KMS?You should only load a keyring component with a […]

May
01
2023
--

Save Money in AWS RDS: Don’t Trust the Defaults

aws rds

Default settings can help you get started quickly – but they can also cost you performance and a higher cloud bill at the end of the month. Want to save money on your AWS RDS bill? I’ll show you some MySQL settings to tune to get better performance, and cost savings, with AWS RDS.

Recently I was engaged in a MySQL Performance Audit for a customer to help troubleshoot performance issues that led to downtime during periods of high traffic on their AWS RDS MySQL instances. During heavy loads, they would see messages about their InnoDB settings in the error logs:

[Note] InnoDB: page_cleaner: 1000ms intended loop took 4460ms. The settings might not be optimal. (flushed=140, during the time.)

This message is normally a side effect of a storage subsystem that is not capable of keeping up with the number of writes (e.g., IOPs) required by MySQL. This is “Hey MySQL, try to write less. I can’t keep up,” which is a common situation when innodb_io_capacity_max is set too high.

After some time of receiving these messages, eventually, they hit performance issues to the point that the server becomes unresponsive for a few minutes. After that, things went back to normal.

Let’s look at the problem and try to gather some context information.

Investigating AWS RDS performance issues

We had a db.m5.8xlarge instance type (32vCPU – 128GB of RAM) with a gp2 storage of 5TB, which should provide up to 10000 IOPS (this is the maximum capacity allowed by gp2), running MySQL 5.7. This is a pretty decent setup, and I don’t see many customers needing to write this many sustained IOPS.

The innodb_io_capacity_max parameter was set to 2000, so the hardware should be able to deliver that many IOPS without major issues. However, gp2 suffers from a tricky way of calculating credits and usage that may drive erroneous conclusions about the real capacity of the storage. Reviewing the CloudWatch graphics, we only had roughly 8-9k IOPS (reads and writes) used during spikes.

AWS RDS MySQL

writeops

While the IO utilization was quite high, there should be some room to get more IOPS, but we were still seeing errors. What caught my attention was the self-healing condition shown by MySQL after a few minutes.

Normally, the common solution that was actually discussed during our kick-off call was, “Well, there is always the chance to move to Provisioned IOPS, but that is quite expensive.” Yes, this is true, io2 volumes are expensive, and honestly, I think they should be used only where really high IO capacity at expected latencies is required, and this didn’t seem to be the case.

Otherwise, most of the environments can adapt to gp2/gp3 volumes; for that matter, you need to provision a big enough volume and get enough IOPS.

Finding the “smoking gun” with pt-mysql-summary

Not too long ago, my colleague Yves Trudeau and I worked on a series of posts debating how to configure an instance for write-intensive workloads. A quick look at the pt-mysql-summary output shows something really interesting when approaching the issue out of the busy period of load:

# InnoDB #####################################################
                  Version | 5.7.38
         Buffer Pool Size | 93.0G
         Buffer Pool Fill | 100%
        Buffer Pool Dirty | 1%
           File Per Table | ON
                Page Size | 16k
            Log File Size | 2 * 128.0M = 256.0M
          Log Buffer Size | 8M
             Flush Method | O_DIRECT
      Flush Log At Commit | 1
               XA Support | ON
                Checksums | ON
              Doublewrite | ON
          R/W I/O Threads | 4 4
             I/O Capacity | 200
       Thread Concurrency | 0
      Concurrency Tickets | 5000
       Commit Concurrency | 0
      Txn Isolation Level | REPEATABLE-READ
        Adaptive Flushing | ON
      Adaptive Checkpoint | 
           Checkpoint Age | 78M
             InnoDB Queue | 0 queries inside InnoDB, 0 queries in queue

 

Wait, what? 256M of redo logs and a Checkpoint Age of only 78M? That is quite conservative, considering a 93GB buffer pool size. I guess we should assume bigger redo logs for such a big buffer pool. Bingo! We have a smoking gun here.

Additionally, full ACID features were enabled, this is innodb_flush_log_at_trx_commit=1 and sync_binlog=1, which adds a lot of write overhead to every operation because, during the commit stage, everything is flushed to disk (or to gp2 in this case).

Considering a spike of load running a lot of writing queries, hitting the max checkpoint age in this setup is a very likely situation.

Basically, MySQL will perform flushing operations at a certain rate depending on several factors. This rate is normally close to innodb_io_capacity (200 by default); if the number of writes starts to approach to max checkpoint age, then the adaptive flushing algorithm will start to push up to innodb_io_capacity_max (2000 by default) to try to keep the free space in the redo logs far from the max checkpoint age limit.

If we keep pushing, we can eventually reach the max checkpoint age, which will drive the system to the synchronous state, meaning that a sort of furious flushing operations will happen beyond innodb_io_capacity_max and all writing operations will be paused (freezing writes) until there is free room in the redo logs to keep writing.

This was exactly what was happening on this server. We calculated roughly how many writes were being performed per hour, and then we recommended increasing the size of redo log files to 2x2GB each (4GB total). In practical terms, it was 3.7G due to some rounding that RDS does, so we got:

# InnoDB #####################################################
                  Version | 5.7.38
         Buffer Pool Size | 92.0G
         Buffer Pool Fill | 100%
        Buffer Pool Dirty | 2%
           File Per Table | ON
                Page Size | 16k
            Log File Size | 2 * 1.9G = 3.7G
          Log Buffer Size | 8M
             Flush Method | O_DIRECT

 

Then we also increased the innodb_io_capacity_max to 4000, so we let the adaptive flushing algorithm increase writes with some more room. Results in CloudWatch show we were right:

 

AWS RDS Cloud MySQL

The reduction during the last couple of weeks is more than 50% of IOPS, which is pretty decent now, and we haven’t changed the hardware at all. Actually, it was possible to reduce the storage size to 3TB and avoid moving to expensive io2 (provisioned IOPS) storage.

Conclusions

RDS normally works very well out of the box; most of the configurations are properly set for the type of instance provisioned. Still, I’ve found that the RDS default size of the redo logs being this small is silly, and people using a fully managed solution would expect not to worry about some common tuning.

MySQL 8.0 implemented innodb_dedicated_server that auto sizes innodb_log_file_size and innodb_log_files_in_group (now replaced by innodb_redo_log_capacity) as a function of InnoDB Buffer Pool size using a pretty simple, but effective, algorithm, and I guess it shouldn’t be hard for AWS team to implement it. We’ve done some research, and it seems RDS is not pushing this login into the 8.0 versions, which sounds strange to have such a default for innodb_redo_log_capacity

In the meantime, checking how RDS MySQL is configured with default parameters is something we all should review to avoid the typical “throwing more hardware solution” – and, by extension, spending more money.

Percona Consultants have decades of experience solving complex database performance issues and design challenges. They’ll work with you to understand your goals and objectives and provide the best, unbiased solutions for your database environment.

 

Learn more about Percona Consulting

 

A personalized Percona Database Performance Audit will help uncover potential performance killers in your current configuration.

 

Get your personalized audit

Apr
28
2023
--

Add More Security to Your Percona Server for MongoDB With AWS IAM integration!

MongoDB With AWS IAM Integration

Did you notice that Percona Server for MongoDB 6.0.5-4 was released just a few days ago? This time around, it introduced improvements to the way we handle master key rotation for data at rest encryption as well as AWS IAM integration.

One key to rule them all — improvements to master key rotation

With the improvements introduced in Percona Server for MongoDB 6.0.5-4, one key path can be used for all servers in a clustered environment. This allows us to use one vault key namespace for all nodes in a deployment while at the same time preserving key versions and allowing each node to perform key rotation without impact to the other nodes.

Changes introduced with Percona Server for MongoDB 6.0.5-4 now allow using the same key for all the members of a replica set if the user chooses so, without impact on functionality.

Why should you care about AWS IAM integration?

With all the systems users need to access daily, password management becomes a more pressing issue. The introduction of IAM systems to an enterprise has become somewhat of a security standard in large enterprises.

Our users approached us about integration with AWS IAM, commonly used in their organizations. It’s an integration missing from MongoDB Community Edition (CE) that is important for compliance with enterprise security policies of many companies. Integration with AWS IAM allows:

MongoDB AWS IAM integration

To set up this integration, follow the steps outlined in our documentation, and configure either the user or the role authentication. This will allow AWS Security Token Service (STS) to play the part of Identity Provider (IDP) in a SAML 2.0-based federation.

Your feedback matters!

We take pride in being open to feedback in Percona. Please do not hesitate to contact us via the community forums or this contact form.

What’s next

We are looking into the problems affecting large size datastores that are a pain to our users. Please let us know if there are any particular issues you are struggling with in MongoDB; we are always open to suggestions!

Learn more about Percona Server for MongoDB

Mar
22
2023
--

Deploy Percona Monitoring and Management on Amazon EKS With eksctl and Helm

Deploy Percona Monitoring and Management on Amazon EKS

One of the installation methods we support for our database software is through Helm. We have a collection of Helm charts, in this repository, for the following Percona software:

Through this blog post, you will learn how to install Percona Monitoring and Management in a Kubernetes cluster on Amazon EKS using Helm and eksctl.

Requirements

For the installation of PMM with Helm, you will need:

  • Kubernetes 1.22+
  • Helm 3.2.0+
  • PV (Persistent Volume) provisioner support in the underlying infrastructure

If you want to install PMM in a Kubernetes cluster on Amazon EKS, you can use eksctl to create the required cluster. I published this blog post Creating a Kubernetes cluster on Amazon EKS with eksctl, in the Percona Community blog, where I explain how to use this tool.

For an easy way to deploy the Kubernetes cluster, check Percona My Database as a Service (MyDBaaS), a Percona Labs project. I also recommend checking this article Percona Labs Presents: Infrastructure Generator for Percona Database as a Service (DBaaS) on our blog, where the process of creating the cluster is described.

MyDBaaS will help you with cluster creation. It will generate the configuration file needed for eksctl, or it can deploy the cluster to AWS.

To use eksctl you must:

  • Install kubectl
  • Create a user with minimum IAM policies
  • Create an access key ID and secret access key for the user previously created
  • Install AWS CLI and configure authentication (
    aws configure

    , from the command line)

  • Install eksctl

Create a Kubernetes cluster

To create the cluster, you need to generate the configuration file for eksctl. Go to https://mydbaas.labs.percona.com/ and fill out the details of your cluster.

MyDBaaS - Kubernetes Cluster Configuration

Figure 1: MyDBaaS – Kubernetes Cluster Configuration

  1. Give your cluster a name. MyDbaas is the default value.
  2. Select the number of nodes. The value selected by default is three but you can create a cluster of up to five nodes.
  3. Write the instance type.
  4. Select the region. The default is us-west-2.

If you don’t know what instance type to use, go to the Instance Type Selector and select:

  • Number of CPUs
  • Memory size
  • Region
Instance Type Selector

Figure 2: MyDBaaS – Instance Type Selector

As stated on the website, this tool will only return the configuration file needed for eksctl. You can also provide your AWS credentials for the tool to deploy the EKS cluster.

After filling out the details, click on Submit and you will get the configuration file that will look like this:

addons:
- name: aws-ebs-csi-driver
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: MyDbaas
  region: us-east-1
nodeGroups:
- desiredCapacity: 2
  iam:
    withAddonPolicies:
      ebs: true
      efs: true
      fsx: true
  instanceType: m5.large
  maxSize: 5
  minSize: 1
  name: ng-1
  preBootstrapCommands:
  - echo 'OPTIONS="--default-ulimit nofile=1048576:1048576"' >> /etc/sysconfig/docker
  - systemctl restart docker
  volumeSize: 100
  volumeType: gp2

Then, create the cluster by running the following command:

eksctl create cluster -f cluster.yaml

While running, eksctl will create the cluster and all the necessary resources. It will take a few minutes to complete.

eksctl Running

Figure 3: eksctl Running

Cluster credentials can be found in

~/.kube/config

. Try

kubectl get nodes

to verify that this file is valid, as suggested by eksctl.

Install PMM with Helm

Once the cluster has been created and configured, you can install PMM using Helm.

  1. Add the repository
helm repo add percona https://percona.github.io/percona-helm-charts/

  1. Install PMM
helm install pmm --set service.type="LoadBalancer" percona/pmm

Once the PMM server runs, you must get its IP address. Run this command to get this value.

kubectl get services monitoring-service

You will get an output similar to the following.

NAME                 TYPE           CLUSTER-IP      EXTERNAL-IP                                                              PORT(S)                      AGE
monitoring-service   LoadBalancer   10.100.17.121   a57d50410ca2f4c9d9b029da8f44f73f-254463297.us-east-1.elb.amazonaws.com   443:30252/TCP,80:31201/TCP   100s

a57d50410ca2f4c9d9b029da8f44f73f-254463297.us-east-1.elb.amazonaws.com

 is the external IP and the one you need to access PMM from the browser.

Before accessing the dashboard of PMM, get the password.

export ADMIN_PASS=$(kubectl get secret pmm-secret --namespace default -o jsonpath='{.data.PMM_ADMIN_PASSWORD}' | base64 --decode)
echo $ADMIN_PASS

The value of

$ADMIN_PASS

  is the password you need to log into the dashboard. The default user is admin.

Go to the browser and paste the external IP in the address bar.

PMM running on Kubernetes

Figure 4: PMM running on Kubernetes

Now you have PMM running in the cloud on Amazon EKS.

I recommend you check this article Percona Monitoring and Management in Kubernetes is now in Tech Preview on our blog for more information about PMM in Kubernetes using Helm.

Conclusion

Through this blog post, you learned how to create a Kubernetes cluster with eksctl and deploy PMM using Helm with the help of Percona MyDBaaS. The process would be the same for any of the Percona software in the collection of Helm charts.

Learn more about Percona MyDBaaS!

Feb
13
2023
--

Differing MySQL Client Versions Causing Broken Replication and Collations in Aurora

Differing MySQL Client Versions Causing Broken Replication

Recently, I was working with my colleagues Edwin Wang and Taras Onishchuk and found an interesting edge case involving a situation where a replica running Percona Server for MySQL 5.7, external to AWS Aurora instance version 2.10.2 (5.7-compatible), broke. I recreated the issue in my lab with a simple create database statement, as you will see below.

Error 'Character set '#255' is not a compiled character set and is not specified in the '/usr/share/percona-server/charsets/Index.xml' file' on query. Default database: 'lab'. Query: 'create database test'

The interesting thing to note here is the character set ‘#255’. You won’t see this available if you check the list of available collations in Percona Server for MySQL 5.7 for the UTF8MB4 character set.

mysql> SHOW COLLATION WHERE Charset = 'utf8mb4' and id = 255; 
Empty set (0.01 sec)

But you will see this is available in Aurora.

mysql> SHOW COLLATION WHERE Charset = 'utf8mb4' and id = 255;
+--------------------+---------+-----+---------+----------+---------+
| Collation          | Charset | Id  | Default | Compiled | Sortlen |
+--------------------+---------+-----+---------+----------+---------+
| utf8mb4_0900_ai_ci | utf8mb4 | 255 |         | Yes      |       0 |
+--------------------+---------+-----+---------+----------+---------+
1 row in set (0.04 sec)

So we know that the problem here is that the command to create the lab database was processed using the utf8mb4_0900_ai_ci character set/collation, which was noted in the binary log for the Aurora instance. When this was executed on the external replica running, it encountered an error because this collation was not part of the binary.

The first thing we need to ask ourselves is this: Why was utf8mb4_0900_ai_ci used when you can see above, for both instances, that utf8mb4_general_ci is the default collation? Meaning that on the Aurora instance, there had to be something that specified that the collation should be changed to utf8mb4_0900_ai_ci before running the CREATE DATABASE command. We were informed that this change was not performed manually by the user, so we checked the system and discovered that the default collation behavior was overridden.

mysql> show variables like 'col%';
+----------------------+--------------------+
| Variable_name        | Value              |
+----------------------+--------------------+
| collation_connection | utf8mb4_0900_ai_ci |
| collation_database   | latin1_swedish_ci  |
| collation_server     | latin1_swedish_ci  |
+----------------------+--------------------+
3 rows in set (0.05 sec)

This was pretty strange, considering that if you check the parameter group options for the Aurora 5.7 family, you cannot set the collation_connection variable to utf8mb4_0900_ai_ci. The only other way to set this would be via the init_connect variable, but we discovered that this was blank.

mysql> show variables like 'init_con%';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| init_connect  |       |
+---------------+-------+
1 row in set (0.04 sec)

So it wasn’t set by the server, but the setting change had to come from somewhere. After extensive searching, we finally discovered that the problem was the MySQL client used to connect to Aurora. Specifically, if you connect to Aurora using the MySQL 8.0 client, it will set the collation_connection variable to utf8mb4_0900_ai_ci.

[root@ip-172-31-92-215 ~]# mysql --version
mysql  Ver 8.0.22-13 for Linux on x86_64 (Percona Server (GPL), Release 13, Revision 6f7822f)
[root@ip-172-31-92-215 ~]# mysql -h sylvester-repl-break.cluster-c5wumyjzkfnn.us-east-1.rds.amazonaws.com
...

mysql> show variables like 'col%';
+----------------------+--------------------+
| Variable_name        | Value              |
+----------------------+--------------------+
| collation_connection | utf8mb4_0900_ai_ci |
| collation_database   | latin1_swedish_ci  |
| collation_server     | latin1_swedish_ci  |
+----------------------+--------------------+
3 rows in set (0.04 sec)

But if you connect using the MySQL 5.7 client, it sets the collation_connection variable to utf8_general_ci.

[root@ip-172-31-92-215 ~]# mysql --version
mysql  Ver 14.14 Distrib 5.7.40-43, for Linux (x86_64) using  6.2
[root@ip-172-31-92-215 ~]# mysql -h sylvester-repl-break.cluster-c5wumyjzkfnn.us-east-1.rds.amazonaws.com
...
mysql> show variables like 'col%';
+----------------------+-------------------+
| Variable_name        | Value             |
+----------------------+-------------------+
| collation_connection | utf8_general_ci   |
| collation_database   | latin1_swedish_ci |
| collation_server     | latin1_swedish_ci |
+----------------------+-------------------+
3 rows in set (0.00 sec)

This likely has to do with the fact that in MySQL 8.0, the default collation was changed to utf8mb4_0900_ai_ci.

In conclusion, if you are running a 5.7-compatible version of Aurora with a 5.7 external replica, ensure you avoid using the MySQL 8.0 client. Or make sure you change the collation_connection variable to utf8_general_ci, or whatever is appropriate for your session, before executing commands on the instance.

I want to thank my colleagues Taras Onishchuk and Edwin Wang for their contributions to helping solve this problem!

Percona Distribution for MySQL is the most complete, stable, scalable, and secure open-source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!

Download Percona Distribution for MySQL Today

Feb
10
2023
--

Storage Autoscaling With Percona Operator for MongoDB

Previously, deploying and maintaining a database usually meant many burdensome chores and repetitive tasks to ensure proper functioning. In the cloud era, however, developers and operation engineers started fully embracing automation tools making their job significantly easier. Of those tools, Kubernetes operators are certainly one of the most prominent, as they are oftentimes used as a building block for DBaaS.

ViewBlock, a blockchain explorer, uses the Percona Operator for MongoDB to store critical data. Today along with their team, we will see how pvc-autoresizer can automate storage scaling for MongoDB clusters on Kubernetes.

Reasoning and goal

Nobody enjoys waking up in the middle of the night from disk usage alerts or, worse, a downed cluster due to a lack of free space on a volume that wasn’t properly set up to send proper warnings to its stakeholders.

Our goal is to automate storage scaling when our disk reaches a certain threshold of use and simultaneously reduce the amount of alert noise related to that. Specifically for our Operator, the volumes used by replicaset nodes are defined by their Persistent Volume Claims (PVCs), which are our targets to enable autoscaling.

Percona Kubernetes Operator

Let’s go

Prerequisites

The currently supported Kubernetes versions of pvc-autoresizer are 1.23 – 1.25. In addition to pvc-autoresizer that you can install through their Helm chart, Prometheus also needs to be running in your cluster to provide the resizer with the metrics it needs to determine if a PVC needs to be scaled.

Your CSI driver needs to support Volume Expansion and NodeGetVolumeStats, and the Storage Class used by your PVCs also requires the Volume Expansion feature to be active.

In our lab we will use AWS EKS with a standard storage class.

In action

You first need to add the following annotation to your Storage Class:

resize.topolvm.io/enabled=true

From that point on, your only requirement is to properly annotate the PVCs created by the Operator.

Assuming a simple 3-node Percona Server for MongoDB cluster without sharding and nothing else in your current namespace, you should be able to run the following commands to allow for automatic storage resizing of all the PVCs.

kubectl annotate pvc --all resize.topolvm.io/storage_limit="100Gi"
kubectl annotate pvc --all resize.topolvm.io/increase="20Gi"
kubectl annotate pvc --all resize.topolvm.io/threshold="20%"
kubectl annotate pvc --all resize.topolvm.io/enabled="true"

To describe that particular configuration, it tells pvc-autoresizer that your PVCs should increase by 20Gi when only 20% of space is left on them, with a maximum size of 100Gi beyond which they will not increase automatically.

Note: If you use version 1.14.0 of the Operator that’s scheduled for release this month, it will be possible to annotate the PVCs directly from your CR configuration file, for example:

spec:
  replsets:
  - name: rs0
    volumeSpec:
      persistentVolumeClaim:
        annotations:
         resize.topolvm.io/storage_limit: 100Gi
          resize.topolvm.io/increase: 20Gi
          resize.topolvm.io/threshold: 20%
          resize.topolvm.io/enabled: "true"

Limitations

Manual downscaling

It is possible to increase the storage automatically, but decreasing it is a manual process requiring replacing the volumes entirely. For example, AWS EBS volumes cannot be downsized and you’d need to delete the volumes before creating new ones with a smaller size. In Change Storage Class on Kubernetes on the Fly, we described how to change the storage class on the fly — a similar process will apply to decrease the size of the volumes.

Percentage threshold

As our increase thresholds are specified in percentages, the space available upon resize will inherently grow along with the size of the volume. When dealing with “big” disks, say 1TB, it is advised to set a low threshold to avoid triggering a rescale when a lot of space is still available.

Scaling quotas

Cloud providers, for example, AWS, have scaling quotas for the volumes. For EBS, you can resize a volume once in six hours. If your data ingestion is bigger than the increase amount you set and happens in less than that time, the resizing will fail. It is essential to consider your ingest rate and disk growth so that you can set the appropriate autoresizer configurations that suit your needs. Failure to do so might result in unwanted alerts and the need to transfer data to new PVCs, which you can learn about in Percona Operator Volume Expansion Without Downtime.

Statefulset and custom resource synchronization

When you provision new nodes or recreate one, their PVC will get bootstrapped with the volume request it gets from the immutable StatefulSet created at the initialization of your cluster, not with its latest size. You will need to set the appropriate annotations for those new PVCs to ensure that pvc-autoresizer scales them enough to allow for the replication to have enough space to proceed and that it doesn’t need to scale more than what your cloud provider permits. It is generally recommended to make sure the storage specs in StatefulSets and Custom Resources are in sync with real volumes.

Conclusion

And there you have it, a Percona Operator for MongoDB configuration that automatically scales its storage based on your needs!

It is still advised to have, at the very least, a minimal alerting setup in case you’re close to hitting the storage limit specified in the annotations since pvc-autoresizer requires it to be set.

ViewBlock is a blockchain-agnostic explorer allowing anyone to inspect blocks, transactions, address history, advanced statistics & much more.

Percona Operator for MongoDB automates deployment and management of replica sets and sharded clusters on Kubernetes.

Apr
22
2022
--

Zero Impact on Index Creation with Amazon Aurora 3

Zero Impact on Index Creation with Aurora 3

Zero Impact on Index Creation with Aurora 3In the last quarter of 2021, AWS released Aurora version 3. This new version aligns Aurora with the latest MySQL 8 version, porting many of the advantages MySQL 8 has over previous versions.

While this brings a lot of new interesting features for Aurora, what we are going to cover here is to see how DDLs behave when using the ONLINE option. With a quick comparison with what happens in MySQL 8 standard and with Group Replication.

Tests

All tests were run on an Aurora instance r6g.large with a secondary availability zone. The test was composed of:

        Four connections

    • #1 to perform DDL
    • #2 to perform insert data in the table I am altering
    • #3 to perform insert data on a different table 
    • #4 checking the other node operations

In the Aurora instance, a sysbench schema with 10 tables and five million rows was created, just to get a bit of traffic. While the test table with 5ml rows as well was:

CREATE TABLE `windmills_test` (
  `id` bigint NOT NULL AUTO_INCREMENT,
  `uuid` char(36) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
  `millid` smallint NOT NULL,
  `kwatts_s` int NOT NULL,
  `date` date NOT NULL,
  `location` varchar(50) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
  `active` tinyint NOT NULL DEFAULT '1',
  `time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `strrecordtype` char(3) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
  PRIMARY KEY (`id`),
  KEY `IDX_millid` (`millid`,`active`),
  KEY `IDX_active` (`id`,`active`),
  KEY `kuuid_x` (`uuid`),
  KEY `millid_x` (`millid`),
  KEY `active_x` (`active`),
  KEY `idx_1` (`uuid`,`active`)
) ENGINE=InnoDB AUTO_INCREMENT=0 DEFAULT CHARSET=utf8mb3 COLLATE=utf8_bin ROW_FORMAT=DYNAMIC

The executed commands:

Connection 1:
    ALTER TABLE windmills_test ADD INDEX idx_1 (`uuid`,`active`), ALGORITHM=INPLACE, LOCK=NONE;
    ALTER TABLE windmills_test drop INDEX idx_1, ALGORITHM=INPLACE;
    
Connection 2:
 while [ 1 = 1 ];do da=$(date +'%s.%3N');mysql --defaults-file=./my.cnf -D windmills_large -e "insert into windmills_test  select null,uuid,millid,kwatts_s,date,location,active,time,strrecordtype from windmills4 limit 1;" -e "select count(*) from windmills_large.windmills_test;" > /dev/null;db=$(date +'%s.%3N'); echo "$(echo "($db - $da)"|bc)";sleep 1;done

Connection 3:
 while [ 1 = 1 ];do da=$(date +'%s.%3N');mysql --defaults-file=./my.cnf -D windmills_large -e "insert into windmills3  select null,uuid,millid,kwatts_s,date,location,active,time,strrecordtype from windmills4 limit 1;" -e "select count(*) from windmills_large.windmills_test;" > /dev/null;db=$(date +'%s.%3N'); echo "$(echo "($db - $da)"|bc)";sleep 1;done

Connections 4:
     while [ 1 = 1 ];do echo "$(date +'%T.%3N')";mysql --defaults-file=./my.cnf -h <secondary aurora instance> -D windmills_large -e "show full processlist;"|egrep -i -e "(windmills_test|windmills_large)"|grep -i -v localhost;sleep 1;done

Operations:
1) start inserts from connections
2) start commands in connections 4 – 5 on the other nodes
3) execute: DC1-1(root@localhost) [windmills_large]>ALTER TABLE windmills_test ADD INDEX idx_1 (uuid,active), ALGORITHM=INPLACE, LOCK=NONE;

With this, what I was looking to capture is the operation impact in doing a common action as creating an Index. My desired expectation is to have no impact when doing operations that are declared “ONLINE” such as creating an index, as well as data consistency between nodes.

Let us see what happened…

Results

While running the insert in the same table, performing the alter:

mysql>  ALTER TABLE windmills_test ADD INDEX idx_1 (`uuid`,`active`), ALGORITHM=INPLACE, LOCK=NONE;
Query OK, 0 rows affected (16.51 sec)
Records: 0  Duplicates: 0  Warnings: 0

It is NOT stopping the operation in the same table or any other table in the Aurora instance.

We can only identify a minimal performance impact:

[root@ip-10-0-0-11 tmp]# while [ 1 = 1 ];do da=$(date +'%s.%3N');mysql --defaults-file=./my.cnf -D windmills_large -e "insert into windmills_test  select null,uuid,millid,kwatts_s,date,location,active,time,strrecordtype from windmills4 limit 1;" -e "select count(*) from windmills_large.windmills_test;" > /dev/null;db=$(date +'%s.%3N'); echo "$(echo "($db - $da)"|bc)";sleep 1;done
.347
.283
.278
.297
.291
.317
.686  ? start
<Snip>
.512  ? end
.278
.284
.279

The secondary node is not affected at all, and this is because Aurora managed at the storage level the data replication. There is no such thing as Apply from Relaylog, as we have in standard MySQL asynchronous or data replicated with Group Replication.  

The result is that in Aurora 3, we can have zero impact index (or any other ONLINE/INSTANT) operation, with this I include the data replicated in the other instances for High Availability. 

If we compare this with Group replication (see blog):

GR         Aurora 3
Time on hold for insert for altering table   	~0.217 sec   ~0 sec
Time on hold for insert for another table   	~0.211 sec   ~0 sec

However, keep in mind that MySQL with Group Replication will still need to apply the data on the Secondaries. This means that if your alter was taking 10 hours to build the index, the Secondary nodes will be misaligned with the Source for approximately another 10 hours. 

With Aurora 3 or with Percona XtraDB Cluster (PXC), changes will be there when Source has completed the operation.    

What about PXC? Well, we have a different scenario:

PXC(NBO)     Aurora 3
Time on hold for insert for altering table   	~120 sec      ~0 sec
Time on hold for insert for another table   	~25  sec      ~0 sec

We will have a higher impact while doing the Alter operation, but the data will be on all nodes at the same time maintaining a high level of consistency in the cluster. 

Conclusion

Aurora is not for all uses, and not for all budgets. However, it has some very good aspects like the one we have just seen. The difference between standard MySQL and Aurora is not in the time of holding/locking (aka operation impact) but in the HA aspects. If I have my data/structure on all my Secondary at the same time as the Source, I will feel much more comfortable than having to wait an additional T time.

This is why PXC in that case is a better alternative if you can afford the locking time. If not, well, Aurora 3 is your solution, just do your math properly and be conservative with the instance resources.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com