Large tables can pose challenges for many operations when working with a database. Occasionally, we may need to modify the table definition. Since RDS replication does not use asynchronous for its replication, the typical switchover procedure is not feasible. However, the Blue/Green feature of RDS utilizes asynchronous replication, which allows us to update the table […]
22
2025
Using Blue/Green Deployment For (near) Zero-Downtime Primary Key Updates in RDS MySQL
11
2023
Maximizing Performance of AWS RDS for MySQL with Dedicated Log Volumes
A quick configuration change may do the trick in improving the performance of your AWS RDS for MySQL instance. Here, we will discuss a notable new feature in Amazon RDS, the Dedicated Log Volume (DLV), that has been introduced to boost database performance. While this discussion primarily targets MySQL instances, the principles are also relevant to PostgreSQL and MariaDB instances.
What is a Dedicated Log Volume (DLV)?
A Dedicated Log Volume (DLV) is a specialized storage volume designed to house database transaction logs separately from the volume containing the database tables. This separation aims to streamline transaction write logging, improving efficiency and consistency. DLVs are particularly advantageous for databases with large allocated storage, high I/O per second (IOPS) requirements, or latency-sensitive workloads.
Who can benefit from DLV?
DLVs are currently supported for Provisioned IOPS (PIOPS) storage, with a fixed size of 1,000 GiB and 3,000 Provisioned IOPS. Amazon RDS extends support for DLVs across various database engines:
- MariaDB: 10.6.7 and later v10 versions
- MySQL: 8.0.28 and later v8 versions
- PostgreSQL: 13.10 and later v13 versions, 14.7 and later v14 versions, and 15.2 and later v15 versions
Cost of enabling Dedicated Log Volumes (DLV) in RDS
The documentation doesn’t say much about additional charges for the Dedicated Log Volumes, but I reached out to AWS support, who responded exactly as follows:
Please note that there are no additional costs for enabling a dedicated log volume (DLV) on Amazon RDS. By default, to enable DLV, you must be using PIOPS storage, sized at 1,000 GiB with 3,000 IOPS, and you will be priced according to the storage type.
Are DLVs effective for your RDS instance?
Implementing dedicated mounts for components such as binlogs and datadir is a recommended standard practice. It becomes more manageable and efficient by isolating logs and data to a dedicated mount. This segregation facilitates optimized I/O operations, preventing potential bottlenecks and enhancing overall system performance. Overall, adopting this practice promotes a structured and efficient storage strategy, fostering better performance, manageability, and, ultimately, a more robust database environment.
Thus, using Dedicated Log Volumes (DLVs), though new in AWS RDS, has been one of the recommended best practices and is a welcome setup improvement for your RDS instance.
We performed a standard benchmarking test using the sysbench tool to compare the performance of a DLV instance vs a standard RDS MySQL instance, as shared in the following section.
Benchmarking AWS RDS DLV setup
Setup
2 RDS Single DB instances | 1 EC2 Instance | ||
Regular | DLV Enabled | Sysbench | |
db.m6i.2xlarge | c5.2xlarge | ||
MySQL 8.0.31 | CentOS 7 | ||
8 Core / 32G | 8 Core / 16G | ||
Data Size: 32G |
– Default RDS configuration was used with binlogs enabled having full ACID compliance configurations.
Benchmark results for DLV-enabled instance vs. standard instance
Write-only traffic
Read-write traffic
Read-only traffic
Benchmarking analysis
- For both read-only and read-write traffic, there is a constant improvement in the QPS counters as the number of threads increases.
- For write-only traffic, the QPS counters match the performance of standard RDS instances for lower thread counts, though, for higher counters, there is a drastic improvement.
- The DLV, of course, affects the WRITE operations the most, and hence, the write-only test should be given the most consideration for the comparison of the DLV configuration vs. standard RDS.
Benchmarking outcome
Based on the sysbench benchmark results in the specified environment, it is strongly advised to employ DLV for a standard RDS instance. DLV demonstrates superior performance across most sysbench workloads, particularly showcasing notable enhancements in write-intensive scenarios.
Implementation considerations
When opting for DLVs, it’s crucial to be aware of the following considerations:
- DLV activation requires a reboot: After modifying the DLV setting for a DB instance, a reboot is mandatory for the changes to take effect.
- Recommended for larger configurations: While DLVs offer advantages across various scenarios, they are particularly recommended for database configurations of five TiB or greater. This recommendation underscores DLV’s effectiveness in handling substantial storage volumes.
- Benchmark and test: It is always recommended to test and review the performance of your application traffic rather than solely depending on standard benchmarking dependent on synthetic load.
DLV in Multi-AZ deployments
Amazon RDS seamlessly integrates DLVs with Multi-AZ deployments. Whether you’re modifying an existing Multi-AZ instance or creating a new one, DLVs are automatically created for both the primary and secondary instances. This ensures that the advantages of DLV extend to enhanced availability and reliability in Multi-AZ configurations.
DLV with read replicas
DLV support extends to read replicas. If the primary DB instance has DLV enabled, all read replicas created after DLV activation inherit this feature. However, it’s important to note that read replicas created before DLV activation will not have it enabled by default. Explicit modification is required for pre-existing read replicas to leverage DLV benefits.
Conclusion
Dedicated Log Volumes have emerged as a strong option for optimizing Amazon RDS performance. By segregating transaction logs and harnessing the power of dedicated storage, DLVs contribute to enhanced efficiency and consistency. Integrating DLVs into your database strategy will help you toward your efforts in achieving peak performance and reliability.
How Percona can help
Percona is a trusted partner for many industry-leading organizations across the globe that rely on us for help in fully utilizing their AWS RDS environment. Here’s how Percona can enhance your AWS RDS experience:
Expert configuration: RDS works well out of the box, but having Percona’s expertise ensures optimal performance. Our consultants will configure your AWS RDS instances for the best possible performance, ensuring minimal TCO.
Decades of experience: Our consultants bring decades of experience in solving complex database performance issues. They understand your goals and objectives, providing unbiased solutions for your database environment.
Blog resources: Percona experts are actively contributing to the community through knowledge sharing via forums and blogs. For example, here are two blogs on this subject:
- When Should I Use Amazon Aurora, and When Should I Use RDS MySQL?
- Save Money in AWS RDS: Don’t Trust the Defaults to gain more valuable insights into optimizing your AWS RDS setup.
Discover how our expert support, services, and enterprise-grade open source database software can make your business run better.
08
2022
AWS RDS Backups: What’s the True Cost?
You have your database instance deployed with AWS and you are using AWS RDS for MySQL. All work smoothly in terms of satisfying queries for your application and delivering reliable uptime and performance. Now you need to take care of your backup strategy. Business is defined to have this retention policy:
- 7 daily full backups
- 4 weekly backups
- 12 monthly backups
Plus the ability to do point-in-time recovery (PITR) for the last 24 hours since the last full backup was taken.
The cloud vendor solution
This is a piece of cake. The daily backups: just set the backup retention period to six days. Done. This also has the plus that PITR is already covered since RDS uploads the transaction logs to S3 every five minutes and stores them in the parquet format, making it smaller than the regular text file. All amazing, right?
Now, the weekly and monthly retention policies are more tricky. But no problem since there is a backup service in AWS called…..AWS Backup. Without going into too much detail, you can create a backup plan for your RDS instance with the desired frequency that suits your retention policies, copy backups between regions for business continuity compliance, etc.
All this comes with a price tag. Let’s do some numbers using the AWS calculator, starting with the snapshots:
According to the documentation “There is no additional charge for backup storage up to 100% of your total database storage for a region.” In other words, this means that one daily snapshot is free. What about the remaining six?
For this example, I will assume the following: a single MySQL RDS instance using m6g.large (the smallest graviton type which is cheaper), Single-AZ, Reserved instance with three years term and paying it upfront. Now, for the disk: a gp2 disk (cheaper than an io1) with a storage capacity of 500GB and I will assume that only half is used.
From the AWS calculator, we have that the monthly cost of the additional backup storage (250GB x 5 days) is $118,75 USD per month or $1,425 USD annually. But that number is not static. Your data most likely will grow. How much will it cost when my data is no longer 250GB but 400GB? The monthly storage cost for the backups could increase to $218 USD, or $2,622 USD annually.
For the PITR, RDS uploads transaction logs to S3 every five minutes. It is not clear if this is charged or not. What it is clear is that in order to restore a DB to a specific point in time, one needs to create a completely new DB instance; you cannot do a PITR over the existing DB, so more costs.
If you are required to distribute your backups for disaster recovery plans or just for business continuity, moving snapshots to a different region comes with a cost depending on the destination region. For example: move to US-EAST-2 (Ohio) comes at a cost of $0.01 per GB. Moving the most recent full daily backup (400GB in our case) will cost $4 USD, which annually means $1,460 USD.
So far we have only discussed the daily backups made by the RDS snapshot feature. And it sums up to $4,082 USD yearly for only 400GB.
Moving on, above we mentioned that the weekly and monthly backups can be taken care of by AWS Backups service. We are not gonna deep dive into these costs primarily because the AWS calculator does not provide a means to use it for that service. However, it is documented that the storage pricing is $0.095 per GB-Month. Now, assuming the unrealistic scenario where your data size is static at 400GB, at the end of the first 12 months the cost will be $6,400 USD. Adding that to the cost of the snapshots, the grand total is:
$10,482 USD annually
And that is for just one RDS instance, with a fairly small data size (400GB), and is not assuming the cost of the instance itself, which depends on the type of instance and its size. Multiply that for the number of clusters in your environment!
Now, to provide some context of how much money that is, know that having an RDS Graviton MySQL instance of size db.m6g.4xlarge costs a bit more than $6000 and a db.m6g.2xlarge is at around $3000 a year. In other words: the annual cost of your backups is similar to having ON a whole year a server with 8 vCPU and 32GiB of memory AND a server with 32 vCPU with 128GiB of memory.
There are additional costs that we haven’t mentioned like if you want to move your backups to cold storage. Currently, the AWS Backup service doesn’t support cold storage for RDS backups, meaning that you need to take care of that by yourself and pay for the cost of that (additional to the storage cost). And if you want to store your data encrypted with your own KMS key, it comes with an additional cost.
Managing your backups like an expert
Percona is prepared to manage the complete lifecycle of your backups for a fraction of the cost and with additional features and capabilities, like PITR at the item level (you don’t need to restore the whole backup to get just the needed table or the needed rows) and can do it over the same instance, with encrypted backups for the same price, cold storage as part of the lifecycle of your data, and many more features.
Percona Managed Services is a solution from Percona to consider!
13
2015
MySQL query digest with Performance Schema
Query analysis is a fantastic path in the pursuit to achieve high performance. It’s also probably the most repeated part of a DBA’s daily adventure. For most of us, the weapon of choice is definitely pt-query-digest, which is one of the best tools for slow query analysis out there.
Why not use pt-query-digest? Well, sometimes getting the slow log can be a challenge, such as with RDS instances or when your database is running as part of a DBaaS, which is a common practice in certain organizations.
In those cases it’s good to have an alternative. And in this case, the chosen one is the Performance Schema. We have already talked about the events_statements_* tables; however, this is the moment for the events_statements_summary_by_digest. In this table each row summarizes events for given schema/digest values (note that before MySQL 5.6.9, there is no SCHEMA_NAME column and grouping is based on DIGEST values only).
In order for MySQL to start to aggregate information in summary tables, you must verify that the consumer statement_digest is enabled.
Now, the most straightforward way to get data is to simply query the table, like this:
SELECT SCHEMA_NAME, digest, digest_text, round(sum_timer_wait/ 1000000000000, 6), count_star FROM performance_schema.events_statements_summary_by_digest ORDER BY sum_timer_wait DESC LIMIT 10;
This will show you the picture of volume and frequency of SQL statements in your server. As simple as that. But there are some caveats:
- Statements are normalized to a digest text. Instead of seeing a query like SELECT age FROM population WHERE id BETWEEN 153 AND 153+69 you will have the fingerprint version: SELECT age FROM population WHERE id BETWEEN ? AND ? + ?
- The events_statements_summary_by_digest table has a limited maximum number of rows (200 by default, but MySQL 5.6.5 can be modified with the performance_schema_digests_size variable). As a consequence, when the table is full, statement digest values that have no already existing row will be added to a special “catch-all” row with DIGEST = NULL. In plain English: you won’t have meaningful info for those statements.
To solve the first issue, we can use the events_statements_history table to get complete queries for all the digests. I chose not to use events_statements_currents because of the short life the rows have on that table. With history, there are more chances to get more queries in the same amount of time.
Now, when using pt-query-digest, the first step is always to collect a representative amount of data, commonly from the slow log, and then process. With Performance Schema, let’s collect a representative amount of complete queries so we can have examples for every aggregated statement.
To address the second issue, we just need to TRUNCATE the events_statements_summary_by_digest table. This way the summary will start from scratch.
Since Performance Schema is available on all the platforms supported by MySQL, I chose to run the tests on an Amazon RDS MySQL instance. The only thing I didn’t like is that P_S is disabled by default on RDS and to enable it requires an instances reboot. Other than that, everything is the same as in a regular instance.
The steps are:
- Enable the events_statements_history consumer
- Create a MEMORY table to hold the data
- Truncate tables to have a fresh start
- Create a MySQL EVENT that will fill the table with data
- Once the event has ended, get the query digest.
The table schema is the following:
CREATE TABLE IF NOT EXISTS percona.digest_seen (schema_name varchar(64) DEFAULT NULL, digest varchar(32) DEFAULT NULL, sql_text varchar(1024) DEFAULT NULL, PRIMARY KEY USING BTREE (schema_name,digest)) engine=memory;
The original SQL_TEXT field from the events_statements_history table is defined as longtext, but unless you are using Percona Server (5.5+), you won’t be able to use longtext on a memory table. This is possible in Percona Server because the Improved Memory Engine permits the use of Blob and Text fields on the Memory Storage Engine. The workaround is to define that field as a varchar 1024. Why 1024? That’s another requirement from the table: The SQL_TEXT is fixed at 1024 chars. It’s only after MySQL 5.7.6 that the maximum number of bytes to display can be modified by changing the performance_schema_max_sql_text_length system variable at server startup.
Also, since we are going to use EVENTS on RDS, the “event_scheduler” variable has to be set to ON. Luckily, it is a dynamic variable so there’s no need to reboot the instance after modifying the Parameter Group. If using a non-RDS, it’s enough to execute “SET GLOBAL event_scheduler = ON;”
Here is the complete list of steps:
UPDATE performance_schema.setup_consumers SET ENABLED = 'YES' WHERE NAME = 'events_statements_history'; SET SESSION max_heap_table_size = 1024*1024; CREATE DATABASE IF NOT EXISTS percona; USE percona; CREATE TABLE IF NOT EXISTS percona.digest_seen (schema_name varchar(64) DEFAULT NULL, digest varchar(32) DEFAULT NULL, sql_text varchar(24) DEFAULT NULL, PRIMARY KEY USING BTREE (schema_name,digest)) engine=memory; TRUNCATE TABLE percona.digest_seen; TRUNCATE performance_schema.events_statements_summary_by_digest; TRUNCATE performance_schema.events_statements_history; CREATE EVENT IF NOT EXISTS getDigest ON SCHEDULE EVERY 1 SECOND ENDS CURRENT_TIMESTAMP + INTERVAL 5 MINUTE ON COMPLETION NOT PRESERVE DO INSERT IGNORE INTO percona.digest_seen SELECT CURRENT_SCHEMA, DIGEST, SQL_TEXT FROM performance_schema.events_statements_history WHERE DIGEST IS NOT NULL GROUP BY current_schema, digest LIMIT 50;
The event is defined to be run immediately, once per second, for 5 minutes. After the event is complete, it will be deleted.
When the event is done, we are in position to get the query digest. Simply execute this query:
SELECT s.SCHEMA_NAME, s.SQL_TEXT, ROUND(d.SUM_TIMER_WAIT / 1000000000000, 6) as EXECUTION_TIME, ROUND(d.AVG_TIMER_WAIT / 1000000000000, 6) as AVERAGE_TIME, COUNT_STAR FROM performance_schema.events_statements_summary_by_digest d LEFT JOIN percona.digest_seen s USING (digest) WHERE s.SCHEMA_NAME IS NOT NULL GROUP BY s.digest ORDER BY EXECUTION_TIME DESC LIMIT 10;
The order by is similar to the one pt-query-digest does by default, but it could be any one you want.
The output is:
+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------+--------------+------------+ | SCHEMA_NAME | SQL_TEXT | EXECUTION_TIME | AVERAGE_TIME | COUNT_STAR | +-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------+--------------+------------+ | percona | UPDATE population SET age=age+1 WHERE id=148 | 202304.145758 | 1.949487 | 103773 | | percona | SELECT age FROM population WHERE id BETWEEN 153 AND 153+69 | 488.572609 | 0.000176 | 2771352 | | percona | SELECT sex,age,estimate2012 FROM population WHERE id BETWEEN 174 AND 174+69 ORDER BY sex | 108.841575 | 0.000236 | 461412 | | percona | SELECT census2010 FROM population WHERE id=153 | 62.742239 | 0.000090 | 693526 | | percona | SELECT SUM(estimate2014) FROM population WHERE id BETWEEN 154 AND 154+69 | 44.940020 | 0.000195 | 230810 | | percona | SELECT DISTINCT base2010 FROM population WHERE id BETWEEN 154 AND 154+69 ORDER BY base2010 | 33.909593 | 0.000294 | 115338 | | percona | UPDATE population SET estimate2011='52906609184-39278192019-93190587310-78276160274-48170779146-66415569224-40310027367-70054020251-87998206812-01032761541' WHERE id=154 | 8.231353 | 0.000303 | 27210 | | percona | COMMIT | 2.630153 | 0.002900 | 907 | | percona | BEGIN | 0.705435 | 0.000031 | 23127 | | | SELECT 1 | 0.422626 | 0.000102 | 4155 | +-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------+--------------+------------+ 10 rows in set (0.10 sec)
Finally, you can do some cleanup:
DROP event IF EXISTS getDigest; DROP TABLE IF EXISTS percona.digest_seen; SET SESSION max_heap_table_size = @@max_heap_table_size; UPDATE performance_schema.setup_consumers SET ENABLED = 'NO' WHERE NAME = 'events_statements_history';
Summary: Performance Schema is doing the query digest already for you. It is just a matter of how to access the data in a way that suits your requirements.
The post MySQL query digest with Performance Schema appeared first on MySQL Performance Blog.
14
2014
RDS for Aurora unveiled at AWS re:Invent
One of the big announcements at the Amazon Web Services re:Invent 2014 conference this week was the unveiling of Aurora. The result of years of internal work, Aurora, currently in preview, is a MySQL 5.6-compatible option that “combines the speed and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases” on the AWS infrastructure. The Aurora database engine will be made available through the Amazon RDS for Aurora service. This new database option is another example of the vibrant innovation coming from the MySQL ecosystem and key role that relational databases play in applications of today and tomorrow.
Integration with other AWS components
Since the Aurora relational database engine will only be made available as a service via RDS, Amazon Web Services can do some interesting new things including:
- Fully leveraging available compute, memory and networking resources
- Automatically growing the size of database volumes as storage needs grow
- Tighter integration with AWS monitoring and restore capabilities
- Moving from RDS for MySQL to RDS for Aurora will be enabled by a migration tool
Aurora performance
In the keynote, Amazon Web Services shared that Aurora delivers up to five times the performance of stock MySQL 5.6 without requiring any changes to most MySQL applications. Their FAQ reveals some insights into their testing methodology: “Amazon Aurora delivers over 500,000 SELECTs/sec and 100,000 updates/sec, five times higher than MySQL running the same benchmark on the same hardware.”
More RDS for Aurora details coming soon
We’re very excited about Amazon pushing the boundaries in relational databases for the enterprise and look forward to see what we, as a MySQL community, can learn from their new cloud-centric and cross-service approach.
Like many, Percona is looking forward to trying out Aurora. We’ll definitely do some benchmark testing to compare additional database options such as a tuned Percona Server 5.6. Additionally, we’ll share our expert advice and ideas for improvement that we uncover from our testing of RDS for Aurora back to the team at AWS so the service becomes better for all.
The post RDS for Aurora unveiled at AWS re:Invent appeared first on MySQL Performance Blog.
26
2014
Logical MySQL backup tool mydumper 0.6.2 now available
We are pleased to announce the third release in the 0.6 series of mydumper, a tool for performing logical MySQL backups. In this release, we focused on simplifying compiling the code and added new features for making logical backups. These new features include enhancements to AWS RDS support and extending TokuDB support.
Due to recent changes (or not so much) on mysql libs, it became impossible to compile mydumper without the complete mysql source code. To simplify this, we had to disable the binlog functionality by default, as it was the one affected by the mysql libs changes. Now you should be able to compile against any mysql version without issues. If you still want the binlog feature, it is still there and you can enable it with:
cmake . -DWITH_BINLOG=ON
The one caveat is that you will need to compile against a mysql version greater than 5.5.34. It’s also possible to compile with binlog enabled against the latest 5.6 versions. In this case, you will need the source code to make some changes. You can find related information about this issue at these locations:
Download mydumper-0.6.2 source code here.
Bugs Fixed:
- #1347392 Last row of table not dumped if it brings statement over statement_size
- #1157113 Compilation of latest branch fails on CentOS 6.3 64bit
- #1326368 Can’t make against Percona-Server-devel-55 headers
- #1282862 Unknown type name ‘HASH’
- #1336860 k is used twice
- #913307 Can’t compile – missing libs crypto and ssl
- #1364393 Rows chunks doesn’t increase non innodb jobs
New MySQL Backup Features:
--lock-all-tables
Use LOCK TABLE for all instead of FLUSH TABLES WITH READ LOCK. With this option you will be able to backup RDS instances and also get the binlog coordinates (5.6).
- TokuDB support
Now TokuDB tables are dumped within the consistent snapshot instead of being locked like MyISAM.
- Support to dump tables from different schemas
mydumper has two arguments,
--database (-B)
and
--tables-list (-T)
so until now you were able to do;
-B db1
or
-B db1 -T t1,t2,tn
To dump a whole database or a list of tables from one database respectively. In 0.6.2 you can list tables in different databases in this way;
-T db1.t1,db1.t2,db2.t1,db2.t1
NOTE: You should not use -B
here because mydumper will take the -T
list as table names.
The post Logical MySQL backup tool mydumper 0.6.2 now available appeared first on Percona Performance Blog.