Sep
11
2020
--

Data Consistency for RDS for MySQL: The 8.0 Version

data consistency rds mysql 8

data consistency rds mysql 8In a previous blog post on Data Consistency for RDS for MySQL, we presented a workaround to manage run pt-table-checksum on RDS instances. However, if your instance is running a MySQL 8.0.X version, there’s a simpler way to check data consistency.

Starting with 8.0.1, MySQL introduced something called “Dynamic Privileges” which is a solution to grant more granulated privileges to the users, instead of the almighty SUPER privilege.

So what was the issue with pt-table-checksum and RDS again? Since there’s no SUPER privileges for any user, there was no way for the tool to change the binlog_format to STATEMENT… but not anymore.

The solution when using 8.0 is to grant a privilege called SYSTEM_VARIABLES_ADMIN, and with that privilege, the user granted with it can now execute “set global binlog_format = STATEMENT” without being rejected.

Hands-On

Before going to the steps, my setup is a primary RDS 8.0.20 with a read replica, the same version. A table called “inconsistency” with, well, an inconsistency introduced.

Primary:

mysql> select * from dani.inconsistency;
+----+--------------+---------------+
| id | string_field | numeric_field |
+----+--------------+---------------+
|  1 | casa         |             1 |
|  2 | caza         |             2 |
|  3 | auto         |             3 |
|  4 | auto         |             3 |
|  5 | auto         |             4 |
|  6 | auto         |             5 |
|  7 | autos        |             5 |
|  8 | autos        |             6 |
|  9 | pepe         |             1 |
+----+--------------+---------------+
9 rows in set (0.00 sec)

Replica:

mysql> select * from dani.inconsistency;
+----+--------------+---------------+
| id | string_field | numeric_field |
+----+--------------+---------------+
|  1 | casa         |             1 |
|  2 | caza         |             2 |
|  3 | auto         |             3 |
|  4 | auto         |             3 |
|  5 | auto         |             4 |
|  6 | auto         |             5 |
|  7 | autos        |             5 |
|  8 | autos        |             6 |
|  9 | papa         |             1 |
+----+--------------+---------------+
9 rows in set (0.00 sec)

Now to the actual check. The first step is to grant the privilege. Here I have my “percona” user:

mysql> grant system_variables_admin on *.* to percona;
Query OK, 0 rows affected (0.01 sec)

The second and final step is to execute pt-table-checksum. That’s it!

[root@ip-192-168-1-200~]# pt-table-checksum --host=dgb-pmm2.can0nprz8rtd.us-east-1.rds.amazonaws.com --user=percona --password=xxxxxxx --databases=dani --recursion-method dsn=h=dgb-pmm2.can0nprz8rtd.us-east-1.rds.amazonaws.com,D=percona,t=dsns --no-check-binlog-format --no-check-replication-filters --chunk-size=3
Checking if all tables can be checksummed ...
Starting checksum ...
            TS ERRORS  DIFFS     ROWS  DIFF_ROWS  CHUNKS SKIPPED    TIME TABLE
09-10T20:20:17      0      1        9          0       5       0   0.175 dani.inconsistency
[root@ip-192-168-1-200 ~]# echo $?
16

So we can see here that there is 1 DIFF reported. Also, the exit status of “16” is confirmed (16 means “At least one diff was found”).

The difference between both instances is in the 3rd chunk, that from id =7 to id=9:

mysql> SELECT * FROM percona.checksums WHERE master_cnt <> this_cnt OR master_crc <> this_crc OR ISNULL(master_crc) <> ISNULL(this_crc) GROUP BY db, tbl\G
*************************** 1. row ***************************
            db: dani
           tbl: inconsistency
         chunk: 3
    chunk_time: 0.00951
   chunk_index: PRIMARY
lower_boundary: 7
upper_boundary: 9
      this_crc: 32ab17eb
      this_cnt: 3
    master_crc: 2d705b07
    master_cnt: 3
            ts: 2020-09-10 20:20:17
1 row in set (0.00 sec)

Which is the 9th row, wherein the primary it says “pepe” and in the replica says “papa”.

So if you are running RDS for MySQL with the 8 series, pt-table-checksum is back to being something you can use thanks to the dynamic privileges. Yet another reason to upgrade to MySQL 8.0!

Sep
08
2020
--

Checking Data Consistency for RDS for MySQL

data consistency RDS MySQL

data consistency RDS MySQLMySQL for RDS and DBaaS, in general, are very controlled environments by the vendors, meaning that there are missing things like a SUPER grant for the root user (and any user in general). This has some implications on operations, one of them being the impossibility of running pt-table-checksum to verify data consistency between a primary and its replicas.

However, there’s a workaround that might overcome this situation and involves three things:

  • The pt-table-checksum itself
  • A way to collect executed queries
  • And the last one, which can be controversial, is to remove the read-only from the replica and use a maintenance window to stop traffic to the database while pt-table-checksum runs.

The problem with RDS is that you cannot change binlog_format to STATEMENT, which is one of the requirements for pt-table-checksum to run.

The workaround consists of capturing the executed queries and replay it them in the replica. There are several ways to collect the queries: one can be using the Performance Schema in a similar way as explained in this blog post (https://www.percona.com/blog/2015/10/01/capture-database-traffic-using-performance-schema/). Another one is just using the slow log with long_query_time = 0. By default on RDS the log output is set to TABLE so with a simple query against mysql.slow_log you can get the queries. Another option that we prefer to avoid is to use pt-query-digest processlist feature since it might lose capturing some queries.

Queries look like this:

# Time: 2020-09-01T15:20:34
# User@Host: percona[percona] @ 192.168.1.200:59646 []
# Query_time: 0.007615  Lock_time: 0.000000  Rows_sent: 0  Rows_examined: 0
use dani;
REPLACE INTO `percona`.`checksums` (db, tbl, chunk, chunk_index, lower_boundary, upper_boundary, this_cnt, this_crc) SELECT 'dani', 'inconsistency', '5', 'PRIMARY', '9', NULL, COUNT(*), '0' FROM `dani`.`inconsistency` FORCE INDEX(`PRIMARY`) WHERE ((`id` > '9')) ORDER BY `id` /*past upper chunk*/;
# Time: 2020-09-01T15:20:34
# User@Host: percona[percona] @ 192.168.1.200:59646 []
# Query_time: 0.009266  Lock_time: 0.000000  Rows_sent: 0  Rows_examined: 0
use dani;
UPDATE `percona`.`checksums` SET chunk_time = '0.008633', master_crc = '0', master_cnt = '0' WHERE db = 'dani' AND tbl = 'inconsistency' AND chunk = '5';

The next step is to send those queries to the replicas as soon as possible so we can somehow guarantee that the point in time for comparison is the same for tables on both primary and secondary. And that’s the reason why one needs to change the read-only value in the replicas to 0. A change that can be rollbacked immediately after the pt-table-checksum process ends.

The Proof of Concept

I have created an RDS primary/secondary environment and have added a table with inconsistency on purpose.

Primary values:

mysql> select * from inconsistency;
+----+--------------+---------------+
| id | string_field | numeric_field |
+----+--------------+---------------+
|  1 | casa         |             1 |
|  2 | caza         |             2 |
|  3 | auto         |             3 |
|  4 | auto         |             3 |
|  5 | auto         |             4 |
|  6 | auto         |             5 |
|  7 | autos        |             5 |
|  8 | autos        |             6 |
|  9 | pepe         |             1 |
+----+--------------+---------------+
9 rows in set (0.09 sec)

And replica values:

mysql> select * from inconsistency;
+----+--------------+---------------+
| id | string_field | numeric_field |
+----+--------------+---------------+
|  1 | casa         |             1 |
|  2 | caza         |             2 |
|  3 | auto         |             3 |
|  4 | auto         |             3 |
|  5 | auto         |             4 |
|  6 | auto         |             5 |
|  7 | autos        |             5 |
|  8 | autos        |             6 |
|  9 | papa         |             1 |
+----+--------------+---------------+
9 rows in set (0.08 sec)

Can you spot the difference :)? It’s the last row. While on the Primary the string_field says “pepe” in the replica it says “papa”.

So are we ready to run pt-table-checksum? Not quite. The tool will complain about not being able to change the binlog_format and it will end the execution. Unfortunately, currently, there’s no way to avoid that other than modifying the code. The change is to add a return to the following conditional:

      if ( VersionParser->new($dbh) >= '5.1.5' ) {
         $sql = 'SELECT @@binlog_format';

With the return:

      if ( VersionParser->new($dbh) >= '5.1.5' ) {
         return;
         $sql = 'SELECT @@binlog_format';

In pt-table-checksum version 3.2.1, that is in the line 10181:
https://github.com/percona/percona-toolkit/blob/release-3.2.1/bin/pt-table-checksum#L10181

Now we are ready! Let’s see if we can find out that difference using the tools. To send the queries to the replicas, execute the queries previously captured.

And finally, the actual pt-table-checksum command:

pt-table-checksum --host=dgb-primary --user=percona --password=xxxxx --no-check-binlog-format --no-check-slave-tables --databases=dani --recursion-method=none --chunk-size=3

The output won’t report any difference and is expected to happen like that, so don’t panic. So, how do we check the reality? By querying the checksums table in the replica:

mysql> select * from percona.checksums;
+------+---------------+-------+------------+-------------+----------------+----------------+----------+----------+------------+------------+---------------------+
| db   | tbl           | chunk | chunk_time | chunk_index | lower_boundary | upper_boundary | this_crc | this_cnt | master_crc | master_cnt | ts                  |
+------+---------------+-------+------------+-------------+----------------+----------------+----------+----------+------------+------------+---------------------+
| dani | inconsistency |     1 |    0.00877 | PRIMARY     | 1              | 3              | ae8eafc4 |        3 | ae8eafc4   |          3 | 2020-09-01 16:48:04 |
| dani | inconsistency |     2 |   0.008754 | PRIMARY     | 4              | 6              | 374d887b |        3 | 374d887b   |          3 | 2020-09-01 16:48:04 |
| dani | inconsistency |     3 |   0.008737 | PRIMARY     | 7              | 9              | 25680fb9 |        3 | d7e101a5   |          3 | 2020-09-01 16:48:04 |
| dani | inconsistency |     4 |   0.008944 | PRIMARY     | NULL           | 1              | 0        |        0 | 0          |          0 | 2020-09-01 16:48:04 |
| dani | inconsistency |     5 |   0.008905 | PRIMARY     | 9              | NULL           | 0        |        0 | 0          |          0 | 2020-09-01 16:48:04 |
+------+---------------+-------+------------+-------------+----------------+----------------+----------+----------+------------+------------+---------------------+

See the difference? It is the chunk number 3, the “this_crc” and “master_crc” are different. It’s hard to spot, right? Let’s try with some filters to the query:

mysql> SELECT * FROM percona.checksums WHERE (  master_cnt <> this_cnt  OR master_crc <> this_crc  OR ISNULL(master_crc) <> ISNULL(this_crc)) GROUP BY db, tbl;
+------+---------------+-------+------------+-------------+----------------+----------------+----------+----------+------------+------------+---------------------+
| db   | tbl           | chunk | chunk_time | chunk_index | lower_boundary | upper_boundary | this_crc | this_cnt | master_crc | master_cnt | ts                  |
+------+---------------+-------+------------+-------------+----------------+----------------+----------+----------+------------+------------+---------------------+
| dani | inconsistency |     3 |   0.008602 | PRIMARY     | 7              | 9              | 25680fb9 |        3 | d7e101a5   |          3 | 2020-09-01 16:56:16 |
+------+---------------+-------+------------+-------------+----------------+----------------+----------+----------+------------+------------+---------------------+
1 row in set (0.08 sec)

There you go, data inconsistency detected.

Working with MySQL 8.0? There’s an even easier way to check data consistency!

Fine print

Some things to consider:

  • Replicas should be up to date – If there’s a lag between primary and secondary you would get false negatives.
  • The read-only itself: it’s kind of ironic that to check data consistency you have to disable the one thing that guarantees data consistency. However, it is temporary, and it is highly important to revert to read-only=on once the process is done.
  • Traffic to the database must be stopped in order to guarantee 100% that the data that we are checking is in the same point-in-time, meaning: no changes happened in between.
Jul
17
2018
--

When Should I Use Amazon Aurora and When Should I use RDS MySQL?

Now that Database-as-a-service (DBaaS) is in high demand, there is one question regarding AWS services that cannot always be answered easily : When should I use Aurora and when RDS MySQL?

DBaaS cloud services allow users to use databases without configuring physical hardware and infrastructure, and without installing software. I’m not sure if there is a straightforward answer, but when trying to find out which solution best fits an organization there are multiple factors that should be taken into consideration. These may be performance, high availability, operational cost, management, capacity planning, scalability, security, monitoring, etc.

There are also cases where although the workload and operational needs seem to best fit to one solution, there are other limiting factors which may be blockers (or at least need special handling).

In this blog post, I will try to provide some general rules of thumb but let’s first try to give a short description of these products.

What we should really compare is the MySQL and Aurora database engines provided by Amazon RDS.

An introduction to Amazon RDS

Amazon Relational Database Service (Amazon RDS) is a hosted database service which provides multiple database products to choose from, including Aurora, PostgreSQL, MySQL, MariaDB, Oracle, and Microsoft SQL Server. We will focus on MySQL and Aurora.

With regards to systems administration, both solutions are time-saving. You get an environment ready to deploy your application and if there are no dedicated DBAs, RDS gives you great flexibility for operations like upgrades or backups. For both products, Amazon applies required updates and the latest patches without any downtime. You can define maintenance windows and automated patching (if enabled) will occur within them. Data is continuously backed up to S3 in real time, with no performance impact. This eliminates the need for backup windows and other, complex or not, scripted procedures. Although this sounds great, the risk of vendor lock-in and the challenges of enforced updates and client-side optimizations are still there.

So, Aurora or RDS MySQL?

Amazon Aurora is a relational, proprietary, closed-source database engine, with all that that implies.

RDS MySQL is 5.5, 5.6 and 5.7 compatible and offers the option to select among minor releases. While RDS MySQL supports multiple storage engines with varying capabilities, not all of them are optimized for crash recovery and data durability. Until recently, it was a limitation that Aurora was only compatible with MySQL 5.6 but it’s now compatible with both 5.6 and 5.7 too.

So, in most cases, no significant application changes are required for either product. Keep in mind that certain MySQL features like the MyISAM storage engine are not available with Amazon Aurora. Migration to RDS can be performed using Percona XtraBackup.

For RDS products shell access to the underlying operating system is disabled and access to MySQL user accounts with the “SUPER” privilege isn’t allowed. To configure MySQL variables or manage users, Amazon RDS provides specific parameter groups, APIs and other special system procedures which be used. If you need to enable remote access this article will help you do so https://www.percona.com/blog/2018/05/08/how-to-enable-amazon-rds-remote-access/

Performance considerations

Although Amazon RDS uses SSDs to achieve better IO throughput for all its database services, Amazon claims that the Aurora is able to achieve a 5x performance boost than standard MySQL and provides reliability out of the box. In general, Aurora seems to be faster, but not always.

For example, due to the need to disable the InnoDB change buffer for Aurora (this is one of the keys for the distributed storage engine), and that updates to secondary indexes must be write through, there is a big performance penalty in workloads where heavy writes that update secondary indexes are performed. This is because of the way MySQL relies on the change buffer to defer and merge secondary index updates. If your application performs a high rate of updates against tables with secondary indexes, Aurora performance may be poor. In any case, you should always keep in mind that performance depends on schema design. Before taking the decision to migrate, performance should be evaluated against an application specific workload. Doing extensive benchmarks will be the subject of a future blog post.

Capacity Planning

Talking about underlying storage, another important thing to take into consideration is that with Aurora there is no need for capacity planning. Aurora storage will automatically grow, from the minimum of 10 GB up to 64 TiB, in 10 GB increments, with no impact on database performance. The table size limit is only constrained by the size of the Aurora cluster volume, which has a maximum of 64 tebibytes (TiB). As a result, the maximum table size for a table in an Aurora database is 64 TiB. For RDS MySQL, the maximum provisioned storage limit constrains the size of a table to a maximum size of 16 TB when using InnoDB file-per-table tablespaces.

Replication

Replication is a really powerful feature of MySQL (like) products. With Aurora, you can provision up to fifteen replicas compared to just five in RDS MySQL. All Aurora replicas share the same underlying volume with the primary instance and this means that replication can be performed in milliseconds as updates made by the primary instance are instantly available to all Aurora replicas. Failover is automatic with no data loss on Amazon Aurora whereas the replicas failover priority can be set.

An explanatory description of Amazon Aurora’s architecture can be found in Vadim’s post written a couple of years ago https://www.percona.com/blog/2015/11/16/amazon-aurora-looking-deeper/

The architecture used and the way that replication works on both products shows a really significant difference between them. Aurora is a High Availablity (HA) solution where you only need to attach a reader and this automatically becomes Multi-AZ available. Aurora replicates data to six storage nodes in Multi-AZs to withstand the loss of an entire AZ (Availability Zone) or two storage nodes without any availability impact to the client’s applications.

On the other hand, RDS MySQL allows only up to five replicas and the replication process is slower than Aurora. Failover is a manual process and may result in last-minute data loss. RDS for MySQL is not an HA solution, so you have to mark the master as Multi-AZ and attach the endpoints.

Monitoring

Both products can be monitored with a variety of monitoring tools. You can enable automated monitoring and you can define the log types to publish to Amazon CloudWatch. Percona Monitoring and Management (PMM) can also be used to gather metrics.

Be aware that for Aurora there is a limitation for the T2 instances such that Performance Schema can cause the host to run out of memory if enabled.

Costs

Aurora instances will cost you ~20% more than RDS MySQL. If you create Aurora read replicas then the cost of your Aurora cluster will double. Aurora is only available on certain RDS instance sizes. Instances pricing details can be found here and here.

Storage pricing may be a bit tricky. Keep in mind that pricing for Aurora differs to that for RDS MySQL. For RDS MySQL you have to select the type and size for the EBS volume, and you have to be sure that provisioned EBS IOPs can be supported by your instance type as EBS IOPs are restricted by the instance type capabilities. Unless you watch for this, you may end up having EBS IOPs that cannot be really used by your instance.

For Aurora, IOPs are only limited by the instance type. This means that if you want to increase IOPs performance on Aurora you should proceed with an instance type upgrade. In any case, Amazon will charge you based on the dataset size and the requests per second.

That said, although for Aurora you pay only for the data you really use in 10GB increments if you want high performance you have to select the correct instance. For Aurora, regardless of the instance type, you get billed $0.10 per GB-month and $0.20 per 1 million requests so if you need high performance the cost maybe even more than RDS MySQL. For RDS MySQL storage costs are based on the EBS type and size.

Percona provides support for RDS services and you might be interested in these cases studies:

When a more fully customized solution is required, most of our customers usually prefer the use of AWS EC2 instances supported by our managed services offering.

TL;DR
  • If you are looking for a native HA solution then you should use Aurora
  • For a read-intensive workload within an HA environment, Aurora is a perfect match. Combined with ProxySQL for RDS you can get a high flexibility
  • Aurora performance is great but is not as much as expected for write-intensive workloads when secondary indexes exist. In any case, you should benchmark both RDS MySQL and Aurora before taking the decision to migrate.  Performance depends much on workload and schema design
  • By choosing Amazon Aurora you are fully dependent on Amazon for bug fixes or upgrades
  • If you need to use MySQL plugins you should use RDS MySQL
  • Aurora only supports InnoDB. If you need other engines i.e. MyISAM, RDS MySQL is the only option
  • With RDS MySQL you can use specific MySQL releases
  • Aurora is not included in the AWS free-tier and costs a bit more than RDS MySQL. If you only need a managed solution to deploy services in a less expensive way and out of the box availability is not your main concern, RDS MySQL is what you need
  • If for any reason Performance Schema must be ON, you should not enable this on Amazon Aurora MySQL T2 instances. With the Performance Schema enabled, the T2 instance may run out of memory
  • For both products, you should carefully examine the known issues and limitations listed here https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/MySQL.KnownIssuesAndLimitations.html and here https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Aurora.AuroraMySQL.html

The post When Should I Use Amazon Aurora and When Should I use RDS MySQL? appeared first on Percona Database Performance Blog.

Sep
21
2017
--

Percona Support with Amazon RDS

Amazon RDS

This blog post will give a brief overview of Amazon RDS capabilities and limitations, and how Percona Support can help you succeed in your Amazon RDS deployments.

One of the common questions that we get from customers and prospective customers is about Percona Support with Amazon RDS. As many companies have shifted to the cloud, or are considering how to do so, it’s natural to try to understand the limitations inherent in different deployment strategies.

Why Use Amazon RDS?

As more companies move to using the cloud, we’ve seen a shift towards work models in technical teams that require software developers to take on more operational duties than they have traditionally. This makes it essential to abstract infrastructure so it can be interacted with as code, whether through automation or APIs. Amazon RDS presents a compelling DBaaS product with significant flexibility while maintaining ease of deployment.

Use Cases Where RDS Isn’t a Fit

There are a number of use cases where the inherent limitations of RDS make it not a good fit. With RDS, you are trading off the flexibility to deploy complex environment topologies for the ease of deploying with the push of a button, or a simple API call. RDS eliminates most of the operational overhead of running a database in your environment by abstracting away the physical or virtual hardware and the operating system, networking and replication configuration. This, however, means that you can’t get too fancy with replication, networking or the underlying operating system or hardware.

When Using RDS, Which Engine is Right For Me?

Amazon’s RDS has numerous database engines available, each suited to a specific use case. The three RDS database engines we’ll be discussing briefly here are MySQL, MariaDB and Aurora.

Use MySQL when you have an application tuned for MySQL, you need to use MySQL plug-ins or you wish to maintain compatibility to support external replicas in EC2. MySQL with RDS has support for Memcached, including plug-in support and 5.7 compatible query optimizer improvements. Unfortunately, thread pooling and similar features that are available in Percona Server for MySQL are not currently available in the MySQL engine on RDS.

Use MariaDB when you have an application that requires features available for this engine but not in others. Currently, MariaDB engines in RDS support thread pooling, table elimination, user roles and virtual columns. MySQL or Aurora don’t support these. MariaDB engines in RDS support global transaction IDs (GTIDs), but they are based on the MariaDB implementation. They are not compatible with MySQL GTIDs. This can affect replication or migrations in the future.

Use Aurora when you want a simple-to-setup solution with strong availability guarantees and minimal configuration. This RDS database engine is cloud-native, built with elasticity and the vagaries of running in a distributed infrastructure in mind. While it does limit your configuration and optimization capabilities more than other RDS database engines, it handles a lot of things for you – including ensuring availability. Aurora automatically detects database crashes and restarts without the need for crash recovery or to rebuild the database cache. If the entire instance fails, Aurora automatically fails over to one of up to 15 read replicas.

So If RDS Handles Operations, Why Do I Need Support?

Generally speaking, properly using a database implies four quadrants of tasks. RDS only covers one of these four quadrants: the operational piece. Your existing staff (or another provider such as Percona) must cover each of the remaining quadrants.

Amazon RDS
Amazon RDS

The areas where people run into trouble are slow queries, database performance not meeting expectations or other such issues. In these cases they often can contact Amazon’s support line. The AWS Support Engineers are trained and focused on addressing issues specific to the AWS environment, however. They’re not DBAs and do not have the database expertise necessary to fully troubleshoot your database issues in depth. Often, when an RDS user encounters a performance issue, the first instinct is to increase the size of their AWS deployment because it’s a simple solution. A better path would be investigating performance tuning. More hardware is not necessarily the best solution. You often end up spending far more on your monthly cloud hosting bill than necessary by ignoring unoptimized configurations and queries.

As noted above, when using MariaDB or MySQL RDS database engines you can make use of plug-ins and inject additional configuration options that aren’t available in Aurora. This includes the ability to replicate to external instances, such as in an EC2 environment. This provides more configuration flexibility for performance optimization – but does require expertise to make use of it.

Outside support vendors (like Percona) can still help you even when you eliminate the operational elements by lending the expertise to your technical teams and educating them on tuning and optimization strategies.

Aug
29
2017
--

Percona Live Europe Featured Talks: Migrating To and Living on RDS/Aurora with Balazs Pocze

Colin Charles

Percona Live Europe Featured Talk Balazs GizmodoWelcome to another post our series of interview blogs for the upcoming Percona Live Europe 2017 in Dublin. This series highlights a number of talks that will be at the conference and gives a short preview of what attendees can expect to learn from the presenter.

This blog post is with Balazs Pocze, Senior Datastore Engineer at Gizmodo. His talk is titled Migrating To and Living on RDS/Aurora. Gizmodo migrated their platform (Kinja) from a datacenter-based approach to AWS, including the migration of standalone MySQL hosts to RDS/Aurora. In our conversation, we discussed how they achieved this migration:

Percona: How did you get into database technology? What do you love about it?

Balazs: I worked as an Operations/DevOps guy for years before I started working with databases. I guess it happened because I was the person at the company that I worked at the time who dared to deal with the database when something strange happened. Somebody had to hold the hot potato. ?

I love that being a DBA is like being a bass player in a rock band. When you do your job perfectly, no one ever notices you are there – but the entire show depends on your work.

Percona: You’re presenting a session called “Migrating To and Living on RDS/Aurora”. What reasons were crucial in the decision to migrate to a cloud platform? Performance? Less management? Database demands?

Balazs: Actually, we migrated the entire Kinja (our platform) to the cloud, so migrating the database wasn’t a question for a second. We moved to the cloud because we didn’t want to deal with hardware anyway, we need flexibility. In the data center days, we had to size the DC’s to handle all of our traffic at any given moment. This means we had to burn a lot of money on underutilized machines. In the cloud, we can spin up machines when we need more computing power. In conjunction, our hardware just got old enough so that it made sense to consider what was a better idea: buying lots of expensive hardware, keep it running, dealing with the hardware (the majority of the ops team lives on a different continent than our servers!) or simply migrating everything to the cloud. That was simpler and safer.

But we didn’t just migrate to the cloud, we also migrated to RDS – managed database service instead of servers with a database on them. The reason to start using RDS was that I didn’t want to re-implement all of the automation stacks we had on the data centers. That seemed like too much work with too many points of failure. When I checked how to fix those failure points, the entire project started to look like the Deathstar. The original database stack was growing organically in the given data center scenario, and reimplementing it for the cloud seemed unsafe.

Percona: How smoothly was the transition, and did you hit unexpected complications? How did you overcome them?

Balazs: The transition was smooth and, from our reader’s view, unnoticeable. Since the majority of my talk will be about those complications and the ways we solved them, I think it would be best if I answer this question during my session. ?

But there’s a non-exhaustive list: we had to switch back from GTID to old-fashioned replication, we had to set up SSL proxies to connect securely the data center and the cloud environment, and after we had to debug a lot of packet loss and TCP overload on the VPN channel. It was fun, actually.

Percona: What do you want attendees to take away from your session? Why should they attend?

Balazs: This session will be about how we had to change our view of the database, and what differences we met in the cloud compared to the hardware world. If somebody plans to migrate to the cloud (especially AWS/RDS), I recommend they check out my talk, because some of the paths we walked down were dead ends. I’ll share what we found, so you don’t have to make the same mistakes we did. It will spare you some time.

Percona: What are you most looking forward to at Percona Live Europe 2017?

Balazs: Three things: hearing about new technologies, learning best practices, and most importantly meeting up with the people I always meet at Percona conferences. There is a really good community with lots of great people. I am always looking forward to seeing them again.

Want to find out more about Balazs and RDS migration? Register for Percona Live Europe 2017, and see his talk Migrating To and Living on RDS/Aurora. Register now to get the best price! Use discount code SeeMeSpeakPLE17 to get 10% off your registration.

Percona Live Open Source Database Conference Europe 2017 in Dublin is the premier European open source event for the data performance ecosystem. It is the place to be for the open source community as well as businesses that thrive in the MySQL, MariaDB, MongoDB, time series database, cloud, big data and Internet of Things (IoT) marketplaces. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

The Percona Live Open Source Database Conference Europe will be September 25-27, 2017 at the Radisson Blu Royal Hotel, Dublin.

Aug
09
2017
--

How to Configure Aurora RDS Parameters

Aurora RDS Parameters

Aurora RDS ParametersIn this blog post, we’ll look at some tips on how to configure Aurora RDS parameters.

I was recently deploying a few Aurora RDS instances, a process very similar to configuring a regular RDS instance. I noticed a few minor differences in the way you configure Aurora RDS parameters, and very few articles on how the commands should be structured (for RDS as well as Aurora). The only real literature available is the official Amazon RDS documentation.

This blog provides a concise “how-to” guide to quickly change Aurora RDS parameters using the AWS CLI. Aurora retains the parameter group model introduced with RDS, with new instances having the default read only parameter groups. For a new instance, you need to create and allocate a new parameter group (this requires a DB reboot). After that, you can apply changes to dynamic variables immediately. In other words, the first time you add the DB parameter group you’ll need to reboot even if the variable you are configuring is dynamic. It’s best to create a new DB parameter group when initializing your clusters. Nothing stops you from adding more than one host to the same DB Parameter Group rather than creating one per instance.

In addition to the DB Parameter Group, each instance is also allocated a DB Cluster Parameter Group. The DB Parameter Group is used for instance-level parameters, while the DB Cluster Parameter Group is used for cluster-level parameters (and applies to all instances in a cluster). You’ll find some of the MySQL engine variables can only be found in the DB Cluster Parameter Group. Here you will find a handy reference of all the DB cluster and DB instance parameters that are viewable or configurable for Aurora instances.

To run these commands, you’ll need to have the “aws” cli tool installed and configured. Note that the force-failover option used for RDS instances doesn’t apply to Aurora. You should perform either a controlled failover or let Aurora handle this. Also, the group family to use for Aurora is “oscar5.6”. The commands to set this up are as follows:

aws rds create-db-parameter-group
    --db-parameter-group-name percona-opt
    --db-parameter-group-family oscar5.6
    --description "Percona Optimizations"
aws rds modify-db-parameter-group
    --db-parameter-group-name percona-opt
    --parameters "ParameterName=max_connections,ParameterValue=5000,ApplyMethod=immediate"
# For each instance-name:
aws rds modify-db-instance --db-instance-identifier <instance-name>
    --db-parameter-group-name=percona-opt
aws rds reboot-db-instance
    --db-instance-identifier <instance-name>

Once you create the initial DB parameter group, configure the variables as follows:

aws rds modify-db-parameter-group
    --db-parameter-group-name <instance-name>
    --parameters "ParameterName=max_connect_errors,ParameterValue=999999,ApplyMethod=immediate"
aws rds modify-db-parameter-group
    --db-parameter-group-name <instance-name>
    --parameters "ParameterName=max_connect_errors,ParameterValue=999999,ApplyMethod=immediate"
## Verifying change:
aws rds describe-db-parameters
      --db-parameter-group-name aurora-instance-1
      | grep -B7 -A2 'max_connect_errors'

Please keep in mind, it can take a few seconds to propagate changes to nodes. Give it a moment before checking the values with “show global variables”. You can configure the DB Cluster Parameter group similarly, for example:

# Create a new db cluster parameter group
aws rds create-db-cluster-parameter-group --db-cluster-parameter-group-name percona-cluster --db-parameter-group-family oscar5.6 --description "new cluster group"
# Tune a variable on the db cluster parameter group
aws rds modify-db-cluster-parameter-group --db-cluster-parameter-group-name percona-cluster --parameters "ParameterName=innodb_flush_log_at_trx_commit,ParameterValue=2,ApplyMethod=immediate"
# Allocate the new db cluster parameter to your cluster
aws rds modify-db-cluster --db-cluster-identifier <cluster_identifier> --db-cluster-parameter-group-name=percona-cluster
# And of course, for viewing the cluster parameters
aws rds describe-db-cluster-parameters --db-cluster-parameter-group-name=percona-cluster

I hope you find this article useful, please make sure to share with the community!

Apr
26
2017
--

Percona Live 2017: A Deep-Dive Into What’s New in Amazon Aurora

Percona Live 2017

PostgresPercona Live 2017 is rolling along, and now that everybody got through lunch we’re all recharged and ready for the afternoon. Let’s start it out with Amazon Aurora.

Amazon AuroraOnce of the best-attended sessions was Sailesh Krishnamurthy’s (Senior Engineering Manager at Amazon Web Services) talk on a deep dive into what is new in Amazon Aurora. Amazon Aurora is a fully managed relational database engine that combines the speed and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. It is purpose-built for the cloud using a new architectural model and distributed systems techniques to provide far higher performance, availability and durability than previously possible using conventional monolithic database architectures.

Amazon Aurora packs a lot of innovations in the engine and storage layers. In this session, Sailesh looked at some of the key innovations behind Amazon Aurora, new improvements to Aurora’s performance, availability and cost-effectiveness and discussed best practices and optimal configurations.

Don’t miss any of tomorrow’s talks! You can find Thursday’s (4/27) session schedule here.

Nov
11
2016
--

Amazon AWS Service Tiers

Amazon AWS Service TiersThis blog post discusses the differences between the Amazon AWS service tiers.

Many people want to move to an Amazon environment but are unsure what AWS service makes the most sense (EC2, RDS, Aurora). For database services, the tiering at Amazon starts with EC2, then moves up to RDS, and on to Aurora. Amazon takes on more of the implementation and management of the database As you move up the tiers. This limits the optimization options. Obviously, moving up the tiers increases basic costs, but there are tradeoffs at each level to consider.

  • EC2 (Elastic Compute Cloud) is a basic cloud platform. It provides the user with complete control of the compute environment, while reducing your need to monitor and manage hardware. From a database perspective, you can do almost anything in EC2 that you could do running a database on your own hardware. You can tweak OS and database settings, plus do all of the normal database optimization work you would do in a bare metal environment. In EC2, you can run a single server, master/slave, or a cluster, and you can use MySQL, MongoDB, or any other product. You can use AWS Snapshot Manager to take backups, or you can use another backup tool. This option is ideal if you want all the flexibility of running your own hardware without the hassles of daily hardware maintenance.
  • RDS (Relational Data Service) makes it easy to set up a relational database in the cloud. It offers similar resizing capabilities to EC2, but also automates a lot of tasks. RDS supports Aurora (more on that later), Postgres, MySQL, MariaDB, Oracle, and MSSQL. RDS simplifies deployment and automates some maintenance tasks. This means that you are limited in terms of the tweaks that you can implement at the OS and database configuration level. This means you will focus on query and schema changes to optimize a database in this environment. RDS also includes automated backups and provides options for read replicas that you can spread across multiple availability zones. You must consider and manage all these are all items in the EC2 world. This choice is great if you are looking to implement a database but don’t want (or know how) to take on a lot of the tasks, such as backups and replication setup, that are needed for a stable and highly available environment.
  • Aurora is one of the database options available through RDS. You might hear people refer to it either as Aurora or RDS Aurora (they’re both the same). With Aurora, Amazon takes on even more of the configuration and management options. This limits your optimization capabilities even more. It also means that there are far fewer things to worry about since Amazon handles so much of the administration. Aurora is MySQL-compatible, and is great if you want the power and convenience of MySQL with a minimum of effort on the hardware side. Aurora is designed to automatically detect database crashes and restart without the need for crash recovery or to rebuild the database cache. If the entire instance fails, Aurora will automatically failover to one of up to 15 read replicas.

With data in the cloud, security becomes a bigger concern. You continue to govern access to your content, platform, applications, systems ,and networks, just like you would with data stored in your own datacenter. Amazon’s cloud offerings also support highly secure environments, like HIPAA and PCI compliance. They have designed the cloud environment to be a secure database environment while maintaining the necessary access for use and administration, even in these more regulated environments.

Storing data in the cloud is becoming more common. Amazon offers multiple platform options and allows for easy scalability, availability, and reliability.

Nov
02
2015
--

A first look at RDS Aurora

Recently, I happened to have an onsite engagement and the goal of the engagement was to move a database service to RDS Aurora. Like probably most of you, I knew the service by name but I couldn’t say much about it, so, I Googled, I listened to talks and I read about it. Now that my onsite engagement is over, here’s my first impression of Aurora.

First, let’s describe the service itself. It is part of RDS and, at first glance, very similar to a regular RDS instance. In order to setup an Aurora instance, you go to the RDS console and you either launch a new instance choosing Aurora as type or you create a snapshot of a RDS 5.6 instance and migrate it to Aurora. While with a regular MySQL RDS instance you can create slaves, with Aurora you can add reader nodes to an existing cluster. An Aurora cluster minimally consists of a writer node but you can add up to 15 reader nodes (only one writer though). It is at the storage level that things become interesting. Aurora doesn’t rely on a filesystem type storage, at least not from a database standpoint, it has its own special storage service that is replicated locally and to two other AZ automatically for a total of 6 copies. Furthermore, you pay only for what you use and the storage grows/shrinks automatically in increments of 10 GB, which is pretty cool. You can have up to 64 TB in an Aurora cluster.

Now, all that is fine, but what are the benefits of using Aurora? I must say I barely used Aurora; one week is not a field proven experience. These are claims by Amazon, but, as we will discuss, there are some good arguments in favor of these claims.

The first claim is that the write capacity is increased by up to 4x. So, even if only a single instance is used as writer in Aurora, you get up to 400% the write capacity of a normal MySQL instance. That’s quite huge and amazing, but it basically means replication is asynchronous at the storage level, at least for the multi-AZ part since the latency would be a performance killer. Locally Aurora uses a quorum-based approach with the storage nodes. Given that the object store is a separate service with its own high availability configuration, that is a reasonable trade-off. For example, the clustering solutions with Galera like Percona XtraDB Cluster typically lowers the write capacity since all nodes must synchronize on commit. Other claims are that the readers performance is unaffected by the clustering and that the readers have almost no lag with the writer. Furthermore, as if that is not enough, readers can’t diverge from the master. Finally, since there’s no lag, any readers can replace the writer very quickly, so in terms of failover, all is right.

That seems almost too good to be true; how can it be possible? I happen to be interested in object stores, Ceph especially, and I was toying with the idea of using Ceph to store InnoDB pages. It appears that the Amazon team did a super great job at putting an object store under InnoDB and they went way further than what I was thinking. Here, I may be speculating a bit and I would be happy to be found wrong. The writer never writes dirty pages back to the store… it only writes fragments of InnoDB log to the object store as objects, one per transaction, and notifies the readers of the set of pages that have been updated by this fragment log object. Just have a look at the show global status of an Aurora instance and you’ll see what I mean… Said otherwise, it is like having an infinitely large set of InnoDB log files; you can’t reach the max checkpoint age. Also, if the object store supports atomic operations, there’s no need for the double-write buffer, a high source of contention in MySQL. Just those two aspects are enough, in my opinion, to explain the up to 4x performance claim for the write capacity, but also considering the amount of writes and the log files are a kind of binary diff, that’s usually much less stuff to write than whole pages.

Something is needed to remove the fragment log objects, since over time, the accumulation of these log objects and the need to apply them would impact performance, a phenomenon called log amplification. With Aurora, that seems to be handled at the storage level and the storage system is wise enough to know that a requested page is dirty and apply the log fragments before sending it back to the reader. The shared object store can also explain why the readers have almost no lag and why they can’t diverge. The only lag the readers can have is the notification time which has to be short if within the same AZ.

So, how does Aurora compares to a technology like Galera?

Pros:

  • Higher write capacity, writer is unaffected by the other nodes
  • Simpler logic, no need for certification
  • No need for an SST to provision a new node
  • Can’t diverge
  • Scale iops tremendously
  • Fast failover
  • No need for quorum (handled by the object store)
  • Simple to deploy

Cons:

  • Likely asynchronous at the storage level
  • Only one node is writable
  • Not open source

Aurora is a mind shift in term of database and a jewel in the hands of Amazon. Openstack currently has no database service that can offer similar features. I wonder how hard it would be to produce an equivalent solution using well known opensource components like Ceph for the object store and corosync or zookeeper or zeroMQ or else for the communication layer. Also, would there be a use case?

The post A first look at RDS Aurora appeared first on MySQL Performance Blog.

Oct
13
2015
--

MySQL query digest with Performance Schema

Data AnalysisQuery analysis is a fantastic path in the pursuit to achieve high performance. It’s also probably the most repeated part of a DBA’s daily adventure. For most of us, the weapon of choice is definitely pt-query-digest, which is one of the best tools for slow query analysis out there.

Why not use pt-query-digest? Well, sometimes getting the slow log can be a challenge, such as with RDS instances or when your database is running as part of a DBaaS, which is a common practice in certain organizations.

In those cases it’s good to have an alternative. And in this case, the chosen one is the Performance Schema. We have already talked about the events_statements_* tables; however, this is the moment for the events_statements_summary_by_digest. In this table each row summarizes events for given schema/digest values (note that before MySQL 5.6.9, there is no SCHEMA_NAME column and grouping is based on DIGEST values only).

In order for MySQL to start to aggregate information in summary tables, you must verify that the consumer statement_digest is enabled.

Now, the most straightforward way to get data is to simply query the table, like this:

SELECT
SCHEMA_NAME,
digest,
digest_text,
round(sum_timer_wait/ 1000000000000, 6),
count_star
FROM performance_schema.events_statements_summary_by_digest
ORDER BY sum_timer_wait DESC LIMIT 10;

This will show you the picture of volume and frequency of SQL statements in your server. As simple as that. But there are some caveats:

  • Statements are normalized to a digest text. Instead of seeing a query like SELECT age FROM population WHERE id BETWEEN 153 AND 153+69 you will have the fingerprint version: SELECT age FROM population WHERE id BETWEEN ? AND ? + ?
  • The events_statements_summary_by_digest table has a limited maximum number of rows (200 by default, but MySQL 5.6.5 can be modified with the performance_schema_digests_size variable). As a consequence, when the table is full, statement digest values that have no already existing row will be added to a special “catch-all” row with DIGEST = NULL. In plain English: you won’t have meaningful info for those statements.

To solve the first issue, we can use the events_statements_history table to get complete queries for all the digests. I chose not to use events_statements_currents because of the short life the rows have on that table. With history, there are more chances to get more queries in the same amount of time.

Now, when using pt-query-digest, the first step is always to collect a representative amount of data, commonly from the slow log, and then process. With Performance Schema, let’s collect a representative amount of complete queries so we can have examples for every aggregated statement.

To address the second issue, we just need to TRUNCATE the events_statements_summary_by_digest table. This way the summary will start from scratch.

Since Performance Schema is available on all the platforms supported by MySQL, I chose to run the tests on an Amazon RDS MySQL instance. The only thing I didn’t like is that P_S is disabled by default on RDS and to enable it requires an instances reboot. Other than that, everything is the same as in a regular instance.

The steps are:

  1. Enable the events_statements_history consumer
  2. Create a MEMORY table to hold the data
  3. Truncate tables to have a fresh start
  4. Create a MySQL EVENT that will fill the table with data
  5. Once the event has ended, get the query digest.

The table schema is the following:

CREATE TABLE IF NOT EXISTS percona.digest_seen
(schema_name varchar(64) DEFAULT NULL,
digest varchar(32) DEFAULT NULL,
sql_text varchar(1024) DEFAULT NULL,
PRIMARY KEY USING BTREE (schema_name,digest)) engine=memory;

The original SQL_TEXT field from the events_statements_history table is defined as longtext, but unless you are using Percona Server (5.5+), you won’t be able to use longtext on a memory table. This is possible in Percona Server because the Improved Memory Engine permits the use of Blob and Text fields on the Memory Storage Engine. The workaround is to define that field as a varchar 1024. Why 1024? That’s another requirement from the table: The SQL_TEXT is fixed at 1024 chars. It’s only after MySQL 5.7.6 that the maximum number of bytes to display can be modified by changing the performance_schema_max_sql_text_length system variable at server startup.

Also, since we are going to use EVENTS on RDS, the “event_scheduler” variable has to be set to ON. Luckily, it is a dynamic variable so there’s no need to reboot the instance after modifying the Parameter Group. If using a non-RDS, it’s enough to execute “SET GLOBAL event_scheduler = ON;”

Here is the complete list of steps:

UPDATE performance_schema.setup_consumers SET ENABLED = 'YES' WHERE NAME = 'events_statements_history';
SET SESSION max_heap_table_size = 1024*1024;
CREATE DATABASE IF NOT EXISTS percona;
USE percona;
CREATE TABLE IF NOT EXISTS percona.digest_seen (schema_name varchar(64) DEFAULT NULL, digest varchar(32) DEFAULT NULL, sql_text varchar(24) DEFAULT NULL, PRIMARY KEY USING BTREE (schema_name,digest)) engine=memory;
TRUNCATE TABLE percona.digest_seen;
TRUNCATE performance_schema.events_statements_summary_by_digest;
TRUNCATE performance_schema.events_statements_history;
CREATE EVENT IF NOT EXISTS getDigest
ON SCHEDULE EVERY 1 SECOND
ENDS CURRENT_TIMESTAMP + INTERVAL 5 MINUTE ON COMPLETION NOT PRESERVE
DO
INSERT IGNORE INTO percona.digest_seen SELECT CURRENT_SCHEMA, DIGEST, SQL_TEXT FROM performance_schema.events_statements_history WHERE DIGEST IS NOT NULL GROUP BY current_schema, digest LIMIT 50;

The event is defined to be run immediately, once per second, for 5 minutes. After the event is complete, it will be deleted.

When the event is done, we are in position to get the query digest. Simply execute this query:

SELECT
s.SCHEMA_NAME,
s.SQL_TEXT,
ROUND(d.SUM_TIMER_WAIT / 1000000000000, 6) as EXECUTION_TIME,
ROUND(d.AVG_TIMER_WAIT / 1000000000000, 6) as AVERAGE_TIME,
COUNT_STAR
FROM performance_schema.events_statements_summary_by_digest d
LEFT JOIN percona.digest_seen s USING (digest)
WHERE s.SCHEMA_NAME IS NOT NULL
GROUP BY s.digest
ORDER BY EXECUTION_TIME DESC LIMIT 10;

The order by is similar to the one pt-query-digest does by default, but it could be any one you want.

The output is:

+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------+--------------+------------+
| SCHEMA_NAME | SQL_TEXT                                                                                                                                                                  | EXECUTION_TIME | AVERAGE_TIME | COUNT_STAR |
+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------+--------------+------------+
| percona     | UPDATE population SET age=age+1 WHERE id=148                                                                                                                              |  202304.145758 |     1.949487 |     103773 |
| percona     | SELECT age FROM population WHERE id BETWEEN 153 AND 153+69                                                                                                                |     488.572609 |     0.000176 |    2771352 |
| percona     | SELECT sex,age,estimate2012 FROM population WHERE id BETWEEN 174 AND 174+69 ORDER BY sex                                                                                  |     108.841575 |     0.000236 |     461412 |
| percona     | SELECT census2010 FROM population WHERE id=153                                                                                                                            |      62.742239 |     0.000090 |     693526 |
| percona     | SELECT SUM(estimate2014) FROM population WHERE id BETWEEN 154 AND 154+69                                                                                                  |      44.940020 |     0.000195 |     230810 |
| percona     | SELECT DISTINCT base2010 FROM population WHERE id BETWEEN 154 AND 154+69 ORDER BY base2010                                                                                |      33.909593 |     0.000294 |     115338 |
| percona     | UPDATE population SET estimate2011='52906609184-39278192019-93190587310-78276160274-48170779146-66415569224-40310027367-70054020251-87998206812-01032761541' WHERE id=154 |       8.231353 |     0.000303 |      27210 |
| percona     | COMMIT                                                                                                                                                                    |       2.630153 |     0.002900 |        907 |
| percona     | BEGIN                                                                                                                                                                     |       0.705435 |     0.000031 |      23127 |
|             | SELECT 1                                                                                                                                                                  |       0.422626 |     0.000102 |       4155 |
+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------+--------------+------------+
10 rows in set (0.10 sec)

Finally, you can do some cleanup:

DROP event IF EXISTS getDigest;
DROP TABLE IF EXISTS percona.digest_seen;
SET SESSION max_heap_table_size = @@max_heap_table_size;
UPDATE performance_schema.setup_consumers SET ENABLED = 'NO' WHERE NAME = 'events_statements_history';

Summary: Performance Schema is doing the query digest already for you. It is just a matter of how to access the data in a way that suits your requirements.

The post MySQL query digest with Performance Schema appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com