Jul
04
2018
--

How to Set Up Replication Between AWS Aurora and an External MySQL Instance

Amazon RDS Aurora replication to external server

Amazon RDS Aurora replication to external serverAmazon RDS Aurora (MySQL) provides its own low latency replication. Nevertheless, there are cases where it can be beneficial to set up replication from Aurora to an external MySQL server, as Amazon RDS Aurora is based on MySQL and supports native MySQL replication. Here are some examples of when replicating from Amazon RDS Aurora to an external MySQL server can make good sense:

  • Replicating to another cloud or datacenter (for added redundancy)
  • Need to use an independent reporting slave
  • Need to have an additional physical backup
  • Need to use another MySQL flavor or fork
  • Need to failover to another cloud and back

In this blog post I will share simple step by step instructions on how to do it.

Steps to setup MySQL replication from AWS RDS Aurora to MySQL server

  1. Enable binary logs in the option group in Aurora (Binlog format = mixed). This will require a restart.
  2. Create a snapshot and restore it (create a new instance from a snapshot). This is only needed to make a consistent copy with mysqldump. As Aurora does not allow “super” privileges, running
    mysqldump --master-data

      is not possible. The snapshot is the only way to get a consistent backup with the specific binary log position.

  3. Get the binary log information from the snapshot. In the console, look for the “Alarms and Recent Events” for the restored snapshot instance. We should see something like:
    Binlog position from crash recovery is mysql-bin-changelog.000708 31278857
  4. Install MySQL 5.6 (i.e. Percona Server 5.6) on a separate EC2 instance (for Aurora 5.6 – note that you should use MySQL 5.7 for Aurora 5.7). After MySQL is up and running, import the timezones:
    # mysql_tzinfo_to_sql /usr/share/zoneinfo/|mysql

    Sample config:

    [mysqld]
    log-bin=log-bin
    log-slave-updates
    binlog-format=MIXED
    server-id=1000
    relay-log=relay-bin
    innodb_log_file_size=1G
    innodb_buffer_pool_size=2G
    innodb_flush_method=O_DIRECT
    innodb_flush_log_at_trx_commit=0 # as this is replication slave
  5. From now on we will make all backups from the restored snapshot. First get all users and import those to the new instance:
    pt-show-grants -h myhost...amazonaws.com -u percona > grants.sql

    # check that grants are valid and upload to MySQL

    mysql -f < grants.sql

    Make a backup of all schemas except for the “mysql” system tables as Aurora using different format of those (make sure we connect to the snapshot):

    host="my-snapshot...amazonaws.com"
    mysqldump --single-transaction -h $host -u percona
    --triggers --routines
    --databases `mysql -u percona -h $host -NBe
    "select group_concat(schema_name separator ' ') from information_schema.schemata where schema_name not in ('mysql', 'information_schema', 'performance_schema')"` > all.sql
  6. Restore to the local database:
    mysql -h localhost < all.sql
  7. Restore users again (some users may fail to create where there are missing databases):
    mysql -f < grants.sql
  8. Download the RDS/Aurora SSL certificate:
    # cd /etc/ssl
    # wget 'https://s3.amazonaws.com/rds-downloads/rds-combined-ca-bundle.pem'
    # chown mysql.mysql rds-combined-ca-bundle.pem
  9. Configure MySQL replication. Take the values for the binary log name and position from #3 above. Please note: now we connect to the actual instance, not a snapshot:
    # mysql -h localhost
    ...
    mysql> CHANGE MASTER TO
    MASTER_HOST='dev01-aws-1...',
    MASTER_USER='awsreplication',
    MASTER_PASSWORD='<pass>',
    MASTER_LOG_FILE = 'mysql-bin-changelog.000708',
    MASTER_LOG_POS = 31278857,
    MASTER_SSL_CA = '/etc/ssl/rds-combined-ca-bundle.pem',
    MASTER_SSL_CAPATH = '',
    MASTER_SSL_VERIFY_SERVER_CERT=1;
    mysql> start slave;
  10. Verify that the slave is working. Optionally add the SQL_Delay option to the CHANGE MASTER TO (or anytime) and specify the slave delay in seconds.

I hope those steps will be helpful for setting up an external MySQL replica.

The post How to Set Up Replication Between AWS Aurora and an External MySQL Instance appeared first on Percona Database Performance Blog.

Jun
15
2018
--

This Week in Data with Colin Charles 42: Security Focus on Redis and Docker a Timely Reminder to Stay Alert

Colin Charles

Colin CharlesJoin Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

Much of last week, there was a lot of talk around this article: New research shows 75% of ‘open’ Redis servers infected. It turns out, it helps that one should always read beyond the headlines because they tend to be more sensationalist than you would expect. From the author of Redis, I highly recommend reading Clarifications on the Incapsula Redis security report, because it turns out that in this case, it is beyond the headline. The content is also suspect. Antirez had to write this to help the press (we totally need to help keep reportage accurate).

Not to depart from the Redis world just yet, but Antirez also had some collaboration with the Apple Information Security Team with regards to the Redis Lua subsystem. The details are pretty interesting as documented in Redis Lua scripting: several security vulnerabilities fixed because you’ll note that the Alibaba team also found some other issues. Antirez also ensured that the Redis cloud providers (notably: Redis Labs, Amazon, Alibaba, Microsoft, Google, Heroku, Open Redis and Redis Green) got notified first (and in the comments, compose.io was missing, but now added to the list). I do not know if Linux distributions were also informed, but they will probably be rolling out updates soon.

In the “be careful where you get your software” department: some criminals have figured out they could host some crypto-currency mining software that you would get pre-installed if you used their Docker containers. They’ve apparently made over $90,000. It is good to note that the Backdoored images downloaded 5 million times finally removed from Docker Hub. This, however, was up on the Docker Hub for ten months and they managed to get over 5 million downloads across 17 images. Know what images you are pulling. Maybe this is again more reason for software providers to run their own registries?

James Turnbull is out with a new book: Monitoring with Prometheus. It just got released, I’ve grabbed it, but a review will come shortly. He’s managed all this while pulling off what seems to be yet another great O’Reilly Velocity San Jose Conference.

Releases

A quiet week on this front.

Link List

  • INPLACE upgrade from MySQL 5.7 to MySQL 8.0
  • PostgreSQL relevant: What’s is the difference between streaming replication vs hot standby vs warm standby ?
  • A new paper on Amazon Aurora is out: Amazon Aurora: On Avoiding Distributed Consensus for I/Os, Commits, and Membership Changes. It was presented at SIGMOD 2018, and an abstract: “One of the more novel differences between Aurora and other relational databases is how it pushes redo processing to a multi-tenant scale-out storage service, purpose-built for Aurora. Doing so reduces networking traffic, avoids checkpoints and crash recovery, enables failovers to replicas without loss of data, and enables fault-tolerant storage that heals without database involvement. Traditional implementations that leverage distributed storage would use distributed consensus algorithms for commits, reads, replication, and membership changes and amplify cost of underlying storage.” Aurora, as you know, avoids distributed consensus under most circumstances. Short 8-page read.
  • Dormando is blogging again, and this was of particular interest — Caching beyond RAM: the case for NVMe. This is done in the context of memcached, which I am certain many use.
  • It is particularly heartening to note that not only does MongoDB use Linkbench for some of their performance testing, they’re also contributing to making it better via a pull request.

Industry Updates

Trying something new here… To cover fundraising, and people on the move in the database industry.

  • Kenny Gorman — who has been on the program committee for several Percona Live conferences, and spoken at the event multiple times before — is the founder and CEO of Eventador, a stream-processing as a service company built on Apache Kafka and Apache Flink, has just raised $3.8 million in funding to fuel their growth. They are also naturally spending this on hiring. The full press release.
  • Jimmy Guerrero (formerly of MySQL and InfluxDB) is now VP Marketing & Community at YugaByte DB. YugaByte was covered in column 13 as having raised $8 million in November 2017.

Upcoming appearances

  • DataOps Barcelona – Barcelona, Spain – June 21-22, 2018 – code dataopsbcn50 gets you a discount
  • OSCON – Portland, Oregon, USA – July 16-19, 2018
  • Percona webinar on Maria Server 10.3 – June 26, 2018

Feedback

I look forward to feedback/tips via e-mail at colin.charles@percona.com or on Twitter @bytebot.

The post This Week in Data with Colin Charles 42: Security Focus on Redis and Docker a Timely Reminder to Stay Alert appeared first on Percona Database Performance Blog.

May
11
2018
--

This Week in Data with Colin Charles 39: a valuable time spent at rootconf.in

Colin Charles

Colin CharlesJoin Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

rootconf.in 2018 just ended, and it was very enjoyable to be in Bangalore for the conference. The audience was large, the conversations were great, and overall I think this is a rather important conference if you’re into the “DevOps” movement (or are a sysadmin!). From the data store world, Oracle MySQL was a sponsor, as was MyDBOPS (blog), and Elastic. There were plenty more, including Digital Ocean/GoJek/Walmart Labs — many MySQL users.

I took a handful of pictures with people, and here are some of the MyDBOPS team and myself.  They have over 20 employees, and serve the Indian market at rates that would be more palatable than straight up USD rates. Traveling through Asia, many businesses always do find local partners and offer local pricing; this really becomes more complex in the SaaS space (everyone pays the same rate generally) and also the services space.

Colin at Rootconf with Oracle
Some of the Oracle MySQL team who were exhibiting were very happy they got a good amount of traffic to the booth based on stuff discussed at the talk and BOF.

From a talk standpoint, I did a keynote for an hour and also a BoF session for another hour (great discussion, lots of blog post ideas from there), and we had a Q&A session for about 15 minutes. There were plenty of good conversations in the hallway track.

A quick observation that I notice happens everywhere: many people don’t realize features that have existed in MySQL since 5.6/5.7.  So they are truly surprised with stuff in 8.0 as well. It is clear there is a huge market that would thrive around education. Not just around feature checklists, but also around how to use features. Sometimes, this feels like the MySQL of the mid-2000’s — getting apps to also use new features, would be a great thing.

Releases

This seems to have been a quiet week on the releases front.

Are you a user of Amazon Aurora MySQL? There is now the Amazon Aurora Backtrack feature, which allows you to go back in time. It is described to work as:

Aurora uses a distributed, log-structured storage system (read Design Considerations for High Throughput Cloud-Native Relational Databases to learn a lot more); each change to your database generates a new log record, identified by a Log Sequence Number (LSN). Enabling the backtrack feature provisions a FIFO buffer in the cluster for storage of LSNs. This allows for quick access and recovery times measured in seconds.

Link List

Upcoming appearances

Feedback

I look forward to feedback/tips via e-mail at colin.charles@percona.com or on Twitter @bytebot.

 

The post This Week in Data with Colin Charles 39: a valuable time spent at rootconf.in appeared first on Percona Database Performance Blog.

Apr
16
2018
--

Webinar Tuesday April 17, 2018: Which Amazon Cloud Technology Should You Chose? RDS? Aurora? Roll Your Own?

Amazon Cloud Technology

Amazon Cloud TechnologyPlease join Percona’s Senior Technical Operations Engineer, Daniel Kowalewski as he presents Which Amazon Cloud Technology Should You Chose? RDS? Aurora? Roll Your Own? on Tuesday, April 17, 2018, at 10:00 am PDT (UTC-7) / 1:00 pm EDT (UTC-4).

Are you running on Amazon, or planning to migrate there? In this talk, we are going to cover the different technologies for running databases on Amazon Cloud environments.

We will focus on the operational aspects, benefits and limitations for each of them.

Register for the webinar now.

Amazon Cloud TechnologyDaniel Kowalewski, Senior Technical Operations Engineer

Daniel joined Percona in August of 2015. Previously, he earned a B.S. in Computer Science from the University of Colorado in 2006 and was a DBA there until he joined Percona. In addition to MySQL, Daniel also has experience with Oracle and Microsoft SQL Server, but he much prefers to stay in the MySQL world. Daniel lives near Denver, CO with his wife, two-year-old son, and dog. If you can’t reach him, he’s probably in the mountains hiking, camping, or trying to get lost.

The post Webinar Tuesday April 17, 2018: Which Amazon Cloud Technology Should You Chose? RDS? Aurora? Roll Your Own? appeared first on Percona Database Performance Blog.

Apr
13
2018
--

This Week in Data with Colin Charles 35: Percona Live 18 final countdown and a roundup of recent news

Colin Charles

Colin CharlesJoin Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

Percona Live is just over a week away — there’s an awesome keynote lineup, and you really should register. Also don’t forget to save the date as Percona Live goes to Frankfurt, Germany November 5-7 2018! Prost!

In acquisitions, we have seen MariaDB acquire MammothDB and Idera acquire Webyog.

Some interesting Amazon notes: Amazon Aurora Continues its Torrid Growth, More than Doubling the Number of Active Customers in the Last Year (not sure I’d describe it as torrid but this is great for MySQL and PostgreSQL), comes with a handful of customer mentions. In addition, there have already been 65,000 database migrations on AWS. For context, in late November 2017, it was 40,000 database migrations.

Releases

Link List

Upcoming appearances

Feedback

I look forward to feedback/tips via e-mail at colin.charles@percona.com or on Twitter @bytebot.

 

The post This Week in Data with Colin Charles 35: Percona Live 18 final countdown and a roundup of recent news appeared first on Percona Database Performance Blog.

Apr
11
2018
--

Calling All Polyglots: Percona Live 2018 Keynote Schedule Now Available!

Percona Live 2018 Keynotes

Percona Live 2018 KeynotesWe’ve posted the Percona Live 2018 keynote addresses for the seventh annual Percona Live Open Source Database Conference 2018, taking place April 23-25, 2018 at the Santa Clara Convention Center in Santa Clara, CA. 

This year’s keynotes explore topics ranging from how cloud and open source database adoption accelerates business growth, to leading-edge emerging technologies, to the importance of MySQL 8.0, to the growing popularity of PostgreSQL.

We’re excited by the great lineup of speakers, including our friends at Alibaba Cloud, Grafana, Microsoft, Oracle, Upwork and VividCortex, the innovative leaders on the Cool Technologies panel, and Brendan Gregg from Netflix, who will discuss how to get the most out of your database on a Linux OS, using his experiences at Netflix to highlight examples.  

With the theme of “Championing Open Source Databases,” the conference will feature multiple tracks, including MySQL, MongoDB, Cloud, PostgreSQL, Containers and Automation, Monitoring and Ops, and Database Security. Once again, Percona will be offering a low-cost database 101 track for beginning users who want to learn how to use and operate open source databases.

The Percona Live 2018 keynotes include:

Tuesday, April 24, 2018

  • Open Source for the Modern Business – Peter Zaitsev of Percona will discuss how open source database adoption continues to grow in enterprise organizations, the expectations and definitions of what constitutes success continue to change. A single technology for everything is no longer an option; welcome to the polyglot world. The talk will include several compelling open source projects and trends of interest to the open source database community and will be followed by a round of lightning talks taking a closer look at some of those projects.
  • Cool Technologies Showcase – Four industry leaders will introduce key emerging industry developments. Andy Pavlo of Carnegie Mellon University will discuss the requirements for enabling autonomous database optimizations. Nikolay Samokhvalov of PostgreSQL.org will discuss new PostgreSQL tools. Sugu Sougoumarane of PlanetScale Data will explore how Vitess became a high-performance, scalable and available MySQL clustering cloud solution in line with today’s NewSQL storage systems. Shuhao Wu of Shopify explains how to use Ghostferry as a data migration tool for incompatible cloud platforms.
  • State of the Dolphin 8.0 – Tomas Ulin of Oracle will discuss the focus, strategy, investments and innovations that are evolving MySQL to power next-generation web, mobile, cloud and embedded applications – and why MySQL 8.0 is the most significant MySQL release in its history.
  • Linux Performance 2018 – Brendan Gregg of Netflix will summarize recent performance features to help users get the most out of their Linux systems, whether they are databases or application servers. Topics include the KPTI patches for Meltdown, eBPF for performance observability, Kyber for disk I/O scheduling, BBR for TCP congestion control, and more.

Wednesday, April 25, 2018

  • Panel Discussion: Database Evolution in the Cloud – An expert panel of industry leaders, including Lixun Peng of Alibaba, Sunil Kamath of Microsoft, and Baron Schwartz of VividCortex, will discuss the rapid changes occurring with databases deployed in the cloud and what that means for the future of databases, management and monitoring and the role of the DBA and developer.
  • Future Perfect: The New Shape of the Data Tier – Baron Schwartz of VividCortex will discuss the impact of macro trends such as cloud computing, microservices, containerization, and serverless applications. He will explore where these trends are headed, touching on topics such as whether we are about to see basic administrative tasks become more automated, the role of open source and free software, and whether databases as we know them today are headed for extinction.
  • MongoDB at Upwork – Scott Simpson of Upwork, the largest freelancing website for connecting clients and freelancers, will discuss how MongoDB is used at Upwork, how the company chose the database, and how Percona helps make the company successful.

We will also present the Percona Live 2018 Community Awards and Lightning Talks on Monday, April 23, 2018, during the Opening Night Reception. Don’t miss the first day of tutorials and Opening Night Reception!

Register for the conference on the Percona Live Open Source Database Conference 2018 website.

Sponsorships

Limited Sponsorship opportunities for Percona Live 2018 Open Source Database Conference are still available, and offer the opportunity to interact with the DBAs, sysadmins, developers, CTOs, CEOs, business managers, technology evangelists, solution vendors, and entrepreneurs who typically attend the event. Contact live@percona.com for sponsorship details.

  • Diamond Sponsors – Percona, VividCortex
  • Platinum – Alibaba Cloud, Microsoft
  • Gold Sponsors – Facebook, Grafana
  • Bronze Sponsors – Altinity, BlazingDB, Box, Dynimize, ObjectRocket, Pingcap, Shannon Systems, SolarWinds, TimescaleDB, TwinDB, Yelp
  • Contributing Sponsors – cPanel, Github, Google Cloud, NaviCat
  • Media Sponsors – Database Trends & Applications, Datanami, EnterpriseTech, HPCWire, ODBMS.org, Packt

The post Calling All Polyglots: Percona Live 2018 Keynote Schedule Now Available! appeared first on Percona Database Performance Blog.

Apr
03
2018
--

Leveraging ProxySQL with AWS Aurora to Improve Performance, Or How ProxySQL Out-performs Native Aurora Cluster Endpoints

ProxySQL with AWS Aurora

In this blog post, I’ll look at how you can use ProxySQL with AWS Aurora to further leverage database performance.

My previous article described how easy is to replace the native Aurora connector with ProxySQL. In this article, you will see WHY you should do that.

It is important to understand that aside from the basic optimization in the connectivity and connection management, ProxySQL also provides you with a new set of features that currently are not available in Aurora.

Just think:

  • Better caching
  • Query filtering
  • Sharding
  • Query substitution
  • Firewalling
  • … and more

We will cover areas like scalability, security and performance. In short, I think is more than worth it to spend some time and give ProxySQL with AWS Aurora a try.

The tests

I will show you the results from two different kinds of tests. One is sysbench-oriented, the other simulates a more complex application using Java, data object utilization and a Hikari connection pool in the middle as well. 

For the EC2 and Aurora platform I used:

  • Application/ProxySQL T2.xlarge eu-central-1a
  • 2 Aurora MySQL 5.7.12 db.t2.medium eu-central-1a
  • 1 Aurora MySQL 5.7.12 db.t2.medium eu-central-1b for AZ redundancy

The code for the application is available here, and for sysbench tests here. All the data and configurations for the application are available here.

I ran three tests using both bench apps, obviously with Aurora as it comes and with ProxySQL. For the ProxySQL configuration see my previous article.
The tests were read_only / Write_only / read_write.

For Aurora, I only increased the number of connections and kept the how it comes out of the box approach. Note each test was run at least three times at different moments of the day, and on a different day. The data reported as final is the BEST performing result for each one.

The Results

For the impatient among us, here is a summary table of the tests:

Sysbench:

Java App:


Now if this is enough for you, you can go to the conclusion and start to use ProxySQL with AWS Aurora. If you would like to know a bit more, continue reading.

Aside from any discussion on the benchmark tool and settings, I really focused on identifying the differences between the two “connectors”. Given the layer below was exactly the same, any difference is due to the simple substitution of the endpoint.

Sysbench

Read Only

The first image reports the number of events achieved at the time of the test. It is quite clear that when using ProxySQL, sysbench ran more events.

In this graph, higher is better:

In this graph, lower is better:

As we can see, the latency when using an Aurora cluster entry point is higher. True, we are talking about milliseconds, but it is not just the value that matters, but also the distribution:

Aurora cluster endpoint ProxySQL
   

An image is worth a thousand words!

We can see, the behavior stays constant in analyzing the READS executed, with ProxySQL performing better.

In this graph, higher is better:


In this graph, higher is better:

Closing with the number of total queries performed, in which ProxySQL surpassed the Cluster endpoint by ~ 4K queries.

Write Only

For writing, things go a bit different. We see that all lines intersect, and the values are very close one to the other. I will let the images speak for themselves:

In this graph, higher is better:

In this graph, lower is better:

Latency spiked in each ProxySQL test, and it may require additional investigation and tuning.  

In this graph, higher is better:

While the rates of writes/sec intersect with each other frequently, in the end ProxySQL resulted in more writes than the native endpoint.

In this graph, higher is better:

In the end, a difference exists and is consistent across the different test iterations, but is minimal. We are talking a range of 25 – 50 entries in total. This result is not surprising, and it will be clear why later in the article.  

Read and Write

As expected in the read and write test, we see a different situation. ProxySQL is still performing better than the default entry point, but not by such a big margin as in read-only tests.

In this graph, higher is better:

In this graph, lower is better

Latency and events are following the expected trend, where read operations are executed more efficiently with ProxySQL and writes are close, but NOT the same as in the write only test.

As a result, the number of queries in ProxySQL is approximately 13% better than the default entry point.

Java Application Tests

What about the Java application? First of all, we need to remember that the application used a connection pool mechanism (Hikari), and the connection pool was present in all cases (for both Aurora cluster endpoint or ProxySQL). Given that a small delay in establishing the first connection was expected, you can easily see this in the MAX value of the connection latency.    

In this graph, lower is better.

The connection latency reported here is expressed in nanoseconds and is the measure of the time taken by the connection provider to return an active connection to the application from the moment the application requested it. In other words, how long the HikariCP is taking to choose/check/return an open connection. As you can see, the MAX value is drastically higher, and this was expected since it is the connection initialization. While not really interesting in terms of performance, this value is interesting because it gives us the dimension of the cost in the CP to open a new connection, which in the worse case is 25 milliseconds.

As the graphs show, ProxySQL manages both cases (first call, reassignment) more efficiently.

In this graph, higher is better.

In the CRUD summary table, we can see the number of SQL commands executed per second for each CRUD action and for each test. Once more we can see that when using ProxySQL, the application performed much better and significantly executed more operations (especially in the R/W test).

In this graph, higher is better.

This graph represents the total number of events run at the time of the test. An event is a full application cycle, which sees the application generate the data needed to fill the SQL (no matter if it is for read/write), create the SQL, request the connection, push the SQL, get and read the resultset returned and give back the connection.

Once more, ProxySQL shows better performance.

In this graph, lower is better.

The execution time reported in this graph is the time taken by the application to run a whole event.

This is it, execution time is the time of a full cycle. The faster the cycle is executed, the better the application is performing. The time is express in milliseconds and it goes from a very fast read, which probably accesses the cache in Aurora, to almost two seconds taken to insert a batch of rows.

Needless to say, the tests using ProxySQL performed better.

But Why?

Why do the tests using ProxySQL perform better? After all, it is just an additional step in the middle, which also has a cost in intercepting the queries and managing the connections. So why the better performance?

The answer is simple and can be found in the Aurora manual: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Aurora.Overview.html#Aurora.Overview.Endpoints.

The Cluster endpoint is an endpoint for an Aurora DB cluster that connects to the current primary instance for that DB cluster. Each Aurora DB cluster has a cluster endpoint and one primary instance.

That endpoint receives the read and write request and sends them to the same instance.The main use for it is to perform failover if needed.

At the same time, the Reader endpoint is an endpoint for an Aurora DB cluster that connects to one of the available Aurora Replicas for that DB cluster. Each Aurora DB cluster has a reader endpoint. If there is more than one Aurora Replica, the reader endpoint directs each connection request to one of the Aurora Replicas.

The reader endpoint only load balances connections to available Aurora Replicas in an Aurora DB cluster. It does not load balance specific queries. If you want to load balance queries to distribute the read workload for a DB cluster, you need to manage that in your application and use instance endpoints to connect directly to Aurora Replicas to balance the load.

This means that to perform a Read/Write split, your application must manage two entry points and you will NOT have much control over how the queries are handled or to which replica instance they are directed. This could lead to unexpected results and delays.

Needless to say, ProxySQL does all that by default (as described in my previous article).

Now that we’ve clarified how Aurora entry points behave, let’s see about the performance difference.

How do we read this graph? From left to right:

  • read_only test with an Aurora cluster endpoint
  • read_only test with ProxySQL
  • write_only with an Aurora cluster endpoint
  • write_only with ProxySQL
  • read and write with an Aurora cluster endpoint
  • read and write with ProxySQL

Here we go! As we can see, the tests with ProxySQL used the two configured instances, splitting R/W without the need to do anything on the application side. I purposely avoided the AZ replica because I previously identified it as having higher latency, so I can exclude it and use it ONLY in the case of an emergency.

The effects are clear in the next graph.

When using the cluster endpoint, given all the load was on a single instance, the CPU utilization is higher and that became a bottleneck. When using ProxySQL, the load is spread across the different instances, allowing real read scalability. This has immediate benefits in read and read/write operations, allowing better load distribution that results in better performance.

Conclusions

Aurora is a very interesting technology and can be a very good solution for read scaling. But at the moment, the way AWS offers data connectivity with the Cluster endpoints and Reader endpoints can negatively affect performance.

The lack of configuration and the limitation of using different endpoints lead to confusion and less optimized utilization.

The introduction of ProxySQL, which now supports (from version 2) Aurora, allows an architect, SA or DBA to properly configure the environment. You can very granularly choose how to use each instance, without the need to have the application modify how it works. This helps keep the data layer solution separate from the application layer.

Even better, this additional set of flexibility does not come with a cost. On the contrary, it improves resource utilization and brings higher performance using less powerful instances. Given the cost of Aurora, this is not a secondary benefit.   

I suggest you try installing ProxySQL v2 (or higher) in front of your Aurora cluster. If you don’t feel confident and prefer to have us help you, contact us and we will be more than happy to support you!

The post Leveraging ProxySQL with AWS Aurora to Improve Performance, Or How ProxySQL Out-performs Native Aurora Cluster Endpoints appeared first on Percona Database Performance Blog.

Apr
03
2018
--

How to Implement ProxySQL with AWS Aurora

ProxySQL with AWS Aurora

ProxySQL with AWS AuroraIn this post, we’ll look at how to implement ProxySQL with AWS Aurora.

Recently, there have been a few discussions and customer requests that focused on AWS Aurora and how to make the various architectures and solutions more flexible.

Flexible how, you may ask? Well, there are the usual expectations:

  • How do you improve resource utilization?
  • How can I filter (or block) things?
  • Can I shard with Aurora?
  • What is the best way to implement query caching?
  • … and more.

The inclusion of ProxySQL solves many of the points above. We in Consulting design the solutions for our customers by applying the different functionalities to better match customers needs. Whenever we deal with Aurora, we’ve had to exclude ProxySQL because of some limitations in the software.

Now, however, ProxySQL 2.0 supports Aurora, and it does it amazingly well.

This article shows you how to implement ProxySQL with AWS Aurora. The the next article Leveraging ProxySQL with AWS Aurora to Improve Performance will show you WHY.

The Problem

ProxySQL has two different ways to deal with backend servers. One is using replication mechanisms, like standard Async replication and Group Replication. The other is to use the scheduler, as in the case of Percona XtraDB Cluster, MariaDB Cluster, etc.

While we can use the scheduler as a solution for Aurora, it is not as immediate and well-integrated as the embedded support for replication, given that we normally opted not to use it in this specific case (Aurora).

But what WAS the problem with Aurora? An Aurora cluster bases its definition of Writer vs. Readers using the innodb_read_only variable. So, where is the problem? Well actually no problem at all, just that ProxySQL up to version 2 for replication only supported the generic variable READ_ONLY. As such, it was not able to correctly identify the Writer/Readers set.

The Solution

In October 2017, this issue was opened (https://github.com/sysown/proxysql/issues/1195 )and the result was, as usual, a quite simple and flexible solution.

Brainstorming, a possible solution could be to add another column in mysql_replication_hostgroups to specify what needs to be checked, either read_only or innodb_read_only, or even super_read_only

This lead to the ProxySQL team delivering (“commit fe2f16d6df15252f0107a6a224dad7b1efdb13f6”):

Added support for innodb_read_only and super_read_only  

MYHGM_MYSQL_REPLICATION_HOSTGROUPS "CREATE TABLE mysql_replication_hostgroups
(writer_hostgroup INT CHECK (writer_hostgroup>=0) NOT NULL PRIMARY KEY ,
reader_hostgroup INT NOT NULL CHECK (reader_hostgroup<>writer_hostgroup AND reader_hostgroup>=0) ,
check_type VARCHAR CHECK (LOWER(check_type) IN ('read_only','innodb_read_only','super_read_only')) NOT NULL DEFAULT 'read_only' ,
comment VARCHAR NOT NULL DEFAULT '' , UNIQUE (reader_hostgroup))"

Which in short means they added a new column to the mysql_replication_hostgroup table. ProxySQL continues to behave exactly the same and manages the servers and the replication groups as usual. No need for scripts or other crazy stuff.

Implementation

Here we are, the HOW TO part. The first thing to keep in mind is that when you implement a new Aurora cluster, you should always consider having at least two instances in the same AZ and another instance in a remote AZ.

To implement ProxySQL, you should refer directly to the instances, NOT to the cluster entry-point. To be clear, you must take this for each instance:

The information is available in the web-admin interface, under the instance or using the command:

aws rds describe-db-instances

And filter the result for:

"Endpoint": {
                "Port": 3306,
                "Address": "proxysqltestdb.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com"
            },

To run ProxySQL with RDS in general, you need to install it on an intermediate server or on the application box.

Once you decide which one fits your setup better, you must download or git clone ProxySQL v2.0+.

DO NOT use v1.4.x, as it does not contain these new features and will not work as expected.

Once you have all the Aurora instances up, it is time to configure ProxySQL. Below is an example of all the commands used during the installation:

grant usage, replication client on *.* to monitor@'%' identified by 'monitor';
delete from mysql_servers where hostgroup_id in (70,71);
delete from mysql_replication_hostgroups where writer_hostgroup=70;
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('proxysqltestdb.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com',70,3306,1000,2000);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('proxysqltestdb.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com',71,3306,1000,2000);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('proxysqltestdb2.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com',71,3306,1000,2000);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('proxysqltestdb-eu-central-1b.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com',71,3306,1,2000);
INSERT INTO mysql_replication_hostgroups(writer_hostgroup,reader_hostgroup,comment,check_type) VALUES (70,71,'aws-aurora','innodb_read_only');
LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK;
delete from mysql_query_rules where rule_id in (50,51,52);
insert into mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply) values(50,6033,'m8_test',70,0,3,'.',1);
insert into mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply) values(51,6033,'m8_test',70,1,3,'^SELECT.*FOR UPDATE',1);
insert into mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply) values(52,6033,'m8_test',71,1,3,'^SELECT.*$',1);
LOAD MYSQL QUERY RULES TO RUNTIME;SAVE MYSQL QUERY RULES TO DISK;
delete from mysql_users where username='m8_test';
insert into mysql_users (username,password,active,default_hostgroup,default_schema,transaction_persistent) values ('m8_test','test',1,70,'mysql',1);
LOAD MYSQL USERS TO RUNTIME;SAVE MYSQL USERS TO DISK;
update global_variables set variable_value="67108864" where variable_name='mysql-max_allowed_packet';
update global_variables set Variable_Value=0  where Variable_name='mysql-hostgroup_manager_verbose';
load mysql variables to run;save mysql variables to disk;

The above will give you a ready-to-go ProxySQL setup that supports Aurora cluster, performing all the usual operations ProxySQL does, including proper W/R split and more for a user named ‘m8_test’.

The key is in passing the value ‘innodb_read_only’ for the column check_type in the table mysql_replication_hostgroups.  

To check the status of your ProxySQL, you can use this command (which gives you a snapshot of what is going to happen):

watch -n 1 'mysql --defaults-file=~/.my.cnf -h 127.0.0.1 -P 6032 -t -e "select b.weight, c.* from stats_mysql_connection_pool c left JOIN runtime_mysql_servers b ON  c.hostgroup=b.hostgroup_id and c.srv_host=b.hostname and c.srv_port = b.port where hostgroup in( 50,52,70,71) order by hostgroup,srv_host desc;" -e " select srv_host,command,avg(time_ms), count(ThreadID) from stats_mysql_processlist group by srv_host,command;" -e "select * from stats_mysql_users;";mysql  --defaults-file=~/.my.cnf -h 127.0.0.1 -P 6032  -t -e "select * from stats_mysql_global "|egrep -i  "(mirror|memory|stmt|processor)"'
+--------+-----------+--------------------------------------------------------------------------+----------+--------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+
| weight | hostgroup | srv_host                                                                 | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | MaxConnUsed | Queries | Queries_GTID_sync | Bytes_data_sent | Bytes_data_recv | Latency_us |
+--------+-----------+--------------------------------------------------------------------------+----------+--------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+
| 1000   | 70        | proxysqltestdb.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com               | 3306     | ONLINE | 0        | 0        | 0	     | 0       | 0           | 0       | 0                 | 0               | 0               | 5491       |
| 1000   | 71        | proxysqltestdb2.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com              | 3306     | ONLINE | 0        | 5        | 5	     | 0       | 5           | 73      | 0                 | 5483            | 28442           | 881        |
| 1000   | 71        | proxysqltestdb.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com               | 3306     | ONLINE | 0        | 5        | 5	     | 0       | 5           | 82      | 0                 | 6203            | 32217           | 5491       |
| 1	 | 71        | proxysqltestdb-eu-central-1b.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com | 3306     | ONLINE | 0        | 0        | 0	     | 0       | 0           | 0       | 0                 | 0               | 0               | 1593       |
+--------+-----------+--------------------------------------------------------------------------+----------+--------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+
+----------+----------------------+--------------------------+
| username | frontend_connections | frontend_max_connections |
+----------+----------------------+--------------------------+
| m8_test  | 0                    | 10000                    |
+----------+----------------------+--------------------------+
| Query_Processor_time_nsec    | 0              |
| Com_backend_stmt_prepare     | 0              |
| Com_backend_stmt_execute     | 0              |
| Com_backend_stmt_close       | 0              |
| Com_frontend_stmt_prepare    | 0              |
| Com_frontend_stmt_execute    | 0              |
| Com_frontend_stmt_close      | 0              |
| Mirror_concurrency           | 0              |
| Mirror_queue_length          | 0              |
| SQLite3_memory_bytes         | 2652288        |
| ConnPool_memory_bytes        | 712720         |
| Stmt_Client_Active_Total     | 0              |
| Stmt_Client_Active_Unique    | 0              |
| Stmt_Server_Active_Total     | 0              |
| Stmt_Server_Active_Unique    | 0              |
| Stmt_Max_Stmt_id             | 1              |
| Stmt_Cached                  | 0              |
| Query_Cache_Memory_bytes     | 0              |

At this point, you can connect your application and see how ProxySQL allows you to perform much better than the native cluster entry point.

This will be expanded in the next article: Leverage AWS Aurora performance.

Conclusions

I had my first issue with the native Aurora connector a long time ago, but I had nothing to replace it. ProxySQL is a very good alternative to standard cluster access, with more options/controls and it also allows us to perform close-to-application caching, which is much more efficient than the remote MySQL one (http://www.proxysql.com/blog/scaling-with-proxysql-query-cache).

In the next article I will illustrate how, in a simple setup, ProxySQL can help in achieving better results than using the default Aurora cluster endpoint.

The post How to Implement ProxySQL with AWS Aurora appeared first on Percona Database Performance Blog.

Mar
14
2018
--

Saw Percona at SCaLE 16x? Come See Even More at Percona Live 2018!

Did you see Percona at SCaLE 16x? I spent a couple of days there learning about open source software, databases, and other interesting topics. You can get even more open source database information at Percona Live 2018.

SCaLE is the largest community-run open-source and free software conference in North America. It is held annually in the greater Los Angeles area. This year’s event took place on March 8-11, 2018, at the Pasadena Convention Center. SCaLE 16X hosted 150 exhibitors this year, along with nearly 130 sessions, tutorials and special events.

Percona has been attending now for a number of years, and this year was no exception. Besides our booth in the Exhibit Hall, we had two speakers giving three different talks:

Percona at Scale 16xPeter Zaitsev, Percona CEO and Founder

Using MySQL for Distributed Database Architectures

In modern data architectures, we’re increasingly moving from single node design systems to distributed architectures using multiple nodes – often spread across multiple databases and multiple continents. Such architectures bring many benefits (such as scalability and resiliency), but can also bring a lot of pain if not correctly architected.

In this presentation, we looked into how we can use MySQL to engineer such systems. Firstly, we looked into the application data requirements that can shape which distributed architectures will work for an application, and what are their benefits and tradeoffs. Then we looked into how to implement the architectures with MySQL, using conventional and proven options such as MySQL Replication, as well as newer options such as:

    • MySQL Multi-Source Replication
    • MySQL Group Replication
    • Percona XtraDB Cluster and Galera
    • Application-driven replication using Kafka

Finally, since a common cause of production problems is a misunderstanding of how distributed systems are designed to behave during failure, we examined what can commonly happen to cause architecture scenarios to fail.

Why We’re Excited About MySQL 8.0

There are many great new features in MySQL 8.0, but how exactly can they help your applications? This session took a practical look at MySQL 8.0 features and improvements. We looked at the bugs, issues and limitations of previous MySQL versions and how MySQL 8.0 addresses them. It also covered what you can do with MySQL 8.0 that you couldn’t before.

Percona at Scale 16x 2Colin Charles

Capacity Planning for your Data Stores

Imagine a ticket sales website that does normal events like an M2M concert, but also occasionally sells tickets to the very popular play Harry Potter and the Cursed Child. This is a perfect capacity planning example. Selling tickets requires that you never sell more tickets than you actually have. You want to load-balance your queries, to shard your data stores and split reads and writes. You need to determine where the system bottlenecks, so you need a baseline for your regular traffic. The website must be able to handle the increased load for extremely popular performances, but you don’t want to buy servers that aren’t doing anything for much of the time. (This is also why the cloud is so popular today.)

Colin Charles explored storage capacity planning for OLTP and data warehousing uses and explains how metrics collection helps you plan your requirements. Coupled with the elastic nature of clouds, you should never have an error establishing database connection. Along the way, Colin also covered tools such as Box Anemometer, innotop, the slow query log, Percona Toolkit (pt-query-digest), vmstat, Facebook’s Prophet, and Percona Monitoring and Management (PMM).

Liked SCaLE 16x? Come to Percona Live 2018!

If you attended the SCaLE 16x conference, there was a multitude of excellent talks on many different open source topics. Many of these same speakers, companies, sponsors and attendees will also be at the Percona Live 2018 Open Source Database Conference in Santa Clara, CA, on April 23 – 25, 2018.

Join the open source database community in Santa Clara, California, to learn about the core topics in MySQL, MongoDB and other open source databases. Get briefed on the hottest topics, learn about building and maintaining high-performing deployments and listen to technical experts and top industry leaders. The Percona Live 2018 – Open Source Database Conference is a great event for users of any level exploring open source database technologies.

Some of these speakers and companies attending include:

. . . and many more.

Hurry and register before the event sells out!

Feb
23
2018
--

Webinar Tuesday February 27, 2018: Monitoring Amazon RDS with Percona Monitoring and Management (PMM)

Monitoring Amazon RDS

Monitoring Amazon RDSPlease join Percona’s Build / Release Engineer, Mykola Marzhan, as he presents Monitoring Amazon RDS with Percona Monitoring and Management on February 27, 2018, at 7:00 am PDT (UTC-8) / 10:00 am EDT (UTC-5).


Are you concerned about how you are monitoring your AWS environment? Keeping track of what is happening in your Amazon RDS deployment is key to guaranteeing the performance and availability of your database for your critical applications and services.

Did you know that Percona Monitoring and Management (PMM) ships with support for MySQL on Amazon RDS and Amazon Aurora out of the box? It does!

Percona Monitoring and Management (PMM) is a free and open-source platform for managing and monitoring MySQL, Percona Server for MySQL MariaDB, MongoDB, Percona Server for MongoDB performance both on-premise and in the cloud.

In this session we’ll discuss:

  • Configuring PMM (metrics and queries) against Amazon RDS MySQL and Amazon Aurora using an EC2 instance
  • Configuring PMM against CloudWatch metrics
  • Setting configuration parameters for AWS for maximum PMM visibility

Register for the webinar now.

mykolaMykola Marzhan, Release Engineer

Mykola joined Percona in 2016 as a release engineer. He has been developing monitoring systems since 2004, and has been working as Release Engineer/Release Manager/DevOps for ten years. Recently, Mykola achieved an AWS Certified Solutions Architect (Professional) authentication.

 

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com