May
11
2018
--

This Week in Data with Colin Charles 39: a valuable time spent at rootconf.in

Colin Charles

Colin CharlesJoin Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

rootconf.in 2018 just ended, and it was very enjoyable to be in Bangalore for the conference. The audience was large, the conversations were great, and overall I think this is a rather important conference if you’re into the “DevOps” movement (or are a sysadmin!). From the data store world, Oracle MySQL was a sponsor, as was MyDBOPS (blog), and Elastic. There were plenty more, including Digital Ocean/GoJek/Walmart Labs — many MySQL users.

I took a handful of pictures with people, and here are some of the MyDBOPS team and myself.  They have over 20 employees, and serve the Indian market at rates that would be more palatable than straight up USD rates. Traveling through Asia, many businesses always do find local partners and offer local pricing; this really becomes more complex in the SaaS space (everyone pays the same rate generally) and also the services space.

Colin at Rootconf with Oracle
Some of the Oracle MySQL team who were exhibiting were very happy they got a good amount of traffic to the booth based on stuff discussed at the talk and BOF.

From a talk standpoint, I did a keynote for an hour and also a BoF session for another hour (great discussion, lots of blog post ideas from there), and we had a Q&A session for about 15 minutes. There were plenty of good conversations in the hallway track.

A quick observation that I notice happens everywhere: many people don’t realize features that have existed in MySQL since 5.6/5.7.  So they are truly surprised with stuff in 8.0 as well. It is clear there is a huge market that would thrive around education. Not just around feature checklists, but also around how to use features. Sometimes, this feels like the MySQL of the mid-2000’s — getting apps to also use new features, would be a great thing.

Releases

This seems to have been a quiet week on the releases front.

Are you a user of Amazon Aurora MySQL? There is now the Amazon Aurora Backtrack feature, which allows you to go back in time. It is described to work as:

Aurora uses a distributed, log-structured storage system (read Design Considerations for High Throughput Cloud-Native Relational Databases to learn a lot more); each change to your database generates a new log record, identified by a Log Sequence Number (LSN). Enabling the backtrack feature provisions a FIFO buffer in the cluster for storage of LSNs. This allows for quick access and recovery times measured in seconds.

Link List

Upcoming appearances

Feedback

I look forward to feedback/tips via e-mail at colin.charles@percona.com or on Twitter @bytebot.

 

The post This Week in Data with Colin Charles 39: a valuable time spent at rootconf.in appeared first on Percona Database Performance Blog.

May
08
2018
--

How to Enable Amazon RDS Remote Access

Amazon RDS remote access

It’s easy to enable Amazon RDS remote access when launching an Amazon RDS instance, but there can be many issues. I created this blog as a guide describing the various issues/configurations we might encounter.

As the first step, we need to select a VPC where we will launch our Amazon RDS instance. The default VPC has all the required settings to make the instance remotely available; we just have to enable it by selecting “Yes” at Public accessibility.

For this example, we used the Default VPC and asked AWS to create a new security group.

Once the instance is created, we can connect to the “Endpoint” address:

[root@server1 ~]# mysql -h publicdb.cbnuzwwzlcf1.eu-west-3.rds.amazonaws.com -u dbuser -p
Enter password: XXXXXX
mysql> s
--------------
mysql  Ver 14.14 Distrib 5.7.19-17, for Linux (x86_64) using  6.2
Connection id: 14
Current database:
Current user: dbuser@server1.hostname.com
SSL: Cipher in use is AES256-SHA
Current pager: stdout
Using outfile: ''
Using delimiter: ;
Server version: 5.6.37 MySQL Community Server (GPL)
Protocol version: 10
Connection: publicdb.cbnuzwwzlcf1.eu-west-3.rds.amazonaws.com via TCP/IP
Server characterset: latin1
Db     characterset: latin1
Client characterset: utf8
Conn.  characterset: utf8
TCP port: 3306
Uptime: 1 min 56 sec
Threads: 2  Questions: 9986  Slow queries: 0 Opens: 319  Flush tables: 1 Open tables: 80  Queries per second avg: 86.086
--------------
mysql>

When AWS creates the security group after we select the option to make it publicly accessible, it appears that AWS takes care of everything. But what if we check the created security groups?

It created a rule to enable incoming traffic, as security group works as a whitelist (it denies everything except the matching rules). 

As we can see here, AWS only created the inbound rule for my current IP address, which means once we change IPs or try to connect from another server, it will fail. To get around that, we need to add another rule:

Adding the 0.0.0.0/0 rule opens the port for the world. This is dangerous! Since anyone can try connecting, it’s much better if we can supply a list of IPs or ranges we want enabled for remote access, even from outside of AWS.

Running remotely accessible RDS in custom VPC

To run RDS in a new VPC or in an existing VPC, we need to ensure a couple of things. 

The VPC needs to have at least two subnets. We believe this is something Amazon asks so that the VPC is ready if you choose to move to a Multi-AZ master, or to simply spread the read-only instances across multiple AZ for higher availability.

If you want to make the RDS cluster remotely available, we need to attach an IGW (Internet Gateway) to the VPC. If you don’t, it isn’t able to communicate with the outside world. To do that, go to VPC -> Internet gateways and hit “Create Internet Gateway”:

Once it’s created, select “Attach to VPC” and select your VPC. 

Still, you won’t be able to reach the internet as we need to add route towards the newly attached internet gateway. 

To do that, go to “Route Tables” and select our VPC, and add the following route (0.0.0.0/0 means it’s going to be the default gateway, and all non-internal traffic needs to be routed towards it):


Hit Save. Now the VPC has Internet access, just like AWS’s Default VPC.

The post How to Enable Amazon RDS Remote Access appeared first on Percona Database Performance Blog.

Apr
20
2018
--

Percona Monitoring and Management (PMM) 1.10.0 Is Now Available

Percona Monitoring and Management

Percona Monitoring and Management (PMM) is a free and open-source platform for managing and monitoring MySQL® and MongoDB® performance. You can run PMM in your own environment for maximum security and reliability. It provides thorough time-based analysis for MySQL® and MongoDB® servers to ensure that your data works as efficiently as possible.

Percona Monitoring and ManagementWe focused mainly on two features in 1.10.0, but there are also several notable improvements worth highlighting:

  • Annotations – Record and display Application Events as Annotations using pmm-admin annotate
  • Grafana 5.0 – Improved visualization effects
  • Switching between Dashboards – Restored functionality to preserve host when switching dashboards
  • New Percona XtraDB Cluster Overview graphs – Added Galera Replication Latency graphs on Percona XtraDB Cluster Overview dashboard with consistent colors

The issues in the release include four new features & improvements, and eight bugs fixed.

Annotations

Application events are one of the contributors to changes in database performance characteristics, and in this release PMM now supports receiving events and displaying them as Annotations using the new command pmm-admin annotate. A recent Percona survey reveals that Database and DevOps Engineers highly value visibility into the Application layer.  By displaying Application Events on top of your PMM graphs, Engineers can now correlate Application Events (common cases: Application Deploys, Outages, and Upgrades) against Database and System level metric changes.

Usage

For example, you have just completed an Application deployment to version 1.2, which is relevant to UI only, so you want to set tags for the version and interface impacted:

pmm-admin annotate "Application deploy v1.2" --tags "UI, v1.2"

Using the optional --tags allows you to filter which Annotations are displayed on the dashboard via a toggle option.  Read more about Annotations utilization in the Documentation.

Grafana 5.0

We’re extremely pleased to see Grafana ship 5.0 and we were fortunate enough to be at Grafanacon, including Percona’s very own Dimitri Vanoverbeke (Dim0) who presented What we Learned Integrating Grafana and Prometheus!

 

 

Included in Grafana 5.0 are a number of dramatic improvements, which in future Percona Monitoring and Management releases we plan to extend our usage of each feature, and the one we like best is the virtually unlimited way you can size and shape graphs.  No longer are you bound by panel constraints to keep all objects at the same fixed height!  This improvement indirectly addresses the visualization error in PMM Server where some graphs would appear to be on two lines and ended up wasting screen space.

Switching between Dashboards

PMM now allows you to navigate between dashboards while maintaining the same host under observation, so that for example you can start on MySQL Overview looking at host serverA, switch to MySQL InnoDB Advanced dashboard and continue looking at serverA, thus saving you a few clicks in the interface.

New Percona XtraDB Cluster Galera Replication Latency Graphs

We have added new Percona XtraDB Cluster Replication Latency graphs on our Percona XtraDB Cluster Galera Cluster Overview dashboard so that you can compare latency across all members in a cluster in one view.

Issues in this release

New Features & Improvements

  • PMM-2330Application Annotations DOC Update
  • PMM-2332Grafana 5 DOC Update
  • PMM-2293Add Galera Replication Latency Graph to Dashboard PXC/Galera Cluster Overview RC Ready
  • PMM-2295Improve color selection on Dashboard PXC/Galera Cluster Overview RC Ready

Bugs fixed

  • PMM-2311Fix misalignment in Query Analytics Metrics table RC Ready
  • PMM-2341Typo in text on password page of OVF RC Ready
  • PMM-2359Trim leading and trailing whitespaces for all fields on AWS/OVF Installation wizard RC Ready
  • PMM-2360Include a “What’s new?” link for Update widget RC Ready
  • PMM-2346Arithmetic on InnoDB AHI Graphs are invalid DOC Update
  • PMM-2364QPS are wrong in QAN RC Ready
  • PMM-2388Query Analytics does not render fingerprint section in some cases DOC Update
  • PMM-2371Pass host when switching between Dashboards

How to get PMM

PMM is available for installation using three methods:

Help us improve our software quality by reporting any Percona Monitoring and Management bugs you encounter using our bug tracking system.

The post Percona Monitoring and Management (PMM) 1.10.0 Is Now Available appeared first on Percona Database Performance Blog.

Apr
16
2018
--

Webinar Tuesday April 17, 2018: Which Amazon Cloud Technology Should You Chose? RDS? Aurora? Roll Your Own?

Amazon Cloud Technology

Amazon Cloud TechnologyPlease join Percona’s Senior Technical Operations Engineer, Daniel Kowalewski as he presents Which Amazon Cloud Technology Should You Chose? RDS? Aurora? Roll Your Own? on Tuesday, April 17, 2018, at 10:00 am PDT (UTC-7) / 1:00 pm EDT (UTC-4).

Are you running on Amazon, or planning to migrate there? In this talk, we are going to cover the different technologies for running databases on Amazon Cloud environments.

We will focus on the operational aspects, benefits and limitations for each of them.

Register for the webinar now.

Amazon Cloud TechnologyDaniel Kowalewski, Senior Technical Operations Engineer

Daniel joined Percona in August of 2015. Previously, he earned a B.S. in Computer Science from the University of Colorado in 2006 and was a DBA there until he joined Percona. In addition to MySQL, Daniel also has experience with Oracle and Microsoft SQL Server, but he much prefers to stay in the MySQL world. Daniel lives near Denver, CO with his wife, two-year-old son, and dog. If you can’t reach him, he’s probably in the mountains hiking, camping, or trying to get lost.

The post Webinar Tuesday April 17, 2018: Which Amazon Cloud Technology Should You Chose? RDS? Aurora? Roll Your Own? appeared first on Percona Database Performance Blog.

Apr
11
2018
--

Calling All Polyglots: Percona Live 2018 Keynote Schedule Now Available!

Percona Live 2018 Keynotes

Percona Live 2018 KeynotesWe’ve posted the Percona Live 2018 keynote addresses for the seventh annual Percona Live Open Source Database Conference 2018, taking place April 23-25, 2018 at the Santa Clara Convention Center in Santa Clara, CA. 

This year’s keynotes explore topics ranging from how cloud and open source database adoption accelerates business growth, to leading-edge emerging technologies, to the importance of MySQL 8.0, to the growing popularity of PostgreSQL.

We’re excited by the great lineup of speakers, including our friends at Alibaba Cloud, Grafana, Microsoft, Oracle, Upwork and VividCortex, the innovative leaders on the Cool Technologies panel, and Brendan Gregg from Netflix, who will discuss how to get the most out of your database on a Linux OS, using his experiences at Netflix to highlight examples.  

With the theme of “Championing Open Source Databases,” the conference will feature multiple tracks, including MySQL, MongoDB, Cloud, PostgreSQL, Containers and Automation, Monitoring and Ops, and Database Security. Once again, Percona will be offering a low-cost database 101 track for beginning users who want to learn how to use and operate open source databases.

The Percona Live 2018 keynotes include:

Tuesday, April 24, 2018

  • Open Source for the Modern Business – Peter Zaitsev of Percona will discuss how open source database adoption continues to grow in enterprise organizations, the expectations and definitions of what constitutes success continue to change. A single technology for everything is no longer an option; welcome to the polyglot world. The talk will include several compelling open source projects and trends of interest to the open source database community and will be followed by a round of lightning talks taking a closer look at some of those projects.
  • Cool Technologies Showcase – Four industry leaders will introduce key emerging industry developments. Andy Pavlo of Carnegie Mellon University will discuss the requirements for enabling autonomous database optimizations. Nikolay Samokhvalov of PostgreSQL.org will discuss new PostgreSQL tools. Sugu Sougoumarane of PlanetScale Data will explore how Vitess became a high-performance, scalable and available MySQL clustering cloud solution in line with today’s NewSQL storage systems. Shuhao Wu of Shopify explains how to use Ghostferry as a data migration tool for incompatible cloud platforms.
  • State of the Dolphin 8.0 – Tomas Ulin of Oracle will discuss the focus, strategy, investments and innovations that are evolving MySQL to power next-generation web, mobile, cloud and embedded applications – and why MySQL 8.0 is the most significant MySQL release in its history.
  • Linux Performance 2018 – Brendan Gregg of Netflix will summarize recent performance features to help users get the most out of their Linux systems, whether they are databases or application servers. Topics include the KPTI patches for Meltdown, eBPF for performance observability, Kyber for disk I/O scheduling, BBR for TCP congestion control, and more.

Wednesday, April 25, 2018

  • Panel Discussion: Database Evolution in the Cloud – An expert panel of industry leaders, including Lixun Peng of Alibaba, Sunil Kamath of Microsoft, and Baron Schwartz of VividCortex, will discuss the rapid changes occurring with databases deployed in the cloud and what that means for the future of databases, management and monitoring and the role of the DBA and developer.
  • Future Perfect: The New Shape of the Data Tier – Baron Schwartz of VividCortex will discuss the impact of macro trends such as cloud computing, microservices, containerization, and serverless applications. He will explore where these trends are headed, touching on topics such as whether we are about to see basic administrative tasks become more automated, the role of open source and free software, and whether databases as we know them today are headed for extinction.
  • MongoDB at Upwork – Scott Simpson of Upwork, the largest freelancing website for connecting clients and freelancers, will discuss how MongoDB is used at Upwork, how the company chose the database, and how Percona helps make the company successful.

We will also present the Percona Live 2018 Community Awards and Lightning Talks on Monday, April 23, 2018, during the Opening Night Reception. Don’t miss the first day of tutorials and Opening Night Reception!

Register for the conference on the Percona Live Open Source Database Conference 2018 website.

Sponsorships

Limited Sponsorship opportunities for Percona Live 2018 Open Source Database Conference are still available, and offer the opportunity to interact with the DBAs, sysadmins, developers, CTOs, CEOs, business managers, technology evangelists, solution vendors, and entrepreneurs who typically attend the event. Contact live@percona.com for sponsorship details.

  • Diamond Sponsors – Percona, VividCortex
  • Platinum – Alibaba Cloud, Microsoft
  • Gold Sponsors – Facebook, Grafana
  • Bronze Sponsors – Altinity, BlazingDB, Box, Dynimize, ObjectRocket, Pingcap, Shannon Systems, SolarWinds, TimescaleDB, TwinDB, Yelp
  • Contributing Sponsors – cPanel, Github, Google Cloud, NaviCat
  • Media Sponsors – Database Trends & Applications, Datanami, EnterpriseTech, HPCWire, ODBMS.org, Packt

The post Calling All Polyglots: Percona Live 2018 Keynote Schedule Now Available! appeared first on Percona Database Performance Blog.

Apr
04
2018
--

Percona Monitoring and Management 1.9.0 Is Now Available

Percona Monitoring and Management

PMM (Percona Monitoring and Management) is a free and open-source platform for managing and monitoring MySQL® and MongoDB® performance. You can run PMM in your own environment for maximum security and reliability. It provides thorough time-based analysis for MySQL® and MongoDB® servers to ensure that your data works as efficiently as possible.

Percona Monitoring and ManagementThere are a number of significant updates in 1.9.0 that we hope you will like, some of the key highlights include:

  • Faster loading of the index page: We have enabled performance optimizations using gzip and HTTP2.
  • AWS improvements: We have added metrics from CloudWatch RDS to 6 dashboards, as well as changed our AWS add instance workflow, and made some changes to credentials handling.
  • Percona Snapshot Server: If you are a Percona customer you can now securely share your dashboards with Percona Engineers.
  • Exporting PMM Server logs: Retrieve logs from PMM Server for troubleshooting using single button-click, avoiding the need to log in manually to the docker container.
  • Low RAM support: We have reduced the memory requirement so PMM Server will run on systems with 512MB
  • Dashboard improvements: We have changed MongoDB instance identification for MongoDB graphs, and set maximum graph Y-axis on Prometheus Exporter Status dashboard

AWS Improvements

CloudWatch RDS metrics

Since we are already consuming Amazon Cloudwatch metrics and persisting them in Prometheus, we have improved six node-specific dashboards to now display Amazon RDS node-level metrics:

  • Cross_Server (Network Traffic)
  • Disk Performance (Disk Latency)
  • Home Dashboard (Network IO)
  • MySQL Overview (Disk Latency, Network traffic)
  • Summary Dashboard (Network Traffic)
  • System Overview (Network Traffic)

AWS Add Instance changes

We have changed our AWS add instance interface and workflow to be more clear on information needed to add an Amazon Aurora MySQL or Amazon RDS MySQL instance. We have provided some clarity on how to locate your AWS credentials.

AWS Settings

We have improved our documentation to highlight connectivity best practices, and authentication options – IAM Role or IAM User Access Key.

Enabling Enhanced Monitoring

Credentials Screen

Low RAM Support

You can now run PMM Server on instances with memory as low as 512MB RAM, which means you can deploy to the free tier of many cloud providers if you want to experiment with PMM. Our memory calculation is now:

METRICS_MEMORY_MULTIPLIED=$(( (${MEMORY_AVAIABLE} - 256*1024*1024) / 100 * 40 ))
if [[ $METRICS_MEMORY_MULTIPLIED < $((128*1024*1024)) ]]; then
    METRICS_MEMORY_MULTIPLIED=$((128*1024*1024))
fi

Percona Snapshot Server

Snapshots are a way of sharing PMM dashboards via a link to individuals who do not normally have access to your PMM Server. If you are a Percona customer you can now securely share your dashboards with Percona Engineers. We have replaced the button for sharing to the Grafana publicly hosted platform onto one administered by Percona. Your dashboard will be written to Percona snapshots and only Percona Engineers will be able to retrieve the data. We will be expiring old snapshots automatically at 90 days, but when sharing you will have the option to configure a shorter retention period.

Export of PMM Server Logs

In this release, the logs from PMM Server can be exported using single button-click, avoiding the need to log in manually to the docker container. This simplifies the troubleshooting process of a PMM Server, and especially for Percona customers, this feature will provide a more consistent data gathering task that you will perform on behalf of requests from Percona Engineers.

Faster Loading of the Index Page

In version 1.8.0, the index page was redesigned to reveal more useful information about the performance of your hosts as well an immediate access to essential components of PMM, however the index page had to load much data dynamically resulting in a noticeably longer load time. In this release we enabled gzip and HTTP2 to improve the load time of the index page. The following screenshots demonstrate the results of our tests on webpagetest.org where we reduce page load time by half. We will continue to look for opportunities to improve the performance of the index page and expect that when we upgrade to Prometheus 2 we will see another improvement.

The load time of the index page of PMM version 1.8.0

The load time of the index page of PMM version 1.9.0

Issues in this release¶

New Features

  • PMM-781: Plot new PXC 5.7.17, 5.7.18 status variables on new graphs for PXC Galera, PXC Overview dashboards
  • PMM-1274: Export PMM Server logs as zip file to the browser
  • PMM-2058: Percona Snapshot Server

Improvements

  • PMM-1587: Use mongodb_up variable for the MongoDB Overview dashboard to identify if a host is MongoDB.
  • PMM-1788: AWS Credentials form changes
  • PMM-1823: AWS Install wizard improvements
  • PMM-2010: System dashboards update to be compatible with RDS nodes
  • PMM-2118: Update grafana config for metric series that will not go above 1.0
  • PMM-2215: PMM Web speed improvements
  • PMM-2216: PMM can now be started on systems without memory limit capabilities in the kernel
  • PMM-2217: PMM Server can now run in Docker with 512 Mb memory
  • PMM-2252: Better handling of variables in the navigation menu

Bug fixes

  • PMM-605: pt-mysql-summary requires additional configuration
  • PMM-941: ParseSocketFromNetstat finds an incorrect socket
  • PMM-948: Wrong load reported by QAN due to mis-alignment of time intervals
  • PMM-1486: MySQL passwords containing the dollar sign ($) were not processed properly.
  • PMM-1905: In QAN, the Explain command could fail in some cases.
  • PMM-2090: Minor formatting issues in QAN
  • PMM-2214: Setting Send real query examples for Query Analytic OFF still shows the real query in example.
  • PMM-2221: no Rate of Scrapes for MySQL & MySQL Errors
  • PMM-2224: Exporter CPU Usage glitches
  • PMM-2227: Auto Refresh for dashboards
  • PMM-2243: Long host names in Grafana dashboards are not displayed correctly
  • PMM-2257: PXC/galera cluster overview Flow control paused time has a percentage glitch
  • PMM-2282: No data is displayed on dashboards for OVA images
  • PMM-2296: The mysql:metrics service will not start on Ubuntu LTS 16.04

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system.

The post Percona Monitoring and Management 1.9.0 Is Now Available appeared first on Percona Database Performance Blog.

Apr
03
2018
--

Leveraging ProxySQL with AWS Aurora to Improve Performance, Or How ProxySQL Out-performs Native Aurora Cluster Endpoints

ProxySQL with AWS Aurora

In this blog post, I’ll look at how you can use ProxySQL with AWS Aurora to further leverage database performance.

My previous article described how easy is to replace the native Aurora connector with ProxySQL. In this article, you will see WHY you should do that.

It is important to understand that aside from the basic optimization in the connectivity and connection management, ProxySQL also provides you with a new set of features that currently are not available in Aurora.

Just think:

  • Better caching
  • Query filtering
  • Sharding
  • Query substitution
  • Firewalling
  • … and more

We will cover areas like scalability, security and performance. In short, I think is more than worth it to spend some time and give ProxySQL with AWS Aurora a try.

The tests

I will show you the results from two different kinds of tests. One is sysbench-oriented, the other simulates a more complex application using Java, data object utilization and a Hikari connection pool in the middle as well. 

For the EC2 and Aurora platform I used:

  • Application/ProxySQL T2.xlarge eu-central-1a
  • 2 Aurora MySQL 5.7.12 db.t2.medium eu-central-1a
  • 1 Aurora MySQL 5.7.12 db.t2.medium eu-central-1b for AZ redundancy

The code for the application is available here, and for sysbench tests here. All the data and configurations for the application are available here.

I ran three tests using both bench apps, obviously with Aurora as it comes and with ProxySQL. For the ProxySQL configuration see my previous article.
The tests were read_only / Write_only / read_write.

For Aurora, I only increased the number of connections and kept the how it comes out of the box approach. Note each test was run at least three times at different moments of the day, and on a different day. The data reported as final is the BEST performing result for each one.

The Results

For the impatient among us, here is a summary table of the tests:

Sysbench:

Java App:


Now if this is enough for you, you can go to the conclusion and start to use ProxySQL with AWS Aurora. If you would like to know a bit more, continue reading.

Aside from any discussion on the benchmark tool and settings, I really focused on identifying the differences between the two “connectors”. Given the layer below was exactly the same, any difference is due to the simple substitution of the endpoint.

Sysbench

Read Only

The first image reports the number of events achieved at the time of the test. It is quite clear that when using ProxySQL, sysbench ran more events.

In this graph, higher is better:

In this graph, lower is better:

As we can see, the latency when using an Aurora cluster entry point is higher. True, we are talking about milliseconds, but it is not just the value that matters, but also the distribution:

Aurora cluster endpoint ProxySQL
   

An image is worth a thousand words!

We can see, the behavior stays constant in analyzing the READS executed, with ProxySQL performing better.

In this graph, higher is better:


In this graph, higher is better:

Closing with the number of total queries performed, in which ProxySQL surpassed the Cluster endpoint by ~ 4K queries.

Write Only

For writing, things go a bit different. We see that all lines intersect, and the values are very close one to the other. I will let the images speak for themselves:

In this graph, higher is better:

In this graph, lower is better:

Latency spiked in each ProxySQL test, and it may require additional investigation and tuning.  

In this graph, higher is better:

While the rates of writes/sec intersect with each other frequently, in the end ProxySQL resulted in more writes than the native endpoint.

In this graph, higher is better:

In the end, a difference exists and is consistent across the different test iterations, but is minimal. We are talking a range of 25 – 50 entries in total. This result is not surprising, and it will be clear why later in the article.  

Read and Write

As expected in the read and write test, we see a different situation. ProxySQL is still performing better than the default entry point, but not by such a big margin as in read-only tests.

In this graph, higher is better:

In this graph, lower is better

Latency and events are following the expected trend, where read operations are executed more efficiently with ProxySQL and writes are close, but NOT the same as in the write only test.

As a result, the number of queries in ProxySQL is approximately 13% better than the default entry point.

Java Application Tests

What about the Java application? First of all, we need to remember that the application used a connection pool mechanism (Hikari), and the connection pool was present in all cases (for both Aurora cluster endpoint or ProxySQL). Given that a small delay in establishing the first connection was expected, you can easily see this in the MAX value of the connection latency.    

In this graph, lower is better.

The connection latency reported here is expressed in nanoseconds and is the measure of the time taken by the connection provider to return an active connection to the application from the moment the application requested it. In other words, how long the HikariCP is taking to choose/check/return an open connection. As you can see, the MAX value is drastically higher, and this was expected since it is the connection initialization. While not really interesting in terms of performance, this value is interesting because it gives us the dimension of the cost in the CP to open a new connection, which in the worse case is 25 milliseconds.

As the graphs show, ProxySQL manages both cases (first call, reassignment) more efficiently.

In this graph, higher is better.

In the CRUD summary table, we can see the number of SQL commands executed per second for each CRUD action and for each test. Once more we can see that when using ProxySQL, the application performed much better and significantly executed more operations (especially in the R/W test).

In this graph, higher is better.

This graph represents the total number of events run at the time of the test. An event is a full application cycle, which sees the application generate the data needed to fill the SQL (no matter if it is for read/write), create the SQL, request the connection, push the SQL, get and read the resultset returned and give back the connection.

Once more, ProxySQL shows better performance.

In this graph, lower is better.

The execution time reported in this graph is the time taken by the application to run a whole event.

This is it, execution time is the time of a full cycle. The faster the cycle is executed, the better the application is performing. The time is express in milliseconds and it goes from a very fast read, which probably accesses the cache in Aurora, to almost two seconds taken to insert a batch of rows.

Needless to say, the tests using ProxySQL performed better.

But Why?

Why do the tests using ProxySQL perform better? After all, it is just an additional step in the middle, which also has a cost in intercepting the queries and managing the connections. So why the better performance?

The answer is simple and can be found in the Aurora manual: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Aurora.Overview.html#Aurora.Overview.Endpoints.

The Cluster endpoint is an endpoint for an Aurora DB cluster that connects to the current primary instance for that DB cluster. Each Aurora DB cluster has a cluster endpoint and one primary instance.

That endpoint receives the read and write request and sends them to the same instance.The main use for it is to perform failover if needed.

At the same time, the Reader endpoint is an endpoint for an Aurora DB cluster that connects to one of the available Aurora Replicas for that DB cluster. Each Aurora DB cluster has a reader endpoint. If there is more than one Aurora Replica, the reader endpoint directs each connection request to one of the Aurora Replicas.

The reader endpoint only load balances connections to available Aurora Replicas in an Aurora DB cluster. It does not load balance specific queries. If you want to load balance queries to distribute the read workload for a DB cluster, you need to manage that in your application and use instance endpoints to connect directly to Aurora Replicas to balance the load.

This means that to perform a Read/Write split, your application must manage two entry points and you will NOT have much control over how the queries are handled or to which replica instance they are directed. This could lead to unexpected results and delays.

Needless to say, ProxySQL does all that by default (as described in my previous article).

Now that we’ve clarified how Aurora entry points behave, let’s see about the performance difference.

How do we read this graph? From left to right:

  • read_only test with an Aurora cluster endpoint
  • read_only test with ProxySQL
  • write_only with an Aurora cluster endpoint
  • write_only with ProxySQL
  • read and write with an Aurora cluster endpoint
  • read and write with ProxySQL

Here we go! As we can see, the tests with ProxySQL used the two configured instances, splitting R/W without the need to do anything on the application side. I purposely avoided the AZ replica because I previously identified it as having higher latency, so I can exclude it and use it ONLY in the case of an emergency.

The effects are clear in the next graph.

When using the cluster endpoint, given all the load was on a single instance, the CPU utilization is higher and that became a bottleneck. When using ProxySQL, the load is spread across the different instances, allowing real read scalability. This has immediate benefits in read and read/write operations, allowing better load distribution that results in better performance.

Conclusions

Aurora is a very interesting technology and can be a very good solution for read scaling. But at the moment, the way AWS offers data connectivity with the Cluster endpoints and Reader endpoints can negatively affect performance.

The lack of configuration and the limitation of using different endpoints lead to confusion and less optimized utilization.

The introduction of ProxySQL, which now supports (from version 2) Aurora, allows an architect, SA or DBA to properly configure the environment. You can very granularly choose how to use each instance, without the need to have the application modify how it works. This helps keep the data layer solution separate from the application layer.

Even better, this additional set of flexibility does not come with a cost. On the contrary, it improves resource utilization and brings higher performance using less powerful instances. Given the cost of Aurora, this is not a secondary benefit.   

I suggest you try installing ProxySQL v2 (or higher) in front of your Aurora cluster. If you don’t feel confident and prefer to have us help you, contact us and we will be more than happy to support you!

The post Leveraging ProxySQL with AWS Aurora to Improve Performance, Or How ProxySQL Out-performs Native Aurora Cluster Endpoints appeared first on Percona Database Performance Blog.

Apr
02
2018
--

Migrate to Amazon RDS Using Percona Xtrabackup

Migrate to Amazon RDS

In this blog post, we’ll look at how to migrate to Amazon RDS using Percona XtraBackup.

Until recently, there was only one way to migrate your data from an existing MySQL instance into a new RDS MySQL instance: take and restore a logical backup with mysqldump or mydumper. This can be slow and error-prone. When Amazon introduced Amazon Aurora MySQL, you could use Percona XtraBackup to take an online physical backup of your database and restore that into a new Aurora instance. This feature is now available for RDS MySQL as well. Using Percona XtraBackup instead of a logical backup can save a lot of time, especially with a large dataset.

There are many caveats and limitations listed in Amazon’s documentation, but the most important ones are:

  • Source and destination databases must be MySQL 5.6. Earlier and later major versions are not supported at this time.
  • You can’t restore into an existing RDS instance using this method.
  • The total data size is limited to 6 TB.
  • User accounts, functions, and stored procedures are not imported automatically.
  • You can’t choose which databases and tables to migrate this way — migrate the whole instance. (You can’t use Percona Xtrabackup’s partial backup feature when migrating to RDS.)

If those limitations don’t apply to your use case, read on to learn how to migrate to Amazon RDS using Percona XtraBackup and restoring it into RDS.

Demonstration

For this demonstration, I created a Percona Server for MySQL 5.6 instance on EC2 with the sakila sample database and an extra InnoDB table. I filled the table with junk data to make the total data size about 13.5 GB. Then I installed the latest percona-xtrabackup-24  (2.3 would also have worked) and the AWS CLI tools. I took a backup from the EC2 instance with this command, using gzip to create a compressed backup:

sudo xtrabackup --backup --stream=tar | gzip -c > /data/backups/xtrabackup.tar.gz

Note that Amazon prepares the backup, so there’s no need to run

xtrabackup --prepare

 yourself.

For comparison, I took a mysqldump backup as well:

mysqldump --all-databases --triggers --events --routines --master-data=1 --single-transaction | gzip -c > /data/backups/mysqldump.sql.gz

I could have used mydumper to make this process multi-threaded, but to reduce complexity I did not. I then uploaded the backup to an S3 bucket (setting up credentials beforehand):

sudo aws s3 cp /data/backups/xtrabackup.tar.gz s3://dankow/

After that, I navigated to Relational Database Service in the AWS Console, and instead of clicking Launch DB Instance, I clicked Restore from S3. After that, the process is almost identical to creating a normal RDS MySQL or Amazon Aurora MySQL instance, with the addition of this box on Step 2:

I chose a db.m4.xlarge instance with 1000 Provisioned IOPS for this test. After I configured all the other options, I clicked “Launch DB Instance” and waited for my backup to decompress, prepare and restore into a new RDS instance.

For time comparison, I imported the backup I took with mysqldump, ignoring all the expected errors about privileges because they don’t affect the tables that we’re really interested in:

gunzip -c /data/backups/mysqldump.sql.gz | mysql --defaults-file=rds.cnf --force

Replication

If you’re planning on migrating a non-RDS instance to RDS, you might want to make your new RDS instance an async replica of the source instance. If there is a network path between the two instances, this is simple. Use the binary log coordinates from the xtrabackup_binlog_info (RDS does not support master_auto_position with GTID replication), and use them as arguments to the RDS external replication stored procedures, like this:
CALL mysql.rds_set_external_master (
"<host_name>",
 <host_port>,
 "<replication_user_name",
 "replication_password",
 "<mysql_binary_log_file_name>",
 mysql_binary_log_file_position,
 0
);
CALL mysql.rds_start_replication;

Currently, there is no way to make this connection use SSL. If the source instance is not in the same VPC as the RDS instance, set up a VPN connection between the two networks in order to protect the replication traffic.

Time Comparison

The time to back up was close: 8 minutes for Percona XtraBackup, and 7.5 minutes for mysqldump. Add the time to copy the backup to S3 (37 seconds), and the two methods are almost identical.The difference comes with restore time. The mysqldump backup took 22.5 minutes to restore, and Amazon took 10 minutes and 50 seconds to create the RDS instance from the backup. Some part of that is the normal overhead of creating an RDS instance, which always takes a few minutes.

Although my test dataset was small (13.5 GB) compared to most production databases, it was large enough to show a significant difference between physical (Percona XtraBackup) and logical (mysqldump) backups. The XtraBackup method was about 60% faster than mysqldump. If your dataset is larger, you will see even more of a difference.

Conclusion

When you migrate to Amazon RDS using a physical backup, it can be much faster than using a logical backup — but it’s not the right option for every use case. If your InnoDB tablespaces have significant fragmentation, or if you’re not currently using innodb_file_per_table, you may want to perform a logical migration to fix those issues. If you normally create RDS instances programmatically, the AWS CLI does not currently support creating an RDS instance from a physical backup. Any corruption in the InnoDB files transfers over to the RDS instance if you use a physical backup, but a logical backup will fail and allow you to fix the corruption before it gets to RDS.

For many use cases, however, building an RDS instance from Percona XtraBackup is a convenient way to get your data into RDS MySQL or Aurora relatively quickly. In this one small-scale test, migrating using XtraBackup was 60% faster than using mysqldump.

The post Migrate to Amazon RDS Using Percona Xtrabackup appeared first on Percona Database Performance Blog.

Mar
21
2018
--

Percona Live 2018 Featured Talk: Deep Dive into the RDS PostgreSQL Universe with Jignesh Shah

Percona Live 2018 Jignesh AWS

Percona Live 2018 Featured TalkWelcome to another interview blog for the rapidly-approaching Percona Live 2018. Each post in this series highlights a Percona Live 2018 featured talk at the conference and gives a short preview of what attendees can expect to learn from the presenter.

This blog post highlights Jignesh Shah, Senior Product Manager at Amazon Web Services. His talk is titled Deep Dive into the RDS PostgreSQL Universe. PostgreSQL is a very popular relational database gaining traction in Amazon’s RDS cloud environment. In our conversation, we discussed the features, uses, and benchmarks for PostgreSQL in AWS RDS:

Percona: Who are you, and how did you get into databases? What was your path to your current responsibilities?

I am a Senior Product Manager for Amazon Relational Database Service. I first started learning about databases with dBase III in 1994, followed by FoxPro, Progress 4GL, IBM DB2. I started learning about open source databases in the 2000s including PostgreSQL and MySQL, with a focus on database performance tuning on Sun Solaris Systems, and got closely involved in PostgreSQL-related benchmarks. One thing led to another, and I ended up building databases, virtual machines, and application lifecycle management products.

Percona: Your talk is titled “Deep Dive into the RDS PostgreSQL Universe”. How popular is PostgreSQL in Amazon RDS? 

PostgreSQL engines – including Amazon RDS for PostgreSQL and Amazon Aurora with PostgreSQL compatibility – are very popular and fast growing. Customers love the flexibility of PostgreSQL and ease of operations provided by Amazon RDS. Customers are excited by the innovations happening here, and love to give feedback on features and capabilities we can add to PostgreSQL in Amazon RDS. Most of our features are driven by customer requests, and customers are excited when they see their requested features available in the service.

Percona Live 2018 Jignesh AWSPercona: Why would you use PostgreSQL in Amazon RDS as opposed to other databases?

PostgreSQL offers good performance out of the box, with transactional semantics very similar to those of Oracle and SQL Server. PostgreSQL is object-oriented and ANSI SQL:2008 compatible, which makes it easy for customers to migrate applications from other relational database platforms. PostgreSQL also has very strong support for geospatial capabilities with the PostGIS extension and supports stored procedures in many languages, including PL/pgSQL (which is very similar to Oracle’s PL/SQL).

Percona: What PostgreSQL features are especially useful?

Every major release of PostgreSQL comes with new interesting features. Features like JSONB to handle JSON data types, spatial features with PostGIS for developing location-based services, foreign database wrappers to do federated queries, and replication features to replicate data are very useful. Especially for modern application development where speed and operational readiness are required for startups and enterprises.

Percona: Why should people attend your talk? What do you hope people will take away from it?

PostgreSQL and Amazon RDS together solve many developer needs and make operational lives easier for administrators – saving them time, resources and cost. Come and learn what is new in Amazon RDS for PostgreSQL, and look under the hood of how some of the capabilities work behind the scenes!

Percona: What are you looking forward to at Percona Live (besides your talk)?

I look forward to hearing from customers about their experiences with PostgreSQL, and learning more about the latest developments in open source databases at Percona Live.

Want to find out more about this Percona Live 2018 featured talk, and PostgreSQL in AWS RDS? Register for Percona Live 2018, and see Jignesh’s talk Deep Dive into the RDS PostgreSQL Universe. Register now to get the best price! Use the discount code SeeMeSpeakPL18 for 10% off.

Percona Live Open Source Database Conference 2018 is the premier open source event for the data performance ecosystem. It is the place to be for the open source community. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

The Percona Live Open Source Database Conference will be April 23-25, 2018 at the Hyatt Regency Santa Clara & The Santa Clara Convention Center.

The post Percona Live 2018 Featured Talk: Deep Dive into the RDS PostgreSQL Universe with Jignesh Shah appeared first on Percona Database Performance Blog.

Mar
14
2018
--

Saw Percona at SCaLE 16x? Come See Even More at Percona Live 2018!

Did you see Percona at SCaLE 16x? I spent a couple of days there learning about open source software, databases, and other interesting topics. You can get even more open source database information at Percona Live 2018.

SCaLE is the largest community-run open-source and free software conference in North America. It is held annually in the greater Los Angeles area. This year’s event took place on March 8-11, 2018, at the Pasadena Convention Center. SCaLE 16X hosted 150 exhibitors this year, along with nearly 130 sessions, tutorials and special events.

Percona has been attending now for a number of years, and this year was no exception. Besides our booth in the Exhibit Hall, we had two speakers giving three different talks:

Percona at Scale 16xPeter Zaitsev, Percona CEO and Founder

Using MySQL for Distributed Database Architectures

In modern data architectures, we’re increasingly moving from single node design systems to distributed architectures using multiple nodes – often spread across multiple databases and multiple continents. Such architectures bring many benefits (such as scalability and resiliency), but can also bring a lot of pain if not correctly architected.

In this presentation, we looked into how we can use MySQL to engineer such systems. Firstly, we looked into the application data requirements that can shape which distributed architectures will work for an application, and what are their benefits and tradeoffs. Then we looked into how to implement the architectures with MySQL, using conventional and proven options such as MySQL Replication, as well as newer options such as:

    • MySQL Multi-Source Replication
    • MySQL Group Replication
    • Percona XtraDB Cluster and Galera
    • Application-driven replication using Kafka

Finally, since a common cause of production problems is a misunderstanding of how distributed systems are designed to behave during failure, we examined what can commonly happen to cause architecture scenarios to fail.

Why We’re Excited About MySQL 8.0

There are many great new features in MySQL 8.0, but how exactly can they help your applications? This session took a practical look at MySQL 8.0 features and improvements. We looked at the bugs, issues and limitations of previous MySQL versions and how MySQL 8.0 addresses them. It also covered what you can do with MySQL 8.0 that you couldn’t before.

Percona at Scale 16x 2Colin Charles

Capacity Planning for your Data Stores

Imagine a ticket sales website that does normal events like an M2M concert, but also occasionally sells tickets to the very popular play Harry Potter and the Cursed Child. This is a perfect capacity planning example. Selling tickets requires that you never sell more tickets than you actually have. You want to load-balance your queries, to shard your data stores and split reads and writes. You need to determine where the system bottlenecks, so you need a baseline for your regular traffic. The website must be able to handle the increased load for extremely popular performances, but you don’t want to buy servers that aren’t doing anything for much of the time. (This is also why the cloud is so popular today.)

Colin Charles explored storage capacity planning for OLTP and data warehousing uses and explains how metrics collection helps you plan your requirements. Coupled with the elastic nature of clouds, you should never have an error establishing database connection. Along the way, Colin also covered tools such as Box Anemometer, innotop, the slow query log, Percona Toolkit (pt-query-digest), vmstat, Facebook’s Prophet, and Percona Monitoring and Management (PMM).

Liked SCaLE 16x? Come to Percona Live 2018!

If you attended the SCaLE 16x conference, there was a multitude of excellent talks on many different open source topics. Many of these same speakers, companies, sponsors and attendees will also be at the Percona Live 2018 Open Source Database Conference in Santa Clara, CA, on April 23 – 25, 2018.

Join the open source database community in Santa Clara, California, to learn about the core topics in MySQL, MongoDB and other open source databases. Get briefed on the hottest topics, learn about building and maintaining high-performing deployments and listen to technical experts and top industry leaders. The Percona Live 2018 – Open Source Database Conference is a great event for users of any level exploring open source database technologies.

Some of these speakers and companies attending include:

. . . and many more.

Hurry and register before the event sells out!

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com