Apr
16
2018
--

Webinar Tuesday April 17, 2018: Which Amazon Cloud Technology Should You Chose? RDS? Aurora? Roll Your Own?

Amazon Cloud Technology

Amazon Cloud TechnologyPlease join Percona’s Senior Technical Operations Engineer, Daniel Kowalewski as he presents Which Amazon Cloud Technology Should You Chose? RDS? Aurora? Roll Your Own? on Tuesday, April 17, 2018, at 10:00 am PDT (UTC-7) / 1:00 pm EDT (UTC-4).

Are you running on Amazon, or planning to migrate there? In this talk, we are going to cover the different technologies for running databases on Amazon Cloud environments.

We will focus on the operational aspects, benefits and limitations for each of them.

Register for the webinar now.

Amazon Cloud TechnologyDaniel Kowalewski, Senior Technical Operations Engineer

Daniel joined Percona in August of 2015. Previously, he earned a B.S. in Computer Science from the University of Colorado in 2006 and was a DBA there until he joined Percona. In addition to MySQL, Daniel also has experience with Oracle and Microsoft SQL Server, but he much prefers to stay in the MySQL world. Daniel lives near Denver, CO with his wife, two-year-old son, and dog. If you can’t reach him, he’s probably in the mountains hiking, camping, or trying to get lost.

The post Webinar Tuesday April 17, 2018: Which Amazon Cloud Technology Should You Chose? RDS? Aurora? Roll Your Own? appeared first on Percona Database Performance Blog.

Apr
13
2018
--

This Week in Data with Colin Charles 35: Percona Live 18 final countdown and a roundup of recent news

Colin Charles

Colin CharlesJoin Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

Percona Live is just over a week away — there’s an awesome keynote lineup, and you really should register. Also don’t forget to save the date as Percona Live goes to Frankfurt, Germany November 5-7 2018! Prost!

In acquisitions, we have seen MariaDB acquire MammothDB and Idera acquire Webyog.

Some interesting Amazon notes: Amazon Aurora Continues its Torrid Growth, More than Doubling the Number of Active Customers in the Last Year (not sure I’d describe it as torrid but this is great for MySQL and PostgreSQL), comes with a handful of customer mentions. In addition, there have already been 65,000 database migrations on AWS. For context, in late November 2017, it was 40,000 database migrations.

Releases

Link List

Upcoming appearances

Feedback

I look forward to feedback/tips via e-mail at colin.charles@percona.com or on Twitter @bytebot.

 

The post This Week in Data with Colin Charles 35: Percona Live 18 final countdown and a roundup of recent news appeared first on Percona Database Performance Blog.

Apr
11
2018
--

Calling All Polyglots: Percona Live 2018 Keynote Schedule Now Available!

Percona Live 2018 Keynotes

Percona Live 2018 KeynotesWe’ve posted the Percona Live 2018 keynote addresses for the seventh annual Percona Live Open Source Database Conference 2018, taking place April 23-25, 2018 at the Santa Clara Convention Center in Santa Clara, CA. 

This year’s keynotes explore topics ranging from how cloud and open source database adoption accelerates business growth, to leading-edge emerging technologies, to the importance of MySQL 8.0, to the growing popularity of PostgreSQL.

We’re excited by the great lineup of speakers, including our friends at Alibaba Cloud, Grafana, Microsoft, Oracle, Upwork and VividCortex, the innovative leaders on the Cool Technologies panel, and Brendan Gregg from Netflix, who will discuss how to get the most out of your database on a Linux OS, using his experiences at Netflix to highlight examples.  

With the theme of “Championing Open Source Databases,” the conference will feature multiple tracks, including MySQL, MongoDB, Cloud, PostgreSQL, Containers and Automation, Monitoring and Ops, and Database Security. Once again, Percona will be offering a low-cost database 101 track for beginning users who want to learn how to use and operate open source databases.

The Percona Live 2018 keynotes include:

Tuesday, April 24, 2018

  • Open Source for the Modern Business – Peter Zaitsev of Percona will discuss how open source database adoption continues to grow in enterprise organizations, the expectations and definitions of what constitutes success continue to change. A single technology for everything is no longer an option; welcome to the polyglot world. The talk will include several compelling open source projects and trends of interest to the open source database community and will be followed by a round of lightning talks taking a closer look at some of those projects.
  • Cool Technologies Showcase – Four industry leaders will introduce key emerging industry developments. Andy Pavlo of Carnegie Mellon University will discuss the requirements for enabling autonomous database optimizations. Nikolay Samokhvalov of PostgreSQL.org will discuss new PostgreSQL tools. Sugu Sougoumarane of PlanetScale Data will explore how Vitess became a high-performance, scalable and available MySQL clustering cloud solution in line with today’s NewSQL storage systems. Shuhao Wu of Shopify explains how to use Ghostferry as a data migration tool for incompatible cloud platforms.
  • State of the Dolphin 8.0 – Tomas Ulin of Oracle will discuss the focus, strategy, investments and innovations that are evolving MySQL to power next-generation web, mobile, cloud and embedded applications – and why MySQL 8.0 is the most significant MySQL release in its history.
  • Linux Performance 2018 – Brendan Gregg of Netflix will summarize recent performance features to help users get the most out of their Linux systems, whether they are databases or application servers. Topics include the KPTI patches for Meltdown, eBPF for performance observability, Kyber for disk I/O scheduling, BBR for TCP congestion control, and more.

Wednesday, April 25, 2018

  • Panel Discussion: Database Evolution in the Cloud – An expert panel of industry leaders, including Lixun Peng of Alibaba, Sunil Kamath of Microsoft, and Baron Schwartz of VividCortex, will discuss the rapid changes occurring with databases deployed in the cloud and what that means for the future of databases, management and monitoring and the role of the DBA and developer.
  • Future Perfect: The New Shape of the Data Tier – Baron Schwartz of VividCortex will discuss the impact of macro trends such as cloud computing, microservices, containerization, and serverless applications. He will explore where these trends are headed, touching on topics such as whether we are about to see basic administrative tasks become more automated, the role of open source and free software, and whether databases as we know them today are headed for extinction.
  • MongoDB at Upwork – Scott Simpson of Upwork, the largest freelancing website for connecting clients and freelancers, will discuss how MongoDB is used at Upwork, how the company chose the database, and how Percona helps make the company successful.

We will also present the Percona Live 2018 Community Awards and Lightning Talks on Monday, April 23, 2018, during the Opening Night Reception. Don’t miss the first day of tutorials and Opening Night Reception!

Register for the conference on the Percona Live Open Source Database Conference 2018 website.

Sponsorships

Limited Sponsorship opportunities for Percona Live 2018 Open Source Database Conference are still available, and offer the opportunity to interact with the DBAs, sysadmins, developers, CTOs, CEOs, business managers, technology evangelists, solution vendors, and entrepreneurs who typically attend the event. Contact live@percona.com for sponsorship details.

  • Diamond Sponsors – Percona, VividCortex
  • Platinum – Alibaba Cloud, Microsoft
  • Gold Sponsors – Facebook, Grafana
  • Bronze Sponsors – Altinity, BlazingDB, Box, Dynimize, ObjectRocket, Pingcap, Shannon Systems, SolarWinds, TimescaleDB, TwinDB, Yelp
  • Contributing Sponsors – cPanel, Github, Google Cloud, NaviCat
  • Media Sponsors – Database Trends & Applications, Datanami, EnterpriseTech, HPCWire, ODBMS.org, Packt

The post Calling All Polyglots: Percona Live 2018 Keynote Schedule Now Available! appeared first on Percona Database Performance Blog.

Apr
03
2018
--

Leveraging ProxySQL with AWS Aurora to Improve Performance, Or How ProxySQL Out-performs Native Aurora Cluster Endpoints

ProxySQL with AWS Aurora

In this blog post, I’ll look at how you can use ProxySQL with AWS Aurora to further leverage database performance.

My previous article described how easy is to replace the native Aurora connector with ProxySQL. In this article, you will see WHY you should do that.

It is important to understand that aside from the basic optimization in the connectivity and connection management, ProxySQL also provides you with a new set of features that currently are not available in Aurora.

Just think:

  • Better caching
  • Query filtering
  • Sharding
  • Query substitution
  • Firewalling
  • … and more

We will cover areas like scalability, security and performance. In short, I think is more than worth it to spend some time and give ProxySQL with AWS Aurora a try.

The tests

I will show you the results from two different kinds of tests. One is sysbench-oriented, the other simulates a more complex application using Java, data object utilization and a Hikari connection pool in the middle as well. 

For the EC2 and Aurora platform I used:

  • Application/ProxySQL T2.xlarge eu-central-1a
  • 2 Aurora MySQL 5.7.12 db.t2.medium eu-central-1a
  • 1 Aurora MySQL 5.7.12 db.t2.medium eu-central-1b for AZ redundancy

The code for the application is available here, and for sysbench tests here. All the data and configurations for the application are available here.

I ran three tests using both bench apps, obviously with Aurora as it comes and with ProxySQL. For the ProxySQL configuration see my previous article.
The tests were read_only / Write_only / read_write.

For Aurora, I only increased the number of connections and kept the how it comes out of the box approach. Note each test was run at least three times at different moments of the day, and on a different day. The data reported as final is the BEST performing result for each one.

The Results

For the impatient among us, here is a summary table of the tests:

Sysbench:

Java App:


Now if this is enough for you, you can go to the conclusion and start to use ProxySQL with AWS Aurora. If you would like to know a bit more, continue reading.

Aside from any discussion on the benchmark tool and settings, I really focused on identifying the differences between the two “connectors”. Given the layer below was exactly the same, any difference is due to the simple substitution of the endpoint.

Sysbench

Read Only

The first image reports the number of events achieved at the time of the test. It is quite clear that when using ProxySQL, sysbench ran more events.

In this graph, higher is better:

In this graph, lower is better:

As we can see, the latency when using an Aurora cluster entry point is higher. True, we are talking about milliseconds, but it is not just the value that matters, but also the distribution:

Aurora cluster endpoint ProxySQL
   

An image is worth a thousand words!

We can see, the behavior stays constant in analyzing the READS executed, with ProxySQL performing better.

In this graph, higher is better:


In this graph, higher is better:

Closing with the number of total queries performed, in which ProxySQL surpassed the Cluster endpoint by ~ 4K queries.

Write Only

For writing, things go a bit different. We see that all lines intersect, and the values are very close one to the other. I will let the images speak for themselves:

In this graph, higher is better:

In this graph, lower is better:

Latency spiked in each ProxySQL test, and it may require additional investigation and tuning.  

In this graph, higher is better:

While the rates of writes/sec intersect with each other frequently, in the end ProxySQL resulted in more writes than the native endpoint.

In this graph, higher is better:

In the end, a difference exists and is consistent across the different test iterations, but is minimal. We are talking a range of 25 – 50 entries in total. This result is not surprising, and it will be clear why later in the article.  

Read and Write

As expected in the read and write test, we see a different situation. ProxySQL is still performing better than the default entry point, but not by such a big margin as in read-only tests.

In this graph, higher is better:

In this graph, lower is better

Latency and events are following the expected trend, where read operations are executed more efficiently with ProxySQL and writes are close, but NOT the same as in the write only test.

As a result, the number of queries in ProxySQL is approximately 13% better than the default entry point.

Java Application Tests

What about the Java application? First of all, we need to remember that the application used a connection pool mechanism (Hikari), and the connection pool was present in all cases (for both Aurora cluster endpoint or ProxySQL). Given that a small delay in establishing the first connection was expected, you can easily see this in the MAX value of the connection latency.    

In this graph, lower is better.

The connection latency reported here is expressed in nanoseconds and is the measure of the time taken by the connection provider to return an active connection to the application from the moment the application requested it. In other words, how long the HikariCP is taking to choose/check/return an open connection. As you can see, the MAX value is drastically higher, and this was expected since it is the connection initialization. While not really interesting in terms of performance, this value is interesting because it gives us the dimension of the cost in the CP to open a new connection, which in the worse case is 25 milliseconds.

As the graphs show, ProxySQL manages both cases (first call, reassignment) more efficiently.

In this graph, higher is better.

In the CRUD summary table, we can see the number of SQL commands executed per second for each CRUD action and for each test. Once more we can see that when using ProxySQL, the application performed much better and significantly executed more operations (especially in the R/W test).

In this graph, higher is better.

This graph represents the total number of events run at the time of the test. An event is a full application cycle, which sees the application generate the data needed to fill the SQL (no matter if it is for read/write), create the SQL, request the connection, push the SQL, get and read the resultset returned and give back the connection.

Once more, ProxySQL shows better performance.

In this graph, lower is better.

The execution time reported in this graph is the time taken by the application to run a whole event.

This is it, execution time is the time of a full cycle. The faster the cycle is executed, the better the application is performing. The time is express in milliseconds and it goes from a very fast read, which probably accesses the cache in Aurora, to almost two seconds taken to insert a batch of rows.

Needless to say, the tests using ProxySQL performed better.

But Why?

Why do the tests using ProxySQL perform better? After all, it is just an additional step in the middle, which also has a cost in intercepting the queries and managing the connections. So why the better performance?

The answer is simple and can be found in the Aurora manual: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Aurora.Overview.html#Aurora.Overview.Endpoints.

The Cluster endpoint is an endpoint for an Aurora DB cluster that connects to the current primary instance for that DB cluster. Each Aurora DB cluster has a cluster endpoint and one primary instance.

That endpoint receives the read and write request and sends them to the same instance.The main use for it is to perform failover if needed.

At the same time, the Reader endpoint is an endpoint for an Aurora DB cluster that connects to one of the available Aurora Replicas for that DB cluster. Each Aurora DB cluster has a reader endpoint. If there is more than one Aurora Replica, the reader endpoint directs each connection request to one of the Aurora Replicas.

The reader endpoint only load balances connections to available Aurora Replicas in an Aurora DB cluster. It does not load balance specific queries. If you want to load balance queries to distribute the read workload for a DB cluster, you need to manage that in your application and use instance endpoints to connect directly to Aurora Replicas to balance the load.

This means that to perform a Read/Write split, your application must manage two entry points and you will NOT have much control over how the queries are handled or to which replica instance they are directed. This could lead to unexpected results and delays.

Needless to say, ProxySQL does all that by default (as described in my previous article).

Now that we’ve clarified how Aurora entry points behave, let’s see about the performance difference.

How do we read this graph? From left to right:

  • read_only test with an Aurora cluster endpoint
  • read_only test with ProxySQL
  • write_only with an Aurora cluster endpoint
  • write_only with ProxySQL
  • read and write with an Aurora cluster endpoint
  • read and write with ProxySQL

Here we go! As we can see, the tests with ProxySQL used the two configured instances, splitting R/W without the need to do anything on the application side. I purposely avoided the AZ replica because I previously identified it as having higher latency, so I can exclude it and use it ONLY in the case of an emergency.

The effects are clear in the next graph.

When using the cluster endpoint, given all the load was on a single instance, the CPU utilization is higher and that became a bottleneck. When using ProxySQL, the load is spread across the different instances, allowing real read scalability. This has immediate benefits in read and read/write operations, allowing better load distribution that results in better performance.

Conclusions

Aurora is a very interesting technology and can be a very good solution for read scaling. But at the moment, the way AWS offers data connectivity with the Cluster endpoints and Reader endpoints can negatively affect performance.

The lack of configuration and the limitation of using different endpoints lead to confusion and less optimized utilization.

The introduction of ProxySQL, which now supports (from version 2) Aurora, allows an architect, SA or DBA to properly configure the environment. You can very granularly choose how to use each instance, without the need to have the application modify how it works. This helps keep the data layer solution separate from the application layer.

Even better, this additional set of flexibility does not come with a cost. On the contrary, it improves resource utilization and brings higher performance using less powerful instances. Given the cost of Aurora, this is not a secondary benefit.   

I suggest you try installing ProxySQL v2 (or higher) in front of your Aurora cluster. If you don’t feel confident and prefer to have us help you, contact us and we will be more than happy to support you!

The post Leveraging ProxySQL with AWS Aurora to Improve Performance, Or How ProxySQL Out-performs Native Aurora Cluster Endpoints appeared first on Percona Database Performance Blog.

Apr
03
2018
--

How to Implement ProxySQL with AWS Aurora

ProxySQL with AWS Aurora

ProxySQL with AWS AuroraIn this post, we’ll look at how to implement ProxySQL with AWS Aurora.

Recently, there have been a few discussions and customer requests that focused on AWS Aurora and how to make the various architectures and solutions more flexible.

Flexible how, you may ask? Well, there are the usual expectations:

  • How do you improve resource utilization?
  • How can I filter (or block) things?
  • Can I shard with Aurora?
  • What is the best way to implement query caching?
  • … and more.

The inclusion of ProxySQL solves many of the points above. We in Consulting design the solutions for our customers by applying the different functionalities to better match customers needs. Whenever we deal with Aurora, we’ve had to exclude ProxySQL because of some limitations in the software.

Now, however, ProxySQL 2.0 supports Aurora, and it does it amazingly well.

This article shows you how to implement ProxySQL with AWS Aurora. The the next article Leveraging ProxySQL with AWS Aurora to Improve Performance will show you WHY.

The Problem

ProxySQL has two different ways to deal with backend servers. One is using replication mechanisms, like standard Async replication and Group Replication. The other is to use the scheduler, as in the case of Percona XtraDB Cluster, MariaDB Cluster, etc.

While we can use the scheduler as a solution for Aurora, it is not as immediate and well-integrated as the embedded support for replication, given that we normally opted not to use it in this specific case (Aurora).

But what WAS the problem with Aurora? An Aurora cluster bases its definition of Writer vs. Readers using the innodb_read_only variable. So, where is the problem? Well actually no problem at all, just that ProxySQL up to version 2 for replication only supported the generic variable READ_ONLY. As such, it was not able to correctly identify the Writer/Readers set.

The Solution

In October 2017, this issue was opened (https://github.com/sysown/proxysql/issues/1195 )and the result was, as usual, a quite simple and flexible solution.

Brainstorming, a possible solution could be to add another column in mysql_replication_hostgroups to specify what needs to be checked, either read_only or innodb_read_only, or even super_read_only

This lead to the ProxySQL team delivering (“commit fe2f16d6df15252f0107a6a224dad7b1efdb13f6”):

Added support for innodb_read_only and super_read_only  

MYHGM_MYSQL_REPLICATION_HOSTGROUPS "CREATE TABLE mysql_replication_hostgroups
(writer_hostgroup INT CHECK (writer_hostgroup>=0) NOT NULL PRIMARY KEY ,
reader_hostgroup INT NOT NULL CHECK (reader_hostgroup<>writer_hostgroup AND reader_hostgroup>=0) ,
check_type VARCHAR CHECK (LOWER(check_type) IN ('read_only','innodb_read_only','super_read_only')) NOT NULL DEFAULT 'read_only' ,
comment VARCHAR NOT NULL DEFAULT '' , UNIQUE (reader_hostgroup))"

Which in short means they added a new column to the mysql_replication_hostgroup table. ProxySQL continues to behave exactly the same and manages the servers and the replication groups as usual. No need for scripts or other crazy stuff.

Implementation

Here we are, the HOW TO part. The first thing to keep in mind is that when you implement a new Aurora cluster, you should always consider having at least two instances in the same AZ and another instance in a remote AZ.

To implement ProxySQL, you should refer directly to the instances, NOT to the cluster entry-point. To be clear, you must take this for each instance:

The information is available in the web-admin interface, under the instance or using the command:

aws rds describe-db-instances

And filter the result for:

"Endpoint": {
                "Port": 3306,
                "Address": "proxysqltestdb.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com"
            },

To run ProxySQL with RDS in general, you need to install it on an intermediate server or on the application box.

Once you decide which one fits your setup better, you must download or git clone ProxySQL v2.0+.

DO NOT use v1.4.x, as it does not contain these new features and will not work as expected.

Once you have all the Aurora instances up, it is time to configure ProxySQL. Below is an example of all the commands used during the installation:

grant usage, replication client on *.* to monitor@'%' identified by 'monitor';
delete from mysql_servers where hostgroup_id in (70,71);
delete from mysql_replication_hostgroups where writer_hostgroup=70;
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('proxysqltestdb.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com',70,3306,1000,2000);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('proxysqltestdb.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com',71,3306,1000,2000);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('proxysqltestdb2.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com',71,3306,1000,2000);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('proxysqltestdb-eu-central-1b.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com',71,3306,1,2000);
INSERT INTO mysql_replication_hostgroups(writer_hostgroup,reader_hostgroup,comment,check_type) VALUES (70,71,'aws-aurora','innodb_read_only');
LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK;
delete from mysql_query_rules where rule_id in (50,51,52);
insert into mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply) values(50,6033,'m8_test',70,0,3,'.',1);
insert into mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply) values(51,6033,'m8_test',70,1,3,'^SELECT.*FOR UPDATE',1);
insert into mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply) values(52,6033,'m8_test',71,1,3,'^SELECT.*$',1);
LOAD MYSQL QUERY RULES TO RUNTIME;SAVE MYSQL QUERY RULES TO DISK;
delete from mysql_users where username='m8_test';
insert into mysql_users (username,password,active,default_hostgroup,default_schema,transaction_persistent) values ('m8_test','test',1,70,'mysql',1);
LOAD MYSQL USERS TO RUNTIME;SAVE MYSQL USERS TO DISK;
update global_variables set variable_value="67108864" where variable_name='mysql-max_allowed_packet';
update global_variables set Variable_Value=0  where Variable_name='mysql-hostgroup_manager_verbose';
load mysql variables to run;save mysql variables to disk;

The above will give you a ready-to-go ProxySQL setup that supports Aurora cluster, performing all the usual operations ProxySQL does, including proper W/R split and more for a user named ‘m8_test’.

The key is in passing the value ‘innodb_read_only’ for the column check_type in the table mysql_replication_hostgroups.  

To check the status of your ProxySQL, you can use this command (which gives you a snapshot of what is going to happen):

watch -n 1 'mysql --defaults-file=~/.my.cnf -h 127.0.0.1 -P 6032 -t -e "select b.weight, c.* from stats_mysql_connection_pool c left JOIN runtime_mysql_servers b ON  c.hostgroup=b.hostgroup_id and c.srv_host=b.hostname and c.srv_port = b.port where hostgroup in( 50,52,70,71) order by hostgroup,srv_host desc;" -e " select srv_host,command,avg(time_ms), count(ThreadID) from stats_mysql_processlist group by srv_host,command;" -e "select * from stats_mysql_users;";mysql  --defaults-file=~/.my.cnf -h 127.0.0.1 -P 6032  -t -e "select * from stats_mysql_global "|egrep -i  "(mirror|memory|stmt|processor)"'
+--------+-----------+--------------------------------------------------------------------------+----------+--------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+
| weight | hostgroup | srv_host                                                                 | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | MaxConnUsed | Queries | Queries_GTID_sync | Bytes_data_sent | Bytes_data_recv | Latency_us |
+--------+-----------+--------------------------------------------------------------------------+----------+--------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+
| 1000   | 70        | proxysqltestdb.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com               | 3306     | ONLINE | 0        | 0        | 0	     | 0       | 0           | 0       | 0                 | 0               | 0               | 5491       |
| 1000   | 71        | proxysqltestdb2.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com              | 3306     | ONLINE | 0        | 5        | 5	     | 0       | 5           | 73      | 0                 | 5483            | 28442           | 881        |
| 1000   | 71        | proxysqltestdb.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com               | 3306     | ONLINE | 0        | 5        | 5	     | 0       | 5           | 82      | 0                 | 6203            | 32217           | 5491       |
| 1	 | 71        | proxysqltestdb-eu-central-1b.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com | 3306     | ONLINE | 0        | 0        | 0	     | 0       | 0           | 0       | 0                 | 0               | 0               | 1593       |
+--------+-----------+--------------------------------------------------------------------------+----------+--------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+
+----------+----------------------+--------------------------+
| username | frontend_connections | frontend_max_connections |
+----------+----------------------+--------------------------+
| m8_test  | 0                    | 10000                    |
+----------+----------------------+--------------------------+
| Query_Processor_time_nsec    | 0              |
| Com_backend_stmt_prepare     | 0              |
| Com_backend_stmt_execute     | 0              |
| Com_backend_stmt_close       | 0              |
| Com_frontend_stmt_prepare    | 0              |
| Com_frontend_stmt_execute    | 0              |
| Com_frontend_stmt_close      | 0              |
| Mirror_concurrency           | 0              |
| Mirror_queue_length          | 0              |
| SQLite3_memory_bytes         | 2652288        |
| ConnPool_memory_bytes        | 712720         |
| Stmt_Client_Active_Total     | 0              |
| Stmt_Client_Active_Unique    | 0              |
| Stmt_Server_Active_Total     | 0              |
| Stmt_Server_Active_Unique    | 0              |
| Stmt_Max_Stmt_id             | 1              |
| Stmt_Cached                  | 0              |
| Query_Cache_Memory_bytes     | 0              |

At this point, you can connect your application and see how ProxySQL allows you to perform much better than the native cluster entry point.

This will be expanded in the next article: Leverage AWS Aurora performance.

Conclusions

I had my first issue with the native Aurora connector a long time ago, but I had nothing to replace it. ProxySQL is a very good alternative to standard cluster access, with more options/controls and it also allows us to perform close-to-application caching, which is much more efficient than the remote MySQL one (http://www.proxysql.com/blog/scaling-with-proxysql-query-cache).

In the next article I will illustrate how, in a simple setup, ProxySQL can help in achieving better results than using the default Aurora cluster endpoint.

The post How to Implement ProxySQL with AWS Aurora appeared first on Percona Database Performance Blog.

Mar
14
2018
--

Saw Percona at SCaLE 16x? Come See Even More at Percona Live 2018!

Did you see Percona at SCaLE 16x? I spent a couple of days there learning about open source software, databases, and other interesting topics. You can get even more open source database information at Percona Live 2018.

SCaLE is the largest community-run open-source and free software conference in North America. It is held annually in the greater Los Angeles area. This year’s event took place on March 8-11, 2018, at the Pasadena Convention Center. SCaLE 16X hosted 150 exhibitors this year, along with nearly 130 sessions, tutorials and special events.

Percona has been attending now for a number of years, and this year was no exception. Besides our booth in the Exhibit Hall, we had two speakers giving three different talks:

Percona at Scale 16xPeter Zaitsev, Percona CEO and Founder

Using MySQL for Distributed Database Architectures

In modern data architectures, we’re increasingly moving from single node design systems to distributed architectures using multiple nodes – often spread across multiple databases and multiple continents. Such architectures bring many benefits (such as scalability and resiliency), but can also bring a lot of pain if not correctly architected.

In this presentation, we looked into how we can use MySQL to engineer such systems. Firstly, we looked into the application data requirements that can shape which distributed architectures will work for an application, and what are their benefits and tradeoffs. Then we looked into how to implement the architectures with MySQL, using conventional and proven options such as MySQL Replication, as well as newer options such as:

    • MySQL Multi-Source Replication
    • MySQL Group Replication
    • Percona XtraDB Cluster and Galera
    • Application-driven replication using Kafka

Finally, since a common cause of production problems is a misunderstanding of how distributed systems are designed to behave during failure, we examined what can commonly happen to cause architecture scenarios to fail.

Why We’re Excited About MySQL 8.0

There are many great new features in MySQL 8.0, but how exactly can they help your applications? This session took a practical look at MySQL 8.0 features and improvements. We looked at the bugs, issues and limitations of previous MySQL versions and how MySQL 8.0 addresses them. It also covered what you can do with MySQL 8.0 that you couldn’t before.

Percona at Scale 16x 2Colin Charles

Capacity Planning for your Data Stores

Imagine a ticket sales website that does normal events like an M2M concert, but also occasionally sells tickets to the very popular play Harry Potter and the Cursed Child. This is a perfect capacity planning example. Selling tickets requires that you never sell more tickets than you actually have. You want to load-balance your queries, to shard your data stores and split reads and writes. You need to determine where the system bottlenecks, so you need a baseline for your regular traffic. The website must be able to handle the increased load for extremely popular performances, but you don’t want to buy servers that aren’t doing anything for much of the time. (This is also why the cloud is so popular today.)

Colin Charles explored storage capacity planning for OLTP and data warehousing uses and explains how metrics collection helps you plan your requirements. Coupled with the elastic nature of clouds, you should never have an error establishing database connection. Along the way, Colin also covered tools such as Box Anemometer, innotop, the slow query log, Percona Toolkit (pt-query-digest), vmstat, Facebook’s Prophet, and Percona Monitoring and Management (PMM).

Liked SCaLE 16x? Come to Percona Live 2018!

If you attended the SCaLE 16x conference, there was a multitude of excellent talks on many different open source topics. Many of these same speakers, companies, sponsors and attendees will also be at the Percona Live 2018 Open Source Database Conference in Santa Clara, CA, on April 23 – 25, 2018.

Join the open source database community in Santa Clara, California, to learn about the core topics in MySQL, MongoDB and other open source databases. Get briefed on the hottest topics, learn about building and maintaining high-performing deployments and listen to technical experts and top industry leaders. The Percona Live 2018 – Open Source Database Conference is a great event for users of any level exploring open source database technologies.

Some of these speakers and companies attending include:

. . . and many more.

Hurry and register before the event sells out!

Feb
23
2018
--

Webinar Tuesday February 27, 2018: Monitoring Amazon RDS with Percona Monitoring and Management (PMM)

Monitoring Amazon RDS

Monitoring Amazon RDSPlease join Percona’s Build / Release Engineer, Mykola Marzhan, as he presents Monitoring Amazon RDS with Percona Monitoring and Management on February 27, 2018, at 7:00 am PDT (UTC-8) / 10:00 am EDT (UTC-5).


Are you concerned about how you are monitoring your AWS environment? Keeping track of what is happening in your Amazon RDS deployment is key to guaranteeing the performance and availability of your database for your critical applications and services.

Did you know that Percona Monitoring and Management (PMM) ships with support for MySQL on Amazon RDS and Amazon Aurora out of the box? It does!

Percona Monitoring and Management (PMM) is a free and open-source platform for managing and monitoring MySQL, Percona Server for MySQL MariaDB, MongoDB, Percona Server for MongoDB performance both on-premise and in the cloud.

In this session we’ll discuss:

  • Configuring PMM (metrics and queries) against Amazon RDS MySQL and Amazon Aurora using an EC2 instance
  • Configuring PMM against CloudWatch metrics
  • Setting configuration parameters for AWS for maximum PMM visibility

Register for the webinar now.

mykolaMykola Marzhan, Release Engineer

Mykola joined Percona in 2016 as a release engineer. He has been developing monitoring systems since 2004, and has been working as Release Engineer/Release Manager/DevOps for ten years. Recently, Mykola achieved an AWS Certified Solutions Architect (Professional) authentication.

 

Feb
21
2018
--

Percona Live 2018 Open Source Database Conference Full Schedule Now Available

Percona Live 2018 Featured Talk

Percona Live 2018The conference session schedule for the seventh annual Percona Live 2018 Open Source Database Conference, taking place April 23-25 at the Santa Clara Convention Center in Santa Clara, CA is now live and available for review! Advance Registration Discounts can be purchased through March 4, 2018, 11:30 p.m. PST.

Percona Live Open Source Database Conference 2018 is the premier open source database event. With a theme of “Championing Open Source Databases,” the conference will feature multiple tracks, including MySQL, MongoDB, Cloud, PostgreSQL, Containers and Automation, Monitoring and Ops, and Database Security. Once again, Percona will be offering a low-cost database 101 track for beginning users who want to start learning how to use and operate open source databases.

Major areas of focus at the conference include:

  • Database operations and automation at scale, featuring speakers from Facebook, Slack, Github and more
  • Databases in the cloud – how database-as-a-service (DBaaS) is changing the DB Landscape, featuring speakers from AWS, Microsoft, Alibaba and more
  • Security and compliance – how GDPR and other government regulations are changing the way we manage databases, featuring speakers from Fastly, Facebook, Pythian, Percona and more
  • Bridging the gap between developers and DBAs – finding common ground, featuring speakers from Square, Oracle, Percona and more

Conference Session Schedule

Conference sessions take place April 24-25 and will feature 90+ in-depth talks by industry experts related to each of the key areas. Several sessions from Oracle and Percona will focus on how the new features and enhancements in the upcoming release of MySQL 8.0 will impact businesses. Conference session examples include:

Sponsorships

Sponsorship opportunities for Percona Live Open Source Database Conference 2018 are available and offer the opportunity to interact with the DBAs, sysadmins, developers, CTOs, CEOs, business managers, technology evangelists, solution vendors and entrepreneurs who typically attend the event. Contact live@percona.com for sponsorship details.

  • Diamond Sponsors – Continuent, VividCortex
  • Platinum – Microsoft
  • Gold Sponsors – Facebook, Grafana
  • Bronze Sponsors – Altinity, BlazingDB, SolarWinds, Timescale, TwinDB, Yelp
  • Other Sponsors – cPanel
  • Media Sponsors – Database Trends & Applications, Datanami, EnterpriseTech, HPCWire, ODBMS.org, Packt

Hyatt Regency Santa Clara & The Santa Clara Convention Center

Percona Live 2018 Open Source Database Conference is held at the Hyatt Regency Santa Clara & The Santa Clara Convention Center, at 5101 Great America Parkway Santa Clara, CA 95054.

The Hyatt Regency Santa Clara & The Santa Clara Convention Center is a prime location in the heart of the Silicon Valley. Enjoy this spacious venue with complimentary wifi, on-site expert staff and three great restaurants. You can reserve a room by booking through the Hyatt’s dedicated Percona Live reservation site.

Book your hotel using Percona’s special room block rate!

Jan
31
2018
--

Aurora Hash Join Optimization (with a Gentle Reminder on Lab Features)

Aurora Hash Join Lab Mode

Aurora Hash Join Lab ModeThe Aurora hash join feature for relational databases has been around for a while now. But unlike MySQL Block Nested Loop algorithm, an Aurora hash join only caters to a specific number of use cases. When implemented with the optimizer properly, they can provide great benefits with certain workloads. Below we’ll see a brief example of a quick win.

This new feature is available in Aurora lab mode version 1.16. Because this is a lab feature, it’s important to make sure to test your queries before upgrading, especially if you are looking to scale up to the new R4 instances before the Superbowl to avoid hitting the same problem I discuss below.

When lab mode is enabled and

hash_join

  is ON, you can verify the optimizer feature from the

optimizer_switch

 variable:

mysql> SELECT @@aurora_version, @@aurora_lab_mode, @@optimizer_switch G
*************************** 1. row ***************************
  @@aurora_version: 1.16
 @@aurora_lab_mode: 1
@@optimizer_switch: index_merge=on,...,hash_join=on,hash_join_cost_based=on

Hash joins work well when joining large result sets because – unlike block nested loop in the same query – the optimizer scans the larger table and matches it against the hashed smaller table instead of the other way around. Consider the tables and query below:

+----------+----------+
| tbl      | rows     |
+----------+----------+
| branches |    55143 |
| users    |   103949 |
| history  | 27168887 |
+----------+----------+
EXPLAIN
SELECT SQL_NO_CACHE COUNT(*)
FROM branches b
   INNER JOIN users u ON (b.u_id = u.u_id)
   INNER JOIN history h ON (u.u_id = h.u_id);

With hash joins enabled, we can see from the Extra column in the EXPLAIN output how it builds the join conditions:

mysql> EXPLAIN
    -> SELECT SQL_NO_CACHE COUNT(*)
    -> FROM branches b
    ->    INNER JOIN users u ON (b.u_id = u.u_id)
    ->    INNER JOIN history h ON (u.u_id = h.u_id);
+----+-------------+-------+-------+---------------+---------+---------+------+----------+----------------------------------------------------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows     | Extra                                                    |
+----+-------------+-------+-------+---------------+---------+---------+------+----------+----------------------------------------------------------+
|  1 | SIMPLE      | u     | index | PRIMARY       | PRIMARY | 4       | NULL |   103342 | Using index                                              |
|  1 | SIMPLE      | h     | ALL   | NULL          | NULL    | NULL    | NULL | 24619023 | Using join buffer (Hash Join Outer table h)              |
|  1 | SIMPLE      | b     | index | user_id       | user_id | 4       | NULL |    54129 | Using index; Using join buffer (Hash Join Inner table b) |
+----+-------------+-------+-------+---------------+---------+---------+------+----------+----------------------------------------------------------+

Without hash joins, it’s a straightforward Cartesian (almost) product of all three tables:

mysql> SET optimizer_switch='hash_join=off';
Query OK, 0 rows affected (0.02 sec)
mysql> EXPLAIN
    -> SELECT SQL_NO_CACHE COUNT(*)
    -> FROM branches b
    ->    INNER JOIN users u ON (b.u_id = u.u_id)
    ->    INNER JOIN history h ON (u.u_id = h.u_id);
+----+-------------+-------+--------+---------------+---------+---------+----------------+----------+-------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref            | rows     | Extra       |
+----+-------------+-------+--------+---------------+---------+---------+----------------+----------+-------------+
|  1 | SIMPLE      | h     | ALL    | NULL          | NULL    | NULL    | NULL           | 24619023 | NULL        |
|  1 | SIMPLE      | u     | eq_ref | PRIMARY       | PRIMARY | 4       | percona.h.u_id |        1 | Using index |
|  1 | SIMPLE      | b     | ref    | user_id       | user_id | 4       | percona.h.u_id |        7 | Using index |
+----+-------------+-------+--------+---------------+---------+---------+----------------+----------+-------------+

Now, the execution times without hash joins enabled:

mysql> SELECT SQL_NO_CACHE COUNT(*)
    -> FROM branches b
    ->    INNER JOIN users u ON (b.u_id = u.u_id)
    ->    INNER JOIN history h ON (u.u_id = h.u_id);
+-----------+
| COUNT(*)  |
+-----------+
| 128815553 |
+-----------+
1 row in set (1 min 6.95 sec)
mysql> SET optimizer_switch='hash_join=off';
Query OK, 0 rows affected (0.01 sec)
mysql> SELECT SQL_NO_CACHE COUNT(*)
    -> FROM branches b
    ->    INNER JOIN users u ON (b.u_id = u.u_id)
    ->    INNER JOIN history h ON (u.u_id = h.u_id);
+-----------+
| COUNT(*)  |
+-----------+
| 128815553 |
+-----------+
1 row in set (2 min 28.27 sec)

Clearly with this optimization enabled, we have more than a 50% gain from the example query.

Now while this type of query might be rare, most of us know we need to avoid really large JOINs as they are not scalable. But at some point, we find some that take advantage of the feature. Here is an excerpt from an actual production query I’ve recently worked on. It shows the good execution plan versus the one using hash joins.

This particular EXPLAIN output only differs in the row where without a hash join, it uses an index, and the query executes normally. With the hash join enabled, the optimizer thought it was better to use it instead:

...
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: t
         type: eq_ref
possible_keys: PRIMARY,r_type_id_ix,r_id_r_type_id_dt_ix
          key: PRIMARY
      key_len: 4
          ref: db.x.p_id
         rows: 1
        Extra: Using where
...
...
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: t
         type: index
possible_keys: PRIMARY,r_type_id_ix,r_id_r_type_id_dt_ix
          key: r_id_r_type_id_dt_ix
      key_len: 18
          ref: NULL
         rows: 715568233
        Extra: Using where; Using index; Using join buffer (Hash Join Inner table t)
...

Needless to say, it caused problems. Unfortunately, a bug on Aurora 1.16 exists where hash joins cannot be turned off selectively (it is enabled by default) from the parameter group. If you try this, you get an error “Error saving: Invalid parameter value: hash_join=off for: optimizer_switch”. The only way to disable the feature is to turn off

lab_mode

, which requires an instance restart. An alternative is to simply add

SET optimizer_switch='hash_join=off';

 from the application, especially if you rely on some of the other lab mode features in Aurora.

To summarize, the new hash join feature is a great addition. But as it’s a lab feature, be careful when upgrading!

Jan
15
2018
--

Sneak Peek of the Percona Live 2018 Open Source Database Conference Breakout Sessions!

Percona Live 2018

Percona Live 2018Take a look at the sneak peek of the breakout sessions for the Percona Live 2018 Open Source Database Conference, taking place April 23-25, 2018 at the Santa Clara Convention Center in Santa Clara, California. Early Bird registration discounts are available until February 4, 2018, and sponsorship opportunities are still available.

Conference breakout sessions will feature a range of in-depth talks related to each of the key areas. Breakout session examples include:

  • Database Security as a Function: Scaling to Your Organization’s Needs – Laine Campbell, Fastly
  • How to Use JSON in MySQL Wrong – Bill Karwin, Square
  • Scaling a High Traffic Database: Moving Tables Across Clusters – Bryana Knight, GitHub
  • MySQL: How to Save Bandwidth – Georgi Kodinov, Oracle
  • MyRocks Roadmaps and Production Deployment at Facebook – Yoshinori Matsunobu, Facebook
  • Securing Your Data on PostgreSQL – Payal Singh, OmniTI Computer Consulting, Inc.
  • The Accidental DBA – Jenni Snyder, Yelp
  • How Microsoft Built MySQL, PostgreSQL and MariaDB for the Cloud – Jun Su, Microsoft
  • MongoDB Cluster Topology, Management and Optimization – Steven Wang, Tesla
  • Ghostferry: A Data Migration Tool for Incompatible Cloud Platforms – Shuhao Wu, Shopify, Inc.

Percona Live Open Source Database Conference 2018 is the premier open source database event. The theme for the upcoming conference is “Championing Open Source Databases,” with a range of topics on MySQL, MongoDB and other open source databases, including time series databases, PostgreSQL and RocksDB. Session tracks include Developers, Operations and Business/Case Studies. A major conference focus will be providing strategies to help attendees meet their business goals by deploying the right mix of database solutions to obtain the performance they need while managing complexity.

Hyatt Regency Santa Clara & The Santa Clara Convention Center

Percona Live 2018 Open Source Database Conference 2018 will be held at the Hyatt Regency Santa Clara & The Santa Clara Convention Center, at 5101 Great America Parkway Santa Clara, CA 95054.

The Hyatt Regency Santa Clara & The Santa Clara Convention Center is a prime location in the heart of the Silicon Valley. Enjoy this spacious venue with complimentary wifi, on-site expert staff and three great restaurants offering Tuscan cuisine, classic American or tantalizing Sushi. Staying for a couple of extra days? Take time to enjoy the Bay Area and enjoy a day in San Francisco located only an hour away. You can reserve a room by booking through the Hyatt’s dedicated Percona Live reservation site.

Book your hotel using Percona’s special room block rate!

Sponsorships

Sponsorship opportunities for Percona Live 2018 Open Source Database Conference 2018 are available and offer the opportunity to interact with the DBAs, sysadmins, developers, CTOs, CEOs, business managers, technology evangelists, solution vendors, and entrepreneurs who typically attend the event. Contact live@percona.com for sponsorship details.

 

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com