Apr
26
2019
--

Load Balanced ProxySQL in Google Cloud

GCP ProxySQL LB

There are three different ways ProxySQL can direct traffic between your application and the backend MySQL services.

  1. Locally, on the MySQL servers.
  2. Between the MySQL servers and the application.
  3. Colocated on the application servers themselves.

Without going through too much detail – each has its own limitations. In the first form, the application needs to know about all MySQL servers at any given point in time. With the third form, a large number of application servers, especially in the age of Kubernetes, where apps can simply recycle easily or be scaled up and down, backend connections can increase exponentially leading to issues.

In the second form, load balancing between a pool of ProxySQL servers is normally the challenge. Do you load balance the load balancers? While there are approaches like balancing from the application, similar to how the MongoDB drivers works, the application still needs to know and maintain a list of healthy backend proxies.

Google Cloud Platform’s (GCP) internal load balancer is a software based managed service which is implemented via virtual networking. This means, unlike physical load balancers, it is designed not to be a single point of failure.

We have played with Internal Load Balancers (ILB) and ProxySQL using the architecture below. There’s a few steps and items involved to be explained.

VM Instance Group

An instance group will be created to run the ProxySQL services. This needs to be a managed instance group so they are distributed between multiple zones. A managed instance group can auto scale, which you might or might not want. For example, a problematic ProxySQL instance can easily be replaced with a templatized VM instance.

Health Check

Health checks are the tricky part. GCP’s internal load balancer supports HTTP(S), SSL and TCP health checks. In this case, as long as ProxySQL is responding on the service port or admin port the service is up, right? Yes, but this is not necessarily enough since a port may respond but the instance can be misconfigured and return errors.

With ProxySQL you have to treat it like an actual MySQL instance (i.e. login and issue a query). On the other hand, the load balancer should be agnostic and does not necessarily need to know which backends do or do not work. The availability of backends should be left to ProxySQL as much as possible.

One way to achieve this is to use dummy rewrite rules inside ProxySQL. In the example below, we’ve configured an account called

percona

that is assigned to a non-existent hostgroup. What we are doing is simply rewriting

SELECT 1

  queries to return an OK result.

mysql> INSERT INTO mysql_query_rules (active, username, match_pattern, OK_msg)
 - > VALUES (1, 'percona', 'SELECT 1', '1');
Query OK, 1 row affected (0.00 sec)
mysql> LOAD MYSQL QUERY RULES TO RUNTIME;
Query OK, 0 rows affected (0.00 sec)
mysql> SAVE MYSQL QUERY RULES TO DISK;
Query OK, 0 rows affected (0.01 sec)

[root@west1-proxy-group-9bgs ~]# mysql -upercona -ppassword -P3306 -h127.1
...
mysql> SELECT 1;
Query OK, 0 rows affected (0.00 sec)
1

It does not solve the problem though where ILB only supports primitive TCP check and HTTP checks. We still need a layer where the response from ProxySQL will be properly translated to ILB in a form it will understand. My personal preference is to expose an HTTP service that queries ProxySQL and responds to ILB HTTP based health check. It provides additional flexibility like being able to check specific or all backends.

Firewall Rules

Health checks to ProxySQL instances comes from a specific set of IP ranges. In our case, these would be 130.211.0.0/22 and 35.191.0.0/16. Firewall ports needs to be open from these ranges to either the HTTP or TCP ports in the ProxySQL instances.

In our next post, we will use Orchestrator to manage cross region replication for high availability.

Mar
14
2019
--

Percona’s Open Source Data Management Software Survey

PerconaSurvey

Click Here to Complete our New Survey!

Last year we informally surveyed the open source community and our conference attendees.
The results revealed that:

  • 48% of those in the cloud choose to self-manage their databases, but 52% were comfortable relying on the DBaaS offering of their cloud vendor.
  • 49% of people said “performance issues” when asked, “what keeps you up at night?”
  • The major decision influence for buying services was price, with 42% of respondents keen to make the most of their money.

We found this information so interesting that we wanted to find out more! As a result, we are pleased to announce the launch of our first annual Open Source Data Management Software Survey.

The final results will be 100% anonymous, and will be made freely available on Creative Commons.

How Will This Survey Help The Community?

Unlimited access to accurate market data is important. Millions of open source projects are in play, and most are dependent on databases. Accurate market data helps you track the popularity of different databases, as well as seeing how and where these databases are run. This helps us all build better software and take advantage of shifting trends.

Thousands of vendors are focused on helping SysAdmins, DBAs, and Developers get the most out of their database infrastructure. Insightful market data enables them to create better tools that meet current demands and grow the open source database market.

We want to assist companies who are still deciding what, how, and where to run their systems. This information will help them understand the industry direction and allow them to make an informed decision on the software and services they choose.

How Can You Help Make This Survey A Success?

Firstly, please share your insight into current trends and new developments in open source data management software.

Secondly, please share this survey with other people who work in the industry, and encourage them to contribute.

The more responses we receive, the more useful this will be to the whole open source community. If we missed anything, or you would like to ask other questions in future, let us know!

So tell us; who are the big fish, and which minnows are nibbling at their tails?! Is the cloud giving you altitude sickness, or are you flying high? What is the next big thing and is everyone on board, or is your company lagging behind?

Preliminary results will be presented at our annual Percona Live Conference in Austin, Texas (May 28-30, 2019) by our CEO, Peter Zaitsev and released to the open source community when finalized.

Click Here to Have Your Say!

Mar
04
2019
--

Upcoming Webinar Wed 3/6: MySQL High Availability and Disaster Recovery

MySQL High Availability and Disaster Recovery Webinar

MySQL High Availability and Disaster Recovery WebinarJoin Percona CEO Peter Zaitsev as he presents MySQL High Availability and Disaster Recovery on Wednesday, March 6th, 2019, at 11:00 AM PST (UTC-8) / 2:00 PM EST (UTC-5).

Register Now

In this hour-long webinar, Peter describes the differences between high availability (HA) and disaster recovery (DR). Afterward, Peter will go through scenarios detailing how each is handled manually and in Amazon RDS.

He will review the pros and cons of managing HA and DR in the traditional database environment as well in the cloud. Having full control of these areas is daunting. However, Amazon RDS makes meeting these needs easier and more efficient.

Regardless of which path you choose, monitoring your environment is vital. Peter’s talk will make that message clear. A discussion of metrics you should regularly review to keep your environment working correctly and performing optimally concludes the webinar.

In order to learn more register for Peter’s webinar on MySQL High Availability and Disaster Recovery.

Jan
11
2019
--

AWS Aurora MySQL – HA, DR, and Durability Explained in Simple Terms

It’s a few weeks after AWS re:Invent 2018 and my head is still spinning from all of the information released at this year’s conference. This year I was able to enjoy a few sessions focused on Aurora deep dives. In fact, I walked away from the conference realizing that my own understanding of High Availability (HA), Disaster Recovery (DR), and Durability in Aurora had been off for quite a while. Consequently, I decided to put this blog out there, both to collect the ideas in one place for myself, and to share them in general. Unlike some of our previous blogs, I’m not focused on analyzing Aurora performance or examining the architecture behind Aurora. Instead, I want to focus on how HA, DR, and Durability are defined and implemented within the Aurora ecosystem.  We’ll get just deep enough into the weeds to be able to examine these capabilities alone.

introducing the aurora storage engine 1

Aurora MySQL – What is it?

We’ll start with a simplified discussion of what Aurora is from a very high level.  In its simplest description, Aurora MySQL is made up of a MySQL-compatible compute layer and a multi-AZ (multi availability zone) storage layer. In the context of an HA discussion, it is important to start at this level, so we understand the redundancy that is built into the platform versus what is optional, or configurable.

Aurora Storage

The Aurora Storage layer presents a volume to the compute layer. This volume is built out in 10GB increments called protection groups.  Each protection group is built from six storage nodes, two from each of three availability zones (AZs).  These are represented in the diagram above in green.  When the compute layer—represented in blue—sends a write I/O to the storage layer, the data gets replicated six times across three AZs.

Durable by Default

In addition to the six-way replication, Aurora employs a 4-of-6 quorum for all write operations. This means that for each commit that happens at the database compute layer, the database node waits until it receives write acknowledgment from at least four out of six storage nodes. By receiving acknowledgment from four storage nodes, we know that the write has been saved in at least two AZs.  The storage layer itself has intelligence built-in to ensure that each of the six storage nodes has a copy of the data. This does not require any interaction with the compute tier. By ensuring that there are always at least four copies of data, across at least two datacenters (AZs), and ensuring that the storage nodes are self-healing and always maintain six copies, it can be said that the Aurora Storage platform has the characteristic of Durable by Default.  The Aurora storage architecture is the same no matter how large or small your Aurora compute architecture is.

One might think that waiting to receive four acknowledgments represents a lot of I/O time and is therefore an expensive write operation.  However, Aurora database nodes do not behave the way a typical MySQL database instance would. Some of the round-trip execution time is mitigated by the way in which Aurora MySQL nodes write transactions to disk. For more information on exactly how this works, check out Amazon Senior Engineering Manager, Kamal Gupta’s deep-dive into Aurora MySQL from AWS re:Invent 2018.

HA and DR Options

While durability can be said to be a default characteristic to the platform, HA and DR are configurable capabilities. Let’s take a look at some of the HA and DR options available. Aurora databases are deployed as members of an Aurora DB Cluster. The cluster configuration is fairly flexible. Database nodes are given the roles of either Writer or Reader. In most cases, there will only be one Writer node. The Reader nodes are known as Aurora Replicas. A single Aurora Cluster may contain up to 15 Aurora Replicas. We’ll discuss a few common configurations and the associated levels of HA and DR which they provide. This is only a sample of possible configurations: it is not meant to represent an exhaustive list of the possible configuration options available on the Aurora platform.

Single-AZ, Single Instance Deployment

great durability with Aurora but DA and HA less so

The most basic implementation of Aurora is a single compute instance in a single availability zone. The compute instance is monitored by the Aurora Cluster service and will be restarted if the database instance or compute VM has a failure. In this architecture, there is no redundancy at the compute level. Therefore, there is no database level HA or DR. The storage tier provides the same high level of durability described in the sections above. The image below is a view of what this configuration looks like in the AWS Console.

Single-AZ, Multi-Instance

Introducing HA into an Amazon Aurora solutionHA can be added to a basic Aurora implementation by adding an Aurora Replica.  We increase our HA level by adding Aurora Replicas within the same AZ. If desired, the Aurora Replicas can be used to also service some of the read traffic for the Aurora Cluster. This configuration cannot be said to provide DR because there are no database nodes outside the single datacenter or AZ. If that datacenter were to fail, then database availability would be lost until it was manually restored in another datacenter (AZ). It’s important to note that while Aurora has a lot of built-in automation, you will only benefit from that automation if your base configuration facilitates a path for the automation to follow. If you have a single-AZ base deployment, then you will not have the benefit of automated Multi-AZ availability. However, as in the previous case, durability remains the same. Again, durability is a characteristic of the storage layer. The image below is a view of what this configuration looks like in the AWS Console. Note that the Writer and Reader are in the same AZ.

Multi-AZ Options

Partial disaster recovery with Amazon auroraBuilding on our previous example, we can increase our level of HA and add partial DR capabilities to the configuration by adding more Aurora Replicas. At this point we will add one additional replica in the same AZ, bringing the local AZ replica count to three database instances. We will also add one replica in each of the two remaining regional AZs. Aurora provides the option to configure automated failover priority for the Aurora Replicas. Choosing your failover priority is best defined by the individual business needs. That said, one way to define the priority might be to set the first failover to the local-AZ replicas, and subsequent failover priority to the replicas in the other AZs. It is important to remember that AZs within a region are physical datacenters located within the same metro area. This configuration will provide protection for a disaster localized to the datacenter. It will not, however, provide protection for a city-wide disaster. The image below is a view of what this configuration looks like in the AWS Console. Note that we now have two Readers in the same AZ as the Writer and two Readers in two other AZs.

Cross-Region Options

The three configuration types we’ve discussed up to this point represent configuration options available within an AZ or metro area. There are also options available for cross-region replication in the form of both logical and physical replication.

Logical Replication

Aurora supports replication to up to five additional regions with logical replication.  It is important to note that, depending on the workload, logical replication across regions can be notably susceptible to replication lag.

Physical Replication

Durability, High Availability and Disaster Recovery with Amazon AuroraOne of the many announcements to come out of re:Invent 2018 is a product called Aurora Global Database. This is Aurora’s implementation of cross-region physical replication. Amazon’s published details on the solution indicate that it is storage level replication implemented on dedicated cross-region infrastructure with sub-second latency. In general terms, the idea behind a cross-region architecture is that the second region could be an exact duplicate of the primary region. This means that the primary region can have up to 15 Aurora Replicas and the secondary region can also have up to 15 Aurora Replicas. There is one database instance in the secondary region in the role of writer for that region. This instance can be configured to take over as the master for both regions in the case of a regional failure. In this scenario the secondary region becomes primary, and the writer in that region becomes the primary database writer. This configuration provides protection in the case of a regional disaster. It’s going to take some time to test this, but at the moment this architecture appears to provide the most comprehensive combination of Durability, HA, and DR. The trade-offs have yet to be thoroughly explored.

Multi-Master Options

Amazon is in the process of building out a new capability called Aurora Multi-Master. Currently, this feature is in preview phase and has not been released for general availability. While there were a lot of talks at re:Invent 2018 which highlighted some of the components of this feature, there is still no affirmative date for release. Early analysis points to the feature being localized to the AZ. It is not known if cross-region Multi-Master will be supported, but it seems unlikely.

Summary

As a post re:Invent takeaway, what I learned was that there is an Aurora configuration to fit almost any workload that requires strong performance behind it. Not all heavy workloads also demand HA and DR. If this describes one of your workloads, then there is an Aurora configuration that fits your needs. On the flip side, it is also important to remember that while data durability is an intrinsic quality of Aurora, HA and DR are not. These are completely configurable. This means that the Aurora architect in your organization must put thought and due diligence into the way they design your Aurora deployment. While we all need to be conscious of costs, don’t let cost consciousness become a blinder to reality. Just because your environment is running in Aurora does not mean you automatically have HA and DR for your database. In Aurora, HA and DR are configuration options, and just like the on-premise world, viable HA and DR have additional costs associated with them.

For More Information See Also:

 

 

 

Nov
26
2018
--

Percona at AWS Re:Invent 2018!

AWS re:Invent

Come see Percona at AWS re:Invent from November 26-30, 2018 in booth 1605 in The Venetian Hotel Expo Hall.

Percona is a Bronze sponsor of AWS re:Invent in 2018 and will be there for the whole show! Drop by booth 1605 in The Venetian Expo Hall to discuss how Percona’s unbiased open source databse experts can help you with your cloud database and DBaaS deployments!

Our CEO, Peter Zaitsev will be presenting a keynote called MySQL High Availability and Disaster Recovery at AWS re:Invent!

  • When: 27 November at 1:45 PM – 2:45 PM
  • Where: Bellagio Hotel, Level 1, Gauguin 2

Check out our case study with Passportal on how Percona DBaaS expertise help guarantee uptime for their AWS RDS environment.

Percona has a lot of great content on how we can help improve your AWS open source database deployment. Check out some of our resources below:

Blogs:

Case Studies:

White Papers:

Webinars:

Datasheets:

See you at the show!

Nov
16
2018
--

Percona on the AWS Database Blog

AWS Database Blog

AWS Database BlogPercona’s very own Michael Benshoof posted an article called Supercharge your Amazon RDS for MySQL deployment with ProxySQL and Percona Monitoring and Management on the AWS Database blog.

In this article, Michael looks at how to maximize the performance of your cloud database Amazon RDS deployment. He shows that you can enhance performance by using ProxySQL to handle load balancing and Percona Monitoring and Management to look at performance metrics.

Percona is an Advanced AWS Partner and provides support and managed cloud database services for AWS RDS and Amazon Aurora.

Check the article it out!

Nov
14
2018
--

Migrating to Amazon Aurora: Reduce the Unknowns

Migrating to Amazon Aurora

Migrating to Amazon Aurora

Migrating to Amazon Aurora. Shutterstock.com

In this Checklist for Success series, we will discuss reducing unknowns when hosting in the cloud using and migrating to Amazon Aurora. These tips might also apply to other database as a service (DBaaS) offerings.

While DBaaS encapsulates a lot of the moving pieces, it also means relying on this approach for your long-term stability. This encapsulation is a two-edged sword that takes away your visibility into performance outside of the service layer.

Shine a Light on Bad Queries

Bad queries are one of the top offenders of downtime. Aurora doesn’t protect you against them. Performing a query review as part of a routine health check of your workload helps ensure that you do not miss looming issues. It also helps you predict the workload on specific times and events. For example, if you already know your top three queries tend to exponentially increase, and are read bound, you can easily decide to increase the number of read-replicas on your cluster.

Having historical query performance data helps makes this task easier and less stressful. While historical data allows you to look backward, it’s also very valuable to have a tool that lets you look at active incident scenarios in progress. Knowing what queries are currently running when suffering from performance issues reduces guesswork and helps solve problems faster.

Pick Your Tool(s)

There are a number of ways you can achieve query performance excellence. Performance Insights is a built-in offering from AWS that is tightly integrated with RDS. It has a seven-day free retention period, with an extra cost beyond that. It is available for each instance in a cluster. Performance Insights takes most of its metrics from the Performance_Schema. It includes specific metrics from the operating system that may not be available from regular Cloudwatch metrics.

Query Analytics from Percona Monitoring and Management (PMM) also uses the same source as Performance Insights: the Performance Schema. Unlike Performance Insights though, PMM is deployed separately from the cluster. This means you can keep your metrics even if you keep recycling your cluster instances. With PMM, you can also consolidate your query reviews from a single location, and you can monitor your cluster instances from the same location – including an extensive list of performance metrics.

You can enable Performance Insights and configure for the default seven-day retention period, and then combine with PMM for longer retention period across all your cluster instances. Note though that PMM may incur a cost for additional API calls to retrieve performance insight metrics.

Outside of the built-in and open source alternative, VividCortex, NewRelic and Datadog are excellent tools that do everything we discussed above and more. NewRelic, for example, allows you to take a good view of the database, application and external requests timing. This, in my opinion, is so very valuable.

Bad queries are not only the potential unknowns. Deleted rows, dropped tables, crippling schema changes, and even AZ/Region failures are realities in the cloud. We will discuss them next! Stay “tuned” for part two: Migrating to Amazon Aurora: Optimize for Binary Log Replication.

Meanwhile, we’d like to hear your success stories in Amazon Aurora in the comments below!

Don’t forget to come by and see us at AWS re:Invent, November 26-30, 2018 in booth 1605! Percona CEO Peter Zaitsev will deliver a keynote on MySQL High Availability & Disaster Recovery, Tuesday, November 27 at 1:45 PM – 2:45 PM in the Bellagio Hotel, Level 1, Gauguin 2

Nov
02
2018
--

Maintenance Windows in the Cloud

maintenance windows cloud

maintenance windows cloudRecently, I’ve been working with a customer to evaluate the different cloud solutions for MySQL. In this post I am going to focus on maintenance windows and requirements, and what the different cloud platforms offer.

Why is this important at all?

Maintenance windows are required so that the cloud provider can do the necessary updates, patches, and changes to our setup. But there are many questions like:

  • Is this going to impact our production traffic?
  • Is this going to cause any downtime?
  • How long does it take?
  • Any way to avoid it?

Let’s discuss the three most popular cloud provider: AWS, Google, Microsoft. These three each have a MySQL based database service where we can compare the maintenance settings.

AWS

When you create an instance you can define your maintenance window. It’s a 30 minutes block when AWS can update and restart your instances, but it might take more time, AWS does not guarantee the update will be done in 30 minutes. The are two different type of updates, Required and Available. 

If you defer a required update, you receive a notice from Amazon RDS indicating when the update will be performed. Other updates are marked as available, and these you can defer indefinitely.

It is even possible to disable auto upgrade for minor versions, and in that case you can decide when do you want to do the maintenance.

AWS separate OS updates and database engine updates.

OS Updates

It requires some downtime, but you can minimise it by using Multi-AZ deployments. First, the secondary instance will be updated. Then AWS do a failover and update the Primary instance as well. This means some small outage during the failover.

DB Engine Updates

For DB maintenance, the updates are applied to both instances (primary and secondary) at the same time. That will cause some downtime.

More information: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_UpgradeDBInstance.Maintenance.html#USER_UpgradeDBInstance.Maintenance.Multi-AZ

Google CloudSQL

With CloudSQL you have to define an hour for a maintenance window, for example 01:00–02:00, and in that hour, they can restart the instances at any time. It is not guaranteed the update will be done in that hour. The primary and the secondary have the same maintenance window. The read replicas do not have any maintenance window, they can be stopped at any time.

CloudSQL does not differentiate between OS or DB engine, or between required and available upgrades. Because the failover replica has the same maintenance window, any upgrade might cause database outage in that time frame.

More information: https://cloud.google.com/sql/docs/mysql/instance-settings

Microsoft Azure

Azure provides a service called Azure Database for MySQL servers. I was reading the documentation and doing some research trying to find anything regarding the maintenance window, but I did not find anything.

I span up an instance in Azure to see if there is any available settings, but I did not find anything so at this point I do not know how Azure does OS or DB maintenance or how that impacts production traffic.

If someone knows where can I find this information in the documentation, please let me know.

Conclusion

AWS CloudSQL Azure
Maintenance Window 30m 1h Unknown
Maintenance Window for Read Replicas No No Unknown
Separate OS and DB updates Yes No Unknown
Outage during update Possible Possible Unknown
Postpone an update Possible No Unknown
Different priority for updates Yes No Unknown

 

While I do not intend  to prefer or promote any of the providers, for this specific question, AWS offers the most options and controls for how we want to deal with maintenance.


Photo by Caitlin Oriel on Unsplash

Oct
03
2018
--

Percona Blog Poll: How Do You Run Your Database in the Cloud?

Cloud database poll

Cloud database pollPercona’s latest blog poll asks how you run your database in the cloud?

Join in!

Are you using a fully managed service or are you self-managing your databases in the cloud? And what provider are you relying on? Perhaps you’re using more than one. Don’t worry, you can tick multiple boxes, so please choose up to four answers. If you don’t see your solution listed, use the comments box on this blog to feedback your thoughts.

If you’d like to, you’re also welcome leave a comment to tell us about your choice—or maybe why you’re NOT planning on moving to a cloud solution in the near future. Likewise if you want to share how you’ve found your cloud deployment so far, feel free to send a comment. Thanks!

Note: There is a poll embedded within this post, please visit the site to participate in this post’s poll.

The post Percona Blog Poll: How Do You Run Your Database in the Cloud? appeared first on Percona Database Performance Blog.

Sep
07
2018
--

Taboola and Percona Host a Meetup in Tel Aviv, Oct 3

Percona Taboola Meetup in Tel Aviv

Percona Taboola Meetup in Tel AvivFrom time to time, Percona hosts and co-hosts events around the world, often at no cost for attendees. The common focus for these events is, of course, open source databases. On October 3 2018, we team up with Taboola for an evening of talks suitable for all levels of expertise, addressing both the how and the why of choosing a database technology. The event is to be held at the Taboola offices in Tel Aviv. Here’s the schedule:

  • 6:00 pm Welcome with food & drink served
  • 6:45 pm How to pick database technologies that cover your requirements (Dimitri Vanoverbeke, Percona)
  • 7:20 pm Lessons from the field, extremely heavy loads on Percona Server for MySQL (Ariel Pisetzky, Taboola)
  • 7:40 pm Migrating to the cloud, the why and the how. (Dimitri Vanoverbeke, Percona)
  • 8:00 pm Event close

If you would like to join us, please sign up here so that we can be sure to cater for you.

Meanwhile, if you have not yet bought your tickets for Percona Live Europe from November 5 – 7 in Frankfurt, then hurry… I have extended Early Bird pricing one week. We’ll be publishing the exciting, high quality schedule soon so watch out for it! (But don’t miss the Early Bird…)

Finally, if you would like to be sure to not miss our future events, please subscribe to our newsletter–we don’t necessarily blog about every opportunity.

The post Taboola and Percona Host a Meetup in Tel Aviv, Oct 3 appeared first on Percona Database Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com