May
08
2024
--

Mastering Database Monitoring: Running PMM in High Availability Mode

Percona Monitoring and Management (PMM) has become a valuable tool for database professionals, providing comprehensive insights into database health and performance. A recent update (version 2.41.0) introduced a significant enhancement: the ability to run PMM in high availability (HA) mode. This feature, currently in technical preview, offers exciting possibilities for ensuring the reliability and robustness […]

Dec
21
2023
--

Percona Monitoring and Management High Availability – A Proof of Concept

Percona Monitoring and Management (PMM) is a state-of-the-art piece of software that exists in part thanks to great open source projects like VictoriaMetrics, PostgreSQL, and ClickHouse. The integration of those projects, plus the years of Percona expertise in the database space, makes PMM one of the best database monitoring solutions on the market.

Being software composed of different, multiple technologies can add complexity to a well-known concept: High Availability (HA). Achieving HA for PMM, as a whole, has proved to be a challenge but not an impossible task.

The easy part

Setting up the PMM cluster is the easy part. All it needs is a reverse proxy in front of a couple or more PMM instances. The go-to proxy is HAProxy configured for active/passive topology, that is, without load distribution.

For the purpose of the PoC, a single HAProxy instance is used (running as a docker container). The configuration file looks like this:

global
stats socket /var/run/api.sock user haproxy group haproxy mode 660 level admin expose-fd listeners
log stdout format raw local0 info

defaults
mode http
timeout client 10s
timeout connect 5s
timeout server 10s
timeout http-request 10s
log global

frontend stats
bind *:8404
stats enable
stats uri /
stats refresh 10s

frontend pmmfrontend
bind :80
default_backend pmmservers

backend pmmservers
option tcp-check
server pmm1 172.31.12.174:80 check port 80
server pmm2 172.31.11.132:80 check port 80 backup

The Docker container is run with this command:

docker run -d --name haproxy -v $(pwd):/usr/local/etc/haproxy:ro -p 80:80 -p 443:443 -p 8404:8404 haproxytech/haproxy-alpine:2.

The -v for the volume guarantees that the local copy of the haproxy.cfg file is the one used inside the container. Whenever you make a change in the cfg file, for the haproxy container to use it, just execute:

docker kill -s HUP haproxy

And to follow the haproxy logs:

docker logs -f haproxy

We have two frontends: One for the HAProxy stats and another for the PMM itself. The backend is a single one where the “passive” PMM instance (the one that is a pure “read replica”) is marked as “backup” so that traffic is only routed there in case the primary fails the health check.

For simplicity, the PMM instances are configured to listen to the 80 port (http) on the private IPs. This is made to avoid SSL certificates since everything goes through the same VPC (everything runs on ec2 instances on AWS). The health check, then, can be a simple “tcp-check” against port 80.

Percona Monitoring and Management high availability

As you can see, stats are available via the port 8404. With this, the easy part is done.

For this example, the PMM SERVER endpoint will be where the HAProxy frontend is listening, and that’s the one used when registering a new PMM CLIENT.

PMM CLIENT

And you can access PMM always using the same endpoint.

The not-so-easy part (made easy)

The proxy is configured to be aware of two different PMM instances — pmm1 and pmm2  — (using the private IPs 172.31.12.174 and 172.31.11.132 in this case), but we haven’t mentioned anything about those PMM instances.

And here is the not-so-easy part: One has to deploy at least two PMM instances on at least two different servers AND set up replicas. How to do it? This is the actual Proof of Concept: Enter the PMM HA script: https://github.com/nethalo/pmmha

The script will take care of installing PMM (if you already have it, you can skip this step), preparing the Primary, and setting up the Secondary. Simple as that. PMM will remain to be a black box.

The requirements are:

– Docker 23.0.3 and higher
– Docker installation will fail if on Amazon Linux or RHEL9/EL9 (unless you are on a s390x architecture machine) since the Percona Easy-Install script relies on the “Get Docker” (https://get.docker.com/) script for the docker install. You will need to install docker on your own in those cases
– SSH Access to the host servers
– sudo capabilities
– Ports 443 and 9000 accessible from outside the Primary host machine

Steps to run the script:

git clone https://github.com/nethalo/pmmha.git
cd pmmha
bash pmm.sh

Failover

The failover will be handled by the HAProxy automatically when it detects that the current primary is no longer available. Traffic will be routed to the backup server from the backend, which, if properly set as a replica, will already have the historical data for metrics and QAN, and also the inventory will be ready to continue the data scrapping from the exporters.

Your feedback is key

We want to hear from you! Suggestions, feature requests, and comments in general are welcome, either in the comment section or via the GitHub repository.

Percona Monitoring and Management is a best-of-breed open source database monitoring solution. It helps you reduce complexity, optimize performance, and improve the security of your business-critical database environments, no matter where they are located or deployed.

 

Download Percona Monitoring and Management Today

Aug
22
2017
--

The MySQL High Availability Landscape in 2017 (The Adults)

In this blog post, we’ll look at some of the MySQL high availability solution options.

In the previous post of this series, we looked at the MySQL high availability (HA) solutions that have been around for a long time. I called these solutions “the elders.” Some of these solutions (like replication) are heavily used today and have been improved from release to release of MySQL.

This post focuses on the MySQL high availability solutions that have appeared over the last five years and gained a fair amount of traction in the community. I chose to include this group only two solutions: Galera and RDS Aurora. I’ll use the term “Galera” generically: it covers Galera Cluster, MariaDB Cluster and Percona XtraDB Cluster. I debated for some time whether or not to include Aurora. I don’t like the fact that they use closed source code. Given the tight integration with the AWS environment, what is the commercial risk of opening the source code? That question evades me, but I am not on the business side of technology. ?

Galera

When I say “Galera,” it means a replication protocol supported by a library provided by Codeship, a Finnish company. The library needs hooks inside the MySQL source code for the replication protocol to work. In Percona XtraDB cluster, I counted 66 .cc files where the word “wsrep” is present. As you can see, it is not a small task to add support for Galera to MySQL. Not all the implementations are similar. Percona, for example, focused more on stability and usability at the expense of new features.

high availability

Let’s start with a quick description of a Galera-based cluster. Unless you don’t care about split-brain, a Galera cluster needs at least three nodes. The Galera replication protocol is nearly synchronous, which is a huge gain compared to regular MySQL replication. It performs transactions almost simultaneously on all the nodes, and the protocol ensures the same commit order. The transactions are almost synchronous because there are incoming queues on each node to improve performance. The presence of these incoming queues forces an extra step: the certification. The certification compares an incoming transaction with the ones already queued. If there a conflict, it returns a deadlock error.

For performance reasons, the certification process must be quick so that the incoming queue stays in memory. Since the number of transactions defines the size of the queue, the presence of large transactions uses a lot of memory. There are safeguards against memory overload, so be aware that transactions like:

update ATableWithMillionsRows set colA=1;

will likely fail. That’s the first important limitation of a Galera-based cluster: the size of the number transactions is limited.

It is also critical to uniquely identify conflicting rows. The best way to achieve an efficient row comparison is to make sure all the tables have a primary key. In a Galera-based cluster, your tables need primary keys otherwise you’ll run into trouble. That’s the second limitation of a Galera based cluster: the need for primary keys. Personally, I think that a table should always have a primary key – but I have seen many oddities…

Another design characteristic is the need for an acknowledgment by all the nodes when a transaction commits. That means the network link with the largest latency between two nodes will set the floor value of the transactional latency. It is an important factor to consider when deploying a Galera-based cluster over a WAN. Similarly, an overloaded node can slow down the cluster if it cannot acknowledge the transaction in a timely manner. In most cases, adding slave threads will allow you to overcome the throughput limitations imposed by the network latency. Each transaction will suffer from latency, but more of them will be able to run at the same time and maintain the throughput.

The exception here is when there are “hot rows.” A hot row is a row that is hammered by updates all the time. The transactions affecting hot rows cannot be executed in parallel and are thus limited by the network latency.

Since Galera-based clusters are very popular, they must also have some good points. The first and most obvious is the full-durability support. Even if the node on which you executed a transaction crashes a fraction of second after the commit statement returned, the data is present on the other nodes incoming queues. In my opinion, it is the main reason for the demise of the share storage solution. Before Galera-based clusters, the shared storage solution was the only other solution guaranteeing no data loss in case of a crash.

While the standby node is unusable with the shared storage solution, all the nodes of a Galera-based cluster are available and are almost in sync. All the nodes can be used for reads without stressing too much about replication lag. If you accept a higher risk of deadlock errors, you can even write on all nodes.

Finally, and not the least, there is an automatic provisioning service for the new nodes called SST. During the SST process, a joiner node asks the cluster for a donor. One of the existing nodes agrees to be the donor and initiate a full backup. The backup is streamed over the network to the joiner and restored there. When the backup completes, the joiner performs an IST to get the recent updates and, once applied, joins the cluster. The most common SST method uses the Percona XtraBackup utility. When using SST for XtraBackup, the cluster is fully available during the SST, although it may degrade performance. This feature really simplifies the operational side of things.

The technology is very popular. Of course, I am a bit biased since I work for Percona and one of our flagship products is Percona XtraDB Cluster – an implementation of the Galera protocol. Other than standard MySQL replication, it is by far the most common HA solution used by the customers I work with.

RDS Aurora

The second “adult” MySQL high availability solution is RDS Aurora. I hesitated to add Aurora here, mainly because it is not an open-source technology. I must also admit that I haven’t followed the latest developments around Aurora very closely. So, let’s describe Aurora.

There are three major parts in Aurora: at least one database server, the writer node and the storage.

high availability

What makes Aurora special is the storage layer has its own processing logic. I don’t know if the processing logic is part of the writer node AWS instance or part of the storage service directly, since the source code is not available. Anyway, I’ll call that layer the appliers. The applier role is to apply redo log fragments that then allow the writer node to write only the redo log fragments (normally written to the InnoDB log files). The appliers read those fragments and modify the pages in the storage. If a node requests a page that has pending redo fragments to be applied, they get applied before returning the page.

From the writer node perspective, there are much fewer writes. There is also no direct upper bound in terms of a number of fragments to be queued, so it is a bit like having

innodb_log_file_size

 set to an extremely large value. Also, since Aurora doesn’t need to flush pages, if the write node needs to read from the storage, and there are no free pages in the buffer pool, it can just discard one even if it is “dirty.” Actually, there are no dirty pages in the buffer pool.

So far, that seems to be very good for high write loads with spikes. What about the reader nodes? These reader nodes receive the updates from the writer nodes. If the update concerns a page they have in their buffer pool, they can modify it in place or discard it and read again from the storage. Again, without the source code, it is hard to tell the implementation. The point is, the readers have the same data as the master, they just can’t lag behind.

Apart from the impossibility of any reader lag, the other good point of Aurora is the storage. InnoDB pages are saved as objects in an object store, not like in a regular file on a file system. That means you don’t need to over-provision your storage ahead of time. You pay for what you are using – actually the maximum you ever use. InnoDB tablespaces do not shrink, even with Aurora.

Furthermore, if you have a 5TB dataset and your workload is such that you would need ten servers (one writer and nine readers), you still need only 5TB of storage if you are not replicating to other AZ. If we compare with regular MySQL and replication, you would have one master and nine slaves, each with 5TB of storage, for a total of 50TB. On top of that, you’ll have at least ten times the write IOPS.

So, storage wise, we have something that could be very interesting for applications with large datasets and highly concurrent read heavy workloads. You ensure high availability with the ability to promote a reader to writer automatically. You access the primary or the readers through endpoints that automatically connect to the correct instances. Finally, you can replicate the storage to multiple availability zones for DR.

Of course, such an architecture comes with a trade-off. If you experiment with Aurora, you’ll quickly discover that the smallest instance types underperform while the largest ones perform in a more expected manner. It is also quite easy to overload the appliers. Just perform the following queries:

update ATableWithMillionsRows set colA=1;
select count(*) from ATableWithMillionsRows where colA=1;

given that the table ATableWithMillionsRows is larger than the buffer pool. The select will hang for a long time because the appliers are overloaded by the number of pages to update.

In term of adoption, we have some customers at Percona using Aurora, but not that many. It could be that users of Aurora do not naturally go to Percona for services and support. I also wonder about the decision to keep the source code closed. It is certainly not a positive marketing factor in a community like the MySQL community. Since the Aurora technology seems extremely bounded to their ecosystem, is there really a risk for the technology to be reused by a competitor? With a better understanding of the technology through open access to the source, Amazon could have received valuable contributions. It would also be much easier to understand, tune and recommend Aurora.

Further reading:

Jul
20
2017
--

Where Do I Put ProxySQL?

ProxySQL

In this blog post, we’ll look at how to deploy ProxySQL.

ProxySQL is a high-performance proxy, currently for MySQL and its forks (like Percona Server for MySQL and MariaDB). It acts as an intermediary for client requests seeking resources from the database. It was created for DBAs by René Cannaò, as a means of solving complex replication topology issues. When bringing up ProxySQL with my clients, I always get questions about where it fits into the architecture. This post should clarify that.

Before continuing, you might want to know why you should use this software. The features that are of interest include:

  • MySQL firewall
  • Connection pooling
  • Shard lookup and automated routing
  • Ability to read/write split
  • Automatically switch to another master in case of active master failure
  • Query cache
  • Performance metrics
  • Other neat features!

Initial Configuration

In general, you install it on nodes that do not have a running MySQL database. You manage it via the MySQL command line on another port, usually 6032. Once it is started the configuration in /etc is not used, and you do everything within the CLI. The backend database is actually SQLite, and the db file is stored in /var/lib/proxysql.

There are many guides out there on initializing and installing it, so I won’t cover those details here. It can be as simple as:

apt-get install proxysql

ProxySQL Architecture

While most first think to install ProxySQL on a standalone node between the application and database, this has the potential to affect query performance due to the additional latency from network hops.

 

ProxySQL

To have minimal impact on performance (and avoid the additional network hop), many recommend installing ProxySQL on the application servers. The application then connects to ProxySQL (acting as a MySQL server) on localhost, using Unix Domain Socket, and avoiding extra latency. It would then use its routing rules to reach out and talk to the actual MySQL servers with its own connection pooling. The application doesn’t have any idea what happens beyond its connection to ProxySQL.

ProxySQL

Reducing Your Network Attack Surface

Another consideration is reducing your network attack surface. This means attempting to control all of the possible vulnerabilities in your network’s hardware and software that are accessible to unauthenticated users.

Percona generally suggests that you put a ProxySQL instance on each application host, like in the second image above. This suggestion is certainly valid for reducing latency in your database environment (by limiting network jumps). But while this is good for performance, it can be bad for security.

Every instance must be able to talk to:

  • Every master
  • Every slave

As you can imagine, this is a security nightmare. With every instance, you have x many more connections spanning your network. That’s x many more connections an attacker might exploit.

Instead, it can be better to have one or more ProxySQL instances that are between your application and MySQL servers (like the first image above). This provides a reasonable DMZ-type setup that prevents opening too many connections across the network.

That said, both architectures are valid production configurations – depending on your requirements.

Jun
23
2017
--

Percona XtraDB Cluster, Galera Cluster, MySQL Group Replication High Availability Webinar: Q & A

High Availability Webinar

High Availability WebinarThank you for attending the Wednesday, June 21, 2017 high availability webinar titled Percona XtraDB Cluster, Galera Cluster, MySQL Group Replication. In this blog, I will provide answers to the Q & A for that webinar.

You can find the slides and a recording of the webinar here.

Is there a minimum MySQL server version for Group Replication?

MySQL Group Replication is GA since MySQL Community 5.7.17. This is the lowest version that you should use for the Group Replication feature. Otherwise, you are using a beta version.

Since 5.7.17 was the GA release, it’s strongly recommended you use the latest 5.7 minor release. Bugs get fixed and features added in each of the minor releases (as can be seen in the Limitations section in the slide deck).

In MySQL 5.6 and earlier versions, Group Replication is not supported. Note that Percona Server for MySQL 5.7.17 and beyond also ships with Group Replication.

Can I use Percona XtraDB Cluster with MariaDB v10.2? or must I use Percona Server for MySQL?

Percona XtraDB Cluster is Percona Server for MySQL and Percona XtraBackup with the modified Galera library. You cannot run Percona XtraDB Cluster on MariaDB.

However, as Percona XtraDB Cluster is open source, it is possible that MariaDB/Codership implements our modifications into their codebase.

If Percona XtraDB Cluster does not allow InnoDB tables, how do we typically deal with applications that need to use MyISAM tables?

You cannot use MyISAM with Percona XtraDB Cluster, Galera or Group Replication. However, there is experimental MyISAM support in Galera/Percona XtraDB Cluster. But we strongly recommend that you don’t use this in production. It effectively executes all statements in Total Order Isolation, which results in bad performance.

What is a typical business use case for the Group Replication? I specifically like the writes order feature.

Typical use cases are:

  • Environments with strict **durability** requirements
  • Write to multiple nodes simultaneously while keeping data **consistent**
  • Reducing failover time
  • Using other nodes for read-scaling, where reading stale data is more difficult for the application (as opposed to standard asynchronous replication)

The use cases for Galera and Percona XtraDB Cluster are similar.

Where do you run ProxySQL, on a separate server? We are using HAProxy.

You can deploy ProxySQL in many different ways. One common method of installation is to run ProxySQL on a separate layer of servers (ensuring there is failover on this layer). Another commonly used method is to run a ProxySQL daemon on every application server.

Do you support KVM?

Yes, there are no limitations on virtualization solutions.

Can you give some examples of an “arbitrator”?

Some useful links:

What does Percona XtraDB add to make it more performant than InnoDB?

The scalability and performance improvement of Percona XtraDB are listed on the Percona Server for MySQL documentation page: https://www.percona.com/doc/percona-server/LATEST/index.html

How scalable is Percona XtraDB Cluster storage wise? Do we have any limitations?

Storage happens through the storage engine (which is InnoDB). Percona XtraDB Cluster does not have any different limitations than Percona Server for MySQL or MySQL.

However, we need to also consider the practical side of things: the larger the cluster gets, the longer certain operations take. For example, when adding a new node to the cluster another node must be the donor and provide all the data. This will take substantially longer with larger datasets. Certain operational aspects might therefore become more complex.

Is there any development to add multiple nodes simultaneously?

No, at the moment only one node can join the cluster at the same time. Other nodes automatically wait until it is finished before joining.

Why does Galera say we cannot use READ COMMITTED isolation for multimaster mode, even though we can start the cluster with READ-COMMITTED?

You can use READ-COMMITTED as transaction isolation level. The limitation is that you cannot use SERIALIZABLE: http://galeracluster.com/documentation-webpages/isolationlevels.html.

Galera Cluster and MariaDB currently do not prevent a user from using this transaction isolation level. Percona XtraDB Cluster implemented the strict mode to prevent these operations: https://www.percona.com/doc/percona-xtradb-cluster/LATEST/features/pxc-strict-mode.html#explicit-table-locking

MariaDB 10.2 fixed the check constraints issue, When will Percona fix this issue?

There are currently no plans to support CHECK constraints in Percona Server for MySQL (and therefore Percona XtraDB Cluster as well).

As Percona Server is effectively a fully backwards-compatible (but modified) MySQL Community Server, CHECK constraints is a feature that normally would be implemented in MySQL Community first.

Can you share your performance benchmark git repository (if you have one)?

We don’t have a performance benchmark in git repository. You can get detailed information about this benchmark in this blog: Performance improvements in Percona XtraDB Cluster 5.7.17-29.20.

On your slide pointing to scalability charts, how many nodes did you run your test against?

We used a three-node cluster for this performance benchmark.

The product is using Master-Master replication. As such what do you mean when you talk about failover in such configuration?
Where do you maintain the cluster state?

All technologies automatically maintain the cluster state as you add and remove nodes.

What are the network/IP requirements for Proxy SQL?

There are no specific requirements. More documentation about ProxySQL can be found here: https://github.com/sysown/proxysql/wiki.

Jun
20
2017
--

The MySQL High Availability Landscape in 2017 (The Elders)

High Availability

In this blog, we’ll look at different MySQL high availability options.

The dynamic MySQL ecosystem is rapidly evolving many technologies built around MySQL. This is especially true for the technologies involved with the high availability (HA) aspects of MySQL. When I joined Percona back in 2009, some of these HA technologies were very popular – but have since been almost forgotten. During the same interval, new technologies have emerged. In order to give some perspective to the reader, and hopefully help to make better choices, I’ll review the MySQL HA landscape as it is in 2017. This review will be in three parts. The first part (this post) will cover the technologies that have been around for a long time: the elders. The second part will focus on the technologies that are very popular today: the adults. Finally, the last part will try to extrapolate which technologies could become popular in the upcoming years: the babies.

Quick disclaimer, I am reporting on the technologies I see the most. There are likely many other solutions not covered here, but I can’t talk about technologies I have barely or never used. Apart from the RDS-related technologies, all the technologies covered are open-source. The target audience for this post are people relatively new to MySQL.

The Elders

Let’s define the technologies in the elders group. These are technologies that anyone involved with MySQL for last ten years is sure to be aware of. I could have called this group the “classics”.  I include the following technologies in this group:

  • Replication
  • Shared storage
  • NDB cluster

Let’s review these technologies in the following sections.

Replication

Simple replication topology

 

MySQL replication is very well known. It is one of the main features behind the wide adoption of MySQL. Replication gets used almost everywhere. The reasons for that are numerous:

  • Replication is simple to setup. There are tons of how-to guides and scripts available to add a slave to a MySQL server. With Amazon RDS, adding a slave is just a few clicks.
  • Slaves allow you to easily scale reads. The slaves are accessible and can be used for reads. This is the most common way of scaling up a MySQL database.
  • Slaves have little impact on the master. Apart from the added network traffic, the presence of slaves does not impact the master performance significantly.
  • It is well known. No surprises here.
  • Used for failover. Your master died, promote a slave and use it as your new master.
  • Used for backups. You don’t want to overload your master with the backups, run them off a slave.

Of course, replication also has some issues:

  • Replication can lag. Replication used to be single-threaded. That means a master with a concurrent load could easily outpace a slave. MySQL 5.6 and MariaDB 10.0 have introduced some parallelism to the slave. Newer versions have further improved to a point where today’s slaves are many times faster than they were.
  • Slaves can diverge. When you modify data on the master, the slave must perform the exact same update. That seems easy, but there are many ways an update can be non-deterministic with statement-based replication. They fixed many issues, and the introduction of row-based replication has been another big step forward. Still, if you write directly to a slave you are asking for trouble. There is a read_only setting, but if the MySQL user has the “SUPER” privilege it is just ignored. That’s why there is now the “super_read_only” setting. Tools like pt-table-checksum and pt-table-sync from the Percona toolkit exist to solve this problem.
  • Replication can impact the master. I wrote above that the presence of slaves does not affect the master, but logging changes are more problematic. The most common issue is the InnoDB table-level locking for auto_increment values with statement-based replication. Only one thread can insert new rows at a time. You can avoid this issue with row-based replication and properly configuring settings.
  • Data gets lost. Replication is asynchronous. That means the master will reply “done” after a commit statement even though the slaves have not received updates yet. Some transactions can get lost if the master crashes.

Although an old technology, a lot of work has been done on replication. It is miles away from the replication implementation of 5.0.x. Here’s a list, likely incomplete, of the evolution of replication:

  • Row based replication (since 5.1). The binary internal representation of the rows is sent instead of the SQL statements. This makes replication more robust against slave divergence.
  • Global transaction ID (since 5.6). Transactions are uniquely identified. Replication can be setup without knowing the binlog file and offset.
  • Checksum (since 5.6). Binlog events have checksum values to validate their integrity.
  • Semi-sync replication (since 5.5). An addition to the replication protocol to make the master aware of the reception of events by the slaves. This helps to avoid losing data when a master crashes.
  • Multi-source replication (since 5.7). Allows a slave to have more than one master.
  • Multi-threaded replication (since 5.6). Allows a slave to use multiple threads. This helps to limit the slave lag.

Managing replication is a tedious job. The community has written many tools to manage replication:

  • MMM. An old Perl tool that used to be quite popular, but had many issues. Now rarely used.
  • MHA. The most popular tool to manage replication. It excels at reconfiguring replication without losing data, and does a decent at handling failover.  It is also simple. No wonder it is popular.
  • PRM. A Pacemaker-based solution developed to replace MMM. It’s quite good at failover, but not as good as MHA at reconfiguring replication. It’s also quite complex, thanks to Pacemaker. Not used much.
  • Orchestrator. The new cool tool. It can manage complex topologies and has a nice web-based interface to monitor and control the topology.

 

Shared Storage

Simple shared storage topology

 

Back when I was working for MySQL ten years ago, shared storage HA setups were very common. A shared storage HA cluster uses one copy of the database files between one of two servers. One server is active, the other one is passive. In order to be shared, the database files reside on a device that can be mounted by both servers. The device can be physical (like a SAN), or logical (like a Linux DRBD device). On top of that, you need a cluster manager (like Pacemaker) to handle the resources and failovers. This solution is very popular because it allows for failover without losing any transactions.

The main drawback of this setup is the need for an idle standby server. The standby server cannot have any other assigned duties since it must always be ready to take over the MySQL server. A shared storage solution is also obviously not resilient to file-level corruption (but that situation is exceptional). Finally, it doesn’t play well with a cloud-based environment.

Today, newly-deployed shared storage HA setups are rare. The only ones I encountered over the last year were either old implementations needing support, or new setups that deployed because of existing corporate technology stacks. That should tell you about the technology’s loss of popularity.

NDB Cluster

A simple NDB Cluster topology

 

An NDB Cluster is a distributed clustering solution that has been around for a long time. I personally started working with this technology back in 2008. An NDB Cluster has three types of nodes: SQL, management and data. A full HA cluster requires a minimum of four nodes.

An NDB Cluster is not a general purpose database due to its distributed nature. For suitable workloads, it is extraordinary good. For unsuitable workloads, it is miserable. A suitable workload for an NDB Cluster contains high concurrency, with a high rate of small primary key oriented transactions. Reaching one million trx/s on an NDB Cluster is nothing exceptional.

At the other end of the spectrum, a poor workload for an NDB Cluster is a single-threaded report query on a star-like schema. I have seen some extreme cases where just the network time of a reporting query amounted to more than 20 minutes.

Although NDB Clusters have improved, and are still improving, their usage has been pushed toward niche-type applications. Overall, the technology is losing ground and is now mostly used for Telcos and online gaming applications.

Apr
22
2017
--

Better Than Linear Scaling

Scalability

In this blog, we’ll look at how to achieve better-than-linear scaling.

Scalability is the capability of a system, network or process to handle a growing amount of work, or its potential to be enlarged to accommodate that growth. For example, we consider a system scalable if it is capable of increasing its total output under an increased load when resources (typically hardware) are added: https://en.wikipedia.org/wiki/Scalability.

It is often accepted as a fact that systems (in particular databases) can’t scale better than linearly. By this I mean when you double resources, the expected performance doubles, at best (and often is less than doubled).  

We can attribute this assumption to Amdahl’s law (https://en.wikipedia.org/wiki/Amdahl%27s_law), and later to the Universal Scalability Law (http://www.perfdynamics.com/Manifesto/USLscalability.html). Both these laws prescribe that it is impossible to achieve better than linear scalability. To be totally precise, this is practically correct for single server systems when the added resources are only CPU units.

Multi-nodes systems

However, I think databases systems no longer should be seen as single server systems. MongoDB and Cassandra for a long time have had multi-node auto-sharding capabilities. We are about to see the rise of strongly-consistent SQL based multi-node systems. And even MySQL is frequently deployed with manual sharding on multi-nodes.

The products like Vitess (http://vitess.io/) proposes auto-sharding for MySQL, and with ProxySQL (which I will use in my experiment) you can setup a basic sharding schema.

I describe multi-nodes setups, because in this environment it is possible to achieve much better than linear scalability. I will show this below.

Why is this important?

Understanding scalability of multi-node systems is important for resource planning, and understanding how much of a potential performance gain we can expect when we add more nodes. This is especially interesting for cloud deployments.

How is it possible?

I’ve written about how the size of available memory (cache) affects the performance. When we add additional nodes to the deployment, effectively we increase not only CPU cores, but also the memory that comes with the node (and we are adding extra IO capacity). So, with increasing node counts, we also increase available memory (and cache). As we can see from these graphs, the effect of extra memory could be non-linear (and actually better than linear). Playing on this fact, we can achieve better-than-linear scaling in a sharded setup. I am going to show the experimental setup of how to achieve this.

Experimental setup

To show the sharded setup we will use ProxySQL in front of N MySQL servers (shards). We also will use sysbench with 60 tables (4 million rows each, uniform distribution).

  • For one shard, this shard contains all 60 tables
  • For two shards, each shard contains 30 tables each
  • For three shards, each shard contains 20 tables each
  • For six shards, each shard contains ten tables each

So schematically, it looks like this:

One shard:

Scaling

Two shards:

Scaling

Six shards:

Scaling

We want to measure how the performance (for both throughput and latency) changes when we go from 1 to 2, to 3, to 4, to 5 and to 6 shards.

For the single shard, I used a Google Cloud instance with eight virtual CPUs and 16GB of RAM, where 10GB is allocated for the innodb_buffer_pool_size.

The database size (for all 60 tables) is about 51GB for the data, and 7GB for indexes.

For this we will use a sysbench read-only uniform workload, and ProxySQL helps to perform query routing. We will use ProxySQL query rules, and set sharding as:

mysql -u admin -padmin -h 127.0.0.1 -P6032 -e "DELETE FROM mysql_query_rules"
shards=$1
for i in {1..60}
do
hg=$(( $i % $shards + 1))
mysql -u admin -padmin -h 127.0.0.1 -P6032 -e "INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,destination_hostgroup,apply) VALUES ($i,1,'root','sbtest$is',$hg,1);"
done
mysql -u admin -padmin -h 127.0.0.1 -P6032 -e "LOAD MYSQL QUERY RULES TO RUNTIME;"

Command line for sysbench 1.0.4:
sysbench oltp_read_only.lua --mysql-socket=/tmp/proxysql.sock --mysql-user=root --mysql-password=test --tables=60 --table-size=4000000 --threads=60 --report-interval=10 --time=900 --rand-type=pareto run

The results

Nodes Throughput Speedup vs. 1 node Latency, ms
1 245 1.00 244.88
2 682 2.78 87.95
3 1659 6.77 36.16
4 2748 11.22 21.83
5 3384 13.81 17.72
6 3514 14.34 17.07

Scaling
As we can see, the performance improves by a factor much better than just linearly.

With five nodes, the improvement is 13.81 times compared to the single node.

The 6th node does not add much benefit, as at this time data practically fits into memory (with five nodes, the total cache size is 50GB compared to the 51GB data size)

Factors that affects multi-node scaling

How can we model/predict the performance gain? There are multiple factors to take into account: the size of the active working set, the available memory size and (also importantly) the distribution of the access to the working set (with uniform distribution being the best case scenario, and with access to the one with only one row being the opposite corner-case, where speedup is impossible). Also we need to keep network speed in mind: if we come close to using all available network bandwidth, it will be impossible to get significant improvement.

Conclusion

In multi-node, auto-scaling, auto-sharding distributed systems, the traditional scalability models do not provide much help. We need to have a better framework to understand how multiple nodes affect performance.

Jan
26
2017
--

Percona Live Featured Tutorial with Frédéric Descamps — MySQL InnoDB Cluster & Group Replication in a Nutshell: Hands-On Tutorial

Percona Live Featured Tutorial

Percona Live Featured TutorialWelcome to another post in the series of Percona Live featured tutorial speakers blogs! In these blogs, we’ll highlight some of the tutorial speakers that will be at this year’s Percona Live conference. We’ll also discuss how these tutorials can help you improve your database environment. Make sure to read to the end to get a special Percona Live 2017 registration bonus!

In this Percona Live featured tutorial, we’ll meet Frédéric Descamps, MySQL Community Manager at Oracle. Frédéric is probably better known in the community as “LeFred” (Twitter: @lefred)! His tutorial is MySQL InnoDB Cluster and Group Replication in a Nutshell: Hands-On Tutorial (along with  Part 2). Frédéric is delivering this talk with Percona’s MySQL Practice Manager Kenny Gryp. Attendees will get their hands on virtual machines and migrate standard Master/Slave architecture to the new MySQL InnoDB Cluster (native Group Replication) with minimal downtime. I had a chance to speak with Frédéric and learn a bit more about InnoDB Cluster and Group Replication:

Percona: How did you get into database technology? What do you love about it?

Frédéric: I started with SQL on VMS and DBASE during my IT courses. I wrote my thesis on SQLServer replication. To be honest, the implementation at the time wasn’t particularly good. At the same time, I enjoyed hacking a Perl module that allowed you to run SQL queries against plain text files. Then I worked as a Progress developer for MFG/Pro (a big customized ERP). When I decided to work exclusively for an open source company, I was managing more and more customers’ databases in MySQL (3.23 and 4.x), so my colleagues and I took MySQL training. It was still MySQL AB at that time. I passed my MySQL 4.1 Core and Professional Certification, and I ended up delivering MySQL Training for MySQL AB too. Some might also remember that I worked for a company called Percona for a while, before moving on to Oracle!

Percona: Your and Kenny’s tutorial is called “MySQL InnoDB Cluster & Group Replication in a nutshell: hands-on tutorial.” Why would somebody want to migrate to this technology?

Frédéric: The main reason to migrate to MySQL InnoDB is to achieve high availability (HA) easily for your MySQL database, while still maintaining performance.

Thanks to  MySQLShell’s adminAPI, you can create a cluster in less than five minutes. All the management can be done remotely, with just some shell commands. Additionally, for Group Replication our engineers have leveraged existing standard MySQL infrastructures (GTIDs, Multi-Source Replication, Multi-threaded Slave Applier, Binary Logs, etc.), which are already well known by the majority of MySQL DBAs.

Percona: How can moving to a high availability environment help businesses?

Frédéric: Time is money! Many businesses can’t afford downtime anymore. Having a reliable HA solution is crucial for businesses on the Internet. Expectations have changed: users want this to be natively supported inside the database technology versus externally. For example, when I started to work in IT, it was very common to have several maintenance windows during the night. But currently, and almost universally, the customer base is spread worldwide. Any time is during business hours somewhere in the world!

Percona: What do you want attendees to take away from your tutorial session? Why should they attend?

Frédéric: I think that current HA solutions are too complex to setup and manage. They use external tools. In our tutorial, Kenny and I will demonstrate how MySQL InnoDB Cluster, being an internal implementation, is extremely easy to use. We will also cover some scenarios where things go wrong, and how to deal with them. Performance and ease-of-use were two key considerations in the design of InnoDB Cluster.

Percona: What are you most looking forward to at Percona Live?

Frédéric: Like every time I’ve attended, my main goal is to bring to the audience all the new and amazing stuff we are implementing in MySQL. MySQL has evolved quickly these last few years, and we don’t really plan to slow down. I also really enjoy the feedback from users and other MySQL professionals. This helps focus us on what really matters for our users. And finally, it’s a great opportunity to re-connect with ex-colleagues and friends.

You can find out more about Frédéric Descamps and his work with InnoDB Cluster at his Twitter handle @lefredRegister for Percona Live Data Performance Conference 2017, and see Frédéric and Kenny present their MySQL InnoDB Cluster and Group Replication in a Nutshell: Hands-On Tutorial talk. Use the code FeaturedTalk and receive $30 off the current registration price!

Percona Live Data Performance Conference 2017 is the premier open source event for the data performance ecosystem. It is the place to be for the open source community as well as businesses that thrive in the MySQL, NoSQL, cloud, big data and Internet of Things (IoT) marketplaces. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

The Percona Live Data Performance Conference will be April 24-27, 2017 at the Hyatt Regency Santa Clara & The Santa Clara Convention Center.

Oct
10
2016
--

Take Percona’s One-Click 2017 Top Database Concerns Poll

2017 top database concerns

2017 top database concernsTake Percona’s One-Click 2017 Top Database Concerns Poll.

With 2017 coming quick around the corner, it’s time to start thinking about what next year is going to bring to the open source database community. We just finished Percona Live Europe 2017, and at the conference we looked at new technologies, new techniques, and new ideas. With all this change comes some uncertainty and concern regarding change: how is a changing ecosystem going to affect your database environment? Are you up to speed on the latest technologies and how they can impact your world?

Are your biggest concerns for 2017 scalability, performance, security, monitoring, updates and bugs, or staffing issues? What do you think is going to be your biggest issue in the coming year??

 

Please take a few seconds and answer the following poll. It will help the community get an idea of how the new year could impact their critical database environments.

If you’ve faced specific issues, feel free to comment below. We’ll post a follow-up blog with the results!

Note: There is a poll embedded within this post, please visit the site to participate in this post’s poll.

You can see the results of our last blog poll on security here.

Jun
20
2016
--

Webinar Thursday June 23: Choosing a MySQL High Availability Solution Today

MySQL High Availability

Please join Percona, Technical Account Manager, Michael Patrick on Thursday, June 23, 2016 at 10 AM PDT (UTC-7) as he presents “Choosing a MySQL High Availability Solution Today.”

High availability (HA) is one of the solutions to improve performance, avoid data outages, and recover quickly from disasters. An HA environment helps guarantee that your database doesn’t have a single point of failure, accommodates rapid growth and exponentially increasing database size, and enables the applications that power your business.

Michael will discuss various topologies for achieving High Availability with MySQL.

Topics include:

  • Percona XtraDB Cluster
  • DRBD
  • MHA
  • MySQL Orchestrator

Each solution has advantages and challenges. Attendees will gain a deeper understanding of how to choose the best solution for their needs while avoiding some of the pitfalls of making the wrong choices. Avoid the costly mistakes that commonly cause outages and lost revenue. Plus get the latest and greatest developments in the technologies!

Register now.

MySQL High AvailabilityMichael Patrick
Technical Account Manager

Mike came to Percona in 2015 after working for a variety of large corporations running hundreds of MySQL and Percona XtraDB Clusters in production environments. He is skilled in performance tuning, server auditing, high availability, multi-data center replication, migration, and other MySQL-related activities. Mike holds a B.S. in Computer Science from East Tennessee State University. In his off time, he enjoys Martial Arts and Cave Exploration. He lives in East Tennessee with his wife and he has four children.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com