Percona Replication Manager (PRM) now supporting 5.6 GTID

Over the last few days, I integrated the MySQL 5.6 GTID version of the Percona Replication Manager (PRM) work of Frédéric Descamps, a colleague at Percona. The agent supports the GTID replication mode of MySQL 5.6 and if the master suffers a hard crash, it picks the slave having applied the highest transaction ID from the dead master. Given the nature of GTID-based replication, that causes all the other slaves to resync appropriately to their new master which is pretty cool and must yet be matched by the regular PRM agent.

For now, it is part of a separate agent, mysql_prm56, which may be integrated with the regular agent in the future. To use it, download the agent with the link above, the pacemaker configuration is similar to the one of the regular PRM agent. If you start from scratch, have a look here and of course, replace “mysql_prm” with “mysql_prm56″. Keep in mind that although it successfully ran many tests, it is the first release and there’s no field experience. I invite you to send any issue or successful usage to PRM-discuss.

As a side note, dealing with GTID based replication is slightly different than regular replication. I invite to consult these posts for more details:

Replication in MySQL 5.6: GTIDs benefits and limitations – Part 1

How to create/restore a slave using GTID replication in MySQL 5.6
How to create a new (or repair a broken) GTID based slave with Percona XtraBackup
Repair MySQL 5.6 GTID replication by injecting empty transactions

The post Percona Replication Manager (PRM) now supporting 5.6 GTID appeared first on MySQL Performance Blog.


Q&A: Geographical disaster recovery with Percona Replication Manager

Q&A: Geographical disaster recovery with Percona Replication ManagerMy December 4 webinar, “Geographical disaster recovery with  Percona Replication Manager (PRM),”  gave rise to a few questions. The recording of the webinar and the slides are available here, and I’ve answered the questions I didn’t have time to address below.

Q1: Hi, I was wondering if corosync will work in cloud environment. As far as I know it is hard to implement because of no support of unicast or multicast.

A1: Corosync supports the udpu transport since somewhere in the 1.3.0 branch. udpu stands for udp unicast and it works in AWS for instance. Most recent distribution are using 1.4.x so it is easy to find.

Q2: For token wouldn’t it make sense to have a value higher than 3000 to account for any packetloss and the default retry of 3 seconds for tcp communication?

A2: Corosync uses udp, not tcp so the argument is somewhat irrelevant.

Q3: Is PRM supported with a Percona support contract?

A3: Yes, PRM is now supported with a Percona Gold or Platinum Support contract. It is not availabe with Silver support.

Q4: Is PRM working with GTID’s in 5.6?

A4: There’s a version in testing phases adapted by Frederic Descamps that works with 5.6/GTID. As soon as it is tested properly, I’ll release it. So far, it is very clean in term of logic.

Q5: Does Percona Replication Manager do anything different with replication over the built in mySQL replication to combat the single threaded nature?

A5: No

Q6: We agree that fencing always has to be configured (even with no shared resources , in cases of mysql stop failure for example) : What about MySQL and Percona’s PRM Ressource agent behavior when fencing? and recommendations concerning fencing? There was no fencing in the demo and there is no fencing configured on the pacemaker crm snippets examples provided in on Percona’s github repo.

A6: Fencing is independent of PRM and is well covered elsewhere on the web. I’ll see to add an example configuration with stonith devices. On real hardware, the most common stonith device are IPMI or ILO. These technologies comes with nearly all the server brands.

Q7: Is there any other Mysql HA setup supported by Percona’s mysql_prm pacemaker resource agent than the Mysql master slave replication ? Like Multi master setups ? In case yes , will you release some crm configurations snippets examples for other Mysql HA setups ?

A7: No, only a single master is supported. The main argument here is that multiple master doesn’t scale writes and are a big source of conflicts.

Q8: Why use Percona Replication Manager over XtraDB cluster (omitting ease of Geo-DR) on a write performance perspective , also considering HA and cost ?

A8: Percona XtraDB Cluster (PXC) is more advanced and capable but, some workloads, like large transactions, are not working well with PXC. Regular replication is well known and many customers are not willing to try a newer technology like (PXC) and prefer to rely on regular replication. Also, PRM can be mixed with PXC without problem. Example of such configuration will be published soon. In term of cost both are free, support is available from Percona (Gold and Platinium) but (PXC) has a premium.

The post Q&A: Geographical disaster recovery with Percona Replication Manager appeared first on MySQL Performance Blog.


High-availability options for MySQL, October 2013 update

The technologies allowing to build highly-available (HA) MySQL solutions are in constant evolution and they cover very different needs and use cases. In order to help people choose the best HA solution for their needs, we decided, Jay Janssen and I, to publish, on a regular basis (hopefully, this is the first), an update on the most common technologies and their state, with a focus on what type of workloads suite them best. We restricted ourselves to the open source solutions that provide automatic failover. Of course, don’t simply look at the number of Positives/Negatives items, they don’t have the same values. Should you pick any of these technologies, heavy testing is mandatory, HA is never beyond scenario that have been tested.

Percona XtraDB Cluster (PXC)

Percona XtraDB Cluster (PXC) is a version of Percona Server implementing the Galera replication protocol from Codeship.

Positive points Negative points
  • Almost synchronous replication, very small lag if any
  • Automatic failover
  • At best with small transactions
  • All nodes are writable
  • Very small read after write lag, usually no need to care about
  • Scale reads very well and to some extent, writes
  • New nodes are provisioned automatically through State Snapshot Transfer (SST)
  • Multi-threaded apply, greater write capacity than regular replication
  • Can do geographical disaster recovery (Geo DR)
  • More resilient to unresponsive nodes (swapping)
  • Can resolve split-brain situations by itself
  • Still under development, some rough edges
  • Large transactions like multi-statement transactions or large write operations cause issues and are usually not a good fit
  • For quorum reasons, 3 nodes are needed but one can be a lightweight arbitrator
  • SST can be heavy over a Wan
  • Commit are affected by the network latency, this impacts especially Geo DR
  • To achieve HA, a load balancer, like haproxy, is needed
  • Failover time is determined by the load balancer check frequency
  • Performance is affected by the weakest/busiest node
  • Foreign Keys are potential issues
  • MyISAM should be avoided
  • Can be mixed with regular async replication as master or slave but, slaves are not easy to reconfigure after a SST on their master
  • Require careful setup of the host, swapping can lead to node expulsion from the cluster
  • No manual failover mode
  • Debugging some Galera protocol issues isn’t trivial


Percona replication manager (PRM)

Percona replication manager (PRM) uses the Linux HA Pacemaker resource manager to manage MySQL and replication and provide high-availability. Information about PRM can be found here, the official page on the Percona web site is in the making.

Positive points Negative points
  • Nothing specific regarding the workload
  • Unlimited number of slaves
  • Slaves can have different roles
  • Typically VIP based access, typically 1 writer VIP and many reader VIPs
  • Also works without VIP (see the fake_mysql_novip agent)
  • Detects if slave lags too much and remove reader VIPs
  • All nodes are monitored
  • The best slaves is picked for master after failover
  • Geographical Disaster recovery possilbe with the lightweight booth protocol
  • Can be operated in manual failover mode
  • Graceful failover is quick, under 2s in normal conditions
  • Ungraceful failover under 30s
  • Distributed operation with Pacemaker, no single point of failure
  • Builtin pacemaker logic, stonith, etc. Very rich and flexible.
  • Still under development, some rough edges
  • Transaction maybe lost is master crashes (async replication)
  • For quorum reasons, 3 nodes are needed but one can be a lightweight arbitrator
  • Only one node is writable
  • Read after write may not be consistent (replication lag)
  • Only scales reads
  • Careful setup for the host, swapping can lead to node expulsion from the cluster
  • Data inconsistency can happen if the master crashes (fix coming)
  • Pacemaker is complex, logs are difficult to read and understand


MySQL master HA (MHA)

Like with PRM above, MySQL master HA (MHA), provides high-availability through replication. The approach is different, instead of relying on an HA framework like Pacemaker, it uses Perl scripts. Information about MHA can be found here.

Positive points Negative points
  • Mature
  • Nothing specific regarding the workload
  • No latency effects on writes
  • Can have many slaves and slaves can have different roles
  • Very good binlog/relaylog handling
  • Work pretty hard to minimise data loss
  • Can be operated in manual failover mode
  • Graceful failover is quick, under 5s in normal conditions
  • If the master crashes, slaves will be consistent
  • The logic is fairly easy to understand
  • Transaction maybe lost is master crashes (async replication)
  • Only one node is writable
  • Read after write may not be consistent (replication lag)
  • Only scales reads
  • Monitoring and logic are centralized, single-point of failure, a network partition can cause a split-brain
  • Custom fencing devices, custom VIP scripts, no reuse of other projects tools
  • Most of the deployments are using manual failover (at least at Percona)
  • Requires priviledged ssh access to read relay-logs, can be a security concern
  • No monitoring of the slave to invalidate it if it lags too much or if replication is broken, need to be done by external tool like HAProxy
  • Careful setup for the host, swapping can lead to node expulsion from the cluster


NDB Cluster

NDB cluster is the most high-end form of high-availability configuration for MySQL. It is a complete shared nothing architecture where the storage engine is distributed over multiple servers (data nodes). Probably the best starting point with NDB is the official document, here.

Positive points Negative points
  • Mature
  • Synchronous replication
  • Very good at small transactions
  • Very good at high concurrency (many client threads)
  • Huge transaction capacity, more than 1M trx/s are not uncommon
  • Failover can be ~1s
  • No single point of failure
  • Geographical disaster recovery capacity built-in
  • Strong at async replication, applying by batches gives multithreaded apply at the data node level
  • Can scale reads and writes, the framework implements sharding by hashes
  • Not a drop-in replacement for Innodb, you need to tune the schema and the queries
  • Not a general purpose database, some loads like reporting are just bad
  • Only the Read-commited isolation level is available
  • Hardware heavy, need 4 servers mininum for full HA
  • Memory (RAM) hungry, even with disk-based tables
  • Complex to operate, lots of parameters to adjust
  • Need a load balancer for failover
  • Very new foreign key support, field reports scarce on it


Shared storage/DRBD

Achieving high-availability use a shared storage medium is an old and well known method. It is used by nearly all the major databases. The share storage can be a DAS connected to two servers, a LUN on SAN accessible from 2 servers or a DRBD partition replicated synchronously over the network. DRBD is by bar the most common shared storage device used in the MySQL world.

Positive points Negative points
  • Mature
  • Synchronous replication (DRBD)
  • Automatic failover is easy to implement
  • VIP based access
  • Write capacity is impacted by network latency for DRBD
  • SANs are expensive
  • Only for InnoDB
  • Standby node, a big server doing nothing
  • Need a warmup period after failover to be fully operational
  • Disk corruption can spread


The post High-availability options for MySQL, October 2013 update appeared first on MySQL Performance Blog.


Choosing a MySQL HA Solution – MySQL Webinar: June 5

Choosing a MySQL HA Solution - MySQL Webinar: June 5Selecting the most appropriate solution for a MySQL HA infrastructure is as much a business and philosophical decision as it is a technical one, but often the choice is made without adequately considering all three perspectives.  When too much attention is paid to one of these aspects at the cost of the others, the resulting system may be over-engineered, poorly-performing, and/or various other flavors of suboptimal.

On Wednesday, June 5, at 10 a.m. PDT (1700 UTC), I will be presenting a webinar entitled, Choosing a MySQL HA Solution, in which we’ll explore the topic of MySQL HA from each of these perspectives.  The goal will be to motivate your thinking about HA in a holistic fashion and help guide you towards asking the right questions when considering a new or upgraded HA deployment.

This webinar will be both technical and non-technical in nature, beginning with a discussion of some general HA principles and some common misconceptions.  We will then explore some of the more well-known MySQL HA tools and technologies available today (largely grouped into those which use traditional MySQL replication, those which use some other MySQL-level replication, and those which replicate at some other layer of the system stack) and then conclude with some typical use cases where a given approach may be well-suited or particularly contraindicated.

If this topic interests you, then register today to reserve your spot.  I look forward to speaking with all of you next week.

The post Choosing a MySQL HA Solution – MySQL Webinar: June 5 appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com