Dec
14
2016
--

Percona XtraDB Cluster 5.6.34-26.19 is now available

Percona XtraDB Cluster 5.7.16-27.19

Percona XtraDB Cluster 5.6.34-26.19

Percona announces the release of Percona XtraDB Cluster 5.6.34-26.19 on December 14, 2016. Binaries are available from the downloads section or our software repositories.

Percona XtraDB Cluster 5.6.34-26.19 is now the current release, based on the following:

All Percona software is open-source and free. Details of this release can be found in the 5.6.34-26.19 milestone on Launchpad.

Deprecated

  • The following encryption modes are now deprecated:
    • encrypt=1
    • encrypt=2
    • encrypt=3

The default is encrypt=0 with encryption disabled. The recommended mode now is the new encrypt=4, which uses SSL files generated by MySQL.

For more information, see Encrypting PXC Traffic.

New Features

  • Added encrypt=4 mode for SST encryption that uses SSL files generated by MySQL. Modes 1, 2, and 3 are now deprecated.

Fixed Bugs

  • Optimized IST donor selection logic to avoid SST. Child processes are now cleaned-up and node state is resumed if SST fails.
  • Added init.ok to the list of files that do not get removed during SST.
  • Fixed error with ASIO library not acknowledging an EPOLLIN event when building Galera.
  • Fixed stalling of DML workload on slave node caused by FLUSH TABLE executed on the master.
    For more information, see 1629296.
  • Fixed super_read_only to not apply to Galera replication applier.
    For more information, see 1634295.
  • Redirected netcat output to stdout to avoid it in the log.
    For more information, see 1625968.
  • Enabled replication of ALTER USER statements.
    For more information, see 1376269.
  • Changed the wsrep_max_ws_rows variable to ignore non-replicated write-sets generated by DML action on temporary tables (explicit or implicit).
    For more information, see 1638138.
  • Fixed SST to fail with an error if SSL is not supported by socat, instead of switching to unencrypted mode.
  • Fixed SST with SSL to auto-generate a 2048-bit dhparams file for versions of socat before 1.7.3. These older versions use 512-bit dhparams file by default that gets rejected by newer clients with dh key too small error.
  • PXC-731: Changed the wsrep_cluster_name variable to read-only, because changing it dynamically leads to high overhead.
    For more information, see 1620439.
  • PXC-732: Improved error message when any of the SSL files required for SST are missing.
  • PXC-735: Fixed SST to fail with an error when netcat is used (transferfmt=nc) with SSL encryption (encrypt set to 2, 3 or 4), instead of silently switching to unencrypted mode.
  • Fixed faulty switch case that caused the cluster to stall when the repl.commit_order variable was set to 2 (LOCAL_OOOC mode that should allow out-of-order committing for local transactions).

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

Sep
03
2014
--

Migrating to Percona XtraDB Cluster 2014 edition: Sept. 10 MySQL webinar

Join Jay Janssen Sept. 10 at 10 a.m. PDT and learn how to migrate to Percona XtraDB Cluster 5.6Join me online next week (September 10 at 10 a.m. PDT) for my live webinar on Migrating to Percona XtraDB Cluster.  This was a popular webinar that I gave a few years ago, so I’m doing it again with updates for Percona XtraDB Cluster 5.6 (PXC) and all the latest in the Galera world.

This webinar will be really good for people interested in getting an overview of what PXC/Galera is, what it would take to adopt it for your application, and some of the differences and challenges it brings compared with a conventional MySQL Master/slave setup.  I’d highly suggest attending if you are considering Galera in your environment and want to get a better understanding of its uses and antipatterns.

Additionally, I’ll cover such questions as:

  • What are the requirements for running Percona XtraDB Cluster?
  • Will I have to reload all my tables?
  • How does configuration for the cluster differ from configuring a stand-alone InnoDB server?
  • How should my application interact with the Cluster?
  • Can I use Percona XtraDB Cluster if I only have two MySQL servers currently?
  • How can I move to the Cluster and keep downtime to a minimum?
  • How can I migrate to Percona XtraDB Cluster gradually?

I hope to see you next Wednesday. And please feel free to ask questions in advance in the comments section below. Next week’s live event, like all of our MySQL webinars, is free. Register here!

The post Migrating to Percona XtraDB Cluster 2014 edition: Sept. 10 MySQL webinar appeared first on MySQL Performance Blog.

May
25
2014
--

Installing Percona XtraDB Cluster 5.6 with the Docker open-source engine

In my previous post, I blogged about using Percona Server with Docker and have shown you how fast and easy it was to create a virtual environment with just a few commands.

This time I will be showing you how to setup a three-node Percona XtraDB Cluster (PXC) 5.6 on the Docker open-source engine. Just to review Docker… “is an open-source engine that automates the deployment of any application as a lightweight, portable, self-sufficient container that will run virtually anywhere.”

In this case we will make use of a Dockerfile, think of this more like the Vagrantfile, it is a build script with a set of commands automating the creation of a new docker container.

For this case, we will use the following Dockerfile contents and use Ubuntu 12.04 instead of CentOS 6.5 as guest OS:

FROM ubuntu:precise
MAINTAINER Percona.com <info@percona.com>
# Upgrade
RUN apt-get update && apt-get upgrade -y && apt-get dist-upgrade -y
ENV DEBIAN_FRONTEND noninteractive
RUN echo "deb http://repo.percona.com/apt precise main" >> /etc/apt/sources.list.d/percona.list
RUN echo "deb-src http://repo.percona.com/apt precise main" >> /etc/apt/sources.list.d/percona.list
RUN apt-key adv --keyserver keys.gnupg.net --recv-keys 1C4CBDCDCD2EFD2A
RUN apt-get update
RUN apt-get install -y percona-xtradb-cluster-56 qpress xtrabackup
RUN apt-get install -y python-software-properties vim wget curl netcat

Create a my.cnf file and add the following:

[MYSQLD]
user = mysql
default_storage_engine = InnoDB
basedir = /usr
datadir = /var/lib/mysql
socket = /var/run/mysqld/mysqld.sock
port = 3306
innodb_autoinc_lock_mode = 2
log_queries_not_using_indexes = 1
max_allowed_packet = 128M
binlog_format = ROW
wsrep_provider = /usr/lib/libgalera_smm.so
wsrep_node_address ={node_IP}
wsrep_cluster_name="mydockerpxc"
wsrep_cluster_address = gcomm://{node1_ip},{node2_ip},{node3_ip}
wsrep_node_name = {node_name}
wsrep_slave_threads = 4
wsrep_sst_method = xtrabackup-v2
wsrep_sst_auth = pxcuser:pxcpass
[sst]
streamfmt = xbstream
[xtrabackup]
compress
compact
parallel = 2
compress_threads = 2
rebuild_threads = 2

Build an image from the Dockerfile we just created.

root@Perconallc-Support /home/jericho.rivera/docker-test # docker build -rm -t ubuntu_1204/percona:galera56 .

The ‘docker build’ command will create a new image from the Dockerfile build script. This will take a few minutes to complete. You can check if the new image was successfully built:

root@Perconallc-Support /home/jericho.rivera/docker-test # docker images | grep ubuntu
ubuntu_1204/percona   galera56            c2adc932aaec        9 minutes ago       669.4 MB
ubuntu                13.10               5e019ab7bf6d        13 days ago         180 MB
ubuntu                saucy               5e019ab7bf6d        13 days ago         180 MB
ubuntu                12.04               74fe38d11401        13 days ago         209.6 MB
ubuntu                precise             74fe38d11401        13 days ago         209.6 MB

Now we will launch three containers with Percona XtraDB Cluster using the new docker image we have just created.

root@Perconallc-Support /home/jericho.rivera/docker-test # for n in {1..3}; do docker run --name dockerpxc$n -i -t -d ubuntu_1204/percona:galera56 bash; done

Check if the new containers were created:

root@Perconallc-Support /home/jericho.rivera/docker-test # docker ps
CONTAINER ID        IMAGE                          COMMAND             CREATED             STATUS              PORTS               NAMES
dd5bafb99108        ubuntu_1204/percona:galera56   bash                5 minutes ago       Up 5 minutes                            dockerpxc3
01664cfcbeb7        ubuntu_1204/percona:galera56   bash                5 minutes ago       Up 5 minutes                            dockerpxc2
2e44d8baee99        ubuntu_1204/percona:galera56   bash                5 minutes ago       Up 5 minutes                            dockerpxc1

Get relevant information from the container using ‘docker inspect’ command which by default will show a JSON-format output on the terminal. Since we only need to get the IP address for each container just run the following commands:

root@Perconallc-Support /home/jericho.rivera/docker-test # docker inspect dockerpxc1 | grep IPAddress
        "IPAddress": "172.17.0.2",
root@Perconallc-Support /home/jericho.rivera/docker-test # docker inspect dockerpxc2 | grep IPAddress
        "IPAddress": "172.17.0.3",
root@Perconallc-Support /home/jericho.rivera/docker-test # docker inspect dockerpxc3 | grep IPAddress
        "IPAddress": "172.17.0.4",

Take note of the IP address because we will need them later.

wsrep_cluster_address=gcomm://172.17.0.2,172.17.0.3,172.17.0.4

Do the same for dockerpxc2 and dockerpxc3 nodes. To attach to a container you need to run ‘docker attach {node_name}’ (eg # docker attach dockerpxc1). To exit without stopping the containers you need to hit CTRL+p/CTRL+q, otherwise an explicit ‘exit’ command on the prompt will drop you out of the container and stop the container as well, as much as possible we try to avoid this. Also make sure to edit wsrep_node_name and wsrep_node_address accordingly.

Bootstrapping the Cluster

Next step is to start the first node in the cluster, or bootstrapping.

root@Perconallc-Support /home/jericho.rivera/docker-test # docker attach dockerpxc1
root@2e44d8baee99:# /etc/init.d/mysql bootstrap-pxc

After bootstrapping the first node, we can then prepare the first node for SST. That means we need to create the SST auth user, and in this case it is wsrep_ss_auth=pxcuser:pxcpass.

After adding the SST auth user on the first node the next step would be to start dockerpxc2 and dockerpxc3:

root@Perconallc-Support /home/jericho.rivera/docker-test # docker attach dockerpxc2
root@01664cfcbeb7:/# /etc/init.d/mysql start

After starting all nodes, check the status of the entire cluster:

root@2e44d8baee99:/# mysql -e "show global status like 'wsrep%';"
+------------------------------+--------------------------------------+
| Variable_name                | Value                                |
+------------------------------+--------------------------------------+
| wsrep_local_state_uuid       | 7ae2a03e-d71e-11e3-9e00-2f7c2c79edda |
| wsrep_protocol_version       | 5                                    |
| wsrep_last_committed         | 1                                    |
| wsrep_replicated             | 1                                    |
| wsrep_replicated_bytes       | 270                                  |
| wsrep_repl_keys              | 1                                    |
| wsrep_repl_keys_bytes        | 31                                   |
| wsrep_repl_data_bytes        | 175                                  |
| wsrep_repl_other_bytes       | 0                                    |
| wsrep_received               | 18                                   |
| wsrep_received_bytes         | 2230                                 |
| wsrep_local_commits          | 0                                    |
| wsrep_local_cert_failures    | 0                                    |
| wsrep_local_replays          | 0                                    |
| wsrep_local_send_queue       | 0                                    |
| wsrep_local_send_queue_avg   | 0.333333                             |
| wsrep_local_recv_queue       | 0                                    |
| wsrep_local_recv_queue_avg   | 0.000000                             |
| wsrep_local_cached_downto    | 1                                    |
| wsrep_flow_control_paused_ns | 0                                    |
| wsrep_flow_control_paused    | 0.000000                             |
| wsrep_flow_control_sent      | 0                                    |
| wsrep_flow_control_recv      | 0                                    |
| wsrep_cert_deps_distance     | 1.000000                             |
| wsrep_apply_oooe             | 0.000000                             |
| wsrep_apply_oool             | 0.000000                             |
| wsrep_apply_window           | 1.000000                             |
| wsrep_commit_oooe            | 0.000000                             |
| wsrep_commit_oool            | 0.000000                             |
| wsrep_commit_window          | 1.000000                             |
| wsrep_local_state            | 4                                    |
| wsrep_local_state_comment    | Synced                               |
| wsrep_cert_index_size        | 1                                    |
| wsrep_causal_reads           | 0                                    |
| wsrep_cert_interval          | 0.000000                             |
| wsrep_incoming_addresses     | ,,                                   |
| wsrep_cluster_conf_id        | 11                                   |
| wsrep_cluster_size           | 3                                    |
| wsrep_cluster_state_uuid     | 7ae2a03e-d71e-11e3-9e00-2f7c2c79edda |
| wsrep_cluster_status         | Primary                              |
| wsrep_connected              | ON                                   |
| wsrep_local_bf_aborts        | 0                                    |
| wsrep_local_index            | 2                                    |
| wsrep_provider_name          | Galera                               |
| wsrep_provider_vendor        | Codership Oy <info@codership.com>    |
| wsrep_provider_version       | 3.5(r178)                            |
| wsrep_ready                  | ON                                   |
+------------------------------+--------------------------------------+

All three nodes are in the cluster!

Install net-tools to verify the default port for Galera.

root@dd5bafb99108:/# netstat -ant
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp        0      0 0.0.0.0:4567            0.0.0.0:*               LISTEN
tcp        0      0 172.17.0.4:33875        91.189.88.153:80        TIME_WAIT
tcp        0      0 172.17.0.4:56956        172.17.0.2:4567         ESTABLISHED
tcp        0      0 172.17.0.4:57475        172.17.0.3:4567         ESTABLISHED
tcp6       0      0 :::3306                 :::*                    LISTEN

Summary

I’ve shown you the following:

* Create a Dockerfile and my.cnf file
* Build a docker container using the created Dockerfile
* Made a few changes on some wsrep_ options on my.cnf
* Bootstrap the first node
* Start the other nodes of the cluster
* Check cluster status and confirmed all nodes are in the cluster

There are other ways to setup Docker with Percona XtraDB Cluster 5.6 such as using vagrant + docker which further automates the whole process or by using shell scripts, this article however shows you the basics of how to accomplish the same task.

On my next post, I will show how to setup Percona ClusterControl on another Docker container and adding this three-node PXC 5.6 cluster to it.

Read related posts here:

* Using MySQL Sandbox with Percona Server
* Testing Percona XtraDB Cluster with Vagrant
* Percona XtraDB Cluster: Setting up a simple cluster

The post Installing Percona XtraDB Cluster 5.6 with the Docker open-source engine appeared first on MySQL Performance Blog.

Feb
07
2014
--

Followup questions to ‘What’s new in Percona XtraDB Cluster 5.6? webinar

Thanks to all who attended my webinar yesterday.  The slides and recording are available on the webinar’s page.  I was a bit overwhelmed with the amount of questions that came in and I’ll try to answer them the best I can here.

Q: Does Percona XtraDB Cluster support writing to multiple master?

Yes, it does.  However, locks are not held on all nodes while a transaction is in progress, so there is a possibility for conflicts between simultaneous writes on different nodes.  Certification is the process that Galera uses to protect your data integrity, and the problem is pushed back to the application to handle in the form of deadlock errors.   As long as you understand this and the conditions where you may see more deadlocks (and your application can properly handle them), multi-node writing is fine.  See more detail on this in a blog post I did a while ago.

Q: Is there any limitation to scale writes?

Yes, there is no sharding in Percona XtraDB Cluster, all data goes to all nodes.  You are limited by replication throughput, or more specifically, transaction apply throughput, just like with standard MySQL replication.  However, PXC offers true parallel apply, so many transactions can be applied in parallel, which should generally give better throughput than conventional master/slave.

Q: Are the WAN segments feature only used to reduce network bandwidth? Would DB write performance be the same before Galera 3.x since each commit still has to be ack by all the servers across the WAN?

If you listened closely to something Alexey said during the webinar, it’s actually possible that WAN segments will improve commit times because the master node won’t have spend extra time sending the transaction to each remote node.  I would expect overall write performance to vary somewhat compared to not using segments, but there are probably factors that may influence it either way including number of remote nodes, bandwidth, latency, etc.

With or without segments, all nodes must acknowledge the transaction, this is a vital part of Galera’s implementation and cannot be relaxed.

Q: What is the max number of cluster servers across a WAN recommended before you start seeing a diminishing return in performance b/c of sync replication?

Good question, I don’t know.  This would be fun to test and would make a good blog post.  I’ll put it on my list.  I’d bet that segments may increase such a limit, but it probably depends on a lot of factors.

Practically I haven’t seen a cluster with more than a half-dozen nodes, but that doesn’t mean that’s the limit.  I’d expect a big cluster would be around 10-12 nodes, but in reality that’s just a gut feeling more than any hard evidence.

Q: Should I be worried about the auto_increment bug you mentioned? I wasn’t planning to upgrade our cluster to Percona XtraDB Cluster 5.6 soon.

Not unless you regularly add AUTO_INCREMENT columns to existing tables using ALTER TABLE.  Note that the bug was also fixed in 5.5.34.

Q: Does Percona XtraDB Cluster support SphinxSE Storage Engine?

Nope.  Currently the Codership-mysql patches are for Innodb only.  Any other transactional storage engine can theoretically be supported provided you can implement prioritized transactions in it.

However SphinxSE is not transactional, so its support would be similar to MyISAM at best (which is very rudimentary and not likely to change).  It would be easy to add such support and it’s possible that the Maria Galera Cluster guys are already considering it since Maria ships with that storage engine.

UPDATE: It was pointed out to me that SphinxSE doesn’t store any actual data, but instead just acts as a client to a distinct Sphinx index (which I just wasn’t thinking of).  In that sense, it doesn’t need Galera support:  each node can just have a SphinxSE table, to which any DML will update the Sphinx index directly, no replication required by the cluster.  SphinxSE should work fine with PXC.

Q: To convert from mysql to Percona XtraDB Cluster 5.6, do you now recommend first upgrading to mysql 5.6 and then converting to PXC?

It depends, but I’d consider migrating to PXC to be at least equivalent to the risk of upgrading a major MySQL version and it should be tested thoroughly.  To limit your risk, breaking the upgrade into smaller pieces may be prudent, but it is not strictly necessary.  The further away you are from 5.6 (like say you are on 5.1 or 5.0), the more likely I’d recommend separate steps.

Q: Do “WAN segments” effect the quorum in any way?

No, group communication continues as before, it’s just that the routing of the actual writeset content is different.

Q: Since each galera node is identical, they all have the same storage footprint. What are best practices for expanding storage on galera nodes when we are low on free space?

On a cluster in steady state, it should be easy to do rolling changes.  That is, take one node out of the cluster, add some extra storage, and put it back in, repeat.  Ideally such rolling changes are done quickly enough and are non-destructive to the datadir so you can IST on restart.

Q: Is there any change to wsrep_retry_autocommit behavior in Percona XtraDB Cluster 5.6, or any plans to apply this setting to explicit transactions in order to avoid cluster “deadlocks”?

wsrep_retry_autocommit has not changed to my knowledge.  The reasoning behind not applying this methodology to explicit transactions was that it was generally assumed that explicit transactions may have application logic being applied between the statements and could we assume it was safe to simply retry the same transaction if the data changed underneath?  For example:

BEGIN;
SELECT * FROM USERS WHERE ID=100 FOR UPDATE;
# Application applies some business logic based on the row contents here???
UPDATE USERS SET AUTHORIZED=1 WHERE ID=100;
COMMIT;

If the UPDATE USERS was an autocommit, then wsrep_retry_autocommit would simply re-broadcast the same writeset (unmodified) with the new RBR row if there was a conflict.  It does not re-run the statement.  Would it be safe to do this if the explicit transaction got a cluster deadlock?  We don’t know.  If the user record was modified (which is what a cluster deadlock would indicate), should they still be authorized?

Q: I heard about xtrabackup-v2. What is the difference between the previous one?

This was actually released in 5.5.  The big changes are that the new SST method allows for encrypted SST transfers, compression, rate limiting, progress metering, and other goodness like that.  It is not backwards compatible with the old xtrabackup method, and both the donor and joiner must have the new method available (i.e., be running a PXC version that has both) for it to work.

Q: Can a cluster which use rsync to be switched to xtrabackup in a rolling like mode?

Yes, no problem.  The SST method is dictated by the JOINER, meaning whatever the joiner node’s wsrep_sst_method is, that’s what is used (the donor obeys this even if its wsrep_sst_method is different).  You can just go change the wsrep_sst_method in all the config files and it will be used next time an SST happens (since SST would only happen on a restart anyway).  Just be careful that you test xtrabackup first, since it requires proper mysql credentials to be set in wsrep_sst_auth.

Q: Do you saw or installed or recommended Percona XtraDB Cluster 5.6 for production now or wait for a while ?

Now that 5.6 is GA, we’ve just passed a milestone where I’d start to recommend 5.6 as a starter for new Percona XtraDB Cluster deployments going forward.  Certainly 5.5 is more battle-tested and understood and there still may be good reasons to use it, but from now forward, I’m expecting that the need for 5.5 will only diminish, not increase.

For existing PXC 5.5 deployments, I’d probably recommend waiting a bit unless there is some great need for the new release, but I don’t have any specific bugs or issues I’m thinking of, just the general newness of 5.6.

Q: are there any specific warnings or cautions to be aware of with PXC 5.6 when the db uses triggers and/or procedures, beyond the cautions in MySQL itself?

Nothing specific, these should work fine with Percona XtraDB Cluster.  In RBR these only run on the master node (the node taking the write) and the changes done by them are incorporated into the transactions writeset.

Q: So all the certifications come directly from the applying nodes back to the node that sent the data?  Or does it relay back through the relay node?

Actually, certification results are not shared on the cluster.  Certification is deterministic (or should be) and all nodes are expected to either pass or fail a transaction without any mixed results, hence there is no communication about pass/failure and only the master node (the node that originated the transaction) actually does anything about certification failure (i.e., increments a lcf failure counter and deadlock error for the client).   Past bugs in PXC have resulted in non-deterministic certification in some edge-cases, and this can then obviously lead to node inconsistencies.

What is sent back is an acknowledgement of receipt of the transaction (which is much smaller) at the replication stage (pre-certification) and my understanding is that all nodes will reply back to the originating node more or less directly.  I tend to think of this communication as “out-and-back”, but in reality it’s more nuanced than this; for example an acknowledgement may be piggy-backed with a new transaction from that node.

The act of replication delivers the transaction payload to all nodes, and all nodes acknowledge the transaction to each other, AND within this process a consistent GTID for the transaction is established efficiently.  HOW precisely this happens is, as Alexey would say, is an implementation detail. :)   Segments simply modify the method of transaction delivery, but I believe most of the other details are more or less the same.

Q: we are on Percona XtraDB Cluster 5.5 and have bursts of a large number of simultaneous updates to the same table which often triggers lock wait timeouts. Could binlog_row_image=minimal help reduce the frequency of these lock waits?

Lock wait timeouts in the cluster will be happening to transactions on a single node while it waits for other transactions on that same node to commit.   Any way that you can reduce the commit time should, in theory, reduce the lock_waits.

Since part of that commit wait is the synchronous replication, it stands to reason that this is perhaps the bulk of the wait time.  I haven’t measured actual commit rates comparing full vs minimal row images, so I can’t tell you if this would decrease replication time or not.  I can imagine a scenario where a very large transaction would benefit from minimal row images (i.e., by being much smaller and thereby taking less time to transmit over the network), but I’d expect that when the transactions are already small and single-row to begin with, it would make less of an impact.

Are you guys using innodb_flush_log_at_trx_commit=1?  Relaxing that (and relying on the cluster replication for durability) may improve your commit times a lot.

Q: Any Percona XtraDB Cluster 5.6 or Galera 3.x settings to increase write performance across the cluster as well as clusters in different geographic locations?

I’m not aware of anything that is necessarily Galera 3 specific that is tunable, though there are things baked into Galera 3 that may help, such as the certification improvements.   As I mentioned above, minimal row images may help by simply reducing the volume of data that needs transmitting.

The post Followup questions to ‘What’s new in Percona XtraDB Cluster 5.6′ webinar appeared first on MySQL Performance Blog.

Jan
30
2014
--

Percona XtraDB Cluster 5.6 GA release now available

Percona XtraDB Cluster 5.6Percona is pleased to announce the first General Availability release of the leading open source High Availability solution for MySQL, Percona XtraDB Cluster 5.6 on January 30, 2014. Binaries are available from downloads area or from our software repositories.

Percona XtraDB Cluster 5.6
Percona XtraDB Cluster 5.6 is an active/active cluster solution for High Availability (HA) MySQL that delivers performance and MySQL-based cluster management features available nowhere else. This free, open source solution combines Percona Server (a drop-in MySQL replacement), Galera, wsrep API, and Percona XtraBackup, in a single package that’s easier to deploy and manage than the various approaches to MySQL replication. Applications that require high availability will benefit from this combination of the latest versions of these software solutions to deliver cost-effective, high performance based on the latest features and fixes available in Percona Server 5.6, Galera 3, wsrep API 25, and Percona XtraBackup 2.1.

Percona XtraDB Cluster expert Jay Janssen will present a webinar titled “What’s New in Percona XtraDB Cluster 5.6” on February 5 at 1PM EST/10AM PST to talk about some of the new features and upgrade strategies and to answer your questions. Alexey Yurchenko, solutions architect from Codership, will join Jay to contribute additional insights on Galera and the Percona XtraDB Cluster solution.

Percona Server 5.6
As our previous performance analysis demonstrates, MySQL 5.6 was a big step forward from MySQL 5.5. However, Percona Server 5.6 yields significantly better results than MySQL 5.6 in IO-bound cases. The ThreadPool feature in Percona Server 5.6 delivers performance advantages over MySQL 5.6 with as few as 150 concurrent threads and up to 4x better performance as concurrent threads grow into the 1,000s. ThreadPool efficiency improvements in Percona Server 5.6 have even shown to deliver similar gains over MariaDB 10.0.7 in transactional workload scenarios.

Other Percona Server 5.6 features that benefit Percona XtraDB Cluster 5.6 include:

  • Expanded diagnostics capability which reveals new insights into user activity with Performance Schema and User Stats features as well as query activity with the Slow Query Log
  • Support for greater control of the end-user experience with the integration of the Statement Timeout feature from Twitter’s MySQL contributions

Galera 3
Galera from Codership provides synchronous replication for greater data consistency. The latest version, Galera 3, includes new features and fixes to minimize replication traffic between WAN segments, improved memory performance, greater data integrity, and improved data throughput.

Percona XtraBackup 2.1
Adding new nodes and recovering failed nodes is an important part of managing a high availability MySQL cluster. Percona XtraDB Cluster 5.6 now defaults to use Percona XtraBackup 2.1 to transfer node state for these processes. Though Galera supports other choices, Percona XtraBackup 2.1 is the best option because it:

  • Has the fastest State Snapshot Transfers (SST) with parallel, compressed, and compacted options
  • Is highly secure with industry standard AES multi-threaded data encryption available

Drop-in Compatibility
Percona XtraDB Cluster has Percona Server at its core so it is drop-in compatible with MySQL. As a result, upgrades to Percona XtraDB Cluster 5.6 are straightforward from a variety of platforms including:

  • Percona XtraDB Cluster 5.5
  • Percona Server 5.6
  • Percona Server 5.5
  • MySQL 5.6
  • MySQL 5.5
  • MariaDB 5.5

Follow our upgrade steps for moving to Percona XtraDB Cluster 5.6 with no downtime.

Additional Help
You can download the operations manual now at the Percona XtraDB Cluster page on percona.com. Community-based help is available for Percona XtraDB Cluster in the Percona Community Forums.

Commercial help with Percona XtraDB Cluster 5.5 and 5.6 is available from the Percona Consulting team. The Percona Support team offers optional coverage for Percona XtraDB Cluster. The Percona Remote DBA team can provide outsourced management of Percona XtraDB Cluster deployments.

All of Percona‘s software is open-source and free. Details of the release can be found in the 5.6.15-25.3 milestone at Launchpad and in the release notes.

New Features

  • New meta packages are now available in order to make the Percona XtraDB Cluster installation easier.
  • Debian/Ubuntu debug packages are now available for Galera and garbd.
  • xtrabackup-v2 SST now supports the GTID replication.

Bugs Fixed

  • Node would get stuck and required restart if DDL was performed after FLUSH TABLES WITH READ LOCK. Bug fixed #1265656.
  • Galera provider pause has been fixed to avoid potential deadlock with replicating threads.
  • Default value for variable binlog_format is now ROW. This is done so that Percona XtraDB Cluster is not started with wrong defaults leading to non-deterministic outcomes like crash. Bug fixed #1243228.
  • During the installation of percona-xtradb-cluster-garbd-3.x package, Debian tries to start it, but as the configuration is not set, it would fail to start and leave the installation in iF state. Bug fixed #1262171.
  • Runtime checks have been added for dynamic variables which are Galera incompatible. Bug fixed #1262188.
  • During the upgrade process, parallel applying could hit an unresolvable conflict when events were replicated from Percona XtraDB Cluster 5.5 to Percona XtraDB Cluster 5.6. Bug fixed #1267494.
  • xtrabackup-v2 is now used as default SST method. Bug fixed #1268837.
  • FLUSH TABLES WITH READ LOCK behavior on the same connection was changed to conform to MySQL behavior. Bug fixed #1269085.
  • Read-only detection has been added in clustercheck, which can be helpful during major upgrades (this is used by xinetd for HAProxy etc.) Bug fixed #1269469.
  • Binary log directory is now being cleanup as part of the XtraBackup SST. Bug fixed #1273368.
  • First connection would hang after changing the wsrep_cluster_address variable. Bug fixed #1022250.
  • When gmcast.listen_addr variable was set manually it did not allow nodes own address in gcomm address list. Bug fixed #1099478.
  • GCache file allocation could fail if file size was a multiple of page size. Bug fixed #1259952.
  • Group remerge after partitioning event has been fixed. Bug fixed #1232747.
  • Fixed multiple build bugs: #1262716, #1269063, #1269351, #1272723, #1272732, and #1253055.

Other bugs fixed: #1273101, #1272961, #1271264, and #1253055.

We did our best to eliminate bugs and problems during the testing release, but this is a software, so bugs are expected. If you encounter them, please report them to our bug tracking system.

The post Percona XtraDB Cluster 5.6 GA release now available appeared first on MySQL Performance Blog.

Jan
15
2014
--

Upcoming Webinar: What’s new in Percona XtraDB Cluster 5.6

Percona XtraDB Cluster 5.6I’ve been blogging a lot about some of the new things you can expect with Percona XtraDB Cluster 5.6 and Galera 3.x – and GA is coming soon.

To get prepared, I’ll be giving a webinar on February 5th at 1PM EST/10AM PST to talk about some of the big new features, upgrading strategies, as well as answering your questions. Alexey Yurchenko, solutions architect from Codership, will be joining me for some extra brain power.

Topics to be covered include:

  • Galera 3 replication enhancements
  • WAN segments
  • Cluster integration with 5.6 Async replication GTIDs
  • Async to cluster inbound replication enhancements
  • Minimal replication images
  • Detecting Donor IST viability
  • Upgrade paths to Percona XtraDB Cluster 5.6

You can register for free right here!

The post Upcoming Webinar: What’s new in Percona XtraDB Cluster 5.6 appeared first on MySQL Performance Blog.

Jan
08
2014
--

Finding a good IST donor in Percona XtraDB Cluster 5.6

Gcache and IST

The Gcache is a memory-based cache of recent Galera transactions that is local to each node in a cluster.  If a node leaves and rejoins the cluster, it can use the gcache from another node that stayed in the cluster (i.e., its donor node) to fetch the transactions it missed (IST) as opposed to doing a full state snapshot transfer (SST).  However, there are a few nuances that are not obvious to the beginner:

  • The Gcache is lost when a node restarts
  • The Gcache is fixed size and implemented as a LRU.  Once it is full, older transactions roll off.
  • Donor selection is made irregardless of the gcache state
  • If the given donor for a restarting node doesn’t have all transactions needed, a full SST (read: full backup) is done instead
  • Until recent developments, there was no way to tell what, precisely, was in the Gcache.

So, with (somewhat) arbitrary donor selection, it was hard to be certain that a node restart would not trigger a SST.  For example:

  • A node crashed over night or was otherwise down for some length of time.  How do you know if the gcache on any node is big enough to contain all the transactions necessary for IST?
  • If you brought two nodes in your cluster simultaneously, the second one you restart might select the first one as its donor and be forced to SST.

Along comes PXC 5.6.15 RC1

Astute readers of the PXC 5.6.15 release notes will have noticed this little tidbit:

New wsrep_local_cached_downto status variable has been introduced. This variable shows the lowest sequence number in gcache. This information can be helpful with determining IST and/or SST.

Until this release there was no visibility into any node’s Gcache and what was likely to happen when restarting a node.  You could make some assumptions, but now it its a bit easier to:

  1. Tell if a given node would be a suitable donor
  2. And hence select a donor manually using wsrep_sst_donor instead of leaving it to chance.

 

What it looks like

Suppose I have a 3 node cluster where load is hitting node1.  I execute the following in sequence:

  1. Shut down node2
  2. Shut down node3
  3. Restart node2

At step 3, node1 is the only viable donor for node2.  Because our restart was quick, we can have some reasonable assurance that node2 will IST correctly (and it does).

However, before we restart node3, let’s check the oldest transaction in the gcache on nodes 1 and 2:

[root@node1 ~]# mysql -e "show global status like 'wsrep_local_cached_downto';"
+---------------------------+--------+
| Variable_name             | Value  |
+---------------------------+--------+
| wsrep_local_cached_downto | 889703 |
+---------------------------+--------+
[root@node2 mysql]# mysql -e "show global status like 'wsrep_local_cached_downto';"
+---------------------------+---------+
| Variable_name             | Value   |
+---------------------------+---------+
| wsrep_local_cached_downto | 1050151 |
+---------------------------+---------+

So we can see that node1 has a much more “complete” gcache than node2 does (i.e., a much smaller seqno). Node2′s gcache was wiped when it restarted, so it only has transactions from after its restart.

To check node3′s GTID, we can either check the grastate.dat, or (if it has crashed and the grastate is zeroed) use –wsrep_recover:

[root@node3 ~]# cat /var/lib/mysql/grastate.dat
# GALERA saved state
version: 2.1
uuid:    7206c8e4-7705-11e3-b175-922feecc92a0
seqno:   1039191
cert_index:
[root@node3 ~]# mysqld_safe --wsrep-recover
140107 16:18:37 mysqld_safe Logging to '/var/lib/mysql/error.log'.
140107 16:18:37 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
140107 16:18:37 mysqld_safe WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.pIVkT4' --pid-file='/var/lib/mysql/node3-recover.pid'
140107 16:18:39 mysqld_safe WSREP: Recovered position 7206c8e4-7705-11e3-b175-922feecc92a0:1039191
140107 16:18:41 mysqld_safe mysqld from pid file /var/lib/mysql/node3.pid ended

So, armed with this information, we can tell what would happen to node3, depending on which donor was selected:

Donor selected Donor’s gcache oldest seqno Node3′s seqno Result for node3
node2 1050151 1039191 SST
node1 889703 1039191 IST

So, we can instruct node3 to use node1 as its donor on restart with wsrep_sst_donor:

[root@node3 ~]# service mysql start --wsrep_sst_donor=node1

Note that passing mysqld options on the command line is only supported in RPM packages, Debian requires you put that setting in your my.cnf.  We can see from node3′s log that it does properly IST:

2014-01-07 16:23:26 19834 [Note] WSREP: Prepared IST receiver, listening at: tcp://192.168.70.4:4568
2014-01-07 16:23:26 19834 [Note] WSREP: Node 0.0 (node3) requested state transfer from 'node1'. Selected 2.0 (node1)(SYNCED) as donor.
...
2014-01-07 16:23:27 19834 [Note] WSREP: Receiving IST: 39359 writesets, seqnos 1039191-1078550
2014-01-07 16:23:27 19834 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.6.15-56'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Percona XtraDB Cluster (GPL), Release 25.2, Revision 645, wsrep_25.2.r4027
2014-01-07 16:23:41 19834 [Note] WSREP: IST received: 7206c8e4-7705-11e3-b175-922feecc92a0:1078550

Sometime in the future, this may be handled automatically on donor selection, but for now it is very useful that we can at least see the status of the gcache.

The post Finding a good IST donor in Percona XtraDB Cluster 5.6 appeared first on MySQL Performance Blog.

Dec
02
2013
--

Useful MySQL 5.6 features you get for free in PXC 5.6

I get a lot of questions about Percona XtraDB Cluster 5.6 (PXC 5.6), specifically about whether such and such MySQL 5.6 Community Edition feature is in PXC 5.6.  The short answer is: yes, all features in community MySQL 5.6 are in Percona Server 5.6 and, in turn, are in PXC 5.6.  Whether or not the new feature is useful in 5.6 really depends on how useful it is in general with Galera.

I thought it would be useful to highlight a few features and try to show them working:

Innodb Fulltext Indexes

Yes, FTS works in Innodb in 5.6, so why wouldn’t it work in PXC 5.6?  To test this I used the Sakila database , which contains a single table with FULLTEXT.  In the sakila-schema.sql file, it is still designated a MyISAM table:

CREATE TABLE film_text (
  film_id SMALLINT NOT NULL,
  title VARCHAR(255) NOT NULL,
  description TEXT,
  PRIMARY KEY  (film_id),
  FULLTEXT KEY idx_title_description (title,description)
)ENGINE=MyISAM DEFAULT CHARSET=utf8;

I edited that file to change MyISAM to Innodb, loaded the schema and data into my 3 node cluster:

[root@node1 sakila-db]# mysql < sakila-schema.sql
[root@node1 sakila-db]# mysql < sakila-data.sql

and it works seamlessly:

node1 mysql> select title, description, match( title, description) against ('action saga' in natural language mode) as score from sakila.film_text order by score desc limit 5;
+-----------------+-----------------------------------------------------------------------------------------------------------+--------------------+
| title | description | score |
+-----------------+-----------------------------------------------------------------------------------------------------------+--------------------+
| FACTORY DRAGON | A Action-Packed Saga of a Teacher And a Frisbee who must Escape a Lumberjack in The Sahara Desert | 3.0801234245300293 |
| HIGHBALL POTTER | A Action-Packed Saga of a Husband And a Dog who must Redeem a Database Administrator in The Sahara Desert | 3.0801234245300293 |
| MATRIX SNOWMAN | A Action-Packed Saga of a Womanizer And a Woman who must Overcome a Student in California | 3.0801234245300293 |
| REEF SALUTE | A Action-Packed Saga of a Teacher And a Lumberjack who must Battle a Dentist in A Baloon | 3.0801234245300293 |
| SHANE DARKNESS | A Action-Packed Saga of a Moose And a Lumberjack who must Find a Woman in Berlin | 3.0801234245300293 |
+-----------------+-----------------------------------------------------------------------------------------------------------+--------------------+
5 rows in set (0.00 sec)

Sure enough, I can run this query on any node and it works fine:

node3 mysql> select title, description, match( title, description) against ('action saga' in natural language mode) as score from sakila.film_text order by score desc limit 5;
+-----------------+-----------------------------------------------------------------------------------------------------------+--------------------+
| title           | description                                                                                               | score              |
+-----------------+-----------------------------------------------------------------------------------------------------------+--------------------+
| FACTORY DRAGON  | A Action-Packed Saga of a Teacher And a Frisbee who must Escape a Lumberjack in The Sahara Desert         | 3.0801234245300293 |
| HIGHBALL POTTER | A Action-Packed Saga of a Husband And a Dog who must Redeem a Database Administrator in The Sahara Desert | 3.0801234245300293 |
| MATRIX SNOWMAN  | A Action-Packed Saga of a Womanizer And a Woman who must Overcome a Student in California                 | 3.0801234245300293 |
| REEF SALUTE     | A Action-Packed Saga of a Teacher And a Lumberjack who must Battle a Dentist in A Baloon                  | 3.0801234245300293 |
| SHANE DARKNESS  | A Action-Packed Saga of a Moose And a Lumberjack who must Find a Woman in Berlin                          | 3.0801234245300293 |
+-----------------+-----------------------------------------------------------------------------------------------------------+--------------------+
5 rows in set (0.05 sec)
node3 mysql> show create table sakila.film_text\G
*************************** 1. row ***************************
       Table: film_text
Create Table: CREATE TABLE `film_text` (
  `film_id` smallint(6) NOT NULL,
  `title` varchar(255) NOT NULL,
  `description` text,
  PRIMARY KEY (`film_id`),
  FULLTEXT KEY `idx_title_description` (`title`,`description`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

There might be a few caveats and differences from how FTS works in Innodb vs MyISAM, but it is there.

Minimal replication images

Galera relies heavily on RBR events, but until 5.6 those were entire row copies, even if you only changed a single column in the table. In 5.6 you can change this to send only the updated data using the variable binlog_row_image=minimal.

Using a simple sysbench update test for 1 minute, I can determine the baseline size of the replicated data:

node3 mysql> show global status like 'wsrep_received%';
+----------------------+-----------+
| Variable_name        | Value     |
+----------------------+-----------+
| wsrep_received       | 703       |
| wsrep_received_bytes | 151875070 |
+----------------------+-----------+
2 rows in set (0.04 sec)
... test runs for 1 minute...
node3 mysql> show global status like 'wsrep_received%';
+----------------------+-----------+
| Variable_name        | Value     |
+----------------------+-----------+
| wsrep_received       | 38909     |
| wsrep_received_bytes | 167749809 |
+----------------------+-----------+
2 rows in set (0.17 sec)

This results in 62.3 MB of data replicated in this test.

If I set binlog_row_image=minimal on all nodes and do a rolling restart, I can see how this changes:

node3 mysql> show global status like 'wsrep_received%';
+----------------------+-------+
| Variable_name        | Value |
+----------------------+-------+
| wsrep_received       | 3     |
| wsrep_received_bytes | 236   |
+----------------------+-------+
2 rows in set (0.07 sec)
... test runs for 1 minute...
node3 mysql> show global status like 'wsrep_received%';
+----------------------+----------+
| Variable_name        | Value    |
+----------------------+----------+
| wsrep_received       | 34005    |
| wsrep_received_bytes | 14122952 |
+----------------------+----------+
2 rows in set (0.13 sec)

This yields a mere 13.4MB, that’s 80% smaller, quite a savings!  This benefit, of course, fully depends on the types of workloads you are doing.

Durable Memcache Cluster

It turns out this feature does not work properly with Galera, see below for an explanation:

5.6 introduces an Memcached interface for Innodb.  This means any standard memcache client can talk to our PXC nodes with the memcache protocol and the data is:

  • Replicated to all nodes
  • Durable across the cluster
  • Highly available
  • Easy to hash memcache clients across all servers for better cache coherency

To set this up, we need to simply load the innodb_memcache schema from the example and restart the daemon to get a listening memcached port:

[root@node1 ~]# mysql < /usr/share/mysql/innodb_memcached_config.sql
[root@node1 ~]# service mysql restart
Shutting down MySQL (Percona XtraDB Cluster)...... SUCCESS!
Starting MySQL (Percona XtraDB Cluster)...... SUCCESS!
[root@node1 ~]# lsof +p`pidof mysqld` | grep LISTEN
mysqld  31961 mysql   11u  IPv4             140592       0t0      TCP *:tram (LISTEN)
mysqld  31961 mysql   55u  IPv4             140639       0t0      TCP *:memcache (LISTEN)
mysqld  31961 mysql   56u  IPv6             140640       0t0      TCP *:memcache (LISTEN)
mysqld  31961 mysql   59u  IPv6             140647       0t0      TCP *:mysql (LISTEN)

This all appears to work and I can fetch the sample AA row from all the nodes with the memcached interface:

node1 mysql> select * from demo_test;
+-----+--------------+------+------+------+
| c1  | c2           | c3   | c4   | c5   |
+-----+--------------+------+------+------+
| AA  | HELLO, HELLO |    8 |    0 |    0 |
+-----+--------------+------+------+------+
[root@node3 ~]# telnet 127.0.0.1 11211
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
get AA
VALUE AA 8 12
HELLO, HELLO
END

However, if I try to update a row, it does not seem to replicate (even if I set innodb_api_enable_binlog):

[root@node3 ~]# telnet 127.0.0.1 11211
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
set DD 0 0 0
STORED
^]
telnet> quit
Connection closed.
node3 mysql> select * from demo_test;
+----+--------------+------+------+------+
| c1 | c2           | c3   | c4   | c5   |
+----+--------------+------+------+------+
| AA | HELLO, HELLO |    8 |    0 |    0 |
| DD |              |    0 |    1 |    0 |
+----+--------------+------+------+------+
2 rows in set (0.00 sec)
node1 mysql> select * from demo_test;
+-----+--------------+------+------+------+
| c1  | c2           | c3   | c4   | c5   |
+-----+--------------+------+------+------+
| AA  | HELLO, HELLO |    8 |    0 |    0 |
+-----+--------------+------+------+------+

So unfortunately the memcached plugin must use some backdoor to Innodb that Galera is unaware of. I’ve filed a bug on the issue, but it’s not clear if there will be an easy solution or if a whole lot of code will be necessary to make this work properly.

In the short-term, however, you can at least read data from all nodes with the memcached plugin as long as data is only written using the standard SQL interface.

Async replication GTID Integration

Async GTIDs were introduced in 5.6 in order to make CHANGE MASTER easier.  You have always been able to use async replication from any cluster node, but now with this new GTID support, it is much easier to failover to another node in the cluster as a new master.

If we take one node out of our cluster to be a slave and enable GTID binary logging on the other two by adding these settings:

server-id = ##
log-bin = cluster_log
log-slave-updates
gtid_mode = ON
enforce-gtid-consistency

If I generate some writes on the cluster, I can see GTIDs are working:

node1 mysql> show master status\G
*************************** 1. row ***************************
File: cluster_log.000001
Position: 573556
Binlog_Do_DB:
Binlog_Ignore_DB:
Executed_Gtid_Set: e941e026-ac70-ee1c-6dc9-40f8d3b5db3f:1-1505
1 row in set (0.00 sec)
node2 mysql> show master status\G
*************************** 1. row ***************************
File: cluster_log.000001
Position: 560011
Binlog_Do_DB:
Binlog_Ignore_DB:
Executed_Gtid_Set: e941e026-ac70-ee1c-6dc9-40f8d3b5db3f:1-1505
1 row in set (0.00 sec)

Notice that we’re at GTID 1505 on both nodes, even though the binary log position happens to be different.

I set up my slave to replicate from node1 (.70.2):

node3 mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.70.2
                  Master_User: slave
...
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
...
           Retrieved_Gtid_Set: e941e026-ac70-ee1c-6dc9-40f8d3b5db3f:1-1506
            Executed_Gtid_Set: e941e026-ac70-ee1c-6dc9-40f8d3b5db3f:1-1506
                Auto_Position: 1
1 row in set (0.00 sec)

And it’s all caught up.  If put some load on the cluster, I can easily change to node2 as my master without needing to stop writes:

node3 mysql> stop slave;
Query OK, 0 rows affected (0.09 sec)
node3 mysql> change master to master_host='192.168.70.3', master_auto_position=1;
Query OK, 0 rows affected (0.02 sec)
node3 mysql> start slave;
Query OK, 0 rows affected (0.02 sec)
node3 mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.70.3
                  Master_User: slave
...
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
...
            Executed_Gtid_Set: e941e026-ac70-ee1c-6dc9-40f8d3b5db3f:1-3712
                Auto_Position: 1

So this seems to work pretty well. It does turns out there is a bit of a bug, but it’s actually with Xtrabackup — currently the binary logs are not copied in Xtrabackup SST and this can cause GTID inconsistencies within nodes in the cluster.  I would expect this to get fixed relatively quickly.

Conclusion

MySQL 5.6 introduces a lot of new interesting features that are even more compelling in the PXC/Galera world.  If you want to experiment for yourself, I pushed the Vagrant environment I used to Github at: https://github.com/jayjanssen/pxc_56_features

The post Useful MySQL 5.6 features you get for free in PXC 5.6 appeared first on MySQL Performance Blog.

Nov
22
2013
--

New wsrep_provider_options in Galera 3.x and Percona XtraDB Cluster 5.6

Now that Percona XtraDB Cluster 5.6 is out in beta, I wanted to start a series talking about new features in Galera 3 and PXC 5.6.  On the surface, Galera 3 doesn’t reveal a lot of new features yet, but there has been a lot of refactoring of the system in preparation for great new features in the future.

Galera vs MySQL options

wsrep_provider_options is a semi-colon separated list of key => value configurations that set low-level Galera library configuration.  These tweak the actual cluster communication and replication in the group communication system.  By contrast, other PXC global variables (like ‘wsrep%’) are set like other mysqld options and generally have more to do with MySQL/Galera integration.   This post will cover the Galera options and mysql-level changes will have to wait for another post.

Here are the differences in the wsrep_provider_options between 5.5 and 5.6:

$ diff 5.5 5.6
38a39
> gmcast.segment = 0;
51,52c52,56
< replicator.causal_read_timeout = PT30S;
< replicator.commit_order = 3 |
---
> repl.causal_read_timeout = PT30S;
> repl.commit_order = 3;
> repl.key_format = FLAT8;
> repl.proto_max = 5;
> socket.checksum = 2

gmcast.segment=0

This is a new setting in 3.x and allows us to distinguish between nodes in different WAN segments.  For example, all nodes in a single datacenter would be configured with the same segment number, but each datacenter would have its own segment.

Segments are currently used in two main ways:

  1. Replication traffic between segments is minimized.  Writesets originating in one segment should be relayed through only one node in every other segment.  From those local relays replication is propagated to the rest of the nodes in each segment respectively.
  2. Segments are used in Donor-selection.  Yes, donors in the same segment are preferred, but not required.
2013-11-21 17:26:59 3853 [Warning] WSREP: There are no nodes in the same segment that will ever be able to become donors, yet there is a suitable donor outside. Will use that one.

replicator -> repl

The older ‘replicator’ tag is now renamed to ‘repl’ and the causal_read_timeout and commit_order settings have moved there.  No news here really.

repl.key_format = FLAT8

Every writeset in Galera has associated keys.  These keys are effectively a list of primary, unique, and foreign keys associated with all rows modified in the writeset.  In Galera 2 these keys were replicated as literal values, but in Galera 3 they are hashed in either 8 or 16 byte values (FLAT8 vs FLAT16).  This should generally make the key sizes smaller, especially with large CHAR keys.

Because the keys are now hashed, there can be collisions where two distinct literal key values result in the same 8-byte hashed value.  This means practically that the places in Galera that rely on keys may falsely believe that there is a match between two writesets when there really is not.  This should be quite rare.  This false positive could affect:

  • Local certification failures (Deadlocks on commit) that are unnecessary.
  • Parallel apply – things could be done in a stricter order (i.e., less parallelization) than necessary

Neither case affects data consistency.  The tradeoff is more efficiency in keys and key operations generally making writesets smaller and certification faster.

repl.proto_max

Limits the Galera protocol version that can be used in the cluster.  Codership’s documentation states it is for debugging only.

socket.checksum = 2

This modifies the previous network packet checksum algorithm (CRC32) to support CRC32-C which is hardware accelerated on supported gear.  Packet checksums also can now be completely disabled (=0).

In the near future I’ll write some posts about WAN segments in more detail and about the other global and status variables introduced in PXC 5.6.

The post New wsrep_provider_options in Galera 3.x and Percona XtraDB Cluster 5.6 appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com