Dec
22
2014
--

Testing backup locks during Xtrabackup SST on Percona XtraDB Cluster

Background on Backup Locks

I was very excited to see Backup locks support in release notes for the latest Percona XtraDB Cluster 5.6.21 release. For those who are not aware, backup locks offer an alternative to FLUSH TABLES WITH READ LOCK (FTWRL) in Xtrabackup. While Xtrabackup can hot-copy Innodb, everything else in MySQL must be locked (usually briefly) to get a consistent snapshot that lines up with Innodb. This includes all other storage engines, but also things like table schemas (even on Innodb) and async replication binary logs. You can skip this lock, but it isn’t generally considered a ‘safe’ backup in every case.

Until recently, Xtrabackup (like most other backup tools) used FTWRL to accomplish this. This worked great, but had the unfortunate side-effect of locking every single table, even the Innodb ones.  This functionally meant that even a hot-backup tool for Innodb had to take a (usually short) global lock to get a consistent backup with MySQL overall.

Backup locks change that by introducing a new locking command on Percona Server called ‘LOCK TABLES FOR BACKUP’.  This works by locking writes to non-transactional tables, as well as locking DDL on all tables (including Innodb).  If Xtrabackup (of a recent vintage) detects that it’s backing up a Percona Server (also of recent vintage), it will automatically use LOCK TABLES WITH BACKUP instead of FLUSH TABLES WITH READ LOCK.

The TL;DR of this is that you can keep on modifying your Innodb data through the entire backup, since we don’t need to use FTWRL any longer.

This feature was introduced in Percona Server 5.6.16-64.0 and Percona XtraBackup 2.2.  I do not believe you will find it in any other MySQL variant, though I could be corrected.

What this means for Percona XtraDB Cluster (PXC)

The most common (and logical) SST method for Percona XtraDB Cluster is using Xtrabackup. This latest release of PXC includes support for backup locks, meaning that Xtrabackup donor nodes will no longer need to get a global lock. Practically for PXC users, this means that your Donor nodes can stay in rotation without causing client interruptions due to FTWRL.

Seeing it in action

To test this out, I spun up a 3-node cluster on AWS and fired up a sysbench run on the first node. I forced and SST on the node. Here is a snippet of the innobackup.backup.log (generated by all Xtrabackup donors in Percona XtraDB Cluster):

InnoDB Backup Utility v1.5.1-xtrabackup; Copyright 2003, 2009 Innobase Oy
and Percona LLC and/or its affiliates 2009-2013. All Rights Reserved.
This software is published under
the GNU GENERAL PUBLIC LICENSE Version 2, June 1991.
Get the latest version of Percona XtraBackup, documentation, and help resources:
http://www.percona.com/xb/p
141218 19:22:01 innobackupex: Connecting to MySQL server with DSN 'dbi:mysql:;mysql_read_default_file=/etc/my.cnf;mysql_read_default_group=xtraback
up;mysql_socket=/var/lib/mysql/mysql.sock' as 'sst' (using password: YES).
141218 19:22:01 innobackupex: Connected to MySQL server
141218 19:22:01 innobackupex: Starting the backup operation
IMPORTANT: Please check that the backup run completes successfully.
 At the end of a successful backup run innobackupex
 prints "completed OK!".
innobackupex: Using server version 5.6.21-70.1-56
innobackupex: Created backup directory /tmp/tmp.Rm0qA740U3
141218 19:22:01 innobackupex: Starting ibbackup with command: xtrabackup --defaults-file="/etc/my.cnf" --defaults-group="mysqld" --backup --suspe
nd-at-end --target-dir=/tmp/tmp.dM03LgPHFY --innodb_data_file_path="ibdata1:12M:autoextend" --tmpdir=/tmp/tmp.dM03LgPHFY --extra-lsndir='/tmp/tmp.dM
03LgPHFY' --stream=xbstream
innobackupex: Waiting for ibbackup (pid=21892) to suspend
innobackupex: Suspend file '/tmp/tmp.dM03LgPHFY/xtrabackup_suspended_2'
xtrabackup version 2.2.7 based on MySQL server 5.6.21 Linux (x86_64) (revision id: )
xtrabackup: uses posix_fadvise().
xtrabackup: cd to /var/lib/mysql
xtrabackup: open files limit requested 0, set to 5000
xtrabackup: using the following InnoDB configuration:
xtrabackup: innodb_data_home_dir = ./
xtrabackup: innodb_data_file_path = ibdata1:12M:autoextend
xtrabackup: innodb_log_group_home_dir = ./
xtrabackup: innodb_log_files_in_group = 2
xtrabackup: innodb_log_file_size = 1073741824
xtrabackup: using O_DIRECT
>> log scanned up to (10525811040)
xtrabackup: Generating a list of tablespaces
[01] Streaming ./ibdata1
>> log scanned up to (10529368594)
>> log scanned up to (10532685942)
>> log scanned up to (10536422820)
>> log scanned up to (10539562039)
>> log scanned up to (10543077110)
[01] ...done
[01] Streaming ./mysql/innodb_table_stats.ibd
[01] ...done
[01] Streaming ./mysql/innodb_index_stats.ibd
[01] ...done
[01] Streaming ./mysql/slave_relay_log_info.ibd
[01] ...done
[01] Streaming ./mysql/slave_master_info.ibd
[01] ...done
[01] Streaming ./mysql/slave_worker_info.ibd
[01] ...done
[01] Streaming ./sbtest/sbtest1.ibd
>> log scanned up to (10546490256)
>> log scanned up to (10550321726)
>> log scanned up to (10553628936)
>> log scanned up to (10555422053)
[01] ...done
...
[01] Streaming ./sbtest/sbtest17.ibd
>> log scanned up to (10831343724)
>> log scanned up to (10834063832)
>> log scanned up to (10837100278)
>> log scanned up to (10840243171)
[01] ...done
xtrabackup: Creating suspend file '/tmp/tmp.dM03LgPHFY/xtrabackup_suspended_2' with pid '21892'
>> log scanned up to (10843312323)
141218 19:24:06 innobackupex: Continuing after ibbackup has suspended
141218 19:24:06 innobackupex: Executing LOCK TABLES FOR BACKUP...
141218 19:24:06 innobackupex: Backup tables lock acquired
141218 19:24:06 innobackupex: Starting to backup non-InnoDB tables and files
innobackupex: in subdirectories of '/var/lib/mysql/'
innobackupex: Backing up files '/var/lib/mysql//mysql/*.{frm,isl,MYD,MYI,MAD,MAI,MRG,TRG,TRN,ARM,ARZ,CSM,CSV,opt,par}' (74 files)
>> log scanned up to (10846683627)
>> log scanned up to (10847773504)
innobackupex: Backing up files '/var/lib/mysql//sbtest/*.{frm,isl,MYD,MYI,MAD,MAI,MRG,TRG,TRN,ARM,ARZ,CSM,CSV,opt,par}' (21 files)
innobackupex: Backing up file '/var/lib/mysql//test/db.opt'
innobackupex: Backing up files '/var/lib/mysql//performance_schema/*.{frm,isl,MYD,MYI,MAD,MAI,MRG,TRG,TRN,ARM,ARZ,CSM,CSV,opt,par}' (53 files)
>> log scanned up to (10852976291)
141218 19:24:09 innobackupex: Finished backing up non-InnoDB tables and files
141218 19:24:09 innobackupex: Executing LOCK BINLOG FOR BACKUP...
141218 19:24:09 innobackupex: Executing FLUSH NO_WRITE_TO_BINLOG ENGINE LOGS...
141218 19:24:09 innobackupex: Waiting for log copying to finish
>> log scanned up to (10856996124)
xtrabackup: The latest check point (for incremental): '9936050111'
xtrabackup: Stopping log copying thread.
.>> log scanned up to (10856996124)
xtrabackup: Creating suspend file '/tmp/tmp.dM03LgPHFY/xtrabackup_log_copied' with pid '21892'
141218 19:24:10 innobackupex: Executing UNLOCK BINLOG
141218 19:24:10 innobackupex: Executing UNLOCK TABLES
141218 19:24:10 innobackupex: All tables unlocked
141218 19:24:10 innobackupex: Waiting for ibbackup (pid=21892) to finish
xtrabackup: Transaction log of lsn (9420426891) to (10856996124) was copied.
innobackupex: Backup created in directory '/tmp/tmp.Rm0qA740U3'
141218 19:24:30 innobackupex: Connection to database server closed
141218 19:24:30 innobackupex: completed OK!

We can see the LOCK TABLES FOR BACKUP issued at 19:24:06 and unlocked at 19:24:10. Let’s see Galera apply stats from this node during that time:

mycluster / ip-10-228-128-220 (idx: 0) / Galera 3.8(rf6147dd)
Wsrep    Cluster  Node Repl  Queue     Ops       Bytes     Conflct   Gcache    Window        Flow
    time P cnf  # Stat Laten   Up   Dn   Up   Dn   Up   Dn  lcf  bfa  ist  idx dst appl comm  p_ms
19:23:55 P   5  3 Dono 698µs    0   72    0 5418  0.0 3.5M    0    0 187k   94  3k    3    2     0
19:23:56 P   5  3 Dono 701µs    0   58    0 5411  0.0 3.5M    0    0 188k  229  3k    3    2     0
19:23:57 P   5  3 Dono 701µs    0    2    0 5721  0.0 3.7M    0    0 188k  120  3k    3    2     0
19:23:58 P   5  3 Dono 689µs    0    5    0 5643  0.0 3.6M    0    0 188k   63  3k    3    2     0
19:23:59 P   5  3 Dono 679µs    0   55    0 5428  0.0 3.5M    0    0 188k  115  3k    3    2     0
19:24:01 P   5  3 Dono 681µs    0    1    0 4623  0.0 3.0M    0    0 188k  104  3k    3    2     0
19:24:02 P   5  3 Dono 690µs    0    0    0 4301  0.0 2.7M    0    0 188k  141  3k    3    2     0
19:24:03 P   5  3 Dono 688µs    0    2    0 4907  0.0 3.1M    0    0 188k  227  3k    3    2     0
19:24:04 P   5  3 Dono 692µs    0   44    0 4894  0.0 3.1M    0    0 188k  116  3k    3    2     0
19:24:05 P   5  3 Dono 706µs    0    0    0 5337  0.0 3.4M    0    0 188k   63  3k    3    2     0

Initially the node is keeping up ok with replication. The Down Queue (wsrep_local_recv_queue) is sticking around 0. We’re applying 4-5k transactions per second (Ops Dn). When the backup lock kicks in, we do see an increase in the queue size, but note that transactions are still applying on this node:

19:24:06 P   5  3 Dono 696µs    0  170    0 5671  0.0 3.6M    0    0 187k  130  3k    3    2     0
19:24:07 P   5  3 Dono 695µs    0 2626    0 3175  0.0 2.0M    0    0 185k 2193  3k    3    2     0
19:24:08 P   5  3 Dono 692µs    0 1248    0 6782  0.0 4.3M    0    0 186k 1800  3k    3    2     0
19:24:09 P   5  3 Dono 693µs    0  611    0 6111  0.0 3.9M    0    0 187k  651  3k    3    2     0
19:24:10 P   5  3 Dono 708µs    0   93    0 5316  0.0 3.4M    0    0 187k  139  3k    3    2     0

So this node isn’t locked from innodb write transactions, it’s just suffering a bit of IO load while the backup finishes copying its files and such. After this, the backup finished up and the node goes back to a Synced state pretty quickly:

19:24:11 P   5  3 Dono 720µs    0    1    0 4486  0.0 2.9M    0    0 188k   78  3k    3    2     0
19:24:12 P   5  3 Dono 715µs    0    0    0 3982  0.0 2.5M    0    0 188k  278  3k    3    2     0
19:24:13 P   5  3 Dono 1.2ms    0    0    0 4337  0.0 2.8M    0    0 188k  143  3k    3    2     0
19:24:14 P   5  3 Dono 1.2ms    0    1    0 4901  0.0 3.1M    0    0 188k  130  3k    3    2     0
19:24:16 P   5  3 Dono 1.1ms    0    0    0 5289  0.0 3.4M    0    0 188k   76  3k    3    2     0
19:24:17 P   5  3 Dono 1.1ms    0   42    0 4998  0.0 3.2M    0    0 188k  319  3k    3    2     0
19:24:18 P   5  3 Dono 1.1ms    0   15    0 3290  0.0 2.1M    0    0 188k   75  3k    3    2     0
19:24:19 P   5  3 Dono 1.1ms    0    0    0 4124  0.0 2.6M    0    0 188k  276  3k    3    2     0
19:24:20 P   5  3 Dono 1.1ms    0    4    0 1635  0.0 1.0M    0    0 188k   70  3k    3    2     0
19:24:21 P   5  3 Dono 1.1ms    0    0    0 5026  0.0 3.2M    0    0 188k  158  3k    3    2     0
19:24:22 P   5  3 Dono 1.1ms    0   20    0 4100  0.0 2.6M    0    0 188k  129  3k    3    2     0
19:24:23 P   5  3 Dono 1.1ms    0    0    0 5412  0.0 3.5M    0    0 188k  159  3k    3    2     0
19:24:24 P   5  3 Dono 1.1ms    0  315    0 4567  0.0 2.9M    0    0 187k  170  3k    3    2     0
19:24:25 P   5  3 Dono 1.0ms    0   24    0 5535  0.0 3.5M    0    0 188k  131  3k    3    2     0
19:24:26 P   5  3 Dono 1.0ms    0    0    0 5427  0.0 3.5M    0    0 188k   71  3k    3    2     0
19:24:27 P   5  3 Dono 1.0ms    0    1    0 5221  0.0 3.3M    0    0 188k  256  3k    3    2     0
19:24:28 P   5  3 Dono 1.0ms    0    0    0 5317  0.0 3.4M    0    0 188k  159  3k    3    2     0
19:24:29 P   5  3 Dono 1.0ms    0    1    0 5491  0.0 3.5M    0    0 188k  163  3k    3    2     0
19:24:30 P   5  3 Sync 1.0ms    0    0    0 5540  0.0 3.5M    0    0 188k  296  3k    3    2     0
19:24:31 P   5  3 Sync 992µs    0  106    0 5594  0.0 3.6M    0    0 187k  130  3k    3    2     0
19:24:33 P   5  3 Sync 984µs    0   19    0 5723  0.0 3.7M    0    0 188k  275  3k    3    2     0
19:24:34 P   5  3 Sync 976µs    0    0    0 5508  0.0 3.5M    0    0 188k  182  3k    3    2     0

Compared to Percona XtraDB Cluster 5.5

The Backup Locking is only a feature of Percona XtraDB Cluster 5.6, so if we repeat the experiment on 5.5, we can see a more severe lock:

141218 20:31:19  innobackupex: Executing FLUSH TABLES WITH READ LOCK...
141218 20:31:19  innobackupex: All tables locked and flushed to disk
141218 20:31:19  innobackupex: Starting to backup non-InnoDB tables and files
innobackupex: in subdirectories of '/var/lib/mysql/'
innobackupex: Backing up files '/var/lib/mysql//sbtest/*.{frm,isl,MYD,MYI,MAD,MAI,MRG,TRG,TRN,ARM,ARZ,CSM,CSV,opt,par}' (21 files)
innobackupex: Backing up files '/var/lib/mysql//mysql/*.{frm,isl,MYD,MYI,MAD,MAI,MRG,TRG,TRN,ARM,ARZ,CSM,CSV,opt,par}' (72 files)
>> log scanned up to (6633554484)
innobackupex: Backing up file '/var/lib/mysql//test/db.opt'
innobackupex: Backing up files '/var/lib/mysql//performance_schema/*.{frm,isl,MYD,MYI,MAD,MAI,MRG,TRG,TRN,ARM,ARZ,CSM,CSV,opt,par}' (18 files)
141218 20:31:21  innobackupex: Finished backing up non-InnoDB tables and files
141218 20:31:21  innobackupex: Executing FLUSH NO_WRITE_TO_BINLOG ENGINE LOGS...
141218 20:31:21  innobackupex: Waiting for log copying to finish
xtrabackup: The latest check point (for incremental): '5420681649'
xtrabackup: Stopping log copying thread.
.>> log scanned up to (6633560488)
xtrabackup: Creating suspend file '/tmp/tmp.Cq5JRZEFki/xtrabackup_log_copied' with pid '23130'
141218 20:31:22  innobackupex: All tables unlocked

Our lock lasts from 20:31:19 until 20:31:21, so it’s fairly short. Note that with larger databases with more schemas and tables, this can be quite a bit longer. Let’s see the effect on the apply rate for this node:

mycluster / ip-10-229-68-156 (idx: 0) / Galera 2.11(r318911d)
Wsrep    Cluster  Node Repl  Queue     Ops       Bytes     Conflct   Gcache    Window        Flow
    time P cnf  # Stat Laten   Up   Dn   Up   Dn   Up   Dn  lcf  bfa  ist  idx dst appl comm  p_ms
20:31:13 P   5  3 Dono   N/A    0   73    0 3493  0.0 1.8M    0    0 1.8m  832 746    2    2   0.0
20:31:14 P   5  3 Dono   N/A    0   29    0 3578  0.0 1.9M    0    0 1.8m  850 749    3    2   0.0
20:31:15 P   5  3 Dono   N/A    0    0    0 3513  0.0 1.8M    0    0 1.8m  735 743    2    2   0.0
20:31:16 P   5  3 Dono   N/A    0    0    0 3651  0.0 1.9M    0    0 1.8m  827 748    2    2   0.0
20:31:17 P   5  3 Dono   N/A    0   27    0 3642  0.0 1.9M    0    0 1.8m  840 762    2    2   0.0
20:31:18 P   5  3 Dono   N/A    0    0    0 3840  0.0 2.0M    0    0 1.8m  563 776    2    2   0.0
20:31:19 P   5  3 Dono   N/A    0    0    0 4368  0.0 2.3M    0    0 1.8m  823 745    2    1   0.0
20:31:20 P   5  3 Dono   N/A    0 3952    0  339  0.0 0.2M    0    0 1.8m  678 751    1    1   0.0
20:31:21 P   5  3 Dono   N/A    0 7883    0    0  0.0  0.0    0    0 1.8m  678 751    0    0   0.0
20:31:22 P   5  3 Dono   N/A    0 4917    0 5947  0.0 3.1M    0    0 1.8m 6034  3k    7    6   0.0
20:31:24 P   5  3 Dono   N/A    0   10    0 8238  0.0 4.3M    0    0 1.8m  991  1k    7    6   0.0
20:31:25 P   5  3 Dono   N/A    0    0    0 3016  0.0 1.6M    0    0 1.8m  914 754    2    1   0.0
20:31:26 P   5  3 Dono   N/A    0    0    0 3253  0.0 1.7M    0    0 1.8m  613 766    1    1   0.0
20:31:27 P   5  3 Dono   N/A    0    1    0 3600  0.0 1.9M    0    0 1.8m  583 777    2    1   0.0
20:31:28 P   5  3 Dono   N/A    0    0    0 3640  0.0 1.9M    0    0 1.8m  664 750    2    2   0.0

The drop here is more severe and the apply rate hits 0 (and stays there for the duration of the FTWRL).

Implications

Obviously Xtrabackup running on a PXC node will cause some load on the node itself, so there still maybe good reasons to keep a Donor node out of rotation from your application.  However, this is less of an issue than it was in the past, where writes would definitely stall on a Donor node and present potentially intermittent stalls on the application.

How you allow applications to start using a Donor node automatically (or not) depends on how you have your HA between the application and cluster setup.  If you use HAproxy or similar with clustercheck, you can either modify the script itself or change a command line argument. The node is in the Donor/Desynced state below:

[root@ip-10-229-64-35 ~]# /usr/bin/clustercheck clustercheckuser clustercheckpassword!
HTTP/1.1 503 Service Unavailable
Content-Type: text/plain
Connection: close
Content-Length: 44
Percona XtraDB Cluster Node is not synced.
[root@ip-10-229-64-35 ~]# /usr/bin/clustercheck clustercheckuser clustercheckpassword! 1
HTTP/1.1 200 OK
Content-Type: text/plain
Connection: close
Content-Length: 40
Percona XtraDB Cluster Node is synced.

For those doing their own custom health checking, you basically just need to pass nodes that have a wsrep_local_state_comment of either ‘Synced’ or ‘Donor/Desynced’.

The post Testing backup locks during Xtrabackup SST on Percona XtraDB Cluster appeared first on MySQL Performance Blog.

Jun
17
2013
--

Percona XtraDB Cluster (PXC) in the real world: Share your use cases!

Percona XtraDB ClusterThe aim of this post is to enumerate real-world usage of Percona XtraDB Cluster (PXC), and also to solicit use cases from the readers. One of the prominent usages in the production environment that we have come across (and our Percona consultants have assisted) is that of HP Cloud. There is a post about it here by Patrick Galbraith of HP. The post focuses on their deployment of PXC for HP Cloud DNS. The post focuses on the key aspects of synchronous replication setup with high-availability guarantees like split-brain immunity.

Nobody likes to debug async replication while its broken or do the master-master/master-slave switchover when master is dying/dead. Yes, there are wrappers/scripts around this to make life easier, however, wouldn’t it be nice if this was built into the system itself? PXC based on Galera strives to provide that. Scaling makes sense only when addition/removal of hosts from a cluster or a HA setup is simple and uncomplicated.

Their post focuses on following aspects:

  • Initial setup
  • Setup of other nodes with SST (Xtrabackup SST)
  • Integration of chef with PXC
  • Finally, integration of HAProxy as a loadbalancer.

To elucidate, their initial setup goes into bootstrapping the first node. Note that in the cloud environment other nodes are not known until they are brought up, hence bootstrapping with an empty gcomm:// is done for the first node by the chef. The second node is then added which SSTs with node1 (based on gcomm://node1 of node2) through Xtrabackup SST (state snapshot transfer). Node3 subsequently joins the cluster with node1 and node2 in its gcomm:// (since by this time node1, node2 are up). After this, a subsequent run of chef-client is done to update the cnf files with IP address of members (excluding itself). The rationale behind this is that when a node is restarted (and there are others when it comes up) it joins the cluster seamlessly. I would like to note here that we are adding a bootstrap parameter to PXC so that any latter modifications like these to cnf files are not required and preset it during cluster startup itself. The only caveat is that the node information – IP address or hostname – should be known in advance (the node itself needn’t be up), which may not be feasible in a cloud environment.

Next, the SST. Xtrabackup SST is used there. SST matters a lot because not only is it used during initial node setup but also it is required when a node has been down for a while and IST (incremental state transfer) is not feasible. It also helps when node data integrity is compromised. So, naturally duration of SST is paramount. We recommend Xtrabackup SST for its reduced locking period from its use (which means the donor is blocked for a shorter while). By using Xtrabackup for SST, you also get its benefits like compression, parallel streaming, encryption, compact backups which can be used for SST (Note, the wsrep_sst_xtrabackup in 5.5.30 can’t do those except parallel, the one in 5.5.31 will handle them all, also XB 2.1 is required for most).

Finally, the HAProxy. HAProxy is one of the loadbalancers recommended for use with PXC. The other one is glb. HAProxy is used with xinetd on the node along with a script which checks PXC for its sync status. As referenced in that post, you can refer a post by Peter Boros (“Percona XtraDB Cluster reference architecture with HaProxy“) for details. In their setup they have automated this with a HAProxy in each AZ (Availability Zone) for the API server. To add, we are looking at reducing the overhead here, through steps like replacing xinetd and clustercheck with a single serving process (we are adding one in 5.5.31), looking for optimizations with HAProxy to account for high connection rates, and using pacemaker with PXC. The goal is to reduce the overhead of status checks, mainly on the node. You can also look this PLMCE talk for HAProxy deployment strategies with PXC.

To conclude, it is interesting to note that they have been able to manage this with a small team. That strongly implies scalability of resources – you scale more with less, and that is how it should be. We would like to hear from you about your architectural setup around PXC – any challenges you faced (and horror stories if any), any special deployment methodologies you employed (Puppet, Chef, Salt, Ansible etc. ), and finally any suggestions.

The post Percona XtraDB Cluster (PXC) in the real world: Share your use cases! appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com