Jul
03
2018
--

Linux OS Tuning for MySQL Database Performance

Linux OS tuning for MySQL database performance

Linux OS tuning for MySQL database performanceIn this post we will review the most important Linux settings to adjust for performance tuning and optimization of a MySQL database server. We’ll note how some of the Linux parameter settings used OS tuning may vary according to different system types: physical, virtual or cloud. Other posts have addressed MySQL parameters, like Alexander’s blog MySQL 5.7 Performance Tuning Immediately After Installation. That post remains highly relevant for the latest versions of MySQL, 5.7 and 8.0. Here we will focus more on the Linux operating system parameters that can affect database performance.

Server and Operating System

Here are some Linux parameters that you should check and consider modifying if you need to improve database performance.

Kernel – vm.swappiness

The value represents the tendency of the kernel  to swap out memory pages. On a database server with ample amounts of RAM, we should keep this value as low as possible. The extra I/O can slow down or even render the service unresponsive. A value of 0 disables swapping completely while 1 causes the kernel to perform the minimum amount of swapping. In most cases the latter setting should be OK:

# Set the swappiness value as root
echo 1 > /proc/sys/vm/swappiness
# Alternatively, using sysctl
sysctl -w vm.swappiness=1
# Verify the change
cat /proc/sys/vm/swappiness
1
# Alternatively, using sysctl
sysctl vm.swappiness
vm.swappiness = 1

The change should be also persisted in /etc/sysctl.conf:

vm.swappiness = 1

Filesystems – XFS/ext4/ZFS
XFS

XFS is a high-performance, journaling file system designed for high scalability. It provides near native I/O performance even when the file system spans multiple storage devices.  XFS has features that make it suitable for very large file systems, supporting files up to 8EiB in size. Fast recovery, fast transactions, delayed allocation for reduced fragmentation and near raw I/O performance with DIRECT I/O.

The default options for mkfs.xfs are good for optimal speed, so the simple command:

# Use default mkfs options
mkfs.xfs /dev/target_volume

will provide best performance while ensuring data safety. Regarding mount options, the defaults should fit most cases. On some filesystems you can see a performance increase by adding the noatime mount option to the /etc/fstab.  For XFS filesystems the default atime behaviour is relatime, which has almost no overhead compared to noatime and still maintains sane atime values.  If you create an XFS file system on a LUN that has a battery backed, non-volatile cache, you can further increase the performance of the filesystem by disabling the write barrier with the mount option nobarrier. This helps you to avoid flushing data more often than necessary. If a BBU (backup battery unit) is not present, however, or you are unsure about it, leave barriers on, otherwise you may jeopardize data consistency. With this options on, an /etc/fstab file should look like the one below:

/dev/sda2              /datastore              xfs     defaults,nobarrier
/dev/sdb2              /binlog                 xfs     defaults,nobarrier

ext4

ext4 has been developed as the successor to ext3 with added performance improvements. It is a solid option that will fit most workloads. We should note here that it supports files up to 16TB in size, a smaller limit than xfs. This is something you should consider if extreme table space size/growth is a requirement. Regarding mount options, the same considerations apply. We recommend the defaults for a robust filesystem without risks to data consistency. However, if an enterprise storage controller with a BBU cache is present, the following mount options will provide the best performance:

/dev/sda2              /datastore              ext4     noatime,data=writeback,barrier=0,nobh,errors=remount-ro
/dev/sdb2              /binlog                 ext4     noatime,data=writeback,barrier=0,nobh,errors=remount-ro

Note: The data=writeback option results in only metadata being journaled, not actual file data. This has the risk of corrupting recently modified files in the event of a sudden power loss, a risk which is minimised with a presence of a BBU enabled controller. nobh only works with the data=writeback option enabled.

ZFS

ZFS is a filesystem and LVM combined enterprise storage solution with extended protection vs data corruption. There are certainly cases where the rich feature set of ZFS makes it an essential option to consider, most notably when advance volume management is a requirement. ZFS tuning for MySQL can be a complex topic and falls outside the scope of this blog. For further reference, there is a dedicated blog post on the subject by Yves Trudeau:

Disk Subsystem – I/O scheduler 

Most modern Linux distributions come with noop or deadline I/O schedulers by default, both providing better performance than the cfq and anticipatory ones. However it is always a good practice to check the scheduler for each device and if the value shown is different than noop or deadline the policy can change without rebooting the server:

# View the I/O scheduler setting. The value in square brackets shows the running scheduler
cat /sys/block/sdb/queue/scheduler
noop deadline [cfq]
# Change the setting
sudo echo noop > /sys/block/sdb/queue/scheduler

To make the change persistent, you must modify the GRUB configuration file:

# Change the line:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"
# to:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash elevator=noop"

AWS Note: There are cases where the I/O scheduler has a value of none, most notably in AWS VM instance types where EBS volumes are exposed as NVMe block devices. This is because the setting has no use in modern PCIe/NVMe devices. The reason is that they have a very large internal queue and they bypass the IO scheduler altogether. The setting in this case is none and it is the optimal in such disks.

Disk Subsystem – Volume optimization

Ideally different disk volumes should be used for the OS installation, binlog, data and the redo log, if this is possible. The separation of OS and data partitions, not just logically but physically, will improve database performance. The RAID level can also have an impact: RAID-5 should be avoided as the checksum needed to ensure integrity is costly. The best performance without making compromises to redundancy is achieved by the use of an advanced controller with a battery-backed cache unit and preferably RAID-10 volumes spanned across multiple disks.

AWS Note: For further information about EBS volumes and AWS storage optimisation, Amazon has documentation at the following links:

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/nvme-ebs-volumes.html

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/storage-optimized-instances.html

Database settings

System Architecture – NUMA settings

Non-uniform memory access (NUMA) is a memory design where an SMP’s system processor can access its own local memory faster than non-local memory (the one assigned local to other CPUs). This may result in suboptimal database performance and potentially swapping. When the buffer pool memory allocation is larger than size of the RAM available local to the node, and the default memory allocation policy is selected, swapping occurs. A NUMA enabled server will report different node distances between CPU nodes. A uniformed one will report a single distance:

# NUMA system
numactl --hardware
available: 4 nodes (0-3)
node 0 cpus: 0 1 2 3 4 5 6 7
node 0 size: 65525 MB
node 0 free: 296 MB
node 1 cpus: 8 9 10 11 12 13 14 15
node 1 size: 65536 MB
node 1 free: 9538 MB
node 2 cpus: 16 17 18 19 20 21 22 23
node 2 size: 65536 MB
node 2 free: 12701 MB
node 3 cpus: 24 25 26 27 28 29 30 31
node 3 size: 65535 MB
node 3 free: 7166 MB
node distances:
node   0   1   2   3
  0:  10  20  20  20
  1:  20  10  20  20
  2:  20  20  10  20
  3:  20  20  20  10
# Uniformed system
numactl --hardware
available: 1 nodes (0)
node 0 cpus: 0 1 2 3 4 5 6 7
node 0 size: 64509 MB
node 0 free: 4870 MB
node distances:
node   0
  0:  10

In the case of a NUMA system, where numactl shows different distances across nodes, the MySQL variable innodb_numa_interleave should be enabled to ensure memory interleaving. Percona Server provides improved NUMA support by introducing the flush_caches variable. When enabled, it will help with allocation fairness across nodes. To determine whether or not allocation is equal across nodes, you can examine numa_maps for the mysqld process with this script:

# The perl script numa_maps.pl will report memory allocation per CPU node:
# 3595 is the pid of the mysqld process
perl numa_maps.pl < /proc/3595/numa_maps
N0        :     16010293 ( 61.07 GB)
N1        :     10465257 ( 39.92 GB)
N2        :     13036896 ( 49.73 GB)
N3        :     14508505 ( 55.35 GB)
active    :          438 (  0.00 GB)
anon      :     54018275 (206.06 GB)
dirty     :     54018275 (206.06 GB)
kernelpagesize_kB:         4680 (  0.02 GB)
mapmax    :          787 (  0.00 GB)
mapped    :         2731 (  0.01 GB)

Conclusion

In this blog post we examined a few important OS related settings and explained how they can be tuned for better database performance.

While you are here …

You might also find value in this recorded webinar Troubleshooting Best Practices: Monitoring the Production Database Without Killing Performance

 

The post Linux OS Tuning for MySQL Database Performance appeared first on Percona Database Performance Blog.

Jun
29
2018
--

Percona XtraDB Cluster 5.7.22-29.26 Is Now Available

Percona XtraDB Cluster 5.7

Percona XtraDB Cluster 5.6Percona announces the release of Percona XtraDB Cluster 5.7.22-29.26 (PXC) on June 29, 2018. Binaries are available from the downloads section or our software repositories.

Percona XtraDB Cluster 5.7.22-29.26 is now the current release, based on the following:

Deprecated

The following variables are deprecated starting from this release:

  • wsrep-force-binlog-format
  • wsrep_sst_method = mysqldump

As long as the use of binlog_format=ROW is enforced in 5.7, wsrep_forced_binlog_format variable is much less significant. The same is related to mysqldump, as xtrabackup is now the recommended SST method.

New features

  • PXC-907: New variable wsrep_RSU_commit_timeout allows to configure RSU wait for active commit connection timeout (in microseconds).
  • Percona XtraDB Cluster now supports the keyring_vault plugin, which allows to store the master key in a vault server.
  • Percona XtraDB Cluster  5.7.22 depends on Percona XtraBackup  2.4.12 in order to fully support vault plugin functionality.

Fixed Bugs

  • PXC-2127: Percona XtraDB Cluster shutdown process hung if thread_handling option was set to pool-of-threads due to a regression in  5.7.21.
  • PXC-2128: Duplicated auto-increment values were set for the concurrent sessions on cluster reconfiguration due to the erroneous readjustment.
  • PXC-2059: Error message about the necessity of the SUPER privilege appearing in case of the CREATE TRIGGER statements fail due to enabled WSREP was made more clear.
  • PXC-2061: Wrong values could be read, depending on timing, when read causality was enforced with wsrep_sync_wait=1, because of waiting on the commit monitor to be flushed instead of waiting on the apply monitor.
  • PXC-2073CREATE TABLE AS SELECT statement was not replicated in case if result set was empty.
  • PXC-2087: Cluster was entering the deadlock state if table had an unique key and INSERT ... ON DUPLICATE KEY UPDATE statement was executed.
  • PXC-2091: Check for the maximum number of rows, that can be replicated as a part of a single transaction because of the Galera limit, was enforced even when replication was disabled with wsrep_on=OFF.
  • PXC-2103: Interruption of the local running transaction in a COMMIT state by a replicated background transaction while waiting for the binlog backup protection caused the commit fail and, eventually, an assert in Galera.
  • PXC-2130: Percona XtraDB Cluster failed to build with Python 3.
  • PXC-2142: Replacing Percona Server with Percona XtraDB Cluster on CentOS 7 with the yum swap command produced a broken symlink in place of the /etc/my.cnf configuration file.
  • PXC-2154: rsync SST is now aborted with error message if used onnode with keyring_vault plugin configured, because it doesn’t support  keyring_vault. Also Percona doesn’t recommend using rsync-based SST for data-at-rest encryption with keyring.
  •  PXB-1544: xtrabackup --copy-back didn’t read which encryption plugin to use from plugin-load setting of the my.cnf configuration file.
  •  PXB-1540: Meeting a zero sized keyring file, Percona XtraBackup was removing and immediately recreating it, and this could affect external software noticing the file had undergo some manipulations.

Other bugs fixed:

PXC-2072 “flush table <table> for export should be blocked with mode=ENFORCING”.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

The post Percona XtraDB Cluster 5.7.22-29.26 Is Now Available appeared first on Percona Database Performance Blog.

Jun
25
2018
--

Running Percona XtraDB Cluster in Kubernetes/OpenShift

Diagram of Percona XtraDB Cluster / MySQL running in Kubernetes Open Shift

Kubernetes, and its most popular distribution OpenShift, receives a lot of interest as a container orchestration platform. However, databases remain a foreign entity, primarily because of their stateful nature, since container orchestration systems prefer stateless applications. That said, there has been good progress in support for StatefulSet applications and persistent storage, to the extent that it might be already comfortable to have a production database instance running in Kubernetes. With this in mind, we’ve been looking at running Percona XtraDB Cluster in Kubernetes/OpenShift.

While there are already many examples on the Internet of how to start a single MySQL instance in Kubernetes, for serious usage we need to provide:

  • High Availability: how can we guarantee availability when an instance (or Pod in Kubernetes terminology) crashes or becomes unresponsive?
  • Persistent storage: we do not want to lose our data in case of instance failure
  • Backup and recovery
  • Traffic routing: in the case of multiple instances, how do we direct an application to the correct one
  • Monitoring

Percona XtraDB Cluster in Kubernetes/OpenShift

Schematically it looks like this:


Percona XtraDB Cluster in Kubernetes/OpenShift a possible configuration for a resilient solution

The picture highlights the components we are going to use

Running this in Kubernetes assumes a high degree of automation and minimal manual intervention.

We provide our proof of concept in this project: https://github.com/Percona-Lab/percona-openshift. Please treat it like a source for ideas and as an alpha-quality project, in no way it is production ready.

Details

In our implementation we rely on Helm, the package manager for Kubernetes.  Unfortunately OpenShift does not officially support Helm out of the box, but there is a guide from RedHat on how to make it work.

In the clustering setup, it is quite typical to use a service discovery software like Zookeeper, etcd or Consul. It may become necessary for our Percona XtraDB Cluster deployment, but for now, to simplify deployment, we are going to use the DNS service discovery mechanism provided by Kubernetes. It should be enough for our needs.

We also expect the Kubernetes deployment to provide Dynamic Storage Provisioning. The major cloud providers (like Google Cloud, Microsoft Azure or Amazon Cloud) should have it. Also, it might not be easy to have Dynamic Storage Provisioning for on-premise deployments. You may need to setup GlusterFS or Ceph to provide Dynamic Storage Provisioning.

The challenge with a distributed file system is how many copies of data you will end up having. Percona XtraDB Cluster by itself has three copies, and GlusterFS will also require at least two copies of the data, so in the end we will have six copies of the data. This can’t be good for write intensive applications, but it’s also not good from the capacity standpoint.

One possible approach is to have local data copies for Percona XtraDB Cluster deployments. This will provide better performance and less impact on the network, but in the case of a big dataset (100GB+ ) the node failure will require SST with a big impact on the cluster and network. So the individual solution should be tailored for your workload and your requirements.

Now, as we have a basic setup working, it would be good to understand the performance impact of running Percona XtraDB Cluster in Kubernetes.  Is the network and storage overhead acceptable or it is too big? We plan to look into this in the future.

Once again, our project is located at https://github.com/Percona-Lab/percona-openshift, we are looking for your feedback and for your experience of running databases in Kubernetes/OpenShift.

Before you leave …

Percona XtraDB Cluster

If this article has interested you and you would like to know more about Percona XtraDB Cluster, you might enjoy our recent series of webinar tutorials that introduce this software and how to use it.

The post Running Percona XtraDB Cluster in Kubernetes/OpenShift appeared first on Percona Database Performance Blog.

Jun
21
2018
--

Lock Down: Enforcing SELinux with Percona XtraDB Cluster

SELinux for PXC security

SELinux for PXC security

Why do I spend time blogging about security frameworks? Because, although there are some resources available on the Web, none apply to Percona XtraDB Cluster (PXC) directly. Actually, I rarely encounter a MySQL setup where SELinux is enforced and never when Percona XtraDB Cluster (PXC) or another Galera replication implementation is used. As we’ll see, there are good reasons for that. I originally thought this post would be a simple “how to” but it ended up with a push request to modify the SST script and a few other surprises.

Some context

These days, with all the major security breaches of the last few years, the importance of security in IT cannot be highlighted enough. For that reason, security in MySQL has been progressively tightened from version to version and the default parameters are much more restrictive than they used to be. That’s all good but it is only at the MySQL level if there is still a breach allowing access to MySQL, someone could in theory do everything the mysql user is allowed to do. To prevent such a situation, the operations that mysqld can do should be limited to only what it really needs to do. SELinux’ purpose is exactly that. You’ll find SELinux on RedHat/Centos and their derived distributions. Debian, Ubuntu and OpenSuse uses another framework, AppArmor, which is functionally similar to SELinux. I’ll talk about AppArmor in a future post, let’s focus for now on SELinux.

The default behavior of many DBAs and Sysadmins appears to be: “if it doesn’t work, disable SELinux”. Sure enough, it often solves the issue but it also removes an important security layer. I believe disabling SELinux is the wrong cure so let’s walk through the steps of configuring a PXC cluster with SELinux enforced.

Starting point

As a starting point, I’ll assume you have a running PXC cluster operating with SELinux in permissive mode. That likely means the file “/etc/sysconfig/selinux” looks like this:

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#     enforcing - SELinux security policy is enforced.
#     permissive - SELinux prints warnings instead of enforcing.
#     disabled - No SELinux policy is loaded.
SELINUX=permissive
# SELINUXTYPE= can take one of three two values:
#     targeted - Targeted processes are protected,
#     minimum - Modification of targeted policy. Only selected processes are protected.
#     mls - Multi Level Security protection.
SELINUXTYPE=targeted

For the purpose of writing this article, I created a 3 nodes PXC cluster with the hosts: BlogSELinux1, BlogSELinux2 and BlogSELinux3. On BlogSELinux1, I set SELinux in permissive mode, I truncated the audit.log. SELinux violations are logged in the audit.log file.

[root@BlogSELinux1 ~]# getenforce
Permissive
[root@BlogSELinux1 ~]# echo '' > /var/log/audit/audit.log

Let’s begin by covering the regular PXC operation items like start, stop, SST Donor, SST Joiner, IST Donor and IST Joiner. As we execute the steps in the list, the audit.log file will record SELinux related elements.

Stop and start

Those are easy:

[root@BlogSELinux1 ~]# systemctl stop mysql
[root@BlogSELinux1 ~]# systemctl start mysql

SST Donor

On BlogSELinux3:

[root@BlogSELinux3 ~]# systemctl stop mysql

then on BlogSELinux2:

[root@BlogSELinux2 ~]# systemctl stop mysql
[root@BlogSELinux2 ~]# rm -f /var/lib/mysql/grastate.dat
[root@BlogSELinux2 ~]# systemctl start mysql

SST Joiner

We have BlogSELinux1 and BlogSELinux2 up and running, we just do:

[root@BlogSELinux1 ~]# systemctl stop mysql
[root@BlogSELinux1 ~]# rm -f /var/lib/mysql/grastate.dat
[root@BlogSELinux1 ~]# systemctl start mysql

IST Donor

We have BlogSELinux1 and BlogSELinux2 up and running, we just do:

[root@BlogSELinux2 ~]# systemctl stop mysql

Then on the first node:

[root@BlogSELinux1 ~]# mysql -e 'create database test;';
[root@BlogSELinux1 ~]# mysql -e 'create table test.testtable (id int not null, primary key (id)) engine=innodb;'
[root@BlogSELinux1 ~]# mysql -e 'insert into test.testtable (id) values (1);'

Those statements put some data in the gcache, now we just restart the second node:

[root@BlogSELinux2 ~]# systemctl start mysql

IST Joiner

We have BlogSELinux1 and BlogSELinux2 up and running, we just do:

[root@BlogSELinux1 ~]# systemctl stop mysql

Then on the second node:

[root@BlogSELinux2 ~]# mysql -e 'insert into test.testtable (id) values (2);'

to insert some data in the gcache and we restart the first node:

[root@BlogSELinux1 ~]# systemctl start mysql

First run

Now that we performed the basic operations of a cluster while recording the security violations in permissive mode, we can look at the audit.log file and start building the SELinux policy. Let’s begin by installing the tools needed to manipulate the SELinux audit log and policy files with:

[root@BlogSELinux1 ~]# yum install policycoreutils-python.x86_64

Then, we’ll use the audit2allow tool to analyze the audit.log file:

[root@BlogSELinux1 ~]# grep -i denied /var/log/audit/audit.log | grep mysqld_t | audit2allow -M PXC
******************** IMPORTANT ***********************
To make this policy package active, execute:
semodule -i PXC.pp

We end up with 2 files, PXC.te and PXC.pp. The pp file is a compiled version of the human readable te file. If we examine the content of the PXC.te file, at the beginning, we have the require section listing all the involved SELinux types and classes:

module PXC 1.0;
require {
        type unconfined_t;
        type init_t;
        type auditd_t;
        type mysqld_t;
        type syslogd_t;
        type NetworkManager_t;
        type unconfined_service_t;
        type system_dbusd_t;
        type tuned_t;
        type tmp_t;
        type dhcpc_t;
        type sysctl_net_t;
        type kerberos_port_t;
        type kernel_t;
        type unreserved_port_t;
        type firewalld_t;
        type systemd_logind_t;
        type chronyd_t;
        type policykit_t;
        type udev_t;
        type mysqld_safe_t;
        type postfix_pickup_t;
        type sshd_t;
        type crond_t;
        type getty_t;
        type lvm_t;
        type postfix_qmgr_t;
        type postfix_master_t;
        class process { getattr setpgid };
        class unix_stream_socket connectto;
        class system module_request;
        class netlink_tcpdiag_socket { bind create getattr nlmsg_read setopt };
        class tcp_socket { name_bind name_connect };
        class file { getattr open read write };
        class dir search;
}

Then, using these types and classes, the policy file adds a series of generic allow rules matching the denied found in the audit.log file. Here’s what I got:

#============= mysqld_t ==============
allow mysqld_t NetworkManager_t:process getattr;
allow mysqld_t auditd_t:process getattr;
allow mysqld_t chronyd_t:process getattr;
allow mysqld_t crond_t:process getattr;
allow mysqld_t dhcpc_t:process getattr;
allow mysqld_t firewalld_t:process getattr;
allow mysqld_t getty_t:process getattr;
allow mysqld_t init_t:process getattr;
#!!!! This avc can be allowed using the boolean 'nis_enabled'
allow mysqld_t kerberos_port_t:tcp_socket name_bind;
allow mysqld_t kernel_t:process getattr;
#!!!! This avc can be allowed using the boolean 'domain_kernel_load_modules'
allow mysqld_t kernel_t:system module_request;
allow mysqld_t lvm_t:process getattr;
allow mysqld_t mysqld_safe_t:process getattr;
allow mysqld_t policykit_t:process getattr;
allow mysqld_t postfix_master_t:process getattr;
allow mysqld_t postfix_pickup_t:process getattr;
allow mysqld_t postfix_qmgr_t:process getattr;
allow mysqld_t sysctl_net_t:file { getattr open read };
allow mysqld_t syslogd_t:process getattr;
allow mysqld_t system_dbusd_t:process getattr;
allow mysqld_t systemd_logind_t:process getattr;
allow mysqld_t tuned_t:process getattr;
allow mysqld_t udev_t:process getattr;
allow mysqld_t unconfined_service_t:process getattr;
allow mysqld_t unconfined_t:process getattr;
allow mysqld_t tuned_t:process getattr;
allow mysqld_t udev_t:process getattr;
allow mysqld_t sshd_t:process getattr;
allow mysqld_t self:netlink_tcpdiag_socket { bind create getattr nlmsg_read setopt };
allow mysqld_t self:process { getattr setpgid };
#!!!! The file '/var/lib/mysql/mysql.sock' is mislabeled on your system.
#!!!! Fix with $ restorecon -R -v /var/lib/mysql/mysql.sock
#!!!! This avc can be allowed using the boolean 'daemons_enable_cluster_mode'
allow mysqld_t self:unix_stream_socket connectto;
allow mysqld_t sshd_t:process getattr;
allow mysqld_t sysctl_net_t:dir search;
allow mysqld_t sysctl_net_t:file { getattr open read };
allow mysqld_t syslogd_t:process getattr;
allow mysqld_t system_dbusd_t:process getattr;
allow mysqld_t systemd_logind_t:process getattr;
#!!!! WARNING 'mysqld_t' is not allowed to write or create to tmp_t.  Change the label to mysqld_tmp_t.
allow mysqld_t tmp_t:file write;
allow mysqld_t tuned_t:process getattr;
allow mysqld_t udev_t:process getattr;
allow mysqld_t unconfined_service_t:process getattr;
allow mysqld_t unconfined_t:process getattr;
#!!!! This avc can be allowed using one of the these booleans:
#     nis_enabled, mysql_connect_any
allow mysqld_t unreserved_port_t:tcp_socket { name_bind name_connect };

I can understand some of these rules. For example, one of the TCP ports used by Kerberos is 4444 and it is also used by PXC for the SST transfer. Similarly, MySQL needs to write to /tmp. But what about all the other rules?

Troubleshooting

We could load the PXC.pp module we got in the previous section and consider our job done. It will likely allow the PXC node to start and operate normally but what exactly is happening? Why did MySQL or one of its subprocesses asked for the process attributes getattr of all the running processes like sshd, syslogd and cron. Looking directly in the audit.log file, I found many entries like these:

type=AVC msg=audit(1527792830.989:136): avc:  denied  { getattr } for  pid=3683 comm="ss"
  scontext=system_u:system_r:mysqld_t:s0 tcontext=system_u:system_r:init_t:s0 tclass=process
type=AVC msg=audit(1527792830.990:137): avc:  denied  { getattr } for  pid=3683 comm="ss"
  scontext=system_u:system_r:mysqld_t:s0 tcontext=system_u:system_r:kernel_t:s0 tclass=process
type=AVC msg=audit(1527792830.991:138): avc:  denied  { getattr } for  pid=3683 comm="ss"
  scontext=system_u:system_r:mysqld_t:s0 tcontext=system_u:system_r:syslogd_t:s0 tclass=process

So, ss, a network utility tool, scans all the processes. That rang a bell… I knew where to look for, the sst script. Here’s the source of the problem in the wsrep_sst_xtrabackup-v2 file:

wait_for_listen()
{
    local HOST=$1
    local PORT=$2
    local MODULE=$3
    for i in {1..300}
    do
        ss -p state listening "( sport = :$PORT )" | grep -qE 'socat|nc' && break
        sleep 0.2
    done
    echo "ready ${HOST}:${PORT}/${MODULE}//$sst_ver"
}

This bash function is used when the node is a joiner and it checks using ss if the TCP port used by socat or nc is opened. The check is needed in order to avoid replying too early with the “ready” message. The code is functionally correct but wrong, security wise. Instead of looking if there is a socat or nc command running in the list of processes owned by the mysql user, it checks if any of the processes has opened the SST port and only then does it checks if the name of the command is socat or nc. Since we don’t know which processes will be running on the server, we can’t write a good security profile. For example, in the future, one could add the ntpd daemon, causing PXC to fail to start yet again. To avoid that, the function needs to be modified like this:

wait_for_listen()
{
    local HOST=$1
    local PORT=$2
    local MODULE=$3
    for i in {1..300}
    do
        sleep 0.2
        # List only our (mysql user) processes to avoid triggering SELinux
        for cmd in $(ps -u $(id -u) -o pid,comm | sed 's/^\s*//g' | tr ' ' '|' | grep -E 'socat|nc')
        do
            pid=$(echo $cmd | cut -d'|' -f1)
            # List the sockets of the pid
            sockets=$(ls -l /proc/$pid/fd | grep socket | cut -d'[' -f2 | cut -d ']' -f1 | tr '\n' '|')
            if [[ -n $sockets ]]; then
                # Is one of these sockets listening on the SST port?
                # If so, we need to break from 2 loops
                grep -E "${sockets:0:-1}" /proc/$pid/net/tcp | \
                  grep "00000000:$(printf '%X' $PORT)" > /dev/null \
                  && break 2
            fi
        done
    done
    echo "ready ${HOST}:${PORT}/${MODULE}//$sst_ver"
}

The modified function removes many of the denied messages in the audit log file and simplifies a lot the content of PXC.te. I tested the above modification and made a pull request to PXC. Among the remaining items, we have:

allow mysqld_t self:process { getattr setpgid };

setpgid is called often used after a fork to set the process group, usually through the setsid call. MySQL uses fork when it starts with the daemonize option but our installation of Percona XtraDB cluster uses mysqld_safe and does not directly run as a daemon. Another fork call is part of the wsrep source files and is used to launch processes like the SST script and is done when mysqld is already running with reduced privileges. This later invocation is certainly our culprit.

TCP ports

What about TPC ports? PXC uses quite a few. Of course there is the 3306/tcp port used to access MySQL. Galera also uses the ports 4567/tcp for replication, 4568/tcp for IST and 4444/tcp for SST. Let’s have a look which ports SELinux allows PXC to use:

[root@BlogSELinux1 audit]# semanage port -l | grep mysql
mysqld_port_t                  tcp      1186, 3306, 63132-63164

No surprise, port 3306/tcp is authorized but if you are new to MySQL, you may wonder what uses the 1186/tcp. It is the port used by NDB cluster for inter-node communication (NDB API). Now, if we try to add the missing ports:

[root@BlogSELinux1 audit]# semanage port -a -t mysqld_port_t -p tcp 4567
ValueError: Port tcp/4567 already defined
[root@BlogSELinux1 audit]# semanage port -a -t mysqld_port_t -p tcp 4568
[root@BlogSELinux1 audit]# semanage port -a -t mysqld_port_t -p tcp 4444
ValueError: Port tcp/4444 already defined

4568/tcp was successfully added but, 4444/tcp and 4567/tcp failed because they are already assigned to another security context. For example, 4444/tcp belongs to the kerberos security context:

[root@BlogSELinux1 audit]# semanage port -l | grep kerberos_port
kerberos_port_t                tcp      88, 750, 4444
kerberos_port_t                udp      88, 750, 4444

A TCP port is not allowed by SELinux to belong to more than one security context. We have no other choice than to move the two missing ports to the mysqld_t security context:

[root@BlogSELinux1 audit]# semanage port -m -t mysqld_port_t -p tcp 4444
[root@BlogSELinux1 audit]# semanage port -m -t mysqld_port_t -p tcp 4567
[root@BlogSELinux1 audit]# semanage port -l | grep mysqld
mysqld_port_t                  tcp      4567, 4444, 4568, 1186, 3306, 63132-63164

If you happen to be planning to deploy a Kerberos server on the same servers you may have to run PXC using a different port for Galera replication. In that case, and in the case where you want to run MySQL on a port other than 3306/tcp, you’ll need to add the port to the mysqld_port_t context like we just did above. Do not worry too much for the port 4567/tcp, it is reserved for tram which, from what I found, is a remote access protocol for routers.

Non-default paths

It is very frequent to run MySQL with non-standard paths/directories. With SELinux, you don’t list the authorized path in the security context, you add the security context labels to the paths. Adding a context label is a two steps process, basically change and apply. For example, if you are using /data as the MySQL datadir, you need to do:

semanage fcontext -a -t mysqld_db_t "/data(/.*)?"
restorecon -R -v /data

On a RedHat/Centos 7 server, the MySQL file contexts and their associated paths are:

[root@BlogSELinux1 ~]# bzcat /etc/selinux/targeted/active/modules/100/mysql/cil | grep filecon
(filecon "HOME_DIR/\.my\.cnf" file (system_u object_r mysqld_home_t ((s0) (s0))))
(filecon "/root/\.my\.cnf" file (system_u object_r mysqld_home_t ((s0) (s0))))
(filecon "/usr/lib/systemd/system/mysqld.*" file (system_u object_r mysqld_unit_file_t ((s0) (s0))))
(filecon "/usr/lib/systemd/system/mariadb.*" file (system_u object_r mysqld_unit_file_t ((s0) (s0))))
(filecon "/etc/my\.cnf" file (system_u object_r mysqld_etc_t ((s0) (s0))))
(filecon "/etc/mysql(/.*)?" any (system_u object_r mysqld_etc_t ((s0) (s0))))
(filecon "/etc/my\.cnf\.d(/.*)?" any (system_u object_r mysqld_etc_t ((s0) (s0))))
(filecon "/etc/rc\.d/init\.d/mysqld" file (system_u object_r mysqld_initrc_exec_t ((s0) (s0))))
(filecon "/etc/rc\.d/init\.d/mysqlmanager" file (system_u object_r mysqlmanagerd_initrc_exec_t ((s0) (s0))))
(filecon "/usr/bin/mysqld_safe" file (system_u object_r mysqld_safe_exec_t ((s0) (s0))))
(filecon "/usr/bin/mysql_upgrade" file (system_u object_r mysqld_exec_t ((s0) (s0))))
(filecon "/usr/libexec/mysqld" file (system_u object_r mysqld_exec_t ((s0) (s0))))
(filecon "/usr/libexec/mysqld_safe-scl-helper" file (system_u object_r mysqld_safe_exec_t ((s0) (s0))))
(filecon "/usr/sbin/mysqld(-max)?" file (system_u object_r mysqld_exec_t ((s0) (s0))))
(filecon "/usr/sbin/mysqlmanager" file (system_u object_r mysqlmanagerd_exec_t ((s0) (s0))))
(filecon "/usr/sbin/ndbd" file (system_u object_r mysqld_exec_t ((s0) (s0))))
(filecon "/var/lib/mysql(-files|-keyring)?(/.*)?" any (system_u object_r mysqld_db_t ((s0) (s0))))
(filecon "/var/lib/mysql/mysql\.sock" socket (system_u object_r mysqld_var_run_t ((s0) (s0))))
(filecon "/var/log/mariadb(/.*)?" any (system_u object_r mysqld_log_t ((s0) (s0))))
(filecon "/var/log/mysql.*" file (system_u object_r mysqld_log_t ((s0) (s0))))
(filecon "/var/run/mariadb(/.*)?" any (system_u object_r mysqld_var_run_t ((s0) (s0))))
(filecon "/var/run/mysqld(/.*)?" any (system_u object_r mysqld_var_run_t ((s0) (s0))))
(filecon "/var/run/mysqld/mysqlmanager.*" file (system_u object_r mysqlmanagerd_var_run_t ((s0) (s0))))

If you want to avoid security issues with SELinux, you should stay within those paths. A good example of an offending path is the PXC configuration file and directory which are now located in their own directory. These are not labeled correctly for SELinux:

[root@BlogSELinux1 ~]# ls -Z /etc/per*
-rw-r--r--. root root system_u:object_r:etc_t:s0       /etc/percona-xtradb-cluster.cnf
/etc/percona-xtradb-cluster.conf.d:
-rw-r--r--. root root system_u:object_r:etc_t:s0       mysqld.cnf
-rw-r--r--. root root system_u:object_r:etc_t:s0       mysqld_safe.cnf
-rw-r--r--. root root system_u:object_r:etc_t:s0       wsrep.cnf

I must admit that even if the security context labels on those files were not set, I got no audit messages and everything worked normally. Nevetheless, adding the labels is straightforward:

[root@BlogSELinux1 ~]# semanage fcontext -a -t mysqld_etc_t "/etc/percona-xtradb-cluster\.cnf"
[root@BlogSELinux1 ~]# semanage fcontext -a -t mysqld_etc_t "/etc/percona-xtradb-cluster\.conf\.d(/.*)?"
[root@BlogSELinux1 ~]# restorecon -v /etc/percona-xtradb-cluster.cnf
restorecon reset /etc/percona-xtradb-cluster.cnf context system_u:object_r:etc_t:s0->system_u:object_r:mysqld_etc_t:s0
[root@BlogSELinux1 ~]# restorecon -R -v /etc/percona-xtradb-cluster.conf.d/
restorecon reset /etc/percona-xtradb-cluster.conf.d context system_u:object_r:etc_t:s0->system_u:object_r:mysqld_etc_t:s0
restorecon reset /etc/percona-xtradb-cluster.conf.d/wsrep.cnf context system_u:object_r:etc_t:s0->system_u:object_r:mysqld_etc_t:s0
restorecon reset /etc/percona-xtradb-cluster.conf.d/mysqld.cnf context system_u:object_r:etc_t:s0->system_u:object_r:mysqld_etc_t:s0
restorecon reset /etc/percona-xtradb-cluster.conf.d/mysqld_safe.cnf context system_u:object_r:etc_t:s0->system_u:object_r:mysqld_etc_t:s0

Variables check list

Here is a list of all the variables you should check for paths used by MySQL

  • datadir, default is /var/lib/mysql, where MySQL stores its data
  • basedir, default is /usr, where binaries and librairies can be found
  • character_sets_dir, default is basedir/share/mysql/charsets, charsets used by MySQL
  • general_log_file, default is the datadir, where the general log is written
  • init_file, no default, sql file read and executed when the server starts
  • innodb_undo_directory, default is datadir, where InnoDB stores the undo files
  • innodb_tmpdir, default is tmpdir, where InnoDB creates temporary files
  • innodb_temp_data_file_path, default is in the datadir, where InnoDB creates the temporary tablespace
  • innodb_parallel_doublewrite_path, default is in the datadir, where InnoDB created the parallel doublewrite buffer
  • innodb_log_group_home_dir, default is the datadir, where InnoDB writes its transational log files
  • innodb_data_home_dir, default is the datadir, used a default value for the InnoDB files
  • innodb_data_file_path, default is in the datadir, path of the system tablespace
  • innodb_buffer_pool_filename, default is in the datadir, where InnoDB writes the buffer pool dump information
  • lc_messages_dir, basedir/share/mysql
  • log_bin_basename, default is the datadir, where the binlogs are stored
  • log_bin_index, default is the datadir, where the binlog index file is stored
  • log_error, no default value, where the MySQL error log is stored
  • pid-file, no default value, where the MySQL pid file is stored
  • plugin_dir, default is basedir/lib/mysql/plugin, where the MySQL plugins are stored
  • relay_log_basename, default is the datadir, where the relay logs are stored
  • relay_log_info_file, default is the datadir, may include a path
  • slave_load_tmpdir, default is tmpdir, where the slave stores files coming from LOAD DATA INTO statements.
  • slow_query_log, default is in the datadir, where the slow queries are logged
  • socket, no defaults, where the Unix socket file is created
  • ssl_*, SSL/TLS related files
  • tmpdir, default is /tmp, where temporary files are stored
  • wsrep_data_home_dir, default is the datadir, where galera stores its files
  • wsrep_provider->base_dir, default is wsrep_data_home_dir
  • wsrep_provider->gcache_dir, default is wsrep_data_home_dir, where the gcache file is stored
  • wsrep_provider->socket.ssl_*, no defaults, where the SSL/TLS related files for the Galera protocol are stored

That’s quite a long list and I may have missed some. If for any of these variables you use a non-standard path, you’ll need to adjust the context labels as we just did above.

All together

I would understand if you feel a bit lost, I am not a SELinux guru and it took me some time to understand decently how it works. Let’s recap how we can enable SELinux for PXC from what we learned in the previous sections.

1. Install the SELinux utilities

yum install policycoreutils-python.x86_64

2. Allow the TCP ports used by PXC

semanage port -a -t mysqld_port_t -p tcp 4568
semanage port -m -t mysqld_port_t -p tcp 4444
semanage port -m -t mysqld_port_t -p tcp 4567

3. Modify the SST script

Replace the wait_for_listen function in the /usr/bin/wsrep_sst_xtrabackup-v2 file by the version above. Hopefully, the next PXC release will include a SELinux friendly wait_for_listen function.

4. Set the security context labels for the configuration files

These steps seems optional but for completeness:

semanage fcontext -a -t mysqld_etc_t "/etc/percona-xtradb-cluster\.cnf"
semanage fcontext -a -t mysqld_etc_t "/etc/percona-xtradb-cluster\.conf\.d(/.*)?"
restorecon -v /etc/percona-xtradb-cluster.cnf
restorecon -R -v /etc/percona-xtradb-cluster.conf.d/

5. Create the policy file PXC.te

Create the file PXC.te with this content:

module PXC 1.0;
require {
        type unconfined_t;
        type mysqld_t;
        type unconfined_service_t;
        type tmp_t;
        type sysctl_net_t;
        type kernel_t;
        type mysqld_safe_t;
        class process { getattr setpgid };
        class unix_stream_socket connectto;
        class system module_request;
        class file { getattr open read write };
        class dir search;
}
#============= mysqld_t ==============
allow mysqld_t kernel_t:system module_request;
allow mysqld_t self:process { getattr setpgid };
allow mysqld_t self:unix_stream_socket connectto;
allow mysqld_t sysctl_net_t:dir search;
allow mysqld_t sysctl_net_t:file { getattr open read };
allow mysqld_t tmp_t:file write;

6. Compile and load the policy module

checkmodule -M -m -o PXC.mod PXC.te
semodule_package -o PXC.pp -m PXC.mod
semodule -i PXC.pp

7. Run for a while in Permissive mode

Set SELinux into permissive mode in /etc/sysconfig/selinux and reboot. Validate everything works fine in Permissive mode, check the audit.log for any denied messages. If there are denied messages, address them.

8. Enforce SELINUX

Last step, enforce SELinux:

setenforce 1
perl -pi -e 's/SELINUX=permissive/SELINUX=enforcing/g' /etc/sysconfig/selinux

Conclusion

As we can see, enabling SELinux with PXC is not straightforward but, once the process is understood, it is not that hard either. In an IT world where security is more than ever a major concern, enabling SELinux with PXC is a nice step forward. In an upcoming post, we’ll look at the other security framework, Apparmor.

The post Lock Down: Enforcing SELinux with Percona XtraDB Cluster appeared first on Percona Database Performance Blog.

Jun
20
2018
--

Percona XtraDB Cluster 5.6.40-26.25 Is Now Available

Percona XtraDB Cluster 5.7

Percona XtraDB Cluster 5.6Percona announces the release of Percona XtraDB Cluster 5.6.40-26.25 (PXC) on June 20, 2018. Binaries are available from the downloads section or our software repositories.

Percona XtraDB Cluster 5.6.40-26.25 is now the current release, based on the following:

All Percona software is open-source and free.

New feature

  • PXC-907: New variable wsrep_RSU_commit_timeout allows to configure RSU wait for active commit connection timeout (in microseconds).

Fixed Bugs

  • PXC-2128: Duplicated auto-increment values were set for the concurrent sessions on cluster reconfiguration due to the erroneous readjustment.
  • PXC-2059: Error message about the necessity of the SUPER privilege appearing in case of the CREATE TRIGGER statements fail due to enabled WSREP was made more clear.
  • PXC-2091: Check for the maximum number of rows, that can be replicated as a part of a single transaction because of the Galera limit, was enforced even when replication was disabled with wsrep_on=OFF.
  • PXC-2103: Interruption of the local running transaction in a COMMIT state by a replicated background transaction while waiting for the binlog backup protection caused the commit fail and, eventually, an assert in Galera.
  • PXC-2130Percona XtraDB Cluster failed to build with Python 3.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

The post Percona XtraDB Cluster 5.6.40-26.25 Is Now Available appeared first on Percona Database Performance Blog.

Jun
19
2018
--

Webinar Weds 20/6: Percona XtraDB Cluster 5.7 Tutorial Part 2

webinar Percona XtraDB Cluster

Including setting up Percona XtraDB Cluster with ProxySQL and PMM

webinar Percona XtraDB ClusterPlease join Percona’s Architect, Tibi Köröcz as he presents Percona XtraDB Cluster 5.7 Tutorial Part 2 on Wednesday, June 20th, 2018, at 7:00 am PDT (UTC-7) / 10:00 am EDT (UTC-4).

 

Never used Percona XtraDB Cluster before? This is the webinar for you! In this 45-minute webinar, we will introduce you to a fully functional Percona XtraDB Cluster.

This webinar will show you how to install Percona XtraDB Cluster with ProxySQL, and monitor it with Percona Monitoring and Management (PMM).

We will also cover topics like bootstrap, IST, SST, certification, common-failure situations and online schema changes.

After this webinar, you will have enough knowledge to set up a working Percona XtraDB Cluster with ProxySQL, in order to meet your high availability requirements.

You can see part one of this series here: Percona XtraDB Cluster 5.7 Tutorial Part 1

Register Now!

Tibor Köröcz

Architect

ProxySQL for Connection Pooling

Tibi joined Percona in 2015 as a Consultant. Before joining Percona, among many other things, he worked at the world’s largest car hire booking service as a Senior Database Engineer. He enjoys trying and working with the latest technologies and applications which can help or work with MySQL together. In his spare time he likes to spend time with his friends, travel around the world and play ultimate frisbee.

 

The post Webinar Weds 20/6: Percona XtraDB Cluster 5.7 Tutorial Part 2 appeared first on Percona Database Performance Blog.

May
30
2018
--

MySQL Test Framework for Percona XtraDB Cluster

MySQL Test Framework

At my latest webinar “MySQL Test Framework (MTR) for Troubleshooting”, I received an interesting question about MTR test cases for Percona XtraDB Cluster (PXC). Particularly about testing SST and IST.

This post is intended to answer this question. It assumes you are familiar with MTR and can write tests for MySQL servers. If you are not, please watch the webinar recording first.

You can find example tests in any PXC tarball package. They are located in directories

mysql-test/suite/galera

 ,

mysql-test/suite/galera_3nodes

  and

mysql-test/suite/wsrep

 , though that last directory only contains a configuration file.

If you simply try to run tests in galera suite you will find they all are disabled, because the environment variable

WSREP_PROVIDER

  was not set:

sveta@Thinkie:~/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test$ ./mtr --suite=galera
Logging: ./mtr --suite=galera
MySQL Version 5.7.19
Too long tmpdir path '/home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test/var/tmp' creating a shorter one...
- using tmpdir: '/tmp/xYgQqOa5b7'
Checking supported features...
- SSL connections supported
- binaries built with wsrep patch
Using suites: galera
Collecting tests...
Checking leftover processes...
- found old pid 30624 in 'mysqld.3.pid', killing it...
process did not exist!
Removing old var directory...
Creating var directory '/home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test/var'...
Installing system database...
Using parallel: 1
==============================================================================
TEST RESULT TIME (ms) or COMMENT
--------------------------------------------------------------------------
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009
galera.GAL-419 [ skipped ] Test needs 'big-test' option
...
galera.galera_binlog_checksum [ skipped ] Test requires wsrep provider library (libgalera_smm.so). Did you set $WSREP_PROVIDER?
galera.galera_binlog_event_max_size_min [ skipped ] Test requires wsrep provider library (libgalera_smm.so). Did you set $WSREP_PROVIDER?
galera.galera_flush_gtid [ skipped ] Test requires wsrep provider library (libgalera_smm.so). Did you set $WSREP_PROVIDER?
galera.galera_gtid [ skipped ] Test requires wsrep provider library (libgalera_smm.so). Did you set $WSREP_PROVIDER?
galera.lp1435482 [ skipped ] Test requires wsrep provider library (libgalera_smm.so). Did you set $WSREP_PROVIDER?
^Cmysql-test-run: *** ERROR: Got ^C signal

In order to run these tests you need to set this variable first.

I use the quite outdated 5.7.19 PXC package (the version does not matter for the purpose of this post) and run tests as:

WSREP_PROVIDER=/home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/lib/libgalera_smm.so ./mtr --suite=galera

After the variable

WSREP_PROVIDER

  is set, 

mtr

  can successfully run:

sveta@Thinkie:~/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test

WSREP_PROVIDER=/home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/lib/libgalera_smm.so ./mtr --suite=galera
Logging: ./mtr --suite=galera
MySQL Version 5.7.19
Too long tmpdir path '/home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test/var/tmp' creating a shorter one...
- using tmpdir: '/tmp/I6HfuqkwR1'
Checking supported features...
- SSL connections supported
- binaries built with wsrep patch
Using suites: galera
Collecting tests...
Checking leftover processes...
- found old pid 14271 in 'mysqld.1.pid', killing it...
process did not exist!
- found old pid 14273 in 'mysqld.2.pid', killing it...
process did not exist!
Removing old var directory...
Creating var directory '/home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test/var'...
Installing system database...
Using parallel: 1
==============================================================================
TEST RESULT TIME (ms) or COMMENT
--------------------------------------------------------------------------
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009
galera.GAL-419 [ skipped ] Test needs 'big-test' option
...
worker[1] mysql-test-run: WARNING: Waited 60 seconds for /home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test/var/run/mysqld.2.pid to be created, still waiting for 120 seconds...
galera.galera_binlog_checksum [ pass ] 2787
worker[1] mysql-test-run: WARNING: Waited 60 seconds for /home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test/var/run/mysqld.2.pid to be created, still waiting for 120 seconds...
galera.galera_binlog_event_max_size_min [ pass ] 2200
...

Now we are ready to write our first PXC test. The easiest way to get started is to open any existing test and check how it is written. Then modify it so that it replays our own scenario.

Since the question was about testing

IST

  and

SST

, I will use the test

galera_ist_progress

  as an example. First let’s check that it runs successfully and that it does not have any requirements that could prevent it from running inside regular production binaries:

sveta@Thinkie:~/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test$ WSREP_PROVIDER=/home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/lib/libgalera_smm.so ./mtr --suite=galera galera_ist_progress
Logging: ./mtr --suite=galera galera_ist_progress
MySQL Version 5.7.19
Too long tmpdir path '/home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test/var/tmp' creating a shorter one...
- using tmpdir: '/tmp/EodvOyCJwo'
Checking supported features...
- SSL connections supported
- binaries built with wsrep patch
Collecting tests...
Checking leftover processes...
Removing old var directory...
Creating var directory '/home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test/var'...
Installing system database...
Using parallel: 1
==============================================================================
TEST RESULT TIME (ms) or COMMENT
--------------------------------------------------------------------------
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009
worker[1] mysql-test-run: WARNING: Waited 60 seconds for /home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test/var/run/mysqld.2.pid to be created, still waiting for 120 seconds...
galera.galera_ist_progress [ pass ] 17970
--------------------------------------------------------------------------
The servers were restarted 0 times
Spent 17.970 of 218 seconds executing testcases
Completed: All 1 tests were successful.

Everything is fine. Now let’s look into the test itself.

First, this test has its own configuration file. Let’s check what’s in there:

$ cat suite/galera/t/galera_ist_progress.cnf
!include ../galera_2nodes.cnf
[mysqld.1]
wsrep_provider_options='base_port=@mysqld.1.#galera_port;pc.ignore_sb=true'

galera_2nodes.cnf

  is one of the standard configuration files in galera suite. If we look into it we may notice that 

wsrep_provider_options

  is defined and overriding this option is not required for all tests.

We’ll continue our review. The test script includes the 

galera_cluster.inc

  file:

--source include/galera_cluster.inc

This file is located outside of galera suite and contains 2 lines:

--let $galera_cluster_size = 2
--source include/galera_init.inc

galera_init.inc

 , in its turn, creates as many nodes as defined by the 

galera_cluster_size

  variable and additionally creates a default connection for each of them.

Now let’s step out from

galera_ist_progress

  and check if this knowledge is enough to create our first PXC test.

I created a simple test based on a two node setup which checks a few status and system variables, creates a table, inserts data into it, and ensures that content is accessible on both nodes:

$ cat ~/src/tests/t/pxc.test
--source include/galera_cluster.inc
--connection node_1
--echo We are on node 1
select @@hostname, @@port;
show status like 'wsrep_cluster_size';
show status like 'wsrep_cluster_status';
show status like 'wsrep_connected';
create table t1(id int not null auto_increment primary key, f1 int) engine=innodb;
insert into t1(f1) values(1),(2),(3);
select * from t1;
--connection node_2
--echo We are on node 2
select @@hostname, @@port;
show status like 'wsrep_cluster_size';
show status like 'wsrep_cluster_status';
show status like 'wsrep_connected';
select * from t1;
insert into t1(f1) values(1),(2),(3);
select * from t1;
--connection node_1
--echo We are on node 1
select * from t1;
drop table t1;

However, if I run this test in the main suite, it will fail:

sveta@Thinkie:~/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test$ export WSREP_PROVIDER=/home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/lib/libgalera_smm.so
sveta@Thinkie:~/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test$ do_test.sh -s ~/mysql_packages -b Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100
Logging: ./mysql-test-run.pl --record --force pxc
MySQL Version 5.7.19
Too long tmpdir path '/home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test/var/tmp' creating a shorter one...
- using tmpdir: '/tmp/uUmBztSWUA'
Checking supported features...
- SSL connections supported
- binaries built with wsrep patch
Collecting tests...
Checking leftover processes...
Removing old var directory...
Creating var directory '/home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test/var'...
Installing system database...
Using parallel: 1
==============================================================================
TEST RESULT TIME (ms) or COMMENT
--------------------------------------------------------------------------
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009
main.pxc [ skipped ] Test requires wsrep provider library (libgalera_smm.so). Did you set $WSREP_PROVIDER?
--------------------------------------------------------------------------
The servers were restarted 0 times
Spent 0.000 of 108 seconds executing testcases
Completed: All 0 tests were successful.
1 tests were skipped, 1 by the test itself.
=====Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100=====
=====pxc=====
sveta@Thinkie:~/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test$ echo $WSREP_PROVIDER
/home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/lib/libgalera_smm.so

The reason for this failure is that galera suite has default option files that set the necessary variables. Let’s skip those option files for a while and simply run our test in galera suite:

sveta@Thinkie:~/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test$ do_test.sh -s ~/mysql_packages -b Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100 -t galera
Logging: ./mysql-test-run.pl --record --force --suite=galera pxc
MySQL Version 5.7.19
Too long tmpdir path '/home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test/var/tmp' creating a shorter one...
- using tmpdir: '/tmp/ytqEjnfM7i'
Checking supported features...
- SSL connections supported
- binaries built with wsrep patch
Collecting tests...
Checking leftover processes...
Removing old var directory...
Creating var directory '/home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test/var'...
Installing system database...
Using parallel: 1
==============================================================================
TEST RESULT TIME (ms) or COMMENT
--------------------------------------------------------------------------
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009
worker[1] mysql-test-run: WARNING: Waited 60 seconds for /home/sveta/mysql_packages/Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100/mysql-test/var/run/mysqld.2.pid to be created, still waiting for 120 seconds...
galera.pxc [ pass ] 2420
--------------------------------------------------------------------------
The servers were restarted 0 times
Spent 2.420 of 208 seconds executing testcases
Completed: All 1 tests were successful.
pxc.result
=====Percona-XtraDB-Cluster-5.7.19-rel17-29.22.3.Linux.x86_64.ssl100=====
=====pxc=====
We are on node 1
select @@hostname, @@port;
@@hostname @@port
Thinkie 13000
show status like 'wsrep_cluster_size';
Variable_name Value
wsrep_cluster_size 2
show status like 'wsrep_cluster_status';
Variable_name Value
wsrep_cluster_status Primary
show status like 'wsrep_connected';
Variable_name Value
wsrep_connected ON
create table t1(id int not null auto_increment primary key, f1 int) engine=innodb;
insert into t1(f1) values(1),(2),(3);
select * from t1;
id f1
2 1
4 2
6 3
We are on node 2
select @@hostname, @@port;
@@hostname @@port
Thinkie 13004
show status like 'wsrep_cluster_size';
Variable_name Value
wsrep_cluster_size 2
show status like 'wsrep_cluster_status';
Variable_name Value
wsrep_cluster_status Primary
show status like 'wsrep_connected';
Variable_name Value
wsrep_connected ON
select * from t1;
id f1
2 1
4 2
6 3
insert into t1(f1) values(1),(2),(3);
select * from t1;
id f1
2 1
4 2
6 3
7 1
9 2
11 3
We are on node 1
select * from t1;
id f1
2 1
4 2
6 3
7 1
9 2
11 3
drop table t1;

You will see that the test reports that the two nodes run on different ports:

We are on node 1
select @@hostname, @@port;
@@hostname @@port
Thinkie 13000
...
We are on node 2
select @@hostname, @@port;
@@hostname @@port
Thinkie 13004

… and that PXC started:

show status like 'wsrep_cluster_size';
Variable_name Value
wsrep_cluster_size 2
show status like 'wsrep_cluster_status';
Variable_name Value
wsrep_cluster_status Primary
show status like 'wsrep_connected';
Variable_name Value
wsrep_connected ON

And we can also clearly see that each node sees the changes to our test table that were made by the other node.

Now let’s get back to

IST

  test, defined in

galera_ist_progress.test

 .

In order to test

IST

  it first stops writes to the cluster:

# Isolate node #2
--connection node_2
SET GLOBAL wsrep_provider_options = 'gmcast.isolate = 1';

Then it connects to node 1 and waits until 

wsrep_cluster_size

  becomes 1:

--connection node_1
--let $wait_condition = SELECT VARIABLE_VALUE = 1 FROM INFORMATION_SCHEMA.GLOBAL_STATUS WHERE VARIABLE_NAME = 'wsrep_cluster_size';
--source include/wait_condition.inc

Then it turns

wsrep_on OFF

  on node 2:

--connection node_2
SET SESSION wsrep_on = OFF;
--let $wait_condition = SELECT VARIABLE_VALUE = 'non-Primary' FROM INFORMATION_SCHEMA.GLOBAL_STATUS WHERE VARIABLE_NAME = 'wsrep_cluster_status';
--source include/wait_condition.inc
SET SESSION wsrep_on = ON;

Now node 2 is completely isolated and node 1 can be updated, so we can test

IST

  when we bring node 2 back online.

--connection node_1
CREATE TABLE t1 (f1 INTEGER) ENGINE=InnoDB;
INSERT INTO t1 VALUES (1);
INSERT INTO t1 VALUES (2);
INSERT INTO t1 VALUES (3);
INSERT INTO t1 VALUES (4);
INSERT INTO t1 VALUES (5);
INSERT INTO t1 VALUES (6);
INSERT INTO t1 VALUES (7);
INSERT INTO t1 VALUES (8);
INSERT INTO t1 VALUES (9);
INSERT INTO t1 VALUES (10);

After the update is done, node 2 is brought online:

--connection node_2
SET GLOBAL wsrep_provider_options = 'gmcast.isolate = 0';
--connection node_1
--let $wait_condition = SELECT VARIABLE_VALUE = 2 FROM INFORMATION_SCHEMA.GLOBAL_STATUS WHERE VARIABLE_NAME = 'wsrep_cluster_size';
--source include/wait_condition.inc
--connection node_2
--let $wait_condition = SELECT VARIABLE_VALUE = 'Primary' FROM INFORMATION_SCHEMA.GLOBAL_STATUS WHERE VARIABLE_NAME = 'wsrep_cluster_status';
--source include/wait_condition.inc

Once node 2 is online, checks for IST progress are performed. To check for IST progress, the test greps the error log file from node 2 where any messages about IST progress are printed:

#
# Grep for expected IST output in joiner log
#
--connection node_1
--let $assert_count = 1
--let $assert_file = $MYSQLTEST_VARDIR/log/mysqld.2.err
--let $assert_only_after = Need state transfer
--let $assert_text = Receiving IST: 11 writesets, seqnos
--let $assert_select = Receiving IST: 11 writesets, seqnos
--source include/assert_grep.inc
--let $assert_text = Receiving IST... 0.0% ( 0/11 events) complete
--let $assert_select = Receiving IST... 0.0% ( 0/11 events) complete
--source include/assert_grep.inc
--let $assert_text = Receiving IST...100.0% (11/11 events) complete
--let $assert_select = Receiving IST...100.0% (11/11 events) complete
--source include/assert_grep.inc

Here is the error log snipped from node 2 when it re-joined the cluster and initiated state transfer.

2018-05-25T17:00:46.908569Z 0 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 13)
2018-05-25T17:00:46.908637Z 2 [Note] WSREP: State transfer required:
	Group state: f364a69b-603c-11e8-a632-ce5a4a7d5964:13
	Local state: f364a69b-603c-11e8-a632-ce5a4a7d5964:2
2018-05-25T17:00:46.908673Z 2 [Note] WSREP: New cluster view: global state: f364a69b-603c-11e8-a632-ce5a4a7d5964:13, view# 4: Primary, number of nodes: 2, my index: 1, protocol version 3
2018-05-25T17:00:46.908694Z 2 [Note] WSREP: Setting wsrep_ready to true
2018-05-25T17:00:46.908717Z 2 [Warning] WSREP: Gap in state sequence. Need state transfer.
2018-05-25T17:00:46.908737Z 2 [Note] WSREP: Setting wsrep_ready to false
2018-05-25T17:00:46.908757Z 2 [Note] WSREP: You have configured 'xtrabackup-v2' state snapshot transfer method which cannot be performed on a running server. Wsrep provider won't be able to fall back to it if other means of state transfer are unavailable. In that case you will need to restart the server.
2018-05-25T17:00:46.908777Z 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2018-05-25T17:00:46.908799Z 2 [Note] WSREP: REPL Protocols: 7 (3, 2)
2018-05-25T17:00:46.908831Z 2 [Note] WSREP: Assign initial position for certification: 13, protocol version: 3
2018-05-25T17:00:46.908886Z 0 [Note] WSREP: Service thread queue flushed.
2018-05-25T17:00:46.908934Z 2 [Note] WSREP: Check if state gap can be serviced using IST
2018-05-25T17:00:46.909062Z 2 [Note] WSREP: IST receiver addr using tcp://127.0.0.1:13006
2018-05-25T17:00:46.909232Z 2 [Note] WSREP: Prepared IST receiver, listening at: tcp://127.0.0.1:13006
2018-05-25T17:00:46.909267Z 2 [Note] WSREP: State gap can be likely serviced using IST. SST request though present would be void.
2018-05-25T17:00:46.909489Z 0 [Note] WSREP: Member 1.0 (Thinkie) requested state transfer from '*any*'. Selected 0.0 (Thinkie)(SYNCED) as donor.
2018-05-25T17:00:46.909513Z 0 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 13)
2018-05-25T17:00:46.909557Z 2 [Note] WSREP: Requesting state transfer: success, donor: 0
2018-05-25T17:00:46.909602Z 2 [Note] WSREP: GCache history reset: f364a69b-603c-11e8-a632-ce5a4a7d5964:2 -> f364a69b-603c-11e8-a632-ce5a4a7d5964:13
2018-05-25T17:00:46.910221Z 0 [Note] WSREP: 0.0 (Thinkie): State transfer to 1.0 (Thinkie) complete.
2018-05-25T17:00:46.910422Z 0 [Note] WSREP: Member 0.0 (Thinkie) synced with group.
2018-05-25T17:00:47.006802Z 2 [Note] WSREP: GCache DEBUG: RingBuffer::seqno_reset(): full reset
2018-05-25T17:00:47.106423Z 2 [Note] WSREP: Receiving IST: 11 writesets, seqnos 2-13
2018-05-25T17:00:47.106764Z 0 [Note] WSREP: Receiving IST...  0.0% ( 0/11 events) complete.
2018-05-25T17:00:47.109740Z 0 [Note] WSREP: Receiving IST...100.0% (11/11 events) complete.
2018-05-25T17:00:47.110029Z 2 [Note] WSREP: IST received: f364a69b-603c-11e8-a632-ce5a4a7d5964:13
2018-05-25T17:00:47.110433Z 0 [Note] WSREP: 1.0 (Thinkie): State transfer from 0.0 (Thinkie) complete.
2018-05-25T17:00:47.110480Z 0 [Note] WSREP: SST leaving flow control
2018-05-25T17:00:47.110509Z 0 [Note] WSREP: Shifting JOINER -> JOINED (TO: 13)
2018-05-25T17:00:47.110778Z 0 [Note] WSREP: Member 1.0 (Thinkie) synced with group.
2018-05-25T17:00:47.110830Z 0 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 13)
2018-05-25T17:00:47.110890Z 2 [Note] WSREP: Synchronized with group, ready for connections

If you want to write your own tests for IST and SST operations you can use existing test cases as a baseline. You are not required to use grep, and can explore your own scenarios. The important parts of the code are:

  • The variable
    WSREP_PROVIDER

     must be set before the test run

  • The test should be either in galera suite or if you choose to use your own suite you must copy the definitions from the galera suite default configuration file
  • The test should include the file
    include/galera_cluster.inc
  • To isolate the node from the cluster run the following code:
# Isolate node #2
--connection node_2
SET GLOBAL wsrep_provider_options = 'gmcast.isolate = 1';
--connection node_1
--let $wait_condition = SELECT VARIABLE_VALUE = 1 FROM INFORMATION_SCHEMA.GLOBAL_STATUS WHERE VARIABLE_NAME = 'wsrep_cluster_size';
--source include/wait_condition.inc
--connection node_2
SET SESSION wsrep_on = OFF;
--let $wait_condition = SELECT VARIABLE_VALUE = 'non-Primary' FROM INFORMATION_SCHEMA.GLOBAL_STATUS WHERE VARIABLE_NAME = 'wsrep_cluster_status';
--source include/wait_condition.inc
SET SESSION wsrep_on = ON;

Replace the node numbers if needed.

To bring the node back to the cluster run the following code:

# Restore node #2, IST is performed
--connection node_2
SET GLOBAL wsrep_provider_options = 'gmcast.isolate = 0';
--connection node_1
--let $wait_condition = SELECT VARIABLE_VALUE = 2 FROM INFORMATION_SCHEMA.GLOBAL_STATUS WHERE VARIABLE_NAME = 'wsrep_cluster_size';
--source include/wait_condition.inc
--connection node_2
--let $wait_condition = SELECT VARIABLE_VALUE = 'Primary' FROM INFORMATION_SCHEMA.GLOBAL_STATUS WHERE VARIABLE_NAME = 'wsrep_cluster_status';
--source include/wait_condition.inc

Depending on the size of the updates and

gcache

 you can test either IST or SST in this way.

The post MySQL Test Framework for Percona XtraDB Cluster appeared first on Percona Database Performance Blog.

May
25
2018
--

Webinar Tues, 5/29: MySQL, Percona XtraDB Cluster, ProxySQL, Kubernetes: How they work together

Please join Percona’s Principal Architect Alex Rubin as he presents MySQL, Percona XtraDB Cluster, ProxySQL, Kubernetes: How they work together to give you a highly available cluster database environment on Tuesday, May 29th at 7:00 AM PDT (UTC-7) / 10:00 AM EDT (UTC-4).

 

In this webinar, Alex will discuss how to deploy a highly available MySQL database environment on Kubernetes/Openshift using Percona XtraDB Cluster (PXC) together with MySQL Proxy to implement read/write splitting.

If you have never used Kubernetes and Openshift, or never used PXC / MySQL Proxy, Alex will do a quick introduction to these technologies. There will also be a demo where Alex sets up a PXC cluster with ProxySQL in Openshift Origin and tries to break it.

By the end of this webinar you will have a better understanding of:

  • How to deploy Percona XtraDB Cluster with ProxySQL for HA solutions
  • How to leverage Kubernetes/Openshift in your environments
  • How to troubleshoot performance issues

Register for the webinar

Alexander Rubin, Principal Consultant

Alexander RubinAlexander joined Percona in 2013. Alexander worked with MySQL since 2000 as DBA and Application Developer. Before joining Percona he was doing MySQL consulting as a principal consultant for over 7 years (started with MySQL AB in 2006, then Sun Microsystems and then Oracle). He helped many customers design large, scalable and highly available MySQL systems and optimize MySQL performance. Alexander also helped customers design Big Data stores with Apache Hadoop and related technologies.

The post Webinar Tues, 5/29: MySQL, Percona XtraDB Cluster, ProxySQL, Kubernetes: How they work together appeared first on Percona Database Performance Blog.

May
22
2018
--

ProxySQL 1.4.8 and Updated proxysql-admin Tool Now in the Percona Repository

ProxySQL for Connection Pooling

ProxySQL 1.4.5ProxySQL 1.4.8, released by ProxySQL, is now available for download in the Percona Repository along with an updated version of Percona’s proxysql-admin tool.

ProxySQL is a high-performance proxy, currently for MySQL and its forks (like Percona Server for MySQL and MariaDB). It acts as an intermediary for client requests seeking resources from the database. René Cannaò created ProxySQL for DBAs as a means of solving complex replication topology issues.

The ProxySQL 1.4.8 source and binary packages available at https://percona.com/downloads/proxysql include ProxySQL Admin – a tool, developed by Percona to configure Percona XtraDB Cluster nodes into ProxySQL. Docker images for release 1.4.8 are available as well: https://hub.docker.com/r/percona/proxysql/. You can download the original ProxySQL from https://github.com/sysown/proxysql/releases.

This release fixes the following bugs in ProxySQL Admin:

Usability improvement:

  • PR #84: Now proxysql-status tool dumps host_priority and proxysql-admin.cnf. Also output format was changed.

Other improvements and bug fixes:

  • PR #66: --syncusers option now makes ProxySQL-admin to update the user’s password in ProxySQL database if there is any password difference between ProxySQL user and MySQL user.
  • PSQLADM-45: it was unclear from the help screen, that --config-file option requires an argument.
  • PSQLADM-48${PROXYSQL_DATADIR}/${CLUSTER_NAME}_mode file was not created at ProxySQL-admin upgrade (1.4.5 or before to 1.4.6 onwards).
  • PSQLADM-52: The  proxysql_galera_checker script was not checking empty query rules.
  • PSQLADM-54: proxysql_node_monitor did not change OFFLINE_HARD status properly for the coming back online nodes.

ProxySQL is available under OpenSource license GPLv3.

The post ProxySQL 1.4.8 and Updated proxysql-admin Tool Now in the Percona Repository appeared first on Percona Database Performance Blog.

May
09
2018
--

Does the Version Number Matter?

ProxySQL

ProxySQLYes, it does! In this blog post, I am going to share my recent experiences with ProxySQL and how important the database software version number can be.

Migration

I was working on a migration to Percona XtraDB Cluster (PXC) with ProxySQL, fortunately on a staging environment first so we could catch any issues (like this one).

We installed Percona XtraDB Cluster and ProxySQL on the staging environment and repointed the staging application to ProxySQL. At first, everything looked great. We were able to do some application tests and everything looked good. I advised the customer to do more testing to make sure everything works well.

Something is wrong, but what?

A few days later the customer noticed that their application was not working properly.

We started investigating. Everything seemed well-configured, and the only thing we could see in the application log was the following:

2018-04-20 11:28:31,169 [ default-threads - 42] ERROR Error in lifecycle management : org.hibernate.StaleStateException : Batch update returned unexpected row count from update [0]; actual row count: 0; expected: 1 {it.tasgroup.monetica.gt.lifecycle.LifeCycle:line 103} (method: error)
org.hibernate.StaleStateException: Batch update returned unexpected row count from update [0]; actual row count: 0; expected: 1
at org.hibernate.jdbc.Expectations$BasicExpectation.checkBatched(Expectations.java:85)
at org.hibernate.jdbc.Expectations$BasicExpectation.verifyOutcome(Expectations.java:70)

Based on this error I still did not know what is wrong. Were some of the queries failing because of PXC, ProxySQL or some other settings?

We redirected the application to one of the nodes from PXC, and everything worked fine. We tried HAproxy as well, and everything worked again. We knew something was happening around ProxySQL which is causing the problem. But we still could not find the problem. Every query went through ProxySQL without any issue.

Debug log is our savior

The customer finally enabled the application debug logging so we could see which query was failing:

delete from TABLENAME where ID='11' and Timestamp ='2018-04-20 16:15:03';

I was confused at first: this is a kind of simple query, what could be wrong? Let’s investigate it on the cluster. When I tried to select the data on the cluster, it gave me back zero results. That’s OK, maybe the row was already deleted?

For this investigation, the slow query logging was enabled and long_query_time set to 0 to log all the queries. I checked the slow query log looking for queries like this. What I found helped me realize what the problem was:

delete from TABLENAME where ID=10 and Timestamp ='2018-04-20 11:17:22.35';
delete from TABLENAME where ID=24 and Timestamp ='2018-04-20 11:17:31.602';
delete from TABLENAME where ID=43 and Timestamp ='2018-04-20 11:18:13.2';
delete from TABLENAME where ID=22 and Timestamp ='2018-04-20 11:11:02.854';
delete from TABLENAME where ID=11 and Timestamp ='2018-04-20 11:21:57';
delete from TABLENAME where ID=64 and Timestamp ='2018-04-20 11:18:34';
delete from TABLENAME where ID=47 and Timestamp ='2018-04-20 10:38:35';
delete from TABLENAME where ID=23 and Timestamp ='2018-04-20 11:30:03';

I hope you see the difference! The first four lines have fractional seconds! At that time, the application was pointed to the cluster directly. So ProxySQL cut off the fractional seconds? That would be a nasty bug.

I checked the application log again with the debug information, and I could see the application does not even use the fractional seconds in the queries when it points to ProxySQL. This is why the query was failing (does not delete any rows), because in the table all the rows had fractional seconds but the queries were not using them.

So why does the application not use fractional seconds with ProxySQL?

First of all, fractional seconds were introduced in MySQL 5.6.4. The application is a Java-based application with Jboss and Hibernate. I knew ProxySQL reports MySQL 5.5. Maybe the application/connector reads the version number and makes decisions based on that?

It was quite easy to test this theory by just changing the version number in ProxySQL like this:

update global_variables set variable_value="5.7.21" where variable_name='mysql-server_version';
load mysql variables to run;save mysql variables to disk;

The application had to be restarted (probably it was caching the previous settings) but after that everything was working as expected.

But be careful, now it will report 5.7.21 for all the hostgroups. What if you have multiple hostgroups with different MySQL versions? It would be nice if you could define this for every hostgroup.

Conclusion

The solution was very easy, but finding the source of the problem took a long time. If you are planning to use ProxySQL, I would always recommend changing the mysql-server_version to match to the underlying MySQL server version number because who knows which connector or application checks the version and makes a decision based on that.

There is another example here where Marco Tusa had a very similar problem with a Java connector.

The post Does the Version Number Matter? appeared first on Percona Database Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com