Aug
31
2017
--

Percona Live Europe Featured Talks: Orchestrating ProxySQL with Orchestrator and Consul with Avraham Apelbaum

Colin Charles

Percona Live EuropeWelcome to another post our series of interview blogs for the upcoming Percona Live Europe 2017 in Dublin. This series highlights a number of talks that will be at the conference and gives a short preview of what attendees can expect to learn from the presenter.

This blog post is with Avraham Apelbaum, DBA and DevOps at Wix.com His talk is titled Orchestrating ProxySQL with Orchestrator and Consul. The combination of ProxySQL and Orchestrator solves many problems, but still requires some manual labor when the configuration changes when there is a network split (and other scenarios). In our conversation, we discussed using Consul to solve some of these issues:

Percona: How did you get into database technology? What do you love about it?

Avraham: On my first day as a soldier in a technology unit of the IDF, I received a HUGE Oracle 8 book and a very low-level design of a DB-based system. “You have one month,” they told me. I finished it all within ten days. Before that, I didn’t even know what a DB was. Today, I’m at Wix managing hundreds of databases that support 100M users!

Percona: You’re presenting a session called “Orchestrating ProxySQL with Orchestrator and Consul”. How do these technologies work together to help provide a high availability solution?

Avraham: ProxySQL is supposed to help you out with high availability (HA) and disaster recovery (DR) for MySQL servers, but it still requires some manual labor when the configuration changes – as a result of a network split, for example. Somehow all ProxySQL servers need to get the new MySQL cluster topology. So to automate all that, I added two more parts: a Consul KV store and a Consul template, which are responsible for updating ProxySQL on every architecture change in the MySQL cluster.

Percona: What is special about this combination of products that works better than other solutions? Is it right all the time, or does it depend on the workload?

Avraham: As DevOps I prefer not to do anything manually. What’s more, no one wants to wake up in the middle of the night because any one of our DB servers can fail. Most everyone, I guess, will have more than one ProxySQL server in their system at some point, so this solution can help them use ProxySql and Orchestrator.

Percona: What do you want attendees to take away from your session? Why should they attend?

Avraham: I am hoping to help people automate their HA and DR solutions. If as a result of my talk someone will earn even one minute off downtime, I’ll be happy.

Percona: What are you most looking forward to at Percona Live Europe 2017?

Avraham: In the DevOps and open source world, it’s all about sharing ideas. It was actually when I attended the talks by ProxySQL and Orchestrator’s creators that I thought of assembling it all up to solve our own problem. So I am looking forward to sharing my idea with others, and getting input from the audience so that everyone can benefit.

Want to find out more about Avraham and RDS migration? Register for Percona Live Europe 2017, and see his talk Orchestrating ProxySQL with Orchestrator and Consul. Register now to get the best price! Use discount code SeeMeSpeakPLE17 to get 10% off your registration.

Percona Live Open Source Database Conference Europe 2017 in Dublin is the premier European open source event for the data performance ecosystem. It is the place to be for the open source community as well as businesses that thrive in the MySQL, MariaDB, MongoDB, time series database, cloud, big data and Internet of Things (IoT) marketplaces. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

The Percona Live Open Source Database Conference Europe will be September 25-27, 2017 at the Radisson Blu Royal Hotel, Dublin.

Oct
10
2016
--

Consul Architecture

consul

consulIn this blog post, I’ll provide my thoughts about Consul for ProxySQL service discovery and automation.

I approached Consul recently while looking for a service discovery and configuration automation solution for ProxySQL. My colleague Nik Vyzas wrote a great post on this topic, and I suggest you read it. I wrote this article to share my first impressions of Consul (for whomever it might interest).

Consul is a complete service discovery solution. In this respect it differs from its alternative etcd, which only provides a foundation to build such solutions.

Consul consists of a single, small binary (the Linux binary is 24MB). You just download it, edit the configuration file and start the program. It doesn’t need a package. The Consul binary does it all. You can start it as a server or as a client. It also provides a set of administrative tasks, usable via the command-line or the HTTP API.

But what is Consul about?

I mentioned service discovery, which is the primary purpose of Consul. But what is it?

Suppose that you have a Percona XtraDB Cluster. Applications query this cluster via ProxySQL (or another proxy), which distributes the workload among the running servers. But the applications still need to know ProxySQL’s address and port. But what if we can’t reach the ProxySQL instance? Well, service discovery is what allows applications to reach a running ProxySQL server. A service discovery server is a server that tells applications the IP address and port of a running service they need. It can also store information about service configuration.

Let’s continue with our Percona XtraDB Cluster and ProxySQL example. Here is what Consul can do for us:

  • When a node is added, automatically discover other nodes in the cluster.
  • When a proxy is added, automatically discover all cluster nodes.
  • Automatically configure the proxy with users and other settings.
  • Even some basic monitoring, thanks to Consul health checks.

Now, let’s see how it does these things.

Interfaces

If you only want to test Consul interfaces from a developer point of view, you can start a stand-alone Consul instance in developer mode. This means that Consul will run in-memory, and will not write anything to disk.

Applications can query Consul in two ways. It can be queried like a DNS server, which is the most lightweight option. For example, an application can send a request for mysql.service.dc1.consul, which means “please find a running MySQL service, in the datacenter called dc1.” Consul will reply with an A or SRV record, with the IP and possibly the port of a running server.

You can make the same request via a REST API. The API can register or unregister services, add health checks, and so on.

Consul performs health checks to find out which services are running. Consul expects to receive an integer representing success or error, just like Nagios. In fact, you can use Nagios plugins with Consul. You can even use Consul as a basis for a distributed monitoring system.

The HTTP API also includes endpoints for a KV store. Under the hood, Consul includes BoltDB. This means you can use Consul for configuration automation. Endpoints are also provided to implement distributed semaphores and leader election.

The Consul binary also provides an easy command-line interface, mainly used for administrative tasks: registering or unregistering services, adding new nodes to Consul, and so on. It also provides good diagnostic commands.

Cluster

In production, Consul runs as a cluster. As mentioned above, each instance can be a server or a client. Clients have less responsibilities: when they receive queries (reads) or transactions (writes), they act like a proxy and forward them to a server. Each client also executes health checks against some services, and informs servers about their health status.

Servers are one of two types: an elected leader, and its followers. The leader can change at any moment. When a follower receives a request from a client, it forwards it to the leader. If it is a transaction, the leader logs it locally and replicates it to the followers. When more than half of them accept the changes, the transaction gets committed. The term “transaction” is a bit confusing: since version 0.7, think of a “transaction” as something that changes the state of the cluster.

Reads can have three different consistency levels, where stricter levels are slower. Followers forward queries to the leader by default, which in turn contacts other followers to check if it is still the leader. This mechanism guarantees that the applications (the users) never receive stale data. However, it requires a considerable amount of work. For this reason, less reliable but faster consistency levels are supported (depending on the use case).

Therefore, we can say that having more servers improves the reliability in case of some nodes crashes, but lowers the performance because it implies more network communications. The recommended number of servers is five. Having a high number of clients makes the system more scalable, because the health check and request forwarding work is distributed over all clients.

Multi-cluster configurations are natively supported, for geographically distributed environments. Each cluster serves data about different services. Applications, however, can query any cluster. If necessary, Consul will forward the request to the proper cluster to retrieve the required information.

Packages

Currently most Linux distributions do not include Consul. However the package is present in some versions that are not yet stable (like Debian Testing and Ubuntu 16.10).

Some community packages also exist. Before using them, you should test them to be sure that they are production-ready.

Consul in Docker

Consul’s official Docker image is based on Alpine Linux, which makes it very small. Alpine Linux is a distribution designed for embedded environments, and has recently become quite popular in the Docker world. It is based on Busybox, a tiny re-implementation of GNU basic tools.

The image is also very secure. Normally containers run a daemon as root; Consul runs as consul user, via a sudo alternative called gosu.

A Good Use Case

When we start a new container in a “dockerized” environment, we cannot predict its IP. This is a major pain when setting up a cluster: all nodes must be configured with other nodes addresses, and optionally a proxy (like ProxySQL) must know the nodes’ addresses. The problem reappears every time we add a new node, a new slave, or a new proxy. Consul is a great way to solve this problem. We will see this in depth in a future post.

Sep
16
2016
--

Consul, ProxySQL and MySQL HA

ProxySQL

When it comes to “decision time” about which type of MySQL HA (high-availability) solution to implement, and how to architect the solution, many questions come to mind. The most important questions are:

  • “What are the best tools to provide HA and Load Balancing?”
  • “Should I be deploying this proxy tool on my application servers or on a standalone server?”.

Ultimately, the best tool really depends on the needs of your application and your environment. You might already be using specific tools such as Consul or MHA, or you might be looking to implement tools that provide richer features. The dilemma of deploying a proxy instance per application host versus a standalone proxy instance is usually a trade-off between “a less effective load balancing algorithm” or “a single point of failure.” Neither are desirable, but there are ways to implement a solution that balances all aspects.

In this article, we’ll go through a solution that is suitable for an application that has not been coded to split reads and writes over separate MySQL instances. An application like this would rely on a proxy or 3rd party tool to split reads/writes, and preferably a solution that has high-availability at the proxy layer. The solution described here is comprised of ProxySQLConsul and Master High Availability (MHA). Within this article, we’ll focus on the configuration required for ProxySQL and Consul since there are many articles that cover MHA configuration (such as Miguel’s recent MHA Quick Start Guide blog post).

When deploying Consul in production, a minimum of 3x instances are recommended – in this example, the Consul agents run on the Application Server (appserver) as well as on the two “ProxySQL servers” mysql1 and mysql2 (which act as the HA proxy pair). This is not a hard requirement, and these instances can easily run on another host or docker container. MySQL is deployed locally on mysql1 and mysql2, however this could just as well be 1..n separate standalone DB server instances:

Consul ProxySQL

So let’s move on to the actual configuration of this HA solution, starting with Consul.

Installation of Consul:

Firstly, we’ll need to install the required packages, download the Consul archive and perform the initial configuration. We’ll need to perform the same installation on each of the nodes (i.e., appserver, mysql1 and mysql2).

### Install pre-requisite packages:
sudo yum -y install wget unzip bind-utils dnsmasq
### Install Consul:
sudo useradd consul
sudo mkdir -p /opt/consul /etc/consul.d
sudo touch /var/log/consul.log /etc/consul.d/proxysql.json
cd /opt/consul
sudo wget https://releases.hashicorp.com/consul/0.6.4/consul_0.6.4_linux_amd64.zip
sudo unzip consul_0.6.4_linux_amd64.zip
sudo ln -s /opt/consul/consul /usr/bin/consul
sudo chown consul:consul -R /etc/consul* /opt/consul* /var/log/consul.log

Configuration of Consul on Application Server (used as ‘bootstrap’ node):

Now, that we’re done with the installation on each of the hosts, let’s continue with the configuration. In this example we’ll bootstrap the Consul cluster using “appserver”:

### Edit configuration files
$ sudo vi /etc/consul.conf
{
  "datacenter": "dc1",
  "data_dir": "/opt/consul/",
  "log_level": "INFO",
  "node_name": "agent1",
  "server": true,
  "ui": true,
  "bootstrap": true,
  "client_addr": "0.0.0.0",
  "advertise_addr": "192.168.1.119"  ## Add server IP here
}
######
$ sudo vi /etc/consul.d/proxysql.json
{"services": [
  {
   "id": "proxy1",
   "name": "proxysql",
   "address": "192.168.1.120",
   "tags": ["mysql"],
   "port": 6033,
   "check": {
     "script": "mysqladmin ping --host=192.168.1.120 --port=6033 --user=root --password=123",
     "interval": "3s"}
   },
  {
   "id": "proxy2",
   "name": "proxysql",
   "address": "192.168.1.121",
   "tags": ["mysql"],
   "port": 6033,
   "check": {
     "script": "mysqladmin ping --host=192.168.1.121 --port=6033 --user=root --password=123",
     "interval": "3s"}
   }
 ]
}
######
### Start Consul agent
$ sudo su - consul -c 'consul agent -config-file=/etc/consul.conf -config-dir=/etc/consul.d > /var/log/consul.log &'
### Setup DNSMASQ (as root)
echo "server=/consul/127.0.0.1#8600" > /etc/dnsmasq.d/10-consul
service dnsmasq restart
### Remember to add the localhost as a DNS server (this step can vary
### depending on how your DNS servers are managed... here I'm just
### adding the following line to resolve.conf:
sudo vi /etc/resolve.conf
#... snippet ...#
nameserver 127.0.0.1
#... snippet ...#
### Restart dnsmasq
sudo service dnsmasq restart

The service should now be started, and you can verify this in the logs in “/var/log/consul.log”.

Configuration of Consul on Proxy Servers:

The next item is to configure each of the proxy Consul agents. Note that the “agent name” and the “IP address” need to be updated for each host (values for both must be unique):

### Edit configuration files
$ sudo vi /etc/consul.conf
{
  "datacenter": "dc1",
  "data_dir": "/opt/consul/",
  "log_level": "INFO",
  "node_name": "agent2",  ### Agent node name must be unique
  "server": true,
  "ui": true,
  "bootstrap": false,   ### Disable bootstrap on joiner nodes
  "client_addr": "0.0.0.0",
  "advertise_addr": "192.168.1.xxx",  ### Set to local instance IP
  "dns_config": {
    "only_passing": true
  }
}
######
$ sudo vi /etc/consul.d/proxysql.json
{"services": [
  {
   "id": "proxy1",
   "name": "proxysql",
   "address": "192.168.1.120",
   "tags": ["mysql"],
   "port": 6033,
   "check": {
     "script": "mysqladmin ping --host=192.168.1.120 --port=6033 --user=root --password=123",
     "interval": "3s"}
   },
  {
   "id": "proxy2",
   "name": "proxysql",
   "address": "192.168.1.121",
   "tags": ["mysql"],
   "port": 6033,
   "check": {
     "script": "mysqladmin ping --host=192.168.1.121 --port=6033 --user=root password=123",
     "interval": "3s"}
   }
 ]
}
######
### Start Consul agent:
$ sudo su - consul -c 'consul agent -config-file=/etc/consul.conf -config-dir=/etc/consul.d > /var/log/consul.log &'
### Join Consul cluster specifying 1st node IP e.g.
$ consul join 192.168.1.119
### Verify logs and look out for the following messages:
$ cat /var/log/consul.log
==> Starting Consul agent...
==> Starting Consul agent RPC...
==> Consul agent running!
         Node name: 'agent2'
        Datacenter: 'dc1'
            Server: true (bootstrap: false)
       Client Addr: 0.0.0.0 (HTTP: 8500, HTTPS: -1, DNS: 8600, RPC: 8400)
      Cluster Addr: 192.168.1.120 (LAN: 8301, WAN: 8302)
    Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false
             Atlas:
==> Log data will now stream in as it occurs:
# ... snippet ...
    2016/09/05 19:48:04 [INFO] agent: Synced service 'consul'
    2016/09/05 19:48:04 [INFO] agent: Synced check 'service:proxysql1'
    2016/09/05 19:48:04 [INFO] agent: Synced check 'service:proxysql2'
# ... snippet ...

At this point, we have Consul installed, configured and running on each of our hosts appserver (mysql1 and mysql2). Now it’s time to install and configure ProxySQL on mysql1 and mysql2.

Installation & Configuration of ProxySQL:

The same procedure should be run on both mysql1 and mysql2 hosts:

### Install ProxySQL packages and initialise ProxySQL DB
sudo yum -y install https://github.com/sysown/proxysql/releases/download/v1.2.2/proxysql-1.2.2-1-centos7.x86_64.rpm
sudo service proxysql initial
sudo service proxysql stop
### Edit the ProxySQL configuration file to update username / password
vi /etc/proxysql.cnf
###
admin_variables=
{
    admin_credentials="admin:admin"
    mysql_ifaces="127.0.0.1:6032;/tmp/proxysql_admin.sock"
}
###
### Start ProxySQL
sudo service proxysql start
### Connect to ProxySQL and configure
mysql -P6032 -h127.0.0.1 -uadmin -padmin
### First we create a replication hostgroup:
mysql> INSERT INTO mysql_replication_hostgroups VALUES (10,11,'Standard Replication Groups');
### Add both nodes to the hostgroup 11 (ProxySQL will automatically put the writer node in hostgroup 10)
mysql> INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.120',11,3306,1000);
mysql> INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.121',11,3306,1000);
### Save server configuration
mysql> LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK;
### Add query rules for RW split
mysql> INSERT INTO mysql_query_rules (active, match_pattern, destination_hostgroup, cache_ttl, apply) VALUES (1, '^SELECT .* FOR UPDATE', 10, NULL, 1);
mysql> INSERT INTO mysql_query_rules (active, match_pattern, destination_hostgroup, cache_ttl, apply) VALUES (1, '^SELECT .*', 11, NULL, 1);
mysql> LOAD MYSQL QUERY RULES TO RUNTIME; SAVE MYSQL QUERY RULES TO DISK;
### Finally configure ProxySQL user and save configuration
mysql> INSERT INTO mysql_users (username,password,active,default_hostgroup,default_schema) VALUES ('root','123',1,10,'test');
mysql> LOAD MYSQL USERS TO RUNTIME; SAVE MYSQL USERS TO DISK;
mysql> EXIT;

MySQL Configuration:

We also need to perform one configuration step on the MySQL servers in order to create a user for ProxySQL to monitor the instances:

### ProxySQL's monitor user on the master MySQL server (default username and password is monitor/monitor)
mysql -h192.168.1.120 -P3306 -uroot -p123 -e"GRANT USAGE ON *.* TO monitor@'%' IDENTIFIED BY 'monitor';"

We can view the configuration of the monitor user on the ProxySQL host by checking the global variables on the admin interface:

mysql> SHOW VARIABLES LIKE 'mysql-monitor%';
+----------------------------------------+---------+
| Variable_name                          | Value   |
+----------------------------------------+---------+
| mysql-monitor_enabled                  | true    |
| mysql-monitor_connect_timeout          | 200     |
| mysql-monitor_ping_max_failures        | 3       |
| mysql-monitor_ping_timeout             | 100     |
| mysql-monitor_replication_lag_interval | 10000   |
| mysql-monitor_replication_lag_timeout  | 1000    |
| mysql-monitor_username                 | monitor |
| mysql-monitor_password                 | monitor |
| mysql-monitor_query_interval           | 60000   |
| mysql-monitor_query_timeout            | 100     |
| mysql-monitor_slave_lag_when_null      | 60      |
| mysql-monitor_writer_is_also_reader    | true    |
| mysql-monitor_history                  | 600000  |
| mysql-monitor_connect_interval         | 60000   |
| mysql-monitor_ping_interval            | 10000   |
| mysql-monitor_read_only_interval       | 1500    |
| mysql-monitor_read_only_timeout        | 500     |
+----------------------------------------+---------+

Testing Consul:

Now that Consul and ProxySQL are configured we can do some tests from the “appserver”. First, we’ll verify that the hosts we’ve added are both reporting [OK] on our DNS requests:

$ dig @127.0.0.1 -p 53 proxysql.service.consul
; <<>> DiG 9.9.4-RedHat-9.9.4-29.el7_2.3 <<>> @127.0.0.1 -p 53 proxysql.service.consul
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 9975
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;proxysql.service.consul.	IN	A
;; ANSWER SECTION:
proxysql.service.consul. 0	IN	A	192.168.1.121
proxysql.service.consul. 0	IN	A	192.168.1.120
;; Query time: 1 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Mon Sep 05 19:32:12 UTC 2016
;; MSG SIZE  rcvd: 158

As you can see from the output above, DNS is reporting both 192.168.120 and 192.168.1.121 as available for the ProxySQL service. As soon as the ProxySQL check fails, the nodes will no longer report in the output above.

We can also view the status of our cluster and agents through the Consul Web GUI which runs on port 8500 of all the Consul servers in this configuration (e.g. http://192.168.1.120:8500/):

Consul GUI

Testing ProxySQL:

So now that we have this configured we can also do some basic tests to see that ProxySQL is load balancing our connections:

[percona@appserver consul.d]$ mysql -hproxysql.service.consul -e"select @@hostname"
+--------------------+
| @@hostname         |
+--------------------+
| mysql1.localdomain |
+--------------------+
[percona@appserver consul.d]$ mysql -hproxysql.service.consul -e"select @@hostname"
+--------------------+
| @@hostname         |
+--------------------+
| mysql2.localdomain |
+--------------------+

Perfect! We’re ready to use the hostname “proxysql.service.consul” to connect to our MySQL instances using a round-robin load balancing and HA proxy solution. If one of the two ProxySQL instances fails, we’ll continue communicating with the database through the other. Of course, this configuration is not limited to just two hosts, so feel free to add as many as you need. Be aware that in this example the two hosts’ replication hierarchy is managed by MHA in order to allow for master/slave promotion. By performing an automatic or manual failover using MHA, ProxySQL automatically detects the change in replication topology and redirect writes to the newly promoted master instance.

To make this configuration more durable, it is encouraged to create a more intelligent Consul check – i.e., a check that checks more than just the availability of the MySQL service (an example would be to select some data from a table). It is also recommended to fine tune the interval of the check to suit the requirements of your application.

Sep
14
2016
--

Webinar Thursday Sept. 15: Identifying and Solving Database Performance Issues with PMM

PMM

PMMPlease join Roman Vynar, Lead Platform Engineer on Thursday, September 15, 2016 at 10 am PDT (UTC-7) for a webinar on Identifying and Solving Database Performance Issues with PMM.

Database performance is the key to high-performance applications. Gaining visibility into the database is the key to improving database performance. Percona’s Monitoring and Management (PMM) provides the insight you need into your database environment.

In this webinar, we will demonstrate how using PMM for query analytics, in combination with database and host performance metrics, can more efficiently drive tuning, issue management and application development. Using PMM can result in faster resolution times, more focused development and a more efficient IT team.

Register for the webinar here.

register-now

PMMRoman Vynar, Lead Platform Engineer
Roman is a Lead Platform Engineer at Percona. He joined the company to establish and develop the Remote DBA service from scratch. Over time, the growing service successfully expanded to Managed Services. Roman develops the monitoring tools, automated scripts, backup solution, notification and incident tracking web system and currently leading Percona Monitoring and Management project.
Dec
05
2014
--

Streamlined Percona XtraDB Cluster (or anything) testing with Consul and Vagrant

Introducing Consul

I’m always interested in what Mitchell Hashimoto and Hashicorp are up to, I typically find their projects valuable.  If you’ve heard of Vagrant, you know their work.

I recently became interested in a newer project they have called ‘Consul‘.  Consul is a bit hard to describe.  It is (in part):

  • Highly consistent metadata store (a bit like Zookeeeper)
  • A monitoring system (lightweight Nagios)
  • A service discovery system, both DNS and HTTP-based. (think of something like haproxy, but instead of tcp load balancing, it provides dns lookups with healthy services)

What this has to do with Percona XtraDB Cluster

I’ve had some more complex testing for Percona XtraDB Cluster (PXC) to do on my plate for quite a while, and I started to explore Consul as a tool to help with this.  I already have Vagrant setups for PXC, but ensuring all the nodes are healthy, kicking off tests, gathering results, etc. were still difficult.

So, my loose goals for Consul are:

  • A single dashboard to ensure my testing environment is healthy
  • Ability to adapt to any size environment — 3 node clusters up to 20+
  • Coordinate starting and stopping load tests running on any number of test clients
  • Have the ability to collect distributed test results

I’ve succeeded on some of these fronts with a Vagrant environment I’ve been working on. This spins up:

  • A Consul cluster (default is a single node)
  • Test server(s)
  • A PXC cluster

Additionally, it integrates the Test servers and PXC nodes with Consul such that:

  • The servers setup a Consul agent in client mode to the  Consul cluster
  • Additionally, they setup a local DNS forwarder that sends all DNS requests to the ‘.consul’ domain to the local agent to be serviced by the Consul cluster.
  • The servers register services with Consul that run local health checks
  • The test server(s) setup a ‘watch’ in consul to wait for starting sysbench on a consul ‘event’.

Seeing it in action

Once I run my ‘vagrant up’, I get a consul UI I can connect to on my localhost at port 8501:

Consul's Node Overview

Consul’s Node Overview

 

I can see all 5 of my nodes.  I can check the services and see that test1 is failing one health check because sysbench isn’t running yet:

Consul reporting sysbench is not running.

Consul reporting sysbench is not running.

This is expected, because I haven’t started testing yet.  I can see that my PXC cluster is healthy:

 

Health checks are using clustercheck from the PXC package

Health checks are using clustercheck from the PXC package

 

Involving Percona Cloud Tools in the system

So far, so good.  This Vagrant configuration (if I provide a PERCONA_AGENT_API_KEY in my environment) also registers my test servers with Percona Cloud Tools, so I can see data being reported there for my nodes:

Percona Cloud Tool's Dashboard for a single node

Percona Cloud Tool’s Dashboard for a single node

So now I am ready to begin my test.  To do so, I simply need to issue a consul event from any of the nodes:

jayj@~/Src/pxc_consul [507]$ vagrant ssh consul1
Last login: Wed Nov 26 14:32:38 2014 from 10.0.2.2
[root@consul1 ~]# consul event -name='sysbench_update_index'
Event ID: 7c8aab42-fd2e-de6c-cb0c-1de31c02ce95

My pre-configured watchers on my test node knows what to do with that event and launches sysbench.  Consul shows that sysbench is indeed running:

Screen Shot 2014-11-26 at 9.43.29 AM

 

And I can indeed see traffic start to come in on Percona Cloud Tools:

Screen Shot 2014-11-26 at 9.53.11 AM

I have testing traffic limited for my example, but that’s easily tunable via the Vagrantfile.  To show something a little more impressive, here’s a 5 node cluster running hitting around 2500 tps total throughput:

Screen Shot 2014-11-26 at 1.08.48 PM

So to summarize thus far:

  • I can spin up any size cluster I want and verify it is healthy with Consul’s UI
  • I can spin up any number of test servers and kick off sysbench on all of them simultaneously

Another big trick of Consul’s

That so far so good, but let me point out a few things that may not be obvious.  If you check the Vagrantfile, I use a consul hostname in a few places.  First, on the test servers:

# sysbench setup
            'tables' => 1,
            'rows' => 1000000,
            'threads' => 4 * pxc_nodes,
            'tx_rate' => 10,
            'mysql_host' => 'pxc.service.consul'

then again on the PXC server configuration:

# PXC setup
          "percona_server_version"  => pxc_version,
          'innodb_buffer_pool_size' => '1G',
          'innodb_log_file_size' => '1G',
          'innodb_flush_log_at_trx_commit' => '0',
          'pxc_bootstrap_node' => (i == 1 ? true : false ),
          'wsrep_cluster_address' => 'gcomm://pxc.service.consul',
          'wsrep_provider_options' => 'gcache.size=2G; gcs.fc_limit=1024',

Notice ‘pxc.service.consul’.  This hostname is provided by Consul and resolves to all the IPs of the current servers both having and passing the ‘pxc’ service health check:

[root@test1 ~]# host pxc.service.consul
pxc.service.consul has address 172.28.128.7
pxc.service.consul has address 172.28.128.6
pxc.service.consul has address 172.28.128.5

So I am using this to my advantage in two ways:

  1. My PXC cluster bootstraps the first node automatically, but all the other nodes use this hostname for their wsrep_cluster_address.  This means: no specific hostnames or ips in the my.cnf file, and this hostname will always be up to date with what nodes are active in the cluster; which is the precise list that should be in the wsrep_cluster_address at any given moment.
  2. My test servers connect to this hostname, therefore they always know where to connect and they will round-robin (if I have enough sysbench threads and PXC nodes) to different nodes based on the response of the dns lookup, which returns 3 of the active nodes in a different order each time.

(Some of) The Issues

This is still a work in progress and there are many improvements that could be made:

  • I’m relying on PCT to collect my data, but it’d be nice to utilize Consul’s central key/value store to store results of the independent sysbench runs.
  • Consul’s leader election could be used to help the cluster determine which node should bootstrap on first startup. I am assuming node1 should bootstrap.
  • A variety of bugs in various software still makes this a bit clunky sometimes to manage.  Here is a sample:
    • Consul events sometimes don’t fire in the current release (though it looks to be fixed soon)
    • PXC joining nodes sometimes get stuck putting speed bumps into the automated deploy.
    • Automated installs of percona-agent (which sends data to Percona Cloud Tools) is straight-forward, except when different cluster nodes clobber each other’s credentials.

So, in summary, I am happy with how easily Consul integrates and I’m already finding it useful for a product in its 0.4.1 release.

The post Streamlined Percona XtraDB Cluster (or anything) testing with Consul and Vagrant appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com