As we have Kubernetes installed in part one (see Using Percona Kubernetes Operators With K3s Part 1: Installation), now we will install Percona Server for MySQL Operator into the running cluster.
I will copy some ideas from Peter’s Minukube tutorial (see Exploring MySQL on Kubernetes with Minkube).
In this case, I will use not Percona XtraDB Cluster Operator but a regular Percona Server for MySQL with Asynchronous replication.
We have recently released version 0.3.0 and it is still in the technical preview state, so we are actively looking for more feedback!
If we go with all defaults, then installation can be done in two steps.
Step 1. Install Operator
kubectl apply -f https://raw.githubusercontent.com/percona/percona-server-mysql-operator/v0.3.0/deploy/bundle.yaml
Step 2. Install Percona Server for MySQL in source->replica mode, with one source and two replicas
kubectl apply -f https://raw.githubusercontent.com/percona/percona-server-mysql-operator/v0.3.0/deploy/cr.yaml
And we can see the following pods running:
kubectl get pods -n mysql
NAME READY STATUS RESTARTS AGE
percona-server-mysql-operator-7bb68f7b6d-tsvbv 1/1 Running 0 30m
cluster1-orc-0 2/2 Running 0 28m
cluster1-orc-1 2/2 Running 0 28m
cluster1-mysql-0 3/3 Running 0 28m
cluster1-haproxy-0 2/2 Running 0 27m
cluster1-haproxy-1 2/2 Running 0 26m
cluster1-haproxy-2 2/2 Running 0 26m
cluster1-orc-2 2/2 Running 0 27m
cluster1-mysql-1 3/3 Running 2 (26m ago) 27m
cluster1-mysql-2 3/3 Running 2 (25m ago) 26m
There is a lot of stuff going on, but remember we just installed three Percona Server for MySQL servers combined in the replication setup.
What else is there? You can find there Orchestrator (in three nodes for HA) and HAProxy (also in three nodes).
We need an Orchestrator to handle replication failover, that is if the primary MySQL node is killed or evicted, the Orchestrator will handle replica election and promote it to the primary.
HAProxy is needed to have a single point of entry and it will direct to the primary, no matter which pod is the primary right now.
How to connect to MySQL?
For connectivity, Kubernetes offers services endpoints and we can see them as:
kubectl get svc
cluster1-mysql-primary ClusterIP 10.43.162.118 <none> 3306/TCP,33062/TCP,33060/TCP,6033/TCP 40m
cluster1-mysql-unready ClusterIP None <none> 3306/TCP,33062/TCP,33060/TCP,6033/TCP 40m
cluster1-mysql ClusterIP None <none> 3306/TCP,33062/TCP,33060/TCP,6033/TCP 40m
cluster1-orc ClusterIP None <none> 3000/TCP,10008/TCP 40m
cluster1-orc-0 ClusterIP 10.43.242.81 <none> 3000/TCP,10008/TCP 40m
cluster1-orc-1 ClusterIP 10.43.69.105 <none> 3000/TCP,10008/TCP 40m
cluster1-orc-2 ClusterIP 10.43.184.202 <none> 3000/TCP,10008/TCP 40m
cluster1-haproxy ClusterIP 10.43.150.69 <none> 3306/TCP,3307/TCP,3309/TCP 40m
Primarily we are looking for cluster1-haproxy if we have HAProxy enabled and cluster1-mysql-primary if we work without HAProxy. These are hostnames we can use to connect from inside Kubernetes, but to connect from outside of Kubernetes we will need to do some extra work: expose services.
Typically this is done using NodePort, but before exposing the primary MySQL node, let’s do a little undocumented hack and expose Orchestrator GUI so we can see the current replication topology:
kubectl expose service cluster1-orc --type=NodePort --port=3000 --name=orc
service/orc exposed
services:
orc NodePort 10.43.87.69 <none> 3000:30924/TCP 49s
Now we can see the current Orchestrator port as 30924 and we can use Kubernetes Master IP address (10.30.2.34 in our case) with this port to connect to Orchestrator GUI from the browser:
The topology confirms that the primary mysql node is cluster1-mysql-0.
Now we will try to connect to MySQL.
Step 1. Exposing HAProxy service:
kubectl expose service cluster1-haproxy --type=NodePort --port=3306 --name=mysql-ha
service/mysql-ha exposed
mysql-ha NodePort 10.43.21.250 <none> 3306:31687/TCP 57s
Step 2. Figuring out MySQL password:
$ kubectl get secrets cluster1-secrets -ojson | jq -r .data.root | base64 -d
root_password
Step 3. Creating a database for benchmark:
kubectl exec -it cluster1-mysql-0 -- mysql -uroot -proot_password
create database sbtest
Step 4. Preparing sysbench database:
sysbench --db-driver=mysql --threads=4 --mysql-host=10.30.2.34 --mysql-port=31687 --mysql-user=root --mysql-password=root_password --mysql-db=sbtest /usr/share/sysbench/oltp_point_select.lua --report-interval=1 --table-size=1000000 prepare
Here are the connection parameters,
--mysql-host=10.30.2.34 --mysql-port=31687 --mysql-user=root --mysql-password=root_password
where 10.30.2.34 is our Kubernetes master node IP and 31687 is the exposed HAProxy port.
Now we can run some sysbench benchmark:
sysbench --db-driver=mysql --threads=4 --mysql-host=10.30.2.34 --mysql-port=31687 --mysql-user=root --mysql-password=root_password --mysql-db=sbtest1 /usr/share/sysbench/oltp_point_select.lua --report-interval=1 --table-size=1000000 --mysql-ignore-errors=all --time=3600 run
sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2)
Running the test with following options:
Number of threads: 4
Report intermediate results every 1 second(s)
Initializing random number generator from current time
Initializing worker threads...
Threads started!
[ 1s ] thds: 4 tps: 10573.86 qps: 10573.86 (r/w/o: 10573.86/0.00/0.00) lat (ms,95%): 0.53 err/s: 0.00 reconn/s: 0.00
[ 2s ] thds: 4 tps: 11219.62 qps: 11219.62 (r/w/o: 11219.62/0.00/0.00) lat (ms,95%): 0.51 err/s: 0.00 reconn/s: 0.00
[ 3s ] thds: 4 tps: 11196.11 qps: 11196.11 (r/w/o: 11196.11/0.00/0.00) lat (ms,95%): 0.52 err/s: 0.00 reconn/s: 0.00
[ 4s ] thds: 4 tps: 11555.85 qps: 11555.85 (r/w/o: 11555.85/0.00/0.00) lat (ms,95%): 0.50 err/s: 0.00 reconn/s: 0.00
[ 5s ] thds: 4 tps: 11002.38 qps: 11002.38 (r/w/o: 11002.38/0.00/0.00) lat (ms,95%): 0.52 err/s: 0.00 reconn/s: 0.00
[ 6s ] thds: 4 tps: 11450.22 qps: 11450.22 (r/w/o: 11450.22/0.00/0.00) lat (ms,95%): 0.51 err/s: 0.00 reconn/s: 0.00
[ 7s ] thds: 4 tps: 11477.98 qps: 11477.98 (r/w/o: 11477.98/0.00/0.00) lat (ms,95%): 0.50 err/s: 0.00 reconn/s: 0.00
[ 8s ] thds: 4 tps: 11481.14 qps: 11481.14 (r/w/o: 11481.14/0.00/0.00) lat (ms,95%): 0.52 err/s: 0.00 reconn/s: 0.00
[ 9s ] thds: 4 tps: 11603.96 qps: 11603.96 (r/w/o: 11603.96/0.00/0.00) lat (ms,95%): 0.52 err/s: 0.00 reconn/s: 0.00
[ 10s ] thds: 4 tps: 11554.07 qps: 11554.07 (r/w/o: 11554.07/0.00/0.00) lat (ms,95%): 0.51 err/s: 0.00 reconn/s: 0.00
Now we want to see how our cluster will handle the primary MySQL node failure.
For this, we will run sysbench and during this time will kill the primary pod:
kubectl delete pods cluster1-mysql-0 --grace-period=0 --force
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "cluster1-mysql-0" force deleted
And this is what happened with sysbench:
sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2)
Running the test with following options:
Number of threads: 4
Report intermediate results every 1 second(s)
Initializing random number generator from current time
Initializing worker threads...
Threads started!
[ 1s ] thds: 4 tps: 11904.89 qps: 11904.89 (r/w/o: 11904.89/0.00/0.00) lat (ms,95%): 0.39 err/s: 0.00 reconn/s: 0.00
[ 2s ] thds: 4 tps: 12179.00 qps: 12179.00 (r/w/o: 12179.00/0.00/0.00) lat (ms,95%): 0.37 err/s: 0.00 reconn/s: 0.00
[ 3s ] thds: 4 tps: 12344.97 qps: 12344.97 (r/w/o: 12344.97/0.00/0.00) lat (ms,95%): 0.37 err/s: 0.00 reconn/s: 0.00
[ 4s ] thds: 4 tps: 12583.93 qps: 12583.93 (r/w/o: 12583.93/0.00/0.00) lat (ms,95%): 0.35 err/s: 0.00 reconn/s: 0.00
[ 5s ] thds: 4 tps: 12288.16 qps: 12288.16 (r/w/o: 12288.16/0.00/0.00) lat (ms,95%): 0.37 err/s: 0.00 reconn/s: 0.00
[ 6s ] thds: 4 tps: 11970.54 qps: 11970.54 (r/w/o: 11970.54/0.00/0.00) lat (ms,95%): 0.37 err/s: 0.00 reconn/s: 0.00
[ 7s ] thds: 4 tps: 12247.29 qps: 12247.29 (r/w/o: 12247.29/0.00/0.00) lat (ms,95%): 0.36 err/s: 0.00 reconn/s: 0.00
[ 8s ] thds: 4 tps: 12364.22 qps: 12364.22 (r/w/o: 12364.22/0.00/0.00) lat (ms,95%): 0.36 err/s: 0.00 reconn/s: 0.00
[ 9s ] thds: 4 tps: 10705.93 qps: 10705.93 (r/w/o: 10705.93/0.00/0.00) lat (ms,95%): 0.37 err/s: 3.00 reconn/s: 0.00
[ 10s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 11s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 12s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 13s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 14s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 15s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 16s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 17s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 18s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 19s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 20s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 21s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 22s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 23s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 24s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 25s ] thds: 4 tps: 11970.08 qps: 11970.08 (r/w/o: 11970.08/0.00/0.00) lat (ms,95%): 0.39 err/s: 0.00 reconn/s: 4.00
[ 26s ] thds: 4 tps: 13008.25 qps: 13008.25 (r/w/o: 13008.25/0.00/0.00) lat (ms,95%): 0.38 err/s: 0.00 reconn/s: 0.00
[ 27s ] thds: 4 tps: 13099.60 qps: 13099.60 (r/w/o: 13099.60/0.00/0.00) lat (ms,95%): 0.36 err/s: 0.00 reconn/s: 0.00
[ 28s ] thds: 4 tps: 12875.61 qps: 12875.61 (r/w/o: 12875.61/0.00/0.00) lat (ms,95%): 0.37 err/s: 0.00 reconn/s: 0.00
[ 29s ] thds: 4 tps: 13019.67 qps: 13019.67 (r/w/o: 13019.67/0.00/0.00) lat (ms,95%): 0.37 err/s: 0.00 reconn/s: 0.00
[ 30s ] thds: 4 tps: 12904.84 qps: 12904.84 (r/w/o: 12904.84/0.00/0.00) lat (ms,95%): 0.38 err/s: 0.00 reconn/s: 0.00
[ 31s ] thds: 4 tps: 12727.94 qps: 12727.94 (r/w/o: 12727.94/0.00/0.00) lat (ms,95%): 0.38 err/s: 0.00 reconn/s: 0.00
[ 32s ] thds: 4 tps: 12683.05 qps: 12683.05 (r/w/o: 12683.05/0.00/0.00) lat (ms,95%): 0.38 err/s: 0.00 reconn/s: 0.00
[ 33s ] thds: 4 tps: 12494.87 qps: 12494.87 (r/w/o: 12494.87/0.00/0.00) lat (ms,95%): 0.39 err/s: 0.00 reconn/s: 0.00
[ 34s ] thds: 4 tps: 12670.94 qps: 12670.94 (r/w/o: 12670.94/0.00/0.00) lat (ms,95%): 0.38 err/s: 0.00 reconn/s: 0.00
So, we can observe a 15-sec downtime. During this time, Orchestrator redirected traffic to a working replica, and we can notice this from Orchestrator GUI.
Immediately after mysql-0 pod failure:
We can see the topology changed only to two nodes and mysql-2 was elected as primary.
After mysql-0 pod healed:
Mysql-0 re-joined the cluster now as a replica to mysql-2 primary.
Now let’s summarize what happened
- The Primary node becomes unavailable
- Sysbench traffic paused
- Orchestrator diagnosed primary node failure and elected mysql-2 as the new primary and re-configured replication
- Sysbench was able to continue the workload
- Mysql-0 pod was recovered and re-joined the cluster. Orchestrator joined it with a replica role.
- Endpoint mysql-haproxy continued to serve as endpoint
All this was handled AUTOMATICALLY by Percona Server for MySQL Operator without human interaction!
Why don’t you give Percona Operator for MySQL based on Percona Server for MySQL a try and provide us feedback on your experience!