Mar
15
2019
--

MySQL Ripple: The First Impression of a MySQL Binlog Server

MySQL Ripple

MySQL RippleJust about a month ago, Pavel Ivanov released Ripple under the Apache-2.0 license. Ripple is a MySQL binlog server: software which receives binary logs from MySQL or MariaDB servers and delivers them to another MySQL or MariaDB server. Practically ,this is an intermediary master which does not store any data, except the binary logs themselves, and does not apply events. This solution allows saving of a lot of resources on the server, which acts only as a middle-man between the master and its actual slave(s).

The intermediary server, keeping binary logs only and not doing any other job, is a prevalent use case which allows us to remove IO (binlog read) and network (binlog retrieval via network) load from the actual master and free its resources for updates. The intermediary master, which does not do any work, distributes binary logs to slaves connected to it. This way you can have an increased number of slaves, attached to such a server, without affecting the application, running updates.

Currently, users exploit the Blackhole storage engine to emulate similar behavior. But Blackhole is just a workaround: it still executes all the events in the binary logs, requires valid MySQL installation, and has a lot of issues. Such a pain!

Therefore a new product which can do the same job and is released with an open source license is something worth trying.

A simple test

For this blog, I did a simple test. First, I installed it as described in the README file. Instructions are pretty straightforward, and I successfully built the server on my Ubuntu 18.04.2 LTS laptop. Guidelines suggest to install

libmariadbclient-dev

, and I replaced

libmysqlclient-dev

which I had already on my machine. Probably this was not needed, but since the tool claims to support both MySQL and MariaDB binary log formats, I preferred to install the MariaDB client.

There is no manual of usage instructions. However, the tool supports

-help

  command, and it is, again, straightforward.

The server can be started with options:

$./bazel-bin/rippled -ripple_datadir=./data -ripple_master_address=127.0.0.1 -ripple_master_port=13001 -ripple_master_user=root -ripple_server_ports=15000

Where:

  • -ripple-datadir

     : datadir where Ripple stores binary logs

  • -ripple_master_address

     : master host

  • -ripple_master_port

     : master port

  • -ripple_master_user

     : replication user

  • -ripple_server_ports

     : comma-separated ports which Ripple will listen

I did not find an option for securing binary log retrieval. The slave can connect to the Ripple server with any credentials. Have this in mind when deploying Ripple in production.

Now, let’s run a simple test. I have two servers. Both running on localhost, one with port 13001 (master) and another one on port 13002 (slave). The command line which I used to start

rippled

 , points to the master. Binary logs are stored in the data directory:

$ ls -l data/
total 14920
-rw-rw-r-- 1 sveta sveta 15251024 Mar 6 01:43 binlog.000000
-rw-rw-r-- 1 sveta sveta 71 Mar 6 00:50 binlog.index

I pointed the slave to the Ripple server with the command

mysql> change master to master_host='127.0.0.1',master_port=15000, master_user='ripple';
Query OK, 0 rows affected, 1 warning (0.02 sec)

Then started the slave.

On the master, I created the database

sbtest

  and ran sysbench

oltp_read_write.lua

test for a single table. After some time, I stopped the load and checked the content of the table on master and slave:

master> select count(*) from sbtest1;
+----------+
| count(*) |
+----------+
| 10000 |
+----------+
1 row in set (0.08 sec)
master> checksum table sbtest1;
+----------------+------------+
| Table | Checksum |
+----------------+------------+
| sbtest.sbtest1 | 4162333567 |
+----------------+------------+
1 row in set (0.11 sec)
slave> select count(*) from sbtest1;
+----------+
| count(*) |
+----------+
| 10000 |
+----------+
1 row in set (0.40 sec)
slave> checksum table sbtest1;
+----------------+------------+
| Table | Checksum |
+----------------+------------+
| sbtest.sbtest1 | 1797645970 |
+----------------+------------+
1 row in set (0.13 sec)
slave> checksum table sbtest1;
+----------------+------------+
| Table | Checksum |
+----------------+------------+
| sbtest.sbtest1 | 4162333567 |
+----------------+------------+
1 row in set (0.10 sec)

It took some time for the slave to catch up, but everything was applied successfully.

Ripple has nice verbose logging:

$ ./bazel-bin/rippled -ripple_datadir=./data -ripple_master_address=127.0.0.1 -ripple_master_port=13001 -ripple_master_user=root -ripple_server_ports=15000
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0306 15:57:13.641451 27908 rippled.cc:48] InitPlugins
I0306 15:57:13.642007 27908 rippled.cc:60] Setup
I0306 15:57:13.642937 27908 binlog.cc:307] Starting binlog recovery
I0306 15:57:13.644090 27908 binlog.cc:350] Scanning binlog file: binlog.000000
I0306 15:57:13.872016 27908 binlog.cc:417] Binlog recovery complete
binlog file: binlog.000000, offset: 15251088, gtid: 6ddac507-3f90-11e9-8ee9-00163e000000:0-0-7192
I0306 15:57:13.872050 27908 rippled.cc:106] Recovered binlog
I0306 15:57:13.873811 27908 mysql_server_port_tcpip.cc:150] Listen on host: localhost, port: 15000
I0306 15:57:13.874282 27908 rippled.cc:62] Start
I0306 15:57:13.874511 27910 mysql_master_session.cc:181] Master session starting
I0306 15:57:13.882601 27910 mysql_client_connection.cc:148] connected to host: 127.0.0.1, port: 13001
I0306 15:57:13.895349 27910 mysql_master_session.cc:137] Connected to host: 127.0.0.1, port: 13001, server_id: 1, server_name:
W0306 15:57:13.898556 27910 mysql_master_session.cc:197] master does not support semi sync
I0306 15:57:13.898583 27910 mysql_master_session.cc:206] start replicating from '6ddac507-3f90-11e9-8ee9-00163e000000:0-0-7192'
I0306 15:57:13.899031 27910 mysql_master_session.cc:229] Master session entering main loop
I0306 15:57:13.899550 27910 binlog.cc:626] Update binlog position to end_pos: binlog.000000:15251152, gtid: 0-0-7192
I0306 15:57:13.899572 27910 binlog.cc:616] Skip writing event [ Previous_gtids len = 67 ]
I0306 15:57:13.899585 27910 binlog.cc:626] Update binlog position to end_pos: binlog.000000:15251152, gtid: 0-0-7192
...

Conclusion

it may be good to run more tests before using Ripple in production, and to explore its other options, but from a first view it seems to be a very nice and useful product.


Photo by Kishor on Unsplash

Jun
21
2013
--

Changing an async slave of a PXC cluster to a new Master

Async and PXC

A common question I get about Percona XtraDB Cluster is if you can mix it with asynchronous replication, and the answer is yes!  You can pick any node in your cluster and it can either be either a slave or a master just like any other regular MySQL standalone server (Just be sure to use log-slave-updates in both cases on the node in question!).  Consider this architecture:

Canvas 1

However, there are some caveats to be aware of.  If you slave from a cluster node, there is no built in mechanism to fail that slave over automatically to another master node in your cluster.  You cannot assume that the binary log positions are the same on all nodes in your cluster (even if they start binary logging at the same time), so you can’t issue a CHANGE MASTER without knowing the proper binary log position to start at.

Canvas 2

Until recently, I thought it was not possible to easily do this without scanning both the relay log on the slave and the binary log on the new master and trying to match up the (binary) replication event payloads somehow.  It turns out it’s actually quite easy to do without the old master being available and without pausing writes on your cluster.

Galera Seqnos and Binary log Xids

Given the above scenario, we have a 2 node cluster with node3 async replicating from node1.  Writes are going to node2.  If I stop writes briefly and look at node1′s binary log, I can see something interesting:

[root@node1 mysql]# mysqlbinlog node1.000004  | tail
MzkzODgyMjg3NDctMjQwNzE1NDU1MjEtNTkxOTQwMzk0MTYtMjU3MTQxNDkzMzI=
'/*!*/;
# at 442923
#130620 11:07:00 server id 2  end_log_pos 442950 	Xid = 86277
COMMIT/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

Notice the Xid field. Galera is modifying this field that exists in a standard binlog with its last commited seqno:

[root@node1 ~]# mysql -e "show global status like 'wsrep_last_committed'";
+----------------------+-------+
| Variable_name        | Value |
+----------------------+-------+
| wsrep_last_committed | 86277 |
+----------------------+-------+

If I check node2, sure enough its latest binlog has the same Xid, even though it’s in a different file at a different position:

[root@node2 mysql]# mysqlbinlog node2.000002  | tail
MzkzODgyMjg3NDctMjQwNzE1NDU1MjEtNTkxOTQwMzk0MTYtMjU3MTQxNDkzMzI=
'/*!*/;
# at 162077
#130620 11:07:00 server id 2  end_log_pos 162104 	Xid = 86277
COMMIT/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

So, with both nodes doing log-bin (and both with log-slave-updates), even though they are on different log files, they still have the same Xid that is very easy to search for.

But, does our slave (node3) know about these Xids? If we check the latest relay log:

[root@node3 mysql]# mysqlbinlog node3-relay-bin.000002  | tail
MzkzODgyMjg3NDctMjQwNzE1NDU1MjEtNTkxOTQwMzk0MTYtMjU3MTQxNDkzMzI=
'/*!*/;
# at 442942
#130620 11:07:00 server id 2  end_log_pos 442950 	Xid = 86277
COMMIT/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

Yes, indeed! We’re assuming the SQL thread on this slave has processed everything in the relay log and, if it has, then we know that the Galera transaction seqno 86277 was the last thing written on the slave. If it hasn’t, you’d have to find the last position applied by the SQL thread using SHOW SLAVE STATUS and find the associated Xid with that position.

Testing it out

Now, let’s simulate a failure to see if we can use this information to our advantage. I have write load running against node2 and I kill node1:

root@node2 mysql]# myq_status wsrep
mycluster / node2 / Galera 2.5(r150)
Wsrep    Cluster  Node     Queue   Ops     Bytes     Flow    Conflct PApply        Commit
    time P cnf  #  cmt sta  Up  Dn  Up  Dn   Up   Dn pau snt lcf bfa dst oooe oool wind
11:13:33 P  24  2 Sync T/T   0   0  2k  11 2.8M  258 0.0   0   0   0   9    0    0    1
11:13:34 P  24  2 Sync T/T   0   0  10   0  16K    0 0.0   0   0   0   9    0    0    1
11:13:35 P  24  2 Sync T/T   0   0  14   0  22K    0 0.0   0   0   0   8    0    0    1
11:13:36 P  24  2 Sync T/T   0   0  10   0  16K    0 0.0   0   0   0   8    0    0    1
11:13:37 P  24  2 Sync T/T   0   0   7   0  11K    0 0.0   0   0   0   8    0    0    1
11:13:38 P  24  2 Sync T/T   0   0   5   0 7.5K    0 0.0   0   0   0   8    0    0    1
11:13:39 P  24  2 Sync T/T   0   0   6   0 9.3K    0 0.0   0   0   0   8    0    0    1
11:13:40 P  24  2 Sync T/T   0   0  10   0  15K    0 0.0   0   0   0   8    0    0    1
11:13:41 P  24  2 Sync T/T   0   0   8   0  12K    0 0.0   0   0   0   8    0    0    1
11:13:42 P  24  2 Sync T/T   0   0   0   0    0    0 0.0   0   0   0   8    0    0    0
11:13:43 P  24  2 Sync T/T   0   0   0   0    0    0 0.0   0   0   0   8    0    0    0
11:13:44 P  24  2 Sync T/T   0   0   0   0    0    0 0.0   0   0   0   8    0    0    0
11:13:45 P  24  2 Sync T/T   0   0   0   0    0    0 0.0   0   0   0   8    0    0    0
11:13:47 P  24  2 Sync T/T   0   0   0   0    0    0 0.0   0   0   0   8    0    0    0
11:13:48 P  24  2 Sync T/T   0   0   0   0    0    0 0.0   0   0   0   8    0    0    0
mycluster / node2 / Galera 2.5(r150)
Wsrep    Cluster  Node     Queue   Ops     Bytes     Flow    Conflct PApply        Commit
    time P cnf  #  cmt sta  Up  Dn  Up  Dn   Up   Dn pau snt lcf bfa dst oooe oool wind
11:13:49 P  24  2 Sync T/T   0   0   0   0    0    0 0.0   0   0   0   8    0    0    0
11:13:50 P  24  2 Sync T/T   0   0   0   0    0    0 0.0   0   0   0   8    0    0    0
11:13:51 P  24  2 Sync T/T   0   0   0   0    0    0 0.0   0   0   0   8    0    0    0
11:13:52 P  24  2 Sync T/T   0   0   0   0    0    0 0.0   0   0   0   8    0    0    0
11:13:53 P  24  2 Sync T/T   0   0   0   0    0    0 0.0   0   0   0   8    0    0    0
11:13:54 P  24  2 Sync T/T   0   0   0   0    0    0 0.0   0   0   0   8    0    0    0
11:13:55 P  24  2 Sync T/T   0   0   0   0    0    0 0.0   0   0   0   8    0    0    0
11:13:56 P  24  2 Sync T/T   0   0   0   0    0    0 0.0   0   0   0   8    0    0    0
11:13:57 P  24  2 Sync T/T   0   0   0   0    0    0 0.0   0   0   0   8    0    0    0
11:13:58 N 615  1 Init F/T   0   0   0   2    0  234 0.0   0   0   0   8    0    0    0
11:13:59 N 615  1 Init F/T   0   0   0   0    0    0 0.0   0   0   0   8    0    0    0
11:14:00 N 615  1 Init F/T   0   0   0   0    0    0 0.0   0   0   0   8    0    0    0

The node failure takes a little while. This is because it was a 2 node cluster and we just lost quorum. Note the node goes to the Init state and it is also non-Primary. This also happened to kill our test client. If we check our slave, we can see it is indeed disconnected since node1 went away:

node3 mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Reconnecting after a failed master event read
                  Master_Host: node1
                  Master_User: slave
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: node1.000004
          Read_Master_Log_Pos: 3247513
               Relay_Log_File: node3-relay-bin.000002
                Relay_Log_Pos: 3247532
        Relay_Master_Log_File: node1.000004
             Slave_IO_Running: Connecting
            Slave_SQL_Running: Yes
...

Since our remaining node is non-Primary, we need to bootstrap it first and restart our load. (Note: This is only because it was a 2 node cluster, this is not strictly necessary to do this procedure in general).

[root@node2 mysql]# mysql -e 'set global wsrep_provider_options="pc.bootstrap=true"'
[root@node2 mysql]# myq_status wsrep
mycluster / node2 / Galera 2.5(r150)
Wsrep    Cluster  Node     Queue   Ops     Bytes     Flow    Conflct PApply        Commit
    time P cnf  #  cmt sta  Up  Dn  Up  Dn   Up   Dn pau snt lcf bfa dst oooe oool wind
11:16:46 P  25  1 Sync T/T   0   0  2k  17 3.0M 1021 0.0   0   0   0   5    0    0    1
11:16:47 P  25  1 Sync T/T   0   0  10   0  16K    0 0.0   0   0   0   5    0    0    1
11:16:48 P  25  1 Sync T/T   0   0  13   0  20K    0 0.0   0   0   0   5    0    0    1
11:16:49 P  25  1 Sync T/T   0   0  10   0  15K    0 0.0   0   0   0   5    0    0    1
11:16:50 P  25  1 Sync T/T   0   0   8   0  11K    0 0.0   0   0   0   5    0    0    1
11:16:51 P  25  1 Sync T/T   0   0  15   0  23K    0 0.0   0   0   0   5    0    0    1
11:16:52 P  25  1 Sync T/T   0   0  11   0  17K    0 0.0   0   0   0   6    0    0    1
11:16:53 P  25  1 Sync T/T   0   0  10   0  15K    0 0.0   0   0   0   6    0    0    1

Ok, so our cluster of 1 is back up and taking writes. But, how can we CHANGE MASTER so node3 replicates from node1? First we need to find node3′s last received Xid:

[root@node3 mysql]# mysqlbinlog node3-relay-bin.000002 | tail
MDk5MjkxNDY3MTUtMDkwMzM3NTg2NjgtMTQ0NTI2MDQ5MDItNzgwOTA5MTcyNTc=
'/*!*/;
# at 3247505
#130620 11:13:41 server id 2  end_log_pos 3247513 	Xid = 88085
COMMIT/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

We can tell from SHOW SLAVE STATUS that all of the relay log was processed, so we know the last applied Xid was 88085. Checking node2, we can indeed tell the cluster has progressed further:

[root@node2 mysql]# mysql -e "show global status like 'wsrep_last_committed'"
+----------------------+-------+
| Variable_name        | Value |
+----------------------+-------+
| wsrep_last_committed | 88568 |
+----------------------+-------+

It’s easy to grep for node3′s last Xid in node2′s binlog (if the slave has been disconnected a while, you may have to search a bit more through the binlogs):

[root@node2 mysql]# mysqlbinlog node2.000002 | grep "Xid = 88085"
#130620 11:13:41 server id 2  end_log_pos 2973899 	Xid = 88085

So, on node2, Xid 88085 is in binary log node2.000002, position 2973899.  2973899 is the “end” position, so that should be where we start from on the slave.  Back on node3 we can now issue a correct CHANGE MASTER:

node3 mysql> slave stop;
Query OK, 0 rows affected (0.01 sec)
node3 mysql> change master to master_host='node2', master_log_file='node2.000002', master_log_pos=2973899;
Query OK, 0 rows affected (0.02 sec)
node3 mysql> slave start;
Query OK, 0 rows affected (0.00 sec)
node3 mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: node2
                  Master_User: slave
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: node2.000002
          Read_Master_Log_Pos: 7072423
               Relay_Log_File: node3-relay-bin.000002
                Relay_Log_Pos: 4098705
        Relay_Master_Log_File: node2.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
....

Of course, we are still scanning relay logs and binary logs, but matching Xid numbers is a lot easier than trying to match RBR payloads.

Summary

  • Galera writes the seqno (part of its GTID) to node’s binary logs (if enabled) in the Xid field
  • Xid is consistent across all nodes in a cluster, regardless of individual binary log position
  • This Xid field is propagated to slaves and can be matched up in their relay logs
  • Those slaves that are also binary logging with log-slave-updates will overwrite Xid, so you this isn’t a true GTID for async in 5.5.

The post Changing an async slave of a PXC cluster to a new Master appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com