Oct
29
2015
--

MobileIron Posts Lower-Than-Expected FQ3 Loss Of $0.20 Per Share Off Revenue Of $38M

mobileiron-earnings Following the bell, device management shop MobileIron announced its fiscal third quarter financial performance, including revenue of $38 million, and adjusted profit of negative $0.20. Using normal accounting techniques, the company lost a stiffer $0.30 per share.
Investors had expected MobileIron to lose an adjusted $0.20 per share, off of $37.61 million in revenue. The company’s… Read More

Oct
29
2015
--

Slack Introduces User Groups To Let Larger Teams Collaborate Way More Effectively

slack-large A great deal of Slack’s success has come from its ability to allow companies to communicate internally in a fluid and transparent manner. Now the enterprise messaging service is looking to enhance the experience of how larger organizations function through the introduction of User Groups. Read More

Oct
29
2015
--

Microsoft Integrates Skype With Office Online, Rolls Out An Office Chrome Extension

office-online-skype Microsoft today announced its plans to extend the social features of Office by integrating Skype into Office Online, allowing users to communicate by voice and video chat within Word, Excel, PowerPoint and OneNote Online documents, as well as in Outlook.com. With the integration, users will be able to see and chat with Skype contacts right from within their file, which Microsoft says will… Read More

Oct
29
2015
--

MySQL and Trojan.Chikdos.A

shutterstock_103378880Symantec published a blog post yesterday regarding MySQL and the Trojan.Chikdos.A as can be seen here

The Symantec post gives detail into the behavior of the Trojan and it’s effects on the Windows system registry, yet gives little detail as to how the required first stage (namely a malicious UDF) is injected, citing:

“In the latest Chikdos campaign that we observed, the attackers likely used an automated scanner or possibly a worm to compromise MySQL servers and install the UDF.”

I’m going to give my thoughts on the pre-requisites to infection here.

  1. The MySQL server has poor network isolation (i.e. is likely accessible from everywhere e.g. shodan.io examples)
  2. The MySQL user has poor (read: overly permissive) grants e.g. “GRANT ALL”
  3. The MySQL @@GLOBAL.plugin_dir is writeable (likely poor file ACL & no D.A.C such as SELinux / AppArmor)

This is pretty much the exact same method I discussed at PLUK13 and updated to be discussed again at PLMCE14. There are YouTube videos embedded in each showing the attack aimed at a Linux deployment. The above 3 pre-requisite points, however, should still hold true on Windows.

So what is the real takeaway here?

  1. Ensure you properly isolate your MySQL service on the network, allowing access from only known hosts which require access.
    1. There really is no reason at all to have the service accessible from everywhere on the Internet
  2. Ensure your user grants follow the Principle of Least Privilege; aka, only grant the required access for the user to do the job they are supposed to be doing.
  3. Never EVER chmod 777 / setenforce 0

I can see that last point raising some arguments; please keep in mind we’re talking about the security of your MySQL data here. Simply turning off security measures “to make it work” is a recipe for disaster.

 

 

The post MySQL and Trojan.Chikdos.A appeared first on MySQL Performance Blog.

Oct
29
2015
--

Spacio Pro Plans To Supercharge Open House Showings With A Smart App

pro-solution1 If software is eating the world, it’s also eating our houses. The real estate market changed radically when companies like Zillow unpacked the data around house sales. In theory you’d think there wasn’t much to do on house sales, but there remains a relatively untapped area. One of the next areas of change could be be in the ‘open house’ market, where houses… Read More

Oct
29
2015
--

MySQL 5.7 first impressions on group-replication

Group Replication During the last few weeks I’ve been testing and playing a bit with the new group-replication plugin available for MySQL 5.7. Before continuing I’d like to clarify some aspects: the plugin is only available in labs and is not yet ready for production. The current version is 0.6. I used 5.7.9 GA running in a Vagrant 3 nodes cluster with CentOS 7.
As an additional note, I’ve tested previous version of plugin 0.5 against 5.7.8.rc and there are some good changes, so I recommend starting with the GA version.

For the matter of my tests I’ve followed instructions from this post. It’s not as straightforward as it looks; there were some issues that needed to be handled, but I finally managed to get a consistent 3 nodes cluster running:

mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 80cc27ce-7cb4-11e5-9055-0800275ff74d | node2 | 3306 | ONLINE |
| group_replication_applier | a0ef74a1-7cb3-11e5-845e-0800275ff74d | node1 | 3306 | ONLINE |
| group_replication_applier | e5263489-7cb7-11e5-a8ee-0800275ff74d | node3 | 3306 | ONLINE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
3 rows in set (0.00 sec)

My impressions about installation

1- The previous version relies only on Corosync for group communication, meaning that you need to install and configure an extra piece of software. For those who aren’t very familiar with it (like me), it can be a bit tricky and complex. From version 0.6.0 for MySQL 5.7.9, a new communication engine called XCom was added. This is now the default communication framework which is included in the plugin, so no extra pieces of software are needed.

2- When you initialize the MySQL database (mysqld –initialize function now replaces mysql_install_db script) you need to disable binary logging in the configuration file, otherwise information such as ‘create database mysql’ will be pushed to binary logs and cause issues with nodes joining the cluster due errors like:
2015-10-21T20:18:52.059231Z 8 [Warning] Slave: Can't create database 'mysql'; database exists Error_code: 1007

3- In group replication there isn’t a concept like SST (State Snapshot Transfer) which basically drops and recreates the datadir if it finds data discrepancies. With group replication you may end up having different datasets and replication will continue working (more on this later in the post).

4- For Incremental State Transfer (a.k.a. IST in Galera), group replication trusts in binary logs present in any of the potential donors (at the moment the selection of a donor is done randomly). So, if a node is disconnected, when it comes back online, it requests binary logs from the donor using the same IO thread as regular replication. The problem here is that if the binary log was purged on the donor then the joiner can’t be recovered and a full rebuild is needed. This is a similar approach to the gcache in Galera, but when gcache is not able to provide transactions needed for IST, an SST is performed instead. Group replication can’t do this (yet?).

These are some of the installation issues I faced. Now that we have the cluster running, what works? Well let’s try some samples.

Simple write tests
I tried running simple write operations like a few inserts, create tables and so on using sysbench like this:

sysbench --test=/vagrant/sysb/doc/sysbench/tests/db/update_index.lua --mysql-user=root --mysql-db=test --mysql-socket=/data/mysql.sock --oltp-table-size=25000 --oltp-tables-count=8 --num-threads=8 --report-interval=1 --max-requests=0 run

And checked status in the other nodes. It does what it is supposed to do; data and records are found in the rest of nodes just like this:
Node1:

mysql> select * from sbtest1 order by 1 limit 4;
+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
| id | k | c | pad |
+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
| 8 | 12410 | 05879259692-02024081996-76022818191-82933803603-81845875017-31928300264-16934042125-67052432228-92123768050-95121478647 | 79362588344-09017007031-35411714211-52054317597-26283585383 |
| 15 | 12593 | 86109785327-21666363106-71483869798-96435839840-61691707322-34771340621-77881281887-34657680322-46195586522-19795644808 | 74925946566-05246863347-88207617885-76849666822-72200234338 |
| 22 | 12544 | 57481185690-89398636500-16888148413-67987678267-15604944838-94210794401-18107184012-91338377776-83386272438-09451188763 | 35227182905-15234265621-59793845249-15413569710-23749555118 |
| 29 | 12436 | 29279855805-99348203463-85191104223-39587263726-81794340135-73817557808-54578801760-64404111877-55434439976-37212880746 | 59222897263-22759023974-22020489960-93434521232-77981152534 |
+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
4 rows in set (0.08 sec)

Node2

mysql> select * from sbtest1 order by 1 limit 4;
+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
| id | k | c | pad |
+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
| 8 | 12410 | 05879259692-02024081996-76022818191-82933803603-81845875017-31928300264-16934042125-67052432228-92123768050-95121478647 | 79362588344-09017007031-35411714211-52054317597-26283585383 |
| 15 | 12593 | 86109785327-21666363106-71483869798-96435839840-61691707322-34771340621-77881281887-34657680322-46195586522-19795644808 | 74925946566-05246863347-88207617885-76849666822-72200234338 |
| 22 | 12544 | 57481185690-89398636500-16888148413-67987678267-15604944838-94210794401-18107184012-91338377776-83386272438-09451188763 | 35227182905-15234265621-59793845249-15413569710-23749555118 |
| 29 | 12436 | 29279855805-99348203463-85191104223-39587263726-81794340135-73817557808-54578801760-64404111877-55434439976-37212880746 | 59222897263-22759023974-22020489960-93434521232-77981152534 |
+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
4 rows in set (0.08 sec)

Well we expected this so, yay, we are fine.

What about trying to write in 2 nodes at the same time? This should fire things like conflict resolution during certification; in a nutshell, if we expect to use group replication to write in multiple nodes at the same time, we need a way to resolve conflicts with the data. These are most common in PK/UK violations; in other words 2 transactions trying to insert the same record/id. This is not recommended because it is not an approach we can use to scale up writes (same as Galera) but it’s still possible to do.

An easier way to test is to run sysbench in more than one member of a cluster and wait for a failure. As expected, it does what it is supposed to do:
Node1

[root@node1 data]# sysbench --test=/vagrant/sysb/doc/sysbench/tests/db/update_index.lua --mysql-user=root --mysql-db=test --mysql-socket=/data/mysql.sock --oltp-table-size=25000 --oltp-tables-count=8 --num-threads=8 --report-interval=1 --max-requests=0 run |grep tps
[ 1s] threads: 8, tps: 0.00, reads: 0.00, writes: 1416.75, response time: 25.65ms (95%), errors: 0.00, reconnects: 0.00
[ 2s] threads: 8, tps: 0.00, reads: 0.00, writes: 1893.78, response time: 20.94ms (95%), errors: 0.00, reconnects: 0.00
[ 3s] threads: 8, tps: 0.00, reads: 0.00, writes: 1421.19, response time: 28.44ms (95%), errors: 0.00, reconnects: 0.00
[ 4s] threads: 8, tps: 0.00, reads: 0.00, writes: 1397.83, response time: 34.92ms (95%), errors: 0.00, reconnects: 0.00
[ 5s] threads: 8, tps: 0.00, reads: 0.00, writes: 1734.31, response time: 22.75ms (95%), errors: 0.00, reconnects: 0.00

Node2

[root@node2 vagrant]# sysbench --test=/vagrant/sysb/doc/sysbench/tests/db/update_index.lua --mysql-user=root --mysql-db=test --mysql-socket=/data/mysql.sock --oltp-table-size=25000 --oltp-tables-count=8 --num-threads=8 --report-interval=1 --max-requests=0 run
WARNING: Both max-requests and max-time are 0, running endless test
sysbench 0.5: multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 8
Report intermediate results every 1 second(s)
Random number generator seed is 0 and will be ignored
Threads started!
ALERT: failed to execute MySQL query: `UPDATE sbtest6 SET k=k+1 WHERE id=12608`:
ALERT: Error 1180 Got error 149 during COMMIT
FATAL: failed to execute function `event': (null)
ALERT: failed to execute MySQL query: `UPDATE sbtest6 SET k=k+1 WHERE id=12468`:
ALERT: Error 1180 Got error 149 during COMMIT
FATAL: failed to execute function `event': (null)

Node2 eventually failed, but what happened? Let’s check the error log to see what’s reporting:

2015-10-27T17:12:33.894759Z 3 [Note] InnoDB: Blocking transaction: ID: 223804 - Blocked transaction ID: 223820 - MySQL thread id 59, OS thread handle 139838350866176, query id 197913 localhost root query end
UPDATE sbtest5 SET k=k+1 WHERE id=12510

At the commit stage there was a conflict with an already committed transaction in Node1, so it forced a failure and a rollback of the operation. So far, so good.

What about a node going down?
One of the tests I ran was to kill one of the nodes during the operations to see if it resumes replication properly when back to life. For this we need to set up some variables in the configuration file as follows:

loose-group_replication_group_name="8a94f357-aab4-11df-86ab-c80aa9429562"
loose-group_replication_start_on_boot=1
loose-group_replication_recovery_user='rpl_user'
loose-group_replication_recovery_password='rpl_pass'

Note: This is interesting, that the replication credentials are not saved into a table (as is done with slave_master_info in regular replication). I guess this is part of a to do section, but it’s something to keep in mind since this implies a security risk.

Back to our test. I ran the regular sysbench command in one of my nodes and then went to node2 and killed mysql daemon. After the regular crash recovery messages we can see:

[root@node2 data]# killall mysqld
[root@node2 data]# tail -500 error.log
2015-10-27T17:15:26.460674Z 0 [Note] InnoDB: 96 redo rollback segment(s) found. 96 redo rollback segment(s) are active.
2015-10-27T17:15:26.460711Z 0 [Note] InnoDB: 32 non-redo rollback segment(s) are active.
2015-10-27T17:15:26.461001Z 0 [Note] InnoDB: Waiting for purge to start
2015-10-27T17:15:26.514015Z 0 [Note] InnoDB: 5.7.9 started; log sequence number 77370253
2015-10-27T17:15:26.515417Z 0 [Note] Plugin 'FEDERATED' is disabled.
2015-10-27T17:15:26.525466Z 0 [Note] InnoDB: not started
2015-10-27T17:15:26.525914Z 0 [Note] InnoDB: Loading buffer pool(s) from /data/ib_buffer_pool
[XCOM_BINDING_DEBUG] ::initialize_xcom():: Configuring Xcom group: XCom Group ID=1827295128 Name=8a94f357-aab4-11df-86ab-c80aa9429562
[XCOM_BINDING_DEBUG] ::initialize_xcom():: Configured Peer Nodes: 192.168.70.2:10300
[XCOM_BINDING_DEBUG] ::initialize_xcom():: Configured Peer Nodes: 192.168.70.3:10300
[XCOM_BINDING_DEBUG] ::initialize_xcom():: Configured Total number of peers: 2
[XCOM_BINDING_DEBUG] ::initialize_xcom():: Configured Local Node: 192.168.70.3:10300
[XCOM_BINDING_DEBUG] ::initialize_xcom():: Configured Bootstrap: false
2015-10-27T17:15:26.723392Z 0 [Note] InnoDB: Buffer pool(s) load completed at 151027 17:15:26
2015-10-27T17:15:27.135089Z 0 [Warning] Failed to set up SSL because of the following SSL library error: SSL context is not usable without certificate and private key
2015-10-27T17:15:27.136449Z 0 [Note] Server hostname (bind-address): '*'; port: 3306
2015-10-27T17:15:27.145198Z 0 [Note] IPv6 is available.
2015-10-27T17:15:27.145247Z 0 [Note] - '::' resolves to '::';
2015-10-27T17:15:27.145265Z 0 [Note] Server socket created on IP: '::'.
2015-10-27T17:15:27.171615Z 0 [Warning] 'user' entry 'root@localhost' ignored in --skip-name-resolve mode.
2015-10-27T17:15:27.171711Z 0 [Warning] 'user' entry 'mysql.sys@localhost' ignored in --skip-name-resolve mode.
2015-10-27T17:15:27.172447Z 0 [Warning] 'db' entry 'sys mysql.sys@localhost' ignored in --skip-name-resolve mode.
2015-10-27T17:15:27.173089Z 0 [Warning] 'proxies_priv' entry '@ root@localhost' ignored in --skip-name-resolve mode.
2015-10-27T17:15:27.192881Z 0 [Warning] 'tables_priv' entry 'sys_config mysql.sys@localhost' ignored in --skip-name-resolve mode.
2015-10-27T17:15:27.205764Z 0 [Warning] Neither --relay-log nor --relay-log-index were used; so replication may break when this MySQL server acts as a slave and has his hostname changed!! Please use '--relay-log=node2-relay-bin' to avoid this problem.
2015-10-27T17:15:27.676222Z 1 [Note] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_applier' executed'. Previous state master_host='<NULL>', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='<NULL>', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''.
2015-10-27T17:15:27.685374Z 3 [Note] Slave SQL thread for channel 'group_replication_applier' initialized, starting replication in log 'FIRST' at position 0, relay log './node2-relay-bin-group_replication_applier.000002' position: 51793711
2015-10-27T17:15:27.685985Z 0 [Note] Plugin group_replication reported: 'Group Replication applier module successfully initialized!'
2015-10-27T17:15:27.686009Z 0 [Note] Plugin group_replication reported: 'auto_increment_increment is set to 7'
2015-10-27T17:15:27.686017Z 0 [Note] Plugin group_replication reported: 'auto_increment_offset is set to 2'
[XCOM BINDING DEBUG] ::join()
connecting to 192.168.70.3 10300
state 0 action xa_init
connecting to 192.168.70.3 10300
connected to 192.168.70.3 10300
connecting to 192.168.70.3 10300
connected to 192.168.70.3 10300
...
[XCOM BINDING DEBUG] ::join():: I am NOT the boot node.
[XCOM BINDING DEBUG] ::join():: xcom_client_open_connection to 192.168.70.2:10300
connecting to 192.168.70.2 10300
connected to 192.168.70.2 10300
[XCOM BINDING DEBUG] ::join():: Calling xcom_client_add_node
cli_err 0
state 3489 action xa_snapshot
new state x_recover
state 3505 action xa_complete
new state x_run
get_nodeno(get_site_def()) = 2 task_now() = 1445966128.920615 n = 0 median_time() = 1.000000 executed_msg={84bca5ce 12385 1}
 [XCOM BINDING DEBUG] ::join():: GCS_OK
[XCOM_BINDING_DEBUG] : Rejecting this message. I'm not in a view yet.
[XCOM_BINDING_DEBUG] : Rejecting this message. I'm not in a view yet.
[XCOM_BINDING_DEBUG] : Rejecting this message. I'm not in a view yet.
[XCOM_BINDING_DEBUG] : Rejecting this message. I'm not in a view yet.
[XCOM_BINDING_DEBUG] : Rejecting this message. I'm not in a view yet.
[XCOM_BINDING_DEBUG] : Rejecting this message. I'm not in a view yet.
2015-10-27T17:15:28.926806Z 0 [Note] Event Scheduler: Loaded 0 events
2015-10-27T17:15:28.928485Z 0 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.7.9-log' socket: '/data/mysql.sock' port: 3306 MySQL Community Server (GPL)
[XCOM_BINDING_DEBUG] : Rejecting this message. I'm not in a view yet.
[XCOM_BINDING_DEBUG] : Rejecting this message. I'm not in a view yet.
[XCOM_BINDING_DEBUG] : Rejecting this message. I'm not in a view yet.
[XCOM_BINDING_DEBUG] : Rejecting this message. I'm not in a view yet.
....
[XCOM_BINDING_DEBUG] ::process_possible_control_message():: Received a control message
[XCOM_BINDING_DEBUG] ::process_possible_control_message()::Install new view
[XCOM_BINDING_DEBUG] ::install_view():: Processing exchanged data
[XCOM_BINDING_DEBUG] ::install_view():: Processing exchanged data
[XCOM_BINDING_DEBUG] ::install_view():: Processing exchanged data
2015-10-27T17:15:30.084101Z 0 [Note] Plugin group_replication reported: 'Starting group replication recovery with view_id 46750407:7'
[XCOM_BINDING_DEBUG] ::install_view():: View delivered to client handler= 1382963399
2015-10-27T17:15:30.091414Z 5 [Note] Plugin group_replication reported: 'Establishing group recovery connection with a possible donor. Attempt 1/86400'
2015-10-27T17:15:30.107078Z 5 [Note] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_recovery' executed'. Previous state master_host='', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='node3', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''.
2015-10-27T17:15:30.117379Z 5 [Note] Plugin group_replication reported: 'Establishing connection to a group replication recovery donor e5263489-7cb7-11e5-a8ee-0800275ff74d at node3 port: 3306.'
2015-10-27T17:15:30.118109Z 6 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
2015-10-27T17:15:30.130001Z 7 [Note] Slave SQL thread for channel 'group_replication_recovery' initialized, starting replication in log 'FIRST' at position 0, relay log './node2-relay-bin-group_replication_recovery.000001' position: 4
[XCOM_BINDING_DEBUG] ::xcom_receive_data_internal():: xcom_receive_data message_id.group= 2226955726 message_id.msgno= 12456 message_id.node= 0
[XCOM_BINDING_DEBUG] ::xcom_receive_data():: Delivered message to client handler= 684195134
[XCOM_BINDING_DEBUG] ::xcom_receive_data_internal():: xcom_receive_data message_id.group= 2226955726 message_id.msgno= 12456 message_id.node= 0
[XCOM_BINDING_DEBUG] ::xcom_receive_data():: Delivered message to client handler= 684195134
[XCOM_BINDING_DEBUG] ::xcom_receive_data_internal():: xcom_receive_data message_id.group= 2226955726 message_id.msgno= 12456 message_id.node= 0
[XCOM_BINDING_DEBUG] ::xcom_receive_data():: Delivered message to client handler= 684195134
[XCOM_BINDING_DEBUG] ::xcom_receive_data_internal():: xcom_receive_data message_id.group= 2226955726 message_id.msgno= 12456 message_id.node= 0
[XCOM_BINDING_DEBUG] ::xcom_receive_data():: Delivered message to client handler= 684195134
[XCOM_BINDING_DEBUG] ::xcom_receive_data_internal():: xcom_receive_data message_id.group= 2226955726 message_id.msgno= 12456 message_id.node= 0
[XCOM_BINDING_DEBUG] ::xcom_receive_data():: Delivered message to client handler= 684195134
[XCOM_BINDING_DEBUG] ::xcom_receive_data_internal():: xcom_receive_data message_id.group= 2226955726 message_id.msgno= 12456 message_id.node= 0
[XCOM_BINDING_DEBUG] ::xcom_receive_data():: Delivered message to client handler= 684195134
[XCOM_BINDING_DEBUG] ::xcom_receive_data_internal():: xcom_receive_data message_id.group= 2226955726 message_id.msgno= 12456 message_id.node= 0
[XCOM_BINDING_DEBUG] ::xcom_receive_data():: Delivered message to client handler= 684195134
[XCOM_BINDING_DEBUG] ::xcom_receive_data_internal():: xcom_receive_data message_id.group= 2226955726 message_id.msgno= 12456 message_id.node= 0
[XCOM_BINDING_DEBUG] ::xcom_receive_data():: Delivered message to client handler= 684195134
2015-10-27T17:15:30.169817Z 6 [ERROR] Slave I/O for channel 'group_replication_recovery': error connecting to master 'rpl_user@node3:3306' - retry-time: 60 retries: 1, Error_code: 1130
2015-10-27T17:15:30.169856Z 6 [Note] Slave I/O thread for channel 'group_replication_recovery' killed while connecting to master
2015-10-27T17:15:30.169862Z 6 [Note] Slave I/O thread exiting for channel 'group_replication_recovery', read up to log 'FIRST', position 4
2015-10-27T17:15:30.174955Z 5 [ERROR] Plugin group_replication reported: 'There was an error when connecting to the donor server. Check group replication recovery's connection credentials.'
2015-10-27T17:15:30.175573Z 5 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 2/86400'
2015-10-27T17:15:30.178016Z 5 [Note] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_recovery' executed'. Previous state master_host='node3', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='node1', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''.
2015-10-27T17:15:30.189233Z 5 [Note] Plugin group_replication reported: 'Establishing connection to a group replication recovery donor a0ef74a1-7cb3-11e5-845e-0800275ff74d at node1 port: 3306.'
2015-10-27T17:15:30.190787Z 8 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
2015-10-27T17:15:30.610531Z 8 [Note] Slave I/O thread for channel 'group_replication_recovery': connected to master 'rpl_user@node1:3306',replication started in log 'FIRST' at position 4
get_nodeno(get_site_def()) = 2 task_now() = 1445966131.000425 n = 74 (n - old_n) / (task_now() - old_t) = 11.136156

During this process we can check the status in any node as follows:

mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 80cc27ce-7cb4-11e5-9055-0800275ff74d | node2 | 3306 | RECOVERING |
| group_replication_applier | a0ef74a1-7cb3-11e5-845e-0800275ff74d | node1 | 3306 | ONLINE |
| group_replication_applier | e5263489-7cb7-11e5-a8ee-0800275ff74d | node3 | 3306 | ONLINE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
3 rows in set (0.00 sec)
mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 80cc27ce-7cb4-11e5-9055-0800275ff74d | node2 | 3306 | ONLINE |
| group_replication_applier | a0ef74a1-7cb3-11e5-845e-0800275ff74d | node1 | 3306 | ONLINE |
| group_replication_applier | e5263489-7cb7-11e5-a8ee-0800275ff74d | node3 | 3306 | ONLINE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
3 rows in set (0.00 sec)

Again, as expected, the node connected to the cluster, requested binary logs from latest GTID executed position and applied remaining changes to be back online.

The final test I’ve done so far is about data consistency. For example, what if I stop group replication in a node and make some data changes? When it gets back to replication will it send these changes?

Let’s see a very simple example:
Node2:

mysql> select * from sbtest1 where id=15;
+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
| id | k | c | pad |
+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
| 15 | 12593 | 86109785327-21666363106-71483869798-96435839840-61691707322-34771340621-77881281887-34657680322-46195586522-19795644808 | 74925946566-05246863347-88207617885-76849666822-72200234338 |
+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
1 row in set (0.12 sec)

Node1:

mysql> select * from sbtest1 where id=15;
+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
| id | k | c | pad |
+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
| 15 | 12593 | 86109785327-21666363106-71483869798-96435839840-61691707322-34771340621-77881281887-34657680322-46195586522-19795644808 | 74925946566-05246863347-88207617885-76849666822-72200234338 |
+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> stop group_replication;
Query OK, 0 rows affected (0.03 sec)
mysql> delete from sbtest1 where id=15;
Query OK, 1 row affected (0.02 sec)
mysql> start group_replication;
Query OK, 0 rows affected (0.02 sec)
mysql> select * from sbtest1 where id=15;
Empty set (0.00 sec)

And now Node2 again:

mysql> select * from sbtest1 where id=15;
+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
| id | k | c | pad |
+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
| 15 | 12593 | 86109785327-21666363106-71483869798-96435839840-61691707322-34771340621-77881281887-34657680322-46195586522-19795644808 | 74925946566-05246863347-88207617885-76849666822-72200234338 |
+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
1 row in set (0.00 sec)

Hmmmm, not cool, what if I try to remove a row from Node2?

mysql> delete from sbtest1 where id=15;
Query OK, 1 row affected (0.01 sec)
mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 80cc27ce-7cb4-11e5-9055-0800275ff74d | node2 | 3306 | ONLINE |
| group_replication_applier | a0ef74a1-7cb3-11e5-845e-0800275ff74d | node1 | 3306 | ONLINE |
| group_replication_applier | e5263489-7cb7-11e5-a8ee-0800275ff74d | node3 | 3306 | ONLINE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
3 rows in set (0.00 sec)

Hmmmm, strange, everything seems to be working correctly. Is it? Let’s check node1 again:

mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 80cc27ce-7cb4-11e5-9055-0800275ff74d | node2 | 3306 | ONLINE |
| group_replication_applier | a0ef74a1-7cb3-11e5-845e-0800275ff74d | node1 | 3306 | ONLINE |
| group_replication_applier | e5263489-7cb7-11e5-a8ee-0800275ff74d | node3 | 3306 | ONLINE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
3 rows in set (0.01 sec)
mysql> stop group_replication;
Query OK, 0 rows affected (4.01 sec)
mysql> start group_replication;
Query OK, 0 rows affected (2.41 sec)
mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | a0ef74a1-7cb3-11e5-845e-0800275ff74d | node1 | 3306 | OFFLINE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
1 row in set (0.00 sec)
mysql> exit
Bye
[root@node1 data]# tail -100 error.log
2015-10-27T17:52:50.075274Z 15 [ERROR] Slave SQL for channel 'group_replication_applier': Could not execute Delete_rows event on table test.sbtest1; Can't find record in 'sbtest1', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 346, Error_code: 1032
2015-10-27T17:52:50.075274Z 15 [Warning] Slave: Can't find record in 'sbtest1' Error_code: 1032
2015-10-27T17:52:50.075274Z 15 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'FIRST' position 0
2015-10-27T17:52:50.075294Z 2 [Note] Plugin group_replication reported: 'Group Replication applier module successfully initialized!'
2015-10-27T17:52:50.075308Z 2 [Note] Plugin group_replication reported: 'auto_increment_increment is set to 7'
2015-10-27T17:52:50.075312Z 2 [Note] Plugin group_replication reported: 'auto_increment_offset is set to 1'
[XCOM BINDING DEBUG] ::join()
connecting to 192.168.70.2 10300
state 3489 action xa_init
connecting to 192.168.70.2 10300
....
XCOM BINDING DEBUG] ::join():: I am NOT the boot node.
[XCOM BINDING DEBUG] ::join():: Skipping own address.
[XCOM BINDING DEBUG] ::join():: xcom_client_open_connection to 192.168.70.3:10300
connecting to 192.168.70.3 10300
connected to 192.168.70.3 10300
[XCOM BINDING DEBUG] ::join():: Calling xcom_client_add_node
cli_err 0
state 3489 action xa_snapshot
new state x_recover
state 3505 action xa_complete
new state x_run
[XCOM BINDING DEBUG] ::join():: GCS_OK
[XCOM_BINDING_DEBUG] ::cb_xcom_receive_global_view():: message_id.group= 2226955726 message_id.msgno= 12585 message_id.node= 0
[XCOM_BINDING_DEBUG] ::cb_xcom_receive_global_view():: node set: peer: 0 flag: 1 peer: 1 flag: 1 peer: 2 flag: 1
[XCOM_BINDING_DEBUG] ::xcom_receive_global_view():: Processing new view on Handler
[XCOM_BINDING_DEBUG] ::xcom_receive_global_view():: My node_id is 2
[XCOM_BINDING_DEBUG] ::xcom_receive_global_view():: State Exchange started.
get_nodeno(get_site_def()) = 2 task_now() = 1445968372.450627 n = 0 median_time() = 1.000000 executed_msg={84bca5ce 12585 1}
 [XCOM_BINDING_DEBUG] ::xcom_receive_data_internal():: xcom_receive_data message_id.group= 2226955726 message_id.msgno= 12585 message_id.node= 1
[XCOM_BINDING_DEBUG] ::process_possible_control_message():: Received a control message
[XCOM_BINDING_DEBUG] ::process_possible_control_message()::Still waiting for more State Exchange messages
[XCOM_BINDING_DEBUG] ::xcom_receive_data_internal():: xcom_receive_data message_id.group= 2226955726 message_id.msgno= 12585 message_id.node= 2
[XCOM_BINDING_DEBUG] ::process_possible_control_message():: Received a control message
[XCOM_BINDING_DEBUG] ::process_possible_control_message()::Still waiting for more State Exchange messages
[XCOM_BINDING_DEBUG] ::xcom_receive_data_internal():: xcom_receive_data message_id.group= 2226955726 message_id.msgno= 12586 message_id.node= 0
[XCOM_BINDING_DEBUG] ::process_possible_control_message():: Received a control message
[XCOM_BINDING_DEBUG] ::process_possible_control_message()::Install new view
[XCOM_BINDING_DEBUG] ::install_view():: Processing exchanged data
[XCOM_BINDING_DEBUG] ::install_view():: Processing exchanged data
[XCOM_BINDING_DEBUG] ::install_view():: Processing exchanged data
2015-10-27T17:52:52.455340Z 0 [Note] Plugin group_replication reported: 'Starting group replication recovery with view_id 46750407:11'
2015-10-27T17:52:52.456474Z 16 [ERROR] Plugin group_replication reported: 'Can't evaluate the group replication applier execution status. Group replication recovery will shutdown to avoid data corruption.'
2015-10-27T17:52:52.456503Z 16 [ERROR] Plugin group_replication reported: 'Fatal error during the Recovery process of Group Replication. The server will leave the group.'
[XCOM BINDING DEBUG] ::leave()
[XCOM BINDING DEBUG] ::leave():: Skipping own address.
[XCOM BINDING DEBUG] ::leave():: xcom_client_open_connection to 192.168.70.3:10300
connecting to 192.168.70.3 10300
[XCOM_BINDING_DEBUG] ::install_view():: View delivered to client handler= 750756806
connected to 192.168.70.3 10300
[XCOM BINDING DEBUG] ::leave():: Calling xcom_client_remove_node
cli_err 0
handle_remove_node /export/home2/pb2/build/sb_0-16846472-1445524610.82/build/BUILD/mysql-server/plugin/group_replication/gcs/src/bindings/xcom/xcom/xcom_base.c:1987 nodes: 0x3d05fa8 nodes->node_list_len = 1 nodes->node_list_val: 0x3da7da0 node_address n.address: 0x3d238d0 192.168.70.2:10300
getstart group_id 84bca5ce
state 3551 action xa_terminate
new state x_start
state 3489 action xa_exit
Exiting xcom thread
new state x_start
[XCOM BINDING DEBUG] ::leave():: Installing Leave view
[XCOM_BINDING_DEBUG] ::install_view():: No exchanged data
[XCOM_BINDING_DEBUG] ::install_view():: View delivered to client handler= 750756806
[XCOM BINDING DEBUG] ::leave():: Exiting with error=GCS_OK

So it looks like a member that has data inconsistencies might be reported as ONLINE erroneously, but whenever group replication is restarted it will fail and won’t be able to join to the cluster. It seems there should be better error handling when a data inconsistency is found.

What about the operational perspective?
It looks very limited, just a few variables and status counters, plus some status tables in performance schema as follows:

mysql> show global variables like '%group_repli%';
+---------------------------------------------------+---------------------------------------+
| Variable_name | Value |
+---------------------------------------------------+---------------------------------------+
| group_replication_allow_local_lower_version_join | OFF |
| group_replication_auto_increment_increment | 7 |
| group_replication_bootstrap_group | OFF |
| group_replication_components_stop_timeout | 31536000 |
| group_replication_gcs_engine | xcom |
| group_replication_group_name | 8a94f357-aab4-11df-86ab-c80aa9429562 |
| group_replication_local_address | 192.168.70.3:10300 |
| group_replication_peer_addresses | 192.168.70.2:10300,192.168.70.3:10300 |
| group_replication_pipeline_type_var | STANDARD |
| group_replication_recovery_complete_at | TRANSACTIONS_CERTIFIED |
| group_replication_recovery_password | |
| group_replication_recovery_reconnect_interval | 60 |
| group_replication_recovery_retry_count | 86400 |
| group_replication_recovery_ssl_ca | |
| group_replication_recovery_ssl_capath | |
| group_replication_recovery_ssl_cert | |
| group_replication_recovery_ssl_cipher | |
| group_replication_recovery_ssl_crl | |
| group_replication_recovery_ssl_crlpath | |
| group_replication_recovery_ssl_key | |
| group_replication_recovery_ssl_verify_server_cert | OFF |
| group_replication_recovery_use_ssl | OFF |
| group_replication_recovery_user | rpl_user |
| group_replication_start_on_boot | ON |
+---------------------------------------------------+---------------------------------------+
24 rows in set (0.00 sec)
mysql> show global status like '%group_repli%';
+-----------------------------+-------+
| Variable_name | Value |
+-----------------------------+-------+
| Com_group_replication_start | 8 |
| Com_group_replication_stop | 7 |
+-----------------------------+-------+
2 rows in set (0.01 sec)
mysql> show tables from performance_schema like '%group%';
+----------------------------------------+
| Tables_in_performance_schema (%group%) |
+----------------------------------------+
| replication_group_member_stats |
| replication_group_members |
+----------------------------------------+
2 rows in set (0.00 sec)

Most of values above are self-descriptive. I still need to dig into it a bit more to find the function for some of them.

Conclusions:

So far the work done with group replication is very impressive. Of course there is still a long road to travel, but it doesn’t look to be fair to compare group replication against Galera, unless it is not a side by side comparison.

Even if I like the idea of using a legacy component, I don’t like the need to install and configure Corosync because it’s another piece of software that could eventually fail. Fortunately this can be avoided with the newer version of the plugin, which can use the new XCom communication framework. I tested both versions and using XCom is far easier to setup and configure; however, the error log file can become very verbose, maybe too verbose in my opinion.
With regards to installation and configuration it’s pretty easy once you find the proper way to do it. There are few variables to configure to have a working cluster and most of the settings works just fine by default (like group_replication_auto_increment_increment).

I would still like to have some automatic control on data inconsistency handling (like SST in Galera), but in my opinion this new feature can be a good approach to consider in the future when looking for high availability solutions. A lot of tests need to be done and I’d also like to see some benchmarks. These are just my first impressions and we should wait some time before seeing this feature as GA. Paraphrasing that song “it’s a long way to the top if you wanna rock ‘n’ roll.”

The post MySQL 5.7 first impressions on group-replication appeared first on MySQL Performance Blog.

Oct
29
2015
--

Forget The 7-Day Forecast, WeatherPlanner Will Give You Year-Ahead Weather Predictions

Screen Shot 2015-10-28 at 10.56.46 PM Are you sick of not knowing what the weather will be like at your cousin’s best friend’s sister’s birthday party that is in six months? Well, now you can know exactly what the weather will be like that day…sort of. Recently featured in the TechCrunch Radio Pitch-Off, WeatherPlanner is a year-ahead weather service that will tell you what the weather will be like on… Read More

Oct
28
2015
--

Microsoft Partners With ConsenSys To Use Ethereum To Provide Blockchain-As-A-Service

ethereum The Ethereum project and ConsenSys, the company created by one of the project’s co-creators, have received a huge vote of approval from one of the world’s biggest enterprise software providers — Microsoft. The company will be working with ConsenSys to provide development tools for Microsoft’s enterprise customers on the Azure platform. Read More

Oct
28
2015
--

Alteryx Scores $85 Million As Growth Spurt Continues

Two men working with Alteryx Alteryx, which offers an analytics tool aimed at business users, announced an $85 million round today. Today’s deal comes on top of the $60 million deal last October and brings the total raised to an impressive $163 million to date. The round was led by Iconiq Capital and Insight Venture Partners. Meritech Capital Partners also participated. The company has been growing steadily and… Read More

Oct
28
2015
--

IBM Will Acquire The Weather Company’s Digital Business

19951457294_c3b239ff7c_o IBM announced this morning it’s acquiring The Weather Company, the parent company to the The Weather Channel, as well as the company’s B2B, mobile and web properties. The deal is being valued at more than $2 billion, according to a report from The Wall St. Journal. The companies are not disclosing the financial terms, however. The deal will see The Weather Company licensing… Read More

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com