MuleSoft is the latest “unicorn” to file for an IPO. The company, which helps businesses like Netflix and Spotify with their APIs, has unveiled its financials to the public in an S-1 filing, suggesting that they are targeting a debut as soon as March. The size of the proposed IPO is $100 million, but that is subject to change. In the filing we see that MuleSoft had $187.7… Read More
Recently I worked on a ticket where a customer performed a point-in-time recovery PITR using a large set of binary logs. Normally we handle this by applying the last backup, then re-applying all binary logs created since the last backup. In the middle of the procedure, their new server crashed. We identified the binary log position and tried to restart the PITR from there. However, using the option
, the restore failed with the error “The BINLOG statement of type
Table_map was not preceded by a format description BINLOG statement.” This is a known bug and is reported as MySQL Bug #72804: “BINLOG statement can no longer be used to apply Query events.”
I created a small test to demonstrate a workaround that we implemented (and worked).
First, I ran a large import process that created several binary logs. I used a small value in
and tested using the database “employees” (a standard database used for testing).Then I dropped the database.
mysql> set sql_log_bin=0; Query OK, 0 rows affected (0.33 sec) mysql> drop database employees; Query OK, 8 rows affected (1.25 sec)
To demonstrate the recovery process, I joined all the binary log files into one SQL file and started an import.
sveta@Thinkie:~/build/ps-5.7/mysql-test$ ../bin/mysqlbinlog var/mysqld.1/data/master.000001 var/mysqld.1/data/master.000002 var/mysqld.1/data/master.000003 var/mysqld.1/data/master.000004 var/mysqld.1/data/master.000005 > binlogs.sql sveta@Thinkie:~/build/ps-5.7/mysql-test$ GENERATE_ERROR.sh binlogs.sql sveta@Thinkie:~/build/ps-5.7/mysql-test$ mysql < binlogs.sql ERROR 1064 (42000) at line 9020: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'inserting error
I intentionally generated a syntax error in the resulting file with the help of the GENERATE_ERROR.sh script (which just inserts a bogus SQL statement in a random row). The error message clearly showed where the import stopped: line 9020. I then created a file that cropped out the part that had already been imported (lines 1- 9020), and tried to import this new file.
sveta@Thinkie:~/build/ps-5.7/mysql-test$ tail -n +9021 binlogs.sql >binlogs_rest.sql sveta@Thinkie:~/build/ps-5.7/mysql-test$ mysql < binlogs_rest.sql ERROR 1609 (HY000) at line 134: The BINLOG statement of type `Table_map` was not preceded by a format description BINLOG statement.
Again, the import failed with exactly the same error as the customer. The reason for this error is that the BINLOG statement – which applies changes from the binary log – expects that the format description event gets run in the same session as the binary log import, but before it. The format description existed initially at the start of the import that failed at line 9020. The later import (from line 9021 on) doesn’t contain this format statement.
Fortunately, this format is the same for the same version! We can simply take it from the beginning the SQL log file (or the original binary file) and put into the file created after the crash without lines 1-9020.
With MySQL versions 5.6 and 5.7, this event is located in the first 11 rows:
sveta@Thinkie:~/build/ps-5.7/mysql-test$ head -n 11 binlogs.sql | cat -n 1 /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/; 2 /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/; 3 DELIMITER /*!*/; 4 # at 4 5 #170128 17:58:11 server id 1 end_log_pos 123 CRC32 0xccda074a Start: binlog v 4, server v 5.7.16-9-debug-log created 170128 17:58:11 at startup 6 ROLLBACK/*!*/; 7 BINLOG ' 8 g7GMWA8BAAAAdwAAAHsAAAAAAAQANS43LjE2LTktZGVidWctbG9nAAAAAAAAAAAAAAAAAAAAAAAA 9 AAAAAAAAAAAAAAAAAACDsYxYEzgNAAgAEgAEBAQEEgAAXwAEGggAAAAICAgCAAAACgoKKioAEjQA 10 AUoH2sw= 11 '/*!*/;
The first six rows are meta information, and rows 6-11 are the format event itself. The only thing we need to export into our resulting file is these 11 lines:
sveta@Thinkie:~/build/ps-5.7/mysql-test$ head -n 11 binlogs.sql > binlogs_rest_with_format.sql sveta@Thinkie:~/build/ps-5.7/mysql-test$ cat binlogs_rest.sql >> binlogs_rest_with_format.sql sveta@Thinkie:~/build/ps-5.7/mysql-test$ mysql < binlogs_rest_with_format.sql sveta@Thinkie:~/build/ps-5.7/mysql-test$
After this, the import succeeded!
Late last year we started a poll on what backend programming languages are being used by the open source community. The three components of the backend – server, application, and database – are what makes a website or application work. Below are the results of Percona’s poll on backend programming languages in use by the community:
Note: There is a poll embedded within this post, please visit the site to participate in this post’s poll.
One of the best-known and earliest web service stacks is the LAMP stack, which spelled out refers to Linux, Apache, MySQL and PHP/Perl/Python. We can see that this early model is still popular when it comes to the backend.
PHP still remains a very common choice for a backend programming language, with Python moving up the list as well. Perl seems to be fading in popularity, despite being used a lot in the MySQL world.
Finally, Go is a language to look out for. Go is an open source programming language created by Google. It first appeared in 2009, and is already more popular than Perl or Ruby according to this poll.
Thanks to the community for participating in our poll. You can take our latest poll on what database engine are you using to store time series data here.
YotaScale, a graduate of Alchemist’s enterprise accelerator, is announcing a $3.6 million venture round today from Engineering Capital, Pelion Ventures and angels Jocelyn Goldfein, Timothy Chou and Robert Dykes. The startup employs machine learning to help balance performance, availability and cost for enterprise cloud computing. Competitors CloudHealth Technologies and Cloudability… Read More
The MariaDB Corporation is organizing a conference called M17 on the East Coast in April. Some Perconians (Peter Zaitsev, Vadim Tchachenko, Sveta Smirnova, Alex Rubin, Colin Charles) decided to submit some interesting talks for that conference. Percona also offered to sponsor the conference.
As of this post, the talks haven’t been accepted, and we were politely told that we couldn’t sponsor.
Some of the proposed talks were:
- MariaDB Backup with Percona XtraBackup (Vadim Tchachenko)
- Managing MariaDB Server operations with Percona Toolkit (Colin Charles)
- MariaDB Server Monitoring with Percona Monitoring and Management (Peter Zaitsev)
- Securing your MariaDB Server/MySQL data (Colin Charles, Ronald Bradford)
- Data Analytics with MySQL, Apache Spark and Apache Drill (Alexander Rubin)
- Performance Schema for MySQL and MariaDB Troubleshooting (Sveta Smirnova)
At Percona, we think MariaDB Server is an important part of the MySQL ecosystem. This is why the Percona Live Open Source Database Conference 2017 in Santa Clara has a MariaDB mini-track, consisting of talks from various Percona and MariaDB experts:
- Securing your MySQL/MariaDB data (Colin Charles, Ronald Bradford)
- MariaRocks: MyRocks in MariaDB (Sergei Petrunia)
- MySQL/MariaDB Parallel Replication: inventory, use cases and limitations (Jean-François Gagné)
- Histograms in MySQL and MariaDB (Sergei Petrunia)
- Common Table Expressions and Window Functions simple, maintainable, fast queries (Vicentiu-Marian Ciorbaru)
- MariaDB Server 10.2: The Complete Guide (Colin Charles)
- …and more.
If any of these topics look enticing, come to the conference. We have MariaDB at Percona Live.
To make your decision easier, we’ve created a special promo code that gets you $75 off a full conference pass! Just use MariaDB@PL17 at checkout.
In the meantime, we will continue to write and discuss MariaDB, and any other open source database technologies. The power of the open source community is the free exchange of ideas, healthy competition and open dialog within the community.
Here are some more past presentations that are also relevant:
Cervin Ventures, under the direction of Preetish Nijhawan and Neeraj Gupta, is announcing its latest $56 million fund. Built on the success of previous angel and micro-venture portfolios, Cervin is targeting seed-stage startups across the enterprise stack — running the full gamut of infrastructure, data and software. Both Nijhawan and Gupta come from operational roles. Preetish… Read More
Chances are that your current company is collecting a ton of data about your customers every second. It’s a goldmine and it often sits there in a corner, unused. Vize is a business intelligence tool that will let you learn more about your business and take advantage of all your data. The company is currently participating in Y Combinator’s winter batch. “The main use case… Read More
With Oracle clearly entering the “open source high availability solutions” arena with the release of their brand new Group Replication solution, I believe it is time to review the quality of the first GA (production ready) release.
TL;DR: Having examined the technology, it is my conclusion that Oracle seems to have released the GA version of Group Replication too early. While the product is definitely “working prototype” quality, the release seems rushed and unfinished. I found a significant number of issues, and I would personally not recommend it for production use.
It is obvious that Oracle is trying hard to ship technology to compete with Percona XtraDB Cluster, which is probably why they rushed to claim Group Replication GA quality.
If you’re all set to follow along and test Group Replication yourself, simplify the initial setup by using this Docker image. We can review some of the issues you might face together.
For the record, I tested the version based on MySQL 5.7.17 release.
No automatic provisioning
First off, the first thing you’ll find is there is NO way to automatically setup of a new node.
If you need to setup new node or recover an existing node from a fatal failure, you’ll need to manually provision the slave.
Of course, you can clone a slave using Percona XtraBackup or LVM by employing some self-developed scripts. But given the high availability nature of the product, one would expect Group Replication to automatically re-provision any failed node.
Bug: stale reads on nodes
Please see this bug:
- https://bugs.mysql.com/bug.php?id=84900: getting inconsistent results on different nodes
One line summary: while any secondary nodes are “catching up” to whatever happened on a first node (it takes time to apply changes on secondary nodes), reads on a secondary node could return stale data (as shown in the bug report).
This behavior brings us back to the traditional asynchronous replication slave behavior (i.e., Group Replication’s predecessor).
It also contradicts the Group Replication documentation, which states: “There is a built-in group membership service that keeps the view of the group consistent and available for all servers at any given point in time.” (See https://dev.mysql.com/doc/refman/5.7/en/group-replication.html.)
I might also mention here that Percona XtraDB Cluster prevents stale reads (see https://www.percona.com/doc/percona-xtradb-cluster/5.7/wsrep-system-index.html#wsrep_sync_wait).
Bug: nodes become unusable after a big transaction, refusing to execute further transactions
There are two related bugs:
- https://bugs.mysql.com/bug.php?id=83218: DML operations in multiple sessions fail
- https://bugs.mysql.com/bug.php?id=84901: can’t execute transaction on the second node
One line summary: after running a big transaction, any secondary nodes become unusable and refuse to perform any further transactions.
Obscure error messages
It is not uncommon to see cryptic error messages while testing Group Replication. For example:
mysql> commit; ERROR 3100 (HY000): Error on observer while running replication hook 'before_commit'.
This is fairly useless and provides little help until I check the mysqld error log. The log provides a little bit more information:
2017-02-09T02:05:36.996776Z 18 [ERROR] Plugin group_replication reported: '[GCS] Gcs_packet's payload is too big. Only the packets smaller than 2113929216 bytes can be compressed.'
The items highlighted above might not seem too bad at first, and you could assume that your workload won’t be affected. However, stale reads and node dysfunctions basically prevent me from running a more comprehensive evaluation.
If you care about your data, then I recommend not using Group Replication in production. Currently, it looks like it might cause plenty of headaches, and it is easy to get inconsistent results.
For the moment, Group Replication appears an advanced – but broken – traditional MySQL asynchronous replication.
I understand Oracle’s dilemma. Usually people are hesitant to test a product that is not GA. So in order to get feedback from users, Oracle needs to push the product to GA. Oracle must absolutely solve the issues above during future QA cycles.
Our most recent release of Percona Server for MySQL (Percona Server for MySQL 5.7.17) comes with Group Replication plugins. Unfortunately, since this technology is very new, it requires some fairly complicated steps to setup and get running. To help with that process, I’ve prepare Docker images that simplify its setup procedures.
You can find the image here: https://hub.docker.com/r/perconalab/pgr-57/.
To start the first node (bootstrap the group):
docker run -d -p 3306 --net=clusternet -e MYSQL_ROOT_PASSWORD=passw0rd -e CLUSTER_NAME=cluster1 perconalab/pgr-57
To add nodes into the group after:
docker run -d -p 3306 --net=clusternet -e MYSQL_ROOT_PASSWORD=passw0rd -e CLUSTER_NAME=cluster1 -e CLUSTER_JOIN=CONTAINER_ID_FROM_THE_FIRST_STEP perconalab/pgr-57
You can also get a full script that starts “N” number of nodes, here: https://github.com/Percona-Lab/percona-docker/blob/master/pgr-57/start_node.sh
IBM wants to bring machine learning to its traditional mainframe customers, and eventually to any technology with large data stores hidden behind a company firewall in what IBM calls a “private cloud.”
Yes mainframes, those ginormous computing machines from an earlier age, are still running inside some of the world’s biggest companies including banks, insurance companies… Read More