Sep
01
2015
--

Booking.com’s Jean-François Gagné on Percona Live Amsterdam

Booking.com, one of the world’s leading e-commerce companies, helps travels book nearly 1 million rooms per night. Established in 1996, Booking.com B.V. guarantees the best prices for any type of property, from small, family-run bed and breakfasts to executive apartments and five-star luxury suites.

The travel website is also a dedicated contributor to the MySQL and Perl community. Other open source technologies include CentOS Linux, Nginx, python, puppet, Git and more.

A Diamond sponsor of Percona Live Amsterdam Sept. 21-23, you can meet the people who power Booking.com at booth 205. Enter promo code “BlogInterview” at registration to save €20!

In the meantime, meet Jean-François Gagné, a system engineer at Booking.com. He’ll be presenting a couple of talks: “Riding the Binlog: an in Deep Dissection of the Replication Stream” and “Binlog Servers at Booking.com.”


Tom: Hi Jean-François, in your session, “Riding the Binlog: an in Deep Dissection of the Replication Stream“, you talk about how we can think of the binary logs as a transport for a “Stream of Transactions”. What will be the top 3 things attendees will come away with following this 50-minute talk?

Jean-François: Hi Tom, thanks for this opportunity to give a sneak peak of my talk.  The most important subject that will be discussed is that the binary logs evolves: by the usage of “log-slave-updates”, the stream can grow, shrink or morph.  Said in another way: the binary logs of a slave can be very different from the binary logs of the master, and this should be taken into account when relying on those (including when replicating using intermediate master and when promoting a slave as a new master using GTIDs).  We will also explore how the binary logs can be decomposed in sub-streams, or viewed as the multiplexing of many streams.  We will also look for de-multiplexing functions and the new possibilities that are opened with that.

 

Tom: Percona Live, starting with this conference, has a new venue and a broader theme – now encompassing, in addition to MySQL, MongoDB, NoSQL and data in the cloud. Your thoughts? And what do think is missing – what would you change (if anything)?

Jean-François: I think you forget the best of all changes: going from a 2 day conference last year in London to a 3 day conference this year.  This will allow better knowledge exchange and I am very happy about that.  I think this event will be a success with a good balance of sessions focus on technologies and presentation about a specific use-case of those technologies.  If I had one wish: I would like to see more sessions about specific use-cases of NoSQL technologies with and in deep discussion about why they are a better choice than more traditional solutions: maybe more of those sessions will be submitted next year.

 

Tom: Which other session(s) are you most looking forward to besides your own?

Jean-François: I will definitely attend the Facebook session about Semi-Synchronous Replication: it is very close to my interest, especially as Booking.com is thinking about using loss-less semi-sync replication in the future, and I look forward to hear war stories about this feature.  All sessions dissecting internals of a technology (InnoDB, TokuDB, RocksDB, …) will also have my attention.  Finally, it is always interesting to hear about how large companies are using databases, so I plan to attend the MySQL@Wikimedia session.

 

Tom: As a resident of Amsterdam, what are some of the must-do activities/sightseeing for those visiting for Percona Live from out of town?

Jean-François: Seeing the city from a high point is impressive, and you will have the opportunity of enjoying that view from the Booking.com office at the Community Dinner.  Also, I recommend finding a bike and discover the city pedaling (there are many renting shops, just ask Google).  From the conference venue, you can do a 70 minutes ride crossing three nice parks: the Westerpark, the Rembrandtpark and the Vondelpark – https://goo.gl/P13Mc7 – and you can discover the first of third park in a shorter ride (45 minutes).  If you feel a little more adventurous, I recommend a 90 minute ride South following the Amstel: once out of Amsterdam, you will have the water on one side at the level of the road, and the fields (Polder) 3 meters below on the other side (https://goo.gl/OPDv5z).  This will allow you to see for yourself why this place is called the “Low Countries”.

The post Booking.com’s Jean-François Gagné on Percona Live Amsterdam appeared first on Percona Data Performance Blog.

Jun
08
2015
--

MaxScale: A new tool to solve your MySQL scalability problems

Ever since MySQL replication has existed, people have dreamed of a good solution to automatically split read from write operations, sending the writes to the MySQL master and load balancing the reads over a set of MySQL slaves. While if at first it seems easy to solve, the reality is far more complex.

First, the tool needs to make sure it parses and analyses correctly all the forms of SQL MySQL supports in order to sort writes from reads, something that is not as easy as it seems. Second, it needs to take into account if a session is in a transaction or not.

While in a transaction, the default transaction isolation level in InnoDB, Repeatable-read, and the MVCC framework insure that you’ll get a consistent view for the duration of the transaction. That means all statements executed inside a transaction must run on the master but, when the transaction commits or rollbacks, the following select statements on the session can be again load balanced to the slaves, if the session is in autocommit mode of course.

Then, what do you do with sessions that set variables? Do you restrict those sessions to the master or you replay them to the slave? If you replay the set variable commands, you need to associate the client connection to a set of MySQL backend connections, made of at least a master and a slave. What about temporary objects like with “create temporary table…”? How do you deal when a slave lags behind or what if worse, replication is broken? Those are just a few of the challenges you face when you want to build a tool to perform read/write splitting.

Over the last few years, a few products have tried to tackle the read/write split challenge. The MySQL_proxy was the first attempt I am aware of at solving this problem but it ended up with many limitations. ScaleARC does a much better job and is very usable but it stills has some limitations. The latest contender is MaxScale from MariaDB and this post is a road story of my first implementation of MaxScale for a customer.

Let me first introduce what is MaxScale exactly. MaxScale is an open source project, developed by MariaDB, that aims to be a modular proxy for MySQL. Most of the functionality in MaxScale is implemented as modules, which includes for example, modules for the MySQL protocol, client side and server side.

Other families of available modules are routers, monitors and filters. Routers are used to determine where to send a query, Read/Write splitting is accomplished by the readwritesplit router. The readwritesplit router uses an embedded MySQL server to parse the queries… quite clever and hard to beat in term of query parsing.

There are other routers available, the readconnrouter is basically a round-robin load balancer with optional weights, the schemarouter is a way to shard your data by schema and the binlog router is useful to manage a large number of slaves (have a look at Booking.com’s Jean-François Gagné’s talk at PLMCE15 to see how it can be used).

Monitors are modules that maintain information about the backend MySQL servers. There are monitors for a replicating setup, for Galera and for NDB cluster. Finally, the filters are modules that can be inserted in the software stack to manipulate the queries and the resultsets. All those modules have well defined APIs and thus, writing a custom module is rather easy, even for a non-developer like me, basic C skills are needed though. All event handling in MaxScale uses epoll and it supports multiple threads.

Over the last few months I worked with a customer having a challenging problem. On a PXC cluster, they have more than 30k queries/s and because of their write pattern and to avoid certification issues, they want to have the possibility to write to a single node and to load balance the reads. The application is not able to do the Read/Write splitting so, without a tool to do the splitting, only one node can be used for all the traffic. Of course, to make things easy, they use a lot of Java code that set tons of sessions variables. Furthermore, for ISO 27001 compliance, they want to be able to log all the queries for security analysis (and also for performance analysis, why not?). So, high query rate, Read/Write splitting and full query logging, like I said a challenging problem.

We experimented with a few solutions. One was a hardware load balancer that failed miserably – the implementation was just too simple, using only regular expressions. Another solution we tried was ScaleArc but it needed many rules to whitelist the set session variables and to repeat them to multiple servers. ScaleArc could have done the job but all the rules increases the CPU load and the cost is per CPU. The queries could have been sent to rsyslog and aggregated for analysis.

Finally, the HA implementation is rather minimalist and we had some issues with it. Then, we tried MaxScale. At the time, it was not GA and was (is still) young. Nevertheless, I wrote a query logging filter module to send all the queries to a Kafka cluster and we gave it a try. Kafka is extremely well suited to record a large flow of queries like that. In fact, at 30k qps, the 3 Kafka nodes are barely moving with cpu under 5% of one core. Although we encountered some issues, remember MaxScale is very young, it appeared to be the solution with the best potential and so we moved forward.

The folks at MariaDB behind MaxScale have been very responsive to the problems we encountered and we finally got to a very usable point and the test in the pilot environment was successful. The solution is now been deployed in the staging environment and if all goes well, it will be in production soon. The following figure is simplified view of the internals of MaxScale as configured for the customer:

MaxScale internals

The blocks in the figure are nearly all defined in the configuration file. We define a TCP listener using the MySQL protocol (client side) which is linked with a router, either the readwritesplit router or the readconn router.

The first step when routing a query is to assign the backends. This is where the read/write splitting decision is made. Also, as part of the steps required to route a query, 2 filters are called, regexp (optional) and Genlog. The regexp filter may be used to hot patch a query and the Genlog filter is the logging filter I wrote for them. The Genlog filter will send a json string containing about what can be found in the MySQL general query log plus the execution time.

Authentication attempts are also logged but the process is not illustrated in the figure. A key point to note, the authentication information is cached by MaxScale and is refreshed upon authentication failure, the refresh process is throttled to avoid overloading the backend servers. The servers are continuously monitored, the interval is adjustable, and the server status are used when the decision to assign a backend for a query is done.

In term of HA, I wrote a simple Pacemaker resource agent for MaxScale that does a few fancy things like load balancing with IPTables (I’ll talk about that in future post). With Pacemaker, we have a full fledge HA solution with quorum and fencing on which we can rely.

Performance wise, it is very good – a single core in a virtual environment was able to read/write split and log to Kafka about 10k queries per second. Although MaxScale supports multiple threads, we are still using a single thread per process, simply because it yields a slightly higher throughput and the custom Pacemaker agent deals with the use of a clone set of MaxScale instances. Remember we started early using MaxScale and the beta versions were not dealing gracefully with threads so we built around multiple single threaded instances.

So, since a conclusion is needed, MaxScale has proven to be a very useful and flexible tool that allows to elaborate solutions to problems that were very hard to tackle before. In particular, if you need to perform read/write splitting, then, try MaxScale, it is best solution for that purpose I have found so far. Keep in touch, I’ll surely write other posts about MaxScale in the near future.

The post MaxScale: A new tool to solve your MySQL scalability problems appeared first on MySQL Performance Blog.

Dec
04
2014
--

Sneak peek at the Percona Live MySQL Conference & Expo 2015

Sneak peek at the Percona Live MySQL Conference & Expo 2015You know you’ll be there so why not save some $$ by registering now for the Percona Live MySQL Conference & Expo 2015 (April 13-16 in Santa Clara, Calif.). Super Saver registration discounts are available through Dec. 14 at 11:30 p.m. PST. (That’s just 10 days away!)

What to expect this year? The Percona Live 2015 conference committee is putting together another fantastic event for the global MySQL community’s next meeting in April. The full conference agenda will be announced in January, but the initial roster includes:

  • Sunny Bains, Senior Engineering Manager at Oracle; “InnoDB 5.7- What’s New”
  • Yoshinori Matsunobu, Database Engineer at Facebook; “Fast Master Failover Without Data Loss”
  • Jeremy Cole, Senior Systems Engineer at Google, Inc.; “Exploring Your Data With InnoDB Explorer”
  • Tom Krouper, Staff Database Administrator at Twitter; “Upgrading to MySQL 5.6 @ Scale”
  • Jenni Snyder, Database Administrator at Yelp; “Schema changes multiple times a day? OK!”
  • Ike Walker, Database Architect at Flite; “Assembling the Perfect MySQL Toolbox”
  • Jean-François Gagné, Senior System Engineer/Architect at Booking.com; “Binlog Servers at Booking.com”
  • Jeremy Glick, Lead DBA at MyDBAteam, and Andrew Moore, MySQL DBA at Percona; “Using MySQL Audit Plugins and Elasticsearch ELK”
  • Tomáš Komenda, Team Lead and Database Specialist, and Lukáš Putna, Senior Developer and Database Specialist at Seznam.cz; “MySQL and HBase Ecosystem for Real-time Big Data Overviews”
  • Alexander Rubin, Principal Consultant at Percona; “Advanced MySQL Query Tuning”

And while the call for papers deadline has expired, there are still sponsorship opportunities available for the world’s largest annual MySQL event. Sponsors become a part of a dynamic and growing ecosystem and interact with more than 1,000 DBAs, sysadmins, developers, CTOs, CEOs, business managers, technology evangelists, solutions vendors, and entrepreneurs who attend the event.

Current sponsors include:

  • Diamond Plus: VMware
  • Gold: Codership, Pythian
  • Silver: Box, SpringbokSQL, Yelp
  • Exhibit Only: FoundationDB, Severalnines, Tokutek, VividCortex
  • Other Sponsors: MailChimp
  • Media Sponsor: Database Trends & Applications , Datanami, InfoQ , Linux Journal, O’Reilly Media

Sneak peek at the Percona Live MySQL Conference & Expo 2015Percona Live 2015 will feature a variety of formal tracks and sessions related to High Availability, DevOps, Programming, Performance Optimization, Replication and Backup, MySQL in the Cloud, MySQL and NoSQL, MySQL Case Studies, Security, and What’s New in MySQL.

As usual the conference will be held in the heart of Silicon Valley at the Hyatt Regency Santa Clara and Santa Clara Convention Center. But this year Percona has also unveiled OpenStack Live 2015, a new conference that will run in parallel with Percona Live MySQL Conference & Expo 2015 on April 13 and 14.

And don’t forget, Super Saver registration discounts are available through Dec. 14 at 11:30 p.m. PST. I hope to see you in Santa Clara!

The post Sneak peek at the Percona Live MySQL Conference & Expo 2015 appeared first on MySQL Performance Blog.

Sep
15
2014
--

Percona Live London: Top Ten reasons to attend Nov. 3-4

Percona Live London 2014Percona Live London 2014 is fast approaching – November is just around the corner. This year’s conference, November 3-4, will be even bigger and better than last year thanks to the participation of leading MySQL experts the world over (including you!).

The Percona Live London MySQL Conference is a great event for users of any level using any of the major MySQL branches: MySQL, MariaDB or Percona Server. And this year we once again host a star-studded group of keynote speakers from industry-leading companies in the MySQL space.

We’ll also be welcoming leading MySQL practitioners from across the industry (and from all corners of the world) who will speak on topics that matter to you now  – see the full conference schedule here:

Monday starts early with a full day of tutorials and a fun evening at the community dinner.  Attendees will be arriving in true London style on a double-decker bus! Tuesday morning will kick-off with a series of keynotes followed by interactive breakout sessions – wrapping things up at the end of the day with a fun post-conference reception (a great chance to make new friends and reconnect with old ones).

Here’s a sneak peek at some of the must-see events this year:

To recap, here are the Top Ten reasons to attend Percona Live London this November 3-4:

10. Advanced Rate Pricing ends October 5th
9. Hear about the hottest current topics and trends.
8. Network! Meet face-to-face in the “hallway track” and make lasting connections.
7. Learn how to make MySQL work better for you – regardless of your expertise.
6. Have a blast at the community dinner!
5. Discuss your unique challenges with experts and discover options for solving them.
4. Engage with the sponsors at their tabletop exhibits.
3. Listen to top industry leaders describe the future of the MySQL ecosystem
2. Learn what works, and what doesn’t, from leading companies using MySQL
1. And the Number 1 reason to attend Percona Live London 2014: ALL of the above!

I look forward to seeing you in London this November and don’t forgot that Advanced Rate Pricing pricing ends October 5 so be sure to register now!

The post Percona Live London: Top Ten reasons to attend Nov. 3-4 appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com