Jan
13
2015
--

Percona Live 2015 conference sessions announced!

Today we announced the full conference sessions schedule for April’s Percona Live MySQL Conference & Expo 2015 and this year’s event, once again at the Hyatt Regency Santa Clara and Santa Clara Convention Center, looks to be the biggest yet with networking and learning opportunities for MySQL professionals and enthusiasts at all levels.

Conference sessions will run April 14-16 following each morning’s keynote addresses (the keynotes have yet to be announced). The 2015 conference features a variety of formal tracks and sessions related to high availability, DevOps, programming, performance optimization, replication and backup. They’ll also cover MySQL in the cloud, MySQL and NoSQL, MySQL case studies, security (a very hot topic), and “what’s new” in MySQL.

The sessions will be delivered by top MySQL practitioners at some of the world’s leading MySQL vendors and users, including Oracle, Facebook, Google, LinkedIn, Twitter, Yelp, Percona and MariaDB.

Percona Live 2014 conference attendees Sessions will include:

  • “Better DevOps with MySQL and Docker,” Sunny Gleason, Founder, SunnyCloud
  • “Big Transactions on Galera Cluster,” Seppo Jaakola, CEO, Codership
  • “Database Defense in Depth,” Geoffrey Anderson, Database Operations Engineer, Box, Inc.
  • “The Database is Down, Now What?” Jeremy Tinley, Senior MySQL Operations Engineer, Etsy.com
  • “Encrypting MySQL data at Google,” Jeremy Cole, Sr. Systems Engineer, and Jonas Oreland, Software Developer, Google
  • “High-Availability using MySQL Fabric,” Mats Kindahl, Senior Principal Software Developer, MySQL Group, Oracle
  • “High Performance MySQL choices in Amazon Web Services: Beyond RDS,” Andrew Shieh, Director of Operations, SmugMug
  • “How to Analyze and Tune MySQL Queries for Better Performance,” Øystein Grøvlen, Senior Principal Software Engineer, Oracle
  • “InnoDB: A journey to the core III,” Davi Arnaut, Sr. Software Engineer, LinkedIn, and Jeremy Cole, Sr. Systems Engineer, Google, Inc.
  • “Meet MariaDB 10.1,” Sergei Golubchik, Chief Architect, MariaDB
  • “MySQL 5.7 Performance: Scalability & Benchmarks,” Dimitri Kravtchuk, MySQL Performance Architect, Oracle
  • “MySQL at Twitter – 2015,” Calvin Sun, Sr. Engineering Manager, and Inaam Rana, Staff Software Engineer, Twitter
  • “MySQL Automation at Facebook Scale,” Shlomo Priymak, MySQL Database Engineer, Facebook
  • “MySQL Cluster Performance Tuning – The 7.4.x Talk,” Johan Andersson CTO and Alex Yu, Vice President of Products, Severalnines AB
  • “MySQL for Facebook Messenger,” Domas Mituzas, Database Engineer, Facebook
  • “MySQL Indexing, How Does It Really Work?” Tim Callaghan, Vice President of Engineering, Tokutek
  • “MySQL in the Hosted Cloud,” Colin Charles, Chief Evangelist, MariaDB
  • “MySQL Security Essentials,” Ronald Bradford, Founder & CEO, EffectiveMySQL
  • “Scaling MySQL in Amazon Web Services,” Mark Filipi, MySQL Team Lead, Pythian
  • “Online schema changes for maximizing uptime,” David Turner, DBA, Dropbox, and Ben Black, DBA, Tango
  • “Upgrading to MySQL 5.6 @ scale,” Tom Krouper, Staff Database Administrator , Twitter

Of course Percona Live 2015 will also include several hours of hands-on, intensive tutorials – led by some of the top minds in MySQL. We had a post talking about the tutorials in more detail last month. Since then we added two more: “MySQL devops: initiation on how to automate MySQL deployment” and “Deploying MySQL HA with Ansible and Vagrant.” And of course Dimitri Vanoverbeke, Liz van Dijk and Kenny Gryp will once again this year host the ever-popular “Operational DBA in a Nutshell! Hands On Tutorial!

Yahoo, VMWare, Box and Yelp are among the industry leaders sponsoring the event, and additional sponsorship opportunities are still available.

Percona Live 2014 world mapWorldwide interest in Percona Live continues to soar, and this year, for the first time, the conference will run in parallel with OpenStack Live 2015, a new Percona conference scheduled for April 13 and 14. That event will be a unique opportunity for OpenStack users and enthusiasts to learn from leading OpenStack experts in the field about top cloud strategies, improving overall cloud performance, and operational best practices for managing and optimizing OpenStack and its MySQL database core.

Best of all, your full Percona Live ticket gives you access to the OpenStack Live conference! So why not save some $$? Early Bird registration discounts are available through Feb. 1, 2015 at 11:30 p.m. PST.

I hope to see you in April!

The post Percona Live 2015 conference sessions announced! appeared first on MySQL Performance Blog.

Oct
29
2014
--

Facebook MySQL database engineers ready for Percona Live London 2014

With 1.28 billion active users, Facebook MySQL database engineers are active and extremely valuable contributors to the global MySQL community. So naturally they are also active participants of Percona Live MySQL conferences! And next week’s Percona Live London 2014 (Nov. 3-4) is no exception. (Register now and use the promotional code “Facebook” to save £30!)

I spoke with Facebook database engineers Yoshinori “Yoshi” Matsunobu and Shlomo Priymak about their upcoming sessions along with what’s new at Facebook since our last conversation back in April.


Percona Live London 2014Tom: Yoshi, last year Facebook deployed MySQL 5.6 on all production environments – what have you and your team learned since doing that? And do you have a few best practices you could share? I realize you’ll be going into detail during your session in London (MySQL 5.6 and WebScaleSQL at Facebook), but maybe a few words on a couple of the bigger ones?

Yoshi: MySQL 5.6 has excellent replication enhancements to use in large-scale deployments. For example, crash safe slave makes it possible to recover without rebuilding a slave instance on server crash. This can greatly minimize slave downtime, especially if your database size is large. There are many other new features such as GTID, multi-threaded slave, streaming mysqlbinlog and we actively use them in production.

For InnoDB, Online DDL is a good example to ease operations. Many MySQL users are doing schema changes by switching masters. This can minimize downtime but requires operational efforts. Online DDL made things much easier.

Tom: Facebook is an active and extremely valuable part of the overall MySQL community and ecosystem – what are some of the key features and improvements you’ve contributed in the past year since moving to MySQL 5.6?

Yoshi: For InnoDB, I think online defragmentation and faster full table scan are the most valuable contributions from Facebook in 5.6. I have received very positive feedback about faster InnoDB full table scan (Logical ReadAhead). My colleague Rongrong will speak about something interesting regarding online defragmentation at Percona Live London. For Replication, we have done many optimizations to make GTID and MTS work without pain. Semi-Synchronous mysqlbinlog and backported Loss-Less semisync from MySQL 5.7 are very useful when you use Semi-Synchronous replication.

Tom: Shlomo, your sesson, “MySQL Automation at Facebook Scale,” will be of great interest to DBAs at large and growing organizations considering that Facebook has one of the world’s largest MySQL database clusters. What are the two or three most significant things that you’ve learned as a database engineer operating a cluster of this size? And has anything surprised you along the way (so far)?

Shlomo: This is a great question! We like to speak of “10x” at Facebook when thinking of scaling. For example, what would you do differently if the number of servers you had was 10x more than what it is? This type of mental exercise is surprisingly useful when working with systems at scale. If you, or any of the readers, try to extrapolate this about systems you manage, there will be things you’ll be imagining about how a system like this would be – and you won’t be too far from our reality in many aspects.

You’d imagine that we automate much of the single units of work, like master/slave failover, upgrades and schema changes. You’d suspect we have automated fault detection, self managing systems, good alarming and self remediation. You’d presume that if you’re used to running a command on 100 machines, you’ll now be running it on 1000. At least that’s what I thought to myself, so these are not the things that surprised me. There are a few fundamental shifts in one’s thinking when you get to these sizes, which I didn’t foresee.

The first one is that there is absolutely no such thing as “one-off.” If there is a server somewhere that hits a problem every three years, and you have 1000 servers, this will be happening daily! Take it to 10,000 servers, and you can see absolutely nothing is a “one-off”. We can’t write things off as “worst case, I’ll get an SMS.” Whatever it is, we have to chase it down and fix it. Not just that – to deploy a fix at scale can require writing fairly large amounts of code, a fix that could be deployed manually by a DBA in smaller environments.

The second one is adapting to constraints which are very pragmatic and tangible. If you’re on AWS, you’re pretty much isolated from things like worrying where your servers are physically located, when they go over their lifetime, and if the firmware on the switch in the rack needs to be upgraded. If you’re a small shop and have a few racks up in a co-lo, hardware maintenance is just not as frequent, but it becomes more painful as you grow.

At Facebook, we run our own datacenters! We need to work around interesting challenges, such as running datacenters that have highly variable compositions of server hardware. Since we have so many servers, something is always going on. Racks of servers need to be moved. Whole clusters need to be rebuilt or refreshed, to be made better, faster, stronger.  New datacenters are constructed, others decommissioned.

Tom: And this is where automation comes into the picture, right?

Shlomo: We have had to build a lot of automation to make these operations seamless, and we work closely with the Site Ops teams on the ground to coordinate these logistically complicated processes.

Another thing my team does in this space is planning capacity and hardware purchases. Since we build our own servers, the turnaround time between ordering and getting machines is quite long, so proper planning is paramount. Buy too much, and you’ve wasted millions of dollars. Buy too few servers, and there won’t be space for user growth and upcoming projects. The sheer scale makes these decisions more complicated and involved.

These things have actually made my job much more interesting, and I think I’d find it hard to adjust to a smaller environment.

Tom: Last April Facebook announced a move to the newly created WebScaleSQL. Yoshi, do you have an update on where WebScaleSQL is today? And I know it’s early, but has there been any impact on Facebook yet?

Yoshi: WebscaleSQL is a collaboration among engineers from several companies that face similar challenges in running MySQL at scale. Collaboration is nothing new to the MySQL community. The intent is to make this collaboration more efficient.

We are based on the latest upstream (currently MySQL-5.6.21), and added many features. We added patches to improve InnoDB performance around compression LRU flushing, locking, NUMA Support, and doublewrite. We statically link Semi-Sync based on lessons learned at Facebook environments (plugin-lock caused hot mutex contentions). We have many upcoming features such as async clients.

We will continue to track the upstream branch that is the latest, production-ready release (currently MySQL 5.6). We are continuing to push the generally useful changes we have from all of the participants.  If you think you have something to contribute, get in touch!

Tom: I remember being surprised earlier this year when you told me there was usually just one MySQL Operations team member on call at any given time thanks to “robots.” How many robots did your team build and what do they do? Oh, and should rank-and-file DBAs around the world be worried about losing their day jobs to these robots? ;-)

Shlomo: Instead of becoming obsolete as some fear, our team is shifting its focus from smaller to larger problems, as we rise higher in the levels of abstraction. Our team has progressed with the requirements of the role. From being a team of DBAs that automate some of their work, we have become more like Production Engineers. We design, write and maintain MySQL/Facebook-specific automation that does our work for us.

While we build these software “robots” to do our work, we also have to maintain them. The job of the oncall is to fix these robots when they malfunction, and that can sometimes be difficult due to the size of our codebase.

In regards to employment concerns, I’d say our work has become more interesting, and the amount has increased. It definitely did not decrease, so if Facebook is indicative of other companies, jobs are not at risk just yet. Speaking of jobs – if what we’re doing sounds interesting, we’re hiring!

Oh, and as for details about these “robots” – that’s the topic of my talk next week in London. Come and hear me speak if you want to know more!

Tom: Yoshi, you also will host a session titled “Fast Master Failover without Data Loss.” I don’t want to give too much away, but how did you get failover to work at scale – across vast datacenters?

Yoshi: Master failure is a norm at Facebook, because of the large amount of servers. Without automation, it is not realistic for a limited number of people to manage. We have a very interesting infrastructure to automate failure handling at Facebook scale. To automate stuff, reliability is important. Unreliable automation makes engineers spend lots of time fixing things manually, and that increases downtime. It is also important to define what to automate and what we shouldn’t automate. Define failure scenarios and write good test cases and continuously integrate. There are multiple failure scenarios like the ones below and you’ll hear about each in detail at my session:

– mysqld crash
– mysqld stalls
– kernel panic and reboot
– error spikes caused by H/W failure
– error spikes caused by bad application logic
– rack switch down
– multiple rack switches down
– datacenter down

Tom: What other sessions, keynotes or events are you looking forward to at Percona Live London 2014? And are you guys planning on attending the MySQL Community Dinner?

Yoshi:MySQL 5.7: Performance and Scalability Benchmark(led by Oracle MySQL performance architect Dimitri Kravtchuk). And yes, we’re looking forward to meeting with people at MySQL Community Dinner!

Tom:  Thanks again Yoshi and Shlomo for taking the time to speak with me and I look forward to seeing you both in London next week!

Percona Live London 2014 MySQL Community DinnerAnd readers, I invite you to register now for Percona Live London using the promotional code “Facebook” to save £30. I also hope to see you at the MySQL Community Dinner next Monday (Nov. 3). Space is limited so be sure to reserve your spot now and join us aboard our private double-decker bus to the restaurant.

I’d also like to thank the Percona Live London 2014 Conference Committee for putting together a terrific event this year! The conference committee includes:

  • Dailymotion’s Cédric Peintre, conference chairman
  • Percona’s David Busby
  • MariaDB’s Colin Charles
  • ebay Classifieds Group’s Luis Motta Campos
  • Booking.com’s Nicolai Plum
  • Oracle’s Morgan Tocker
  • Spil Games’ Art van Scheppingen

The post Facebook MySQL database engineers ready for Percona Live London 2014 appeared first on MySQL Performance Blog.

Sep
15
2014
--

Percona Live London: Top Ten reasons to attend Nov. 3-4

Percona Live London 2014Percona Live London 2014 is fast approaching – November is just around the corner. This year’s conference, November 3-4, will be even bigger and better than last year thanks to the participation of leading MySQL experts the world over (including you!).

The Percona Live London MySQL Conference is a great event for users of any level using any of the major MySQL branches: MySQL, MariaDB or Percona Server. And this year we once again host a star-studded group of keynote speakers from industry-leading companies in the MySQL space.

We’ll also be welcoming leading MySQL practitioners from across the industry (and from all corners of the world) who will speak on topics that matter to you now  – see the full conference schedule here:

Monday starts early with a full day of tutorials and a fun evening at the community dinner.  Attendees will be arriving in true London style on a double-decker bus! Tuesday morning will kick-off with a series of keynotes followed by interactive breakout sessions – wrapping things up at the end of the day with a fun post-conference reception (a great chance to make new friends and reconnect with old ones).

Here’s a sneak peek at some of the must-see events this year:

To recap, here are the Top Ten reasons to attend Percona Live London this November 3-4:

10. Advanced Rate Pricing ends October 5th
9. Hear about the hottest current topics and trends.
8. Network! Meet face-to-face in the “hallway track” and make lasting connections.
7. Learn how to make MySQL work better for you – regardless of your expertise.
6. Have a blast at the community dinner!
5. Discuss your unique challenges with experts and discover options for solving them.
4. Engage with the sponsors at their tabletop exhibits.
3. Listen to top industry leaders describe the future of the MySQL ecosystem
2. Learn what works, and what doesn’t, from leading companies using MySQL
1. And the Number 1 reason to attend Percona Live London 2014: ALL of the above!

I look forward to seeing you in London this November and don’t forgot that Advanced Rate Pricing pricing ends October 5 so be sure to register now!

The post Percona Live London: Top Ten reasons to attend Nov. 3-4 appeared first on MySQL Performance Blog.

Mar
27
2014
--

A conversation with 5 Facebook MySQL gurus

A conversation with 6 Facebook MySQL gurusFacebook, the undisputed king of online social networks, has 1.23 billion monthly active users collectively contributing to an ocean of data-intensive tasks – making the company one of the world’s top MySQL users.

A small army of Facebook MySQL experts will be converging on Santa Clara, Calif. next week where several of them are leading sessions at the Percona Live MySQL Conference and Expo. I had the chance to chat virtually with four of them about their sessions: Steaphan Greene, Evan Elias, Shlomo Priymak and Yoshinori MatsunobuMark Callaghan, who spoke at Percona Live last year, also joined our conversation which included a discussion of Facebook’s use of MySQL and other open source technologies.


Tom: What’s Facebook’s view of Open Source?

Steaphan: Facebook was built on open source software, and we still invest heavily in open source today. We understand the power these communities have to drive innovation – they allow us to focus on new challenges, as opposed to reinventing the wheel over and over again. And contributing as much as possible back to open projects is in everyone’s best interest.


Tom: Why MySQL? Wouldn’t NoSQL databases, for example, be better suited for the massive workloads seen at Facebook?

Mark: MySQL is great for many of our important workloads. We make it even better with our expertise in MySQL operations and engineering, and by working with the community and learning from their experience.

Yoshinori: I have not been able to find a transactional NoSQL database better than InnoDB. And it’s easy to understand how MySQL Replication works, which makes much easier to fix problems in production.


 Tom: How does Facebook make MySQL scale?

Steaphan: Sharding, automation, monitoring, and heavy investment in operations and performance engineering.


 Tom: What other things help Facebook run smoothly?

Steaphan: Our completely open culture, and the freedom all engineers here have to try any idea they have.


Tom: What is the top scaling challenge(s) Facebook faces in 2014 – and beyond?

Mark: Our biggest challenge is to make things better (performance, efficiency, availability) in the future at the rate we made things better in the past.

Yoshinori: Availability has improved a lot so far for us. Come to my session at Percona Live to hear about that. For me personally, efficiency is the biggest challenge for 2014 and 2015. This includes reducing space and optimizing for newer-generation hardware.


Tom: Facebook deployed MySQL 5.6 last year – including on critical environments – long before many other large organizations. What prompted such a move so soon? And where there any major concerns?

Steaphan: The same thing that prompts most efforts on the Facebook infra team: We will consider any technology that will help us improve performance, efficiency, or reliability, and we’re willing to accept the risk that sometimes comes along with adopting things like 5.6 very early on. But that’s only half the story here. The other half is that Facebook encourages its engineers to go after big bets like these — in this case it was just one engineer who made this happen. And we had the MySQL engineering talent we needed to work with the Oracle team to get 5.6 ready for production at our scale.

Yoshinori: At Facebook, we have three MySQL teams — Operations, Performance and Engineering. Facebook is one of the very few MySQL users that has internal MySQL developers. We all worked hard to adapt 5.6 to our scale and ensure that it would be production-ready. We found some issues after production deployment, but in many cases we could fix the problem and deployed new MySQL binary within one or two days. When deploying in production, we expected that we encountered MySQL 5.6 specific issues, which was typical when releasing new software. We were just confident that we could fix issues immediately.

Our 5.6 deployment step was not all at once. At first rollout, we disabled most major 5.6 features, such as GTID and binlog checksum. We gradually enabled such features in production.


Tom: Where there any significant issues in that move to MySQL 5.6? Any lessons learned you like to share – along with best practices you’d like to share?

Yoshinori: Performance regression of the CPU intensive replication was a main blocker for some of our applications. I wrote a blog post about this last year. We have several design plans to fix the problem on MySQL side. One of the most effective plans is grouping multiple transactions into one, since the most expensive part is writing to InnoDB system table at transaction commit. This optimization would be done when writing binary log, or by SQL thread. It may take longer time to test and deploy in production. For existing applications, we optimized to group multiple transactions from application side to mitigate the problem.


Tom: Performance monitoring is usually challenging at any organization. How do you do that at Facebook, which has tens of thousands of MySQL instances?

Yoshinori: Top-N monitoring is very important for managing a huge number of instances. Average statistics (for example: average innodb_rows_read across all instances) is not always useful since ~1% of problematic instances won’t noticeably change average numbers. p99 gives better indicators, but in our environment we typically have fewer than 0.1% instances causing problems, in which case p99 is not helpful either. We have several graphical and command-line tools to efficiently list up top-N bad behaving hosts. After listing up bad instances, the way to investigate root cause is pretty straightforward, like what MySQL consultants usually do. Server failure is something we expect and plan for at Facebook. For example, typical MySQL DBA at small companies may not encounter master instance failure during employment, because recent mysqld and H/W are stable enough. At Facebook, master failure is a norm and something the system can accommodate.


Tom: Evan, you and Yoshinori will present on Global Transaction ID (GTID) at Facebook. GTID is very tricky to deploy to an existing large-scale environment – how, and why, did you decide on adopting it?

Evan: Our primary motivations for adopting GTID all relate to either failover or binlog backups. When a master fails, getting replicas in sync with GTID is substantially simpler, faster, and less error-prone than previous methods of diffing binlogs. For backups, GTID is a cornerstone in building cross-datacenter point-in-time recovery, without needing redundant binlog streams from every region.

The “how” question is a bit more involved, and we’ll be covering this in detail during the session. The GTID project was a joint effort between three of Facebook’s MySQL teams. Santosh added new functionality to the MySQL server to make online rollout possible, and Yoshinori improved MHA to seamlessly support GTID-based failover. I added GTID support to all of our other in-house automation, and also scripted the rollout procedure across our many thousands of replica sets. A lot of validation logic and monitoring functionality was involved to ensure the safety of the rollout.


Tom: Shlomo, your session is titled “Under the Hood – MySQL Pool Scanner (MPS).” As you point out in your talk, Facebook has one of the largest MySQL database clusters in the world, comprising thousands of servers across multiple data centers. You must have an army of DBAs – or is there some secret you’d like to share? 

Shlomo: We do have an army, yes — it’s an “army of one.” We have one person on call on the MySQL Operations team at a given time, and they don’t even need to do all that much most days. We built “robots” to do our day to day jobs. The largest and most complex robot we have is MPS, an automated system to do most of the work a DBA might otherwise spend time on, such as replacement of broken or overflowing servers. Among other things, MPS also allows a human to initiate complex bulk operations with a few keystrokes, and it will follow up and complete the operations over the course of days or weeks.

I’ll be describing some of the complex MySQL automation systems we have at Facebook, and how they fit together during my talk.


Tom: Shlomo, what does a typical day look like for you there at Facebook?

Shlomo: The team’s work mostly focuses on maintaining those robots I’ve mentioned, as well as developing new ways to improve the reliability of our databases for Facebook’s users. This year the team also spent a lot of time making sure the new MySQL features such as GTIDs and semi-sync are deeply integrated in our automation. Every day, we work hard to to make ourselves obsolete, but we haven’t gotten there just yet!

On a typical day, I probably spend much of the time coding, mainly in Python. I also spend a significant amount of time working on capacity-related projects, such as thinking of ways to optimize the way we distribute the data across our fleet of servers.
Even after 2.5 years at Facebook, I am still in awe of the number of servers we manage. The typical small-scale maintenance operation at Facebook probably involves more servers than all the companies I’ve previously worked for had, combined. It really is pretty amazing!


Tom:  What are you looking forward to the most at this year’s conference?

Evan: There are plenty of fascinating sessions this year. Just to mention a handful: Jeremy Cole and Davi Arnaut’s session on innodb_ruby, since it’s a very unique way to interactively learn about InnoDB’s internals. Baron Schwartz’s session on using Go with MySQL, as VividCortex is blazing the trail here. Peter Boros and Kenny Gryp’s talk on scalability and benchmarking, which I’m hoping will include recent developments of Percona Playback. Tom Christ’s session on my former project Jetpants, to see how it has evolved over the past year at Tumblr. And several talks by Oracle engineers about upcoming functionality in MySQL 5.7.

Steaphan: In addition to the conference sessions, I look forward to the birds of a feather session with the MySQL team.  Last year, it proved to be a valuable opportunity to engage with those upstream developers who make the changes we care about, and I expect the same this year.


Tom: If you could talk to a DBA or developer on the fence about attending this year’s conference, what would be your top 3-5 reasons for making it over to Santa Clara for this event?

Evan: I’m based in NYC, so I’m traveling a bit further than many of my colleagues, but I can still confidently say that Percona Live is well worth the trip. The MySQL ecosystem is very healthy and constantly evolving, and the conference is the best place to learn about ongoing developments across a wide spectrum of companies and contributors. It’s also a perfect opportunity to personally connect with all of the amazing engineers, DBAs, users, and vendors that make MySQL so unique and compelling.


 The Percona Live MySQL Conference and Expo 2014 runs April 1-4 in Santa Clara, Calif. Use the code “SeeMeSpeak” when registering and save 10 percent. The inaugural Open Source Appreciation Day is on March 31 – this full-day event is free but because space is limited I suggest registering now to reserve your spot.

The post A conversation with 5 Facebook MySQL gurus appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com