Jun
22
2022
--

Missed Percona Live 2022? Watch It On Demand Now

Couldn’t make it to Austin this year for Percona Live? Don’t worry! We got you covered. 

This year’s show included 80+ sessions by leading open source database experts alongside big names like Shopify, Hubspot, Venmo, Amazon, and Facebook. And every session is available right now, on demand

Here are just a few of the sessions you can access right now. 

Database Incident Management: When Everything Goes Wrong

Speaker: Joshua Varner, Shopify

When database systems go down, the consequences are usually severe. And who’s the last line of defense protecting a company’s data? Database professionals. During this talk, Joshua Varner of Shopify discusses incident management practices that can help any DBA/DRE work as a team through the toughest of issues – even when the pressure is high. Joshua shares important tips that span the various stages of an incident – detection, alerting, response, postmortem, and prevention – as well as the unique challenges database teams face as compared to other disciplines.

Migrating Facebook/Meta to MySQL 8.0

Speakers: Herman Lee and Pradeep Nayak, FaceBook/Meta

MySQL powers some of Facebook/Meta’s most important workloads. Because the Facebook/Meta team actively develops new features in MySQL to support their evolving requirements, migrating workloads to majors versions requires significant time and effort. Their upgrade to version 5.6 took more than a year, and moving to 5.7 would have been no faster, as they were in the middle of developing their MyRocks storage engine. By MySQL 8.0, they had finished MyRocks and were ready to take advantage of that version’s compelling new features, like atomic DDL support. Even so, the migration would come with some tough challenges. In this talk, Herman Lee and Pradeep Nayak share how Facebook/Meta tackled their 8.0 migration project and some of the surprises they discovered along the way.

Percona Monitoring and Measuring (PMM) For Novices

Speaker: Dave Stokes, Percona

The use of open source databases comes with huge benefits like greater innovation, lower costs, liberal licensing, data portability, and the ability to extend the code. But how can you effectively monitor and manage your databases in the face of increasing complexity? Percona Monitoring and Management (PMM) is an open source database observability, monitoring, and management tool for MySQL, PostgreSQL, and MongoDB. In this session, open source database expert Dave Stokes walks you through how to install PMM, how to register databases with it, and how to use it to simplify operations, optimize performance, and improve the security of your database services. Tune into this session to see how Dave starts at zero to make you a hero at monitoring your databases. 

Simplifying the Top Complexities While Migrating From Oracle to PostgreSQL

Speaker: Avinash Vallarapu, MigOps

Migrating from Oracle to PostgreSQL is complex and can seem like a risky and daunting task. It doesn’t have to be. In this talk, Avinash Vallarapu shows you how to simplify some of the major complexities that may occur when moving from Oracle to PostgreSQL. Specifically, he discusses and demos:

  • Converting sysdate from Oracle to PostgreSQL. Among a variety of options available in Postgres, which one is correct?
  • Converting Hierarchical Queries from Oracle to PostgreSQL. Is Recursive CTE a correct alternative?
  • Understanding Internals of the available solutions.
  • Implementing Associative Arrays in PostgreSQL.
  • Understanding the correct alternatives for Oracle Array Indexes and other complexities.

Big Data…Or Just Too Much Data?

Speaker: Marcos Albe, Percona

Has your multi-terabyte MySQL instance become unmanageable? Are reporting queries taking longer and longer? Do backup storage costs keep increasing? Do simple queries take seconds to run? In this talk, Marcos Albe discusses the practical limits of MySQL, how to tame your data, and what the alternatives to MySQL are.

The Future of Percona Products

Speaker: Donnie Berkholz, Percona

Percona has long been a leader in the global open source community. As we head into the future, we’re excited to unveil our plans to help even more companies harness the benefits of open source databases. In this session, Donnie Berkholtz shares the Percona product roadmap for the next few years and makes a major announcement about Percona Platform. Whether you’re already using Percona software or services or considering it, you don’t want to miss this session.

MySQL Parallel Replication: All the 5.7 and 8.0 Details 9 (LOGICAL_CLOCK)

Speaker: Jean-François Gagné, HubSpot

Parallel Replication is an effective feature for achieving faster replication and reducing lag. While MySQL implements Parallel Replication through LOGICAL_CLOCK, fully benefiting from this mechanism requires more than just enabling it. In this talk, Jean-François Gagné covers important things you need to know about MySQL 5.7 and 8.0 Parallel Replication. He explains in detail how LOGICAL_CLOCK works and provides expert advice on how to optimize Parallel Replication in MySQL. He also discusses changes made in MySQL 8.0, and back-ported in 5.7, that greatly improve the potential for parallel execution on replicas. 

Deploying MongoDB Sharded Clusters Easily With Terraform and Ansible

Speaker : Ivan Groenewold, Percona

Installing big clusters can be a time-consuming task. In this talk, you’ll learn how to develop a complete pipeline to deploy MongoDB sharded clusters. By combining Terraform for the hardware provisioning and Ansible for the software installation, we can save time and provide a standardized reusable solution.

Sign up to see all Percona Live sessions

Learn from the best and brightest in the open source database industry. Watch these and other Percona Live 2022 sessions now.

 

May
25
2022
--

Join a New Virtual Event This June – Percona Community Live!

Percona Community Live 2022

Percona Community Live 2022Percona Live 2022 in Austin has just ended, but we don’t waste time and we’ve prepared a new event for you. We understand that traveling still might be challenging for both speakers and attendees, and that’s why the event will be virtual. It will take place on June 21 – 23 and will be streamed on YouTube, LinkedIn, and Twitch.

For this event, we invited speakers whose amazing talks we were not able to accommodate in the tight schedule of Percona Live, and also other fantastic speakers.

Good news – we still accept submissions from everyone! We are looking forward to hearing from you!  We welcome any talks related to open source technologies, MySQL, PostgreSQL, MongoDB, MariaDB databases, and new & emerging technologies. The CFP closes on Monday, May 31. Don’t hesitate and send your abstracts here now!

Here are some of the amazing talks that are already on the schedule:

  • Flexible Indexing with Postgres by Bruce Momjian (Postgres Evangelist, EDB)
  • PMM on Kubernetes by Denys Kondratenko (Director of Software Engineering, Percona)
  • The Open-Source Distributed Graph Database: Nebula Graph by Wey Gu (Developer Advocate, Vesoft)
  • 10 Deadly PostgreSQL Sins by Matt Yonkovit (HOSS, Percona) and Barrett Chambers (Director, Solutions Engineering, Percona)
  • I Dropped My Database! Now What? – A Dive Into PostgreSQL Backup Using pgBackRest and How to Use it for PITR by Charly Batista (Postgres Tech Lead, Percona)
  • The MySQL Ecosystem in 2022 by Colin Charles (Codership)
  • The Ways of Performance Schema and How to Leverage it for Troubleshooting by Leonardo Bacchi Fernandes (Support Engineer, Percona)

And more!

If you are not ready to talk, we welcome you to join the event as an attendee. Just come to our streams on YouTube, LinkedIn, or Twitch. All the updates will be published on the event’s page. Also, follow us on Twitter so as to not to miss the important news.

Apr
15
2022
--

What Else Do You Need To Learn at Percona Live?

Learn at Percona Live

Learn at Percona LivePercona Live is about one month away and the schedule has a lot of content. But what are we missing?

Occasionally we have speakers who cannot make the event at the last minute or discover a room with an unused time slot.  So, if we had one of these speaking slots open up, what would you want to see presented?   What subject did we miss or not emphasize on the schedule that needs to be added?

This is your chance to tailor Percona Live to your needs.

Please let me know via email or in the comments before the show and I will pursue your suggestions.  And while I may not know someone who could present ‘Kubernetes Explained In Interpretive Dance’ or ‘GDPR Data Retention Issues for non-European Union Countries’, there is a chance that I can track down someone who does.

Do we need an hour (or so) on data normalization, scrubbing user input, ETL tools,  MongoDB indexing methodology, what’s new in distributed lock management, SQL syntax, or hybrid cloud management?  Sometimes we have no submissions on a subject and therefore have no content in the show but this time we may be able to arrange to get coverage on something missed.

Or are you in the ‘I missed the call for papers deadline and really want to present’ category?  Well, this just may be the last opportunity to speak in Austin for 2022.

We are also looking at presenting a virtual event in late June where we will repeat some of the Austin content and will have room for new content.  So please let me know what is lacking, what you want to learn about, maybe a speaker you would like to see, or some activity you want to be added to Percona Live or the virtual event.  Percona’s staff is pretty sharp at this database stuff but we are not perfect at knowing your needs.  This is your chance to let us know general areas that need emphasis (more beginners subjects or Kube?) to let us serve our community better.

Register for Percona Live

Apr
12
2022
--

11 Sessions Not to Miss at Percona Live 2022

Percona Live 2022

Percona Live is a jam-packed few days. There are over 100 sessions, covering a range of open source topics and featuring several tactical, hands-on demos. Obviously, you can’t attend every one of them, and parsing through the agenda can take some time.

That’s why we’ve compiled a list of what we’re calling “hot topic” sessions. These are the ones that are top-of-mind in the open source community and feature heavy-hitting speakers from Meta/Facebook, Venmo, Amazon, and, yes, Percona. They’re also the only sessions that are available for live streaming

Below, you’ll find the full list of “hot topic” sessions. And if you haven’t yet registered for Percona Live, the largest open source database conference in the world, head on over to our registration page. Just do it quickly – Percona Live is happening May 16th – 18th in Austin, TX. 

Tuesday, May 17, 2022

The Evolution of a MySQL Database: From a Single Instance to HA with Disaster Recovery, Frédéric Descamps, Oracle, 10:30 AM

This session discusses how to make the transition from a single MySQL instance to multi-site high availability, as well as which solutions are best suited to changing business requirements (RPO, RTO).  

MySQL Performance Diagnostic Using Percona PMM, Marcos Albe, Percona, 11:30 AM

In this session, we show how we use Percona Monitoring and Management (PMM) at our Support department to systematically find the root cause of performance issues.  We will showcase our bottom-up approach, based on the U.S.E. method by Brendan Gregg, and will diagnose a live workload that the attendees themselves will have a chance to tweak during the presentation.

MariaDB Scaling – From Single Instance to Multiple Clusters, Michal Kuchta, Seznam.cz, 1:30 PM

In this session, attendees will follow the story of the growing application with respect to database architecture – from a single-instance database on a developer’s machine to cross-DC multi-cluster HA setup using Galera, ProxySQL, and anycast routing.

Scaling Venmo Applications for Growth With Zero Database Downtime, Kushal Shah, Venom (Paypal Inc.), 2:30 PM

This talk will describe how Venmo used proxy solutions for MongoDB and Aurora-MySQL to scale with traffic, various alternatives explored and driving factors narrowing down to a chosen approach. Venmo will share how they migrated the traffic via proxy with zero downtime, the challenges faced and the solutions applied to achieve their goal. 

Looking Ahead at PostgreSQL 15, Jonathan Katz, AWS, 4:00 PM 

This talk explores many of the new features that will be released in the next major version of PostgreSQL and explains how they can impact your workload. 

Data Consistency at Scale at Meta, Junyi Lu, Meta/Facebook, 5:00 PM

Meta/Facebook has gone through multiple large-scale MySQL rollouts like MyRocks and 8.0 migrations. To avoid the potential data inconsistency introduced by those migrations and various other small rollouts like minor version upgrades, they implemented several large-scale consistency checking tools to help surface issues at an early stage. In this talk, they will cover how these tools work under the hood and share some stories about how they have helped in large-scale rollouts.

Wednesday, May 18, 2022

Deploying Highly Available, Durable Amazon RDS for PostgreSQL and for MySQL Databases with Multi-AZ with Two Readable Standbys, Vijay Karumajji, AWS, 9:30 AM

Amazon Relational Database Service (Amazon RDS) for PostgreSQL and for MySQL now support a new Amazon RDS Multi-AZ deployment option with one primary and two readable standby database (DB) instances across three Availability Zones (AZs). With this new option, Multi-AZ users can get up to 2x faster transaction commit latency compared to one with standby, typically under 35-second failovers, and additional read capacity. In this session, learn about the new option, and how it compares to other high availability options, and see a demo on how to get started.

Postgres and the Artificial Intelligence, Ibrar Ahmed, Percona LLC, 10:50 AM

Artificial intelligence, machine learning, and deep learning are intertwined capabilities that attempt to solve problems that defy traditional computational solutions — problems including fraud detection, voice recognition, and search result recommendations. While they elude simple computation, they are computationally expensive, involving the calculation of perhaps millions of probabilities and weights. These computations can be done outside the database, but there are specific advantages of doing machine learning inside it, close to where the data is stored. This presentation explains how to do machine learning inside the Postgres database.

Multi-Tenant Kubernetes Cluster With Percona Operators, Chetan Shivashanker, Percona, 11:50 AM

Percona Operators provides a great way to run databases on Kubernetes. But what if you want to run multiple databases on a single k8s cluster? In this talk, we will explore how this is possible and also walk through some of the best practices that make multi-tenancy of DB on k8s a smooth experience. 

Efficient MySQL Performance, Daniel Nichter, Block, 2:00 PM

MySQL performance can be challenging for new software engineers because where does one begin? Even experienced engineers can find MySQL performance challenging because it’s not their area of expertise. This session covers the path to learning and achieving better MySQL performance by focusing on the most important topics for software engineers using MySQL, not aspiring DBAs. 

Database Resiliency, Ravikumar Buragapu, Adobe, 3:00 PM

This session covers cutting-edge strategies in handling scalability, resiliency, and high availability, building fallout tolerant database solutions, preventing global outages by preventing the propagation of failures, scaling a high-volume, high-transaction based database solution,  and safely performing complex database changes in distributed database solutions. 

 

Register for Percona Live

Mar
10
2022
--

PowerOn Your Voice at Percona Live 2022

Percona Live 2022

Percona Live 2022Percona Live 2022 is comin’ in hot as May 16-18 draws near. This year, we will be supercharging open source conversations, community, and expertise by welcoming the best and brightest open source database users to Austin, Texas for three days of knowledge sharing.

Have a unique perspective on open source database tech? Percona Live attendees would love to hear it. Here’s the lowdown:

Three Different Session Types

  • Breakout Session – Broadly cover a technology area using specific examples. Breakout sessions are 50 minutes to include time for Q&A.
  • Tutorial Session – Blend the feel of a training class and a conference breakout session in a detailed and hands-on presentation on a technical topic. We encourage attendees to bring laptops to follow along. Tutorials will be three hours to include time for Q&A.
  • Lightning Talk – Share a five-minute crowd-pleaser about a new idea, successful project, cautionary tale, or quick tip. Focus on a key takeaway that would interest the open source community, and consider giving a snappy demonstration. Think technical, lighthearted, and entertaining. You have five minutes to shine.

Some tips to make your proposal unforgettable:

  • Offer a complete perspective in your proposal.
  • Demonstrate your unique insights with a case study, a personal experience, or technical knowledge.
  • Are there certain reasons that drive your use of open source databases?
  • Did you just embrace open source databases this year? What motivated that move, e.g. ROI?

Tracks

  • MySQL – Do you have an opinion on the latest in MySQL? With the release of MySQL 8.0, what new features are helping you solve business issues or make the deployment of applications and websites easier, faster, or more efficient? Did the new release influence you to choose or switch to MySQL? What’s been the biggest impact of the MySQL 8.0 release for you? Do you use MySQL in conjunction with other databases in your environment?
  • MariaDB – How have the latest features of MariaDB, MariaDB compatible databases, and related tools allowed you to optimize performance? What best practices have you adopted? Could you demonstrate with real production use cases and applications?
  • PostgreSQL – In what ways have you benchmarked or compared PostgreSQL against other types of databases, what prompted this, and what were your results? How has PostgreSQL won out over other SQL options? How does PostgreSQL help you with application performance or deployment? How do you use PostgreSQL in conjunction with other databases in your environment?
  • MongoDB – How has the 5.0 release improved your experience in application development or time-to-market? What new features make your database environment better and why? What is it about MongoDB 5.0 that excites you? What’s significant about your experience with Atlas? Have you moved to it, and has it lived up to its promises? Do you use MongoDB in conjunction with other databases in your environment?
  • Observability and Monitoring – How are you designing your database-powered applications for observability? What observability and monitoring tools and methods give you the best application and database insights for running your business? How are you using tools to troubleshoot issues and bottlenecks? What new patterns of database behavior have you identified using observability tools, and what makes the best tools stand out?
  • Kubernetes & Containers – How are you running open source databases on the Kubernetes, OpenShift, and other container platforms? What software helps you reach your objectives? What best practices and processes make containers a vital part of your business strategy?
  • Emerging Technologies – Talks on New technologies and products out on the market, not a product pitch however, show us/teach us something cool

Hot Topics

  • Cloud-Native Applications the Databases that support them
  • Open-source database deployments and technologies
  • Observability and troubleshooting of your database infrastructure
  • How to secure and protect your data infrastructure
  • Database development best practices, tips, and tricks
  • Managing databases at scale (or how to manage 1000s of databases across multiple sites)
  • Performance and optimization techniques, tricks, and strategies for optimizing your databases deployments

Speakers

Call for Papers is open until March 14, 2022. Submit your session now! 

Not sure what to submit?  Check out our list of ideas.

If your proposal is selected for breakout or tutorial sessions, you will receive a complimentary full conference pass. 

Looking forward to seeing you there!

Mar
08
2022
--

Ideas, Topics, and Suggested Topics for Percona Live 2022

Percona Live 2022

Percona Live 2022I have had a lot of conversations with people interested in participating in Percona Live this year, but are looking for ideas on what talks or tutorials to submit.  I decided to put together a list of topics I think would make great sessions and topics I have heard you in the community ask about.  

Generally speaking, talks show real-world architectures, deployments, and use cases are always well received and well attended.  People love learning about and often gain inspiration from how other people are deploying and using their favorite databases.  Submitting talks like “How my company deployed <insert database> to do <insert something interesting>” are uniquely yours and will be well received.

Submit Your Talk

Also good advice:  Don’t be afraid to submit.  Many people talk themselves out of talking or submitting for fear that their session is too basic or may not be “good enough”.  Often 101 talks are very helpful for people not only at the conference but also those looking to start out later on ( slides/videos will be posted online ).  If you need some help, reach out to me at hoss@percona.com and I will be happy to help you with slide review, ideas, or by doing a walkthrough of your talk ahead of time.  

If you are a company with a product looking to speak and highlight the product, the best talks avoid product pitches, instead of focusing on how to overcome or solve a particular problem.  Show us something cool, teach us something new, show us how others are using the software.  

Here is Hoss’s big list of ideas broken into categories (this is not an exhaustive list, just the top of my head):

PostgreSQL:

  • Kubernetes for the PostgreSQL DBAs
  • PostgreSQL for Oracle DBAs
  • Best Practices for PostgreSQL on AWS, Azure, or Google Cloud
  • PostgreSQL Schema Design
  • Securing PostgreSQL 
  • Performance Tuning PostgreSQL
  • Geo-Distributed PostgreSQL 
  • Using JSON Datatypes within PostgreSQL
  • Setting up and Optimizing Patroni 
  • PostgreSQL Deep Tuning Secrets: Most people know the easy options for tuning, but what are some hidden ones.
  • PostgreSQL Query Design Tips and Tricks
  • Setting up PostgreSQL for Analytics workloads, features, and tricks for analytics environments
  • Large Scale PG:  The challenges of running PostgreSQL at scale, tips and tricks at scale, sharding, etc
  • War Stories:  People love hearing about how people built their environments and what the setup looked like.  Use cases are awesome.  
  • PostgreSQL tuning walkthrough
  • The SRE guide to maintaining, optimizing, and fixing PostgreSQL
  • How to extend PostgreSQL ( Building an extension )
  • Finding, tracking, and fixing Disk IO bottlenecks in PostgreSQL
  • Secrets of the all-important vacuum 
  • PostgreSQL Indexing overview: which index types work best in what workloads
  • Migrating to PostgreSQL from SQL Server, Oracle, or other databases
  • Monitoring PostgreSQL 101:  What should you be monitoring, and what alerts should you setup
  • Backup best practices for PostgreSQL
  • Timeseries Data in PostgreSQL

MySQL:

  • Kubernetes for the MySQL DBAs
  • Best Practices for MySQL on AWS, Azure, or Google Cloud
  • MySQL Schema Design
  • Securing MySQL 
  • Performance Tuning MySQL
  • Geo-Distributed MySQL 
  • Using JSON Datatypes within MySQL
  • MySQL Deep Tuning Secrets: Most people know the easy options for tuning, but what are some hidden ones.
  • MySQL Query Design Tips and Tricks
  • Large Scale MySQL: The challenges of running MySQL at scale, tips and tricks at scale, sharding, etc
  • War Stories:  People love hearing about how people built their environments and what the setup looked like.  Use cases are awesome.  
  • MySQL tuning walkthrough
  • The SRE guide to maintaining, optimizing, and fixing MySQL
  • Finding, tracking, and fixing Disk IO bottlenecks in MySQL
  • Monitoring MySQL 101:  What should you be monitoring, and what alerts should you setup
  • Using the MySQL Shell – Tips, tricks, even an overview
  • War stories/Use Cases of deploying InnoDB Cluster or PXC (architectural overview, etc.)
  • Backup best practices for MySQL

MongoDB:

  • Kubernetes for the MongoDB DBAs & Developers
  • Topics on: MongoDB Sharding, Rebalancing, picking your shard key, etc
  • Walking through identifying and drilling into a slow down in MongoDB
  • MongoDB Schema Design/Validation 
  • Deep secrets of MongoDB ( things people don’t often use )
  • Securing MongoDB
  • MongoDB Backup and restore best practices
  • Monitoring MongoDB 101:  What should you be monitoring, and what alerts should you setup
  • Large Scale MongoDB: The challenges of running MongoDB at scale, tips and tricks at scale, sharding, etc
  • War Stories:  People love hearing about how people built their environments and what the setup looked like.  Use cases are awesome.  
  • The SRE guide to maintaining, optimizing, and fixing MongoDB

MariaDB:

  • Best Practices for MariaDB on AWS, Azure, or Google Cloud
  • MariaDB Schema Design
  • Securing MariaDB 
  • Performance Tuning MariaDB
  • Geo-Distributed MariaDB 
  • Using JSON Datatypes within MariaDB
  • MariaDB Deep Tuning Secrets: Most people know the easy options for tuning, but what are some hidden ones.
  • MariaDB Query Design Tips and Tricks
  • Large Scale MariaDB: The challenges of running MariaDB at scale, tips and tricks at scale, sharding, etc
  • War Stories:  People love hearing about how people built their environments and what the setup looked like.  Use cases are awesome.  
  • Finding, tracking, and fixing Disk IO bottlenecks in MariaDB
  • Monitoring MariaDB 101:  What should you be monitoring, and what alerts should you setup

Misc Databases:

(We would love to hear from other people using other databases not listed.)

  • Topics on using Redis, Elastic, OpenSearch, Cassandra, Couchbase, etc are all welcome
    • Multi-database topics at the conference generally are well received and often well attended.  Things like using Redis and PostgreSQL together are great topics.
  • There is a lot of Buzz around “NewSQL” solutions, Yugabyte, TiDB, Cockroach, etc.  Introduction talks as well as advanced use cases would be very interesting. 

Infrastructure:

  • Kubernetes 101 for DBAs – Let’s be honest many DBAs may not be used to this
  • Topics on best practices for databases running on containers/Kubernetes is very timely and welcome
  • Linux Tuning Tips and tricks for database servers
  • Topics showing concrete tools/methods for the automation of infrastructure and backend tasks are really good as well.

Development:

  • Diving into ORMs comparing the pros and cons of different ORMs is interesting
  • Talking about best practices for different types of programming languages with your favorite database is also a good topic ( Go, Python, PHP, etc )
  • Fixing, Tuning, and Optimizing MySQL/PostgreSQL for your favorite 3rd party application (WordPress, Drupal, etc.)
  • Full Stack Tuning:  Tools/Methods for walking through issues from user pain, through code, and down to the database
  • CI/CD tips, tricks, best practices when dealing with databases
  • Schema Migration Best Practices
  • Effective Testing Code/Database Changes 
  • Database Error Handling 101
  • Connection Management 101
  • Connecting Your App to A Highly Distributed, Geo-Located Database
  • Storing Time Series Data
  • Accessing Data From Multiple Data Sources/Types

Miscellaneous/General:

  • Show us new features or extensions in your favorite database
  • Topics on deep tracing, troubleshooting, and internal debugging often gain a great audience (bpftrace, flame graphs, etc.)
  • How to test your backups (any database) when your data set is huge
  • Talks on new technologies and products out on the market, not a product pitch however, show us/teach us something cool

Note all these are merely suggestions and ideas.  There may be lots of talks/topics that are super interesting not listed.  If you don’t see something you like or think you can do, feel free to reach out and we can talk and I can help you create a talk that is uniquely you.  Space is limited for speakers, so picking a topic above won’t guarantee a spot, but these are topics that generally are well received. 

What do you think?  Did I miss something?  Do you have a topic you would like to see?  Drop it into the comments section below.  See you at Percona Live!

Jun
11
2021
--

PostgreSQL HA with Patroni: Your Turn to Test Failure Scenarios

PostgreSQL HA with Patroni

A couple of weeks ago, Jobin and I did a short presentation during Percona Live Online bearing a similar title as the one for this post: “PostgreSQL HA With Patroni: Looking at Failure Scenarios and How the Cluster Recovers From Them”. We deployed a 3-node PostgreSQL environment with some recycled hardware we had lying around and set ourselves at “breaking” it in different ways: by unplugging network and power cables, killing main processes, attempting to saturate processors. All of this while continuously writing and reading data from PostgreSQL. The idea was to see how Patroni would handle the failures and manage the cluster to continue delivering service. It was a fun demo!

We promised a follow-up post explaining how we set up the environment, so you could give it a try yourselves, and this is it. We hope you also have fun attempting to reproduce our small experiment, but mostly that you use it as an opportunity to learn how a PostgreSQL HA environment managed by Patroni works in practice: there is nothing like a hands-on lab for this!

Initial Setup

We recycled three 10-year old Intel Atom mini-computers for our experiment but you could use some virtual machines instead: even though you will miss the excitement of unplugging real cables, this can still be simulated with a VM. We installed the server version of Ubuntu 20.04 and configured them to know “each other” by hostname; here’s how the hosts file of the first node looked like:

$ cat /etc/hosts
127.0.0.1 localhost node1
192.168.1.11 node1
192.168.1.12 node2
192.168.1.13 node3

etcd

Patroni supports a myriad of systems for Distribution Configuration Store but etcd remains a popular choice. We installed the version available from the Ubuntu repository on all three nodes:

sudo apt-get install etcd

It is necessary to initialize the etcd cluster from one of the nodes and we did that from node1 using the following configuration file:

$ cat /etc/default/etcd
ETCD_NAME=node1
ETCD_INITIAL_CLUSTER="node1=http://192.168.1.11:2380"
ETCD_INITIAL_CLUSTER_TOKEN="devops_token"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.1.11:2380"
ETCD_DATA_DIR="/var/lib/etcd/postgresql"
ETCD_LISTEN_PEER_URLS="http://192.168.1.11:2380"
ETCD_LISTEN_CLIENT_URLS="http://192.168.1.11:2379,http://localhost:2379"
ETCD_ADVERTISE_CLIENT_URLS="http://192.168.1.11:2379"

Note how ETCD_INITIAL_CLUSTER_STATE is defined with “new”.

We then restarted the service:

sudo systemctl restart etcd

We can then move on to install etcd on node2. The configuration file follows the same structure as that of node1, except that we are adding node2 to an existing cluster so we should indicate the other node(s):

ETCD_NAME=node2
ETCD_INITIAL_CLUSTER="node1=http://192.168.1.11:2380,node2=http://192.168.1.12:2380"
ETCD_INITIAL_CLUSTER_TOKEN="devops_token"
ETCD_INITIAL_CLUSTER_STATE="existing"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.1.12:2380"
ETCD_DATA_DIR="/var/lib/etcd/postgresql"
ETCD_LISTEN_PEER_URLS="http://192.168.1.12:2380"
ETCD_LISTEN_CLIENT_URLS="http://192.168.1.12:2379,http://localhost:2379"
ETCD_ADVERTISE_CLIENT_URLS="http://192.168.1.12:2379"

Before we restart the service, we need to formally add node2 to the etcd cluster by running the following command on node1:

sudo etcdctl member add node2 http://192.168.1.12:2380

We can then restart the etcd service on node2:

sudo systemctl restart etcd

The configuration file for node3 looks like this:

ETCD_NAME=node3
ETCD_INITIAL_CLUSTER="node1=http://192.168.1.11:2380,node2=http://192.168.1.12:2380,node3=http://192.168.1.13:2380"
ETCD_INITIAL_CLUSTER_TOKEN="devops_token"
ETCD_INITIAL_CLUSTER_STATE="existing"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.1.13:2380"
ETCD_DATA_DIR="/var/lib/etcd/postgresql"
ETCD_LISTEN_PEER_URLS="http://192.168.1.13:2380"
ETCD_LISTEN_CLIENT_URLS="http://192.168.1.13:2379,http://localhost:2379"
ETCD_ADVERTISE_CLIENT_URLS="http://192.168.1.13:2379"

Remember we need to add node3 to the cluster by running the following command on node1:

sudo etcdctl member add node3 http://192.168.1.13:2380

before we can restart the service on node3:

sudo systemctl restart etcd

We can verify the cluster state to confirm it has been deployed successfully by running the following command from any of the nodes:

$ sudo etcdctl member list
2ed43136d81039b4: name=node3 peerURLs=http://192.168.1.13:2380 clientURLs=http://192.168.1.13:2379 isLeader=false
d571a1ada5a5afcf: name=node1 peerURLs=http://192.168.1.11:2380 clientURLs=http://192.168.1.11:2379 isLeader=true
ecec6c549ebb23bc: name=node2 peerURLs=http://192.168.1.12:2380 clientURLs=http://192.168.1.12:2379 isLeader=false

As we can see above, node1 is the leader at this point, which is expected since the etcd cluster has been bootstrapped from it. If you get a different result, check for etcd entries logged to /var/log/syslog on each node.

Watchdog

Quoting Patroni’s manual:

Watchdog devices are software or hardware mechanisms that will reset the whole system when they do not get a keepalive heartbeat within a specified timeframe. This adds an additional layer of fail safe in case usual Patroni split-brain protection mechanisms fail.

While the use of a watchdog mechanism with Patroni is optional, you shouldn’t really consider deploying a PostgreSQL HA environment in production without it.

For our tests, we used the standard software implementation for watchdog that is shipped with Ubuntu 20.04, a module called softdog. Here’s the procedure we used in all three nodes to configure the module to load:

sudo sh -c 'echo "softdog" >> /etc/modules'

Patroni will be the component interacting with the watchdog device. Since Patroni is run by the postgres user, we need to either set the permissions of the watchdog device open enough so the postgres user can write to it or make the device owned by postgres itself, which we consider a safer approach (as it is more restrictive):

sudo sh -c 'echo "KERNEL==\"watchdog\", OWNER=\"postgres\", GROUP=\"postgres\"" >> /etc/udev/rules.d/61-watchdog.rules'

These two steps looked like all that would be required for watchdog to work but to our surprise, the softdog module wasn’t loaded after restarting the servers. After spending quite some time digging around we figured the module was blacklisted by default and there was a strain file with such a directive still lingering around:

$ grep blacklist /lib/modprobe.d/* /etc/modprobe.d/* |grep softdog
/lib/modprobe.d/blacklist_linux_5.4.0-72-generic.conf:blacklist softdog

Editing that file in each of the nodes to remove the line above and restarting the servers did the trick:

$ lsmod | grep softdog
softdog                16384  0

$ ls -l /dev/watchdog*
crw-rw---- 1 postgres postgres  10, 130 May 21 21:30 /dev/watchdog
crw------- 1 root     root     245,   0 May 21 21:30 /dev/watchdog0

PostgreSQL

Percona Distribution for PostgreSQL can be easily installed from the Percona Repository in a few easy steps:

sudo apt-get update -y; sudo apt-get install -y wget gnupg2 lsb-release curl
wget https://repo.percona.com/apt/percona-release_latest.generic_all.deb
sudo dpkg -i percona-release_latest.generic_all.deb
sudo apt-get update
sudo percona-release setup ppg-12
sudo apt-get install percona-postgresql-12

An important concept to understand in a PostgreSQL HA environment like this one is that PostgreSQL should not be started automatically by systemd during the server initialization: we should leave it to Patroni to fully manage it, including the process of starting and stopping the server. Thus, we should disable the service:

sudo systemctl disable postgresql

For our tests, we want to start with a fresh new PostgreSQL setup and let Patroni bootstrap the cluster, so we stop the server and remove the data directory that has been created as part of the PostgreSQL installation:

sudo systemctl stop postgresql
sudo rm -fr /var/lib/postgresql/12/main

These steps should be repeated in nodes 2 and 3 as well.

Patroni

The Percona Repository also includes a package for Patroni so with it already configured in the nodes we can install Patroni with a simple:

sudo apt-get install percona-patroni

Here’s the configuration file we have used for node1:

$ cat /etc/patroni/config.yml
scope: stampede
name: node1

restapi:
  listen: 0.0.0.0:8008
  connect_address: node1:8008

etcd:
  host: node1:2379

bootstrap:
  # this section will be written into Etcd:/<namespace>/<scope>/config after initializing new cluster
  dcs:
    ttl: 30
    loop_wait: 10
    retry_timeout: 10
    maximum_lag_on_failover: 1048576
#    master_start_timeout: 300
#    synchronous_mode: false
    postgresql:
      use_pg_rewind: true
      use_slots: true
      parameters:
        wal_level: replica
        hot_standby: "on"
        logging_collector: 'on'
        max_wal_senders: 5
        max_replication_slots: 5
        wal_log_hints: "on"
        #archive_mode: "on"
        #archive_timeout: 600
        #archive_command: "cp -f %p /home/postgres/archived/%f"
        #recovery_conf:
        #restore_command: cp /home/postgres/archived/%f %p

  # some desired options for 'initdb'
  initdb:  # Note: It needs to be a list (some options need values, others are switches)
  - encoding: UTF8
  - data-checksums

  pg_hba:  # Add following lines to pg_hba.conf after running 'initdb'
  - host replication replicator 192.168.1.1/24 md5
  - host replication replicator 127.0.0.1/32 trust
  - host all all 192.168.1.1/24 md5
  - host all all 0.0.0.0/0 md5
#  - hostssl all all 0.0.0.0/0 md5

  # Additional script to be launched after initial cluster creation (will be passed the connection URL as parameter)
# post_init: /usr/local/bin/setup_cluster.sh
  # Some additional users users which needs to be created after initializing new cluster
  users:
    admin:
      password: admin
      options:
        - createrole
        - createdb

postgresql:
  listen: 0.0.0.0:5432
  connect_address: node1:5432
  data_dir: "/var/lib/postgresql/12/main"
  bin_dir: "/usr/lib/postgresql/12/bin"
#  config_dir:
  pgpass: /tmp/pgpass0
  authentication:
    replication:
      username: replicator
      password: vagrant
    superuser:
      username: postgres
      password: vagrant
  parameters:
    unix_socket_directories: '/var/run/postgresql'

watchdog:
  mode: required # Allowed values: off, automatic, required
  device: /dev/watchdog
  safety_margin: 5

tags:
    nofailover: false
    noloadbalance: false
    clonefrom: false
    nosync: false

With the configuration file in place, and now that we already have the etcd cluster up, all that is required is to restart the Patroni service:

sudo systemctl restart patroni

When Patroni starts, it will take care of initializing PostgreSQL (because the service is not currently running and the data directory is empty) following the directives in the bootstrap section of Patroni’s configuration file. If everything went according to the plan, you should be able to connect to PostgreSQL using the credentials in the configuration file (password is vagrant):

$ psql -U postgres
psql (12.6 (Ubuntu 2:12.6-2.focal))
Type "help" for help.

postgres=#

Repeat the operation for installing Patroni on nodes 2 and 3: the only difference is that you will need to replace the references to node1 in the configuration file (there are four of them, shown in bold) with the respective node name.

You can also check the state of the Patroni cluster we just created with:

$ sudo patronictl -c /etc/patroni/config.yml list
+----------+--------+-------+--------+---------+----+-----------+
| Cluster  | Member |  Host |  Role  |  State  | TL | Lag in MB |
+----------+--------+-------+--------+---------+----+-----------+
| stampede | node1  | node1 | Leader | running |  2 |           |
| stampede | node2  | node2 |        | running |  2 |         0 |
| stampede | node3  | node3 |        | running |  2 |         0 |
+----------+--------+-------+--------+---------+----+-----------+

node1 started the Patroni cluster so it was automatically made the leader – and thus the primary/master PostgreSQL server. Nodes 2 and 3 are configured as read replicas (as the hot_standby option was enabled in Patroni’s configuration file).

HAProxy

A common implementation of high availability in a PostgreSQL environment makes use of a proxy: instead of connecting directly to the database server, the application will be connecting to the proxy instead, which will forward the request to PostgreSQL. When HAproxy is used for this, it is also possible to route read requests to one or more replicas, for load balancing. However, this is not a transparent process: the application needs to be aware of this and split read-only from read-write traffic itself. With HAproxy, this is done by providing two different ports for the application to connect. We opted for the following setup:

  • Writes   ?  5000
  • Reads   ?  5001

HAproxy can be installed as an independent server (and you can have as many as you want) but it can also be installed on the application server or the database server itself – it is a light enough service. For our tests, we planned on using our own Linux workstations (which also run Ubuntu 20.04) to simulate application traffic so we installed HAproxy on them:

sudo apt-get install haproxy

With the software installed, we modified the main configuration file as follows:

$ cat /etc/haproxy/haproxy.cfg
global
    maxconn 100

defaults
    log    global
    mode    tcp
    retries 2
    timeout client 30m
    timeout connect 4s
    timeout server 30m
    timeout check 5s

listen stats
    mode http
    bind *:7000
    stats enable
    stats uri /

listen primary
    bind *:5000
    option httpchk OPTIONS /master
    http-check expect status 200
    default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
    server node1 node1:5432 maxconn 100 check port 8008
    server node2 node2:5432 maxconn 100 check port 8008
    server node3 node3:5432 maxconn 100 check port 8008

listen standbys
    balance roundrobin
    bind *:5001
    option httpchk OPTIONS /replica
    http-check expect status 200
    default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
    server node1 node1:5432 maxconn 100 check port 8008
    server node2 node2:5432 maxconn 100 check port 8008
    server node3 node3:5432 maxconn 100 check port 8008

Note there are two sections: primary, using port 5000, and standbys, using port 5001. All three nodes are included in both sections: that’s because they are all potential candidates to be either primary or secondary. For HAproxy to know which role each node currently has, it will send an HTTP request to port 8008 of the node: Patroni will answer. Patroni provides a built-in REST API support for health check monitoring that integrates perfectly with HAproxy for this:

$ curl -s http://node1:8008
{"state": "running", "postmaster_start_time": "2021-05-24 14:50:11.707 UTC", "role": "master", "server_version": 120006, "cluster_unlocked": false, "xlog": {"location": 25615248}, "timeline": 1, "database_system_identifier": "6965869170583425899", "patroni": {"version": "1.6.4", "scope": "stampede"}}

We configured the standbys group to balance read-requests in a round-robin fashion, so each connection request (or reconnection) will alternate between the available replicas. We can test this in practice, let’s save the postgres user password in a file to facilitate the process:

echo "localhost:5000:postgres:postgres:vagrant" > ~/.pgpass
echo "localhost:5001:postgres:postgres:vagrant" >> ~/.pgpass
chmod 0600 ~/.pgpass

We can then execute two read-requests to verify the round-robin mechanism is working as intended:

$ psql -Upostgres -hlocalhost -p5001 -t -c "select inet_server_addr()"
 192.168.1.13

$ psql -Upostgres -hlocalhost -p5001 -t -c "select inet_server_addr()"
 192.168.1.12

as well as test the writer access:

$ psql -Upostgres -hlocalhost -p5000 -t -c "select inet_server_addr()"
 192.168.1.11

You can also check the state of HAproxy by visiting http://localhost:7000/ on your browser.

Workload

To best simulate a production environment to test our failure scenarios, we wanted to have continuous reads and writes to the database. We could have used a benchmark tool such as Sysbench or Pgbench but we were more interested in observing the switch of source server upon a server failure than load itself. Jobin wrote a simple Python script that is perfect for this, HAtester. As was the case with HAproxy, we run the script from our Linux workstation. Since it is a Python script, you need to have a PostgreSQL driver for Python installed to execute it:

sudo apt-get install python3-psycopg2
curl -LO https://raw.githubusercontent.com/jobinau/pgscripts/main/patroni/HAtester.py
chmod +x HAtester.py

Edit the script with the credentials to access the PostgreSQL servers (through HAproxy) if you are using different settings from ours. The only requirement for it to work is to have the target table created beforehand, so first connect to the postgres database (unless you are using a different target) in the Primary and run:

CREATE TABLE HATEST (TM TIMESTAMP);

You can then start two different sessions:

  1. One for writes:

    ./HAtester.py 5000
  2. One for reads:
    ./HAtester.py 5001

The idea is to observe what happens with database traffic when the environment experiences a failure; that is, how HAproxy will route reads and writes as Patroni adjusts the PostgreSQL cluster. You can continuously monitor Patroni from the point of view of the nodes by opening a session in each of them and running the following command:

sudo -u postgres watch patronictl -c /etc/patroni/config.yml list

To facilitate observability and better follow the changes in real-time, we used the terminal multiplexer Tmux to visualize all 5 sessions on the same screen:

  • On the left side, we have one session open for each of the 3 nodes, continuously running:

    sudo -u postgres watch patronictl -c /etc/patroni/config.yml list

    It’s better to have the Patroni view for each node independently because when you start the failure tests you will lose connection to a part of the cluster.

  • On the right side, we are executing the HAtester.py script from our workstation:
    • Sending writes through port 5000:

      ./HAtester.py 5000
    • and reads through port 5001:

      ./HAtester.py 5001

A couple of notes on the execution of the HAtester.py script:

  • Pressing Ctrl+C will break the connection but the script will reconnect, this time to a different replica (in the case of reads) due to having the Standbys group on HAproxy configured with round-robin balancing.
  • When a switchover or failover takes place and the nodes are re-arranged in the cluster, you may temporarily see writes sent to a node that used to be a replica and was just promoted as primary and reads send to a node that used to be the primary and was demoted as secondary: that’s a limitation of the HAtester.py script but “by design”; we favored faster reconnections and minimal checks on the node’s role for demonstration purposes. On a production application, this part ought to be implemented differently.

Testing Failure Scenarios

The fun part starts now! We leave it to you to test and play around to see what happens with the PostgreSQL cluster in practice following a failure. We leave as suggestions the tests we did in our presentation. For each failure scenario, observe how the cluster re-adjusts itself and the impact on read and write traffic.

1) Loss of Network Communication

  • Unplug the network cable from one of the nodes (or simulate this condition in your VM):
    • First from a replica
    • Then from the primary
  • Unplug the network cable from one replica and the primary at the same time:
    • Does Patroni experience a split-brain situation?

2) Power Outage

  • Unplug the power cable from the primary
  • Wait until the cluster is re-adjusted then plug the power cable back and start the node

3) SEGFAULT

Simulate an OOM/crash by killing the postmaster process in one of the nodes with kill -9.

4) Killing Patroni

Remember that Patroni is managing PostgreSQL. What happens if the Patroni process (and not PostgreSQL) is killed?

5) CPU Saturation

Simulate CPU saturation with a benchmark tool such as Sysbench, for example:

sysbench cpu --threads=10 --time=0 run

This one is a bit tricky as the reads and writes are each single-threaded operation. You may need to decrease the priority of the HAtester.py processes with renice, and possibly increase that of Sysbench’s.

6) Manual Switchover

Patroni facilitates changes in the PostgreSQL hierarchy. Switchover operations can be scheduled, the command below is interactive and will prompt you with options:

sudo -u postgres patronictl -c /etc/patroni/config.yml switchover

Alternatively, you can be specific and tell Patroni exactly what to do:

sudo -u postgres patronictl -c /etc/patroni/config.yml switchover --master node1 --candidate node2 --force


We hope you had fun with this hands-on lab! If you have questions or comments, leave us a note in the comments section below!

May
12
2021
--

New Survey Shows Enterprises Increasing Their Reliance on Open Source Software

changing face of open source

changing face of open sourceA survey of 200 IT decision-makers from medium to large enterprises was conducted in Q1 of 2021 by Vanson Bourne and sponsored by Percona in advance of Percona Live ONLINE 2021, which started today! 

Percona Live Online represents dozens of projects, communities, and tech companies, and features more than 150 expert speakers across 200 sessions. There’s still time to register and attend.

Register and Attend

In his keynote address on May 12 at Noon Eastern, Peter Zaitsev, CEO of Percona, addressed the findings of the survey in more detail, along with the impact of licensing changes and the importance of keeping open source truly open. 

The survey focused on the business perspective of open source software. 25% of respondents were from medium-sized enterprises (500-999 employees) and 75% were from large enterprises (over 1,000 employees). 

The Benefit of Open Source

Respondents came from a cross-section of industries and had knowledge of open source software. Their answers showed that enterprises have a deep appreciation for the value of open source software.

100% of information technology (IT) decision-makers said that “using open source provides benefits for their organization.

78% of respondents reported their use of open source software had increased over the past disruptive 12 months.

Cloud Adoption

Large enterprise respondents were most likely to have moved databases and applications to cloud services. Just 13% of large enterprises continue to have all their databases and applications running at their on-premises data center, compared with 29% of medium-size enterprises.

The transition to the cloud was accelerated by the worldwide pandemic and demand for flexible, fast, and reliable technology. However, it’s likely the increase in demand led to an increase in costs for many businesses. 68% of respondents said that cloud infrastructure has become more expensive in the past year.

The survey asked how public cloud providers can contribute back to open source. 59% said by providing better security, 48% said by encouraging open source collaboration, 43% said by improving existing code quality and 43% said by enabling open source to run on their cloud.

Licensing Changes

Nearly half of survey respondents indicated concerns over changing open source licenses, such as the Business Source License (BSL) and Server Side Public License (SSPL). They indicated that it will increase costs (44%); it encourages lock-in (37%); discourages engagement from the open source community (34%); and discourages growth in the open source market (26%).

Download the Full Survey Report

May
12
2021
--

Percona Live ONLINE: Percona Previews Open Source Database as a Service

percona open source dbaas

percona open source dbaasPercona Live ONLINE 2021 starts today!  

Representing dozens of projects, communities, and tech companies, and featuring more than 150 expert speakers across 200 sessions, there’s still time to register and attend. 

Register and Attend

Percona latest product announcements focus on Percona’s open source DBaaS preview, and new Percona Kubernetes Operators features.

During Percona Live ONLINE 2021, our experts will be discussing the preview of Percona’s 100% open source Database as a Service (DBaaS), which eliminates vendor lock-in and enables users to maintain control of their data. 

As an alternative to public cloud and large enterprise database vendor DBaaS offerings, this on-demand self-service option provides users with a convenient and simple way to deploy databases quickly. Using Percona Kubernetes Operators means it is possible to configure a database once and deploy it anywhere.

“The future of databases is in the cloud, an approach confirmed by the market and validated by our own customer research,” said Peter Zaitsev, co-founder and CEO of Percona. “We’re taking this one step further by enabling open source databases to be deployed wherever the customer wants them to run – on-premises, in the cloud, or in a hybrid environment. Companies want the flexibility of DBaaS, but they don’t want to be tied to their original decision for all time – as they grow or circumstances change, they want to be able to migrate without lock-in or huge additional expenses.”

The DBaaS supports Percona open source versions of MySQL, MongoDB, and PostgreSQL. 

Critical database management operations such as backup, recovery, and patching will be managed through the Percona Monitoring and Management (PMM) component of Percona DBaaS. 

PMM is completely open source and provides enhanced automation with monitoring and alerting to find, eliminate, and prevent outages, security issues, and slowdowns in performance across MySQL, MongoDB, PostgreSQL, and MariaDB databases.

Customer trials of Percona DBaaS will start this summer. Businesses interested in being part of this trial can register here.

Easy Deployment and Management with Kubernetes Operators from Percona

The Kubernetes Operator for Percona Distribution of PostgreSQL is now available in technical preview, making it easier than ever to deploy. This Operator streamlines the process of creating a database so that developers can gain access to resources faster, as well as then ongoing lifecycle management.

There are also new capabilities available in the Kubernetes Operator for Percona Server for MongoDB, which support enterprise mission-critical deployments with features for advanced data recovery. It now includes support for multiple shards, which provides horizontal database scaling, and allows for distribution of data across multiple MongoDB Pods. This is useful for large data sets when a single machine’s overall processing speed or storage capacity is insufficient. 

This Operator also allows Point-in-Time Recovery, which enables users to roll back the cluster to a specific transaction and time, or even skip a specific transaction. This is important when data needs to be restored to reverse a problem transaction or ransomware attack.

The new Percona product announcements will be discussed in more detail at our annual Percona Live ONLINE Open Source Database Conference 2021 starting today.

We hope you’ll join us! Register today to attend Percona Live ONLINE for free.

Register and Attend

May
10
2021
--

War Stories and Learning From Others – Percona Live

percona live

percona liveLessons Learned – Learning from those who blazed the trail!

Another cool thing I like about attending conferences is learning from how other companies and people overcame problems, how they run their systems, and figure out problems that I may run into in the future. Secretly I also want to validate how I have done things and make sure I did not miss something. ? Percona Live has a huge group of interesting speakers and users, customers, and companies are sharing tons of interesting war stories, how-to’s, and explaining some things they are very proud of! So what is the HOSS looking forward to hearing about?

SCALE!

Who is going to be sharing their tales of scale? 

Edmodo’s Pandemic Natarajan Chidhambharam & Miklos Szel will be sharing “A Tale of 25x Growth in Three Weeks”. Edmodo does online educational services/software, so when Covid hit their entire platform’s traffic skyrocketed within days. This is a tale of massive growth and dealing with growth you could never imagine.

Another talk to gain insight on that unexpected growth comes from Art van Scheppingen at MessageBird in his talk entitled:  “How to Cope With (Unexpected) Millupling of Your Workload?

While the pandemic caused load for many, high load often occurs with regular frequency for many companies.  Javi Santana from TinyBird is going to share with us his story of growth and scale in the talk: “How We Processed 12 Trillion Rows During Black Friday”.

But one time events or even recurring events are nothing compared to a constant flow of transactions.  To learn how to handle that you may want to see how a company like Adobe handles this workload.  Adobe’s Yeshwanth Vijayakumar will be giving us the details in his talk “How Adobe Does Millions of Records Per Second Using Apache Spark

The team over at Venmo/Payal (Kushal Shah, Neeraj Wadhwani, Van Pham, & Tianshi Wang) will also be sharing the secrets of how they scale in their talk: “Scaling Venmo’s Payments”. While not all of us run at this level of scale, understanding the issues they faced and how they resolved them can help many as they build what’s next.

Migrations! There and Back Again

Who doesn’t love a good story about moving to a new town, state, or in the case of companies new cloud provider!   If you are looking for information and stories from others’ cloud migration adventures look no further than the following:

BlaBlaCar’s Maxime Fouilleul is going to be delivering “Organize the Migration of a Hundred Database Clusters to the Cloud”.  Migrating a few databases is challenging, but hundreds or thousands of them is daunting.  

Sometimes the last part of a migration is the most challenging.  Box’s Jordan Moldow is going to share with us the challenges Box faced in finishing that large migration in his talk: The Last Mile: Delivering the Last 10% of a Four-Year Migration

Groupon recently moved their systems all to AWS.  Groupon’s Mani Subramanian will be sharing the ins and outs of this Journey in his talk  “MySQL & PostgreSQL Migration to AWS at Groupon

Database Operations Best Practices From the Experts

Finding out better ways to do our day-to-day jobs helps us find more time and get more efficient.  Good thing so many great companies are willing to share what has found works for them.

Upgrades Whether in the cloud or not are sometimes challenging. Ashwin Nellore & Kushal Shah are going to share “Venmo’s Aurora Upgrades With Open Source Tools” detailing how they approach and execute upgrades in an AWS environment.

Stephen Borg & Matthew Formosa (GiG) are going to talk about choosing the best tool or in this case database for the job in their talk  “Fun and Games: Why We Picked ClickHouse To Drive Gaming Analytics at GiG “.  Gaming analytics ( any analytics really ) remains a hot topic.  Learning why Clickhouse fits their needs should be interesting.  

When your website has millions of users and 24×7 requirements you have to set things up to scale and survive multiple outages.  Companies like Linkedin have spent lots of money and many hours solving many of these scale and availability challenges.  That is why listening in on Karthik Appigatla talking about “Multi-colo Async Replication at LinkedIn” is going to be very informative. Later on, Karthik is joined by Apoorv Purohit to discuss scaling Linkedin with Vitess

When you have large datasets and lots of individual databases, even something as simple as copying databases from one environment to the next can be a challenge. Nicolai Plum from Booking.com has some practical advice and tips in his talk:  The Many Ways to Copy Your Database.

Finally, keeping track of what’s happening in any server when you have hundreds more is a pain. That’s why I am excited to hear about how Rappi’s Daniel Guzman Burgos & Rodrigo Cadaval did this in their environment.  Their talk “Monitoring Hundreds of RDS PostgreSQL Instances with PMM: The Rappi Case” aims to provide us with practical ways to monitor those oversized database environments.

There is a lot to learn from those in the industry who are solving real problems at scale! These are just a few of the sessions and talks I am looking forward to learning from.

Not registered yet? There’s still time! Don’t miss it! 

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com