Aug
03
2022
--

OS Platform End of Life (EOL) Announcement for Debian Linux 9

EOL) Announcement for Debian Linux 9

EOL) Announcement for Debian Linux 9After having been in long-term support (LTS) for the past two years, Debian 9 (“Stretch”) has now reached End of Life (EOL) on 2022-06-30.

We generally align our OS platform end-of-life dates with the upstream platform vendor. We publish the platform end-of-life dates in advance on our website on the Percona Release Lifecycle Overview page.

When an OS platform has reached End of Life, we stop providing new packages, binary builds, hotfixes, or bug fixes for Percona software. Still, we will continue providing downloads of the existing packages.

Per our policies, we are announcing that this EOL has gone into effect for Percona software on Debian Linux 9 on 2022-07-01.

Each operating system vendor has different supported migration or upgrade paths to their next major release. Please contact us if you need assistance migrating your database to a different supported OS platform – we will be happy to assist you!

Jun
30
2022
--

Database Corruption: An Overview

overview of database corruption

overview of database corruptionThough I am writing this post being a PostgreSQL DBA, this page can be read by anyone because the concept of corruption is the same in every database.

After reading this blog post, one should understand what database corruption is and how it may occur.

Being DBAs, corruption is something we do not want to ever see in our system; however, there are a number of systems where corruption revisits very frequently. Whenever it occurs in big databases, it becomes challenging for us to detect and repair it as we may see no sign of it. In my 15 years of experience as a DBA, I saw corruption as the toughest nut to crack because ostensible reasons for any corruption are not actually predictable. In other words, we may not know the actual cause of the issue; hence, it is quite difficult to get the RCA.

In this series of blogs, I am going to cover various types of corruption and methods to find and fix them.

This blog post will throw some light on the basics of database corruption.

Concept and analogy

To explain it in a simple manner, I will take an example of the Spanish language. There is text in the image below.

Here, the above text is in Spanish. For anyone who does not understand Spanish, is it possible for them to read it? The straightforward answer is “No”. 

Anyone would ask “How may a person without having knowledge of Spanish read it?”. To end the curiosity, the image reads “Goodbye” in English.

The same thing happens with software products as well. All software is programmed to read and write in its own pre-defined format, and it may not be able to do so in any other format that is supported by any other software product. For example, Python code can not be compiled or read, or executed, in C, Java, or Perl.

In the case of databases, it is about the format of data being stored on a disk. When a database product, such as PostgreSQL, MySQL, or MongoDB, writes on a disk, it performs the operation by using some format.

While reading from a disk, the database product expects the same format there; any portion of data on disk that is not in an appropriate format is CORRUPTION.

To summarize this, corruption is nothing but an improper format or a sequence of data.

How does it happen?

As mentioned in the last section, corruption is a result of an unreadable data format. As we know, data is stored in the form of bits on a disk. Now, in the case of integer or numeric, the conversion is simple. But for characters, every machine is designed to convert data in the form of bytes, which is a group of eight bits, in such a way that every byte represents a character. There are 256 different combinations of every byte, from 00000000(0) to 11111111(255).

To read bytes in the form of characters, some formats are designed, such as ASCII, EBCDIC, BCD, and so on. They are also known as encoding schemes. Out of all these schemes, ASCII (American Standard Code for Information Interchange) is more popular. In this format, every byte (all 256 combinations) is assigned a particular character.

Like,

01000001(65) – A

00101100(44) – ,

Below is the link to view all the ASCII codes.

https://www.rapidtables.com/code/text/ascii-table.html

Here, if any byte is stored with an unexpected sequence of bits, the machine will read a different character.

For example,

Let’s say character A(65) is stored as 11000001(193) instead of 01000001(65), which is “Á“(not the English letter A).

Now, in these mentioned encoding schemes, some characters are human-readable and the rest are not. But, another point to note is that all the software products are not built to decipher all the characters. So, in any case, if a byte at any position gets changed, it is possible that the database may not be able to read the character, and hence data. Those unreadable or non-parsable data are deemed as corrupted data.

For example,

In case How are you? is stored as How are you¿, character “¿” is not available in English, hence those character sets that can only parse English may not be able to recognize that character. So, it will not be able to read that text and throws an error by marking it unreadable. Here, only one character is unrecognizable, but the whole text will be marked as corrupted data.

Causes

It is truly mysterious because we may never know the actual reason for any kind of corruption. As I mentioned above, the corruption is attributed to changes of bits/bytes, but it is really difficult to make it certain what process/thread leads to that change. This is why any test case related to corruption is not actually reproducible. The only thing we can do is explore possible causes.

Some of the possible causes are as below.

Hardware issue:

When RAID disks are not designed properly or controllers are faulty, it may not be able to write data correctly on disks.  In non-RAID disks, mechanical devices should work properly because bits are not stored properly due to faulty disks as well.

Corruption may also occur due to heavy I/Os.

Bug in the operating system:

On occasions, due to a buggy kernel or code, OS encodes data wrongly, and it is later written to the disk. On occasions, OS produces corrupted data while it is inefficient to stem I/Os.

Bug in the database product:

In many cases, the product itself sometimes stores wrong data on the disk, or due to inefficient algorithms, it saves data in the wrong format.

Types of corruption

Every database comprises different types of files, such as data files, WAL files, commit logs, and so on. These files contain data for various database objects e.g. tables, indexes, materialized views, WAL records, etc. When these database files go corrupt, some queries retrieve wrong data or return errors, or some operations(e.g. recovery, replay) may not work as expected. As a DBA, one needs to identify what particular objects are affected due to that corruption. For ease of understanding, corruption is categorized into different types; some of them are as below.

Index corruption:

In general, an index keeps a pointer(s) for a particular value(or a value set) in a column. Whenever an index is asked to return pointers (ctid in PostgreSQL, rowid in Oracle), it fetches those pointers and returns them to the requestor.

In the case of corruption, a wrong pointer to any value is saved on the disk due to faulty bits on the disk. Consequently, it returns a wrong record.

Data corruption:

When data/toast pages store faulty data(in terms of format), it may become unrecognizable while reading the same data. Hence, they get errored out by the database.

WAL corruption:

WAL/Redo/Transaction log files store data in a particular format, and while reading them, WAL entries are parsed and applied. In the case of WAL corruption, WAL entries are not parsable, which affects the WAL reply operation.

Page header corruption:

The lowest unit of storage in databases is block/page, which actually stores the actual records. To maintain data integrity, some information is stored in a separate section that is called the page header. Any improper information in a page header is header corruption. This affects the data integrity.

Summary

Corruption results from changes in bits/bytes while storing data on the disk. When a database product (e.g. MySQL, PostgreSQL) does not get the data in an expected format, it is corruption.

The data in the database may get corrupted due to various reasons, such as faulty hardware and buggy OS/kernel/database products. Owing to this, the data is accidentally changed before it is stored on the disk. While it is wrongly stored on a disk, and hence files, it affects various functions of the software product; to easily understand what particular areas are affected, the corruption is classified into various types, such as index corruption, data corruption, and so on.

This is the first post in the series of database corruption blogs; other blogs will be posted soon. Stay tuned!

May
31
2022
--

Percona Platform First Steps

Percona Platform

I am a big fan of Percona Monitoring and Management (PMM) and am happy to report that setting up Percona Platform is as easy to set up and offers a lot of value. Percona Platform reached GA status recently and I think you will find it a handy addition to your infrastructure.

What is Percona Platform?

Percona Platform brings together enterprise-level distributions of MySQL, PostgreSQL, and MongoDB plus it includes a range of open source tools for data backup, availability, and management. The core is PMM which provides database management, monitoring, and automated insights, making it easier to manage database deployments. The number of sites with more than 100 separate databases has grown rapidly in the past few years.  Being able to have command and control of that many instances from a CLI has become impossible. Businesses need to move faster in increasingly complex environments which puts ever-increasing pressure on database administrators, developers, and everyone involved in database operations. The spiraling levels of demand make it harder to support, manage, and correct issues in database environments.

What Percona Platform provides is a unified view of the health of your entire database environment to quickly visually identify and remediate issues.  Developers can now self-service many of their database demands quickly and efficiently so they can easily provision and manage databases on a self-service basis across test and production instances. So you spend fewer resources and time on the management of database complexity.

The two keys to Percona Platform are Query Analytics (QAN), which provides granular insights into database behavior and helps uncover new database performance patterns for in-depth troubleshooting and performance optimization, and Percona Advisors, which are automated insights, created by Percona Experts to identify important issues for remediation such as security vulnerabilities, misconfigurations, performance problems, policy compliance, and database design issues. Automated insights within Percona Monitoring and Management ensure your database performs at its best. The Advisors check for replication inconsistencies, durability issues, password-less users, insecure connections, unstable OS configuration, and search for available performance improvements among other functions. 

Percona Platform is a point of control for your database infrastructure and augments PMM to be even more intelligent when connected to the Percona Platform. By connecting PMM with the Percona Platform, you get more advanced Advisors, centralized user account management, access to support tickets, private Database as a Service, Percona Expertise with the fastest SLAs, and more.

Percona Platform architecture

So How Do I Install Percona Platform?

The first step is to install PMM by following the Quick Start Guide. You need version 2.2.7 or later.

Second, You will need a Percona Account.

Third, you will need to connect that account to PMM.

I will assume that you will already have PMM installed.  Did I mention that PMM is free, open source software?

Percona Platform account signup

The signup form allows you to create a new account or use an existing account.

Now you can create a name for your organization.

Form for creating your organization and connecting PMM

After creating your username and password, create your organization

Now login to your PMM dashboard and select the Settings / Percona Platform.  You will need to get your ‘Public Address’ which the browser can populate the value for you if need be.

The PMM Server ID is automatically generated by PMM.  You will need to provide a name for your server, and you will need a second browser window to login into Percona Platform to get the Percona Platform Access Token (this token has a thirty-minute lifetime, so be quick or regenerate another token).

Go back into PMM, paste the Access Token into the Percona Platform Access Token field, and click Connect.

On the Percona Platform page, you will see your PMM instances. Congratulations, you are using Percona Platform!

Advisor Checks

All checks are hosted on Percona Platform. PMM Server automatically downloads them from here when the Advisors and Telemetry options are enabled in PMM under Configuration > Settings > Advanced Settings. Both options are enabled by default.

Depending on the entitlements available for your Percona Account, the set of advisor checks that PMM can download from Percona Platform differ in terms of complexity and functionality.

If your PMM instance is not connected to Percona Platform, PMM can only download the basic set of Anonymous advisor checks. As soon as you connect your PMM instance to Percona Platform, has access to additional checks, available only for Registered PMM instances.

If you are a Percona customer with a Percona Customer Portal account, you also get access to Paid checks, which offer more advanced database health information. A list is provided below.

Check Name Description Tier
MongoDB Active vs Available Connections Checks the ratio between Active and Available connections. Registered, Paid
MongoDB Authentication Warns if MongoDB authentication is disabled. Anonymous, Registered, Paid
MongoDB Security AuthMech Warns if MongoDB is not using the default SHA-256 hashing as SCRAM authentication method. Paid
MongoDB IP Bindings Warns if MongoDB network binding is not set as recommended. Anonymous, Registered, Paid
MongoDB CVE Version Shows an error if MongoDB or Percona Server for MongoDB version is not the latest one with CVE fixes. Anonymous, Registered, Paid
MongoDB Journal Check Warns if journal is disabled. Registered, Paid
MongoDB Localhost Authentication Bypass is Enabled Warns if MongoDB localhost bypass is enabled. Anonymous, Registered, Paid
MongoDB Non-Default Log Level Warns if MongoDB is not using the default log level. Paid
MongoDB Profiling Level Warns when the MongoDB profile level is set to collect data for all operations. Registered, Paid
MongoDB Read Tickets Warns if MongoDB is using more than 128 read tickets. Paid
MongoDB Replica Set Topology Warns if the Replica Set cluster has less than three members. Registered, Paid
MongoDB Version Warns if MongoDB or Percona Server for MongoDB version is not the latest one. Anonymous, Registered, Paid
MongoDB Write Tickets Warns if MongoDB network is using more than 128 write tickets. Paid
Check if Binaries are 32-bits Notifies if version_compatible_machine equals i686. Anonymous, Registered, Paid
MySQL Automatic User Expired Password Notifies if version_compatible_machine equals i686. Registered, Paid
MySQL InnoDB flush method and File Format check Checks the following settings: innodb_file_formatinnodb_file_format_maxinnodb_flush_method and innodb_data_file_path Registered, Paid
MySQL Checks based on values of MySQL configuration variables Checks the following settings: innodb_file_format,innodb_file_format_max,innodb_flush_method and innodb_data_file_path. Paid
MySQL Binary Logs checks, Local infile and SQL Mode checks Warns about non-optimal settings for Binary Log, Local Infile and SQL mode. Registered, Paid
MySQL Configuration Check Warns if parameters are not following Percona best practices, for infile, replication threads, and replica checksum. Paid
MySQL Users With Granted Public Networks Access Notifies about MySQL accounts allowed to be connected from public networks. Registered, Paid
MySQL User Check Runs a high-level check on user setup Registered, Paid
MySQL Advanced User Check Runs a detailed check on user setup Paid
MySQL Security Check Runs a detailed check on user setup Paid
MySQL Test Database This check returns a notice if there is a database with name ‘test’ or ‘test_%’. Registered, Paid
MySQL Version Warns if MySQL, Percona Server for MySQL, or MariaDB version is not the latest one. Anonymous, Registered, Paid
PostgreSQL Archiver is Failing Verifies if the archiver has failed. Paid
PostgreSQL Cache Hit Ratio Checks database hit ratio and complains when this is too low. Paid
PostgreSQL Configuration Change Requires Restart/Reload Warns when a configuration was changed and requires a server restart/reload Registered, Paid
PostgreSQL fsync is Disabled Shows an error if the fsync configuration is disabled, as this can result in unrecoverable data corruption. Anonymous, Registered, Paid
PostgreSQL Autovacuum Logging Is Disabled Shows an error if the fsync configuration is disabled, as this can result in unrecoverable data corruption. Paid
PostgreSQL Checkpoints Logging Is Disabled Notifies if the log_checkpoints configuration option is not enabled. Paid
PostgreSQL Max_connections is too high Notifies if the max_connections setting is set above 300. Anonymous, Registered, Paid
PostgreSQL Stale Replication Slot Warns for stale replication slots since these can lead to WAL file accumulation and DB server outage. Paid
PostgreSQL Super Role Notifies if there are users with superuser privileges. Anonymous, Registered, Paid
PostgreSQL Version Check Warns if the PostgreSQL minor or major versions are not the latest, and shows an error if the major version is 9.4 or older. Anonymous, Registered, Paid

More Advisors will be added and Perona always welcomes contributions. Check out Percona Platform today!

 

May
24
2022
--

A Quick Peek at the PostgreSQL 15 Beta 1

PostgreSQL 15 Beta 1

PostgreSQL 15  Beta 1 was announced on May 15th and to keep with the spirit of past peeks at MySQL and MariaDB products, this is a cursory glance at the release notes to see what is coming.  This is a preview of all the features that will be in the General Available version when it is released this fall.  Beta 1 is not in production but the developers encourage you to test this code to find bugs or other problems before the GA release.  My comments are in italics and I tend to highlight the stuff I find of note.

The TL;DR

If you are used to the once-a-quarter releases of MySQL or  MariaDB, this is an overwhelming amount of new goodies being delivered. PostgreSQL releases less frequently but packs plenty of solid engineering into each release.  The claim of being the most advanced open source relational database is well-founded.  My one-word summary is ‘wow’.  It is very evident that the code is delivered when it is ready and not due to a deadline.

For Developers

SQL codes get MERGE from the SQL Standard for conditionally performing writes (INSERT, UPDATE, or DELETE) when synchronizing two tables.  This could be emulated in previous releases by using INSERT .. ON CONFLICT but more batch orientated.  This is also known as an upsert.  The ‘how do I combine two tables based on the values’ type question on Reddit and Stackoverflow can now be pointed here.

More JSON support including a very impressive JSON_TABLE function to temporarily turn unstructured JSON data into a relational table. Very handy!

And more regular expression functions – regexp_count, regexp_instr, regexp_like, regexp_substr, and range_agg.  

In prior versions, NULL values were always indexed as distinct values but this can be changed by using UNIQUE NULLS NOT DISTINCT.

Performance

Reportedly there is a big gain for sorting large dataset, a paralleled SELECT DISTINCT, faster window functions (row_number(), rank(), and count()), and a \copy bulk loader built into psql. If you use the foreign data wrapper’s postgres_fdw you now can have transactions committed in parallel. 

Compression

LZ4 is added to Zstd compression into many components.  The WAL can not take advantage of those two compression tools plus  pg_basebackup can now take advantage of those two plus gzip.

Logical Replication

Row filtering and column filtering are now offered and there is also the ability to publish all the tables in a schema instead of just the tables in the database. Yes, MySQL-ites, there is a difference

DBAs/SREs Get Goodies Too

Logging can now be done using the jsopnlog format for the consumption of programs that aggregate and analyze logs. Slow checkpoint and autovacumms are automatically logged.  You can now see configuration values by using \dconfig with psql and the default shows you the non-default values. This will be loved by anyone who tried to figure out why two servers behave differently and get tired of using diff.   All server-level stats are stored in shared memory and adios to statistic collector processes.  And there is now a pager option for /watch.

Views now can execute using the privileges of the user instead of the creator of the view. Anyone bit by this type of problem is now sighing with relief.

There is now the ability to get extended statistics for a parent and all children to augment the existing ability to record parent/child stats with regular statistics separately. 

Generic periodic log messages telling you the case of the delay are generated during slow server starts. Finally, the reason for ‘why the #$@ is in not up yet?’ can be told. 

Conclusion

PostgreSQL 15, like all its predecessors, has an impressive amount of solid engineering in each release.  For someone coming from less featured open source relational databases like me, the amount of work being delivered is staggering.  My 10,000-foot view is that an awesome amount of things to make PostgreSQL more reliable, more useful, and maintain the reputation of being on the cutting edge has handed over for us to kick the tires rev the engine, and take it for a spin.  I am sure a lot of backend database specialists will be muttering ‘gee, I could really use that as they pour over the items in the release notes.  

Apr
21
2022
--

Some Things to Consider Before Moving Your Database to the Cloud

Moving Your Database to the Cloud

Moving Your Database to the CloudBefore you transition your database environment to the cloud, there are a few considerations you should consider first.  Some of the touted benefits of the cloud also carry with them some risks or negative impacts.  Let’s take a look at just a few of these.

First, consider whether or not you will face vendor lock-in.  Many people choose Open Source databases precisely to avoid this.  The interesting fact, however, is that you can actually get locked in without realizing it.  Many cloud providers have their own versions of popular database platforms such as MySQL, PostgreSQL, MongoDB, etc.  These versions may be utilizing heavily engineered versions of these database systems.

While it is often easy to migrate your data into the environments, applications are often modified to adapt to unique aspects of these database platforms.  In some cases, this may be utilizing specialized features that were added while other times it can even be dealing with a lack of features with extra development.  This can occur because often the cloud versions may be based upon older codebases which may not contain all of the latest feature sets of the DBMS.  This makes migrating off the cloud more challenging as it may require changes to the code to go back.  In this sense, you may pragmatically be locked into a particular cloud database without even realizing it.

Also, consider the additional cost of time and resources if you need to re-engineer your application to work with the cloud platform.  It may not be as simple as simply migrating over.  Instead, you will need to do extensive testing and perhaps rewrite code to make it all work as it should.

cloud save moneyA common reason many migrate to the cloud is to save cost.  Some cloud providers cite projected savings of 50% or more due to not needing so much infrastructure, staff, etc.  While this is certainly possible, it is also possible that your costs could rise.  With the ease of creating and configuring new servers, it is easy to spin up more instances very quickly.  Of course, each of these instances is increasing your costs.  Without proper oversight and someone managing the spend, that monthly bill could quickly cause some sticker shock!

Storage and networking are areas that can easily increase costs in addition to sheer server instance counts.  Although storage costs are relatively cheap nowadays, think about what happens when teams set up additional test servers and leave large backups and datasets lying around.  And, of course, you have to pay networking costs as these large data sets are transferred from server to server.  The inter-connected networking of servers that used to be essentially “free” in your on-prem Data Center, is now generating costs.  It can add up quickly.

Moreover, with the “cheap” storage costs, archiving becomes less of a concern.  This is a real double-edged sword as not only do your costs increase, but your performance decreases when data sets are not archived properly.  Databases typically lose performance querying these enormous data sets and the additional time to update indexes negatively impacts performance.

Also, consider the loss of control.  In your on-prem databases, you control the data entirely.  Security is completely your responsibility.  Without a doubt, cloud providers have built their systems around security controls and this can be a huge advantage.  The thing you have to consider is that you really don’t know who is managing the systems housing your data.  You lose insight into the human aspects and this cannot be discounted.

In addition, if you have components of your application that will be either in another cloud or will remain on-prem, you must think about the effects of network latency on your application.  No longer are the various components sitting in the same Data Center.  They may be geographically dispersed across the globe.  Again, this can be a benefit, but it also carries with it a cost in performance.

You also need to consider whether you will need to retrain your staff.  Certainly, today most people are familiar with the cloud but is almost certain you will have some holdouts who are not sold on the change in the way you manage your servers.  Without a doubt, the way they work day-to-day will change.  Done properly, this shift can prove beneficial by allowing your DBAs to focus more on database performance and less on setting up and configuring servers, managing backups, monitoring, etc.

Moving to the cloud is all the rage nowadays and this article is certainly not meant to dissuade you from doing so.  Instead, the idea is to consider some of the ramifications before making the move to determine if it is really in your best interest and help you avoid some of the pitfalls.

Apr
19
2022
--

Make Database Management Easier With Percona Platform

Percona Platform

Percona Software PlatformDatabase management is hard. Not only must your database platform team work to provide consistent and reliable performance, availability, and security, but they also need to meet your developers’ ever-increasing demands to move faster. This strain has led to an environment of increasing complexity, in which you and your teams are tasked with managing hundreds or thousands of siloed databases and their supporting technologies. 

Then, of course, there’s the problem of vendor lock-in. Too often, you’re forced into non-portable platforms or environments, where it’s difficult to reclaim your data, massive and oppressive annual support obligations swell, and vendors are more than happy to hold you hostage over proprietary features. It can seem there’s little way out of the spiral. 

Fortunately, there’s a better way. 

Introducing Percona Platform 

Percona Platform

Percona is different. Since day one, we’ve stood for Keeping Open Source Open. Whatever your environment, any cloud or on-premises, and whatever “enterprise” features you need, we offer ultimate freedom and flexibility. It’s what you’d expect from a company committed to the ideals of free and open source software and why many of you have trusted us for years.

Today, we’re proud to announce that we’re bringing together many of your favorite Percona offerings into a new product called Percona Platform. 

Percona Platform combines our software and services into a subscription-based offering that gives you all the tools and expertise you need to succeed: Whether that’s our fully open, performance-optimized database servers, our Percona-curated distributions, which include best-of-breed solutions for high availability, security, and backup, or Percona Monitoring and Management (PMM).  And we’re backing it up with our world-class services, delivered however you need them — whether that’s support for self-managed installations, or providing you with fully managed database environments.

Take a look at everything included:

Percona software and services

This commitment to giving our customers greater control of availability and access to their data and saving them time on managing databases is why RedMonk wrote: 

“There has been an explosion of data tools available to enterprises, which in turn has led to rampant fragmentation and highly complex, disjointed workflows. Organizations that prize velocity are looking for ways to integrate various data tools into their broader development workflows and monitor them.  This is the opportunity that Percona Platform is built for. The combination of software and services in this space is intended to help customers better manage their database deployments.” –   Rachel Stephens, Sr. Analyst, RedMonk

But you really don’t have to look any further than our customers, who love the combination of Percona software and services. It’s why more than two-thirds of our Support customers run Percona Software, and 58% of our Managed Services customers choose to run Percona databases in addition to their Percona Software.

Here’s what some of them had to say: 

“Percona’s drop-in database software replacement was a big factor as migrations can be tricky! … We have more control with Percona.” – Carpages.ca

“Percona has the best engineers in the world.” –  Appsuite 

“Percona have provided us with the best solutions and identified opportunities for improvement in many areas of MySQL, Postgres, and MongoDB.” – Solera

Percona Platform is designed based on decades of experience from Percona experts, and its production-ready database software is battle-tested in mission-critical environments with the most demanding scale and performance needs.

Discover how the Percona Platform can make database management easier for you. 

 

Mar
04
2022
--

DBaaS in Under 20 Min!

DBaaS Kubernetes Percona

DBaaS Kubernetes PerconaMy father always used the expression, “sometimes the juice ain’t worth the squeeze”. What he meant by it was, “what you put into something better be worth what you get out of it”. Why do I bring this up?  Well…we’ve got this GREAT feature for Percona Monitoring and Management (PMM) we’ve been banging away on: PMM DBaaS (Database as a Service).  Using it for only 30 minutes, you can see it has the potential to change the way teams think about providing database services while controlling cost and minimizing complexity.  But it’s a major pain in the ass to get all set up to first realize that value…and we want to change that!

TLDR: YES! I’ve been wanting to try out DBaaS, but have no desire to become a Kubernetes expert just to see it! Skip to the good stuff! 

Quick history.  Our DBaaS (Database as a Service) offering is designed to be resilient and performant…after all, we’re kind of known for being able to beat databases into high-performance submission.  So when considering the backends to help us deliver performance, scalability, reliability, and more, we settled on Kubernetes as the starting point thanks to its scalability, resiliency, and orchestration capabilities out of the box!  We released a preview release about a year ago and have been adding features and functionality to it ever since. 

Getting Past Setup Kubernetes

I’m lucky enough to get to talk to all kinds of users that are begging for a solution with the flexibility of your public cloud DBaaS but without racking up tens of thousands of dollars of bills a month, or that need to maintain tight control of their data, or who have moved a ton of workload to the cloud and have racks of servers just sitting there.  I tell them about what we’ve built and encourage them to try it out. All of them get excited to hear what it can do and are eager to give it a try!  So I give them some time and follow up a few weeks later…nothing.  I encourage them to make the time, follow up a few weeks later…nothing?  Challenge them as to why not when they admit they’re losing precious cycles on silly operations that users should just be able to do on their own and the number one response is “Kubernetes is too confusing for me and I could never get past Step 1: Setup Kubernetes”.  Not. Good!  

I’ve used our DBaaS on numerous occasions…mostly just on my laptop with minikube.  There’s a drawback with minikube; you must have a powerhouse of a machine to be able to use DBaaS and PMM on the same machine to play around with it; not to mention weights and chains to keep your laptop from flying away when the fans go nuts!  The best way to poke around DBaaS is with some cheap public cloud infrastructure!  So I figured I’d give it a try…our docs show what looks like “three easy steps”, but failed to mention the prerequisite 20 steps if you don’t already have eksctl and other tools installed/configured ?????. It was more work than I budgeted time for, but I decided to push through; determined to get an EKS cluster up and running!  I threw in the towel after about five hours one Saturday…defeated.  It wasn’t just getting the cluster up, it was all the hoops and tricks and rules and roles and permissions needed to do anything with that cluster.  That’s five hours of squeeze and still no juice!

So I did what all smart engineers do…found a smarter engineer!  Sergey and I decided there was a real opportunity to make DBaaS available to a wider range of users…those who were not AWS Level 8 Samurais with PhDs in Kubernetes and the goal was simple: “Be able to use PMM’s DBaaS in 10 minutes or less…starting from nothing!”  We have not quite hit the 10-minute mark, but we DID hit the 18-minute mark…and 16 of those 18 minutes are spent watching paint dry as the tiny robots of CloudFormation get you a suitable cluster up and running.  But when it’s done, there’s no messing with IAM roles or Load Balancers or VPCs…just copy/paste, and use!

Wanna Try DBaaS?

You’re going to need your AWS Access Key ID and your AWS Secret Access Key for a root user…so get that before you start the timer ? (here is a handy guide to getting them if you don’t already have them safely stored somewhere).  You will also need a Linux system to set up your PMM server on and make the needed calls to get your K8s cluster up and running (this has only been tested on CentOS and Ubuntu).  

As any user with sudo, run: 

curl -fsSL https://raw.githubusercontent.com/shoffman-percona/easyK8s/main/easyK8s-aws.sh | bash -s -- <AWS_ACCESS_KEY_ID> <AWS_SECRET_ACCESS_KEY>

You can optionally add an AWS region at the end of that if you want something other than us-east-2 (default).

While this is running (~16 min) you can go right to the PMM installation.  In a new tab/terminal window and run the following (user with sudo privileges):

curl -fsSL https://www.percona.com/get/pmm | /bin/bash

When the script is done, the CLI tool will print a list of IP-based URLs you can put in a browser to access the PMM UI.  Copy/paste one into your favorite browser.  You may receive a security warning, there are instructions in the script output on how to bypass if you don’t get a “proceed anyway” option in your browser.    

Log in to PMM’s UI, the default username/password is admin/admin and you’ll be prompted to change the password.  

To turn on DBaaS you’ll need to click the gear icon Settings Icon, followed by “Settings”.  On the PMM settings page, click on “Advanced Settings” and scroll down to the “Technical Preview features” section, and toggle DBaaS on.  While you’re here, fill in the Public Address using the “Get from Browser” button.  This makes automatic monitoring that much easier later. Click “Apply Changes” and you’ll see the screen refresh and a new icon will appear of a database DBaaS Icon. Click it to get to the DBaaS main page…but you’ll most likely be holding here as the infrastructure is probably still setting up.  Take advantage of the opportunity to stand up, stretch your leg, maybe grab a drink!

Once the cluster setup is completed, you can Copy/Paste from the ####BEGIN KUBECONFIG#### comment to the ####END KUBECONFIG#### comment. Switch over to the PMM DBaaS UI on the Kubernetes Cluster tab click “Register New Kubernetes Cluster”.  Name your new cluster and paste the config in the bottom window…it’ll take a second and your PMM server will install both Percona XtraDB Cluster and Percona Server for MongoDB operators and enable the DB Cluster tab where you can create and size DBs of your choosing!  

That’s it!  If all the complicated setup has held you back from taking DBaaS for a test drive, hopefully this will give you “more juice for your squeeze”!  We’d love to hear feedback on what we’ve built for so far so feel free to leave a comment here or offer an improvement in our jira instance under the PMM project.  Our main objective is to take the complication out of getting a database up and running for your application development process and being able to create MySQL and MongoDB databases in one place (PostgreSQL coming soon).  When you’re done playing, you can unregister the Kubernetes from PMM then log in to your AWS account, and delete both stacks (eksctl-pmmDBaaS-nodes-XXX and eksctl-pmmDBaaS-cluster) in the Cloudformation app for the region you chose (or us-east-2 if you left the default).  

Jan
31
2022
--

How PostgreSQL Pipeline Mode Works

How PostgreSQL Pipeline Mode Works

I’d like to introduce to you a very cool feature introduced in PostgreSQL, the Pipeline Mode.

So just what exactly is Pipeline Mode? Pipeline Mode allows applications to send a query without having to read the result of the previously sent query. In short, it provides a significant performance boost by allowing multiple queries and results to be sent and received in a single network transaction.

As with all good ideas, there is precedent: one can emulate such behavior with a little application code wizardry. Alternatively known as “Batch Mode”, running asynchronous communications between a client and its server has been around for some time. There are a number of existing solutions batching multiple queries in an asynchronous fashion. For example, PgJDBC has supported batch mode for many years using the standard JDBC batch interface. And of course, there’s the old reliable standby dblink.

What distinguishes Pipeline Mode is that it provides an out-of-the-box solution greatly reducing the application code’s complexity handling the client-server session.

Traditional BATCH MODE Operations

Traditional BATCH MODE Operations

 

Pipeline Mode

Pipeline Mode PostgreSQL

Although introduced in PostgreSQL 14, pipeline mode works against any currently supported version of postgres as the enhancement is in the LIBPQ which is used by the client and not the server itself!

And now for the bad news, of a sort; leveraging Pipeline Mode requires using “C” or an equivalent programming language capable of interfacing directly with LIBPQ. Unfortunately, there’s not too much out there yet in the way of ODBC development offering the requisite hooks taking advantage of this enhanced feature. Therefore, one is required to design and program the client-application session in the said programming language.

HINT: This is a great way for somebody to make a name for themselves and create a convenient interface to the LIBPQ Pipeline Mode.

How It Works

Now that I’ve issued the requisite caveat, let’s talk about how this mechanism works.

Keeping things simple

  1. The client first makes a connection to the postgres server.
  2. The client must then switch the connection to pipeline mode.
  3. Once in pipeline mode, SQL statements are sent to the server.
  4. Upon arrival to the server, the statements are immediately executed and results sent back to the client i.e. client/server acknowledgments are not required.
  5. Because each SQL statement is sent sequentially, the application logic can either use a state machine or take advantage of what is obviously a FIFO queue in order to process the results.
  6. Once all asynchronous statements have been executed and returned the client application explicitly terminates the pipeline mode and returns the connection to its default setting.

Since each SQL statement is essentially idempotent it is up to the client logic to make sense of the results. Sending SQL statements and pulling out results that have no relation with each other is one thing but life gets more complicated when working with logical outcomes that have some level of interdependence.

It is possible to bundle asynchronous SQL statements as a single transaction. But as with all transactions, failure of any one of these asynchronously sent SQL statements will result in a rollback for all the SQL statements.

Of course, the API does provide error handling in the case of pipeline failures. In the case of a FATAL condition, when the pipeline itself fails, the client connection is informed of the error thus flagging the remaining queued operations as lost. Thereafter normal processing is resumed as if the pipeline was explicitly closed by the client, and the client connection remains active.

Getting Into The, UGH, Details …

For the C programmer at heart, here’s a couple of references that I can share with you:

Caveat

  • Pipeline Mode is designed expressly for asynchronous mode. Synchronous mode is therefore not possible, which kinda defeats the purpose of pipeline mode.
  • One can only send a single SQL command at a time i.e. multiple SQL commands are disallowed.
  • COPY is disallowed.
  • In the case of sending a transaction COMMIT: The client cannot assume the transaction is committed until it receives the corresponding result.
  • Leveraging Pipeline Mode requires programming in either C or a language that can access the libpq API.

Remember to check with the postgres documentation which has more to say here.

Interesting References

Percona Distribution for PostgreSQL provides the best and most critical enterprise components from the open-source community, in a single distribution, designed and tested to work together.

Download Percona Distribution for PostgreSQL Today!

Jan
14
2022
--

DBaaS and the Enterprise

DBaaS and the Enterprise

DBaaS and the EnterpriseInstall a database server. Give the application team an endpoint. Set up backups and monitor in perpetuity. This is a pattern I hear about regularly from DBAs with most of my enterprise clients. Rarely do they get to troubleshoot or work with application teams to tune queries or design schemas. This is what triggers the interest in a DBaaS platform from database administrators.

What is DBaaS?

DBaaS stands for “Database as a Service”. When this acronym is thrown out, the first thought is generally a cloud offering such as RDS. While this is a very commonly used service, a DBaaS is really just a managed database platform that offloads much of the operational burden from the DBA team. Tasks handled by the platform include:

  • Installing the database software
  • Configuring the database
  • Setting up backups
  • Managing upgrades
  • Handling failover scenarios

A common misconception is that a DBaaS is limited to the public cloud. As many enterprises already have large data centers and heavy investments in hardware, an on-premise DBaaS can also be quite appealing. Keeping the database in-house is often favored when the hardware and resources are already available. In addition, there are extra compliance and security concerns when looking at a public cloud offering.

DBaaS also represents a difference in mindset. In conventional deployments, systems and architecture are often designed in very exotic ways making automation a challenge. With a DBaaS, automation, standardization, and best practices are the priority. While this can be seen as limiting flexibility, this approach can lead to larger and more robust infrastructures that are much easier to manage and maintain.

Why is DBaaS Appealing?

From a DBA perspective (and being a former DBA myself), I always enjoyed working on more challenging issues. Mundane operations like launching servers and setting up backups make for a less-than-exciting daily work experience. When managing large fleets, these operations make up the majority of the work.

As applications grow more complex and data sets grow rapidly, it is much more interesting to work with the application teams to design and optimize the data tier. Query tuning, schema design, and workflow analysis are much more interesting (and often beneficial) when compared to the basic setup. DBAs are often skilled at quickly identifying issues and understanding design issues before they become problems.

When an enterprise adopts a DBaaS model, this can free up the DBAs to work on more complex problems. They are also able to better engage and understand the applications they are supporting. A common comment I get when discussing complex tickets with clients is: “well, I have no idea what the application is doing, but we have an issue with XYZ”. If this could be replaced with a detailed understanding from the design phase to the production deployment, these discussions would be very different.

From an application development perspective, a DBaaS is appealing because new servers can be launched much faster. Ideally, with development or production deployment options, an application team can have the resources they need ready in minutes rather than days. It greatly speeds up the development life cycle and makes developers much more self-reliant.

DBaaS Options

While this isn’t an exhaustive list, the main options when looking to move to a DBaaS are:

  • Public cloud
    • Amazon RDS, Microsoft Azure SQL, etc
  • Private/Internal cloud
    • Kubernetes (Percona DBaaS), VMWare, etc
  • Custom provisioning/operations on bare-metal

Looking at public cloud options for a DBaaS, security and compliance are generally the first concern. While they are incredibly easy to launch and generally offer some pay-as-you-go options, managing access is a major consideration.

Large enterprises with existing hardware investments often want to explore a private DBaaS. I’ve seen clients work to create their own tooling within their existing infrastructure. While this is a viable option, it can be very time-consuming and require many development cycles. Another alternative is to use an existing DBaaS solution. For example, Percona currently has a DBaaS deployment as part of Percona Monitoring and Management in technical preview. The Percona DBaaS automates PXC deployments and management tasks on Kubernetes through a user-friendly UI.

Finally, a custom deployment is just what it sounds like. I have some clients that manage fleets (1000s of servers) of bare metal servers with heavy automation and custom scripting. To the end-user, it can look just like a normal DBaaS (an endpoint with all operations hidden). On the backend, the DBA team spends significant time just supporting the infrastructure.

How Can Percona help?

Percona works to meet your business where you are. If that is supporting Percona Server for MySQL on bare metal or a fleet of RDS instances, we can help. If your organization is leveraging Kubernetes for the data tier, the Percona Private DBaaS is a great option to standardize and simplify your deployments while following best practices. We can help from the design phase through the entire life cycle. Let us know how we can help!

Jan
05
2022
--

In Application and Database Design, Small Things Can Have a Big Impact

Application and Database Design

Application and Database DesignWith modern application design, systems are becoming more diverse, varied and have more components than ever before. Developers are often forced to become master chefs adding the ingredients from dozens of different technologies and blending them together to create something tasty and amazing. But with so many different ingredients, it is often difficult to understand how the individual ingredients interact with each other. The more diverse the application, the more likely it is that some seemingly insignificant combination of technology may cause cascading effects.

Many people I talk to have hundreds if not thousands of different libraries, APIs, components, and services making up the systems they support. In this type of environment, it is very difficult to know what small thing could add up to something much bigger. Look at some of the more recent big cloud or application outages, they often have their root cause in something rather small.

Street Fight: Python 3.10 -vs- 3.9.7

Let me give you an example of something I ran across recently. I noticed that performance on two different nodes running the same hardware/application code was performing drastically different than one another. The app server was running close to 100% CPU on one server while the other was around 40%. Each had the same workload and the same database, etc. It turned out that one server was using Python 3.10.0 and the other was running 3.9.7. This combined with the MySQL Connector/Python lead to almost a 50% reduction in database throughput (the 3.10.0 release saw a regression). However, this performance change was not seen either in my PostgreSQL testing or in testing the mysqlclient connector. It happened only when running the pure python version of the MySQL connector. See the results below:

Python 3.10 -vs- 3.9.7

Note: This workload was using a warmed BP, with workload running for over 24 hours before the test. I cycled the application server making the change to either the version of python or the MySQL library. These tests are repeatable regardless of the length of the run. All data fits into memory here. I am not trying to make an authoritative statement on a specific technology, merely pointing out the complexity of layers of technology.

Looking at this purely from the user perspective, this particular benchmark simulates certain types of users.  Let’s look at the number of users who could complete their actions per second.  I will also add another pure python MySQL driver to the mix.

Note: There appears to be a significant regression in 3.10. The application server was significantly busier when using Python 3.10 and one of the pure python drivers than when running the same test in 3.9 or earlier.

The main difference between the MySQL Connector and mysqlclient is the mysqlclient is using the C libmysqlclient. Oddly the official MySQL Connector says it should switch between the pure python and C version if available, but I was not seeing that behavior ( so I have to look into it ). This resulted in the page load time for the app in Python 3.10.0 taking 0.05 seconds up from 0.03 seconds in Python 3.9.7. However, the key thing I wanted to highlight is that sometimes seemingly small or insignificant changes can lead to a drastic difference in performance and stability. You could be running along fine for months or even years before something upgrades to something that you would not think would drastically impact performance.

Can This Be Fixed With Better Testing?

You may think this is a poster child for testing before upgrading components, and while that is a requirement, it won’t necessarily prevent this type of issue. While technology combinations, upgrades, and releases can often have some odd side effects, oftentimes issues they introduce remain hidden and don’t manifest until some outside influence pops up. Note: These generally happen at the worst time possible, like during a significant marketing campaign, event, etc. The most common way I see nasty bugs get exposed is not through a release or in testing but often with a change in workload. Workload changes can hide or raise bottlenecks at the worst times. Let’s take a look at the above combination of Python version and different connectors with a different workload:

Here the combination of reporting and read/write workload push all the app nodes and database nodes to the redline. These look fairly similar in terms of performance, but the workload is hiding the above issues I mentioned. A system pushed to the red will behave differently than it will in the real world. If you ended up testing upgrading python on your app servers to 3.10.0 by pushing your systems to the max, you may see the above small regression as within acceptable limits. In reality, however, the upgrade could net you seeing a 50% decrease in throughput when moved to production.

Do We Care?

Depending on how your application is built many people won’t notice the above-mentioned performance regression after the upgrade happens. First, most people do not run their servers even close to 100% load, adding more load on the boxes may not immediately impact their user’s performance. Adding .02 seconds of load time to a user may be imperceptible unless under heavy load ( which would increase that load time). The practical impact is to speed up the point at which you need to either add more nodes or upgrade their instances sooner.

Second, scaling application nodes automatically is almost a requirement in most modern cloud-native environments. Reaching a point where you need to add more nodes and more processing power will come with increases in users on your application, so it is easily explained away.

Suppose users won’t immediately notice and the system will automatically expand as needed ( preventing you from knowing or getting involved ). In that case, do we, or should we care about adding more nodes or servers? Adding nodes is cheap; it is not free.

First, there is a direct cost to you. Take your hosting costs for your application servers and double them in the case above. What is that? $10K, $100K, $1M a year? That is money that is wasted. Look no further than the recent news lamenting the ever-increasing costs of the cloud i.e.:

Second, there is a bigger cost that comes with complexity. Observability is such a huge topic because everything we end up doing in modern environments is done in mass. The more nodes and servers you have the more potential exists for issues or one node behaving badly. While our goal is to create a system where everything is replaceable and can be torn down and rebuilt to overcome problems, this is often not the reality. Instead, we end up replicating bad code, underlying bottlenecks, and making a single problem a problem at 100x the scale.

We need to care. Application workload is a living entity that grows, shrinks, and expands with users. While modern systems need to scale quickly up and down to meet the demand, that should not preclude us from having to look out for hidden issues and bottlenecks. It is vitally important that we understand our applications workloads and look for deviations in the normal patterns.  We need to ask why something changed and dig in to find the answer.  Just because we can build automation to mitigate the problem does not mean we should get complacent and lazy about fixing and optimizing our systems.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com