Sep
06
2022
--

Percona Private DBaaS API in Action

Percona Private DBaaS API

Percona Monitoring and Management (PMM) comes with Database as a Service (DBaaS) functionality that allows you to deploy highly-available databases through a simple user interface and API. PMM DBaaS is sort of unique for various reasons:

  • It is fully open source and free
  • It runs on your premises – your data center or public cloud account
  • The databases are deployed on Kubernetes and you have full control over your data

PMM features a robust API but while I have seen demos on the internet demonstrating the UI, I have never seen anything about API. PMM API is great for building tools that can automate workflows and operations at scale. In this blog post, I’m going to impersonate a developer playing with PMMs API to deploy and manage databases. I also created an experimental CLI tool in python to showcase the possible integration.

Percona Private DBaaS API in Action

Preparation

At the end of this step, you should have the following:

  • Percona Monitoring and Management up and running
  • Kubernetes cluster 
  • PMM API token generated

First off, you need a PMM server installed and reachable from your environment. See the various installation ways in our documentation.

For DBaaS to work you would also need a Kubernetes cluster. You can use minikube, or leverage the recently released free Kubernetes capability (in this case it will take you ~2 minutes to set everything up).

To automate various workflows, you will need programmatic access to Percona Monitoring and Management. The recommended way is API tokens. To generate it please follow the steps described here. Please keep in mind that for now, you need admin-level privileges to use DBaaS.

Using API

I will use our API documentation for all the experiments here. DBaaS has a dedicated section. In each step, I will provide an example with the cURL command, but keep in mind that our documentation has examples for cURL, Python, Golang, and many more. My PMM server address is

34.171.88.159

and my token

eyJrIjoiNmJ4VENyb0p0NWg1ODlONXRLT1FwN1N6YkU2SW5XMmMiLCJuIjoiYWRtaW5rZXkiLCJpZCI6MX0=

. Just replace these with yours.

In the demo below, you can see me playing with PMM API through percona-dbaas-cli tool that I created to demonstrate possible integration for your teams. The goal here is to deploy the database with API and connect to it.

Below are the basic steps describing the path from setting up PMM to deploying your first database.

Connection check

To quickly check if everything is configured correctly, let’s try to get the PMM version.

API endpoint: /v1/version

CLI tool:

percona-dbaas-cli pmm version

 

cURL:

curl -k --request GET \
     --url https://34.171.88.159/v1/version \
     --header 'Accept: application/json' \
     --header 'Authorization: Bearer eyJrIjoiNmJ4VENyb0p0NWg1ODlONXRLT1FwN1N6YkU2SW5XMmMiLCJuIjoiYWRtaW5rZXkiLCJpZCI6MX0='

It should return the information about the PMM server. If you get the error, there is no way we can proceed.

In the CLI tool, if you have not configured access to the PMM, it will ask you to do it first.

Enable DBaaS in PMM

Database as a Service in PMM is in the technical preview stage at the time of writing this blog post. So we are going to enable it if you have not enabled it during the installation.

API endpoint: /v1/Settings/Change

CLI tool:

percona-dbaas-cli dbaas enable

 

cURL:

curl -k --request POST \
     --url https://34.171.88.159/v1/Settings/Change \
     --header 'Accept: application/json' \
     --header 'Authorization: Bearer eyJrIjoiNmJ4VENyb0p0NWg1ODlONXRLT1FwN1N6YkU2SW5XMmMiLCJuIjoiYWRtaW5rZXkiLCJpZCI6MX0=' \
     --data '{"enable_dbaas": true}'

Now you should see the DBaaS icon in your PMM user interface and we can proceed with further steps. 

Register Kubernetes cluster

At this iteration, PMM DBaaS uses Percona Kubernetes Operators to run databases. It is required to register the Kubernetes cluster in PMM by submitting its kubeconfig.

API endpoint: /v1/management/DBaaS/Kubernetes/Register

CLI tool:

percona-dbaas-cli dbaas kubernetes-register

 

Registering k8s with cURL requires some magic. First, you will need to put kubeconfig into a variable and it should be all in one line. We have an example in our documentation:

KUBECONFIG=$(kubectl config view --flatten --minify | sed -e ':a' -e 'N' -e '$!ba' -e 's/\n/\\n/g')

curl -k --request POST \
      --url "http://34.171.88.159/v1/management/DBaaS/Kubernetes/Register" \
     --header "accept: application/json" \
     --header "authorization: Bearer eyJrIjoiNmJ4VENyb0p0NWg1ODlONXRLT1FwN1N6YkU2SW5XMmMiLCJuIjoiYWRtaW5rZXkiLCJpZCI6MX0=" \
     --data "{ \"kubernetes_cluster_name\": \"my-k8s\", \"kube_auth\": { \"kubeconfig\": \"${KUBECONFIG}\" }}"

It is much more elegant in python or other languages. We will think about how to simplify this in the following iterations.

Once the Kubernetes cluster is registered, PMM does the following:

  1. Deploys Percona Operators for MySQL and for MongoDB
  2. Deploys Victoria Metrics Operators, so that we can get monitoring data from the Kubernetes in PMM

Get the list of Kubernetes clusters

Mostly to check if the cluster was added successfully and if the Operators were installed.

API endpoint: /v1/management/DBaaS/Kubernetes/List

CLI tool:

percona-dbaas-cli dbaas kubernetes-list

 

cURL:

curl -k --request POST \
     --url https://34.171.88.159/v1/management/DBaaS/Kubernetes/List \
     --header 'Accept: application/json' \
     --header 'Authorization: Bearer eyJrIjoiNmJ4VENyb0p0NWg1ODlONXRLT1FwN1N6YkU2SW5XMmMiLCJuIjoiYWRtaW5rZXkiLCJpZCI6MX0=' \

In the CLI tool, I decided to have a nicely formatted list of the clusters, as it is possible to have many registered in a single PMM server.

Create the database

Right now our DBaaS solutions support MySQL (based on Percona XtraDB Cluster) and MongoDB, thus there are two endpoints to create databases:

API endpoints: 

CLI tool:

percona-dbaas-cli dbaas databases-create

 

cURL: 

curl -k --request POST \
     --url https://34.171.88.159/v1/management/DBaaS/PSMDBCluster/Create \
     --header 'Accept: application/json' \
     --header 'Authorization: Bearer eyJrIjoiNmJ4VENyb0p0NWg1ODlONXRLT1FwN1N6YkU2SW5XMmMiLCJuIjoiYWRtaW5rZXkiLCJpZCI6MX0=' \
     --header 'Content-Type: application/json' \
     --data '{"kubernetes_cluster_name": "my-k8s", \"expose\": true}'

In the experimental CLI tool, I decided to go with a single command, where the user can specify the engine with the

–engine

flag.

Notice that I also set the expose flag to true, which instructs the Operator to create a LoadBalancer Service for my cluster. It is going to be publicly exposed to the internet, not a good idea for production.

There are various other parameters that you can use to tune your database when interacting with the API.

For now, there is a certain gap between the features that Operators provide and the API. We are heading towards more flexibility, stay tuned for future releases.

Get credentials and connect

It will take some time to provision the database – in the background, Persistent Volume Claims are provisioned, the cluster is formed and networking is getting ready. You can get the list of the databases and their statuses by looking at /v1/management/DBaaS/DBClusters/List endpoint. 

We finally have the cluster up and running. It is time to get the credentials:

API endpoints:

CLI tool:

percona-dbaas-cli dbaas get-credentials

 

cURL:

curl --request POST \
     --url https://34.171.88.159/v1/management/DBaaS/PXCClusters/GetCredentials \
     --header 'Accept: application/json' \
     --header 'Authorization: Bearer eyJrIjoiNmJ4VENyb0p0NWg1ODlONXRLT1FwN1N6YkU2SW5XMmMiLCJuIjoiYWRtaW5rZXkiLCJpZCI6MX0=' \
     --header 'Content-Type: application/json' \
     --data '{"kubernetes_cluster_name": "my-k8s","name": "my-mysql-0"}'

This will return you the endpoint to connect to, user and password. Use your favorite CLI tool or ODBC to connect to the database.

Conclusion

Automated database provisioning and management with various Database as a Service solution is becoming a minimal requirement for agile teams. Percona is committed to helping developers and operations teams to run databases anywhere. You can deploy fully open source Percona Monitoring and Management in the cloud or on-premises and provide a self-service experience to your teams not only through UI but API as well.

Right now PMM DBaaS is in technical preview and we encourage you to try it out. Feel free to tell us about your experience in our community forum.

Aug
22
2022
--

Private DBaaS with Free Kubernetes Cluster

Percona Private DBaaS with Free Kubernetes Cluster

Percona Private DBaaS with Free Kubernetes ClusterWe at Percona are committed to delivering software that enables users to run databases anywhere. Our Operators for databases and Percona Monitoring and Management (PMM) Database as a Service (DBaaS) confirm our commitment to Kubernetes. Kubernetes is not only the most popular container orchestrator, but also becoming a de-facto standard for containerized workloads.

Even though we have an enterprise-grade solution to run and manage databases on Kubernetes, we still see that Kubernetes itself sometimes becomes a blocker for onboarding. We wrote a blog post some time ago about spinning up DBaaS in under 20 minutes. What if we can do it in two? This is why we partnered with a cloud-native service provider – Civo – to provide our users with a free temporary Kubernetes cluster. In this blog post, you will learn how to use it and try out our Private DBaaS solutions without the need of being a database or Kubernetes expert.

How do I get the cluster?

  • Sign in to Percona Platform. If you don’t have an account yet, click Create one at the bottom of the sign-in form.
  • Find “Free Kubernetes” in the menu on the left:

Percona Portal

  • Click “Launch a new cluster”. It will take less than 90 seconds to create one.
  • Once the cluster is ready, you will be able to download kubeconfig – a file used to access the Civo Kubernetes cluster.

Percona DBaaS

Save this file somewhere on your computer, we will need it later to register Kubernetes in PMM DBaaS. That is it, the cluster is up and running. 

Limitations

  • The cluster will be automatically destroyed in three hours. It must not be used for any production workloads.
  • The cluster comes with three nodes (4 CPUs, 8 GB RAM each) and does not have auto scaling enabled. It is enough for deploying a database cluster and an application.

Try DBaaS in Percona Monitoring and Management

Install PMM server

If you have a PMM server – skip this section. If not, we are going to deploy it using the quick install. You can also install PMM on Kubernetes with a helm chart by following our documentation and this blog post.

Run the following command to install PMM server on your docker compatible *nix based machine (see quick start guide for more details):

curl -fsSL https://www.percona.com/get/pmm | /bin/bash

When the script is done, the CLI tool will print a list of IP-based URLs you can put in a browser to access the PMM UI.  Copy/paste one into your favorite browser.  You may receive a security warning, there are instructions in the script output on how to bypass if you don’t get a “proceed anyway” option in your browser.  

DBaaS

You can find necessary information about how to utilize DBaaS in our documentation or this video. In general there are few steps:

  1. At the time of writing this blog post, DBaaS is in technical preview. Do not forget to enable it in Settings -> Advanced Settings.
  2. Register the Kubernetes cluster in the DBaaS using the kubeconfig generated in the Portal
  3. Deploy your first database

Your database will be ready in a few minutes, you will get the endpoint to connect to and the username and password. By default the database is not exposed publicly and reachable only within the Kubernetes cluster. You can change it in the Advanced Options when creating the database.

With ‘Free Kubernetes’ we want to simplify PMM DBaaS onboarding and we also want to bring value to our community of users. It is the first version and we plan to deliver more enhancements to provide even more exciting onboarding. It would be great if you could help us to find those improvements by submitting your feedback at platform_portal@percona.com. Please spend a couple of minutes and let us know what problems or improvements you would like to see in your PMM DBaaS and Kubernetes journey. 

Mar
04
2022
--

DBaaS in Under 20 Min!

DBaaS Kubernetes Percona

DBaaS Kubernetes PerconaMy father always used the expression, “sometimes the juice ain’t worth the squeeze”. What he meant by it was, “what you put into something better be worth what you get out of it”. Why do I bring this up?  Well…we’ve got this GREAT feature for Percona Monitoring and Management (PMM) we’ve been banging away on: PMM DBaaS (Database as a Service).  Using it for only 30 minutes, you can see it has the potential to change the way teams think about providing database services while controlling cost and minimizing complexity.  But it’s a major pain in the ass to get all set up to first realize that value…and we want to change that!

TLDR: YES! I’ve been wanting to try out DBaaS, but have no desire to become a Kubernetes expert just to see it! Skip to the good stuff! 

Quick history.  Our DBaaS (Database as a Service) offering is designed to be resilient and performant…after all, we’re kind of known for being able to beat databases into high-performance submission.  So when considering the backends to help us deliver performance, scalability, reliability, and more, we settled on Kubernetes as the starting point thanks to its scalability, resiliency, and orchestration capabilities out of the box!  We released a preview release about a year ago and have been adding features and functionality to it ever since. 

Getting Past Setup Kubernetes

I’m lucky enough to get to talk to all kinds of users that are begging for a solution with the flexibility of your public cloud DBaaS but without racking up tens of thousands of dollars of bills a month, or that need to maintain tight control of their data, or who have moved a ton of workload to the cloud and have racks of servers just sitting there.  I tell them about what we’ve built and encourage them to try it out. All of them get excited to hear what it can do and are eager to give it a try!  So I give them some time and follow up a few weeks later…nothing.  I encourage them to make the time, follow up a few weeks later…nothing?  Challenge them as to why not when they admit they’re losing precious cycles on silly operations that users should just be able to do on their own and the number one response is “Kubernetes is too confusing for me and I could never get past Step 1: Setup Kubernetes”.  Not. Good!  

I’ve used our DBaaS on numerous occasions…mostly just on my laptop with minikube.  There’s a drawback with minikube; you must have a powerhouse of a machine to be able to use DBaaS and PMM on the same machine to play around with it; not to mention weights and chains to keep your laptop from flying away when the fans go nuts!  The best way to poke around DBaaS is with some cheap public cloud infrastructure!  So I figured I’d give it a try…our docs show what looks like “three easy steps”, but failed to mention the prerequisite 20 steps if you don’t already have eksctl and other tools installed/configured ?????. It was more work than I budgeted time for, but I decided to push through; determined to get an EKS cluster up and running!  I threw in the towel after about five hours one Saturday…defeated.  It wasn’t just getting the cluster up, it was all the hoops and tricks and rules and roles and permissions needed to do anything with that cluster.  That’s five hours of squeeze and still no juice!

So I did what all smart engineers do…found a smarter engineer!  Sergey and I decided there was a real opportunity to make DBaaS available to a wider range of users…those who were not AWS Level 8 Samurais with PhDs in Kubernetes and the goal was simple: “Be able to use PMM’s DBaaS in 10 minutes or less…starting from nothing!”  We have not quite hit the 10-minute mark, but we DID hit the 18-minute mark…and 16 of those 18 minutes are spent watching paint dry as the tiny robots of CloudFormation get you a suitable cluster up and running.  But when it’s done, there’s no messing with IAM roles or Load Balancers or VPCs…just copy/paste, and use!

Wanna Try DBaaS?

You’re going to need your AWS Access Key ID and your AWS Secret Access Key for a root user…so get that before you start the timer ? (here is a handy guide to getting them if you don’t already have them safely stored somewhere).  You will also need a Linux system to set up your PMM server on and make the needed calls to get your K8s cluster up and running (this has only been tested on CentOS and Ubuntu).  

As any user with sudo, run: 

curl -fsSL https://raw.githubusercontent.com/shoffman-percona/easyK8s/main/easyK8s-aws.sh | bash -s -- <AWS_ACCESS_KEY_ID> <AWS_SECRET_ACCESS_KEY>

You can optionally add an AWS region at the end of that if you want something other than us-east-2 (default).

While this is running (~16 min) you can go right to the PMM installation.  In a new tab/terminal window and run the following (user with sudo privileges):

curl -fsSL https://www.percona.com/get/pmm | /bin/bash

When the script is done, the CLI tool will print a list of IP-based URLs you can put in a browser to access the PMM UI.  Copy/paste one into your favorite browser.  You may receive a security warning, there are instructions in the script output on how to bypass if you don’t get a “proceed anyway” option in your browser.    

Log in to PMM’s UI, the default username/password is admin/admin and you’ll be prompted to change the password.  

To turn on DBaaS you’ll need to click the gear icon Settings Icon, followed by “Settings”.  On the PMM settings page, click on “Advanced Settings” and scroll down to the “Technical Preview features” section, and toggle DBaaS on.  While you’re here, fill in the Public Address using the “Get from Browser” button.  This makes automatic monitoring that much easier later. Click “Apply Changes” and you’ll see the screen refresh and a new icon will appear of a database DBaaS Icon. Click it to get to the DBaaS main page…but you’ll most likely be holding here as the infrastructure is probably still setting up.  Take advantage of the opportunity to stand up, stretch your leg, maybe grab a drink!

Once the cluster setup is completed, you can Copy/Paste from the ####BEGIN KUBECONFIG#### comment to the ####END KUBECONFIG#### comment. Switch over to the PMM DBaaS UI on the Kubernetes Cluster tab click “Register New Kubernetes Cluster”.  Name your new cluster and paste the config in the bottom window…it’ll take a second and your PMM server will install both Percona XtraDB Cluster and Percona Server for MongoDB operators and enable the DB Cluster tab where you can create and size DBs of your choosing!  

That’s it!  If all the complicated setup has held you back from taking DBaaS for a test drive, hopefully this will give you “more juice for your squeeze”!  We’d love to hear feedback on what we’ve built for so far so feel free to leave a comment here or offer an improvement in our jira instance under the PMM project.  Our main objective is to take the complication out of getting a database up and running for your application development process and being able to create MySQL and MongoDB databases in one place (PostgreSQL coming soon).  When you’re done playing, you can unregister the Kubernetes from PMM then log in to your AWS account, and delete both stacks (eksctl-pmmDBaaS-nodes-XXX and eksctl-pmmDBaaS-cluster) in the Cloudformation app for the region you chose (or us-east-2 if you left the default).  

Jan
14
2022
--

DBaaS and the Enterprise

DBaaS and the Enterprise

DBaaS and the EnterpriseInstall a database server. Give the application team an endpoint. Set up backups and monitor in perpetuity. This is a pattern I hear about regularly from DBAs with most of my enterprise clients. Rarely do they get to troubleshoot or work with application teams to tune queries or design schemas. This is what triggers the interest in a DBaaS platform from database administrators.

What is DBaaS?

DBaaS stands for “Database as a Service”. When this acronym is thrown out, the first thought is generally a cloud offering such as RDS. While this is a very commonly used service, a DBaaS is really just a managed database platform that offloads much of the operational burden from the DBA team. Tasks handled by the platform include:

  • Installing the database software
  • Configuring the database
  • Setting up backups
  • Managing upgrades
  • Handling failover scenarios

A common misconception is that a DBaaS is limited to the public cloud. As many enterprises already have large data centers and heavy investments in hardware, an on-premise DBaaS can also be quite appealing. Keeping the database in-house is often favored when the hardware and resources are already available. In addition, there are extra compliance and security concerns when looking at a public cloud offering.

DBaaS also represents a difference in mindset. In conventional deployments, systems and architecture are often designed in very exotic ways making automation a challenge. With a DBaaS, automation, standardization, and best practices are the priority. While this can be seen as limiting flexibility, this approach can lead to larger and more robust infrastructures that are much easier to manage and maintain.

Why is DBaaS Appealing?

From a DBA perspective (and being a former DBA myself), I always enjoyed working on more challenging issues. Mundane operations like launching servers and setting up backups make for a less-than-exciting daily work experience. When managing large fleets, these operations make up the majority of the work.

As applications grow more complex and data sets grow rapidly, it is much more interesting to work with the application teams to design and optimize the data tier. Query tuning, schema design, and workflow analysis are much more interesting (and often beneficial) when compared to the basic setup. DBAs are often skilled at quickly identifying issues and understanding design issues before they become problems.

When an enterprise adopts a DBaaS model, this can free up the DBAs to work on more complex problems. They are also able to better engage and understand the applications they are supporting. A common comment I get when discussing complex tickets with clients is: “well, I have no idea what the application is doing, but we have an issue with XYZ”. If this could be replaced with a detailed understanding from the design phase to the production deployment, these discussions would be very different.

From an application development perspective, a DBaaS is appealing because new servers can be launched much faster. Ideally, with development or production deployment options, an application team can have the resources they need ready in minutes rather than days. It greatly speeds up the development life cycle and makes developers much more self-reliant.

DBaaS Options

While this isn’t an exhaustive list, the main options when looking to move to a DBaaS are:

  • Public cloud
    • Amazon RDS, Microsoft Azure SQL, etc
  • Private/Internal cloud
    • Kubernetes (Percona DBaaS), VMWare, etc
  • Custom provisioning/operations on bare-metal

Looking at public cloud options for a DBaaS, security and compliance are generally the first concern. While they are incredibly easy to launch and generally offer some pay-as-you-go options, managing access is a major consideration.

Large enterprises with existing hardware investments often want to explore a private DBaaS. I’ve seen clients work to create their own tooling within their existing infrastructure. While this is a viable option, it can be very time-consuming and require many development cycles. Another alternative is to use an existing DBaaS solution. For example, Percona currently has a DBaaS deployment as part of Percona Monitoring and Management in technical preview. The Percona DBaaS automates PXC deployments and management tasks on Kubernetes through a user-friendly UI.

Finally, a custom deployment is just what it sounds like. I have some clients that manage fleets (1000s of servers) of bare metal servers with heavy automation and custom scripting. To the end-user, it can look just like a normal DBaaS (an endpoint with all operations hidden). On the backend, the DBA team spends significant time just supporting the infrastructure.

How Can Percona help?

Percona works to meet your business where you are. If that is supporting Percona Server for MySQL on bare metal or a fleet of RDS instances, we can help. If your organization is leveraging Kubernetes for the data tier, the Percona Private DBaaS is a great option to standardize and simplify your deployments while following best practices. We can help from the design phase through the entire life cycle. Let us know how we can help!

May
19
2021
--

Percona Monitoring and Management DBaaS Overview and Technical Details

Percona Monitoring and Management DBaaS Overview

Percona Monitoring and Management DBaaS OverviewDatabase-as-a-Service (DBaaS) is a managed database that doesn’t need to be installed and maintained but is instead provided as a service to the user. The Percona Monitoring and Management (PMM) DBaaS component allows users to CRUD (Create, Read, Update, Delete) Percona XtraDB Cluster (PXC) and Percona Server for MongoDB (PSMDB) managed databases in Kubernetes clusters.

PXC and PSMDB implement DBaaS on top of Kubernetes (k8s), and PMM DBaaS provides a nice interface and API to manage them.

Deploy Playground with minikube

The easiest way to play with and test PMM DBaaS is to use minikube. Please follow the minikube installation guideline. It is possible that your OS distribution provides native packages for it, so check that with your package manager as well.

In the examples below, Linux is used with kvm2 driver, so additionally kvm and libvirt should be installed. Other OS and drivers could be used as well. Install the kubectl tool as well, it would be more convenient to use it and minikube will configure kubeconfig so k8s cluster could be accessed from the host easily.

Let’s create a k8s cluster and adjust resources as needed. The minimum requirements can be found in the documentation.

  • Start minikube cluster
$ minikube start --cpus 12 --memory 32G --driver=kvm2

  • Download PMM Server deployment for minikube and deploy it in k8s cluster
$ curl -sSf -m 30 https://raw.githubusercontent.com/percona-platform/dbaas-controller/main/deploy/pmm-server-minikube.yaml \
| kubectl apply -f -

  • For the first time, it could take a while for the PMM Server to init the volume, but it will eventually start
  • Here’s how to check that PMM Server deployment is running:
$ kubectl get deployment
NAME             READY   UP-TO-DATE   AVAILABLE   AGE
pmm-deployment   1/1     1            1           3m40s

$ kubectl get pods
NAME                             READY   STATUS    RESTARTS   AGE
pmm-deployment-d688fb846-mtc62   1/1     Running   0          3m42s

$ kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM              STORAGECLASS   REASON   AGE
pmm-data                                   10Gi       RWO            Retain           Available                                              3m44s
pvc-cb3a0a18-b6dd-4b2e-92a5-dfc0bc79d880   10Gi       RWO            Delete           Bound       default/pmm-data   standard                3m44s

$ kubectl get pvc
NAME       STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
pmm-data   Bound    pvc-cb3a0a18-b6dd-4b2e-92a5-dfc0bc79d880   10Gi       RWO            standard       3m45s

$ kubectl get service
NAME         TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE
kubernetes   ClusterIP   10.96.0.1        <none>        443/TCP                      6m10s
pmm          NodePort    10.102.228.150   <none>        80:30080/TCP,443:30443/TCP   3m5

  • Expose PMM Server ports on the host, as this also opens links to the PMM UI as well as to the API endpoint in the default browser.
$ minikube service pmm

NOTE:

To not go into too much detail: PV (kubectl get pv) and PVC(kubectl get pvc) are essentially the storage for PMM data (/srv directory). Service is a network for PMM and how to access it.

Attention: this PMM Server deployment is not supposed to be used in production, but just as a sandbox for testing and playing around, as it always starts with the latest version of PMM and k8s is not yet a supported environment for it.

Configure PMM DBaaS

Now PMM DBaaS Dashboard can be used and a k8s cluster could be added, DB added, as well as configured.

DBaaS Dashboard

NOTE:

To enable the PMM DBaaS feature you need to either pass a special environment (ENABLE_DBAAS=1) to the container or enable it in the settings (next screenshot).

To allow PMM managing k8s cluster – it needs to be configured. Check the documentation, but here are short steps:

  • set Public Address address to pmm on Configuration -> Settings -> Advanced Settings page

PMM Advanced settings

  • Get k8s config (kubeconfig) and copy it for registration:
kubectl config view --flatten --minify

  • Register configuration that was copied on DBaaS Kubernetes Cluster dashboard:

DBaaS Register k8s Cluster

 

Let’s get into details on what that all means.

The Public Address is propagated to pmm-client containers that are run as part of PXC and PSMDB deployments to monitor DB services pmm-client containers run pmm-agent, which would need to connect to the PMM server. It uses Public Address. DNS name pmm is set by Service in pmm-server-minikube.yaml file for our PMM server deployment.

So far, PMM DBaaS uses kubeconfig to get access to k8s API to be able to manage PXC and PSMDB operators. The kubeconfig file and k8s cluster information is stored securely in PMM Server internal DB.

PMM DBaaS couldn’t deploy operators into the k8s cluster for now, but that feature will be implemented very soon. And that is why Operator status on the Kubernetes Cluster dashboard shows hints on how to install them.

What are the operators and why are they needed? This is defined very well in the documentation. Long story short, they are the heart of DBaaS that deploy and configure DBs inside of k8s cluster.

Operators themselves are complex pieces of software that need to be correctly started and configured to deploy DBs. That is where PMM DBaaS comes in handy, to configure a lot for the end-user and provide a UI to choose what DB needs to be created, configured, or deleted.

Deploy PSMDB with DBaaS

Let’s deploy the PSMDB operator and DBs step by step and check them in detail.

  • Deploy PSMDB operator
curl -sSf -m 30 https://raw.githubusercontent.com/percona/percona-server-mongodb-operator/v1.7.0/deploy/bundle.yaml \
| kubectl apply -f -

  • Here’s how it could be checked that operator was created:
$ kubectl get deployment
NAME                              READY   UP-TO-DATE   AVAILABLE   AGE
percona-server-mongodb-operator   1/1     1            1           46h
pmm-deployment                    1/1     1            1           24h


$ kubectl get pods
NAME                                               READY   STATUS    RESTARTS   AGE
percona-server-mongodb-operator-586b769b44-hr7mg   1/1     Running   2          46h
pmm-deployment-7fcb579576-hwf76                    1/1     Running   1          24h

Now it is seen on the PMM DBaaS Kubernetes Cluster Dashboard that the MongoDB operator is installed.

Cluster with PSMDB

PMM API

All REST APIs could be discovered via Swagger; it is exposed on both ports (30080 and 30443 in case of minikube) and could be accessed by appending /swagger to the PMM server address. It is recommended to use https (30443 port), and for example, the URL could look like this: https://192.168.39.202:30443/swagger.

As DBaaS is a feature under active development, replace /swagger.json to /swagger-dev.json and push the Explore button.

Swagger API

Now all APIs can be seen and even executed.

Let’s try it out. First Authorize and then find /v1/management/DBaaS/Kubernetes/List and push Try it out and Execute. There will be an example of curl as well as response to the REST API POST request. The curl example could be used from the command line as well:

$ curl -kX POST "https://192.168.39.202:30443/v1/management/DBaaS/Kubernetes/List" -H  "accept: application/json" -H  "authorization: Basic YWRtaW46YWRtaW4=" -H  "Content-Type: application/json" -d "{}"
{
  "kubernetes_clusters": [
    {
      "kubernetes_cluster_name": "minikube",
      "operators": {
        "xtradb": {
          "status": "OPERATORS_STATUS_NOT_INSTALLED"
        },
        "psmdb": {
          "status": "OPERATORS_STATUS_OK"
        }
      },
      "status": "KUBERNETES_CLUSTER_STATUS_OK"
    }
  ]
}

PMM Swagger API Example

Create DB and Deep Dive

PMM Server consists of different components, and for the DBaaS feature, here are the main ones:

  • Grafana UI with DBaaS dashboards talk to pmm-managed through REST API to show the current state and provides a user interface
  • pmm-managed acts as REST gateway and holds kubeconfig and talks to dbaas-controller through gRPC
  • dbaas-controller implements DBaaS features, talks to k8s, and exposes gRPC interface for pmm-managed

The Grafana UI is what users see, and now when operators are installed, the user could create the MongoDB instance. Let’s do this.

  • Go to DBaaS -> DB Cluster page and push Create DB Cluster link
  • Choose your options and push Create Cluster button

Create MongoDB cluster

It has more advanced options to configure resources allocated for the cluster:

Advanced settings for cluster creation

As seen, the cluster was created and could be manipulated. Now let’s see in detail what has happened underneath.

PSMDB Cluster created

When the user pushes the Create Cluster button, Grafana UI POSTS /v1/management/DBaaS/PSMDBCluster/Create request to pmm-managed. pmm-managed handles the request and sends it via gRPC to the dbaas-controller together with kubeconfig.

dbaas-controller handles requests, and with knowledge of operator structure (Custom Resources/CRD), it prepares CR with all needed parameters to create a MongoDB cluster. After filling all needed structures, dbaas-controller converts CR to yaml file and applies it with kubectl apply -f command. kubectl gets pre-configured with kubeconf file (that was passed by pmm-managed from its DB) to talk to the correct cluster, and the kubeconf file is temporarily created and deleted immediately after the request.

The same happens when some parameters change or dbaas-controller gets some parameters from the k8s cluster.

Essentially, the dbaas-controller automates all stages to fill CRs with correct parameters, check that everything works correctly, and returns details about clusters created. The kubectl interface is used for simplicity but it is subject to change before GA, most probably to k8s Go API.

Summary

All together, PMM Server DBaaS provides a seamless experience for the user to deploy DB clusters on top of Kubernetes with simple and nice UI without the need to know operators’ internals. Deploying PXC and PSMDB clusters it also configures PMM Agents and exporters, thus all monitoring data is present in PMM Server right away.

PMM PSMDB overview

Go to PMM Dashboard -> MongoDB -> MongoDB Overview and see MongoDB monitoring data, explorer nodes, and service monitoring too, which comes pre-configured with the help of the DBaaS feature.

Give it a try, submit feedback, and chat with us, we would be happy to hear from you!

P.S.

Don’t forget to stop and/or delete your minikube cluster if it is not used:

  • Stop minikube cluster, to not use resources (could be started with start again)
$ minikube stop

  • If a cluster is not needed anymore, delete minikube cluster
$ minikube delete

 

May
12
2021
--

Percona Live ONLINE: Percona Previews Open Source Database as a Service

percona open source dbaas

percona open source dbaasPercona Live ONLINE 2021 starts today!  

Representing dozens of projects, communities, and tech companies, and featuring more than 150 expert speakers across 200 sessions, there’s still time to register and attend. 

Register and Attend

Percona latest product announcements focus on Percona’s open source DBaaS preview, and new Percona Kubernetes Operators features.

During Percona Live ONLINE 2021, our experts will be discussing the preview of Percona’s 100% open source Database as a Service (DBaaS), which eliminates vendor lock-in and enables users to maintain control of their data. 

As an alternative to public cloud and large enterprise database vendor DBaaS offerings, this on-demand self-service option provides users with a convenient and simple way to deploy databases quickly. Using Percona Kubernetes Operators means it is possible to configure a database once and deploy it anywhere.

“The future of databases is in the cloud, an approach confirmed by the market and validated by our own customer research,” said Peter Zaitsev, co-founder and CEO of Percona. “We’re taking this one step further by enabling open source databases to be deployed wherever the customer wants them to run – on-premises, in the cloud, or in a hybrid environment. Companies want the flexibility of DBaaS, but they don’t want to be tied to their original decision for all time – as they grow or circumstances change, they want to be able to migrate without lock-in or huge additional expenses.”

The DBaaS supports Percona open source versions of MySQL, MongoDB, and PostgreSQL. 

Critical database management operations such as backup, recovery, and patching will be managed through the Percona Monitoring and Management (PMM) component of Percona DBaaS. 

PMM is completely open source and provides enhanced automation with monitoring and alerting to find, eliminate, and prevent outages, security issues, and slowdowns in performance across MySQL, MongoDB, PostgreSQL, and MariaDB databases.

Customer trials of Percona DBaaS will start this summer. Businesses interested in being part of this trial can register here.

Easy Deployment and Management with Kubernetes Operators from Percona

The Kubernetes Operator for Percona Distribution of PostgreSQL is now available in technical preview, making it easier than ever to deploy. This Operator streamlines the process of creating a database so that developers can gain access to resources faster, as well as then ongoing lifecycle management.

There are also new capabilities available in the Kubernetes Operator for Percona Server for MongoDB, which support enterprise mission-critical deployments with features for advanced data recovery. It now includes support for multiple shards, which provides horizontal database scaling, and allows for distribution of data across multiple MongoDB Pods. This is useful for large data sets when a single machine’s overall processing speed or storage capacity is insufficient. 

This Operator also allows Point-in-Time Recovery, which enables users to roll back the cluster to a specific transaction and time, or even skip a specific transaction. This is important when data needs to be restored to reverse a problem transaction or ransomware attack.

The new Percona product announcements will be discussed in more detail at our annual Percona Live ONLINE Open Source Database Conference 2021 starting today.

We hope you’ll join us! Register today to attend Percona Live ONLINE for free.

Register and Attend

Feb
26
2021
--

Webinar March 18: Moving Your Database to the Cloud – Top 3 Things to Consider

Moving Your Database to the Cloud

Moving Your Database to the CloudJoin Rick Vasquez, Percona Technical Expert, as he discusses the pros and cons of moving your database to the cloud.

Flexibility, performance, and cost management are three things that make cloud database environments an easy choice for many businesses. If you are thinking of moving your database to the cloud you need to know the issues you might encounter, and how you can make the most of your DBaaS cloud configuration.

Our latest webinar looks into a number of key issues and questions, including:

* The real Total Cost of Ownership (TCO) when moving your database to the cloud.

* If performance is a critical factor in your application, how do you achieve the same or better performance in your chosen cloud?

* The “hidden costs” of running your database in the cloud.

* Why do companies choose open source and cloud software? The pros and cons of this decision.

* The case for cloud support (why “fully managed” isn’t always as fully managed as claimed).

* Should you consider a multi-cloud strategy?

* Business continuity planning in the cloud – what if your provider has an outage?

Please join Rick Vasquez, Percona Technical Expert, on Thursday, March 18, 2021, at 1:00 PM EST for his webinar “Moving Your Database to the Cloud – Top 3 Things to Consider“.

Register for Webinar

If you can’t attend, sign up anyway, and we’ll send you the slides and recording afterward.

Feb
24
2021
--

Point-In-Time Recovery in Kubernetes Operator for Percona XtraDB Cluster – Architecture Decisions

Point-In-Time Recovery in Kubernetes Operator

Point-In-Time Recovery in Kubernetes OperatorPoint-In-Time Recovery (PITR) for MySQL databases is an important feature that is essential and covers common use cases, like a recovery to the latest possible transaction or roll-back the database to a specific date before some bad query was executed. Percona Kubernetes Operator for Percona XtraDB Cluster (PXC) added support for PITR in version 1.7, and in this blog post we are going to look into the technical details and decisions we made to implement this feature.

Architecture Decisions

Store Binary Logs on Object Storage

MySQL uses binary logs to perform point-in-time recovery. Usually, they are stored locally along with the data, but it is not an option for us:

  • We run the cluster and we cannot rely on a single node’s local storage.
  • The cloud-native world lives in an ephemeral dimension, where nodes and pods can be terminated and S3-compatible storage is a de facto standard to store data.
  • We should be able to recover the data to another Kubernetes cluster in case of a disaster.

We have decided to add a new Binlog Uploader Pod, which connects to the available PXC member and uploads binary logs to S3. Under the hood, it relies on the mysqlbinlog utility.

Use Global Transaction ID

Binary logs on the clustered nodes are not synced and can have different names and contents. This becomes a problem for the Uploader, as it can connect to different PXC nodes for various reasons.

To solve this problem, we decided to rely on Global Transaction ID (GTID). It is a unique transaction identifier, but it is unique not only to the server on which it originated, but is unique across all servers in a given replication topology.  With the GTID captured in binary logs, we can identify any transaction not depending on the filename or its contents. This allows us to continue streaming binlogs from any PXC member at any moment.

User-Defined Functions

We have a unique identifier for every transaction, but the mysqlbinlog utility still doesn’t have the functionality to determine which binary log file contains which GTID. We decided to extend MySQL with few User Defined Functions and added them to Percona Server for MySQL and Percona XtraDB Cluster versions 8.0.21

get_gtid_set_by_binlog()

This function returns all GTIDs that are stored inside the given binlog file. We put the GTID setlist to a new file next to the binary log on S3.

get_binlog_by_gtid_set()

This function takes GTID set as an input and returns a binlog filename which is stored locally. We use it to figure out which GTIDs are already uploaded and which binlog to upload next. 

binlog uploader pod

Have open source expertise you want to share? Submit your talk for Percona Live ONLINE 2021!

Find the node with the oldest binary log

Our quality assurance team caught a bug before the release which can happen in the cluster only:

  • Add a new node to the Percona XtraDB Cluster (for example scale up from 3 to 5 nodes).
  • Binlog Uploader Pod tries to execute get_binlog_by_gtid_set on the new node but gets the error.
2021/01/19 11:23:19 ERROR: collect binlog files: get last uploaded binlog name by gtid set: scan binlog: sql: Scan error on column index 0, name "get_binlog_by_gtid_set('a8e657ab-5a47-11eb-bea2-f3554c9e5a8d:15')": converting NULL to string is unsupported

The error is valid, as this node is new and there are no binary log files that have the GTID set that Uploader got from S3. If you look into this pull request, the quick patch is to always pick the oldest node in the array or in other words the node, which most likely would have the binary logs we need. In the next release of the Operator, we add more sophisticated logic, to discover the node which has the oldest binary logs for sure.

Storageless binlog uploader

The size of binlogs depends on the cluster usage patterns, so it is hard to predict the size of the storage or memory required for them. We decided to take this complexity away by making our Binary Log Uploader Pod completely storageless. Mysqlbinlog can store remote binlog only into files, but we need to put them to S3. To get there we decided to use a named pipe or FIFO special file. Now mysqlbinlog utility loads the binary log file to a named pipe, our Uploader reads it and streams the data directly to S3.

Also, storageless design means that we never store any state between Uploader restarts. Basically, state is not needed, we only need to know which GTIDs are already uploaded and we have this data on a remote S3 bucket. Such design enables the continuous upload flow of binlogs.

Binlog upload delay

S3 protocol expects that the file is completely uploaded. If the file upload is interrupted (let’s say Uploader Pod is evicted), the file will not be accessible/visible on S3. Potentially we can lose many hours of binary logs because of such interruptions. That’s why we need to split the binlog stream into files and upload them separately.

One of the options that users can configure when enabling point-in-time recovery in Percona XtraDB Cluster Operator is timeBetweenUploads. It sets the number of seconds between uploads for Binlog Uploader Pod. By default, we set it to 60 seconds, but it can go down to one second. We do not recommend setting it too low, as every invocation of the Uploader leads to FLUSH BINARY LOGS command execution on the PXC node. We need to flush the logs to close the binary log file to upload it to external storage, but doing it frequently may negatively affect IO and as a result database performance.

Recovery

It is all about recovery and it has two steps:

  1. Recover the cluster from a full backup
  2. Apply binary logs

We already have the functionality to restore from a full backup (see here), so let’s get to applying the binary logs.

First, we need to figure out from which GTID set we should start applying binary logs – in other words: where do we start?. As we rely on the Percona XtraBackup utility to take full MySQL backups, what we need to do is read the xtrabackup_info file which has lots of useful metadata. We already have this file on S3 near the full backup.

Second, find the binlog which has the GTID set we need. As you remember, we store a file with binlog’s GTID sets on S3 already, so it boils down to reading these files.

Third, download binary logs and apply them. Here we rely on mysqlbinlog as well, which has the flags we need, like –stop-datetime – which stops recovery when the event with a specific timestamp is caught in the log.

point in time recovery

Conclusion

MySQL is more than 25 years old and has a great tooling ecosystem established around it, but as we saw in this blog post, not all these tools are cloud-native ready. Percona engineering teams are committed to providing users the same features across various environments, whether it is a bare-metal installation in the data center or cutting edge Kubernetes in the cloud.

Feb
12
2021
--

Tame Kubernetes Costs with Percona Monitoring and Management and Prometheus Operator

Kubernetes Costs Percona Monitoring and Management

Kubernetes Costs Percona Monitoring and ManagementMore and more companies are adopting Kubernetes, but after some time they see an unexpected growth around cloud costs. Engineering teams did their part in setting up auto-scalers, but the cloud bill is still growing. Today we are going to see how Percona Monitoring and Management (PMM) can help with monitoring Kubernetes and reducing the costs of the infrastructure.

Get the Metrics

Overview

Prometheus Operator is a great tool to monitor Kubernetes as it deploys a full monitoring stack (prometheus, grafana, alertmanager, node exporters) and works out of the box. But if you have multiple k8s clusters, then it would be great to have a single pane of glass from which to monitor them all.

To get there I will have Prometheus Operator running on each cluster and pushing metrics to my PMM server. Metrics will be stored in VictoriaMetrics time-series DB, which PMM uses by default since the December 2020 release of version 2.12.

Prometheus Operator

PMM-server

I followed this manual to the letter to install my PMM server with docker. Don’t forget to open the HTTPS port on your firewall, so that you can reach the UI from your browser, and so that the k8s clusters can push their metrics to VictoriaMetrics through NGINX.

Prometheus Operator

On each Kubernetes cluster, I will now install Prometheus Operator to scrape the metrics and send them to PMM. Bear in mind that Helm charts are stored in prometheus-community repo.

Add helm repository

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts

Prepare the configuration before installing the operator

$ cat values.yaml
serviceAccounts:
  alertmanager:
    create: false

alertmanager:
  enabled: false

configmapReload:
  alertmanager:
    enabled: false

extraScrapeConfigs: |
  remote_write:
    - url: https://{PMM_USER}:{PMM_PASS}@{YOUR_PMM_HOST}/victoriametrics/api/v1/write

server:
  global:
    external_labels:
      kubernetes_cluster_name: {UNIQUE_K8S_LABEL}

  • Disable alertmanager, as I will rely on PMM
  • Add remote_write section to write metrics to PMM’s VictoriaMetrics storage
    • Use your PMM user and password to authenticate. The default username and password are admin/admin. It is highly recommended to change defaults, see how here.
    • /victoriametrics

      endpoint is exposed through NGINX on PMM server

    • If you use https and a self-signed certificate you may need to disable TLS verification:
 tls_config:
   insecure_skip_verify: true

  • external_labels

    section is important – it labels all the metrics sent from Prometheus. Each cluster must have a unique

    kubernetes_cluster_name

    label to distinguish metrics once they are merged in VictoriaMetrics.

Create namespace and deploy

kubectl create namespace prometheus
helm install prometheus prometheus-community/prometheus -f values.yaml  --namespace=prometheus

 

Have open source expertise you want to share? Submit your talk for Percona Live ONLINE 2021!

Check

  • PMM Server is up – check
  • Prometheus Operators run on Kubernetes Clusters – check

Now let’s check if metrics are getting to PMM:

  • Go to PMM Server UI
  • On the left pick Explore

PMM Server UI

  • Run the query
    kube_node_info{kubernetes_cluster_name="UNIQUE_K8S_LABEL"}

It should return the information about the Nodes running on the cluster with UNIQUE_K8S_LABEL. If it does – all good, metrics are there.

Monitor the Costs

The main reasons for the growth of the cloud bill are computing and storage. Kubernetes can scale up adding more and more nodes, skyrocketing compute costs. 

We are going to add two dashboards to the PMM Server which would equip us with a detailed understanding of how resources are used and what should be tuned to reduce the number of nodes in the cluster or change instance types accordingly:

  1. Cluster overview dashboard
  2. Namespace and Pods dashboard

Import these dashboards in PMM:

dashboards in PMM

Dashboard #1 – Cluster Overview

The goal of this dashboard is to provide a quick overview of the cluster and its workloads.

Cluster Overview

The cluster on the screenshot has some room for improvement in utilization. It has a capacity of 1.6 thousand CPU cores but utilizes only 146 cores (~9%). Memory utilization is better – ~62%, but can be improved as well.

Quick take:

  • It is possible to reduce # of nodes and get utilization to at least 80%
  • Looks like workloads in this cluster are mostly memory bound, so it would be wiser to run nodes with more memory and less CPU.

Graphs in the CPU/Mem Request/Limit/Capacity section gives a detailed view of resource usage over time:

CPU/Mem Request/Limit/Capacity section

Another two interesting graphs would show us the top 20 namespaces that are wasting resources. It is calculated as the difference between requests and real utilization for CPU and Memory. The values on this graph can be negative if requests for the containers are not set.

This dashboard also has a graph showing persistent volume claims and their states. It can potentially help to reduce the number of volumes spun up on the cloud.

Dashboard #2 – Namespace and Pod

Now that we have an overview, it is time to dive deeper into the details. At the top, this dashboard allows the user to choose the Cluster, the Namespace, and the Pod.

At first, the user sees Namespace details: Quotas (might be empty if Resource Quotas are not set for the namespace), requests, limits, and real usage for CPU, Memory, Pods, and Persistent Volume Claims.

Namespace and Pod

The Namespace on the screenshot utilizes almost zero CPU cores but requests 20+ cores. If requests are tuned properly, then the capacity required to run the workloads would drop and the number of nodes can be reduced.

The next valuable insight that the user can pick from this dashboard is real Pod utilization – CPU, Memory, Network, and disks (only local storage).

Pod CPU Usage

In the case above you can see CPU and Memory container-level utilization for Prometheus Pod, which is shipping the metrics on one of my Kubernetes clusters.

Summary

This blog post equips you with the design to collect multiple Kubernetes clusters metrics in a single time-series database and expose them on the Percona Monitoring and Management UI through dashboards to analyze and gain insights. These insights help you drive your infrastructure costs down and highlight issues on the clusters.

Also, look to PMM on Kubernetes for monitoring of your databases – see our demo here and contact Percona if you are interested in learning more about how to become a Percona Customer, we are here to help!


The call for papers for Percona Live is open. We’d love to receive submissions on topics related to open-source databases such as MySQL, MongoDB, MariaDB, and PostgreSQL. To find out more visit percona.com/live.

Feb
08
2021
--

DBaaS on Kubernetes: Under the Hood

DBaaS on Kubernetes

DBaaS on KubernetesRunning Database-as-a-Service (DBaaS) in the cloud is the norm for users today. It provides a ready-to-use database instance for users in a few seconds, which can be easily scaled and usually comes with a pay-as-you-go model.

We at Percona see that more and more cloud vendors or enterprises either want to or are already running their DBaaS workloads on Kubernetes (K8S). Today we are going to shed some light on how Kubernetes and Operators can be a good tool to run DBaaS and what the most common patterns are.

DBaaS on Kubernetes Offerings

Users usually do not care much how it works internally, but common expectations are:

  • The Database is provisioned and the user gets the endpoint information and a password to connect to it.
  • The Database is highly available and meets certain SLA. Usually, it starts with 99.9% uptime.
  • The Database can be monitored and backed up easily (remember, it is a managed service)

Topologies

The first step for cloud providers or for big enterprises leaning towards DBaaS on Kubernetes is to choose the topology. There are various ways how it can be structured.

Kubernetes Cluster per DB

Kubernetes Cluster per DB

The advantage of this solution is that it is simple and provides good isolation of workloads by design.

The biggest disadvantage is that each Kubernetes cluster comes with its own masters, which run etcd and control-plane software. On a big scale, this adds a huge overhead.

Shared Kubernetes with Node Isolation

Shared Kubernetes with Node Isolation

Pros: Good isolation, no overhead

Cons: It might be hard to maintain a good utilization of all the nodes in the cluster, and, as a result, lots of computational power will be wasted.

Fully Shared Kubernetes

Fully Shared Kubernetes

We are talking about namespace isolation. Either each customer or each database gets its own namespace.

Pros: Good use of computational resources, no waste

Cons: Resources are isolated via cgroups and it might be tricky to deal with security and “noisy tenants”.

Shared Kubernetes with VM-Like Isolation

Shared Kubernetes with VM-Like Isolation

VM-like isolation is usually delivered by a special container runtime that hardens the boundary between the application and the host kernel.

It can be gVisor, Kata containers, even Firecracker. It slightly increases the resources overhead but improves security greatly.

High Availability

As you can see on the diagrams above, database nodes are running on different Kubernetes nodes. Such a design guarantees zero downtime in case of a single VM or server failure. Affinity or Pod Topology Thread Constraints (promoted to stable in k8s 1.19) are used in Kubernetes to enforce Pod scheduling on separate nodes. In Percona Kubernetes Operators we have anti-affinity enabled by default to ensure a smooth experience for the end-user.

Storage

The database cannot go without data and storage. And as always, there are various ways how Operators can store data in Kubernetes.

StatefulSets

When somebody talks storage in Kubernetes usually it starts with StatefulSets, the primitive which “manages the deployment and scaling of a set of Pods and provides guarantees about the ordering and uniqueness of these Pods”. In other words, this object ensures that Pods with data have unique names and start in a predictable order, which sometimes is crucial for the databases.

Persistent Volume Claims

Persistent Volume Claims

PVCs over network storage is the obvious choice in Kubernetes world, but there is no magic and network storage must be in place. Most cloud providers already have it, e.g. AWS EBS, Google Storage Engine, etc. There are solutions for private clouds as well – Ceph, Gluster, Portworx, etc. Some of them are even cloud-native – like Rook.io, which is Ceph on k8s.

The performance of PVCs heavily depends on network performance and the underlying disks used for the storage.

Local Storage

Kubernetes provides HostPath and EmptyDir primitives to use the storage of the node where the Pod runs. Local storage might be a good choice to boost performance and in cases where network storage is not available or introduces complexity (e.g. in private clouds)

Local Storage

EmptyDir

It is ephemeral storage at its best – a Pod or node restart will wipe the data completely from the node. With good anti-affinity, it is possible to use EmptyDir for production workloads, but it’s not recommended. It is worth mentioning that EmptyDir can be used to store data in memory for extra performance:

- name: file-storage 
  emptyDir: 
    medium: Memory

HostPath

Pod stores data on a locally mounted disk of the Node. In this case, a Pod or Node restart does not cause data loss on the DB node. It is much easier to manage HostPath through PVCs with the OpenEBS Container Storage Interface. Read this great blog post from Percona’s CTO Vadim Tkachenko on how Percona Operators can be deployed with OpenEBS.

Pitfall #1 – Node Failure Causes Data Loss

Using local storage is tricky because in case of a full node crash the data for one Pod will be lost completely and once the Pod starts back on a new node, it must be recovered somehow or synced from other nodes. It depends on the technology. For example, Percona XtraDB Cluster will do the State Snapshot Transfer with Galera to sync the data to an empty node.

Node Failure Causes Data Loss

Pitfall #2 – Data Limitation is Hard

DBaaS should provide users with the database instance with limited resources, including limits on the storage. For example, 100 GBs per instance. With local storage, there is no way to limit the growth of the database. There are a couple of ways on how to address this:

  1. Run a sidecar container that constantly monitors the data consumption (e.g. with du)
  2. Pick a topology where the database node runs on a dedicated Kubernetes node (see topologies above). In such a case you can attach only the amount of storage required by the tier.

Network

Now DBaaS needs to give the user the endpoint to connect to the database. Remember, if we want our database service to be highly available we run multiple DB nodes.

In Percona Operator for Percona XtraDB Cluster users can choose HAProxy or ProxySQL for traffic balancing across the nodes. The proxy will be deployed along with the DB nodes to balance the traffic between them and monitor the DB node’s health. Without these proxies either the application needs to figure out where to route the traffic itself or use some other external proxy.

The next step is to expose the proxy outside of Kubernetes using regular primitives: LoadBalancer, Ingress with TCP or NodePort, and return the endpoint information to the user.

DBaas Network

Day 2 Operations

Once the platform design is ready and users are happy with Day 1 operations – getting the database up and running – comes Day 2, where DBaaS automates the day-to-day routines.

Scaling

Kubernetes by design provides scaling capabilities. You can read more about it in my previous blog post Kubernetes Scaling Capabilities with Percona XtraDB Cluster.

DBaaS scaling boils down to increasing the resources of the Kubernetes cluster and then the Pods. The steps may differ depending on the chosen topology, but general capacity rules still apply and Pods will not be scheduled if there is not enough capacity.

Upgrading

Preferably DBaaS should provide automated database upgrades at least for minor versions, otherwise, every security vulnerability or bug in the DB engine would force users to perform manual upgrades.

Percona Operators have a Smart Update feature that asks check.percona.com for the latest version of the component, fetches the latest docker image, and safely rolls it out relying on Kubernetes automation.

Backups

Every DBA at least once in their life dropped the production database. This is where backups come in handy.  DBaaS must also automate backup management and lifecycle.

Usually, DB engines already come with the tools to take and restore backups (mysqldump, mongodump, XtraBackup, etc.). The goal here is to embed these tools into Kubernetes world and give users a clear interface to interact with backups.

Percona Operators provide Custom Resources for backups and rely on a Cronjob object to execute a sequence of commands on schedule. DBaaS in this case can be easily integrated and trigger backups and restores by simply pushing the manifests through the K8S control plane.

Point-in-time recovery

Recovering from backup which is a few hours old might not be an acceptable option for critical data. Rolling back the transaction or recovering the data to the latest possible state are the most common use cases for point-in-time recovery (PITR). Most databases have the capability to perform PITR, but they are not designed for the cloud-native world. The recently released version 1.7.0 of Percona XtraDB Cluster Operator provides PITR support out of the box. In future blog posts, we will tell you more about the challenges and technical design of the solution.

Monitoring

Kubernetes vigorously embraces the ephemeral nature of the containers and takes it to the next level by making nodes ephemeral as well. This complicates the regular monitoring approach, where the IP-addresses and ports of the services are well known and probed by the monitoring server. Luckily, the cloud-native landscape is full of monitoring tools that can solve this problem. Take a look at Prometheus Operator, for example.

In Percona we have our own dashing Percona Monitoring and Management (PMM) tool, which integrates with our Operators automatically. To solve the challenge of ephemeral IP-addresses and ports we switched our time-series database from Prometheus to VictoriaMetrics, which allows pushing the metrics, instead of pulling.

Conclusion

It is definitely fun to build DBaaS on Kubernetes and there are multiple ways on how to achieve this goal. The easiest way to do it is by leveraging Operators, as they automate out most of the manual burden. Delivering DBaaS requires rigorous planning and a deep understanding of the user requirements.

In January we kicked off the Preview of Database as a Service in Percona Monitoring and Management. It is fully open source and runs on our Operators in the background to provide the database. We are still looking for users to test and provide feedback during this year-long program, and we would be grateful for your participation and feedback!

 

Learn more about Percona Kubernetes Operators for Percona XtraDB Cluster or Percona Server for MongoDB

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com