Mar
15
2023
--

Automating Physical Backups of MongoDB on Kubernetes

MongoDB

We at Percona talk a lot about how Kubernetes Operators automate the deployment and management of databases. Operators seamlessly handle lots of Kubernetes primitives and database configuration bits and pieces, all to remove toil from operation teams and provide a self-service experience for developers.

Today we want to take you backstage and show what is really happening under the hood. We will review technical decisions and obstacles we faced when implementing physical backup in Percona Operator for MongoDB version 1.14.0. The feature is now in technical preview.

The why

Percona Server for MongoDB can handle petabytes of data. The users of our Operators want to host huge datasets in Kubernetes too. However, using a logical restore recovery method takes a lot of time and can lead to SLA breaches.

Our Operator uses Percona Backup for MongoDB (PBM) as a tool to backup and restore databases. We have been leveraging logical backups for quite some time. Physical backups in PBM were introduced a few months ago, and, with those, we saw significant improvement in recovery time (read more in this blog post, Physical Backup Support in Percona Backup for MongoDB):

physical backup mongodb

So if we want to reduce the Recovery Time Objective (RTO) for big data sets, we must provide physical backups and restores in the Operator.

The how

Basics

When you enable backups in your cluster, the Operator adds a sidecar container to each replset pod (including the Config Server pods if sharding is enabled) to run pbm-agent. These agents listen for PBM commands and perform backups and restores on the pods. The Operator opens connections to the database to send PBM commands to its collections or to track the status of PBM operations.

kubernetes backup mongodb

Backups

Enabling physical backups was fairly straightforward. The Operator already knew how to start a logical backup, so we just passed a single field to the PBM command if the requested backup was physical.

Unlike logical backups, PBM needs direct access to mongod’s data directory, and we mounted the persistent volume claim (PVC) on PBM sidecar containers. With these two simple changes, the Operator could perform physical backups. But whether you know it from theory or from experience (ouch!), a backup is useless if you cannot restore it.

Restores

Enabling physical restores wasn’t easy. PBM had two major limitations when it came to physical restores:

1. PBM stops the mongod process during the restore.

2. It runs a temporary mongod process after the backup files are copied to the data directory.

What makes physical restores a complex feature is dealing with the implications of these two constraints.

PBM kills the mongod process

PBM kills the mongod process in the container, which means the operator can’t query PBM collections from the database during the restore to track their status. This forced us to use the PBM CLI tool to control the PBM operations during the physical restore instead of opening a connection to the database.

Note: We are now considering using the same approach for every PBM operation in the Operator. Let us know what you think.

Another side effect of PBM killing mongod was restarting the mongod container. The mongod process runs as PID 1 in the container, and when you kill it, the container restarts. But the PBM agent needs to keep working after mongod is killed to complete the physical restore. For this, we have created a special entrypoint to be used during the restore. This special entrypoint starts mongod and just sleeps after mongod has finished.

Temporary mongod started by PBM

When you start a physical restore, PBM does approximately the following:

1. Kills the mongod process and wipes the data directory.

2. Copies the backup files to the data directory.

3. Starts a temporary mongod process listening on a random port to perform operations on oplog.

4. Kills the mongod and PBM agent processes.

For a complete walkthrough, see Physical Backup Support in Percona Backup for MongoDB.

From the Operator’s point of view, the problem with this temporary mongod process is its version. We support MongoDB 4.4, MongoDB 5.0, and now MongoDB 6.0 in the Operator, and the temporary mongod needs to be the same version as the MongoDB running in the cluster. This means we either include mongod binary in PBM docker images and create a separate image for each version combination or find another solution.

Well… we found it: We copy the PBM binaries into the mongod container before the restore. This way, the PBM agent can kill and start the mongod process as often as it wants, and it’s guaranteed to be the same version with cluster.

Tying it all together

By now, it may seem like I have thrown random solutions to each problem. PBM, CLI, special entrypoint, copying PBM binaries… How does it all work?

When you start a physical restore, the Operator patches the StatefulSet of each replset to:

1. Switch the mongod container entrypoint to our new special entrypoint: It’ll start pbm-agent in the background and then start the mongod process. The script itself will run as PID 1, and after mongod is killed, it’ll sleep forever.

2. Inject a new init container: This init container will copy the PBM and pbm-agent binaries into the mongod container.

3. Remove the PBM sidecar: We must ensure that only one pbm-agent is running in each pod.

Until all StatefulSets (except mongos) are updated, the PerconaServerMongoDBRestore object is in a “waiting” state. Once they’re all updated, the operator starts the restore, and the restore object goes into the requested state, then the running state, and finally, the ready or error state.

Conclusion

Operators simplify the deployment and management of applications on Kubernetes. Physical backups and restores are quite a popular feature for MongoDB clusters. Restoring from physical backup with Percona Backup for MongoDB has multiple manual steps. This article shows what kind of complexity is hidden behind this seemingly simple step.

 

Try Percona Operator for MongoDB

Mar
01
2023
--

Reduce Your Cloud Costs With Percona Kubernetes Operators

cut costs percona operators

Public cloud spending is slowing down. Quarter-over-quarter growth is no longer hitting 30% gains for AWS, Google, and Microsoft. This is businesses’ response to tough and uncertain macroeconomic conditions, where organizations scrutinize their public cloud spending to optimize and adjust.

In this blog post, we will see how running databases on Kubernetes with Percona Operators can reduce your cloud bill when compared to using AWS RDS.

Inputs

These are the following instances that we will start with:

  • AWS
  • RDS for MySQL in us-east-1
  • 10 x db.r5.4xlarge
  • 200 GB storage each

The cost of RDS consists mostly of two things – compute and storage. We will not consider data transfer or backup costs in this article.

  • db.r5.4xlarge – $1.92/hour or $46.08/day or $16,819.20/year
  • For ten instances, it will be $168,192 per year
  • Default gp2 storage is $0.115 per GB-month, for 200 GB, it is $22.50/month or $270 per year
  • For ten instances, it will be $2,700 per year

Our total estimated annual cost is $168,192 + $2,700 = $170,892.

Let’s see how much we can save.

Cutting costs

Move to Kubernetes

If we take AWS EKS, the following components are included in the cost:

  1. Control plane
  2. Compute – EC2, where we will have a dedicated Kubernetes node per database node
  3. Storage – EBS

Control plane is – $0.10/hour per cluster or $876 per year.

For EC2, we will use the same r5.4xlarge instance, which is $1.008/hour or $8,830.08 per year.

EBS gp2 storage is $0.10 per GB-month, for 200GB, it is $240 per year.

For ten instances, our annual cost is:

  • Compute – $88,300.80
  • Storage – $2,880
  • EKS control plane – $876

Total: $92,056.80

By just moving from RDS to Kubernetes and Operators, you will save $78,835.20.

Why Operators?

Seems this and further techniques can be easily executed without Kubernetes and Operators by just leveraging EC2 instances. “Why do I need Operators?” would be the right question to ask.

The usual block on a way to cost savings is management complexity. Managed Database Services like RDS remove a lot of toil from operation teams and provide self-service for developers. Percona Operators do exactly the same thing on Kubernetes, enabling your teams to innovate faster but without an extremely high price tag.

Kubernetes itself brings new challenges and complexity. Managed Kubernetes like EKS or GCP do a good job by removing part of these, but not all. Keep this in mind when you make a decision to cut costs with Operators.

Spot instances

Spot instances (or Preemptible VMs at GCP) use the spare compute capacity of the cloud. They come with no availability guarantee but are available at a much lower price – up to an 80% discount.

Discount is great, but what about availability? You can successfully leverage spot instances with Kubernetes and databases, but you must have a good strategy behind it. For example:

  1. Your production database should be running in a clustered mode. For the Percona XtraDB Cluster, we recommend having at least three nodes.
  2. Run in multiple Availability Zones to minimize the risk of massive spot instance interruptions.
  3. In AWS, you can keep some percentage of your Kubernetes nodes on-demand to guarantee the availability of your database clusters and applications.

You can move the levers between your availability requirements and cost savings. Services like spot.io and tools like aws-node-termination-handler can enhance your availability greatly with spot instances.

Leverage container density

Some of your databases are always utilized, and some are not. With Kubernetes, you can leverage various automation scaling techniques to pack containers densely and reduce the number of instances.

  1. Vertical Pod Autoscaler (VPA) or Kubernetes Event-driven Autoscaling (KEDA) can help to reduce the number of requested resources for your containers based on real utilization.
  2. Cluster Autoscaler will remove the nodes that are not utilized. 

Figure 1 shows an example of this tandem, where Cluster Autoscaler terminates one EC2 instance after rightsizing the container resources.

Figure 1: Animated example showing the Kubernetes autoscaler terminating an unused EC2 instance

Local storage

Databases tend to consume storage, and usually, it grows over time. It gets even trickier when your performance requirements demand expensive storage tiers like io2, where significant spending lands on provisioned IOPs. Some instances at cloud providers come with local NVMe SSDs. It is possible to store data on these local disks instead of EBS volumes, thus reducing your storage costs.

For example,  r5d.4xlarge has 2 x 300 GBs of built-in NVME SSDs. It can be enough to host a database.

The use of local storage comes with caveats:

  1. Instances with local SSDs are approximately 10% more expensive
  2. Like spot nodes, you have two levers – cost and data safety. Clustering and proper backups are essential.

Solutions like OpenEBS or Ondat leverage local storage and, at the same time, simplify operations. We have written many posts about Percona Operators running with OpenEBS:

Conclusion

Cloud spending grows with your business. Simple actions like rightsizing or changing to new instance generations bring short-lived savings. The strategies described in this article significantly cut your cloud bill and keep the complexity away through Operators and Kubernetes.

Try out Percona Operators today!

Feb
10
2023
--

Storage Autoscaling With Percona Operator for MongoDB

Previously, deploying and maintaining a database usually meant many burdensome chores and repetitive tasks to ensure proper functioning. In the cloud era, however, developers and operation engineers started fully embracing automation tools making their job significantly easier. Of those tools, Kubernetes operators are certainly one of the most prominent, as they are oftentimes used as a building block for DBaaS.

ViewBlock, a blockchain explorer, uses the Percona Operator for MongoDB to store critical data. Today along with their team, we will see how pvc-autoresizer can automate storage scaling for MongoDB clusters on Kubernetes.

Reasoning and goal

Nobody enjoys waking up in the middle of the night from disk usage alerts or, worse, a downed cluster due to a lack of free space on a volume that wasn’t properly set up to send proper warnings to its stakeholders.

Our goal is to automate storage scaling when our disk reaches a certain threshold of use and simultaneously reduce the amount of alert noise related to that. Specifically for our Operator, the volumes used by replicaset nodes are defined by their Persistent Volume Claims (PVCs), which are our targets to enable autoscaling.

Percona Kubernetes Operator

Let’s go

Prerequisites

The currently supported Kubernetes versions of pvc-autoresizer are 1.23 – 1.25. In addition to pvc-autoresizer that you can install through their Helm chart, Prometheus also needs to be running in your cluster to provide the resizer with the metrics it needs to determine if a PVC needs to be scaled.

Your CSI driver needs to support Volume Expansion and NodeGetVolumeStats, and the Storage Class used by your PVCs also requires the Volume Expansion feature to be active.

In our lab we will use AWS EKS with a standard storage class.

In action

You first need to add the following annotation to your Storage Class:

resize.topolvm.io/enabled=true

From that point on, your only requirement is to properly annotate the PVCs created by the Operator.

Assuming a simple 3-node Percona Server for MongoDB cluster without sharding and nothing else in your current namespace, you should be able to run the following commands to allow for automatic storage resizing of all the PVCs.

kubectl annotate pvc --all resize.topolvm.io/storage_limit="100Gi"
kubectl annotate pvc --all resize.topolvm.io/increase="20Gi"
kubectl annotate pvc --all resize.topolvm.io/threshold="20%"
kubectl annotate pvc --all resize.topolvm.io/enabled="true"

To describe that particular configuration, it tells pvc-autoresizer that your PVCs should increase by 20Gi when only 20% of space is left on them, with a maximum size of 100Gi beyond which they will not increase automatically.

Note: If you use version 1.14.0 of the Operator that’s scheduled for release this month, it will be possible to annotate the PVCs directly from your CR configuration file, for example:

spec:
  replsets:
  - name: rs0
    volumeSpec:
      persistentVolumeClaim:
        annotations:
         resize.topolvm.io/storage_limit: 100Gi
          resize.topolvm.io/increase: 20Gi
          resize.topolvm.io/threshold: 20%
          resize.topolvm.io/enabled: "true"

Limitations

Manual downscaling

It is possible to increase the storage automatically, but decreasing it is a manual process requiring replacing the volumes entirely. For example, AWS EBS volumes cannot be downsized and you’d need to delete the volumes before creating new ones with a smaller size. In Change Storage Class on Kubernetes on the Fly, we described how to change the storage class on the fly — a similar process will apply to decrease the size of the volumes.

Percentage threshold

As our increase thresholds are specified in percentages, the space available upon resize will inherently grow along with the size of the volume. When dealing with “big” disks, say 1TB, it is advised to set a low threshold to avoid triggering a rescale when a lot of space is still available.

Scaling quotas

Cloud providers, for example, AWS, have scaling quotas for the volumes. For EBS, you can resize a volume once in six hours. If your data ingestion is bigger than the increase amount you set and happens in less than that time, the resizing will fail. It is essential to consider your ingest rate and disk growth so that you can set the appropriate autoresizer configurations that suit your needs. Failure to do so might result in unwanted alerts and the need to transfer data to new PVCs, which you can learn about in Percona Operator Volume Expansion Without Downtime.

Statefulset and custom resource synchronization

When you provision new nodes or recreate one, their PVC will get bootstrapped with the volume request it gets from the immutable StatefulSet created at the initialization of your cluster, not with its latest size. You will need to set the appropriate annotations for those new PVCs to ensure that pvc-autoresizer scales them enough to allow for the replication to have enough space to proceed and that it doesn’t need to scale more than what your cloud provider permits. It is generally recommended to make sure the storage specs in StatefulSets and Custom Resources are in sync with real volumes.

Conclusion

And there you have it, a Percona Operator for MongoDB configuration that automatically scales its storage based on your needs!

It is still advised to have, at the very least, a minimal alerting setup in case you’re close to hitting the storage limit specified in the annotations since pvc-autoresizer requires it to be set.

ViewBlock is a blockchain-agnostic explorer allowing anyone to inspect blocks, transactions, address history, advanced statistics & much more.

Percona Operator for MongoDB automates deployment and management of replica sets and sharded clusters on Kubernetes.

Jan
24
2023
--

Backup Databases on Kubernetes With VolumeSnapshots

Backup Databases on Kubernetes With VolumeSnapshots

Backup Databases on Kubernetes With VolumeSnapshotsDatabases on Kubernetes continue their rising trend. We see the growing adoption of our Percona Kubernetes Operators and the demand to migrate workloads to the cloud-native platform. Our Operators provide built-in backup and restore capabilities, but some users are still looking for old-fashioned ways, like storage-level snapshots (i.e., AWS EBS Snapshots).

In this blog post, you will learn:

  1. How to back up and restore from storage snapshots using Percona Operators
  2. What the risks and limitations are of such backups

Overview

Volume Snapshots went GA in Kubernetes 1.20. Both your storage and Container Storage Interface (CSI) must support snapshots. All major cloud providers support them but might require some steps to enable it. For example, for GKE, you must create a VolumeSnapshotClass resource first.

At the high level, snapshotting on Kubernetes looks like this:

As PersistentVolume is represented by the real storage volume,

VolumeSnapshot

is the Kubernetes resource for volume snapshot in the cloud.

Getting ready for backups

First, we need to be sure that VolumeSnapshots are supported. For the major clouds, read the following docs:

Once you have CSI configured and Volume Snapshot Class is in place, proceed to create a backup.

Take the backup

Identify the PersistentVolumeClaims (PVC) that you want to snapshot. For example, for my MongoDB cluster, I have six PVCs: three x replica set nodes and three x config server nodes.

$ kubectl get pvc
NAME                                STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
mongod-data-my-cluster-name-cfg-0   Bound    pvc-c9fb5afa-1fc9-41f9-88f3-4ed457f88e58   3Gi        RWO            standard-rwo   78m
mongod-data-my-cluster-name-cfg-1   Bound    pvc-b9253264-f79f-4fd0-8496-1d88105d84e5   3Gi        RWO            standard-rwo   77m
mongod-data-my-cluster-name-cfg-2   Bound    pvc-5d462005-4015-47ad-9269-c205b7a3dfcb   3Gi        RWO            standard-rwo   76m
mongod-data-my-cluster-name-rs0-0   Bound    pvc-410acf85-36ad-4bfc-a838-f311f9dfd40b   3Gi        RWO            standard-rwo   78m
mongod-data-my-cluster-name-rs0-1   Bound    pvc-a621dd8a-a671-4a35-bb3b-3f386550c101   3Gi        RWO            standard-rwo   77m
mongod-data-my-cluster-name-rs0-2   Bound    pvc-484bb835-0e2d-4a40-b5a3-1ba340ec0567   3Gi        RWO            standard-rwo   76m

Each PVC will have its own VolumeSnapshot. Example for

mongod-data-my-cluster-name-cfg-0

:

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: mongod-data-my-cluster-name-cfg-0-snap
spec:
  volumeSnapshotClassName: gke-snapshotclass
  source:
    persistentVolumeClaimName: mongod-data-my-cluster-name-cfg-0

I have listed all my VolumeSnapshots objects in one YAML manifest here.

$ kubectl apply -f https://raw.githubusercontent.com/spron-in/blog-data/master/volume-snapshots/mongo-volumesnapshots.yaml
volumesnapshot.snapshot.storage.k8s.io/mongod-data-my-cluster-name-cfg-0-snap created
volumesnapshot.snapshot.storage.k8s.io/mongod-data-my-cluster-name-cfg-1-snap created
volumesnapshot.snapshot.storage.k8s.io/mongod-data-my-cluster-name-cfg-2-snap created
volumesnapshot.snapshot.storage.k8s.io/mongod-data-my-cluster-name-rs0-0-snap created
volumesnapshot.snapshot.storage.k8s.io/mongod-data-my-cluster-name-rs0-1-snap created
volumesnapshot.snapshot.storage.k8s.io/mongod-data-my-cluster-name-rs0-2-snap created

VolumeSnapshotContent is created and bound to every

VolumeSnapshot

resource. Its status can tell you the name of the snapshot in the cloud and check if a snapshot is ready:

$ kubectl get volumesnapshotcontent snapcontent-0e67c3b5-551f-495b-b775-09d026ea3c8f -o yaml
…
status:
  creationTime: 1673260161919000000
  readyToUse: true
  restoreSize: 3221225472
  snapshotHandle: projects/percona-project/global/snapshots/snapshot-0e67c3b5-551f-495b-b775-09d026ea3c8f

  • snapshot-0e67c3b5-551f-495b-b775-09d026ea3c8f is the snapshot I have in GCP for the volume.
  • readyToUse: true – indicates that the snapshot is ready

Restore

The restoration process, in a nutshell, looks as follows:

  1. Create persistent volumes using the snapshots. The names of the volumes must match the standard that Operator uses.
  2. Provision the cluster

Like any other backup, it must have secrets in place: TLS and users.

You can use this restoration process to clone existing clusters as well, just make sure you change the cluster, PVCs, and Secret names.

Create persistent volumes from snapshots. It is the same as the creation of regular PersistentVolumeClaim, but with a

dataSource

section that points to the snapshot:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mongod-data-my-cluster-name-rs0-0
spec:
  dataSource:
    name: mongod-data-my-cluster-name-rs0-0-snap
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
  storageClassName: standard-rwo
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 3Gi

$ kubectl apply -f https://raw.githubusercontent.com/spron-in/blog-data/master/volume-snapshots/mongo-pvc-restore.yaml
persistentvolumeclaim/mongod-data-my-cluster-name-cfg-0 created
persistentvolumeclaim/mongod-data-my-cluster-name-cfg-1 created
persistentvolumeclaim/mongod-data-my-cluster-name-cfg-2 created
persistentvolumeclaim/mongod-data-my-cluster-name-rs0-0 created
persistentvolumeclaim/mongod-data-my-cluster-name-rs0-1 created
persistentvolumeclaim/mongod-data-my-cluster-name-rs0-2 created

Once done, spin up the cluster as usual. The volumes you created earlier will be used automatically. Restoration is done.

Risks and limitations

Storage support

Both storage and the storage plugin in Kubernetes must support volume snapshots. This limits the choices. Apart from public clouds, there are open source solutions like Ceph (rook.io for k8s) that can provide snapshotting capabilities.

Point-in-time recovery

Point-in-time recovery (PITR) allows you to reduce your Point Recovery Objective by restoring or rolling back the database to a specific transaction or time.

Volume snapshots in the clouds store data in increments. The first snapshot holds all the data, and the following ones only store the changes. This significantly reduces your cloud bill. But snapshots cannot provide you with the same RPO as native database mechanisms.

Data consistency and corruption

Snapshots are not data-aware. When a snapshot is taken, numerous transactions and data modifications can happen. For example, heavy write activity and simultaneous compound index creation in MongoDB might lead to snapshot corruption. The biggest problem is that you will learn about data corruption during restoration.

Locking or freezing a filesystem before the snapshot would help to avoid such issues. Solutions like Velero or Veeam make the first steps towards data awareness and can create consistent snapshots by automating file system freezes or stopping replication.

Percona Services teams use various tools to automate the snapshot creation safely. Please contact us here to ensure data safety.

Cost

Public clouds store snapshots on cheap object storage but charge you extra for convenience. For example, the AWS EBS snapshot is priced at $0.05/GB, whereas S3 is only $0.023. It is a 2x difference, which for giant data sets might significantly increase your bill.

Time to recover

It is not a risk or limitation but a common misconception I often see: recovery from snapshots takes only a few seconds. It does not. When you create an EBS volume from the snapshot, it takes a few seconds. But in reality, the volume you just created does not have any data. You can read more about the internals of EBS snapshots in this nice blog post.

Conclusion

Volume Snapshots on Kubernetes can be used for databases but come with certain limitations and risks. Data safety and consistency are the most important factors when choosing a backup solution. For Percona Operators, we strongly recommend using built-in solutions which guarantee data consistency and minimize your recovery time and point objectives.

Learn More About Percona Kubernetes Operators

Dec
15
2022
--

Least Privilege for Kubernetes Resources and Percona Operators

Operators hide the complexity of the application and Kubernetes. Instead of dealing with Pods, StatefulSets, tons of YAML manifests, and various configuration files, the user talks to Kubernetes API to provision a ready-to-use application. An Operator automatically provisions all the required resources and exposes the application. Though, there is always a risk that the user would want to do something manual that can negatively affect the application and the Operator logic.

In this blog post, we will explain how to limit access scope for the user to avoid manual changes for database clusters deployed with Percona Operators. To do so, we will rely on Kubernetes Role-based Access Control (RBAC).

The goal

We are going to have two roles: Administrator and Developer. Administrator will deploy the Operator, create necessary roles, and service accounts. Developers will be able to:

  1. Create, modify, and delete Custom Resources that Percona Operators use
  2. List all the resources – users might want to debug the issues

Developers will not be able to:

  1. Create, modify, or delete any other resource

Least Privilege for Kubernetes

As a result, the Developer will be able to deploy and manage database clusters through a Custom Resource, but will not be able to modify any operator-controlled resources, like Pods, Services, Persistent Volume Claims, etc.

Action

We will provide an example for Percona Operator for MySQL based on Percona XtraDB Cluster (PXC), which just had version 1.12.0 released

Administrator

Create a dedicated namespace

We will allow users to manage clusters in a single namespace called

prod-dbs

:

$ kubectl create namespace prod-dbs

Deploy the operator

Use any of the ways described in our documentation to deploy the operator into the namespace. My personal favorite will be with simple

kubectl

command:

$ kubectl apply -f https://raw.githubusercontent.com/percona/percona-xtradb-cluster-operator/v1.12.0/deploy/bundle.yaml

Create ClusterRole

ClusterRole resource defines the permissions the user will have for a specific resource in Kubernetes. You can find the YAML in this github repository.

    - apiGroups: ["pxc.percona.com"]
      resources: ["*"]
      verbs: ["*"]
    - apiGroups: [""]
      resources:
      - pods
      - pods/exec
      - pods/log
      - configmaps
      - services
      - persistentvolumeclaims
      - secrets
      verbs:
      - get
      - list
      - watch

As you can see we allow any operations for

pxc.percona.com

resources, but restrict others to get, list, and watch.

$ kubectl apply -f https://github.com/spron-in/blog-data/blob/master/rbac-operators/clusterrole.yaml

Create ServiceAccount

We are going to generate a kubeconfig for this service account. This is what the user is going to use to connect to the Kubernetes API.

apiVersion: v1
kind: ServiceAccount
metadata:
    name: database-manager
    namespace: prod-dbs

$ kubectl apply -f https://raw.githubusercontent.com/spron-in/blog-data/master/rbac-operators/serviceaccount.yaml

Create ClusterRoleBinding

We need to assign the

ClusterRole

to the

ServiceAccount

.

ClusterRoleBinding

 acts as a relation between these two.

$ kubectl apply -f https://github.com/spron-in/blog-data/blob/master/rbac-operators/clusterrolebinding.yaml

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
    name: percona-database-manager-bind
    namespace: prod-dbs
roleRef:
    apiGroup: rbac.authorization.k8s.io
    kind: ClusterRole
    name: percona-pxc-rbac
subjects:
    - kind: ServiceAccount
      name: database-manager
      namespace: prod-dbs

Developer

Verify

can-i

command allows you to verify if the service account can really do what we want it to do. Let’s try:

$ kubectl auth can-i create perconaxtradbclusters.pxc.percona.com --as=system:serviceaccount:prod-dbs:database-manager
yes

$ kubectl auth can-i delete service --as=system:serviceaccount:prod-dbs:database-manager
no

All good. I can create Percona XtraDB Clusters, but I can’t delete Services. Please note, that sometimes it might be useful to allow Developers to delete Pods to force the cluster recovery. If you feel that it is needed, please modify the ClusterRole.

Apply

There are multiple ways to use this service account. You can read more about it in Kubernetes documentation. For a quick demonstration, we are going to generate a kubeconfig that we can share with our user. 

Run this script to generate the config. What it does:

  1. Gets the service account secret resource name
  2. Extracts Certificate Authority (ca.crt) contents from the secret
  3. Extracts the token from the secret
  4. Gets the endpoint of the Kubernetes API
  5. Generates the kubeconfig using all of the above
$ bash generate-kubeconfig.sh > tmp-kube.config

Now let’s see if it works as expected:

$ KUBECONFIG=tmp-kube.config -n prod-dbs apply -f https://raw.githubusercontent.com/percona/percona-xtradb-cluster-operator/v1.12.0/deploy/cr.yaml
perconaxtradbcluster.pxc.percona.com/cluster1 created

$ KUBECONFIG=tmp-kube.config kubectl -n prod-dbs get pods
NAME                                               READY   STATUS    RESTARTS   AGE
cluster1-haproxy-0                                 2/2     Running   0          15m
cluster1-haproxy-1                                 2/2     Running   0          14m
cluster1-haproxy-2                                 2/2     Running   0          14m
cluster1-pxc-0                                     3/3     Running   0          15m
cluster1-pxc-1                                     3/3     Running   0          14m
cluster1-pxc-2                                     3/3     Running   0          13m
percona-xtradb-cluster-operator-77bf8b9df5-qglsg   1/1     Running   0          52m

I was able to create the Custom Resource and a cluster. Let’s try to delete the Pod:

$ KUBECONFIG=tmp-kube.config kubectl -n prod-dbs delete pods cluster1-haproxy-0
Error from server (Forbidden): pods "cluster1-haproxy-0" is forbidden: User "system:serviceaccount:prod-dbs:database-manager" cannot delete resource "pods" in API group "" in the namespace "prod-dbs"

What about the Custom Resource?

$ KUBECONFIG=tmp-kube.config kubectl -n prod-dbs delete pxc cluster1
perconaxtradbcluster.pxc.percona.com "cluster1" deleted

Conclusion

Percona Operators automate the deployment and management of the databases in Kubernetes. Least privilege principle should be applied to minimize the ability of inexperienced users to affect availability and data integrity. Kubernetes comes with sophisticated Role-Based Access Control capabilities which allow you to do just that without the need to reinvent it in your platform or application.

Try Percona Operators

Nov
22
2022
--

Troubleshooting Percona Operators With troubleshoot.sh

Troubleshooting Percona Operators

Troubleshooting Percona OperatorsPercona loves and embraces Kubernetes. Percona Operators have found their way into the hearts and minds of developers and operations teams. And with growing adoption, we get a lot of valuable feedback from the community and our support organization. Two of the most common questions that we hear are:

  1. What are the best practices to run Operators?
  2. Where do I look if there is a problem? In other words, how do I troubleshoot the issues?

In this blog post, we will experiment with troubleshoot.sh, an open source tool developed by Replicated, and see if it can help with our problems. 

Prepare

Installation

Troubleshoot.sh is a collection of kubectl plugins. Installation is dead simple and described in the documentation.

Concepts

There are two basic use cases:

  1. Preflight – check if your cluster is ready and compliant to run an application, so it should cover the best practices question
  2. Support bundle – check if the application is running and behaving appropriately, it clearly helps with troubleshooting

Three functional concepts in the tool that you should know about:

  1. Collectors – where you can define what information you want to collect. A lot of basic information is collected about the cluster by default. 
  2. Redact – if there is sensitive information, you can redact it from the bundle. For example, AWS credentials are somewhere in the logs of your application.
  3. Analyze – act on collected information and provide actionable insights. 

 

Preflight

Checks are defined through YAML files. I have created an example to check if the Kubernetes cluster is ready to run Percona Operator for MySQL (based on Percona XtraDB Cluster). This checks the following:

  1. Kubernetes version and flavor – as we only test our releases on specific versions and providers
  2. Default storage class – it is possible to use local storage, but by default, our Operator uses Persistent Volume Claims
  3. Node size – we check if nodes have at least 1 GB of RAM, otherwise, Percona XtraDB Cluster might fail to start

Run the check with the following command:

kubectl preflight https://raw.githubusercontent.com/spron-in/blog-data/master/percona-troubleshoot/pxc-op-1.11-preflight.yaml

It is just an example. Preflight checks are very flexible and you can read more in the documentation.

Support bundle

Support bundles can be used by support organizations or by users to troubleshoot issues with applications on Kubernetes. Once you run the command, it also creates a tar.gz archive with various data points that can be analyzed later or shared with experts.

Similarly to Preflight, support bundle checks are defined in YAML files. See the example that I created here and run it with the following command:

kubectl support-bundle https://raw.githubusercontent.com/spron-in/blog-data/master/percona-troubleshoot/pxc-op-1.11-support.yaml

Check cluster state

First, we are going to check the presence of Custom Resource Definition and the health of the cluster. In my YAML the name of the cluster is hard coded to minimal-cluster (it can be deployed from our repo).

We are using Custom Resource fields to verify the statuses of Percona XtraDB Cluster, HAProxy, and ProxySQL. This is the example checking HAProxy:

 - yamlCompare:
        checkName: Percona Operator for MySQL - HAProxy
        fileName: cluster-resources/custom-resources/perconaxtradbclusters.pxc.percona.com/default.yaml
        path: "[0].status.haproxy.status"
        value: ready
        outcomes:
          - fail:
              when: "false"
              message: HAProxy is not ready
          - pass:
              when: "true"
              message: HAProxy is ready

perconaxtradbclusters.pxc.percona.com/default.yaml

has all

pxc

Custom Resources in the default namespace. I know I have only one there, so I just parse the whole YAML with the

yamlCompare

method.

Verify Pods statuses

If you see that previous checks failed, it is time to check the Pods. In the current implementation, you can get the statuses of the Pods in various namespaces. 

- clusterPodStatuses:
        name: unhealthy
        namespaces:
          - default
        outcomes:
…
          - fail:
              when: "== Pending"
              message: Pod {{ .Namespace }}/{{ .Name }} is in a Pending state with status of {{ .Status.Reason }}.

Parse logs for known issues

Parsing logs of the application can be tedious. But what if you can predefine all known errors and quickly learn about the issue without going into your central logging system? Troubleshoot.sh can do it as well.

First, we define the collectors. We will get the logs from the Operator and Percona XtraDB Cluster itself:

collectors:
    - logs:
        selector:
          - app.kubernetes.io/instance=minimal-cluster
          - app.kubernetes.io/component=pxc
        namespace: default
        name: pxc/container/logs
        limits:
          maxAge: 24h
    - logs:
        selector:
          - app.kubernetes.io/instance=percona-xtradb-cluster-operator
          - app.kubernetes.io/component=operator
        namespace: default
        name: pxc-operator/container/logs
        limits:
          maxAge: 24h

Now the logs are in our support bundle. We can analyze them manually or catch some known messages with a regular expression analyzer:

    - textAnalyze:
        checkName: Failed to update the password
        fileName: pxc-operator/container/logs/percona-xtradb-cluster-operator-.*.log
        ignoreIfNoFiles: true
        regex: 'Error 1396: Operation ALTER USER failed for'
        outcomes:
          - pass:
              when: "false"
              message: "No failures on password change"
          - fail:
              when: "true"
              message: "There was a failure to change the system user password. For more details: https://docs.percona.com/percona-operator-for-mysql/pxc/users.html"

In the example we are checking for a message in the Operator log, that indicates that the system user password change failed.

Check for security best practices

Except for dealing with operational issues, troubleshoot.sh can help with compliance checks. You should have some security guardrails in the code before you deploy to Kubernetes, but if something slipped you will have a second line of defense. 

In our example we check for weak passwords and if the statefulset matches high availability best practices. 

Collecting and sharing passwords is not recommended, but still can be a valuable addition to your security policies. To capture secrets you need to implicitly define a collector:

spec:
  collectors:
    - secret:
        namespace: default
        name: internal-minimal-cluster
        includeValue: true
        key: root

Replay support bundle with sbctl

sbctl is an awesome addition to troubleshoot.sh. It allows users and support engineers to examine support bundles as if it was regular interaction with the Kubernetes cluster. See the demo below:

What’s next

The tool is new and there are certain limitations. To list a few:

  1. For now, all checks are hard coded. It calls out for some templating mechanism or simple generator of YAML files.
  2. It is not possible to add sophisticated logic into analyzers. For example, to link one analyzer to another. 
  3. It would be great to apply targeted filtering. For example, do not gather the information for all Custom Resources, but only one. It can be useful for targeted troubleshooting. 

We are going to work with the community and see how Percona can contribute to this tool, as it can help our users to standardize the troubleshooting of our products. Please let us know about your use cases or stories around debugging Percona products on Kubernetes in the comments.

The Percona Kubernetes Operators automate the creation, alteration, or deletion of members in your Percona Distribution for MySQL, MongoDB, or PostgreSQL environment.

Learn More About Percona Kubernetes Operators

Nov
10
2022
--

Run MySQL in Kubernetes: Solutions, Pros and Cons

Run MySQL in Kubernetes

Run MySQL in KubernetesThis blog post continues the series of comparisons of solutions to run databases on Kubernetes. Previous blog posts:

The initial release of MySQL was in 1995 and Kubernetes was released 19 years later in 2014. According to DB-engines, MySQL today is the most popular open source relational database, whereas Kubernetes is the de-facto standard for running containerized workloads. The popularity of these two technologies among engineers drove companies to create solutions to run them together. In this blog post, we will review various Kubernetes Operators for MySQL, see how different they are, and what capabilities they provide for developers and operations teams.

The summary and comparison table can be found in our documentation.

Notable mentions

Before reviewing the operators, I need to mention numerous interesting solutions to run MySQL in k8s.

KubeDB

KubeDB is a swiss-army knife operator, which can deploy and manage multiple databases, including MySQL. The thing is that it is working in an open core model, where the most interesting features are not available in the free version, so I cannot easily try them out. Rest assured it is a viable solution, but in this blog post, I want to focus on open source offerings.

Helm or manual deployment

The desire to run MySQL on Kubernetes was there before the Operator SDK appeared. To address this, the community was very creative and developed numerous ways how to do it, ranging from regular manual deployments to more sophisticated helm charts (ex Bitnami Helm Chart).

They do the job – you can deploy a MySQL server. With some digging and tuning it is even possible to have a cluster. But what these solutions have in common is that they lack the ability to perform different day-2 operations: backups, scaling, upgrades, etc. For databases, it might be crucial, because data consistency and safety are at stake. Applying methods that worked on legacy environments, might not be safe on Kubernetes.

This is where Operators come into play.

Bitpoke MySQL Operator

Bitpoke is a company that provides tools for WordPress self-hosting on Kubernetes, including MySQL and WordPress operators. Their team developed one of the first MySQL operators and shared it with the community. Developed initially within Presslabs, since 2021 the team and operator have moved to Bitpoke. The operator is used in production by numerous companies. 

It is Apache 2.0 licensed. Interestingly enough, it uses Percona Server for MySQL under the hood “because of backup improvements (eg. backup locks), monitoring improvements, and some serviceability improvements (eg. utility user)”.

Deployment

I followed the documentation:

helm repo add bitpoke https://helm-charts.bitpoke.io
helm install mysql-operator bitpoke/mysql-operator

The operator is up and running. Deploying the cluster:

kubectl apply -f https://raw.githubusercontent.com/bitpoke/mysql-operator/master/examples/example-cluster-secret.yaml
kubectl apply -f https://raw.githubusercontent.com/bitpoke/mysql-operator/master/examples/example-cluster.yaml

This deploys MySQL 5.7 cluster with asynchronous replication – one main and one replica node.

Features

Bitpoke operator allows you to back up, restore, scale, and upgrade MySQL on Kubernetes. So regular day-2 operations are available. 

Let’s start with the pros:

  • It is the oldest open source Operator for MySQL. It was battle-tested by Bitpoke and widely adopted by the community and other companies. This gives assurance that there are no critical issues. But do read ahead to the Cons section.
  • Replication lag and mitigation takes lagging nodes out of rotation when lag is above a set threshold. It is important for asynchronous replication to present the real data for the application.

As for cons, it seems that the operator is not actively developed with 15 commits for the last year. 

Oracle MySQL Operator

This is not the first time Oracle created the Operator for MySQL, but the difference now is that this Operator made it to the General Availability stage. Operator is distributed under the unusual “Universal Permissive License (UPL)”, but it is really permissive and close to the MIT license.

Deployment

Standard deployment for the operator with helm, no surprises:

helm repo add mysql-operator https://mysql.github.io/mysql-operator/
helm repo update
helm install mysql-operator mysql-operator/mysql-operator --namespace mysql-operator --create-namespace

Now the cluster:

export NAMESPACE=default
helm install my-mysql-innodbcluster mysql-operator/mysql-innodbcluster -n $NAMESPACE \
        --version 2.0.7 \
        --set credentials.root.password=">-0URS4F3P4SS" \
        --set tls.useSelfSigned=true

This deploys a MySQL cluster with three nodes with Group Replication and a single MySQL router Pod in front of it for query routing. 

Features

Even though the operator was promoted to GA, some basic capabilities are not there or should be implemented by the user. For example, upgrades, monitoring, and topology flexibility.

  • Operator uses MySQL Shell underneath and its codebase is in python (versus regular golang). Backups and restores also rely on MySQL Shell and use the dumpInstance() method. It is a logical backup and might be problematic for big databases.
  • Operator supports only MySQL 8.0 and only Group Replication (or marketed InnoDB Cluster). Which I think is a logical move.
  • You can use MySQL Community container images, but there is a certain trend from Oracle to push users towards Enterprise. At least it is visible in the two last release notes, where new additions are all-around features available in the Enterprise edition only.

Moco

Similar to the Bitpoke operator, Moco was created by Cybozu for its internal needs and later open-sourced. It goes under Apache 2.0 license, is written in Golang, and has a good release cadence.

Deployment

As usual, let’s try a quick start guide. Note that a cert-manager is required (curiosity peaked from the start!). 

Install cert-manager and deploy the operator:

curl -fsLO https://github.com/jetstack/cert-manager/releases/latest/download/cert-manager.yaml
kubectl apply -f cert-manager.yaml

helm repo add moco https://cybozu-go.github.io/moco/
helm repo update
helm install --create-namespace --namespace moco-system moco moco/moco

Create the cluster from an example folder:

kubectl apply -f https://raw.githubusercontent.com/cybozu-go/moco/main/examples/loadbalancer.yaml

This deploys a cluster with three nodes and semi-sync replication exposed with a load balancer.

Features

Moco is quite feature-rich and enables users to execute various management tasks. Refreshing solutions and ideas:

  • Documentation. For an in-house grown product, the operator has pretty good and detailed documentation. From design and architecture decisions to how-tos and common use cases.
  • Similar to Oracle’s operator, this one relies on MySQL Shell for backups and restores, but at the same time supports incremental backups. Binary logs are copied to an S3 bucket at the time of backup, so it is not a continuous binlog copy that can deliver point-in-time recovery, but only a one-time dump. 
  • Operator has its own kubectl plugin to smoothen the onboarding and usage. I’m curious though if these plugins are widely used in production. Please share your thoughts in a comment.

There are some concerns that I have regarding this operator:

  • It works with semi-sync replication which is not a good fit for workloads that have strong data-consistency requirements. See Face to Face with Semi-Synchronous Replication for more details.
  • It does not have official support, so for businesses, it is a “use at your own risk” type of situation.

Vitess

Vitess is a database clustering system for horizontal scaling of MySQL and was initially developed in YouTube (and it was widely used there). Now it is a CNCF project and is actively developed by PlanetScale and the community. It is open source, but there are some features in Vitess itself that are only available for PlanetScale customers. Interesting fact: Vitess Operator serves as a core component of the PlanetScale DBaaS. So it is a production-grade and battle-tested operator.

Deployment

Going with a quickstart

git clone https://github.com/vitessio/vitess
cd vitess/examples/operator

kubectl apply -f operator.yaml

Operator is ready. Let’s deploy an intro cluster:

kubectl apply -f 101_initial_cluster.yaml

This deploys the following:

  • Three etcd Pods
  • Various Vitess Pods:
    • Two vttablet
    • vtadmin
    • vtctld
    • vtgate
    • vtorc 

You can read more about Vitess concepts and pieces in architecture documentation.

Features

Sharding is one of the biggest pros of this operator. The only competitor I can think of is TiDB (which should be MySQL protocol compatible, but not MySQL). There are no other solutions for MySQL sharding in the open source space. But at the same time, it all comes with a price – complexity, which Kubernetes for sure helps to masquerade. Getting familiar with all vt-* components can be overwhelming, especially for users who never used Vitess before. 

Operator provides users with all the regular management operations. The only downside is that these operations are not well documented and you have to discover them through various blog posts, reference docs, and other artifacts. For example, this blog post covers some basics for backups and restores, whereas this document covers basic Vitess operations.

Percona

At Percona we have two operators for MySQL:

  1. Based on Percona XtraDB Cluster (PXC)
  2. Based on Percona Server for MySQL (PS)

In Percona Operator for MySQL – Alpha Release, we explain why we decided to create the new operator. Both operators are fully open source as the components they are based on. The one based on PXC is production-ready, whereas PS is getting there.

Deployment

For Percona Kubernetes Operators we maintain helm charts for ease of onboarding. Deployment is a two-step process.

Deploy the operator:

helm repo add percona https://percona.github.io/percona-helm-charts/
helm install my-op percona/pxc-operator

And the database:

helm install my-db percona/pxc-db

Features

For features, I will focus on PXC as it is production-ready, and we are aiming for PS Operator to reach parity in the nearest future. 

  • Proxy integration. Along with the database, the operator also deploys proxies. Users can choose from HAProxy and ProxySQL. The choice depends on the use case.
  • Operator allows users to have multi-cluster or multi-regional MySQL deployments. This is useful for migrations or disaster recovery.
  • There are no other operators for MySQL that provide a true point-in-time recovery. We developed a solution that continuously stores backups on object storage. Read more about architecture decisions in Point-In-Time Recovery in Percona Operator for MySQL Based on Percona XtraDB Cluster – Architecture Decisions.
  • Operator provides automated and safe upgrades for minor versions of MySQL and its components (like proxies). It is extremely useful to quickly fix critical vulnerabilities and roll out bug fixes with zero downtime.

Percona is committed to providing open source products to the community, but we also provide exceptional services for our customers: managed and professional services and support. We have an ecosystem of products — Percona Platform — that brings together our software and services offerings.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com