Run MongoDB in Kubernetes: Solutions, Pros and Cons

Run MongoDB in Kubernetes

Run MongoDB in KubernetesRunning MongoDB in Kubernetes is becoming the norm. The most common use cases for running databases in Kubernetes are:

  • Create a private Database as a Service (DBaaS).
  • Maintain coherent infrastructure, where databases and applications run on the same platform.
  • Avoid vendor lock-in and be able to run their databases anywhere utilizing the power of Kubernetes and containers.

There are multiple solutions that allow you to run MongoDB in Kubernetes, and in this blog post, we are going to compare these solutions and review the pros and cons of each of them.

Solutions that we are going to review are:

The summary and comparison table can be found in our documentation.

Bitnami Helm chart

Bitnami was acquired by VMWare in 2019 and is known for its Helm charts to deploy various applications in Kubernetes. The key word here is deploy, as there are no management capabilities in this solution. 

All Bitnami Helm charts are under Apache 2.0 license, pure open source. The community around Bitnami is quite big and active.


Installation is a usual two-step process: add Helm repository, and deploy the chart.

Add the repository:

$ helm repo add bitnami

You can tweak the chart installation through values. I decided to experiment with it and deploy a replica set vs the default standalone setup:

$ helm install my-mongo bitnami/mongodb --set architecture="replicaset" --set replicaCount=2

This command deploys a MongoDB replica set with two nodes and one arbiter. 

It is worth noting that Github readme is not the only piece of documentation, but there are official Bitnami docs with more examples and details.


Management capabilities are not the strong part of this solution, but it is possible to scale the replica set horizontally (add more nodes to it) and vertically (add more resources to nodes). 

Monitoring is represented by


as a sidecar. It is described in the Metrics parameters section. It’s a straightforward approach that would be enough for most of the use cases.

What I love about all Bitnami Helm charts is that they provide you with lots of flexibility for tuning Kubernetes primitives and components: labels, annotations, images,  resources, and all. 

If we look into MongoDB-related functionality in this solution, there are a few that are worth noting:

  1. Create users and databases during database provisioning. It is a bit weird, that you cannot create roles and assign them right away. But a nice move.
  2. Hidden node – you will not find it in any other solution in this blog post. 
  3. TLS – you can encrypt the traffic flowing between the replica set nodes. Good move, but for some reason, it is disabled by default. 

I have not found any information about supported MongoDB versions. The latest Helm chart version was deploying MongoDB 5.0.

You can deploy sharded clusters, but it is a separate Helm chart – MongoDB Sharded. But still, you cannot easily integrate it with your enterprise solutions – let’s say LDAP or Hashicorp vault. At least this is not going to be a straightforward integration.


KubeDB, created by AppsCode, is a Swiss-army knife Operator to deploy various databases in Kubernetes including MongoDB. 

This solution follows an open-core model, where limited functionality is available for free, but good stuff comes after you purchase the license. Here you can find a comparison of the Community vs Enterprise versions.

The focus of our review would be on the Community version, but we are going to mention which features are available in the Enterprise version.


I found deployment a bit cumbersome as it requires downloading the license first, even for the Community version. 

Add the Helm repo:

$ helm repo add appscode

Install the Operator:

$ helm install kubedb appscode/kubedb \
 --version v2022.05.24 \
 --namespace kubedb --create-namespace \
 --set-file global.license=/path/to/the/license.txt

Deploy the database with the command below. It is worth mentioning that with the Community version you can deploy the database only in the demo namespace.

$ kubectl create -f

This deploys a sharded cluster. You can find a bunch of examples in github.


The Community version comes with almost no management capabilities. You can add and remove resources and nodes, so perform basic scaling, but nothing else.

Backups, restores, upgrades, encryption, and many more are all available in the Enterprise version only.

For monitoring, KubeDB follows the same approach as Bitnami and allows you to deploy a


container as a sidecar. Read more here.

Definitely, the Community version comes with limited functionality which is not enough to run it in production. At the same time there are some interesting features:

  • One Operator, many databases. I like this approach as it can simplify your operational landscape if you run various database flavors.
  • Pick your MongoDB version. You can easily find various supported MongoDB versions and pick the one you need through custom resource:
$ kubectl get mongodbversions
NAME             VERSION   DISTRIBUTION   DB_IMAGE                                 DEPRECATED   AGE
3.4.17-v1        3.4.17    Official       mongo:3.4.17                                          20m
3.4.22-v1        3.4.22    Official       mongo:3.4.22                                          20m
3.6.13-v1        3.6.13    Official       mongo:3.6.13                                          20m
3.6.8-v1         3.6.8     Official       mongo:3.6.8                                           20m
4.0.11-v1        4.0.11    Official       mongo:4.0.11                                          20m
4.0.3-v1         4.0.3     Official       mongo:4.0.3                                           20m
4.0.5-v3         4.0.5     Official       mongo:4.0.5                                           20m
4.1.13-v1        4.1.13    Official       mongo:4.1.13                                          20m
4.1.4-v1         4.1.4     Official       mongo:4.1.4                                           20m
4.1.7-v3         4.1.7     Official       mongo:4.1.7                                           20m
4.2.3            4.2.3     Official       mongo:4.2.3                                           20m
4.4.6            4.4.6     Official       mongo:4.4.6                                           20m
5.0.2            5.0.2     Official       mongo:5.0.2                                           20m
5.0.3            5.0.3     Official       mongo:5.0.3                                           20m
percona-3.6.18   3.6.18    Percona        percona/percona-server-mongodb:3.6.18                 20m
percona-4.0.10   4.0.10    Percona        percona/percona-server-mongodb:4.0.10                 20m
percona-4.2.7    4.2.7     Percona        percona/percona-server-mongodb:4.2.7-7                20m
percona-4.4.10   4.4.10    Percona        percona/percona-server-mongodb:4.4.10                 20m

As you can see, it even supports Percona Server for MongoDB

  • Sharding. I like that the Community version comes with sharding support as it helps to experiment and understand the future fit of this solution for your environment. 

MongoDB Community Operator

Similar to KubeDB, MongoDB Corp follows the open core model. The Community version of the Operator is free, and there is an enterprise version available (MongoDB Enterprise Kubernetes Operator), which is more feature rich.


Helm chart is available for this Operator, but at the same time, there is no helm chart to deploy the Custom Resource itself. This makes onboarding a bit more complicated.

Add the repository: 

$ helm repo add mongodb

Install the Operator:

$ helm install community-operator mongodb/community-operator

To deploy the database you need to modify this example and set the password. Once done apply it to deploy the replica set of three nodes:

$ kubectl apply -f mongodb.com_v1_mongodbcommunity_cr.yaml


There are no expectations that anyone would run MongoDB Communty Operator in production as it lacks basic features: backups and restores, sharding, upgrades, etc. But at the same time as always there are interesting ideas that are worth mentioning:

  • User provisioning. Described here. You can create a MongoDB database user to authenticate to your MongoDB replica set using SCRAM. You create the Secret object and list users and their corresponding roles.
  • Connection string in the secret. Once a cluster is created, you can get the connection string, without figuring out the service endpoints or fetching the user credentials.
  "connectionString.standard": "mongodb://my-user:mySupaPass@example-mongodb-0.example-mongodb-svc.default.svc.cluster.local:27017,example-mongodb-1.example-mongodb-svc.default.svc.cluster.local:27017,example-mongodb-2.example-mongodb-svc.default.svc.cluster.local:27017/admin?replicaSet=example-mongodb&ssl=false",
  "connectionString.standardSrv": "mongodb+srv://my-user:mySupaPass@example-mongodb-svc.default.svc.cluster.local/admin?replicaSet=example-mongodb&ssl=false",
  "password": "mySupaPass",
  "username": "my-user"

Percona Operator for MongoDB

Last but not least in our solution list – a fully open source Percona Operator for MongoDB. It was created in 2018 and went through various stages of improvements reaching GA in 2020. The latest release came out in May 2022. 

If you are a Percona customer, you will get 24/7 support for the Operator and clusters deployed with it. 


There are various ways to deploy Percona Operators, and one of them is through Helm charts. One chart for Operator, another one for a database. 

Add repository:

$ helm repo add percona

Deploy the Operator:

$ helm install my-operator percona/psmdb-operator

Deploy the cluster:

$ helm install my-db percona/psmdb-db


Operator leverages Percona Distribution for MongoDB and this unblocks various enterprise capabilities right away. The Operator itself is feature rich and provides users with various features that would simplify the migration from your regular MongoDB deployment to Kubernetes: sharding, arbiters, backups and restores, and many more. 

Some key features that I would like to mention:

  • Automated upgrades. You can set up the schedule of upgrades and the Operator will automatically upgrade your MongoDB cluster’s minor version with no downtime to operations.
  • Point-in-time recovery. Percona Backup for MongoDB is part of our Distribution and provides backup and restore functionality in our Operator, and this includes the ability to store oplogs on an object storage of your choice. Read more about it in our documentation.
  • Multi-cluster deployment. Percona Operator supports complex topologies with cross-cluster replication capability. The most common use cases are disaster recovery and migrations.


Choosing a solution to deploy and manage MongoDB is an important technical decision, which might impact various business metrics in the future. In this blog post, we summarized and highlighted the pros and cons of various open source solutions to run MongoDB in Kubernetes. 

An important thing to remember is that the solution you choose should not only provide a way to deploy the database but also enable your teams to execute various management and maintenance tasks without drowning in MongoDB complexity. 


Managing MySQL Configurations with the PXC Kubernetes Operator V1.10.0 Part 3: Conclusion

Managing MySQL Configurations Kubernetes

Managing MySQL Configurations KubernetesIn part one and part two of this series, we introduced the different ways to manage MySQL configurations and precedence when using the Percona XtraDB Cluster (PXC) object and ConfigMap. In this post, we will see the precedence when secrets are used for MySQL configurations in Percona Operator for MySQL based on Percona XtraDB Cluster.

CASE-4: Secret with name cluster1-pxc and ConfigMap with name cluster1-pxc but without configuration in PXC object

When the MySQL configuration is present in the ConfigMap and secret but not in the PXC object, the following would be the state

# kubectl get pxc cluster1 -ojson | jq .spec.pxc.configuration
# kubectl get secrets cluster1-pxc -ojson | jq -r '.data."secret.cnf"' |base64 -d
wsrep_provider_options="gcache.size=768M; gcache.recover=yes"
# kubectl get cm cluster1-pxc -ojson | jq .data
 "init.cnf": "[mysqld]\nwsrep_debug=CLIENT\nwsrep_provider_options=\"gcache.size=64M; gcache.recover=yes\"\n"

Let’s query the DB to see which value has been taken

# kubectl run -i --rm --tty percona-client --image=percona:8.0 --restart=Never -- bash -il
mysql> SHOW VARIABLES LIKE 'wsrep_provider_options'\G
*************************** 1. row ***************************
Variable_name: wsrep_provider_options
 </snip>       gcache.size = 768M;  … </snip>

As it can be seen, secrets take precedence over ConfigMap

CASE-5: Configuration present in PXC object, ConfigMap cluster1-pxc, secret cluster1-pxc

Current State:

# kubectl get pxc cluster1 -ojson | jq .spec.pxc.configuration
"[mysqld]\nwsrep_debug=CLIENT\nwsrep_provider_options=\"gcache.size=128M; gcache.recover=yes\"\n"
# kubectl get cm cluster1-pxc -ojson | jq .data
 "init.cnf": "[mysqld]\nwsrep_debug=CLIENT\nwsrep_provider_options=\"gcache.size=128M; gcache.recover=yes\"\n"

Let’s try to use secrets cluster1-pxc and see the effects.

# cat secret.cnf
wsrep_provider_options="gcache.size=768M; gcache.recover=yes"
# kubectl create secret generic cluster1-pxc --from-file secret.cnf
secret/cluster1-pxc created

As it can be seen below, ConfigMap and pxc object had no changes.

# for i in `seq 1 100`; do kubectl get pxc cluster1 -ojson | jq .spec.pxc.configuration ; sleep 2 ; done
"[mysqld]\nwsrep_debug=CLIENT\nwsrep_provider_options=\"gcache.size=128M; gcache.recover=yes\"\n"
"[mysqld]\nwsrep_debug=CLIENT\nwsrep_provider_options=\"gcache.size=128M; gcache.recover=yes\"\n"
# for i in `seq 1 100`; do kubectl get cm cluster1-pxc -ojson | jq .data; sleep 2 ;done
 "init.cnf": "[mysqld]\nwsrep_debug=CLIENT\nwsrep_provider_options=\"gcache.size=128M; gcache.recover=yes\"\n"
 "init.cnf": "[mysqld]\nwsrep_debug=CLIENT\nwsrep_provider_options=\"gcache.size=128M; gcache.recover=yes\"\n"
 "init.cnf": "[mysqld]\nwsrep_debug=CLIENT\nwsrep_provider_options=\"gcache.size=128M; gcache.recover=yes\"\n"

However, the DB has taken configurations from secrets

mysql> SHOW VARIABLES LIKE 'wsrep_provider_options'\G
*************************** 1. row ***************************
Variable_name: wsrep_provider_options
 </snip>   gcache.size = 768M; …  </snip>

Secrets take precedence over pxc object and ConfigMap


  1. MySQL Configurations via secret cluster1-pxc takes precedence over ConfigMap cluster1-pxc or pxc object
  2. If secret cluster1-pxc is not present, MySQL configurations present with the PXC object take precedence over ConfigMap cluster1-pxc.
  3. The operator takes the configuration from the PXC object and overwrites the configuration in ConfigMap cluster1-pxc in the reconciliation loop.
  4. If the configuration is present in just ConfigMap or secret, the same is not written in the PXC object in the reconciliation loop.

We would love to hear how you are managing MySQL configurations, feel free to comment ?


Managing MySQL Configurations with the PXC Kubernetes Operator V1.10.0 Part 2: Walkthrough

Managing MySQL Configurations with the PXC Kubernetes

Managing MySQL Configurations with the PXC KubernetesIn part one of this series, we introduced the different ways to manage MySQL configurations. In this post, we will walk through different possibilities and the changes happening while modifying MySQL configurations with the operator.

Percona Distribution for MySQL Operator based on Percona XtraDB Cluster (PXC) provides three ways for managing MySQL, but the question is, what is the precedence among options? We will walk through several cases of using MySQL configs. For the sake of simplicity, we will play with the values of the Galera cache to see the effects.

configuration: |
   wsrep_provider_options="gcache.size=64M; gcache.recover=yes"

CASE-1: Modify Percona XtraDB Cluster object

If PXC object is not yet present, configurations can be edited in the cr.yaml and applied with

kubectl apply -f cr.yaml

. Configuration will then be placed in



   size: 3
   image: percona/percona-xtradb-cluster:8.0.25
   autoRecovery: true
   configuration: |
     wsrep_provider_options="gcache.size=64M; gcache.recover=yes"

After applying the changes and verifying the configuration:

# kubectl apply -f deploy/cr.yaml configured
# kubectl get pxc cluster1 -ojson | jq .spec.pxc.configuration
"[mysqld]\nwsrep_debug=CLIENT\nwsrep_provider_options=\"gcache.size=64M; gcache.recover=yes\"\n"

An interesting observation is when we add MySQL configuration into the PXC object, the ConfigMap named cluster1-pxc (cluster1-pxc in this case) is automatically created. This could be observed in the following snippet where the loop instruction was started before the kubectl apply command:

# for i in `seq 1 100`; do kubectl get cm cluster1-pxc; sleep 3 ;done
Error from server (NotFound): configmaps "cluster1-pxc" not found
NAME           DATA   AGE
cluster1-pxc   1      3s
NAME           DATA   AGE
cluster1-pxc   1      7s

If we check the content of ConfigMap, we can see that MySQL configuration from PXC object is written.

# kubectl get cm cluster1-pxc -ojson | jq .data
 "init.cnf": "[mysqld]\nwsrep_debug=CLIENT\nwsrep_provider_options=\"gcache.size=64M; gcache.recover=yes\"\n"

Now let’s change the gcache size from 64M to 128M in cr.yaml and apply the changes with

kubectl apply -f cr.yaml


configuration: |
     wsrep_provider_options="gcache.size=128M; gcache.recover=yes"

PXC object got changed and also the ConfigMap values will be changed:

# kubectl get pxc cluster1 -ojson | jq .spec.pxc.configuration
"[mysqld]\nwsrep_debug=CLIENT\nwsrep_provider_options=\"gcache.size=128M; gcache.recover=yes\"\n"

for i in `seq 1 100`; do kubectl get cm cluster1-pxc -ojson | jq .data; sleep 3 ;done
 "init.cnf": "[mysqld]\nwsrep_debug=CLIENT\nwsrep_provider_options=\"gcache.size=64M; gcache.recover=yes\"\n"
 "init.cnf": "[mysqld]\nwsrep_debug=CLIENT\nwsrep_provider_options=\"gcache.size=128M; gcache.recover=yes\"\n"

ConfigMap cluster1-pxc  is automatically reconciled by the operator to reflect the values of PXC object.

CASE-2: When MySQL configuration is present in both PXC object and ConfigMap and ConfigMap is modified

Following up on CASE-1, let’s try to edit the ConfigMap and see the behavior.

Configmap changes of gcache size from 128M to 256M (changed with

kubectl edit cm cluster1-pxc


# apiVersion: v1
 init.cnf: |
   wsrep_provider_options="gcache.size=256M; gcache.recover=yes"

As it can be seen from the below output, the operator reconciles the ConfigMap and reverts the setting of the PXC object to ConfigMap.Configuration in PXC object takes precedence over ConfigMap if both are present.

Added comments to make explain the transition:

# for i in `seq 1 100`; do kubectl get cm cluster1-pxc -ojson | jq .data; sleep 1 ;done  
 "init.cnf": "[mysqld]\nwsrep_debug=CLIENT\nwsrep_provider_options=\"gcache.size=128M; gcache.recover=yes\"\n"
 "init.cnf": "[mysqld]\nwsrep_debug=CLIENT\nwsrep_provider_options=\"gcache.size=256M; gcache.recover=yes\"\n"
 "init.cnf": "[mysqld]\nwsrep_debug=CLIENT\nwsrep_provider_options=\"gcache.size=256M; gcache.recover=yes\"\n"
 "init.cnf": "[mysqld]\nwsrep_debug=CLIENT\nwsrep_provider_options=\"gcache.size=128M; gcache.recover=yes\"\n"
 "init.cnf": "[mysqld]\nwsrep_debug=CLIENT\nwsrep_provider_options=\"gcache.size=128M; gcache.recover=yes\"\n"

CASE-3: Modifying the ConfigMap cluster1-pxc when there is no MySQL configuration in PXC object

Let’s clean up the MySQL configuration in PXC object. This is done by removing the configuration section of PXC in cr.yaml and applying changes with

kubectl apply -f cr.yaml


After applying the changes, PXC object can be verified:

# kubectl get pxc cluster1 -ojson | jq .spec.pxc.configuration

Now let’s create a ConfigMap cluster1-pxc with MySQL configurations:

# cat my.cnf
wsrep_provider_options="gcache.size=256M; gcache.recover=yes"
# kubectl create configmap cluster1-pxc --from-file my.cnf
configmap/cluster1-pxc created

An interesting point to observe is PXC object is not updated:

# kubectl get pxc cluster1 -ojson | jq .spec.pxc.configuration

However, the changes are reflected in DB after pods are recycled and updated with the new configuration:

# kubectl run -i --rm --tty percona-client --image=percona:8.0 --restart=Never -- bash -il
[mysql@percona-client /]$ mysql -h cluster1-pxc-2.cluster1-pxc.pxc.svc.cluster.local -uroot -proot_password
mysql> SHOW VARIABLES LIKE 'wsrep_provider_options'\G
*************************** 1. row ***************************
<snip> …  gcache.size = 256M; … </snip>

In the next post, we will see the precedence when secrets are used for MySQL configurations in PXC Operator. Stay tuned!


Managing MySQL Configurations with the PXC Kubernetes Operator V1.10.0 Part 1: Introduction

MySQL Configurations with the PXC Kubernetes Operator

MySQL Configurations with the PXC Kubernetes OperatorIntroduction/FAQ

Question: I need to run a production-grade open source MySQL DB.

Answer: Percona to the rescue! Percona XtraDB Cluster (PXC) is an open source enterprise MySQL solution that helps you to ensure data availability for your applications while improving security and simplifying the development of new applications in the most demanding public, private, and hybrid cloud environments

Question: I forgot to mention that I need to run it on Kubernetes.

Answer: Percona to the rescue again! Percona Distribution for MySQL Operator based on Percona XtraDB Cluster contains everything you need to quickly and consistently deploy and scale Percona XtraDB Cluster instances in a Kubernetes-based environment on-premises or in the cloud.

Question: I have a lot of MySQL configurations to manage.

Answer:  PXC Operator makes it easy to manage MySQL Configurations. Let’s explore.

For the rest of the article, the name of the PXC cluster is assumed to be cluster1, and this can be modified based on the user preference.

How can I change the MySQL configurations?

If you have not done it already, the first thing to do is install the PXC operator. Our Quickstart guide gives detailed instructions on how to get started. 

There are three possible ways to modify the MySQL configurations as described in the Documentation for MySQL options:

  1. Custom Resource PerconaXtraDBCluster (pxc/pxcs/perconaxtradbclusters )
  2. Config map with name cluster1-pxc
  3. Secret with name cluster1-pxc

Which option should I choose for managing configurations?

The choice of using the above options depends on the use case and the user’s preferences.

Following are some examples:

Using ConfigMap

  1. If the MySQL configuration is pretty big and/or if you want to maintain the configuration separately rather than updating everything in PXC object.
  2. If you want to provide permission to change MySQL configurations but not the other properties of PXC objects like resources, affinity, etc., K8s RBAC can be used to achieve this. A Role/ClusterRole can be created to provide access only for the ConfigMap which is used for MySQL configuration.

Using Secrets

  1. If there is any sensitive information that needs to be used in the configuration, secrets are recommended. Even though k8s secrets are just base64 encoded data, secrets have the advantage of integrating well with vaults, and it’s always best practice to use k8s secrets than ConfigMap when there is sensitive data.

What happens when I change MySQL configuration?

Any changes in MySQL configurations will generally recycle the pods in reverse order if the RollingUpdate strategy is used.

Example: If three replicas are used for the PXC cluster, cluster1-pxc-[0,1,2] pods would be created. When MySQL configuration is changed, cluster1-pxc-2 will be terminated first and the system will wait till the new pod cluster1-pxc-2 starts running and becomes healthy, then cluster1-pxc-1 will be terminated, and so on.

Following are the changes observed with the watch command.

# kubectl get po -l --watch
cluster1-pxc-0   3/3     Running   0          8m23s
cluster1-pxc-1   3/3     Running   0          10m
cluster1-pxc-2   3/3     Running   0          13m
cluster1-pxc-2   3/3     Terminating   0          13m
cluster1-pxc-2   0/3     Terminating   0          14m
cluster1-pxc-2   0/3     Pending       0          0s
cluster1-pxc-2   0/3     Init:0/1      0          1s
cluster1-pxc-2   0/3     PodInitializing   0          8s
cluster1-pxc-2   2/3     Running           0          10s
cluster1-pxc-2   3/3     Running           0          2m
cluster1-pxc-1   3/3     Terminating       0          14m
cluster1-pxc-1   0/3     Terminating       0          14m
cluster1-pxc-1   0/3     Pending           0          0s
cluster1-pxc-1   0/3     Init:0/1          0          1s
cluster1-pxc-1   0/3     PodInitializing   0          6s
cluster1-pxc-1   2/3     Running           0          8s
cluster1-pxc-1   3/3     Running           0          2m1s
cluster1-pxc-0   3/3     Terminating       0          13m
cluster1-pxc-0   0/3     Terminating       0          14m
cluster1-pxc-0   0/3     Pending           0          0s
cluster1-pxc-0   0/3     Init:0/1          0          0s
cluster1-pxc-0   0/3     PodInitializing   0          6s
cluster1-pxc-0   2/3     Running           0          8s
cluster1-pxc-0   3/3     Running           0          2m

In the upcoming post, we will see the precedence and the changes happening while modifying MySQL configurations. Stay tuned!


Exploring MySQL on Kubernetes with Minkube

Exploring MySQL on Kubernetes with Minkube

Exploring MySQL on Kubernetes with MinkubeIn this blog post, I will show how to install the MySQL-compatible Percona XtraDB Cluster (PXC) Operator on Minikube as well as perform some basic actions.   I am by no means a Kubernetes expert and this blog post is the result of my explorations preparing for a local MySQL Meetup, so if you have some comments or suggestions on how to do things better, please comment away!

For my experiments, I used Minikube version 1.26 with the docker driver in the most basic installation on Ubuntu 22.04 LTS, though it should work with other combinations, too. You also can find the official “Running Percona XtraDB Cluster on Minikube” documentation here.

You will also need kubectl installed for this tutorial to work. Alternatively, you can use “minikube kubectl” instead of “kubectl” in the examples.

You will also need MySQL client, “jq” and sysbench utilities installed.

I also made the commands and Yaml configurations I’m using throughout this tutorial available in the minikube-pxc-tutorial GitHub repository 

Enabling Metrics Server in Minikube

As we may want to look into resource usage in some of our experiments, we can consider enabling the Metrics Server add-on which is done by running:

minikube addons enable metrics-server

Getting basic MySQL up and running on Kubernetes

If you use Percona XtraDB Cluster Operator for your MySQL deployment on Kubernetes, the first thing you need to do is to install that operator:

kubectl apply -f

Next, we can create our cluster which we’ll call “minimal-cluster”:

kubectl apply -f 0-cr-minimal.yaml

Note that the completion of this command does not mean the cluster is provisioned and ready for operation, but rather that the process is started and it may take a bit of time to complete.  You can verify that it is provisioned and ready for operation by running:

# watch -n1 -d kubectl get pods
NAME                                               READY   STATUS    RESTARTS   AGE
minimal-cluster-haproxy-0                          2/2     Running   0          3m1s
minimal-cluster-pxc-0                              3/3     Running   0          3m1s
percona-xtradb-cluster-operator-79949dc46d-tdslj   1/1     Running   0          3m55s

You can see three pods running – one is the MySQL (Percona XtraDB Cluster), one is pxc-0 node, and one is the HAproxy haproxy-0 node, which is used to provide high availability for “real” clusters with more than one cluster node in operation.

If you do not want to experiment with high availability options, such a single node deployment is all you need for basic development tasks.

Percona Operator for MySQL Custom Resource Manifest explained

Before we go further let’s look at the YAML file we just deployed to check what’s in it. For a complete list of supported options check out the documentation:

kind: PerconaXtraDBCluster
  name: minimal-cluster

This resource definition corresponds to Percona XtraDB Cluster Operator version 1.11  Our cluster will be named minimal-cluster.

  crVersion: 1.11.0
  secretsName: minimal-cluster-secrets

Once again, we specify the version of the custom resource definition and the name of the “secrets” resource that this cluster will use. It is derived from cluster name for convenience but actually could be anything.

allowUnsafeConfigurations: true

We are deploying a single node cluster (at first) which is considered an unsafe configuration, so one needs to allow an unsafe configuration for such deployment to succeed. For production deployments, you should not allow unsafe configuration unless you really know what you’re doing.

    apply: 8.0-recommended
    schedule: "0 4 * * *"

This section automatically checks for updates every day at 4 AM and performs upgrades to the recommended version if available.

    size: 1
    image: percona/percona-xtradb-cluster:8.0.27-18.1
            storage: 6G

PXC is the main container of this pod containing the cluster itself – we need to store data on a persistent volume, so we’re defining it here.  Use the default volume ask for a 6GB size parameter to define how many nodes to provision and image which particular image of Percona XtraDB Cluster to deploy.

    enabled: true
    size: 1
    image: percona/percona-xtradb-cluster-operator:1.11.0-haproxy

Deploy HAProxy to manage the high availability of the cluster. We only deploy one for testing purposes. In production, though, you need at least two to ensure you do not have a single point of failure.

    enabled: true
    image: percona/percona-xtradb-cluster-operator:1.11.0-logcollector

Collect logs from the Percona XtraDB Cluster container, so they are persisted even if that container is destroyed.

Accessing the MySQL server you provisioned 

But how do we access our newly provisioned MySQL/Percona XtraDB Cluster instance?

By default, the instance is not accessible outside of the Kubernetes cluster for security purposes, which means outside of the Minikube environment, but in our case, and if we want it to be, we need to expose it:

kubectl expose service  minimal-cluster-haproxy --type=NodePort --port=3306 --name=mysql-test

We can see the result of this command by running:  

l# kubectl get service
NAME                               TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                                 AGE
kubernetes                         ClusterIP        <none>        443/TCP                                 34m
minimal-cluster-haproxy            ClusterIP       <none>        3306/TCP,3309/TCP,33062/TCP,33060/TCP   11m
minimal-cluster-haproxy-replicas   ClusterIP   <none>        3306/TCP                                11m
minimal-cluster-pxc                ClusterIP   None             <none>        3306/TCP,33062/TCP,33060/TCP            11m
minimal-cluster-pxc-unready        ClusterIP   None             <none>        3306/TCP,33062/TCP,33060/TCP            11m
mysql-test                         NodePort   <none>        3306:30504/TCP                          24s
percona-xtradb-cluster-operator    ClusterIP    <none>        443/TCP                                 12m

Unlike the rest of the services which are only accessible inside the Kubernetes cluster (hence ClusterIP type)  our “mysql-test” service has NodePort type which means it is exposed on the port on the local node.  In our case, port 3306, which is the standard port MySQL listens on, is mapped to port 30504. 

Note that IP mentioned here is internal cluster IP and it is not accessible. To find out which IP you should utilize, you can use:

# minikube service list
|  NAMESPACE  |               NAME               | TARGET PORT  |            URL            |
| default     | kubernetes                       | No node port |
| default     | minimal-cluster-haproxy          | No node port |
| default     | minimal-cluster-haproxy-replicas | No node port |
| default     | minimal-cluster-pxc              | No node port |
| default     | minimal-cluster-pxc-unready      | No node port |
| default     | mysql-test                       |         3306 | |
| default     | percona-xtradb-cluster-operator  | No node port |
| kube-system | kube-dns                         | No node port |
| kube-system | metrics-server                   | No node port |

Even though this is not an HTTP protocol service, the URL is quite helpful in showing us the IP and port we can use to access MySQL. 

An alternative to exposing by running “expose service command” is you can also set haproxy.serviceType=NodePort in manifest.

Next, we need the MySQL password! To keep things secure, Percona Operator for MySQL does not have a default password but instead generates a unique secure password which is stored in Kubernetes Secrets. In fact, there are two Secrets created: one for internal use and another intended to be accessible by the user – this is the one we will use:

# kubectl get secrets
NAME                       TYPE     DATA   AGE
internal-minimal-cluster   Opaque   7      94m
minimal-cluster-secrets    Opaque   7      94m

Let’s look deeper into minimal-cluster Secrets:

#  kubectl get secret minimal-cluster-secrets -o yaml
apiVersion: v1
  clustercheck: a3VMZktDUUhLd1JjM3BJaw==
  monitor: ZTdoREJoQXNhdDRaWlB4eg==
  operator: SnZySWttRWFNNDRreWlvVlR5aA==
  proxyadmin: ZVcxTGpIQXNLT0NjemhxVEV3
  replication: OTEzY09mUnN2MjM1d3RlNUdOUA==
  root: bm1oemtOaDVRakxBcG05OUdQTw==
  xtrabackup: RHNTNkExYlhqNDkxZjdvY3k=
kind: Secret
  creationTimestamp: "2022-07-07T18:33:03Z"
  name: minimal-cluster-secrets
  namespace: default
  resourceVersion: "1538"
  uid: 3ab006f1-01e2-41a7-b9f5-465aecb38a69
type: Opaque

We can see Percona Operator for MySQL has a number of users created and it stores their passwords in Kubernetes Secrets.  Note those values are base64 encoded. 

Let’s now get all the MySQL connection information we discussed and store it in various variables:

export MYSQL_HOST=$(kubectl get nodes -o jsonpath="{.items[0].status.addresses[0].address}")
export MYSQL_TCP_PORT=$(kubectl get services/mysql-test -o go-template='{{(index .spec.ports 0).nodePort}}')
export MYSQL_PWD=`kubectl get secrets minimal-cluster-secrets  -ojson | jq -r .data.root | base64 -d`

We’re using these specific names as “mysql” command line client (but not mysqladmin) and will use those variables by default, making it quite convenient. 

Let’s check if MySQL is accessible using those credentials:

# mysql -e "select 1"
| 1 |
| 1 |

Works like a charm!

Running Sysbench on MySQL on Kubernetes

Now that we know how to access MySQL, let’s create a test database and load some data:

# mysql -e "create database sbtest"
sysbench --db-driver=mysql --threads=4 --mysql-host=$MYSQL_HOST --mysql-port=$MYSQL_TCP_PORT --mysql-user=root --mysql-password=$MYSQL_PWD --mysql-db=sbtest /usr/share/sysbench/oltp_point_select.lua --report-interval=1 --table-size=1000000 prepare
sysbench 1.0.20 (using system LuaJIT 2.1.0-beta3)

Initializing worker threads...

Creating table 'sbtest1'...
Inserting 1000000 records into 'sbtest1'
Creating a secondary index on 'sbtest1'...

And now we can run a test:

# sysbench --db-driver=mysql --threads=8 --mysql-host=$MYSQL_HOST --mysql-port=$MYSQL_TCP_PORT --mysql-user=root --mysql-password=$MYSQL_PWD --mysql-db=sbtest /usr/share/sysbench/oltp_point_select.lua --report-interval=1 --table-size=1000000 --report-interval=1 --time=60 run

Running the test with following options:
Number of threads: 8
Report intermediate results every 1 second(s)
Initializing random number generator from current time

Initializing worker threads...

Threads started!

[ 1s ] thds: 8 tps: 27299.11 qps: 27299.11 (r/w/o: 27299.11/0.00/0.00) lat (ms,95%): 0.52 err/s: 0.00 reconn/s: 0.00
[ 2s ] thds: 8 tps: 29983.13 qps: 29983.13 (r/w/o: 29983.13/0.00/0.00) lat (ms,95%): 0.46 err/s: 0.00 reconn/s: 0.00
[ 60s ] thds: 8 tps: 30375.00 qps: 30375.00 (r/w/o: 30375.00/0.00/0.00) lat (ms,95%): 0.42 err/s: 0.00 reconn/s: 0.00
SQL statistics:
    queries performed:
        read:                            1807737
        write:                           0
        other:                           0
        total:                           1807737
    transactions:                        1807737 (30127.61 per sec.)
    queries:                             1807737 (30127.61 per sec.)
    ignored errors:                      0      (0.00 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          60.0020s
    total number of events:              1807737

Latency (ms):
         min:                                    0.10
         avg:                                    0.26
         max:                                   30.91
         95th percentile:                        0.42
         sum:                               478882.64

Threads fairness:
    events (avg/stddev):           225967.1250/1916.69
    execution time (avg/stddev):   59.8603/0.00

Great, we can see our little test cluster can run around 30,000 queries per second!

Let’s make MySQL on Kubernetes highly available!

Let’s see if we can convert our single-node Percona XtraDB Cluster to fake a highly available cluster (fake – because with MinKube running on a single node, we’re not going to be protected from actual hardware failures).

To achieve this we can simply scale our cluster to three nodes:

# kubectl scale --replicas=3 pxc/minimal-cluster scaled

As usual, Kubernetes “cluster scaled” here does not mean what cluster actually got scaled but what new desired state was configured and the operator started to work on scaling the cluster.

You can wait for a few minutes and then notice that one of the pods is stuck in a pending state and it does not look like it’s progressing:

# kubectl get pods
NAME                                               READY   STATUS    RESTARTS   AGE
minimal-cluster-haproxy-0                          2/2     Running   0          127m
minimal-cluster-pxc-0                              3/3     Running   0          127m
minimal-cluster-pxc-1                              0/3     Pending   0          3m31s
percona-xtradb-cluster-operator-79949dc46d-tdslj   1/1     Running   0          128m

To find out what’s going on we can describe the pod:

# kubectl describe pod minimal-cluster-pxc-1
  Type     Reason            Age    From               Message
  ----     ------            ----   ----               -------
  Warning  FailedScheduling  5m26s  default-scheduler  0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.
  Warning  FailedScheduling  5m25s  default-scheduler  0/1 nodes are available: 1 node(s) didn't match pod anti-affinity rules. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.
  Warning  FailedScheduling  21s    default-scheduler  0/1 nodes are available: 1 node(s) didn't match pod anti-affinity rules. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.

There is a lot of information in the output but for our purpose, it is the events section that is most important.

What the warning message here is saying is only one node is available while anti-affinity rules prevent scheduling more pods on this node.  This makes sense – in a real production environment, it would not be acceptable to schedule more than one Percona XtraDB Cluster pod on the same physical server as this would end up being a single point of failure. 

This is for production, though, but for testing, we can allow running multiple pods on the same physical machine, as one physical machine is all we have. So we modify the resource definition by disabling anti-affinity protection:

    size: 1
    image: percona/percona-xtradb-cluster:8.0.27-18.1
      antiAffinityTopologyKey: "none"

Let’s apply this modified configuration to the cluster.  (You can find this and another YAML file in the GitHub repository).

# kubectl apply -f 1-affinity.yaml configured

And try scaling it again:

#kubectl scale --replicas=3 pxc/minimal-cluster scaled

Instead of constantly checking for provisioning of all new nodes to be complete, we can watch as pods progress through initialization and check if there is any problem:

#kubectl get po -l --watch
NAME                    READY   STATUS    RESTARTS   AGE
minimal-cluster-pxc-0   3/3     Running   0          64s
minimal-cluster-pxc-1   0/3     Pending   0          0s
minimal-cluster-pxc-1   0/3     Pending   0          0s
minimal-cluster-pxc-1   0/3     Init:0/1   0          0s
minimal-cluster-pxc-1   0/3     PodInitializing   0          2s
minimal-cluster-pxc-1   2/3     Running           0          4s
minimal-cluster-pxc-1   3/3     Running           0          60s
minimal-cluster-pxc-2   0/3     Pending           0          0s
minimal-cluster-pxc-2   0/3     Pending           0          0s
minimal-cluster-pxc-2   0/3     Pending           0          2s
minimal-cluster-pxc-2   0/3     Init:0/1          0          2s
minimal-cluster-pxc-2   0/3     PodInitializing   0          3s
minimal-cluster-pxc-2   2/3     Running           0          6s
minimal-cluster-pxc-2   3/3     Running           0          62s

Great! We see two additional “pxc” pods were provisioned and running!

Setting up resource limits for your MySQL on Kubernetes

As of now, we have not specified any resource limits (besides persistent volume size) for our MySQL deployment. This means it will use all resources currently available on the host, and while this might be the preferred way for testing an environment for production, you’re likely to be looking for both more predictable performance and more resource isolation so that a single cluster can’t oversaturate the node degrading performance of other pods running on the same node.

In order to place resource limits you need to place an appropriate section in the custom resource definition:

        memory: 1G
        cpu: 200m
        memory: 1G
        cpu: 500m

This resource definition means we’re requesting at least 1GB of memory and 0.2 CPU core, and limit resources this instance will be used to 0.5 CPU core and the same 1GB of memory.

Note that if the “requests” conditions can’t be satisfied, i.e. if the required amount of CPU or memory is not available in the cluster, the pod will not be scheduled, waiting for resources to become available (possibly forever).

Let’s apply those limits to our cluster:

kubectl apply -f  2-small-limits.yaml configured

If we look at what happens to the pods after we apply this command, we see:

#kubectl get po -l --watch
root@localhost:~/minikube-pxc-tutorial# sh
NAME                    READY   STATUS        RESTARTS   AGE
minimal-cluster-pxc-0   3/3     Running       0          3h52m
minimal-cluster-pxc-1   3/3     Running       0          3h49m
minimal-cluster-pxc-2   3/3     Terminating   0          3h48m
minimal-cluster-pxc-0   0/3     Init:0/1          0          1s
minimal-cluster-pxc-0   0/3     PodInitializing   0          2s
minimal-cluster-pxc-0   2/3     Running           0          4s
minimal-cluster-pxc-0   3/3     Running           0          30s

So changing resource limits requires cluster restart.

If you set resource limits, the operator will automatically set some of MySQL configuration parameters (for example max_connections) according to requests (resources guaranteed to be available).  If no resource constraints are specified, the operator will not perform any automatic configuration.

If we run the same Sysbench test again, we will see:

queries:                             274076 (4563.18 per sec.)

A quite more modest result than running with no limits. 

We can validate actual CPU usage by the pods as we run benchmarks by running:

# watch kubectl top pods
NAME                                               CPU(cores)   MEMORY(bytes)
minimal-cluster-haproxy-0                          277m         15Mi
minimal-cluster-pxc-0                              500m         634Mi
minimal-cluster-pxc-1                              8m           438Mi
minimal-cluster-pxc-2                              8m           443Mi
percona-xtradb-cluster-operator-79949dc46d-tdslj   6m           23Mi

We can observe a few interesting things here: 

  • The limit for Percona XtraDB Cluster looks well respected.
  • We can see only one of the Percona XtraDB Cluster nodes getting the load. This is expected as HAProxy does not load balance queries among cluster nodes, just ensures high availability. If you want to see queries load balanced, you either can use a separate port for load balanced reads or deploy ProxySQL for intelligent query routing
  • HAProxy pod required about 50% of the resources of Percona XtraDB Cluster pod (this is a worst-case scenario of very simple in-memory queries), so do not underestimate its resource usage needs!

Pausing and resuming MySQL on Kubernetes 

If you have a MySQL instance that you use for testing/development, or otherwise do not need it running all the time, if you are running Linux you can just Stop MySQL Process, or Stop MySQL Container if you’re running Docker.  How do you achieve the same in Kubernetes? 

First, you need to be careful – if you delete the cluster instance, it will destroy both compute and storage resources and you will lose all the data.  Instead of deleting you need to pause the cluster. 

To pause the cluster, you can just add a pause option to the cluster custom resource definition:

pause: true

Let’s apply CR with this option enabled and see what happens:

# kubectl apply -f 4-paused.yaml configured

After the pause process is complete, we will not see any cluster pods running, just operator:

l# kubectl get pod
NAME                                               READY   STATUS    RESTARTS   AGE
percona-xtradb-cluster-operator-79949dc46d-tdslj   1/1     Running   0          20h

However, you still will see persistent volume claims which hold cluster data present:

# kubectl get pvc
NAME                            STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
datadir-minimal-cluster-pxc-0   Bound    pvc-866c58c9-65b8-4cb4-ba2c-7b904c0eecf0   6G         RWO            standard       95m
datadir-minimal-cluster-pxc-1   Bound    pvc-11f1ac22-caf0-4eb9-99a8-772167cf62bd   6G         RWO            standard       94m
datadir-minimal-cluster-pxc-2   Bound    pvc-c949cc2f-9b54-403d-ba32-a739abd5256d   6G         RWO            standard       93m

We can also see the cluster is in a paused state by this command:

# kubectl get pxc
NAME              ENDPOINT                          STATUS   PXC   PROXYSQL   HAPROXY   AGE
minimal-cluster   minimal-cluster-haproxy.default   paused                              97m

We can unpause the cluster by applying the same CR with the value  pause: false and the cluster will be back online in a couple of minutes:

# kubectl apply -f 5-resumed.yaml
# kubectl get pxc
NAME              ENDPOINT                          STATUS   PXC   PROXYSQL   HAPROXY   AGE
minimal-cluster   minimal-cluster-haproxy.default   ready    3                1         100m

MySQL on Kubernetes backup (and restore)

If you care about your data you need backups, and Percona Operator for MySQL provides quite a few backup features including scheduled backups, point-in-time recovery, backing up to S3 compatible storage, and many others.  In this walkthrough, though, we will look at the most simple backup and restore – taking a backup to persistent volume.

First, we need the cluster configured for backups; we need to configure the storage to which the backup will be performed, as well as provide an image for the container which will be responsible for managing backups:

    image: percona/percona-xtradb-cluster-operator:1.11.0-pxc8.0-backup
        type: filesystem
            accessModes: [ "ReadWriteOnce" ]
                storage: 6Gi

As before, we can apply a new configuration which includes this section by

kubectl apply -f 6-1-backup-config.yaml

. While we can schedule backups to take place automatically, we will focus on running backups manually. To do this, you can apply the following backup CR:

kind: PerconaXtraDBClusterBackup
  name: backup1
  pxcCluster: minimal-cluster
  storageName: fs-pvc

It basically defines creating a backup named “backup1” for the cluster named “minimal-cluster” and storing it on “fs-pvc” storage volume:

#kubectl apply -f 6-2-0-backup-exec.yaml
#kubectl get pxc-backups
backup1   minimal-cluster   fs-pvc    pvc/xb-backup1   Succeeded   2m14s       2m42s

After the backup job has successfully completed, we can see its status. 

Let’s now mess up our database before attempting to restore:

mysql -e "drop database sbtest";

The restore happens similar to backup by running Restore Job on the cluster with the following configuration:

apiVersion: ""
kind: "PerconaXtraDBClusterRestore"
  name: "restore1"
  pxcCluster: "minimal-cluster"
  backupName: "backup1"

We can do this by applying the configuration and checking its status:

# kubectl apply -f 6-3-1-restore-exec.yaml
# kubectl get pxc-restore
restore1   minimal-cluster   Restoring               61s

What if you mess up the database again (not uncommon in a development environment) and want to restore the same backup again?

If you just run restore again, it will not work.

root@localhost:~/minikube-pxc-tutorial# kubectl apply -f 6-3-1-restore-exec.yaml unchanged

Because a job with the same name already exists in the system, you can either create a restore job with a different name or delete the restore object and run it again:

# kubectl delete pxc-restore restore1 "restore1" deleted
# kubectl apply -f 6-3-1-restore-exec.yaml created

Note, while deleting the restore job does not affect backup data or a running cluster, removing “backup” removes the data stored in this backup, too:

# kubectl delete pxc-backup backup1 "backup1" deleted

Monitoring your deployment with Percona Monitoring and Management (PMM)

Your MySQL deployment would not be complete without setting it up with monitoring, and in Percona Operator for MySQL, Percona Monitoring and Management (PMM) support is built in and can be enabled quite easily.

According to Kubernetes best practices, credentials to access PMM are not stored in the resource definition, but rather in secrets. 

As recommended, we’re going to use PMM API Keys instead of a user name and password.  We can get them in the API Keys configuration section:

API Keys configuration section

After we’ve created the API Key we can store it in our cluster’s secret:

# export PMM_API_KEY="<KEY>"
# kubectl patch secret/minimal-cluster-secrets -p '{"data":{"pmmserverkey": "'$(echo -n $PMM_API_KEY | base64 -w0)'"}}'
secret/minimal-cluster-secrets patched

Next, we need to add the PMM configuration section to the cluster custom resource, specifying the server where our PMM server is deployed.  

    enabled: true
    image: percona/pmm-client:2.28.0
        memory: 150M
        cpu: 300m

Apply complete configuration:

kubectl apply -f 7-2-pmm.yaml

As deployment is complete you should see that the MySQL/PXC and HAProxy instance statistics are visible in the Percona Monitoring and Management instance you specified:

Percona Monitoring and Management

Note, as of this writing (Percona Monitoring and Management 2.28), PMM is not fully Kubernetes aware, so it will report CPU and memory usage in the context of the node, not what has been made available for a specific pod.

Configuring MySQL on Kubernetes

So far we’ve been running MySQL (Percona XtraDB Cluster) with the default settings, and even though Percona Operator for MySQL automatically configures some of the options for optimal performance, chances are you will want to change some of the configuration options.

Percona Operator for MySQL provides multiple ways to change options, and we will focus on what I consider the most simple and practical – including them in the CR definition.

Simply add as a section:

configuration: |

Which will be added to your MySQL configuration file, hence overriding any default values or any adjustments the operator has done during deployment.

As usual, you can apply a new configuration to see those applied by running kubectl apply -f 8-config.yaml.

If you ever misspelled a configuration option name, you might wonder what would happen.

The operator will try applying configuration to one of the nodes, which will fail with one of the nodes repeatedly crashing and restarting, but the cluster will be available in a degraded state.

# kubectl get pods
NAME                                               READY   STATUS             RESTARTS      AGE
minimal-cluster-haproxy-0                          3/3     Running            0             56m
minimal-cluster-pxc-0                              4/4     Running            0             17m
minimal-cluster-pxc-1                              4/4     Running            0             28m
minimal-cluster-pxc-2                              3/4     CrashLoopBackOff   6 (85s ago)   11m
percona-xtradb-cluster-operator-79949dc46d-tdslj   1/1     Running            0             24h
restore-job-restore1-minimal-cluster-lp4jg         0/1     Completed          0             165m

To fix the issue, just correct the configuration error and reapply.

Accessing MySQL logs

To troubleshoot your MySQL on Kubernetes deployment you may need to access MySQL logs. Logs are provided in the “logs” container in each PXC node pod. (More info) To access them, you can use something like:

# kubectl logs minimal-cluster-pxc-0 -c logs  --tail 5
{"log":"2022-07-08T19:08:36.664908Z 0 [Note] [MY-000000] [Galera] (70d46e6b-9729, 'tcp://') turning message relay requesting off\n","file":"/var/lib/mysql/mysqld-error.log"}
{"log":"2022-07-08T19:08:36.696000Z 0 [Note] [MY-000000] [Galera] Member 0.0 (minimal-cluster-pxc-2) requested state transfer from 'minimal-cluster-pxc-1,'. Selected 2.0 (minimal-cluster-pxc-1)(SYNCED) as donor.\n","file":"/var/lib/mysql/mysqld-error.log"}
{"log":"2022-07-08T19:08:37.000623Z 0 [Note] [MY-000000] [Galera] 2.0 (minimal-cluster-pxc-1): State transfer to 0.0 (minimal-cluster-pxc-2) complete.\n","file":"/var/lib/mysql/mysqld-error.log"}
{"log":"2022-07-08T19:08:37.000773Z 0 [Note] [MY-000000] [Galera] Member 2.0 (minimal-cluster-pxc-1) synced with group.\n","file":"/var/lib/mysql/mysqld-error.log"}
{"log":"2022-07-08T19:08:37.881614Z 0 [Note] [MY-000000] [Galera] 0.0 (minimal-cluster-pxc-2): State transfer from 2.0 (minimal-cluster-pxc-1) complete.\n","file":"/var/lib/mysql/mysqld-error.log"}

Note: as you can see, logs are provided in JSON format and can be processed with “jq” command, if needed. 

Deleting the MySQL deployment

Deleting your deployed cluster is very easy, so remember with great power comes great responsibility. Just one command will destroy the entire cluster, with all its data:

# kubectl delete pxc minimal-cluster "minimal-cluster" deleted

Note: this only applies to the PXC cluster; any backups taken from the cluster are considered separate objects and will not be deleted by this operation.

Protecting PVCs from deletion

This might NOT be good enough for your production environment and you may want to have more protection for your data from accidental loss, and Percona Operator for MySQL allows that.  If finalizers.delete-pxc-pvc is NOT configured, persistent volumes claims (PVCs) are not going to be deleted after cluster deletion, so re-creating a cluster with the same will allow you to recover your data. If this is the route you take, remember to have your own process to manage PVCs  – otherwise, you may see massive space usage from old clusters.


I hope this walkthrough was helpful to introduce you to the power of Percona Operator for MySQL and what it looks like to run MySQL on Kubernetes.  I would encourage you not to end with a read but to deploy your own Minikube and play around with the scenarios mentioned in this blog.  I’ve included all the scripts (and more) in the GitHub repository, and by practicing rather than reading alone you can learn much faster!


Percona Operator for MySQL Supports Group Replication

Percona Operator for MySQL Supports Group Replication

Percona Operator for MySQL Supports Group ReplicationThere are two Operators at Percona to deploy MySQL on Kubernetes:

We wrote a blog post in the past explaining the thought process and reasoning behind creating the new Operator for MySQL. The goal for us is to provide production-grade solutions to run MySQL in Kubernetes and support various replication configurations:

  • Synchronous replication
    • with Percona XtraDB Cluster
    • with Group Replication
  • Asynchronous replication

With the latest 0.2.0 release of Percona Operator for MySQL (based on Percona Server for MySQL), we have added Group Replication support. In this blog post, we will briefly review the design of our implementation and see how to set it up. 


This is a high-level design of running MySQL cluster with Group Replication:

MySQL cluster with Group Replication

MySQL Router acts as an entry point for all requests and routes the traffic to the nodes. 

This is a deeper look at how the Operator deploys these components in Kubernetes:
kubernetes deployment

Going from right to left:

  1. StatefulSet to deploy a cluster of MySQL nodes with Group Replication configured. Each node has its storage attached to it.
  2. Deployment object for stateless MySQL Router. 
  3. Deployment is exposed with a Service. We use various TCP ports here:
    1. MySQL Protocol ports
      1. 6446 – read/write, routing traffic to Primary node
      2. 6447 – read-only, load-balancing the traffic across Replicas 
    2. MySQL X Protocol – can be useful for CRUD operations, ex. asynchronous calls. Ports follow the same logic:
      1. 6448 – read/write
      2. 6449 – read-only 


Prerequisites: you need a Kubernetes cluster. Minikube would do.

The files used in this blog post can be found in this Github repo.

Deploy the Operator

kubectl apply --server-side -f



flag, without it you will get the error:

The CustomResourceDefinition "" is invalid: metadata.annotations: Too long: must have at most 262144 bytes

Our Operator follows OpenAPIv3 schema to have proper validation. This unfortunately increases the size of our Custom Resource Definition manifest and as a result, requires us to use



Deploy the Cluster

We are ready to deploy the cluster now:

kubectl apply -f

I created this Custom Resource manifest specifically for this demo. Important to note variables:

  1. Line 10:
    clusterType: group-replication

    – instructs Operator that this is going to be a cluster with Group Replication.

  2. Lines 31-47: are all about MySQL Router. Once Group Replication is enabled, the Operator will automatically deploy the router. 

Get the status

The best way to see if the cluster is ready is to check the Custom Resource state:

$ kubectl get ps
my-cluster   group-replication   ready   18m

As you can see, it is


. You can also see


if the cluster is still not ready or


if something went wrong.

Here you can also see the endpoint where you can connect to. In our case, it is a public IP-address of the load balancer. As described in the design section above, there are multiple ports exposed:

$ kubectl get service my-cluster-router
NAME                TYPE           CLUSTER-IP    EXTERNAL-IP     PORT(S)                                                       AGE
my-cluster-router   LoadBalancer   6446:30852/TCP,6447:31694/TCP,6448:31515/TCP,6449:31686/TCP   18h

Connect to the Cluster

To connect we will need the user first. By default, there is a root user with a randomly generated password. The password is stored in the Secret object. You can always fetch the password with the following command:

$ kubectl get secrets my-cluster-secrets -ojson | jq -r .data.root | base64 -d

I’m going to use port 6446, which would grant me read/write access and lead me directly to the Primary node through MySQL Router:

mysql -u root -p -h --port 6446
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 156329
Server version: 8.0.28-19 Percona Server (GPL), Release 19, Revision 31e88966cd3

Copyright (c) 2000, 2022, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

Group Replication in action

Let’s see if Group Replication really works. 

List the members of the cluster by running the following command:

$ mysql -u root -p -P 6446 -h -e 'SELECT member_host, member_state, member_role FROM performance_schema.replication_group_members;'
| member_host                               | member_state | member_role |
| | ONLINE       | PRIMARY     |
| | ONLINE       | SECONDARY   |
| | ONLINE       | SECONDARY   |

Now we will delete one Pod (MySQL node), which also happens to have a Primary role, and see what happens:

$ kubectl delete pod my-cluster-mysql-0
pod "my-cluster-mysql-0" deleted

$ mysql -u root -p -P 6446 -h -e 'SELECT member_host, member_state, member_role FROM performance_schema.replication_group_members;'
| member_host                               | member_state | member_role |
| | ONLINE       | PRIMARY     |
| | ONLINE       | SECONDARY   |

One node is gone as expected.


node got promoted to a Primary role. I’m still using port 6446 and the same host to connect to the database, which indicates that MySQL Router is doing its job.

After some time Kubernetes will recreate the Pod and the node will join the cluster again automatically:

$ mysql -u root -p -P 6446 -h -e 'SELECT member_host, member_state, member_role FROM performance_schema.replication_group_members;'
| member_host                               | member_state | member_role |
| | ONLINE       | PRIMARY     |
| | ONLINE       | SECONDARY   |

The recovery phase might take some time, depending on the data size and amount of the changes, but eventually, it will come back ONLINE:

$ mysql -u root -p -P 6446 -h -e 'SELECT member_host, member_state, member_role FROM performance_schema.replication_group_members;'
| member_host                               | member_state | member_role |
| | ONLINE       | SECONDARY   |
| | ONLINE       | PRIMARY     |
| | ONLINE       | SECONDARY   |

What’s coming up next?

Some exciting capabilities and features that we are going to ship pretty soon:

  • Backup and restore support for clusters with Group Replication
    • We have backups and restores in the Operator, but they currently do not work with Group Replication
  • Monitoring of MySQL Router in Percona Monitoring and Management (PMM)
    • Even though the Operator integrates nicely with PMM, it is possible to monitor MySQL nodes only, but not MySQL Router.
  • Automated Upgrades of MySQL and database components in the Operator
    • We have it in all other Operators and it is just logical to add it here

Percona is an open source company and we value our community and contributors. You are greatly encouraged to contribute to Percona Software. Please read our Contributions guide and visit our community webpage.


Percona Operator Volume Expansion Without Downtime

Percona Operator Volume Expansion Without Downtime

Percona Operator Volume Expansion Without DowntimeThere are several ways to manage storage in Percona Kubernetes Operators: Persistent Volume (PV), hostPath, ephemeral storage, etc. Using PVs, which are provisioned by the Operator through Storage Classes and Persistent Volume Claims, is the most popular choice for our users. And one of the most popular questions is how to scale our operator storages which are based on PVs. To make PVs resize easier, the volume expansion feature was introduced as an alpha feature in Kubernetes 1.8 and eventually became a stable feature in 1.24. 

 In this blog post, I will show you how to easily increase the storage size in Percona Operators using this feature without any database downtime and explain what to do if your storage class doesn’t support volume expansion. 

Scale-up persistent volume claim (PVC) by volume expansion

  • You can only resize PVC if the storage class of the PVC has set the AllowVolumeExpansion=True option. 
kubectl describe sc <storage class name> | grep allowVolumeExpansion

  • Ensure that the delete PVC finalizer (delete-pxc-pvc/delete-psmdb-pvc) is not set in the custom resource, otherwise, all cluster data can be lost. 
  • Please refer to volume expansion documentation for intree volume types which support volume expansion
  • If the underlying Storage Driver can only support offline expansion, users of the PVC must take down their Pod before expansion can succeed. Please refer to the documentation of your storage provider to understand which modes of volume expansion are supported. 

Please note that expanding EBS volumes is a time-consuming operation. Also, there is a per-volume quota of one modification every six hours. 

Percona Operator for MongoDB/Percona Operator for MySQL

Under the hood of Percona Operator for MongoDB and Percona Operator for MySQL based on Percona XtraDB Cluster, StatefuSets are used, so the volume expansion task for the Operator comes down to resizing the corresponding StatefulSet persistent volume. 

Resizing a PV claimed by changing custom resource or StatefulSets (Does not work)

Only a number of specific StatefulSet fields can be modified after creation. Altering storage size in operator custom resource or StatefulSet leads to the error below:

StatefulSet.apps \"my-cluster-name-rs0\" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', 
'updateStrategy', 'persistentVolumeClaimRetentionPolicy' and 'minReadySeconds' are forbidden"

Expansion storage by modifying persistent volume claim (PVC)

1. Change PVC size: 

kubectl patch pvc <pvc-name>  -n <pvc-namespace> -p '{ "spec": { "resources": { "requests": { "storage": "NEW STORAGE VALUE" }}}}'

2. After the process is finished, you can see the below message in the PVC description. 

kubectl describe pvc <pvc-name>
Normal  FileSystemResizeSuccessful  3s    kubelet  MountVolume.NodeExpandVolume succeeded for volume "pvc-7ed0ba5c-cc79-42d4-a4b3-xxxxxxxxxxxx"

If you see the following event in the PVC description (and it does not change) you need to restart pods to finish resizing. (See further notes.)

FileSystemResizePending   True    Mon, 01 Jan 0001 00:00:00 +0000   Thu, 23 Jun 2022 19:24:50 +0200 Waiting for user to (re-)start a pod to finish 
file system resize of volume on node.

3. When PVC size was changed, you can ensure that other objects also changed.

kubectl get pvc

kubectl get pv

kubectl exec <pod-name> -- lsblk

4. Update storage size in corresponding operator custom resource (Percona Operator for MongoDB or Percona Operator for MySQL based on Percona XtraDB Cluster).

So now we have scaled storage and old storage values in the corresponding StatefulSets. As we saw above we can apply custom resources with new storage values, but the storage size field can’t be changed for StatefulSet objects. That is why we need to recreate StatefulSet with new values. To avoid downtime we delete StatefulSet without deleting the pods (–cascade=’orphan’ flag).

kubectl delete sts <statefulset-name> --cascade=orphan

5. Apply cr.yml.

kubectl apply -f deploy/cr.yml

6. Delete StatefulSet pods one by one (optional).

According to the Kubernetes official documentation:

If your storage provider supports online expansion then no Pod restart should be necessary for volume expansion to finish.

Percona Operator for PostgreSQL (pg operator)

In contrast to Percona Operator for MongoDB/Percona Operator for MySQL based on Percona XtraDB Cluster which uses partially modifiable StatefulSet object, our Percona Operator for PostgreSQL uses Deployment instead of StatefulSet for cluster objects. Deployment is a mutable object and can be changed not only at the cluster start, but also on the running cluster. That is why the change of storage size in custom resource will be continuously applied to the running cluster. Changes in custom resources not only resize volume but also initiate pods restart. 

Moreover, in order to maintain a safe cluster configuration,  the operator keeps primary and replicas volume size exactly equal.  And hence, if the size of the primary is changed, the size of the replicas is changed too and both primary and replica pods are restarted. Otherwise, if you modify replica size first, only replica PVCs are expanded and pods are recreated. However, if you try to increase the primary size to the same value as replicas, only primary PVC is scaled (PVCs of the replicas already have the necessary value ) but both replicas and primary pods are restarted.

So you can see the situation that both primary and replica PVCs changed the size, only the storage size for the primary was increased in the custom resource.

You need to be careful and keep your custom resource file up-to-date. 

Resizing persistent volume claim (PVC) by transferring data to new PVC

Volume Expansion is the great Kubernetes feature that provides an easy and effective way to resize PVC. However, some storage classes do not support PVC volume expansion. For these classes, you must create a new PVC and move content to it. You can do it in the following steps:

1. Configure PVC size in custom resource for the corresponding operator (Percona Operator for MongoDB or Percona Operator for MySQL based on Percona XtraDB Cluster) and apply it. 

kubectl apply -f deploy/cr.yaml

2. Delete StatefulSet. Like in the previous part, we need to recreate StatefulSet in order to apply storage changes. 

kubectl delete sts <statefulset-name> --cascade=orphan

As a result, the Pods are up and the Operator recreates the StatefulSet with a new volume size.

3. Scale up the cluster (optional).

Changing the storage size would require us to terminate the Pods, which can induce performance degradation. To preserve performance during the operation we can scale up the cluster using these instructions  Percona Operator for MongoDB or Percona Operator for MySQL based on Percona XtraDB ClusterAs long as we have changed the StatefulSet already, new operator pods will be provisioned with increased volumes. 

4. Reprovision the Pods One by One to Change the Storage.

This is the step where underlying storage is going to be changed for the database Pods.

Delete the PVC of the Pod that you are going to reprovision.

kubectl delete pvc <pvc-name>

The PVC will not be deleted right away as there is a Pod using it. To proceed, delete the Pod:

kubectl delete pod <pod-name>

The Pod will be deleted along with the PVCs. The StatefulSet controller will notice that the pod is gone and will recreate it along with the new expanded  PVC.

Check PVC size:

kubectl get pvc

NAME                                STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
mongod-data-minimal-cluster-rs1-1   Bound    pvc-e9b494fb-f201-44b9-9493-ea6faa903ddc   4Gi        RWO            standard       3m3s
mongod-data-minimal-cluster-rs1-2   Bound    pvc-6236c7e1-9670-49a8-9928-0dd24708588c   3Gi        RWO            standard       144m

The CAPACITY column indicates that this PVC is increased.

Once the Pod is up, the data is synced from other nodes. The data transfer speed depends on the amount of data in the cluster and cluster utilization as a consequence synchronization might take a while. Please wait until the node is fully up and running, sync is finished, and only then proceed to the next Pod.

5. Scale Down the Cluster and clean up unnecessary PVCs. 


We are constantly introducing new features in the Percona Kubernetes Operators. One of them is automated volume expansion, which will be implemented in a future release.  We also plan to add the possibility to change storage size directly in a custom resource that makes it possible to add to the Private DBaaS feature in Percona Monitoring and Management (PMM) detecting the volume consumption and automatically scaling the cluster. If you have any suggestions on the topic and are willing to collaborate with us, please submit the issue to Community Forum or JIRA


Percona Operator for MongoDB and LDAP: How Hard Can it Be?

Percona Operator for MongoDB and LDAP

Percona Operator for MongoDB and LDAPAs a member of the Percona cloud team, I run a lot of databases in Kubernetes and our Percona Operator for MongoDB in particular. Most of us feel comfortable with rather small and tidy infrastructure, which is easy to control and operate. However, the natural growth of the organization brings a lot of challenges to be tackled, especially in the context of access management. We’ve all been there – every new account requires platform-specific efforts adding to the operational cost of your infrastructure support. Such a burden could be solved by LDAP-based software like OpenLDAP, Microsoft Active Directory, etc., as a source of truth for the authentication/authorization process. Let’s dive into the details of how Percona Distribution for MongoDB managed by the Kubernetes operator could be connected to the LDAP server.


Our scenario is based on the integration of the OpenLDAP server and Percona Distribution for MongoDB and the corresponding Kubernetes Operator. We are going to keep the setup as simple as possible thus no complicated domain relationships will be listed here.


On the OpenLDAP side the following settings may be used:

0-percona-ous.ldif: |-
   dn: ou=perconadba,dc=ldap,dc=local
   objectClass: organizationalUnit
   ou: perconadba
 1-percona-users.ldif: |-
   dn: uid=percona,ou=perconadba,dc=ldap,dc=local
   objectClass: top
   objectClass: account
   objectClass: posixAccount
   objectClass: shadowAccount
   cn: percona
   uid: percona
   uidNumber: 1100
   gidNumber: 100
   homeDirectory: /home/percona
   loginShell: /bin/bash
   gecos: percona
   userPassword: {crypt}x
   shadowLastChange: -1
   shadowMax: -1
   shadowWarning: -1

Also, a read-only user should be created for database-issued user lookups.

If everything is done correctly, the following command should work

$ ldappasswd -s percona -D "cn=admin,dc=ldap,dc=local" -w password -x "uid=percona,ou=perconadba,dc=ldap,dc=local"


Percona Operator for MongoDB will do all the work inside the Kubernetes. You can use any supported platform from System Requirements

In order to get MongoDB connected with OpenLDAP we need to configure both:

  • Mongod
  • Internal mongodb role

As for mongod you may use the following configuration snippet:

  authorization: "enabled"
      queryTemplate: 'ou=perconadba,dc=ldap,dc=local??sub?(&(objectClass=group)(uid={USER}))'
    servers: "openldap"
    transportSecurity: none
      queryUser: "cn=readonly,dc=ldap,dc=local"
      queryPassword: "password"
            match : "(.+)",
            ldapQuery: "OU=perconadba,DC=ldap,DC=local??sub?(uid={0})"
  authenticationMechanisms: 'PLAIN,SCRAM-SHA-1'

The internals of the snippet is a topic for another blog post, though. Basically, we are providing mongod with ldap-specific parameters like domain name of ldap server (‘server’), explicit lookup user, domain rules, etc. 

Put the snippet on your local machine and create Kubernetes secret object named after the MongoDB cluster name from Install Percona server for MongoDB on Kubernetes. 

$ kubectl create secret generic cluster1-rs0-mongod --from-file=mongod.conf=<path-to-mongod-ldap-configuration>

Percona Operator for MongoDB is able to pass through a custom configuration from these Kubernetes objects: Custom Resource, ConfigMap, Secret. Since we are to use some security-sensitive information, we’ve picked the Kubernetes Secret.

The next step is to start the MongoDB cluster up as it’s described in Install Percona Server for MongoDB on Kubernetes.

On successful completion of the steps from the doc, we are to proceed with setting the LDAP users’ roles inside the MongoDB. Let’s log in to MongoDB as administrator and execute:

var admin = db.getSiblingDB("admin")
    role: "ou=perconadba,dc=ldap,dc=local",
    privileges: [],
    roles: [ "userAdminAnyDatabase" ]

Now our percona user created inside OpenLDAP is able to login to MongoDB as administrator. Please replace <mongodb-rs-endpoint> with a valid replica set domain name.

$ mongo --username percona --password 'percona' --authenticationMechanism 'PLAIN' --authenticationDatabase '$external' --host <mongodb-rs-endpoint> --port 27017


Percona Operator for MongoDB supports unencrypted only transport between mongodb and LDAP servers at the moment of writing this blog post. We’ll do our best to bring such a feature in the future, in the meantime feel free to use it in security relaxed scenarios.


This very blog post describes only one possible case out of a huge variety of possible combinations. It’s rather simple and does not bring for discussion topics like TLS secured transport setup, directory design, etc. We encourage you to try the MongoDB LDAP integration described for proof of concept, development environment setup, etc., where security concerns are not so demanding. Don’t hesitate to bring more complexity with sophisticated directory structures, user privileges schemes, etc. If you find your own configuration interesting and worth sharing with the community, please visit our Percona Community Forum. We’ll be glad to check your outcomes. 

Have fun with Percona Operator for MongoDB!


Percona Monitoring and Management in Kubernetes is now in Tech Preview

Percona Monitoring and Management in Kubernetes

Percona Monitoring and Management in KubernetesOver the course of the years, we see the growing interest in running databases and stateful workloads in Kubernetes. With Container Storage Interfaces (CSI) maturing and more and more Operators appearing, running stateful workloads in your favorite platform is not that scary anymore. Our Kubernetes story at Percona started with Operators for MySQL and MongoDB, adding PostgreSQL later on. 

Percona Monitoring and Management (PMM) is an open source database monitoring, observability, and management tool. It can be deployed in a virtual appliance or a Docker container. Our customers requested us to provide a way to deploy PMM in Kubernetes for a long time. We had an unofficial helm chart which was created as a PoC by Percona teams and the community (GitHub).

We are introducing the Technical Preview of the helm chart that is supported by Percona to easily deploy PMM in Kubernetes. You can find it in our helm chart repository here

Use cases

Single platform

If Kubernetes is a platform of your choice, currently you need a separate virtual machine to run Percona Monitoring and Management. No more with an introduction of a helm chart. 

As you know, Percona Operators all have integration with the PMM which enables monitoring for databases deployed on Kubernetes. Operators configure and deploy pmm-client sidecar container and register the nodes on a PMM server. Bringing PMM into Kubernetes simplifies this integration and the whole flow. Now the network traffic between pmm-client and PMM server does not leave the Kubernetes cluster at all.

All you have to do is to set the correct endpoint in a pmm section in the Custom Resource manifest. For example, for Percona Operator for MongoDB, the pmm section will look like this:

    enabled: true
    image: percona/pmm-client:2.28.0
    serverHost: monitoring-service

Where monitoring-service is the service created by a helm chart to expose a PMM server.

High availability

Percona Monitoring and Management has lots of moving parts: Victoria Metrics to store time-series data, ClickHouse for query analytics functionality, and PostgreSQL to keep PMM configuration. Right now all these components are a part of a single container or virtual machine, with Grafana as a frontend. To provide a zero-downtime deployment in any environment, we need to decouple these components. It is going to substantially complicate the installation and management of PMM.

What we offer instead right now are ways to automatically recover PMM in case of failure within minutes (for example leveraging the EC2 self-healing mechanism).

Kubernetes is a control plane for container orchestration that automates manual tasks for engineers. When you run software in Kubernetes it is best if you rely on its primitives to handle all the heavy lifting. This is what PMM looks like in Kubernetes:

PMM in Kubernetes

  • StatefulSet controls the Pod creation
  • There is a single Pod with a single container with all the components in it
  • This Pod is exposed through a Service that is utilized by PMM Users and pmm-clients
  • All the data is stored on a persistent volume
  • ConfigMap has various environment variable settings that can help to fine-tune PMM

In case of a node or a Pod failure, the StatefulSet is going to recover PMM Pod automatically and remount the Persistent Volume to it. The recovery time depends on the load of the cluster and node availability, but in normal operating environments, PMM Pod is up and running again within a minute.


Let’s see how PMM can be deployed in Kubernetes.

Add the helm chart:

helm repo add percona
helm repo update

Install PMM:

helm install pmm --set service.type="LoadBalancer" percona/pmm

You can now login into PMM using the LoadBalancer IP address and use a randomly generated password stored in a


Secret object (default user is admin).

The Service object created for PMM is called



$ kubectl get services monitoring-service
NAME                 TYPE           CLUSTER-IP    EXTERNAL-IP     PORT(S)         AGE
monitoring-service   LoadBalancer   443:32591/TCP   3m34s

$ kubectl get secrets pmm-secret -o yaml
apiVersion: v1

$ echo 'LE5lSTx3IytrUWBmWEhFTQ==' | base64 --decode && echo

Login to PMM by connecting to HTTPS://<YOUR_PUBLIC_IP>.


Helm chart is a template engine for YAML manifests and it allows users to customize the deployment. You can see various parameters that you can set to fine-tune your PMM installation in our README

For example, to set choose another storage class and set the desired storage size, set the following flags:

helm install pmm percona/pmm \
--set storage.storageClassName="premium-rwo" \
-–set storage.size=”20Gi”

You can also change these parameters in values.yaml and use “-f” flag:

# values.yaml contents
  storageClassName: “premium-rwo”
  size: 20Gi

helm install pmm percona/pmm -f values.yaml


For most of the maintenance tasks, regular Kubernetes techniques would apply. Let’s review a couple of examples.

Compute scaling

It is possible to vertically scale PMM by adding or removing resources through


variable in values.yaml. 

    memory: "4Gi"
    cpu: "2"
    memory: "8Gi"
    cpu: "4"

Once done, upgrade the deployment:

helm upgrade -f values.yaml pmm percona/pmm

This will restart a PMM Pod, so better plan it carefully not to disrupt your team’s work.

Storage scaling

This depends a lot on your storage interface capabilities and the underlying implementation. In most clouds, Persistent Volumes can be expanded. You can check if your storage class supports it by describing it:

kubectl describe storageclass standard
AllowVolumeExpansion:  True

Unfortunately, just changing the size of the storage in values.yaml (storage.size) will not do the trick and you will see the following error:

helm upgrade -f values.yaml pmm percona/pmm
Error: UPGRADE FAILED: cannot patch "pmm" with kind StatefulSet: StatefulSet.apps "pmm" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', and 'updateStrategy' are forbidden

We use the StatefulSet object to deploy PMM, and StatefulSets are mostly immutable and there are a handful of things that can be changed on the fly. There is a trick though.

First, delete the StatefulSet, but keep the Pods and PVCs:

kubectl delete sts pmm --cascade=orphan

Recreate it again with the new storage configuration:

helm upgrade -f values.yaml pmm percona/pmm

It will recreate the StatefulSet with the new storage size configuration. 

Edit Persistent Volume Claim manually and change the storage size (the name of the PVC can be different for you). You need to change the storage in section:

kubectl edit pvc pmm-storage-pmm-0

The PVC is not resized yet and you can see the following message when you describe it:

kubectl describe pvc pmm-storage-pmm-0
  Type                      Status  LastProbeTime                     LastTransitionTime                Reason  Message
  ----                      ------  -----------------                 ------------------                ------  -------
  FileSystemResizePending   True    Mon, 01 Jan 0001 00:00:00 +0000   Thu, 16 Jun 2022 11:55:56 +0300           Waiting for user to (re-)start a pod to finish file system resize of volume on node.

The last step would be to restart the Pod:

kubectl delete pod pmm-0


Running helm upgrade is a recommended way. Either once a new helm chart is released or when you want to upgrade the newer version of PMM by replacing the image in the image section. 

Backup and restore

PMM stores all the data on a Persistent Volume. As said before, regular Kubernetes techniques can be applied here to backup and restore the data. There are numerous options:

  • Volume Snapshots – check if it is supported by your CSI and storage implementation
  • Third-party tools, like Velero, can handle the backups and restores of PVCs
  • Snapshots provided by your storage (ex. AWS EBS Snapshots) with manual mapping to PVC during restoration

What is coming

To keep you excited there are numerous things that we are working on or have planned to enable further Kubernetes integrations.

OpenShift support

We are working on building a rootless container so that OpenShift users can run Percona Monitoring and Management there without having to grant elevated privileges. 

Microservices architecture

This is something that we have been discussing internally for some time now. As mentioned earlier, there are lots of components in PMM. To enable proper horizontal scaling, we need to break down our monolith container and run these components as separate microservices. 

Managing all these separate containers and Pods (if we talk about Kubernetes), would require coming up with separate maintenance strategies. This brings up the question of creating a separate Operator for PMM only to manage all this, but it is a significant effort. If you have an opinion here – please let us know on our community forum.

Automated k8s registration in DBaaS

As you know, Percona Monitoring and Management comes with a technical preview of  Database as a Service (DBaaS). Right now when you install PMM on a Kubernetes cluster, you still need to register the cluster to deploy databases. We want to automate this process so that after the installation you can start deploying and managing databases right away.


Percona Monitoring and Management enables database administrators and site reliability engineers to pinpoint issues in their open source database environments, whether it is a quick look through the dashboards or a detailed analysis with Query Analytics. Percona’s support for PMM on Kubernetes is a response to the needs of our customers who are transforming their infrastructure.

Some useful links that would help you to deploy PMM on Kubernetes:


Running Rocket.Chat with Percona Server for MongoDB on Kubernetes

Running Rocket.Chat with Percona Server for MongoDB on Kubernetes

Our goal is to have a Rocket.Chat deployment which uses highly available Percona Server for MongoDB cluster as the backend database and it all runs on Kubernetes. To get there, we will do the following:

  • Start a Google Kubernetes Engine (GKE) cluster across multiple availability zones. It can be any other Kubernetes flavor or service, but I rely on multi-AZ capability in this blog post.
  • Deploy Percona Operator for MongoDB and database cluster with it
  • Deploy Rocket.Chat with specific affinity rules
    • Rocket.Chat will be exposed via a load balancer

Rocket.Chat will be exposed via a load balancer

Percona Operator for MongoDB, compared to other solutions, is not only the most feature-rich but also comes with various management capabilities for your MongoDB clusters – backups, scaling (including sharding), zero-downtime upgrades, and many more. There are no hidden costs and it is truly open source.

This blog post is a walkthrough of running a production-grade deployment of Rocket.Chat with Percona Operator for MongoDB.


All YAML manifests that I use in this blog post can be found in this repository.

Deploy Kubernetes Cluster

The following command deploys GKE cluster named


in 3 availability zones:

gcloud container clusters create --zone us-central1-a --node-locations us-central1-a,us-central1-b,us-central1-c percona-rocket --cluster-version 1.21 --machine-type n1-standard-4 --preemptible --num-nodes=3

Read more about this in the documentation.

Deploy MongoDB

I’m going to use helm to deploy the Operator and the cluster.

Add helm repository:

helm repo add percona

Install the Operator into the percona namespace:

helm install psmdb-operator percona/psmdb-operator --create-namespace --namespace percona

Deploy the cluster of Percona Server for MongoDB nodes:

helm install my-db percona/psmdb-db -f psmdb-values.yaml -n percona

Replica set nodes are going to be distributed across availability zones. To get there, I altered the affinity keys in the corresponding sections of psmdb-values.yaml:

antiAffinityTopologyKey: ""

Prepare MongoDB

For Rocket.Chat to connect to our database cluster, we need to create the users. By default, clusters provisioned with our Operator have


user, its password is set in




For production-grade systems, do not forget to change this password or create dedicated secrets to provision those. Read more about user management in our documentation.

Spin up a client Pod to connect to the database:

kubectl run -i --rm --tty percona-client1 --image=percona/percona-server-mongodb:4.4.10-11 --restart=Never -- bash -il

Connect to the database with



[mongodb@percona-client1 /]$ mongo "mongodb://userAdmin:userAdmin123456@my-db-psmdb-db-rs0-0.percona/admin?replicaSet=rs0"

We are going to create the following:

  • rocketchat


  • rocketChat

    user to store data and connect to the database

  • oplogger

    user to provide access to oplog for rocket chat

    • Rocket.Chat uses Meteor Oplog tailing to improve performance. It is optional.
use rocketchat
  user: "rocketChat",
  pwd: passwordPrompt(),
  roles: [
    { role: "readWrite", db: "rocketchat" }

use admin
  user: "oplogger",
  pwd: passwordPrompt(),
  roles: [
    { role: "read", db: "local" }

Deploy Rocket.Chat

I will use helm here to maintain the same approach. 

helm install -f rocket-values.yaml my-rocketchat rocketchat/rocketchat --version 3.0.0

You can find rocket-values.yaml in the same repository. Please make sure you set the correct passwords in the corresponding YAML fields.

As you can see, I also do the following:

  • Line 11: expose Rocket.Chat through

    service type

  • Line 13-14: set number of replicas of Rocket.Chat Pods. We want three – one per each availability zone.
  • Line 16-23: set affinity to distribute Pods across availability zones

Load Balancer will be created with a public IP address:

$ kubectl get service my-rocketchat-rocketchat
NAME                       TYPE           CLUSTER-IP    EXTERNAL-IP    PORT(S)        AGE
my-rocketchat-rocketchat   LoadBalancer   80:32548/TCP   12m

You should now be able to connect to

and enjoy your highly available Rocket.Chat installation.

Rocket.Chat installation

Clean Up

Uninstall all helm charts to remove MongoDB cluster, the Operator, and Rocket.Chat:

helm uninstall my-rocketchat
helm uninstall my-db -n percona
helm uninstall psmdb-operator -n percona

Things to Consider


Instead of exposing Rocket.Chat through a load balancer, you may also try ingress. By doing so, you can integrate it with cert-manager and have a valid TLS certificate for your chat server.


It is also possible to run a sharded MongoDB cluster with Percona Operator. If you do so, Rocket.Chat will connect to mongos Service, instead of the replica set nodes. But you will still need to connect to the replica set directly to get oplogs.


We encourage you to try out Percona Operator for MongoDB with Rocket.Chat and let us know on our community forum your results.

There is always room for improvement and a time to find a better way. Please let us know if you face any issues with contributing your ideas to Percona products. You can do that on the Community Forum or JIRA. Read more about contribution guidelines for Percona Operator for MongoDB in

Percona Operator for MongoDB contains everything you need to quickly and consistently deploy and scale Percona Server for MongoDB instances into a Kubernetes cluster on-premises or in the cloud. The Operator enables you to improve time to market with the ability to quickly deploy standardized and repeatable database environments. Deploy your database with a consistent and idempotent result no matter where they are used.

Powered by WordPress | Theme: Aeros 2.0 by