Oct
13
2021
--

Migrating MongoDB to Kubernetes

Migrating MongoDB to Kubernetes

Migrating MongoDB to KubernetesThis blog post is the last in the series of articles on migrating databases to Kubernetes with Percona Operators. Two previous posts can be found here:

As you might have guessed already, this time we are going to cover the migration of MongoDB to Kubernetes. In the 1.10.0 release of Percona Distribution for MongoDB Operator, we have introduced a new feature (in tech preview) that enables users to execute such migrations through regular MongoDB replication capabilities. We have already shown before how it can be used to provide cross-regional disaster recovery for MongoDB, we encourage you to read it.

The Goal

There are two ways to migrate the database:

  1. Take the backup and restore it.
    – This option is the simplest one, but unfortunately comes with downtime. The bigger the database, the longer the recovery time is.
  2. Replicate the data to the new site and switch the application once replicas are in sync.
    – This allows the user to perform the migration and switch the application with either zero or little downtime.

This blog post is a walkthrough on how to migrate MongoDB replica set to Kubernetes with replication capabilities. 

MongoDB replica set to Kubernetes

  1. We have a MongoDB cluster somewhere (the Source). It can be on-prem or some virtual machine. For demo purposes, I’m going to use a standalone replica set node. The migration procedure of a replica set with multiple nodes or sharded cluster is almost identical.
  2. We have a Kubernetes cluster with Percona Operator (the Target). The operator deploys 3 standalone MongoDB nodes in unmanaged mode (we will talk about it below).
  3. Each node is exposed so that the nodes on the Source can reach them.
  4. We are going to replicate the data to Target nodes by adding them into the replica set.

As always all blog post scripts and configuration files are available publicly in this Github repository.

Prerequisites

  • MongoDB cluster – either on-prem or VM. It is important to be able to configure mongod to some extent and add external nodes to the replica set.
  • Kubernetes cluster for the Target.
  • kubectl to deploy and manage the Operator and database on Kubernetes.

Prepare the Source

This section explains what preparations must be made on the Source to set up the replication.

Expose

All nodes in the replica set must form a mesh and be able to reach each other. The communication between the nodes can go through the public internet or some VPN. For demonstration purposes, we are going to expose the Source to the public internet by editing mongod.conf:

net:
  bindIp: <PUBLIC IP>

If you have multiple replica sets – you need to expose all nodes of each of them, including config servers.

TLS

We take security seriously at Percona, and this is why by default our Operator deploys MongoDB clusters with encryption enabled. I have prepared a script that generates self-signed certificates and keys with the openssl tool. If you already have Certificate Authority (CA) in use in your organization, generate the certificates and sign them by your CA.

The list of alternative names can be found either in this document or in this openssl configuration file. Note DNS.20 entry:

DNS.20      = *.mongo.spron.in

I’m going to use this wildcard entry to set up the replication between the nodes. The script also generates an

ssl-secret.yaml

file, which we are going to use on the Target side.

You need to upload CA and certificate with a private key to every Source replica set node and then define it in the

mongod.conf

:

# network interfaces
net:
  ...
  tls:
    mode: preferTLS
    CAFile: /path/to/ca.pem
    certificateKeyFile: /path/to/mongod.pem

security:
  clusterAuthMode: x509
  authorization: enabled

Note that I also set

clusterAuthMode

to x509. It enforces the use of x509 authentication. Test it carefully on a non-production environment first as it might break your existing replication.

Create System Users

Our Operator needs system users to manage the cluster and perform health checks. Usernames and passwords for system users should be the same on the Source and the Target. This script is going to generate the

user-secret.yaml

to use on the Target and mongo shell code to add the users on the Source (it is an example, do not use it in production).

Connect to the primary node on the Source and execute mongo shell commands generated by the script.

Prepare the Target

Apply Users and TLS secrets

System users’ credentials and TLS certificates must be similar on both sides. The scripts we used above generate Secret object manifests to use on the Target. Apply them:

$ kubectl apply -f ssl-secret.yaml
$ kubectl apply -f user-secret.yam

Deploy the Operator and MongoDB Nodes

Please follow one of the installation guides to deploy the Operator. Usually, it is one step operation through

kubectl

:

$ kubectl apply -f operator.yaml

MongoDB nodes are deployed with a custom resource manifest – cr.yaml. There are the following important configuration items in it:

spec:
  unmanaged: true

This flag instructs Operator to deploy the nodes in unmanaged mode, meaning they are not configured to form the cluster. Also, the Operator does not generate TLS certificates and system users.

spec:
…
  updateStrategy: Never

Disable the Smart Update feature as the cluster is unmanaged.

spec:
…
  secrets:
    users: my-new-cluster-secrets
    ssl: my-custom-ssl
    sslInternal: my-custom-ssl-internal

This section points to the Secret objects that we created in the previous step.

spec:
…
  replsets:
  - name: rs0
    size: 3
    expose:
      enabled: true
      exposeType: LoadBalancer

Remember, that nodes need to be exposed and reachable. To do this we create a service per Pod. In our case, it is a

LoadBalancer

object, but it can be any other Service type.

spec:
...
  backup:
    enabled: false

If the cluster and nodes are unmanaged, the Operator should not be taking backups. 

Deploy unmanaged nodes with the following command:

$ kubectl apply -f cr.yaml

Once nodes are up and running, also check the services. We will need the IP-addresses of new replicas to add them later to the replica set on the Source.

$ kubectl get services
NAME                    TYPE           CLUSTER-IP     EXTERNAL-IP      PORT(S)           AGE
…
my-new-cluster-rs0-0    LoadBalancer   10.3.252.134   35.223.104.224   27017:31332/TCP   2m11s
my-new-cluster- rs0-1   LoadBalancer   10.3.244.166   34.134.210.223   27017:32647/TCP   81s
my-new-cluster-rs0-2    LoadBalancer   10.3.248.119   34.135.233.58    27017:32534/TCP   45s

Configure Domains

X509 authentication is strict and requires that the certificate’s common name or alternative name match the domain name of the node. As you remember we had wildcard

*.mongo.spron.in

included in our certificate. It can be any domain that you use, but make sure a certificate is issued for this domain.

I’m going to create A-records to point to public IP-addresses of MongoDB nodes:

k8s-1.mongo.spron-in -> 35.223.104.224
k8s-2.mongo.spron.in -> 34.134.210.223
k8s-3.mongo.spron-in -> 34.135.233.58

Replicate the Data to the Target

It is time to add our nodes in the Kubernetes cluster to the replica set. Login into the mongo shell on the Source and execute the following:

rs.add({ host: "k8s-1.mongo.spron.in", priority: 0, votes: 0} )
rs.add({ host: "k8s-2.mongo.spron.in", priority: 0, votes: 0} )
rs.add({ host: "k8s-3.mongo.spron.in", priority: 0, votes: 0} )

If everything is done correctly these nodes are going to be added as secondaries. You can check the status with the

rs.status()

command.

Cutover

Check that newly added node are synchronized. The more data you have, the longer the synchronization process is going to take. To understand if nodes are synchronized you should compare the values of

optime

and

optimeDate

of the Primary node with the values for the Secondary node in

rs.status()

output:

{
        "_id" : 0,
        "name" : "147.182.213.59:27017",
        "stateStr" : "PRIMARY",
...
        "optime" : {
                "ts" : Timestamp(1633697030, 1),
                "t" : NumberLong(2)
        },
        "optimeDate" : ISODate("2021-10-08T12:43:50Z"),
...
},
{
        "_id" : 1,
        "name" : "k8s-1.mongo.spron.in:27017",
        "stateStr" : "SECONDARY",
...
        "optime" : {
                "ts" : Timestamp(1633697030, 1),
                "t" : NumberLong(2)
        },
        "optimeDurable" : {
                "ts" : Timestamp(1633697030, 1),
                "t" : NumberLong(2)
        },
        "optimeDate" : ISODate("2021-10-08T12:43:50Z"),
...
},

When nodes are synchronized, we are ready to perform the cutover. Please ensure that your application is configured properly to minimize downtime during the cutover.

The cutover is going to have two steps:

  1. One of the nodes on the Target becomes the primary.
  2. Operator starts managing the cluster and nodes on the Source are no longer present in the replica set.

Switch the Primary

Connect with mongo shell to the primary on the Source side and make one of the nodes on the Target primary. It can be done by changing the replica set configuration:

cfg = rs.config()
cfg.members[1].priority = 2
cfg.members[1].votes = 1
rs.reconfig(cfg)

We enable voting and set priority to two on one of the nodes in the Kubernetes cluster. Member id can be different for you, so please look carefully into the output of

rs.config()

command.

Start Managing the Cluster

Once the primary is running in Kubernetes, we are going to tell the Operator to start managing the cluster. Change

spec.unmanaged

to

false

 in the Custom Resource with patch command:

$ kubectl patch psmdb my-cluster-name --type=merge -p '{"spec":{"unmanaged": true}}'

You can also do this by changing

cr.yaml

and applying it. This is it, now you have the cluster in Kubernetes which is managed by the Operator. 

Conclusion

You truly start to appreciate Operators once you get used to them. When I was writing this blog post I found it extremely annoying to deploy and configure a single MongoDB node on a Linux box; I don’t want to think about the cluster. Operators abstract Kubernetes primitives and database configuration and provide you with a fully operational database service instead of a bunch of nodes. Migration of MongoDB to Kubernetes is a challenging task, but it is much simpler with Operator. And once you are on Kubernetes, Operator takes care of all day-2 operations as well.

We encourage you to try out our operator. See our GitHub repository and check out the documentation.

Found a bug or have a feature idea? Feel free to submit it in JIRA.

For general questions please raise the topic in the community forum

You are a developer and looking to contribute? Please read our CONTRIBUTING.md and send the Pull Request.

Percona Distribution for MongoDB Operator

The Percona Distribution for MongoDB Operator simplifies running Percona Server for MongoDB on Kubernetes and provides automation for day-1 and day-2 operations. It’s based on the Kubernetes API and enables highly available environments. Regardless of where it is used, the Operator creates a member that is identical to other members created with the same Operator. This provides an assured level of stability to easily build test environments or deploy a repeatable, consistent database environment that meets Percona expert-recommended best practices.

Complete the 2021 Percona Open Source Data Management Software Survey

Have Your Say!

Oct
08
2021
--

Disaster Recovery for MongoDB on Kubernetes

Disaster Recovery for MongoDB on Kubernetes

Disaster Recovery for MongoDB on KubernetesAs per the glossary, Disaster Recovery (DR) protocols are an organization’s method of regaining access and functionality to its IT infrastructure in events like a natural disaster, cyber attack, or even business disruptions related to the COVID-19 pandemic. When we talk about data, storing backups on remote servers is enough to pass DR compliance checks for some companies. But for others, Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO) are extremely tight and require more than just a backup/restore procedure.

In this blog post, we are going to show you how to set up MongoDB on two distant Kubernetes clusters with Percona Distribution for MongoDB Operator to meet the toughest DR requirements.

What to Expect

Here is what we are going to do:

  1. Setup two Kubernetes clusters
  2. Deploy Percona Distribution for MongoDB Operator on both of them. The Disaster Recovery site will run a MongoDB cluster in unmanaged mode.
  3. We are going to simulate the failure and perform a failover to DR site

In the 1.10.0 version of the Operator, we have added the Technology Preview of the new feature which enables users to deploy unmanaged MongoDB nodes and connect them to existing Replica Sets.

MongoDB Kubernetes

Set it All Up

We are not going to cover the configuration of the Kubernetes clusters, but in our tests, we relied on two Google Kubernetes Engine (GKE) clusters deployed in different regions. Read more about GKE here.

Prepare Main Site

We have shared all the resources for this blog post in this GitHub repo. As a first step we are going to deploy the operator on the Main site:

$ kubectl apply -f bundle.yaml

Deploy the MongoDB managed cluster with

cr-main.yaml

:

$ kubectl apply -f cr-main.yaml

It is important to understand that we will need to expose ReplicaSet nodes through a dedicated service. This includes Config Servers. This is required to ensure that ReplicaSet nodes on Main and DR can reach each other. So it is like a full mesh:

ReplicaSet nodes

To get there, cr-main.yaml has the following changes:

spec:
  replsets:
  - rs0:
    expose:
      enabled: true
      exposeType: LoadBalancer
  sharding:
    configsvrReplSet:
      expose:
        enabled: true
        exposeType: LoadBalancer

We are using the LoadBalancer Kubernetes Service object as it is just simpler for us, but there are other options – ClusterIP, NodePort. It is also possible to utilize 3rd party tools like Submariner to implement a private connection.

If you have an already running MongoDB cluster in Kubernetes, you can expose the ReplicaSets without downtime by changing these variables.

Prepare Disaster Recovery Site

The configuration of the Disaster Recovery site could be broken down into the following steps:

  1. Copy the Secrets from the Main cluster.
    1. system users secrets
    2. SSL keys – both used for external connections and internal replication traffic
  2. Tune Custom Resource:
    1. run nodes in unmanaged mode – Operator does not control replicaset configuration and secrets generation
    2. expose ReplicaSets (the same way we do it on the Main cluster)
    3. disable backups – backups can be only taken on the cluster managed by the Operator

Copy the Secrets

System user’s credentials are stored by default in my-cluster-name-secrets Secret object and defined in spec.secrets.users. Apply this secret in the DR cluster with kubectl apply -f yaml-with-secrets. If you don’t have it in your source code repository or if you rely on the Operator to generate it, you can get the secret from Kubernetes itself, remove the unnecessary metadata and apply.

On main execute:

$ kubectl get secret my-cluster-name-secrets -o yaml > my-cluster-secrets.yaml

Now remove the following lines from metadata:

annotations
creationTimestamp
resourceVersion
selfLink
uid

Save the file and apply it to the DR cluster.

The procedure to copy SSL keys is almost the same as for users. The difference is the names of the Secret objects – they are usually called <CLUSTER_NAME>-ssl and <CLUSTER_NAME>-ssl-internal. It is also possible to specify them in secrets.ssl and secrets.sslInternal in the Custom Resource. Copy these two keys from Main to DR and reference them in the CR.

Tune Custom Resource

cr-replica.yaml will have the following changes:

  secrets:
    users: my-cluster-name-secrets
    ssl: replica-cluster-ssl
    sslInternal: replica-cluster-ssl-internal

  replsets:
  - name: rs0
    size: 3
    expose:
      enabled: true
      exposeType: LoadBalancer

  sharding:
    enabled: true
    configsvrReplSet:
      size: 3
      expose:
        enabled: true
        exposeType: LoadBalancer

  backup:
    enabled: false

Once the Custom Resource is applied, the services are going to be created.  We will need the IP addresses of each ReplicaSet node to configure the DR site.

$ kubectl get services
NAME                  TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)           AGE
replica-cluster-cfg-0    LoadBalancer   10.111.241.213   34.78.119.1       27017:31083/TCP   5m28s
replica-cluster-cfg-1    LoadBalancer   10.111.243.70    35.195.138.253    27017:31957/TCP   4m52s
replica-cluster-cfg-2    LoadBalancer   10.111.246.94    146.148.113.165   27017:30196/TCP   4m6s
...
replica-cluster-rs0-0    LoadBalancer   10.111.241.41    34.79.64.213      27017:31993/TCP   5m28s
replica-cluster-rs0-1    LoadBalancer   10.111.242.158   34.76.238.149     27017:32012/TCP   4m47s
replica-cluster-rs0-2    LoadBalancer   10.111.242.191   35.195.253.107    27017:31209/TCP   4m22s

Add External Nodes to Main

At this step, we are going to add unmanaged nodes to the Replica Set on the Main site. In cr-main.yaml we should add externalNodes under replsets.[] and sharding.configsvrReplSet:

  replsets:
  - name: rs0
    externalNodes:
    - host: 34.79.64.213
      priority: 1
      votes: 1
    - host: 34.76.238.149
      priority: 1
      votes: 1
    - host: 35.195.253.107
      priority: 0
      votes: 0

  sharding:
    configsvrReplSet:
      externalNodes:
      - host: 34.78.119.1
        priority: 1
        votes: 1
      - host: 35.195.138.253
        priority: 1
        votes: 1
      - host: 146.148.113.165
        priority: 0
        votes: 0

Please note that we add three nodes, but only two are voters. We do this to avoid split-brain situations and do not start the primary election if the DR site is down or there is a network disruption between the Main and DR sites.

Failover

Once all the configuration above is applied, the situation will look like this:

Failover

We have three voters in the main cluster and two voters in the replica cluster. That means replica nodes won’t have the majority in case of main cluster failure and they won’t be able to elect a new primary. Therefore we need to step in and perform a manual failover.

Let’s kill the main cluster:

gcloud compute instances list | 
grep my-main-gke-demo | 
awk '{print $1}' | 
xargs gcloud compute instances delete --zone europe-west3-b

gcloud container node-pools delete \
--zone europe-west3-b \
--cluster my-main-gke-demo \
Default-pool

I deleted the nodes and the node pool of the main Kubernetes cluster so now the cluster is in an unhealthy state. Let’s see what mongos on the DR site says when we try to read or write through it (psmdb-tester can be found in the git repo as well):

% ./psmdb-tester
2021/09/03 18:19:19 Successfully connected and pinged 34.141.3.189:27017
2021/09/03 18:19:40 read failed: (FailedToSatisfyReadPreference) Encountered non-retryable error during query :: caused by :: Could not find host matching read preference { mode: "primary" } for set cfg
2021/09/03 18:19:49 write failed: (FailedToSatisfyReadPreference) Could not find host matching read preference { mode: "primary" } for set cfg
Disaster Recovery MongoDB

Normally, we can only alter the replica set configuration from the primary node but in this kind of situation where you don’t have a primary and only have a few surviving members, MongoDB allows us to force the reconfiguration from any alive member.

Let’s connect to one of the secondary nodes in the replica cluster and perform the failover:

kubectl exec -it psmdb-client-7b9f978649-pjb2k -- mongo 'mongodb://clusterAdmin:<pass>@replica-cluster-rs0-0.replica.svc.cluster.local/admin?ssl=false'
...
rs0:SECONDARY> cfg = rs.config()
rs0:SECONDARY> cfg.members = [cfg.members[3], cfg.members[4], cfg.members[5]]
rs0:SECONDARY> rs.reconfig(cfg, {force: true})

Note that the indexes of surviving members may differ in your environment. You should check rs.status() and rs.config() outputs first. The main idea is to repopulate config members with only surviving members.

After the reconfiguration, the replica set will have just three members and two of them will have votes and a majority. So, they’ll be able to select a new primary. After performing the same process on the cfg replica set, we will be able to read and write through mongos again:

% ./psmdb-tester
2021/09/03 18:41:48 Successfully connected and pinged 34.141.3.189:27017
2021/09/03 18:41:49 read succeed
2021/09/03 18:41:50 read succeed
2021/09/03 18:41:51 read succeed
2021/09/03 18:41:52 read succeed
2021/09/03 18:41:53 read succeed
2021/09/03 18:41:54 read succeed
2021/09/03 18:41:55 read succeed
2021/09/03 18:41:56 read succeed
2021/09/03 18:41:57 read succeed
2021/09/03 18:41:58 read succeed
2021/09/03 18:41:58 write succeed

Once the replica cluster has become the primary, you should reconfigure all clients that connect to the old main cluster and point them to the DR site.

Conclusion

Disaster Recovery is important for business continuity. The goal of administrators and SREs is to have a plan in place. With the new release of Percona Distribution for MongoDB Operator, setting up DR is fast, automated, and enables IT teams to meet RTO and RPO requirements.

We encourage you to try out our operator. See our GitHub repository and check out the documentation.

Found a bug or have a feature idea? Feel free to submit it in JIRA.

For general questions please raise the topic in the community forum.

You are a developer and looking to contribute? Please read our CONTRIBUTING.md and send the Pull Request.

Percona Distribution for MongoDB Operator

The Percona Distribution for MongoDB Operator simplifies running Percona Server for MongoDB on Kubernetes and provides automation for day-1 and day-2 operations. It’s based on the Kubernetes API and enables highly available environments. Regardless of where it is used, the Operator creates a member that is identical to other members created with the same Operator. This provides an assured level of stability to easily build test environments or deploy a repeatable, consistent database environment that meets Percona expert-recommended best practices.

Complete the 2021 Percona Open Source Data Management Software Survey

Have Your Say!

Sep
17
2021
--

Migration of a MySQL Database to a Kubernetes Cluster Using Asynchronous Replication

Migration of a MySQL Database to a Kubernetes Cluster Using Asynchronous Replication

Migration of a MySQL Database to a Kubernetes Cluster Using Asynchronous ReplicationNowadays, more and more companies are thinking about the migration of their infrastructure to Kubernetes. Databases are no exception. There were a lot of k8s operators that were created to simplify the different types of deployments and also perform routine day-to-day tasks like making the backups, renewing certificates, and so on.  If a few years ago nobody wanted to even listen about running databases in Kubernetes,  everything has changed now.

At Percona, we created a few very featureful k8s operators for Percona Server for MongoDB, PostgreSQL, and MySQL databases. Today we will talk about using cross-site replication – a new feature that was added to the latest release of Percona Distribution for MySQL Operator. This feature is based on synchronous connection failover mechanism.
The cross-site replication involves configuring one Percona XtraDB Cluster or a single/several MySQL servers as Source, and another Percona XtraDB Cluster (PXC) as a replica to allow asynchronous replication between them.  If an operator has several sources in custom resource (CR), it will automatically handle connection failure of the source DB.
This cross-site replication feature is supported only since MySQL 8.0.23, but you can read about migrating MySQL of earlier versions in this blog post.

The Goal

Migrate the MySQL database, which is deployed on-prem or in the cloud, to the Percona Distribution for MySQL Operator using asynchronous replication. This approach helps you reduce downtime and data loss for your application.

So, we have the following setup:

Migration of MySQL database to Kubernetes cluster using asynchronous replication

The following components are used:

1. MySQL 8.0.23 database (in my case it is Percona Server for MySQL) which is deployed in DO (as a Source) and Percona XtraBackup for the backup. In my test deployment, I use only one server as a Source to simplify the deployment. Depending on your topology of DB deployment, you can use several servers to use synchronous connection failover mechanism on the operator’s end.

2. Google Kubernetes Engine (GKE) cluster where Percona Distribution for MySQL Operator is deployed with PXC cluster (as a target).

3. AWS S3 bucket is used to save the backup from MySQL DB and then to restore the PXC cluster in k8s.

The following steps should be done to perform the migration procedure:

1. Make the MySQL database backup using Percona XtraBackup and upload it to the S3 bucket using xbcloud.

2. Perform the restore of the MySQL database from the S3 bucket into the PXC cluster which is deployed in k8s.

3. Configure asynchronous replication between MySQL server and PXC cluster managed by k8s operator.

As a result, we have asynchronous replication between MySQL server and PXC cluster in k8s which is in read-only mode.

Migration

Configure the target PXC cluster managed by k8s operator:

1. Deploy Percona Distribution for MySQL Operator on Kubernetes (I have used GKE 1.20).

# clone the git repository
git clone -b v1.9.0 https://github.com/percona/percona-xtradb-cluster-operator
cd percona-xtradb-cluster-operator

# deploy the operator
kubectl apply -f deploy/bundle.yaml

2. Create PXC cluster using the default custom resource manifest (CR).

# create my-cluster-secrets secret (do no use default passwords for production systems)
kubectl apply -f deploy/secrets.yaml

# create cluster by default it will be PXC 8.0.23
kubectl apply -f deploy/cr.yaml

3. Create the secret with credentials for the AWS S3 bucket which will be used for access to the S3 bucket during the restoration procedure.

# create S3-secret.yaml file with following content, and use correct credentials instead of XXXXXX

apiVersion: v1
kind: Secret
metadata:
  name: aws-s3-secret
type: Opaque
data:
  AWS_ACCESS_KEY_ID: XXXXXX
  AWS_SECRET_ACCESS_KEY: XXXXXX

# create secret
kubectl apply -f S3-secret.yaml

Configure the Source MySQL Server

1. Install Percona Server for MySQL 8.0.23 and Percona XtraBackup for the backup. Refer to the Installing Percona Server for MySQL and Installing Percona XtraBackup chapters in the documentation for installation instructions.


NOTE:
You need to add the following options to my.cnf to enable GTID support; otherwise, replication will not work because it is used by the PXC cluster  by default.

[mysqld]
enforce_gtid_consistency=ON
gtid_mode=ON

2. Create all needed users who will be used by k8s operator, the password should be the same as in

deploy/secrets.yaml

. Also, please note that the password for the root user should be the same as in deploy/secrets.yaml file for k8s the secret.  In my case, I used our default passwords from

deploy/secrets.yaml

file.

CREATE USER 'monitor'@'%' IDENTIFIED BY 'monitory' WITH MAX_USER_CONNECTIONS 100;
GRANT SELECT, PROCESS, SUPER, REPLICATION CLIENT, RELOAD ON *.* TO 'monitor'@'%';
GRANT SERVICE_CONNECTION_ADMIN ON *.* TO 'monitor'@'%';

CREATE USER 'operator'@'%' IDENTIFIED BY 'operatoradmin';
GRANT ALL ON *.* TO 'operator'@'%' WITH GRANT OPTION;

CREATE USER 'xtrabackup'@'%' IDENTIFIED BY 'backup_password';
GRANT ALL ON *.* TO 'xtrabackup'@'%';

CREATE USER 'replication'@'%' IDENTIFIED BY 'repl_password';
GRANT REPLICATION SLAVE ON *.* to 'replication'@'%';
FLUSH PRIVILEGES;

2. Make the backup of MySQL database using XtraBackup tool and upload it to S3 bucket.

# export aws credentials
export AWS_ACCESS_KEY_ID=XXXXXX
export AWS_SECRET_ACCESS_KEY=XXXXXX

#make the backup
xtrabackup --backup --stream=xbstream --target-dir=/tmp/backups/ --extra-lsndirk=/tmp/backups/  --password=root_password | xbcloud put --storage=s3 --parallel=10 --md5 --s3-bucket="mysql-testing-bucket" "db-test-1"

Now, everything is ready to perform the restore of the backup on the target database. So, let’s get back to our k8s cluster.

Configure the Asynchronous Replication to the Target PXC Cluster

If you have a completely clean source database (without any data), you can skip the points connected with backup and restoration of the database. Otherwise, do the following:

1. Restore the backup from the S3 bucket using the following manifest:

# create restore.yml file with following content

apiVersion: pxc.percona.com/v1
kind: PerconaXtraDBClusterRestore
metadata:
  name: restore1
spec:
  pxcCluster: cluster1
  backupSource:
    destination: s3://mysql-testing-bucket/db-test-1
    s3:
      credentialsSecret: aws-s3-secret
      region: us-east-1

# trigger the restoration procedure
kubectl apply -f restore.yml

As a result, you will have a PXC cluster with data from the source DB. Now everything is ready to configure the replication.

2. Edit custom resource manifest

deploy/cr.yaml

  to configure

spec.pxc.replicationChannels

 section.

spec:
  ...
  pxc:
    ...
    replicationChannels:
    - name: ps_to_pxc1
      isSource: false
      sourcesList:
        - host: <source_ip>
          port: 3306
          weight: 100

# apply CR
kubectl apply -f deploy/cr.yaml


Verify the Replication 

In order to check the replication, you need to connect to the PXC node and run the following commands:

kubectl exec -it cluster1-pxc-0 -c pxc -- mysql -uroot -proot_password -e "show replica status \G"
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: <ip-of-source-db>
                  Master_User: replication
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: binlog.000004
          Read_Master_Log_Pos: 529
               Relay_Log_File: cluster1-pxc-0-relay-bin-ps_to_pxc1.000002
                Relay_Log_Pos: 738
        Relay_Master_Log_File: binlog.000004
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 529
              Relay_Log_Space: 969
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
                  Master_UUID: 9741945e-148d-11ec-89e9-5ee1a3cf433f
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 3
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set: 9741945e-148d-11ec-89e9-5ee1a3cf433f:1-2
            Executed_Gtid_Set: 93f1e7bf-1495-11ec-80b2-06e6016a7c3d:1,
9647dc03-1495-11ec-a385-7e3b2511dacb:1-7,
9741945e-148d-11ec-89e9-5ee1a3cf433f:1-2
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name: ps_to_pxc1
           Master_TLS_Version:
       Master_public_key_path:
        Get_master_public_key: 0
            Network_Namespace:

Also, you can verify the replication by checking that the data is changing.

Promote the PXC Cluster as a Primary

As soon as you are ready (your application was reconfigured and ready to work with the new DB) to stop the replication and promote the PXC cluster in k8s to be a primary DB, you need to perform the following simple actions:

1. Edit the

deploy/cr.yaml

  and comment the replicationChannels

spec:
  ...
  pxc:
    ...
    #replicationChannels:
    #- name: ps_to_pxc1
    #  isSource: false
    #  sourcesList:
    #    - host: <source_ip>
    #      port: 3306
    #      weight: 100

2. Stop mysqld service on the source server to be sure that no new data is written.

 systemctl stop mysqld

3. Apply a new CR for k8s operator.

# apply CR
kubectl apply -f deploy/cr.yaml

As a result, replication is stopped and the read-only mode is disabled for the PXC cluster.

Conclusion

Technologies are changing so fast that a migration procedure to k8s cluster, seeming very complex at first sight, turns out to be not so difficult and nor time-consuming. But you need to keep in mind that significant changes were made. Firstly, you migrate the database to the PXC cluster which has some peculiarities, and, of course, Kubernetes itself.  If you are ready, you can start the journey to Kubernetes right now.

The Percona team is ready to guide you during this journey. If you have any questions,  please raise the topic in the community forum.

The Percona Kubernetes Operators automate the creation, alteration, or deletion of members in your Percona Distribution for MySQL, MongoDB, or PostgreSQL environment.

Learn More About Percona Kubernetes Operators

Aug
19
2021
--

Dynamic User Creation with MySQL on Kubernetes and Hashicorp Cloud Platform Vault

MySQL Kubernetes Hashicorp Cloud

You may have already seen this document which describes the integration between HashiCorp Vault and Percona Distribution for MySQL Operator to enable data-at-rest encryption for self-managed Vault deployments.  In April 2021, HashiCorp announced a fully managed offering, HashiCorp Cloud Platform Vault (HCP Vault), that simplifies deployment and management of the Vault.

With that in mind, I’m going to talk about the integration between Percona and HCP Vault to provide dynamic user creation for MySQL.

Without dynamic credentials, organizations are susceptible to a breach due to secrets sprawl across different systems, files, and repositories. Dynamic credentials provide a secure way of connecting to the database by using a unique password for every login or service account. With Vault, these just-in-time credentials are stored securely and it is also possible to set a lifetime for them.

Goal

My goal would be to provision users on my MySQL cluster deployed in Kubernetes with dynamic credentials through Hashicorp Vault.

MySQL cluster deployed in Kubernetes with dynamic credentials through Hashicorp Vault

  1. Percona Operator deploys Percona XtraDB Cluster and HAProxy
  2. HashiCorp Vault connects to MySQL through HAProxy and creates users with specific grants
  3. Application or user can connect to myapp database using dynamic credentials created by vault

Before You Begin

Prerequisites

  • HCP Vault account
  • Kubernetes cluster

Networking

Right now HCP deploys Vault in Hashicorp’s Amazon account in a private Virtual Private Network. For now to establish a private connection between the Vault and your application you would need to have an AWS account, VPC, and either a peering or Transit Gateway connection:

For the sake of simplicity in this blog post, I’m going to expose the Vault publicly, which is not recommended for production but allows me to configure Vault from anywhere.

More clouds and networking configurations are on the way. Stay tuned to HashiCorp news.

Set it All Up

MySQL

To deploy Percona Distribution for MySQL on Kubernetes please follow our documentation. The only requirement is to have HAProxy exposed via a public load balancer. The following fields should be set correctly in the Custom Resource –

deploy/cr.yaml

:

For simplicity, I have shared two required YAMLs in this GitHub repository. Deploying them would provision the Percona XtraDB Cluster on Kubernetes exposed publicly:

kubectl apply -f bundle.yaml
kubectl apply -f cr.yaml

Once the cluster is ready, get the public address:

$ kubectl get pxc
NAME       ENDPOINT       STATUS   PXC   PROXYSQL   HAPROXY   AGE
cluster1   35.223.41.79   ready    3                3         4m43s

Remember the ENDPOINT address, we will need to use it below to configure HCP Vault.

Create the User and the Database

I’m going to create a MySQL user which is going to be used by HCP Vault to create users dynamically. Also, an empty database called ‘myapp’ to which these users are going to have access.

Get the current root password from the Secret object:

$ kubectl get secrets my-cluster-secrets -o yaml | awk '$1~/root/ {print $2}' | base64 --decode && echo
Jw6OYIsUJeAQQapk

Connect to MySQL directly or by executing into the container:

kubectl exec -ti cluster1-pxc-0 -c pxc bash
mysql -u root -p -h 35.223.41.79

Create the database user and the database:

mysql> create user hcp identified by 'superduper';
Query OK, 0 rows affected (0.04 sec)

mysql> grant select, insert, update, delete, drop, create, alter, create user on *.* to hcp with grant option;
Query OK, 0 rows affected (0.04 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.02 sec)

mysql> create database myapp;
Query OK, 1 row affected (0.02 sec)

Hashicorp Cloud Platform Vault

Setting up Vault on HCP is a few-click process that is described here.

As I mentioned before, for the sake of simplicity HCP Vault is going to be publicly accessible. To do that, go to your Vault cluster in HCP UI, click Manage and Edit Configuration:

vault cluster

Enable the knob to expose the cluster publicly:

configure cluster

Now let’s get the Admin token for Vault. Navigate to your overview dashboard of your Vault cluster and click Generate token:

Vault CLI

The Vault is publicly accessible and you have the Admin token. Let’s configure it with the vault CLI tool. Install it by following the manual here.

Try to log in:

export VAULT_ADDR=”https://vault-cluster.SOMEURL.hashicorp.cloud:8200”
export VAULT_NAMESPACE="admin"

vault login
Token (will be hidden):
Success! You are now authenticated. The token information displayed below
is already stored in the token helper. You do NOT need to run "vault login"
again. Future Vault requests will automatically use this token.
...

Connecting the Dots

It is time to connect Vault with the MySQL database in Kubernetes and start provisioning users. We are going to rely on Vault’s Databases Secrets engine.

1. Enable database secrets engine:

vault secrets enable database

2. Point Vault to MySQL and store the configuration:

vault write database/config/myapp plugin_name=mysql-database-plugin \
connection_url=”{{username}}:{{password}}@tcp(35.223.41.79:3306)/” \
allowed_roles=”mysqlrole” \
username=”hcp” \
password=”superduper”
Success! Data written to: database/config/myapp

3. Create the role:

vault write database/roles/mysqlrole db_name=myapp \
creation_statements=”CREATE USER ‘{{name}}’@’%’ IDENTIFIED BY ‘{{password}}’; GRANT select, insert, update, delete, drop, create, alter ON myapp.* TO ‘{{name}}’@’%’;” \
default_ttl=”1h” \
max_ttl=”24h”
Success! Data written to: database/roles/mysqlrole

This role does the following:

  • Creates the user with a random name and password
  • The user has grants to myapp database
  • By default, the user exists for one hour, but time-to-live can be extended to 24 hours.

Now to create the temporary user just execute the following:

vault read database/creds/mysqlrole
Key                Value
---                -----
lease_id           database/creds/mysqlrole/MpO5oMsd1A0uyXT8d7R6sxVe.slbaC                                                                                   lease_duration     1h
lease_renewable    true
password           Gmx6fv89BL4qHbFokG-p
username           v-token-hcp--mysqlrole-EMt7xeECd

It is now possible to connect to myapp database using the credentials provided above.

Conclusion

Dynamic credentials can be an essential part of your company’s security framework to avoid a breach due to secrets sprawl, data leaks, and maintain data integrity and consistency. You can similarly integrate HashiCorp Vault with any Percona Kubernetes Operator – for MongoDB, MySQL, and PostgreSQL.

We encourage you to try it out to keep your data safe. Let us know if you faced any issues by submitting the topic to our Community Forum.

Percona Distribution for MySQL Operator

The Percona Distribution for MySQL Operator simplifies running Percona XtraDB Cluster on Kubernetes and provides automation for day-1 and day-2 operations. It’s based on the Kubernetes API and enables highly available environments. Regardless of where it is used, the Operator creates a member that is identical to other members created with the same Operator. This provides an assured level of stability to easily build test environments or deploy a repeatable, consistent database environment that meets Percona expert-recommended best practices.

Hashicorp Vault

Hashicorp Vault is an identity-based security solution that secures, stores, and tightly controls access to tokens, passwords, and other secrets with both open-source and enterprise offerings for self-managed security automation. In April 2021, HashiCorp announced a fully managed offering, HashiCorp Cloud Platform Vault (HCP Vault), that simplifies deployment and management of the Vault.

Aug
13
2021
--

Migrating PostgreSQL to Kubernetes

Migrating PostgreSQL to Kubernetes

More and more companies are adopting Kubernetes. For some it is about being cutting-edge, for some, it is a well-defined strategy and a business transformation. Developers and operations teams all over the world are struggling with moving applications that aren’t cloud-native friendly to containers and Kubernetes.

Migrating databases is always a challenge, which comes with risks and downtime for businesses. Today I’m going to show how easy it is to migrate a PostgreSQL database to Kubernetes with minimal downtime with Percona Distribution for PostgreSQL Operator.

Goal

To perform the migration I’m going to use the following setup:

Migrating PostgreSQL to Kubernetes

  1. PostgreSQL database deployed on-prem or somewhere in the cloud. It will be the Source.
  2. Google Kubernetes Engine (GKE) cluster where Percona Operator deploys and manages PostgreSQL cluster (the Target) and pgBackRest Pod
  3. PostgreSQL backups and Write Ahead Logs are uploaded to some Object Storage bucket (GCS in my case)
  4. pgBackRest Pod reads the data from the bucket
  5. pgBackRest Pod restores the data continuously to the PostgreSQL cluster in Kubernetes

The data should be continuously synchronized. In the end, I want to shut down PostgreSQL running on-prem and only keep the cluster in GKE.

Migration

Prerequisites

To replicate the setup you will need the following:

  • PostgreSQL (v 12 or 13) running somewhere
  • pgBackRest installed
  • Google Cloud Storage or any S3 bucket. My examples will be about GCS.
  • Kubernetes cluster

Configure The Source

I have Percona Distribution for PostgreSQL version 13 running on some Linux machines.

1. Configure pgBackrest

# cat /etc/pgbackrest.conf
[global]
log-level-console=info
log-level-file=debug
start-fast=y

[db]
pg1-path=/var/lib/postgresql/13/main
repo1-type=gcs
repo1-gcs-bucket=sp-test-1
repo1-gcs-key=/tmp/gcs.key
repo1-path=/on-prem-pg

  • pg1-path should point to PostgreSQL data directory
  • repo1-type is set to GCS as we want our backups to go there
  • The key is in /tmp/gcs.key file. The key can be obtained through Google Cloud UI. Read more about it here.
  • The backups are going to be stored in on-prem-pg folder in sp-test-1 bucket

2. Edit

postgresql.conf

config to enable archival through pgBackrest 

archive_mode = on   
archive_command = 'pgbackrest --stanza=db archive-push %p'

Restart is required after changing the configuration.

3. Operator requires to have a

postgresql.conf

file in the data directory. It is enough to have an empty file:

touch /var/lib/postgresql/13/main/postgresql.conf

4.

primaryuser

must be created on the Source to ensure replication is correctly set up by the Operator. 

# create user primaryuser with encrypted password '<PRIMARYUSER PASSWORD>' replication;

Configure The Target

1. Deploy Percona Distribution for PostgreSQL Operator on Kubernetes. Read more about it in the documentation here.

# create the namespace
kubectl create namespace pgo

# clone the git repository
git clone -b v0.2.0 https://github.com/percona/percona-postgresql-operator/
cd percona-postgresql-operator

# deploy the operator
kubectl apply -f deploy/operator.yaml

2. Edit main custom resource manifest – deploy/cr.yaml.

  • I’m not going to change the cluster name and keep it cluster1
  • the cluster is going to operate in Standby mode, which means it is going to sync the data from the GCS bucket. Set
    spec.standby

    to

    true

    .

  • configure GCS itself.
    spec.backup

    section would look like this (

    bucket

      and

    repoPath

    are the same as in pgBackrest configuration above)

backup:
...
    repoPath: "/on-prem-pg"
...
    storages:
      my-s3:
        type: gcs
        endpointUrl: https://storage.googleapis.com
        region: us-central1-a
        uriStyle: path
        verifyTLS: false
        bucket: sp-test-1
    storageTypes: [
      "gcs"
    ]

  • I would like to have at least one Replica in my PostgreSQL cluster. Set
    spec.pgReplicas.hotStandby.size

    to 1.

3. Operator should be able to authenticate with GCS. To do that we need to create a secret object called

<CLUSTERNAME>-backrest-repo-config

with

gcs-key

in data. It should be the same key we used on the Source. See the example of this secret here.

kubectl apply -f gcs.yaml

4. Create users by creating Secret objects:

postgres

  and

primaryuser

(the one we created on the Source). See the examples of users Secrets here. The passwords should be the same as on the Source.

kubectl apply -f users.yaml

5. Now let’s deploy our cluster on Kubernetes by applying the

cr.yaml

:

kubectl apply -f deploy/cr.yaml

Verify and Troubleshoot

If everything is done correctly you should see the following in the Primary Pod logs:

kubectl -n pgo logs -f --tail=20 cluster1-5dfb96f77d-7m2rs
2021-07-30 10:41:08,286 INFO: Reaped pid=548, exit status=0
2021-07-30 10:41:08,298 INFO: establishing a new patroni connection to the postgres cluster
2021-07-30 10:41:08,359 INFO: initialized a new cluster
Fri Jul 30 10:41:09 UTC 2021 INFO: PGHA_INIT is 'true', waiting to initialize as primary
Fri Jul 30 10:41:09 UTC 2021 INFO: Node cluster1-5dfb96f77d-7m2rs fully initialized for cluster cluster1 and is ready for use
2021-07-30 10:41:18,781 INFO: Lock owner: cluster1-5dfb96f77d-7m2rs; I am cluster1-5dfb96f77d-7m2rs                                 2021-07-30 10:41:18,810 INFO: no action.  i am the standby leader with the lock                                                     2021-07-30 10:41:28,781 INFO: Lock owner: cluster1-5dfb96f77d-7m2rs; I am cluster1-5dfb96f77d-7m2rs                                 2021-07-30 10:41:28,832 INFO: no action.  i am the standby leader with the lock

Change some data on the Source and ensure that it is properly synchronized to the Target cluster.

Common Issues

The following error message indicates that you forgot to create

postgresql.conf

file in the data directory:

FileNotFoundError: [Errno 2] No such file or directory: '/pgdata/cluster1/postgresql.conf' -> '/pgdata/cluster1/postgresql.base.conf'

Sometimes it is easy to forget to create the

primaryuser

  and see the following in the logs:

psycopg2.OperationalError: FATAL:  password authentication failed for user "primaryuser"

Wrong or missing object store credentials will trigger the following error:

WARN: repo1: [CryptoError] unable to load info file '/on-prem-pg/backup/db/backup.info' or '/on-prem-pg/backup/db/backup.info.copy':      CryptoError: raised from remote-0 protocol on 'cluster1-backrest-shared-repo': unable to read PEM: [218529960] wrong tag            HINT: is or was the repo encrypted?                                                                                                 CryptoError: raised from remote-0 protocol on 'cluster1-backrest-shared-repo': unable to read PEM: [218595386] nested asn1 error
      HINT: is or was the repo encrypted?
      HINT: backup.info cannot be opened and is required to perform a backup.
      HINT: has a stanza-create been performed?
ERROR: [075]: no backup set found to restore
Fri Jul 30 10:54:00 UTC 2021 ERROR: pgBackRest standby Creation: pgBackRest restore failed when creating standby

Cutover

Everything looks good and it is time to perform the cutover. In this blog post, I cover only the database side but do not forget that your application should be reconfigured to point to the correct PostgreSQL cluster. It might be a good idea to stop the application before the cutover.

1. Stop the source PostgreSQL cluster to ensure no data is written

systemctl stop postgresql

2. Promote the Target cluster to primary. To do that remove

spec.backup.repoPath

, change

spec.standby

to false in

deploy/cr.yaml

, and apply the changes:

kubectl apply -f deploy/cr.yaml

PostgreSQL will be restarted automatically and you will see the following in the logs:

2021-07-30 11:16:20,020 INFO: updated leader lock during promote
2021-07-30 11:16:20,025 INFO: Changed archive_mode from on to True (restart might be required)
2021-07-30 11:16:20,025 INFO: Changed max_wal_senders from 10 to 6 (restart might be required)
2021-07-30 11:16:20,027 INFO: Reloading PostgreSQL configuration.
server signaled
2021-07-30 11:16:21,037 INFO: Lock owner: cluster1-5dfb96f77d-n4c79; I am cluster1-5dfb96f77d-n4c79
2021-07-30 11:16:21,132 INFO: no action.  i am the leader with the lock

Conclusion

Deploying and managing database clusters is not an easy task. Recently released Percona Distribution for PostgreSQL Operator automates day-1 and day-2 operations and turns running PostgreSQL on Kubernetes into a smooth and pleasant journey.

With Kubernetes becoming the default control plane, the most common task for developers and operations teams is to perform the migration, which usually turns into a complex project. This blog post shows that database migration can be an easy task with minimal downtime.

We encourage you to try out our operator. See our github repository and check out the documentation.

Found a bug or have a feature idea? Feel free to submit it in JIRA.

For general questions please raise the topic in the community forum.

Are you a developer and looking to contribute? Please read our CONTRIBUTING.md and send the Pull Request.

Percona Distribution for PostgreSQL provides the best and most critical enterprise components from the open-source community, in a single distribution, designed and tested to work together.

Download Percona Distribution for PostgreSQL Today!

Jul
28
2021
--

Building and Testing Percona Distribution for MongoDB Operator

Testing Percona Distribution for MongoDB Operator

Testing Percona Distribution for MongoDB OperatorRecently I wanted to play with the latest and greatest Percona Distribution for MongoDB Operator which had a bug fix I was interested in. The bug fix was merged in the main branch of the git repository, but no version of the Operator that includes this fix was released yet. I started the Operator by cloning the main branch, but the bug was still reproducible. The reason was simple – the main branch had the last released version of the Operator in bundle.yaml, instead of the main branch build:

  spec:
      containers:
        - name: percona-server-mongodb-operator
          image: percona/percona-server-mongodb-operator:1.9.0

instead of 

   spec:
      containers:
        - name: percona-server-mongodb-operator
          image: perconalab/percona-server-mongodb-operator:main

Then I decided to dig deeper to see how hard it is to do a small change in the Operator code and test it.

This blog post is a beginner contributor guide where I tried to follow our CONTRIBUTING.md and Building and testing the Operator manual to build and test Percona Distribution for MongoDB Operator.

Requirements

The requirements section was the first blocker for me as I’m used to running Ubuntu, but examples that we have are for CentOS and MacOS. For all Ubuntu fans below are the instructions: 

echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] https://packages.cloud.google.com/apt cloud-sdk main" | sudo tee -a /etc/apt/sources.list.d/google-cloud-sdk.list
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key --keyring /usr/share/keyrings/cloud.google.gpg add -
sudo apt-get update
sudo apt-get install -y google-cloud-sdk docker.io kubectl jq
sudo snap install helm --classic
sudo snap install yq --channel=v3/stable
curl -s -L https://github.com/openshift/origin/releases/download/v3.11.0/openshift-origin-client-tools-v3.11.0-0cbc58b-linux-64bit.tar.gz | sudo tar -C /usr/bin --strip-components 1 --wildcards -zxvpf - '*/oc'

I have also prepared a Pull Request to fix our docs and drafted a cloud-init file to simplify environment provisioning.

Build

Get the code from GitHub main branch:

git clone https://github.com/percona/percona-server-mongodb-operator

Change some code. Now it is time to build the Operator image and push it to the registry. DockerHub is a nice choice for beginners as it does not require any installation or configuration, but for keeping it local you might want to install your own registry. See Docker Registry, Harbor, Trow.

./e2e-tests/build

command builds the image and pushes it to the registry which you specify in IMAGE environment variable like this:

export IMAGE=bob/my_repository_for_test_images:K8SPSMDB-372-fix-feature-X

Fixing the Issues

For me the execution of the build command failed for multiple reasons:

1. Most probably you need to run it as root to get access to docker unix socket or just add the user to the docker group:

Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock

2. Once I ran it with root I got the following error:

"--squash" is only supported on a Docker daemon with experimental features enabled

It is quite easy to fix it by adding the experimental flag into /etc/docker/daemon.json file:

{
    "experimental": true
}

I have added it into the cloud-init file and will fix it in the same PR in the docs.

3. The third failure was on the last stage of pushing the image: 

denied: requested access to the resource is denied

Obviously, you should be authorized to push to the registry.

docker login

fixed it for me just fine.

Finally, the image is built and pushed to the registry:

The push refers to repository [docker.io/bob/my_repository_for_test_images]
0014bf17d462: Pushed
...
K8SPSMDB-372-fix-feature-X: digest: sha256:458066396fdd6ac358bcd78ed4d8f5279ff0295223f1d7fbec0e6d429c01fb16 size: 949

Test

./e2e-tests/run

command executes the tests in e2e-tests folder one-by-one, as you see there are multiple scenarios:

"$dir/init-deploy/run" || fail "init-deploy"
"$dir/limits/run" || fail "limits"
"$dir/scaling/run" || fail "scaling"
"$dir/monitoring/run" || fail "monitoring"
"$dir/monitoring-2-0/run" || fail "monitoring-2-0"
"$dir/liveness/run" || fail "liveness"
"$dir/one-pod/run" || fail "one-pod"
"$dir/service-per-pod/run" || fail "service-per-pod"
"$dir/arbiter/run" || fail "arbiter"
"$dir/demand-backup/run" || fail "demand-backup"
"$dir/demand-backup-sharded/run" || fail "demand-backup-sharded"
"$dir/scheduled-backup/run" || fail "scheduled-backup"
"$dir/security-context/run" || fail "security-context"
"$dir/storage/run" || fail "storage"
"$dir/self-healing/run" || fail "self-healing"
"$dir/self-healing-chaos/run" || fail "self-healing-chaos"
"$dir/operator-self-healing/run" || fail "operator-self-healing"
"$dir/operator-self-healing-chaos/run" || fail "operator-self-healing-chaos"
"$dir/smart-update/run" || fail "smart-update"
"$dir/version-service/run" || fail "version-service"
"$dir/users/run" || fail "users"
"$dir/rs-shard-migration/run" || fail "rs-shard-migration"
"$dir/data-sharded/run" || fail "data-sharded"
"$dir/upgrade/run" || fail "upgrade"
"$dir/upgrade-sharded/run" || fail "upgrade-sharded"
"$dir/upgrade-consistency/run" || fail "upgrade-consistency"
"$dir/pitr/run" || fail "pitr"
"$dir/pitr-sharded/run" || fail "pitr-sharded"

Obviously, it is possible to run the tests one by one.

It is required to have kubectl configured and pointing to the working Kubernetes cluster. If something is missing or not working the tests are going to tell you that.

The only issue I faced is the readability of the test results. The logging of the test execution is pretty verbose, so I would recommend redirecting the output to some file for further debugging purposes. 

`./e2e-tests/run >> /tmp/e2e-tests.out 2>&1

We in Percona rely on Jenkins to automatically test and verify the results for each Pull Request.

Conclusion

Contribution guides are written for developers by developers, so they often have some gaps or unclear instructions which sometimes require experience to resolve. Such minor issues might scare off potential contributors and as a result, the project does not get the Pull Request with an awesome implementation of the brightest idea. Percona embraces open source culture and values contributors by providing simple tools to develop and test the ideas.

Writing this blog post resulted in two Pull Requests:

  1. Use
    :main

     tag for container images in the main branch (link)

  2. Removing some gaps in the docs (link)

There is always room for improvement and a time to find a better way. Please let us know if you face any issues with contributing your ideas to Percona products. You can do that on the Community Forum or JIRA. Read more about contribution guidelines for Percona Distribution for MongoDB Operator in CONTRIBUTING.md.

Jun
23
2021
--

MySQL on Kubernetes with GitOps

MySQL on Kubernetes with GitOps

GitOps workflow was introduced by WeaveWorks as a way to implement Continuous Deployment for cloud-native applications. This technique quickly found its way into devops and developer’s hearts as it greatly simplifies the application delivery pipeline: the change in the manifests in the git repository is reflected in Kubernetes right away. With GitOps there is no need to provide access to the cluster for the developer as all the actions are executed by the Operator.

This blog post is a guide on how to deploy Percona Distribution for MySQL on Kubernetes with Flux – GitOps Operator that keeps your cluster state in sync with the Git repository.

Percona Distribution for MySQL on Kubernetes with Flux

In a nutshell, the flow is the following:

  1. Developer triggers the change in the GitHub repository
  2. Flux Operator:
    1. detects the change
    2. deploys Percona Distribution for MySQL Operator
    3. creates the Custom Resource, which triggers the creation of Percona XtraDB Cluster and HAProxy pods

The result is a fully working MySQL service deployed without talking to Kubernetes API directly.

Preparation

Prerequisites:

  • Kubernetes cluster
  • Github user and account
    • For this blog post, I used the manifests from this repository 

It is a good practice to create a separate namespace for Flux:

$ kubectl create namespace gitops

Installing and managing Flux is easier with

fluxctl

. In Ubuntu, I use snap to install tools, for other operating systems please refer to the manual here.

$ sudo snap install fluxctl --classic

Install Flux operator to your Kubernetes cluster:

$ fluxctl install --git-email=your@email.com --git-url=git@github.com:spron-in/blog-data.git --git-path=gitops-mysql --manifest-generation=true --git-branch=master --namespace=gitops | kubectl apply -f -

GitHub Sync

As per configuration, Flux will monitor the changes in the spron-in/blog-data repository continuously and sync the state. It is required to grant access to Flux to the repo.

Get the public key that was generated during the installation:

$ fluxctl identity --k8s-fwd-ns gitops

Copy the key, add it as Deploy key with write access in GitHub. Go to Settings -> Deploy keys -> Add deploy key:

Action

All set. Flux reconcile loops check the state for changes every five minutes. To trigger synchronization right away run:

$ fluxctl sync --k8s-fwd-ns gitops

In my case I have two YAMLs in the repo:

  • bundle.yaml

    – installs the Operator, creates the Custom Resource Definitions (CRDs)

  • cr.yaml

    – deploys PXC and HAProxy pods

Flux is going to deploy them both.

$ kubectl get pods
NAME                                               READY   STATUS    RESTARTS   AGE
cluster1-haproxy-0                                 2/2     Running   0          26m
cluster1-haproxy-1                                 2/2     Running   0          25m
cluster1-pxc-0                                     1/1     Running   0          26m
cluster1-pxc-1                                     1/1     Running   0          25m
cluster1-pxc-2                                     1/1     Running   0          23m
percona-xtradb-cluster-operator-79966668bd-95plv   1/1     Running   0          26m

Now let’s add one more HAProxy Pod by changing

spec.haproxy.size

from 2 to 3 in

cr.yaml

. After that commit and push the changes. In a production-grade scenario, the Pull Request will go through a thorough review, in my case I push directly to the main branch.

$ git commit cr.yaml -m 'increase haproxy size from 2 to 3'
$ git push
Enumerating objects: 7, done.
Counting objects: 100% (7/7), done.
Delta compression using up to 2 threads
Compressing objects: 100% (4/4), done.
Writing objects: 100% (4/4), 385 bytes | 385.00 KiB/s, done.
Total 4 (delta 2), reused 0 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (2/2), completed with 2 local objects.
To https://github.com/spron-in/blog-data
   e1a27b8..d555c77  master -> master

Either trigger the sync with

fluxctl sync

command or wait for approximately 5 minutes for Flux reconcile loop to detect the changes. In the logs of the Flux Operator you will see the event:

ts=2021-06-15T12:59:08.267469963Z caller=loop.go:134 component=sync-loop event=refreshed url=ssh://git@github.com/spron-in/blog-data.git branch=master HEAD=d555c77c19ea9d1685392680186e1491905401cc
ts=2021-06-15T12:59:08.270678093Z caller=sync.go:61 component=daemon info="trying to sync git changes to the cluster" old=e1a27b8a81e640d3bee9bc2e2c31f9c4189e898a new=d555c77c19ea9d1685392680186e1491905401cc
ts=2021-06-15T12:59:08.844068322Z caller=sync.go:540 method=Sync cmd=apply args= count=9
ts=2021-06-15T12:59:09.097835721Z caller=sync.go:606 method=Sync cmd="kubectl apply -f -" took=253.684342ms err=null output="serviceaccount/percona-xtradb-cluster-operator unchanged\nrole.rbac.authorization.k8s.io/percona-xtradb-cluster-operator unchanged\ncustomresourcedefinition.apiextensions.k8s.io/perconaxtradbbackups.pxc.percona.com configured\ncustomresourcedefinition.apiextensions.k8s.io/perconaxtradbclusterbackups.pxc.percona.com unchanged\ncustomresourcedefinition.apiextensions.k8s.io/perconaxtradbclusterrestores.pxc.percona.com unchanged\ncustomresourcedefinition.apiextensions.k8s.io/perconaxtradbclusters.pxc.percona.com unchanged\nrolebinding.rbac.authorization.k8s.io/service-account-percona-xtradb-cluster-operator unchanged\ndeployment.apps/percona-xtradb-cluster-operator unchanged\nperconaxtradbcluster.pxc.percona.com/cluster1 configured"
ts=2021-06-15T12:59:09.099258988Z caller=daemon.go:701 component=daemon event="Sync: d555c77, default:perconaxtradbcluster/cluster1" logupstream=false
ts=2021-06-15T12:59:11.387525662Z caller=loop.go:236 component=sync-loop state="tag flux" old=e1a27b8a81e640d3bee9bc2e2c31f9c4189e898a new=d555c77c19ea9d1685392680186e1491905401cc
ts=2021-06-15T12:59:12.122386802Z caller=loop.go:134 component=sync-loop event=refreshed url=ssh://git@github.com/spron-in/blog-data.git branch=master HEAD=d555c77c19ea9d1685392680186e1491905401cc

The log indicates that the main CR was configured:

perconaxtradbcluster.pxc.percona.com/cluster1 configured

 

Now we have three HAProxy Pods:

$ kubectl get pods
NAME                                               READY   STATUS    RESTARTS   AGE
cluster1-haproxy-0                                 2/2     Running   1          50m
cluster1-haproxy-1                                 2/2     Running   0          48m
cluster1-haproxy-2                                 2/2     Running   0          4m45s

It is important to note that GitOps maintains the sync between Kubernetes and GitHub. It means that if the user manually changes the object on Kubernetes, Flux, or any other GitOps Operator will revert the changes and sync them with GitHub.

GitOps also comes in handy when users want to take the backup or perform the restoration. To do that the user just creates YAML manifests in the GitHub repo and Flux creates corresponding Kubernetes objects. The Database Operator does the rest.

Conclusion

GitOps is a simple approach to deploy and manage applications on Kubernetes:

  • Change Management is provided by git version-control and code reviews
  • Direct access to Kubernetes API is limited which increases security
  • Infrastructure-as-a-Code is here, there is no need to integrate Terraform, Ansible, or any other tool

All Percona Operators can be deployed and managed with GitOps. As a result, you will get production-grade MySQL, MongoDB, or PostgreSQL cluster which just works.

May
26
2021
--

Percona Distribution for PostgreSQL Operator Technical Preview Released

Percona Distribution for PostgreSQL Operator

Percona is championing open source database software and we are committed to running our products on Kubernetes. We don’t only want to run the software, but make sure that the database is highly available, secure, and follows best practices. We also focus on day-2 operations such as scaling, backup and restore, disaster recovery, and customization.

To get there, we have Operators – the software framework that extends Kubernetes APIs and provides control over database deployment and operations through the control plane. Until May we had two Operators:

The only missing piece was Percona Distribution for PostgreSQL, for which we introduced the Operator during Percona Live in May 2021. This completes our vision for deploying our software on Kubernetes. See the release notes of the initial version here.

Kubernetes Operator FAQ

This blog post is intended to answer some frequently asked questions we received from our community about Percona Distribution for PostgreSQL Operator.

Is This a Brand New Operator?

No. Our Operator is based on PGO, the Postgres Operator from Crunchy Data, which we modified and enhanced in order to support our PostgreSQL distribution.

Why CrunchyData Operator?

As noted above, we are committed to running our database software on Kubernetes. There are multiple ways to achieve this goal:

  1. Develop a new Operator from scratch
  2. Collaborate and contribute necessary changes to an existing Operator
  3. Fork an  existing Operator

Option (1) looks great, but it is time and effort-intensive, and we might be re-inventing existing wheels. Our goal is to minimize Time To Market (TTM), so we dropped this option right away.

For options (2) and (3), there are at least three different PostgreSQL Operators in active development:

Stackgres is written in Java, and our engineering team is more familiar with C/C++ and Golang. We do not see that changing in the near future. Zalando Operator is great and provides a lot of functionality out of the box, but our Engineering team estimated the effort to perform the needed changes almost similar to writing the Operator from scratch.

PGO is written in Golang and provides the features we were looking for: high availability with Patroni, scaling, backups, and many more. Our engineering team did not flag any complexity of introducing the changes and we jumped to work.

Will Percona use PGO as an Upstream?

For now – yes. We will be merging new features implemented in PGO into our fork. Version 0.1.0 of our Operator is based on the 4.6.2 version of PGO. Version 0.2.0 will be based on version 4.7.X. At the same time, we want to contribute back some of our changes to the upstream and have already sent some pull requests (one, two). We’ll continue submitting patches to the upstream project.

What is Different About Percona Operator?

The main differences were highlighted in the release notes. Here they are:

  • Percona Distribution for PostgreSQL is now used as the main container image. CrunchyData container images are provided under Crunchy Data Developer Program, which means that without an active contract they could not be used for production. Percona container images are fully open source and do not have any limitations for use.
  • It is possible to specify custom images for all components separately. For example, users can easily build and use custom images for one or several components (e.g. pgBouncer) while all other images will be the official ones. Also, users can build and use all custom images.
  • All container images are reworked and simplified. They are built on Red Hat Universal Base Image (UBI) 8.
  • The Operator has built-in integration with Percona Monitoring and Management (PMM) v2.
  • A build/test infrastructure was created, and we have started adding e2e tests to be sure that all pieces of the cluster work together as expected.
  • We have phased out the PGO CLI tool, and the Custom Resource UX will be completely aligned with other Percona Operators in the following release.

For future releases, our goal is to cover the feature and UX parity between the Operators, so that our users will have the same look and feel for all three database engines.

What Does Tech Preview Mean?

Tech Preview Features are not yet ready for production use and are not included in support via SLA (Service License Agreement). They are included in this release so that users can provide feedback prior to the full release of the feature in a future GA (General Availability) release (or removal of the feature if it is deemed not useful). This functionality can change (APIs, CLIs, etc.) from tech preview to GA.

When is GA Coming and What is Going to be Included?

Our goal is to release the GA version early in Q3. The changes in this version would include:

  • Moving control over replicas to the main Custom Resource instead of managing them separately
  • Change the main manifest to provide the same look and feel as in other Percona Operators
  • Rework scheduled backups and control them with main CR and Kubernetes primitives
  • Add support for Google Cloud Storage (this will be merged from upstream)

Call for Action

To install our Operator and learn more about it please read the documentation.

Our Operator is licensed under Apache 2.0 and can be found in percona-postgresql-operator repository on Github. There are multiple ways to contact us or share your ideas:

  • To report a bug use jira.percona.com and create the bug in K8SPG project
  • For general questions and sharing your thoughts, we have a community forum or Discord where we chat about open source, databases, Kubernetes, and many more.
  • We have a public roadmap where you can see what is planned and what is under development. You can share your ideas about new features there as well.
May
20
2021
--

Manage MySQL Users with Kubernetes

Manage MySQL Users with Kubernetes

Manage MySQL Users with KubernetesQuite a common request that we receive from the community and customers is to provide a way to manage database users with Operators – both MongoDB and MySQL. Even though we see it as an interesting task, our Operators are mainly a tool to simplify the deployment and management of our software on Kubernetes. Our goal is to provide the database cluster which is ready to host mission-critical applications and deployed with the best practices.

Why Manage Users with Operators?

There are few use cases:

  1. Simplify the CICD pipeline. It is simpler to apply a single manifest than running multiple commands to create the user after the DB is ready.
  2. Give control over DB users to developers or applications, but without providing direct access to the database.
  3. There is an opinion that Kubernetes will transition from container orchestrator to a control plane to manage everything. For some companies, it is a strategy.

We want to have the functionality to provision users with Operators, but it does not seem to be the right solution to do it separately for each Operator. It looks like it can be unified.

What if we take it to another level and create a way to provision users on any database through the Kubernetes control plane? The user has created the MySQL instance on a public cloud through the control plane, so why not create the DB user the same way?

Crossplane.io – A Kubernetes addon that enables users to declaratively describe and provision the infrastructure through the k8s control plane. By design, it is extendable through “providers”. One of them – provider-sql – enables the functionality to manage MySQL and PostgreSQL users (and even databases) through CRDs. Let’s see how to make it work with Percona XtraDB Cluster Operator.

Action

Prerequisites:

  • Kubernetes cluster
  • Percona XtraDB Cluster deployed with Operator (see the docs here)

The goal is to create a Custom Resource (CR) object through Kubernetes API to trigger crossplane.io to create the user on the PXC cluster. As a summary it will look like this:

Manage MySQL Users with Kubernetes

  1. A user creates the CR with the desired user and grants
  2. Crossplane detects it
  3. provider-sql (Crossplane provider) is configured through a Secret object which has the PXC endpoint and root credentials
  4. provider-sql connects to PXC and creates the user

I have placed all the files for this blog post into the public github repository along with the condensed runbook which just lists all the steps. You can find it all here.

Install crossplane

The simplest way is to do it through helm:

kubectl create namespace crossplane
helm repo add crossplane-alpha https://charts.crossplane.io/alpha
helm install crossplane --namespace crossplane crossplane-alpha/crossplane

Other installation methods can be found in crossplane.io documentation.

Install provider-sql

Provider in Crossplane is a concept similar to Terraform’s provider. Everything in crossplane is done through Kubernetes API and that includes the installation of Providers:

$ cat crossplane-provider-sql.yaml
apiVersion: pkg.crossplane.io/v1beta1
kind: Provider
metadata:
  name: provider-sql
spec:
  package: "crossplane/provider-sql:master"

$ kubectl apply -f crossplane-provider-sql.yaml

This is going to install Custom Resource Definitions for provider-sql which is going to be used to manage MySQL users on our PXC cluster. Full docs for provider-sql can be found here, but they are not very detailed.

Almost There

Everything is installed and needs a last configuration touch.

  1. Create a secret which provider-sql is going to use to connect to the MySQL database. I have cluster1 Percona XtraDB cluster deployed in a pxc namespace and the corresponding secret will look like this:
$ cat crossplane-secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: crossplane-secret
  namespace: pxc
stringData:
  username: root
  password: <root password>
  endpoint: cluster1-haproxy.pxc.svc.cluster.local
  port: "3306"

type: Opaque

You can get the root password from the secret which is created when PXC is deployed through the Operator. Quick way to get it is like this:

$ kubectl get secret -n pxc my-cluster-secrets -o yaml | awk '/root:/ {print $2}' | base64 --decode && echo
<root password>

Crossplane will use the endpoint and port as a MySQL connection string, and username and password to connect to it. Configure provider-sql to get the information from the secret:

$ cat crossplane-mysql-config.yaml
apiVersion: mysql.sql.crossplane.io/v1alpha1
kind: ProviderConfig
metadata:
  name: cluster1-pxc
spec:
  credentials:
    source: MySQLConnectionSecret
    connectionSecretRef:
      namespace: pxc
      name: crossplane-secret

$ kubectl apply -f crossplane-mysql-config.yaml

Let’s verify that configuration is in place:

$ kubectl get providerconfig.mysql.sql.crossplane.io
NAME           AGE
cluster1-pxc   14s

Do It

All set. Crossplane can now connect to the database and create the users. From the Kubernetes and user perspective, it is just creating the custom resources through the control plane API.

Database Creation

$ cat crossplane-db.yaml
apiVersion: mysql.sql.crossplane.io/v1alpha1
kind: Database
metadata:
  name: my-db
spec:
  providerConfigRef:
    name: cluster1-pxc

$ kubectl apply -f crossplane-db.yaml
database.mysql.sql.crossplane.io/my-db created

$ kubectl get database.mysql.sql.crossplane.io
NAME    READY   SYNCED   AGE
my-db   True    True     14s

This created the database on my Percona XtraDB Cluster:

$ mysql -u root -p -h cluster1-haproxy
Server version: 8.0.22-13.1 Percona XtraDB Cluster (GPL), Release rel13, Revision a48e6d5, WSREP version 26.4.3
...
mysql> show databases like 'my-db';
+------------------+
| Database (my-db) |
+------------------+
| my-db            |
+------------------+
1 row in set (0.01 sec)

The DB can be deleted through Kubernetes API as well – just delete the corresponding

database.mysql.sql.crossplane.io

  object.

User Creation

The user needs a password. Password should never be stored as plain text, so let’s put it into a Secret:

$ cat user-secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: my-user-secret
stringData:
  password: mysuperpass
type: Opaque

We can create the user now:

$ cat crossplane-user.yaml
apiVersion: mysql.sql.crossplane.io/v1alpha1
kind: User
metadata:
  name: my-user
spec:
  providerConfigRef:
    name: cluster1-pxc
  forProvider:
    passwordSecretRef:
      name: my-user-secret
      namespace: default
      key: password
  writeConnectionSecretToRef:
    name: connection-secret
    namespace: default

$ kubectl apply -f crossplane-user.yaml
user.mysql.sql.crossplane.io/my-user created

$ kubectl get user.mysql.sql.crossplane.io
NAME      READY   SYNCED   AGE
my-user   True    True     11s

And add some grants:

$ cat crossplane-grants.yaml
apiVersion: mysql.sql.crossplane.io/v1alpha1
kind: Grant
metadata:
  name: my-grant
spec:
  providerConfigRef:
    name: cluster1-pxc
  forProvider:
    privileges:
      - DROP
      - CREATE ROUTINE
      - EVENT
    userRef:
      name: my-user
    databaseRef:
      name: my-db

$ kubectl apply -f crossplane-grants.yaml
grant.mysql.sql.crossplane.io/my-grant created

$ kubectl get grant.mysql.sql.crossplane.io
NAME       READY   SYNCED   AGE   ROLE      DATABASE   PRIVILEGES
my-grant   True    True     7s    my-user   my-db      [DROP CREATE ROUTINE EVENT]

Verify that the user is there:

mysql> show grants for 'my-user';
+-----------------------------------------------------------------+
| Grants for my-user@%                                            |
+-----------------------------------------------------------------+
| GRANT USAGE ON *.* TO `my-user`@`%`                             |
| GRANT DROP, CREATE ROUTINE, EVENT ON `my-db`.* TO `my-user`@`%` |
+-----------------------------------------------------------------+
2 rows in set (0.00 sec)

Keeping the State

Kubernetes is declarative and its controllers always do their best to keep the declared configuration and real state in sync. It means that if you are going to delete the user manually from the database (not through Kubernetes API), on the next pass of a reconcile loop Crossplane will sync the state and recreate the user and grants again.

Conclusion

Some functionality in one database engine differs a lot from the other, but sometimes there is a pattern. User creation is one of these patterns that can be unified across multiple database engines. Luckily Cloud Native Foundation landscape is huge and consists of a lot of building blocks which when used together can deliver wonderful infrastructures or applications.

This blog post shows that the community might have already found a better solution to the problem and re-inventing it might be a waste of time.

Extending crossplane.io providers to support other database engines (like MongoDB) is a challenge but can be solved. We are drafting a proposal and will work with our teams and community to deliver this.

May
19
2021
--

Percona Monitoring and Management DBaaS Overview and Technical Details

Percona Monitoring and Management DBaaS Overview

Percona Monitoring and Management DBaaS OverviewDatabase-as-a-Service (DBaaS) is a managed database that doesn’t need to be installed and maintained but is instead provided as a service to the user. The Percona Monitoring and Management (PMM) DBaaS component allows users to CRUD (Create, Read, Update, Delete) Percona XtraDB Cluster (PXC) and Percona Server for MongoDB (PSMDB) managed databases in Kubernetes clusters.

PXC and PSMDB implement DBaaS on top of Kubernetes (k8s), and PMM DBaaS provides a nice interface and API to manage them.

Deploy Playground with minikube

The easiest way to play with and test PMM DBaaS is to use minikube. Please follow the minikube installation guideline. It is possible that your OS distribution provides native packages for it, so check that with your package manager as well.

In the examples below, Linux is used with kvm2 driver, so additionally kvm and libvirt should be installed. Other OS and drivers could be used as well. Install the kubectl tool as well, it would be more convenient to use it and minikube will configure kubeconfig so k8s cluster could be accessed from the host easily.

Let’s create a k8s cluster and adjust resources as needed. The minimum requirements can be found in the documentation.

  • Start minikube cluster
$ minikube start --cpus 12 --memory 32G --driver=kvm2

  • Download PMM Server deployment for minikube and deploy it in k8s cluster
$ curl -sSf -m 30 https://raw.githubusercontent.com/percona-platform/dbaas-controller/main/deploy/pmm-server-minikube.yaml \
| kubectl apply -f -

  • For the first time, it could take a while for the PMM Server to init the volume, but it will eventually start
  • Here’s how to check that PMM Server deployment is running:
$ kubectl get deployment
NAME             READY   UP-TO-DATE   AVAILABLE   AGE
pmm-deployment   1/1     1            1           3m40s

$ kubectl get pods
NAME                             READY   STATUS    RESTARTS   AGE
pmm-deployment-d688fb846-mtc62   1/1     Running   0          3m42s

$ kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM              STORAGECLASS   REASON   AGE
pmm-data                                   10Gi       RWO            Retain           Available                                              3m44s
pvc-cb3a0a18-b6dd-4b2e-92a5-dfc0bc79d880   10Gi       RWO            Delete           Bound       default/pmm-data   standard                3m44s

$ kubectl get pvc
NAME       STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
pmm-data   Bound    pvc-cb3a0a18-b6dd-4b2e-92a5-dfc0bc79d880   10Gi       RWO            standard       3m45s

$ kubectl get service
NAME         TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE
kubernetes   ClusterIP   10.96.0.1        <none>        443/TCP                      6m10s
pmm          NodePort    10.102.228.150   <none>        80:30080/TCP,443:30443/TCP   3m5

  • Expose PMM Server ports on the host, as this also opens links to the PMM UI as well as to the API endpoint in the default browser.
$ minikube service pmm

NOTE:

To not go into too much detail: PV (kubectl get pv) and PVC(kubectl get pvc) are essentially the storage for PMM data (/srv directory). Service is a network for PMM and how to access it.

Attention: this PMM Server deployment is not supposed to be used in production, but just as a sandbox for testing and playing around, as it always starts with the latest version of PMM and k8s is not yet a supported environment for it.

Configure PMM DBaaS

Now PMM DBaaS Dashboard can be used and a k8s cluster could be added, DB added, as well as configured.

DBaaS Dashboard

NOTE:

To enable the PMM DBaaS feature you need to either pass a special environment (ENABLE_DBAAS=1) to the container or enable it in the settings (next screenshot).

To allow PMM managing k8s cluster – it needs to be configured. Check the documentation, but here are short steps:

  • set Public Address address to pmm on Configuration -> Settings -> Advanced Settings page

PMM Advanced settings

  • Get k8s config (kubeconfig) and copy it for registration:
kubectl config view --flatten --minify

  • Register configuration that was copied on DBaaS Kubernetes Cluster dashboard:

DBaaS Register k8s Cluster

 

Let’s get into details on what that all means.

The Public Address is propagated to pmm-client containers that are run as part of PXC and PSMDB deployments to monitor DB services pmm-client containers run pmm-agent, which would need to connect to the PMM server. It uses Public Address. DNS name pmm is set by Service in pmm-server-minikube.yaml file for our PMM server deployment.

So far, PMM DBaaS uses kubeconfig to get access to k8s API to be able to manage PXC and PSMDB operators. The kubeconfig file and k8s cluster information is stored securely in PMM Server internal DB.

PMM DBaaS couldn’t deploy operators into the k8s cluster for now, but that feature will be implemented very soon. And that is why Operator status on the Kubernetes Cluster dashboard shows hints on how to install them.

What are the operators and why are they needed? This is defined very well in the documentation. Long story short, they are the heart of DBaaS that deploy and configure DBs inside of k8s cluster.

Operators themselves are complex pieces of software that need to be correctly started and configured to deploy DBs. That is where PMM DBaaS comes in handy, to configure a lot for the end-user and provide a UI to choose what DB needs to be created, configured, or deleted.

Deploy PSMDB with DBaaS

Let’s deploy the PSMDB operator and DBs step by step and check them in detail.

  • Deploy PSMDB operator
curl -sSf -m 30 https://raw.githubusercontent.com/percona/percona-server-mongodb-operator/v1.7.0/deploy/bundle.yaml \
| kubectl apply -f -

  • Here’s how it could be checked that operator was created:
$ kubectl get deployment
NAME                              READY   UP-TO-DATE   AVAILABLE   AGE
percona-server-mongodb-operator   1/1     1            1           46h
pmm-deployment                    1/1     1            1           24h


$ kubectl get pods
NAME                                               READY   STATUS    RESTARTS   AGE
percona-server-mongodb-operator-586b769b44-hr7mg   1/1     Running   2          46h
pmm-deployment-7fcb579576-hwf76                    1/1     Running   1          24h

Now it is seen on the PMM DBaaS Kubernetes Cluster Dashboard that the MongoDB operator is installed.

Cluster with PSMDB

PMM API

All REST APIs could be discovered via Swagger; it is exposed on both ports (30080 and 30443 in case of minikube) and could be accessed by appending /swagger to the PMM server address. It is recommended to use https (30443 port), and for example, the URL could look like this: https://192.168.39.202:30443/swagger.

As DBaaS is a feature under active development, replace /swagger.json to /swagger-dev.json and push the Explore button.

Swagger API

Now all APIs can be seen and even executed.

Let’s try it out. First Authorize and then find /v1/management/DBaaS/Kubernetes/List and push Try it out and Execute. There will be an example of curl as well as response to the REST API POST request. The curl example could be used from the command line as well:

$ curl -kX POST "https://192.168.39.202:30443/v1/management/DBaaS/Kubernetes/List" -H  "accept: application/json" -H  "authorization: Basic YWRtaW46YWRtaW4=" -H  "Content-Type: application/json" -d "{}"
{
  "kubernetes_clusters": [
    {
      "kubernetes_cluster_name": "minikube",
      "operators": {
        "xtradb": {
          "status": "OPERATORS_STATUS_NOT_INSTALLED"
        },
        "psmdb": {
          "status": "OPERATORS_STATUS_OK"
        }
      },
      "status": "KUBERNETES_CLUSTER_STATUS_OK"
    }
  ]
}

PMM Swagger API Example

Create DB and Deep Dive

PMM Server consists of different components, and for the DBaaS feature, here are the main ones:

  • Grafana UI with DBaaS dashboards talk to pmm-managed through REST API to show the current state and provides a user interface
  • pmm-managed acts as REST gateway and holds kubeconfig and talks to dbaas-controller through gRPC
  • dbaas-controller implements DBaaS features, talks to k8s, and exposes gRPC interface for pmm-managed

The Grafana UI is what users see, and now when operators are installed, the user could create the MongoDB instance. Let’s do this.

  • Go to DBaaS -> DB Cluster page and push Create DB Cluster link
  • Choose your options and push Create Cluster button

Create MongoDB cluster

It has more advanced options to configure resources allocated for the cluster:

Advanced settings for cluster creation

As seen, the cluster was created and could be manipulated. Now let’s see in detail what has happened underneath.

PSMDB Cluster created

When the user pushes the Create Cluster button, Grafana UI POSTS /v1/management/DBaaS/PSMDBCluster/Create request to pmm-managed. pmm-managed handles the request and sends it via gRPC to the dbaas-controller together with kubeconfig.

dbaas-controller handles requests, and with knowledge of operator structure (Custom Resources/CRD), it prepares CR with all needed parameters to create a MongoDB cluster. After filling all needed structures, dbaas-controller converts CR to yaml file and applies it with kubectl apply -f command. kubectl gets pre-configured with kubeconf file (that was passed by pmm-managed from its DB) to talk to the correct cluster, and the kubeconf file is temporarily created and deleted immediately after the request.

The same happens when some parameters change or dbaas-controller gets some parameters from the k8s cluster.

Essentially, the dbaas-controller automates all stages to fill CRs with correct parameters, check that everything works correctly, and returns details about clusters created. The kubectl interface is used for simplicity but it is subject to change before GA, most probably to k8s Go API.

Summary

All together, PMM Server DBaaS provides a seamless experience for the user to deploy DB clusters on top of Kubernetes with simple and nice UI without the need to know operators’ internals. Deploying PXC and PSMDB clusters it also configures PMM Agents and exporters, thus all monitoring data is present in PMM Server right away.

PMM PSMDB overview

Go to PMM Dashboard -> MongoDB -> MongoDB Overview and see MongoDB monitoring data, explorer nodes, and service monitoring too, which comes pre-configured with the help of the DBaaS feature.

Give it a try, submit feedback, and chat with us, we would be happy to hear from you!

P.S.

Don’t forget to stop and/or delete your minikube cluster if it is not used:

  • Stop minikube cluster, to not use resources (could be started with start again)
$ minikube stop

  • If a cluster is not needed anymore, delete minikube cluster
$ minikube delete

 

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com