Jan
11
2022
--

Creating a Standby Cluster With the Percona Distribution for PostgreSQL Operator

Standby Cluster With the Percona Distribution for PostgreSQL Operator

A customer recently asked if our Percona Distribution for PostgreSQL Operator supports the deployment of a standby cluster, which they need as part of their Disaster Recovery (DR) strategy. The answer is yes – as long as you are making use of an object storage system for backups, such as AWS S3 or GCP Cloud Storage buckets, that can be accessed by the standby cluster. In a nutshell, it works like this:

  • The primary cluster is configured with pgBackRest to take backups and store them alongside archived WAL files in a remote repository;
  • The standby cluster is built from one of these backups and it is kept in sync with the primary cluster by consuming the WAL files that are copied from the remote repository.

Note that the primary node in the standby cluster is not a streaming replica from any of the nodes in the primary cluster and that it relies on archived WAL files to replicate events. For this reason, this approach cannot be used as a High Availability (HA) solution. Even though the primary use of a standby cluster in this context is DR, it can be also employed for migrations as well.

So, how can we create a standby cluster using the Percona operator? We will show you next. But first, let’s create a primary cluster for our example.

Creating a Primary PostgreSQL Cluster Using the Percona Operator

You will find a detailed procedure on how to deploy a PostgreSQL cluster using the Percona operator in our online documentation. Here we want to highlight the main steps involved, particularly regarding the configuration of object storage, which is a crucial requirement and should better be done during the initial deployment of the cluster. In the following example, we will deploy our clusters using the Google Kubernetes Engine (GKE) but you can find similar instructions for other environments in the previous link.

Considering you have a Google account configured as well as the gcloud (from the Google Cloud SDK suite) and kubectl command-line tools installed, authenticate yourself with gcloud auth login, and off we go!

Creating a GKE Cluster and Basic Configuration

The following command will create a default cluster named “cluster-1” and composed of three nodes. We are creating it in the us-central1-a zone using e2-standard-4 VMs but you may choose different options. In fact, you may also need to indicate the project name and other main settings if you do not have your gcloud environment pre-configured with them:

gcloud container clusters create cluster-1 --preemptible --machine-type e2-standard-4 --num-nodes=3 --zone us-central1-a

Once the cluster is created, use your IAM identity to control access to this new cluster:

kubectl create clusterrolebinding cluster-admin-binding --clusterrole cluster-admin --user $(gcloud config get-value core/account)

Finally, create the pgo namespace:

kubectl create namespace pgo

and set the current context to refer to this new namespace:

kubectl config set-context $(kubectl config current-context) --namespace=pgo

Creating a Cloud Storage Bucket

Remember for this setup we need a Google Cloud Storage bucket configured as well as a Service Account created with the necessary privileges/roles to access it. The respective procedures to obtain these vary according to how your environment is configured so we won’t be covering them here. Please refer to the Google Cloud Storage documentation for the exact steps. The bucket we created for the example in this post was named cluster1-backups-and-wals.

Likewise, please refer to the Creating and managing service account keys documentation to learn how to create a Service Account and download the corresponding key in JSON format – we will need to provide it to the operator so our PostgreSQL clusters can access the storage bucket.

Creating the Kubernetes Secrets File to Access the Storage Bucket

Create a file named my-gcs-account-secret.yaml with the following structure:

apiVersion: v1
kind: Secret
metadata:
  name: cluster1-backrest-repo-config
type: Opaque
data:
  gcs-key: <VALUE>

replacing the <VALUE> placeholder by the output of the following command according to the OS you are using:

Linux:

base64 --wrap=0 your-service-account-key-file.json

macOS:

base64 your-service-account-key-file.json

Installing and Deploying the Operator

The most practical way to install our operator is by cloning the Git repository, and then moving inside its directory:

git clone -b v1.1.0 https://github.com/percona/percona-postgresql-operator
cd percona-postgresql-operator

The following command will deploy the operator:

kubectl apply -f deploy/operator.yaml

We have already prepared the secrets file to access the storage bucket so we can apply it now:

kubectl apply -f my-gcs-account-secret.yaml

Now, all that is left is to customize the storages options in the deploy/cr.yaml file to indicate the use of the GCS bucket as follows:

    storages:
      my-gcs:
        type: gcs
        bucket: cluster1-backups-and-wals

We can now deploy the primary PostgreSQL cluster (cluster1):

kubectl apply -f deploy/cr.yaml

Once the operator has been deployed, you can run the following command to do some housekeeping:

kubectl delete -f deploy/operator.yaml

Creating a Standby PostgreSQL Cluster Using the Percona Operator

After this long preamble, let’s look at what brought you here: how to deploy a standby cluster, which we will refer to as cluster2, that will replicate from the primary cluster.

Copying the Secrets Over

Considering you probably have customized the passwords you use in your primary cluster and that they differ from the default values found in the operator’s git repository, we need to make a copy of the secrets files, adjusted to the standby cluster’s name. The following procedure facilitates this task, saving the secrets files under /tmp/cluster1-cluster2-secrets (you can choose a different target directory):

NOTE: make sure you have the yq tool installed in your system.
mkdir -p /tmp/cluster1-cluster2-secrets/
export primary_cluster_name=cluster1
export standby_cluster_name=cluster2
export secrets="${primary_cluster_name}-users"
kubectl get secret/$secrets -o yaml \
| yq eval 'del(.metadata.creationTimestamp)' - \
| yq eval 'del(.metadata.uid)' - \
| yq eval 'del(.metadata.selfLink)' - \
| yq eval 'del(.metadata.resourceVersion)' - \
| yq eval 'del(.metadata.namespace)' - \
| yq eval 'del(.metadata.annotations."kubectl.kubernetes.io/last-applied-configuration")' - \
| yq eval '.metadata.name = "'"${secrets/$primary_cluster_name/$standby_cluster_name}"'"' - \
| yq eval '.metadata.labels.pg-cluster = "'"${standby_cluster_name}"'"' - \
>/tmp/cluster1-cluster2-secrets/${secrets/$primary_cluster_name/$standby_cluster_name}

Deploying the Standby Cluster: Fast Mode

Since we have already covered the procedure used to create the primary cluster in detail in a previous section, we will be presenting the essential steps to create the standby cluster below and provide additional comments only when necessary.

NOTE: the commands below are issued from inside the percona-postgresql-operator directory hosting the git repository for our operator.

Deploying a New GKE Cluster Named cluster-2

This time using the us-west1-b zone here:

gcloud container clusters create cluster-2 --preemptible --machine-type e2-standard-4 --num-nodes=3 --zone us-west1-b
kubectl create clusterrolebinding cluster-admin-binding --clusterrole cluster-admin --user $(gcloud config get-value core/account)
kubectl create namespace pgo
kubectl config set-context $(kubectl config current-context) --namespace=pgo
kubectl apply -f deploy/operator.yaml

Apply the Adjusted Kubernetes Secrets:

export standby_cluster_name=cluster2
export secrets="${standby_cluster_name}-users"
kubectl create -f /tmp/cluster1-cluster2-secrets/$secrets

The list above does not include the GCS secret file; the key contents remain the same but the backrest-repo pod name needs to be adjusted. Make a copy of that file:

cp my-gcs-account-secret.yaml my-gcs-account-secret-2.yaml

then edit the copy to indicate “cluster2-” instead of “cluster1-”:

name: cluster2-backrest-repo-config

You can apply it now:

kubectl apply -f my-gcs-account-secret-2.yaml

The cr.yaml file of the Standby Cluster

Let’s make a copy of the cr.yaml file we customized for the primary cluster:

cp deploy/cr.yaml deploy/cr-2.yaml

and edit the copy as follows:

1) Change all references (that are not commented) from cluster1 to cluster2  – including current-primary but excluding the bucket reference, which in our example is prefixed with “cluster1-”; the storage section must remain unchanged. (We know it’s not very practical to replace so many references, we still need to improve this part of the routine).

2) Enable the standby option:

standby: true

3) Provide a repoPath that points to the GCS bucket used by the primary cluster (just below the storages section, which should remain the same as in the primary cluster’s cr.yaml file):

repoPath: “/backrestrepo/cluster1-backrest-shared-repo”

And that’s it! All that is left now is to deploy the standby cluster:

kubectl apply -f deploy/cr-2.yaml

With everything working on the standby cluster, do some housekeeping:

kubectl delete -f deploy/operator.yaml

Verifying it all Works as Expected

Remember that the standby cluster is created from a backup and relies on archived WAL files to be continued in sync with the primary cluster. If you make a change in the primary cluster, such as adding a row to a table, that change won’t reach the standby cluster until the WAL file it has been recorded to is archived and consumed by the standby cluster.

When checking if all is working with the new setup, you can force the rotation of the WAL file (and subsequent archival of the previous one) in the primary node of the primary cluster to accelerate the sync process by issuing:

psql> SELECT pg_switch_wal();

The Percona Kubernetes Operators automate the creation, alteration, or deletion of members in your Percona Distribution for MySQL, MongoDB, or PostgreSQL environment.

Learn More About Percona Kubernetes Operators

Jan
07
2022
--

Configure wiredTiger cacheSize Inside Percona Distribution for MongoDB Kubernetes Operator

wiredTiger cacheSize Inside Percona Distribution for MongoDB Kubernetes Operator

wiredTiger cacheSize Inside Percona Distribution for MongoDB Kubernetes OperatorNowadays we are seeing a lot of customers starting to use our Percona Distribution for MongoDB Kubernetes Operator. The Percona Kubernetes Operators are based on best practices for the configuration of a Percona Server for MongoDB replica set or the sharded cluster. The main component in MongoDB is the wiredTiger cache which helps to define the cache used by this engine and we can set it based on our load.

In this blog post, we will see how to define the resources’ memory and set the wiredTiger cache for the shard replicaset to improve the performance of the sharded cluster.

The Necessity of WT cache

The parameter storage.wiredTiger.engineConfig.cacheSizeGB limits the size of the WiredTiger internal cache. The operating system will use the available free memory for filesystem cache, which allows the compressed MongoDB data files to stay in memory. In addition, the operating system will use any free RAM to buffer file system blocks and file system cache. To accommodate the additional consumers of RAM, you may have to set WiredTiger’s internal cache size properly.

Starting from MongoDB 3.4, the default WiredTiger internal cache size is the larger of either:

50% of (RAM - 1 GB), or 256 MB.

For example, on a system with a total of 4GB of RAM the WiredTiger cache will use 1.5GB of RAM (0.5 * (4 GB – 1 GB) = 1.5 GB). Conversely, a system with a total of 1.25 GB of RAM will allocate 256 MB to the WiredTiger cache because that is more than half of the total RAM minus one gigabyte (0.5 * (1.25 GB – 1 GB) = 128 MB < 256 MB).

WT cacheSize in Kubernetes Operator

The mongodb wiredTiger cacheSize can be tune with the parameter storage.wiredTiger.engineConfig.cacheSizeRatio and its default value is 0.5. As explained above, if the system allocated memory limit is too low, then the WT cache is set to 256M or calculated as per the formula.

Prior to PSMDB operator 1.9.0, the cacheSizeRatio can be tuned under the sharding section of the cr.yaml file. This is deprecated from v1.9.0+ and unavailable from v1.12.0+. So you have to use the cacheSizeRatio parameter available under replsets configuration instead. The main thing that you will need to check here before changing the cacheSize is to make sure that the resources’ memory limit allocated is also available as per your cacheSize’s requirement. i.e the below section limiting the memory:

     resources:
       limits:
         cpu: "300m"
         memory: "0.5G"
       requests:
         cpu: "300m"
         memory: "0.5G"

 

https://github.com/percona/percona-server-mongodb-operator/blob/main/pkg/psmdb/container.go#L307

From the source code that calculates the mongod.storage.wiredTiger.engineConfig.cacheSizeRatio:

// In normal situations WiredTiger does this default-sizing correctly but under Docker
// containers WiredTiger fails to detect the memory limit of the Docker container. We
// explicitly set the WiredTiger cache size to fix this.
//
// https://docs.mongodb.com/manual/reference/configuration-options/#storage.wiredTiger.engineConfig.cacheSizeGB//

func getWiredTigerCacheSizeGB(resourceList corev1.ResourceList, cacheRatio float64, subtract1GB bool) float64 {
 maxMemory := resourceList[corev1.ResourceMemory]
 var size float64
 if subtract1GB {
  size = math.Floor(cacheRatio * float64(maxMemory.Value()-gigaByte))
 } else {
  size = math.Floor(cacheRatio * float64(maxMemory.Value()))
 }
 sizeGB := size / float64(gigaByte)
 if sizeGB < minWiredTigerCacheSizeGB {
  sizeGB = minWiredTigerCacheSizeGB
 }
 return sizeGB
}

 

Changing the cacheSizeRatio

Here for the test, we deployed the PSMDB operator on GCP. You can refer here for the steps – https://www.percona.com/doc/kubernetes-operator-for-psmongodb/gke.html. With the latest operator v1.11.0, the sharded cluster has been started with a shard and a config server replicaSets along with mongos pods.

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
my-cluster-name-cfg-0 2/2 Running 0 4m9s
my-cluster-name-cfg-1 2/2 Running 0 2m55s
my-cluster-name-cfg-2 2/2 Running 1 111s
my-cluster-name-mongos-758f9fb44-d4hnh 1/1 Running 0 99s
my-cluster-name-mongos-758f9fb44-d5wfm 1/1 Running 0 99s
my-cluster-name-mongos-758f9fb44-wmvkx 1/1 Running 0 99s
my-cluster-name-rs0-0 2/2 Running 0 4m7s
my-cluster-name-rs0-1 2/2 Running 0 2m55s
my-cluster-name-rs0-2 2/2 Running 0 117s
percona-server-mongodb-operator-58c459565b-fc6k8 1/1 Running 0 5m45s

Now login into the shard and check the default memory allocated to the container and to the mongod instance. In below, the memory size available is 15G, but the memory limit to use in this container is 476MB only:

rs0:PRIMARY> db.hostInfo()
{
"system" : {
"currentTime" : ISODate("2021-12-30T07:16:59.441Z"),
"hostname" : "my-cluster-name-rs0-0",
"cpuAddrSize" : 64,
"memSizeMB" : NumberLong(15006),
"memLimitMB" : NumberLong(476),
"numCores" : 4,
"cpuArch" : "x86_64",
"numaEnabled" : false
},
"os" : {
"type" : "Linux",
"name" : "Red Hat Enterprise Linux release 8.4 (Ootpa)",
"version" : "Kernel 5.4.144+"
},
"extra" : {
"versionString" : "Linux version 5.4.144+ (builder@7d732a1aec13) (Chromium OS 12.0_pre408248_p20201125-r7 clang version 12.0.0 (/var/tmp/portage/sys-devel/llvm-12.0_pre408248_p20201125-r7/work/llvm-12.0_pre408248_p20201125/clang f402e682d0ef5598eeffc9a21a691b03e602ff58)) #1 SMP Sat Sep 25 09:56:01 PDT 2021",
"libcVersion" : "2.28",
"kernelVersion" : "5.4.144+",
"cpuFrequencyMHz" : "2000.164",
"cpuFeatures" : "fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves arat md_clear arch_capabilities",
"pageSize" : NumberLong(4096),
"numPages" : 3841723,
"maxOpenFiles" : 1048576,
"physicalCores" : 2,
"mountInfo" : [
..
..

 

The cachesize in MB of wiredTiger engine allocated in Shard is as follows:

rs0:PRIMARY> db.serverStatus().wiredTiger.cache["maximum bytes configured"]/1024/1024
256

The cache size of 256MB is too low for the real environment. So let’s see how to tune the memory limit and also the cacheSize of WT engine. You can use the parameter called cacheSizeRatio to mention the WT cache ratio (out of 1) and memlimit to mention the memory allocated to the container. To do this, edit the cr.yaml file under deploy directory in the operator to change the settings. From the PSMDB operator v1.9.0, editing cacheSizeRatio parameter under mongod section is deprecated. So for the WT cache limit, use the cacheSizeRatio parameter under the section “replsets” and to set memory, use the memlimit parameter. Setting 3G for the container and 80% of the memory calculations.

deploy/cr.yaml:58

46 configuration: |
47 # operationProfiling:
48 # mode: slowOp
49 # systemLog:
50 # verbosity: 1
51 storage:
52 engine: wiredTiger
53 # inMemory:
54 # engineConfig:
55 # inMemorySizeRatio: 0.9
56 wiredTiger:
57 engineConfig:
58 cacheSizeRatio: 0.8

 

deploy/cr.yaml:229-232:

226 resources:
227 limits:
228 cpu: "300m"
229 memory: "3G"
230 requests:
231 cpu: "300m"
232 memory: "3G"

 

Apply the new cr.yaml

# kubectl appli -f deploy/cr.yaml
perconaservermongodb.psmdb.percona.com/my-cluster-name configured

The shard pods are re-allocated and you can check the progress as follows:

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
my-cluster-name-cfg-0 2/2 Running 0 36m
my-cluster-name-cfg-1 2/2 Running 0 35m
my-cluster-name-cfg-2 2/2 Running 1 34m
my-cluster-name-mongos-758f9fb44-d4hnh 1/1 Running 0 34m
my-cluster-name-mongos-758f9fb44-d5wfm 1/1 Running 0 34m
my-cluster-name-mongos-758f9fb44-wmvkx 1/1 Running 0 34m
my-cluster-name-rs0-0 0/2 Init:0/1 0 13s
my-cluster-name-rs0-1 2/2 Running 0 60s
my-cluster-name-rs0-2 2/2 Running 0 8m33s
percona-server-mongodb-operator-58c459565b-fc6k8 1/1 Running 0 38m

Now check the new settings of WT cache as follows:

rs0:PRIMARY> db.hostInfo().system
{
"currentTime" : ISODate("2021-12-30T08:37:38.790Z"),
"hostname" : "my-cluster-name-rs0-1",
"cpuAddrSize" : 64,
"memSizeMB" : NumberLong(15006),
"memLimitMB" : NumberLong(2861),
"numCores" : 4,
"cpuArch" : "x86_64",
"numaEnabled" : false
}
rs0:PRIMARY> 
rs0:PRIMARY> 
rs0:PRIMARY> db.serverStatus().wiredTiger.cache["maximum bytes configured"]/1024/1024
1474

Here, the memory calculation for WT is done roughly as follows (Memory limit should be more than 1G, else 256MB is allocated by default:
(Memory limit – 1G) * cacheSizeRatio

(2861 - 1) *0.8 = 1467

 

NOTE:

Till PSMDB operator v1.10.0, the operator takes the change of cacheSizeRatio only if the resources.limit.cpu is also set. This is a bug and it got fixed in v1.11.0 – refer https://jira.percona.com/browse/K8SPSMDB-603 . So if you’re in an older version, don’t be surprised and you have to make sure the resources.limit.cpu is set as well.

https://github.com/percona/percona-server-mongodb-operator/blob/v1.10.0/pkg/psmdb/container.go#L194

if limit, ok := resources.Limits[corev1.ResourceCPU]; ok && !limit.IsZero() {
args = append(args, fmt.Sprintf(
"--wiredTigerCacheSizeGB=%.2f",
getWiredTigerCacheSizeGB(resources.Limits, replset.Storage.WiredTiger.EngineConfig.CacheSizeRatio, true),
))
}

From v1.11.0:
https://github.com/percona/percona-server-mongodb-operator/blob/v1.11.0/pkg/psmdb/container.go#L194

if limit, ok := resources.Limits[corev1.ResourceMemory]; ok && !limit.IsZero() {
    args = append(args, fmt.Sprintf(
       "--wiredTigerCacheSizeGB=%.2f",
       getWiredTigerCacheSizeGB(resources.Limits, replset.Storage.WiredTiger.EngineConfig.CacheSizeRatio, true),
))
}

 

Conclusion

So based on the application load, you will need to set the cacheSize of WT for better performance. You can use the above methods to tune the cache size for the shard replicaset in the PSMDB operator.

Reference Links :

https://www.percona.com/doc/kubernetes-operator-for-psmongodb/operator.html

https://www.percona.com/doc/kubernetes-operator-for-psmongodb/gke.html

https://www.percona.com/doc/kubernetes-operator-for-psmongodb/operator.html#mongod-storage-wiredtiger-engineconfig-cachesizeratio

MongoDB 101: How to Tune Your MongoDB Configuration After Upgrading to More Memory

Dec
21
2021
--

Quick Guide on Azure Blob Storage Support for Percona Distribution for MongoDB Operator

Azure Blob Percona MongoDB Operator

If you have ever used backups with Percona Distribution for MongoDB Operator, you should already know that backed-up data is stored outside the Kubernetes cluster – on Amazon S3 or any S3-compatible storage. Storage types not compatible with the S3 protocol were supported indirectly in the case of an existing S3 wrapper/gateway. A good example of such a solution is running MinIO Gateway on Azure Kubernetes Service to store backups on Azure Blob Storage.

Starting with Operator version 1.11, it is now possible to use Azure Blob Storage for backups directly:

Backups on Azure Blob Storage

The following steps will allow you to configure it.

1. Get Azure Blob Storage Credentials

As with most other S3-compatible storage types, the first thing to do is to obtain credentials the Operator will use to access Azure Blob storage.

If you are new to Azure, these two tutorials will help you to configure your storage:

When you have a container to store your backups, getting credentials to access it involve the following steps:

  1. Go to your storage account settings,
  2. Open the “Access keys” section,
  3. Copy and save both the account name and account key as shown on a screenshot below:

Azure credentials

2. Create a Secret with Credentials

The Operator will use a Kubernetes Secrets Object to obtain the needed credentials. Create this Secret with credentials using the following command:

$ kubectl create secret generic azure-secret \
 --from-literal=AZURE_STORAGE_ACCOUNT_NAME=<your-storage-account-name> \
 --from-literal=AZURE_STORAGE_ACCOUNT_KEY=<your-storage-key>

3. Setup spec.backup Section in your deploy/cr.yaml file

As usual, backups are configured via the same-name section in the deploy/cr.yaml configuration file.

Make sure that backups are enabled (backup.enable key set to true), and add the following lines to the backup.storages subsection (use the proper name of your container): 

azure-blob:
  type: azure
  azure:
    container: <your-container-name>
    prefix: psmdb
    credentialsSecret: azure-secret

If you want to schedule a regular backup, add the following lines to the backup.tasks subsection:

tasks:
  - name: weekly
    enabled: true
    schedule: "0 0 * * 0"
    compressionType: gzip
    storageName: azure-blob

The backup schedule is specified in crontab format (the above example runs backups at 00:00 on Sunday). If you know nothing about cron schedule expressions, you can use this online generator.

The full backup section in our example will look like this:

backup:
  enabled: true
  restartOnFailure: true
  image: percona/percona-server-mongodb-operator:1.10.0-backup
  storages:
    azure-blob:
      type: azure
      azure:
        container: <your container name>
        prefix: psmdb
        credentialsSecret: azure-secret
  tasks:
    - name: weekly
      enabled: true
      schedule: "0 0 * * 0"
      compressionType: gzip
      storageName: azure-blob

You can find more information on backup options in the official backups documentation and the backups section of the Custom Resource options reference.

Percona Distribution for MongoDB is a freely available MongoDB database alternative, giving you a single solution that combines the best and most important enterprise components from the open source community, designed and tested to work together.

Download Percona Distribution for MongoDB Today!

Dec
21
2021
--

Data at Rest Encryption Support in Percona Distribution for MongoDB Operator

Data at Rest Encryption Support in Percona Distribution for MongoDB Operator

Data at Rest Encryption Support in Percona Distribution for MongoDB OperatorAs we all know, security is very important these days and we read about many data leaks. Security has many aspects, but one of the most important is securing data since it is a vital asset to companies. When we speak about data, it can be encrypted at rest (transparent data encryption – TDE, full disk encryption, column/field-level encryption) or in transit (TLS).

What we will concentrate on in this blog post is data at rest encryption (specifically TDE) and how it is currently supported in Percona Distribution for MongoDB Operator, but also what the limitations are and the features coming in some of the next releases.

TDE basically means that any data which is not actively used is encrypted at the storage engine level (WiredTiger in this case), but this does not include logs or data which is replicated.

TDE in Percona Distribution for MongoDB Operator

TDE in Operator is based on options that Percona Server for MongoDB (PSMDB) supports and which were developed to be mostly the same or similar as in MongoDB Enterprise edition. The differences are that PSMDB doesn’t support KMIP or Amazon AWS key management services, but instead offers the ability to store the master key inside HashiCorp Vault.

The Operator currently doesn’t support storing keys in HashiCorp Vault as PSMDB does, and the master key is stored in the Kubernetes secret and mounted locally in database pods, but I will mention this more in the limitations section and future plans.

Options for data at rest encryption support in the Operator are only a few, and the defaults in the Operator are:

  • security.enableEncryption: true
  • security.encryptionCipherMode: AES256-CBC
  • security.encryptionKeySecret: optional (needs to be a 32 character string encoded in base64

You can read about the options here, but as you can see in Operator, encryption is enabled by default and if you don’t specify some custom security.encryptionKeySecret the Operator will create one for you.

Limitations

Dynamically Enabling/Disabling

We cannot simply dynamically change the option to enable or disable encryption. If we try to do that, the Operator will try to restart MongoDB pods with new options and the pod start will fail. The reason is that MongoDB will just start with the new option on the old data directory, and it will not be able to read/write the data.

One of the ways this can be overcome is by creating a logical backup and then restoring on a cluster that has the desired option enabled. The second option would be to create a second cluster with the desired option and do a cross-cluster replication and then switch the main cluster to the new one.

Key Rotation

One of the most important things with encryption is the periodic rotation of the keys, but at this moment with the Operator, this is not so easy. This is basically the same issue as above, but if we try to just update the secret and restart the cluster the effect will be the same – MongoDB will not be able to read/write the data.

It can be overcome with the same options as above, but it will be made really easy with the ability to store the keys in the Vault. If you are interested in this functionality you can track the progress in this Jira ticket.

Storing the Key in the Vault

Currently, the Operator supports only storing the master key as a secret which is presented to PSMDB as a local file. This is not a recommended setup for production and is very limiting.

PSMDB has integration for storing the keys in HashiCorp Vault key management which is much more secure and also has the ability to rotate the master key. Basically, how it works is that PSMDB is restarted with the option “rotateMasterKey: true” and then it just generates a new key in the Vault and re-encrypts the specific database encryption keys (whole data is not re-encrypted).

Support for this is definitely one of the features in the roadmap and it will be a huge deal for data at rest encryption support in the Operator so stay tuned for upcoming changes. The request for implementing support for HashiCorp Vault integration can be tracked here.

Conclusion

As you can see, data at rest encryption in our Kubernetes Operator is supported but currently only at the most basic level. Our Percona Distribution for MySQL Operator already supports integration with HashiCorp Vault and since we like to keep the feature parity between our different operators, this functionality will be soon available in our MongoDB operator as well.

The Percona Kubernetes Operators automate the creation, alteration, or deletion of members in your Percona Distribution for MySQL, MongoDB, or PostgreSQL environment.

Learn More About Percona Kubernetes Operators

Dec
09
2021
--

Testing Percona Distribution for MySQL Operator Locally with Kind

Percona Distribution for MySQL Operator with Kind

Percona Distribution for MySQL Operator with KindWe have a quickstart guide for how to install Percona Distribution for MySQL Operator on minikube. Installing the minimal version works well as it is described in the guide. After that, we will have one HAproxy and one Percona XtraDB Cluster (PXC) node to work with.

Minikube provides Kubernetes locally. One can try using the provided local k8s to try the more advanced scenarios such as the one described here.

Following that guide, everything works well, until we get to the part of deploying a cluster with

deploy/cr.yaml

Even after that, things seemingly work.

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
cluster1-haproxy-0 0/2 ContainerCreating 0 5s
cluster1-pxc-0 0/3 Init:0/1 0 5s
percona-xtradb-cluster-operator-77bfd8cdc5-rcqsp 1/1 Running 1 62s

That is until the second pod is getting created. The creation of that pod will be stuck forever in a pending state.

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
cluster1-haproxy-0 1/2 Running 0 93s
cluster1-pxc-0 3/3 Running 0 93s
cluster1-pxc-1 0/3 Pending 0 10s
percona-xtradb-cluster-operator-77bfd8cdc5-rcqsp 1/1 Running 1 2m30s

When checking cluster1-pxc-1 pods with

kubectl describe pod cluster1-pxc-1

the reason becomes clear.

Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 66s (x2 over 66s) default-scheduler 0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.
Warning FailedScheduling 63s default-scheduler 0/1 nodes are available: 1 node(s) didn't match pod affinity/anti-affinity, 1 node(s) didn't match pod anti-affinity rules.

Anti-affinity rules are specified for different pods in the cluster, which makes sense, normally – one would want to have the different PXC instances in different failure domains, so we can have actual fault tolerance. I could have made this one work by editing the anti-affinity rules in cr.yaml, which would have been suitable for testing purposes, but I was wondering if there is a better way to have a more complicated local k8s setup. Kind can give that, and it’s an ideal playground for following the second guide. Alternatively, the anti-affinity rules can be edited, but I wanted to have an easy test environment for a full setup.

In this example, I am using macOS and DockerDesktop for Mac, kind can be installed via homebrew.

$ cat kind-config.yaml
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
- role: worker
- role: worker

This way I have one control and 3 worker nodes (running kubelet), a redundant control plane is also supported, but not needed for this testing. With this, the cluster can be created.

$ kind create cluster --name k8s-playground --config kind-config.yaml
Creating cluster "k8s-playground" ...
? Ensuring node image (kindest/node:v1.21.1) ?
? Preparing nodes ? ? ? ?
? Writing configuration ?
? Starting control-plane ?
? Installing CNI ?
? Installing StorageClass ?
? Joining worker nodes ?
Set kubectl context to "kind-k8s-playground"
You can now use your cluster with:
kubectl cluster-info --context kind-k8s-playground

Have a question, bug, or feature request? Let us know! https://kind.sigs.k8s.io/#community ?

Each node will be a docker container.

$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
6d404954433e kindest/node:v1.21.1 "/usr/local/bin/entr…" About a minute ago Up About a minute k8s-playground-worker2
93a293dfc423 kindest/node:v1.21.1 "/usr/local/bin/entr…" About a minute ago Up About a minute 127.0.0.1:64922->6443/tcp k8s-playground-control-plane
e531e10b0384 kindest/node:v1.21.1 "/usr/local/bin/entr…" About a minute ago Up About a minute k8s-playground-worker
383a89f6d9f8 kindest/node:v1.21.1 "/usr/local/bin/entr…" About a minute ago Up About a minute k8s-playground-worker3

From this point on, kubectl is configured, and we can follow the second guide for the Percona Distribution for MySQL Operator.

After that, we need to wait for a while for the cluster to come up.

$ kubectl apply -f deploy/cr.yaml
perconaxtradbcluster.pxc.percona.com/cluster1 created

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
cluster1-haproxy-0 0/2 ContainerCreating 0 4s
cluster1-pxc-0 0/3 Init:0/1 0 4s
percona-xtradb-cluster-operator-d99c748-d5nq6 1/1 Running 0 21s

After a few minutes, the cluster will be running as expected.

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
cluster1-haproxy-0 2/2 Running 0 5m5s
cluster1-haproxy-1 2/2 Running 0 3m20s
cluster1-haproxy-2 2/2 Running 0 2m55s
cluster1-pxc-0 3/3 Running 0 5m5s
cluster1-pxc-1 3/3 Running 0 3m32s
cluster1-pxc-2 3/3 Running 0 119s
percona-xtradb-cluster-operator-d99c748-d5nq6 1/1 Running 0 5m22s

$ kubectl run -i --rm --tty percona-client --image=percona:8.0 --restart=Never -- mysql -h cluster1-haproxy -uroot -proot_password -e "show global status like 'wsrep_cluster_size'"
mysql: [Warning] Using a password on the command line interface can be insecure.
+--------------------+-------+
| Variable_name | Value |
+--------------------+-------+
| wsrep_cluster_size | 3 |
+--------------------+-------+
pod "percona-client" deleted

For that last check, I used the default password from secret.yaml. If you changed that, use the password it’s changed to.

Kind will work on macOS out of the box like this as a simple solution. In order to try Percona software in local playgrounds (on Linux or in a Linux virtual machine), you can also check anydbver, created and maintained by Nickolay Ihalainen.

At the end of the experiments, the kind k8s can be destroyed.

$ kind delete cluster --name k8s-playground
Deleting cluster "k8s-playground" ...

Nov
23
2021
--

Multi-Tenant Kubernetes Cluster with Percona Operators

multi-tenant kubernetes cluster

multi-tenant kubernetes clusterThere are cases where multiple teams, customers, or applications run in the same Kubernetes cluster. Such an environment is called multi-tenant and requires some preparation and management. Multi-tenant Kubernetes deployment allows you to utilize the economy of scale model on various levels:

  • Smaller compute footprint – one control plane, dense container deployments
  • Ease of management – one cluster, not hundreds

In this blog post, we are going to review multi-tenancy best practices, recommendations and see how Percona Kubernetes Operators can be deployed and managed in such Kubernetes clusters.

Multi-Tenancy

Generic

Multi-tenancy usually means a lot of Pods and workloads in a single cluster. You should always remember that there are certain limits when designing your infrastructure. For vanilla Kubernetes, these limits are quite high and hard to reach:

  • 5000 nodes
  • 10 000 namespaces
  • 150 000 pods

Managed Kubernetes services have their own limits that you should keep in mind. For example, GKE allows a maximum of 110 Pods per node on a standard cluster and only 32 on GKE Autopilot nodes.

The older AWS EKS CNI plugin was limiting the number of Pods per node to the number of IP addresses EC2 can have. With the prefix assignment enabled in CNI, you are still going to hit a limit of 110 pods per node.

Namespaces

Kubernetes Namespaces provides a mechanism for isolating groups of resources within a single cluster. The scope of k8s objects can either be cluster scope or namespace scoped. Objects which are accessible across all the namespaces like

ClusterRole

 are cluster scoped and those which are accessible only in a single namespace like Deployments are namespace scoped.

kubernetes namespaces

Deploying a database with Percona Operators creates pods that are namespace scoped. This provides interesting opportunities to run workloads on different namespaces for different teams, projects, and potentially, customers too. 

Example: Percona Distribution for MongoDB Operator and Percona Server for MongoDB can be run on two different namespaces by adding namespace metadata fields. Snippets are as follows:

# Team 1 DB running in team1-db namespace
apiVersion: psmdb.percona.com/v1-11-0
kind: PerconaServerMongoDB
metadata:
 name: team1-server
 namespace: team1-db

# Team 1 deployment running in team1-db namespace
apiVersion: apps/v1
kind: Deployment
metadata:
 name: percona-server-mongodb-operator-team1
 namespace: team1-db


# Team 2 DB running in team2-db namespace
apiVersion: psmdb.percona.com/v1-11-0
kind: PerconaServerMongoDB
metadata:
 name: team2-server
 namespace: team2-db

# Team 2 deployment running in team2-db namespace
apiVersion: apps/v1
kind: Deployment
metadata:
 name: percona-server-mongodb-operator-team2
 namespace: team2-db

Suggestions:

  1. Avoid using the standard namespaces like
    kube-system

    or

    default

    .

  2. It’s always better to run independent workloads on different namespaces unless there is a specific requirement to do it in a shared namespace.

Namespaces can be used per team, per application environment, or any other logical structure that fits the use case.

Resources

The biggest problem in any multi-tenant environment is this – how can we ensure that a single bad apple doesn’t spoil the whole bunch of apples?

ResourceQuotas

Thanks to Resource Quotas, we can restrict the resource utilization of namespaces.

ResourceQuotas

 also allows you to restrict the number of k8s objects which can be created in a namespace. 

Example of the YAML manifest with resource quotas:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: team1-quota         
  namespace: team1-db    # Namespace where operator is deployed
spec:
  hard:
    requests.cpu: "10"     # Cumulative CPU requests of all k8s objects in the namespace cannot exceed 10vcpu
    limits.cpu: "20"       # Cumulative CPU limits of all k8s objects in the namespace cannot exceed 20 vcpu
    requests.memory: 10Gi  # Cumulative memory requests of all k8s objects in the namespace cannot exceed 10Gi
    limits.memory: 20Gi    # Cumulative memory limits of all k8s objects in the namespace cannot exceed 20Gi
    requests.ephemeral-storage: 100Gi  # Cumulative ephemeral storage request of all k8s objects in the namespace cannot exceed 100Gi
    limits.ephemeral-storage: 200Gi    # Cumulative ephemeral storage limits of all k8s objects in the namespace cannot exceed 200Gi
    requests.storage: 300Gi            # Cumulative storage requests of all PVC in the namespace cannot exceed 300Gi
    persistentvolumeclaims: 5          # Maximum number of PVC in the namespace is 5
    count/statefulsets.apps: 2         # Maximum number of statefulsets in the namespace is 2
    # count/psmdb: 2                   # Maximum number of PSMDB objects in the namespace is 2, replace the name with proper Custom Resource

Please refer to the Resource Quotas documentation and apply quotas that are required for your use case.

If resource quotas are applied to a namespace, it is required to set containers’ requests and limits, otherwise, you are going to have an error similar to the following:

Error creating: pods "my-cluster-name-rs0-0" is forbidden: failed quota: my-cpu-memory-quota: must specify limits.cpu,requests.cpu

All Percona Operators provide the capability to fine-tune the requests and limits. The following example sets CPU and memory requests for Percona XtraDB Cluster containers:

spec:
  pxc:
    resources:
      requests:
        memory: 4G
        cpu: 2

LimitRange

With

ResourceQuotas

we can control the cumulative resources in the namespaces but if we want to enforce constraints on individual Kubernetes objects, LimitRange is a useful option. 

For example, if Team 1,2,3 are provided a namespace to run workloads,

ResourceQuota

will ensure that none of the team can exceed the quotas allocated and over-utilize the cluster… but what if a badly configured workload (say an operator run from team 1 with higher priority class) is utilizing all the resources allocated to the team?

LimitRange

can be used to enforce resources like compute, memory, ephemeral storage, storage with PVC. The example below highlights some of the possibilities.

apiVersion: v1
kind: LimitRange
metadata:
  name: lr-team1
  namespace: team1-db
spec:
  limits:
  - type: Pod                      
    max:                            # Maximum resource limit of all containers combined. Consider setting default limits
      ephemeral-storage: 100Gi      # Maximum ephemeral storage cannot exceed 100GB
      cpu: "800m"                   # Maximum CPU limits of the Pod is 800mVCPU
      memory: 4Gi                   # Maximum memory limits of the Pod is 4 GB
    min:                            # Minimum resource request of all containers combined. Consider setting default requests
      ephemeral-storage: 50Gi       # Minimum ephemeral storage should be 50GB
      cpu: "200m"                   # Minimum CPU request is  200mVCPU
      memory: 2Gi                   # Minimum memory request is 2 GB
  - type: PersistentVolumeClaim
    max:
      storage: 2Gi                  # Maximum PVC storage limit
    min:
      storage: 1Gi                  # Minimum PVC storage request

Suggestions:

  1. When it’s feasible, apply
    ResourceQuotas

    and

    LimitRanges

     to the namespaces where the Percona operator is running. This ensures that tenants are not overutilizing the cluster.

  2. Set alerts to monitor objects and usage of resources in namespaces. Automation of
    ResourceQuotas

     changes may also be useful in some scenarios.

  3. It is advisable to use a buffer on maximum expected utilization before setting the
    ResourceQuotas

    .

  4. Set
    LimitRanges

    to ensure workloads are not overutilizing resources in individual namespaces.

Roles and Security

Kubernetes provides several modes to authorize an API request. Role-Based access control is a popular way for authorization. There are four important objects to provide access:

ClusterRole Represents a set of permissions across the cluster (cluster scope)
Role Represents a set of permissions within a namespace ( namespace scope)
ClusterRoleBinding Granting permission to subjects across the cluster ( cluster scope )
RoleBinding Granting permissions to subjects within a namespace ( namespace scope)

Subjects in the

RoleBinding/ClusterRoleBinding

can be users, groups, or service accounts. Every pod running in the cluster will have an identity and a service account attached (“default” service account in the same namespace will be attached if not explicitly specified). Permissions granted to the service account with

RoleBinding/ClusterRoleBinding

dictate the access that pods will have. 

Going by the best policy of least privileges, it’s always advisable to use Roles with the least set of permissions and bind it to a service account with

RoleBinding

. This service account can be used to run the operator or custom resource to ensure proper access and also restrict the blast radius.

Avoid granting cluster-level access unless there is a strong use case to do it.

Example: RBAC in MongoDB Operator uses Role and

RoleBinding

restricting access to a single namespace for the service account. The same service account is used for both CustomResource and the Operator

Network Policies

Network isolation provides additional security to applications and customers in a multi-tenant environment. Network policies are Kubernetes resources that allow you to control the traffic between Pods, CIDR blocks, and network endpoints, but the most common approach is to control the traffic between namespaces:

kubernetes network policies

Most Container Network Interface (CNI) plugins support the implementation of network policies, however, if they don’t and

NetworkPolicy

is created, the resource is silently ignored. For example, AWS CNI does not support network policies, but AWS EKS can run Calico CNI which does.

It is a good approach to follow the least privilege approach, whereby default traffic is denied and access is granted granularly:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-ingress
  namespace: app1-db
spec:
  podSelector: {}
  policyTypes:
  - Ingress

Allow traffic from Pods in namespace

app1

to namespace

app1-db

:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-ingress
  namespace: app1-db
spec:
  podSelector: {}
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          name: app1
  policyTypes:
  - Ingress

Policy Enforcement

In a multi-tenant environment, policy enforcement plays a key role. Policy enforcement ensures that k8s objects pass the required quality gates set by administrators/teams. Some examples of policy enforcement could be:

  1. All the workloads have proper labels 
  2. Proper network policies are set for DB
  3. Unsafe configurations are not allowed (Example)
  4. Backups are always enabled (Example)

The K8s ecosystem offers a wide range of options to achieve this. Some of them are listed below:

  1. Open Policy Agent (OPA) is a CNCF graduated project which gives a high-level declarative language to author and enforces policies across k8s objects. (Examples from Google and OPA repo can be helpful)
  2. Mutating Webhooks can be used to modify API calls before it reaches the API server. This can be used to set required properties for k8s objects. (Example: mutating webhook to add
    NetworkPolicy

    for Pods created in production namespaces)

  3. Validating Webhooks can be used to check if k8s API follows the required policy, any API which doesn’t follow the policy will be rejected. (Example: validating webhook to ensure huge pages of 1GB is not used in the pod )

Cluster-Wide

Percona Distribution for MySQL Operator and Percona Distribution for PostgreSQL Operator both support cluster-wide mode which allows single Operator deploy and manage databases across multiple namespaces (support for cluster-wide mode in Percona Operator for MongoDB is on the roadmap). Is also possible to have an Operator per namespace:

Operator per namespace

For example, a single deployment of Percona Distribution for MySQL Operator can monitor multiple namespaces in cluster-wide mode. The use can specify them in WATCH_NAMESPACE environment variable in the

cw-bundle.yaml

file:

    spec:
      containers:
      - command:
        - percona-xtradb-cluster-operator
        env:
        - name: WATCH_NAMESPACE
          value: "namespace-a, namespace-b"

In a multi-tenant environment, it depends on the amount of freedom you want to give to the tenants. Usually when the tenants are highly trusted (for instance internal teams), then it is fine to choose namespace-scoped deployment, where each team can deploy and manage the Operator themselves.

Conclusion

It is important to remember that Kubernetes is not a multi-tenant system out of the box. Various levels of isolation were described in this blog post that would help you to run your applications and databases securely and ensure operational stability. 

We encourage you to try out our Operators:

CONTRIBUTING.md in every repository is there for those of you who want to contribute your ideas, code, and docs.

For general questions please raise the topic in the community forum.

Oct
18
2021
--

Cloud-Native Through the Prism of Percona: Episode 1

Percona Cloud Native Series 1

Percona Cloud Native Series 1The cloud-native landscape matures every day, and new great tools and products continue to appear. We are starting a series of blog posts that are going to focus on new tools in the container and cloud-native world, and provide a holistic view through the prism of Percona products.

In this blog:

  • VMware Tanzu Community edition
  • Data on Kubernetes survey
  • Azure credits for open source projects
  • Percona Distribution for PostgreSQL Operator is GA
  • kube-fledged
  • kubescape
  • m3o – new generation public cloud

VMware Tanzu Community Edition

I personally like this move by VMware to open source Tanzu, the set of products to run and manage Kubernetes clusters and applications. Every time I deploy Amazon EKS I feel like I’ve been punished for something. With VMware Tanzu, deployment of the cluster on Amazon (not EKS) is a smooth experience. It has its own quirks, but still much much better.

Tanzu Community Edition is not only about AWS EKS, but also other public clouds and even local environments with docker.

I also was able to successfully deploy Percona Operators on the Tanzu provisioned cluster. Keep in mind that you need a storage class to be created to run stateful workloads. It can be easily done with Tanzu’s packaging system.

Data on Kubernetes Survey

The Data on Kubernetes (DoK) community does a great job in promoting and evangelizing stateful workloads on Kubernetes. I strongly recommend you check out the DoK 2021 report that was released this week. Key takeaways:

  • Kubernetes is on the rise (nothing new). Half of the respondents run 50% or more of production workloads in k8s.
  • K8S is ready to run stateful workloads – 90% think so, and 70% already run data on k8s.
  • The key benefits of running stateful applications on Kubernetes:
    • Consistency
    • Standardizing
    • Simplify management
    • Enable develop self-service
  • Operators help with:
    • Management
    • Scalability
    • Improve app lifecycle mgmt

There are lots of other interesting data points, I encourage you to go through them.

Azure Credits for Open Source Projects

Percona’s motto is “Keeping Open Source Open”, which is why an announcement from Microsoft to issue Azure credits for open source projects caught our attention. This is a good move from Microsoft helping the open source community to certify products on Azure without spending a buck.

Percona Distribution for PostgreSQL Operator is GA

I cannot miss the opportunity to share with you that Percona’s PostgreSQL Operator has reached the General Availability stage. It was a long journey for us and we were constantly focused on improving the quality of our Operator through introduction of rigorous end-to-end testing. Please read more about this release on the PostgreSQL news mailing list. I also encourage you to look into our GitHub repository and try out the Operator by following these installation instructions.

kube-fledged

Back in the days of my Operations career, I was looking for an easy way to have container images pre-pulled on my Kubernetes nodes. kube-fledged does exactly this. The most common use cases are applications that require rapid start-up or some batch-processing which is fired randomly. If we talk about Percona Operators, then kube-fledged is useful if you scale your databases frequently and don’t want to waste valuable seconds on pulling the image. I have tried it out for Percona Distribution for PostgreSQL Operator and it worked like a charm.

kube-fledged is an operator and it controls which images to pull to the nodes with ImageCache custom resource. I have prepared an

ImageCache

manifest for Percona PostgreSQL Operator as an example – please find it here. This instructs kube-fledge to pull the images that we use in PostgreSQL cluster deployment on all nodes.

kubescape

In every container-related survey, we see security as one of the top concerns. kubescape is a neat tool to test if Kubernetes and apps are deployed securely as defined in National Security Agency (NSA) and Cybersecurity and Infrastructure Security Agency (CICA) Hardening Guidance.

It provides both details and a summary of failures. Here is for example the failure of Resource policy control for containers for default Percona MongoDB Operator deployment:

[control: Resource policies] failed ?
Description: CPU and memory resources should have a limit set for every container to prevent resource exhaustion. This control identifies all the Pods without resource limit definition.
   Namespace default
      Deployment - percona-server-mongodb-operator
      StatefulSet - my-cluster-name-cfg
      StatefulSet - my-cluster-name-rs0
Summary - Passed:3   Excluded:0   Failed:3   Total:6
Remediation: Define LimitRange and ResourceQuota policies to limit resource usage for namespaces or nodes.

It might be a good idea for developers to add kubescape into the CICD pipeline to get additional automated security policy checks.

M3O – New Generation Public Cloud

M3O is an open source AWS alternative built for the next generation of developers. Consume public APIs as simpler programmable building blocks for a 10x better developer experience.” – not my words, but from their website – m3o.com. In a nutshell, it is a set of APIs to greatly simplify the development. In m3o’s GitHub repo there is an example of how to build the Reddit Clone utilizing these APIs only. You can explore available APIs here. As an example I used URL shortener API for this blog post link:

$ curl "https://api.m3o.com/v1/url/Shorten" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $MICRO_API_TOKEN" \
-d '{
"destinationURL": "https://www.percona.com/blog/cloud-native-series-1"
}'

{"shortURL":"https://m3o.one/u/8FlJbxSfp"}

Looks neat! For now, most of the APIs are free to use, but I assume this is going to change soon once the project gets more traction and grows its user base.

It is also important to note, that this is an open source project, meaning that anyone can deploy their own M3O platform on Kubernetes (yes, k8s again) and have these APIs exposed privately and for free, not as a SaaS offering. See m3o/platform repo for more details and Pulumi example to deploy it.

Complete the 2021 Percona Open Source Data Management Software Survey

Have Your Say!

Oct
13
2021
--

Migrating MongoDB to Kubernetes

Migrating MongoDB to Kubernetes

Migrating MongoDB to KubernetesThis blog post is the last in the series of articles on migrating databases to Kubernetes with Percona Operators. Two previous posts can be found here:

As you might have guessed already, this time we are going to cover the migration of MongoDB to Kubernetes. In the 1.10.0 release of Percona Distribution for MongoDB Operator, we have introduced a new feature (in tech preview) that enables users to execute such migrations through regular MongoDB replication capabilities. We have already shown before how it can be used to provide cross-regional disaster recovery for MongoDB, we encourage you to read it.

The Goal

There are two ways to migrate the database:

  1. Take the backup and restore it.
    – This option is the simplest one, but unfortunately comes with downtime. The bigger the database, the longer the recovery time is.
  2. Replicate the data to the new site and switch the application once replicas are in sync.
    – This allows the user to perform the migration and switch the application with either zero or little downtime.

This blog post is a walkthrough on how to migrate MongoDB replica set to Kubernetes with replication capabilities. 

MongoDB replica set to Kubernetes

  1. We have a MongoDB cluster somewhere (the Source). It can be on-prem or some virtual machine. For demo purposes, I’m going to use a standalone replica set node. The migration procedure of a replica set with multiple nodes or sharded cluster is almost identical.
  2. We have a Kubernetes cluster with Percona Operator (the Target). The operator deploys 3 standalone MongoDB nodes in unmanaged mode (we will talk about it below).
  3. Each node is exposed so that the nodes on the Source can reach them.
  4. We are going to replicate the data to Target nodes by adding them into the replica set.

As always all blog post scripts and configuration files are available publicly in this Github repository.

Prerequisites

  • MongoDB cluster – either on-prem or VM. It is important to be able to configure mongod to some extent and add external nodes to the replica set.
  • Kubernetes cluster for the Target.
  • kubectl to deploy and manage the Operator and database on Kubernetes.

Prepare the Source

This section explains what preparations must be made on the Source to set up the replication.

Expose

All nodes in the replica set must form a mesh and be able to reach each other. The communication between the nodes can go through the public internet or some VPN. For demonstration purposes, we are going to expose the Source to the public internet by editing mongod.conf:

net:
  bindIp: <PUBLIC IP>

If you have multiple replica sets – you need to expose all nodes of each of them, including config servers.

TLS

We take security seriously at Percona, and this is why by default our Operator deploys MongoDB clusters with encryption enabled. I have prepared a script that generates self-signed certificates and keys with the openssl tool. If you already have Certificate Authority (CA) in use in your organization, generate the certificates and sign them by your CA.

The list of alternative names can be found either in this document or in this openssl configuration file. Note DNS.20 entry:

DNS.20      = *.mongo.spron.in

I’m going to use this wildcard entry to set up the replication between the nodes. The script also generates an

ssl-secret.yaml

file, which we are going to use on the Target side.

You need to upload CA and certificate with a private key to every Source replica set node and then define it in the

mongod.conf

:

# network interfaces
net:
  ...
  tls:
    mode: preferTLS
    CAFile: /path/to/ca.pem
    certificateKeyFile: /path/to/mongod.pem

security:
  clusterAuthMode: x509
  authorization: enabled

Note that I also set

clusterAuthMode

to x509. It enforces the use of x509 authentication. Test it carefully on a non-production environment first as it might break your existing replication.

Create System Users

Our Operator needs system users to manage the cluster and perform health checks. Usernames and passwords for system users should be the same on the Source and the Target. This script is going to generate the

user-secret.yaml

to use on the Target and mongo shell code to add the users on the Source (it is an example, do not use it in production).

Connect to the primary node on the Source and execute mongo shell commands generated by the script.

Prepare the Target

Apply Users and TLS secrets

System users’ credentials and TLS certificates must be similar on both sides. The scripts we used above generate Secret object manifests to use on the Target. Apply them:

$ kubectl apply -f ssl-secret.yaml
$ kubectl apply -f user-secret.yam

Deploy the Operator and MongoDB Nodes

Please follow one of the installation guides to deploy the Operator. Usually, it is one step operation through

kubectl

:

$ kubectl apply -f operator.yaml

MongoDB nodes are deployed with a custom resource manifest – cr.yaml. There are the following important configuration items in it:

spec:
  unmanaged: true

This flag instructs Operator to deploy the nodes in unmanaged mode, meaning they are not configured to form the cluster. Also, the Operator does not generate TLS certificates and system users.

spec:
…
  updateStrategy: Never

Disable the Smart Update feature as the cluster is unmanaged.

spec:
…
  secrets:
    users: my-new-cluster-secrets
    ssl: my-custom-ssl
    sslInternal: my-custom-ssl-internal

This section points to the Secret objects that we created in the previous step.

spec:
…
  replsets:
  - name: rs0
    size: 3
    expose:
      enabled: true
      exposeType: LoadBalancer

Remember, that nodes need to be exposed and reachable. To do this we create a service per Pod. In our case, it is a

LoadBalancer

object, but it can be any other Service type.

spec:
...
  backup:
    enabled: false

If the cluster and nodes are unmanaged, the Operator should not be taking backups. 

Deploy unmanaged nodes with the following command:

$ kubectl apply -f cr.yaml

Once nodes are up and running, also check the services. We will need the IP-addresses of new replicas to add them later to the replica set on the Source.

$ kubectl get services
NAME                    TYPE           CLUSTER-IP     EXTERNAL-IP      PORT(S)           AGE
…
my-new-cluster-rs0-0    LoadBalancer   10.3.252.134   35.223.104.224   27017:31332/TCP   2m11s
my-new-cluster- rs0-1   LoadBalancer   10.3.244.166   34.134.210.223   27017:32647/TCP   81s
my-new-cluster-rs0-2    LoadBalancer   10.3.248.119   34.135.233.58    27017:32534/TCP   45s

Configure Domains

X509 authentication is strict and requires that the certificate’s common name or alternative name match the domain name of the node. As you remember we had wildcard

*.mongo.spron.in

included in our certificate. It can be any domain that you use, but make sure a certificate is issued for this domain.

I’m going to create A-records to point to public IP-addresses of MongoDB nodes:

k8s-1.mongo.spron-in -> 35.223.104.224
k8s-2.mongo.spron.in -> 34.134.210.223
k8s-3.mongo.spron-in -> 34.135.233.58

Replicate the Data to the Target

It is time to add our nodes in the Kubernetes cluster to the replica set. Login into the mongo shell on the Source and execute the following:

rs.add({ host: "k8s-1.mongo.spron.in", priority: 0, votes: 0} )
rs.add({ host: "k8s-2.mongo.spron.in", priority: 0, votes: 0} )
rs.add({ host: "k8s-3.mongo.spron.in", priority: 0, votes: 0} )

If everything is done correctly these nodes are going to be added as secondaries. You can check the status with the

rs.status()

command.

Cutover

Check that newly added node are synchronized. The more data you have, the longer the synchronization process is going to take. To understand if nodes are synchronized you should compare the values of

optime

and

optimeDate

of the Primary node with the values for the Secondary node in

rs.status()

output:

{
        "_id" : 0,
        "name" : "147.182.213.59:27017",
        "stateStr" : "PRIMARY",
...
        "optime" : {
                "ts" : Timestamp(1633697030, 1),
                "t" : NumberLong(2)
        },
        "optimeDate" : ISODate("2021-10-08T12:43:50Z"),
...
},
{
        "_id" : 1,
        "name" : "k8s-1.mongo.spron.in:27017",
        "stateStr" : "SECONDARY",
...
        "optime" : {
                "ts" : Timestamp(1633697030, 1),
                "t" : NumberLong(2)
        },
        "optimeDurable" : {
                "ts" : Timestamp(1633697030, 1),
                "t" : NumberLong(2)
        },
        "optimeDate" : ISODate("2021-10-08T12:43:50Z"),
...
},

When nodes are synchronized, we are ready to perform the cutover. Please ensure that your application is configured properly to minimize downtime during the cutover.

The cutover is going to have two steps:

  1. One of the nodes on the Target becomes the primary.
  2. Operator starts managing the cluster and nodes on the Source are no longer present in the replica set.

Switch the Primary

Connect with mongo shell to the primary on the Source side and make one of the nodes on the Target primary. It can be done by changing the replica set configuration:

cfg = rs.config()
cfg.members[1].priority = 2
cfg.members[1].votes = 1
rs.reconfig(cfg)

We enable voting and set priority to two on one of the nodes in the Kubernetes cluster. Member id can be different for you, so please look carefully into the output of

rs.config()

command.

Start Managing the Cluster

Once the primary is running in Kubernetes, we are going to tell the Operator to start managing the cluster. Change

spec.unmanaged

to

false

 in the Custom Resource with patch command:

$ kubectl patch psmdb my-cluster-name --type=merge -p '{"spec":{"unmanaged": true}}'

You can also do this by changing

cr.yaml

and applying it. This is it, now you have the cluster in Kubernetes which is managed by the Operator. 

Conclusion

You truly start to appreciate Operators once you get used to them. When I was writing this blog post I found it extremely annoying to deploy and configure a single MongoDB node on a Linux box; I don’t want to think about the cluster. Operators abstract Kubernetes primitives and database configuration and provide you with a fully operational database service instead of a bunch of nodes. Migration of MongoDB to Kubernetes is a challenging task, but it is much simpler with Operator. And once you are on Kubernetes, Operator takes care of all day-2 operations as well.

We encourage you to try out our operator. See our GitHub repository and check out the documentation.

Found a bug or have a feature idea? Feel free to submit it in JIRA.

For general questions please raise the topic in the community forum

You are a developer and looking to contribute? Please read our CONTRIBUTING.md and send the Pull Request.

Percona Distribution for MongoDB Operator

The Percona Distribution for MongoDB Operator simplifies running Percona Server for MongoDB on Kubernetes and provides automation for day-1 and day-2 operations. It’s based on the Kubernetes API and enables highly available environments. Regardless of where it is used, the Operator creates a member that is identical to other members created with the same Operator. This provides an assured level of stability to easily build test environments or deploy a repeatable, consistent database environment that meets Percona expert-recommended best practices.

Complete the 2021 Percona Open Source Data Management Software Survey

Have Your Say!

Oct
08
2021
--

Disaster Recovery for MongoDB on Kubernetes

Disaster Recovery for MongoDB on Kubernetes

Disaster Recovery for MongoDB on KubernetesAs per the glossary, Disaster Recovery (DR) protocols are an organization’s method of regaining access and functionality to its IT infrastructure in events like a natural disaster, cyber attack, or even business disruptions related to the COVID-19 pandemic. When we talk about data, storing backups on remote servers is enough to pass DR compliance checks for some companies. But for others, Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO) are extremely tight and require more than just a backup/restore procedure.

In this blog post, we are going to show you how to set up MongoDB on two distant Kubernetes clusters with Percona Distribution for MongoDB Operator to meet the toughest DR requirements.

What to Expect

Here is what we are going to do:

  1. Setup two Kubernetes clusters
  2. Deploy Percona Distribution for MongoDB Operator on both of them. The Disaster Recovery site will run a MongoDB cluster in unmanaged mode.
  3. We are going to simulate the failure and perform a failover to DR site

In the 1.10.0 version of the Operator, we have added the Technology Preview of the new feature which enables users to deploy unmanaged MongoDB nodes and connect them to existing Replica Sets.

MongoDB Kubernetes

Set it All Up

We are not going to cover the configuration of the Kubernetes clusters, but in our tests, we relied on two Google Kubernetes Engine (GKE) clusters deployed in different regions. Read more about GKE here.

Prepare Main Site

We have shared all the resources for this blog post in this GitHub repo. As a first step we are going to deploy the operator on the Main site:

$ kubectl apply -f bundle.yaml

Deploy the MongoDB managed cluster with

cr-main.yaml

:

$ kubectl apply -f cr-main.yaml

It is important to understand that we will need to expose ReplicaSet nodes through a dedicated service. This includes Config Servers. This is required to ensure that ReplicaSet nodes on Main and DR can reach each other. So it is like a full mesh:

ReplicaSet nodes

To get there, cr-main.yaml has the following changes:

spec:
  replsets:
  - rs0:
    expose:
      enabled: true
      exposeType: LoadBalancer
  sharding:
    configsvrReplSet:
      expose:
        enabled: true
        exposeType: LoadBalancer

We are using the LoadBalancer Kubernetes Service object as it is just simpler for us, but there are other options – ClusterIP, NodePort. It is also possible to utilize 3rd party tools like Submariner to implement a private connection.

If you have an already running MongoDB cluster in Kubernetes, you can expose the ReplicaSets without downtime by changing these variables.

Prepare Disaster Recovery Site

The configuration of the Disaster Recovery site could be broken down into the following steps:

  1. Copy the Secrets from the Main cluster.
    1. system users secrets
    2. SSL keys – both used for external connections and internal replication traffic
  2. Tune Custom Resource:
    1. run nodes in unmanaged mode – Operator does not control replicaset configuration and secrets generation
    2. expose ReplicaSets (the same way we do it on the Main cluster)
    3. disable backups – backups can be only taken on the cluster managed by the Operator

Copy the Secrets

System user’s credentials are stored by default in my-cluster-name-secrets Secret object and defined in spec.secrets.users. Apply this secret in the DR cluster with kubectl apply -f yaml-with-secrets. If you don’t have it in your source code repository or if you rely on the Operator to generate it, you can get the secret from Kubernetes itself, remove the unnecessary metadata and apply.

On main execute:

$ kubectl get secret my-cluster-name-secrets -o yaml > my-cluster-secrets.yaml

Now remove the following lines from metadata:

annotations
creationTimestamp
resourceVersion
selfLink
uid

Save the file and apply it to the DR cluster.

The procedure to copy SSL keys is almost the same as for users. The difference is the names of the Secret objects – they are usually called <CLUSTER_NAME>-ssl and <CLUSTER_NAME>-ssl-internal. It is also possible to specify them in secrets.ssl and secrets.sslInternal in the Custom Resource. Copy these two keys from Main to DR and reference them in the CR.

Tune Custom Resource

cr-replica.yaml will have the following changes:

  secrets:
    users: my-cluster-name-secrets
    ssl: replica-cluster-ssl
    sslInternal: replica-cluster-ssl-internal

  replsets:
  - name: rs0
    size: 3
    expose:
      enabled: true
      exposeType: LoadBalancer

  sharding:
    enabled: true
    configsvrReplSet:
      size: 3
      expose:
        enabled: true
        exposeType: LoadBalancer

  backup:
    enabled: false

Once the Custom Resource is applied, the services are going to be created.  We will need the IP addresses of each ReplicaSet node to configure the DR site.

$ kubectl get services
NAME                  TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)           AGE
replica-cluster-cfg-0    LoadBalancer   10.111.241.213   34.78.119.1       27017:31083/TCP   5m28s
replica-cluster-cfg-1    LoadBalancer   10.111.243.70    35.195.138.253    27017:31957/TCP   4m52s
replica-cluster-cfg-2    LoadBalancer   10.111.246.94    146.148.113.165   27017:30196/TCP   4m6s
...
replica-cluster-rs0-0    LoadBalancer   10.111.241.41    34.79.64.213      27017:31993/TCP   5m28s
replica-cluster-rs0-1    LoadBalancer   10.111.242.158   34.76.238.149     27017:32012/TCP   4m47s
replica-cluster-rs0-2    LoadBalancer   10.111.242.191   35.195.253.107    27017:31209/TCP   4m22s

Add External Nodes to Main

At this step, we are going to add unmanaged nodes to the Replica Set on the Main site. In cr-main.yaml we should add externalNodes under replsets.[] and sharding.configsvrReplSet:

  replsets:
  - name: rs0
    externalNodes:
    - host: 34.79.64.213
      priority: 1
      votes: 1
    - host: 34.76.238.149
      priority: 1
      votes: 1
    - host: 35.195.253.107
      priority: 0
      votes: 0

  sharding:
    configsvrReplSet:
      externalNodes:
      - host: 34.78.119.1
        priority: 1
        votes: 1
      - host: 35.195.138.253
        priority: 1
        votes: 1
      - host: 146.148.113.165
        priority: 0
        votes: 0

Please note that we add three nodes, but only two are voters. We do this to avoid split-brain situations and do not start the primary election if the DR site is down or there is a network disruption between the Main and DR sites.

Failover

Once all the configuration above is applied, the situation will look like this:

Failover

We have three voters in the main cluster and two voters in the replica cluster. That means replica nodes won’t have the majority in case of main cluster failure and they won’t be able to elect a new primary. Therefore we need to step in and perform a manual failover.

Let’s kill the main cluster:

gcloud compute instances list | 
grep my-main-gke-demo | 
awk '{print $1}' | 
xargs gcloud compute instances delete --zone europe-west3-b

gcloud container node-pools delete \
--zone europe-west3-b \
--cluster my-main-gke-demo \
Default-pool

I deleted the nodes and the node pool of the main Kubernetes cluster so now the cluster is in an unhealthy state. Let’s see what mongos on the DR site says when we try to read or write through it (psmdb-tester can be found in the git repo as well):

% ./psmdb-tester
2021/09/03 18:19:19 Successfully connected and pinged 34.141.3.189:27017
2021/09/03 18:19:40 read failed: (FailedToSatisfyReadPreference) Encountered non-retryable error during query :: caused by :: Could not find host matching read preference { mode: "primary" } for set cfg
2021/09/03 18:19:49 write failed: (FailedToSatisfyReadPreference) Could not find host matching read preference { mode: "primary" } for set cfg
Disaster Recovery MongoDB

Normally, we can only alter the replica set configuration from the primary node but in this kind of situation where you don’t have a primary and only have a few surviving members, MongoDB allows us to force the reconfiguration from any alive member.

Let’s connect to one of the secondary nodes in the replica cluster and perform the failover:

kubectl exec -it psmdb-client-7b9f978649-pjb2k -- mongo 'mongodb://clusterAdmin:<pass>@replica-cluster-rs0-0.replica.svc.cluster.local/admin?ssl=false'
...
rs0:SECONDARY> cfg = rs.config()
rs0:SECONDARY> cfg.members = [cfg.members[3], cfg.members[4], cfg.members[5]]
rs0:SECONDARY> rs.reconfig(cfg, {force: true})

Note that the indexes of surviving members may differ in your environment. You should check rs.status() and rs.config() outputs first. The main idea is to repopulate config members with only surviving members.

After the reconfiguration, the replica set will have just three members and two of them will have votes and a majority. So, they’ll be able to select a new primary. After performing the same process on the cfg replica set, we will be able to read and write through mongos again:

% ./psmdb-tester
2021/09/03 18:41:48 Successfully connected and pinged 34.141.3.189:27017
2021/09/03 18:41:49 read succeed
2021/09/03 18:41:50 read succeed
2021/09/03 18:41:51 read succeed
2021/09/03 18:41:52 read succeed
2021/09/03 18:41:53 read succeed
2021/09/03 18:41:54 read succeed
2021/09/03 18:41:55 read succeed
2021/09/03 18:41:56 read succeed
2021/09/03 18:41:57 read succeed
2021/09/03 18:41:58 read succeed
2021/09/03 18:41:58 write succeed

Once the replica cluster has become the primary, you should reconfigure all clients that connect to the old main cluster and point them to the DR site.

Conclusion

Disaster Recovery is important for business continuity. The goal of administrators and SREs is to have a plan in place. With the new release of Percona Distribution for MongoDB Operator, setting up DR is fast, automated, and enables IT teams to meet RTO and RPO requirements.

We encourage you to try out our operator. See our GitHub repository and check out the documentation.

Found a bug or have a feature idea? Feel free to submit it in JIRA.

For general questions please raise the topic in the community forum.

You are a developer and looking to contribute? Please read our CONTRIBUTING.md and send the Pull Request.

Percona Distribution for MongoDB Operator

The Percona Distribution for MongoDB Operator simplifies running Percona Server for MongoDB on Kubernetes and provides automation for day-1 and day-2 operations. It’s based on the Kubernetes API and enables highly available environments. Regardless of where it is used, the Operator creates a member that is identical to other members created with the same Operator. This provides an assured level of stability to easily build test environments or deploy a repeatable, consistent database environment that meets Percona expert-recommended best practices.

Complete the 2021 Percona Open Source Data Management Software Survey

Have Your Say!

Oct
07
2021
--

Getting Started with ProxySQL in Kubernetes

Getting Started with ProxySQL in Kubernetes

Getting Started with ProxySQL in KubernetesThere are plenty of ways to run ProxySQL in Kubernetes (K8S). For example, we can deploy sidecar containers on the application pods, or run a dedicated ProxySQL service with its own pods.

We are going to discuss the latter approach, which is more likely to be used when dealing with a large number of application pods. Remember each ProxySQL instance runs a number of checks against the database backends. These checks monitor things like server-status and replication lag. Having too many proxies can cause significant overhead.

Creating a Cluster

For the purpose of this example, I am going to deploy a test cluster in GKE. We need to follow these steps:

1. Create a cluster

gcloud container clusters create ivan-cluster --preemptible --project my-project --zone us-central1-c --machine-type n2-standard-4 --num-nodes=3

2. Configure command-line access

gcloud container clusters get-credentials ivan-cluster --zone us-central1-c --project my-project

3. Create a Namespace

kubectl create namespace ivantest-ns

4. Set the context to use our new Namespace

kubectl config set-context $(kubectl config current-context) --namespace=ivantest-ns

Dedicated Service Using a StatefulSet

One way to implement this approach is to have ProxySQL pods use persistent volumes to store the configuration. We can rely on ProxySQL Cluster mode to make sure the configuration is kept in sync.

For simplicity, we are going to use a ConfigMap with the initial config for bootstrapping the ProxySQL service for the first time.

Exposing the passwords in the ConfigMap is far from ideal, and so far the K8S community hasn’t made up its mind about how to implement Reference Secrets from ConfigMap.

1. Prepare a file for the ConfigMap

tee proxysql.cnf <<EOF
datadir="/var/lib/proxysql"
 
admin_variables=
{
    admin_credentials="admin:admin;cluster:secret"
    mysql_ifaces="0.0.0.0:6032"
    refresh_interval=2000
    cluster_username="cluster"
    cluster_password="secret"  
}
 
mysql_variables=
{
    threads=4
    max_connections=2048
    default_query_delay=0
    default_query_timeout=36000000
    have_compress=true
    poll_timeout=2000
    interfaces="0.0.0.0:6033;/tmp/proxysql.sock"
    default_schema="information_schema"
    stacksize=1048576
    server_version="8.0.23"
    connect_timeout_server=3000
    monitor_username="monitor"
    monitor_password="monitor"
    monitor_history=600000
    monitor_connect_interval=60000
    monitor_ping_interval=10000
    monitor_read_only_interval=1500
    monitor_read_only_timeout=500
    ping_interval_server_msec=120000
    ping_timeout_server=500
    commands_stats=true
    sessions_sort=true
    connect_retries_on_failure=10
}
 
mysql_servers =
(
    { address="mysql1" , port=3306 , hostgroup=10, max_connections=100 },
    { address="mysql2" , port=3306 , hostgroup=20, max_connections=100 }
)
 
mysql_users =
(
    { username = "myuser", password = "password", default_hostgroup = 10, active = 1 }
)
 
proxysql_servers =
(
    { hostname = "proxysql-0.proxysqlcluster", port = 6032, weight = 1 },
    { hostname = "proxysql-1.proxysqlcluster", port = 6032, weight = 1 },
    { hostname = "proxysql-2.proxysqlcluster", port = 6032, weight = 1 }
)
EOF

2. Create the ConfigMap

kubectl create configmap proxysql-configmap --from-file=proxysql.cnf

3. Prepare a file with the StatefulSet

tee proxysql-ss-svc.yml <<EOF
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: proxysql
  labels:
    app: proxysql
spec:
  replicas: 3
  serviceName: proxysqlcluster
  selector:
    matchLabels:
      app: proxysql
  updateStrategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: proxysql
    spec:
      restartPolicy: Always
      containers:
      - image: proxysql/proxysql:2.3.1
        name: proxysql
        volumeMounts:
        - name: proxysql-config
          mountPath: /etc/proxysql.cnf
          subPath: proxysql.cnf
        - name: proxysql-data
          mountPath: /var/lib/proxysql
          subPath: data
        ports:
        - containerPort: 6033
          name: proxysql-mysql
        - containerPort: 6032
          name: proxysql-admin
      volumes:
      - name: proxysql-config
        configMap:
          name: proxysql-configmap
  volumeClaimTemplates:
  - metadata:
      name: proxysql-data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 2Gi
---
apiVersion: v1
kind: Service
metadata:
  annotations:
  labels:
    app: proxysql
  name: proxysql
spec:
  ports:
  - name: proxysql-mysql
    nodePort: 30033
    port: 6033
    protocol: TCP
    targetPort: 6033
  - name: proxysql-admin
    nodePort: 30032
    port: 6032
    protocol: TCP
    targetPort: 6032
  selector:
    app: proxysql
  type: NodePort
EOF

4. Create the StatefulSet

kubectl create -f proxysql-ss-svc.yml

5. Prepare the definition of the headless Service (more on this later)

tee proxysql-headless-svc.yml <<EOF 
apiVersion: v1
kind: Service
metadata:
  name: proxysqlcluster
  labels:
    app: proxysql
spec:
  clusterIP: None
  ports:
  - port: 6032
    name: proxysql-admin
  selector:
    app: proxysql
EOF

6. Create the headless Service

kubectl create -f proxysql-headless-svc.yml

7. Verify the Services

kubectl get svc

NAME              TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)                         AGE
proxysql          NodePort    10.3.249.158           6033:30033/TCP,6032:30032/TCP   12m
proxysqlcluster   ClusterIP   None                   6032/TCP                        8m53s

Pod Name Resolution

By default, each pod has a DNS name associated in the form pod-ip-address.my-namespace.pod.cluster-domain.example.

The headless Service causes K8S to auto-create a DNS record with each pod’s FQDN as well. The result is we will have the following entries available:

proxysql-0.proxysqlcluster
proxysql-1.proxysqlcluster
proxysql-3.proxysqlcluster

We can then use these to set up the ProxySQL cluster (the proxysql_servers part of the configuration file).

Connecting to the Service

To test the service, we can run a container that includes a MySQL client and connect its console output to our terminal. For example, use the following command (which also removes the container/pod after we exit the shell):

kubectl run -i --rm --tty percona-client --image=percona/percona-server:latest --restart=Never -- bash -il

The connections from other pods should be sent to the Cluster-IP and port 6033 and will be load balanced. We can also use the DNS name proxysql.ivantest-ns.svc.cluster.local that got auto-created.

mysql -umyuser -ppassword -h10.3.249.158 -P6033

Use port 30033 instead if the client is connecting from an external network:

mysql -umyuser -ppassword -h10.3.249.158 -P30033

Cleanup Steps

In order to remove all the resources we created, run the following steps:

kubectl delete statefulsets proxysql
kubectl delete service proxysql
kubectl delete service proxysqlcluster

Final Words

We have seen one of the possible ways to deploy ProxySQL in Kubernetes. The approach presented here has a few shortcomings but is good enough for illustrative purposes. For a production setup, consider looking at the Percona Kubernetes Operators instead.

Complete the 2021 Percona Open Source Data Management Software Survey

Have Your Say!

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com