May
16
2022
--

Running Rocket.Chat with Percona Server for MongoDB on Kubernetes

Running Rocket.Chat with Percona Server for MongoDB on Kubernetes

Our goal is to have a Rocket.Chat deployment which uses highly available Percona Server for MongoDB cluster as the backend database and it all runs on Kubernetes. To get there, we will do the following:

  • Start a Google Kubernetes Engine (GKE) cluster across multiple availability zones. It can be any other Kubernetes flavor or service, but I rely on multi-AZ capability in this blog post.
  • Deploy Percona Operator for MongoDB and database cluster with it
  • Deploy Rocket.Chat with specific affinity rules
    • Rocket.Chat will be exposed via a load balancer

Rocket.Chat will be exposed via a load balancer

Percona Operator for MongoDB, compared to other solutions, is not only the most feature-rich but also comes with various management capabilities for your MongoDB clusters – backups, scaling (including sharding), zero-downtime upgrades, and many more. There are no hidden costs and it is truly open source.

This blog post is a walkthrough of running a production-grade deployment of Rocket.Chat with Percona Operator for MongoDB.

Rock’n’Roll

All YAML manifests that I use in this blog post can be found in this repository.

Deploy Kubernetes Cluster

The following command deploys GKE cluster named

percona-rocket

in 3 availability zones:

gcloud container clusters create --zone us-central1-a --node-locations us-central1-a,us-central1-b,us-central1-c percona-rocket --cluster-version 1.21 --machine-type n1-standard-4 --preemptible --num-nodes=3

Read more about this in the documentation.

Deploy MongoDB

I’m going to use helm to deploy the Operator and the cluster.

Add helm repository:

helm repo add percona https://percona.github.io/percona-helm-charts/

Install the Operator into the percona namespace:

helm install psmdb-operator percona/psmdb-operator --create-namespace --namespace percona

Deploy the cluster of Percona Server for MongoDB nodes:

helm install my-db percona/psmdb-db -f psmdb-values.yaml -n percona

Replica set nodes are going to be distributed across availability zones. To get there, I altered the affinity keys in the corresponding sections of psmdb-values.yaml:

antiAffinityTopologyKey: "topology.kubernetes.io/zone"

Prepare MongoDB

For Rocket.Chat to connect to our database cluster, we need to create the users. By default, clusters provisioned with our Operator have

userAdmin

user, its password is set in

psmdb-values.yaml

:

MONGODB_USER_ADMIN_PASSWORD: userAdmin123456

For production-grade systems, do not forget to change this password or create dedicated secrets to provision those. Read more about user management in our documentation.

Spin up a client Pod to connect to the database:

kubectl run -i --rm --tty percona-client1 --image=percona/percona-server-mongodb:4.4.10-11 --restart=Never -- bash -il

Connect to the database with

userAdmin

:

[mongodb@percona-client1 /]$ mongo "mongodb://userAdmin:userAdmin123456@my-db-psmdb-db-rs0-0.percona/admin?replicaSet=rs0"

We are going to create the following:

  • rocketchat

    database

  • rocketChat

    user to store data and connect to the database

  • oplogger

    user to provide access to oplog for rocket chat

    • Rocket.Chat uses Meteor Oplog tailing to improve performance. It is optional.
use rocketchat
db.createUser({
  user: "rocketChat",
  pwd: passwordPrompt(),
  roles: [
    { role: "readWrite", db: "rocketchat" }
  ]
})

use admin
db.createUser({
  user: "oplogger",
  pwd: passwordPrompt(),
  roles: [
    { role: "read", db: "local" }
  ]
})

Deploy Rocket.Chat

I will use helm here to maintain the same approach. 

helm install -f rocket-values.yaml my-rocketchat rocketchat/rocketchat --version 3.0.0

You can find rocket-values.yaml in the same repository. Please make sure you set the correct passwords in the corresponding YAML fields.

As you can see, I also do the following:

  • Line 11: expose Rocket.Chat through
    LoadBalancer

    service type

  • Line 13-14: set number of replicas of Rocket.Chat Pods. We want three – one per each availability zone.
  • Line 16-23: set affinity to distribute Pods across availability zones

Load Balancer will be created with a public IP address:

$ kubectl get service my-rocketchat-rocketchat
NAME                       TYPE           CLUSTER-IP    EXTERNAL-IP    PORT(S)        AGE
my-rocketchat-rocketchat   LoadBalancer   10.32.17.26   34.68.238.56   80:32548/TCP   12m

You should now be able to connect to

34.68.238.56

and enjoy your highly available Rocket.Chat installation.

Rocket.Chat installation

Clean Up

Uninstall all helm charts to remove MongoDB cluster, the Operator, and Rocket.Chat:

helm uninstall my-rocketchat
helm uninstall my-db -n percona
helm uninstall psmdb-operator -n percona

Things to Consider

Ingress

Instead of exposing Rocket.Chat through a load balancer, you may also try ingress. By doing so, you can integrate it with cert-manager and have a valid TLS certificate for your chat server.

Mongos

It is also possible to run a sharded MongoDB cluster with Percona Operator. If you do so, Rocket.Chat will connect to mongos Service, instead of the replica set nodes. But you will still need to connect to the replica set directly to get oplogs.

Conclusion

We encourage you to try out Percona Operator for MongoDB with Rocket.Chat and let us know on our community forum your results.

There is always room for improvement and a time to find a better way. Please let us know if you face any issues with contributing your ideas to Percona products. You can do that on the Community Forum or JIRA. Read more about contribution guidelines for Percona Operator for MongoDB in CONTRIBUTING.md.

Percona Operator for MongoDB contains everything you need to quickly and consistently deploy and scale Percona Server for MongoDB instances into a Kubernetes cluster on-premises or in the cloud. The Operator enables you to improve time to market with the ability to quickly deploy standardized and repeatable database environments. Deploy your database with a consistent and idempotent result no matter where they are used.

May
12
2022
--

Percona Operator for MongoDB and Kubernetes MCS: The Story of One Improvement

Percona Operator for MongoDB and Kubernetes MCS

Percona Operator for MongoDB supports multi-cluster or cross-site replication deployments since version 1.10. This functionality is extremely useful if you want to have a disaster recovery deployment or perform a migration from or to a MongoDB cluster running in Kubernetes. In a nutshell, it allows you to use Operators deployed in different Kubernetes clusters to manage and expand replica sets.Percona Kubernetes Operator

For example, you have two Kubernetes clusters: one in Region A, another in Region B.

  • In Region A you deploy your MongoDB cluster with Percona Operator. 
  • In Region B you deploy unmanaged MongoDB nodes with another installation of Percona Operator.
  • You configure both Operators, so that nodes in Region B are added to the replica set in Region A.

In case of failure of Region A, you can switch your traffic to Region B.

Migrating MongoDB to Kubernetes describes the migration process using this functionality of the Operator.

This feature was released in tech preview, and we received lots of positive feedback from our users. But one of our customers raised an internal ticket, which was pointing out that cross-site replication functionality does not work with Multi-Cluster Services. This started the investigation and the creation of this ticket – K8SPSMDB-625

This blog post will go into the deep roots of this story and how it is solved in the latest release of Percona Operator for MongoDB version 1.12.

The Problem

Multi-Cluster Services or MCS allows you to expand network boundaries for the Kubernetes cluster and share Service objects across these boundaries. Someone calls it another take on Kubernetes Federation. This feature is already available on some managed Kubernetes offerings,  Google Cloud Kubernetes Engine (GKE) and AWS Elastic Kubernetes Service (EKS). Submariner uses the same logic and primitives under the hood.

MCS Basics

To understand the problem, we need to understand how multi-cluster services work. Let’s take a look at the picture below:

multi-cluster services

  • We have two Pods in different Kubernetes clusters
  • We add these two clusters into our MCS domain
  • Each Pod has a service and IP-address which is unique to the Kubernetes cluster
  • MCS introduces new Custom Resources –
    ServiceImport

    and

    ServiceExport

    .

    • Once you create a
      ServiceExport

      object in one cluster,

      ServiceImport

      object appears in all clusters in your MCS domain.

    • This
      ServiceImport

        object is in

      svc.clusterset.local

      domain and with the network magic introduced by MCS can be accessed from any cluster in the MCS domain

Above means that if I have an application in the Kubernetes cluster in Region A, I can connect to the Pod in Kubernetes cluster in Region B through a domain name like

my-pod.<namespace>.svc.clusterset.local

. And it works from another cluster as well.

MCS and Replica Set

Here is how cross-site replication works with Percona Operator if you use load balancer:

MCS and Replica Set

All replica set nodes have a dedicated service and a load balancer. A replica set in the MongoDB cluster is formed using these public IP addresses. External node added using public IP address as well:

  replsets:
  - name: rs0
    size: 3
    externalNodes:
    - host: 123.45.67.89

All nodes can reach each other, which is required to form a healthy replica set.

Here is how it looks when you have clusters connected through multi-cluster service:

Instead of load balancers replica set nodes are exposed through Cluster IPs. We have ServiceExports and ServiceImports resources. All looks good on the networking level, it should work, but it does not.

The problem is in the way the Operator builds MongoDB Replica Set in Region A. To register an external node from Region B to a replica set, we will use MCS domain name in the corresponding section:

  replsets:
  - name: rs0
    size: 3
    externalNodes:
    - host: rs0-4.mongo.svc.clusterset.local

Now our rs.status() will look like this:

"name" : "my-cluster-rs0-0.mongo.svc.cluster.local:27017"
"role" : "PRIMARY"
...
"name" : "my-cluster-rs0-1.mongo.svc.cluster.local:27017"
"role" : "SECONDARY"
...
"name" : "my-cluster-rs0-2.mongo.svc.cluster.local:27017"
"role" : "SECONDARY"
...
"name" : "rs0-4.mongo.svc.clusterset.local:27017"
"role" : "UNKNOWN"

As you can see, Operator formed a replica set out of three nodes using

svc.cluster.local

domain, as it is how it should be done when you expose nodes with

ClusterIP

Service type. In this case, a node in Region B cannot reach any node in Region A, as it tries to connect to the domain that is local to the cluster in Region A. 

In the picture below, you can easily see where the problem is:

The Solution

Luckily we have a Special Interest Group (SIG), a Kubernetes Enhancement Proposal (KEP) and multiple implementations for enabling Multi-Cluster Services. Having a KEP is great since we can be sure the implementations from different providers (i.e GCP, AWS) will follow the same standard more or less.

There are two fields in the Custom Resource that control MCS in the Operator:

spec:
  multiCluster:
    enabled: true
    DNSSuffix: svc.clusterset.local

Let’s see what is happening in the background with these flags set.

ServiceImport and ServiceExport Objects

Once you enable MCS by patching the CR with

spec.multiCluster.enabled: true

, the Operator creates a

ServiceExport

object for each service. These ServiceExports will be detected by the MCS controller in the cluster and eventually a

ServiceImport

for each

ServiceExport

will be created in the same namespace in each cluster that has MCS enabled.

As you see, we made a decision and empowered the Operator to create

ServiceExport

objects. There are two main reasons for doing that:

  • If any infrastructure-as-a-code tool is used, it would require additional logic and level of complexity to automate the creation of required MCS objects. If Operator takes care of it, no additional work is needed. 
  • Our Operators take care of the infrastructure for the database, including Service objects. It just felt logical to expand the reach of this functionality to MCS.

Replica Set and Transport Encryption

The root cause of the problem that we are trying to solve here lies in the networking field, where external replica set nodes try to connect to the wrong domain names. Now, when you enable multi-cluster and set

DNSSuffix

(it defaults to

svc.clusterset.local

), Operator does the following:

  • Replica set is formed using MCS domain set in
    DNSSuffix

     

  • Operator generates TLS certificates as usual, but adds
    DNSSuffix

    domains into the picture

With this approach, the traffic between nodes flows as expected and is encrypted by default.

Replica Set and Transport Encryption

Things to Consider

MCS APIs

Please note that the operator won’t install MCS APIs and controllers to your Kubernetes cluster. You need to install them by following your provider’s instructions prior to enabling MCS for your PSMDB clusters. See our docs for links to different providers.

Operator detects if MCS is installed in the cluster by API resources. The detection happens before controllers are started in the operator. If you installed MCS APIs while the operator is running, you need to restart the operator. Otherwise, you’ll see an error like this:

{
  "level": "error",
  "ts": 1652083068.5910048,
  "logger": "controller.psmdb-controller",
  "msg": "Reconciler error",
  "name": "cluster1",
  "namespace": "psmdb",
  "error": "wrong psmdb options: MCS is not available on this cluster",
  "errorVerbose": "...",
  "stacktrace": "..."
}

ServiceImport Provisioning Time

It might take some time for

ServiceImport

objects to be created in the Kubernetes cluster. You can see the following messages in the logs while creation is in progress: 

{
  "level": "info",
  "ts": 1652083323.483056,
  "logger": "controller_psmdb",
  "msg": "waiting for service import",
  "replset": "rs0",
  "serviceExport": "cluster1-rs0"
}

During testing, we saw wait times up to 10-15 minutes. If you see your cluster is stuck in initializing state by waiting for service imports, it’s a good idea to check the usage and quotas for your environment.

DNSSuffix

We also made a decision to automatically generate TLS certificates for Percona Server for MongoDB cluster with

*.clusterset.local

domain, even if MCS is not enabled. This approach simplifies the process of enabling MCS for a running MongoDB cluster. It does not make much sense to change the

DNSSuffix

 field, unless you have hard requirements from your service provider, but we still allow such a change. 

If you want to enable MCS with a cluster deployed with an operator version below 1.12, you need to update your TLS certificates to include

*.clusterset.local

SANs. See the docs for instructions.

Conclusion

Business relies on applications and infrastructure that serves them more than ever nowadays. Disaster Recovery protocols and various failsafe mechanisms are routine for reliability engineers, not an annoying task in the backlog. 

With multi-cluster deployment functionality in Percona Operator for MongoDB, we want to equip users to build highly available and secured database clusters with minimal effort.

Percona Operator for MongoDB is truly open source and provides users with a way to deploy and manage their enterprise-grade MongoDB clusters on Kubernetes. We encourage you to try this new Multi-Cluster Services integration and let us know your results on our community forum. You can find some scripts that would help you provision your first MCS clusters on GKE or EKS here.

There is always room for improvement and a time to find a better way. Please let us know if you face any issues with contributing your ideas to Percona products. You can do that on the Community Forum or JIRA. Read more about contribution guidelines for Percona Operator for MongoDB in CONTRIBUTING.md.

Apr
18
2022
--

Expose Databases on Kubernetes with Ingress

Expose Databases on Kubernetes with Ingress

Ingress is a resource that is commonly used to expose HTTP(s) services outside of Kubernetes. To have ingress support, you will need an Ingress Controller, which in a nutshell is a proxy. SREs and DevOps love ingress as it provides developers with a self-service to expose their applications. Developers love it as it is simple to use, but at the same time quite flexible.

High-level ingress design looks like this: 

High-level ingress design

  1. Users connect through a single Load Balancer or other Kubernetes service
  2. Traffic is routed through Ingress Pod (or Pods for high availability)
    • There are multiple flavors of Ingress Controllers. Some use nginx, some envoy, or other proxies. See a curated list of Ingress Controllers here.
  3. Based on HTTP headers traffic is routed to corresponding Pods which run websites. For HTTPS traffic Server Name Indication (SNI) is used, which is an extension in TLS supported by most browsers. Usually, ingress controller integrates nicely with cert-manager, which provides you with full TLS lifecycle support (yeah, no need to worry about renewals anymore).

The beauty and value of such a design is that you have a single Load Balancer serving all your websites. In Public Clouds, it leads to cost savings, and in private clouds, it simplifies your networking or reduces the number of IPv4 addresses (if you are not on IPv6 yet). 

TCP and UDP with Ingress

Quite interestingly, some ingress controllers also support TCP and UDP proxying. I have been asked on our forum and Kubernetes slack multiple times if it is possible to use ingress with Percona Operators. Well, it is. Usually, you need a load balancer per database cluster: 

TCP and UDP with Ingress

The design with ingress is going to be a bit more complicated but still allows you to utilize the single load balancer for multiple databases. In cases where you run hundreds of clusters, it leads to significant cost savings. 

  1. Each TCP port represents the database cluster
  2. Ingress Controller makes a decision about where to route the traffic based on the port

The obvious downside of this design is non-default TCP ports for databases. There might be weird cases where it can turn into a blocker, but usually, it should not.

Hands-on

My goal is to have the following:

All configuration files I used for this blog post can be found in this Github repository.

Deploy Percona XtraDB Clusters (PXC)

The following commands are going to deploy the Operator and three clusters:

kubectl apply -f https://raw.githubusercontent.com/spron-in/blog-data/master/operators-and-ingress/bundle.yaml

kubectl apply -f https://raw.githubusercontent.com/spron-in/blog-data/master/operators-and-ingress/cr-minimal.yaml
kubectl apply -f https://raw.githubusercontent.com/spron-in/blog-data/master/operators-and-ingress/cr-minimal2.yaml
kubectl apply -f https://raw.githubusercontent.com/spron-in/blog-data/master/operators-and-ingress/cr-minimal3.yaml

Deploy Ingress

helm upgrade --install ingress-nginx ingress-nginx   --repo https://kubernetes.github.io/ingress-nginx   --namespace ingress-nginx --create-namespace  \
--set controller.replicaCount=2 \
--set tcp.3306="default/minimal-cluster-haproxy:3306"  \
--set tcp.3307="default/minimal-cluster2-haproxy:3306" \
--set tcp.3308="default/minimal-cluster3-haproxy:3306"

This is going to deploy a highly available ingress-nginx controller. 

  • controller.replicaCount=2

    – defines that we want to have at least two Pods of ingress controller. This is to provide a highly available setup.

  • tcp flags do two things:
    • expose ports 3306-3308 on the ingress’s load balancer
    • instructs ingress controller to forward traffic to corresponding services which were created by Percona Operator for PXC clusters. For example, port 3307 is the one to use to connect to minimal-cluster2. Read more about this configuration in ingress documentation.

Here is the load balancer resource that was created:

$ kubectl -n ingress-nginx get service
NAME                                 TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)                                                                   AGE
…
ingress-nginx-controller             LoadBalancer   10.43.72.142    212.2.244.54   80:30261/TCP,443:32112/TCP,3306:31583/TCP,3307:30786/TCP,3308:31827/TCP   4m13s

As you see, ports 3306-3308 are exposed.

Check the Connection

This is it. Database clusters should be exposed and reachable. Let’s check the connection. 

Get the root password for minimal-cluster2:

$ kubectl get secrets minimal-cluster2-secrets | grep root | awk '{print $2}' | base64 --decode && echo
kSNzx3putYecHwSXEgf

Connect to the database:

$ mysql -u root -h 212.2.244.54 --port 3307  -p
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 5722

It works! Notice how I use port 3307, which corresponds to minimal-cluster2.

Adding More Clusters

What if you add more clusters into the picture, how do you expose those?

Helm

If you use helm, the easiest way is to just add one more flag into the command:

helm upgrade --install ingress-nginx ingress-nginx   --repo https://kubernetes.github.io/ingress-nginx   --namespace ingress-nginx --create-namespace  \
--set controller.replicaCount=2 \
--set tcp.3306="default/minimal-cluster-haproxy:3306"  \
--set tcp.3307="default/minimal-cluster2-haproxy:3306" \
--set tcp.3308="default/minimal-cluster3-haproxy:3306" \
--set tcp.3309="default/minimal-cluster4-haproxy:3306"

No helm

Without Helm, it is a two-step process:

First, edit the

configMap

which configures TCP services exposure. By default it is called

ingress-nginx-tcp

:

kubectl -n ingress-nginx edit cm ingress-nginx-tcp
…
apiVersion: v1
data:
  "3306": default/minimal-cluster-haproxy:3306
  "3307": default/minimal-cluster2-haproxy:3306
  "3308": default/minimal-cluster3-haproxy:3306
  "3309": default/minimal-cluster4-haproxy:3306

Change in

configMap

will trigger the reconfiguration of nginx in ingress pods. But as a second step, it is also necessary to expose this port on a load balancer. To do so – edit the corresponding service:

kubectl -n ingress-nginx edit services ingress-nginx-controller
…
spec:
  ports:
    …
  - name: 3309-tcp
    port: 3309
    protocol: TCP
    targetPort: 3309-tcp

The new cluster is exposed on port 3309 now.

Limitations and considerations

Ports per Load Balancer

Cloud providers usually limit the number of ports that you can expose through a single load balancer:

  • AWS has 50 listeners per NLB, GCP 100 ports per service.

If you hit the limit, just create another load balancer pointing to the ingress controller.

Automation

Cost-saving is a good thing, but with Kubernetes capabilities, users expect to avoid manual tasks, not add more. Integrating the change of ingress configMap and load balancer ports into CICD is not a hard task, but maintaining the logic of adding new load balancers to add more ports is harder. I have not found any projects that implement the logic of reusing load balancer ports or automating it in any way. If you know of any – please leave a comment under this blog post.

Mar
15
2022
--

Run PostgreSQL on Kubernetes with Percona Operator & Pulumi

Run PostgreSQL on Kubernetes with Percona Operator and Pulumi

Avoid vendor lock-in, provide a private Database-as-a-Service for internal teams, quickly deploy-test-destroy databases with CI/CD pipeline – these are some of the most common use cases for running databases on Kubernetes with operators. Percona Distribution for PostgreSQL Operator enables users to do exactly that and more.

Pulumi is an infrastructure-as-a-code tool, which enables developers to write code in their favorite language (Python, Golang, JavaScript, etc.) to deploy infrastructure and applications easily to public clouds and platforms such as Kubernetes.

This blog post is a step-by-step guide on how to deploy a highly-available PostgreSQL cluster on Kubernetes with our Percona Operator and Pulumi.

Desired State

We are going to provision the following resources with Pulumi:

  • Google Kubernetes Engine cluster with three nodes. It can be any Kubernetes flavor.
  • Percona Operator for PostgreSQL
  • Highly available PostgreSQL cluster with one primary and two hot standby nodes
  • Highly available pgBouncer deployment with the Load Balancer in front of it
  • pgBackRest for local backups

Pulumi code can be found in this git repository.

Prepare

I will use the Ubuntu box to run Pulumi, but almost the same steps would work on macOS.

Pre-install Packages

gcloud and kubectl

echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] https://packages.cloud.google.com/apt cloud-sdk main" | sudo tee -a /etc/apt/sources.list.d/google-cloud-sdk.list
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key --keyring /usr/share/keyrings/cloud.google.gpg add -
sudo apt-get update
sudo apt-get install -y google-cloud-sdk docker.io kubectl jq unzip

python3

Pulumi allows developers to use the language of their choice to describe infrastructure and applications. I’m going to use python. We will also pip (python package-management system) and venv (virtual environment module).

sudo apt-get install python3 python3-pip python3-venv

Pulumi

Install Pulumi:

curl -sSL https://get.pulumi.com | sh

On macOS, this can be installed view Homebrew with

brew install pulumi

 

You will need to add .pulumi/bin to the $PATH:

export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/home/percona/.pulumi/bin

Authentication

gcloud

You will need to provide access to Google Cloud to provision Google Kubernetes Engine.

gcloud config set project your-project
gcloud auth application-default login
gcloud auth login

Pulumi

Generate Pulumi token at app.pulumi.com. You will need it later to init Pulumi stack:

Action

This repo has the following files:

  • Pulumi.yaml

    – identifies that it is a folder with Pulumi project

  • __main__.py

    – python code used by Pulumi to provision everything we need

  • requirements.txt

    – to install required python packages

Clone the repo and go to the

pg-k8s-pulumi

folder:

git clone https://github.com/spron-in/blog-data
cd blog-data/pg-k8s-pulumi

Init the stack with:

pulumi stack init pg

You will need the key here generated before on app.pulumi.com.

__main__.py

Python code that Pulumi is going to process is in __main__.py file. 

Lines 1-6: importing python packages

Lines 8-31: configuration parameters for this Pulumi stack. It consists of two parts:

  • Kubernetes cluster configuration. For example, the number of nodes.
  • Operator and PostgreSQL cluster configuration – namespace to be deployed to, service type to expose pgBouncer, etc.

Lines 33-80: deploy GKE cluster and export its configuration

Lines 82-88: create the namespace for Operator and PostgreSQL cluster

Lines 91-426: deploy the Operator. In reality, it just mirrors the operator.yaml from our Operator.

Lines 429-444: create the secret object that allows you to set the password for pguser to connect to the database

Lines 445-557: deploy PostgreSQL cluster. It is a JSON version of cr.yaml from our Operator repository

Line 560: exports Kubernetes configuration so that it can be reused later 

Deploy

At first, we will set the configuration for this stack. Execute the following commands:

pulumi config set gcp:project YOUR_PROJECT
pulumi config set gcp:zone us-central1-a
pulumi config set node_count 3
pulumi config set master_version 1.21

pulumi config set namespace percona-pg
pulumi config set pg_cluster_name pulumi-pg
pulumi config set service_type LoadBalancer
pulumi config set pg_user_password mySuperPass

These commands set the following:

  • GCP project where GKE is going to be deployed
  • GCP zone 
  • Number of nodes in a GKE cluster
  • Kubernetes version
  • Namespace to run PostgreSQL cluster
  • The name of the cluster
  • Expose pgBouncer with LoadBalancer object

Deploy with the following command:

$ pulumi up
Previewing update (pg)

View Live: https://app.pulumi.com/spron-in/percona-pg-k8s/pg/previews/d335d117-b2ce-463b-867d-ad34cf456cb3

     Type                                                           Name                                Plan       Info
 +   pulumi:pulumi:Stack                                            percona-pg-k8s-pg                   create     1 message
 +   ?? random:index:RandomPassword                                 pguser_password                     create
 +   ?? random:index:RandomPassword                                 password                            create
 +   ?? gcp:container:Cluster                                       gke-cluster                         create
 +   ?? pulumi:providers:kubernetes                                 gke_k8s                             create
 +   ?? kubernetes:core/v1:ServiceAccount                           pgoPgo_deployer_saServiceAccount    create
 +   ?? kubernetes:core/v1:Namespace                                pgNamespace                         create
 +   ?? kubernetes:batch/v1:Job                                     pgoPgo_deployJob                    create
 +   ?? kubernetes:core/v1:ConfigMap                                pgoPgo_deployer_cmConfigMap         create
 +   ?? kubernetes:core/v1:Secret                                   percona_pguser_secretSecret         create
 +   ?? kubernetes:rbac.authorization.k8s.io/v1:ClusterRoleBinding  pgo_deployer_crbClusterRoleBinding  create
 +   ?? kubernetes:rbac.authorization.k8s.io/v1:ClusterRole         pgo_deployer_crClusterRole          create
 +   ?? kubernetes:pg.percona.com/v1:PerconaPGCluster               my_cluster_name                     create

Diagnostics:
  pulumi:pulumi:Stack (percona-pg-k8s-pg):
    E0225 14:19:49.739366105   53802 fork_posix.cc:70]           Fork support is only compatible with the epoll1 and poll polling strategies

Do you want to perform this update? yes

Updating (pg)
View Live: https://app.pulumi.com/spron-in/percona-pg-k8s/pg/updates/5
     Type                                                           Name                                Status      Info
 +   pulumi:pulumi:Stack                                            percona-pg-k8s-pg                   created     1 message
 +   ?? random:index:RandomPassword                                 pguser_password                     created
 +   ?? random:index:RandomPassword                                 password                            created
 +   ?? gcp:container:Cluster                                       gke-cluster                         created
 +   ?? pulumi:providers:kubernetes                                 gke_k8s                             created
 +   ?? kubernetes:core/v1:ServiceAccount                           pgoPgo_deployer_saServiceAccount    created
 +   ?? kubernetes:core/v1:Namespace                                pgNamespace                         created
 +   ?? kubernetes:core/v1:ConfigMap                                pgoPgo_deployer_cmConfigMap         created
 +   ?? kubernetes:batch/v1:Job                                     pgoPgo_deployJob                    created
 +   ?? kubernetes:core/v1:Secret                                   percona_pguser_secretSecret         created
 +   ?? kubernetes:rbac.authorization.k8s.io/v1:ClusterRole         pgo_deployer_crClusterRole          created
 +   ?? kubernetes:rbac.authorization.k8s.io/v1:ClusterRoleBinding  pgo_deployer_crbClusterRoleBinding  created
 +   ?? kubernetes:pg.percona.com/v1:PerconaPGCluster               my_cluster_name                     created

Diagnostics:
  pulumi:pulumi:Stack (percona-pg-k8s-pg):
    E0225 14:20:00.211695433   53839 fork_posix.cc:70]           Fork support is only compatible with the epoll1 and poll polling strategies

Outputs:
    kubeconfig: "[secret]"

Resources:
    + 13 created

Duration: 5m30s

Verify

Get kubeconfig first:

pulumi stack output kubeconfig --show-secrets > ~/.kube/config

Check if Pods of your PG cluster are up and running:

$ kubectl -n percona-pg get pods
NAME                                             READY   STATUS      RESTARTS   AGE
backrest-backup-pulumi-pg-dbgsp                  0/1     Completed   0          64s
pgo-deploy-8h86n                                 0/1     Completed   0          4m9s
postgres-operator-5966f884d4-zknbx               4/4     Running     1          3m27s
pulumi-pg-787fdbd8d9-d4nvv                       1/1     Running     0          2m12s
pulumi-pg-backrest-shared-repo-f58bc7657-2swvn   1/1     Running     0          2m38s
pulumi-pg-pgbouncer-6b6dc4564b-bh56z             1/1     Running     0          81s
pulumi-pg-pgbouncer-6b6dc4564b-vpppx             1/1     Running     0          81s
pulumi-pg-pgbouncer-6b6dc4564b-zkdwj             1/1     Running     0          81s
pulumi-pg-repl1-58d578cf49-czm54                 0/1     Running     0          46s
pulumi-pg-repl2-7888fbfd47-h98f4                 0/1     Running     0          46s
pulumi-pg-repl3-cdd958bd9-tf87k                  1/1     Running     0          46s

Get the IP-address of pgBouncer LoadBalancer:

$ kubectl -n percona-pg get services
NAME                             TYPE           CLUSTER-IP     EXTERNAL-IP    PORT(S)                      AGE
…
pulumi-pg-pgbouncer              LoadBalancer   10.20.33.122   35.188.81.20   5432:32042/TCP               3m17s

You can connect to your PostgreSQL cluster through this IP-address. Use pguser password that was set earlier with

pulumi config set pg_user_password

:

psql -h 35.188.81.20 -p 5432 -U pguser pgdb

Clean up

To delete everything it is enough to run the following commands:

pulumi destroy
pulumi stack rm

Tricks and Quirks

Pulumi Converter

kube2pulumi is a huge help if you already have YAML manifests. You don’t need to rewrite the whole code, but just convert YAMLs to Pulumi code. This is what I did for operator.yaml.

apiextensions.CustomResource

There are two ways for Custom Resource management in Pulumi:

crd2pulumi generates libraries/classes out of Custom Resource Definitions and allows you to create custom resources later using these. I found it a bit complicated and it also lacks documentation.

apiextensions.CustomResource on the other hand allows you to create Custom Resources by specifying them as JSON. It is much easier and requires less manipulation. See lines 446-557 in my __main__.py.

True/False in JSON

I have the following in my Custom Resource definition in Pulumi code:

perconapg = kubernetes.apiextensions.CustomResource(
…
    spec= {
…
    "disableAutofail": False,
    "tlsOnly": False,
    "standby": False,
    "pause": False,
    "keepData": True,

Be sure that you use boolean of the language of your choice and not the “true”/”false” strings. For me using the strings turned into a failure as the Operator was expecting boolean, not the strings.

Depends On…

Pulumi makes its own decisions on the ordering of provisioning resources. You can enforce the order by specifying dependencies

For example, I’m ensuring that Operator and Secret are created before the Custom Resource:

    },opts=ResourceOptions(provider=k8s_provider,depends_on=[pgo_pgo_deploy_job,percona_pg_cluster1_pguser_secret_secret])

Mar
07
2022
--

Manage Your Data with Mingo.io and Percona Distribution for MongoDB Operator

Mingo.io and Percona Distribution for MongoDB Operator

Mingo.io and Percona Distribution for MongoDB OperatorDeploying MongoDB on Kubernetes has never been simpler with Percona Distribution for MongoDB Operator. It provides you with an enterprise-ready MongoDB cluster with no manual burden. In addition to that, you also get automated day-to-day operations – scaling, backups, upgrades.

But once you have your cluster up and running and have a good grasp on managing it, you still need to manage the data itself – collections, indexes, shards. We at Percona focus on infrastructure, data consistency, and the health of the cluster, but data is an integral part of any database that should be managed. In this blog post, we will see how the user can manage, browse and query the data of a MongoDB cluster with mingo.io. Mingo.io is a tool that I have discovered recently and it reminded me of PHPMyAdmin, but with a great user interface, feature set, and that it is for MongoDB, not MySQL.

This post provides a step-by-step guide on how to deploy MongoDB locally, connect it with Mingo.io, and manage the data. We want to have a local setup so that anyone would be able to try it at home.

Deploy the Operator and the Database

For local installation, I will use microk8s and minimal cr.yaml for MongoDB, which deploys a one-node replica set with sharding. It is enough for the demo.

Start Kubernetes Cluster

As I use Ubuntu, I will install microk8s from snap:

# snap install microk8s --classic

We need to start microk8s and enable Domain Name Service (DNS) and storage support:

# microk8s start
# microk8s enable dns storage

Kubernetes cluster is ready, let’s fetch kubeconfig from it to use regular kubectl:

# microk8s config > /home/percona/.kube/config

MongoDB Up and Running

Install the Operator:

kubectl apply -f https://raw.githubusercontent.com/spron-in/blog-data/master/mingo/bundle.yaml

Deploy MongoDB itself:

kubectl apply -f https://raw.githubusercontent.com/spron-in/blog-data/master/mingo/cr-minimal.yaml

Connect to MongoDB

To connect to MongoDB we need a connection string with the login and password. In this example, we expose mongos with a NodePort service type. Find the Port to connect to:

$ kubectl get services
NAME                     TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)           AGE
…
minimal-cluster-mongos   NodePort    10.152.183.46   <none>        27017:32609/TCP   32m

You can connect to the MongoDB cluster using the IP address of your machine and port 32609. 

To demonstrate the full power of Mingo, we will need to create a user first to manage the data:

Get userAdmin password

$ kubectl get secrets minimal-cluster -o yaml | awk '$1~/MONGODB_USER_ADMIN_PASSWORD/ {print $2}' | base64 --decode
4OnZ2RZ3SpRVtKbUxy

Run MongoDB client Pod

kubectl run -i --rm --tty percona-client --image=percona/percona-server-mongodb:4.4.10-11 --restart=Never -- bash -il

Connect to Mongo

mongo "mongodb://userAdmin:4OnZ2RZ3SpRVtKbUxy@minimal-cluster-mongos.default.svc.cluster.local/admin?ssl=false"

Create the user in mongo shell:

db.createUser( { user: "test", pwd: "mysUperpass", roles: [ "userAdminAnyDatabase", "readWriteAnyDatabase", "dbAdminAnyDatabase" ] } )

The connection strict that I would use in Mingo would look like this:

mongodb://test:mySuperpass@10.10.3.4:32609

IP address and port most likely would be different for you.

Mingo at Work

MongoDB cluster is up and running. Let’s have Mingo installed and configured.

First, download the latest version of Mingo from https://mingo.io/ On the first opening after installation you will be welcomed with the message about Mingo’s security.

Once you are prompted for your first MongoDB connection, paste your mongo URL into the field and submit. Now you are connected and ready to manage your data.

Mingo.io

Now, let’s try two examples of how Mingo works.

Example 1

Let’s assume you have a collection called “Blog” and you want to rename it to “Articles”. First, locate the collection in the sidebar, right-click it and select “Rename collection…” in the context menu. Type the new collection name and hit Enter. That’s it.

 

Example 2

Let’s assume you have a collection called “orders” with the field “shippedDate”, and you want to rename this field to shipmentDate in every document that was created during 2020. How would you do this in the old way? Would it be easy? Let’s see how you can do this in Mingo.

Instead of complicated queries with dates, we can use a simple shorthand { “OrderDate”: #2020 } and press Submit to see the results. Now just expand any document and then right-click on shippedDate. From the context menu select “Rename Fields ? Filtered documents…” and then just insert the correct name and hit Rename.

Example 3

Let’s customize the way we see the “Orders” collection, for example.

First, click CMD+T (or CTRL+T on Windows), start typing the name of the collection in the Finder until you see it. Then, using the up and down arrow keys, select the collection and press Enter. Alternatively, you can click on the collection in the sidebar.

When the collection is shown, Mingo picks a couple of columns to show. To add another column, click on the + icon and select the field to add. You can rearrange the columns by dragging their headers. Click on the column header to see all the actions for a column and explore them.

Projects

On top of regular MongoDB connections, Mingo offers “Projects”. A project is like a wrapper on several connections to different databases, such as development, production, testing. Mingo treats these databases as siblings and simplifies routine tasks, such as synchronization, backups, shared settings for sibling collections and many more.

Have a Quick Glance at Mingo in Action

 

Conclusion

Deploying and managing MongoDB clusters on Kubernetes is still in its early phase. Percona is committed to delivering enterprise-grade solutions to deploy and manage databases on Kubernetes.

We are focused on infrastructure and data consistency, whereas Mingo.io empowers users to perform data management on any MongoDB cluster.

Percona Distribution for MongoDB Operator contains everything you need to quickly and consistently deploy and scale Percona Server for MongoDB instances into a Kubernetes cluster on-premises or in the cloud. The Operator enables you to improve time to market with the ability to quickly deploy standardized and repeatable database environments. Deploy your database with a consistent and idempotent result no matter where they are used.

Mingo

As NodeJS developers, heavily using MongoDB for our projects, we have long dreamed of a better MongoDB admin tool.

Existing solutions lacked usability and features to make browsing and querying documents enjoyable. They would easily analyze data, list documents, and build aggregations, but they felt awkward with simple everyday tasks, such as finding a document quickly, viewing its content in a nice layout, or doing common actions with one click. Since dreaming only gets you so far, we started working on a new MongoDB admin.

Our goal at Mingo is to create a MongoDB GUI with superb user experience, modern design, and productive features to speed up your work and make you fall in love with your data.

Jan
27
2022
--

Percona Distribution for MySQL Operator Based on Percona Server for MySQL – Alpha Release

Percona Distribution for MySQL Operator

Percona Distribution for MySQL OperatorOperators are a software framework that extends Kubernetes API and enables application deployment and management through the control plane. For such complex technologies as databases, Operators play a crucial role by automating deployment and day-to-day operations. At Percona we have the following production-ready and enterprise-grade Kubernetes Operators for databases:

Today we are glad to announce an alpha version of our new Operator for MySQL. In this blog post, we are going to answer some frequently asked questions.

Why the New Operator?

As mentioned above, our existing operator for MySQL is based on the Percona XtraDB Cluster (PXC). It is feature-rich and provides virtually-synchronous replication by utilizing Galera Write-Sets. Sync replication ensures data consistency and proved itself useful for critical applications, especially on Kubernetes.

But there are two things that we want to address:

  1. Our community and customers let us know that there are numerous use cases where asynchronous replication would be a more suitable solution for MySQL on Kubernetes.
  2. Support Group Replication (GR) – a native way to provide synchronous replication in MySQL without the need to use Galera.

We heard you! That is why our new Operator is going to run Percona Server for MySQL (PS) and provide both regular asynchronous (with semi-sync support) and virtually-synchronous replication based on GR.

What Is the Name of the New Operator?

We will have two Operators for MySQL and follow the same naming as we have for Distributions:

  • Percona Distribution for MySQL Operator – PXC (Existing Operator)
  • Percona Distribution for MySQL Operator – PS (Percona Server for MySQL)

Is It the Successor of the Existing Operator for MySQL?

Not in the short run. We want to provide our users MySQL clusters on Kubernetes with three replication capabilities:

  • Percona XtraDB Cluster
  • Group Replication
  • Regular asynchronous replication with semi-sync support

Will I Be Able to Switch From One Operator to Another?

We are going to provide instructions and tools for migrations through replication or backup and restore. As the underlying implementation is totally different there is no direct, automated path to switch available.

Can I Use the New Operator Now?

Yes, but please remember that it is an alpha version and we do not recommend it for production workloads. 

Our Operator is licensed under Apache 2.0 and can be found in the percona-server-mysql-operator repository on GitHub.

To learn more about our operator please see the documentation.

Quick Deploy

Run these two commands to spin up a MySQL cluster with 3 nodes with asynchronous replication:

$ kubectl apply -f https://raw.githubusercontent.com/percona/percona-server-mysql-operator/main/deploy/bundle.yaml
$ kubectl apply -f https://raw.githubusercontent.com/percona/percona-server-mysql-operator/main/deploy/cr.yaml

In a couple of minutes, the cluster is going to be up and running. Verify:

$ kubectl get ps
NAME       MYSQL   ORCHESTRATOR   AGE
cluster1   ready   ready          6m16s

$ kubectl get pods
NAME                                                READY   STATUS    RESTARTS   AGE
cluster1-mysql-0                                    1/1     Running   0          6m27s
cluster1-mysql-1                                    1/1     Running   1          5m11s
cluster1-mysql-2                                    1/1     Running   1          3m36s
cluster1-orc-0                                      2/2     Running   0          6m27s
percona-server-for-mysql-operator-c8f8dbccb-q7lbr   1/1     Running   0          9m31s

Connect to the Cluster

First, you need to get the root user password, which was automatically generated by the Operator. By default system users’ passwords are stored in cluster1-secrets Secret resource:

$ kubectl get secrets cluster1-secrets -o yaml | grep root | awk '{print $2}' | base64 --decode

Start another container with a MySQL client in it:

$ kubectl run -i --rm --tty percona-client --image=percona:8.0 --restart=Never -- bash -il

Connect to a primary node of our MySQL cluster from this container:

$ mysql -h cluster1-mysql-primary -u root -p
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 138
Server version: 8.0.25-15 Percona Server (GPL), Release 15, Revision a558ec2

Copyright (c) 2009-2021 Percona LLC and/or its affiliates
Copyright (c) 2000, 2021, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql>

Consult the documentation to learn more about other operational capabilities and options.

What Is Currently Supported?

The following functionality is available in the Operator:

  • Deploy asynchronous and semi-sync replication MySQL clusters with Orchestrator on top of it
  • Expose clusters with regular Kubernetes Services
  • Monitor the cluster with Percona Monitoring and Management
  • Customize MySQL configuration

When Will It Be GA? What Is Going to Be Included?

Our goal is to release the GA version late in Q2 2022. We plan to include the following:

  • Support for both sync and async replication
  • Backups and restores, proxies integration
  • Certifications on various Kubernetes platforms and flavors
  • Deploy and manage MySQL Clusters with PMM DBaaS

Call for Action

Percona Distribution for MySQL Operator – PS just hatched and your feedback is highly appreciated. 

Open a bug or create a Pull Request for a chance to get awesome Percona Swag!

Jan
11
2022
--

Creating a Standby Cluster With the Percona Distribution for PostgreSQL Operator

Standby Cluster With the Percona Distribution for PostgreSQL Operator

A customer recently asked if our Percona Distribution for PostgreSQL Operator supports the deployment of a standby cluster, which they need as part of their Disaster Recovery (DR) strategy. The answer is yes – as long as you are making use of an object storage system for backups, such as AWS S3 or GCP Cloud Storage buckets, that can be accessed by the standby cluster. In a nutshell, it works like this:

  • The primary cluster is configured with pgBackRest to take backups and store them alongside archived WAL files in a remote repository;
  • The standby cluster is built from one of these backups and it is kept in sync with the primary cluster by consuming the WAL files that are copied from the remote repository.

Note that the primary node in the standby cluster is not a streaming replica from any of the nodes in the primary cluster and that it relies on archived WAL files to replicate events. For this reason, this approach cannot be used as a High Availability (HA) solution. Even though the primary use of a standby cluster in this context is DR, it can be also employed for migrations as well.

So, how can we create a standby cluster using the Percona operator? We will show you next. But first, let’s create a primary cluster for our example.

Creating a Primary PostgreSQL Cluster Using the Percona Operator

You will find a detailed procedure on how to deploy a PostgreSQL cluster using the Percona operator in our online documentation. Here we want to highlight the main steps involved, particularly regarding the configuration of object storage, which is a crucial requirement and should better be done during the initial deployment of the cluster. In the following example, we will deploy our clusters using the Google Kubernetes Engine (GKE) but you can find similar instructions for other environments in the previous link.

Considering you have a Google account configured as well as the gcloud (from the Google Cloud SDK suite) and kubectl command-line tools installed, authenticate yourself with gcloud auth login, and off we go!

Creating a GKE Cluster and Basic Configuration

The following command will create a default cluster named “cluster-1” and composed of three nodes. We are creating it in the us-central1-a zone using e2-standard-4 VMs but you may choose different options. In fact, you may also need to indicate the project name and other main settings if you do not have your gcloud environment pre-configured with them:

gcloud container clusters create cluster-1 --preemptible --machine-type e2-standard-4 --num-nodes=3 --zone us-central1-a

Once the cluster is created, use your IAM identity to control access to this new cluster:

kubectl create clusterrolebinding cluster-admin-binding --clusterrole cluster-admin --user $(gcloud config get-value core/account)

Finally, create the pgo namespace:

kubectl create namespace pgo

and set the current context to refer to this new namespace:

kubectl config set-context $(kubectl config current-context) --namespace=pgo

Creating a Cloud Storage Bucket

Remember for this setup we need a Google Cloud Storage bucket configured as well as a Service Account created with the necessary privileges/roles to access it. The respective procedures to obtain these vary according to how your environment is configured so we won’t be covering them here. Please refer to the Google Cloud Storage documentation for the exact steps. The bucket we created for the example in this post was named cluster1-backups-and-wals.

Likewise, please refer to the Creating and managing service account keys documentation to learn how to create a Service Account and download the corresponding key in JSON format – we will need to provide it to the operator so our PostgreSQL clusters can access the storage bucket.

Creating the Kubernetes Secrets File to Access the Storage Bucket

Create a file named my-gcs-account-secret.yaml with the following structure:

apiVersion: v1
kind: Secret
metadata:
  name: cluster1-backrest-repo-config
type: Opaque
data:
  gcs-key: <VALUE>

replacing the <VALUE> placeholder by the output of the following command according to the OS you are using:

Linux:

base64 --wrap=0 your-service-account-key-file.json

macOS:

base64 your-service-account-key-file.json

Installing and Deploying the Operator

The most practical way to install our operator is by cloning the Git repository, and then moving inside its directory:

git clone -b v1.1.0 https://github.com/percona/percona-postgresql-operator
cd percona-postgresql-operator

The following command will deploy the operator:

kubectl apply -f deploy/operator.yaml

We have already prepared the secrets file to access the storage bucket so we can apply it now:

kubectl apply -f my-gcs-account-secret.yaml

Now, all that is left is to customize the storages options in the deploy/cr.yaml file to indicate the use of the GCS bucket as follows:

    storages:
      my-gcs:
        type: gcs
        bucket: cluster1-backups-and-wals

We can now deploy the primary PostgreSQL cluster (cluster1):

kubectl apply -f deploy/cr.yaml

Once the operator has been deployed, you can run the following command to do some housekeeping:

kubectl delete -f deploy/operator.yaml

Creating a Standby PostgreSQL Cluster Using the Percona Operator

After this long preamble, let’s look at what brought you here: how to deploy a standby cluster, which we will refer to as cluster2, that will replicate from the primary cluster.

Copying the Secrets Over

Considering you probably have customized the passwords you use in your primary cluster and that they differ from the default values found in the operator’s git repository, we need to make a copy of the secrets files, adjusted to the standby cluster’s name. The following procedure facilitates this task, saving the secrets files under /tmp/cluster1-cluster2-secrets (you can choose a different target directory):

NOTE: make sure you have the yq tool installed in your system.
mkdir -p /tmp/cluster1-cluster2-secrets/
export primary_cluster_name=cluster1
export standby_cluster_name=cluster2
export secrets="${primary_cluster_name}-users"
kubectl get secret/$secrets -o yaml \
| yq eval 'del(.metadata.creationTimestamp)' - \
| yq eval 'del(.metadata.uid)' - \
| yq eval 'del(.metadata.selfLink)' - \
| yq eval 'del(.metadata.resourceVersion)' - \
| yq eval 'del(.metadata.namespace)' - \
| yq eval 'del(.metadata.annotations."kubectl.kubernetes.io/last-applied-configuration")' - \
| yq eval '.metadata.name = "'"${secrets/$primary_cluster_name/$standby_cluster_name}"'"' - \
| yq eval '.metadata.labels.pg-cluster = "'"${standby_cluster_name}"'"' - \
>/tmp/cluster1-cluster2-secrets/${secrets/$primary_cluster_name/$standby_cluster_name}

Deploying the Standby Cluster: Fast Mode

Since we have already covered the procedure used to create the primary cluster in detail in a previous section, we will be presenting the essential steps to create the standby cluster below and provide additional comments only when necessary.

NOTE: the commands below are issued from inside the percona-postgresql-operator directory hosting the git repository for our operator.

Deploying a New GKE Cluster Named cluster-2

This time using the us-west1-b zone here:

gcloud container clusters create cluster-2 --preemptible --machine-type e2-standard-4 --num-nodes=3 --zone us-west1-b
kubectl create clusterrolebinding cluster-admin-binding --clusterrole cluster-admin --user $(gcloud config get-value core/account)
kubectl create namespace pgo
kubectl config set-context $(kubectl config current-context) --namespace=pgo
kubectl apply -f deploy/operator.yaml

Apply the Adjusted Kubernetes Secrets:

export standby_cluster_name=cluster2
export secrets="${standby_cluster_name}-users"
kubectl create -f /tmp/cluster1-cluster2-secrets/$secrets

The list above does not include the GCS secret file; the key contents remain the same but the backrest-repo pod name needs to be adjusted. Make a copy of that file:

cp my-gcs-account-secret.yaml my-gcs-account-secret-2.yaml

then edit the copy to indicate “cluster2-” instead of “cluster1-”:

name: cluster2-backrest-repo-config

You can apply it now:

kubectl apply -f my-gcs-account-secret-2.yaml

The cr.yaml file of the Standby Cluster

Let’s make a copy of the cr.yaml file we customized for the primary cluster:

cp deploy/cr.yaml deploy/cr-2.yaml

and edit the copy as follows:

1) Change all references (that are not commented) from cluster1 to cluster2  – including current-primary but excluding the bucket reference, which in our example is prefixed with “cluster1-”; the storage section must remain unchanged. (We know it’s not very practical to replace so many references, we still need to improve this part of the routine).

2) Enable the standby option:

standby: true

3) Provide a repoPath that points to the GCS bucket used by the primary cluster (just below the storages section, which should remain the same as in the primary cluster’s cr.yaml file):

repoPath: “/backrestrepo/cluster1-backrest-shared-repo”

And that’s it! All that is left now is to deploy the standby cluster:

kubectl apply -f deploy/cr-2.yaml

With everything working on the standby cluster, do some housekeeping:

kubectl delete -f deploy/operator.yaml

Verifying it all Works as Expected

Remember that the standby cluster is created from a backup and relies on archived WAL files to be continued in sync with the primary cluster. If you make a change in the primary cluster, such as adding a row to a table, that change won’t reach the standby cluster until the WAL file it has been recorded to is archived and consumed by the standby cluster.

When checking if all is working with the new setup, you can force the rotation of the WAL file (and subsequent archival of the previous one) in the primary node of the primary cluster to accelerate the sync process by issuing:

psql> SELECT pg_switch_wal();

The Percona Kubernetes Operators automate the creation, alteration, or deletion of members in your Percona Distribution for MySQL, MongoDB, or PostgreSQL environment.

Learn More About Percona Kubernetes Operators

Dec
21
2021
--

Quick Guide on Azure Blob Storage Support for Percona Distribution for MongoDB Operator

Azure Blob Percona MongoDB Operator

If you have ever used backups with Percona Distribution for MongoDB Operator, you should already know that backed-up data is stored outside the Kubernetes cluster – on Amazon S3 or any S3-compatible storage. Storage types not compatible with the S3 protocol were supported indirectly in the case of an existing S3 wrapper/gateway. A good example of such a solution is running MinIO Gateway on Azure Kubernetes Service to store backups on Azure Blob Storage.

Starting with Operator version 1.11, it is now possible to use Azure Blob Storage for backups directly:

Backups on Azure Blob Storage

The following steps will allow you to configure it.

1. Get Azure Blob Storage Credentials

As with most other S3-compatible storage types, the first thing to do is to obtain credentials the Operator will use to access Azure Blob storage.

If you are new to Azure, these two tutorials will help you to configure your storage:

When you have a container to store your backups, getting credentials to access it involve the following steps:

  1. Go to your storage account settings,
  2. Open the “Access keys” section,
  3. Copy and save both the account name and account key as shown on a screenshot below:

Azure credentials

2. Create a Secret with Credentials

The Operator will use a Kubernetes Secrets Object to obtain the needed credentials. Create this Secret with credentials using the following command:

$ kubectl create secret generic azure-secret \
 --from-literal=AZURE_STORAGE_ACCOUNT_NAME=<your-storage-account-name> \
 --from-literal=AZURE_STORAGE_ACCOUNT_KEY=<your-storage-key>

3. Setup spec.backup Section in your deploy/cr.yaml file

As usual, backups are configured via the same-name section in the deploy/cr.yaml configuration file.

Make sure that backups are enabled (backup.enable key set to true), and add the following lines to the backup.storages subsection (use the proper name of your container): 

azure-blob:
  type: azure
  azure:
    container: <your-container-name>
    prefix: psmdb
    credentialsSecret: azure-secret

If you want to schedule a regular backup, add the following lines to the backup.tasks subsection:

tasks:
  - name: weekly
    enabled: true
    schedule: "0 0 * * 0"
    compressionType: gzip
    storageName: azure-blob

The backup schedule is specified in crontab format (the above example runs backups at 00:00 on Sunday). If you know nothing about cron schedule expressions, you can use this online generator.

The full backup section in our example will look like this:

backup:
  enabled: true
  restartOnFailure: true
  image: percona/percona-server-mongodb-operator:1.10.0-backup
  storages:
    azure-blob:
      type: azure
      azure:
        container: <your container name>
        prefix: psmdb
        credentialsSecret: azure-secret
  tasks:
    - name: weekly
      enabled: true
      schedule: "0 0 * * 0"
      compressionType: gzip
      storageName: azure-blob

You can find more information on backup options in the official backups documentation and the backups section of the Custom Resource options reference.

Percona Distribution for MongoDB is a freely available MongoDB database alternative, giving you a single solution that combines the best and most important enterprise components from the open source community, designed and tested to work together.

Download Percona Distribution for MongoDB Today!

Oct
18
2021
--

Cloud-Native Through the Prism of Percona: Episode 1

Percona Cloud Native Series 1

Percona Cloud Native Series 1The cloud-native landscape matures every day, and new great tools and products continue to appear. We are starting a series of blog posts that are going to focus on new tools in the container and cloud-native world, and provide a holistic view through the prism of Percona products.

In this blog:

  • VMware Tanzu Community edition
  • Data on Kubernetes survey
  • Azure credits for open source projects
  • Percona Distribution for PostgreSQL Operator is GA
  • kube-fledged
  • kubescape
  • m3o – new generation public cloud

VMware Tanzu Community Edition

I personally like this move by VMware to open source Tanzu, the set of products to run and manage Kubernetes clusters and applications. Every time I deploy Amazon EKS I feel like I’ve been punished for something. With VMware Tanzu, deployment of the cluster on Amazon (not EKS) is a smooth experience. It has its own quirks, but still much much better.

Tanzu Community Edition is not only about AWS EKS, but also other public clouds and even local environments with docker.

I also was able to successfully deploy Percona Operators on the Tanzu provisioned cluster. Keep in mind that you need a storage class to be created to run stateful workloads. It can be easily done with Tanzu’s packaging system.

Data on Kubernetes Survey

The Data on Kubernetes (DoK) community does a great job in promoting and evangelizing stateful workloads on Kubernetes. I strongly recommend you check out the DoK 2021 report that was released this week. Key takeaways:

  • Kubernetes is on the rise (nothing new). Half of the respondents run 50% or more of production workloads in k8s.
  • K8S is ready to run stateful workloads – 90% think so, and 70% already run data on k8s.
  • The key benefits of running stateful applications on Kubernetes:
    • Consistency
    • Standardizing
    • Simplify management
    • Enable develop self-service
  • Operators help with:
    • Management
    • Scalability
    • Improve app lifecycle mgmt

There are lots of other interesting data points, I encourage you to go through them.

Azure Credits for Open Source Projects

Percona’s motto is “Keeping Open Source Open”, which is why an announcement from Microsoft to issue Azure credits for open source projects caught our attention. This is a good move from Microsoft helping the open source community to certify products on Azure without spending a buck.

Percona Distribution for PostgreSQL Operator is GA

I cannot miss the opportunity to share with you that Percona’s PostgreSQL Operator has reached the General Availability stage. It was a long journey for us and we were constantly focused on improving the quality of our Operator through introduction of rigorous end-to-end testing. Please read more about this release on the PostgreSQL news mailing list. I also encourage you to look into our GitHub repository and try out the Operator by following these installation instructions.

kube-fledged

Back in the days of my Operations career, I was looking for an easy way to have container images pre-pulled on my Kubernetes nodes. kube-fledged does exactly this. The most common use cases are applications that require rapid start-up or some batch-processing which is fired randomly. If we talk about Percona Operators, then kube-fledged is useful if you scale your databases frequently and don’t want to waste valuable seconds on pulling the image. I have tried it out for Percona Distribution for PostgreSQL Operator and it worked like a charm.

kube-fledged is an operator and it controls which images to pull to the nodes with ImageCache custom resource. I have prepared an

ImageCache

manifest for Percona PostgreSQL Operator as an example – please find it here. This instructs kube-fledge to pull the images that we use in PostgreSQL cluster deployment on all nodes.

kubescape

In every container-related survey, we see security as one of the top concerns. kubescape is a neat tool to test if Kubernetes and apps are deployed securely as defined in National Security Agency (NSA) and Cybersecurity and Infrastructure Security Agency (CICA) Hardening Guidance.

It provides both details and a summary of failures. Here is for example the failure of Resource policy control for containers for default Percona MongoDB Operator deployment:

[control: Resource policies] failed ?
Description: CPU and memory resources should have a limit set for every container to prevent resource exhaustion. This control identifies all the Pods without resource limit definition.
   Namespace default
      Deployment - percona-server-mongodb-operator
      StatefulSet - my-cluster-name-cfg
      StatefulSet - my-cluster-name-rs0
Summary - Passed:3   Excluded:0   Failed:3   Total:6
Remediation: Define LimitRange and ResourceQuota policies to limit resource usage for namespaces or nodes.

It might be a good idea for developers to add kubescape into the CICD pipeline to get additional automated security policy checks.

M3O – New Generation Public Cloud

M3O is an open source AWS alternative built for the next generation of developers. Consume public APIs as simpler programmable building blocks for a 10x better developer experience.” – not my words, but from their website – m3o.com. In a nutshell, it is a set of APIs to greatly simplify the development. In m3o’s GitHub repo there is an example of how to build the Reddit Clone utilizing these APIs only. You can explore available APIs here. As an example I used URL shortener API for this blog post link:

$ curl "https://api.m3o.com/v1/url/Shorten" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $MICRO_API_TOKEN" \
-d '{
"destinationURL": "https://www.percona.com/blog/cloud-native-series-1"
}'

{"shortURL":"https://m3o.one/u/8FlJbxSfp"}

Looks neat! For now, most of the APIs are free to use, but I assume this is going to change soon once the project gets more traction and grows its user base.

It is also important to note, that this is an open source project, meaning that anyone can deploy their own M3O platform on Kubernetes (yes, k8s again) and have these APIs exposed privately and for free, not as a SaaS offering. See m3o/platform repo for more details and Pulumi example to deploy it.

Complete the 2021 Percona Open Source Data Management Software Survey

Have Your Say!

Oct
13
2021
--

Migrating MongoDB to Kubernetes

Migrating MongoDB to Kubernetes

Migrating MongoDB to KubernetesThis blog post is the last in the series of articles on migrating databases to Kubernetes with Percona Operators. Two previous posts can be found here:

As you might have guessed already, this time we are going to cover the migration of MongoDB to Kubernetes. In the 1.10.0 release of Percona Distribution for MongoDB Operator, we have introduced a new feature (in tech preview) that enables users to execute such migrations through regular MongoDB replication capabilities. We have already shown before how it can be used to provide cross-regional disaster recovery for MongoDB, we encourage you to read it.

The Goal

There are two ways to migrate the database:

  1. Take the backup and restore it.
    – This option is the simplest one, but unfortunately comes with downtime. The bigger the database, the longer the recovery time is.
  2. Replicate the data to the new site and switch the application once replicas are in sync.
    – This allows the user to perform the migration and switch the application with either zero or little downtime.

This blog post is a walkthrough on how to migrate MongoDB replica set to Kubernetes with replication capabilities. 

MongoDB replica set to Kubernetes

  1. We have a MongoDB cluster somewhere (the Source). It can be on-prem or some virtual machine. For demo purposes, I’m going to use a standalone replica set node. The migration procedure of a replica set with multiple nodes or sharded cluster is almost identical.
  2. We have a Kubernetes cluster with Percona Operator (the Target). The operator deploys 3 standalone MongoDB nodes in unmanaged mode (we will talk about it below).
  3. Each node is exposed so that the nodes on the Source can reach them.
  4. We are going to replicate the data to Target nodes by adding them into the replica set.

As always all blog post scripts and configuration files are available publicly in this Github repository.

Prerequisites

  • MongoDB cluster – either on-prem or VM. It is important to be able to configure mongod to some extent and add external nodes to the replica set.
  • Kubernetes cluster for the Target.
  • kubectl to deploy and manage the Operator and database on Kubernetes.

Prepare the Source

This section explains what preparations must be made on the Source to set up the replication.

Expose

All nodes in the replica set must form a mesh and be able to reach each other. The communication between the nodes can go through the public internet or some VPN. For demonstration purposes, we are going to expose the Source to the public internet by editing mongod.conf:

net:
  bindIp: <PUBLIC IP>

If you have multiple replica sets – you need to expose all nodes of each of them, including config servers.

TLS

We take security seriously at Percona, and this is why by default our Operator deploys MongoDB clusters with encryption enabled. I have prepared a script that generates self-signed certificates and keys with the openssl tool. If you already have Certificate Authority (CA) in use in your organization, generate the certificates and sign them by your CA.

The list of alternative names can be found either in this document or in this openssl configuration file. Note DNS.20 entry:

DNS.20      = *.mongo.spron.in

I’m going to use this wildcard entry to set up the replication between the nodes. The script also generates an

ssl-secret.yaml

file, which we are going to use on the Target side.

You need to upload CA and certificate with a private key to every Source replica set node and then define it in the

mongod.conf

:

# network interfaces
net:
  ...
  tls:
    mode: preferTLS
    CAFile: /path/to/ca.pem
    certificateKeyFile: /path/to/mongod.pem

security:
  clusterAuthMode: x509
  authorization: enabled

Note that I also set

clusterAuthMode

to x509. It enforces the use of x509 authentication. Test it carefully on a non-production environment first as it might break your existing replication.

Create System Users

Our Operator needs system users to manage the cluster and perform health checks. Usernames and passwords for system users should be the same on the Source and the Target. This script is going to generate the

user-secret.yaml

to use on the Target and mongo shell code to add the users on the Source (it is an example, do not use it in production).

Connect to the primary node on the Source and execute mongo shell commands generated by the script.

Prepare the Target

Apply Users and TLS secrets

System users’ credentials and TLS certificates must be similar on both sides. The scripts we used above generate Secret object manifests to use on the Target. Apply them:

$ kubectl apply -f ssl-secret.yaml
$ kubectl apply -f user-secret.yam

Deploy the Operator and MongoDB Nodes

Please follow one of the installation guides to deploy the Operator. Usually, it is one step operation through

kubectl

:

$ kubectl apply -f operator.yaml

MongoDB nodes are deployed with a custom resource manifest – cr.yaml. There are the following important configuration items in it:

spec:
  unmanaged: true

This flag instructs Operator to deploy the nodes in unmanaged mode, meaning they are not configured to form the cluster. Also, the Operator does not generate TLS certificates and system users.

spec:
…
  updateStrategy: Never

Disable the Smart Update feature as the cluster is unmanaged.

spec:
…
  secrets:
    users: my-new-cluster-secrets
    ssl: my-custom-ssl
    sslInternal: my-custom-ssl-internal

This section points to the Secret objects that we created in the previous step.

spec:
…
  replsets:
  - name: rs0
    size: 3
    expose:
      enabled: true
      exposeType: LoadBalancer

Remember, that nodes need to be exposed and reachable. To do this we create a service per Pod. In our case, it is a

LoadBalancer

object, but it can be any other Service type.

spec:
...
  backup:
    enabled: false

If the cluster and nodes are unmanaged, the Operator should not be taking backups. 

Deploy unmanaged nodes with the following command:

$ kubectl apply -f cr.yaml

Once nodes are up and running, also check the services. We will need the IP-addresses of new replicas to add them later to the replica set on the Source.

$ kubectl get services
NAME                    TYPE           CLUSTER-IP     EXTERNAL-IP      PORT(S)           AGE
…
my-new-cluster-rs0-0    LoadBalancer   10.3.252.134   35.223.104.224   27017:31332/TCP   2m11s
my-new-cluster- rs0-1   LoadBalancer   10.3.244.166   34.134.210.223   27017:32647/TCP   81s
my-new-cluster-rs0-2    LoadBalancer   10.3.248.119   34.135.233.58    27017:32534/TCP   45s

Configure Domains

X509 authentication is strict and requires that the certificate’s common name or alternative name match the domain name of the node. As you remember we had wildcard

*.mongo.spron.in

included in our certificate. It can be any domain that you use, but make sure a certificate is issued for this domain.

I’m going to create A-records to point to public IP-addresses of MongoDB nodes:

k8s-1.mongo.spron-in -> 35.223.104.224
k8s-2.mongo.spron.in -> 34.134.210.223
k8s-3.mongo.spron-in -> 34.135.233.58

Replicate the Data to the Target

It is time to add our nodes in the Kubernetes cluster to the replica set. Login into the mongo shell on the Source and execute the following:

rs.add({ host: "k8s-1.mongo.spron.in", priority: 0, votes: 0} )
rs.add({ host: "k8s-2.mongo.spron.in", priority: 0, votes: 0} )
rs.add({ host: "k8s-3.mongo.spron.in", priority: 0, votes: 0} )

If everything is done correctly these nodes are going to be added as secondaries. You can check the status with the

rs.status()

command.

Cutover

Check that newly added node are synchronized. The more data you have, the longer the synchronization process is going to take. To understand if nodes are synchronized you should compare the values of

optime

and

optimeDate

of the Primary node with the values for the Secondary node in

rs.status()

output:

{
        "_id" : 0,
        "name" : "147.182.213.59:27017",
        "stateStr" : "PRIMARY",
...
        "optime" : {
                "ts" : Timestamp(1633697030, 1),
                "t" : NumberLong(2)
        },
        "optimeDate" : ISODate("2021-10-08T12:43:50Z"),
...
},
{
        "_id" : 1,
        "name" : "k8s-1.mongo.spron.in:27017",
        "stateStr" : "SECONDARY",
...
        "optime" : {
                "ts" : Timestamp(1633697030, 1),
                "t" : NumberLong(2)
        },
        "optimeDurable" : {
                "ts" : Timestamp(1633697030, 1),
                "t" : NumberLong(2)
        },
        "optimeDate" : ISODate("2021-10-08T12:43:50Z"),
...
},

When nodes are synchronized, we are ready to perform the cutover. Please ensure that your application is configured properly to minimize downtime during the cutover.

The cutover is going to have two steps:

  1. One of the nodes on the Target becomes the primary.
  2. Operator starts managing the cluster and nodes on the Source are no longer present in the replica set.

Switch the Primary

Connect with mongo shell to the primary on the Source side and make one of the nodes on the Target primary. It can be done by changing the replica set configuration:

cfg = rs.config()
cfg.members[1].priority = 2
cfg.members[1].votes = 1
rs.reconfig(cfg)

We enable voting and set priority to two on one of the nodes in the Kubernetes cluster. Member id can be different for you, so please look carefully into the output of

rs.config()

command.

Start Managing the Cluster

Once the primary is running in Kubernetes, we are going to tell the Operator to start managing the cluster. Change

spec.unmanaged

to

false

 in the Custom Resource with patch command:

$ kubectl patch psmdb my-cluster-name --type=merge -p '{"spec":{"unmanaged": true}}'

You can also do this by changing

cr.yaml

and applying it. This is it, now you have the cluster in Kubernetes which is managed by the Operator. 

Conclusion

You truly start to appreciate Operators once you get used to them. When I was writing this blog post I found it extremely annoying to deploy and configure a single MongoDB node on a Linux box; I don’t want to think about the cluster. Operators abstract Kubernetes primitives and database configuration and provide you with a fully operational database service instead of a bunch of nodes. Migration of MongoDB to Kubernetes is a challenging task, but it is much simpler with Operator. And once you are on Kubernetes, Operator takes care of all day-2 operations as well.

We encourage you to try out our operator. See our GitHub repository and check out the documentation.

Found a bug or have a feature idea? Feel free to submit it in JIRA.

For general questions please raise the topic in the community forum

You are a developer and looking to contribute? Please read our CONTRIBUTING.md and send the Pull Request.

Percona Distribution for MongoDB Operator

The Percona Distribution for MongoDB Operator simplifies running Percona Server for MongoDB on Kubernetes and provides automation for day-1 and day-2 operations. It’s based on the Kubernetes API and enables highly available environments. Regardless of where it is used, the Operator creates a member that is identical to other members created with the same Operator. This provides an assured level of stability to easily build test environments or deploy a repeatable, consistent database environment that meets Percona expert-recommended best practices.

Complete the 2021 Percona Open Source Data Management Software Survey

Have Your Say!

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com