Sep
26
2022
--

MySQL in Microservices Environments

MySQL in Microservices Environments

MySQL in Microservices EnvironmentsThe microservice architecture is not a new pattern but has become very popular lately for mainly two reasons: cloud computing and containers. That combo helped increase adoption by tackling the two main concerns on every infrastructure: Cost reduction and infrastructure management.

However, all that beauty hides a dark truth:

The hardest part of microservices is the data layer

And that is especially true when it comes to classic relational databases like MySQL. Let’s figure out why that is.

MySQL and the microservice

Following the same two pillars of microservices (cloud computing and containers), what can one do with that in the MySQL space? What do cloud computing and containers bring to the table?

Cloud computing

The magic of the cloud is that it allows you to be cost savvy by letting you easily SCALE UP/SCALE DOWN the size of your instances. No more wasted money on big servers that are underutilized most of the time. What’s the catch? It’s gotta be fast. Quick scale up to be ready to deal with traffic and quick scale down to cut costs when traffic is low. 

Containers

The magic of containers is that one can slice hardware to the resource requirements. The catch here is that containers were traditionally used on stateless applications. Disposable containers.

Relational databases in general and MySQL, in particular, are not fast to scale and are stateful. However, it can be adapted to the cloud and be used for the data layer on microservices.

The Scale Cube

The Scale Cube

The book “The Art of Scalability” by Abott and Fisher describes a really useful, three dimension scalability model: the Scale Cube. In this model, scaling an application by running clones behind a load balancer is known as X-axis scaling. The other two kinds of scaling are Y-axis scaling and Z-axis scaling. The microservice architecture is an application of Y-axis scaling: It defines an architecture that structures the application as a set of loosely coupled, collaborating services.

  • X-Axis: Horizontal Duplication and Cloning of services and data (READ REPLICAS)
  • Y-Axis: Functional Decomposition and Segmentation – (MULTI-TENANT)
  • Z-Axis: Service and Data Partitioning along Customer Boundaries – (SHARDING)

On microservices, each service has its own database in order to be decoupled from other services. In other words: a service’s transactions only involve its database; data is private and accessible only via the microservice API.

It’s natural that the first approach to divide the data is by using the multi-tenant pattern:

Actually before trying multi-tenant, one can use a tables-per-service model where each service owns a set of tables that must only be accessed by that service, but by having that “soft” division, the temptation to skip the API and access directly other services’ tables is huge.

Schema-per-service, where each service has a database schema that is private to that service is appealing since it makes ownership clearer. It is easy to create a user per database, with specific grants to limit database access.

This pattern of “shared database” however comes with some drawbacks like:

  • Single hardware: a failure in your database will hurt all the microservices
  • Resource-intensive tasks related to a particular database will impact the other databases (think on DDLs)
  • Shared resources: disk latency, IOPS, and bandwidth needs to be shared, as well as other resources like CPU, Network bandwidth, etc.

The alternative is to go “Database per service”

Database per service

Share nothing. Cleaner logical separation. Isolated issues. Services are loosely coupled. In fact, this opens the door for microservices to use a database that best suits their needs, like a graph db, a document-oriented database, etc. But as with everything, this also comes with drawbacks:

  • The most obvious: cost. More instances to deploy
  • The most critical: Distributed transactions. As we mentioned before, microservices are collaborative between them and that means that transactions span several services. 

The simplistic approach is to use a two-phase commit implementation. But that solution is just an open invitation to a huge amount of locking issues. It just doesn’t scale. So what are the alternatives?

  • Implementing transactions that span services: The Saga pattern
  • Implementing queries that span services: API composition or Command Query Responsibility Segregation (CQRS)

A saga is a sequence of local transactions. Each local transaction updates the database and publishes messages or events that trigger the next local transaction in the saga. If a local transaction fails for whatever reason, then the saga executes a series of compensating transactions that undo the changes made by the previous transactions. More on Saga here: https://microservices.io/patterns/data/saga.html

An API composition is just a composer that invokes queries on each microservice and then performs an in-memory join of the results:
https://microservices.io/patterns/data/api-composition.html

CQRS is keeping one or more materialized views that contain data from multiple services. This avoids the need to do joins on the query size: https://microservices.io/patterns/data/cqrs.html

What do all these alternatives have in common? That is taken care of at the API level: it becomes the responsibility of the developer to implement and maintain it. The data layer keep continues to be data, not information.

Make it cloud

There are means for your MySQL to be cloud-native: Easy to scale up and down fast; running on containers, a lot of containers; orchestrated with Kubernetes; with all the pros of Kubernetes (health checks, I’m looking at you).

Percona Operator for MySQL based on Percona XtraDB Cluster

A Kubernetes Operator is a special type of controller introduced to simplify complex deployments. Operators provide full application lifecycle automation and make use of the Kubernetes primitives above to build and manage your application. 

Percona Operator for MySQL

In our blog post “Introduction to Percona Operator for MySQL Based on Percona XtraDB Cluster” an overview of the operator and its benefits are covered. However, it’s worth mentioning what does it make it cloud native:

  • It takes advantage of cloud computing, scaling up and down
  • Runs con containers
  • Is orchestrated by the cloud orchestrator itself: Kubernetes

Under the hood is a Percona XtraDB Cluster running on PODs. Easy to scale out (increase the number of nodes: https://www.percona.com/doc/kubernetes-operator-for-pxc/scaling.html) and can be scaled up by giving more resources to the POD definition (without downtime)

Give it a try https://www.percona.com/doc/kubernetes-operator-for-pxc/index.html and unleash the power of the cloud on MySQL.

Sep
16
2022
--

Percona Operator for MongoDB Goes Cluster-Wide, Adds AKS Support, and More!

Percona Operator for MongoDB Goes Cluster-Wide

Percona Operator for MongoDB Goes Cluster-WidePercona Operator for MongoDB version 1.13 was recently released and it comes with various ravishing features. In this blog post, we are going to look under the hood and see what are the practical use cases for these improvements.

Cluster-wide deployment

There are two modes that Percona Operators support:

  1. Namespace scope
  2. Cluster-wide

Namespace scope limits the Operator operations to a single namespace, whereas in cluster-wide mode Operator can deploy and manage databases in multiple namespaces of a Kubernetes cluster. Our Operators for PostgreSQL and MySQL already support cluster-wide mode. With the 1.13 release, we are closing the gap for Percona Operator for MongoDB. 

Multi-tenant clusters are the most common call for cluster-wide mode. You as a cluster administrator manage a single deployment of the Operator and equip your teams with the way to deploy and manage MongoDB in their isolated namespaces. Read more about multi-tenancy and best practices in our Multi-Tenant Kubernetes Cluster with Percona Operators.

How does it work?

To deploy in cluster-wide mode we introduce

cw-*.yaml

manifests. The quickest way would be to use the

cw-bundle.yaml

which deploys the following:

  • Custom Resource Definition
  • Service Account and Cluster Role that allows Operator to create and manage Kubernetes objects in various namespaces
  • Operator Deployment itself

By default, Operator monitors all the namespaces in the cluster. The

WATCH_NAMESPACE

environment variable in the Operator Deployment limits the scope. It can be a comma-separated list that instructs the Operator on which namespaces to monitor for Custom Resource objects:

            - name: WATCH_NAMESPACE
              value: "app-dev1,app-dev2”

This is useful if you want to limit the blast radius, but run multiple Operators monitoring various namespaces. For example, you can run an Operator per environment – development, staging, production.

Deploy the bundle:

kubectl apply -f deploy/cw-bundle.yaml -n psmdb-operator

Now you can start deploying databases in the namespaces you need:

kubectl apply -f deploy/cr.yaml -n app-dev1
kubectl apply -f deploy/cr.yaml -n app-dev2

See the demo below where I deploy two clusters in different namespaces with a single Operator.

Hashicorp Vault integration for encryption-at-rest

We take security seriously at Percona. Data-at-rest encryption prevents data visibility in the event of unauthorized access or theft. It is supported by all our Operators. With this release, we introduce the support for integration with Hashicorp Vault, where the user can keep the keys in the Vault and instruct Percona Operator for MongoDB to use those. This feature is in a technical preview stage.

There is a good blog post that describes how Percona Server for MongoDB works with the Vault. In Operator, we implement the same functionality and follow the structure of the same parameters.

How does it work?

We are going to assume that you already have Hashicorp Vault installed – you either use Cloud Platform or a self-hosted version. We will focus on the configuration of the Operator.

To instruct the Operator to use Vault you need to specify two things in the Custom Resource:

  1. secrets.vault

    – Secret resource with a Vault token in it

  2. Custom configuration for mongod for a replica set and config servers

secrets.vault

Example of cr.yaml:

spec:
  secrets:
    vault: my-vault-secret

The secret object itself should contain the token that has access to create, read, update and delete the secrets in the desired path in the Vault. Please refer to the Vault documentation to understand policies better.

Example of a Secret:

apiVersion: v1
data:
  token: aHZzLnhrVVRPTEVOM2dLQmZuV0I5WTF0RmtOaA==
kind: Secret
metadata:
  name: my-vault-secret
  namespace: default
type: Opaque

Custom configuration

The operator allows users to fine-tune mongod and mongos configurations. For encryption to work, you must specify vault configuration for replica sets – both data and config servers.

Example of cr.yaml:

 replsets:
  - name: rs0
    size: 3
    configuration: |
      security:
        enableEncryption: true
        vault:
          serverName: vault
          port: 8200
          tokenFile: /etc/mongodb-vault/token
          secret: secret/data/dc/cluster1/rs0
          disableTLSForTesting: true
…
  sharding:
    enabled: true

    configsvrReplSet:
      size: 3
      configuration: |
        security:
          enableEncryption: true
          vault:
            serverName: vault
            port: 8200
            tokenFile: /etc/mongodb-vault/token
            secret: secret/data/dc/cluster1/cfg
            disableTLSForTesting: true

What to note here:

  • tokenFile: /etc/mongodb-vault/token
    • It is where the Operator is going to mount the Secret with the Vault token you created before. This is the default path and in most cases should not be changed.
  • secret: secret/data/dc/cluster1/rs0
    • It is the path where keys are going to be stored in the Vault. 

You can read more about Percona Server for MongoDB and Hashicorp Vault parameters in our documentation.

Once you are done with the configuration, apply the Custom Resource as usual. If everything is set up correctly you will see the following message in mongod log:

$ kubectl logs cluster1-rs0-0 -c mongod 
…
{"t":{"$date":"2022-09-13T19:40:20.342+00:00"},"s":"I",  "c":"STORAGE",  "id":29039,   "ctx":"initandlisten","msg":"Encryption keys DB is initialized successfully"}

Azure Kubernetes Service support

All Percona Operators are going through rigorous QA testing throughout the development lifecycle. Hours of QA engineers’ work are put into automating the test suites for specific Kubernetes flavors. 

AKS, or Azure Kubernetes Service, is the second most popular managed Kubernetes offering according to Flexera 2022 State of the Cloud report. After adding the support for Azure Blob Storage in version 1.11.0, it was just a matter of time before we started supporting AKS in full.

Starting with the 1.13.0 release, Percona Operator for MongoDB supports AKS in Technical Preview. You can see more details in our documentation.

The installation process of the Operator is no different from any other Kubernetes flavor. You can use a helm chart or apply YAML manifests with kubectl. I ran the cluster-wide demo above with AKS.

Admin user

This is a minor change, but frankly, it is my favorite as it impacts the user experience in a great way. Our Operator is coming with systems users that are used to manage and track the health of the database. Also, there are userAdmin and clusterAdmin for users to control the database, create users, and so on.

The problem here is neither userAdmin nor clusterAdmin allows you to start playing with the database right away. At first, you need to create the user that has permission to create databases and collections, and only after that, you can start using your fresh MongoDB cluster. 

With the release 1.13, we say no more to this. We add the databaseAdmin user that acts like a database admin, enabling users to start innovating right away.

databaseAdmin credentials are added to the same Secret object where other users are:

$ kubectl get secret my-cluster-name-secrets -o yaml | grep MONGODB_DATABASE_ADMIN_
  MONGODB_DATABASE_ADMIN_PASSWORD: Rm9NUnJ3UDJDazB5cW9WWU8=
  MONGODB_DATABASE_ADMIN_USER: ZGF0YWJhc2VBZG1pbg==

Get your password like this:

$ kubectl get secret my-cluster-name-secrets -o jsonpath='{.data.MONGODB_DATABASE_ADMIN_PASSWORD}' | base64 --decode
FoMRrwP2Ck0yqoVYO

Connect to the database as usual and start innovating:

mongo "mongodb://databaseAdmin:FoMRrwP2Ck0yqoVYO@20.31.226.164/admin"

What’s next

Percona is committed to running databases anywhere. Kubernetes adoption grows year over year, turning from a container orchestrator into a cloud operating system. Our Operators are supporting the community and our customers’ journey in infrastructure transformations by automating the deployment and management of the databases in Kubernetes.

The following links are going to help you to get familiar with Percona Operator for MongoDB:

  1. Quickstart guides
  2. Free Kubernetes cluster for easier and quicker testing
  3. Percona Operator for MongoDB community forum if you have general questions or need assistance

Read about Percona Monitoring and Management DBaaS, an open source solution that simplifies the deployment and management of MongoDB even more.

Jul
11
2022
--

Percona Operator for MySQL Supports Group Replication

Percona Operator for MySQL Supports Group Replication

Percona Operator for MySQL Supports Group ReplicationThere are two Operators at Percona to deploy MySQL on Kubernetes:

We wrote a blog post in the past explaining the thought process and reasoning behind creating the new Operator for MySQL. The goal for us is to provide production-grade solutions to run MySQL in Kubernetes and support various replication configurations:

  • Synchronous replication
    • with Percona XtraDB Cluster
    • with Group Replication
  • Asynchronous replication

With the latest 0.2.0 release of Percona Operator for MySQL (based on Percona Server for MySQL), we have added Group Replication support. In this blog post, we will briefly review the design of our implementation and see how to set it up. 

Design

This is a high-level design of running MySQL cluster with Group Replication:

MySQL cluster with Group Replication

MySQL Router acts as an entry point for all requests and routes the traffic to the nodes. 

This is a deeper look at how the Operator deploys these components in Kubernetes:
kubernetes deployment

Going from right to left:

  1. StatefulSet to deploy a cluster of MySQL nodes with Group Replication configured. Each node has its storage attached to it.
  2. Deployment object for stateless MySQL Router. 
  3. Deployment is exposed with a Service. We use various TCP ports here:
    1. MySQL Protocol ports
      1. 6446 – read/write, routing traffic to Primary node
      2. 6447 – read-only, load-balancing the traffic across Replicas 
    2. MySQL X Protocol – can be useful for CRUD operations, ex. asynchronous calls. Ports follow the same logic:
      1. 6448 – read/write
      2. 6449 – read-only 

Action

Prerequisites: you need a Kubernetes cluster. Minikube would do.

The files used in this blog post can be found in this Github repo.

Deploy the Operator

kubectl apply --server-side -f https://raw.githubusercontent.com/spron-in/blog-data/master/ps-operator-gr-demo/bundle.yaml

Note

–server-side

flag, without it you will get the error:

The CustomResourceDefinition "perconaservermysqls.ps.percona.com" is invalid: metadata.annotations: Too long: must have at most 262144 bytes

Our Operator follows OpenAPIv3 schema to have proper validation. This unfortunately increases the size of our Custom Resource Definition manifest and as a result, requires us to use

–server-side

flag.

Deploy the Cluster

We are ready to deploy the cluster now:

kubectl apply -f https://raw.githubusercontent.com/spron-in/blog-data/master/ps-operator-gr-demo/cr.yaml

I created this Custom Resource manifest specifically for this demo. Important to note variables:

  1. Line 10:
    clusterType: group-replication

    – instructs Operator that this is going to be a cluster with Group Replication.

  2. Lines 31-47: are all about MySQL Router. Once Group Replication is enabled, the Operator will automatically deploy the router. 

Get the status

The best way to see if the cluster is ready is to check the Custom Resource state:

$ kubectl get ps
NAME         REPLICATION         ENDPOINT        STATE   AGE
my-cluster   group-replication   35.223.42.238   ready   18m

As you can see, it is

ready

. You can also see

initializing

if the cluster is still not ready or

error

if something went wrong.

Here you can also see the endpoint where you can connect to. In our case, it is a public IP-address of the load balancer. As described in the design section above, there are multiple ports exposed:

$ kubectl get service my-cluster-router
NAME                TYPE           CLUSTER-IP    EXTERNAL-IP     PORT(S)                                                       AGE
my-cluster-router   LoadBalancer   10.20.22.90   35.223.42.238   6446:30852/TCP,6447:31694/TCP,6448:31515/TCP,6449:31686/TCP   18h

Connect to the Cluster

To connect we will need the user first. By default, there is a root user with a randomly generated password. The password is stored in the Secret object. You can always fetch the password with the following command:

$ kubectl get secrets my-cluster-secrets -ojson | jq -r .data.root | base64 -d
SomeRandomPassword

I’m going to use port 6446, which would grant me read/write access and lead me directly to the Primary node through MySQL Router:

mysql -u root -p -h 35.223.42.238 --port 6446
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 156329
Server version: 8.0.28-19 Percona Server (GPL), Release 19, Revision 31e88966cd3

Copyright (c) 2000, 2022, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

Group Replication in action

Let’s see if Group Replication really works. 

List the members of the cluster by running the following command:

$ mysql -u root -p -P 6446 -h 35.223.42.238 -e 'SELECT member_host, member_state, member_role FROM performance_schema.replication_group_members;'
+-------------------------------------------+--------------+-------------+
| member_host                               | member_state | member_role |
+-------------------------------------------+--------------+-------------+
| my-cluster-mysql-0.my-cluster-mysql.mysql | ONLINE       | PRIMARY     |
| my-cluster-mysql-1.my-cluster-mysql.mysql | ONLINE       | SECONDARY   |
| my-cluster-mysql-2.my-cluster-mysql.mysql | ONLINE       | SECONDARY   |
+-------------------------------------------+--------------+-------------+

Now we will delete one Pod (MySQL node), which also happens to have a Primary role, and see what happens:

$ kubectl delete pod my-cluster-mysql-0
pod "my-cluster-mysql-0" deleted


$ mysql -u root -p -P 6446 -h 35.223.42.238 -e 'SELECT member_host, member_state, member_role FROM performance_schema.replication_group_members;'
+-------------------------------------------+--------------+-------------+
| member_host                               | member_state | member_role |
+-------------------------------------------+--------------+-------------+
| my-cluster-mysql-1.my-cluster-mysql.mysql | ONLINE       | PRIMARY     |
| my-cluster-mysql-2.my-cluster-mysql.mysql | ONLINE       | SECONDARY   |
+-------------------------------------------+--------------+-------------+

One node is gone as expected.

my-cluster-mysql-1

node got promoted to a Primary role. I’m still using port 6446 and the same host to connect to the database, which indicates that MySQL Router is doing its job.

After some time Kubernetes will recreate the Pod and the node will join the cluster again automatically:

$ mysql -u root -p -P 6446 -h 35.223.42.238 -e 'SELECT member_host, member_state, member_role FROM performance_schema.replication_group_members;'
+-------------------------------------------+--------------+-------------+
| member_host                               | member_state | member_role |
+-------------------------------------------+--------------+-------------+
| my-cluster-mysql-0.my-cluster-mysql.mysql | RECOVERING   | SECONDARY   |
| my-cluster-mysql-1.my-cluster-mysql.mysql | ONLINE       | PRIMARY     |
| my-cluster-mysql-2.my-cluster-mysql.mysql | ONLINE       | SECONDARY   |
+-------------------------------------------+--------------+-------------+

The recovery phase might take some time, depending on the data size and amount of the changes, but eventually, it will come back ONLINE:

$ mysql -u root -p -P 6446 -h 35.223.42.238 -e 'SELECT member_host, member_state, member_role FROM performance_schema.replication_group_members;'
+-------------------------------------------+--------------+-------------+
| member_host                               | member_state | member_role |
+-------------------------------------------+--------------+-------------+
| my-cluster-mysql-0.my-cluster-mysql.mysql | ONLINE       | SECONDARY   |
| my-cluster-mysql-1.my-cluster-mysql.mysql | ONLINE       | PRIMARY     |
| my-cluster-mysql-2.my-cluster-mysql.mysql | ONLINE       | SECONDARY   |
+-------------------------------------------+--------------+-------------+

What’s coming up next?

Some exciting capabilities and features that we are going to ship pretty soon:

  • Backup and restore support for clusters with Group Replication
    • We have backups and restores in the Operator, but they currently do not work with Group Replication
  • Monitoring of MySQL Router in Percona Monitoring and Management (PMM)
    • Even though the Operator integrates nicely with PMM, it is possible to monitor MySQL nodes only, but not MySQL Router.
  • Automated Upgrades of MySQL and database components in the Operator
    • We have it in all other Operators and it is just logical to add it here

Percona is an open source company and we value our community and contributors. You are greatly encouraged to contribute to Percona Software. Please read our Contributions guide and visit our community webpage.

May
16
2022
--

Running Rocket.Chat with Percona Server for MongoDB on Kubernetes

Running Rocket.Chat with Percona Server for MongoDB on Kubernetes

Our goal is to have a Rocket.Chat deployment which uses highly available Percona Server for MongoDB cluster as the backend database and it all runs on Kubernetes. To get there, we will do the following:

  • Start a Google Kubernetes Engine (GKE) cluster across multiple availability zones. It can be any other Kubernetes flavor or service, but I rely on multi-AZ capability in this blog post.
  • Deploy Percona Operator for MongoDB and database cluster with it
  • Deploy Rocket.Chat with specific affinity rules
    • Rocket.Chat will be exposed via a load balancer

Rocket.Chat will be exposed via a load balancer

Percona Operator for MongoDB, compared to other solutions, is not only the most feature-rich but also comes with various management capabilities for your MongoDB clusters – backups, scaling (including sharding), zero-downtime upgrades, and many more. There are no hidden costs and it is truly open source.

This blog post is a walkthrough of running a production-grade deployment of Rocket.Chat with Percona Operator for MongoDB.

Rock’n’Roll

All YAML manifests that I use in this blog post can be found in this repository.

Deploy Kubernetes Cluster

The following command deploys GKE cluster named

percona-rocket

in 3 availability zones:

gcloud container clusters create --zone us-central1-a --node-locations us-central1-a,us-central1-b,us-central1-c percona-rocket --cluster-version 1.21 --machine-type n1-standard-4 --preemptible --num-nodes=3

Read more about this in the documentation.

Deploy MongoDB

I’m going to use helm to deploy the Operator and the cluster.

Add helm repository:

helm repo add percona https://percona.github.io/percona-helm-charts/

Install the Operator into the percona namespace:

helm install psmdb-operator percona/psmdb-operator --create-namespace --namespace percona

Deploy the cluster of Percona Server for MongoDB nodes:

helm install my-db percona/psmdb-db -f psmdb-values.yaml -n percona

Replica set nodes are going to be distributed across availability zones. To get there, I altered the affinity keys in the corresponding sections of psmdb-values.yaml:

antiAffinityTopologyKey: "topology.kubernetes.io/zone"

Prepare MongoDB

For Rocket.Chat to connect to our database cluster, we need to create the users. By default, clusters provisioned with our Operator have

userAdmin

user, its password is set in

psmdb-values.yaml

:

MONGODB_USER_ADMIN_PASSWORD: userAdmin123456

For production-grade systems, do not forget to change this password or create dedicated secrets to provision those. Read more about user management in our documentation.

Spin up a client Pod to connect to the database:

kubectl run -i --rm --tty percona-client1 --image=percona/percona-server-mongodb:4.4.10-11 --restart=Never -- bash -il

Connect to the database with

userAdmin

:

[mongodb@percona-client1 /]$ mongo "mongodb://userAdmin:userAdmin123456@my-db-psmdb-db-rs0-0.percona/admin?replicaSet=rs0"

We are going to create the following:

  • rocketchat

    database

  • rocketChat

    user to store data and connect to the database

  • oplogger

    user to provide access to oplog for rocket chat

    • Rocket.Chat uses Meteor Oplog tailing to improve performance. It is optional.
use rocketchat
db.createUser({
  user: "rocketChat",
  pwd: passwordPrompt(),
  roles: [
    { role: "readWrite", db: "rocketchat" }
  ]
})

use admin
db.createUser({
  user: "oplogger",
  pwd: passwordPrompt(),
  roles: [
    { role: "read", db: "local" }
  ]
})

Deploy Rocket.Chat

I will use helm here to maintain the same approach. 

helm install -f rocket-values.yaml my-rocketchat rocketchat/rocketchat --version 3.0.0

You can find rocket-values.yaml in the same repository. Please make sure you set the correct passwords in the corresponding YAML fields.

As you can see, I also do the following:

  • Line 11: expose Rocket.Chat through
    LoadBalancer

    service type

  • Line 13-14: set number of replicas of Rocket.Chat Pods. We want three – one per each availability zone.
  • Line 16-23: set affinity to distribute Pods across availability zones

Load Balancer will be created with a public IP address:

$ kubectl get service my-rocketchat-rocketchat
NAME                       TYPE           CLUSTER-IP    EXTERNAL-IP    PORT(S)        AGE
my-rocketchat-rocketchat   LoadBalancer   10.32.17.26   34.68.238.56   80:32548/TCP   12m

You should now be able to connect to

34.68.238.56

and enjoy your highly available Rocket.Chat installation.

Rocket.Chat installation

Clean Up

Uninstall all helm charts to remove MongoDB cluster, the Operator, and Rocket.Chat:

helm uninstall my-rocketchat
helm uninstall my-db -n percona
helm uninstall psmdb-operator -n percona

Things to Consider

Ingress

Instead of exposing Rocket.Chat through a load balancer, you may also try ingress. By doing so, you can integrate it with cert-manager and have a valid TLS certificate for your chat server.

Mongos

It is also possible to run a sharded MongoDB cluster with Percona Operator. If you do so, Rocket.Chat will connect to mongos Service, instead of the replica set nodes. But you will still need to connect to the replica set directly to get oplogs.

Conclusion

We encourage you to try out Percona Operator for MongoDB with Rocket.Chat and let us know on our community forum your results.

There is always room for improvement and a time to find a better way. Please let us know if you face any issues with contributing your ideas to Percona products. You can do that on the Community Forum or JIRA. Read more about contribution guidelines for Percona Operator for MongoDB in CONTRIBUTING.md.

Percona Operator for MongoDB contains everything you need to quickly and consistently deploy and scale Percona Server for MongoDB instances into a Kubernetes cluster on-premises or in the cloud. The Operator enables you to improve time to market with the ability to quickly deploy standardized and repeatable database environments. Deploy your database with a consistent and idempotent result no matter where they are used.

May
12
2022
--

Percona Operator for MongoDB and Kubernetes MCS: The Story of One Improvement

Percona Operator for MongoDB and Kubernetes MCS

Percona Operator for MongoDB supports multi-cluster or cross-site replication deployments since version 1.10. This functionality is extremely useful if you want to have a disaster recovery deployment or perform a migration from or to a MongoDB cluster running in Kubernetes. In a nutshell, it allows you to use Operators deployed in different Kubernetes clusters to manage and expand replica sets.Percona Kubernetes Operator

For example, you have two Kubernetes clusters: one in Region A, another in Region B.

  • In Region A you deploy your MongoDB cluster with Percona Operator. 
  • In Region B you deploy unmanaged MongoDB nodes with another installation of Percona Operator.
  • You configure both Operators, so that nodes in Region B are added to the replica set in Region A.

In case of failure of Region A, you can switch your traffic to Region B.

Migrating MongoDB to Kubernetes describes the migration process using this functionality of the Operator.

This feature was released in tech preview, and we received lots of positive feedback from our users. But one of our customers raised an internal ticket, which was pointing out that cross-site replication functionality does not work with Multi-Cluster Services. This started the investigation and the creation of this ticket – K8SPSMDB-625

This blog post will go into the deep roots of this story and how it is solved in the latest release of Percona Operator for MongoDB version 1.12.

The Problem

Multi-Cluster Services or MCS allows you to expand network boundaries for the Kubernetes cluster and share Service objects across these boundaries. Someone calls it another take on Kubernetes Federation. This feature is already available on some managed Kubernetes offerings,  Google Cloud Kubernetes Engine (GKE) and AWS Elastic Kubernetes Service (EKS). Submariner uses the same logic and primitives under the hood.

MCS Basics

To understand the problem, we need to understand how multi-cluster services work. Let’s take a look at the picture below:

multi-cluster services

  • We have two Pods in different Kubernetes clusters
  • We add these two clusters into our MCS domain
  • Each Pod has a service and IP-address which is unique to the Kubernetes cluster
  • MCS introduces new Custom Resources –
    ServiceImport

    and

    ServiceExport

    .

    • Once you create a
      ServiceExport

      object in one cluster,

      ServiceImport

      object appears in all clusters in your MCS domain.

    • This
      ServiceImport

        object is in

      svc.clusterset.local

      domain and with the network magic introduced by MCS can be accessed from any cluster in the MCS domain

Above means that if I have an application in the Kubernetes cluster in Region A, I can connect to the Pod in Kubernetes cluster in Region B through a domain name like

my-pod.<namespace>.svc.clusterset.local

. And it works from another cluster as well.

MCS and Replica Set

Here is how cross-site replication works with Percona Operator if you use load balancer:

MCS and Replica Set

All replica set nodes have a dedicated service and a load balancer. A replica set in the MongoDB cluster is formed using these public IP addresses. External node added using public IP address as well:

  replsets:
  - name: rs0
    size: 3
    externalNodes:
    - host: 123.45.67.89

All nodes can reach each other, which is required to form a healthy replica set.

Here is how it looks when you have clusters connected through multi-cluster service:

Instead of load balancers replica set nodes are exposed through Cluster IPs. We have ServiceExports and ServiceImports resources. All looks good on the networking level, it should work, but it does not.

The problem is in the way the Operator builds MongoDB Replica Set in Region A. To register an external node from Region B to a replica set, we will use MCS domain name in the corresponding section:

  replsets:
  - name: rs0
    size: 3
    externalNodes:
    - host: rs0-4.mongo.svc.clusterset.local

Now our rs.status() will look like this:

"name" : "my-cluster-rs0-0.mongo.svc.cluster.local:27017"
"role" : "PRIMARY"
...
"name" : "my-cluster-rs0-1.mongo.svc.cluster.local:27017"
"role" : "SECONDARY"
...
"name" : "my-cluster-rs0-2.mongo.svc.cluster.local:27017"
"role" : "SECONDARY"
...
"name" : "rs0-4.mongo.svc.clusterset.local:27017"
"role" : "UNKNOWN"

As you can see, Operator formed a replica set out of three nodes using

svc.cluster.local

domain, as it is how it should be done when you expose nodes with

ClusterIP

Service type. In this case, a node in Region B cannot reach any node in Region A, as it tries to connect to the domain that is local to the cluster in Region A. 

In the picture below, you can easily see where the problem is:

The Solution

Luckily we have a Special Interest Group (SIG), a Kubernetes Enhancement Proposal (KEP) and multiple implementations for enabling Multi-Cluster Services. Having a KEP is great since we can be sure the implementations from different providers (i.e GCP, AWS) will follow the same standard more or less.

There are two fields in the Custom Resource that control MCS in the Operator:

spec:
  multiCluster:
    enabled: true
    DNSSuffix: svc.clusterset.local

Let’s see what is happening in the background with these flags set.

ServiceImport and ServiceExport Objects

Once you enable MCS by patching the CR with

spec.multiCluster.enabled: true

, the Operator creates a

ServiceExport

object for each service. These ServiceExports will be detected by the MCS controller in the cluster and eventually a

ServiceImport

for each

ServiceExport

will be created in the same namespace in each cluster that has MCS enabled.

As you see, we made a decision and empowered the Operator to create

ServiceExport

objects. There are two main reasons for doing that:

  • If any infrastructure-as-a-code tool is used, it would require additional logic and level of complexity to automate the creation of required MCS objects. If Operator takes care of it, no additional work is needed. 
  • Our Operators take care of the infrastructure for the database, including Service objects. It just felt logical to expand the reach of this functionality to MCS.

Replica Set and Transport Encryption

The root cause of the problem that we are trying to solve here lies in the networking field, where external replica set nodes try to connect to the wrong domain names. Now, when you enable multi-cluster and set

DNSSuffix

(it defaults to

svc.clusterset.local

), Operator does the following:

  • Replica set is formed using MCS domain set in
    DNSSuffix

     

  • Operator generates TLS certificates as usual, but adds
    DNSSuffix

    domains into the picture

With this approach, the traffic between nodes flows as expected and is encrypted by default.

Replica Set and Transport Encryption

Things to Consider

MCS APIs

Please note that the operator won’t install MCS APIs and controllers to your Kubernetes cluster. You need to install them by following your provider’s instructions prior to enabling MCS for your PSMDB clusters. See our docs for links to different providers.

Operator detects if MCS is installed in the cluster by API resources. The detection happens before controllers are started in the operator. If you installed MCS APIs while the operator is running, you need to restart the operator. Otherwise, you’ll see an error like this:

{
  "level": "error",
  "ts": 1652083068.5910048,
  "logger": "controller.psmdb-controller",
  "msg": "Reconciler error",
  "name": "cluster1",
  "namespace": "psmdb",
  "error": "wrong psmdb options: MCS is not available on this cluster",
  "errorVerbose": "...",
  "stacktrace": "..."
}

ServiceImport Provisioning Time

It might take some time for

ServiceImport

objects to be created in the Kubernetes cluster. You can see the following messages in the logs while creation is in progress: 

{
  "level": "info",
  "ts": 1652083323.483056,
  "logger": "controller_psmdb",
  "msg": "waiting for service import",
  "replset": "rs0",
  "serviceExport": "cluster1-rs0"
}

During testing, we saw wait times up to 10-15 minutes. If you see your cluster is stuck in initializing state by waiting for service imports, it’s a good idea to check the usage and quotas for your environment.

DNSSuffix

We also made a decision to automatically generate TLS certificates for Percona Server for MongoDB cluster with

*.clusterset.local

domain, even if MCS is not enabled. This approach simplifies the process of enabling MCS for a running MongoDB cluster. It does not make much sense to change the

DNSSuffix

 field, unless you have hard requirements from your service provider, but we still allow such a change. 

If you want to enable MCS with a cluster deployed with an operator version below 1.12, you need to update your TLS certificates to include

*.clusterset.local

SANs. See the docs for instructions.

Conclusion

Business relies on applications and infrastructure that serves them more than ever nowadays. Disaster Recovery protocols and various failsafe mechanisms are routine for reliability engineers, not an annoying task in the backlog. 

With multi-cluster deployment functionality in Percona Operator for MongoDB, we want to equip users to build highly available and secured database clusters with minimal effort.

Percona Operator for MongoDB is truly open source and provides users with a way to deploy and manage their enterprise-grade MongoDB clusters on Kubernetes. We encourage you to try this new Multi-Cluster Services integration and let us know your results on our community forum. You can find some scripts that would help you provision your first MCS clusters on GKE or EKS here.

There is always room for improvement and a time to find a better way. Please let us know if you face any issues with contributing your ideas to Percona products. You can do that on the Community Forum or JIRA. Read more about contribution guidelines for Percona Operator for MongoDB in CONTRIBUTING.md.

Apr
18
2022
--

Expose Databases on Kubernetes with Ingress

Expose Databases on Kubernetes with Ingress

Ingress is a resource that is commonly used to expose HTTP(s) services outside of Kubernetes. To have ingress support, you will need an Ingress Controller, which in a nutshell is a proxy. SREs and DevOps love ingress as it provides developers with a self-service to expose their applications. Developers love it as it is simple to use, but at the same time quite flexible.

High-level ingress design looks like this: 

High-level ingress design

  1. Users connect through a single Load Balancer or other Kubernetes service
  2. Traffic is routed through Ingress Pod (or Pods for high availability)
    • There are multiple flavors of Ingress Controllers. Some use nginx, some envoy, or other proxies. See a curated list of Ingress Controllers here.
  3. Based on HTTP headers traffic is routed to corresponding Pods which run websites. For HTTPS traffic Server Name Indication (SNI) is used, which is an extension in TLS supported by most browsers. Usually, ingress controller integrates nicely with cert-manager, which provides you with full TLS lifecycle support (yeah, no need to worry about renewals anymore).

The beauty and value of such a design is that you have a single Load Balancer serving all your websites. In Public Clouds, it leads to cost savings, and in private clouds, it simplifies your networking or reduces the number of IPv4 addresses (if you are not on IPv6 yet). 

TCP and UDP with Ingress

Quite interestingly, some ingress controllers also support TCP and UDP proxying. I have been asked on our forum and Kubernetes slack multiple times if it is possible to use ingress with Percona Operators. Well, it is. Usually, you need a load balancer per database cluster: 

TCP and UDP with Ingress

The design with ingress is going to be a bit more complicated but still allows you to utilize the single load balancer for multiple databases. In cases where you run hundreds of clusters, it leads to significant cost savings. 

  1. Each TCP port represents the database cluster
  2. Ingress Controller makes a decision about where to route the traffic based on the port

The obvious downside of this design is non-default TCP ports for databases. There might be weird cases where it can turn into a blocker, but usually, it should not.

Hands-on

My goal is to have the following:

All configuration files I used for this blog post can be found in this Github repository.

Deploy Percona XtraDB Clusters (PXC)

The following commands are going to deploy the Operator and three clusters:

kubectl apply -f https://raw.githubusercontent.com/spron-in/blog-data/master/operators-and-ingress/bundle.yaml

kubectl apply -f https://raw.githubusercontent.com/spron-in/blog-data/master/operators-and-ingress/cr-minimal.yaml
kubectl apply -f https://raw.githubusercontent.com/spron-in/blog-data/master/operators-and-ingress/cr-minimal2.yaml
kubectl apply -f https://raw.githubusercontent.com/spron-in/blog-data/master/operators-and-ingress/cr-minimal3.yaml

Deploy Ingress

helm upgrade --install ingress-nginx ingress-nginx   --repo https://kubernetes.github.io/ingress-nginx   --namespace ingress-nginx --create-namespace  \
--set controller.replicaCount=2 \
--set tcp.3306="default/minimal-cluster-haproxy:3306"  \
--set tcp.3307="default/minimal-cluster2-haproxy:3306" \
--set tcp.3308="default/minimal-cluster3-haproxy:3306"

This is going to deploy a highly available ingress-nginx controller. 

  • controller.replicaCount=2

    – defines that we want to have at least two Pods of ingress controller. This is to provide a highly available setup.

  • tcp flags do two things:
    • expose ports 3306-3308 on the ingress’s load balancer
    • instructs ingress controller to forward traffic to corresponding services which were created by Percona Operator for PXC clusters. For example, port 3307 is the one to use to connect to minimal-cluster2. Read more about this configuration in ingress documentation.

Here is the load balancer resource that was created:

$ kubectl -n ingress-nginx get service
NAME                                 TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)                                                                   AGE
…
ingress-nginx-controller             LoadBalancer   10.43.72.142    212.2.244.54   80:30261/TCP,443:32112/TCP,3306:31583/TCP,3307:30786/TCP,3308:31827/TCP   4m13s

As you see, ports 3306-3308 are exposed.

Check the Connection

This is it. Database clusters should be exposed and reachable. Let’s check the connection. 

Get the root password for minimal-cluster2:

$ kubectl get secrets minimal-cluster2-secrets | grep root | awk '{print $2}' | base64 --decode && echo
kSNzx3putYecHwSXEgf

Connect to the database:

$ mysql -u root -h 212.2.244.54 --port 3307  -p
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 5722

It works! Notice how I use port 3307, which corresponds to minimal-cluster2.

Adding More Clusters

What if you add more clusters into the picture, how do you expose those?

Helm

If you use helm, the easiest way is to just add one more flag into the command:

helm upgrade --install ingress-nginx ingress-nginx   --repo https://kubernetes.github.io/ingress-nginx   --namespace ingress-nginx --create-namespace  \
--set controller.replicaCount=2 \
--set tcp.3306="default/minimal-cluster-haproxy:3306"  \
--set tcp.3307="default/minimal-cluster2-haproxy:3306" \
--set tcp.3308="default/minimal-cluster3-haproxy:3306" \
--set tcp.3309="default/minimal-cluster4-haproxy:3306"

No helm

Without Helm, it is a two-step process:

First, edit the

configMap

which configures TCP services exposure. By default it is called

ingress-nginx-tcp

:

kubectl -n ingress-nginx edit cm ingress-nginx-tcp
…
apiVersion: v1
data:
  "3306": default/minimal-cluster-haproxy:3306
  "3307": default/minimal-cluster2-haproxy:3306
  "3308": default/minimal-cluster3-haproxy:3306
  "3309": default/minimal-cluster4-haproxy:3306

Change in

configMap

will trigger the reconfiguration of nginx in ingress pods. But as a second step, it is also necessary to expose this port on a load balancer. To do so – edit the corresponding service:

kubectl -n ingress-nginx edit services ingress-nginx-controller
…
spec:
  ports:
    …
  - name: 3309-tcp
    port: 3309
    protocol: TCP
    targetPort: 3309-tcp

The new cluster is exposed on port 3309 now.

Limitations and considerations

Ports per Load Balancer

Cloud providers usually limit the number of ports that you can expose through a single load balancer:

  • AWS has 50 listeners per NLB, GCP 100 ports per service.

If you hit the limit, just create another load balancer pointing to the ingress controller.

Automation

Cost-saving is a good thing, but with Kubernetes capabilities, users expect to avoid manual tasks, not add more. Integrating the change of ingress configMap and load balancer ports into CICD is not a hard task, but maintaining the logic of adding new load balancers to add more ports is harder. I have not found any projects that implement the logic of reusing load balancer ports or automating it in any way. If you know of any – please leave a comment under this blog post.

Mar
15
2022
--

Run PostgreSQL on Kubernetes with Percona Operator & Pulumi

Run PostgreSQL on Kubernetes with Percona Operator and Pulumi

Avoid vendor lock-in, provide a private Database-as-a-Service for internal teams, quickly deploy-test-destroy databases with CI/CD pipeline – these are some of the most common use cases for running databases on Kubernetes with operators. Percona Distribution for PostgreSQL Operator enables users to do exactly that and more.

Pulumi is an infrastructure-as-a-code tool, which enables developers to write code in their favorite language (Python, Golang, JavaScript, etc.) to deploy infrastructure and applications easily to public clouds and platforms such as Kubernetes.

This blog post is a step-by-step guide on how to deploy a highly-available PostgreSQL cluster on Kubernetes with our Percona Operator and Pulumi.

Desired State

We are going to provision the following resources with Pulumi:

  • Google Kubernetes Engine cluster with three nodes. It can be any Kubernetes flavor.
  • Percona Operator for PostgreSQL
  • Highly available PostgreSQL cluster with one primary and two hot standby nodes
  • Highly available pgBouncer deployment with the Load Balancer in front of it
  • pgBackRest for local backups

Pulumi code can be found in this git repository.

Prepare

I will use the Ubuntu box to run Pulumi, but almost the same steps would work on macOS.

Pre-install Packages

gcloud and kubectl

echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] https://packages.cloud.google.com/apt cloud-sdk main" | sudo tee -a /etc/apt/sources.list.d/google-cloud-sdk.list
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key --keyring /usr/share/keyrings/cloud.google.gpg add -
sudo apt-get update
sudo apt-get install -y google-cloud-sdk docker.io kubectl jq unzip

python3

Pulumi allows developers to use the language of their choice to describe infrastructure and applications. I’m going to use python. We will also pip (python package-management system) and venv (virtual environment module).

sudo apt-get install python3 python3-pip python3-venv

Pulumi

Install Pulumi:

curl -sSL https://get.pulumi.com | sh

On macOS, this can be installed view Homebrew with

brew install pulumi

 

You will need to add .pulumi/bin to the $PATH:

export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/home/percona/.pulumi/bin

Authentication

gcloud

You will need to provide access to Google Cloud to provision Google Kubernetes Engine.

gcloud config set project your-project
gcloud auth application-default login
gcloud auth login

Pulumi

Generate Pulumi token at app.pulumi.com. You will need it later to init Pulumi stack:

Action

This repo has the following files:

  • Pulumi.yaml

    – identifies that it is a folder with Pulumi project

  • __main__.py

    – python code used by Pulumi to provision everything we need

  • requirements.txt

    – to install required python packages

Clone the repo and go to the

pg-k8s-pulumi

folder:

git clone https://github.com/spron-in/blog-data
cd blog-data/pg-k8s-pulumi

Init the stack with:

pulumi stack init pg

You will need the key here generated before on app.pulumi.com.

__main__.py

Python code that Pulumi is going to process is in __main__.py file. 

Lines 1-6: importing python packages

Lines 8-31: configuration parameters for this Pulumi stack. It consists of two parts:

  • Kubernetes cluster configuration. For example, the number of nodes.
  • Operator and PostgreSQL cluster configuration – namespace to be deployed to, service type to expose pgBouncer, etc.

Lines 33-80: deploy GKE cluster and export its configuration

Lines 82-88: create the namespace for Operator and PostgreSQL cluster

Lines 91-426: deploy the Operator. In reality, it just mirrors the operator.yaml from our Operator.

Lines 429-444: create the secret object that allows you to set the password for pguser to connect to the database

Lines 445-557: deploy PostgreSQL cluster. It is a JSON version of cr.yaml from our Operator repository

Line 560: exports Kubernetes configuration so that it can be reused later 

Deploy

At first, we will set the configuration for this stack. Execute the following commands:

pulumi config set gcp:project YOUR_PROJECT
pulumi config set gcp:zone us-central1-a
pulumi config set node_count 3
pulumi config set master_version 1.21

pulumi config set namespace percona-pg
pulumi config set pg_cluster_name pulumi-pg
pulumi config set service_type LoadBalancer
pulumi config set pg_user_password mySuperPass

These commands set the following:

  • GCP project where GKE is going to be deployed
  • GCP zone 
  • Number of nodes in a GKE cluster
  • Kubernetes version
  • Namespace to run PostgreSQL cluster
  • The name of the cluster
  • Expose pgBouncer with LoadBalancer object

Deploy with the following command:

$ pulumi up
Previewing update (pg)

View Live: https://app.pulumi.com/spron-in/percona-pg-k8s/pg/previews/d335d117-b2ce-463b-867d-ad34cf456cb3

     Type                                                           Name                                Plan       Info
 +   pulumi:pulumi:Stack                                            percona-pg-k8s-pg                   create     1 message
 +   ?? random:index:RandomPassword                                 pguser_password                     create
 +   ?? random:index:RandomPassword                                 password                            create
 +   ?? gcp:container:Cluster                                       gke-cluster                         create
 +   ?? pulumi:providers:kubernetes                                 gke_k8s                             create
 +   ?? kubernetes:core/v1:ServiceAccount                           pgoPgo_deployer_saServiceAccount    create
 +   ?? kubernetes:core/v1:Namespace                                pgNamespace                         create
 +   ?? kubernetes:batch/v1:Job                                     pgoPgo_deployJob                    create
 +   ?? kubernetes:core/v1:ConfigMap                                pgoPgo_deployer_cmConfigMap         create
 +   ?? kubernetes:core/v1:Secret                                   percona_pguser_secretSecret         create
 +   ?? kubernetes:rbac.authorization.k8s.io/v1:ClusterRoleBinding  pgo_deployer_crbClusterRoleBinding  create
 +   ?? kubernetes:rbac.authorization.k8s.io/v1:ClusterRole         pgo_deployer_crClusterRole          create
 +   ?? kubernetes:pg.percona.com/v1:PerconaPGCluster               my_cluster_name                     create

Diagnostics:
  pulumi:pulumi:Stack (percona-pg-k8s-pg):
    E0225 14:19:49.739366105   53802 fork_posix.cc:70]           Fork support is only compatible with the epoll1 and poll polling strategies

Do you want to perform this update? yes

Updating (pg)
View Live: https://app.pulumi.com/spron-in/percona-pg-k8s/pg/updates/5
     Type                                                           Name                                Status      Info
 +   pulumi:pulumi:Stack                                            percona-pg-k8s-pg                   created     1 message
 +   ?? random:index:RandomPassword                                 pguser_password                     created
 +   ?? random:index:RandomPassword                                 password                            created
 +   ?? gcp:container:Cluster                                       gke-cluster                         created
 +   ?? pulumi:providers:kubernetes                                 gke_k8s                             created
 +   ?? kubernetes:core/v1:ServiceAccount                           pgoPgo_deployer_saServiceAccount    created
 +   ?? kubernetes:core/v1:Namespace                                pgNamespace                         created
 +   ?? kubernetes:core/v1:ConfigMap                                pgoPgo_deployer_cmConfigMap         created
 +   ?? kubernetes:batch/v1:Job                                     pgoPgo_deployJob                    created
 +   ?? kubernetes:core/v1:Secret                                   percona_pguser_secretSecret         created
 +   ?? kubernetes:rbac.authorization.k8s.io/v1:ClusterRole         pgo_deployer_crClusterRole          created
 +   ?? kubernetes:rbac.authorization.k8s.io/v1:ClusterRoleBinding  pgo_deployer_crbClusterRoleBinding  created
 +   ?? kubernetes:pg.percona.com/v1:PerconaPGCluster               my_cluster_name                     created

Diagnostics:
  pulumi:pulumi:Stack (percona-pg-k8s-pg):
    E0225 14:20:00.211695433   53839 fork_posix.cc:70]           Fork support is only compatible with the epoll1 and poll polling strategies

Outputs:
    kubeconfig: "[secret]"

Resources:
    + 13 created

Duration: 5m30s

Verify

Get kubeconfig first:

pulumi stack output kubeconfig --show-secrets > ~/.kube/config

Check if Pods of your PG cluster are up and running:

$ kubectl -n percona-pg get pods
NAME                                             READY   STATUS      RESTARTS   AGE
backrest-backup-pulumi-pg-dbgsp                  0/1     Completed   0          64s
pgo-deploy-8h86n                                 0/1     Completed   0          4m9s
postgres-operator-5966f884d4-zknbx               4/4     Running     1          3m27s
pulumi-pg-787fdbd8d9-d4nvv                       1/1     Running     0          2m12s
pulumi-pg-backrest-shared-repo-f58bc7657-2swvn   1/1     Running     0          2m38s
pulumi-pg-pgbouncer-6b6dc4564b-bh56z             1/1     Running     0          81s
pulumi-pg-pgbouncer-6b6dc4564b-vpppx             1/1     Running     0          81s
pulumi-pg-pgbouncer-6b6dc4564b-zkdwj             1/1     Running     0          81s
pulumi-pg-repl1-58d578cf49-czm54                 0/1     Running     0          46s
pulumi-pg-repl2-7888fbfd47-h98f4                 0/1     Running     0          46s
pulumi-pg-repl3-cdd958bd9-tf87k                  1/1     Running     0          46s

Get the IP-address of pgBouncer LoadBalancer:

$ kubectl -n percona-pg get services
NAME                             TYPE           CLUSTER-IP     EXTERNAL-IP    PORT(S)                      AGE
…
pulumi-pg-pgbouncer              LoadBalancer   10.20.33.122   35.188.81.20   5432:32042/TCP               3m17s

You can connect to your PostgreSQL cluster through this IP-address. Use pguser password that was set earlier with

pulumi config set pg_user_password

:

psql -h 35.188.81.20 -p 5432 -U pguser pgdb

Clean up

To delete everything it is enough to run the following commands:

pulumi destroy
pulumi stack rm

Tricks and Quirks

Pulumi Converter

kube2pulumi is a huge help if you already have YAML manifests. You don’t need to rewrite the whole code, but just convert YAMLs to Pulumi code. This is what I did for operator.yaml.

apiextensions.CustomResource

There are two ways for Custom Resource management in Pulumi:

crd2pulumi generates libraries/classes out of Custom Resource Definitions and allows you to create custom resources later using these. I found it a bit complicated and it also lacks documentation.

apiextensions.CustomResource on the other hand allows you to create Custom Resources by specifying them as JSON. It is much easier and requires less manipulation. See lines 446-557 in my __main__.py.

True/False in JSON

I have the following in my Custom Resource definition in Pulumi code:

perconapg = kubernetes.apiextensions.CustomResource(
…
    spec= {
…
    "disableAutofail": False,
    "tlsOnly": False,
    "standby": False,
    "pause": False,
    "keepData": True,

Be sure that you use boolean of the language of your choice and not the “true”/”false” strings. For me using the strings turned into a failure as the Operator was expecting boolean, not the strings.

Depends On…

Pulumi makes its own decisions on the ordering of provisioning resources. You can enforce the order by specifying dependencies

For example, I’m ensuring that Operator and Secret are created before the Custom Resource:

    },opts=ResourceOptions(provider=k8s_provider,depends_on=[pgo_pgo_deploy_job,percona_pg_cluster1_pguser_secret_secret])

Mar
07
2022
--

Manage Your Data with Mingo.io and Percona Distribution for MongoDB Operator

Mingo.io and Percona Distribution for MongoDB Operator

Mingo.io and Percona Distribution for MongoDB OperatorDeploying MongoDB on Kubernetes has never been simpler with Percona Distribution for MongoDB Operator. It provides you with an enterprise-ready MongoDB cluster with no manual burden. In addition to that, you also get automated day-to-day operations – scaling, backups, upgrades.

But once you have your cluster up and running and have a good grasp on managing it, you still need to manage the data itself – collections, indexes, shards. We at Percona focus on infrastructure, data consistency, and the health of the cluster, but data is an integral part of any database that should be managed. In this blog post, we will see how the user can manage, browse and query the data of a MongoDB cluster with mingo.io. Mingo.io is a tool that I have discovered recently and it reminded me of PHPMyAdmin, but with a great user interface, feature set, and that it is for MongoDB, not MySQL.

This post provides a step-by-step guide on how to deploy MongoDB locally, connect it with Mingo.io, and manage the data. We want to have a local setup so that anyone would be able to try it at home.

Deploy the Operator and the Database

For local installation, I will use microk8s and minimal cr.yaml for MongoDB, which deploys a one-node replica set with sharding. It is enough for the demo.

Start Kubernetes Cluster

As I use Ubuntu, I will install microk8s from snap:

# snap install microk8s --classic

We need to start microk8s and enable Domain Name Service (DNS) and storage support:

# microk8s start
# microk8s enable dns storage

Kubernetes cluster is ready, let’s fetch kubeconfig from it to use regular kubectl:

# microk8s config > /home/percona/.kube/config

MongoDB Up and Running

Install the Operator:

kubectl apply -f https://raw.githubusercontent.com/spron-in/blog-data/master/mingo/bundle.yaml

Deploy MongoDB itself:

kubectl apply -f https://raw.githubusercontent.com/spron-in/blog-data/master/mingo/cr-minimal.yaml

Connect to MongoDB

To connect to MongoDB we need a connection string with the login and password. In this example, we expose mongos with a NodePort service type. Find the Port to connect to:

$ kubectl get services
NAME                     TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)           AGE
…
minimal-cluster-mongos   NodePort    10.152.183.46   <none>        27017:32609/TCP   32m

You can connect to the MongoDB cluster using the IP address of your machine and port 32609. 

To demonstrate the full power of Mingo, we will need to create a user first to manage the data:

Get userAdmin password

$ kubectl get secrets minimal-cluster -o yaml | awk '$1~/MONGODB_USER_ADMIN_PASSWORD/ {print $2}' | base64 --decode
4OnZ2RZ3SpRVtKbUxy

Run MongoDB client Pod

kubectl run -i --rm --tty percona-client --image=percona/percona-server-mongodb:4.4.10-11 --restart=Never -- bash -il

Connect to Mongo

mongo "mongodb://userAdmin:4OnZ2RZ3SpRVtKbUxy@minimal-cluster-mongos.default.svc.cluster.local/admin?ssl=false"

Create the user in mongo shell:

db.createUser( { user: "test", pwd: "mysUperpass", roles: [ "userAdminAnyDatabase", "readWriteAnyDatabase", "dbAdminAnyDatabase" ] } )

The connection strict that I would use in Mingo would look like this:

mongodb://test:mySuperpass@10.10.3.4:32609

IP address and port most likely would be different for you.

Mingo at Work

MongoDB cluster is up and running. Let’s have Mingo installed and configured.

First, download the latest version of Mingo from https://mingo.io/ On the first opening after installation you will be welcomed with the message about Mingo’s security.

Once you are prompted for your first MongoDB connection, paste your mongo URL into the field and submit. Now you are connected and ready to manage your data.

Mingo.io

Now, let’s try two examples of how Mingo works.

Example 1

Let’s assume you have a collection called “Blog” and you want to rename it to “Articles”. First, locate the collection in the sidebar, right-click it and select “Rename collection…” in the context menu. Type the new collection name and hit Enter. That’s it.

 

Example 2

Let’s assume you have a collection called “orders” with the field “shippedDate”, and you want to rename this field to shipmentDate in every document that was created during 2020. How would you do this in the old way? Would it be easy? Let’s see how you can do this in Mingo.

Instead of complicated queries with dates, we can use a simple shorthand { “OrderDate”: #2020 } and press Submit to see the results. Now just expand any document and then right-click on shippedDate. From the context menu select “Rename Fields ? Filtered documents…” and then just insert the correct name and hit Rename.

Example 3

Let’s customize the way we see the “Orders” collection, for example.

First, click CMD+T (or CTRL+T on Windows), start typing the name of the collection in the Finder until you see it. Then, using the up and down arrow keys, select the collection and press Enter. Alternatively, you can click on the collection in the sidebar.

When the collection is shown, Mingo picks a couple of columns to show. To add another column, click on the + icon and select the field to add. You can rearrange the columns by dragging their headers. Click on the column header to see all the actions for a column and explore them.

Projects

On top of regular MongoDB connections, Mingo offers “Projects”. A project is like a wrapper on several connections to different databases, such as development, production, testing. Mingo treats these databases as siblings and simplifies routine tasks, such as synchronization, backups, shared settings for sibling collections and many more.

Have a Quick Glance at Mingo in Action

 

Conclusion

Deploying and managing MongoDB clusters on Kubernetes is still in its early phase. Percona is committed to delivering enterprise-grade solutions to deploy and manage databases on Kubernetes.

We are focused on infrastructure and data consistency, whereas Mingo.io empowers users to perform data management on any MongoDB cluster.

Percona Distribution for MongoDB Operator contains everything you need to quickly and consistently deploy and scale Percona Server for MongoDB instances into a Kubernetes cluster on-premises or in the cloud. The Operator enables you to improve time to market with the ability to quickly deploy standardized and repeatable database environments. Deploy your database with a consistent and idempotent result no matter where they are used.

Mingo

As NodeJS developers, heavily using MongoDB for our projects, we have long dreamed of a better MongoDB admin tool.

Existing solutions lacked usability and features to make browsing and querying documents enjoyable. They would easily analyze data, list documents, and build aggregations, but they felt awkward with simple everyday tasks, such as finding a document quickly, viewing its content in a nice layout, or doing common actions with one click. Since dreaming only gets you so far, we started working on a new MongoDB admin.

Our goal at Mingo is to create a MongoDB GUI with superb user experience, modern design, and productive features to speed up your work and make you fall in love with your data.

Jan
27
2022
--

Percona Distribution for MySQL Operator Based on Percona Server for MySQL – Alpha Release

Percona Distribution for MySQL Operator

Percona Distribution for MySQL OperatorOperators are a software framework that extends Kubernetes API and enables application deployment and management through the control plane. For such complex technologies as databases, Operators play a crucial role by automating deployment and day-to-day operations. At Percona we have the following production-ready and enterprise-grade Kubernetes Operators for databases:

Today we are glad to announce an alpha version of our new Operator for MySQL. In this blog post, we are going to answer some frequently asked questions.

Why the New Operator?

As mentioned above, our existing operator for MySQL is based on the Percona XtraDB Cluster (PXC). It is feature-rich and provides virtually-synchronous replication by utilizing Galera Write-Sets. Sync replication ensures data consistency and proved itself useful for critical applications, especially on Kubernetes.

But there are two things that we want to address:

  1. Our community and customers let us know that there are numerous use cases where asynchronous replication would be a more suitable solution for MySQL on Kubernetes.
  2. Support Group Replication (GR) – a native way to provide synchronous replication in MySQL without the need to use Galera.

We heard you! That is why our new Operator is going to run Percona Server for MySQL (PS) and provide both regular asynchronous (with semi-sync support) and virtually-synchronous replication based on GR.

What Is the Name of the New Operator?

We will have two Operators for MySQL and follow the same naming as we have for Distributions:

  • Percona Distribution for MySQL Operator – PXC (Existing Operator)
  • Percona Distribution for MySQL Operator – PS (Percona Server for MySQL)

Is It the Successor of the Existing Operator for MySQL?

Not in the short run. We want to provide our users MySQL clusters on Kubernetes with three replication capabilities:

  • Percona XtraDB Cluster
  • Group Replication
  • Regular asynchronous replication with semi-sync support

Will I Be Able to Switch From One Operator to Another?

We are going to provide instructions and tools for migrations through replication or backup and restore. As the underlying implementation is totally different there is no direct, automated path to switch available.

Can I Use the New Operator Now?

Yes, but please remember that it is an alpha version and we do not recommend it for production workloads. 

Our Operator is licensed under Apache 2.0 and can be found in the percona-server-mysql-operator repository on GitHub.

To learn more about our operator please see the documentation.

Quick Deploy

Run these two commands to spin up a MySQL cluster with 3 nodes with asynchronous replication:

$ kubectl apply -f https://raw.githubusercontent.com/percona/percona-server-mysql-operator/main/deploy/bundle.yaml
$ kubectl apply -f https://raw.githubusercontent.com/percona/percona-server-mysql-operator/main/deploy/cr.yaml

In a couple of minutes, the cluster is going to be up and running. Verify:

$ kubectl get ps
NAME       MYSQL   ORCHESTRATOR   AGE
cluster1   ready   ready          6m16s

$ kubectl get pods
NAME                                                READY   STATUS    RESTARTS   AGE
cluster1-mysql-0                                    1/1     Running   0          6m27s
cluster1-mysql-1                                    1/1     Running   1          5m11s
cluster1-mysql-2                                    1/1     Running   1          3m36s
cluster1-orc-0                                      2/2     Running   0          6m27s
percona-server-for-mysql-operator-c8f8dbccb-q7lbr   1/1     Running   0          9m31s

Connect to the Cluster

First, you need to get the root user password, which was automatically generated by the Operator. By default system users’ passwords are stored in cluster1-secrets Secret resource:

$ kubectl get secrets cluster1-secrets -o yaml | grep root | awk '{print $2}' | base64 --decode

Start another container with a MySQL client in it:

$ kubectl run -i --rm --tty percona-client --image=percona:8.0 --restart=Never -- bash -il

Connect to a primary node of our MySQL cluster from this container:

$ mysql -h cluster1-mysql-primary -u root -p
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 138
Server version: 8.0.25-15 Percona Server (GPL), Release 15, Revision a558ec2

Copyright (c) 2009-2021 Percona LLC and/or its affiliates
Copyright (c) 2000, 2021, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql>

Consult the documentation to learn more about other operational capabilities and options.

What Is Currently Supported?

The following functionality is available in the Operator:

  • Deploy asynchronous and semi-sync replication MySQL clusters with Orchestrator on top of it
  • Expose clusters with regular Kubernetes Services
  • Monitor the cluster with Percona Monitoring and Management
  • Customize MySQL configuration

When Will It Be GA? What Is Going to Be Included?

Our goal is to release the GA version late in Q2 2022. We plan to include the following:

  • Support for both sync and async replication
  • Backups and restores, proxies integration
  • Certifications on various Kubernetes platforms and flavors
  • Deploy and manage MySQL Clusters with PMM DBaaS

Call for Action

Percona Distribution for MySQL Operator – PS just hatched and your feedback is highly appreciated. 

Open a bug or create a Pull Request for a chance to get awesome Percona Swag!

Jan
11
2022
--

Creating a Standby Cluster With the Percona Distribution for PostgreSQL Operator

Standby Cluster With the Percona Distribution for PostgreSQL Operator

A customer recently asked if our Percona Distribution for PostgreSQL Operator supports the deployment of a standby cluster, which they need as part of their Disaster Recovery (DR) strategy. The answer is yes – as long as you are making use of an object storage system for backups, such as AWS S3 or GCP Cloud Storage buckets, that can be accessed by the standby cluster. In a nutshell, it works like this:

  • The primary cluster is configured with pgBackRest to take backups and store them alongside archived WAL files in a remote repository;
  • The standby cluster is built from one of these backups and it is kept in sync with the primary cluster by consuming the WAL files that are copied from the remote repository.

Note that the primary node in the standby cluster is not a streaming replica from any of the nodes in the primary cluster and that it relies on archived WAL files to replicate events. For this reason, this approach cannot be used as a High Availability (HA) solution. Even though the primary use of a standby cluster in this context is DR, it can be also employed for migrations as well.

So, how can we create a standby cluster using the Percona operator? We will show you next. But first, let’s create a primary cluster for our example.

Creating a Primary PostgreSQL Cluster Using the Percona Operator

You will find a detailed procedure on how to deploy a PostgreSQL cluster using the Percona operator in our online documentation. Here we want to highlight the main steps involved, particularly regarding the configuration of object storage, which is a crucial requirement and should better be done during the initial deployment of the cluster. In the following example, we will deploy our clusters using the Google Kubernetes Engine (GKE) but you can find similar instructions for other environments in the previous link.

Considering you have a Google account configured as well as the gcloud (from the Google Cloud SDK suite) and kubectl command-line tools installed, authenticate yourself with gcloud auth login, and off we go!

Creating a GKE Cluster and Basic Configuration

The following command will create a default cluster named “cluster-1” and composed of three nodes. We are creating it in the us-central1-a zone using e2-standard-4 VMs but you may choose different options. In fact, you may also need to indicate the project name and other main settings if you do not have your gcloud environment pre-configured with them:

gcloud container clusters create cluster-1 --preemptible --machine-type e2-standard-4 --num-nodes=3 --zone us-central1-a

Once the cluster is created, use your IAM identity to control access to this new cluster:

kubectl create clusterrolebinding cluster-admin-binding --clusterrole cluster-admin --user $(gcloud config get-value core/account)

Finally, create the pgo namespace:

kubectl create namespace pgo

and set the current context to refer to this new namespace:

kubectl config set-context $(kubectl config current-context) --namespace=pgo

Creating a Cloud Storage Bucket

Remember for this setup we need a Google Cloud Storage bucket configured as well as a Service Account created with the necessary privileges/roles to access it. The respective procedures to obtain these vary according to how your environment is configured so we won’t be covering them here. Please refer to the Google Cloud Storage documentation for the exact steps. The bucket we created for the example in this post was named cluster1-backups-and-wals.

Likewise, please refer to the Creating and managing service account keys documentation to learn how to create a Service Account and download the corresponding key in JSON format – we will need to provide it to the operator so our PostgreSQL clusters can access the storage bucket.

Creating the Kubernetes Secrets File to Access the Storage Bucket

Create a file named my-gcs-account-secret.yaml with the following structure:

apiVersion: v1
kind: Secret
metadata:
  name: cluster1-backrest-repo-config
type: Opaque
data:
  gcs-key: <VALUE>

replacing the <VALUE> placeholder by the output of the following command according to the OS you are using:

Linux:

base64 --wrap=0 your-service-account-key-file.json

macOS:

base64 your-service-account-key-file.json

Installing and Deploying the Operator

The most practical way to install our operator is by cloning the Git repository, and then moving inside its directory:

git clone -b v1.1.0 https://github.com/percona/percona-postgresql-operator
cd percona-postgresql-operator

The following command will deploy the operator:

kubectl apply -f deploy/operator.yaml

We have already prepared the secrets file to access the storage bucket so we can apply it now:

kubectl apply -f my-gcs-account-secret.yaml

Now, all that is left is to customize the storages options in the deploy/cr.yaml file to indicate the use of the GCS bucket as follows:

    storages:
      my-gcs:
        type: gcs
        bucket: cluster1-backups-and-wals

We can now deploy the primary PostgreSQL cluster (cluster1):

kubectl apply -f deploy/cr.yaml

Once the operator has been deployed, you can run the following command to do some housekeeping:

kubectl delete -f deploy/operator.yaml

Creating a Standby PostgreSQL Cluster Using the Percona Operator

After this long preamble, let’s look at what brought you here: how to deploy a standby cluster, which we will refer to as cluster2, that will replicate from the primary cluster.

Copying the Secrets Over

Considering you probably have customized the passwords you use in your primary cluster and that they differ from the default values found in the operator’s git repository, we need to make a copy of the secrets files, adjusted to the standby cluster’s name. The following procedure facilitates this task, saving the secrets files under /tmp/cluster1-cluster2-secrets (you can choose a different target directory):

NOTE: make sure you have the yq tool installed in your system.
mkdir -p /tmp/cluster1-cluster2-secrets/
export primary_cluster_name=cluster1
export standby_cluster_name=cluster2
export secrets="${primary_cluster_name}-users"
kubectl get secret/$secrets -o yaml \
| yq eval 'del(.metadata.creationTimestamp)' - \
| yq eval 'del(.metadata.uid)' - \
| yq eval 'del(.metadata.selfLink)' - \
| yq eval 'del(.metadata.resourceVersion)' - \
| yq eval 'del(.metadata.namespace)' - \
| yq eval 'del(.metadata.annotations."kubectl.kubernetes.io/last-applied-configuration")' - \
| yq eval '.metadata.name = "'"${secrets/$primary_cluster_name/$standby_cluster_name}"'"' - \
| yq eval '.metadata.labels.pg-cluster = "'"${standby_cluster_name}"'"' - \
>/tmp/cluster1-cluster2-secrets/${secrets/$primary_cluster_name/$standby_cluster_name}

Deploying the Standby Cluster: Fast Mode

Since we have already covered the procedure used to create the primary cluster in detail in a previous section, we will be presenting the essential steps to create the standby cluster below and provide additional comments only when necessary.

NOTE: the commands below are issued from inside the percona-postgresql-operator directory hosting the git repository for our operator.

Deploying a New GKE Cluster Named cluster-2

This time using the us-west1-b zone here:

gcloud container clusters create cluster-2 --preemptible --machine-type e2-standard-4 --num-nodes=3 --zone us-west1-b
kubectl create clusterrolebinding cluster-admin-binding --clusterrole cluster-admin --user $(gcloud config get-value core/account)
kubectl create namespace pgo
kubectl config set-context $(kubectl config current-context) --namespace=pgo
kubectl apply -f deploy/operator.yaml

Apply the Adjusted Kubernetes Secrets:

export standby_cluster_name=cluster2
export secrets="${standby_cluster_name}-users"
kubectl create -f /tmp/cluster1-cluster2-secrets/$secrets

The list above does not include the GCS secret file; the key contents remain the same but the backrest-repo pod name needs to be adjusted. Make a copy of that file:

cp my-gcs-account-secret.yaml my-gcs-account-secret-2.yaml

then edit the copy to indicate “cluster2-” instead of “cluster1-”:

name: cluster2-backrest-repo-config

You can apply it now:

kubectl apply -f my-gcs-account-secret-2.yaml

The cr.yaml file of the Standby Cluster

Let’s make a copy of the cr.yaml file we customized for the primary cluster:

cp deploy/cr.yaml deploy/cr-2.yaml

and edit the copy as follows:

1) Change all references (that are not commented) from cluster1 to cluster2  – including current-primary but excluding the bucket reference, which in our example is prefixed with “cluster1-”; the storage section must remain unchanged. (We know it’s not very practical to replace so many references, we still need to improve this part of the routine).

2) Enable the standby option:

standby: true

3) Provide a repoPath that points to the GCS bucket used by the primary cluster (just below the storages section, which should remain the same as in the primary cluster’s cr.yaml file):

repoPath: “/backrestrepo/cluster1-backrest-shared-repo”

And that’s it! All that is left now is to deploy the standby cluster:

kubectl apply -f deploy/cr-2.yaml

With everything working on the standby cluster, do some housekeeping:

kubectl delete -f deploy/operator.yaml

Verifying it all Works as Expected

Remember that the standby cluster is created from a backup and relies on archived WAL files to be continued in sync with the primary cluster. If you make a change in the primary cluster, such as adding a row to a table, that change won’t reach the standby cluster until the WAL file it has been recorded to is archived and consumed by the standby cluster.

When checking if all is working with the new setup, you can force the rotation of the WAL file (and subsequent archival of the previous one) in the primary node of the primary cluster to accelerate the sync process by issuing:

psql> SELECT pg_switch_wal();

The Percona Kubernetes Operators automate the creation, alteration, or deletion of members in your Percona Distribution for MySQL, MongoDB, or PostgreSQL environment.

Learn More About Percona Kubernetes Operators

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com