Oct
13
2021
--

Migrating MongoDB to Kubernetes

Migrating MongoDB to Kubernetes

Migrating MongoDB to KubernetesThis blog post is the last in the series of articles on migrating databases to Kubernetes with Percona Operators. Two previous posts can be found here:

As you might have guessed already, this time we are going to cover the migration of MongoDB to Kubernetes. In the 1.10.0 release of Percona Distribution for MongoDB Operator, we have introduced a new feature (in tech preview) that enables users to execute such migrations through regular MongoDB replication capabilities. We have already shown before how it can be used to provide cross-regional disaster recovery for MongoDB, we encourage you to read it.

The Goal

There are two ways to migrate the database:

  1. Take the backup and restore it.
    – This option is the simplest one, but unfortunately comes with downtime. The bigger the database, the longer the recovery time is.
  2. Replicate the data to the new site and switch the application once replicas are in sync.
    – This allows the user to perform the migration and switch the application with either zero or little downtime.

This blog post is a walkthrough on how to migrate MongoDB replica set to Kubernetes with replication capabilities. 

MongoDB replica set to Kubernetes

  1. We have a MongoDB cluster somewhere (the Source). It can be on-prem or some virtual machine. For demo purposes, I’m going to use a standalone replica set node. The migration procedure of a replica set with multiple nodes or sharded cluster is almost identical.
  2. We have a Kubernetes cluster with Percona Operator (the Target). The operator deploys 3 standalone MongoDB nodes in unmanaged mode (we will talk about it below).
  3. Each node is exposed so that the nodes on the Source can reach them.
  4. We are going to replicate the data to Target nodes by adding them into the replica set.

As always all blog post scripts and configuration files are available publicly in this Github repository.

Prerequisites

  • MongoDB cluster – either on-prem or VM. It is important to be able to configure mongod to some extent and add external nodes to the replica set.
  • Kubernetes cluster for the Target.
  • kubectl to deploy and manage the Operator and database on Kubernetes.

Prepare the Source

This section explains what preparations must be made on the Source to set up the replication.

Expose

All nodes in the replica set must form a mesh and be able to reach each other. The communication between the nodes can go through the public internet or some VPN. For demonstration purposes, we are going to expose the Source to the public internet by editing mongod.conf:

net:
  bindIp: <PUBLIC IP>

If you have multiple replica sets – you need to expose all nodes of each of them, including config servers.

TLS

We take security seriously at Percona, and this is why by default our Operator deploys MongoDB clusters with encryption enabled. I have prepared a script that generates self-signed certificates and keys with the openssl tool. If you already have Certificate Authority (CA) in use in your organization, generate the certificates and sign them by your CA.

The list of alternative names can be found either in this document or in this openssl configuration file. Note DNS.20 entry:

DNS.20      = *.mongo.spron.in

I’m going to use this wildcard entry to set up the replication between the nodes. The script also generates an

ssl-secret.yaml

file, which we are going to use on the Target side.

You need to upload CA and certificate with a private key to every Source replica set node and then define it in the

mongod.conf

:

# network interfaces
net:
  ...
  tls:
    mode: preferTLS
    CAFile: /path/to/ca.pem
    certificateKeyFile: /path/to/mongod.pem

security:
  clusterAuthMode: x509
  authorization: enabled

Note that I also set

clusterAuthMode

to x509. It enforces the use of x509 authentication. Test it carefully on a non-production environment first as it might break your existing replication.

Create System Users

Our Operator needs system users to manage the cluster and perform health checks. Usernames and passwords for system users should be the same on the Source and the Target. This script is going to generate the

user-secret.yaml

to use on the Target and mongo shell code to add the users on the Source (it is an example, do not use it in production).

Connect to the primary node on the Source and execute mongo shell commands generated by the script.

Prepare the Target

Apply Users and TLS secrets

System users’ credentials and TLS certificates must be similar on both sides. The scripts we used above generate Secret object manifests to use on the Target. Apply them:

$ kubectl apply -f ssl-secret.yaml
$ kubectl apply -f user-secret.yam

Deploy the Operator and MongoDB Nodes

Please follow one of the installation guides to deploy the Operator. Usually, it is one step operation through

kubectl

:

$ kubectl apply -f operator.yaml

MongoDB nodes are deployed with a custom resource manifest – cr.yaml. There are the following important configuration items in it:

spec:
  unmanaged: true

This flag instructs Operator to deploy the nodes in unmanaged mode, meaning they are not configured to form the cluster. Also, the Operator does not generate TLS certificates and system users.

spec:
…
  updateStrategy: Never

Disable the Smart Update feature as the cluster is unmanaged.

spec:
…
  secrets:
    users: my-new-cluster-secrets
    ssl: my-custom-ssl
    sslInternal: my-custom-ssl-internal

This section points to the Secret objects that we created in the previous step.

spec:
…
  replsets:
  - name: rs0
    size: 3
    expose:
      enabled: true
      exposeType: LoadBalancer

Remember, that nodes need to be exposed and reachable. To do this we create a service per Pod. In our case, it is a

LoadBalancer

object, but it can be any other Service type.

spec:
...
  backup:
    enabled: false

If the cluster and nodes are unmanaged, the Operator should not be taking backups. 

Deploy unmanaged nodes with the following command:

$ kubectl apply -f cr.yaml

Once nodes are up and running, also check the services. We will need the IP-addresses of new replicas to add them later to the replica set on the Source.

$ kubectl get services
NAME                    TYPE           CLUSTER-IP     EXTERNAL-IP      PORT(S)           AGE
…
my-new-cluster-rs0-0    LoadBalancer   10.3.252.134   35.223.104.224   27017:31332/TCP   2m11s
my-new-cluster- rs0-1   LoadBalancer   10.3.244.166   34.134.210.223   27017:32647/TCP   81s
my-new-cluster-rs0-2    LoadBalancer   10.3.248.119   34.135.233.58    27017:32534/TCP   45s

Configure Domains

X509 authentication is strict and requires that the certificate’s common name or alternative name match the domain name of the node. As you remember we had wildcard

*.mongo.spron.in

included in our certificate. It can be any domain that you use, but make sure a certificate is issued for this domain.

I’m going to create A-records to point to public IP-addresses of MongoDB nodes:

k8s-1.mongo.spron-in -> 35.223.104.224
k8s-2.mongo.spron.in -> 34.134.210.223
k8s-3.mongo.spron-in -> 34.135.233.58

Replicate the Data to the Target

It is time to add our nodes in the Kubernetes cluster to the replica set. Login into the mongo shell on the Source and execute the following:

rs.add({ host: "k8s-1.mongo.spron.in", priority: 0, votes: 0} )
rs.add({ host: "k8s-2.mongo.spron.in", priority: 0, votes: 0} )
rs.add({ host: "k8s-3.mongo.spron.in", priority: 0, votes: 0} )

If everything is done correctly these nodes are going to be added as secondaries. You can check the status with the

rs.status()

command.

Cutover

Check that newly added node are synchronized. The more data you have, the longer the synchronization process is going to take. To understand if nodes are synchronized you should compare the values of

optime

and

optimeDate

of the Primary node with the values for the Secondary node in

rs.status()

output:

{
        "_id" : 0,
        "name" : "147.182.213.59:27017",
        "stateStr" : "PRIMARY",
...
        "optime" : {
                "ts" : Timestamp(1633697030, 1),
                "t" : NumberLong(2)
        },
        "optimeDate" : ISODate("2021-10-08T12:43:50Z"),
...
},
{
        "_id" : 1,
        "name" : "k8s-1.mongo.spron.in:27017",
        "stateStr" : "SECONDARY",
...
        "optime" : {
                "ts" : Timestamp(1633697030, 1),
                "t" : NumberLong(2)
        },
        "optimeDurable" : {
                "ts" : Timestamp(1633697030, 1),
                "t" : NumberLong(2)
        },
        "optimeDate" : ISODate("2021-10-08T12:43:50Z"),
...
},

When nodes are synchronized, we are ready to perform the cutover. Please ensure that your application is configured properly to minimize downtime during the cutover.

The cutover is going to have two steps:

  1. One of the nodes on the Target becomes the primary.
  2. Operator starts managing the cluster and nodes on the Source are no longer present in the replica set.

Switch the Primary

Connect with mongo shell to the primary on the Source side and make one of the nodes on the Target primary. It can be done by changing the replica set configuration:

cfg = rs.config()
cfg.members[1].priority = 2
cfg.members[1].votes = 1
rs.reconfig(cfg)

We enable voting and set priority to two on one of the nodes in the Kubernetes cluster. Member id can be different for you, so please look carefully into the output of

rs.config()

command.

Start Managing the Cluster

Once the primary is running in Kubernetes, we are going to tell the Operator to start managing the cluster. Change

spec.unmanaged

to

false

 in the Custom Resource with patch command:

$ kubectl patch psmdb my-cluster-name --type=merge -p '{"spec":{"unmanaged": true}}'

You can also do this by changing

cr.yaml

and applying it. This is it, now you have the cluster in Kubernetes which is managed by the Operator. 

Conclusion

You truly start to appreciate Operators once you get used to them. When I was writing this blog post I found it extremely annoying to deploy and configure a single MongoDB node on a Linux box; I don’t want to think about the cluster. Operators abstract Kubernetes primitives and database configuration and provide you with a fully operational database service instead of a bunch of nodes. Migration of MongoDB to Kubernetes is a challenging task, but it is much simpler with Operator. And once you are on Kubernetes, Operator takes care of all day-2 operations as well.

We encourage you to try out our operator. See our GitHub repository and check out the documentation.

Found a bug or have a feature idea? Feel free to submit it in JIRA.

For general questions please raise the topic in the community forum

You are a developer and looking to contribute? Please read our CONTRIBUTING.md and send the Pull Request.

Percona Distribution for MongoDB Operator

The Percona Distribution for MongoDB Operator simplifies running Percona Server for MongoDB on Kubernetes and provides automation for day-1 and day-2 operations. It’s based on the Kubernetes API and enables highly available environments. Regardless of where it is used, the Operator creates a member that is identical to other members created with the same Operator. This provides an assured level of stability to easily build test environments or deploy a repeatable, consistent database environment that meets Percona expert-recommended best practices.

Complete the 2021 Percona Open Source Data Management Software Survey

Have Your Say!

Oct
08
2021
--

Disaster Recovery for MongoDB on Kubernetes

Disaster Recovery for MongoDB on Kubernetes

Disaster Recovery for MongoDB on KubernetesAs per the glossary, Disaster Recovery (DR) protocols are an organization’s method of regaining access and functionality to its IT infrastructure in events like a natural disaster, cyber attack, or even business disruptions related to the COVID-19 pandemic. When we talk about data, storing backups on remote servers is enough to pass DR compliance checks for some companies. But for others, Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO) are extremely tight and require more than just a backup/restore procedure.

In this blog post, we are going to show you how to set up MongoDB on two distant Kubernetes clusters with Percona Distribution for MongoDB Operator to meet the toughest DR requirements.

What to Expect

Here is what we are going to do:

  1. Setup two Kubernetes clusters
  2. Deploy Percona Distribution for MongoDB Operator on both of them. The Disaster Recovery site will run a MongoDB cluster in unmanaged mode.
  3. We are going to simulate the failure and perform a failover to DR site

In the 1.10.0 version of the Operator, we have added the Technology Preview of the new feature which enables users to deploy unmanaged MongoDB nodes and connect them to existing Replica Sets.

MongoDB Kubernetes

Set it All Up

We are not going to cover the configuration of the Kubernetes clusters, but in our tests, we relied on two Google Kubernetes Engine (GKE) clusters deployed in different regions. Read more about GKE here.

Prepare Main Site

We have shared all the resources for this blog post in this GitHub repo. As a first step we are going to deploy the operator on the Main site:

$ kubectl apply -f bundle.yaml

Deploy the MongoDB managed cluster with

cr-main.yaml

:

$ kubectl apply -f cr-main.yaml

It is important to understand that we will need to expose ReplicaSet nodes through a dedicated service. This includes Config Servers. This is required to ensure that ReplicaSet nodes on Main and DR can reach each other. So it is like a full mesh:

ReplicaSet nodes

To get there, cr-main.yaml has the following changes:

spec:
  replsets:
  - rs0:
    expose:
      enabled: true
      exposeType: LoadBalancer
  sharding:
    configsvrReplSet:
      expose:
        enabled: true
        exposeType: LoadBalancer

We are using the LoadBalancer Kubernetes Service object as it is just simpler for us, but there are other options – ClusterIP, NodePort. It is also possible to utilize 3rd party tools like Submariner to implement a private connection.

If you have an already running MongoDB cluster in Kubernetes, you can expose the ReplicaSets without downtime by changing these variables.

Prepare Disaster Recovery Site

The configuration of the Disaster Recovery site could be broken down into the following steps:

  1. Copy the Secrets from the Main cluster.
    1. system users secrets
    2. SSL keys – both used for external connections and internal replication traffic
  2. Tune Custom Resource:
    1. run nodes in unmanaged mode – Operator does not control replicaset configuration and secrets generation
    2. expose ReplicaSets (the same way we do it on the Main cluster)
    3. disable backups – backups can be only taken on the cluster managed by the Operator

Copy the Secrets

System user’s credentials are stored by default in my-cluster-name-secrets Secret object and defined in spec.secrets.users. Apply this secret in the DR cluster with kubectl apply -f yaml-with-secrets. If you don’t have it in your source code repository or if you rely on the Operator to generate it, you can get the secret from Kubernetes itself, remove the unnecessary metadata and apply.

On main execute:

$ kubectl get secret my-cluster-name-secrets -o yaml > my-cluster-secrets.yaml

Now remove the following lines from metadata:

annotations
creationTimestamp
resourceVersion
selfLink
uid

Save the file and apply it to the DR cluster.

The procedure to copy SSL keys is almost the same as for users. The difference is the names of the Secret objects – they are usually called <CLUSTER_NAME>-ssl and <CLUSTER_NAME>-ssl-internal. It is also possible to specify them in secrets.ssl and secrets.sslInternal in the Custom Resource. Copy these two keys from Main to DR and reference them in the CR.

Tune Custom Resource

cr-replica.yaml will have the following changes:

  secrets:
    users: my-cluster-name-secrets
    ssl: replica-cluster-ssl
    sslInternal: replica-cluster-ssl-internal

  replsets:
  - name: rs0
    size: 3
    expose:
      enabled: true
      exposeType: LoadBalancer

  sharding:
    enabled: true
    configsvrReplSet:
      size: 3
      expose:
        enabled: true
        exposeType: LoadBalancer

  backup:
    enabled: false

Once the Custom Resource is applied, the services are going to be created.  We will need the IP addresses of each ReplicaSet node to configure the DR site.

$ kubectl get services
NAME                  TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)           AGE
replica-cluster-cfg-0    LoadBalancer   10.111.241.213   34.78.119.1       27017:31083/TCP   5m28s
replica-cluster-cfg-1    LoadBalancer   10.111.243.70    35.195.138.253    27017:31957/TCP   4m52s
replica-cluster-cfg-2    LoadBalancer   10.111.246.94    146.148.113.165   27017:30196/TCP   4m6s
...
replica-cluster-rs0-0    LoadBalancer   10.111.241.41    34.79.64.213      27017:31993/TCP   5m28s
replica-cluster-rs0-1    LoadBalancer   10.111.242.158   34.76.238.149     27017:32012/TCP   4m47s
replica-cluster-rs0-2    LoadBalancer   10.111.242.191   35.195.253.107    27017:31209/TCP   4m22s

Add External Nodes to Main

At this step, we are going to add unmanaged nodes to the Replica Set on the Main site. In cr-main.yaml we should add externalNodes under replsets.[] and sharding.configsvrReplSet:

  replsets:
  - name: rs0
    externalNodes:
    - host: 34.79.64.213
      priority: 1
      votes: 1
    - host: 34.76.238.149
      priority: 1
      votes: 1
    - host: 35.195.253.107
      priority: 0
      votes: 0

  sharding:
    configsvrReplSet:
      externalNodes:
      - host: 34.78.119.1
        priority: 1
        votes: 1
      - host: 35.195.138.253
        priority: 1
        votes: 1
      - host: 146.148.113.165
        priority: 0
        votes: 0

Please note that we add three nodes, but only two are voters. We do this to avoid split-brain situations and do not start the primary election if the DR site is down or there is a network disruption between the Main and DR sites.

Failover

Once all the configuration above is applied, the situation will look like this:

Failover

We have three voters in the main cluster and two voters in the replica cluster. That means replica nodes won’t have the majority in case of main cluster failure and they won’t be able to elect a new primary. Therefore we need to step in and perform a manual failover.

Let’s kill the main cluster:

gcloud compute instances list | 
grep my-main-gke-demo | 
awk '{print $1}' | 
xargs gcloud compute instances delete --zone europe-west3-b

gcloud container node-pools delete \
--zone europe-west3-b \
--cluster my-main-gke-demo \
Default-pool

I deleted the nodes and the node pool of the main Kubernetes cluster so now the cluster is in an unhealthy state. Let’s see what mongos on the DR site says when we try to read or write through it (psmdb-tester can be found in the git repo as well):

% ./psmdb-tester
2021/09/03 18:19:19 Successfully connected and pinged 34.141.3.189:27017
2021/09/03 18:19:40 read failed: (FailedToSatisfyReadPreference) Encountered non-retryable error during query :: caused by :: Could not find host matching read preference { mode: "primary" } for set cfg
2021/09/03 18:19:49 write failed: (FailedToSatisfyReadPreference) Could not find host matching read preference { mode: "primary" } for set cfg
Disaster Recovery MongoDB

Normally, we can only alter the replica set configuration from the primary node but in this kind of situation where you don’t have a primary and only have a few surviving members, MongoDB allows us to force the reconfiguration from any alive member.

Let’s connect to one of the secondary nodes in the replica cluster and perform the failover:

kubectl exec -it psmdb-client-7b9f978649-pjb2k -- mongo 'mongodb://clusterAdmin:<pass>@replica-cluster-rs0-0.replica.svc.cluster.local/admin?ssl=false'
...
rs0:SECONDARY> cfg = rs.config()
rs0:SECONDARY> cfg.members = [cfg.members[3], cfg.members[4], cfg.members[5]]
rs0:SECONDARY> rs.reconfig(cfg, {force: true})

Note that the indexes of surviving members may differ in your environment. You should check rs.status() and rs.config() outputs first. The main idea is to repopulate config members with only surviving members.

After the reconfiguration, the replica set will have just three members and two of them will have votes and a majority. So, they’ll be able to select a new primary. After performing the same process on the cfg replica set, we will be able to read and write through mongos again:

% ./psmdb-tester
2021/09/03 18:41:48 Successfully connected and pinged 34.141.3.189:27017
2021/09/03 18:41:49 read succeed
2021/09/03 18:41:50 read succeed
2021/09/03 18:41:51 read succeed
2021/09/03 18:41:52 read succeed
2021/09/03 18:41:53 read succeed
2021/09/03 18:41:54 read succeed
2021/09/03 18:41:55 read succeed
2021/09/03 18:41:56 read succeed
2021/09/03 18:41:57 read succeed
2021/09/03 18:41:58 read succeed
2021/09/03 18:41:58 write succeed

Once the replica cluster has become the primary, you should reconfigure all clients that connect to the old main cluster and point them to the DR site.

Conclusion

Disaster Recovery is important for business continuity. The goal of administrators and SREs is to have a plan in place. With the new release of Percona Distribution for MongoDB Operator, setting up DR is fast, automated, and enables IT teams to meet RTO and RPO requirements.

We encourage you to try out our operator. See our GitHub repository and check out the documentation.

Found a bug or have a feature idea? Feel free to submit it in JIRA.

For general questions please raise the topic in the community forum.

You are a developer and looking to contribute? Please read our CONTRIBUTING.md and send the Pull Request.

Percona Distribution for MongoDB Operator

The Percona Distribution for MongoDB Operator simplifies running Percona Server for MongoDB on Kubernetes and provides automation for day-1 and day-2 operations. It’s based on the Kubernetes API and enables highly available environments. Regardless of where it is used, the Operator creates a member that is identical to other members created with the same Operator. This provides an assured level of stability to easily build test environments or deploy a repeatable, consistent database environment that meets Percona expert-recommended best practices.

Complete the 2021 Percona Open Source Data Management Software Survey

Have Your Say!

Oct
07
2021
--

Getting Started with ProxySQL in Kubernetes

Getting Started with ProxySQL in Kubernetes

Getting Started with ProxySQL in KubernetesThere are plenty of ways to run ProxySQL in Kubernetes (K8S). For example, we can deploy sidecar containers on the application pods, or run a dedicated ProxySQL service with its own pods.

We are going to discuss the latter approach, which is more likely to be used when dealing with a large number of application pods. Remember each ProxySQL instance runs a number of checks against the database backends. These checks monitor things like server-status and replication lag. Having too many proxies can cause significant overhead.

Creating a Cluster

For the purpose of this example, I am going to deploy a test cluster in GKE. We need to follow these steps:

1. Create a cluster

gcloud container clusters create ivan-cluster --preemptible --project my-project --zone us-central1-c --machine-type n2-standard-4 --num-nodes=3

2. Configure command-line access

gcloud container clusters get-credentials ivan-cluster --zone us-central1-c --project my-project

3. Create a Namespace

kubectl create namespace ivantest-ns

4. Set the context to use our new Namespace

kubectl config set-context $(kubectl config current-context) --namespace=ivantest-ns

Dedicated Service Using a StatefulSet

One way to implement this approach is to have ProxySQL pods use persistent volumes to store the configuration. We can rely on ProxySQL Cluster mode to make sure the configuration is kept in sync.

For simplicity, we are going to use a ConfigMap with the initial config for bootstrapping the ProxySQL service for the first time.

Exposing the passwords in the ConfigMap is far from ideal, and so far the K8S community hasn’t made up its mind about how to implement Reference Secrets from ConfigMap.

1. Prepare a file for the ConfigMap

tee proxysql.cnf <<EOF
datadir="/var/lib/proxysql"
 
admin_variables=
{
    admin_credentials="admin:admin;cluster:secret"
    mysql_ifaces="0.0.0.0:6032"
    refresh_interval=2000
    cluster_username="cluster"
    cluster_password="secret"  
}
 
mysql_variables=
{
    threads=4
    max_connections=2048
    default_query_delay=0
    default_query_timeout=36000000
    have_compress=true
    poll_timeout=2000
    interfaces="0.0.0.0:6033;/tmp/proxysql.sock"
    default_schema="information_schema"
    stacksize=1048576
    server_version="8.0.23"
    connect_timeout_server=3000
    monitor_username="monitor"
    monitor_password="monitor"
    monitor_history=600000
    monitor_connect_interval=60000
    monitor_ping_interval=10000
    monitor_read_only_interval=1500
    monitor_read_only_timeout=500
    ping_interval_server_msec=120000
    ping_timeout_server=500
    commands_stats=true
    sessions_sort=true
    connect_retries_on_failure=10
}
 
mysql_servers =
(
    { address="mysql1" , port=3306 , hostgroup=10, max_connections=100 },
    { address="mysql2" , port=3306 , hostgroup=20, max_connections=100 }
)
 
mysql_users =
(
    { username = "myuser", password = "password", default_hostgroup = 10, active = 1 }
)
 
proxysql_servers =
(
    { hostname = "proxysql-0.proxysqlcluster", port = 6032, weight = 1 },
    { hostname = "proxysql-1.proxysqlcluster", port = 6032, weight = 1 },
    { hostname = "proxysql-2.proxysqlcluster", port = 6032, weight = 1 }
)
EOF

2. Create the ConfigMap

kubectl create configmap proxysql-configmap --from-file=proxysql.cnf

3. Prepare a file with the StatefulSet

tee proxysql-ss-svc.yml <<EOF
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: proxysql
  labels:
    app: proxysql
spec:
  replicas: 3
  serviceName: proxysqlcluster
  selector:
    matchLabels:
      app: proxysql
  updateStrategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: proxysql
    spec:
      restartPolicy: Always
      containers:
      - image: proxysql/proxysql:2.3.1
        name: proxysql
        volumeMounts:
        - name: proxysql-config
          mountPath: /etc/proxysql.cnf
          subPath: proxysql.cnf
        - name: proxysql-data
          mountPath: /var/lib/proxysql
          subPath: data
        ports:
        - containerPort: 6033
          name: proxysql-mysql
        - containerPort: 6032
          name: proxysql-admin
      volumes:
      - name: proxysql-config
        configMap:
          name: proxysql-configmap
  volumeClaimTemplates:
  - metadata:
      name: proxysql-data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 2Gi
---
apiVersion: v1
kind: Service
metadata:
  annotations:
  labels:
    app: proxysql
  name: proxysql
spec:
  ports:
  - name: proxysql-mysql
    nodePort: 30033
    port: 6033
    protocol: TCP
    targetPort: 6033
  - name: proxysql-admin
    nodePort: 30032
    port: 6032
    protocol: TCP
    targetPort: 6032
  selector:
    app: proxysql
  type: NodePort
EOF

4. Create the StatefulSet

kubectl create -f proxysql-ss-svc.yml

5. Prepare the definition of the headless Service (more on this later)

tee proxysql-headless-svc.yml <<EOF 
apiVersion: v1
kind: Service
metadata:
  name: proxysqlcluster
  labels:
    app: proxysql
spec:
  clusterIP: None
  ports:
  - port: 6032
    name: proxysql-admin
  selector:
    app: proxysql
EOF

6. Create the headless Service

kubectl create -f proxysql-headless-svc.yml

7. Verify the Services

kubectl get svc

NAME              TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)                         AGE
proxysql          NodePort    10.3.249.158           6033:30033/TCP,6032:30032/TCP   12m
proxysqlcluster   ClusterIP   None                   6032/TCP                        8m53s

Pod Name Resolution

By default, each pod has a DNS name associated in the form pod-ip-address.my-namespace.pod.cluster-domain.example.

The headless Service causes K8S to auto-create a DNS record with each pod’s FQDN as well. The result is we will have the following entries available:

proxysql-0.proxysqlcluster
proxysql-1.proxysqlcluster
proxysql-3.proxysqlcluster

We can then use these to set up the ProxySQL cluster (the proxysql_servers part of the configuration file).

Connecting to the Service

To test the service, we can run a container that includes a MySQL client and connect its console output to our terminal. For example, use the following command (which also removes the container/pod after we exit the shell):

kubectl run -i --rm --tty percona-client --image=percona/percona-server:latest --restart=Never -- bash -il

The connections from other pods should be sent to the Cluster-IP and port 6033 and will be load balanced. We can also use the DNS name proxysql.ivantest-ns.svc.cluster.local that got auto-created.

mysql -umyuser -ppassword -h10.3.249.158 -P6033

Use port 30033 instead if the client is connecting from an external network:

mysql -umyuser -ppassword -h10.3.249.158 -P30033

Cleanup Steps

In order to remove all the resources we created, run the following steps:

kubectl delete statefulsets proxysql
kubectl delete service proxysql
kubectl delete service proxysqlcluster

Final Words

We have seen one of the possible ways to deploy ProxySQL in Kubernetes. The approach presented here has a few shortcomings but is good enough for illustrative purposes. For a production setup, consider looking at the Percona Kubernetes Operators instead.

Complete the 2021 Percona Open Source Data Management Software Survey

Have Your Say!

Sep
17
2021
--

Migration of a MySQL Database to a Kubernetes Cluster Using Asynchronous Replication

Migration of a MySQL Database to a Kubernetes Cluster Using Asynchronous Replication

Migration of a MySQL Database to a Kubernetes Cluster Using Asynchronous ReplicationNowadays, more and more companies are thinking about the migration of their infrastructure to Kubernetes. Databases are no exception. There were a lot of k8s operators that were created to simplify the different types of deployments and also perform routine day-to-day tasks like making the backups, renewing certificates, and so on.  If a few years ago nobody wanted to even listen about running databases in Kubernetes,  everything has changed now.

At Percona, we created a few very featureful k8s operators for Percona Server for MongoDB, PostgreSQL, and MySQL databases. Today we will talk about using cross-site replication – a new feature that was added to the latest release of Percona Distribution for MySQL Operator. This feature is based on synchronous connection failover mechanism.
The cross-site replication involves configuring one Percona XtraDB Cluster or a single/several MySQL servers as Source, and another Percona XtraDB Cluster (PXC) as a replica to allow asynchronous replication between them.  If an operator has several sources in custom resource (CR), it will automatically handle connection failure of the source DB.
This cross-site replication feature is supported only since MySQL 8.0.23, but you can read about migrating MySQL of earlier versions in this blog post.

The Goal

Migrate the MySQL database, which is deployed on-prem or in the cloud, to the Percona Distribution for MySQL Operator using asynchronous replication. This approach helps you reduce downtime and data loss for your application.

So, we have the following setup:

Migration of MySQL database to Kubernetes cluster using asynchronous replication

The following components are used:

1. MySQL 8.0.23 database (in my case it is Percona Server for MySQL) which is deployed in DO (as a Source) and Percona XtraBackup for the backup. In my test deployment, I use only one server as a Source to simplify the deployment. Depending on your topology of DB deployment, you can use several servers to use synchronous connection failover mechanism on the operator’s end.

2. Google Kubernetes Engine (GKE) cluster where Percona Distribution for MySQL Operator is deployed with PXC cluster (as a target).

3. AWS S3 bucket is used to save the backup from MySQL DB and then to restore the PXC cluster in k8s.

The following steps should be done to perform the migration procedure:

1. Make the MySQL database backup using Percona XtraBackup and upload it to the S3 bucket using xbcloud.

2. Perform the restore of the MySQL database from the S3 bucket into the PXC cluster which is deployed in k8s.

3. Configure asynchronous replication between MySQL server and PXC cluster managed by k8s operator.

As a result, we have asynchronous replication between MySQL server and PXC cluster in k8s which is in read-only mode.

Migration

Configure the target PXC cluster managed by k8s operator:

1. Deploy Percona Distribution for MySQL Operator on Kubernetes (I have used GKE 1.20).

# clone the git repository
git clone -b v1.9.0 https://github.com/percona/percona-xtradb-cluster-operator
cd percona-xtradb-cluster-operator

# deploy the operator
kubectl apply -f deploy/bundle.yaml

2. Create PXC cluster using the default custom resource manifest (CR).

# create my-cluster-secrets secret (do no use default passwords for production systems)
kubectl apply -f deploy/secrets.yaml

# create cluster by default it will be PXC 8.0.23
kubectl apply -f deploy/cr.yaml

3. Create the secret with credentials for the AWS S3 bucket which will be used for access to the S3 bucket during the restoration procedure.

# create S3-secret.yaml file with following content, and use correct credentials instead of XXXXXX

apiVersion: v1
kind: Secret
metadata:
  name: aws-s3-secret
type: Opaque
data:
  AWS_ACCESS_KEY_ID: XXXXXX
  AWS_SECRET_ACCESS_KEY: XXXXXX

# create secret
kubectl apply -f S3-secret.yaml

Configure the Source MySQL Server

1. Install Percona Server for MySQL 8.0.23 and Percona XtraBackup for the backup. Refer to the Installing Percona Server for MySQL and Installing Percona XtraBackup chapters in the documentation for installation instructions.


NOTE:
You need to add the following options to my.cnf to enable GTID support; otherwise, replication will not work because it is used by the PXC cluster  by default.

[mysqld]
enforce_gtid_consistency=ON
gtid_mode=ON

2. Create all needed users who will be used by k8s operator, the password should be the same as in

deploy/secrets.yaml

. Also, please note that the password for the root user should be the same as in deploy/secrets.yaml file for k8s the secret.  In my case, I used our default passwords from

deploy/secrets.yaml

file.

CREATE USER 'monitor'@'%' IDENTIFIED BY 'monitory' WITH MAX_USER_CONNECTIONS 100;
GRANT SELECT, PROCESS, SUPER, REPLICATION CLIENT, RELOAD ON *.* TO 'monitor'@'%';
GRANT SERVICE_CONNECTION_ADMIN ON *.* TO 'monitor'@'%';

CREATE USER 'operator'@'%' IDENTIFIED BY 'operatoradmin';
GRANT ALL ON *.* TO 'operator'@'%' WITH GRANT OPTION;

CREATE USER 'xtrabackup'@'%' IDENTIFIED BY 'backup_password';
GRANT ALL ON *.* TO 'xtrabackup'@'%';

CREATE USER 'replication'@'%' IDENTIFIED BY 'repl_password';
GRANT REPLICATION SLAVE ON *.* to 'replication'@'%';
FLUSH PRIVILEGES;

2. Make the backup of MySQL database using XtraBackup tool and upload it to S3 bucket.

# export aws credentials
export AWS_ACCESS_KEY_ID=XXXXXX
export AWS_SECRET_ACCESS_KEY=XXXXXX

#make the backup
xtrabackup --backup --stream=xbstream --target-dir=/tmp/backups/ --extra-lsndirk=/tmp/backups/  --password=root_password | xbcloud put --storage=s3 --parallel=10 --md5 --s3-bucket="mysql-testing-bucket" "db-test-1"

Now, everything is ready to perform the restore of the backup on the target database. So, let’s get back to our k8s cluster.

Configure the Asynchronous Replication to the Target PXC Cluster

If you have a completely clean source database (without any data), you can skip the points connected with backup and restoration of the database. Otherwise, do the following:

1. Restore the backup from the S3 bucket using the following manifest:

# create restore.yml file with following content

apiVersion: pxc.percona.com/v1
kind: PerconaXtraDBClusterRestore
metadata:
  name: restore1
spec:
  pxcCluster: cluster1
  backupSource:
    destination: s3://mysql-testing-bucket/db-test-1
    s3:
      credentialsSecret: aws-s3-secret
      region: us-east-1

# trigger the restoration procedure
kubectl apply -f restore.yml

As a result, you will have a PXC cluster with data from the source DB. Now everything is ready to configure the replication.

2. Edit custom resource manifest

deploy/cr.yaml

  to configure

spec.pxc.replicationChannels

 section.

spec:
  ...
  pxc:
    ...
    replicationChannels:
    - name: ps_to_pxc1
      isSource: false
      sourcesList:
        - host: <source_ip>
          port: 3306
          weight: 100

# apply CR
kubectl apply -f deploy/cr.yaml


Verify the Replication 

In order to check the replication, you need to connect to the PXC node and run the following commands:

kubectl exec -it cluster1-pxc-0 -c pxc -- mysql -uroot -proot_password -e "show replica status \G"
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: <ip-of-source-db>
                  Master_User: replication
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: binlog.000004
          Read_Master_Log_Pos: 529
               Relay_Log_File: cluster1-pxc-0-relay-bin-ps_to_pxc1.000002
                Relay_Log_Pos: 738
        Relay_Master_Log_File: binlog.000004
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 529
              Relay_Log_Space: 969
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
                  Master_UUID: 9741945e-148d-11ec-89e9-5ee1a3cf433f
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 3
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set: 9741945e-148d-11ec-89e9-5ee1a3cf433f:1-2
            Executed_Gtid_Set: 93f1e7bf-1495-11ec-80b2-06e6016a7c3d:1,
9647dc03-1495-11ec-a385-7e3b2511dacb:1-7,
9741945e-148d-11ec-89e9-5ee1a3cf433f:1-2
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name: ps_to_pxc1
           Master_TLS_Version:
       Master_public_key_path:
        Get_master_public_key: 0
            Network_Namespace:

Also, you can verify the replication by checking that the data is changing.

Promote the PXC Cluster as a Primary

As soon as you are ready (your application was reconfigured and ready to work with the new DB) to stop the replication and promote the PXC cluster in k8s to be a primary DB, you need to perform the following simple actions:

1. Edit the

deploy/cr.yaml

  and comment the replicationChannels

spec:
  ...
  pxc:
    ...
    #replicationChannels:
    #- name: ps_to_pxc1
    #  isSource: false
    #  sourcesList:
    #    - host: <source_ip>
    #      port: 3306
    #      weight: 100

2. Stop mysqld service on the source server to be sure that no new data is written.

 systemctl stop mysqld

3. Apply a new CR for k8s operator.

# apply CR
kubectl apply -f deploy/cr.yaml

As a result, replication is stopped and the read-only mode is disabled for the PXC cluster.

Conclusion

Technologies are changing so fast that a migration procedure to k8s cluster, seeming very complex at first sight, turns out to be not so difficult and nor time-consuming. But you need to keep in mind that significant changes were made. Firstly, you migrate the database to the PXC cluster which has some peculiarities, and, of course, Kubernetes itself.  If you are ready, you can start the journey to Kubernetes right now.

The Percona team is ready to guide you during this journey. If you have any questions,  please raise the topic in the community forum.

The Percona Kubernetes Operators automate the creation, alteration, or deletion of members in your Percona Distribution for MySQL, MongoDB, or PostgreSQL environment.

Learn More About Percona Kubernetes Operators

Sep
16
2021
--

Mirantis launches cloud-native data center-as-a-service software

Mirantis has been around the block, starting way back as an OpenStack startup, but a few years ago the company began to embrace cloud-native development technologies like containers, microservices and Kubernetes. Today, it announced Mirantis Flow, a fully managed open source set of services designed to help companies manage a cloud-native data center environment, whether your infrastructure lives on-prem or in a public cloud.

“We’re about delivering to customers an open source-based cloud-to-cloud experience in the data center, on the edge, and interoperable with public clouds,” Adrian Ionel, CEO and co-founder at Mirantis explained.

He points out that the biggest companies in the world, the hyperscalers like Facebook, Netflix and Apple, have all figured out how to manage in a hybrid cloud-native world, but most companies lack the resources of these large organizations. Mirantis Flow is aimed at putting these same types of capabilities that the big companies have inside these more modest organizations.

While the large infrastructure cloud vendors like Amazon, Microsoft and Google have been designed to help with this very problem, Ionel says that these tend to be less open and more proprietary. That can lead to lock-in, which today’s large organizations are looking desperately to avoid.

“[The large infrastructure vendors] will lock you into their stack and their APIs. They’re not based on open source standards or technology, so you are locked in your single source, and most large enterprises today are pursuing a multi-cloud strategy. They want infrastructure flexibility,” he said. He added, “The idea here is to provide a completely open and flexible zero lock-in alternative to the [big infrastructure providers, but with the] same cloud experience and same pace of innovation.”

They do this by putting together a stack of open source solutions in a single service. “We provide virtualization on top as part of the same fabric. We also provide software-defined networking, software-defined storage and CI/CD technology with DevOps as a service on top of it, which enables companies to automate the entire software development pipeline,” he said.

As the company describes the service in a blog post published today, it includes “Mirantis Container Cloud, Mirantis OpenStack and Mirantis Kubernetes Engine, all workloads are available for migration to cloud native infrastructure, whether they are traditional virtual machine workloads or containerized workloads.”

For companies worried about migrating their VMware virtual machines to this solution, Ionel says they have been able to move these VMs to the Mirantis solution in early customers. “This is a very, very simple conversion of the virtual machine from VMware standard to an open standard, and there is no reason why any application and any workload should not run on this infrastructure — and we’ve seen it over and over again in many many customers. So we don’t see any bottlenecks whatsoever for people to move right away,” he said.

It’s important to note that this solution does not include hardware. It’s about bringing your own hardware infrastructure, either physical or as a service, or using a Mirantis partner like Equinix. The service is available now for $15,000 per month or $180,000 annually, which includes: 1,000 core/vCPU licenses for access to all products in the Mirantis software suite plus support for 20 virtual machine (VM) migrations or application onboarding and unlimited 24×7 support. The company does not charge any additional fees for control plane and management software licenses.

Aug
19
2021
--

Dynamic User Creation with MySQL on Kubernetes and Hashicorp Cloud Platform Vault

MySQL Kubernetes Hashicorp Cloud

You may have already seen this document which describes the integration between HashiCorp Vault and Percona Distribution for MySQL Operator to enable data-at-rest encryption for self-managed Vault deployments.  In April 2021, HashiCorp announced a fully managed offering, HashiCorp Cloud Platform Vault (HCP Vault), that simplifies deployment and management of the Vault.

With that in mind, I’m going to talk about the integration between Percona and HCP Vault to provide dynamic user creation for MySQL.

Without dynamic credentials, organizations are susceptible to a breach due to secrets sprawl across different systems, files, and repositories. Dynamic credentials provide a secure way of connecting to the database by using a unique password for every login or service account. With Vault, these just-in-time credentials are stored securely and it is also possible to set a lifetime for them.

Goal

My goal would be to provision users on my MySQL cluster deployed in Kubernetes with dynamic credentials through Hashicorp Vault.

MySQL cluster deployed in Kubernetes with dynamic credentials through Hashicorp Vault

  1. Percona Operator deploys Percona XtraDB Cluster and HAProxy
  2. HashiCorp Vault connects to MySQL through HAProxy and creates users with specific grants
  3. Application or user can connect to myapp database using dynamic credentials created by vault

Before You Begin

Prerequisites

  • HCP Vault account
  • Kubernetes cluster

Networking

Right now HCP deploys Vault in Hashicorp’s Amazon account in a private Virtual Private Network. For now to establish a private connection between the Vault and your application you would need to have an AWS account, VPC, and either a peering or Transit Gateway connection:

For the sake of simplicity in this blog post, I’m going to expose the Vault publicly, which is not recommended for production but allows me to configure Vault from anywhere.

More clouds and networking configurations are on the way. Stay tuned to HashiCorp news.

Set it All Up

MySQL

To deploy Percona Distribution for MySQL on Kubernetes please follow our documentation. The only requirement is to have HAProxy exposed via a public load balancer. The following fields should be set correctly in the Custom Resource –

deploy/cr.yaml

:

For simplicity, I have shared two required YAMLs in this GitHub repository. Deploying them would provision the Percona XtraDB Cluster on Kubernetes exposed publicly:

kubectl apply -f bundle.yaml
kubectl apply -f cr.yaml

Once the cluster is ready, get the public address:

$ kubectl get pxc
NAME       ENDPOINT       STATUS   PXC   PROXYSQL   HAPROXY   AGE
cluster1   35.223.41.79   ready    3                3         4m43s

Remember the ENDPOINT address, we will need to use it below to configure HCP Vault.

Create the User and the Database

I’m going to create a MySQL user which is going to be used by HCP Vault to create users dynamically. Also, an empty database called ‘myapp’ to which these users are going to have access.

Get the current root password from the Secret object:

$ kubectl get secrets my-cluster-secrets -o yaml | awk '$1~/root/ {print $2}' | base64 --decode && echo
Jw6OYIsUJeAQQapk

Connect to MySQL directly or by executing into the container:

kubectl exec -ti cluster1-pxc-0 -c pxc bash
mysql -u root -p -h 35.223.41.79

Create the database user and the database:

mysql> create user hcp identified by 'superduper';
Query OK, 0 rows affected (0.04 sec)

mysql> grant select, insert, update, delete, drop, create, alter, create user on *.* to hcp with grant option;
Query OK, 0 rows affected (0.04 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.02 sec)

mysql> create database myapp;
Query OK, 1 row affected (0.02 sec)

Hashicorp Cloud Platform Vault

Setting up Vault on HCP is a few-click process that is described here.

As I mentioned before, for the sake of simplicity HCP Vault is going to be publicly accessible. To do that, go to your Vault cluster in HCP UI, click Manage and Edit Configuration:

vault cluster

Enable the knob to expose the cluster publicly:

configure cluster

Now let’s get the Admin token for Vault. Navigate to your overview dashboard of your Vault cluster and click Generate token:

Vault CLI

The Vault is publicly accessible and you have the Admin token. Let’s configure it with the vault CLI tool. Install it by following the manual here.

Try to log in:

export VAULT_ADDR=”https://vault-cluster.SOMEURL.hashicorp.cloud:8200”
export VAULT_NAMESPACE="admin"

vault login
Token (will be hidden):
Success! You are now authenticated. The token information displayed below
is already stored in the token helper. You do NOT need to run "vault login"
again. Future Vault requests will automatically use this token.
...

Connecting the Dots

It is time to connect Vault with the MySQL database in Kubernetes and start provisioning users. We are going to rely on Vault’s Databases Secrets engine.

1. Enable database secrets engine:

vault secrets enable database

2. Point Vault to MySQL and store the configuration:

vault write database/config/myapp plugin_name=mysql-database-plugin \
connection_url=”{{username}}:{{password}}@tcp(35.223.41.79:3306)/” \
allowed_roles=”mysqlrole” \
username=”hcp” \
password=”superduper”
Success! Data written to: database/config/myapp

3. Create the role:

vault write database/roles/mysqlrole db_name=myapp \
creation_statements=”CREATE USER ‘{{name}}’@’%’ IDENTIFIED BY ‘{{password}}’; GRANT select, insert, update, delete, drop, create, alter ON myapp.* TO ‘{{name}}’@’%’;” \
default_ttl=”1h” \
max_ttl=”24h”
Success! Data written to: database/roles/mysqlrole

This role does the following:

  • Creates the user with a random name and password
  • The user has grants to myapp database
  • By default, the user exists for one hour, but time-to-live can be extended to 24 hours.

Now to create the temporary user just execute the following:

vault read database/creds/mysqlrole
Key                Value
---                -----
lease_id           database/creds/mysqlrole/MpO5oMsd1A0uyXT8d7R6sxVe.slbaC                                                                                   lease_duration     1h
lease_renewable    true
password           Gmx6fv89BL4qHbFokG-p
username           v-token-hcp--mysqlrole-EMt7xeECd

It is now possible to connect to myapp database using the credentials provided above.

Conclusion

Dynamic credentials can be an essential part of your company’s security framework to avoid a breach due to secrets sprawl, data leaks, and maintain data integrity and consistency. You can similarly integrate HashiCorp Vault with any Percona Kubernetes Operator – for MongoDB, MySQL, and PostgreSQL.

We encourage you to try it out to keep your data safe. Let us know if you faced any issues by submitting the topic to our Community Forum.

Percona Distribution for MySQL Operator

The Percona Distribution for MySQL Operator simplifies running Percona XtraDB Cluster on Kubernetes and provides automation for day-1 and day-2 operations. It’s based on the Kubernetes API and enables highly available environments. Regardless of where it is used, the Operator creates a member that is identical to other members created with the same Operator. This provides an assured level of stability to easily build test environments or deploy a repeatable, consistent database environment that meets Percona expert-recommended best practices.

Hashicorp Vault

Hashicorp Vault is an identity-based security solution that secures, stores, and tightly controls access to tokens, passwords, and other secrets with both open-source and enterprise offerings for self-managed security automation. In April 2021, HashiCorp announced a fully managed offering, HashiCorp Cloud Platform Vault (HCP Vault), that simplifies deployment and management of the Vault.

Aug
16
2021
--

Cisco beefing up app monitoring portfolio with acquisition of Epsagon for $500M

Cisco announced on Friday that it’s acquiring Israeli applications-monitoring startup Epsagon at a price pegged at $500 million. The purchase gives Cisco a more modern microservices-focused component for its growing applications-monitoring portfolio.

The Israeli business publication Globes reported it had gotten confirmation from Cisco that the deal was for $500 million, but Cisco would not confirm that price with TechCrunch.

The acquisition comes on top of a couple of other high-profile app-monitoring deals, including AppDynamics, which the company bought in 2018 for $3.7 billion, and ThousandEyes, which it nabbed last year for $1 billion.

With Epsagon, the company is getting a way to monitor more modern applications built with containers and Kubernetes. Epsagon’s value proposition is a solution built from the ground up to monitor these kinds of workloads, giving users tracing and metrics, something that’s not always easy to do given the ephemeral nature of containers.

As Cisco’s Liz Centoni wrote in a blog post announcing the deal, Epsagon adds to the company’s concept of a full-stack offering in their applications-monitoring portfolio. Instead of having a bunch of different applications monitoring tools for different tasks, the company envisions one that works together.

“Cisco’s approach to full-stack observability gives our customers the ability to move beyond just monitoring to a paradigm that delivers shared context across teams and enables our customers to deliver exceptional digital experiences, optimize for cost, security and performance and maximize digital business revenue,” Centoni wrote.

That experience point is particularly important because when an application isn’t working, it isn’t happening in a vacuum. It has a cascading impact across the company, possibly affecting the core business itself and certainly causing customer distress, which could put pressure on customer service to field complaints, and the site reliability team to fix it. In the worst case, it could result in customer loss and an injured reputation.

If the application-monitoring system can act as an early warning system, it could help prevent the site or application from going down in the first place, and when it does go down, help track the root cause to get it up and running more quickly.

The challenge here for Cisco is incorporating Epsagon into the existing components of the application-monitoring portfolio and delivering that unified monitoring experience without making it feel like a Frankenstein’s monster of a solution globbed together from the various pieces.

Epsagon launched in 2018 and has raised $30 million. According to a report in the Israeli publication, Calcalist, the company was on the verge of a big Series B round with a valuation in the range of $200 million when it accepted this offer. It certainly seems to have given its early investors a good return. The deal is expected to close later this year.

Aug
13
2021
--

Migrating PostgreSQL to Kubernetes

Migrating PostgreSQL to Kubernetes

More and more companies are adopting Kubernetes. For some it is about being cutting-edge, for some, it is a well-defined strategy and a business transformation. Developers and operations teams all over the world are struggling with moving applications that aren’t cloud-native friendly to containers and Kubernetes.

Migrating databases is always a challenge, which comes with risks and downtime for businesses. Today I’m going to show how easy it is to migrate a PostgreSQL database to Kubernetes with minimal downtime with Percona Distribution for PostgreSQL Operator.

Goal

To perform the migration I’m going to use the following setup:

Migrating PostgreSQL to Kubernetes

  1. PostgreSQL database deployed on-prem or somewhere in the cloud. It will be the Source.
  2. Google Kubernetes Engine (GKE) cluster where Percona Operator deploys and manages PostgreSQL cluster (the Target) and pgBackRest Pod
  3. PostgreSQL backups and Write Ahead Logs are uploaded to some Object Storage bucket (GCS in my case)
  4. pgBackRest Pod reads the data from the bucket
  5. pgBackRest Pod restores the data continuously to the PostgreSQL cluster in Kubernetes

The data should be continuously synchronized. In the end, I want to shut down PostgreSQL running on-prem and only keep the cluster in GKE.

Migration

Prerequisites

To replicate the setup you will need the following:

  • PostgreSQL (v 12 or 13) running somewhere
  • pgBackRest installed
  • Google Cloud Storage or any S3 bucket. My examples will be about GCS.
  • Kubernetes cluster

Configure The Source

I have Percona Distribution for PostgreSQL version 13 running on some Linux machines.

1. Configure pgBackrest

# cat /etc/pgbackrest.conf
[global]
log-level-console=info
log-level-file=debug
start-fast=y

[db]
pg1-path=/var/lib/postgresql/13/main
repo1-type=gcs
repo1-gcs-bucket=sp-test-1
repo1-gcs-key=/tmp/gcs.key
repo1-path=/on-prem-pg

  • pg1-path should point to PostgreSQL data directory
  • repo1-type is set to GCS as we want our backups to go there
  • The key is in /tmp/gcs.key file. The key can be obtained through Google Cloud UI. Read more about it here.
  • The backups are going to be stored in on-prem-pg folder in sp-test-1 bucket

2. Edit

postgresql.conf

config to enable archival through pgBackrest 

archive_mode = on   
archive_command = 'pgbackrest --stanza=db archive-push %p'

Restart is required after changing the configuration.

3. Operator requires to have a

postgresql.conf

file in the data directory. It is enough to have an empty file:

touch /var/lib/postgresql/13/main/postgresql.conf

4.

primaryuser

must be created on the Source to ensure replication is correctly set up by the Operator. 

# create user primaryuser with encrypted password '<PRIMARYUSER PASSWORD>' replication;

Configure The Target

1. Deploy Percona Distribution for PostgreSQL Operator on Kubernetes. Read more about it in the documentation here.

# create the namespace
kubectl create namespace pgo

# clone the git repository
git clone -b v0.2.0 https://github.com/percona/percona-postgresql-operator/
cd percona-postgresql-operator

# deploy the operator
kubectl apply -f deploy/operator.yaml

2. Edit main custom resource manifest – deploy/cr.yaml.

  • I’m not going to change the cluster name and keep it cluster1
  • the cluster is going to operate in Standby mode, which means it is going to sync the data from the GCS bucket. Set
    spec.standby

    to

    true

    .

  • configure GCS itself.
    spec.backup

    section would look like this (

    bucket

      and

    repoPath

    are the same as in pgBackrest configuration above)

backup:
...
    repoPath: "/on-prem-pg"
...
    storages:
      my-s3:
        type: gcs
        endpointUrl: https://storage.googleapis.com
        region: us-central1-a
        uriStyle: path
        verifyTLS: false
        bucket: sp-test-1
    storageTypes: [
      "gcs"
    ]

  • I would like to have at least one Replica in my PostgreSQL cluster. Set
    spec.pgReplicas.hotStandby.size

    to 1.

3. Operator should be able to authenticate with GCS. To do that we need to create a secret object called

<CLUSTERNAME>-backrest-repo-config

with

gcs-key

in data. It should be the same key we used on the Source. See the example of this secret here.

kubectl apply -f gcs.yaml

4. Create users by creating Secret objects:

postgres

  and

primaryuser

(the one we created on the Source). See the examples of users Secrets here. The passwords should be the same as on the Source.

kubectl apply -f users.yaml

5. Now let’s deploy our cluster on Kubernetes by applying the

cr.yaml

:

kubectl apply -f deploy/cr.yaml

Verify and Troubleshoot

If everything is done correctly you should see the following in the Primary Pod logs:

kubectl -n pgo logs -f --tail=20 cluster1-5dfb96f77d-7m2rs
2021-07-30 10:41:08,286 INFO: Reaped pid=548, exit status=0
2021-07-30 10:41:08,298 INFO: establishing a new patroni connection to the postgres cluster
2021-07-30 10:41:08,359 INFO: initialized a new cluster
Fri Jul 30 10:41:09 UTC 2021 INFO: PGHA_INIT is 'true', waiting to initialize as primary
Fri Jul 30 10:41:09 UTC 2021 INFO: Node cluster1-5dfb96f77d-7m2rs fully initialized for cluster cluster1 and is ready for use
2021-07-30 10:41:18,781 INFO: Lock owner: cluster1-5dfb96f77d-7m2rs; I am cluster1-5dfb96f77d-7m2rs                                 2021-07-30 10:41:18,810 INFO: no action.  i am the standby leader with the lock                                                     2021-07-30 10:41:28,781 INFO: Lock owner: cluster1-5dfb96f77d-7m2rs; I am cluster1-5dfb96f77d-7m2rs                                 2021-07-30 10:41:28,832 INFO: no action.  i am the standby leader with the lock

Change some data on the Source and ensure that it is properly synchronized to the Target cluster.

Common Issues

The following error message indicates that you forgot to create

postgresql.conf

file in the data directory:

FileNotFoundError: [Errno 2] No such file or directory: '/pgdata/cluster1/postgresql.conf' -> '/pgdata/cluster1/postgresql.base.conf'

Sometimes it is easy to forget to create the

primaryuser

  and see the following in the logs:

psycopg2.OperationalError: FATAL:  password authentication failed for user "primaryuser"

Wrong or missing object store credentials will trigger the following error:

WARN: repo1: [CryptoError] unable to load info file '/on-prem-pg/backup/db/backup.info' or '/on-prem-pg/backup/db/backup.info.copy':      CryptoError: raised from remote-0 protocol on 'cluster1-backrest-shared-repo': unable to read PEM: [218529960] wrong tag            HINT: is or was the repo encrypted?                                                                                                 CryptoError: raised from remote-0 protocol on 'cluster1-backrest-shared-repo': unable to read PEM: [218595386] nested asn1 error
      HINT: is or was the repo encrypted?
      HINT: backup.info cannot be opened and is required to perform a backup.
      HINT: has a stanza-create been performed?
ERROR: [075]: no backup set found to restore
Fri Jul 30 10:54:00 UTC 2021 ERROR: pgBackRest standby Creation: pgBackRest restore failed when creating standby

Cutover

Everything looks good and it is time to perform the cutover. In this blog post, I cover only the database side but do not forget that your application should be reconfigured to point to the correct PostgreSQL cluster. It might be a good idea to stop the application before the cutover.

1. Stop the source PostgreSQL cluster to ensure no data is written

systemctl stop postgresql

2. Promote the Target cluster to primary. To do that remove

spec.backup.repoPath

, change

spec.standby

to false in

deploy/cr.yaml

, and apply the changes:

kubectl apply -f deploy/cr.yaml

PostgreSQL will be restarted automatically and you will see the following in the logs:

2021-07-30 11:16:20,020 INFO: updated leader lock during promote
2021-07-30 11:16:20,025 INFO: Changed archive_mode from on to True (restart might be required)
2021-07-30 11:16:20,025 INFO: Changed max_wal_senders from 10 to 6 (restart might be required)
2021-07-30 11:16:20,027 INFO: Reloading PostgreSQL configuration.
server signaled
2021-07-30 11:16:21,037 INFO: Lock owner: cluster1-5dfb96f77d-n4c79; I am cluster1-5dfb96f77d-n4c79
2021-07-30 11:16:21,132 INFO: no action.  i am the leader with the lock

Conclusion

Deploying and managing database clusters is not an easy task. Recently released Percona Distribution for PostgreSQL Operator automates day-1 and day-2 operations and turns running PostgreSQL on Kubernetes into a smooth and pleasant journey.

With Kubernetes becoming the default control plane, the most common task for developers and operations teams is to perform the migration, which usually turns into a complex project. This blog post shows that database migration can be an easy task with minimal downtime.

We encourage you to try out our operator. See our github repository and check out the documentation.

Found a bug or have a feature idea? Feel free to submit it in JIRA.

For general questions please raise the topic in the community forum.

Are you a developer and looking to contribute? Please read our CONTRIBUTING.md and send the Pull Request.

Percona Distribution for PostgreSQL provides the best and most critical enterprise components from the open-source community, in a single distribution, designed and tested to work together.

Download Percona Distribution for PostgreSQL Today!

Aug
10
2021
--

VCs are betting big on Kubernetes: Here are 5 reasons why

I worked at Google for six years. Internally, you have no choice — you must use Kubernetes if you are deploying microservices and containers (it’s actually not called Kubernetes inside of Google; it’s called Borg). But what was once solely an internal project at Google has since been open-sourced and has become one of the most talked about technologies in software development and operations.

For good reason. One person with a laptop can now accomplish what used to take a large team of engineers. At times, Kubernetes can feel like a superpower, but with all of the benefits of scalability and agility comes immense complexity. The truth is, very few software developers truly understand how Kubernetes works under the hood.

I like to use the analogy of a watch. From the user’s perspective, it’s very straightforward until it breaks. To actually fix a broken watch requires expertise most people simply do not have — and I promise you, Kubernetes is much more complex than your watch.

How are most teams solving this problem? The truth is, many of them aren’t. They often adopt Kubernetes as part of their digital transformation only to find out it’s much more complex than they expected. Then they have to hire more engineers and experts to manage it, which in a way defeats its purpose.

Where you see containers, you see Kubernetes to help with orchestration. According to Datadog’s most recent report about container adoption, nearly 90% of all containers are orchestrated.

All of this means there is a great opportunity for DevOps startups to come in and address the different pain points within the Kubernetes ecosystem. This technology isn’t going anywhere, so any platform or tooling that helps make it more secure, simple to use and easy to troubleshoot will be well appreciated by the software development community.

In that sense, there’s never been a better time for VCs to invest in this ecosystem. It’s my belief that Kubernetes is becoming the new Linux: 96.4% of the top million web servers’ operating systems are Linux. Similarly, Kubernetes is trending to become the de facto operating system for modern, cloud-native applications. It is already the most popular open-source project within the Cloud Native Computing Foundation (CNCF), with 91% of respondents using it — a steady increase from 78% in 2019 and 58% in 2018.

While the technology is proven and adoption is skyrocketing, there are still some fundamental challenges that will undoubtedly be solved by third-party solutions. Let’s go deeper and look at five reasons why we’ll see a surge of startups in this space.

 

Containers are the go-to method for building modern apps

Docker revolutionized how developers build and ship applications. Container technology has made it easier to move applications and workloads between clouds. It also provides as much resource isolation as a traditional hypervisor, but with considerable opportunities to improve agility, efficiency and speed.

Jul
28
2021
--

Building and Testing Percona Distribution for MongoDB Operator

Testing Percona Distribution for MongoDB Operator

Testing Percona Distribution for MongoDB OperatorRecently I wanted to play with the latest and greatest Percona Distribution for MongoDB Operator which had a bug fix I was interested in. The bug fix was merged in the main branch of the git repository, but no version of the Operator that includes this fix was released yet. I started the Operator by cloning the main branch, but the bug was still reproducible. The reason was simple – the main branch had the last released version of the Operator in bundle.yaml, instead of the main branch build:

  spec:
      containers:
        - name: percona-server-mongodb-operator
          image: percona/percona-server-mongodb-operator:1.9.0

instead of 

   spec:
      containers:
        - name: percona-server-mongodb-operator
          image: perconalab/percona-server-mongodb-operator:main

Then I decided to dig deeper to see how hard it is to do a small change in the Operator code and test it.

This blog post is a beginner contributor guide where I tried to follow our CONTRIBUTING.md and Building and testing the Operator manual to build and test Percona Distribution for MongoDB Operator.

Requirements

The requirements section was the first blocker for me as I’m used to running Ubuntu, but examples that we have are for CentOS and MacOS. For all Ubuntu fans below are the instructions: 

echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] https://packages.cloud.google.com/apt cloud-sdk main" | sudo tee -a /etc/apt/sources.list.d/google-cloud-sdk.list
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key --keyring /usr/share/keyrings/cloud.google.gpg add -
sudo apt-get update
sudo apt-get install -y google-cloud-sdk docker.io kubectl jq
sudo snap install helm --classic
sudo snap install yq --channel=v3/stable
curl -s -L https://github.com/openshift/origin/releases/download/v3.11.0/openshift-origin-client-tools-v3.11.0-0cbc58b-linux-64bit.tar.gz | sudo tar -C /usr/bin --strip-components 1 --wildcards -zxvpf - '*/oc'

I have also prepared a Pull Request to fix our docs and drafted a cloud-init file to simplify environment provisioning.

Build

Get the code from GitHub main branch:

git clone https://github.com/percona/percona-server-mongodb-operator

Change some code. Now it is time to build the Operator image and push it to the registry. DockerHub is a nice choice for beginners as it does not require any installation or configuration, but for keeping it local you might want to install your own registry. See Docker Registry, Harbor, Trow.

./e2e-tests/build

command builds the image and pushes it to the registry which you specify in IMAGE environment variable like this:

export IMAGE=bob/my_repository_for_test_images:K8SPSMDB-372-fix-feature-X

Fixing the Issues

For me the execution of the build command failed for multiple reasons:

1. Most probably you need to run it as root to get access to docker unix socket or just add the user to the docker group:

Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock

2. Once I ran it with root I got the following error:

"--squash" is only supported on a Docker daemon with experimental features enabled

It is quite easy to fix it by adding the experimental flag into /etc/docker/daemon.json file:

{
    "experimental": true
}

I have added it into the cloud-init file and will fix it in the same PR in the docs.

3. The third failure was on the last stage of pushing the image: 

denied: requested access to the resource is denied

Obviously, you should be authorized to push to the registry.

docker login

fixed it for me just fine.

Finally, the image is built and pushed to the registry:

The push refers to repository [docker.io/bob/my_repository_for_test_images]
0014bf17d462: Pushed
...
K8SPSMDB-372-fix-feature-X: digest: sha256:458066396fdd6ac358bcd78ed4d8f5279ff0295223f1d7fbec0e6d429c01fb16 size: 949

Test

./e2e-tests/run

command executes the tests in e2e-tests folder one-by-one, as you see there are multiple scenarios:

"$dir/init-deploy/run" || fail "init-deploy"
"$dir/limits/run" || fail "limits"
"$dir/scaling/run" || fail "scaling"
"$dir/monitoring/run" || fail "monitoring"
"$dir/monitoring-2-0/run" || fail "monitoring-2-0"
"$dir/liveness/run" || fail "liveness"
"$dir/one-pod/run" || fail "one-pod"
"$dir/service-per-pod/run" || fail "service-per-pod"
"$dir/arbiter/run" || fail "arbiter"
"$dir/demand-backup/run" || fail "demand-backup"
"$dir/demand-backup-sharded/run" || fail "demand-backup-sharded"
"$dir/scheduled-backup/run" || fail "scheduled-backup"
"$dir/security-context/run" || fail "security-context"
"$dir/storage/run" || fail "storage"
"$dir/self-healing/run" || fail "self-healing"
"$dir/self-healing-chaos/run" || fail "self-healing-chaos"
"$dir/operator-self-healing/run" || fail "operator-self-healing"
"$dir/operator-self-healing-chaos/run" || fail "operator-self-healing-chaos"
"$dir/smart-update/run" || fail "smart-update"
"$dir/version-service/run" || fail "version-service"
"$dir/users/run" || fail "users"
"$dir/rs-shard-migration/run" || fail "rs-shard-migration"
"$dir/data-sharded/run" || fail "data-sharded"
"$dir/upgrade/run" || fail "upgrade"
"$dir/upgrade-sharded/run" || fail "upgrade-sharded"
"$dir/upgrade-consistency/run" || fail "upgrade-consistency"
"$dir/pitr/run" || fail "pitr"
"$dir/pitr-sharded/run" || fail "pitr-sharded"

Obviously, it is possible to run the tests one by one.

It is required to have kubectl configured and pointing to the working Kubernetes cluster. If something is missing or not working the tests are going to tell you that.

The only issue I faced is the readability of the test results. The logging of the test execution is pretty verbose, so I would recommend redirecting the output to some file for further debugging purposes. 

`./e2e-tests/run >> /tmp/e2e-tests.out 2>&1

We in Percona rely on Jenkins to automatically test and verify the results for each Pull Request.

Conclusion

Contribution guides are written for developers by developers, so they often have some gaps or unclear instructions which sometimes require experience to resolve. Such minor issues might scare off potential contributors and as a result, the project does not get the Pull Request with an awesome implementation of the brightest idea. Percona embraces open source culture and values contributors by providing simple tools to develop and test the ideas.

Writing this blog post resulted in two Pull Requests:

  1. Use
    :main

     tag for container images in the main branch (link)

  2. Removing some gaps in the docs (link)

There is always room for improvement and a time to find a better way. Please let us know if you face any issues with contributing your ideas to Percona products. You can do that on the Community Forum or JIRA. Read more about contribution guidelines for Percona Distribution for MongoDB Operator in CONTRIBUTING.md.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com