One of the main features that I like about the Percona Operator for MongoDB is the integration with Percona Backup for MongoDB (PBM) tool and the ability to backup/restore the database without manual intervention. The Operator allows backing up the DB to S3-compatible cloud storage and so you can use AWS, Azure, etc.
One of our customers asked about the integration between Backblaze and the Operator for backup and restore purposes. So I was checking for it and found that it is S3-compatible and provides a free account with 10GB of cloud storage. So I jumped into testing it with our Operator. Also, I saw in our forum that a few users are using Backblaze cloud storage. So making this blog post for everyone to utilize if they want to test/use Backblaze S3-compatible cloud storage for testing our Operator and PBM.
S3-compatible storage configuration
The Operator supports backup to S3-compatible storage. The steps for backup to AWS or Azure blob are given here. So you can try that as well. In this blog let me focus on B2 cloud storage configuring as the backup location and restore it to another deployment.
Let’s configure the Percona Server for MongoDB (PSMDB) Sharded Cluster using the Operator (with minimal config as explained here). I have used PSMDB operator v1.12.0 and PBM 1.8.1 for the test below. You can sign up for the free account here – https://www.backblaze.com/b2/cloud-storage-b.html. Then log in to your account. You can first create a key pair to access the storage from your operator as follows in the “App Keys” tab:
Then you can create a bucket with your desired name and note down the S3-compatible storage’s details like bucketname (shown in the picture below) and the endpointUrl to point here to send the backup files. The details of endpointUrl can be obtained from the provider and the region is specified in the prefix of the endpointURL variable.
Deploy the cluster
Now let’s download the Operator from GitHub (I used v1.12.0) and configure the files for deploying the MongoDB sharded cluster. Here, I am using cr-minimal.yaml for deploying a very minimal setup of single member replicaset for a shard, config db, and a mongos.
#using an alias for the kubectl command $ alias "k=kubectl" $ cd percona-server-mongodb-operator # Add a backup section in the cr file as shown below. Use the appropriate values from your setup $ cat deploy/cr-minimal.yaml apiVersion: psmdb.percona.com/v1-12-0 kind: PerconaServerMongoDB metadata: name: minimal-cluster spec: crVersion: 1.12.0 image: percona/percona-server-mongodb:5.0.7-6 allowUnsafeConfigurations: true upgradeOptions: apply: 5.0-recommended schedule: "0 2 * * *" secrets: users: minimal-cluster replsets: - name: rs0 size: 1 volumeSpec: persistentVolumeClaim: resources: requests: storage: 3Gi sharding: enabled: true configsvrReplSet: size: 1 volumeSpec: persistentVolumeClaim: resources: requests: storage: 3Gi mongos: size: 1 backup: enabled: true image: percona/percona-backup-mongodb:1.8.1 serviceAccountName: percona-server-mongodb-operator pitr: enabled: false compressionType: gzip compressionLevel: 6 storages: s3-us-west: type: s3 s3: bucket: psmdbbackupBlaze credentialsSecret: my-cluster-name-backup-s3 region: us-west-004 endpointUrl: https://s3.us-west-004.backblazeb2.com/ # prefix: "" # uploadPartSize: 10485760 # maxUploadParts: 10000 # storageClass: STANDARD # insecureSkipTLSVerify: false
The backup-s3.yaml contains the key details to access the B2 cloud storage. Encode the Key ID and Access Details (retrieved from Backblaze as mentioned here) as follows to use inside the backup-s3.yaml file. The key name: my-cluster-name-backup-s3 should be unique which is used to refer to the other yaml files:
# First use base64 to encode your keyid and access key: $ echo "key-sample" | base64 --wrap=0 XXXX== $ echo "access-key-sample" | base64 --wrap=0 XXXXYYZZ== $ cat deploy/backup-s3.yaml apiVersion: v1 kind: Secret metadata: name: my-cluster-name-backup-s3 type: Opaque data: AWS_ACCESS_KEY_ID: XXXX== AWS_SECRET_ACCESS_KEY: XXXXYYZZ==
Then deploy the cluster as mentioned below and deploy backup-s3.yaml as well.
$ k apply -f ./deploy/bundle.yaml customresourcedefinition.apiextensions.k8s.io/perconaservermongodbs.psmdb.percona.com created customresourcedefinition.apiextensions.k8s.io/perconaservermongodbbackups.psmdb.percona.com created customresourcedefinition.apiextensions.k8s.io/perconaservermongodbrestores.psmdb.percona.com created role.rbac.authorization.k8s.io/percona-server-mongodb-operator created serviceaccount/percona-server-mongodb-operator created rolebinding.rbac.authorization.k8s.io/service-account-percona-server-mongodb-operator created deployment.apps/percona-server-mongodb-operator created $ k apply -f ./deploy/cr-minimal.yaml perconaservermongodb.psmdb.percona.com/minimal-cluster created $ k apply -f ./deploy/backup-s3.yaml secret/my-cluster-name-backup-s3 created
After starting the Operator and applying the yaml files, the setup looks like the below:
$ k get pods NAME READY STATUS RESTARTS AGE minimal-cluster-cfg-0 2/2 Running 0 39m minimal-cluster-mongos-0 1/1 Running 0 70m minimal-cluster-rs0-0 2/2 Running 0 38m percona-server-mongodb-operator-665cd69f9b-44tq5 1/1 Running 0 74m $ k get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 76m minimal-cluster-cfg ClusterIP None <none> 27017/TCP 72m minimal-cluster-mongos ClusterIP 10.100.7.70 <none> 27017/TCP 72m minimal-cluster-rs0 ClusterIP None <none> 27017/TCP 72m
Backup
After deploying the cluster, the DB is ready for backup anytime. Other than the scheduled backup, you can create a backup-custom.yaml file to take a backup whenever you need it (you will need to provide a unique backup name each time, or else a new backup will not work). Our backup yaml file looks like the below one:
$ cat deploy/backup/backup-custom.yaml apiVersion: psmdb.percona.com/v1 kind: PerconaServerMongoDBBackup metadata: finalizers: - delete-backup name: backup1 spec: clusterName: minimal-cluster storageName: s3-us-west # compressionType: gzip # compressionLevel: 6
Now load some data into the database and then start the backup now:
$ k apply -f deploy/backup/backup-custom.yaml perconaservermongodbbackup.psmdb.percona.com/backup1 configured
The backup progress looks like the below:
$ k get perconaservermongodbbackup.psmdb.percona.com NAME CLUSTER STORAGE DESTINATION STATUS COMPLETED AGE backup1 minimal-cluster s3-us-west 2022-09-08T03:21:58Z requested 43s $ k get perconaservermongodbbackup.psmdb.percona.com NAME CLUSTER STORAGE DESTINATION STATUS COMPLETED AGE backup1 minimal-cluster s3-us-west 2022-09-08T03:22:19Z requested 46s $ k get perconaservermongodbbackup.psmdb.percona.com NAME CLUSTER STORAGE DESTINATION STATUS COMPLETED AGE backup1 minimal-cluster s3-us-west 2022-09-08T03:22:19Z running 49s
Here, if you have any issues with the backup, you can view the backup logs from the backup agent sidecar as follows:
$ k logs pod/minimal-cluster-rs0 -c backup-agent
To start another backup, edit backup-custom.yaml and change the backup name followed by applying it (using name:backup2):
$ k apply -f deploy/backup/backup-custom.yaml perconaservermongodbbackup.psmdb.percona.com/backup2 configured
Monitor the backup process (you can use -w option to watch the progress continuously). It should show the status as READY:
$ k get perconaservermongodbbackup.psmdb.percona.com -w NAME CLUSTER STORAGE DESTINATION STATUS COMPLETED AGE backup1 minimal-cluster s3-us-west 2022-09-08T03:22:19Z ready 12m 14m backup2 minimal-cluster s3-us-west 8s backup2 minimal-cluster s3-us-west 2022-09-08T03:35:56Z requested 21s backup2 minimal-cluster s3-us-west 2022-09-08T03:35:56Z running 26s backup2 minimal-cluster s3-us-west 2022-09-08T03:35:56Z ready 0s 41s
From the bucket on Backblaze, the backup files are listed as they were sent from the backup:
Restore
You can restore the cluster from the backup into another similar deployment or into the same cluster. List the backups and restore one of them as follows. The configuration restore-custom.yaml has the backup information to restore. If you are using another deployment, then you can also include backupSource section which I commented on below for your reference, from which the restore process finds the source of the backup. In this case, make sure you create a secret my-cluster-name-backup-s3 before restoring as well to access the backup.
$ cat deploy/backup/restore-custom.yaml apiVersion: psmdb.percona.com/v1 kind: PerconaServerMongoDBRestore metadata: name: restore2 spec: clusterName: minimal-cluster backupName: backup2 # pitr: # type: date # date: YYYY-MM-DD HH:MM:SS # backupSource: # destination: s3://S3-BACKUP-BUCKET-NAME-HERE/BACKUP-DESTINATION # s3: # credentialsSecret: my-cluster-name-backup-s3 # region: us-west-004 # bucket: S3-BACKUP-BUCKET-NAME-HERE # endpointUrl: https://s3.us-west-004.backblazeb2.com/ # prefix: "" # azure: # credentialsSecret: SECRET-NAME # prefix: PREFIX-NAME # container: CONTAINER-NAME
Listing the backup:
$ k get psmdb-backup NAME CLUSTER STORAGE DESTINATION STATUS COMPLETED AGE backup1 minimal-cluster s3-us-west 2022-09-08T03:22:19Z ready 3h5m 3h6m backup2 minimal-cluster s3-us-west 2022-09-08T03:35:56Z ready 171m 172m backup3 minimal-cluster s3-us-west 2022-09-08T04:16:39Z ready 130m 131m
To verify the restore process, I write some data into a collection vinodh.testData after the backup and before the restore. So the newly inserted document shouldn’t be there after the restore:
# Using mongosh from the mongo container to see the data # Listing data from collection vinodh.testData $ kubectl run -i --rm --tty mongo-client --image=mongo:5.0.7 --restart=Never -- bash -c "mongosh --host=10.96.30.92 --username=root --password=password --authenticationDatabase=admin --eval \"db.getSiblingDB('vinodh').testData.find()\" --quiet " If you don't see a command prompt, try pressing enter. [ { _id: ObjectId("631956cc70e60e9ed3ecf76d"), id: 1 } ] pod "mongo-client" deleted
Inserting a document into it:
$ kubectl run -i --rm --tty mongo-client --image=mongo:5.0.7 --restart=Never -- bash -c "mongosh --host=10.96.30.92 --username=root --password=password --authenticationDatabase=admin --eval \"db.getSiblingDB('vinodh').testData.insert({id:2})\" --quiet " If you don't see a command prompt, try pressing enter. DeprecationWarning: Collection.insert() is deprecated. Use insertOne, insertMany, or bulkWrite. { acknowledged: true, insertedIds: { '0': ObjectId("631980fe07180f860bd22534") } } pod "mongo-client" delete
Listing it again to verify:
$ kubectl run -i --rm --tty mongo-client --image=mongo:5.0.7 --restart=Never -- bash -c "mongosh --host=10.96.30.92 --username=root --password=password --authenticationDatabase=admin --eval \"db.getSiblingDB('vinodh').testData.find()\" --quiet " If you don't see a command prompt, try pressing enter. [ { _id: ObjectId("631956cc70e60e9ed3ecf76d"), id: 1 }, { _id: ObjectId("631980fe07180f860bd22534"), id: 2 } ] pod "mongo-client" deleted
Running restore as follows:
$ k apply -f deploy/backup/restore-custom.yaml perconaservermongodbrestore.psmdb.percona.com/restore2 created
Now check the data again in vinodh.testData collection and verify whether the restore is done properly. The below data proves that the collection was restored from the backup as it is listing only the record from the backup:
$ kubectl run -i --rm --tty mongo-client --image=mongo:5.0.7 --restart=Never -- bash -c "mongosh --host=minimal-cluster-mongos --username=root --password=password --authenticationDatabase=admin --eval \"db.getSiblingDB('vinodh').testData.find()\" --quiet " If you don't see a command prompt, try pressing enter. [ { _id: ObjectId("631956cc70e60e9ed3ecf76d"), id: 1 } ]
Hope this helps you! Now you can try the same from your end to check and use Backblaze in your production if it suits your requirements. I haven’t tested the performance of the network yet. If you have used Backblaze or similar S3-compatible storage for backup, then you can share your experience with us in the comments.
The Percona Kubernetes Operators automate the creation, alteration, or deletion of members in your Percona Distribution for MySQL, MongoDB, or PostgreSQL environment.