The increase of cloud-native technologies is transforming how we manage databases. Since I stepped into the world of databases and cloud-native technologies, I have encountered several initiatives aimed at developing and optimizing database operations in the cloud, and Kubernetes plays a crucial role in this shift through Operators. While the core concepts and techniques of […]
16
2025
Percona Everest: An Open Source Solution for MongoDB Sharding and Backups
02
2024
Percona Backup for MongoDB and Disk Snapshots in Google Cloud Platform
Percona Backup for MongoDB (PBM) supports snapshot-based physical backups. This is made possible by the backup cursor functionality present in Percona Server for MongoDB. The flow of snapshot-based physical backup consists of these stages: Preparing the database – done by PBM Taking the snapshots – done by the user/client app Completing the backup – done […]
20
2023
Introducing Percona’s API for Faster Snapshot-Based Backup and Restore for Your MongoDB Environments
Although the introduction of physical backups in Percona Backup for MongoDB (PBM) made it possible to significantly cut the restore time for big datasets, it’s hard to beat snapshots in speed and efficiency. That’s why we introduced external backups (aka snapshot-based backup API) in PBM 2.2.0
The idea came from the requests to bring EBS snapshots into PBM. But we decided that, instead of focusing on a particular type, we would give users the ability to leverage their own backup strategies to meet their needs and environment.
How Percona’s snapshot-based backup API works
What PBM does during physical backup can be genuinely split into three phases.
First, it prepares the database and ensures the data files can be safely copied. Then, it copies files to the storage. And lastly, it returns the database to the non-backup state — closes backup cursors, etc. Something similar happens with the restore: prepare the cluster, copy files, and prepare data so the cluster can start in a consistent state. For more details, you can see the Tech Peek section in our blog post on physical backups, but for the purposes of this blog, those details don’t matter.
The new API literally breaks up the backup and restore process into these three stages. Giving the user full control over the data copy. So it can be either EBS or any other snapshot, or ‘cp -Rp’ … or whatever fits your needs.
To start, just run pbm backup -t external. PBM will notify when the data will be ready for copying with the prompt saying from exactly which node on each shard it should be done (backup needs to be done only from one node on each replica set). Then, when the snapshot(s) (data copy) is done, you have to tell PBM to finish the backup with pbm backup-finish <backup_name>.
And that’s it. Restore follows the pattern. Start the restore with pbm restore [backup_name] -external, copy data files to every node of the corresponding replica set in the cluster when PBM prepares everything, and finish the restore with pbm restore-finish.
Backup:
Restore:
Restore your existing snapshots with PBM
The great thing is that you can restore snapshots taken without PBM. PBM creates backup metadata during an external backup, and if [backup_name] is provided for the restore, it will use it to check backup compatibility with the target cluster in terms of topology and PSMDB version and to define “restore-to-time.” But restore can be run with no [backup_name] perfectly fine. Just the checks will be omitted (that’s on you), and the “restore-to-time” will be picked from the provided data. PBM will look into the actual data during the restore and define the most recent common cluster time across all shards. Just be mindful that there is not much we can check and ensure for non-PBM snapshots. Another thing that PBM might need regarding the backup data is MongoDB storage options. These are preserved in the backup metadata as well, but in case of an existing snapshot, you can pass it via –config flag.
We are looking into leveraging snapshot-based backups in our tools like Percona Operator for MongoDB and Percona Monitoring and Management.
This feature was released as Technical Preview so that we can adjust it in further iterations following your feedback. So, give it a try and either leave your thoughts on the forum or file a Jira ticket.
24
2023
Announcing the Availability of Percona Backup for MongoDB 2.2.0
In the previous minor release, Percona Backup for MongoDB 2.1.0 (PBM) introduced a GA version of incremental physical backups, which can greatly impact both the recovery time and the cost of backup (considering storage and data transfer cost). Back then, we already knew that our customers needed to back up:
- More data
- Faster
- Using existing tools
During all conversations we’ve had with our Community, customers, and prospects, we’ve noticed how popular the usage of snapshot tools is. While AWS EBS snapshots were mentioned the most, the likes of persistent disk snapshots on GCP, Azure-managed disk snapshots, or local storage, as well as k8s snapshot capabilities, were also playing a crucial part in the backup strategies.
To make sure that Percona Backup for MongoDB answers the pains of our customers, we took the common denominator of the pains that anyone using snapshot capabilities faces when performing backups and restores and positioned PBM as the open source, freely available answer to those pains.
Snapshot backup/restore Technical Preview
As already promised during Percona Live 2023, it’s my pleasure to deliver on that promise. I am happy to announce that with PBM 2.2.0, we are launching the technical preview for Percona Backup for MongoDB snapshot CLI.
With the use of this snapshot CLI, you can now build your backup strategy using the snapshot tools at your disposal or include PBM into your existing strategy, even using existing snapshots of Percona Server for MongoDB!
Now why Technical Preview, you may ask? Well, it is because we believe that open source philosophy is not only about the source but also about design and decision-making. We want to ensure that our Community feels that this design we delivered fits their use cases. We hope for a dialogue with you, our users, to make sure we are fixing your pains in the best way we can. So please do not hesitate and:
- Provide feedback in the feedback form.
- Engage on our Community pages (there is even a placeholder topic for this!)
- Contact me through any channels (we do have our Jira open to the Community, my email is available in the footnote, and I am present on LinkedIn).
Physical backups Point in Time Recovery (PITR)
MongoDB Inc. provides only limited backup/restore capabilities with their Community Edition. These are, respectively, the widely adopted mongodump/mongorestore that also Percona Backup for MongoDB uses for what we call logical backups.
While this type of backup is very convenient for smaller data sets, it has certain limitations regarding larger ones. The main limitations are the RPO and RTO. While RPO is addressed with Point in Time Recovery (PITR) that works by default with logical backups, for RTO improvement, we introduced physical backups. In short, for larger datasets, there is a very distinctive speed improvement in recovery.
By default, the PITR capabilities were designed to work with logical backups, and we want to make the user experience as good as physical backups. While for previous versions, PITR works well with physical backups, some operational limitations require some more manual operations.
With PBM 2.2.0, we introduce numerous fixes that put these limitations in the past:
- Previously the database, after restoring from a full backup, was allowing connections that required manual changes to restrict users from connecting to it before the PITR restore finishes so that the result of the restore process guarantees data integrity.
- The restore process for physical backups + PITR up until now was not handled in one command, making the user experience not as good as for the logical backups + PITR.
Fixes, bugs, improvements – all is here!
Of course, each release also includes bug fixes and refining some of the existing features. This one is no different. Outside of your typical array of bugs and patches, we have noticed that the way that physical restores handle the remapping of replica sets needs improvement so that you can notice a better-handled experience there.
Feedback is also a contribution
Contribution is not only code. Feedback, adoption and usage data, bug reports, success stories, case studies, testimonials — all these are contributions. Engaging with Community in meaningful discussions is also a contribution. All of these help us grow and help us deliver a better product.
We appreciate any contribution. Again, even negative feedback is something we can use to make the product evolve!
What’s next?
I hope that next, we will close the gap between the capabilities of physical and logical backups by:
- Selective backup/restore
- Further improvements on physical PITR, if needed
and, of course, deliver GA capabilities of snapshot backup/restore based on your feedback.
There is also a lot of work around Percona Monitoring and Management (PMM) for MongoDB and some new improvements coming for Percona Operator for MongoDB.
10
2023
MongoDB Version Requirements for Percona Backup for MongoDB
Percona Backup for MongoDB (PBM) is a backup utility developed by Percona to address the needs of customers and users who prefer open source tools over proprietary software like MongoDB Enterprise and Ops Manager. PBM offers comprehensive support and ensures cluster-wide consistent backups for MongoDB without additional costs.
MongoDB supported versions
While physical backups and restores are supported on Percona Server for MongoDB v4.2.15-16, v4.4.6-8, v5.0, and higher, incremental backups and restores, which were only recently GA-ed in PBM v2.1.0, do requires newer versions, specifically:
or higher. Using previous Percona Server MongoDB versions is not recommended since you can experience issues with the backup and the MongoDB nodes.
Using the supported Percona Server for MongoDB versions is vital to guarantee compatibility and avoid issues (including crashes). Percona recommends using the latest binaries available for Percona Backup for MongoDB and the minor versions available for your Percona MongoDB instance.
We are committed to continually enhancing PBM, and in future releases, we plan to restrict PBM from performing incremental backups on non-supported MongoDB versions. You can find more details about this task in the following link: PBM-1062.
For reference, here are the tickets detailing issues in older versions, which have resulted in improvements to Percona Server MongoDB, allowing PBM to guarantee consistent backups:
- PSMDB-1177: Fix incremental backup failure via $backupCursor for PSMDB 4.2/4.4
- PSMDB-1175: Fix PSMDB behavior when calling $backupCursor with disableIncrementalBackup option
Additional questions
For any additional questions, Percona customers can open a new support ticket.
Community users can use the usual community support channels to request help.
We appreciate your trust in Percona, and we are dedicated to providing you with exceptional solutions for all your MongoDB backup needs.
Why opt for expensive property software like MongoDB Enterprise Advanced and Ops Manager when you can get many of the same benefits without the cost or lock-in?
26
2023
Speeding Up Restores in Percona Backup for MongoDB
When you do a database restore, you want to have it done as soon as possible. In the case of disaster recovery, the situation is stressful enough on its own. And the database is unusable until the restore is done. So every minute matters. That becomes especially crucial with big datasets.
Bringing physical backups in Percona Backup for MongoDB (PBM) was a big step toward the restoration speed. A physical restore is essentially copying data files to the target nodes and starting a database with that data catalog, while logical means copying data and running insert operations on the database, which brings overhead on parsing data, building indexes, etc. Our tests showed physical database restores up to 5x faster than the logical ones. But can we do better? Let’s try.
The speed of the physical restoration comes down to how fast we can copy (download) data from the remote storage. So we decided to try parallel (concurrent) download. In physical backups, PBM stores WiredTiger files pretty much the same as they are in the data directory, just adding extra compression. So what if you want to download different files in parallel? It won’t exactly work as each MongoDB collection’s data is stored in one file. So data doesn’t spread evenly across the files. And we would have bottlenecks in case of big collections. So the better approach is to download each file concurrently.
PBM already downloads files in chunks, but it’s done solely for retries. So in case of a network failure, we’d have to retry a recent chunk rather than the whole file. The idea is to download these chunks concurrently. Here’s the problem: Reading out-of-order, we cannot write it straight to the file (with a seek offset), as we have to decompress data first (data in the backup is compressed by default). Hence, although we can read data out-of-order, we must write it sequentially. For that, we made a special memory buffer. Chunks can be put there concurrently and out-of-order, but consumers always read data in order.
The final design
The downloader starts the number of workers, which equals the concurrency (number of CPU cores by default). There is preallocated arena in the arenas pool for each worker. The arena basically is a bytes buffer with the free slots bitmap. Each arena is split into spans. The span size is equal to the download chunk size. When a worker wants to download a chunk, it acquires free span from its arena first and downloads data in there. When a consumer reads this data, the span is marked as free and can be reused for the next chunk. The worker doesn’t wait for the data to be read, and once it has downloaded a chunk, it takes another chunk from the task queue, acquires the next free span, and downloads data. To prevent uncontrolled memory consumption, the number of spans in each arena is limited, and the worker would have to wait for a free span to download the next chunk if all are busy.
On the other hand, we keep track of what was given to the consumer, the number of the last written byte, for each file. And if the downloaded chunk is out-of-order, it’s being pushed to the heap or given to the consumer otherwise. On the next iteration (the next downloaded chunk), we check the top of the heap, pop chunks out, and give it back to the consumer if/until chunks are in order.
See the commit with changes for more details.
Config options
A few new options were added to the PBM config file to tweak concurrent downloads.
numDownloadWorkers – sets concurrency. Default: number of CPUs
downloadChunkMb – the size of the chunks to download in Mb. Default: 32
maxDownloadBufferMb – the upper limit of the memory that can be used for download buffers (arenas) in Mb. Default: numDownloadWorkers * downloadChunkMb * 16. If set, chunk size might be changed to fit the max requirements. It doesn’t mean that all of this memory will be used and actually allocated in the physical RAM.
Results
PBM supports different storage types, but for this implementation, we decided to start with the most widely used – S3 compatible. We aim to port it to Azure Blob and FileSystem storage types in subsequent releases.
Our tests on AWS S3 show up to 19x improvements in the restore speed:
Instances |
Backup size |
Concurrency |
Span size |
Restore time |
Concurrent download |
||||
i3en.xlarge (4vCPU,16Gb RAM) |
500Gb |
4 |
32Mb |
45 min |
i3en.xlarge (4vCPU,16Gb RAM) |
500Gb |
8 |
32Mb |
32 min |
i3en.3xlarge (12vCPU,96GB RAM) |
5Tb |
12 |
32Mb |
168 min |
i3en.3xlarge (12vCPU,96GB RAM) |
5Tb |
24 |
32Mb |
117 min |
Release v2.0.3 |
||||
i3en.xlarge (4vCPU,16Gb RAM) |
500Gb |
227 min |
||
i3en.3xlarge (12vCPU,96GB RAM) |
5Tb |
~2280 min |
* Tests were made on AWS i3en instances with the S3 storage in the same region.
** We didn’t wait for 5Tb restore on v2.0.3 to finish and used the “time to uploaded Gb” ratio for results extrapolation.
Try Percona Backup for MongoDB for faster restores
This is a significant improvement that comes among the other features with the new Percona Backup for MongoDB (PBM) release. Give it a try, and leave your feedback!
11
2023
Faster, Cheaper, and More Reliable Incremental Backups for MongoDB
MongoDB is a great database that provides outstanding high availability solutions out of the box. For most Database Administrators (DBAs) we talk with, though, the awesome safety features we came to appreciate MongoDB for, like automatic failover or data redundancy, are not enough.
The thing we hear almost every time is that they still need good, reliable backups to satisfy the requirements of their business.
Unfortunately, there is a lack of free-to-use backup tools. The only freely available backup for MongoDB users without an enterprise contract with MongoDB Inc. is mongodump. While being a very reliable tool, it has some major drawbacks such as:
- not a viable solution for large production environments due to the time it takes for both backup/restore and the capabilities available;
- not a good option for sharded clusters as there’s no support for sharded transactions and users have to deal with syncing the backup/restore for all shards themselves.
That’s why Percona has developed Percona Backup for MongoDB (PBM). A fully open source tool to help with your backup needs!
Physical backups
While PBM allows logical backups (creating binary exports of database contents) based on mongodump, with version 2.0 the physical backups went GA!
Now physical hot backups that copy the files of your database are available to all your database clusters. Yes, that means the sharded ones as well!
Point-in-time recovery
With the introduction of physical backups, point-in-time recovery (PITR) has been more performant as well! This is because physical backups have been made available as the base for the PITR backups. That means that now the logical oplog slices can be administered on top of a database snapshot in the form of a physical database hot backup.
To learn more about backup and restore types supported by PBM, visit our documentation.
Introducing incremental physical backups!
For the largest MongoDB clusters, the sheer amount of data impacts the backup strategy. At scale, it’s the time to restore that matters as with the size it will be an issue.
Another factor is the size of the backups. Storing backups is one thing, but with databases deployed in the cloud, the transfer of these backups is also a factor as it can significantly impact the budget.
We’re happy to share that with PBM 2.0.3 we are introducing physical incremental backups. After a full physical backup, instead of performing full physical backups, you can create smaller incremental ones. This is how for large, write intense databases you can see significant savings in the backup strategy:
- time (performing an incremental physical backup is faster vs any other PBM backup);
- cost (based on storage and transfer cost vs. other PBM backups).
Simply put: faster and cheaper. Also safer as with smaller and faster backups you can afford to back up more frequently and still have savings in the budget.
This is a technical preview for now, as we are looking for feedback from the Community about this feature. Please don’t be a stranger, let us know what you think!
Tech preview
We do have another ongoing technical preview for selective backup/restore. Please don’t be shy and share your thoughts and comments with us!
In case you were wondering what a tech preview means for us at Percona, let me quote our lifecycle page:
*Tech Preview Features: RC/GA releases can include features that are not yet ready for enterprise use and are not included in support via SLA (supported by product/engineering on the best-effort basis). These features will be fully documented and described as a tech preview so that customers can provide feedback prior to the full release of the feature in a future GA release (or removal of the feature is deemed not useful). This functionality can change (APIs, CLIs, etc.) from tech preview to GA, however, there is no guarantee of compatibility between different tech preview versions.
When developing open source and free-to-use software, we look for your input. Please share your feedback with us and know that every voice counts!
What’s next
In terms of what new is coming to Percona Backup for MongoDB, we are looking into two main topics:
- Making point-in-time recovery more appealing for large datasets.
- Selective restores supporting sharding, which is one of the main items needed for selective backup/restore.
This tool is fully supported and we are determined to deliver fixes and new features to everyone in the spirit of true open source, not only to our customers paying for premium support.
Percona Distribution for MongoDB is a freely available MongoDB database alternative, giving you a single solution that combines the best and most important enterprise components from the open source community, designed and tested to work together.
25
2022
Percona Operator for MongoDB Backup and Restore on S3-Compatible Storage – Backblaze
One of the main features that I like about the Percona Operator for MongoDB is the integration with Percona Backup for MongoDB (PBM) tool and the ability to backup/restore the database without manual intervention. The Operator allows backing up the DB to S3-compatible cloud storage and so you can use AWS, Azure, etc.
One of our customers asked about the integration between Backblaze and the Operator for backup and restore purposes. So I was checking for it and found that it is S3-compatible and provides a free account with 10GB of cloud storage. So I jumped into testing it with our Operator. Also, I saw in our forum that a few users are using Backblaze cloud storage. So making this blog post for everyone to utilize if they want to test/use Backblaze S3-compatible cloud storage for testing our Operator and PBM.
S3-compatible storage configuration
The Operator supports backup to S3-compatible storage. The steps for backup to AWS or Azure blob are given here. So you can try that as well. In this blog let me focus on B2 cloud storage configuring as the backup location and restore it to another deployment.
Let’s configure the Percona Server for MongoDB (PSMDB) Sharded Cluster using the Operator (with minimal config as explained here). I have used PSMDB operator v1.12.0 and PBM 1.8.1 for the test below. You can sign up for the free account here – https://www.backblaze.com/b2/cloud-storage-b.html. Then log in to your account. You can first create a key pair to access the storage from your operator as follows in the “App Keys” tab:
Then you can create a bucket with your desired name and note down the S3-compatible storage’s details like bucketname (shown in the picture below) and the endpointUrl to point here to send the backup files. The details of endpointUrl can be obtained from the provider and the region is specified in the prefix of the endpointURL variable.
Deploy the cluster
Now let’s download the Operator from GitHub (I used v1.12.0) and configure the files for deploying the MongoDB sharded cluster. Here, I am using cr-minimal.yaml for deploying a very minimal setup of single member replicaset for a shard, config db, and a mongos.
#using an alias for the kubectl command $ alias "k=kubectl" $ cd percona-server-mongodb-operator # Add a backup section in the cr file as shown below. Use the appropriate values from your setup $ cat deploy/cr-minimal.yaml apiVersion: psmdb.percona.com/v1-12-0 kind: PerconaServerMongoDB metadata: name: minimal-cluster spec: crVersion: 1.12.0 image: percona/percona-server-mongodb:5.0.7-6 allowUnsafeConfigurations: true upgradeOptions: apply: 5.0-recommended schedule: "0 2 * * *" secrets: users: minimal-cluster replsets: - name: rs0 size: 1 volumeSpec: persistentVolumeClaim: resources: requests: storage: 3Gi sharding: enabled: true configsvrReplSet: size: 1 volumeSpec: persistentVolumeClaim: resources: requests: storage: 3Gi mongos: size: 1 backup: enabled: true image: percona/percona-backup-mongodb:1.8.1 serviceAccountName: percona-server-mongodb-operator pitr: enabled: false compressionType: gzip compressionLevel: 6 storages: s3-us-west: type: s3 s3: bucket: psmdbbackupBlaze credentialsSecret: my-cluster-name-backup-s3 region: us-west-004 endpointUrl: https://s3.us-west-004.backblazeb2.com/ # prefix: "" # uploadPartSize: 10485760 # maxUploadParts: 10000 # storageClass: STANDARD # insecureSkipTLSVerify: false
The backup-s3.yaml contains the key details to access the B2 cloud storage. Encode the Key ID and Access Details (retrieved from Backblaze as mentioned here) as follows to use inside the backup-s3.yaml file. The key name: my-cluster-name-backup-s3 should be unique which is used to refer to the other yaml files:
# First use base64 to encode your keyid and access key: $ echo "key-sample" | base64 --wrap=0 XXXX== $ echo "access-key-sample" | base64 --wrap=0 XXXXYYZZ== $ cat deploy/backup-s3.yaml apiVersion: v1 kind: Secret metadata: name: my-cluster-name-backup-s3 type: Opaque data: AWS_ACCESS_KEY_ID: XXXX== AWS_SECRET_ACCESS_KEY: XXXXYYZZ==
Then deploy the cluster as mentioned below and deploy backup-s3.yaml as well.
$ k apply -f ./deploy/bundle.yaml customresourcedefinition.apiextensions.k8s.io/perconaservermongodbs.psmdb.percona.com created customresourcedefinition.apiextensions.k8s.io/perconaservermongodbbackups.psmdb.percona.com created customresourcedefinition.apiextensions.k8s.io/perconaservermongodbrestores.psmdb.percona.com created role.rbac.authorization.k8s.io/percona-server-mongodb-operator created serviceaccount/percona-server-mongodb-operator created rolebinding.rbac.authorization.k8s.io/service-account-percona-server-mongodb-operator created deployment.apps/percona-server-mongodb-operator created $ k apply -f ./deploy/cr-minimal.yaml perconaservermongodb.psmdb.percona.com/minimal-cluster created $ k apply -f ./deploy/backup-s3.yaml secret/my-cluster-name-backup-s3 created
After starting the Operator and applying the yaml files, the setup looks like the below:
$ k get pods NAME READY STATUS RESTARTS AGE minimal-cluster-cfg-0 2/2 Running 0 39m minimal-cluster-mongos-0 1/1 Running 0 70m minimal-cluster-rs0-0 2/2 Running 0 38m percona-server-mongodb-operator-665cd69f9b-44tq5 1/1 Running 0 74m $ k get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 76m minimal-cluster-cfg ClusterIP None <none> 27017/TCP 72m minimal-cluster-mongos ClusterIP 10.100.7.70 <none> 27017/TCP 72m minimal-cluster-rs0 ClusterIP None <none> 27017/TCP 72m
Backup
After deploying the cluster, the DB is ready for backup anytime. Other than the scheduled backup, you can create a backup-custom.yaml file to take a backup whenever you need it (you will need to provide a unique backup name each time, or else a new backup will not work). Our backup yaml file looks like the below one:
$ cat deploy/backup/backup-custom.yaml apiVersion: psmdb.percona.com/v1 kind: PerconaServerMongoDBBackup metadata: finalizers: - delete-backup name: backup1 spec: clusterName: minimal-cluster storageName: s3-us-west # compressionType: gzip # compressionLevel: 6
Now load some data into the database and then start the backup now:
$ k apply -f deploy/backup/backup-custom.yaml perconaservermongodbbackup.psmdb.percona.com/backup1 configured
The backup progress looks like the below:
$ k get perconaservermongodbbackup.psmdb.percona.com NAME CLUSTER STORAGE DESTINATION STATUS COMPLETED AGE backup1 minimal-cluster s3-us-west 2022-09-08T03:21:58Z requested 43s $ k get perconaservermongodbbackup.psmdb.percona.com NAME CLUSTER STORAGE DESTINATION STATUS COMPLETED AGE backup1 minimal-cluster s3-us-west 2022-09-08T03:22:19Z requested 46s $ k get perconaservermongodbbackup.psmdb.percona.com NAME CLUSTER STORAGE DESTINATION STATUS COMPLETED AGE backup1 minimal-cluster s3-us-west 2022-09-08T03:22:19Z running 49s
Here, if you have any issues with the backup, you can view the backup logs from the backup agent sidecar as follows:
$ k logs pod/minimal-cluster-rs0 -c backup-agent
To start another backup, edit backup-custom.yaml and change the backup name followed by applying it (using name:backup2):
$ k apply -f deploy/backup/backup-custom.yaml perconaservermongodbbackup.psmdb.percona.com/backup2 configured
Monitor the backup process (you can use -w option to watch the progress continuously). It should show the status as READY:
$ k get perconaservermongodbbackup.psmdb.percona.com -w NAME CLUSTER STORAGE DESTINATION STATUS COMPLETED AGE backup1 minimal-cluster s3-us-west 2022-09-08T03:22:19Z ready 12m 14m backup2 minimal-cluster s3-us-west 8s backup2 minimal-cluster s3-us-west 2022-09-08T03:35:56Z requested 21s backup2 minimal-cluster s3-us-west 2022-09-08T03:35:56Z running 26s backup2 minimal-cluster s3-us-west 2022-09-08T03:35:56Z ready 0s 41s
From the bucket on Backblaze, the backup files are listed as they were sent from the backup:
Restore
You can restore the cluster from the backup into another similar deployment or into the same cluster. List the backups and restore one of them as follows. The configuration restore-custom.yaml has the backup information to restore. If you are using another deployment, then you can also include backupSource section which I commented on below for your reference, from which the restore process finds the source of the backup. In this case, make sure you create a secret my-cluster-name-backup-s3 before restoring as well to access the backup.
$ cat deploy/backup/restore-custom.yaml apiVersion: psmdb.percona.com/v1 kind: PerconaServerMongoDBRestore metadata: name: restore2 spec: clusterName: minimal-cluster backupName: backup2 # pitr: # type: date # date: YYYY-MM-DD HH:MM:SS # backupSource: # destination: s3://S3-BACKUP-BUCKET-NAME-HERE/BACKUP-DESTINATION # s3: # credentialsSecret: my-cluster-name-backup-s3 # region: us-west-004 # bucket: S3-BACKUP-BUCKET-NAME-HERE # endpointUrl: https://s3.us-west-004.backblazeb2.com/ # prefix: "" # azure: # credentialsSecret: SECRET-NAME # prefix: PREFIX-NAME # container: CONTAINER-NAME
Listing the backup:
$ k get psmdb-backup NAME CLUSTER STORAGE DESTINATION STATUS COMPLETED AGE backup1 minimal-cluster s3-us-west 2022-09-08T03:22:19Z ready 3h5m 3h6m backup2 minimal-cluster s3-us-west 2022-09-08T03:35:56Z ready 171m 172m backup3 minimal-cluster s3-us-west 2022-09-08T04:16:39Z ready 130m 131m
To verify the restore process, I write some data into a collection vinodh.testData after the backup and before the restore. So the newly inserted document shouldn’t be there after the restore:
# Using mongosh from the mongo container to see the data # Listing data from collection vinodh.testData $ kubectl run -i --rm --tty mongo-client --image=mongo:5.0.7 --restart=Never -- bash -c "mongosh --host=10.96.30.92 --username=root --password=password --authenticationDatabase=admin --eval \"db.getSiblingDB('vinodh').testData.find()\" --quiet " If you don't see a command prompt, try pressing enter. [ { _id: ObjectId("631956cc70e60e9ed3ecf76d"), id: 1 } ] pod "mongo-client" deleted
Inserting a document into it:
$ kubectl run -i --rm --tty mongo-client --image=mongo:5.0.7 --restart=Never -- bash -c "mongosh --host=10.96.30.92 --username=root --password=password --authenticationDatabase=admin --eval \"db.getSiblingDB('vinodh').testData.insert({id:2})\" --quiet " If you don't see a command prompt, try pressing enter. DeprecationWarning: Collection.insert() is deprecated. Use insertOne, insertMany, or bulkWrite. { acknowledged: true, insertedIds: { '0': ObjectId("631980fe07180f860bd22534") } } pod "mongo-client" delete
Listing it again to verify:
$ kubectl run -i --rm --tty mongo-client --image=mongo:5.0.7 --restart=Never -- bash -c "mongosh --host=10.96.30.92 --username=root --password=password --authenticationDatabase=admin --eval \"db.getSiblingDB('vinodh').testData.find()\" --quiet " If you don't see a command prompt, try pressing enter. [ { _id: ObjectId("631956cc70e60e9ed3ecf76d"), id: 1 }, { _id: ObjectId("631980fe07180f860bd22534"), id: 2 } ] pod "mongo-client" deleted
Running restore as follows:
$ k apply -f deploy/backup/restore-custom.yaml perconaservermongodbrestore.psmdb.percona.com/restore2 created
Now check the data again in vinodh.testData collection and verify whether the restore is done properly. The below data proves that the collection was restored from the backup as it is listing only the record from the backup:
$ kubectl run -i --rm --tty mongo-client --image=mongo:5.0.7 --restart=Never -- bash -c "mongosh --host=minimal-cluster-mongos --username=root --password=password --authenticationDatabase=admin --eval \"db.getSiblingDB('vinodh').testData.find()\" --quiet " If you don't see a command prompt, try pressing enter. [ { _id: ObjectId("631956cc70e60e9ed3ecf76d"), id: 1 } ]
Hope this helps you! Now you can try the same from your end to check and use Backblaze in your production if it suits your requirements. I haven’t tested the performance of the network yet. If you have used Backblaze or similar S3-compatible storage for backup, then you can share your experience with us in the comments.
The Percona Kubernetes Operators automate the creation, alteration, or deletion of members in your Percona Distribution for MySQL, MongoDB, or PostgreSQL environment.
28
2022
Restore a Specific MongoDB Collection(s) From a Physical Backup
We all know how critical it is to get our data back as soon as possible. To achieve this, as you all might be aware, we have two methods of restore available for MongoDB: logical and physical.
Of course, this depends on the type of backup we have configured. For large data sets, it is advisable to use physical backups which offer faster backup and restore times. I’ve used Percona Backup for MongoDB (PBM) to take a physical backup in this demo.
But here’s a catch. It’s very complicated to restore specific collection(s) from a physical backup, be it any kind of backup method like volume snapshot, cold rsync data file copies, or a Hotbackup/Percona Backup for MongoDB.
The simplest, or what almost everyone knows, is to restore the physical dump, a.k.a. data files, onto a temporary “mongod” and get the logical dump of that collection(s) manually first, followed by a logical restore to the original cluster. This is usually done with the conjunction of mongodump and mongorestore.
But there’s another way where we can avoid provisioning up a temporary mongod and then a generally slow logical dump, a.k.a. mongodump.
To achieve this, in this blog post we’ll be taking a look at the “wt” utility which will help us to perform a restore.
As a pre-requisite, we need to first build “wt” from either their GitHub repo or download a tar archive file and build it to a specific version. Now let’s jump straight into the steps now.
In this blog, we’ll be building using the GitHub repo.
1. Clone the WiredTiger repo
$ git clone https://github.com/wiredtiger/wiredtiger.git $ cd wiredtiger
2. Check your mongod version and checkout the same branch version.
$ git branch --all | grep mongodb $ git checkout remotes/origin/mongodb-5.0 (I was running version 5.0)
3. Depending on the type of compression method we’re using, we will install its library and configure its WiredTiger library extension. Mostly snappy is being used currently so I’ve used the same method. But depending upon what compression method you have, you can provide the same library path after installing/configuring it.
$ ./configure --enable-zstd --enable-zlib --enable-lz4 -enable-snappy (if it fails with 'configure: error', then install that library first</strong>) $ yum install libzstd-devel zlib-devel lz4-devel snappy-devel $ make -j $(nproc)
Once we have the necessary dependencies installed and verified, we can execute the commands below to dive straight into the action of restoring the dropped collection.
Restoration time
1. Find out the relevant URI of the collection you want to restore. This can be checked on the original cluster:
> demo:PRIMARY> db.demo.stats().wiredTiger.uri statistics:table:collection-36-8063732008498576985 > demo:PRIMARY> db.adminCommand({'getCmdLineOpts':1}).parsed.storage.dbPath /var/lib/mongo
2. It’s necessary to check the compression method of the collection. Accordingly, we need to use the respective WiredTiger library extension. Look for ‘block_compressor’:
> demo:PRIMARY> db.demo.stats().wiredTiger.creationString .........block_compressor=snappy......... //truncated output
3. There are different ways to take a physical dump like Percona Backup for MongoDB, Hotbackup, Snapshot based, or even a rsync to a separate volume/path. Let’s now take a physical backup using Percona Backup for MongoDB and then drop our collection.
======== FS /backup Snapshots: 2022-10-26T03:05:19Z 10.87MB <physical> [restore_to_time: 2022-10-26T03:05:21Z] $ pbm describe-backup 2022-10-26T03:05:19Z name: "2022-10-26T03:05:19Z" opid: 6358a3ef2722ffe26150f98b type: physical last_write_time: "2022-10-26T03:05:21Z" last_transition_time: "2022-10-26T03:05:26Z" mongodb_version: 4.4.16-16 pbm_version: 2.0.1 status: done size_h: 10.9 MiB replsets: - name: demo status: done last_write_time: "2022-10-26T03:05:21Z" last_transition_time: "2022-10-26T03:05:25Z" security: {} $ ls -lrth /backup total 12K drwxr-xr-x. 3 mongod mongod 17 Oct 26 03:05 2022-10-26T03:05:19Z -rw-r--r--. 1 mongod mongod 11K Oct 26 03:05 2022-10-26T03:05:19Z.pbm.json $ ls -lrth /backup/2022-10-26T03:05:19Z/demo total 28K drwxr-xr-x. 4 mongod mongod 37 Oct 26 03:05 admin drwxr-xr-x. 4 mongod mongod 37 Oct 26 03:05 local -rw-r--r--. 1 mongod mongod 6.7K Oct 26 03:05 WiredTiger.backup.s2 drwxr-xr-x. 4 mongod mongod 37 Oct 26 03:05 config -rw-r--r--. 1 mongod mongod 4.7K Oct 26 03:05 _mdb_catalog.wt.s2 -rw-r--r--. 1 mongod mongod 2.0K Oct 26 03:05 WiredTigerHS.wt.s2 drwxr-xr-x. 4 mongod mongod 37 Oct 26 03:05 percona -rw-r--r--. 1 mongod mongod 1.9K Oct 26 03:05 sizeStorer.wt.s2 -rw-r--r--. 1 mongod mongod 68 Oct 26 03:05 WiredTiger.s2z drwxr-xr-x. 2 mongod mongod 76 Oct 26 03:05 journal
As you can clearly see, using PBM we can list all our underlying data files which we’ll use for this demo purpose.
> demo:PRIMARY> use percona switched to db percona > demo:PRIMARY> show collections demo > demo:PRIMARY> db.demo.countDocuments({}) 1000000 > demo:PRIMARY> > demo:PRIMARY> db.demo.drop() //accidentally dropped collection</strong> true > demo:PRIMARY> show collections > demo:PRIMARY> db percona
4. Take the “wt dump” of the URI file from Step 1 into a mongodb known bson format. A regular output of “wt dump” would be in binary hex string format, otherwise.
$ ./wt -v -h /backup -C "extensions=[/root/wiredtiger/ext/compressors/snappy/.libs/libwiredtiger_snappy.so]" dump -x file:collection-36-8063732008498576985.wt | tail -n +7 | awk 'NR%2 == 0 { print }' | xxd -r -p > /backup/percona.demo.bson
If you have got a different compression method for the collection or on a global level, you can use its respective WiredTiger library extension.
/clonedPath/wiredtiger/ext/compressors/<compressionMethod>/.libs/libwiredtiger_<compMethod>.so
Note: Don’t forget to append ‘.wt’ at the end of URI and add the prefix “table:” or “file:”, as within WiredTiger, all collection files are in the table or file format.
Let’s look at and understand the different flags used in the above command.
- Extensions: based on our compression method, we have used respective compression WiredTiger library extension
- [ -v, -x , dump, -h, -C]: These are WiredTiger binary flags to take the dump from a URI file in hex raw string in verbose style with default or command line provided configuration using “-C”
- Tail is only used to trim the top seven header lines from the output of “wt dump”
- awk is used to filter out just the keys (line number NR%2 == 1) or values (line number NR%2 == 0) in the pretty-print or hex mode of wt dump.
- xxd with conjunction of ‘-r’ and ‘-p’ is to convert raw hex strings into mongodb known bson format
- Finally, we’re redirecting the output to filename “percona.demo.bson”. This is very important to keep the output filename as per WiredTiger catalog ident which is nothing but a proper full namespace of collection URI. In our case, it was “percona.demo”.
- If needed, it can be validated using the below command:
$ ./wt -v -h /backup -C "extensions=[/root/wiredtiger/ext/compressors/snappy/.libs/libwiredtiger_snappy.so]" dump -x table:_mdb_catalog | tail -n +7 | awk 'NR%2 == 0 { print }' | xxd -r -p | bsondump --quiet | jq -r 'select(. | has("md")) | [.ident, .ns] | @tsv' | sort | grep percona | awk '{print $2}' percona.demo
5. Finally, restore bson using the native mongorestore.
$ mongorestore -authenticationDatabase "admin" --port 37017 -d percona -c demo /backup/percona.demo.bson 2022-10-17T03:03:43.232+0000 checking for collection data in /backup/percona.demo.bson 2022-10-17T03:03:43.240+0000 restoring percona.demo from /backup/percona.demo.bson 2022-10-17T03:03:46.231+0000 [####################....] percona.demo 106MB/128MB (83.4%) 2022-10-17T03:03:46.848+0000 [########################] percona.demo 128MB/128MB (100.0%) 2022-10-17T03:03:46.848+0000 finished restoring percona.demo (1000000 documents, 0 failures) 2022-10-17T03:03:46.848+0000 1000000 document(s) restored successfully. 0 document(s) failed to restore. $ mongo --quiet --port 37017 percona demo:PRIMARY> show collections demo demo:PRIMARY> db.demo.countDocuments({}) 1000000 demo:PRIMARY> db.demo.findOne() { "_id" : ObjectId("634cba61d42256b5fb5c9033"), "name" : "Blanche Potter", "age" : 38, "emails" : [ "filvo@av.hm", "pamjiiw@ewdofmik.sh", "opaibizaj@ha.ch" ] } demo:PRIMARY> SUMMARY
A few things to consider
1. You all must be thinking, what about indexes, right? The same process can be done by dumping the index*.wt files. But it’s rather complex as dumped keys and values have slightly different formats. I’ll soon cover it in a separate blog. Also interesting to mention is that WiredTiger maintains multiple index* URI files for every index, thus it’s better to build manually with the “createIndex” command which is a far easier approach.
demo:PRIMARY> db.demo.getIndexKeys() [ { "_id" : 1 }, { "a" : 1 }, { "b" : 1 } ] demo:PRIMARY> db.demo.stats({'indexDetails': true}).indexDetails['_id_'].uri statistics:table:percona/index/35-2625234990440311433 demo:PRIMARY> db.demo.stats({'indexDetails': true}).indexDetails['a_1'].uri statistics:table:percona/index/69-2625234990440311433 demo:PRIMARY> db.demo.stats({'indexDetails': true}).indexDetails['b_1'].uri statistics:table:percona/index/71-2625234990440311433
2. In order to perform point-in-time recovery, incremental oplogs still need to be replayed on top of the restored backup, which we have covered.
3. This method is applicable for sharded clusters as well (both unsharded and sharded collections) but there are a few additional steps that need to be taken. We’ll cover similarly detailed demos in upcoming blogs.
Before I wrap up, let’s talk about some drawbacks since this is a bit risky and complicated approach. Hence test it out first in your lab or test environment first to get familiarized with WiredTiger internals before jumping straight into production.
Cons
- Overall it’s a bit complicated approach as one has to have a clear understanding of “wt” utility and its internals.
- Doesn’t restore indexes and needs to be built separately, as mentioned.
Percona Distribution for MongoDB is a freely available MongoDB database alternative, giving you a single solution that combines the best and most important enterprise components from the open source community, designed and tested to work together.
15
2022
Moving MongoDB Cluster to a Different Environment with Percona Backup for MongoDB
Percona Backup for MongoDB (PBM) is a distributed backup and restore tool for sharded and non-sharded clusters. In 1.8.0, we added the replset-remapping functionality that allows you to restore data on a new compatible cluster topology.
The new environment can have different replset names and/or serve on different hosts and ports. PBM handles this hard work for you. Making such migration indistinguishable from the usual restore. In this blog post, I’ll show you how to migrate to a new cluster practically.
The Problem
Usually to change a cluster topology you do lots of manual steps. PBM reduces the process.
Let’s have a look at a case where we will have an initial cluster and a desired one.
Initial cluster:
configsrv: "configsrv/conf:27017" shards: - "rs0/rs0:27017,rs1:27017,rs2:27017" - "extra-shard/extra:27018"
The cluster consists of the configsrv configsvr replset with a single node and two shards: rs0 (3 nodes in the replset) and extra-shard (1 node in the replset). The names, hosts, and ports are not conventional across the cluster but we will resolve this.
Target cluster:
configsrv: "cfg/cfg0:27019" shards: - "rs0/rs00:27018,rs01:27018,rs02:27018" - "rs1/rs10:27018,rs11:27018,rs12:27018" - "rs2/rs20:27018,rs21:27018,rs22:27018"
Here we have the cfg configsvr replset with a single node and 3 shards rs0–rs2 where each shard is 3-nodes replset.
Think about how you can do this.
With PBM, all that we need is deployed cluster and logical backup made with PBM 1.5.0 or later. The following simple command will do the rest:
pbm restore $BACKUP_NAME --replset-remapping "cfg=configsrv,rs1=extra-shard"
Migration in Action
Let me show you how it looks in practice. I’ll provide details at the end of the post. In the repo, you can find all configs, scripts, and output used here.
As mentioned above, we need a backup. For this, we will deploy a cluster, seed data, and then make the backup.
Deploying the initial cluster
$> initial/deploy >initial/deploy.out $> docker compose -f "initial/compose.yaml" exec pbm-conf \ pbm status -s cluster Cluster: ======== configsvr: - configsvr/conf:27019: pbm-agent v1.8.0 OK rs0: - rs0/rs00:27017: pbm-agent v1.8.0 OK - rs0/rs01:27017: pbm-agent v1.8.0 OK - rs0/rs02:27017: pbm-agent v1.8.0 OK extra-shard: - extra-shard/extra:27018: pbm-agent v1.8.0 OK
links: initial/deploy, initial/deploy.out
The cluster is ready and we can add some data.
Seed data
We will insert the first 1000 numbers in a natural number sequence: 1 – 1000.
$> mongosh "mongo:27017/rsmap" --quiet --eval " for (let i = 1; i <= 1000; i++) db.coll.insertOne({ i })" >/dev/null
Getting the data state
These documents should be partitioned across all shards at insert time. Let’s see, in general, how. We will use the “dbHash“ command on all shards to have the collections’ state. It will be useful for verification later.
We will also do a quick check on shards and mongos.
$> initial/dbhash >initial/dbhash.out && cat initial/dbhash.out # rs00:27017 db.getSiblingDB("rsmap").runCommand("dbHash").collections { "coll" : "550f86eb459b4d43de7999fe465e39e0" } # rs01:27017 db.getSiblingDB("rsmap").runCommand("dbHash").collections { "coll" : "550f86eb459b4d43de7999fe465e39e0" } # rs02:27017 db.getSiblingDB("rsmap").runCommand("dbHash").collections { "coll" : "550f86eb459b4d43de7999fe465e39e0" } # extra:27018 db.getSiblingDB("rsmap").runCommand("dbHash").collections { "coll" : "4a79c07e0cbf3c9076d6e2d81eb77f0a" } # rs00:27017 db.getSiblingDB("rsmap").coll .find().sort({ i: 1 }).toArray() .reduce(([count = 0, seq = true, next = 1], { i }) => [count + 1, seq && next == i, i + 1], []) .slice(0, 2) [ 520, false ] # extra:27018 db.getSiblingDB("rsmap").coll .find().sort({ i: 1 }).toArray() .reduce(([count = 0, seq = true, next = 1], { i }) => [count + 1, seq && next == i, i + 1], []) .slice(0, 2) [ 480, false ] # mongo:27017 [ 1000, true ]
links: initial/dbhash, initial/dbhash.out
All rs0 members have the same data. So secondaries replicate from primary correctly.
The quickcheck.js used in the initial/dbhash script describes our documents. It returns the number of documents and whether these documents make the natural number sequence.
We have data for the backup. Time to make the backup.
Making a backup
$> docker compose -f initial/compose.yaml exec pbm-conf bash pbm-conf> pbm backup --wait Starting backup '2022-06-15T08:18:44Z'.... Waiting for '2022-06-15T08:18:44Z' backup.......... done pbm-conf> pbm status -s backups Backups: ======== FS /data/pbm Snapshots: 2022-06-15T08:18:44Z 28.23KB <logical> [complete: 2022-06-15T08:18:49Z]
We have a backup. It’s enough for migration to the new cluster.
Let’s destroy the initial cluster and deploy the target environment. (Destroying the initial cluster is not a requirement. I just don’t want to waste resources on it.)
Deploying the target cluster
pbm-conf> exit $> docker compose -f initial/compose.yaml down -v >/dev/null $> target/deploy >target/deploy.out
links: target/deploy, target/deploy.out
Let’s check the PBM status.
PBM Status
$> docker compose -f target/compose.yaml exec pbm-cfg0 bash pbm-cfg0> pbm config --force-resync # ensure agents sync from storage Storage resync started pbm-cfg0> pbm status -s backups Backups: ======== FS /data/pbm Snapshots: 2022-06-15T08:18:44Z 28.23KB <logical> [incompatible: Backup doesn't match current cluster topology - it has different replica set names. Extra shards in the backup will cause this, for a simple example. The extra/unknown replica set names found in the backup are: extra-shard, configsvr. Backup has no data for the config server or sole replicaset] [2022-06-15T08:18:49Z]
As expected, it is incompatible with the new deployment.
See how to make it work
Resolving PBM Status
pbm-cfg0> export PBM_REPLSET_REMAPPING="cfg=configsvr,rs1=extra-shard" pbm-cfg0> pbm status -s backups Backups: ======== FS /data/pbm Snapshots: 2022-06-15T08:18:44Z 28.23KB <logical> [complete: 2022-06-15T08:18:49Z]
Nice. Now we can restore.
Restoring
pbm-cfg0> pbm restore '2022-06-15T08:18:44Z' --wait Starting restore from '2022-06-15T08:18:44Z'....Started logical restore. Waiting to finish.....Restore successfully finished!
The –wait flag blocks the shell session till the restore completes. You could not wait but check it later.
pbm-cfg0> pbm list --restore Restores history: 2022-06-15T08:18:44Z
Everything is going well so far. Almost done
Let’s verify the data.
Data verification
pbm-cfg0> exit $> target/dbhash >target/dbhash.out && cat target/dbhash.out # rs00:27018 db.getSiblingDB("rsmap").runCommand("dbHash").collections { "coll" : "550f86eb459b4d43de7999fe465e39e0" } # rs01:27018 db.getSiblingDB("rsmap").runCommand("dbHash").collections { "coll" : "550f86eb459b4d43de7999fe465e39e0" } # rs02:27018 db.getSiblingDB("rsmap").runCommand("dbHash").collections { "coll" : "550f86eb459b4d43de7999fe465e39e0" } # rs10:27018 db.getSiblingDB("rsmap").runCommand("dbHash").collections { "coll" : "4a79c07e0cbf3c9076d6e2d81eb77f0a" } # rs11:27018 db.getSiblingDB("rsmap").runCommand("dbHash").collections { "coll" : "4a79c07e0cbf3c9076d6e2d81eb77f0a" } # rs12:27018 db.getSiblingDB("rsmap").runCommand("dbHash").collections { "coll" : "4a79c07e0cbf3c9076d6e2d81eb77f0a" } # rs20:27018 db.getSiblingDB("rsmap").runCommand("dbHash").collections { } # rs21:27018 db.getSiblingDB("rsmap").runCommand("dbHash").collections { } # rs22:27018 db.getSiblingDB("rsmap").runCommand("dbHash").collections { } # rs00:27018 db.getSiblingDB("rsmap").coll .find().sort({ i: 1 }).toArray() .reduce(([count = 0, seq = true, next = 1], { i }) => [count + 1, seq && next == i, i + 1], []) .slice(0, 2) [ 520, false ] # rs10:27018 db.getSiblingDB("rsmap").coll .find().sort({ i: 1 }).toArray() .reduce(([count = 0, seq = true, next = 1], { i }) => [count + 1, seq && next == i, i + 1], []) .slice(0, 2) [ 480, false ] # rs20:27018 db.getSiblingDB("rsmap").coll .find().sort({ i: 1 }).toArray() .reduce(([count = 0, seq = true, next = 1], { i }) => [count + 1, seq && next == i, i + 1], []) .slice(0, 2) [ ] # mongo:27017 [ 1000, true ]
links: target/dbhash, target/dbhash.out
As you can see, the rs2 shard is empty. The other two have the identical dbHash and the quickcheck results as in the initial cluster. I think balancer can tell something about this
Balancer status
$> mongosh "mongo:27017" --quiet --eval "sh.balancerCollectionStatus('rsmap.coll')" { balancerCompliant: false, firstComplianceViolation: 'chunksImbalance', ok: 1, '$clusterTime': { clusterTime: Timestamp({ t: 1655281436, i: 1 }), signature: { hash: Binary(Buffer.from("0000000000000000000000000000000000000000", "hex"), 0), keyId: Long("0") } }, operationTime: Timestamp({ t: 1655281436, i: 1 }) }
We know what to do. Starting balancer and checking status again.
$> mongosh "mongo:27017" --quiet --eval "sh.startBalancer().ok" 1 $> mongosh "mongo:27017" --quiet --eval "sh.balancerCollectionStatus('rsmap.coll')" { balancerCompliant: true, ok: 1, '$clusterTime': { clusterTime: Timestamp({ t: 1655281457, i: 1 }), signature: { hash: Binary(Buffer.from("0000000000000000000000000000000000000000", "hex"), 0), keyId: Long("0") } }, operationTime: Timestamp({ t: 1655281457, i: 1 }) } $> target/dbhash >target/dbhash-2.out && cat target/dbhash-2.out # rs00:27018 db.getSiblingDB("rsmap").runCommand("dbHash").collections { "coll" : "550f86eb459b4d43de7999fe465e39e0" } # rs01:27018 db.getSiblingDB("rsmap").runCommand("dbHash").collections { "coll" : "550f86eb459b4d43de7999fe465e39e0" } # rs02:27018 db.getSiblingDB("rsmap").runCommand("dbHash").collections { "coll" : "550f86eb459b4d43de7999fe465e39e0" } # rs10:27018 db.getSiblingDB("rsmap").runCommand("dbHash").collections { "coll" : "4a79c07e0cbf3c9076d6e2d81eb77f0a" } # rs11:27018 db.getSiblingDB("rsmap").runCommand("dbHash").collections { "coll" : "4a79c07e0cbf3c9076d6e2d81eb77f0a" } # rs12:27018 db.getSiblingDB("rsmap").runCommand("dbHash").collections { "coll" : "4a79c07e0cbf3c9076d6e2d81eb77f0a" } # rs20:27018 db.getSiblingDB("rsmap").runCommand("dbHash").collections { "coll" : "6a54e10a5526e0efea0d58b5e2fbd7c5" } # rs21:27018 db.getSiblingDB("rsmap").runCommand("dbHash").collections { "coll" : "6a54e10a5526e0efea0d58b5e2fbd7c5" } # rs22:27018 db.getSiblingDB("rsmap").runCommand("dbHash").collections { "coll" : "6a54e10a5526e0efea0d58b5e2fbd7c5" } # rs00:27018 db.getSiblingDB("rsmap").coll .find().sort({ i: 1 }).toArray() .reduce(([count = 0, seq = true, next = 1], { i }) => [count + 1, seq && next == i, i + 1], []) .slice(0, 2) [ 520, false ] # rs10:27018 db.getSiblingDB("rsmap").coll .find().sort({ i: 1 }).toArray() .reduce(([count = 0, seq = true, next = 1], { i }) => [count + 1, seq && next == i, i + 1], []) .slice(0, 2) [ 480, false ] # rs20:27018 db.getSiblingDB("rsmap").coll .find().sort({ i: 1 }).toArray() .reduce(([count = 0, seq = true, next = 1], { i }) => [count + 1, seq && next == i, i + 1], []) .slice(0, 2) [ 229, false ] # mongo:27017 [ 1000, true ]
links: target/dbhash-2.out
Interesting. rs2 shard has some data. However, rs1 and rs2 haven’t changed. It’s expected that mongos moves some chunks to rs2 and updates the router config. Physically deletion of chunks on a shard is a separate step. That’s why querying data directly on a shard is inaccurate. The data could disappear at any time. The cursor returns all available documents in a replset at the moment despite the router config.
Anyway, we shouldn’t care about it anymore. It is mongos/mongod responsibility now to update router config, query right shards, and remove moved chunks from shards by demand. In the end, we have valid data through mongos.
That’s it.
But wait, we didn’t make a backup! Never forget to make another solid backup.
Making a new backup
Better to change the storage so that we will have backups for the new deployment in a different place and will not see errors about incompatible backups from the initial cluster further.
$> pbm config --file "$NEW_PBM_CONFIG" >/dev/null $> pbm config --force-resync >/dev/null $> pbm backup -w >/dev/null pbm-cfg0> pbm status -s backups Backups: ======== FS /data/pbm Snapshots: 2022-06-15T08:25:44Z 165.34KB <logical> [complete: 2022-06-15T08:25:49Z]
Now we’re done. And can sleep better.
One More Thing: Possible Misconfiguration
Let’s review another imaginal case to explain all possible errors.
Initial cluster: cfg, rs0, rs1, rs2, rs3, rs4, rs5
Target cluster: cfg, rs0, rs1, rs2, rs3, rs4, rs6
If we apply remapping: “rs0=rs0,rs1=rs2,rs2=rs1,rs3=rs4“, we will get error like “missed replsets: rs3, rs5“. And nothing about rs6.
The missed rs5 should be obvious: backup topology has rs5 replset, but it is missed on target. And target rs6 does not have data to restore from. Adding rs6=rs5 fixes this.
But the missed rs3 could be confusing. Let’s visualize:
init | curr -----+----- cfg cfg # unchanged rs0 --> rs0 # mapped. unchanged rs1 --> rs2 rs2 --> rs1 rs3 --> # err: no shard rs4 --> rs3 -> rs4 # ok: no data rs5 --> # err: no shard -> rs6 # ok: no data
When we remap the backup from rs4 to rs3, the target rs3 is reserved. The rs3 in the backup does not have a target replset now. Just remapping rs3 to available rs4 will fix it too.
This reservation avoids data duplication. That’s why we use the quick check via mongos.
Details
Compatible topology
Simply speaking, compatible topology is equal to or has a larger number of shards in the target deployment. In our example, we had initial 2 shards but restored them to 3 shards. PBM restored data on two shards only. MongoDB can distribute it with the remaining shards later when the balancer is enabled (sh.startBalancer()). The number of replset members does not matter because PBM takes backup from a member (per replset) and restores it to primary only. Other data-bearing members replicate data from the primary. So you could make a backup from a multi-members replset and then restore it to a single member replset.
You cannot restore to a different replset type like from shardsvr to configsvr.
Preconfigured environment
The cluster should be deployed with all shards added. Users and permissions should be added and assigned in advance. PBM agents should be configured to the same storage and be accessible to it from the new cluster.
Note: PBM agents store backup metadata on storage and keep the cache in MongoDB. pbm config –force-resync lets you refresh the cache from the storage. Do it on a new cluster right after deployment to see backups/oplog chunks made from the initial cluster.
Understanding replset remapping
You can remap replset names by the –replset-remapping flag or PBM_REPLSET_REMAPPING environment variable. If both sets, the flag has precedence.
For full restore, point-in-time recovery, and oplog replay, PBM CLI sends the mapping as a parameter in the command. Each command gets a separate explicit mapping (or none). It can be done only by CLI. Agents do not use the environment variable nor have the flag.
pbm status and pbm list use the flag/envvar to remap replsets in backups/oplog metadata and apply this mapping to the current deployment to show them properly. If backup and present replset names do not match, pbm list will not show these backups, and pbm status prints an error with missed replset names.
Restoring with remapping works with logical backups only.
How does PBM do this?
During restore, PBM reviews current topology and assigns members’ snapshots and oplog chunks to each shard/replset by name, respectively. The remapping changes the default assignment.
After the restore is done, PBM agents sync the router config to make the restored data “native” to this cluster.
Behind the scene
The config.shards collection describes the current topology. PBM uses it to know where and what to restore. The collection is not modified by PBM. But restored data contains some other router configurations for initial topology.
We updated two collections to replace old shard names with new ones in restored data:
- config.databases – primary shard for non-sharded databases
- config.chunks – shards where chunks are
After this, MongoDB knows where databases, collections, and chunks are in the new cluster.
CONCLUSION
Migration of a cluster requires much attention, knowledge, and calm. The replset-remapping functionality in Percona Backup for MongoDB reduces complexity during migration between two different environments. I would say, it is near to a routine job now.
Have a nice day