Data-at-rest encryption (also known as transparent data encryption or TDE) is a necessary mechanism for ensuring the security of a DBMS deployment. Upcoming releases of Percona Server for MongoDB extend that mechanism with the KMIP key state polling feature. In this technical post, I will describe how the feature works and how it helps reduce […]
09
2024
05
2024
How to Upgrade MongoDB Using Backups Through Many Major Versions
Companies use specific database versions because they’re proven performers or because it’s hard to keep up with frequent releases. But lagging behind has some major issues. When it’s time to upgrade, is it better to update binaries through each major revision or skip versions? TL;DR: Upgrading a MongoDB cluster using backups and skipping versions is […]
17
2024
Using Compact in Percona Server for MongoDB From Version 4.4+
In the previously posted blog, Compaction in Percona Server for MongoDB (PSMDB), we discussed how compact works before version 4.4. In this blog, we will see how compact works on PSMDB 6.0. I recommend reading the blog post linked above to understand what compact does, how to check dataSize, and how much space we can […]
11
2023
Migrate Data From Atlas to Self-Hosted MongoDB
In this blog post, we will discuss how we can migrate data from MongoDB Atlas to self-hosted MongoDB. There are a couple of third-party tools in the market to migrate data from Atlas to Pecona Server for MongoDB (PSMDB), like MongoPush, Hummingbird, and MongoShake. Today, we are going to discuss how to use MongoShake and migrate and sync the data from Atlas to PSMDB.
NOTE: These tools are not officially supported by Percona.
MongoShake is a powerful tool that facilitates the migration of data from one MongoDB cluster to another. These are step-by-step instructions on how to install and utilize MongoShake for data migration from Atlas to PSMDB. So, let’s get started!
Prerequisites:
A MongoDB Atlas account. I created a test account (replica set) and loaded sample data with one click in Atlas:
- Create an account in Atlas.
- Create a cluster.
- Once a cluster is created, go to browse collections.
- It will ask for load sample data. Once you click on it, you will see the sample data like below.
Atlas atlas-mhnnqy-shard-0 [primary] test> show dbs sample_airbnb 52.69 MiB sample_analytics 9.44 MiB sample_geospatial 1.23 MiB sample_guides 40.00 KiB sample_mflix 109.43 MiB sample_restaurants 6.42 MiB sample_supplies 1.05 MiB sample_training 46.77 MiB sample_weatherdata 2.59 MiB admin 336.00 KiB local 20.35 GiB Atlas atlas-mhnnqy-shard-0 [primary] test>
An EC2 instance with PSMDB installed. I installed PSMDB on the EC2 machine:
rs0 [direct: primary] test> rs0 [direct: primary] test> show dbs admin 40.00 KiB config 12.00 KiB local 40.00 KiB rs0 [direct: primary] test>
Make sure Atlas and PSMDB both have the same DB version (I have also used this tool on MongoDB 4.2, which is already EOL).
PSMDB version:
rs0 [direct: primary] test> db.version() 6.0.9-7 rs0 [direct: primary] test>
MongoDB Atlas version:
Atlas atlas-mhnnqy-shard-0 [primary] test> db.version() 6.0.10 Atlas atlas-mhnnqy-shard-0 [primary] test>
To install MongoShake, follow these steps:
Step 1: Install Go
Ensure that Go is installed on your system. If not, download it from the official website and follow the installation instructions. I used Amazon Linux 2, so used the below command to install go:
sudo yum install golang -y
Step 2: Install MongoShake
Open the terminal and run the following command to install MongoShake:
git clone https://github.com/alibaba/MongoShake.git
- Untar the file; it will create a folder with the name Mongoshake.
- cd MongoShake.
- Run ./build.sh file.
Once you have installed MongoShake, you need to configure it for the migration process. Here’s how:
- Configuration file (collector.conf) will be under conf dir under Mongoshake dir.
- In the config file, you can edit the URI for both RS or sharded clusters. Also, the tunnel (how you are migrating the data) method. If you are doing it directly, then the value will be direct. You can edit the log file path and log file name. Below are some important parameters:
mongo_urls = mongodb+srv://gautam:****@cluster0.teeeayh.mongodb.net/ // Atlas conn string Tunnel.address = mongodb:127.0.0.1:27017 // PSMDB conn string Sync_mode = all // default incr log.dir = /home/percona/MongoShake/log/ // default /root/mongoshake/
Sync_mode other options: all/full/incr.
- All means full synchronization + incremental synchronization. (copy the data and apply the oplogs after sync completes).
- Full means full synchronization only. (only copy the data).
- Incr means incremental synchronization only. (only apply the oplog).
There are other parameters as well in the configuration file, which you can tune as per your needs. For example, if you want to read data from the Secondary node and do not want to overwhelm the Primary with the reads, you can set below parameter:
mongo_connect_mode = secondaryPreferred
Step 3: Once you are done with the configuration, run MongoShake in a screen session like the one below:
./bin/collector.linux -conf=conf/collector.conf -verbose 0
Step 4: Monitor the log file in the log directory to check the progress of migration.
Below is the sample log when you start MongoShake:
[2023/09/25 21:09:13 UTC] [INFO] New session to mongodb+srv://gautam:***@cluster0.teeeayh.mongodb.net/ successfully [2023/09/25 21:09:13 UTC] [INFO] Close client with mongodb+srv://gautam:***@cluster0.teeeayh.mongodb.net/ [2023/09/25 21:09:13 UTC] [INFO] New session to mongodb+srv://gautam:***@cluster0.teeeayh.mongodb.net/ successfully [2023/09/25 21:09:19 UTC] [INFO] Close client with mongodb+srv://gautam:***@cluster0.teeeayh.mongodb.net/ [2023/09/25 21:09:19 UTC] [INFO] GetAllTimestamp biggestNew:{1695675385 26}, smallestNew:{1695675385 26}, biggestOld:{1695668185 9}, smallestOld:{1695668185 9}, MongoSource:[url[mongodb+srv://gautam:***@cluster0.teeeayh.mongodb.net/], name[atlas-mhnnqy-shard-0]], tsMap:map[atlas-mhnnqy-shard-0:{7282839399442677769 7282870323207208986}] [2023/09/25 21:09:19 UTC] [INFO] all node timestamp map: map[atlas-mhnnqy-shard-0:{7282839399442677769 7282870323207208986}] CheckpointStartPosition:{1 0} [2023/09/25 21:09:19 UTC] [INFO] New session to mongodb+srv://gautam:***@cluster0.teeeayh.mongodb.net/ successfully [2023/09/25 21:09:19 UTC] [INFO] atlas-mhnnqy-shard-0 Regenerate checkpoint but won't persist. content: {"name":"atlas-mhnnqy-shard-0","ckpt":1,"version":2,"fetch_method":"","oplog_disk_queue":"","oplog_disk_queue_apply_finish_ts":1} [2023/09/25 21:09:19 UTC] [INFO] atlas-mhnnqy-shard-0 checkpoint using mongod/replica_set: {"name":"atlas-mhnnqy-shard-0","ckpt":1,"version":2,"fetch_method":"","oplog_disk_queue":"","oplog_disk_queue_apply_finish_ts":1}, ckptRemote set? [false] [2023/09/25 21:09:19 UTC] [INFO] atlas-mhnnqy-shard-0 syncModeAll[true] ts.Oldest[7282839399442677769], confTsMongoTs[4294967296] [2023/09/25 21:09:19 UTC] [INFO] start running with mode[all], fullBeginTs[7282870323207208986[1695675385, 26]]
You will see the below log once full sync is completed, and incr will start (incr means it will start syncing live data via oplog):
[2023/09/25 22:12:04 UTC] [INFO] GetAllTimestamp biggestNew:{1695679924 3}, smallestNew:{1695679924 3}, biggestOld:{1695677613 1}, smallestOld:{1695677613 1}, MongoSource:[url[mongodb+srv://gautam:***@cluster0.teeeayh.mongodb.net/], name[atlas-mhnnqy-shard-0]], tsMap:map[atlas-mhnnqy-shard-0::{7282879892394344449 7282889818063765507}] [2023/09/25 22:12:04 UTC] [INFO] ------------------------full sync done!------------------------ [2023/09/25 22:12:04 UTC] [INFO] oldestTs[7282879892394344449[1695677613, 1]] fullBeginTs[7282889689214746625[1695679894, 1]] fullFinishTs[7282889818063765507[1695679924, 3]] [2023/09/25 22:12:04 UTC] [INFO] finish full sync, start incr sync with timestamp: fullBeginTs[7282889689214746625[1695679894, 1]], fullFinishTs[7282889818063765507[1695679924, 3]] [2023/09/25 22:12:04 UTC] [INFO] start incr replication
You will see the logs like this when both nodes are in sync (when lag is 0, i.e., tps=0):
[2023/09/25 22:14:41 UTC] [INFO] [name=atlas-mhnnqy-shard-0, stage=incr, get=24, filter=24, write_success=0, tps=0, ckpt_times=0, lsn_ckpt={0[0, 0], 1970-01-01 00:00:00}, lsn_ack={0[0, 0], 1970-01-01 00:00:00}]] [2023/09/25 22:14:46 UTC] [INFO] [name=atlas-mhnnqy-shard-0, stage=incr, get=24, filter=24, write_success=0, tps=0, ckpt_times=0, lsn_ckpt={0[0, 0], 1970-01-01 00:00:00}, lsn_ack={0[0, 0], 1970-01-01 00:00:00}]] [2023/09/25 22:14:51 UTC] [INFO] [name=atlas-mhnnqy-shard-0, stage=incr, get=25, filter=25, write_success=0, tps=0, ckpt_times=0, lsn_ckpt={0[0, 0], 1970-01-01 00:00:00}, lsn_ack={0[0, 0], 1970-01-01 00:00:00}]] [2023/09/25 22:14:56 UTC] [INFO] [name=atlas-mhnnqy-shard-0, stage=incr, get=25, filter=25, write_success=0, tps=0, ckpt_times=0, lsn_ckpt={0[0, 0], 1970-01-01 00:00:00}, lsn_ack={0[0, 0], 1970-01-01 00:00:00}]]
Once the full data replication process is complete and both clusters are in sync, you can stop pointing the application to Atlas. Check the logs of MongoShake, and when the lag is 0, as we can see in the above logs, stop the replication/sync from Atlas or stop MongoShake. Verify that the data has been successfully migrated to PSMDB. You can use MongoDB shell or any other client to connect to the PSMDB instance to verify this.
MongoDB Atlas databases and their collection count:
Database: sample_airbnb ----- Collection 'listingsAndReviews' documents: 5555 Database: sample_analytics ----- Collection 'transactions' documents: 1746 Collection 'accounts' documents: 1746 Collection 'customers' documents: 500 Database: sample_geospatial ----- Collection 'shipwrecks' documents: 11095 Database: sample_guides ----- Collection 'planets' documents: 8 Database: sample_mflix ----- Collection 'embedded_movies' documents: 3483 Collection 'users' documents: 185 Collection 'theaters' documents: 1564 Collection 'movies' documents: 21349 Collection 'comments' documents: 41079 Collection 'sessions' documents: 1 Database: sample_restaurants ----- Collection 'neighborhoods' documents: 195 Collection 'restaurants' documents: 25359 Database: sample_supplies ----- Collection 'sales' documents: 5000 Database: sample_training ----- Collection 'posts' documents: 500 Collection 'trips' documents: 10000 Collection 'grades' documents: 100000 Collection 'routes' documents: 66985 Collection 'inspections' documents: 80047 Collection 'companies' documents: 9500 Collection 'zips' documents: 29470 Database: sample_weatherdata ----- Collection 'data' documents: 10000 Atlas atlas-mhnnqy-shard-0 [primary] sample_weatherdata>
PSDMB databases and their collection count:
rs0 [direct: primary] test> show dbs admin 80.00 KiB config 240.00 KiB local 468.00 KiB mongoshake 56.00 KiB sample_airbnb 52.20 MiB sample_analytics 9.21 MiB sample_geospatial 984.00 KiB sample_guides 40.00 KiB sample_mflix 108.17 MiB sample_restaurants 5.57 MiB sample_supplies 980.00 KiB sample_training 40.50 MiB sample_weatherdata 2.39 MiB rs0 [direct: primary] test>
Database: sample_airbnb ----- Collection 'listingsAndReviews' documents: 5555 Database: sample_analytics ----- Collection 'transactions' documents: 1746 Collection 'accounts' documents: 1746 Collection 'customers' documents: 500 Database: sample_geospatial ----- Collection 'shipwrecks' documents: 11095 Database: sample_guides ----- Collection 'planets' documents: 8 Database: sample_mflix ----- Collection 'embedded_movies' documents: 3483 Collection 'users' documents: 185 Collection 'theaters' documents: 1564 Collection 'movies' documents: 21349 Collection 'comments' documents: 41079 Collection 'sessions' documents: 1 Database: sample_restaurants ----- Collection 'neighborhoods' documents: 195 Collection 'restaurants' documents: 25359 Database: sample_supplies ----- Collection 'sales' documents: 5000 Database: sample_training ----- Collection 'posts' documents: 500 Collection 'trips' documents: 10000 Collection 'grades' documents: 100000 Collection 'routes' documents: 66985 Collection 'inspections' documents: 80047 Collection 'companies' documents: 9500 Collection 'zips' documents: 29470 Database: sample_weatherdata ----- Collection 'data' documents: 10000 rs0 [direct: primary] sample_weatherdata>
Above, you can see we have verified data in PSMDB. Now, update the connection string of the application to point to PSMDB.
NOTE: Sometimes, during the migration process, it is possible for some indexes to replicate. So, during the data verification process, please verify the indexes, and if an index is missing, create that index before the cutover time.
Conclusion
MongoShake simplifies the process of migrating MongoDB data from Atlas to self-hosted MongoDB. Percona experts can assist you with migration as well. By following the steps outlined in this blog, you can seamlessly install, configure, and utilize MongoShake for migrating your data from MongoDB Atlas.
To learn more about the enterprise-grade features available in the license-free Percona Server for MongoDB, we recommend going through our blog MongoDB: Why Pay for Enterprise When Open Source Has You Covered?
Percona Distribution for MongoDB is a freely available MongoDB database alternative, giving you a single solution that combines the best and most important enterprise components from the open source community, designed and tested to work together.
10
2023
Percona Server for MongoDB 7 Is Now Available
Databases are different from a lot of software. For one, they often favor stability over innovation. This is not a general rule, but as databases are responsible for a core layer of any IT system – data storage and processing — they require reliability. This requirement does not always pair with the latest and greatest improvements that have not been hardened over time.
Even with that, the fact that MongoDB 5.0 is planned for EOL in October 2024 and MongoDB 6.0 is planned for EOL in July 2025 should put MongoDB 7.0 on your radar. Even if you are not considering all the interesting improvements that have been added by the development team from MongoDB, this new version is already very important from the database supportability and lifecycle planning perspective.
Why choose Percona Server for MongoDB?
Percona provides a drop-in replacement solution for MongoDB Community Edition that is based on the same upstream code delivered by MongoDB, Inc. The difference between Percona Server for MongoDB and MongoDB CE is that we strive to provide a gap-closing set of features for users who want to use MongoDB in production. These enterprise features include, but are not limited to:
- Security improvements – Among which is the KMIP and Hashicorp Vault integration.
- Availability solutions – Advanced backups, including physical backups and point-in-time recovery that are not available to MongoDB Community Edition.
- K8s Operator – An enterprise-grade k8s operator to run your workloads in Kubernetes.
- Percona Monitoring and Management (PMM) – A fully open source monitoring tool to help you run your databases (not limited to MongoDB).
Why release Percona Server for MongoDB 7 now?
The dev teams from MongoDB, Inc. delivering the Upstream code do a great job and build a very solid tool. With that said, each new major version, by definition, introduces enough big changes to require a certain amount of precaution.
We explicitly delay the release of each main version of MongoDB server to take extra time to validate whether all of our added functionalities work well with the given version.
We also spend extra time to ensure that the quality of the release is good enough for our customers to start using. Think of us as the extra set of eyes, the extra layer of QA to ensure your safety passage to the next database version.
This time around, the first version we were able to release as Release Candidate (RC) was 7.0.2, which was released as Percona Server for MongoDB RC 7.0.2-1. Expect GA release soon to follow.
Important changes in MongoDB 7
One of the most eyebrow-raising changes that MongoDB 7.0 introduces is the limitation of the downgrade process. Reading the below would not make me feel at ease while performing an upgrade:
Binary downgrades are no longer supported for MongoDB Community Edition. (source)
followed by
Starting in MongoDB 7.0, you cannot downgrade your Enterprise deployment’s binary version without assistance from support. (source)
What it means is that there are some important changes coming with 7.0 that can also be very beneficial for you. What it also means is that in case of any problems with your upgrade, as soon as you change the fCV to 7.0, the way back will be closed without a time-consuming and operationally complicated logical restore or complicated and tailor-suited solutions requiring a lot of experience.
What it also means is that binary downgrade may still be possible for enterprise customers of MongoDB. Documentation does not limit that option.
Percona offers upgrade support to get you safely through the upgrade process. We also provide Managed Services to take this stress off your shoulders so that our top-of-the-market experts handle your databases for you.
To check out the list of changes that MongoDB 7 introduces, check out the summary write-up in Percona Server for MongoDB 7.0.2-1 release or the full release notes from MongoDB Inc.
What’s coming next
We are working on improvements available previously only to the MongoDB Enterprise users that will impact scalability and availability of especially large datasets.
We also want to focus more on the security aspects not available outside of MongoDB Enterprise.
Our Operators Team is also working on improvements especially important for sharded clusters, and the Percona Monitoring and Management team is planning to look into more scalability and management-enhancing options.
Stay tuned for more news about MongoDB offerings.
17
2023
An Enterprise-Grade MongoDB Alternative Without Licensing or Lock-in
MongoDB Community Edition software might set the stage for achieving your high-volume database goals, but you quickly learn that its features fall short of your enterprise needs.
So you look at MongoDB Enterprise software, but its costly and complex licensing structure does not meet your budget goals. You’re also not certain its features will always align with your evolving technology needs. What’s more, you’re wary of the expenses and restrictions of vendor lock-in.
Still, you don’t want to ditch the advantages of MongoDB Enterprise software. But you can’t absorb the negatives, either.
Don’t despair; there are alternatives. In this blog, we’ll examine the reasons why people would seek an alternative to MongoDB Enterprise, and we’ll identify some of the most popular NoSQL alternatives. Then, we’ll highlight some reasons why Percona Software for MongoDB might be the alternative you seek.
First, some stage-setting for this blog article.
The popularity of MongoDB
MongoDB has emerged as a popular database platform. It ranks No. 5 among all database management systems and No. 1 among non-relational/document-based systems (DB-Engines, July 2023).
Developers and DBAs like MongoDB’s ease-of-use. Instead of the table-based structure of relational databases, MongoDB stores data in documents and collections, a design for handling large amounts of unstructured data and for real-time web applications. DBAs and developers appreciate its combination of flexibility, scalability, and performance.
More specifically, DBAs and developers like that MongoDB stores data in JSON-like documents with optional schemas. It’s a good setup for real-time analytics and high-speed logging.
Taking a deeper dive, MongoDB is a system for analyzing data because documents are easily shared across multiple nodes and because of its indexing, query-on-demand, and real-time aggregation capabilities. MongoDB replica sets enable data redundancy and automatic failover, setting the stage (there’s that term again) for high availability. MongoDB also provides strong encryption and firewall security. MongoDB is preferable for working with content management systems and mobile apps.
And MongoDB is popular across industries. A survey of 90,240 companies using MongoDB listed the leading uses as Technology and Services (23%), Computer Software (16%), and Internet (6%).
So, if MongoDB provides such a great foundation, why not just step up to MongoDB Enterprise?
Why businesses seek an alternative to MongoDB Enterprise
The reasons for choosing an alternative to MongoDB Enterprise vary depending on business objectives, technical requirements, on-staff expertise, and project specifications. But there are common concerns about MongoDB Enterprise that drive people to seek alternatives.
Those problems (some with shared elements) include:
- High cost and complicated pricing structure: Many companies say MongoDB has an expensive and complicated pricing structure (Cloud Zero, January 2023). MongoDB Enterprise is a commercial (proprietary) product with licensing fees, as well as support and maintenance charges that can rapidly escalate. With the potentially high costs and complications of tiers, pay-as-you-go, hourly and monthly rates, etc., companies and organizations seek alternatives that offer similar functionality at a lower cost.
- Limited toolset: MongoDB Enterprise offers advanced features for data encryption, authentication, auditing, access control, and more. But you can be out of luck and forced to spend additional money if business-critical objectives require specific features or capabilities unavailable in MongoDB Enterprise.
- Not really open source: Even the MongoDB Community version is not open source; it’s source-available and is under the SSPL license (introduced by MongoDB itself). MongoDB Enterprise, built on the Community version, adds proprietary features and database management tools. MongoDB Enterprise is commercial. Customers miss out on the cost-effectiveness, creative freedom, and global community support (for innovation, better performance, and enhanced security) that come with open source solutions and from companies with an open source spirit.
- Inflexibility: Proprietary software puts a company — and its ability to tailor solutions to fit specific use cases — at the mercy of the software vendor. Conversely, open source database software (and companies that support source-available software with open source terms) provides the flexibility to customize and modify the software to suit specific requirements. Organizations have access to the source code, allowing them to make changes and enhancements as needed. This level of flexibility is particularly valuable for businesses with unique or specialized needs.
- Vendor lock-in: Relying on contracted MongoDB Enterprise support to address immediate concerns, reduce complexity, and provide a secure database might provide initial comfort, but trepidation about vendor lock-in would be legitimate. Concerns about price hikes, paying for unnecessary technology, and being blocked from new technology can provide the impetus to seek an alternative to MongoDB Enterprise. Companies might opt for alternative databases that offer more of the aforementioned flexibility and the ability to migrate to different platforms or technologies in the future without significant challenges.
- Infrastructure incompatibility: Organizations might have existing tools and applications that are not readily compatible with MongoDB Enterprise software. If an alternative database has better compatibility or provides specific integrations with the company’s existing technology stack, that might be a compelling reason to seek an alternative to MongoDB Enterprise.
- Needless complexity: For smaller companies and/or those with simpler needs or limited database budgets and resources, some of the advanced features in MongoDB Enterprise could introduce unwanted complexity in the database environment. Such organizations might seek a more straightforward alternative.
Alternatives to MongoDB Enterprise
There are plenty of alternatives, but we’re not going to cover them all here, of course. Instead, let’s examine a few of the more popular non-relational (NoSQL) database options.
MongoDB Community
We’ve already touched on MongoDB Community’s licensing, but let’s address some of what the software lacks to be a viable technology alternative to its Enterprise sibling.
For starters, MongoDB Community lacks the advanced security features available in MongoDB Enterprise. It also lacks more advanced monitoring and management features like custom alerting, automation, and deeper insights into database performance that are part of MongoDB Enterprise. While it offers basic backup and restore functionality, the Community version lacks advanced features in MongoDB Enterprise, such as continuous backups, point-in-time recovery (PITR), and integration with third-party backup tools.
And from a support standpoint, for the loss of a better word, it’s lacking. The MongoDB Community edition does not come with official technical support or service level agreements (SLAs) from MongoDB Inc.
Redis
Known for exceptional performance, Redis is a popular in-memory data platform. It stores data in RAM, which enables fast data access and retrieval. Redis can handle a high volume of operations per second, making it useful for running applications that require low latency. Redis supports a wide range of data structures, including strings, lists, sets, sorted sets, and hashes. Developers appreciate how Redis supplies appropriate data structures for specific use cases.
Storing large datasets can be a challenge, as Redis’ storage capacity is limited by available RAM. Also, Redis is designed primarily for key-value storage and lacks advanced querying capabilities.
Redis ranks right after MongoDB as the sixth most popular database management system (DB-Engines, July 2023).
Apache Cassandra
Apache Cassandra, with users across industries, ranks as the 12th most popular database management system (DB-Engines, July 2023). It’s an open source distributed NoSQL database that offers high scalability and availability. It manages unstructured data with thousands of writes every second.
Fault tolerance and linear scalability make Cassandra popular for handling mission-critical data. But because Cassandra handles large amounts of data and multiple requests, transactions can be slower, and there can be memory management issues.
Couchbase
Couchbase is a distributed document store with a powerful search engine and built-in operational and analytical capabilities. It’s designed to handle high volumes of data with minimal delay. An in-memory caching mechanism supports horizontal scaling, which enables it to handle large-scale applications and workloads effectively. Couchbase — No. 32 in database popularity (DB-Engines, July 2023) — uses a distributed, peer-to-peer architecture that enables data replication and automatic sharding across multiple nodes. This architecture ensures high availability, fault tolerance, and resilience to failures.
With Couchbase, certain tasks can be more challenging or time-consuming. Its indexing mechanisms are not as well-developed as those of some other database solutions. Additionally, it has its own query language, so the learning curve can be steeper.
Percona’s MongoDB alternative — enterprise advantages with none of the strings
Here’s one more alternative: If you want enterprise-grade MongoDB — without the high cost of runaway licensing fees or restrictions of vendor lock-in — consider Percona Software for MongoDB.
Secure, enterprise-grade Percona Software for MongoDB is freely available and empowers you to operate the production environments you want, wherever you want. Benefits include:
- High performance without lock-in — Operate production environments requiring high-performance, highly available, and secure databases. Do it without licensing costs and vendor lock-in.
- Data durability — Ensure it via an open source, distributed, and low-impact solution for consistent backups of MongoDB sharded clusters and replica sets.
- Scalability — Freely deploy and scale MongoDB in a public or private cloud, on-premises, or hybrid environment—no credit card required.
- MongoDB database health checks — Monitor, receive alerts, manage backups, and diagnose user-impacting incidents rooted in database configuration.
- Automated procedures and accelerated value — Automate deployments, scaling, and backup and restore operations of MongoDB on Kubernetes.
For MongoDB users, Percona offers:
- Percona Server for MongoDB — A source-available, fully compatible drop-in replacement for the MongoDB Community Edition with enterprise security, backup, and developer-friendly features.
- Percona Backup for MongoDB — This is a fully supported, 100% open source community backup tool for MongoDB. It creates a physical data backup on a running server without notable performance and operating degradation. Percona Backup for MongoDB offers PITR and a backup management interface via Percona Monitoring and Management (PMM).
- Percona Toolkit for MongoDB — A collection of advanced open source command-line tools that are engineered to perform a variety of tasks too difficult or complex to perform manually.
- Percona Distribution for MongoDB — A collection of Percona for MongoDB software offerings integrated with each other and packed into a single solution that maximizes performance while being more cost-effective for teams to run over time.
- Percona Monitoring and Management (PMM) — An open source database observability monitoring and management tool that’s ideal for finding MongoDB database issues.
With Percona Software for MongoDB, you can ensure data availability while improving security and simplifying the development of new applications — in the most demanding public, private, and hybrid cloud environments.
And with Percona, you’re never on your own. We back our MongoDB offerings with Percona Support, Managed Services, and Consulting. We’ll provide support that best fits the needs of your company or organization — without a restrictive contract.
29
2023
CommitQuorum in Index Creation From Percona Server for MongoDB 4.4
Before Percona Server for MongoDB 4.4 (PSMDB), the best practice to create an index was doing it in a rolling manner. Many folks used to create directly on Primary, resulting in the first index being created successfully on Primary and then replicated to Secondary nodes.
Starting from PSMDB 4.4, there was a new parameter commitQuorum introduced in the createIndex command. If you are not passing this parameter explicitly with the createIndex command, it will use the default settings on a replica set or sharded cluster and start building the index simultaneously across all data-bearing voting replica set members.
Below is the command used to create an index using commitQuorum as the majority:
db.getSiblingDB("acme").products.createIndex({ "airt" : 1 }, { }, "majority")
The above command will run the index create command on the majority of data-bearing replica set members. There are other options available too when using commitQuorum:
- “Voting Members” – This is the default behavior when an index will be created on all data-bearing voting replica set members (Default). A “voting” member is any replica set member where votes are greater than 0.
- “Majority” – A simple majority of data-bearing replica set members.
- “<int>” – A specific number of data-bearing replica set members. Specify an integer greater than 0.
- “Tag name” – A replica set tag name of a node is used.
Now we will see the scenarios of what happens when the index is created with the default and majority commitQuorum.
- When all data-bearing replica set members are available, and the index is created with default commitQuorum, below are the details from the Primary and the Secondary nodes. Create index:
rs1:PRIMARY> db.products.createIndex({ "airt" : 1 })
Primary logs:
{"t":{"$date":"2023-06-26T12:33:18.417+00:00"},"s":"I", "c":"INDEX", "id":20384, "ctx":"IndexBuildsCoordinatorMongod-0","msg":"Index build: starting","attr":{"namespace":"acme.products","buildUUID":{"uuid":{"$uuid":"58f4e7bf-7b8f-4eb6-8de0-0ad774c4b51f"}},"properties":{"v":2,"key":{"airt":1.0},"name":"airt_1"},"method":"Hybrid","maxTemporaryMemoryUsageMB":200}}
Secondary logs:
{"t":{"$date":"2023-06-26T12:33:18.417+00:00"},"s":"I", "c":"INDEX", "id":20384, "ctx":"IndexBuildsCoordinatorMongod-0","msg":"Index build: starting","attr":{"namespace":"acme.products","buildUUID":{"uuid":{"$uuid":"58f4e7bf-7b8f-4eb6-8de0-0ad774c4b51f"}},"properties":{"v":2,"key":{"airt":1.0},"name":"airt_1"},"method":"Hybrid","maxTemporaryMemoryUsageMB":200}}}
Secondary logs:
{"t":{"$date":"2023-06-26T12:33:28.445+00:00"},"s":"I", "c":"INDEX", "id":20384, "ctx":"IndexBuildsCoordinatorMongod-0","msg":"Index build: starting","attr":{"namespace":"acme.products","buildUUID":{"uuid":{"$uuid":"58f4e7bf-7b8f-4eb6-8de0-0ad774c4b51f"}},"properties":{"v":2,"key":{"airt":1.0},"name":"airt_1"},"method":"Hybrid","maxTemporaryMemoryUsageMB":200}}
We can see the above index was created simultaneously on all the data-bearing voting replica set members.
- When one secondary is down, and the index is created with default commitQuorum, below are the details from the Primary and the Secondary nodes.
Status of nodes:
rs1:PRIMARY> rs.status().members.forEach(function (d) {print(d.name) + " " + print(d.stateStr)}); 127.0.0.1:27017 PRIMARY localhost:27018 SECONDARY localhost:27019 (not reachable/healthy) rs1:PRIMARY>
Index command:
rs1:PRIMARY> db.products.createIndex({ "airt" : 1 })
Replication status:
rs1:PRIMARY> db.printSecondaryReplicationInfo() source: localhost:27018 syncedTo: Mon Jun 26 2023 17:56:30 GMT+0000 (UTC) 0 secs (0 hrs) behind the primary source: localhost:27019 syncedTo: Thu Jan 01 1970 00:00:00 GMT+0000 (UTC) 1687802190 secs (468833.94 hrs) behind the primary rs1:PRIMARY>
Index status:
rs1:PRIMARY> db.currentOp(true).inprog.forEach(function(op){ if(op.msg!==undefined) print(op.msg) }) Index Build: draining writes received during build rs1:PRIMARY> Date() Mon Jun 26 2023 18:07:26 GMT+0000 (UTC) rs1:PRIMARY>
CurrentOp:
"active" : true, "currentOpTime" :"2023-06-26T19:04:33.175+00:00", "opid" : 329147, "lsid" : { "id" :UUID("dd9672f8-4f56-47ce-8ceb-31caf5e8baf8"), "uid": BinData(0,"47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=") }, "secs_running" : NumberLong(4214), "microsecs_running" : NumberLong("4214151233"), "op" : "command", "ns" : "acme.products", "command" : { "createIndexes" : "products", "indexes" : [ { "key" : { "airt" : 1 }, "name" : "airt_1" } ], "lsid" : { "id" :UUID("dd9672f8-4f56-47ce-8ceb-31caf5e8baf8") }, "$clusterTime" : { "clusterTime" : Timestamp(1687801980, 1), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } }, "$db" : "acme" }
Logs from Primary node:
{"t":{"$date":"2023-06-26T17:54:21.419+00:00"},"s":"I", "c":"STORAGE", "id":3856203, "ctx":"IndexBuildsCoordinatorMongod-1","msg":"Index build: waiting for next action before completing final phase","attr":{"buildUUID":{"uuid":{"$uuid":"46451b37-141f-4312-a219-4b504736ab5b"}}}}
Logs from up-and-running Secondary node:
{"t":{"$date":"2023-06-26T17:54:21.424+00:00"},"s":"I", "c":"STORAGE", "id":3856203, "ctx":"IndexBuildsCoordinatorMongod-1","msg":"Index build: waiting for next action before completing final phase","attr":{"buildUUID":{"uuid":{"$uuid":"46451b37-141f-4312-a219-4b504736ab5b"}}}}
You can see above that when one node is down, and the index is created with default commitQuorum, the index command will keep running till that third data-bearing voting node comes up. Now we can check if the index is created on Primary or not:
rs1:PRIMARY> db.products.getIndexes() [ { "v" : 2, "key" : { "_id" : 1 }, "name" : "_id_" }, { "v" : 2, "key" : { "airt" : 1 }, "name" : "airt_1" } ] rs1:PRIMARY>
We can see the index is created, but you will not be able to use the above index as the index is not marked as completed.
Below is the explain plan of a query, where we can see the query is doing COLLSCAN instead of IXSCAN:
rs1:PRIMARY> db.products.find({"airt" : 1.9869362536440427}).explain() { "queryPlanner" : { "plannerVersion" : 1, "namespace" : "acme.products", "indexFilterSet" : false, "parsedQuery" : { "airt" : { "$eq" : 1.9869362536440427 } }, "queryHash" : "65E2F79D", "planCacheKey" : "AA490985", "winningPlan" : { "stage" : "COLLSCAN", "filter" : { "airt" : { "$eq" : 1.9869362536440427 } }, "direction" : "forward" }, "rejectedPlans" : [ ] }, "serverInfo" : { "host" : "ip-172-31-82-235.ec2.internal", "port" : 27017, "version" : "4.4.22-21", "gitVersion" : "be7a5f4a1000bed8cf1d1feb80a20664d51503ce" }
Now I will bring up the third node, and we will see that index op will complete.
Index status:
rs1:PRIMARY> db.products.createIndex({ "airt" : 1 }) { "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "commitQuorum" : "votingMembers", "ok" : 1, "$clusterTime" : { "clusterTime" : Timestamp(1687806737, 3), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } }, "operationTime" : Timestamp(1687806737, 3) } rs1:PRIMARY>
Now will run the same query, and we can see index (IXSCAN) is getting used as the index was created successfully above:
rs1:PRIMARY> db.products.find({"airt" : 1.9869362536440427}).explain() { "queryPlanner" : { "plannerVersion" : 1, "namespace" : "acme.products", "indexFilterSet" : false, "parsedQuery" : { "airt" : { "$eq" : 1.9869362536440427 } }, "queryHash" : "65E2F79D", "planCacheKey" : "AA490985", "winningPlan" : { "stage" : "FETCH", "inputStage" : { "stage" : "IXSCAN", "keyPattern" : { "airt" : 1 }, "indexName" : "airt_1", "isMultiKey" : false, "multiKeyPaths" : { "airt" : [ ] }, "isUnique" : false, "isSparse" : false, "isPartial" : false, "indexVersion" : 2, "direction" : "forward", "indexBounds" : { "airt" : [ "[1.986936253644043, 1.986936253644043]" ] } } }, "rejectedPlans" : [ ] }, "serverInfo" : { "host" : "ip-172-31-82-235.ec2.internal", "port" : 27017, "version" : "4.4.22-21", "gitVersion" : "be7a5f4a1000bed8cf1d1feb80a20664d51503ce" }
Primary logs once the third node came up and the index was created successfully:
{"t":{"$date":"2023-06-26T19:12:17.450+00:00"},"s":"I", "c":"STORAGE", "id":3856201, "ctx":"conn40","msg":"Index build: commit quorum satisfied","attr":{"indexBuildEntry":{"_id":{"$uuid":"46451b37-141f-4312-a219-4b504736ab5b"},"collectionUUID":{"$uuid":"a963b7e7-1054-4a5f-a935-a5be8995cff0"},"commitQuorum":"votingMembers","indexNames":["airt_1"],"commitReadyMembers":["127.0.0.1:27017","localhost:27018","localhost:27019"]}}} {"t":{"$date":"2023-06-26T19:12:17.450+00:00"},"s":"I", "c":"STORAGE", "id":3856204, "ctx":"IndexBuildsCoordinatorMongod-1","msg":"Index build: received signal","attr":{"buildUUID":{"uuid":{"$uuid":"46451b37-141f-4312-a219-4b504736ab5b"}},"action":"Commit quorum Satisfied"}} {"t":{"$date":"2023-06-26T19:12:17.451+00:00"},"s":"I", "c":"INDEX", "id":20345, "ctx":"IndexBuildsCoordinatorMongod-1","msg":"Index build: done building","attr":{"buildUUID":{"uuid":{"$uuid":"46451b37-141f-4312-a219-4b504736ab5b"}},"namespace":"acme.products","index":"airt_1","commitTimestamp":{"$timestamp":{"t":1687806737,"i":2}}}} {"t":{"$date":"2023-06-26T19:12:17.452+00:00"},"s":"I", "c":"STORAGE", "id":20663, "ctx":"IndexBuildsCoordinatorMongod-1","msg":"Index build: completed successfully","attr":{"buildUUID":{"uuid":{"$uuid":"46451b37-141f-4312-a219-4b504736ab5b"}},"namespace":"acme.products","uuid":{"uuid":{"$uuid":"a963b7e7-1054-4a5f-a935-a5be8995cff0"}},"indexesBuilt":1,"numIndexesBefore":1,"numIndexesAfter":2}} {"t":{"$date":"2023-06-26T19:12:17.554+00:00"},"s":"I", "c":"INDEX", "id":20447, "ctx":"conn34","msg":"Index build: completed","attr":{"buildUUID":{"uuid":{"$uuid":"46451b37-141f-4312-a219-4b504736ab5b"}}}} {"t":{"$date":"2023-06-26T19:12:17.554+00:00"},"s":"I", "c":"COMMAND", "id":51803, "ctx":"conn34","msg":"Slow query","attr":{"type":"command","ns":"acme.products","appName":"MongoDB Shell","command":{"createIndexes":"products","indexes":[{"key":{"airt":1.0},"name":"airt_1"}],"lsid":{"id":{"$uuid":"dd9672f8-4f56-47ce-8ceb-31caf5e8baf8"}},"$clusterTime":{"clusterTime":{"$timestamp":{"t":1687801980,"i":1}},"signature":{"hash":{"$binary":{"base64":"AAAAAAAAAAAAAAAAAAAAAAAAAAA=","subType":"0"}},"keyId":0}},"$db":"acme"},"numYields":0,"reslen":271,"locks":{"ParallelBatchWriterMode":{"acquireCount":{"r":3}},"FeatureCompatibilityVersion":{"acquireCount":{"r":1,"w":4}},"ReplicationStateTransition":{"acquireCount":{"w":5}},"Global":{"acquireCount":{"r":1,"w":4}},"Database":{"acquireCount":{"w":3}},"Collection":{"acquireCount":{"r":1,"w":1,"W":1}},"Mutex":{"acquireCount":{"r":3}}},"flowControl":{"acquireCount":3,"timeAcquiringMicros":7},"storage":{"data":{"bytesRead":98257,"timeReadingMicros":3489}},"protocol":"op_msg","durationMillis":4678530}}
Above, you can see how much time it took to complete the index build; the op was running till the third node was down.
- When one secondary is down, and the index is created with commitQuorum as the majority, below are the details from the Primary and the Secondary nodes. Status of nodes:
rs1:PRIMARY> rs.status().members.forEach(function (d) {print(d.name) + " " + print(d.stateStr)}); 127.0.0.1:27017 PRIMARY localhost:27018 SECONDARY localhost:27019 (not reachable/healthy) rs1:PRIMARY>
Index command:
rs1:PRIMARY> db.products.createIndex({ "airt" : 1 }, { }, "majority") { "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "commitQuorum" : "majority", "ok" : 1, "$clusterTime" : { "clusterTime" : Timestamp(1687808148, 4), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } }, "operationTime" : Timestamp(1687808148, 4) } rs1:PRIMARY>
Logs from Primary node:
{"t":{"$date":"2023-06-26T19:35:48.821+00:00"},"s":"I", "c":"STORAGE", "id":3856201, "ctx":"conn7","msg":"Index build: commit quorum satisfied","attr":{"indexBuildEntry":{"_id":{"$uuid":"5f8f75ee-aa46-42a6-b4c2-59a68fea47a7"},"collectionUUID":{"$uuid":"a963b7e7-1054-4a5f-a935-a5be8995cff0"},"commitQuorum":"majority","indexNames":["airt_1"],"commitReadyMembers":["127.0.0.1:27017","localhost:27018"]}}} {"t":{"$date":"2023-06-26T19:35:48.821+00:00"},"s":"I", "c":"STORAGE", "id":3856204, "ctx":"IndexBuildsCoordinatorMongod-3","msg":"Index build: received signal","attr":{"buildUUID":{"uuid":{"$uuid":"5f8f75ee-aa46-42a6-b4c2-59a68fea47a7"}},"action":"Commit quorum Satisfied"}} {"t":{"$date":"2023-06-26T19:35:48.822+00:00"},"s":"I", "c":"INDEX", "id":20345, "ctx":"IndexBuildsCoordinatorMongod-3","msg":"Index build: done building","attr":{"buildUUID":{"uuid":{"$uuid":"5f8f75ee-aa46-42a6-b4c2-59a68fea47a7"}},"namespace":"acme.products","index":"airt_1","commitTimestamp":{"$timestamp":{"t":1687808148,"i":3}}}} {"t":{"$date":"2023-06-26T19:35:48.824+00:00"},"s":"I", "c":"STORAGE", "id":20663, "ctx":"IndexBuildsCoordinatorMongod-3","msg":"Index build: completed successfully","attr":{"buildUUID":{"uuid":{"$uuid":"5f8f75ee-aa46-42a6-b4c2-59a68fea47a7"}},"namespace":"acme.products","uuid":{"uuid":{"$uuid":"a963b7e7-1054-4a5f-a935-a5be8995cff0"}},"indexesBuilt":1,"numIndexesBefore":1,"numIndexesAfter":2}} {"t":{"$date":"2023-06-26T19:35:48.923+00:00"},"s":"I", "c":"INDEX", "id":20447, "ctx":"conn34","msg":"Index build: completed","attr":{"buildUUID":{"uuid":{"$uuid":"5f8f75ee-aa46-42a6-b4c2-59a68fea47a7"}}}} {"t":{"$date":"2023-06-26T19:35:48.923+00:00"},"s":"I", "c":"COMMAND", "id":51803, "ctx":"conn34","msg":"Slow query","attr":{"type":"command","ns":"acme.products","appName":"MongoDB Shell","command":{"createIndexes":"products","indexes":[{"key":{"airt":1.0},"name":"airt_1"}],"commitQuorum":"majority","lsid":{"id":{"$uuid":"dd9672f8-4f56-47ce-8ceb-31caf5e8baf8"}},"$clusterTime":{"clusterTime":{"$timestamp":{"t":1687808123,"i":1}},"signature":{"hash":{"$binary":{"base64":"AAAAAAAAAAAAAAAAAAAAAAAAAAA=","subType":"0"}},"keyId":0}},"$db":"acme"},"numYields":0,"reslen":266,"locks":{"ParallelBatchWriterMode":{"acquireCount":{"r":3}},"FeatureCompatibilityVersion":{"acquireCount":{"r":1,"w":4}},"ReplicationStateTransition":{"acquireCount":{"w":5}},"Global":{"acquireCount":{"r":1,"w":4}},"Database":{"acquireCount":{"w":3}},"Collection":{"acquireCount":{"r":1,"w":1,"W":1}},"Mutex":{"acquireCount":{"r":3}}},"flowControl":{"acquireCount":3,"timeAcquiringMicros":7},"storage":{},"protocol":"op_msg","durationMillis":2469}}
Above, we can see when one node is down, and we used commitQuorum as majority while creating the index, index op got completed as per expected behavior as two voting (majority) nodes were up and running.
So far, we have discussed how to use commitQuorum and when to use it. Now we will see a scenario when one node (voting) is down for any reason, and someone created an index with default commitQuorum. The op will keep running, and you want to kill the op.
I created the index with the default commitQuorum when one node is down.
Status of nodes:
rs1:PRIMARY> rs.status().members.forEach(function (d) {print(d.name) + " " + print(d.stateStr)}); 127.0.0.1:27017 PRIMARY localhost:27018 SECONDARY localhost:27019 (not reachable/healthy) rs1:PRIMARY>
CurrentOp:
"active" : true, "currentOpTime" : "2023-06-26T21:27:41.304+00:00", "opid" : 536535, "lsid" : { "id" : UUID("dd9672f8-4f56-47ce-8ceb-31caf5e8baf8"), "uid" : BinData(0,"47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=") }, "secs_running" : NumberLong(264), "microsecs_running" : NumberLong(264345444), "op" : "command", "ns" : "acme.products", "command" : { "createIndexes" : "products", "indexes" : [ { "key" : { "airt" : 1 }, "name" : "airt_1" } ], "lsid" : { "id" : UUID("dd9672f8-4f56-47ce-8ceb-31caf5e8baf8") }, "$clusterTime" : { "clusterTime" : Timestamp(1687814589, 2), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } }, "$db" : "acme" }
Now you need to kill the above opid to release the above op:
rs1:PRIMARY> db.killOp(536535) { "info" : "attempting to kill op", "ok" : 1, "$clusterTime" : { "clusterTime" : Timestamp(1687815189, 2), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } }, "operationTime" : Timestamp(1687815189, 2) } rs1:PRIMARY> rs1:PRIMARY> db.products.createIndex({ "airt" : 1 }) { "operationTime" : Timestamp(1687815192, 2), "ok" : 0, "errmsg" : "operation was interrupted", "code" : 11601, "codeName" : "Interrupted", "$clusterTime" : { "clusterTime" : Timestamp(1687815192, 2), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } } } rs1:PRIMARY> rs1:PRIMARY> db.products.getIndexes() [ { "v" : 2, "key" : { "_id" : 1 }, "name" : "_id_" } ] rs1:PRIMARY>
Above, we can see when we killed the op, and the index creation op got killed.
Conclusion
We have seen how commitQuorum works while creating indexes from PSMDB 4.4. Still, the best practice is to create indexes in a rolling manner.
We recommend checking out our products for Percona Server for MongoDB, Percona Backup for MongoDB, and Percona Operator for MongoDB. We also recommend checking out our blog MongoDB: Why Pay for Enterprise When Open Source Has You Covered?
15
2023
MongoDB 4.2 EOL… And Its Implications
Enjoy it while it lasts, as everything has its end.
It sounded a bit more cryptic than it was planned, but I hope that it gets the attention it needs, as it’s important to know that MongoDB 4.2 in April has reached its End of Life (EOL), and more versions are soon going to be decommissioned as well.
What does that mean for me?
If you are a user of MongoDB 4.2, whether the MongoDB Inc. version or Percona Server for MongoDB one, your database will no longer receive bug fixes, patches, or minor releases.
As defined in our Lifecycle Policy:
We will provide Operational Support for customers running EOL software on active platforms. For EOLed platforms, we provide Community Support.
And as stated in our Lifecycle Overview:
For software that Percona bases on an upstream build, we match the upstream EOL dates.
Our MongoDB Server follows the upstream EOL dates. This means that bug fixes and software builds will no longer be generated also for our release of MongoDB.
Also, with the Percona Server for MongoDB 4.2 reaching its end of life, the implications are as follows:
- Percona Distribution for MongoDB 4.2 will no longer receive updates and bug fixes
- Percona Backup for MongoDB (PBM) will no longer support 4.2 either. That means that testing with 4.2 has ceased, and while PBM may still successfully perform backups and restores, we cannot guarantee it anymore.
That being said, rest assured, you will not be left alone. Those that have or would like to sign up for a Percona Support Subscription will continue to receive operational support and services. Operational support includes but is not limited to:
- Query optimization
- MongoDB tuning (replica sets and sharded clusters)
- MongoDB configuration, including our enterprise features such as LDAP
- Upgrade support (from EOL versions, so, i.e., 3.6->4.0->4.2->…)
- Setup and configuration of MongoDB clusters and tools such as Percona Backup for MongoDB and Percona Monitoring and Management (respecting the tool limitation for the EOL-ed version).
- In case of crashes, although we do not report bugs, we can still track down known bugs and provide recommendations.
Still have questions about the 4.2 EOL?
In her recent blog post, MongoDB V4.2 EOL Is Coming: How To Upgrade Now and Watch Out for the Gotchas!, our MongoDB Tech Lead, Kimberly Wilkins, has covered the ins and outs of a MongoDB upgrade.
She has also hosted a webinar on the MongoDB 4.2 EOL common questions and challenges.
If you are our customer, please create a ticket for more assistance. Remember also that our Percona Community Forum is always open for any users of our software, as we believe that community is very important in building our products!
What’s next
I do not want to be the bearer of bad news, but we have seen the great popularity of MongoDB 4.2 and 4.4. If you are on 4.2 right now, it makes all the difference to move away from it ASAP. This version has just become a possible threat to your security.
As you see, 4.4 was mentioned as well. That’s right, this highly popular, last version in the 4.x family is scheduled to be EOL in February 2024. That’s less than one year to make preparations for upgrading.
MongoDB EOL for the upcoming year or so.
While at it, notice that 5.0 is planned to be EOL in October 2024 as well, so next year, it’s worth considering upgrading to 6.0 to have at least till 2025 for the next EOL.
MongoDB 6.0 still has two years of life now.
If such an upgrade seems challenging and you want some help or at least advice around it, consider some of our premium services from MongoDB experts that can help you with migration by:
- Support – Answering any operational questions
- Managed Services – Playing the role of the remote DBA that handles all maintenance (including upgrades) for you
- Consulting – Professionals that can come in and advise or even do the upgrade for you at any time
- Training – So that your team can feel more comfortable with running the upgrades
Percona Distribution for MongoDB is a freely available MongoDB database alternative, giving you a single solution that combines the best and most important enterprise components from the open source community, designed and tested to work together.
28
2023
Add More Security to Your Percona Server for MongoDB With AWS IAM integration!
Did you notice that Percona Server for MongoDB 6.0.5-4 was released just a few days ago? This time around, it introduced improvements to the way we handle master key rotation for data at rest encryption as well as AWS IAM integration.
One key to rule them all — improvements to master key rotation
With the improvements introduced in Percona Server for MongoDB 6.0.5-4, one key path can be used for all servers in a clustered environment. This allows us to use one vault key namespace for all nodes in a deployment while at the same time preserving key versions and allowing each node to perform key rotation without impact to the other nodes.
Changes introduced with Percona Server for MongoDB 6.0.5-4 now allow using the same key for all the members of a replica set if the user chooses so, without impact on functionality.
Why should you care about AWS IAM integration?
With all the systems users need to access daily, password management becomes a more pressing issue. The introduction of IAM systems to an enterprise has become somewhat of a security standard in large enterprises.
Our users approached us about integration with AWS IAM, commonly used in their organizations. It’s an integration missing from MongoDB Community Edition (CE) that is important for compliance with enterprise security policies of many companies. Integration with AWS IAM allows:
- Passwordless authentication to your database
- Thanks to enterprise identity federation
- Fine-grained permissions
- More scalable, centralized access control
- Verify the permissions in fewer places
- Have access to preventive measures like data perimeter
To set up this integration, follow the steps outlined in our documentation, and configure either the user or the role authentication. This will allow AWS Security Token Service (STS) to play the part of Identity Provider (IDP) in a SAML 2.0-based federation.
Your feedback matters!
We take pride in being open to feedback in Percona. Please do not hesitate to contact us via the community forums or this contact form.
What’s next
We are looking into the problems affecting large size datastores that are a pain to our users. Please let us know if there are any particular issues you are struggling with in MongoDB; we are always open to suggestions!
13
2023
MongoDB V4.2 EOL Is Coming: How To Upgrade Now and Watch Out for the Gotchas!
MongoDB v4.2 was released in August 2019. And like previous versions, there is a time to go — and that End of Life (EOL) date approaches.
MongoDB v4.2 EOL is set for April 30, 2023. Since Percona Server for MongoDB (PSMDB) is a fully compatible replacement for the upstream MongoDB Community Edition, we try to follow the same EOL date as the upstream vendor. So let’s discuss why you should upgrade MongoDB, what the upgrade process looks like, and some of the potential gotchas that you may run into along the way.
First, consider why you even need to upgrade MongoDB.
You may think that everything is fine, your application is humming along, you have had no major problems with this version, and you don’t really see a reason to upgrade. And that’s exactly why you want to upgrade; you want to keep it that way. There are multiple very good reasons to upgrade MongoDB.
Those reasons include the following:
- Security, stability, compliance
- Bug fixes
- New features, enhancements
- 3rd party application requirements, code changes
The two most important reasons to upgrade MongoDB are around security and bug fixes.
The goal is always to ensure that your PSMDB/MongoDB environment stays as stable, secure, and performant as possible.
According to Percona’s Software Release Lifecycle Overview, after a version has gone EOL, no new builds or updates will be provided. We will continue to offer Operation Support on Supported Platforms.
Don’t leave your database vulnerable because this version has gone on to the database software wasteland. Plan and execute your upgrade ASAP.
With new versions come new features that may be beneficial to your application. And some just like shiny new things. Some of the major new features that are available in more recent versions of MongoDB are Time Series Collections and Live Resharding in v5.0.
Between v4.2 and v6.0, there are multiple enhancements to take advantage of.
Those include more resilient chunk migrations and the ability to at least refine your shard keys in v4.4, the above-mentioned v5.0 new features, and many changes under the hood to support those major new features. Then with v6.0, you get sync improvements, mandatory use of the new mongosh shell, many enhancements and tweaks to all of the major changes that v5.0 brought about, and a new default chunk size of 128M to help handle some of the recent changes to the auto-splitter, chunk migration, and balancer processes.
But there are quite a few hops between v4.2 and v6.0. So let’s debunk some of the reasons that you may have for NOT upgrading.
Those main potential blockers could include:
- Lack of resources or staff
- Lack of time/can’t afford the downtime
- Legacy code
- Outdated drivers
- Stack incompatibility
But really, the danger of not having support or bug fixes for your business-critical databases and applications is generally a very good reason to overcome all of those blockers sooner rather than later.
Let’s look at the steps required to upgrade MongoDB and the different types of environments that you may have running.
The types of systems include:
- Standalone
- Replica Sets
- Sharded Clusters
Those systems or architectures look like the below:
Now look at the components that make up those systems.
That gives us an idea of what we will need to upgrade.
But we also have to consider the built-in Replication that MongoDB has. That, too, will impact our upgrade process.
Basic overall MongoDB upgrade steps
- Take a backup
- Check prerequisites, resolve as needed
-
- Add resources if needed
- Download new desired binaries
- Keep FCV (Feature Compatibility Value) set to Current/Previous version
- Shut down mongo processes in the correct order; rolling fashion according to system type
- Replace current binaries with new binaries
- Start up mongo processes in the correct order; rolling fashion as required by system type
- Wait for an amount of time to ensure no problems
- Set FCV to the new desired version
The Pre-Req Checks generally look like this:
Once those are done, begin the upgrade for the standalone system type.
Upgrade a standalone system
It’s a single host or node with one process, so do the following:
- Take a backup
- Shut down the mongod process
- Replace the current binary with the new binary
- Keep FCV at the previous version until everything checks out after the upgrade
- Restart the mongodb process using the original port (default 27017)
- Check version, data, logs, etc.
- When all is good, update FCV
Now let’s look at the steps for the other two system types.
First, the Replica Set upgrade steps:
*Note* that there is a warning there.
You ALWAYS want to upgrade MongoDB to the latest version available – well, after checking the bug list, of course. We’ll review that warning when discussing the “Gotchas” later.
For Replica Set upgrades – upgrade binaries in a rolling manner. Start with the secondaries, then force the election of a new PRIMARY, and then upgrade the binary for the former PRIMARY.
Basic Replica Set upgrade steps
Upgrade Secondaries in a Rolling Method – SECONDARY 1, SECONDARY 2, force election to new PRIMARY, upgrade old PRIMARY/new SECONDARY.
Step 1 – Upgrade SECONDARY 1
– Shutdown Secondary 1
– Take Secondary 1 out of the replica set by restarting it with another port number (ex. port 3333)
– Change Binaries to new version
– Start Secondary 1 back up with its original replica set port number (ex. 27017)
Step 2 – Upgrade SECONDARY 2 – repeat that process with Secondary 2
Shutdown Secondary 2
– Take Secondary 2 out of the replica set by restarting it with another port number (ex. port 3333)
– Change Binaries to new version
– Start Secondary 2 back up with its original replica set port number (ex. 27017)
Step 3 – Upgrade the current PRIMARY
Step Down the current Primary – force election of a new PRIMARY – make sure state is good
Upgrade old PRIMARY, now new Secondary
Shut down old PRIMARY, now new Secondary 3
– Take new Secondary 3 out of the replica set by restarting it with another port number (ex. port 3333)
– Change Binaries to new version
– Start new Secondary 3 backup with its original replica set port number (ex. 27017)
Wait, check the data and logs. If there are no problems, update FCV to the new version
Done.
Now move on to the Sharded Cluster Upgrade process, where there are more components to consider and upgrade.
Reminder – those components are:
*Note* When Upgrading a Sharded Cluster, the order in which you upgrade the components matters.
You have the additional components (the balancer, config server replica set, the main data bearing mongod shard nodes, and the query router process mongoS’s.)
Order Matters … with this system type upgrade.
Stop Balancer
- Upgrade Config Servers
- Upgrade Shard nodes – MongoD’s
- Upgrade the MongoS’s
Start Balancer
Below is an infogram to show that order when upgrading a sharded cluster:
Downgrades
If, for whatever reason, you run into any problem, you can perform a downgrade.
Downgrades basically go in reverse order – more applicable to sharded clusters.
Stop Balancer
- Downgrade MongoS’s
- Downgrade Shard Nodes – MongoD’s
- Downgrade the Config Servers
ReStart Balancer
* This happens all in a rolling manner at the replset level again.
That covers the basic steps.
I will do a more technical series of blogs in the coming months covering running MongoDB in Kubernetes and using the Percona Operator for MongoDB. Those blogs will contain commands, example results, etc., used when managing in that containerized environment. One of the planned blogs will cover upgrades to that environment.
For now, you can see the actual commands used when upgrading a Percona Server for MongoDB replica set or a sharded cluster in Sudhir’s blog.
Now let’s take a look at some of the potential gotchas.
Gotchas — AKA things to watch out for
Whenever you are changing versions via an upgrade, there are different groups of things to watch out for or be aware of.
Those general buckets are:
- Compatibility issues – whether programming language or driver related
- Deprecated functions or tools
- Behavior changes – ex TTL behavior changed significantly between v4.4 and v6
- Bugs – newly discovered or already reported but not fixed yet
Below are some specific examples for those buckets.
Deprecated – Simple Network Management Protocol (SNMP) v6.0
Starting in MongoDB 6.0, SNMP is deprecated and will be removed in the next release.
Deprecated – old mongo shell “mongo” in v5.0
– Some legacy methods unavailable or replaced …
– mongosh uses the same syntax as old mongo shell
– Beware –check your code
Behavior changes
v4.2 –Faster stepdowns
Autosplitter process moved from mongos to the PRIMARY member of the replica set. This led to knowing more truth about chunk balance, so more chunk splits and more chunk migrations. Many more. It caused added write pressure.
v4.4 –Adjustments to the management of jumbo chunks – no longer getting stuck forever due to memory usage limit. Started adding in some of the changes that would be needed for v5.0 when Live Resharding would be hitting.
v5.0 – Deprecated old shell; major changes to the WiredTiger engine and Core Server to support Live ReSharding and Time Series Collections. It took a while for this major release to be fully baked. 😉
v6.0 – Default chunk size increased to 128M to help against too frequent chunk splitting and chunk moves by the Balancer. Removed the old shell. Changes to Initial sync and resync. New Operators; New Functions.
Those are just some of the changes.
More about some of the negative impacts.
Things to watch out for – bugs!
Along with all of the major changes that went into v5.0, there were many bugs for quite a while. Making this even more impactful – the v5.0 changes were backported in v4.4.
So early on for v5.0.x, the same bugs also broke versions of 4.4 from 4.4.1 up through v4.4.8 but really up into v4.9 and v4.10 if you look at the JIRA tickets closely.
Compatibility for Supported Life Cycle and OS Host Software – ex. MongoDB v5.0.x Community Edition drops support for RHEL/CentOS/Oracle 6 on x86_64 and other OS support changes
Here are some screenshots that I took along the way during the many releases for the version.
Make sure to use the latest minor release version when upgrading MongoDB
For v4.4
These various bugs last pretty much ALL the way Through 4.4.8 – Serious bugs that cause checkpoint issues, possible data inconsistencies, missing documents/data loss, duplicate unique key, problems restarting, omitting a page or pages of data, unclean restarts, data loss, etc.
Examples: SERVER-61483 SERVER-61633 SERVER-61945 SERVER-61950 – problems restarting nodes WT-8104 WT-8204 – race conditions, memory leaks. WT-8395 – upgrade related –data in an “inconsistent state”, missing documents WT-8534 WT-8551
Performance impact? There are also a few postings online talking about the newer versions possibly having a negative impact on performance.
There was a recently reported slowdown due to the new mongosh shell and a bug there, but seems to be with certain combinations and using a ruby driver …? Below are links for that issue and bug:
- https://github.com/bitnami/charts/issues/10316#issuecomment-1131718880
- https://jira.mongodb.org/browse/MONGOSH-1240
Ok, enough of that. What should we do?
Upgrade of course!
Slow and Steady wins the race!
And what else?
Migrations!
Consider moving over to Percona Server for MongoDB
That migration — it’s really just a lift and shift.
Normally a binary change between PSMDB and MGDB CE will take care of it. No need for data type changes, and no loss of triggers and procedures to deal with.
Remember, tick-tock … Plan your MongoDB upgrade and migrations today.
Thanks all!