Aug
09
2024
--

Improve the Security of a Percona Server for MongoDB Deployment with KMIP Key State Polling

Security of a Percona Server for MongoDB Deployment with KMIP Key State PollingData-at-rest encryption (also known as transparent data encryption or TDE) is a necessary mechanism for ensuring the security of a DBMS deployment. Upcoming releases of Percona Server for MongoDB extend that mechanism with the KMIP key state polling feature. In this technical post, I will describe how the feature works and how it helps reduce […]

Aug
05
2024
--

How to Upgrade MongoDB Using Backups Through Many Major Versions

Upgrade MongoDBCompanies use specific database versions because they’re proven performers or because it’s hard to keep up with frequent releases. But lagging behind has some major issues. When it’s time to upgrade, is it better to update binaries through each major revision or skip versions? TL;DR: Upgrading a MongoDB cluster using backups and skipping versions is […]

Jul
17
2024
--

Using Compact in Percona Server for MongoDB From Version 4.4+

Using Compact in Percona Server for MongoDB From Version 4.4+In the previously posted blog, Compaction in Percona Server for MongoDB (PSMDB), we discussed how compact works before version 4.4. In this blog, we will see how compact works on PSMDB 6.0. I recommend reading the blog post linked above to understand what compact does, how to check dataSize, and how much space we can […]

Oct
11
2023
--

Migrate Data From Atlas to Self-Hosted MongoDB

Migrate Data From Atlas to Self-Hosted MongoDB

In this blog post, we will discuss how we can migrate data from MongoDB Atlas to self-hosted MongoDB. There are a couple of third-party tools in the market to migrate data from Atlas to Pecona Server for MongoDB (PSMDB), like MongoPush, Hummingbird, and MongoShake. Today, we are going to discuss how to use MongoShake and migrate and sync the data from Atlas to PSMDB.

NOTE: These tools are not officially supported by Percona.

MongoShake is a powerful tool that facilitates the migration of data from one MongoDB cluster to another. These are step-by-step instructions on how to install and utilize MongoShake for data migration from Atlas to PSMDB. So, let’s get started!

Prerequisites:

A MongoDB Atlas account. I created a test account (replica set) and loaded sample data with one click in Atlas:

  1. Create an account in Atlas.
  2. Create a cluster.
  3. Once a cluster is created, go to browse collections.
  4. It will ask for load sample data. Once you click on it, you will see the sample data like below.
    Atlas atlas-mhnnqy-shard-0 [primary] test> show dbs
    sample_airbnb        52.69 MiB
    sample_analytics      9.44 MiB
    sample_geospatial     1.23 MiB
    sample_guides        40.00 KiB
    sample_mflix        109.43 MiB
    sample_restaurants    6.42 MiB
    sample_supplies       1.05 MiB
    sample_training      46.77 MiB
    sample_weatherdata    2.59 MiB
    admin               336.00 KiB
    local                20.35 GiB
    Atlas atlas-mhnnqy-shard-0 [primary] test>

An EC2 instance with PSMDB installed. I installed PSMDB on the EC2 machine:

rs0 [direct: primary] test>

rs0 [direct: primary] test> show dbs
admin   40.00 KiB
config  12.00 KiB
local   40.00 KiB
rs0 [direct: primary] test>

Make sure Atlas and PSMDB both have the same DB version (I have also used this tool on MongoDB 4.2, which is already EOL).

PSMDB version:

rs0 [direct: primary] test> db.version()
6.0.9-7
rs0 [direct: primary] test>

MongoDB Atlas version:

Atlas atlas-mhnnqy-shard-0 [primary] test> db.version()
6.0.10
Atlas atlas-mhnnqy-shard-0 [primary] test>

To install MongoShake, follow these steps:

Step 1: Install Go
Ensure that Go is installed on your system. If not, download it from the official website and follow the installation instructions. I used Amazon Linux 2, so used the below command to install go:

sudo yum install golang -y

Step 2: Install MongoShake
Open the terminal and run the following command to install MongoShake:

git clone https://github.com/alibaba/MongoShake.git

  1. Untar the file; it will create a folder with the name Mongoshake.
  2. cd MongoShake.
  3. Run ./build.sh file.

Once you have installed MongoShake, you need to configure it for the migration process. Here’s how:

  1. Configuration file (collector.conf) will be under conf dir under Mongoshake dir.
  2. In the config file, you can edit the URI for both RS or sharded clusters. Also, the tunnel (how you are migrating the data) method. If you are doing it directly, then the value will be direct. You can edit the log file path and log file name. Below are some important parameters:
    mongo_urls = mongodb+srv://gautam:****@cluster0.teeeayh.mongodb.net/  // Atlas conn string
    Tunnel.address = mongodb:127.0.0.1:27017 // PSMDB conn string
    Sync_mode = all 				        // default incr
    log.dir = /home/percona/MongoShake/log/    // default /root/mongoshake/

    Sync_mode other options: all/full/incr.

  • All means full synchronization + incremental synchronization. (copy the data and apply the oplogs after sync completes). 
  • Full means full synchronization only. (only copy the data).
  • Incr means incremental synchronization only. (only apply the oplog).

There are other parameters as well in the configuration file, which you can tune as per your needs. For example, if you want to read data from the Secondary node and do not want to overwhelm the Primary with the reads, you can set below parameter:

mongo_connect_mode = secondaryPreferred

Step 3: Once you are done with the configuration, run MongoShake in a screen session like the one below:

./bin/collector.linux -conf=conf/collector.conf -verbose 0

Step 4: Monitor the log file in the log directory to check the progress of migration.

Below is the sample log when you start MongoShake:

[2023/09/25 21:09:13 UTC] [INFO] New session to mongodb+srv://gautam:***@cluster0.teeeayh.mongodb.net/ successfully
[2023/09/25 21:09:13 UTC] [INFO] Close client with mongodb+srv://gautam:***@cluster0.teeeayh.mongodb.net/
[2023/09/25 21:09:13 UTC] [INFO] New session to mongodb+srv://gautam:***@cluster0.teeeayh.mongodb.net/ successfully
[2023/09/25 21:09:19 UTC] [INFO] Close client with mongodb+srv://gautam:***@cluster0.teeeayh.mongodb.net/
[2023/09/25 21:09:19 UTC] [INFO] GetAllTimestamp biggestNew:{1695675385 26}, smallestNew:{1695675385 26}, biggestOld:{1695668185 9}, smallestOld:{1695668185 9}, MongoSource:[url[mongodb+srv://gautam:***@cluster0.teeeayh.mongodb.net/], name[atlas-mhnnqy-shard-0]], tsMap:map[atlas-mhnnqy-shard-0:{7282839399442677769 7282870323207208986}]
[2023/09/25 21:09:19 UTC] [INFO] all node timestamp map: map[atlas-mhnnqy-shard-0:{7282839399442677769 7282870323207208986}] CheckpointStartPosition:{1 0}
[2023/09/25 21:09:19 UTC] [INFO] New session to mongodb+srv://gautam:***@cluster0.teeeayh.mongodb.net/ successfully
[2023/09/25 21:09:19 UTC] [INFO] atlas-mhnnqy-shard-0 Regenerate checkpoint but won't persist. content: {"name":"atlas-mhnnqy-shard-0","ckpt":1,"version":2,"fetch_method":"","oplog_disk_queue":"","oplog_disk_queue_apply_finish_ts":1}
[2023/09/25 21:09:19 UTC] [INFO] atlas-mhnnqy-shard-0 checkpoint using mongod/replica_set: {"name":"atlas-mhnnqy-shard-0","ckpt":1,"version":2,"fetch_method":"","oplog_disk_queue":"","oplog_disk_queue_apply_finish_ts":1}, ckptRemote set? [false]
[2023/09/25 21:09:19 UTC] [INFO] atlas-mhnnqy-shard-0 syncModeAll[true] ts.Oldest[7282839399442677769], confTsMongoTs[4294967296]
[2023/09/25 21:09:19 UTC] [INFO] start running with mode[all], fullBeginTs[7282870323207208986[1695675385, 26]]

You will see the below log once full sync is completed, and incr will start (incr means it will start syncing live data via oplog):

[2023/09/25 22:12:04 UTC] [INFO] GetAllTimestamp biggestNew:{1695679924 3}, smallestNew:{1695679924 3}, biggestOld:{1695677613 1}, smallestOld:{1695677613 1}, MongoSource:[url[mongodb+srv://gautam:***@cluster0.teeeayh.mongodb.net/], name[atlas-mhnnqy-shard-0]], tsMap:map[atlas-mhnnqy-shard-0::{7282879892394344449 7282889818063765507}]
[2023/09/25 22:12:04 UTC] [INFO] ------------------------full sync done!------------------------
[2023/09/25 22:12:04 UTC] [INFO] oldestTs[7282879892394344449[1695677613, 1]] fullBeginTs[7282889689214746625[1695679894, 1]] fullFinishTs[7282889818063765507[1695679924, 3]]
[2023/09/25 22:12:04 UTC] [INFO] finish full sync, start incr sync with timestamp: fullBeginTs[7282889689214746625[1695679894, 1]], fullFinishTs[7282889818063765507[1695679924, 3]]
[2023/09/25 22:12:04 UTC] [INFO] start incr replication

You will see the logs like this when both nodes are in sync (when lag is 0, i.e., tps=0):

[2023/09/25 22:14:41 UTC] [INFO] [name=atlas-mhnnqy-shard-0, stage=incr, get=24, filter=24, write_success=0, tps=0, ckpt_times=0, lsn_ckpt={0[0, 0], 1970-01-01 00:00:00}, lsn_ack={0[0, 0], 1970-01-01 00:00:00}]]
[2023/09/25 22:14:46 UTC] [INFO] [name=atlas-mhnnqy-shard-0, stage=incr, get=24, filter=24, write_success=0, tps=0, ckpt_times=0, lsn_ckpt={0[0, 0], 1970-01-01 00:00:00}, lsn_ack={0[0, 0], 1970-01-01 00:00:00}]]
[2023/09/25 22:14:51 UTC] [INFO] [name=atlas-mhnnqy-shard-0, stage=incr, get=25, filter=25, write_success=0, tps=0, ckpt_times=0, lsn_ckpt={0[0, 0], 1970-01-01 00:00:00}, lsn_ack={0[0, 0], 1970-01-01 00:00:00}]]
[2023/09/25 22:14:56 UTC] [INFO] [name=atlas-mhnnqy-shard-0, stage=incr, get=25, filter=25, write_success=0, tps=0, ckpt_times=0, lsn_ckpt={0[0, 0], 1970-01-01 00:00:00}, lsn_ack={0[0, 0], 1970-01-01 00:00:00}]]

Once the full data replication process is complete and both clusters are in sync, you can stop pointing the application to Atlas. Check the logs of MongoShake, and when the lag is 0, as we can see in the above logs, stop the replication/sync from Atlas or stop MongoShake. Verify that the data has been successfully migrated to PSMDB. You can use MongoDB shell or any other client to connect to the PSMDB instance to verify this.

MongoDB Atlas databases and their collection count:

Database: sample_airbnb
-----
Collection 'listingsAndReviews' documents: 5555

Database: sample_analytics
-----
Collection 'transactions' documents: 1746
Collection 'accounts' documents: 1746
Collection 'customers' documents: 500

Database: sample_geospatial
-----
Collection 'shipwrecks' documents: 11095

Database: sample_guides
-----
Collection 'planets' documents: 8

Database: sample_mflix
-----
Collection 'embedded_movies' documents: 3483
Collection 'users' documents: 185
Collection 'theaters' documents: 1564
Collection 'movies' documents: 21349
Collection 'comments' documents: 41079
Collection 'sessions' documents: 1

Database: sample_restaurants
-----
Collection 'neighborhoods' documents: 195
Collection 'restaurants' documents: 25359

Database: sample_supplies
-----
Collection 'sales' documents: 5000

Database: sample_training
-----
Collection 'posts' documents: 500
Collection 'trips' documents: 10000
Collection 'grades' documents: 100000
Collection 'routes' documents: 66985
Collection 'inspections' documents: 80047
Collection 'companies' documents: 9500
Collection 'zips' documents: 29470

Database: sample_weatherdata
-----
Collection 'data' documents: 10000


Atlas atlas-mhnnqy-shard-0 [primary] sample_weatherdata>


PSDMB databases and their collection count:

rs0 [direct: primary] test> show dbs
admin                80.00 KiB
config              240.00 KiB
local               468.00 KiB
mongoshake           56.00 KiB
sample_airbnb        52.20 MiB
sample_analytics      9.21 MiB
sample_geospatial   984.00 KiB
sample_guides        40.00 KiB
sample_mflix        108.17 MiB
sample_restaurants    5.57 MiB
sample_supplies     980.00 KiB
sample_training      40.50 MiB
sample_weatherdata    2.39 MiB
rs0 [direct: primary] test>

Database: sample_airbnb
-----
Collection 'listingsAndReviews' documents: 5555

Database: sample_analytics
-----
Collection 'transactions' documents: 1746
Collection 'accounts' documents: 1746
Collection 'customers' documents: 500

Database: sample_geospatial
-----
Collection 'shipwrecks' documents: 11095

Database: sample_guides
-----
Collection 'planets' documents: 8

Database: sample_mflix
-----
Collection 'embedded_movies' documents: 3483
Collection 'users' documents: 185
Collection 'theaters' documents: 1564
Collection 'movies' documents: 21349
Collection 'comments' documents: 41079
Collection 'sessions' documents: 1

Database: sample_restaurants
-----
Collection 'neighborhoods' documents: 195
Collection 'restaurants' documents: 25359

Database: sample_supplies
-----
Collection 'sales' documents: 5000

Database: sample_training
-----
Collection 'posts' documents: 500
Collection 'trips' documents: 10000
Collection 'grades' documents: 100000
Collection 'routes' documents: 66985
Collection 'inspections' documents: 80047
Collection 'companies' documents: 9500
Collection 'zips' documents: 29470

Database: sample_weatherdata
-----
Collection 'data' documents: 10000


rs0 [direct: primary] sample_weatherdata>

Above, you can see we have verified data in PSMDB. Now, update the connection string of the application to point to PSMDB.

NOTE: Sometimes, during the migration process, it is possible for some indexes to replicate. So, during the data verification process, please verify the indexes, and if an index is missing, create that index before the cutover time.

Conclusion

MongoShake simplifies the process of migrating MongoDB data from Atlas to self-hosted MongoDB. Percona experts can assist you with migration as well. By following the steps outlined in this blog, you can seamlessly install, configure, and utilize MongoShake for migrating your data from MongoDB Atlas.

To learn more about the enterprise-grade features available in the license-free Percona Server for MongoDB, we recommend going through our blog MongoDB: Why Pay for Enterprise When Open Source Has You Covered? 

Percona Distribution for MongoDB is a freely available MongoDB database alternative, giving you a single solution that combines the best and most important enterprise components from the open source community, designed and tested to work together.

 

Download Percona Distribution for MongoDB Today!

Oct
10
2023
--

Percona Server for MongoDB 7 Is Now Available

Percona Server for MongoDB 7 available

Databases are different from a lot of software. For one, they often favor stability over innovation. This is not a general rule, but as databases are responsible for a core layer of any IT system – data storage and processing — they require reliability. This requirement does not always pair with the latest and greatest improvements that have not been hardened over time.

Even with that, the fact that MongoDB 5.0 is planned for EOL in October 2024 and MongoDB 6.0 is planned for EOL in July 2025 should put MongoDB 7.0 on your radar. Even if you are not considering all the interesting improvements that have been added by the development team from MongoDB, this new version is already very important from the database supportability and lifecycle planning perspective.

Why choose Percona Server for MongoDB?

Percona provides a drop-in replacement solution for MongoDB Community Edition that is based on the same upstream code delivered by MongoDB, Inc. The difference between Percona Server for MongoDB and MongoDB CE is that we strive to provide a gap-closing set of features for users who want to use MongoDB in production. These enterprise features include, but are not limited to:

  • Security improvements – Among which is the KMIP and Hashicorp Vault integration.
  • Availability solutions – Advanced backups, including physical backups and point-in-time recovery that are not available to MongoDB Community Edition.
  • K8s Operator – An enterprise-grade k8s operator to run your workloads in Kubernetes.
  • Percona Monitoring and Management (PMM) – A fully open source monitoring tool to help you run your databases (not limited to MongoDB).

Why release Percona Server for MongoDB 7 now?

The dev teams from MongoDB, Inc. delivering the Upstream code do a great job and build a very solid tool. With that said, each new major version, by definition, introduces enough big changes to require a certain amount of precaution.

We explicitly delay the release of each main version of MongoDB server to take extra time to validate whether all of our added functionalities work well with the given version.

We also spend extra time to ensure that the quality of the release is good enough for our customers to start using. Think of us as the extra set of eyes, the extra layer of QA to ensure your safety passage to the next database version.

This time around, the first version we were able to release as Release Candidate (RC) was 7.0.2, which was released as Percona Server for MongoDB RC 7.0.2-1. Expect GA release soon to follow. 

Important changes in MongoDB 7

One of the most eyebrow-raising changes that MongoDB 7.0 introduces is the limitation of the downgrade process.  Reading the below would not make me feel at ease while performing an upgrade: 

Binary downgrades are no longer supported for MongoDB Community Edition. (source)

followed by

Starting in MongoDB 7.0, you cannot downgrade your Enterprise deployment’s binary version without assistance from support. (source)

What it means is that there are some important changes coming with 7.0 that can also be very beneficial for you. What it also means is that in case of any problems with your upgrade, as soon as you change the fCV to 7.0, the way back will be closed without a time-consuming and operationally complicated logical restore or complicated and tailor-suited solutions requiring a lot of experience.

What it also means is that binary downgrade may still be possible for enterprise customers of MongoDB. Documentation does not limit that option.

Percona offers upgrade support to get you safely through the upgrade process. We also provide Managed Services to take this stress off your shoulders so that our top-of-the-market experts handle your databases for you.

To check out the list of changes that MongoDB 7 introduces, check out the summary write-up in Percona Server for MongoDB 7.0.2-1 release or the full release notes from MongoDB Inc.

What’s coming next

We are working on improvements available previously only to the MongoDB Enterprise users that will impact scalability and availability of especially large datasets.

We also want to focus more on the security aspects not available outside of MongoDB Enterprise.

Our Operators Team is also working on improvements especially important for sharded clusters, and the Percona Monitoring and Management team is planning to look into more scalability and management-enhancing options.

Stay tuned for more news about MongoDB offerings.

 

Learn more about Percona Server for MongoDB

Jul
17
2023
--

An Enterprise-Grade MongoDB Alternative Without Licensing or Lock-in

Enterprise-Grade MongoDB Alternative

MongoDB Community Edition software might set the stage for achieving your high-volume database goals, but you quickly learn that its features fall short of your enterprise needs.

So you look at MongoDB Enterprise software, but its costly and complex licensing structure does not meet your budget goals. You’re also not certain its features will always align with your evolving technology needs. What’s more, you’re wary of the expenses and restrictions of vendor lock-in.

Still, you don’t want to ditch the advantages of MongoDB Enterprise software. But you can’t absorb the negatives, either.

Don’t despair; there are alternatives. In this blog, we’ll examine the reasons why people would seek an alternative to MongoDB Enterprise, and we’ll identify some of the most popular NoSQL alternatives. Then, we’ll highlight some reasons why Percona Software for MongoDB might be the alternative you seek.

First, some stage-setting for this blog article.

The popularity of MongoDB

MongoDB has emerged as a popular database platform. It ranks No. 5 among all database management systems and No. 1 among non-relational/document-based systems (DB-Engines, July 2023).

Developers and DBAs like MongoDB’s ease-of-use. Instead of the table-based structure of relational databases, MongoDB stores data in documents and collections, a design for handling large amounts of unstructured data and for real-time web applications. DBAs and developers appreciate its combination of flexibility, scalability, and performance.

More specifically, DBAs and developers like that MongoDB stores data in JSON-like documents with optional schemas. It’s a good setup for real-time analytics and high-speed logging.

Taking a deeper dive, MongoDB is a system for analyzing data because documents are easily shared across multiple nodes and because of its indexing, query-on-demand, and real-time aggregation capabilities. MongoDB replica sets enable data redundancy and automatic failover, setting the stage (there’s that term again) for high availability. MongoDB also provides strong encryption and firewall security. MongoDB is preferable for working with content management systems and mobile apps.

And MongoDB is popular across industries. A survey of 90,240 companies using MongoDB listed the leading uses as Technology and Services (23%), Computer Software (16%), and Internet (6%).

So, if MongoDB provides such a great foundation, why not just step up to MongoDB Enterprise?

Why businesses seek an alternative to MongoDB Enterprise

The reasons for choosing an alternative to MongoDB Enterprise vary depending on business objectives, technical requirements, on-staff expertise, and project specifications. But there are common concerns about MongoDB Enterprise that drive people to seek alternatives.

Those problems (some with shared elements) include:

  • High cost and complicated pricing structure: Many companies say MongoDB has an expensive and complicated pricing structure (Cloud Zero, January 2023). MongoDB Enterprise is a commercial (proprietary) product with licensing fees, as well as support and maintenance charges that can rapidly escalate. With the potentially high costs and complications of tiers, pay-as-you-go, hourly and monthly rates, etc., companies and organizations seek alternatives that offer similar functionality at a lower cost.
  • Limited toolset: MongoDB Enterprise offers advanced features for data encryption, authentication, auditing, access control, and more. But you can be out of luck and forced to spend additional money if business-critical objectives require specific features or capabilities unavailable in MongoDB Enterprise.
  • Not really open source: Even the MongoDB Community version is not open source; it’s source-available and is under the SSPL license (introduced by MongoDB itself). MongoDB Enterprise, built on the Community version, adds proprietary features and database management tools. MongoDB Enterprise is commercial. Customers miss out on the cost-effectiveness, creative freedom, and global community support (for innovation, better performance, and enhanced security) that come with open source solutions and from companies with an open source spirit.
  • Inflexibility: Proprietary software puts a company — and its ability to tailor solutions to fit specific use cases — at the mercy of the software vendor. Conversely, open source database software (and companies that support source-available software with open source terms) provides the flexibility to customize and modify the software to suit specific requirements. Organizations have access to the source code, allowing them to make changes and enhancements as needed. This level of flexibility is particularly valuable for businesses with unique or specialized needs.
  • Vendor lock-in: Relying on contracted MongoDB Enterprise support to address immediate concerns, reduce complexity, and provide a secure database might provide initial comfort, but trepidation about vendor lock-in would be legitimate. Concerns about price hikes, paying for unnecessary technology, and being blocked from new technology can provide the impetus to seek an alternative to MongoDB Enterprise. Companies might opt for alternative databases that offer more of the aforementioned flexibility and the ability to migrate to different platforms or technologies in the future without significant challenges.
  • Infrastructure incompatibility: Organizations might have existing tools and applications that are not readily compatible with MongoDB Enterprise software. If an alternative database has better compatibility or provides specific integrations with the company’s existing technology stack, that might be a compelling reason to seek an alternative to MongoDB Enterprise.
  • Needless complexity: For smaller companies and/or those with simpler needs or limited database budgets and resources, some of the advanced features in MongoDB Enterprise could introduce unwanted complexity in the database environment. Such organizations might seek a more straightforward alternative.

Alternatives to MongoDB Enterprise

There are plenty of alternatives, but we’re not going to cover them all here, of course. Instead, let’s examine a few of the more popular non-relational (NoSQL) database options.

MongoDB Community

We’ve already touched on MongoDB Community’s licensing, but let’s address some of what the software lacks to be a viable technology alternative to its Enterprise sibling.

For starters, MongoDB Community lacks the advanced security features available in MongoDB Enterprise. It also lacks more advanced monitoring and management features like custom alerting, automation, and deeper insights into database performance that are part of MongoDB Enterprise. While it offers basic backup and restore functionality, the Community version lacks advanced features in MongoDB Enterprise, such as continuous backups, point-in-time recovery (PITR), and integration with third-party backup tools.

And from a support standpoint, for the loss of a better word, it’s lacking. The MongoDB Community edition does not come with official technical support or service level agreements (SLAs) from MongoDB Inc.

Redis

Known for exceptional performance, Redis is a popular in-memory data platform. It stores data in RAM, which enables fast data access and retrieval. Redis can handle a high volume of operations per second, making it useful for running applications that require low latency. Redis supports a wide range of data structures, including strings, lists, sets, sorted sets, and hashes. Developers appreciate how Redis supplies appropriate data structures for specific use cases.

Storing large datasets can be a challenge, as Redis’ storage capacity is limited by available RAM. Also, Redis is designed primarily for key-value storage and lacks advanced querying capabilities.

Redis ranks right after MongoDB as the sixth most popular database management system (DB-Engines, July 2023).

Apache Cassandra

Apache Cassandra, with users across industries, ranks as the 12th most popular database management system (DB-Engines, July 2023). It’s an open source distributed NoSQL database that offers high scalability and availability. It manages unstructured data with thousands of writes every second.

Fault tolerance and linear scalability make Cassandra popular for handling mission-critical data. But because Cassandra handles large amounts of data and multiple requests, transactions can be slower, and there can be memory management issues.

Couchbase

Couchbase is a distributed document store with a powerful search engine and built-in operational and analytical capabilities. It’s designed to handle high volumes of data with minimal delay. An in-memory caching mechanism supports horizontal scaling, which enables it to handle large-scale applications and workloads effectively. Couchbase — No. 32 in database popularity (DB-Engines, July 2023) — uses a distributed, peer-to-peer architecture that enables data replication and automatic sharding across multiple nodes. This architecture ensures high availability, fault tolerance, and resilience to failures.

With Couchbase, certain tasks can be more challenging or time-consuming. Its indexing mechanisms are not as well-developed as those of some other database solutions. Additionally, it has its own query language, so the learning curve can be steeper.

Percona’s MongoDB alternative — enterprise advantages with none of the strings

Here’s one more alternative: If you want enterprise-grade MongoDB — without the high cost of runaway licensing fees or restrictions of vendor lock-in — consider Percona Software for MongoDB.

Secure, enterprise-grade Percona Software for MongoDB is freely available and empowers you to operate the production environments you want, wherever you want. Benefits include:

  • High performance without lock-in — Operate production environments requiring high-performance, highly available, and secure databases. Do it without licensing costs and vendor lock-in.
  • Data durability — Ensure it via an open source, distributed, and low-impact solution for consistent backups of MongoDB sharded clusters and replica sets.
  • Scalability — Freely deploy and scale MongoDB in a public or private cloud, on-premises, or hybrid environment—no credit card required.
  • MongoDB database health checks — Monitor, receive alerts, manage backups, and diagnose user-impacting incidents rooted in database configuration.
  • Automated procedures and accelerated value — Automate deployments, scaling, and backup and restore operations of MongoDB on Kubernetes.

For MongoDB users, Percona offers:

  • Percona Server for MongoDB — A source-available, fully compatible drop-in replacement for the MongoDB Community Edition with enterprise security, backup, and developer-friendly features.
  • Percona Backup for MongoDB — This is a fully supported, 100% open source community backup tool for MongoDB. It creates a physical data backup on a running server without notable performance and operating degradation. Percona Backup for MongoDB offers PITR and a backup management interface via Percona Monitoring and Management (PMM).
  • Percona Toolkit for MongoDB A collection of advanced open source command-line tools that are engineered to perform a variety of tasks too difficult or complex to perform manually.
  • Percona Distribution for MongoDB — A collection of Percona for MongoDB software offerings integrated with each other and packed into a single solution that maximizes performance while being more cost-effective for teams to run over time.
  • Percona Monitoring and Management (PMM) — An open source database observability monitoring and management tool that’s ideal for finding MongoDB database issues.

With Percona Software for MongoDB, you can ensure data availability while improving security and simplifying the development of new applications — in the most demanding public, private, and hybrid cloud environments.

And with Percona, you’re never on your own. We back our MongoDB offerings with Percona Support, Managed Services, and Consulting. We’ll provide support that best fits the needs of your company or organization — without a restrictive contract.

 

Learn more about Percona Software for MongoDB

Jun
29
2023
--

CommitQuorum in Index Creation From Percona Server for MongoDB 4.4

CommitQuorum MongoDB

Before Percona Server for MongoDB 4.4 (PSMDB), the best practice to create an index was doing it in a rolling manner. Many folks used to create directly on Primary, resulting in the first index being created successfully on Primary and then replicated to Secondary nodes.

Starting from PSMDB 4.4, there was a new parameter commitQuorum introduced in the createIndex command. If you are not passing this parameter explicitly with the createIndex command, it will use the default settings on a replica set or sharded cluster and start building the index simultaneously across all data-bearing voting replica set members.

Below is the command used to create an index using commitQuorum as the majority:

db.getSiblingDB("acme").products.createIndex({ "airt" : 1 }, { }, "majority")

The above command will run the index create command on the majority of data-bearing replica set members. There are other options available too when using commitQuorum:

  1. “Voting Members” – This is the default behavior when an index will be created on all data-bearing voting replica set members (Default). A “voting” member is any replica set member where votes are greater than 0.
  2. “Majority” – A simple majority of data-bearing replica set members.
  3. “<int>” – A specific number of data-bearing replica set members. Specify an integer greater than 0.
  4. “Tag name” – A replica set tag name of a node is used.

Now we will see the scenarios of what happens when the index is created with the default and majority commitQuorum.

  1. When all data-bearing replica set members are available, and the index is created with default commitQuorum, below are the details from the Primary and the Secondary nodes. Create index:
    rs1:PRIMARY> db.products.createIndex({ "airt" : 1 })

    Primary logs:

    {"t":{"$date":"2023-06-26T12:33:18.417+00:00"},"s":"I",  "c":"INDEX",    "id":20384,   "ctx":"IndexBuildsCoordinatorMongod-0","msg":"Index build: starting","attr":{"namespace":"acme.products","buildUUID":{"uuid":{"$uuid":"58f4e7bf-7b8f-4eb6-8de0-0ad774c4b51f"}},"properties":{"v":2,"key":{"airt":1.0},"name":"airt_1"},"method":"Hybrid","maxTemporaryMemoryUsageMB":200}}

    Secondary logs:

    {"t":{"$date":"2023-06-26T12:33:18.417+00:00"},"s":"I",  "c":"INDEX",    "id":20384,   "ctx":"IndexBuildsCoordinatorMongod-0","msg":"Index build: starting","attr":{"namespace":"acme.products","buildUUID":{"uuid":{"$uuid":"58f4e7bf-7b8f-4eb6-8de0-0ad774c4b51f"}},"properties":{"v":2,"key":{"airt":1.0},"name":"airt_1"},"method":"Hybrid","maxTemporaryMemoryUsageMB":200}}}

    Secondary logs:

    {"t":{"$date":"2023-06-26T12:33:28.445+00:00"},"s":"I",  "c":"INDEX",    "id":20384,   "ctx":"IndexBuildsCoordinatorMongod-0","msg":"Index build: starting","attr":{"namespace":"acme.products","buildUUID":{"uuid":{"$uuid":"58f4e7bf-7b8f-4eb6-8de0-0ad774c4b51f"}},"properties":{"v":2,"key":{"airt":1.0},"name":"airt_1"},"method":"Hybrid","maxTemporaryMemoryUsageMB":200}}

    We can see the above index was created simultaneously on all the data-bearing voting replica set members.

  2. When one secondary is down, and the index is created with default commitQuorum, below are the details from the Primary and the Secondary nodes.

    Status of nodes:

    rs1:PRIMARY> rs.status().members.forEach(function (d) {print(d.name) + " " + print(d.stateStr)});
    127.0.0.1:27017
    PRIMARY
    localhost:27018
    SECONDARY
    localhost:27019
    (not reachable/healthy)
    rs1:PRIMARY>

    Index command:

    rs1:PRIMARY> db.products.createIndex({ "airt" : 1 })

    Replication status:

    rs1:PRIMARY> db.printSecondaryReplicationInfo()
    source: localhost:27018
            syncedTo: Mon Jun 26 2023 17:56:30 GMT+0000 (UTC)
            0 secs (0 hrs) behind the primary
    source: localhost:27019
            syncedTo: Thu Jan 01 1970 00:00:00 GMT+0000 (UTC)
            1687802190 secs (468833.94 hrs) behind the primary
    rs1:PRIMARY>

    Index status:

    rs1:PRIMARY> db.currentOp(true).inprog.forEach(function(op){ if(op.msg!==undefined) print(op.msg) })
    Index Build: draining writes received during build
    rs1:PRIMARY> Date()
    Mon Jun 26 2023 18:07:26 GMT+0000 (UTC)
    rs1:PRIMARY>

    CurrentOp:

    "active" : true,
          "currentOpTime" :"2023-06-26T19:04:33.175+00:00",
          "opid" : 329147,
      "lsid" : {
               "id" :UUID("dd9672f8-4f56-47ce-8ceb-31caf5e8baf8"),
               "uid": BinData(0,"47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=")
                },
     "secs_running" : NumberLong(4214),
     "microsecs_running" : NumberLong("4214151233"),
     "op" : "command",
     "ns" : "acme.products",
     "command" : {
                  "createIndexes" : "products",
                  "indexes" : [
                                  {
                                   "key" : {
                                            "airt" : 1
                                            },
                                   "name" : "airt_1"
                                  }
                               ],
                   "lsid" : {
                   "id" :UUID("dd9672f8-4f56-47ce-8ceb-31caf5e8baf8")
                             },
                   "$clusterTime" : {
                                     "clusterTime" : Timestamp(1687801980, 1),
                                     "signature" : {
                                     "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                                     "keyId" : NumberLong(0)
                                            }
                                    },
                    "$db" : "acme"
                            }

    Logs from Primary node:

    {"t":{"$date":"2023-06-26T17:54:21.419+00:00"},"s":"I",  "c":"STORAGE",  "id":3856203, "ctx":"IndexBuildsCoordinatorMongod-1","msg":"Index build: waiting for next action before completing final phase","attr":{"buildUUID":{"uuid":{"$uuid":"46451b37-141f-4312-a219-4b504736ab5b"}}}}

     Logs from up-and-running Secondary node:

    {"t":{"$date":"2023-06-26T17:54:21.424+00:00"},"s":"I",  "c":"STORAGE",  "id":3856203, "ctx":"IndexBuildsCoordinatorMongod-1","msg":"Index build: waiting for next action before completing final phase","attr":{"buildUUID":{"uuid":{"$uuid":"46451b37-141f-4312-a219-4b504736ab5b"}}}}

    You can see above that when one node is down, and the index is created with default commitQuorum, the index command will keep running till that third data-bearing voting node comes up. Now we can check if the index is created on Primary or not:

    rs1:PRIMARY> db.products.getIndexes()
    [
            {
                    "v" : 2,
                    "key" : {
                            "_id" : 1
                    },
                    "name" : "_id_"
            },
            {
                    "v" : 2,
                    "key" : {
                            "airt" : 1
                    },
                    "name" : "airt_1"
            }
    ]
    rs1:PRIMARY>

    We can see the index is created, but you will not be able to use the above index as the index is not marked as completed.

    Below is the explain plan of a query, where we can see the query is doing COLLSCAN instead of IXSCAN:

    rs1:PRIMARY> db.products.find({"airt" : 1.9869362536440427}).explain()
    {
            "queryPlanner" : {
                    "plannerVersion" : 1,
                    "namespace" : "acme.products",
                    "indexFilterSet" : false,
                    "parsedQuery" : {
                            "airt" : {
                                    "$eq" : 1.9869362536440427
                            }
                    },
                    "queryHash" : "65E2F79D",
                    "planCacheKey" : "AA490985",
                    "winningPlan" : {
                            "stage" : "COLLSCAN",
                            "filter" : {
                                    "airt" : {
                                            "$eq" : 1.9869362536440427
                                    }
                            },
                            "direction" : "forward"
                    },
                    "rejectedPlans" : [ ]
            },
            "serverInfo" : {
                    "host" : "ip-172-31-82-235.ec2.internal",
                    "port" : 27017,
                    "version" : "4.4.22-21",
                    "gitVersion" : "be7a5f4a1000bed8cf1d1feb80a20664d51503ce"
    }

    Now I will bring up the third node, and we will see that index op will complete.

    Index status:

    rs1:PRIMARY> db.products.createIndex({ "airt" : 1 })
    {
            "createdCollectionAutomatically" : false,
            "numIndexesBefore" : 1,
            "numIndexesAfter" : 2,
            "commitQuorum" : "votingMembers",
            "ok" : 1,
            "$clusterTime" : {
                    "clusterTime" : Timestamp(1687806737, 3),
                    "signature" : {
                            "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                            "keyId" : NumberLong(0)
                    }
            },
            "operationTime" : Timestamp(1687806737, 3)
    }
    rs1:PRIMARY>

    Now will run the same query, and we can see index (IXSCAN) is getting used as the index was created successfully above:

    rs1:PRIMARY> db.products.find({"airt" : 1.9869362536440427}).explain()
    {
            "queryPlanner" : {
                    "plannerVersion" : 1,
                    "namespace" : "acme.products",
                    "indexFilterSet" : false,
                    "parsedQuery" : {
                            "airt" : {
                                    "$eq" : 1.9869362536440427
                            }
                    },
                    "queryHash" : "65E2F79D",
                    "planCacheKey" : "AA490985",
                    "winningPlan" : {
                            "stage" : "FETCH",
                            "inputStage" : {
                                    "stage" : "IXSCAN",
                                    "keyPattern" : {
                                            "airt" : 1
                                    },
                                    "indexName" : "airt_1",
                                    "isMultiKey" : false,
                                    "multiKeyPaths" : {
                                            "airt" : [ ]
                                    },
                                    "isUnique" : false,
                                    "isSparse" : false,
                                    "isPartial" : false,
                                    "indexVersion" : 2,
                                    "direction" : "forward",
                                    "indexBounds" : {
                                            "airt" : [
                                                    "[1.986936253644043, 1.986936253644043]"
                                            ]
                                    }
                            }
                    },
                    "rejectedPlans" : [ ]
            },
            "serverInfo" : {
                    "host" : "ip-172-31-82-235.ec2.internal",
                    "port" : 27017,
                    "version" : "4.4.22-21",
                    "gitVersion" : "be7a5f4a1000bed8cf1d1feb80a20664d51503ce"
            }

    Primary logs once the third node came up and the index was created successfully:

    {"t":{"$date":"2023-06-26T19:12:17.450+00:00"},"s":"I",  "c":"STORAGE",  "id":3856201, "ctx":"conn40","msg":"Index build: commit quorum satisfied","attr":{"indexBuildEntry":{"_id":{"$uuid":"46451b37-141f-4312-a219-4b504736ab5b"},"collectionUUID":{"$uuid":"a963b7e7-1054-4a5f-a935-a5be8995cff0"},"commitQuorum":"votingMembers","indexNames":["airt_1"],"commitReadyMembers":["127.0.0.1:27017","localhost:27018","localhost:27019"]}}}
    
    {"t":{"$date":"2023-06-26T19:12:17.450+00:00"},"s":"I",  "c":"STORAGE",  "id":3856204, "ctx":"IndexBuildsCoordinatorMongod-1","msg":"Index build: received signal","attr":{"buildUUID":{"uuid":{"$uuid":"46451b37-141f-4312-a219-4b504736ab5b"}},"action":"Commit quorum Satisfied"}}
    
    {"t":{"$date":"2023-06-26T19:12:17.451+00:00"},"s":"I",  "c":"INDEX",    "id":20345,   "ctx":"IndexBuildsCoordinatorMongod-1","msg":"Index build: done building","attr":{"buildUUID":{"uuid":{"$uuid":"46451b37-141f-4312-a219-4b504736ab5b"}},"namespace":"acme.products","index":"airt_1","commitTimestamp":{"$timestamp":{"t":1687806737,"i":2}}}}
    
    {"t":{"$date":"2023-06-26T19:12:17.452+00:00"},"s":"I",  "c":"STORAGE",  "id":20663,   "ctx":"IndexBuildsCoordinatorMongod-1","msg":"Index build: completed successfully","attr":{"buildUUID":{"uuid":{"$uuid":"46451b37-141f-4312-a219-4b504736ab5b"}},"namespace":"acme.products","uuid":{"uuid":{"$uuid":"a963b7e7-1054-4a5f-a935-a5be8995cff0"}},"indexesBuilt":1,"numIndexesBefore":1,"numIndexesAfter":2}}
    
    {"t":{"$date":"2023-06-26T19:12:17.554+00:00"},"s":"I",  "c":"INDEX",    "id":20447,   "ctx":"conn34","msg":"Index build: completed","attr":{"buildUUID":{"uuid":{"$uuid":"46451b37-141f-4312-a219-4b504736ab5b"}}}}
    
    {"t":{"$date":"2023-06-26T19:12:17.554+00:00"},"s":"I",  "c":"COMMAND",  "id":51803,   "ctx":"conn34","msg":"Slow query","attr":{"type":"command","ns":"acme.products","appName":"MongoDB Shell","command":{"createIndexes":"products","indexes":[{"key":{"airt":1.0},"name":"airt_1"}],"lsid":{"id":{"$uuid":"dd9672f8-4f56-47ce-8ceb-31caf5e8baf8"}},"$clusterTime":{"clusterTime":{"$timestamp":{"t":1687801980,"i":1}},"signature":{"hash":{"$binary":{"base64":"AAAAAAAAAAAAAAAAAAAAAAAAAAA=","subType":"0"}},"keyId":0}},"$db":"acme"},"numYields":0,"reslen":271,"locks":{"ParallelBatchWriterMode":{"acquireCount":{"r":3}},"FeatureCompatibilityVersion":{"acquireCount":{"r":1,"w":4}},"ReplicationStateTransition":{"acquireCount":{"w":5}},"Global":{"acquireCount":{"r":1,"w":4}},"Database":{"acquireCount":{"w":3}},"Collection":{"acquireCount":{"r":1,"w":1,"W":1}},"Mutex":{"acquireCount":{"r":3}}},"flowControl":{"acquireCount":3,"timeAcquiringMicros":7},"storage":{"data":{"bytesRead":98257,"timeReadingMicros":3489}},"protocol":"op_msg","durationMillis":4678530}}

    Above, you can see how much time it took to complete the index build; the op was running till the third node was down.

  3. When one secondary is down, and the index is created with commitQuorum as the majority, below are the details from the Primary and the Secondary nodes. Status of nodes:
    rs1:PRIMARY> rs.status().members.forEach(function (d) {print(d.name) + " " + print(d.stateStr)});
    127.0.0.1:27017
    PRIMARY
    localhost:27018
    SECONDARY
    localhost:27019
    (not reachable/healthy)
    rs1:PRIMARY>

    Index command:

    rs1:PRIMARY> db.products.createIndex({ "airt" : 1 }, { }, "majority")
    {
            "createdCollectionAutomatically" : false,
            "numIndexesBefore" : 1,
            "numIndexesAfter" : 2,
            "commitQuorum" : "majority",
            "ok" : 1,
            "$clusterTime" : {
                    "clusterTime" : Timestamp(1687808148, 4),
                    "signature" : {
                            "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                            "keyId" : NumberLong(0)
                    }
            },
            "operationTime" : Timestamp(1687808148, 4)
    }
    rs1:PRIMARY>

    Logs from Primary node:

    {"t":{"$date":"2023-06-26T19:35:48.821+00:00"},"s":"I",  "c":"STORAGE",  "id":3856201, "ctx":"conn7","msg":"Index build: commit quorum satisfied","attr":{"indexBuildEntry":{"_id":{"$uuid":"5f8f75ee-aa46-42a6-b4c2-59a68fea47a7"},"collectionUUID":{"$uuid":"a963b7e7-1054-4a5f-a935-a5be8995cff0"},"commitQuorum":"majority","indexNames":["airt_1"],"commitReadyMembers":["127.0.0.1:27017","localhost:27018"]}}}
    
    {"t":{"$date":"2023-06-26T19:35:48.821+00:00"},"s":"I",  "c":"STORAGE",  "id":3856204, "ctx":"IndexBuildsCoordinatorMongod-3","msg":"Index build: received signal","attr":{"buildUUID":{"uuid":{"$uuid":"5f8f75ee-aa46-42a6-b4c2-59a68fea47a7"}},"action":"Commit quorum Satisfied"}}
    
    {"t":{"$date":"2023-06-26T19:35:48.822+00:00"},"s":"I",  "c":"INDEX",    "id":20345,   "ctx":"IndexBuildsCoordinatorMongod-3","msg":"Index build: done building","attr":{"buildUUID":{"uuid":{"$uuid":"5f8f75ee-aa46-42a6-b4c2-59a68fea47a7"}},"namespace":"acme.products","index":"airt_1","commitTimestamp":{"$timestamp":{"t":1687808148,"i":3}}}}
    
    {"t":{"$date":"2023-06-26T19:35:48.824+00:00"},"s":"I",  "c":"STORAGE",  "id":20663,   "ctx":"IndexBuildsCoordinatorMongod-3","msg":"Index build: completed successfully","attr":{"buildUUID":{"uuid":{"$uuid":"5f8f75ee-aa46-42a6-b4c2-59a68fea47a7"}},"namespace":"acme.products","uuid":{"uuid":{"$uuid":"a963b7e7-1054-4a5f-a935-a5be8995cff0"}},"indexesBuilt":1,"numIndexesBefore":1,"numIndexesAfter":2}}
    
    {"t":{"$date":"2023-06-26T19:35:48.923+00:00"},"s":"I",  "c":"INDEX",    "id":20447,   "ctx":"conn34","msg":"Index build: completed","attr":{"buildUUID":{"uuid":{"$uuid":"5f8f75ee-aa46-42a6-b4c2-59a68fea47a7"}}}}
    
    {"t":{"$date":"2023-06-26T19:35:48.923+00:00"},"s":"I",  "c":"COMMAND",  "id":51803,   "ctx":"conn34","msg":"Slow query","attr":{"type":"command","ns":"acme.products","appName":"MongoDB Shell","command":{"createIndexes":"products","indexes":[{"key":{"airt":1.0},"name":"airt_1"}],"commitQuorum":"majority","lsid":{"id":{"$uuid":"dd9672f8-4f56-47ce-8ceb-31caf5e8baf8"}},"$clusterTime":{"clusterTime":{"$timestamp":{"t":1687808123,"i":1}},"signature":{"hash":{"$binary":{"base64":"AAAAAAAAAAAAAAAAAAAAAAAAAAA=","subType":"0"}},"keyId":0}},"$db":"acme"},"numYields":0,"reslen":266,"locks":{"ParallelBatchWriterMode":{"acquireCount":{"r":3}},"FeatureCompatibilityVersion":{"acquireCount":{"r":1,"w":4}},"ReplicationStateTransition":{"acquireCount":{"w":5}},"Global":{"acquireCount":{"r":1,"w":4}},"Database":{"acquireCount":{"w":3}},"Collection":{"acquireCount":{"r":1,"w":1,"W":1}},"Mutex":{"acquireCount":{"r":3}}},"flowControl":{"acquireCount":3,"timeAcquiringMicros":7},"storage":{},"protocol":"op_msg","durationMillis":2469}}

    Above, we can see when one node is down, and we used commitQuorum as majority while creating the index, index op got completed as per expected behavior as two voting (majority) nodes were up and running.

So far, we have discussed how to use commitQuorum and when to use it. Now we will see a scenario when one node (voting) is down for any reason, and someone created an index with default commitQuorum. The op will keep running, and you want to kill the op.

I created the index with the default commitQuorum when one node is down.

Status of nodes:

rs1:PRIMARY> rs.status().members.forEach(function (d) {print(d.name) + " " + print(d.stateStr)});
127.0.0.1:27017
PRIMARY
localhost:27018
SECONDARY
localhost:27019
(not reachable/healthy)
rs1:PRIMARY>

CurrentOp:

"active" : true,
"currentOpTime" : "2023-06-26T21:27:41.304+00:00",
"opid" : 536535,
"lsid" : {
          "id" : UUID("dd9672f8-4f56-47ce-8ceb-31caf5e8baf8"),
          "uid" : BinData(0,"47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=")
 },
 "secs_running" : NumberLong(264),
 "microsecs_running" : NumberLong(264345444),
 "op" : "command",
 "ns" : "acme.products",
 "command" : {
              "createIndexes" : "products",
              "indexes" : [
                      {
                           "key" : {
                                     "airt" : 1
                                    },
                           "name" : "airt_1"
                       }
                            ],
 "lsid" : {
               "id" : UUID("dd9672f8-4f56-47ce-8ceb-31caf5e8baf8")
           },
               "$clusterTime" : {
                                "clusterTime" : Timestamp(1687814589, 2),
                                "signature" : {
                                               "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                                 "keyId" : NumberLong(0)
                                 }
                                },
                                "$db" : "acme"
                        }

Now you need to kill the above opid to release the above op:

rs1:PRIMARY> db.killOp(536535)
{
        "info" : "attempting to kill op",
        "ok" : 1,
        "$clusterTime" : {
                "clusterTime" : Timestamp(1687815189, 2),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        },
        "operationTime" : Timestamp(1687815189, 2)
}
rs1:PRIMARY>
rs1:PRIMARY> db.products.createIndex({ "airt" : 1 })
{
        "operationTime" : Timestamp(1687815192, 2),
        "ok" : 0,
        "errmsg" : "operation was interrupted",
        "code" : 11601,
        "codeName" : "Interrupted",
        "$clusterTime" : {
                "clusterTime" : Timestamp(1687815192, 2),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
}
rs1:PRIMARY>
rs1:PRIMARY> db.products.getIndexes()
[ { "v" : 2, "key" : { "_id" : 1 }, "name" : "_id_" } ]
rs1:PRIMARY>

Above, we can see when we killed the op, and the index creation op got killed.

Conclusion

We have seen how commitQuorum works while creating indexes from PSMDB 4.4. Still, the best practice is to create indexes in a rolling manner.

We recommend checking out our products for Percona Server for MongoDB, Percona Backup for MongoDB, and Percona Operator for MongoDB. We also recommend checking out our blog MongoDB: Why Pay for Enterprise When Open Source Has You Covered?

May
15
2023
--

MongoDB 4.2 EOL… And Its Implications

MongoDB 4.2 EOL

Enjoy it while it lasts, as everything has its end.

It sounded a bit more cryptic than it was planned, but I hope that it gets the attention it needs, as it’s important to know that MongoDB 4.2 in April has reached its End of Life (EOL), and more versions are soon going to be decommissioned as well.

What does that mean for me?

If you are a user of MongoDB 4.2, whether the MongoDB Inc. version or Percona Server for MongoDB one, your database will no longer receive bug fixes, patches, or minor releases.

As defined in our Lifecycle Policy:

We will provide Operational Support for customers running EOL software on active platforms. For EOLed platforms, we provide Community Support.

And as stated in our Lifecycle Overview:

For software that Percona bases on an upstream build, we match the upstream EOL dates.

Our MongoDB Server follows the upstream EOL dates. This means that bug fixes and software builds will no longer be generated also for our release of MongoDB.

Also, with the Percona Server for MongoDB 4.2 reaching its end of life, the implications are as follows:

  • Percona Distribution for MongoDB 4.2 will no longer receive updates and bug fixes
  • Percona Backup for MongoDB (PBM) will no longer support 4.2 either. That means that testing with 4.2 has ceased, and while PBM may still successfully perform backups and restores, we cannot guarantee it anymore.

That being said, rest assured, you will not be left alone. Those that have or would like to sign up for a Percona Support Subscription will continue to receive operational support and services. Operational support includes but is not limited to:

  • Query optimization
  • MongoDB tuning (replica sets and sharded clusters)
  • MongoDB configuration, including our enterprise features such as LDAP
  • Upgrade support (from EOL versions, so, i.e., 3.6->4.0->4.2->…)
  • Setup and configuration of MongoDB clusters and tools such as Percona Backup for MongoDB and Percona Monitoring and Management (respecting the tool limitation for the EOL-ed version).
  • In case of crashes, although we do not report bugs, we can still track down known bugs and provide recommendations.

Still have questions about the 4.2 EOL?

In her recent blog post, MongoDB V4.2 EOL Is Coming: How To Upgrade Now and Watch Out for the Gotchas!, our MongoDB Tech Lead, Kimberly Wilkins, has covered the ins and outs of a MongoDB upgrade.

She has also hosted a webinar on the MongoDB 4.2 EOL common questions and challenges.

If you are our customer, please create a ticket for more assistance. Remember also that our Percona Community Forum is always open for any users of our software, as we believe that community is very important in building our products!

What’s next

I do not want to be the bearer of bad news, but we have seen the great popularity of MongoDB 4.2 and 4.4. If you are on 4.2 right now, it makes all the difference to move away from it ASAP. This version has just become a possible threat to your security.

As you see, 4.4 was mentioned as well. That’s right, this highly popular, last version in the 4.x family is scheduled to be EOL in February 2024. That’s less than one year to make preparations for upgrading.

Mongo 4.2 EOL

MongoDB EOL for the upcoming year or so.

While at it, notice that 5.0 is planned to be EOL in October 2024 as well, so next year, it’s worth considering upgrading to 6.0 to have at least till 2025 for the next EOL.

MongoDB eol

MongoDB 6.0 still has two years of life now.

If such an upgrade seems challenging and you want some help or at least advice around it, consider some of our premium services from MongoDB experts that can help you with migration by:

  • Support – Answering any operational questions
  • Managed Services – Playing the role of the remote DBA that handles all maintenance (including upgrades) for you
  • Consulting – Professionals that can come in and advise or even do the upgrade for you at any time
  • Training – So that your team can feel more comfortable with running the upgrades

Percona Distribution for MongoDB is a freely available MongoDB database alternative, giving you a single solution that combines the best and most important enterprise components from the open source community, designed and tested to work together.

Download Percona Distribution for MongoDB Today!

Apr
28
2023
--

Add More Security to Your Percona Server for MongoDB With AWS IAM integration!

MongoDB With AWS IAM Integration

Did you notice that Percona Server for MongoDB 6.0.5-4 was released just a few days ago? This time around, it introduced improvements to the way we handle master key rotation for data at rest encryption as well as AWS IAM integration.

One key to rule them all — improvements to master key rotation

With the improvements introduced in Percona Server for MongoDB 6.0.5-4, one key path can be used for all servers in a clustered environment. This allows us to use one vault key namespace for all nodes in a deployment while at the same time preserving key versions and allowing each node to perform key rotation without impact to the other nodes.

Changes introduced with Percona Server for MongoDB 6.0.5-4 now allow using the same key for all the members of a replica set if the user chooses so, without impact on functionality.

Why should you care about AWS IAM integration?

With all the systems users need to access daily, password management becomes a more pressing issue. The introduction of IAM systems to an enterprise has become somewhat of a security standard in large enterprises.

Our users approached us about integration with AWS IAM, commonly used in their organizations. It’s an integration missing from MongoDB Community Edition (CE) that is important for compliance with enterprise security policies of many companies. Integration with AWS IAM allows:

MongoDB AWS IAM integration

To set up this integration, follow the steps outlined in our documentation, and configure either the user or the role authentication. This will allow AWS Security Token Service (STS) to play the part of Identity Provider (IDP) in a SAML 2.0-based federation.

Your feedback matters!

We take pride in being open to feedback in Percona. Please do not hesitate to contact us via the community forums or this contact form.

What’s next

We are looking into the problems affecting large size datastores that are a pain to our users. Please let us know if there are any particular issues you are struggling with in MongoDB; we are always open to suggestions!

Learn more about Percona Server for MongoDB

Apr
13
2023
--

MongoDB V4.2 EOL Is Coming: How To Upgrade Now and Watch Out for the Gotchas!

MongoDB upgrade

MongoDB v4.2 was released in August 2019. And like previous versions, there is a time to go — and that End of Life (EOL) date approaches.

MongoDB v4.2 EOL is set for April 30, 2023. Since Percona Server for MongoDB (PSMDB) is a fully compatible replacement for the upstream MongoDB Community Edition, we try to follow the same EOL date as the upstream vendor. So let’s discuss why you should upgrade MongoDB, what the upgrade process looks like, and some of the potential gotchas that you may run into along the way.

First, consider why you even need to upgrade MongoDB.

You may think that everything is fine, your application is humming along, you have had no major problems with this version, and you don’t really see a reason to upgrade.  And that’s exactly why you want to upgrade; you want to keep it that way. There are multiple very good reasons to upgrade MongoDB.

Those reasons include the following:

  • Security, stability, compliance
  • Bug fixes 
  • New features, enhancements
  • 3rd party application requirements, code changes

The two most important reasons to upgrade MongoDB are around security and bug fixes.

The goal is always to ensure that your PSMDB/MongoDB environment stays as stable, secure, and performant as possible. 

According to Percona’s Software Release Lifecycle Overview, after a version has gone EOL, no new builds or updates will be provided. We will continue to offer Operation Support on Supported Platforms.

Don’t leave your database vulnerable because this version has gone on to the database software wasteland. Plan and execute your upgrade ASAP.

With new versions come new features that may be beneficial to your application. And some just like shiny new things. Some of the major new features that are available in more recent versions of MongoDB are Time Series Collections and Live Resharding in v5.0

Between v4.2 and v6.0, there are multiple enhancements to take advantage of.

Those include more resilient chunk migrations and the ability to at least refine your shard keys in v4.4, the above-mentioned v5.0 new features, and many changes under the hood to support those major new features. Then with v6.0, you get sync improvements, mandatory use of the new mongosh shell, many enhancements and tweaks to all of the major changes that v5.0 brought about, and a new default chunk size of 128M to help handle some of the recent changes to the auto-splitter, chunk migration, and balancer processes.

But there are quite a few hops between v4.2 and v6.0. So let’s debunk some of the reasons that you may have for NOT upgrading.

Those main potential blockers could include:

  • Lack of resources or staff
  • Lack of time/can’t afford the downtime
  • Legacy code
  • Outdated drivers
  • Stack incompatibility

But really, the danger of not having support or bug fixes for your business-critical databases and applications is generally a very good reason to overcome all of those blockers sooner rather than later.

Let’s look at the steps required to upgrade MongoDB and the different types of environments that you may have running. 

The types of systems include:

  • Standalone
  • Replica Sets
  • Sharded Clusters

Those systems or architectures look like the below:

types of mongodb systems

Now look at the components that make up those systems.

mongodb architectural components

That gives us an idea of what we will need to upgrade.

But we also have to consider the built-in Replication that MongoDB has. That, too, will impact our upgrade process.

replication via replica sets

Basic overall MongoDB upgrade steps

  • Take a backup
  • Check prerequisites, resolve as needed
    • Add resources if needed
    • Download new desired binaries
    • Keep FCV (Feature Compatibility Value) set to Current/Previous version
    • Shut down mongo processes in the correct order; rolling fashion according to system type
    • Replace current binaries with new binaries
    • Start up mongo processes in the correct order; rolling fashion as required by system type
    • Wait for an amount of time to ensure no problems
  • Set FCV to the new desired version

The Pre-Req Checks generally look like this:

upgrade pre-reqs

Once those are done, begin the upgrade for the standalone system type.

Upgrade a standalone system

It’s a single host or node with one process, so do the following:

  • Take a backup
  • Shut down the mongod process
  • Replace the current binary with the new binary
  • Keep FCV at the previous version until everything checks out after the upgrade
  • Restart the mongodb process using the original port (default 27017)
  • Check version, data, logs, etc.
  • When all is good, update FCV

Now let’s look at the steps for the other two system types.

First, the Replica Set upgrade steps:

replica set upgrade mongodb

*Note* that there is a warning there.

You ALWAYS want to upgrade MongoDB to the latest version available – well, after checking the bug list, of course. We’ll review that warning when discussing the “Gotchas” later.

For Replica Set upgrades – upgrade binaries in a rolling manner. Start with the secondaries, then force the election of a new PRIMARY, and then upgrade the binary for the former PRIMARY.

Basic Replica Set upgrade steps

Upgrade Secondaries in a Rolling Method – SECONDARY 1, SECONDARY 2,  force election to new PRIMARY, upgrade old PRIMARY/new SECONDARY.

Step 1 – Upgrade SECONDARY 1 

 – Shutdown Secondary 1

 – Take Secondary 1 out of the replica set by restarting it with another port number (ex. port 3333)

 – Change Binaries to new version

 – Start Secondary 1 back up with its original replica set port number (ex. 27017)

Step 2 – Upgrade SECONDARY 2 – repeat that process with Secondary 2

Shutdown Secondary 2

 – Take Secondary 2 out of the replica set by restarting it with another port number (ex. port 3333)

 – Change Binaries to new version

 – Start Secondary 2 back up with its original replica set port number (ex. 27017)

Step 3 – Upgrade the current PRIMARY

Step Down the current Primary force election of a new PRIMARY – make sure state is good

Upgrade old PRIMARY, now new Secondary

Shut down old PRIMARY, now new Secondary 3

 – Take new Secondary 3 out of the replica set by restarting it with another port number (ex. port 3333)

 – Change Binaries to new version

 – Start new Secondary 3 backup with its original replica set port number (ex. 27017)

Wait, check the data and logs. If there are no problems, update FCV to the new version

Done.

Now move on to the Sharded Cluster Upgrade process, where there are more components to consider and upgrade.

Reminder – those components are:

mongodb Sharded Cluster Upgrade

*Note* When Upgrading a Sharded Cluster, the order in which you upgrade the components matters.

You have the additional components (the balancer, config server replica set, the main data bearing mongod shard nodes, and the query router process mongoS’s.) 

Order Matters … with this system type upgrade.

Stop Balancer

  • Upgrade Config Servers
  • Upgrade Shard nodes – MongoD’s
  • Upgrade the MongoS’s

Start Balancer

Below is an infogram to show that order when upgrading a sharded cluster:

upgrading a sharded cluster

Downgrades

If, for whatever reason, you run into any problem, you can perform a downgrade.

Downgrades basically go in reverse order – more applicable to sharded clusters.

Stop Balancer

  • Downgrade MongoS’s
  • Downgrade Shard Nodes – MongoD’s
  • Downgrade the Config Servers

ReStart Balancer

* This happens all in a rolling manner at the replset level again.

sharded cluster downgrade mongodb

That covers the basic steps.

I will do a more technical series of blogs in the coming months covering running MongoDB in Kubernetes and using the Percona Operator for MongoDB. Those blogs will contain commands, example results, etc., used when managing in that containerized environment. One of the planned blogs will cover upgrades to that environment.

For now, you can see the actual commands used when upgrading a Percona Server for MongoDB replica set or a sharded cluster in Sudhir’s blog. 

Now let’s take a look at some of the potential gotchas.

Gotchas — AKA things to watch out for

Whenever you are changing versions via an upgrade, there are different groups of things to watch out for or be aware of.

Those general buckets are:

  • Compatibility issues – whether programming language or driver related
  • Deprecated functions or tools
  • Behavior changes – ex TTL behavior changed significantly between v4.4 and v6
  • Bugs – newly discovered or already reported but not fixed yet

Below are some specific examples for those buckets.

Deprecated – Simple Network Management Protocol (SNMP) v6.0

Starting in MongoDB 6.0, SNMP is deprecated and will be removed in the next release.

Deprecated – old mongo shell “mongo” in v5.0

– Some legacy methods unavailable or replaced …

– mongosh uses the same syntax as old mongo shell

– Beware check your code

Behavior changes

v4.2 –Faster stepdowns 

Autosplitter process moved from mongos to the PRIMARY member of the replica set. This led to knowing more truth about chunk balance, so more chunk splits and more chunk migrations. Many more. It caused added write pressure.

v4.4 –Adjustments to the management of jumbo chunks – no longer getting stuck forever due to memory usage limit. Started adding in some of the changes that would be needed for v5.0 when Live Resharding would be hitting.

v5.0 – Deprecated old shell; major changes to the WiredTiger engine and Core Server to support Live ReSharding and Time Series Collections. It took a while for this major release to be fully baked. 😉 

v6.0 Default chunk size increased to 128M to help against too frequent chunk splitting and chunk moves by the Balancer. Removed the old shell. Changes to Initial sync and resync. New Operators; New Functions.

MongoDB version

Those are just some of the changes.

More about some of the negative impacts.

Things to watch out for – bugs!

Along with all of the major changes that went into v5.0, there were many bugs for quite a while. Making this even more impactful – the v5.0 changes were backported in v4.4.

So early on for v5.0.x, the same bugs also broke versions of 4.4 from 4.4.1 up through v4.4.8 but really up into v4.9 and v4.10 if you look at the JIRA tickets closely. 

Compatibility for Supported Life Cycle and OS Host Software – ex. MongoDB v5.0.x Community Edition drops support for RHEL/CentOS/Oracle 6 on x86_64 and other OS support changes

Here are some screenshots that I took along the way during the many releases for the version.

patch releases

Make sure to use the latest minor release version when upgrading MongoDB

For v4.4

These various bugs last pretty much ALL the way Through 4.4.8 – Serious bugs that cause checkpoint issues, possible data inconsistencies, missing documents/data loss, duplicate unique key, problems restarting, omitting a page or pages of data, unclean restarts, data loss, etc.

Examples: SERVER-61483  SERVER-61633   SERVER-61945  SERVER-61950 – problems restarting nodes     WT-8104   WT-8204 – race conditions, memory leaks. WT-8395 – upgrade related –data in an “inconsistent state”, missing documents      WT-8534    WT-8551 

Performance impact? There are also a few postings online talking about the newer versions possibly having a negative impact on performance.

There was a recently reported slowdown due to the new mongosh shell and a bug there, but seems to be with certain combinations and using a ruby driver …? Below are links for that issue and bug: 

Ok, enough of that. What should we do?

Upgrade of course!

Slow and Steady wins the race!

percona mongodb

percona server for mongodb versions

And what else?

Migrations!

Consider moving over to Percona Server for MongoDB

 

That migration — it’s really just a lift and shift.

Normally a binary change between PSMDB and MGDB CE will take care of it.  No need for data type changes, and no loss of triggers and procedures to deal with.

Remember, tick-tock … Plan your MongoDB upgrade and migrations today.

Thanks all!

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com