PMM v2.32: Backup Management for MongoDB in GA, New Home Dashboard, and More!

Percona Monitoring and Management v2.32

Percona Monitoring and Management v2.32We are pleased to announce the general availability of the Backup Management for MongoDB and other improvements in Percona Monitoring and Management (PMM) v.2.32 that has been released in November 2022. Details are in this blog and also in the PMM 2.32 Release Notes.

PMM is now on the scene with a new Home Dashboard where you can quickly and easily check your databases’ health at one glance and detect anomalies. While there’s no one-size-fits-all approach, we created and released the new Home Dashboard to make it more user-friendly, even for users new to PMM.

You can get started using PMM in minutes with PMM Demo to check out the latest version of PMM V2.32.

Let’s have a look at the highlights of PMM 2.32:

General availability of Backup Management for MongoDB

The Backup Management for MongoDB in PMM has reached General Availability and is no longer in Technical Preview.

Supported setups

MongoDB Backup Management now supports replica set setups for the following actions:

  • Create logical snapshot backups
  • Create logical Point In Time Recovery (PITR) backups
  • Create physical snapshot backups. This is available only with Percona Server for MongoDB
  • Restore logical snapshot backups.
  • Restore physical backups. This requires additional manual operations after the restore and is only available with Percona Server for MongoDB.
  • Restore logical PITR backups from S3

Current limitations

  • Restoring logical PITR backups only supports S3 storage type
  • Restoring physical backups requires manual post-restore actions
  • Restoring a MongoDB backup on a new cluster is not yet supported
  • Restoring physical backups for containerized MongoDB setups is not supported
  • Local storage for MySQL is not supported
  • Sharded cluster setups are not supported
  • Backups that are stored locally cannot be removed automatically
  • Retention is not supported for PITR artifacts


Quicker and easier database health overview with the new Home Dashboard

As mentioned and promised in previous release notes, we were investigating better approaches, methods, and user-friendly presentation of database health in the Home Dashboard, which is also the entry point to PMM. Finally, we are proud to release this finalized dashboard as the new Home Dashboard. Thank you for your feedback and collaboration during all iterations of the experimental versions.

Monitor hundreds of nodes without any performance issues

If you have hundreds of nodes being monitored with the same PMM instance, the original dashboard may have taken a long time to load, which could have resulted in an unresponsive page, due to the design of the original Home Dashboard with repeating panels for each node. With performance issues in mind, we re-designed the Home Dashboard with new logic to show what is wrong or what is OK with your databases, instead of showing all metrics, for each node.  

PMM Home Dashboard_home_select multiple nodes

PMM Home Dashboard_home_select multiple nodes

Anomaly detection

Many of you probably use dozens of tools for different purposes in your daily work, meetings, and projects. These tools should make your life easier, not more intensive. With monitoring tools, the issue of too many metrics can be daunting— so analyzing data, and detecting anomalies that deviate from a database’s normal behavior should be easy and fast. Functional Anomaly Detection panels, as opposed to separate graphs for each node, are a much better way to visualize and recognize problems with your databases that may require action to be taken.

  • You can click the node name on the panel to see the Node Overview Dashboard of the related node if you see any anomaly. So you can see all metrics of the Node that you need to diagnose the problems.
  • All panels except Advisor Checks can be filtered by node and environment variables
  • Graphs in the Anomaly Detection row show the data for the top 20 nodes. e.g., CPU anomalies in the top 20
Anomaly Detection

PMM Anomaly Detection panels

Command Center panels

The primary motivation behind the new Home Dashboard is simplicity. It was always hard to balance presenting the required metrics for everyone and at the same time, making it clean, functional, and simple while working on the new design. So we decided to use Command Center panels which are collapsed by default. If you see any anomaly in Memory Usage with more than 90%, how do you know when it happened or started? Time-series graphs for the Top 20 in the Command Center panels will help you see when the anomalies occurred: in the last 1 hour or the last week? 

PMM Command Center Panels

PMM Command Center Panels on Home Dashboard

Enhanced main menu

We returned with two improvements we previously promised. These improvements were announced in V2.32 for easier access to dashboards from the Main Menu. After the last changes, with each possible monitored services type represented on the Main Menu as icons, the menu became crowded and extended with all icons representing different service types. In the latest version,  you’ll only see the icons of currently monitored services on the Main Menu. For example, if you’re monitoring MongoDB, you will see the MongoDB Dashboard’s icon on the main menu, as opposed to the previous versions, which showed all database types PMM is capable of monitoring, whether you had them in your system or not. When and if you start to monitor other services like MySQL, they will be automatically added to the Main Menu.

Another improvement on the Main Menu is the visibility of all other dashboards. PMM provides multiple dashboards for different levels of information for each service. You only see some dashboards in the main menu; the rest are available in the folders. Some users can miss these dashboards, which are not presented in the Main Menu. Also, customer dashboards created by different users in your organization can be missed or invisible to you until you see them in the folders by chance. So, we added Other Dashboards links to the sub-menu of each service,  so that you can easily click and see all dashboards in the Service folder.

Quick access to other dashboards from the menu

Quick access to other dashboards from the menu

What’s next?

  • We’ll improve the Vacuum Dashboard with more metrics. If you’d like to enhance it with us, you can share your feedback in the comments.
  • A health dashboard for MySQL is on the way. Please share your suggestions in the comments or forum if you’d like to be part of the group shaping PMM. 
  • We have started to work on two new and significant projects: High Availability in PMM and advanced Role-Based Access Control (RBAC). We’d love to hear your needs, use cases, and suggestions. You can quickly book a short call with the product team to collaborate with us. 
  • For Backup Management, we are planning to continue to iterate on the current limitations listed above and make the restore processes as seamless as possible for all database types.

Install PMM 2.32 now or upgrade your installation to V2.32 by checking our documentation for more information about upgrading.

Learn more about Percona Monitoring and Management 3.32

Thanks to Community and Perconians

We love our community and team in Percona, who shape the future of PMM, together and help us with all those changes.

You can also join us on our community forums to request new features, share your feedback, and ask for support.

Thank you for your collaboration on the new Home Dashboards:

Cihan Tunal?   @SmartMessage 

Tyson McPherson @Parts Authority

Paul Migdalen @IntelyCare


Percona Operator for MongoDB Goes Cluster-Wide, Adds AKS Support, and More!

Percona Operator for MongoDB Goes Cluster-Wide

Percona Operator for MongoDB Goes Cluster-WidePercona Operator for MongoDB version 1.13 was recently released and it comes with various ravishing features. In this blog post, we are going to look under the hood and see what are the practical use cases for these improvements.

Cluster-wide deployment

There are two modes that Percona Operators support:

  1. Namespace scope
  2. Cluster-wide

Namespace scope limits the Operator operations to a single namespace, whereas in cluster-wide mode Operator can deploy and manage databases in multiple namespaces of a Kubernetes cluster. Our Operators for PostgreSQL and MySQL already support cluster-wide mode. With the 1.13 release, we are closing the gap for Percona Operator for MongoDB. 

Multi-tenant clusters are the most common call for cluster-wide mode. You as a cluster administrator manage a single deployment of the Operator and equip your teams with the way to deploy and manage MongoDB in their isolated namespaces. Read more about multi-tenancy and best practices in our Multi-Tenant Kubernetes Cluster with Percona Operators.

How does it work?

To deploy in cluster-wide mode we introduce


manifests. The quickest way would be to use the


which deploys the following:

  • Custom Resource Definition
  • Service Account and Cluster Role that allows Operator to create and manage Kubernetes objects in various namespaces
  • Operator Deployment itself

By default, Operator monitors all the namespaces in the cluster. The


environment variable in the Operator Deployment limits the scope. It can be a comma-separated list that instructs the Operator on which namespaces to monitor for Custom Resource objects:

            - name: WATCH_NAMESPACE
              value: "app-dev1,app-dev2”

This is useful if you want to limit the blast radius, but run multiple Operators monitoring various namespaces. For example, you can run an Operator per environment – development, staging, production.

Deploy the bundle:

kubectl apply -f deploy/cw-bundle.yaml -n psmdb-operator

Now you can start deploying databases in the namespaces you need:

kubectl apply -f deploy/cr.yaml -n app-dev1
kubectl apply -f deploy/cr.yaml -n app-dev2

See the demo below where I deploy two clusters in different namespaces with a single Operator.

Hashicorp Vault integration for encryption-at-rest

We take security seriously at Percona. Data-at-rest encryption prevents data visibility in the event of unauthorized access or theft. It is supported by all our Operators. With this release, we introduce the support for integration with Hashicorp Vault, where the user can keep the keys in the Vault and instruct Percona Operator for MongoDB to use those. This feature is in a technical preview stage.

There is a good blog post that describes how Percona Server for MongoDB works with the Vault. In Operator, we implement the same functionality and follow the structure of the same parameters.

How does it work?

We are going to assume that you already have Hashicorp Vault installed – you either use Cloud Platform or a self-hosted version. We will focus on the configuration of the Operator.

To instruct the Operator to use Vault you need to specify two things in the Custom Resource:

  1. secrets.vault

    – Secret resource with a Vault token in it

  2. Custom configuration for mongod for a replica set and config servers


Example of cr.yaml:

    vault: my-vault-secret

The secret object itself should contain the token that has access to create, read, update and delete the secrets in the desired path in the Vault. Please refer to the Vault documentation to understand policies better.

Example of a Secret:

apiVersion: v1
  token: aHZzLnhrVVRPTEVOM2dLQmZuV0I5WTF0RmtOaA==
kind: Secret
  name: my-vault-secret
  namespace: default
type: Opaque

Custom configuration

The operator allows users to fine-tune mongod and mongos configurations. For encryption to work, you must specify vault configuration for replica sets – both data and config servers.

Example of cr.yaml:

  - name: rs0
    size: 3
    configuration: |
        enableEncryption: true
          serverName: vault
          port: 8200
          tokenFile: /etc/mongodb-vault/token
          secret: secret/data/dc/cluster1/rs0
          disableTLSForTesting: true
    enabled: true

      size: 3
      configuration: |
          enableEncryption: true
            serverName: vault
            port: 8200
            tokenFile: /etc/mongodb-vault/token
            secret: secret/data/dc/cluster1/cfg
            disableTLSForTesting: true

What to note here:

  • tokenFile: /etc/mongodb-vault/token
    • It is where the Operator is going to mount the Secret with the Vault token you created before. This is the default path and in most cases should not be changed.
  • secret: secret/data/dc/cluster1/rs0
    • It is the path where keys are going to be stored in the Vault. 

You can read more about Percona Server for MongoDB and Hashicorp Vault parameters in our documentation.

Once you are done with the configuration, apply the Custom Resource as usual. If everything is set up correctly you will see the following message in mongod log:

$ kubectl logs cluster1-rs0-0 -c mongod 
{"t":{"$date":"2022-09-13T19:40:20.342+00:00"},"s":"I",  "c":"STORAGE",  "id":29039,   "ctx":"initandlisten","msg":"Encryption keys DB is initialized successfully"}

Azure Kubernetes Service support

All Percona Operators are going through rigorous QA testing throughout the development lifecycle. Hours of QA engineers’ work are put into automating the test suites for specific Kubernetes flavors. 

AKS, or Azure Kubernetes Service, is the second most popular managed Kubernetes offering according to Flexera 2022 State of the Cloud report. After adding the support for Azure Blob Storage in version 1.11.0, it was just a matter of time before we started supporting AKS in full.

Starting with the 1.13.0 release, Percona Operator for MongoDB supports AKS in Technical Preview. You can see more details in our documentation.

The installation process of the Operator is no different from any other Kubernetes flavor. You can use a helm chart or apply YAML manifests with kubectl. I ran the cluster-wide demo above with AKS.

Admin user

This is a minor change, but frankly, it is my favorite as it impacts the user experience in a great way. Our Operator is coming with systems users that are used to manage and track the health of the database. Also, there are userAdmin and clusterAdmin for users to control the database, create users, and so on.

The problem here is neither userAdmin nor clusterAdmin allows you to start playing with the database right away. At first, you need to create the user that has permission to create databases and collections, and only after that, you can start using your fresh MongoDB cluster. 

With the release 1.13, we say no more to this. We add the databaseAdmin user that acts like a database admin, enabling users to start innovating right away.

databaseAdmin credentials are added to the same Secret object where other users are:

$ kubectl get secret my-cluster-name-secrets -o yaml | grep MONGODB_DATABASE_ADMIN_

Get your password like this:

$ kubectl get secret my-cluster-name-secrets -o jsonpath='{.data.MONGODB_DATABASE_ADMIN_PASSWORD}' | base64 --decode

Connect to the database as usual and start innovating:

mongo "mongodb://databaseAdmin:FoMRrwP2Ck0yqoVYO@"

What’s next

Percona is committed to running databases anywhere. Kubernetes adoption grows year over year, turning from a container orchestrator into a cloud operating system. Our Operators are supporting the community and our customers’ journey in infrastructure transformations by automating the deployment and management of the databases in Kubernetes.

The following links are going to help you to get familiar with Percona Operator for MongoDB:

  1. Quickstart guides
  2. Free Kubernetes cluster for easier and quicker testing
  3. Percona Operator for MongoDB community forum if you have general questions or need assistance

Read about Percona Monitoring and Management DBaaS, an open source solution that simplifies the deployment and management of MongoDB even more.


PostgreSQL Minor Versions Released May 9, 2019

PostgreSQL Logo

Usually, the PostgreSQL Community releases minor patches on the Thursday of the second week of the second month of each quarter. This can vary depending on the nature of the fixes. For example, if the fixes include critical fixes as a postgres community response to security vulnerabilities or bugs that might affect data integrity. The previous minor release happened on February 14, 2019. In this case, the fysnc failure issue was corrected, along with some other enhancements and fixes.

The latest minor release, published on May 9, 2019, has some security fixes and bug fixes for these PostgreSQL Major versions.

PostgreSQL 11 (11.3)
PostgreSQL 10 (10.8)
PostgreSQL 9.6 (9.6.13)
PostgreSQL 9.5 (9.5.17)
PostgreSQL 9.4 (9.4.22)

Let’s take a look at some of the security fixes in this release.

Security Fixes

CVE-2019-10130 (Applicable to PostgreSQL 9.5, 9.6, 10 and 11 versions)

When you


a table in PostgreSQL, statistics of all the database objects are stored in


. The query planner uses this statistical data is and it may contain some sensitive data, for example, min and max values of a column. Some of the planner’s selectivity estimators apply user-defined operators to values found in


. There was a security fix :


in the past, that restricted a leaky user-defined operator from disclosing some of the entries of a data column.

Starting from PostgreSQL 9.5, tables in PostgreSQL not only have SQL-standard privilege system but also row security policies. To keep it short and simple, you can restrict a user so that they can only access specific rows of a table. We call this RLS (Row-Level Security).


is about restricting a user who has SQL permissions to read a column but is forbidden to read some rows due to RLS policy from discovering the restricted rows through a leaky operator. Through this patch, leaky operators to statistical data are only allowed when there is no relevant


policy. We shall see more about


in our future blog posts.

CVE-2019-10129 (Applicable to PostgreSQL 11 only)

This fixes the situation where a user could access arbitrary back end memory with correctly crafted DDLs. This is only applicable to PostgreSQL 11 version and no other major release has been impacted.

Fixes roundup

There have been a good number of fixes around partitioning in both PostgreSQL 10 and 11 versions that included a fix to behaviour of an UPDATE or an INSERT on certain scenarios, failure while ALTER INDEX .. ATTACH PARTITION, tuple routing in multi-level partitioned tables that have dropped attributes. etc. .

We should also mention a couple of interesting bugs addressed in this new release.

Bug # 15631 is about the catalog corruption when a temporary table on commit drop is created with an identity column. Due to this issue, PostgreSQL 11.1 would not allow the creation of temporary tables, and you might have observed this error message prior to this release:

# create temporary table foo ( bar int generated by default as identity ) on
# commit drop;
# \q
# psql postgres postgres
# create temporary table a (b varchar);
ERROR: could not open relation with OID 16388
LINE 1: create temporary table a (b varchar);

Another bug that got fixed is Bug # 15734 which is about a


process crashing when executing


using replication protocol.

Some more fixes include:

1. Fix


to allow zero-column views.
2. Add missing support for the


  .. statement.
3. Fix incompatibility of GIN-index WAL records that were introduced in 11.2, 10.7, 9.6.12, 9.5.16, and 9.4.21 that affected replica servers running these versions reading in changes to GIN indexes from primary servers of older versions.
4. Make


verify that the data directory it’s pointed at is of the right PostgreSQL version.
5. Fixes for


where an


could lead to incorrect results or a crash.
6. Several memory leak fixes as well as fixes to management of dynamic shared memory.

The postgres community have fixed more than just the bug fixes and enhancements we’ve highlighted. We would recommend you to go through the release notes, which you can access by clicking the appropriate release in the beginning of this blog post. Let us know if you have any specific questions you’d like us to address.

Applying updates

We always recommend that you keep your PostgreSQL databases updated to the latest minor versions. Be aware that applying a minor release might need a restart after updating the new binaries.

Here is the sequence of steps you should follow to upgrade to the latest minor versions after thorough testing :

1. Shutdown the PostgreSQL database server
2. Install the updated binaries
3. Restart your PostgreSQL database server

Most of the time, you can choose to update the minor versions in a rolling fashion in a master-slave (replication) setup, because it avoids downtime for both reads and writes simultaneously. PostgreSQL logoFor a rolling style update, you could perform the update on one server after another. However, the best method that we’d almost always recommend is – shutdown, update and restart all instances at once.

If you are currently running your databases on PostgreSQL 9.3.x or earlier, we recommend that you to prepare a plan to upgrade your PostgreSQL databases to a supported version ASAP. We’ve previously published some blog posts about the various options for upgrading your PostgreSQL databases to a supported major version, and you can read our post archive here. Please subscribe to our blog posts so that you can see more interesting topics around PostgreSQL.

And don’t forget that at Percona Live in Austin, May 28-30 2019, we’ll have two days of PostgreSQL content in a postgres dedicated track. Please see all our PostgreSQL talks here.


PostgreSQL 11! Our First Take On The New Release


PostgreSQL logoYou may be aware that the new major version of PostgreSQL has been released today. PostgreSQL 11 is going to be one of the most vibrant releases in recent times. It incorporates many features found in proprietary, industry-leading database systems, further qualifying PostgreSQL as a strong open source alternative.

Without further ado, let’s have a look at some killer features in this new release.

Just In Time (JIT) Compilation of SQL Statements

This is a cutting edge feature in PostgreSQL: SQL statements can get compiled into native code for execution. It’s well know how much Google V8 JIT revolutionized the JavaScript language. JIT in PostgreSQL 11 supports accelerating two important factors—expression evaluation and tuple deforming during query execution—and helps CPU bound queries perform faster. Hopefully this is a new era in the SQL world.

Parallel B-tree Index build

This could be the most sought after feature by DBAs, especially those migrating large databases from other database systems to PostgreSQL. Gone are the days when a lot of time was spent on building indexes during data migration. Index maintenance (rebuild) for very large tables can now make an effective use of multiple cores in the server by parallelizing the operation, taking considerably less time to complete.

Lightweight and super fast ALTER TABLE for NOT NULL column with DEFAULT values

In the process of continuous enhancement and adding new features, we see several application developments that involve schema changes to the database. Most such changes include adding new columns to a table. This can be a nightmare if a new column needs to be added to a large table with a default value and a NOT NULL constraint. This is because an ALTER statement can hold a write lock on the table for a long period. It can also involve excessive IO due to table rewrite. PostgreSQL 11 addresses this issue by ensuring that the column addition with a default value and a NOT NULL constraint avoids a table rewrite.  

Stored procedures with transaction control

PostgreSQL 11 includes stored procedures. What really existed in PostgreSQL so far was functions. The lack of native stored procedures in PostgreSQL made the database code for migrations from other databases complex. They often required extensive manual work from experts. Since stored procedures might include transaction blocks with BEGIN, COMMIT, and ROLLBACK, it was necessary to apply workarounds to meet this requirement in past PostgreSQL versions, but not anymore.

Load cache upon crash or restart – pg_prewarm

Memory is becoming cheaper and faster, year over year. The latest generation of servers is commonly available with several hundreds of GBs of RAM, making it easy to employ large caches (shared_buffers) in PostgreSQL. Until today, you might have used pg_prewarm to warm up the cache manually (or automatically at server start). PostgreSQL 11 now includes a background worker thread that will take care of that for you, recording the contents of the shared_buffers—in fact, the “address” of those—to a file autoprewarm.blocks. Upon crash recovery or normal server restart, two such threads work in the background, reloading those blocks into the cache.

Hash Partition

Until PostgreSQL 9.6 we used table inheritance for partitioning a table. PostgreSQL 10 came up with declarative partitioning, using two of the three most common partitioning methods: list and range. And now, PostgreSQL 11 has introduced the missing piece: hash partitioning.

Advanced partitioning features that were always on demand

There were a lot of new features committed to the partitioning space in PostgreSQL 11. It now allows us to attach an index to a given partition even though it won’t behave as a global index.

Also, row updates now automatically move rows to new partitions (if necessary) based on the updated fields. During query processing, the optimizer may now simply skip “unwanted” partitions from the execution plan, which greatly simplifies the work to be done. Previously, it had to convey all the partitions, even if the target data was to be found in just a subset of them.

We will discuss these new features in detail in a future blog post.

Tables can have default partitions

Until PostgreSQL 10, if a table did not have a default partition, PostgreSQL had to reject a row when the row being inserted did not satisfy any of the existing partitions definitions. That changes with the introduction of default partitions in PostgreSQL 11.

Parallel hash join

Most of the SQLs with equi-joins do hash joins in the background. There is a great opportunity to speed up performance if we can leverage the power of hardware by spinning off multiple parallel workers. PostgreSQL 11 now allows hash joins to be performed in parallel.

Write-Ahead Log (WAL) improvements

Historically, PostgreSQL had a default WAL segment of 16MB and we had to recompile PostgreSQL in order to operate with WAL segments of a different size. Now it is possible to change the WAL size during the initialization of the data directory (initdb) or while resetting WALs using pg_resetwal by means of the parameter –wal-segsize = <wal_segment_size>.

Add extensions to convert JSONB data to/from PL/Perl and PL/Python

Python as a programming language continues to gain popularity. It is always among the top 5 in the TIOBE Index. One of the greatest features of PostgreSQL is that you can write stored procedures and functions in most of the popular programming languages, including Python (with PL/Python). Now it is also possible to transform JSONB (Binary JSON) data type to/from PL/Python. This feature was later made available for PL/Perl too. It can be a great add-on for organizations using PostgreSQL as a document store.

Command line improvements in psql: autocomplete and quit/exit

psql has always been friendly to first time PostgreSQL users through the various options like autocomplete and shortcuts. There’s an exception though: users may find it difficult to understand how to effectively quit from psql, and often attempt to use non-existing quit and exit commands. Eventually, they find \q or ctrl + D, but not without frustrating themselves first. Fortunately, that shouldn’t happen anymore: among many recent fixes and improvements to psql is the addition of the intuitive quit and exit commands to safely leave this popular command line client.

Improved statistics

PostgreSQL 10 introduced the new statement CREATE STATISTICS to collect more statistics information about columns from tables. This has been further improved in PostgreSQL 11. Previously, while collecting optimizer statistics, most-common-values (MCV) were chosen based on their significance compared to all columns. But now, MCVs are chosen based on their significance compared to non-MCV values.

The new features of PostgreSQL 11 are not limited to the ones mentioned above. We will be looking further into most of them in future blog posts. We invite you to leave a comment below and let us know if there is any particular feature you would be interested in knowing more about.


Webinar Tues 6/26: MariaDB Server 10.3

MariaDB 10.3 Webinar

MariaDB 10.3 WebinarPlease join Percona’s Chief Evangelist, Colin Charles on Tuesday, June 26th, 2018, as he presents MariaDB Server 10.3 at 7:00 AM PDT (UTC-7) / 10:00 AM EDT (UTC-4).


MariaDB Server 10.3 is out. It has some interesting features around system versioned tables, Oracle compatibility, column compression, an integrated SPIDER engine, as well as MyRocks. Learn about what’s new, how you can use it, and how it is different from MySQL.

Register Now

Colin Charles

Chief Evangelist

Colin Charles is the Chief Evangelist at Percona. He was previously on the founding team of MariaDB Server in 2009, and had worked at MySQL since 2005, and been a MySQL user since 2000. Before joining MySQL, he worked actively on the Fedora and projects. He’s well known within open source communities in APAC, and has spoken at many conferences. Experienced technologist, well known in the open source world for work that spans nearly two decades within the community. Pays attention to emerging technologies from an integration standpoint. Prolific speaker at many industry-wide conferences delivering talks and tutorials with ease. Interests: application development, systems administration, database development, migration, Web-based technologies. Considered expert in Linux and Mac OS X usage/administration/roll-out’s. Specialties: MariaDB, MySQL, Linux, Open Source, Community, speaking & writing to technical audiences as well as business stakeholders.

The post Webinar Tues 6/26: MariaDB Server 10.3 appeared first on Percona Database Performance Blog.


MongoDB 3.6 Community Is Here!

MongoDB 3.6 Community

MongoDB 3.6 CommunityBy now you surely know MongoDB 3.6 Community became generally available on Dec 5, 2017. Of course, this is great news: it has some big ticket items that we are all excited about! But I want to also talk about my general thoughts on this release.

It is always a good idea for your internal teams to study and consider new versions. This is crucial for understanding if and when you should start using it. After deciding to use it, there is the question of if you want your developers using the new features (or are they not suitably implemented yet to be used)?

So what is in MongoDB 3.6 Community? Check it out:

  • Sessions
  • Change Streams
  • Retryable Writes
  • Security Improvement
  • Major love for Arrays in Aggregation
  • A better balancer
  • JSON Schema Validation
  • Better Time management
  • Compass is Community
  • Major WiredTiger internal overhaul

As you can see, this is an extensive list. But there are 1400+ implemented Jira tickets just on the server itself (not even in the WiredTigers project).

To that end, I thought we should break my review into a few areas. We will have blog posts out soon covering these areas in more depth. This blog is more about my thoughts on the topics above.

Expected blogs (we will link to them as they become available):

  • Change Streams –  Nov 11 2017
  • Sessions
  • Transactions and new functions
  • Aggregation improvements
  • Security Controls to use ASAP
  • Other changes from diagnostics to Validation

Today let’s quickly recap the above areas.


We will have a blog on this (it has some history). This move has been long-awaited by anyone using MongoDB before 2.4. There were connection changes in that release that made it complicated for load balancers due to the inability to “re-attach” to the same session.  If you were not careful in 2.4+, you could easily use a load-balancer and have very odd behavior: from broken to invisibly incomplete getMores (big queries).

Sessions aim is to change this. Now, the client drivers know about the internal session to the database used for reads and writes. Better yet, MongoDB tracks these sessions so even if an election occurs, when your drive fails over so will the session. For anyone who’s applications handled fail-overs badly, this is an amazing improvement. Some of the other new features that make 3.6 a solid release require this new behavior.

Does this mean this solution is perfect and works everywhere? No! It, like newer features we have seen, leave MMAPv1 out in the cold due to its inability without major re-work to support logic that is so native to Wired Tiger. Talking with engine engineers, it’s clear that some of the logic behind the underlying database snapshots and rollbacks added here can cause issues (which we will talk about more in the transactions blog).

Change streams

As one of the most talked about (but most specialized) features, I can see its appeal in a very specific use case (but it is rather limited). However, I am getting ahead of myself! Let’s talk about what it is first and where it came from.

Before this feature, people streamed data out of MySQL and MongoDB into Elastic and Hadoop stacks. I mention MySQL, as this was the primer for the initial method MongoDB used. The tools would read the MySQL binlogs – typically saved off somewhere – and then apply those operations to some other technology. When they went to do the same thing in MongoDB, there was a big issue: if writes are not sent to the majority of the nodes, it can cause a rollback. In fact, such rollbacks were not uncommon. The default was w:1 (meaning the primary only needed to have the write), which resulted in data existing in Elastic that had been removed from MongoDB. Hopefully, everyone can see the issue here, and why a better solution was needed than just reading the oplog.



, which in the shell has a helper called


 . This is a method that uses a multi-node consistent read to ensure the data is on the majority of nodes before the command returns the data in a tailable cursor. For this use case, this is amazing as it allows the data replicating tool much more assurance that data is not going to vanish. 


 is not without limits: if we have 10k collections and we want to watch them all, this is 10k separate operations and cursors. This puts a good deal of strain on the systems, so MongoDB Inc. suggests you do this on no more than 1000 namespaces at a time to prevent overhead issues.

Sadly it is not currently possible to take a mongod-wide snapshot to support this under the hood, as this is done on each namespace to implement the snapshot directly inside WiredTiger’s engine. So for anyone with a very large collection count, this will not be your silver bullet yet. This also means streams between collections and databases are not guaranteed to be in sync. This could be an issue for someone even with a smaller number of namespaces that expect this. Please don’t get me wrong: it’s a step in the correct direction, but it falls short.

I had very high hopes for this to simplify backups in MongoDB. Percona Labs’s GitHub has a tool called MongoDB-Consistent-Backup, which tails multiple oplogs to get a consistent sharded backup without the need to pay for MongoDB’s backup service or use the complicated design that is Ops Manager (when you host it yourself). Due to the inability to do a system-wide change stream, this type of tool still needs to use the oplog. If you are not using


  it could trigger a failure if you have an election or if a rollback occurs. Still, I have hopes this will be something that can be considered in the future to make things better for everyone.

Retryable writes

Unlike change streams, this feature is much more helpful to the general MongoDB audience. I am very excited for it. If you have not watched this video, please do right now! Samantha does a good job explaining the issue and solution in detail. However, for now just know there has been a problem that where a write that has an issue (network, app shutdown, DB shutdown, election), you had no way to know if the write failed or not. This unknown situation was terrible for a developer, and they would not know if they needed to run the command again or not. This is especially true if you have an ordering system and you’re trying to remove stock from your inventory system. Sessions, as discussed before, allowed the client to reconnect to a broken connection and keep getting results to know what happened or didn’t. To me, this is the second best feature of 3.6. Only Security is more important to me personally.

Security improvement

In speaking of security, there is one change that the security community wanted (which I don’t think is that big of a deal). For years now, the MongoDB packaging for all OSs (and even the Percona Server for MongoDB packing) by default would limit the bindIP setting to localhost. This was to prevent unintended situations where you had a database open to the public. With 3.6 now the binaries also default to this. So, yes, it will help some. But I would (and have) argued that when you install a database from binaries or source, you are taking more ownership of its setup compared to using Docker, Yum or Apt.

The other major move forward, however, is something I have been waiting for since 2010. Seriously, I am not joking! It offers the ability to limit users to specific CIDR or IP address ranges. Please note MySQL has had this since at least 1998. I can’t recall if it’s just always been there, so let’s say two decades.

This is also known as “authenticationRestriction” and it’s an array you can put into the user document when creating a document. The manual describes it as:

The authentication restrictions the server enforces on the created user. Specifies a list of IP addresses and CIDR ranges from which the user is allowed to connect to the server or from which the server can accept users.

I can not overstate how huge this is. MongoDB Inc. did a fantastic job on it. Not only does it support the classic client address matching, it supports an array of these with matching on the server IP/host also. This means supporting multiple IP segments with a single user is very easy. By extension, I could see a future where you could even limit some actions by range – allowing dev/load test to drop collections, but production apps would not be allowed to. While they should have separate users, I regularly see clients who have one password everywhere. That extension would save them from unplanned outages due to human errors of connecting to the wrong thing.

We will have a whole blog talking about these changes, their importance and using them in practice. Yes, security is that important!

Major love for array and more in Aggregation

This one is a bit easier to summarize. Arrays and dates get some much-needed aggregation love in particular. I could list all the new operators here, but I feel it’s better served in a follow-up blog where we talk about each operator and how to get the most of it. I will say my favorite new option is the $hint. Finally, I can try to control the work plan a bit if it’s making bad decisions, which sometimes happens in any technology.

A better balancer

Like many other areas, there was a good deal that went into balancer improvements. However, there are a few key things that continue the work of 3.4’s parallel balancing improvements.

Some of it makes a good deal of sense for anyone in a support role, such as FTDC now also existing in mongos’. If you do not know what this is, basically MongoDB collects some core metrics and state data and puts it into binary files in dbPath for engineers at companies like Percona and MongoDB Inc. to review. That is not to say you can’t use this data also. However, think of it as a package of performance information if a bug happens. Other diagnostic type improvements include moveChunk, which provides data about what happened when it runs in its output. Previously you could get this data from the config.changeLog or config.actionLog collections in the config servers. Obviously, more and more people are learning MongoDB’s internals and this should be made more available to them.

Having talked about diagnostic items, let’s move more into the operations wheelhouse. The single biggest frustration about sharding and replica-sets is the sensitivity to time variations that cause odd issues, even when using ntpd. To this point, as of 3.6 there is now a logical clock in MongoDB. For the geekier in the crowd, this was implemented using a Lamport Clock (great video of them). For the less geeky, think of it as a cluster-wide clock preventing some of the typical issues related to varying system times. In fact, if you look closer at the oplog record format in 3.6 there is a new wt field for tracking this. Having done that, the team at MongoDB Inc. considered what other things were an issue. At times like elections of the config servers, meta refreshes did not try enough times and could cause a mongos to stop working or fail. Those days are gone! Now it will check three times as much, for a total of ten attempts before giving up. This makes the system much more resilient.

A final change that is still somewhat invisible to the user but helps make dropping collections more stable, is that they remove the issue MongoDB had about dropping and recreating sharded collections. Your namespaces look as they always have. Behind the scenes, however, they have UUID’s attached to them so that if drops and gets recreated, it would be a different UUID. This allows for less-blocking drops. Additionally, it prevents confusion in a distributed system if we are talking about the current or old collection.

JSON Schema validation

Some non-developers might not know much about something called JSON Schema. It allows you to set rules on schema design more efficiently and strictly than MongoDB’s internal rules. With 3.6, you can use this directly. Read more about it here. Some key points:

  • Describes your existing data format
  • Clear, human- and machine-readable documentation
  • Complete structural validation, useful for:
    • Automated testing
    • Validating client-submitted data
You can even make it reject when certain fields are missing. As for MySQL DBAs, you might ask why this is a big deal? You could always have a DBA define a schema in an RDBMS, and the point of MongoDB was to be flexible. That’s a fair and correct view. However, the big point of using it is you could apply this in production, not in development. This gives developers the freedom to move quickly, but provides operational teams with control methods to understand when mistakes or bugs are present before production is adversely affected. Taking a step back, its all about bridging the control and freedom ravines to ensure both camps are happy and able to do their jobs efficiently.

Compass is Community

If you have never used Compass, you might think this isn’t that great. You could use things like RoboMongo and such. You absolutely could, but Compass can do visualization as well as CRUD operations. It’s also a very fluid experience that everyone should know is available for use. This is especially true for QA teams who want to review how often some types of data are present, or a business analyst who needs to understand in two seconds what your demographics are.

Major WiredTiger internal overhaul

There is so much here that I recommend any engineer-minded person take a look at this deck, presented by one of the great minds behind WiredTiger. It does a fantastic job explaining all the reasons behind some of the 3.2 and 3.4 scaling issues MongoDB had on WiredTiger. Of particular interest is why it tended to have worse and worse performance as you added more collections and indexes. It then goes into how they fixed those issues:

  • Some key points on what they did
  • Made Evictions Smarts, as they are not collection uniform
  • Improved assumption around the handle cache
  • Made Checkpoints better in all ways
  • Aggressively cleaned up old handles

I hope this provides a peek into the good, bad, and ugly in MongoDB 3.6 Community! Please check back as we publish more in-depth blogs on how these features work in practice, and how to best use them.

Powered by WordPress | Theme: Aeros 2.0 by