Aug
16
2019
--

The Long-Awaited MongoDB 4.2 GA Has Landed

MongoDB 4.2 GA

MongoDB 4.2 GAAt Percona we’ve been tracking the upstream releases from MongoDB Inc. closely and, like many of you, are happy that MongoDB is finally available in its General Availability 4.2.0 version.

It is time for MongoDB Community users to start using 4.2 in their testing, QA, and pre-production staging environments. As with many products, the first few minor versions of a new major version are the ones that have the quickest-landing and most important fixes. Historically this has also been true for MongoDB. In short, I wouldn’t start using 4.2.0 in production today, but by the time you’ve finished trialing 4.2 in your pre-production environments, it will already be a past version.

For users of Percona Server for MongoDB, which includes open-source replacements for MongoDB Enterprise add-on modules (feature comparison here) you’ll be pleased to know that we’ve already merged the 4.2.0 code, and testing has begun. We expect to release Percona Server for MongoDB 4.2 in the next few weeks.

What’s new in MongoDB 4.2?

We covered the keynotes new features of 4.2 in our recent blog post Percona’s View on MongoDB’s 4.2 Release – The Good, the Bad, and the Ugly… This looked at:

  • Distributed transactions
  • Server-side updates (both with the classic CRUD update op and aggregation pipeline $merge stage)
  • Field-level encryption (MongoDB Enterprise only so far)

I also discussed some internal changes of interest in Diving into the MongoDB 4.2 Release Small Print. This covered:

  • MMAPv1 storage engine removed
  • queryHash added in diagnostic fields
  • Auto-splitting thread removed from mongos nodes
  • Modularization of config files through –configExpand 
  • Improved WiredTiger data file repair
  • Listing open cursors

But of course, there is still more to discuss given the size and long ‘cooking time’ of this version! Some additional thoughts on the new and enhanced features are below.

Transactions

I think many organizations decided 4.0 was not the right time to start using transactions, even though they were supported for non-sharded replicasets. A company usually has multiple databases; some are sharded and some are not. It was easier to wait until the feature was ready for both. But now is the time – if you need them. Do not use transactions unless you have a compelling reason (see performance sub-section below).

The syntax for using transactions hasn’t changed from 4.0. You must create a session, and then with that session as scope run a startTransaction command; the normal CRUD commands within the transaction; then a commitTransaction (or abortTransaction).

Although the syntax for using transactions hasn’t changed, the client driver specs did change a little. To use transactions in 4.2 you must upgrade your drivers to the 4.2 server’s compatible versions. These are the twelve officially-supported drivers:

  • C 1.15.0
  • C# 2.9.0
  • C++ 3.5
  • Go 1.1
  • Java 3.11.0
  • Node 3.3.0
  • Perl 2.2.0
  • PHP Extension 1.6 (or 1.5 for the library)
  • Python 3.9.0
  • Motor (i.e. async Python) 4.2 compatibility not available. Last supported version is 3.6
  • Ruby 2.10.0
  • Scala 2.7.0

Transactions Performance

The fundamentals in this next section are true for any transaction-supporting database, but I’ll use examples in the context of MongoDB to make it clear it applies here too.

Using transactions is going to hurt your performance if you use them too freely. Reads and writes as you’ve been doing them before transactions were supported should still make-up the great majority of the operations being run in your MongoDB cluster or non-shared replica set.

Let’s say a read in your MongoDB server takes a double-digit number of microseconds on average, and a write takes about ten times as long. The chance of there being conflicts for the same document as those single ops take place is limited to those small windows of time. It can happen, but you have to be really pushing it.

If a conflict of a normal, single-doc write happens, it will be retried automatically by the storage engine until it gets its turn and creates the new document version. The update will be completed, following data consistency rules, so it might not seem so bad. But the processing cost grows and the latency grows.

Transactions stretch out the time window of operations as logical units. Conflicts in transactions vary, but the transaction will possibly have to walk through old, rather than the latest, versions of a doc. A transaction will abort if the docs in a write (or reads in the same transaction preceding the write) are found to have a conflicting update that arrived during its lifetime. So the internal ‘housekeeping’ work for the storage engine increases with the number of documents being affected by the transaction, and the chance of conflicts in time increases exponentially to the length of transaction (by random spread assumption at least).

Both of these things can make transactions much slower than normal operations unless you: 

  1. design transactions to use as few documents as possible 
  2. never have a slow op such as an unindexed query within a transaction 

This is in addition to only having transactions as a minor part of the database traffic.

Also, don’t forget a point promoted well in MongoDB’s early days; if you put fields in the same collection document (i.e. Document database style rather than relation database table style) you avoid the need for transactions in the first place.

Stability

New client locking/blocking throttle mechanism on Primaries

Marketing name: “Flow control”

I am going to cover this feature in detail in an upcoming blog post. But, it also belongs on any review of 4.2 major new features, so here is a quick summary.

Version 4.2 primary nodes will keep clients waiting, as much as it needs to, to ensure that they are not getting more than 10 seconds ahead of the secondaries.

Benefit: high replication lag will be a much less likely event in the future. (Secondary reads still risk being stale <= 10 secs at any normal time though.)

Major benefit: I believe this limits exponentially-increasing storage engine costs that must be paid until writes are confirmed written to secondaries.

Nuisance: you’ll have to disable this feature (it’s on by default) if you want best latencies on the primary during sudden heavy load events.

I like this feature. But please call it what it is: a Primary Throttle.

Retryable writes become default

Retryable writes will become the default for 4.2-compatible drivers. If you decide you don’t want automatic retries you will now have to explicitly disable them.

I don’t see a reason why you wouldn’t use retryable writes. The result for the client is the same as if there had been no primary switch at that time, i.e. just as if things were normal.

(In)stability

Yuck! There’s a problem with dropDatabase and movePrimary commands now.

“dropDatabase and movePrimary

“Starting in MongoDB 4.2, after you run dropDatabase or movePrimary operation:

  • “Restart all mongos instances and mongod shard members;
  • “Use the flushRouterConfig command on all mongos instances and mongod shard members before reading or writing to that database.

“This ensures that the mongos and shard instances refresh their metadata cache. Otherwise, the you may miss data on reads, and may not write data to the correct shard. To recover, you must manually intervene.

“In earlier versions, you only need to restart or run flushRouterConfig on the mongos instances”

Distributed catalog cache management (i.e. the information on how the db, collections, and chunks are distributed in shards) is a very hard task, but MongoDB’s development had done very well in my opinion. Going back to v2.6 it was not unusual to manually force a catalog refresh after some sharding operations. But, it had improved steadily, and by v3.6 I had stopped recommending running the flushRouterConfig at all.

But it seems you will now have to restart all the mongod and mongos nodes except the configsvr nodes if you drop a database or use movePrimary. That’s a huge hassle.

It wouldn’t be so bad if it was just movePrimary – that is an uncommon operation and a tricky business for catalog cache. But, for users who automate the creation and deletion of databases, doing it for dropDatabase is unfeasible.

Indexes

Improved index build process

This is a major revamp of how indexes are built. Background index builds are gone. The new index build process takes a lock briefly at the start, then does something like a background index build as it scans the entire collection to build the index structure, then locks again only whilst finishing the process.

Wildcard indexes

I believe these indexes solve a problem for users who are temporarily forced to use MongoDB with data structured as it was from a legacy key-value store.

Please be aware using wildcard indexes does not mean a new index is created every time a new, as-of-yet unindexed, field name is encountered during an insert or update. Instead, it creates one physical index. (You can confirm this by looking at db.collection.getIndexes()).

As it is not made of normal indexes, limitations apply: null or existence comparisons can’t be performed; field-to-object (or array) comparisons can’t be performed (including the pseudo $min and $max objects); a sort can not be a composite spec such as ‘wildcard-indexed field + other non-wildcard field’ or ‘wildcard-indexed field + another wildcard-indexed field’); cannot be made to also be a unique index, or text, geo or hashed index; cannot be used for TTL index.

Assorted extras

Extended JSON v2

The mongodb tools will start using a stricter Extended JSON. No longer is the ambiguity of numeric literals in JSON being tolerated (E.g. ‘Is this “6” going to be an integer? Or float value 6.0?’). Dates will also be serialized as a numeric type. It appears the ISO 8601 date format string (YYYY-MM-DDTHH:MM:SS.nnn(TZ)) is no longer OK because it wasn’t able to be used for dates before the unix epoch start.

The new format might take up to five times as much space if you happen to be dumping your big data with mongodump (unlikely, admittedly). Dates are basically the same size, but a number such as 567.9 will now take 28 bytes to serialize instead of 5.

“ssl*” option names are changing to “tls*” ones

One for the protocol name pedants. And fair enough.

dropConnections

There is a new dropConnections command. It will only kill outgoing connections. In other words, if you want to log as the admin user to a shard node and kill all clients ops, this is not going to do that.

New fields in currentOp docs

currentOp output now includes idle ops and sessions, not just active ops. To distinguish between them, there is a new “type” field.

Other new fields of interest to me are: effectiveUsers, runBy, writeConflicts, and prepareReadConflicts.

Storage Watchdog

A previously enterprise-only feature that is being shifted into the Community edition. This is an aid to make sure a mongod node dies without a certain time limit if the disk interface goes completely silent to the kernel. Without storage ‘watchdog’ the mongod won’t react here because even the kernel won’t react.

I don’t recommend using storage watchdog. Disk full-errors, or even complete disk death, will be detected by the kernel and signaled through to mongod, which will then react appropriately. It is only a really evil design of SCSI or RAID card etc. that can create the kernel-silence situation that this feature addresses. I think we should exorcise not accommodate hardware like that.

Wrapping-up

In conclusion, it’s great to see that MongoDB Community 4.2 has squirmed out from its release-candidate cocoon and become the beautiful butterfly we’ve been waiting for (my complaints about the new dropDatabase sensitivities aside).

We look forward to bringing you Percona Server for MongoDB 4.2 in the near future, which incorporates the key elements of 4.2 with additional Enterprise features. 

Please contact us if you would like any assistance with your MongoDB database set-up, or if you would like to discuss any of the 4.2 features in more detail.

Aug
09
2019
--

Percona Server for MongoDB 3.4.22-2.20 Now Available

Percona Server for MongoDB 3.4.22-2.20

Percona Server for MongoDB

Percona announces the release of Percona Server for MongoDB 3.4.22-2.20 on August 9, 2019. Download the latest version from the Percona website or the Percona software repositories.

Percona Server for MongoDB is an enhanced, open source, and highly-scalable database that is a fully-compatible, drop-in replacement for MongoDB 3.4 Community Edition. It supports MongoDB 3.4 protocols and drivers.

Percona Server for MongoDB extends Community Edition functionality by including the Percona Memory Engine storage engine, as well as several enterprise-grade features:

Also, it includes MongoRocks storage engine, which is now deprecated. Percona Server for MongoDB requires no changes to MongoDB applications or code.

This release is based on MongoDB 3.4.22. In Percona Server for MongoDB 3.4.22-2.20, there are no additional improvements or new features on top of those upstream fixes.

Release notes are available in the official documentation.

Aug
05
2019
--

MongoDB: Impact-free Index Builds using Detached ReplicaSet Nodes

MongoDB Impact-free Index Builds

MongoDB Impact-free Index BuildsCreating an index on a MongoDB collection is simple; just run the command CreateIndex and that’s all there is to it. There are several index types available, and in a previous post, you can find the more important index types: MongoDB index types and explain().

The command is quite simple, but for MongoDB, building an index is probably the most complicated task. I’m going to explain what the potential issues are and the best way to create any kind of index on a Replica Set environment.

A Replica Set is a cluster of mongod servers, at least 3, where the complete database is replicated. Some of the main advantages of this kind of structure are automatic failover and read scalability. If you need more familiarity with Replica Set, you may take a look at the following posts:

Deploy a MongoDB Replica Set with Transport Encryption (Part 1)

Deploy a MongoDB Replica Set with Transport Encryption (Part 2)

Deploy a MongoDB Replica Set with Transport Encryption (Part 3)

Create Index Impact

As said, creating an index for MongoDB has really a severe impact. A simple index creation on a field like the following blocks all other operations on the database:

db.people.Createindex( { email: 1 } )

This could be ok for a very small collection, let’s say where the building will take a few milliseconds. But for larger collections, this is absolutely forbidden.

We call this way of building an index the “foreground” mode.

The foreground index creation is the fastest way, but since it is blocking we have to use something different in the production environments. Fortunately, we can also create an index in “background” mode. We may use the following command:

db.people.CreateIndex( { email : 1}, { background: true } )

The index will be built in the background by mongod using a different incremental approach. The advantage is that the database can continue to operate normally without any lock. Unfortunately, background creation takes much longer than the foreground build.

The first hint is then to create the indexes using the background option. This is OK, but not in all the cases. More on that later.

Another impact when building an index is memory usage. MongoDB uses, by default,  up to 500MB of memory for building the index, but you can override it if the index is larger. The larger the index, the higher will be the impact if you don’t have the capability to assign more memory for the task.

To increase the amount of memory for index builds, set the following in the configuration file:

maxIndexBuildMemoryUsageMegabytes: 1024

Example: set it for 1 GB.

Create Index on a Replica Set

As long as the index creation command is replicated on all the nodes of the cluster in the same way all the other write commands are replicated. the index creation is replicated on a Replica Set cluster. A foreground creation on the PRIMARY is replicated as foreground on SECONDARY nodes. A background creation is replicated as background on SECONDARY nodes as well.

The same limitation applies for the Replica Set as the standalone server. The foreground build is fast but blocking and the background build is not blocking, but it is significantly slower for very large collections.

So, what can we do?

If you need to create a small index, let’s say the size is less than the available memory, you can rely on the background creation on the PRIMARY node. The operation will be correctly replicated to the SECONDARY nodes and the overall impact won’t be too bad.

But if you need to create an index larger than the memory, on a huge collection, then even the background build is bad. The creation will have a significant impact on the server resources and you can get overall performance problems on all the nodes. In this case, we have to follow another procedure. The procedure requires more manual steps, but it’s the only way to properly build such a large index.

The idea is to detach from the Replica Set one node at the time, create the index, and rejoin the node to the cluster. But first, you need to take care of the oplog size. The oplog window should be large enough to give you the time for the index build when a node is detached.

Note: the oplog window is the timestamp difference between the first entry in the oplog and the more recent one. It represents the maximum amount of time you can have a node detached from the cluster for any kind of task (software upgrades, index builds, backups). If you rejoin the node inside the window, the node will be able to catch up with the PRIMARY just executing the missing operations from the oplog. If you rejoin the node after the window, it will have to copy completely all the collections. This will take a long time and an impressive bandwidth usage for large deployments. 

The following is the step by step guide:

  • choose one of the SECONDARY nodes
  • detach the node from the Replica Set
    • comment into the configuration file the replSetName and the port options
    • set a different port number
    • set the parameter disableLogicalSessionCacheRefresh to true
net:
   bindIp: localhost,192.168.10.10
   port: 27777
#  port: 27017
#replication:
#   replSetName: RSCluster
setParameter:
   disableLogicalSessionCacheRefresh: true

    • restart mongod
    • now the server is running as standalone; any query won’t be replicated
  • connect using the alternative port and build the index in foreground mode
db.people.CreateIndex( { email: 1} )

  • connect the node to the Replica Set
    • uncomment the options in the configuration file
    • remove the disableLogicalSessionCacheRefresh option
    • restart mongod
    • now the node is a member of the Replica Set
    • wait some time for the node to catch up with the PRIMARY
  • repeat the previous steps for all the remaining SECONDARY nodes
  • stepdown the PRIMARY node to force an election
    • run rs.stepDown() command. This forces an election. Wait for some time for the PRIMARY to become a SECONDARY node.
  • restart it as standalone
    • use the same procedure we saw before
  • build the index in foreground mode
  • restart the node and connect it to the Replica Set

That’s all. We have created the index on all the nodes without any impact for the cluster and for the production applications.

Note: when restarting a node as standalone, the node could be exposed to mistake writes. For the sake of security, a good practice could be to disable TCP connections, allowing only local connections using the socket. Then you can put into the configuration file for example:
bindIp: /tmp/mongod.sock

Conclusion

This procedure is definitely more complicated than running a single command. It will require some time, but we hope you don’t have to create such large indexes every single day. ?

Jul
30
2019
--

Network (Transport) Encryption for MongoDB

Encryption for MongoDB

Encryption for MongoDBWhy do I need Network encryption?

In our previous blog post MongoDB Security vs. Five ‘Bad Guys’ there’s an overview of five main areas of security functions.

Let’s say you’ve enabled #1 and #2 (Authentication, Authorization) and #4 (Storage encryption a.k.a. encryption-at-rest and Auditing) mentioned in the previous blog post. Only authenticated users will be connecting, and they will be only doing what they’re authorized to. With storage encryption configured properly, the database data can’t be decrypted even if the server’s disk was stolen or accidentally given away.

You will have some pretty tight database servers indeed. However, consider the following movement of user data over the network:

  • Clients sending updates to the database (to a mongos, or mongod if unsharded).
  • A mongos or mongod sending query results back to a client.
  • Between replica set members as they replicate to each other.
  • mongos nodes retrieving collection data from the shards before relaying it to the user.
  • Shards with each other if chunks are being moved between sharded collections

As it moves, the user collection data is no longer within the database ‘fortress’. It’s riding in plain, unencrypted TCP packets. It can be grabbed from that with tcpdump etc. as shown here:

~$ #mongod primary is running on localhost:28051 is this example.
~$ #
~$ #In a different terminal I run: 
~$ #  mongo --port 28051 -u akira -p secret  --quiet --eval 'db.getSiblingDB("payments").TheImportantCollection.findOne()
~$ 
~$ sudo ngrep -d lo . 'port 28051'
interface: lo (127.0.0.0/255.0.0.0)
filter: ( port 28051 ) and ((ip || ip6) || (vlan && (ip || ip6)))
match: .
####
...
...
T 127.0.0.1:51632 -> 127.0.0.1:28051 [AP] #19
  ..........................find.....TheImportantCollection..filter.......lim
  it........?.singleBatch...lsid......id........-H..HN.n.`..}{..$clusterTime.
  X....clusterTime...../%9].signature.3....hash.........>.9...(.j. ..H4. .key
  Id.....fK.]...$db.....payments..                                           
#
T 127.0.0.1:28051 -> 127.0.0.1:51632 [AP] #20
  ....4................s....cursor......firstBatch......0......_id..........c
  ustomer.d....fn.....Smith..gn.....Ken..city.....Georgeville..street1.....1 
  Wishful St...postcode.....45031...order_ids.........id..........ns. ...paym
  ents.TheImportantCollection...ok........?.operationTime...../%9].$clusterTi
  me.X....clusterTime...../%9].signature.3....hash.........>.9...(.j. ..H4. .
  keyId.....fK.]...                                                          
#
T 127.0.0.1:51632 -> 127.0.0.1:28051 [AP] #21
  \....................G....endSessions.&....0......id........-H..HN.n.`..}{.
  ..$db.....admin..                                                          
#
T 127.0.0.1:28051 -> 127.0.0.1:51632 [AP] #22
  ....5.....................ok........?.operationTime...../%9].$clusterTime.X
  ....clusterTime...../%9].signature.3....hash.........>.9...(.j. ..H4. .keyI
  d.....fK.]...                                                              
###^Cexit

The key names and strings such as customer name and address are visible at a glance. This is proof that the TCP data isn’t encrypted. It is moving around in the plain. (You can use “mongoreplay monitor” if you want to see numeric and other non-ascii-string data in a fully human-readable way.)

(If you can unscramble the ascii soup above and see the whole BSON document in your head – great! But you failed the “I am not a robot” test so now you have to stop reading this web page.)

For comparison, this is what the same ngrep command prints when I change to using TLS in the client <-> database connection.

~$ #ngrep during: mongo --port 28051 --ssl --sslCAFile /data/tls_certs_etc/root.crt \
~$ #  --sslPEMKeyFile /data/tls_certs_etc/client.foo_app.pem -u akira -p secret --quiet \
~$ #  --eval 'db.getSiblingDB("payments").TheImportantCollection.findOne()'
~$ 
~$ sudo ngrep -d lo . 'port 28051'
interface: lo (127.0.0.0/255.0.0.0)
filter: ( port 28051 ) and ((ip || ip6) || (vlan && (ip || ip6)))
match: .
####
...
...
T 127.0.0.1:51612 -> 127.0.0.1:28051 [AP] #23
  .........5nYe.).I.M..H.T..j...r".4{.1\.....>...N.Vm.........C..m.V....7.nP.
  f..Z37......}..c?...$.......edN..Qj....$....O[Zb...[...v.....<s.T..m8..u.u3
  R.?....5;...$.F.h...]....@...uq....."..F.M(^.b.....cv.../............\.z..9
  hY........Bz.QEu...`z.W..O@...\.K..C.....N..=.......}.                     
#
T 127.0.0.1:28051 -> 127.0.0.1:51612 [AP] #24
  .....*......4...p.t...G5!.Od...e}.......b.dt.\.xo..^0g.F'.;""..a.....L....#
  DXR.H..)..b.3`.y.vB{@...../..;lOn.k.$7R.]?.M.!H..BC.7........8..k..Rl2.._..
  .pa..-.u...t..;7T8s. z4...Q.....+.Y.\B.............B`.R.(.........~@f..^{.s
  .....\}.D[&..?..m.j#jb.....*.a..`. J?".........Z...J.,....B6............M>.
  ....J....N.H.).!:...B.g2...lua.....5......L9.?.a3....~.G..:...........VB..v
  ........E..[f.S."+...W...A...3...0.G5^.                                    
#
T 127.0.0.1:51612 -> 127.0.0.1:28051 [AP] #25
  ....m..m.5...u...i.H|..L;...M..~#._.v.....7..e...7w.0.......[p..".E=...a.?.
  G{{TS&.........s\..).U.vwV.t...t..2.%..                                    
#
T 127.0.0.1:28051 -> 127.0.0.1:51612 [AP] #26
  .....?..u.*.j...^.LF]6...I.5..5...X.....?..IR(v.T..sX.`....>..Vos.v...z.3d.
  .z.(d.DFs..j.SIA.d]x..s{7..{....n.[n{z.'e'...r............\..#.<<.Y.5.K....
  .....[......[6.....2......[w.5....H                                        
###^Cexit

 

No more plain data to see! The high-loss ascii format being printed by ngrep above can’t provide genuine satisfaction that this is perfectly encrypted binary data, but I hope I’ve demonstrated a quick, useful way to do a ‘sanity check’ that you are using TLS and are not still sending data in the plain.

Note: I’ve used ngrep because I found it made the shortest example. If you prefer tcpdump you can capture the dump with tcpdump <interface filter> <bpf filter> -w <dump file>, then open with the Wireshark GUI or view it with tshark -r <dump file> -V on the command line. And for real satisfaction that the TLS traffic is cryptographically protected data, you can print the captured data in hexadecimal / binary format (as opposed to ‘ascii’) and run an entropy assessment on it.

What’s the risk, really?

It’s probably a very difficult task for a hypothetical spy who was targeting you 1-to-1 to find and occupy a place in your network where they can just read the TCP traffic as a man-in-the-middle. But wholesale network scanners, who don’t know or care who any target is beforehand, will find any place that happens to have a gate open on the day they were passing by.

The scrambled look of raw TCP data to the human eye is not a distraction to them as it is to you, the DBA or server or application programmer. They’ve already scripted the unpacking of all the protocols. I assume the technical problem for the blackhat hackers is more a big-data one (too much copied data to process). As an aside, I hypothesize that they are already pioneering a lot of edge-computing techniques.

It is true that data captured on the move between servers might be only a tiny fraction of the whole data. But if you are making backups by the dump method once a day, and the traffic between the database server and the backup store server is being listened to, then it would be the full database.

How to enable MongoDB network encryption

MongoDB traffic is not encrypted until you create a set of TLS/SSL certificates and keys, and apply them in the mongod and mongos configuration files of your entire cluster (or non-sharded replica set). If you are an experienced TLS/SSL admin, or you are a DBA who has been given a certificate and key set by security administrators elsewhere in your organization, then I think you will find enabling MongoDB’s TLS easy – just distribute the files, reference them in the net.ssl.* options, and stop all the nodes and restart them. Gradually enabling without downtime takes longer but is still possible by using rolling restarts changing net.ssl.mode from disabled -> allowSSL -> preferSSL -> requireSSL (doc link) in each restart.

Conversely, if you are an experienced DBA and it will be your first time creating and distributing TLS certificates and keys, be prepared to spend some time learning about it first.

The way the certificates and PEM key files are created varies according to the following choices:

  • Using an external certificate authority or making a new root certificate just for these MongoDB clusters
  • If you are using it just for the internal system authentication between mongod and mongos nodes, or if you are enabling TLS for clients too
  • How strict you will be making certificates (e.g. with host limitations)
  • Whether you need the ability to revoke certificates

To repeat the first point in this section: if you have a security administration team who already know and control these public key infrastructure (PKI) components – ask them for help, in the interests of saving time and being more certain you’re getting certificates that conform with internal policy.

Self-made test certificates

Percona Security Team note: This is not a best practice, even though it is in the documentation as a tutorial; we recommend you do not use this in production deployments.

So you want to get hands-on with TLS configuration of MongoDB a.s.a.p.? You’ll need certificates and PEM key files. Having the patience to fully master certificate administration would be a virtue, but you are not that virtuous. So you are going to use the existing tutorials (links below) to create self-signed certificates.

The quickest way to create certificates is:

  • Make a new root certificate
  • Generate server certificates (i.e. the ones the mongod and mongos nodes use for net.ssl.PEMKeyFile) from that root certificate
  • Generate client certificates from the new root certificate too
    • Skip setting CN / “subject” fields that limit the hosts or domains the client certificate can be used on
  • Self-sign those certificates
  • Skip making revocation certificates

The weakness in these certificates is:

  • A man in the middle attack is possible (MongoDB doc link):
    “MongoDB can use any valid TLS/SSL certificate issued by a certificate authority or a self-signed certificate. If you use a self-signed certificate, although the communications channel will be encrypted, there will be no validation of server identity. Although such a situation will prevent eavesdropping on the connection, it leaves you vulnerable to a man-in-the-middle attack. Using a certificate signed by a trusted certificate authority will permit MongoDB drivers to verify the server’s identity.”
  • What will happen if someone gets a copy of one of them?
    • If they get the client or a server certificate they will be able to decrypt or spoof being a  SSL encrypting-and-decrypting network peer on the network edges to those nodes.
    • When using self-signed certificates you distribute a copy of the root certificate with the server or client certificate to every mongod, mongos, and client app. I.e. it’s as likely to be misplaced or stolen as a single client or server certificate. With the root certificate spoofing can be done on any edge in the network.
    • You can’t revoke a stolen client or server certificate and cut them off from further access. You’re stuck with it. You’ll have to completely replace all the server-side and client certificates with cluster-wide downtime (at least for MongoDB < 4.2).

Examples on how to make self-signed certificates:

  • This snippet from MongoDB’s Kevin Adistimba is the most concise I’ve seen.
  • This replicaset setup tutorial for Percona’s Corrado Pandiani includes similar instructions with more mongodb context on the page.

Reference in the MongoDB docs:

Various configuration file examples

Three detailed appendix entries on how to make OpenSSL Certificates for Testing.

Troubleshooting

I like the brevity of the SSL Troubleshooting page in Gabriel Ciciliani’s MongoDB administration cool tips presentation from Percona Live Europe ’18. Speaking from my own experience before enabling them in the MongoDB config it’s crucial to make sure the PEM files (both server and client ones) pass the ‘openssl verify’ test command against the root / CA certificate they’re derived from. Absolutely, 100% do this before trying to use them in your mongodb config.

If “openssl verify“-confirmed certificates still create a mongodb replicaset or cluster that is unconnectable then add the --sslAllowInvalidHostnames option when connecting with the mongo shell, and/or net.ssl.allowInvalidHostnames in mongod/mongos configuration. This is a differential diagnosis to see if the hostname requirements of the certificates are the only thing causing the SSL rules to reject the certificates.

If you find it takes --sslAllowInvalidHostnames to make it work it means the CN subject field and/or SAN fields in the certificate need to be edited until they match the hostnames and domains that the SSL lib identifies the hosts as. Don’t be tempted to just conveniently forget about it; disabling hostname verification is a gap that might be leveraged into a man-in-the-middle attack.

If you still are experiencing trouble my next step would be to check the mongod logs. You will find lines matching the grep expression ‘NETWORK .*SSL’ in the log if there are rejections. (This might become “TLS” later.) E.g.

2019-07-25T16:34:49.981+0900 I NETWORK  [conn11] Error receiving request from client: SSLHandshakeFailed: SSL peer certificate validation failed: self signed certificate in certificate chain. Ending connection from 127.0.0.1:33456 (connection id: 11)

You might also try grepping for '[EW] NETWORK' to look for all network errors and warnings.

For SSL there is no need to raise the logging verbosity to see errors and warnings. From what I can see in ssl_manager_openssl.cpp those all come at the default log verbosity of 0. Only if you want to confirm normal, successful connections would I advise briefly raising log verbosity in the config file to level 2 for the exact log ‘component’ (in this case this is the “network”). (Don’t forget to turn it off soon after – forgetting you set log level 2 is a great way to fill your disk.) But for this topic the only thing I think log level 2 will add is “Accepted TLS connection from peer” confirmations like the following. 

2019-07-25T16:29:41.779+0900 D NETWORK  [conn18] Accepted TLS connection from peer: emailAddress=akira.kurogane@nowhere.com,CN=svrA80v,OU=testmongocluster,O=MongoTestCorp,L=Tokyo,ST=Tokyo,C=JP

Take a peek in the code

Certificate acceptance rules are a big topic and I am not the author to cover it. But take a look at the SSLManagerOpenSSL::parseAndValidatePeerCertificate(…) function in ssl_manager_openssl.cpp as a starting point if you’d like to be a bit more familiar with MongoDB’s application.

Jul
23
2019
--

PMM for MongoDB: Quick Start Guide

PMM for MongoDB

As a Solutions Engineer at Percona, one of my responsibilities is to support our customer-facing roles such as the sales and customer success teams, which affords me the opportunity to speak to many current and new customers who partner with Percona. I often find that many people are interested in Percona Monitoring and Management (PMM) as a free and open-source monitoring solution due to its robust monitoring capabilities when compared to many SaaS-based monitoring solutions. They are interested in installing PMM for MongoDB for the first time and want a “quick start guide” with a brief overview to get their feet wet. I have included the commands to get started for both PMM 1 and PMM 2 (PMM2 is still in beta).

PMM for MongoDB

Overview and Architecture

PMM is an open-source platform for out-of-the-box management and monitoring of MySQL, MongoDB, and PostgreSQL performance, on-premise and in the cloud. It is developed by Percona in collaboration with experts in the field of managed database services, support, and consulting. PMM is built off of Prometheus, a powerful open-source monitoring and alerting platform, and supports any other service that has an exporter. An exporter is an endpoint that collects data on the instance being monitored and is polled by Prometheus to collect metrics. For more information on how to use your own exporters, read the documentation here.

When deployed on-premises, the PMM platform is based on a client-server model that enables scalability. It includes the following modules:

  • PMM Client– installed on every database host that you want to monitor. It collects server metrics, general system metrics, and Query Analytics data for a complete performance overview.
  • PMM Server – the central part of PMM that aggregates collected data and presents it in the form of tables, dashboards, and graphs in a web interface.

PMM can also be deployed to support DBaaS instances for remoting monitoring. Instructions can be found here, under the Advanced section. The drawback of this approach is that you will not have visibility of host-level metrics (CPU, memory, and disk activity will not be captured nor displayed in PMM). There are currently 3 different deployment options:

For a more detailed overview of the PMM Architecture please read the Overview of PMM Architecture.

Demonstration Environment

When deploying PMM in this example, I am making the following assumptions about the environment:

  • MongoDB and the monitoring host are running on Debian based operating systems. (For information on installing as an RPM instead please read Deploying Percona Monitoring and Management.)
  • MongoDB is already installed and setup. The username and password for the MongoDB user are percona:percona.
  • The PMM server will be installed within a docker container on a dedicated host.

Installing PMM Server

This process will consist of two steps:

  1. Create the docker container – docker will automatically pull the PMM Server image from the Percona docker repository.
  2. Start (or run) the docker container – docker will bring up the PMM Server in the container

Create the Docker Container

The code below illustrates the command for creating the docker container for PMM 1:

docker create \
  -v /opt/prometheus/data \
  -v /opt/consul-data \
  -v /var/lib/mysql \
  -v /var/lib/grafana \
  --name pmm-data \
  percona/pmm-server:1 /bin/true

The code below illustrates the command for creating the docker container for PMM 2:

docker create -v /srv --name pmm-data-2-0-0-beta1 perconalab/pmm-server:2.0.0-beta1 /bin/true

This is the expected output from the code:

Use the following command to start the PMM 1 docker container:

docker run -d \
   -p 80:80 \
   --volumes-from pmm-data \
   --name pmm-server \
   --restart always \
   percona/pmm-server:1

Use the following command to start the PMM 2 docker container:

docker run -d -p 80:80 -p 443:443 --volumes-from pmm-data-2-0-0-beta1 --name pmm-server-2.0.0-beta1 --restart always perconalab/pmm-server:2.0.0-beta1

This is the expected output from the code:

The PMM Server should now be installed! Yes, it IS that easy. In order to check that you can access PMM, navigate in a browser to the IP address of the monitoring host. If you are using PMM 2, the default username and password for viewing PMM is admin:admin. You should arrive at a page that looks like https://pmmdemo.percona.com.

Installing PMM Client for MongoDB

Setting up DB permissions

PMM Query Analytics for MongoDB requires the user of the mongodb_exporter to have the clusterMonitor role assigned for the admin database and the read role for the local database. If you do not have these set up already, please read Configuring MongoDB for Monitoring in PMM Query Analytics.

Download the Percona repo package

We must first enable the Percona package repository on our MongoDB instance and install the PMM Client. We can run the following commands in order to accomplish this:

$ wget https://repo.percona.com/apt/percona-release_latest.generic_all.deb
$ sudo dpkg -i percona-release_latest.generic_all.deb
$ sudo apt-get update

Since PMM 2 is still not GA, you’ll need to leverage our experimental release of the Percona repository. You’ll need to download and install the official percona-release package from Percona and use it to enable the Percona experimental component of the original repository. See percona-release official documentation for further details on this new tool. The following commands can be used for PMM 2:

$ wget https://repo.percona.com/apt/percona-release_latest.generic_all.deb
$ sudo dpkg -i percona-release_latest.generic_all.deb
$ sudo percona-release disable all
$ sudo percona-release enable original experimental
$ sudo apt-get update

Now that we have the MongoDB database server configured with the Percona software repository, we can download the agent software with the local package manager.  Enter the following command to automatically download and install the PMM Client package on the MongoDB server:

$ sudo apt-get install pmm-client

To download and install the PMM 2 Client:

$ apt-get install pmm2-client

Next, we will configure the PMM client by telling it where to find the PMM server.  Execute the following command to configure the PMM client:

$ sudo pmm-admin config --server=<pmm_server_ip>:80

To configure the PMM 2 Client:

$ pmm-admin config --server-insecure-tls --server-url=https://<pmm_server_ip>:443

You should get a similar output as below if it was successful:

Now we provide the PMM Client credentials necessary for monitoring the MongoDB database.  Execute the following command to start monitoring and communicating with the PMM server:

$ sudo pmm-admin add mongodb --uri mongodb://percona:percona@127.0.0.1:27017

To start monitoring and communicating with the PMM 2 Server:

$ sudo pmm-admin add mongodb --use-profiler  --server-insecure-tls --username=percona  --password=percona --server-url=https://<pmm_ip>:443

You should get a similar output as below if it was successful:

Great! We have successfully installed PMM for MongoDB and are ready to take a look at the dashboards.

PMM for MongoDB Dashboards Overview

Navigate to the IP address of your monitoring host. http://<pmm_server_ip>.

PMM Home Dashboard – The Home Dashboard for PMM gives an overview of your entire environment to include all the systems you have connected and configured for monitoring under PMM. It provides useful metrics such as CPU utilization, RAM availability, database connections, and uptime.

Percona Monitoring and Management Dashboard

Cluster Summary – it shows the statistics for the selected MongoDB cluster such as counts of sharded and un-sharded databases, shard and chunk statistics, and various mongos statistics.

MongoDB Cluster Summary

MongoDB Overview – this provides basic information about MongoDB instances such as connections, command operations, and document operations.

MongoDB Overview

ReplSet – provides information about replica sets and their members such as replication operations, replication lag, and member state uptime.

ReplSet

WiredTiger/MMAPv1/In-Memory/RocksDB – it contains metrics that describe the performance of the selected host storage engine.

WiredTiger/MMAPv1/In-Memory/RocksDB

Query Analytics – this allows you to analyze database queries over periods of time. This can help you optimize database performance by ensuring queries are executed as expected and within the shortest amount of time. If you are having performance issues, this is a great place to see which queries may be the cause of your performance issues and get detailed metrics for them.

PMM Query Analytics

What Now?

Now that you have PMM for MongoDB up and running, I encourage you to explore in-depth more of the graphs and features. A few other MongoDB PMM blog posts which may be of interest:

If you run into issues during the install process, a good place to start is Peter Zaitsev’s blog post on PMM Troubleshooting.

Jul
18
2019
--

Resolving MongoDB Stack Traces

MongoDB Stack Traces

MongoDB Stack TracesWhen a MongoDB server crashes you will usually find what is called a “stack trace” in its log file. But what is it and what purpose does it have? Let’s simulate a simple crash so we can dig into it.

Crashing a test server

In a test setup with a freshly installed MongoDB server, we connect to it and create some test data:

$ mongo
MongoDB shell version v3.6.12
(...)
> use test
switched to db test
> db.albums.insert({ name: "The Wall" })
WriteResult({ "nInserted" : 1 })
> db.albums.find()
{ "_id" : ObjectId("5d237cef9affce6d7e4e8345"), "name" : "The Wall" }

On a separate connection to the server, we change the ownership of the MongoDB data files, so the mongod user will no longer have access to them:

$ sudo chown root:root /var/lib/mongo/*

Going back to the mongo session, we try to add a new record and it fails, as expected:

> db.albums.insert({ name: "The Division Bell" })
2019-07-08T17:27:40.275+0000 E QUERY    [thread1] Error: error doing query: failed: network error while attempting to run command 'insert' on host '127.0.0.1:27017'  :
DB.prototype.runCommand@src/mongo/shell/db.js:168:1
DBCollection.prototype._dbCommand@src/mongo/shell/collection.js:173:1
Bulk/executeBatch@src/mongo/shell/bulk_api.js:903:22
Bulk/this.execute@src/mongo/shell/bulk_api.js:1154:21
DBCollection.prototype.insert@src/mongo/shell/collection.js:317:22
@(shell):1:1
2019-07-08T17:27:40.284+0000 I NETWORK  [thread1] trying reconnect to 127.0.0.1:27017 (127.0.0.1) failed
2019-07-08T17:27:40.284+0000 W NETWORK  [thread1] Failed to connect to 127.0.0.1:27017, in(checking socket for error after poll), reason: Connection refused
2019-07-08T17:27:40.284+0000 I NETWORK  [thread1] reconnect 127.0.0.1:27017 (127.0.0.1) failed failed

Looking at the error log we confirm the server has crashed, leaving a stack trace (also called a “backtrace”) behind:

$ sudo cat /var/log/mongodb/mongod.log 
(...)
2019-07-08T17:27:39.666+0000 E STORAGE  [thread2] WiredTiger error (13) [1562606859:666004][24742:0x7f70a3501700], log-server: __directory_list_worker, 48: /home/vagrant/db/journal: directory-list: opendir: Permission denied
(...)
2019-07-08T17:27:39.666+0000 E STORAGE  [thread2] WiredTiger error (-31804) [1562606859:666313][24742:0x7f70a3501700], log-server: __wt_panic, 523: the process must exit and restart: WT_PANIC: WiredTiger library panic
(...)
----- BEGIN BACKTRACE -----
(...)
 mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x5618a3ab92c1]
 mongod(+0x22744D9) [0x5618a3ab84d9]
 mongod(+0x22749BD) [0x5618a3ab89bd]
 libpthread.so.0(+0xF6D0) [0x7f70a6cff6d0]
 libc.so.6(gsignal+0x37) [0x7f70a6959277]
 libc.so.6(abort+0x148) [0x7f70a695a968]
 mongod(_ZN5mongo32fassertFailedNoTraceWithLocationEiPKcj+0x0) [0x5618a21e064c]
 mongod(+0xA6D9EE) [0x5618a22b19ee]
 mongod(+0xADEEF1) [0x5618a2322ef1]
 mongod(__wt_err_func+0x90) [0x5618a217b742]
 mongod(__wt_panic+0x3F) [0x5618a217bb62]
 mongod(+0xB3DFB2) [0x5618a2381fb2]
 libpthread.so.0(+0x7E25) [0x7f70a6cf7e25]
 libc.so.6(clone+0x6D) [0x7f70a6a21bad]
-----  END BACKTRACE  -----
Aborted

But what can we infer from these somewhat cryptic lines full of hexadecimal content?

Inspecting the MongoDB stack trace

In the bottom of the stack trace, we can see a list of function names and addresses. Note the resolution of most functions worked reasonably well in the example above; the mongod binary used by our test server is not stripped of symbols (if yours is you will need to install the respective debugsymbols/debuginfo package and use the mongod binary provided by it to resolve the stack trace):

$ file `which mongod`
/usr/bin/mongod: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=1d0fd59529274e06c35e6dc4c74e0ef08caf931c, not stripped

This means we can actually extract them from the mongod binary with the help of a tool such as nm, from GNU Development Tools:

$ nm -n /usr/bin/mongod > mongod.symbols

Function names appear all mangled though:

$ tail mongod.symbols 
0000000003125f88 u _ZNSt9money_getIwSt19istreambuf_iteratorIwSt11char_traitsIwEEE2idE
0000000003125f90 u _ZNSt10moneypunctIwLb1EE2idE
0000000003125f98 u _ZNSt10moneypunctIwLb0EE2idE
0000000003125fa0 b _ZZN9__gnu_cxx27__verbose_terminate_handlerEvE11terminating
0000000003125fc0 b _ZZN12_GLOBAL__N_112get_catalogsEvE10__catalogs
0000000003126008 b _ZGVZN12_GLOBAL__N_112get_catalogsEvE10__catalogs
0000000003126020 b _ZZN12_GLOBAL__N_112get_catalogsEvE10__catalogs
0000000003126068 b _ZGVZN12_GLOBAL__N_112get_catalogsEvE10__catalogs
0000000003126080 B __wt_process
00000000031260e8 A _end

We can use another tool from that same toolkit, c++filt, to get them straightened, for example:

$ echo _ZNSt9money_getIwSt19istreambuf_iteratorIwSt11char_traitsIwEEE2idE | c++filt 
std::money_get<wchar_t, std::istreambuf_iterator<wchar_t, std::char_traits<wchar_t> > >::id

In fact, we can process the whole stack trace with c++filt all at once…

$ cat <<EOT | c++filt
> mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x5618a3ab92c1]
> mongod(+0x22744D9) [0x5618a3ab84d9]
> mongod(+0x22749BD) [0x5618a3ab89bd]
> libpthread.so.0(+0xF6D0) [0x7f70a6cff6d0]
> libc.so.6(gsignal+0x37) [0x7f70a6959277]
> libc.so.6(abort+0x148) [0x7f70a695a968]
> mongod(_ZN5mongo32fassertFailedNoTraceWithLocationEiPKcj+0x0) [0x5618a21e064c]
> mongod(+0xA6D9EE) [0x5618a22b19ee]
> mongod(+0xADEEF1) [0x5618a2322ef1]
> mongod(__wt_err_func+0x90) [0x5618a217b742]
> mongod(__wt_panic+0x3F) [0x5618a217bb62]
> mongod(+0xB3DFB2) [0x5618a2381fb2]
> libpthread.so.0(+0x7E25) [0x7f70a6cf7e25]
> libc.so.6(clone+0x6D) [0x7f70a6a21bad]
> EOT

… and get it fully demangled, with C++ function and method names easily recognizable now:

mongod(mongo::printStackTrace(std::basic_ostream<char, std::char_traits<char> >&)+0x41) [0x5618a3ab92c1]
mongod(+0x22744D9) [0x5618a3ab84d9]
mongod(+0x22749BD) [0x5618a3ab89bd]
libpthread.so.0(+0xF6D0) [0x7f70a6cff6d0]
libc.so.6(gsignal+0x37) [0x7f70a6959277]
libc.so.6(abort+0x148) [0x7f70a695a968]
mongod(mongo::fassertFailedNoTraceWithLocation(int, char const*, unsigned int)+0x0) [0x5618a21e064c]
mongod(+0xA6D9EE) [0x5618a22b19ee]
mongod(+0xADEEF1) [0x5618a2322ef1]
mongod(__wt_err_func+0x90) [0x5618a217b742]
mongod(__wt_panic+0x3F) [0x5618a217bb62]
mongod(+0xB3DFB2) [0x5618a2381fb2]
libpthread.so.0(+0x7E25) [0x7f70a6cf7e25]
libc.so.6(clone+0x6D) [0x7f70a6a21bad]

While easily reproducible, this was not a very interesting example: the change in the database files ownership caused WiredTiger to crash upon the insert without leaving much trace behind. Let’s have a look at another one.

A more realistic example

Despite being somewhat old, bug SERVER-13751 (mongod crash on geo nearSphere query) provides a realistic yet easy to reproduce example of a simple routine that crashed MongoDB 2.6.0 (this bug is, in fact, a duplicate of SERVER-13666, but it provides a simpler test case). Here’s how to get to it.

1) First, we download these old binaries and start a MongoDB server:

$ wget http://downloads.mongodb.org/linux/mongodb-linux-x86_64-2.6.0.tgz
$ tar zxvf mongodb-linux-x86_64-2.6.0.tgz
$ cd mongodb-linux-x86_64-2.6.0/bin
$ mkdir /home/vagrant/db
$ ./mongod --dbpath /home/vagrant/db

2) In a second terminal window, we connect to the MongoDB server we just started and run a more simplified version of the routine described in the bug, which consists of creating a 2dsphere index and querying for a point described with invalid coordinates:

$ cd mongodb-linux-x86_64-2.6.0/bin
$ ./mongo
> db.places.ensureIndex({loc:"2dsphere"})
> db.places.find({loc:{$nearSphere: [200.4905, 300.2646]}})

Now when we look back at the first terminal we find the server has crashed, leaving the following stack trace:

./mongod(_ZN5mongo15printStackTraceERSo+0x21) [0x11bd301]
./mongod() [0x11bc6de]
/lib64/libc.so.6(+0x36340) [0x7f11cd866340]
/lib64/libc.so.6(gsignal+0x37) [0x7f11cd8662c7]
/lib64/libc.so.6(abort+0x148) [0x7f11cd8679b8]
./mongod(_ZN5mongo13fassertFailedEi+0x13a) [0x11421ea]
./mongod(_ZN15LogMessageFatalD1Ev+0x1d) [0x125d58d]
./mongod(_ZN5S2Cap13FromAxisAngleERK7Vector3IdERK7S1Angle+0x169) [0x1267699]
./mongod(_ZN5mongo11S2NearStage11nextAnnulusEv+0xd2) [0xabd142]
./mongod(_ZN5mongo11S2NearStage4workEPm+0x1fb) [0xabf2cb]
./mongod(_ZN5mongo12PlanExecutor7getNextEPNS_7BSONObjEPNS_7DiskLocE+0xef) [0xd66a7f]
./mongod(_ZN5mongo11newRunQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_+0x958) [0xd4acf8]
./mongod() [0xb96382]
./mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x442) [0xb98962]
./mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x9f) [0x76b76f]
./mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x4fb) [0x117367b]
/lib64/libpthread.so.0(+0x7dd5) [0x7f11ce62bdd5]
/lib64/libc.so.6(clone+0x6d) [0x7f11cd92e02d]

Processing  the stack trace with c++filt we get:

./mongod(mongo::printStackTrace(std::basic_ostream<char, std::char_traits<char> >&)+0x21) [0x11bd301]
./mongod() [0x11bc6de]
/lib64/libc.so.6(+0x36340) [0x7f11cd866340]
/lib64/libc.so.6(gsignal+0x37) [0x7f11cd8662c7]
/lib64/libc.so.6(abort+0x148) [0x7f11cd8679b8]
./mongod(mongo::fassertFailed(int)+0x13a) [0x11421ea]
./mongod(LogMessageFatal::~LogMessageFatal()+0x1d) [0x125d58d]
./mongod(S2Cap::FromAxisAngle(Vector3<double> const&, S1Angle const&)+0x169) [0x1267699]
./mongod(mongo::S2NearStage::nextAnnulus()+0xd2) [0xabd142]
./mongod(mongo::S2NearStage::work(unsigned long*)+0x1fb) [0xabf2cb]
./mongod(mongo::PlanExecutor::getNext(mongo::BSONObj*, mongo::DiskLoc*)+0xef) [0xd66a7f]
./mongod(mongo::newRunQuery(mongo::Message&, mongo::QueryMessage&, mongo::CurOp&, mongo::Message&)+0x958) [0xd4acf8]
./mongod() [0xb96382]
./mongod(mongo::assembleResponse(mongo::Message&, mongo::DbResponse&, mongo::HostAndPort const&)+0x442) [0xb98962]
./mongod(mongo::MyMessageHandler::process(mongo::Message&, mongo::AbstractMessagingPort*, mongo::LastError*)+0x9f) [0x76b76f]
./mongod(mongo::PortMessageServer::handleIncomingMsg(void*)+0x4fb) [0x117367b]
/lib64/libpthread.so.0(+0x7dd5) [0x7f11ce62bdd5]
/lib64/libc.so.6(clone+0x6d) [0x7f11cd92e02d]

This particular mongod binary is stripped of symbols:

$ file mongod
mongod: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.9, stripped

So in order to resolve the stack trace, we need to first obtain one that is not:

$ wget http://downloads.mongodb.org/linux/mongodb-linux-x86_64-debugsymbols-2.6.0.tgz
$ tar zxvf mongodb-linux-x86_64-debugsymbols-2.6.0.tgz
$ cd mongodb-linux-x86_64-debugsymbols-2.6.0/bin

We can now extract the actual function names from the addresses between the brackets using addr2line (option “f” provides the function name and if we also use “i” we get the preceding ones as well if the main one was inline; option “C” provides some extent of demangling, similar to c++filt):

$ addr2line -e mongod -ifC 0x1267699
S2Cap::FromAxisAngle(Vector3<double> const&, S1Angle const&)
/srv/10gen/mci-exec/mci/git@github.commongodb/mongo.git/mongodb-mongo-v2.6/src/third_party/s2/s2cap.cc:35

One of the greatest values of working with Open Source software is being able to have a direct look at this exact piece of code, which translates to https://github.com/mongodb/mongo/blob/v2.6/src/third_party/s2/s2cap.cc#L35 :

S2Cap S2Cap::FromAxisAngle(S2Point const& axis, S1Angle const& angle) {
  DCHECK(S2::IsUnitLength(axis));
  DCHECK_GE(angle.radians(), 0);
  return S2Cap(axis, GetHeightForAngle(angle.radians()));
}

Note that the actual fix for this bug didn’t come from modifying this function, which is being re-used from a third-party (another beauty of working with Open Source!), but in making sure the arguments that are being passed to it (which compose the point’s coordinates) are validated beforehand.

There is a home for bugs

If you ever run into a MongoDB server crash, I hope this little set of instructions can serve as a reference in helping you make sense of the stack trace that will (hopefully) have been left behind. You can then search for bugs at https://jira.mongodb.org if you’re running a MongoDB server, or at https://jira.percona.com/projects/PSMDB if you’re running Percona Server for MongoDB. If you can’t find a bug that matches your crash, please consider filing a new one; providing a clear stack trace alongside the exact binary version you’re using is a must. If you are able to reproduce the problem at will and can provide a reproducible test case as well, like the ones we showed above, that will not only make the life of our developers easier, it also increases the likelihood of getting the bug fixed much, much faster.

Jul
12
2019
--

MongoDB Security vs. Five ‘Bad Guys’

MongoDB Security

MongoDB SecurityMost any commercially mature DBMS provides the following five ways to secure the data you keep inside it:

  • Authentication of user connections (== Identity)
  • Authorization (== DB command permissions) (a.k.a. Role-based access control)
  • Network Encryption (a.k.a. Transport encryption)
  • Storage Encryption (a.k.a. Encryption-at-rest)
  • Auditing (MongoDB Enterprise or Percona Server for MongoDB only)

MongoDB is no exception. All of these have been present for quite a while, although infamously the first versions set “–auth” off by default and this is still in effect. (See more in the “Auth – Still disabled by default” section later.)

This article is an overview of all, plus some important clarification to sort out Authentication and Authorization. Network and storage encryption and Auditing will be expanded on in other articles.

MongoDB Security: The bad guy line-up

So what exactly do the five security subsystems do? Where does the responsibility and purpose of one stop, and others, start?

I think the easiest way to explain it is to highlight the ‘bad guy’ each one repels.

Bad guy
Authentication An unknown person, whom you didn’t realize had network access to the database server, who just ‘walks in’ and looks at, copies, or damages the database data.
Authorization
a.k.a. Access control
A user or application that reads or alters or deletes data other than what they were supposed to.
This ‘bad guy’ is usually a colleague who does it by accident so it’s mostly for safety rather than security, but it also prevents malicious cases too.
Network Encryption Someone who takes a copy of the data being transferred over a network link somewhere between server A and server B.
Storage Encryption Someone who breaks into your datacenter and steals your server’s hard disk so they can read the data files on it.
In practice, they would probably steal the file data over the network or get the disk in a second-hand hardware sale, but the concept is still someone who obtains a copy of the underlying database files.
Auditing A privileged database user who knows how to cover up their tracks after altering database data.
An important caveat – all bets are off if a unix account with the privilege to overwrite the audit log is controlled by db-abusing adversary.


Authentication and authorization must be activated in unison, and auditing requires authentication as a prerequisite, but otherwise, they can be used independently of each other and there is little if any entanglement of code between one the subsystems above and another.

Apart from the caveats of the paragraph above, if you believe that certain bad guys are not a problem for you then you don’t have to use the relevant section.

Which ones should I use, must I use?

Excluding those who are building a public sandpit or honey-trap, no-one can say the “Authentication” bad guy or the “Authorization” bad guy would be an acceptable visitor to their database. So you should at least be using Authorization and Authentication.

Network encryption is almost a no-brainer for me too, but I assume insecure networks as a matter of course. If your MongoDB cluster and all its clients are, for example, inside a virtual private network that you believe has no firewall holes and no privilege escalation risk from other apps then no, you don’t need network encryption.

But if you are a network security expert who is good enough to guarantee your VPN is risk-free, I assume you’re also skilled at TLS/SSL certificate generation. In this case, you will find that setting up TLS/SSL in MongoDB is pretty easy, so why not do it too?

Storage encryption (a.k.a. encryption-at-rest) and Auditing are only high value when certain other risks are eliminated (e.g. unix root access can’t be gained by an attacker on the servers running the live mongod nodes). Storage encryption has a slight performance cost. If Auditing is used suitably, there is not much performance cost, but beware that performance will quickly choke if audit filters are made too broad.

Having said that, it is worth repeating that these last two are high value to some users. If you know you are one of those, please refer to our later articles on them.

Where in the config?

Although it’s all security, it isn’t configured all in the same place.

Authentication security.authorization, (and/or …keyfile or …clusterAuthMode which imply/force it too).
security.sasl and security.ldap are sub-sections for those optional authentication methods.
Authorization (Enabled simultaneously with “Authentication” above.)
Users and roles are not in config files – they are stored within the db itself. Typically in the admin db’s system.users and system.roles collections).
Network encryption net.ssl N.b. not inside the security.* section of the config file.
Storage encryption security.enableEncryption is disk encryption
Auditing auditLog section of the config file, especially the auditLog.filter option.

The above config pointers are just the root positions; in total there are dozens of different settings underneath them.

Authentication and Authorization

Question: “These Authxxxx and Authyyyy words … the same thing right?”

The Answer: 1) No, 2) Yes. 3) Yes, 4) No.

  1. No: Authentication and authorization are not the same things because they are two parts of the software that do different things

Authentication  == User Identity, by means of credential checking.

Authorization == Assigning and enforcing DB object and DB command permissions.

  1. Yes: Authentication and authorization are kind of a single unit because enabling Authentication automatically enables Authorization too.

I assume it was made like this because this matches user expectations from other older databases, and besides, why authenticate if you don’t want to stop unknown users for accessing or changing data? Authorization is enabled in unison with authentication, so connections from unknown users will have no privilege to do anything with database data.

Authorization requires the user name (verified by Authentication) to know which privileges apply to a connection’s requests. So it can’t be enabled independently to the other either.

  1. Yes: Authentication and authorization are sort of the same thing in unfortunate, legacy naming of configuration options

The commandline argument for enabling authentication (which forces authorization to be on too) is simply “–auth”. Even worse, the configuration file option name for the same thing authentication is security.authorization rather than security.authentication. When you use it, though, the first thing that is being enabled is Authentication, and Authorization is only enabled as an after-effect.

  1. No: There is one exception to the ‘Authentication and authorization on together’ rule: during initial setup Authentication is disabled for localhost connections. This is brief though – you get one opportunity to create the first user, then the exception privilege is dropped.

Another exception is when 3.4+ replica set or cluster uses security.transitionToAuth, but its point is obvious and I won’t expand on it here.

Auth – Still disabled by default

MongoDB’s first versions set “–auth” off by default. This has been widely regarded as a bad move.

You might think that by now (mid-2019, v4.0) it would be on by default – but you’d be wrong. Blank configuration still equates to authorization being off, albeit with startup warnings (on and off again in v3.2, on again v3.4) and various exposure reductions such as localhost becoming the only default-bound network device in v3.6.

Feeling nervous about your MongoDB instances now? If the mongod config files do not have security.authorization set to “enabled”, nor include security.keyfile or a security.clusterAuthMode settings which force it on, then you are not using authentication. You can try this quick mongo shell one-liner (with no user credential arguments set) to double-check if you have authentication and authorization enabled or not. 

mongo --host <target_host>:<port> --quiet --eval 'db.adminCommand({listDatabases: 1})'

The response you want to see is an “unauthorized” error. If you get a list of the database names on the other hand, sorry, you have a naked MongoDB deployment.

Note: Stick with the “listDatabases” example above for simplicity. There are some commands such as ismaster that don’t require authorization at any time. If you use those, you aren’t proving or disproving anything about auth mode.

External Authentication

As most people intuitively expect, this is about allowing users to be authenticated in an external service. As one exception it can’t be used for the internal mongodb __system user, but using it for any real human user or client application service account is perfectly suitable. In concrete terms, the external auth service will be a Kerberos KDC, or an ActiveDirectory or OpenLDAP server. 

Using external authentication doesn’t prevent you from having ordinary MongoDB user accounts at the same time. A common situation is that one DBA user for setup and maintenance reasons was created in the normal way (i.e. with db.createUser(…) in the “admin” db) but otherwise every other account is managed centrally in Kerberos or LDAP.

Internal Authentication

Confusingly MongoDB “internal authentication” doesn’t mean the opposite of the external authentication discussed just above.

It would have been better named ‘peer mongodb node authentication as the __system user’. A mongod node running with authentication enabled won’t trust that any TCP peer is another mongod or mongos node just because it talks like one. Rather it requires that the peer authenticates by proof of a shared secret.

Keyfile Internal Authentication (Default)

In the basic case, the shared secret is the keyfile saved in an identical file distributed to each mongod and mongos node in the cluster. “Key” suggests an asymmetric encryption key but in reality, it is just a password even if you generated it from /dev/random, etc. per the documentation’s advice.

Once the password is used successfully, a mongod node will permit commands coming from the authenticated peer to run as the “__system” superuser.

Unfunny fact: if someone has a copy of the keyfile they can simply strip control and non-printing chars from the key file to make the password string that will let them connect as the “__system” user.

mongo --authenticationDatabase local -u __system -p "$(tr -d '\011-\015\040' < /path/to/keyfile)"

Don’t panic if you try this right now as the mongod (or root) user on one of your MongoDB servers and it succeeds. These unix users already have the permissions to disable security and restart nodes anyway. It is not an extra vulnerability if they can do it. There won’t be accidental read-privilege leaking either – mongod will abort on startup if the keyfile is in anything other than 400 (or 600) file permissions mode.

It is, however, a security failure if users who aren’t DBAs (or the server admins with root) are able to read a copy of the keyfile. This can happen by accidentally saving the keyfile in your world-readable source control, or putting them in deployment ‘recipes’. An intermediate-risk increase is when the keyfile is distributed with mongos nodes owned and run as one of the application team’s unix users instead of “mongod” or other DBA team-owned unix user.

X.509 Internal Authentication

The x.509 authentication mechanism does actually use asymmetric public/private keys, unlike the “security.keyfile” above. It must be used in conjunction with TLS/SSL.

It can be used for client connections as well as internal authentication. Information regarding x.509 authentication is spread over two places in the documentation as a result.

The benefit of x.509 is that, compared to the really-just-a-big-password ‘keyfile’ above, it is less likely that one of the keys deployed with mongod and mongos nodes can be abused by an attacker who gets a copy of it. It depends on how strictly the x.509 certificates are set up, however. To be practical if you do not have a dedicated security team that understands x.509 concepts and best practices, and takes on the administrative responsibility for it, you won’t be getting the best points of x.509. These better practices include tightening down which hosts it will work on and being able to revoke and rollover certificates.

Following up

That’s the end of ‘five bad guys’ overview for MongoDB security, with some clarification about Authentication and Authorization thrown in for good measure.

To give them the space they deserve, the following two subsystems will be covered in later articles:

  • Network encryption
  • Storage encryption (a.k.a. disk encryption or encryption-at-rest)

For Auditing please see this earlier Percona blog MongoDB Audit Log.

Quick documentation links

Jul
08
2019
--

Upcoming Webinar 7/10: Learn how to run MongoDB Inside of a Containerized Environment

MongoDB Inside of a Containerized Environment

MongoDB Inside of a Containerized EnvironmentPlease join Percona Consultant Doug Duncan as he presents his talk “Building Kubernetes Operator for Percona Server for MongoDB” on Wednesday, July 10th, 2019 at 10:00 AM PDT (UTC-7).

Register Now

Doug will discuss the basic knowledge needed to understand the complications of running MongoDB inside of a containerized environment and then to go over the specifics of how Percona solved these challenges in the PSMDB Operator. It also will provide an overview of PSMDB Operator features, and a sneak peek at future plans.

Jul
05
2019
--

Hiding Fields in MongoDB: Views + Custom Roles

hiding fields in MongoDB

hiding fields in MongoDBA time ago we wrote about how personalized roles may help you to give specific permissions when it is needed. This time we want to discuss how a custom role, combined with a MongoDB View, can hide sensitive information from the client.

Hiding Fields in MongoDB

Suppose you have a collection that needs to be shared with a different team, but this team should not be able to see some fields – in our case, to make it easy: the salary field.

Views in MongoDB can hide sensitive information and change the data visualization as needed – It was discussed here. For this example, we will use the collection employee with some data, with a user that has permission. Let’s insert some objects in the percona database

use percona

db.employees.insert({ "_id" : ObjectId("5ce5e609444cde8078f337f2"), "name" : "Adamo Tonete", 
    "salary" : { "year" : 1, "bonus" : 1 } })
db.employees.insert({ "_id" : ObjectId("5ce5e616444cde8078f337f3"), "name" : "Vinicius Grippa", 
    "salary" : { "year" : 1, "bonus" : 1 } })
db.employees.insert({ "_id" : ObjectId("5ce5e627444cde8078f337f4"), "name" : "Marcos Albe", 
    "salary" : { "year" : 1, "bonus" : 1 } })
db.employees.insert({ "_id" : ObjectId("5ce5e63f444cde8078f337f5"), "name" : "Vinodh Krishnaswamy", 
    "salary" : { "year" : 1, "bonus" : 1 } })
db.employees.insert({ "_id" : ObjectId("5ce5e655444cde8078f337f6"), "name" : "Aayushi Mangal", 
    "salary" : { "year" : 1, "bonus" : 1 } })

Then let’s create a view for this collection:

db.createView('employees_name', 'employees',
   [{ $project: { _id: 1, name : 1 } } ]
)

If we type show dbs; we will be able to see both collections, so, a read-only user still able to read the employees collection.

In order to secure the employees’ collection, we are creating a custom role that one has permission to see the employees_names collection and nothing else. In that way the fields salary will never exist to the user:

use admin
db.createRole(
   {
     role: "view_views",
     privileges: [
       { resource: { db: "percona", collection: "system.views" }, actions: [ "find" ] },
       { resource: { db: "percona", collection: "employees_name" }, actions: [ "find","collStats"]}
     ],
     roles: [
       { role: "read", db: "admin" }
     ]
   }
)

Then we will create a user that only has permission to read data from the view (belongs to the role “view_views”);

db.createUser({user : 'intern', pwd : '123', roles : ["view_views"]})

Now the user can only see the collection employees_name in the percona database and nothing else.

Running the query as the user intern:

> show dbs
admin    0.000GB
percona  0.000GB
> use percona
switched to db percona
> db.employees_name.find()
{ "_id" : ObjectId("5ce5e609444cde8078f337f2"), "name" : "Adamo Tonete" }
{ "_id" : ObjectId("5ce5e616444cde8078f337f3"), "name" : "Vinicius Grippa" }
{ "_id" : ObjectId("5ce5e627444cde8078f337f4"), "name" : "Marcos Albe" }
{ "_id" : ObjectId("5ce5e63f444cde8078f337f5"), "name" : "Vinodh Krishnaswamy" }
{ "_id" : ObjectId("5ce5e655444cde8078f337f6"), "name" : "Aayushi Mangal" }

There are several ways to do that. For instance, if you were using an application it would do the same thing, but the purpose of this blog is to demonstrate how a combination of two technologies can help in hiding fields in MongoDB

I hope you liked the blog, feel free to reach out me on @AdamoTonete or @percona for questions.

Jun
26
2019
--

Upcoming Webinar 6/27: Beyond Relational Databases – A Look Into MongoDB, Redis, and ClickHouse

relational databases

Relational DatabasesPlease join Percona’s Principal Support Engineer Marcos Albe as he presents “Beyond Relational Databases: A Look Into MongoDB, Redis, and ClickHouse” on Thursday, June 27th, 2019 at 12:00 PM PDT (UTC-7).

Register Now

We all use and love relational databases… until we use them for purposes for which they are not a good fit: queues, caches, catalogs, unstructured data, counters, and many other use cases could be solved with relational databases, but are better solved with other alternatives.

In this talk, we’ll review the goals, pros and cons, and good and bad use cases of these alternative paradigms by looking at some modern open source implementations.

By the end of this talk, the audience will have learned the basics of three database paradigms (document, key-value, and columnar store) and will know when it’s appropriate to opt for one of these or when to favor relational databases and avoid falling into buzzword temptations.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com