Dealing with Jumbo Chunks in MongoDB

Jumbo Chunks in MongoDB

Jumbo Chunks in MongoDBIn this blog post, we will discuss how to deal with jumbo chunks in MongoDB.

You are a MongoDB DBA, and your first task of the day is to remove a shard from your cluster. It sounds scary at first, but you know it is pretty easy. You can do it with a simple command:

db.runCommand( { removeShard: "server1_set6" } )

MongoDB then does its magic. It finds the chunks and databases and balances them across all other servers. You can go to sleep without any worry.

The next morning when you wake up, you check the status of that particular shard and you find the process is stuck:

"msg" : "draining ongoing",
"state" : "ongoing",
"remaining" : {
"chunks" : NumberLong(3),
"dbs" : NumberLong(0)

There are three chunks that for some reason haven’t been migrated, so the


 command is stalled! Now what do you do?

Find chunks that cannot be moved

We need to connect to mongos and check the catalog:

mongos> use config
switched to db config
mongos> db.chunks.find({shard:"server1_set6"})

The output will show three chunks, with minimum and maximum _id keys, along with the namespace where they belong. But the last part of the output is what we really need to check:

"min" : {
"_id" : "17zx3j9i60180"
"max" : {
"_id" : "30td24p9sx9j0"
"shard" : "server1_set6",
"jumbo" : true

So, the chunk is marked as “jumbo.” We have found the reason the balancer cannot move the chunk!

Jumbo chunks and how to deal with them

So, what is a “jumbo chunk”? It is a chunk whose size exceeds the maximum amount specified in the

chunk size

 configuration parameter (which has a default value of 64 MB). When the value is greater than the limit, the balancer won’t move it.

The way to remove the flag from that those chunks is to manually split them. There are two ways to do it:

  1. You can specify at what point to split the chunk, specifying the corresponding _id value. To do this, you really need to understand how your data is distributed and what the settings are for min and max in order to select a good splitting point.
  2. You can just tell MongoDB to split it by half, letting it decide which is the best possible _id. This is easier and less error prone.

To do it manually, you need to use


. For example:

sh.splitAt("dbname", { _id: "19fr21z5sfg2j0" })

In this command, you are telling MongoDB to split the chunk in two using that _id as the cut point.

If you want MongoDB to find the best split point for you, use the 


 command. In this particular case, you only need to specify a key (any key) that is part of the chunk you want to split. MongoDB will use that key to find that particular chunk, and then divide it into two parts using the _id that sits in the middle of the list.

sh.splitFind("dbname", { _id : "30td24p9sx9j0" })

Once the three chunks have been split, the jumbo flag is removed and the balancer can move them to a different server.


 will complete the process and you can drink a well-deserved coffee.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com