Nowadays we are seeing a lot of customers starting to use our Percona Distribution for MongoDB Kubernetes Operator. The Percona Kubernetes Operators are based on best practices for the configuration of a Percona Server for MongoDB replica set or the sharded cluster. The main component in MongoDB is the wiredTiger cache which helps to define the cache used by this engine and we can set it based on our load.
In this blog post, we will see how to define the resources’ memory and set the wiredTiger cache for the shard replicaset to improve the performance of the sharded cluster.
The Necessity of WT cache
The parameter storage.wiredTiger.engineConfig.cacheSizeGB limits the size of the WiredTiger internal cache. The operating system will use the available free memory for filesystem cache, which allows the compressed MongoDB data files to stay in memory. In addition, the operating system will use any free RAM to buffer file system blocks and file system cache. To accommodate the additional consumers of RAM, you may have to set WiredTiger’s internal cache size properly.
Starting from MongoDB 3.4, the default WiredTiger internal cache size is the larger of either:
50% of (RAM - 1 GB), or 256 MB.
For example, on a system with a total of 4GB of RAM the WiredTiger cache will use 1.5GB of RAM (0.5 * (4 GB – 1 GB) = 1.5 GB). Conversely, a system with a total of 1.25 GB of RAM will allocate 256 MB to the WiredTiger cache because that is more than half of the total RAM minus one gigabyte (0.5 * (1.25 GB – 1 GB) = 128 MB < 256 MB).
WT cacheSize in Kubernetes Operator
The mongodb wiredTiger cacheSize can be tune with the parameter storage.wiredTiger.engineConfig.cacheSizeRatio and its default value is 0.5. As explained above, if the system allocated memory limit is too low, then the WT cache is set to 256M or calculated as per the formula.
Prior to PSMDB operator 1.9.0, the cacheSizeRatio can be tuned under the sharding section of the cr.yaml file. This is deprecated from v1.9.0+ and unavailable from v1.12.0+. So you have to use the cacheSizeRatio parameter available under replsets configuration instead. The main thing that you will need to check here before changing the cacheSize is to make sure that the resources’ memory limit allocated is also available as per your cacheSize’s requirement. i.e the below section limiting the memory:
resources: limits: cpu: "300m" memory: "0.5G" requests: cpu: "300m" memory: "0.5G"
https://github.com/percona/percona-server-mongodb-operator/blob/main/pkg/psmdb/container.go#L307
From the source code that calculates the mongod.storage.wiredTiger.engineConfig.cacheSizeRatio:
// In normal situations WiredTiger does this default-sizing correctly but under Docker // containers WiredTiger fails to detect the memory limit of the Docker container. We // explicitly set the WiredTiger cache size to fix this. // // https://docs.mongodb.com/manual/reference/configuration-options/#storage.wiredTiger.engineConfig.cacheSizeGB// func getWiredTigerCacheSizeGB(resourceList corev1.ResourceList, cacheRatio float64, subtract1GB bool) float64 { maxMemory := resourceList[corev1.ResourceMemory] var size float64 if subtract1GB { size = math.Floor(cacheRatio * float64(maxMemory.Value()-gigaByte)) } else { size = math.Floor(cacheRatio * float64(maxMemory.Value())) } sizeGB := size / float64(gigaByte) if sizeGB < minWiredTigerCacheSizeGB { sizeGB = minWiredTigerCacheSizeGB } return sizeGB }
Changing the cacheSizeRatio
Here for the test, we deployed the PSMDB operator on GCP. You can refer here for the steps – https://www.percona.com/doc/kubernetes-operator-for-psmongodb/gke.html. With the latest operator v1.11.0, the sharded cluster has been started with a shard and a config server replicaSets along with mongos pods.
$ kubectl get pods NAME READY STATUS RESTARTS AGE my-cluster-name-cfg-0 2/2 Running 0 4m9s my-cluster-name-cfg-1 2/2 Running 0 2m55s my-cluster-name-cfg-2 2/2 Running 1 111s my-cluster-name-mongos-758f9fb44-d4hnh 1/1 Running 0 99s my-cluster-name-mongos-758f9fb44-d5wfm 1/1 Running 0 99s my-cluster-name-mongos-758f9fb44-wmvkx 1/1 Running 0 99s my-cluster-name-rs0-0 2/2 Running 0 4m7s my-cluster-name-rs0-1 2/2 Running 0 2m55s my-cluster-name-rs0-2 2/2 Running 0 117s percona-server-mongodb-operator-58c459565b-fc6k8 1/1 Running 0 5m45s
Now login into the shard and check the default memory allocated to the container and to the mongod instance. In below, the memory size available is 15G, but the memory limit to use in this container is 476MB only:
rs0:PRIMARY> db.hostInfo() { "system" : { "currentTime" : ISODate("2021-12-30T07:16:59.441Z"), "hostname" : "my-cluster-name-rs0-0", "cpuAddrSize" : 64, "memSizeMB" : NumberLong(15006), "memLimitMB" : NumberLong(476), "numCores" : 4, "cpuArch" : "x86_64", "numaEnabled" : false }, "os" : { "type" : "Linux", "name" : "Red Hat Enterprise Linux release 8.4 (Ootpa)", "version" : "Kernel 5.4.144+" }, "extra" : { "versionString" : "Linux version 5.4.144+ (builder@7d732a1aec13) (Chromium OS 12.0_pre408248_p20201125-r7 clang version 12.0.0 (/var/tmp/portage/sys-devel/llvm-12.0_pre408248_p20201125-r7/work/llvm-12.0_pre408248_p20201125/clang f402e682d0ef5598eeffc9a21a691b03e602ff58)) #1 SMP Sat Sep 25 09:56:01 PDT 2021", "libcVersion" : "2.28", "kernelVersion" : "5.4.144+", "cpuFrequencyMHz" : "2000.164", "cpuFeatures" : "fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves arat md_clear arch_capabilities", "pageSize" : NumberLong(4096), "numPages" : 3841723, "maxOpenFiles" : 1048576, "physicalCores" : 2, "mountInfo" : [ .. ..
The cachesize in MB of wiredTiger engine allocated in Shard is as follows:
rs0:PRIMARY> db.serverStatus().wiredTiger.cache["maximum bytes configured"]/1024/1024 256
The cache size of 256MB is too low for the real environment. So let’s see how to tune the memory limit and also the cacheSize of WT engine. You can use the parameter called cacheSizeRatio to mention the WT cache ratio (out of 1) and memlimit to mention the memory allocated to the container. To do this, edit the cr.yaml file under deploy directory in the operator to change the settings. From the PSMDB operator v1.9.0, editing cacheSizeRatio parameter under mongod section is deprecated. So for the WT cache limit, use the cacheSizeRatio parameter under the section “replsets” and to set memory, use the memlimit parameter. Setting 3G for the container and 80% of the memory calculations.
deploy/cr.yaml:58
46 configuration: | 47 # operationProfiling: 48 # mode: slowOp 49 # systemLog: 50 # verbosity: 1 51 storage: 52 engine: wiredTiger 53 # inMemory: 54 # engineConfig: 55 # inMemorySizeRatio: 0.9 56 wiredTiger: 57 engineConfig: 58 cacheSizeRatio: 0.8
deploy/cr.yaml:229-232:
226 resources: 227 limits: 228 cpu: "300m" 229 memory: "3G" 230 requests: 231 cpu: "300m" 232 memory: "3G"
Apply the new cr.yaml
# kubectl appli -f deploy/cr.yaml perconaservermongodb.psmdb.percona.com/my-cluster-name configured
The shard pods are re-allocated and you can check the progress as follows:
$ kubectl get pods NAME READY STATUS RESTARTS AGE my-cluster-name-cfg-0 2/2 Running 0 36m my-cluster-name-cfg-1 2/2 Running 0 35m my-cluster-name-cfg-2 2/2 Running 1 34m my-cluster-name-mongos-758f9fb44-d4hnh 1/1 Running 0 34m my-cluster-name-mongos-758f9fb44-d5wfm 1/1 Running 0 34m my-cluster-name-mongos-758f9fb44-wmvkx 1/1 Running 0 34m my-cluster-name-rs0-0 0/2 Init:0/1 0 13s my-cluster-name-rs0-1 2/2 Running 0 60s my-cluster-name-rs0-2 2/2 Running 0 8m33s percona-server-mongodb-operator-58c459565b-fc6k8 1/1 Running 0 38m
Now check the new settings of WT cache as follows:
rs0:PRIMARY> db.hostInfo().system { "currentTime" : ISODate("2021-12-30T08:37:38.790Z"), "hostname" : "my-cluster-name-rs0-1", "cpuAddrSize" : 64, "memSizeMB" : NumberLong(15006), "memLimitMB" : NumberLong(2861), "numCores" : 4, "cpuArch" : "x86_64", "numaEnabled" : false } rs0:PRIMARY> rs0:PRIMARY> rs0:PRIMARY> db.serverStatus().wiredTiger.cache["maximum bytes configured"]/1024/1024 1474
Here, the memory calculation for WT is done roughly as follows (Memory limit should be more than 1G, else 256MB is allocated by default:
(Memory limit – 1G) * cacheSizeRatio
(2861 - 1) *0.8 = 1467
NOTE:
Till PSMDB operator v1.10.0, the operator takes the change of cacheSizeRatio only if the resources.limit.cpu is also set. This is a bug and it got fixed in v1.11.0 – refer https://jira.percona.com/browse/K8SPSMDB-603 . So if you’re in an older version, don’t be surprised and you have to make sure the resources.limit.cpu is set as well.
https://github.com/percona/percona-server-mongodb-operator/blob/v1.10.0/pkg/psmdb/container.go#L194
if limit, ok := resources.Limits[corev1.ResourceCPU]; ok && !limit.IsZero() { args = append(args, fmt.Sprintf( "--wiredTigerCacheSizeGB=%.2f", getWiredTigerCacheSizeGB(resources.Limits, replset.Storage.WiredTiger.EngineConfig.CacheSizeRatio, true), )) }
From v1.11.0:
https://github.com/percona/percona-server-mongodb-operator/blob/v1.11.0/pkg/psmdb/container.go#L194
if limit, ok := resources.Limits[corev1.ResourceMemory]; ok && !limit.IsZero() { args = append(args, fmt.Sprintf( "--wiredTigerCacheSizeGB=%.2f", getWiredTigerCacheSizeGB(resources.Limits, replset.Storage.WiredTiger.EngineConfig.CacheSizeRatio, true), )) }
Conclusion
So based on the application load, you will need to set the cacheSize of WT for better performance. You can use the above methods to tune the cache size for the shard replicaset in the PSMDB operator.
Reference Links :
https://www.percona.com/doc/kubernetes-operator-for-psmongodb/operator.html
https://www.percona.com/doc/kubernetes-operator-for-psmongodb/gke.html
MongoDB 101: How to Tune Your MongoDB Configuration After Upgrading to More Memory