Sep
21
2022
--

MongoDB 6.0: Should You Upgrade Now?

MongoDB 6.0 Should You Upgrade Now

MongoDB 6.0 Should You Upgrade NowMongoDB is a cross-platform, document-oriented NoSQL database. It has been developed as an answer to the growing need for easy-to-use yet very performant, scalable, and content-agnostic storage. MongoDB has been widely adopted by engineers across application development anywhere from banking to social media. Unfortunately, after MongoDB Inc.’s IPO in 2017, they chose an aggressive path of monetization, changing the license to SSPL (which is a license model that’s bad for you) and promoting Atlas (MongoDB’s Database as a Service (DBaaS) solution) even over the costly MongoDB Enterprise. The company put the community’s user needs way behind catering to high-end enterprise customers, leaving the MongoDB community stranded.

While limited by the SSPL license (not recognized by the Open Standards Initiative (OSI) as open source), Percona, known for its deep commitment to open source software, chose to come with support to the stranded MongoDB community by:

  • Providing Percona Server for MongoDB (PSMDB) – a source-available MongoDB drop-in replacement database based on MongoDB Community Edition (CE) yet adding enterprise features developed on top of that by Percona.
  • Delivering a freely available open source product for MongoDB backup and recovery: Percona Backup for MongoDB (PBM) works with PSMDB as well as (to some extent) with MongoDB Inc.’s MongoDB Community and Enterprise editions.
  • Packaging PSMDB and PBM in Percona Distribution for MongoDB: an easy-to-deploy complete solution for MongoDB.
  • Providing Percona Monitoring and Management (PMM), which can be used as an open source, multi-database alternative to MongoDB Ops Manager.
  • Developing a MongoDB Prometheus exporter, free to use for anyone (from Grafana, through Ansible to Dynatrace) needing insights into how their MongoDB instance is doing that’s widely used in the industry both by the open source communities and enterprise APM tools.
  • Delivering a Percona Operator for MongoDB, a complete solution for containerized environments of your Percona Server for MongoDB, containing the necessary Kubernetes settings to maintain consistent PSMDB instance (if you’re unsure whether running MongoDB in Kubernetes is for you, check out pros and cons of solutions for that).

I think it’s fair to say that the list is impressive. Individuals, organizations, and even enterprises benefit from the fact that the software Percona provides is free and open.

A bittersweet edition

Seeing all the critical bugs that MongoDB 5.0 has introduced, it feels as if its release has been rushed, allowing for half-baked features to go GA. Looking at numerous critical problems that could result in data corruption, you could still argue that it’s a natural state of things in IT development, to quote Albert Einstein:

a person who never made a mistake never tried anything new.

True point, but following the story by The Register, is not an argument I’d use here. The “accelerated release cadence” introduced by MongoDB Inc. assumes that major improvements are released in “rapid releases” available only for Atlas (DBaaS) customers. Neither MongoDB Community nor even MongoDB Enterprise customers will get a taste of those improvements in 5.0.x, even though they have a chance to taste all the instabilities, limitations, and bugs available with 5.0.

Of course, MongoDB Inc will argue that the Rapid Releases are for the bleeding edge adopters, that they include new features, whereas all the issues are fixed in the bugfix, patchset releases. From my experience, not only bug fixes solve user issues. Think of the release cycle and the situations where, due to the deadlines, some features are released in the major version in a limited scope. It does sound all too familiar, right? Now that’s not that bad, as (with semantic versioning) minor versions will fill in the missing capabilities, lift the limitations, and make the often go-to-market spotlight features of the major version complete. Not in this case, at least not if you are the “second class citizen user” of the Community or Enterprise edition. Rapid Releases are what semantic versioning calls minor releases. Meaning you have to live with the limitations and lacking features till the next major release, for now having to satisfy with the bug-fix holding patches only.

Consider that MongoDB 5.0 introduced very appealing capabilities like time-series collections or resharding that allows for an automatic changing of a shard-key for a collection. Choosing a good shard-key during the application design when the initial sharding takes place is often challenging. Having a poorly designed sharding in your database means everything to MongoDB’s performance. For now, to change it, a manual and cumbersome process has been needed. Even taking into account the downfalls of the introduced resharding like performance and storage overhead during the process, it is still a very tempting feature that for many situations could be a game changer. Unfortunately, with a lack of trust in MongoDB 5.0 and the new release cadence not having the community’s back, the community often simply cannot benefit from it.

Percona has waited a long time for the release of 5.0 to feel stable enough to release. It was not until MongoDB 5.0.6 CE was released almost half a year after 5.0.0 that Percona decided it was safe enough for our users and customers. This sort of third-party overwatch is an invaluable asset that open source brings. With companies like Percona standing behind a software release, you get the added benefit of extra verification of your software “for free”.

End-of-life strategy

End-of-life strategy MongoDBLooking at the previous chapter, the adoption of releases of 5.0 being not as impressive as one could expect is not that surprising. As this blog post is being written, the telemetry data gathered by Percona shows:

  • MongoDB 4.4 = 47%
  • MongoDB 4.2 = 17%
  • MongoDB 5.0 = 15%
  • MongoDB 4.0 = 13%

With the end-of-life calendar MongoDB 4.x looking as follows:

  • 4.0 EOL April 2022
  • 4.2 EOL April 2023
  • 4.4 EOL April 2024

Add in the apparent lack of trust in 5.0, we see a growing trend for the adoption of MongoDB 4.4 that gives some “breathing space” until the EOL.

That’s a fair strategy that makes sense, but limits the value you are getting. What if there was another way that could allow you to get some more benefits?

Here comes Percona Server for MongoDB 6.0

With the introduction of MongoDB 6.0, users got long-awaited improvements and usability fixes to MongoDB 5.0 that only the Atlas customers could taste before. After the EOL of major versions that required users to upgrade, 6.0 could become their landing zone. This way users can benefit from more advanced features of the new version as well as a later EOL.

Percona Server for MongoDB 6.0A quick look at the features that made it to MongoDB 6.0 shows a range of interesting ones, like:

  • Cluster to cluster sync
  • Queryable encryption
  • Time-series collections improvements
  • Analytics improvements
  • Change streams improvements
  • New aggregation operators
  • Improved search

Obviously, not all of these will make it to MongoDB Community Edition, since some are reserved for MongoDB Enterprise or even Atlas only.

Even without the features not available in the Community Edition, the 6.0 release providing fixes over the unstable 5.0 is a large improvement that’s worth considering in your long-time update strategy.

While updating your Community Edition, it’s worth considering migrating from MongoDB CE to Percona Server for MongoDB. This way you have all the benefits of MongoDB CE 6.0 plus the advantages that Percona brings to the release cycle for the community. With the upcoming release of Percona Server for MongoDB 6.0, as well as the freshly released Percona Backup for MongoDB 2.0 and the support of PBM in Percona Monitoring and Management, the solution becomes complete. With features like an in-memory engine, extensive data at rest encryption, hot backups, LDAP, and Kerberos integration on top of what MongoDB Community Edition already provides, PDMDB provides a complete solution that Percona is committed to keeping open. Be on the lookout for the announcement of PSMDB 6.0 very soon!

What now?

We see over the years that companies change their licenses, becoming less open source while claiming to be more open in an obvious marketing play. At its core, Percona chooses to stay true to the open source philosophy.

Over the years, Percona experts have meticulously delivered increments of Percona Server for MongoDB based on the same upstream codebase as the MongoDB Community Edition. As a drop-in replacement for MongoDB CE, it’s the enterprise features like these that PSMDB adds on top of it that make it so interesting:

  • in-memory storage engine,
  • KMIP support,
  • HashiCorp Vault integration,
  • data-at-rest encryption,
  • audit logging,
  • external LDAP authentication with SASL,
  • hot backups.

These enterprise-grade feature enhancements were added to Percona Server for MongoDB. This way the open source community could benefit from features previously reserved only for MongoDB Enterprise customers. With PSMDB 6.0, things are not going to change. Percona is on a mission to provide open database solutions to everyone and everywhere. With this in mind, we are open to your suggestions as to which features are the most important to you, our users. Reach out and let us know!

Learn more about Percona Server for MongoDB

Sep
19
2022
--

Testing LDAP Authentication and Authorization on Percona Operator for MongoDB

LDAP Authentication and Authorization on Percona Operator for MongoDB

LDAP Authentication and Authorization on Percona Operator for MongoDBAs of Percona Operator for MongoDB 1.12.0, the documentation now has instructions on how to configure LDAP Authentication and Authorization. It already contains an example of how to configure the operator if OpenLDAP is your LDAP server. Here is another example of setting it up but using Samba as your LDAP server.

To simplify the installation and configuration, I will use Ubuntu Jammy 22.04 LTS since the distribution repository contains the packages to install Samba and Kubernetes.

This is the current configuration of the test server:

OS: Ubuntu Jammy 22.04 LTS
Hostname: samba.percona.local
IP Address: 192.168.0.101

Setting up Samba

Let’s install the necessary packages to install Samba as PDC and troubleshooting tools:

$ sudo apt update
$ sudo apt -y upgrade
$ sudo apt -y install samba net-tools winbind ldap-utils

Disable smbd, winbind, and systemd-resolved services because we will need to reconfigure samba as a PDC and DNS resolver. Also remove current samba configuration, /etc/samba/smb.conf.

$ sudo systemctl stop smbd
$ sudo systemctl stop systemd-resolved
$ sudo systemctl stop winbind
$ sudo systemctl disable smbd
$ sudo systemctl disable systemd-resolved
$ sudo systemctl disable winbind
$ sudo rm /etc/samba/smb.conf

Delete the symlink on /etc/resolv.conf and replace the content with “nameserver 127.0.0.1” to use the samba’s DNS service:

$ sudo rm -f /etc/resolv.conf
$ sudo echo -e "nameserver 127.0.0.1" | sudo tee /etc/resolv.conf

Create a domain environment with the following settings:

Realm: PERCONA.LOCAL
Domain: PERCONA
Administrator Password: PerconaLDAPTest2022

$ sudo samba-tool domain provision --realm percona.local --domain percona --admin=PerconaLDAPTest2022

Edit /etc/samba/smb.conf and set DNS forwarder to 8.8.8.8 to resolve other zones. We will also disable mandatory TLS authentication since Percona Operator does not support LDAP with TLS at the time of writing this article.

$ cat /etc/samba/smb.conf
# Global parameters
[global]
	dns forwarder = 8.8.8.8
	netbios name = SAMBA
	realm = PERCONA.LOCAL
	server role = active directory domain controller
	workgroup = PERCONA
	ldap server require strong auth = No
[sysvol]
	path = /var/lib/samba/sysvol
	read only = No
[netlogon]
	path = /var/lib/samba/sysvol/percona.local/scripts
	read only = No

Symlink krb5.conf configuration.

$ sudo ln -s /var/lib/samba/private/krb5.conf /etc

Unmask samba-ad-dc service and start it. Ensure it will start at boot time.

$ sudo systemctl unmask samba-ad-dc
$ sudo systemctl start samba-ad-dc
$ sudo systemctl enable samba-ad-dc

Check if the Samba services are up and running

$ sudo netstat -tapn|grep samba
tcp        0      0 0.0.0.0:389             0.0.0.0:*               LISTEN      4376/samba: task[ld 
tcp        0      0 0.0.0.0:53              0.0.0.0:*               LISTEN      4406/samba: task[dn 
tcp        0      0 0.0.0.0:636             0.0.0.0:*               LISTEN      4376/samba: task[ld 
tcp        0      0 0.0.0.0:135             0.0.0.0:*               LISTEN      4371/samba: task[rp 
tcp6       0      0 :::389                  :::*                    LISTEN      4376/samba: task[ld 
tcp6       0      0 :::53                   :::*                    LISTEN      4406/samba: task[dn 
tcp6       0      0 :::636                  :::*                    LISTEN      4376/samba: task[ld 
tcp6       0      0 :::135                  :::*                    LISTEN      4371/samba: task[rp 

$ host google.com
google.com has address 172.217.194.101

$ host samba.percona.local
samba.percona.local has address 192.168.0.101

Adding users and groups

Now that Samba is up and running, we can now perform user and group management. We will create Samba users and groups and assign users to groups with samba-tool.

$ sudo samba-tool user add dbauser01 --surname=User01 --given-name=Dba --mail-address=dbauser01@percona.local DbaPassword1
$ sudo samba-tool user add devuser01 --surname=User01 --given-name=Dev --mail-address=devuser01@percona.local DevPassword1
$ sudo samba-tool user add searchuser01 --surname=User01 --given-name=Search --mail-address=searchuser01@percona.local SearchPassword1
$ sudo samba-tool group add developers
$ sudo samba-tool group add dbadmins
$ sudo samba-tool group addmembers developers devuser01
$ sudo samba-tool group addmembers dbadmins dbauser01

Use samba-tool again to view the details of the users and groups:

$ sudo samba-tool user show devuser01
dn: CN=Dev User01,CN=Users,DC=percona,DC=local
objectClass: person
objectClass: user
cn: Dev User01
sn: User01
givenName: Dev
name: Dev User01
sAMAccountName: devuser01
mail: devuser01@percona.local
memberOf: CN=developers,CN=Users,DC=percona,DC=local

$ sudo samba-tool group show dbadmins
dn: CN=dbadmins,CN=Users,DC=percona,DC=local
objectClass: group
cn: dbadmins
name: dbadmins
sAMAccountName: dbadmins
member: CN=Dba User01,CN=Users,DC=percona,DC=local

Searching with ldapsearch

Troubleshooting LDAP starts with being able to use the ldapsearch tool to specify the credentials and filters. Once you are successful with authentication and searching, it’s easier to plug the same or similar parameters used in ldapsearch in the configuration of the Percona operator. Here are some examples of useful ldapsearch commands:

1. Logging in as “CN=Dev User01,CN=Users,DC=percona,DC=local”. If authenticated, return the DN, First Name, Last Name, email and sAMAccountName for that record.

$ ldapsearch -LLL -W -x -H ldap://samba.percona.local -b "CN=Dev User01,CN=Users,DC=percona,DC=local" -D "CN=Dev User01,CN=Users,DC=percona,DC=local" "givenName" "sn" "mail" "sAMAccountName"
Enter LDAP Password:
dn: CN=Dev User01,CN=Users,DC=percona,DC=local
sn: User01
givenName: Dev
sAMAccountName: devuser01
mail: devuser01@percona.local

Essentially, without mapping,you will need to supply the username as the full DN to login to MongoDB. Eg. mongo -u “CN=Dev User01,CN=Users,DC=percona,DC=local”

2. Logging in as “CN=Search User01,CN=Users,DC=percona,DC=local” and looking for users in “DC=percona,dc=local” where sAMAccountName is “dbauser01”. If there’s a match, it will return the DN, First Name, Last Name, mail and sAMAccountName for that record.

$ ldapsearch -LLL -W -x -H ldap://samba.percona.local -b "DC=percona,dc=local" -D "CN=Search User01,CN=Users,DC=percona,DC=local"  "(&(objectClass=person)(sAMAccountName=dbauser01))" "givenName" "sn" "mail" "sAMAccountName"
Enter LDAP Password:
dn: CN=Dba User01,CN=Users,DC=percona,DC=local
sn: User01
givenName: Dba
sAMAccountName: dbauser01
mail: dbauser01@percona.local

With mapping, you can now authenticate by specifying sAMAaccountName or mail depending on how mapping is defined. Eg. mongo -u dbauser01 or mongo -u “dbauser01@percona.local”

3. Logging in as “CN=Search User01,CN=Users,DC=percona,DC=local”, looking for groups in “DC=percona,dc=local” where “CN=Dev User01,CN=Users,DC=percona,DC=local” is a member. If there’s a match, it will return the DN and common name of the group.

$ ldapsearch -LLL -W -x -H ldap://samba.percona.local -b "DC=percona,dc=local" -D "CN=Search User01,CN=Users,DC=percona,DC=local" "(&(objectClass=group)(member=CN=Dev User01,CN=Users,DC=percona,DC=local))" "cn"
Enter LDAP Password:
dn: CN=developers,CN=Users,DC=percona,DC=local
cn: developers

This type of search is important to enumerate the groups of that user for we can define the privileges of that user based on its group membership.

Kubernetes installation and configuration

Now that authenticating to LDAP and search filters are working, we are ready to test this in the Percona Operator. Since this is just for testing, we might as well use the same server to deploy Kubernetes. In this example, we will use Microk8s.

$ sudo snap install microk8s --classic
$ sudo usermod -a -G microk8s $USER
$ sudo chown -f -R $USER ~/.kube
$ newgrp microk8s
$ microk8s status --wait-ready
$ microk8s enable dns
$ microk8s enable hostpath-storage
$ alias kubectl='microk8s kubectl'

Once installed, check system pods when all are running before we continue to the next step:

$ kubectl get pods --all-namespaces
NAMESPACE     NAME                                       READY   STATUS    RESTARTS   AGE
kube-system   calico-node-bj9c4                          1/1     Running   0          3m12s
kube-system   coredns-66bcf65bb8-l9hwb                   1/1     Running   0          65s
kube-system   calico-kube-controllers-644d5c79cb-fhhkc   1/1     Running   0          3m11s
kube-system   hostpath-provisioner-85ccc46f96-qmjrq      1/1     Running   0          3m

Deploying the Percona Operator for MongoDB

Now that Kubernetes is running, we can download the Percona Operator for MongoDB. Let’s download version 1.13.0 with git:

$ git clone -b v1.13.0 https://github.com/percona/percona-server-mongodb-operator

Then let’s go to the deploy directory and apply bundle.yaml to install the Percona operator:

$ cd percona-server-mongodb-operator/deploy
$ kubectl apply -f bundle.yaml 
customresourcedefinition.apiextensions.k8s.io/perconaservermongodbs.psmdb.percona.com created
customresourcedefinition.apiextensions.k8s.io/perconaservermongodbbackups.psmdb.percona.com created
customresourcedefinition.apiextensions.k8s.io/perconaservermongodbrestores.psmdb.percona.com created
role.rbac.authorization.k8s.io/percona-server-mongodb-operator created
serviceaccount/percona-server-mongodb-operator created
rolebinding.rbac.authorization.k8s.io/service-account-percona-server-mongodb-operator created
deployment.apps/percona-server-mongodb-operator created

Check if the operator is up and running:

$ kubectl get pods
NAME                                               READY   STATUS    RESTARTS   AGE
percona-server-mongodb-operator-547c499bd8-p8k74   1/1     Running   0          41s

Now that it is running we need to apply cr.yaml to create the MongoDB instances and services. We will just use minimal deployment in cr-minimal.yaml which is provided in the deploy directory.

$ kubectl apply -f cr-minimal.yaml
perconaservermongodb.psmdb.percona.com/my-cluster-name created

Wait until all pods are created:

$ kubectl get pods
NAME                                               READY   STATUS    RESTARTS   AGE
percona-server-mongodb-operator-547c499bd8-p8k74   1/1     Running   0          5m16s
minimal-cluster-cfg-0                              1/1     Running   0          3m25s
minimal-cluster-rs0-0                              1/1     Running   0          3m24s
minimal-cluster-mongos-0                           1/1     Running   0          3m24s

Setting up roles on the Percona Operator

Now that MongoDB pods are running, let’s add the groups for role-based mapping. We need to add this configuration from the primary config server which will be used by mongos and replicaset for authorization when logging in.

First, let’s get the username and password of the admin user:

$ kubectl get secrets
NAME                                     TYPE     DATA   AGE
minimal-cluster                          Opaque   10     4m3s
internal-minimal-cluster-users           Opaque   10     4m3s
minimal-cluster-mongodb-keyfile          Opaque   1      4m3s
minimal-cluster-mongodb-encryption-key   Opaque   1      4m3s

$ kubectl get secrets minimal-cluster -o yaml
apiVersion: v1
data:
  MONGODB_BACKUP_PASSWORD: b2NNNkFjOHdEUU42OUpmYnE=
  MONGODB_BACKUP_USER: YmFja3Vw
  MONGODB_CLUSTER_ADMIN_PASSWORD: aElBWlVyajFkZWF0eEhWSzI=
  MONGODB_CLUSTER_ADMIN_USER: Y2x1c3RlckFkbWlu
  MONGODB_CLUSTER_MONITOR_PASSWORD: V1p6YkFhN1o3T2RkSm5Gbg==
  MONGODB_CLUSTER_MONITOR_USER: Y2x1c3Rlck1vbml0b3I=
  MONGODB_DATABASE_ADMIN_PASSWORD: U0hMR3Y3WlF2SVpxZ1dhcUFh
  MONGODB_DATABASE_ADMIN_USER: ZGF0YWJhc2VBZG1pbg==
  MONGODB_USER_ADMIN_PASSWORD: eW5TZjRzQjkybm5UdjdVdXduTQ==
  MONGODB_USER_ADMIN_USER: dXNlckFkbWlu
kind: Secret
metadata:
  creationTimestamp: "2022-09-15T15:57:42Z"
  name: minimal-cluster
  namespace: default
  resourceVersion: "5673"
  uid: d3f4f678-a3db-4578-b10c-69e8c4410b00
type: Opaque

$ echo `echo "dXNlckFkbWlu"|base64 --decode`
userAdmin
$ echo `echo "eW5TZjRzQjkybm5UdjdVdXduTQ=="|base64 --decode`
ynSf4sB92nnTv7UuwnM

Next, let’s connect to the primary config server:

$ kubectl get services
NAME                     TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)     AGE
kubernetes               ClusterIP   10.152.183.1             443/TCP     22m
minimal-cluster-cfg      ClusterIP   None                     27017/TCP   7m27s
minimal-cluster-rs0      ClusterIP   None                     27017/TCP   7m27s
minimal-cluster-mongos   ClusterIP   10.152.183.220           27017/TCP   7m27s

$ kubectl run -i --rm --tty percona-client --image=percona/percona-server-mongodb:5.0.11-10 --restart=Never -- bash -il
[mongodb@percona-client /]$ mongo --host minimal-cluster-cfg -u userAdmin -p ynSf4sB92nnTv7UuwnM
Percona Server for MongoDB shell version v5.0.11-10
connecting to: mongodb://minimal-cluster-cfg:27017/?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("5f1f7db8-d75f-4658-a579-86b9bbf22471") }
Percona Server for MongoDB server version: v5.0.11-10
cfg:PRIMARY>

From the console, we can create two roles “CN=dbadmins,CN=Users,DC=percona,DC=local” and “CN=developers,CN=Users,DC=percona,DC=local” with their corresponding privileges:

use admin
db.createRole(
 {
   role: "CN=dbadmins,CN=Users,DC=percona,DC=local",
   roles: [ "root"],
   privileges: []
 }
)
db.createRole(
{
  role: "CN=developers,CN=Users,DC=percona,DC=local",
  roles: [
    "readWriteAnyDatabase"
  ],
  privileges: []
}
)

Note that the role names defined here correspond to the Samba groups I created with samba-tool. Also, you will need to add the same roles in the replicaset endpoint if you want your LDAP users to have these privileges when connecting to the replicaset directly.

Finally, exit the mongo console by typing exit and pressing Enter. Do the same to exit the pod as well.

Applying the LDAP configuration to the replicaset, mongos, and config servers

Now, we can add the LDAP configuration to the config server. Our first test configuration is to supply the full DN when logging in so the configuration will be:

$ cat fulldn-config.yaml
security:
  authorization: "enabled"
  ldap:
    authz:
      queryTemplate: 'DC=percona,DC=local??sub?(&(objectClass=group)(member:={PROVIDED_USER}))'
    servers: "192.168.0.101"
    transportSecurity: none
    bind:
      queryUser: "CN=Search User01,CN=Users,DC=percona,DC=local"
      queryPassword: "SearchPassword1"
setParameter:
  authenticationMechanisms: 'PLAIN,SCRAM-SHA-1,SCRAM-SHA-256'

Next, apply the configuration to the config servers:

$ kubectl create secret generic minimal-cluster-cfg-mongod --from-file=mongod.conf=fulldn-config.yaml

Additionally, if you want to log in to the replica set with LDAP, you can apply the same configuration as well:

$ kubectl create secret generic minimal-cluster-rs0-mongod --from-file=mongod.conf=fulldn-config.yaml

As for mongos, you will still need to omit the settings for authorization because this will come from the config server:

$ cat fulldn-mongos-config.yaml 
security:
  ldap:
    servers: "192.168.0.101"
    transportSecurity: none
    bind:
      queryUser: "CN=Search User01,CN=Users,DC=percona,DC=local"
      queryPassword: "SearchPassword1"
setParameter:
  authenticationMechanisms: 'PLAIN,SCRAM-SHA-1,SCRAM-SHA-256'

Then apply the configuration for mongos:

$ kubectl create secret generic minimal-cluster-mongos --from-file=mongos.conf=fulldn-mongos-config.yaml

One-by-one the pods will be recreated. Wait until all of them are recreated:

$ kubectl get pods
NAME                                               READY   STATUS    RESTARTS   AGE
percona-server-mongodb-operator-547c499bd8-p8k74   1/1     Running   0          24m
minimal-cluster-cfg-0                              1/1     Running   0          4m27s
minimal-cluster-rs0-0                              1/1     Running   0          3m34s
minimal-cluster-mongos-0                           1/1     Running   0          65s

Now you can test authentication in one of the endpoints:

$ kubectl run -i --rm --tty percona-client --image=percona/percona-server-mongodb:5.0.11-10 --restart=Never -- mongo --host minimal-cluster-mongos  -u "CN=Dba User01,CN=Users,DC=percona,DC=local" -p DbaPassword1 --authenticationDatabase '$external' --authenticationMechanism 'PLAIN' --eval "db.runCommand({connectionStatus:1})"

+ exec mongo --host minimal-cluster-mongos -u 'CN=Dba User01,CN=Users,DC=percona,DC=local' -p DbaPassword1 --authenticationDatabase '$external' --authenticationMechanism PLAIN --eval 'db.runCommand({connectionStatus:1})'
Percona Server for MongoDB shell version v5.0.11-10
connecting to: mongodb://minimal-cluster-mongos:27017/?authMechanism=PLAIN&authSource=%24external&compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("7eca812d-ad04-4ae2-8484-3b55dee1a673") }
Percona Server for MongoDB server version: v5.0.11-10
{
    "authInfo" : {
        "authenticatedUsers" : [
            {
                "user" : "CN=Dba User01,CN=Users,DC=percona,DC=local",
                "db" : "$external"
            }
        ],
        "authenticatedUserRoles" : [
            {
                "role" : "CN=dbadmins,CN=Users,DC=percona,DC=local",
                "db" : "admin"
            },
            {
                "role" : "root",
                "db" : "admin"
            }
        ]
    }
}
pod "percona-client" deleted

As you can see above, the user,”CN=Dba User01,CN=Users,DC=percona,DC=local” has assumed the role as root. You can test other endpoints using these commands.

$ kubectl run -i --rm --tty percona-client --image=percona/percona-server-mongodb:5.0.11-10 --restart=Never -- mongo --host minimal-cluster-rs0  -u "CN=Dba User01,CN=Users,DC=percona,DC=local" -p DbaPassword1 --authenticationDatabase '$external' --authenticationMechanism 'PLAIN' --eval "db.runCommand({connectionStatus:1})"
$ kubectl run -i --rm --tty percona-client --image=percona/percona-server-mongodb:5.0.11-10 --restart=Never -- mongo --host minimal-cluster-cfg  -u "CN=Dba User01,CN=Users,DC=percona,DC=local" -p DbaPassword1 --authenticationDatabase '$external' --authenticationMechanism 'PLAIN' --eval "db.runCommand({connectionStatus:1})"

Using userToDNMapping to simplify usernames

Obviously, you may not want the users to authenticate with the full DN. Perhaps, you want the users to specify just the first CN. You can use match and substitution mapping for this:

$ cat mapping1-config.yaml 
security:
  authorization: "enabled"
  ldap:
    authz:
      queryTemplate: 'DC=percona,DC=local??sub?(&(objectClass=group)(member:={USER}))'
    servers: "192.168.0.101"
    transportSecurity: none
    bind:
      queryUser: "CN=Search User01,CN=Users,DC=percona,DC=local"
      queryPassword: "SearchPassword1"
    userToDNMapping: >-
      [
        {
          match: "(.+)",
          substitution: "CN={0},CN=users,DC=percona,DC=local"
        }
      ]
setParameter:
  authenticationMechanisms: 'PLAIN,SCRAM-SHA-1,SCRAM-SHA-256'

$ cat mapping1-mongos-config.yaml 
security:
  ldap:
    servers: "192.168.0.101"
    transportSecurity: none
    bind:
      queryUser: "CN=Search User01,CN=Users,DC=percona,DC=local"
      queryPassword: "SearchPassword1"
    userToDNMapping: >-
      [
        {
          match: "(.+)",
          substitution: "CN={0},CN=users,DC=percona,DC=local"
        }
      ]
setParameter:
  authenticationMechanisms: 'PLAIN,SCRAM-SHA-1,SCRAM-SHA-256'

You will need to delete the old configuration and apply the new ones:

$ kubectl delete secret minimal-cluster-cfg-mongod
$ kubectl delete secret minimal-cluster-rs0-mongod
$ kubectl delete secret minimal-cluster-mongos
$ kubectl create secret generic minimal-cluster-cfg-mongod --from-file=mongod.conf=mapping1-config.yaml
$ kubectl create secret generic minimal-cluster-rs0-mongod --from-file=mongod.conf=mapping1-config.yaml
$ kubectl create secret generic minimal-cluster-mongos --from-file=mongos.conf=mapping1-mongos-config.yaml

With userToDNMapping, match and substitution you can now just specify the first CN. Once all of the pods are restarted, try logging in with a shorter username:

$ kubectl run -i --rm --tty percona-client --image=percona/percona-server-mongodb:5.0.11-10 --restart=Never -- mongo --host minimal-cluster-mongos  -u "Dba User01" -p DbaPassword1 --authenticationDatabase '$external' --authenticationMechanism 'PLAIN' --eval "db.runCommand({connectionStatus:1})"

Perhaps, it still seems awkward to have usernames with spaces and you would like to login based on other attributes such as sAMAccountName or mail. You can use an additional LDAP query in userToDBMapping to search for the record based on these properties. Once the record is found it will extract the user’s DN for authentication. For the example below, we will use sAMAccountName as input for the username:

$ cat mapping2-config.yaml 
security:
  authorization: "enabled"
  ldap:
    authz:
      queryTemplate: 'DC=percona,DC=local??sub?(&(objectClass=group)(member:={USER}))'
    servers: "192.168.0.101"
    transportSecurity: none
    bind:
      queryUser: "CN=Search User01,CN=Users,DC=percona,DC=local"
      queryPassword: "SearchPassword1"
    userToDNMapping: >-
      [
        {
          match: "(.+)",
          ldapQuery: "dc=percona,dc=local??sub?(&(sAMAccountName={0})(objectClass=person))"
        }
      ]
setParameter:
  authenticationMechanisms: 'PLAIN,SCRAM-SHA-1,SCRAM-SHA-256'
  
$ cat mapping2-mongos-config.yaml 
security:
  ldap:
    servers: "192.168.0.101"
    transportSecurity: none
    bind:
      queryUser: "CN=Search User01,CN=Users,DC=percona,DC=local"
      queryPassword: "SearchPassword1"
    userToDNMapping: >-
      [
        {
          match: "(.+)",
          ldapQuery: "dc=percona,dc=local??sub?(&(sAMAccountName={0})(objectClass=person))"
        }
      ]
setParameter:
  authenticationMechanisms: 'PLAIN,SCRAM-SHA-1,SCRAM-SHA-256'

Again, we will need to delete the old configuration and apply new ones:

$ kubectl delete secret minimal-cluster-cfg-mongod
$ kubectl delete secret minimal-cluster-rs0-mongod
$ kubectl delete secret minimal-cluster-mongos
$ kubectl create secret generic minimal-cluster-cfg-mongod --from-file=mongod.conf=mapping2-config.yaml
$ kubectl create secret generic minimal-cluster-rs0-mongod --from-file=mongod.conf=mapping2-config.yaml
$ kubectl create secret generic minimal-cluster-mongos --from-file=mongos.conf=mapping2-mongos-config.yaml

Once the pods are recreated, we can now authenticate with regular usernames.

$ kubectl run -i --rm --tty percona-client --image=percona/percona-server-mongodb:5.0.11-10 --restart=Never -- mongo --host minimal-cluster-mongos  -u devuser01 -p DevPassword1 --authenticationDatabase '$external' --authenticationMechanism 'PLAIN' --eval "db.runCommand({connectionStatus:1})"

$ kubectl run -i --rm --tty percona-client --image=percona/percona-server-mongodb:5.0.11-10 --restart=Never -- mongo --host minimal-cluster-mongos  -u dbauser01 -p DbaPassword1 --authenticationDatabase '$external' --authenticationMechanism 'PLAIN' --eval "db.runCommand({connectionStatus:1})"

Summary

I hope this article gets you up to speed on setting up LDAP authentication and authorization with Percona Operator for MongoDB.

Jul
07
2022
--

MongoDB Index Building on ReplicaSet and Shard Cluster

MongoDB Index Building on ReplicaSet and Shard Cluster

MongoDB Index Building on ReplicaSet and Shard ClusterWe all know how important it is to have a proper index in the database in order to do its job effectively. We have been using indexing in our daily life to import daily tasks, without index all tasks would be completed but in a relatively long time.

The basic working of index

Imagine that we have tons of information and we want to look at something very particular and we don’t know where it is. We are going to spend a lot of time finding that particular piece of data.

If only we would have some kind of information about all the pieces of data, the job would finish very quickly because now we know where to look without spending too much time searching each and every record for one particular data.

Indexes are special data structures that store some information of records to traverse to that particular data. Indexes can be created in ascending or descending order to support efficient equality matches and range-based query operations.

Index building strategy and consideration

When we think of building an index many aspects have to be considered like key data set which is frequently being used, cardinality, write ratio in that collection, free memory, and storage.

If there are no indexes in the collection, MongoDB will do a full collection scan every time any type of query is performed which could contain millions of records. This will not only slow down the operation but will also increase the wait time for other operations too.

We can also create multiple indexes at the same time on the same collection, saving lots of time that is spent scanning the collection with the createIndexes command.

Limitations

It is very important to have enough memory to accommodate the working set. It is not necessary that all indexes need to fit in RAM.

Index key limit should be less than 1024 bytes till v4.0. Starting v4.2 with fcv 4.2, this limit is removed.

Same with index name, it can be up 127 bytes in db with fcv 4.0 and below. This limit is reduced with db v4.2 and fcv 4.2.

Only 64 indexes can be created in any given single collection.

Index types in MongoDB

Before seeing various index types, let’s see what the index name looks like.

The default name for an index is the concatenation of the indexed keys and each key’s direction in the index ( i.e. 1 or -1) using underscores as a separator. For example, an index created on { mobile : 1, points: -1 } has the name mobile_1_points_-1.

We can also create a custom, more human-readable name 

db.products.createIndex({ mobile: 1, points: -1 }, { name: "query for rewards points" })

Index type

MongoDB provides various types of indexes to support various data and queries.

Single field index: In a single-field index, an index is created on a single field in a document. It can traverse in both directions regardless of sort order while creating the index.

Syntax:

db.collection.createIndex({"<fieldName>" : <1 or -1>})

Here 1 represents the field specified in ascending order and -1 for descending order.

Example:

db.inventory.createIndex({productId:1});

 

Compound index: In a compound index, we can create indexes on multiple fields. The order of fields listed in a compound index has significance. For instance, if a compound index consists of { userid: 1, score: -1 }, the index sorts first by userid and then, within each userid value, sorts by score.

Syntax:

db.collection.createIndex({ <field1>: <1/–1>, <field2>: <1/–1>, … })

Example:

db.students.createIndex({ userid: 1, score: -1 })

 

Multikey index: MongoDB uses multikey indexes to index the content stored in arrays. When we create an index on a field that contains an array value, MongoDB will automatically create a separate index for every element of the array. We do not need to specify multikey type explicitly, as MongoDB automatically takes care of whether to create a multikey index if the indexed field contains an array value.

Syntax:

db.collection.createIndex({ <field1>: <1/–1>})

Example:

db.students.createIndex({ "addr.zip":1})

 

Geospatial index: MongoDB provides two special indexes: 2d indexes that use planar geometry when returning results and 2dsphere indexes that use spherical geometry to return results.

Syntax:

db.collection.createIndex({ <location field> : "2dsphere" })

*where the <location field> is a field whose value is either a GeoJSON object or a legacy coordinate pair.

Example:

db.places.createIndex({ loc : "2dsphere" })

 

Text index: With the text index type, MongoDB supports searching for string content in a collection. A collection can only have one text search index, but that index can cover multiple fields.

Syntax:

db.collection.createIndex({ <field1>: text })

Example:

db.reviews.createIndex({ comments: "text" })

 

Hash index: MongoDB creates the hash value of the indexed field in case of a hash base index. This type of index is mainly required where we want to have an even data distribution e.g in the case of a shard cluster environment. 

Syntax:

db.collection.createIndex({ _id: "hashed"  })

From Version 4.4 onwards, the compound Hashed Index is applicable

Properties

Unique indexes: When specified, MongoDB will reject duplicate values for the indexed field. It will not allow inserting another document containing the same key-value pair which is indexed.

> db.cust_details.createIndex({Cust_id:1},{unique:true})

{

"createdCollectionAutomatically" : true,

"numIndexesBefore" : 1,

"numIndexesAfter" : 2,

"ok" : 1

}

> db.cust_details.insert({"Cust_id":"39772","Batch":"342"})

WriteResult({ "nInserted" : 1 })




> db.cust_details.insert({"Cust_id":"39772","Batch":"452"})

WriteResult({

"nInserted" : 0,

"writeError" : {

"code" : 11000,

"errmsg" : "E11000 duplicate key error collection: student.cust_details index: Cust_id_1 dup key: { Cust_id: \"39772\" }"

}

})

 

Partial indexes: Partial indexes only index the documents that match the filter criteria.

db.restaurants.createIndex({ cuisine: 1, name: 1 },{ partialFilterExpression: { rating: { $gt: 5 } } })

{

"createdCollectionAutomatically" : true,

"numIndexesBefore" : 1,

"numIndexesAfter" : 2,

"ok" : 1

}

 

TTL indexes: TTL indexes are special single-field indexes that can be used to auto delete documents from the collection over a certain period of time.

db.eventlog.createIndex({ "lastModifiedDate": 1 }, { expireAfterSeconds: 3600 })

lastModifiedDate_1

 

Sparse indexes: Sparse indexes only contain entries for documents that have the indexed field, even if the index field contains a null value.

db.addresses.createIndex({ "email": 1 }, { sparse: true })

email_1

 

Hidden indexes: Hidden indexes are not visible to the query planner and cannot be used to support a query. Apart from being hidden from the planner, hidden indexes behave like unhidden indexes.

To create a new hidden index:

db.addresses.createIndex({ pincode: 1 },{ hidden: true });

To change an existing index into a hidden one (works only with db having fcv 4.4 or greater):

db.addresses.hideIndex({ pincode: 1 }); // Specify the index key specification document
or
db.addresses.hideIndex( "pincode_1" );  // Specify the index name

To unhide any hidden index:

Index name or key can be used to hide the index.

db.addresses.unhideIndex({ pincode: 1 }); // Specify the index key specification document
or
db.addresses.unhideIndex( "pincode_1" );  // Specify the index name

Rolling index builds on replica sets

Starting from MongoDB 4.4 and later, index build happens simultaneously on all data-bearing nodes. For workloads that cannot tolerate performance issues due to index build, we can follow the approach of rolling index build strategy.

**NOTE**

Unique indexes

To create unique indexes using the following procedure, you must stop all writes to the collection during this procedure.

If you cannot stop all writes to the collection during this procedure, do not use the procedure on this page. Instead, build your unique index on the collection by issuing db.collection.createIndex() on the primary for a replica set.

Oplog size

Ensure that your oplog is large enough to permit the indexing or re-indexing operation to complete without falling too far behind to catch up.

Procedure

1. Stop one secondary and restart as a standalone on a different port number.

In this process, we are going to stop any one secondary node at a time and disable the replication parameter from the configuration file, and disableLogicalSessionCacheRefresh to true in the configuration file under the setParameter section.

Example

net:

   bindIp: localhost,<hostname(s)|ip address(es)>

   port: 27217

#   port: 27017

#replication:

#   replSetName: myRepl

setParameter:

disableLogicalSessionCacheRefresh: true

We only need to make changes in the above settings, the rest will remain the same.

Once the above changes are done, save it and restart the process.

mongod --config <path/To/ConfigFile>

OR

sudo systemctl start mongod

Now, the mongod process will start on port 27217 in standalone mode.

2. Build the index

Connect to the mongod instance on port 27217. Switch to the desired database and collection to create an index.

Example:

mongo –port 27217 -u ‘username’  –authenticationDatabase admin

> use student

switched to db student

> db.studentData.createIndex( { StudentID: 1 } );

{

"createdCollectionAutomatically" : true,

"numIndexesBefore" : 1,

"numIndexesAfter" : 2,

"ok" : 1

}

 

3. Restart the process mongod as a replica set member

After the desired index build completes, we can add the node back to replicaset member. 

Undo the configuration file change made in step one above. Restart the mongod process with the original configuration file.

net:

   bindIp: localhost,<hostname(s)|ip address(es)>

   port: 27017

replication:

   replSetName: myRepl

After saving the configuration file, restart the process and let it become secondary.

mongod --config <path/To/ConfigFile>

OR

sudo systemctl start mongod

4. Repeat the above procedure for the remaining secondaries

Once the ongoing node becomes secondary and there is no lag, repeat the procedure again one node at a time.

  1. Stop one secondary and restart as a standalone.
  2. Build the index.
  3. Restart the mongod process as a replica set member.

5. Index build on primary

Once index build activity finishes up in all the secondary nodes, use the same process as above to create an index on the last remaining node.

  1. Connect to the primary node and issue rs.stepDown(); Once it successfully steps down, it becomes secondary and a new primary is elected. Follow steps from one through three to build the index.
  2. Stop secondary node and restart as a standalone.
  3. Build the iondex.
  4. Restart the mongod process as a replica set member.

Rolling index builds on sharded clusters

Starting from MongoDB 4.4 and later, index build happens simultaneously on all data-bearing nodes. For workloads that cannot tolerate performance issues due to index build, we can follow the approach of rolling index build strategy.

**NOTE**

Unique indexes

To create unique indexes using the following procedure, you must stop all writes to the collection during this procedure.

If you cannot stop all writes to the collection during this procedure, do not use the procedure on this page. Instead, build your unique index on the collection by issuing db.collection.createIndex() on the primary for a replica set.

Oplog size

Ensure that your oplog is large enough to permit the indexing or re-indexing operation to complete without falling too far behind to catch up.

Procedure

1. Stop the balancer

In order to create an index in a rolling fashion in a shard cluster, it is necessary to stop the balancer so that we do not end up with an inconsistent index.

Connect to mongos instance and run sh.stopBalancer() to disable the balancer.

If there is any active migration going on, the balancer will stop only after the completion of the ongoing migration.

We can check if the balancer is stopped or not with the below command,

sh.getBalancerState()

If the balancer is stopped, the output will be false.

2. Determine the distribution of the collection

In order to build indexes in a rolling fashion, it is necessary to know on which shards the collections are residing. 

Connect to one of the mongos and refresh the cache so that we get fresh distribution information of collections in the shard for which we want to build the index.

Example:

We want to create an index in the studentData collection in the student database.

We will run the below command to get a fresh distribution of that collection.

db.adminCommand( { flushRouterConfig: "students.studentData" } );

db.records.getShardDistribution();

We will get the output of shards containing the collection :

Shard shardA at shardA/s1-mongo1.net:27018,s1-mongo2.net:27018,s1-mongo3.net:27018
data : 1KiB docs : 50 chunks : 1
estimated data per chunk : 1KiB
estimated docs per chunk : 50
Shard shardC at shardC/s3-mongo1.net:27018,s3-mongo2.net:27018,s3-mongo3.net:27018
data : 1KiB docs : 50 chunks : 1
estimated data per chunk : 1KiB
estimated docs per chunk : 50
Totals
data : 3KiB docs : 100 chunks : 2
Shard shardA contains 50% data, 50% docs in cluster, avg obj size on shard : 40B
Shard shardC contains 50% data, 50% docs in cluster, avg obj size on shard : 40B

From the above output, we can see that the students.studentData exist on shardA and shardC and we need to build indexes on shardA and shardC, respectively.

3. Build indexes on the shards that contain collection chunks

Follow the procedure below on each shard that contains the chunk of collection.

3.1. Stop one secondary and restart as a standalone

For the identified shard, stop one of the secondary nodes and make the following changes.

  • Change the port number to a different port
  • Comment out replication parameters
  • Comment out sharding parameters
  • Under section “setParameter” add skipShardingConfigurationChecks: true and disableLogicalSessionCacheRefresh: true 

Example

net:

   bindIp: localhost,<hostname(s)|ip address(es)>

   port: 27218

#   port: 27018

#replication:

#   replSetName: shardA

#sharding:

#   clusterRole: shardsvr

setParameter:

 skipShardingConfigurationChecks: true

 disableLogicalSessionCacheRefresh: true

After saving the configuration restart the process 

mongod --config <path/To/ConfigFile>

OR

sudo systemctl start mongod

 

3.2. Build the index

Connect to the mongod instance running on standalone mode and start the index build process.

Here, we are building the index in students collection on field StudentID in ascending order

> db.students.createIndex( { StudentID: 1 } )

{

"createdCollectionAutomatically" : true,

"numIndexesBefore" : 1,

"numIndexesAfter" : 2,

"ok" : 1

}

 

3.3. Restart the MongoDB process as replicaset node

Once the index build activity is finished, shutdown the instance and restart with the original configuration, remove the parameters skipShardingConfigurationChecks: true and disableLogicalSessionCacheRefresh: true 

net:

   bindIp: localhost,<hostname(s)|ip address(es)>

   port: 27018

replication:

   replSetName: shardA

sharding:

   clusterRole: shardsvr

 

After saving the configuration restart the process 

mongod --config <path/To/ConfigFile>

OR

sudo systemctl start mongod

 

3.4. Repeat the procedure for the remaining secondaries for the shard

Once the node on which index build has been completed, added back to the replicaset set, and is in sync with other nodes, repeat the above process from 3.1 to 3.3 on the remaining nodes.

3.1. Stop one secondary and restart as a standalone

3.2. Build the index

3.3. Restart the MongoDB process as replicaset node

3.5. Index build on primary

Once index build activity finishes up in all the secondary nodes, use the same process as above to create an index on the last remaining node.

  1. Connect to the primary node and issue rs.stepDown(); Once it successfully steps down, becomes secondary and a new primary is elected. Follow steps from one through three to build the index.
  2. Stop the secondary node and restart it as a standalone
  3. Build the index
  4. Restart the process mongod as a replica set member

4. Repeat for the other affected shards

Once the index build is finished for one of the identified shard, start the process outlined in step three on the next identified shard.

5. Restart the balancer

Once we are done building the index on all identified shards we can start the balancer again.

Connect to a mongos instance in the sharded cluster, and run sh.startBalancer()

sh.startBalancer()

Conclusion

Picking the right key based on an access pattern and having a good index is better than having multiple bad indexes. So, choose your index wisely.

There are also other interesting blogs on https://www.percona.com/blog/ which might be helpful to you.

I also recommend going and using Percona Server for MongoDB, which provides MongoDB enterprise-grade features without any license (as it is free). You can learn more about it in the blog MongoDB: Why Pay for Enterprise When Open Source Has You Covered?

Percona also offers some more great products for MongoDB like Percona Backup for MongoDBPercona Operator for MongoDB, and for other technologies and tools too like MySQL Software, PostgreSQL Distribution, Percona Operators, and Monitoring & Management

Jun
29
2022
--

Window Functions in MongoDB 5.0

Window Functions in MongoDB 5.0

Window Functions in MongoDB 5.0I have already presented in previous posts some of the new features available on MongoDB 5.0: resharding and time series collections. Please have a look if you missed them:

MongoDB 5.0 Time Series Collections

Resharding in MongoDB 5.0

In this article, I would like to present another new feature: window functions.

Window functions are quite popular on relational databases, they permit the run of a window across sorted documents producing calculations over each step of the window. Typical use cases are calculating rolling averages, correlation scores, or cumulative totals. You can achieve the same result even with older versions of MongoDB or with databases where window functions are not available. But this comes at the cost of more complexity because multiple queries are usually required, and saving somewhere temporary data is needed as well.

Instead, the window functions let you run a single query and get the expected results in a more efficient and elegant way.

Let’s see how the feature works on MongoDB 5.0.

The window functions

A new aggregation stage $setWindowFields is available on MongoDB 5.0. This is the one that provides the window functions capability.

The following is the syntax of the stage:

{
  $setWindowFields: {
    partitionBy: <expression>,
    sortBy: { 
      <sort field 1>: <sort order>,
      <sort field 2>: <sort order>,
      ...,
      <sort field n>: <sort order>
    },
    output: {
      <output field 1>: {
        <window operator>: <window operator parameters>,
        window: { 
          documents: [ <lower boundary>, <upper boundary> ],
          range: [ <lower boundary>, <upper boundary> ],
          unit: <time unit>
        }
      },
      <output field 2>: { ... }, 
      ...
      <output field n>: { ... }
    }
  }
}

  • partitionBy (optional): some expression to group the document. If omitted by default all the documents are grouped into a single partition
  • sortBy (required in some cases ): sorting the documents. Uses the $sort syntax
  • output (required): specifies the documents to append to the result set. Basically, this is the parameter that provides the result of the window function
  • window (optional): defines the inclusive window boundaries and how the boundaries should be used for the calculation of the window function result

Well, the definitions may look cryptic but a couple of simple examples will clarify how you can use them.

The test dataset

I have a Percona Server for MongoDB 5.0 running and I got some public data about COVID-19 infections, hospitalizations, and other info from Italy. The data are available on a per-day and per-region basis from the following link: https://github.com/pcm-dpc/COVID-19/tree/master/dati-regioni.

I loaded just a few months’ data spanning 2021 and 2022. Data is labeled in Italian, so I created a similar and reduced collection just for the needs of this article.

Here is a sample of the documents:

> db.covid.find({"region":"Lombardia"}).sort({"date":1}).limit(5)
{ "_id" : ObjectId("62ab5f7d017d030e4cb314e9"), "region" : "Lombardia", "total_cases" : 884125, "date" : ISODate("2021-10-01T15:00:00Z") }
{ "_id" : ObjectId("62ab5f7d017d030e4cb314fe"), "region" : "Lombardia", "total_cases" : 884486, "date" : ISODate("2021-10-02T15:00:00Z") }
{ "_id" : ObjectId("62ab5f7d017d030e4cb31516"), "region" : "Lombardia", "total_cases" : 884814, "date" : ISODate("2021-10-03T15:00:00Z") }
{ "_id" : ObjectId("62ab5f7d017d030e4cb31529"), "region" : "Lombardia", "total_cases" : 884920, "date" : ISODate("2021-10-04T15:00:00Z") }
{ "_id" : ObjectId("62ab5f7d017d030e4cb3153d"), "region" : "Lombardia", "total_cases" : 885208, "date" : ISODate("2021-10-05T15:00:00Z") }

Each document contains the daily number of total COVID infections from the beginning of the pandemic for a specific Italian region.

Calculate daily new cases

Let’s create our first window function.

Since we have in the collection only the number of total cases, we would like to calculate the number of new cases per day. This way we can understand if the status of the pandemic is getting worse or improving.

You can achieve that by issuing the following aggregation pipeline:

> db.covid.aggregate( [
{ $setWindowFields: {
    partitionBy : "$region",
    sortBy: { date: 1 },
    output: {
      previous: {
        $push: "$total_cases",
        window: {
          range: [-1, -1],
          unit: "day"
        }
      } 
    }
  }
}
,
{ $unwind:"$previous"},
{ $addFields: {
    new_cases: {
      $subtract: ["$total_cases","$previous"]
    }
  }
},
{ $match: { "region": "Lombardia" } },
{ $project: { _id:0, region:1, date:1, new_cases: 1}  }
] )
{ "region" : "Lombardia", "date" : ISODate("2021-10-02T15:00:00Z"), "new_cases" : 361 }
{ "region" : "Lombardia", "date" : ISODate("2021-10-03T15:00:00Z"), "new_cases" : 328 }
{ "region" : "Lombardia", "date" : ISODate("2021-10-04T15:00:00Z"), "new_cases" : 106 }
{ "region" : "Lombardia", "date" : ISODate("2021-10-05T15:00:00Z"), "new_cases" : 288 }
{ "region" : "Lombardia", "date" : ISODate("2021-10-06T15:00:00Z"), "new_cases" : 449 }
{ "region" : "Lombardia", "date" : ISODate("2021-10-07T15:00:00Z"), "new_cases" : 295 }
{ "region" : "Lombardia", "date" : ISODate("2021-10-08T15:00:00Z"), "new_cases" : 293 }
{ "region" : "Lombardia", "date" : ISODate("2021-10-09T15:00:00Z"), "new_cases" : 284 }
{ "region" : "Lombardia", "date" : ISODate("2021-10-10T15:00:00Z"), "new_cases" : 278 }
{ "region" : "Lombardia", "date" : ISODate("2021-10-11T15:00:00Z"), "new_cases" : 87 }
{ "region" : "Lombardia", "date" : ISODate("2021-10-12T15:00:00Z"), "new_cases" : 306 }
{ "region" : "Lombardia", "date" : ISODate("2021-10-13T15:00:00Z"), "new_cases" : 307 }
{ "region" : "Lombardia", "date" : ISODate("2021-10-14T15:00:00Z"), "new_cases" : 273 }
{ "region" : "Lombardia", "date" : ISODate("2021-10-15T15:00:00Z"), "new_cases" : 288 }
{ "region" : "Lombardia", "date" : ISODate("2021-10-16T15:00:00Z"), "new_cases" : 432 }
{ "region" : "Lombardia", "date" : ISODate("2021-10-17T15:00:00Z"), "new_cases" : 297 }
{ "region" : "Lombardia", "date" : ISODate("2021-10-18T15:00:00Z"), "new_cases" : 112 }
{ "region" : "Lombardia", "date" : ISODate("2021-10-19T15:00:00Z"), "new_cases" : 412 }
{ "region" : "Lombardia", "date" : ISODate("2021-10-20T15:00:00Z"), "new_cases" : 457 }
{ "region" : "Lombardia", "date" : ISODate("2021-10-21T15:00:00Z"), "new_cases" : 383 }

 

The pipeline also contains stages to make the output more readable. Let’s focus on the $setWindowFields anyway.

In the first stage, we define the window function in order to create for each document a new field containing the total cases from the previous day. The field was obviously named previous.

Then we’ll use this information in the following stages to simply calculate the difference between the total cases of “today” and “yesterday”. Then we get the daily increase.

Take a look at how the window function has been created. We used $push to fill the new field with the value of total_cases. In the window document, we defined the range as [-1,-1]. These numbers represent the lower and upper boundaries of the window and they both correspond to the previous (-1) document in the window. It spans only one document: yesterday. In this case, the usage of sortBy is relevant because it tells MongoDB which order to consider the documents in the windows. The trick of defining the range as [-1,-1] to get yesterday’s data is possible because the documents are properly sorted.

Calculate moving average

Let’s now calculate the moving average. We’ll consider the last week of data to calculate the average of new cases on a daily basis. This kind of parameter was a very popular one during the peak of the pandemic to trigger a lot of discussions around the forecasts and to address the decisions of the governments. Well, it’s a simplification. There were also other relevant parameters, but the moving average was one of them.

To calculate the moving average we need the daily new cases we have calculated in the previous example. We can reuse those values in different ways like adding another “$setWindowField” stage in the previous pipeline, adding the new_cases field on existing documents, or creating another collection as I did for simplicity this way using the $out stage:

> db.covid.aggregate( [ { $setWindowFields: { partitionBy : "$region", sortBy: { date: 1 }, output: { previous: { $push: "$total_cases", window: { range: [-1, -1],  unit: "day" } } } } }, { $unwind:"$previous"},  { $addFields: { new_cases: { $subtract: ["$total_cases","$previous"] } } }, { $project: { region:1, date:1, new_cases: 1} }, { $out: "covid_daily"  }  ] )

Now we can calculate the moving average on the covid_daily collection. Let’s do it with the following aggregation:

> db.covid_daily.aggregate([
{ $setWindowFields: {
    partitionBy : "$region",
    sortBy : { date: 1 },
    output: {
      moving_average: {
        $avg: "$new_cases",
        window: {
          range: [-6, 0],
          unit: "day"
        }
      }
    }
  }
},
{ $project: { _id:0  } }
])
{ "region" : "Abruzzo", "date" : ISODate("2021-10-02T15:00:00Z"), "new_cases" : 49, "moving_average" : 49 }
{ "region" : "Abruzzo", "date" : ISODate("2021-10-03T15:00:00Z"), "new_cases" : 36, "moving_average" : 42.5 }
{ "region" : "Abruzzo", "date" : ISODate("2021-10-04T15:00:00Z"), "new_cases" : 14, "moving_average" : 33 }
{ "region" : "Abruzzo", "date" : ISODate("2021-10-05T15:00:00Z"), "new_cases" : 35, "moving_average" : 33.5 }
{ "region" : "Abruzzo", "date" : ISODate("2021-10-06T15:00:00Z"), "new_cases" : 61, "moving_average" : 39 }
{ "region" : "Abruzzo", "date" : ISODate("2021-10-07T15:00:00Z"), "new_cases" : 54, "moving_average" : 41.5 }
{ "region" : "Abruzzo", "date" : ISODate("2021-10-08T15:00:00Z"), "new_cases" : 27, "moving_average" : 39.42857142857143 }
{ "region" : "Abruzzo", "date" : ISODate("2021-10-09T15:00:00Z"), "new_cases" : 48, "moving_average" : 39.285714285714285 }
{ "region" : "Abruzzo", "date" : ISODate("2021-10-10T15:00:00Z"), "new_cases" : 19, "moving_average" : 36.857142857142854 }
{ "region" : "Abruzzo", "date" : ISODate("2021-10-11T15:00:00Z"), "new_cases" : 6, "moving_average" : 35.714285714285715 }
{ "region" : "Abruzzo", "date" : ISODate("2021-10-12T15:00:00Z"), "new_cases" : 55, "moving_average" : 38.57142857142857 }
{ "region" : "Abruzzo", "date" : ISODate("2021-10-13T15:00:00Z"), "new_cases" : 56, "moving_average" : 37.857142857142854 }
{ "region" : "Abruzzo", "date" : ISODate("2021-10-14T15:00:00Z"), "new_cases" : 45, "moving_average" : 36.57142857142857 }
{ "region" : "Abruzzo", "date" : ISODate("2021-10-15T15:00:00Z"), "new_cases" : 41, "moving_average" : 38.57142857142857 }
{ "region" : "Abruzzo", "date" : ISODate("2021-10-16T15:00:00Z"), "new_cases" : 26, "moving_average" : 35.42857142857143 }
{ "region" : "Abruzzo", "date" : ISODate("2021-10-17T15:00:00Z"), "new_cases" : 39, "moving_average" : 38.285714285714285 }
{ "region" : "Abruzzo", "date" : ISODate("2021-10-18T15:00:00Z"), "new_cases" : 3, "moving_average" : 37.857142857142854 }
{ "region" : "Abruzzo", "date" : ISODate("2021-10-19T15:00:00Z"), "new_cases" : 45, "moving_average" : 36.42857142857143 }
{ "region" : "Abruzzo", "date" : ISODate("2021-10-20T15:00:00Z"), "new_cases" : 54, "moving_average" : 36.142857142857146 }
{ "region" : "Abruzzo", "date" : ISODate("2021-10-21T15:00:00Z"), "new_cases" : 72, "moving_average" : 40 }

 

Note we have defined the range boundaries as [-6,0] in order to span the last week’s documents for the current document.

Notes about window functions

We have used unit: “day” in the window definition, but this option field can also have other values like year, quarter, month, week, day, hour, and so on.

There are multiple operators that can be used with $setWindowFields: $avg, $count, $first, $last, $max, $min, $derivative, $sum, $rank and many others you can check on the documentation.

There are a few restrictions about window functions usage. Please have a look at the official documentation in case you hit some of them.

Conclusion

The new window function is a very good feature deployed on MongoDB 5.0. It could make life easier for a lot of developers.

For getting more details and to check the restrictions you can, have a look at the following page:

https://www.mongodb.com/docs/manual/reference/operator/aggregation/setWindowFields/

Percona Server for MongoDB 5.0 is a drop-in replacement for MongoDB Community. You can use it for free and you can rely on enterprise-class features like encryption at rest, LDAP authentication, auditing, and many others. You can also rely on all new features of MongoDB Community 5.0, including window functions.

Take a look at Percona Server for MongoDB.

Jun
24
2022
--

Debug Symbols for Percona Server for MongoDB and MongoDB

Debug Symbols MongoDB

Debug Symbols MongoDBBoth Percona Server for MongoDB and vanilla MongoDB packages do not contain debug symbols by default. This is because the debug symbols package can be up to a 3GB download depending on the version and target platform. Fortunately, you only need debug symbols in those rare cases when you have got a serious enough issue, and you really want to debug it. So for most users, it is an absolutely reasonable decision to not download gigabytes of debug symbols by default.

This blog post provides pointers to where to get debug symbols for Percona Server for MongoDB or vanilla MongoDB.

Percona Server for MongoDB

Using the corresponding package manager

Percona provides debug symbols packages for Percona Server for MongoDB. It is recommended to install Percona packages from official Percona repositories using the corresponding tool for your system:

Installing debug symbols manually

You can also download packages from the Percona website and install them manually using dpkg or rpm.

To get debug symbols for Percona Server for MongoDB, go to the downloads page. Then look for the “Percona Server for MongoDB” section and click the “Download X.X Now” button corresponding to the version you are interested in.

On the new page, select the minor release version from the Version: dropdown list and the target platform from the Software: dropdown list. This will reveal a list of available packages for the selected platform. You can search for a dbg or debuginfo package (depending on the target platform) and download it.

In most cases, it is possible to download debug symbols as a special package or as part of the “All Packages” bundle. For example, on the Percona Server for MongoDB 5.0.98 page for Ubuntu 20.04 you can download either a separate percona-server-mongodb-dbg_5.0.98.focal_amd64.deb package or all packages bundle which contains debug symbols package: percona-server-mongodb-5.0.9 8-r15a95b4-focal-x86_64-bundle.tar

MongoDB Debug Symbols

There are no debug symbols packages provided by MongoDB Inc., but fortunately, it is possible to download binary tarballs containing debug symbols files from the non-advertised location: https://www.mongodb.org/dl/linux/x86_64 

Be careful – the above link opens a huge list containing thousands of tgz archives created since 2009. This is virtually the full MongoDB history, in a single directory.

The names of those files are talking for themselves: it’s a combination of architecture, platform, and MongoDB version.

For example, there are two files for MongoDB 5.0.9 on Ubuntu 20.04:

The first of those files are just server core binaries:

$ tar -tf mongodb-linux-x86_64-ubuntu2004-5.0.9.tgz
mongodb-linux-x86_64-ubuntu2004-5.0.9/LICENSE-Community.txt
mongodb-linux-x86_64-ubuntu2004-5.0.9/MPL-2
mongodb-linux-x86_64-ubuntu2004-5.0.9/README
mongodb-linux-x86_64-ubuntu2004-5.0.9/THIRD-PARTY-NOTICES
mongodb-linux-x86_64-ubuntu2004-5.0.9/bin/install_compass
mongodb-linux-x86_64-ubuntu2004-5.0.9/bin/mongo
mongodb-linux-x86_64-ubuntu2004-5.0.9/bin/mongod
mongodb-linux-x86_64-ubuntu2004-5.0.9/bin/mongos

The second file contains corresponding debug symbols files:

$ tar -tf mongodb-linux-x86_64-ubuntu2004-debugsymbols-5.0.9.tgz
mongodb-linux-x86_64-ubuntu2004-5.0.9/LICENSE-Community.txt
mongodb-linux-x86_64-ubuntu2004-5.0.9/MPL-2 
mongodb-linux-x86_64-ubuntu2004-5.0.9/README 
mongodb-linux-x86_64-ubuntu2004-5.0.9/THIRD-PARTY-NOTICES 
mongodb-linux-x86_64-ubuntu2004-5.0.9/bin/mongo.debug 
mongodb-linux-x86_64-ubuntu2004-5.0.9/bin/mongod.debug 
mongodb-linux-x86_64-ubuntu2004-5.0.9/bin/mongos.debug

Thus, you can use those symbol files with gdb to analyze the core dump file if you have one or to debug a running instance of MongoDB Community Edition.

Each xxxx.debug file is a debug symbols file for the corresponding xxxx binary. If you accidentally try to debug with a mismatched symbols file, gdb will politely inform you about that:

Reading symbols from ./mongod...
warning: the debug information found in "/home/igor/5.0.9/bin/mongod.debug" does not match "/home/igor/5.0.9/bin/mongod" (CRC mismatch).

(no debugging symbols found)...done.

This especially can happen if you upgrade the binaries package but not debug symbols.

Conclusion

It is a really rare case when you will need to debug Percona Server for MongoDB or MongoDB, but if you really need to I hope this debug symbols information will save a few minutes for you.

Happy debugging!

Jun
03
2022
--

Migration of a MongoDB Replica Set to a Sharded Cluster

Migration of a MongoDB Replica Set to a Sharded Cluster

Migration of a MongoDB Replica Set to a Sharded ClusterIn this blog post, we will discuss how can we migrate from a replica set to sharded cluster. 

Before moving to migration let me briefly explain Replication and Sharding and why do we need to shard a replica Set.

Replication: It creates additional copies of data and allows for automatic failover to another node in case Primary went down. It also helps to scale our reads if the application is fine to read data that may not be the latest.

Sharding: It allows horizontal scaling of data writes by allowing data partition in multiple servers by using a shard key. Here, we should understand that a shard key is very important to distribute the data evenly across multiple servers. 

Why Do We Need a Sharded Cluster?

We need sharding due to the below reasons:

  1. By adding shards, we can reduce the number of operations each shard manages. 
  2. It increases the Read/Write capacity by distributing the Reads/Writes across multiple servers. 
  3. It also gives high availability as we deploy the replicas for the shards, config servers, and multiple MongoS.

Sharded cluster will include two more components which are Config Servers and Query routers i.e. MongoS.

Config Servers: It keeps metadata for the sharded cluster. The metadata comprises a list of chunks on each shard and the ranges that define the chunks. The metadata indicates the state of all the data and its components within the cluster. 

Query Routers(MongoS): It caches metadata and uses it to route the read or write operations to the respective shards. It also updates the cache when there are any metadata changes for the sharded cluster like Splitting of chunks or shard addition etc. 

Note: Before starting the migration process it’s recommended that you perform a full backup (if you don’t have one already).

The Procedure of Migration:

  1. Initiate at least a three-member replica set for the Config Server ( another member can be included as a hidden node for the backup purpose).
  2. Perform necessary OS, H/W, and disk-level tuning as per the existing Replica set.
  3. Setup the appropriate clusterRole for the Config servers in the mongod config file.
  4. Create at least two more nodes for the Query routers ( MongoS )
  5. Set appropriate configDB parameters in the mongos config file.
  6. Repeat step 2 from above to tune as per the existing replica set.
  7. Apply proper SELinux policies on all the newly configured nodes of Config server and MongoS.
  8. Add clusterRole parameter into existing replica set nodes in a rolling fashion.
  9. Copy all the users from the replica set to any MongoS.
  10. Connect to any MongoS and add the existing replica set as Shard. 

Note: Do not enable sharding on any database until the shard key is finalized. If it’s finalized then we can enable the sharding.

Detailed Migration Plan:

Here, we are assuming that a Replica set has three nodes (1 primary, and 2 secondaries)

  1. Create three servers to initiate a 3-member replica set for the Config Servers.

Perform necessary OS, H/W, and disk-level tuning. To know more about it, please visit our blog on Tuning Linux for MongoDB.

  1. Install the same version of Percona Server for MongoDB as the existing replica set from here.
  2. In the config file of the config server mongod, add the parameter clusterRole: configsvr and port: 27019  to start it as config server on port 27019.
  3. If SELinux policy is enabled then set the necessary SELinux policy for dbPath, keyFile, and logs as below.
sudo semanage fcontext -a -t mongod_var_lib_t '/dbPath/mongod.*'

sudo chcon -Rv -u system_u -t mongod_var_lib_t '/dbPath/mongod'

sudo restorecon -R -v '/dbPath/mongod'

sudo semanage fcontext -a -t mongod_log_t '/logPath/log.*'

sudo chcon -Rv -u system_u -t mongod_log_t '/logPath/log'

sudo restorecon -R -v '/logPath/log'

sudo semanage port -a -t mongod_port_t -p tcp 27019

Start all the Config server mongod instances and connect to any one of them. Create a temporary user on it and initiate the replica set.

> use admin

> rs.initiate()

> db.createUser( { user: "tempUser", pwd: "<password>", roles:[{role: "root" , db:"admin"}]})

Create a role anyResource with action anyAction as well and assign it to “tempUser“.

>db.getSiblingDB("admin").createRole({ "role": "pbmAnyAction",

      "privileges": [

         { "resource": { "anyResource": true },

           "actions": [ "anyAction" ]

         }

      ],

      "roles": []

   });

> 

>db.grantRolesToUser( "tempUser", [{role: "pbmAnyAction", db: "admin"}]  )

> rs.add("config_host[2-3]:27019")

Now our Config server replica set is ready, let’s move to deploying Query routers i.e. MongoS.

  1. Create two instances for the MongoS and tune the OS, H/W, and disk. To do it follow our blog Tuning Linux for MongoDB or point 1 from the above Detailed migration.
  2. In mongos config file, adjust the configDB parameter and include only non-hidden nodes of Config servers ( In this blog post, we have not mentioned starting hidden config servers).
  3. Apply SELinux policies if it’s enabled, then follow step 4 and keep the same keyFile and start the MongoS on port 27017.
  4. Add the below parameter in mongod.conf on the Replica set nodes. Make sure the services are restarted in a rolling fashion i.e. start with the Secondaries then step down the existing Primary and restart it with port 27018.
clusterRole: shardsvr

Login to any MongoS and authenticate using “tempUser” and add the existing replica set as a shard.

> sh.addShard( "replicaSetName/<URI of the replica set>") //Provide URI of the replica set

Verify it with:

> sh.status() or db.getSiblingDB("config")['shards'].find()

Connect to the Primary of the replica set and copy all the users and roles. To authenticate/authorize mention the replica set user.

> var mongos = new Mongo("mongodb://put MongoS URI string here/admin?authSource=admin") //Provide the URI of the MongoS with tempUser for authentication/authorization.

>db.getSiblingDB("admin").system.roles.find().forEach(function(d) {

mongos.getDB('admin').getCollection('system.roles').insert(d)});

>db.getSiblingDB("admin").system.users.find().forEach(function(d) { mongos.getDB('admin').getCollection('system.users').insert(d)});

  1.  Connect to any MongoS and verify copied users on it. 
  2.  Shard the database if shardKey is finalized (In this post, we are not sharing this information as it’s related to migration of Replica set to Sharded cluster only).

Shard the database:

>sh.enableSharding("<db>")

Shard the collection with hash-based shard key:

>sh.shardCollection("<db>.<coll1>", { <shard key field> : "hashed" } )

Shard the collection with range based shard key:

sh.shardCollection("<db>.<coll1>", { <shard key field> : 1, ... } )

Conclusion

Migration of a MongoDB replica set to a sharded cluster is very important to scale horizontally, increase the read/write operations, and also reduce the operations each shard manages.

We encourage you to try our products like Percona Server for MongoDB, Percona Backup for MongoDB, or Percona Operator for MongoDB. You can also visit our site to know “Why MongoDB Runs Better with Percona”.

Oct
19
2021
--

How to Build Percona Server for MongoDB for Various Operating Systems

How to Build Percona Server for MongoDB Operating Systems

How to Build Percona Server for MongoDB Operating SystemsFollowing this series of blog posts started by Evgeniy Patlan:

we’ll show you how to build Percona Server for MongoDB for various operating systems¹ using Docker on your local Linux machine/build server. In this very case, we’ll build packages of Percona Server for MongoDB 4.4.9-10 version for Centos 8 and Debian 11 (bullseye).

This can be useful when you need to test your changes to the code for different RPMs/DEBs based platforms and make sure that all works as expected in different environments. In our case, this approach is used for building Percona Server for MongoDB packages/binary tarballs for all supported OSes.

Prepare Build Environment

  • Make sure that you have at least 60GB of free disk space
  • Create a “build folder” – the folder where all the build actions will be performed, in our case “/mnt/psmdb-44/test
  • Make sure that you have installed the package which provides docker and docker service is up and running

Obtain Build Script of Needed Version²

You need to download the build script of the needed version to the “/mnt/psmdb-44” folder:

cd /mnt/psmdb-44/
wget https://raw.githubusercontent.com/percona/percona-server-mongodb/psmdb-4.4.9-10/percona-packaging/scripts/psmdb_builder.sh -O psmdb_builder.sh

Create Percona Server for MongoDB Source Tarball

  • Please note that for the creation of source tarball, we use the oldest supported OS, in this case, it is Centos 7.
docker run -ti -u root -v /mnt/psmdb-44:/mnt/psmdb-44 centos:7 sh -c '
set -o xtrace
cd /mnt/psmdb-44
bash -x ./psmdb_builder.sh --builddir=/mnt/psmdb-44/test --install_deps=1
bash -x ./psmdb_builder.sh --builddir=/mnt/psmdb-44/test --repo=https://github.com/percona/percona-server-mongodb.git \
--branch=release-4.4.9-10 --psm_ver=4.4.9 --psm_release=10 --mongo_tools_tag=100.4.1 --jemalloc_tag=psmdb-3.2.11-3.1 --get_sources=1
'

  • Check that source tarball has been created:
$ ls -la /mnt/psmdb-44/source_tarball/
total 88292
drwxr-xr-x. 2 root root     4096 Oct  1 10:58 .
drwxr-xr-x. 5 root root     4096 Oct  1 10:58 ..
-rw-r--r--. 1 root root 90398894 Oct  1 10:58 percona-server-mongodb-4.4.9-10.tar.gz

Build Percona Server for MongoDB Generic Source RPM/DEB:

Please note that for building generic source RPM/DEB, we still use the oldest supported RPM/DEB-based OS, in this case, Centos 7/ Ubuntu Xenial(16.04).

  • Build source RPM:
docker run -ti -u root -v /mnt/psmdb-44:/mnt/psmdb-44 centos:7 sh -c '
set -o xtrace
cd /mnt/psmdb-44
bash -x ./psmdb_builder.sh --builddir=/mnt/psmdb-44/test --install_deps=1
bash -x ./psmdb_builder.sh --builddir=/mnt/psmdb-44/test --repo=https://github.com/percona/percona-server-mongodb.git \
--branch=release-4.4.9-10 --psm_ver=4.4.9 --psm_release=10 --mongo_tools_tag=100.4.1 --jemalloc_tag=psmdb-3.2.11-3.1 --build_src_rpm=1
'

  • Build source DEB:
docker run -ti -u root -v /mnt/psmdb-44:/mnt/psmdb-44 ubuntu:xenial sh -c '
set -o xtrace
cd /mnt/psmdb-44
bash -x ./psmdb_builder.sh --builddir=/mnt/psmdb-44/test --install_deps=1
bash -x ./psmdb_builder.sh --builddir=/mnt/psmdb-44/test --repo=https://github.com/percona/percona-server-mongodb.git \
--branch=release-4.4.9-10 --psm_ver=4.4.9 --psm_release=10 --mongo_tools_tag=100.4.1 --jemalloc_tag=psmdb-3.2.11-3.1 --build_src_deb=1
'

  • Check that both SRPM and Source DEB have been created:
$ ls -la /mnt/psmdb-44/srpm/
total 87480
drwxr-xr-x. 2 root root     4096 Oct  1 11:35 .
drwxr-xr-x. 6 root root     4096 Oct  1 11:35 ..
-rw-r--r--. 1 root root 89570312 Oct  1 11:35 percona-server-mongodb-4.4.9-10.generic.src.rpm

$ ls -la /mnt/psmdb-44/source_deb/
total 88312
drwxr-xr-x. 2 root root     4096 Oct  1 11:45 .
drwxr-xr-x. 7 root root     4096 Oct  1 11:45 ..
-rw-r--r--. 1 root root    10724 Oct  1 11:45 percona-server-mongodb_4.4.9-10.debian.tar.xz
-rw-r--r--. 1 root root     1528 Oct  1 11:45 percona-server-mongodb_4.4.9-10.dsc
-rw-r--r--. 1 root root     2075 Oct  1 11:45 percona-server-mongodb_4.4.9-10_source.changes
-rw-r--r--. 1 root root 90398894 Oct  1 11:45 percona-server-mongodb_4.4.9.orig.tar.gz

Build Percona Server for MongoDB RPMs/DEBs:

  • Build RPMs:
docker run -ti -u root -v /mnt/psmdb-44:/mnt/psmdb-44 centos:8 sh -c '
set -o xtrace
cd /mnt/psmdb-44
bash -x ./psmdb_builder.sh --builddir=/mnt/psmdb-44/test --install_deps=1
bash -x ./psmdb_builder.sh --builddir=/mnt/psmdb-44/test --repo=https://github.com/percona/percona-server-mongodb.git \
--branch=release-4.4.9-10 --psm_ver=4.4.9 --psm_release=10 --mongo_tools_tag=100.4.1 --jemalloc_tag=psmdb-3.2.11-3.1 --build_rpm=1
'

  • Build DEBs:
docker run -ti -u root -v /mnt/psmdb-44:/mnt/psmdb-44 debian:bullseye sh -c '
set -o xtrace
cd /mnt/psmdb-44
bash -x ./psmdb_builder.sh --builddir=/mnt/psmdb-44/test --install_deps=1
bash -x ./psmdb_builder.sh --builddir=/mnt/psmdb-44/test --repo=https://github.com/percona/percona-server-mongodb.git \
--branch=release-4.4.9-10 --psm_ver=4.4.9 --psm_release=10 --mongo_tools_tag=100.4.1 --jemalloc_tag=psmdb-3.2.11-3.1 --build_deb=1
'

  • Check that RPMs for Centos 8 and DEBs for Debian 11 have been created:
$  ls -la /mnt/psmdb-44/rpm/
total 1538692
drwxr-xr-x. 2 root root      4096 Oct  1 13:19 .
drwxr-xr-x. 9 root root      4096 Oct  1 13:19 ..
-rw-r--r--. 1 root root      8380 Oct  1 13:19 percona-server-mongodb-4.4.9-10.el8.x86_64.rpm
-rw-r--r--. 1 root root  19603132 Oct  1 13:19 percona-server-mongodb-debugsource-4.4.9-10.el8.x86_64.rpm
-rw-r--r--. 1 root root  16199100 Oct  1 13:19 percona-server-mongodb-mongos-4.4.9-10.el8.x86_64.rpm
-rw-r--r--. 1 root root 382301668 Oct  1 13:19 percona-server-mongodb-mongos-debuginfo-4.4.9-10.el8.x86_64.rpm
-rw-r--r--. 1 root root  37794568 Oct  1 13:19 percona-server-mongodb-server-4.4.9-10.el8.x86_64.rpm
-rw-r--r--. 1 root root 829718252 Oct  1 13:19 percona-server-mongodb-server-debuginfo-4.4.9-10.el8.x86_64.rpm
-rw-r--r--. 1 root root  13310328 Oct  1 13:19 percona-server-mongodb-shell-4.4.9-10.el8.x86_64.rpm
-rw-r--r--. 1 root root 218625728 Oct  1 13:19 percona-server-mongodb-shell-debuginfo-4.4.9-10.el8.x86_64.rpm
-rw-r--r--. 1 root root  30823056 Oct  1 13:19 percona-server-mongodb-tools-4.4.9-10.el8.x86_64.rpm
-rw-r--r--. 1 root root  27196024 Oct  1 13:19 percona-server-mongodb-tools-debuginfo-4.4.9-10.el8.x86_64.rpm

$  ls -la /mnt/psmdb-44/deb/
total 2335288
drwxr-xr-x. 2 root root       4096 Oct  1 13:16 .
drwxr-xr-x. 9 root root       4096 Oct  1 13:16 ..
-rw-r--r--. 1 root root 2301998432 Oct  1 13:16 percona-server-mongodb-dbg_4.4.9-10.bullseye_amd64.deb
-rw-r--r--. 1 root root   14872728 Oct  1 13:16 percona-server-mongodb-mongos_4.4.9-10.bullseye_amd64.deb
-rw-r--r--. 1 root root   35356944 Oct  1 13:16 percona-server-mongodb-server_4.4.9-10.bullseye_amd64.deb
-rw-r--r--. 1 root root   12274928 Oct  1 13:16 percona-server-mongodb-shell_4.4.9-10.bullseye_amd64.deb
-rw-r--r--. 1 root root   26784020 Oct  1 13:16 percona-server-mongodb-tools_4.4.9-10.bullseye_amd64.deb
-rw-r--r--. 1 root root      18548 Oct  4 13:16 percona-server-mongodb_4.4.9-10.bullseye_amd64.deb

Now, the packages are ready to be installed for testing/working on Centos 8 and Debian 11.

As you can see from the above, the process of building packages for various operating systems is quite easy and doesn’t require lots of physical/virtual machines. All you need is the build script and Docker.

Also, as you may have noticed, all the build commands are similar to each other except the last passed argument, which defines the action that should be performed. Such an approach allows us to unify the build process and make it scripted so that the last argument can be passed as a parameter to the script. Surely all the rest arguments can and should also be passed as parameters in case you are going to automate the build process.

¹ Supported operating systems(version psmdb-4.4.9-10):

  • Centos 7
  • Centos 8
  • Ubuntu Xenial(16.04)
  • Ubuntu Bionic(18.04)
  • Ubuntu Focal(20.04)
  • Debian Stretch(9)
  • Debian Buster(10)
  • Debian Bullseye(11)

² In order to build Percona Server for MongoDB of another version, you need to use the build script of the proper version. For example, it is needed to build Percona Server for MongoDB of 4.2.7-7 version:

Complete the 2021 Percona Open Source Data Management Software Survey

Have Your Say!

Aug
17
2021
--

Percona Server for MongoDB 5.0.2 Release Candidate Is Now Available

Percona Server for MongoDB 5.0.2

Percona Server for MongoDB 5.0.2We’re happy to announce the first release candidate of Percona Server for MongoDB version 5.0.2 (PSMDB). It is now available for download from the Percona website and via the Percona Software Repositories.

Percona Server for MongoDB 5.0.2 is an enhanced, source-available, and highly scalable document-oriented database that is a fully compatible drop-in replacement for MongoDB 5.0.2 Community Edition. It includes all the features of MongoDB 5.0.2 Community Edition, as well as some additional enterprise-grade features.

The most notable features in version 5.0 include the following:

  • Resharding allows you to select a new shard key for a collection and then works in the background to correct any data distribution problems caused by bad shard keys and improve performance.
  • Time Series Collections are aimed at storing sequences of measurements over a period of time. These specialized collections will store data in a highly optimized way that will improve query efficiency, allow data analysis in real-time, and optimize disk usage.
  • Resumable Index Builds means that the index build for a collection continues if a primary node in a replica set is switched to another server or when a server restarts. The build process is saved to disk and resumes from the saved position. This allows DBAs to perform maintenance and not worry about losing the index build in the process.
  • Window operators allow operations on a specified span of documents known as a window. $setWindowFields is a new pipeline stage to operate with these documents.
  • Versioned API allows specifying which API version your application communicating with MongoDB runs against. Versioned API detaches the application’s lifecycle from that of the database. As a result, you modify the application only to introduce new features instead of having to maintain compatibility with the new version of MongoDB.

Additionally, new aggregation operators such as $count, $dateAdd, $dateDiff, $dateSubtract, $sampleRate and $rand are available with this release.

Note: As with every major release, version 5.0 comes with a significant number of new features and is still being rapidly updated. At this point, we’re making this version available as a “Release Candidate” only and we strongly suggest not to use it for production environments yet. However, we do encourage the use of this version in test and development environments.

We’re also still in the process of integrating support for version 5.0 into our other products. While Percona Backup for MongoDB 1.6.0 has just been released to support this version, some other products still need to be updated and tested.

For example, the Percona Distribution for MongoDB Operator will have PSMDB 5.0 support from version 1.10.0, which is slated to happen in mid-September.

On the Percona Monitoring and Management side, Percona Server for MongoDB 5.0 support is scheduled to be included in version 2.22.0 (currently targeting the end of September).

Because of these factors, we will not release version 5.0 of our Percona Distribution for MongoDB until we’ve updated these products and have gathered enough confidence to remove the “release candidate” label.

Jul
05
2021
--

MongoDB 5.0 Is Coming in Hot! What Do Database Experts Across the Community Think?

MongoDB 5.0 Percona

MongoDB 5.0 PerconaIf you love using MongoDB databases, you’ll want to tune in to this live-stream event ‘Percona and Friends React to MongoDB live’ at 11:00 AM EDT on July 15.

Watch or listen as industry experts from Percona, Southbank Software, and Qarbine respond to MongoDB’s conference announcements. The team will consider:

  • New features and other announcements
  • The importance of new MongoDB 5.0 features for applications
  • What this might mean for the Community Edition
  • The impact MongoDB 5.0 will have on users and the Community

This is a live event. So please bring your questions or concerns, and raise your voice to give your thoughts on the latest product news.

Or, if you’re feeling shy, you could just listen in!

Register Today

Our Community-based panel has a wide variety of expertise and experience.

Akira Kurogane

MongoDB Product Owner for Percona’s Enterprise MongoDB product additions and tools

Akira is an expert in MongoDB symptom-to-code defect analysis, diagnostics, and performance. He has helped countless distributed database clients overcome obstacles and adjust to the changing landscape. Since getting his start as a search engine and RDBMS-based developer, Akira describes himself as, “All MongoDB, all the time.”

Kimberly Wilkins

MongoDB Technical Lead with 20+ years of experience managing and architecting databases

Kimberly has been a DBA, a Principal Engineer, an architect, and has built out and managed expert database teams across multiple data store offerings over her database years. She has worked with MongoDB customers of all sizes in many industries and helped them architect, deploy, troubleshoot, and tune their databases to handle heavy workloads and keep their applications running. She specializes in MongoDB sharding to help customers scale and thrive as their businesses grow in today’s big data world. Kimberly enjoys sharing her experiences at technical conferences in the US and abroad. Why? Because after all, “there is no perfect shard key.”

Guy Harrison

CTO, ProvenDB and Southbank Software 

Author, MongoDB Performance Tuning

Not only is Guy a founder and CTO, he is also an IT professional with experience in a range of disciplines, technologies and practices but probably best known both for his longstanding involvement in relational databases (Oracle and MySQL) and for emerging database technologies such as MongoDB and Blockchain.  Guy is also an expert on performance tuning and has written several books on that subject including “MongoDB Performance Tuning”, “Next Generation Databases” and “MySQL Stored Procedure Programming”.  He also writes the “MongoDB Matters” column for Database Trends and Applications 

Bill Reynolds

CTO/Co-founder of Qarbine specializing in BI solutions for enterprise investments in NoSQL databases like MongoDB

Bill has led product teams who have integrated with 23 different database APIs across many favors of NoSQL such as MongoDB to pure object oriented, to legacy SQL.

His companies have licensed database and reporting software to most of the Fortune 500 and many others worldwide. For over 3 years he has been applying that experience developing a native MongoDB detailed reporting and analysis suite.

Join Percona and Friends as they react to MongoDB.live!

Register For Free

Jun
21
2021
--

Discover Why MongoDB Runs Better With Percona

MongoDB Percona

MongoDB PerconaIn just under a month, MongoDB will host its annual event, MongoDB.live. And just over a month ago, Percona held its annual event Percona Live

Despite the naming convention similarity, these events couldn’t be more different!

Percona Live was an open source database software community event with 196 speakers and over 200 presentations. We platformed a huge range of people and companies that use and champion a variety of open source databases and tools. 

Although many people still think of MongoDB as open source, this is incorrect. The Open Source Initiative referred to MongoDB’s introduction of the Server Side Public License (SSPL) as “fauxpen” source license.

In 2019, MongoDB CEO Dev Ittycheria stated in an interview, “MongoDB was built by MongoDB. There was no prior art. So one: it speaks to the technical acumen of the team here. And two: we didn’t open source it to get help from the community, to make the product better… We open sourced as a freemium strategy; to drive adoption.” 

For many people, this is totally contrary to open source values and the practice of open source overall. 

The move away from open source and the community means that MongoDB has become increasingly closed off. Without market alternatives, MongoDB can become a monopoly. They can raise fees without competition and lock-in users. Some people believe that this is their intent with their planned new Quarterly Release Cycle, which will provide quarterly releases only to Atlas customers.

This is where Percona can help. 

We offer a viable and secure drop-in replacement for MongoDB Community with added enterprise-level features. Plus, market-leading support services and open source tools

Percona customers are not locked in and enjoy a lower total cost of ownership, with the freedom to move their data at any time, without fees or barriers.

For the next six weeks, we will be focusing on Percona’s MongoDB offering and all the benefits a move to Percona can bring to your business.

Highlights include:

  • Expert webinars on a variety of hot MongoDB topics
  • New market insight and thought leadership
  • In-depth technical blogs addressing key MongoDB pain-points
  • Percona and Friends React to MongoDB.live – a live stream on the July 15th where industry experts discuss the news and announcements coming from MongoDB.live

Our first webinar kicks off on June 29th as Percona experts Kimberly Wilkins, Mike Grayson, and Vinicius Grippa present ‘Unlocking the Mystery of MongoDB Shard Key Selection’ and offer advice on the measures to take if things go wrong. Please register now to attend for free.

Keep an eye on our blog and social channels for much more exciting content, insight, and events over the next few weeks. 

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com