Jan
14
2022
--

DBaaS and the Enterprise

DBaaS and the Enterprise

DBaaS and the EnterpriseInstall a database server. Give the application team an endpoint. Set up backups and monitor in perpetuity. This is a pattern I hear about regularly from DBAs with most of my enterprise clients. Rarely do they get to troubleshoot or work with application teams to tune queries or design schemas. This is what triggers the interest in a DBaaS platform from database administrators.

What is DBaaS?

DBaaS stands for “Database as a Service”. When this acronym is thrown out, the first thought is generally a cloud offering such as RDS. While this is a very commonly used service, a DBaaS is really just a managed database platform that offloads much of the operational burden from the DBA team. Tasks handled by the platform include:

  • Installing the database software
  • Configuring the database
  • Setting up backups
  • Managing upgrades
  • Handling failover scenarios

A common misconception is that a DBaaS is limited to the public cloud. As many enterprises already have large data centers and heavy investments in hardware, an on-premise DBaaS can also be quite appealing. Keeping the database in-house is often favored when the hardware and resources are already available. In addition, there are extra compliance and security concerns when looking at a public cloud offering.

DBaaS also represents a difference in mindset. In conventional deployments, systems and architecture are often designed in very exotic ways making automation a challenge. With a DBaaS, automation, standardization, and best practices are the priority. While this can be seen as limiting flexibility, this approach can lead to larger and more robust infrastructures that are much easier to manage and maintain.

Why is DBaaS Appealing?

From a DBA perspective (and being a former DBA myself), I always enjoyed working on more challenging issues. Mundane operations like launching servers and setting up backups make for a less-than-exciting daily work experience. When managing large fleets, these operations make up the majority of the work.

As applications grow more complex and data sets grow rapidly, it is much more interesting to work with the application teams to design and optimize the data tier. Query tuning, schema design, and workflow analysis are much more interesting (and often beneficial) when compared to the basic setup. DBAs are often skilled at quickly identifying issues and understanding design issues before they become problems.

When an enterprise adopts a DBaaS model, this can free up the DBAs to work on more complex problems. They are also able to better engage and understand the applications they are supporting. A common comment I get when discussing complex tickets with clients is: “well, I have no idea what the application is doing, but we have an issue with XYZ”. If this could be replaced with a detailed understanding from the design phase to the production deployment, these discussions would be very different.

From an application development perspective, a DBaaS is appealing because new servers can be launched much faster. Ideally, with development or production deployment options, an application team can have the resources they need ready in minutes rather than days. It greatly speeds up the development life cycle and makes developers much more self-reliant.

DBaaS Options

While this isn’t an exhaustive list, the main options when looking to move to a DBaaS are:

  • Public cloud
    • Amazon RDS, Microsoft Azure SQL, etc
  • Private/Internal cloud
    • Kubernetes (Percona DBaaS), VMWare, etc
  • Custom provisioning/operations on bare-metal

Looking at public cloud options for a DBaaS, security and compliance are generally the first concern. While they are incredibly easy to launch and generally offer some pay-as-you-go options, managing access is a major consideration.

Large enterprises with existing hardware investments often want to explore a private DBaaS. I’ve seen clients work to create their own tooling within their existing infrastructure. While this is a viable option, it can be very time-consuming and require many development cycles. Another alternative is to use an existing DBaaS solution. For example, Percona currently has a DBaaS deployment as part of Percona Monitoring and Management in technical preview. The Percona DBaaS automates PXC deployments and management tasks on Kubernetes through a user-friendly UI.

Finally, a custom deployment is just what it sounds like. I have some clients that manage fleets (1000s of servers) of bare metal servers with heavy automation and custom scripting. To the end-user, it can look just like a normal DBaaS (an endpoint with all operations hidden). On the backend, the DBA team spends significant time just supporting the infrastructure.

How Can Percona help?

Percona works to meet your business where you are. If that is supporting Percona Server for MySQL on bare metal or a fleet of RDS instances, we can help. If your organization is leveraging Kubernetes for the data tier, the Percona Private DBaaS is a great option to standardize and simplify your deployments while following best practices. We can help from the design phase through the entire life cycle. Let us know how we can help!

Jan
14
2022
--

Help Percona Make Better Databases

Percona Make Better Databases

Hello everyone!  The new year is upon us and we are looking at planning some awesome new features, enhancements, and ideas in 2022.  One of the things we could really use your help on, however, is providing us with feedback on how we are doing and what we can do to improve our database products.  We have put together a few short surveys for each one of our main products (Supporting MySQL, MongoDB, PostgreSQL, Percona Monitoring and Management, and our Kubernetes Operators).  Each survey is a quick 5-7 questions and should take no more than a minute or two to complete.

Percona Server for MySQL & PXC https://per.co.na/lnsTqM
Percona Server for MongoDB https://per.co.na/7BGhFe
Percona Distribution for PostgreSQL https://per.co.na/1KfvRt
Percona Monitoring and Management https://per.co.na/JQ7yAt
Percona Operators (Either for PG, MYSQL, or MongoDB) https://per.co.na/Gxwcrj

Note, everyone who fills in a survey is eligible to be part of a drawing for a Percona T-Shirt. 10 people will randomly be selected. You don’t have to leave an email when you fill out the survey, but you do to be entered into the drawing.

Thank you for helping us out!

Jan
13
2022
--

Toddler wooden chair – cute and worth your trust!

There is nothing better than a classic, trusty wooden chair when you need a place for your child to sit. Made of natural materials and can take the significant weight, easy to clean with a damp cloth – you can’t go wrong with this.

As a parent, the biggest problem that you may come across is to find a reliable toddler wooden chair. It is much likely that you will find various ones in the market, but it may be confusing to choose which one would suit your kid’s needs.

You don’t need to worry, as we have compiled all the necessary information about toddler wooden chairs and their pros and cons here. First, let us start with what to consider when buying a wooden chair for your toddler.

Things To Consider Before Buying A Wooden Toddler Chair

You may find many stylish and durable toddler wooden chairs, but it’s crucial to keep certain things in your mind before buying one.

• Your kid’s comfort. Make sure that the seat is deep enough to accommodate your toddler. The bottom of the chair must be sturdy to bear heavy pressure on it.

• Durability. The wooden toddler chair should be sturdy and long-lasting so that your child can use it for many years without any problem. A strong base is advisable because the child’s weight will be on the chair, so avoid chairs made up of plastic or metal as they are not durable. The wooden one is best as it can be used for many years.

• Material. Wooden toddler chairs are much durable and comfortable, so that’s the reason most of the parents prefer buying a wooden one rather than other types. There are various kinds of woods available in the market but go for ones with veneer as it is more durable and long-lasting.

• Price. You may find many wooden toddler chairs within your budget, so do not worry about the price.

These are some tips that you need to consider before buying a wooden toddler chair for your child. It not only allows cherishing with your kid but also develops various skills in them like creativity and independence.

Best wooden chairs for your toddler

Do you want to buy a wooden toddler chair for your child? Here are some of the best ones available in the market.

Famobay Bunny Ears Wooden Toddler Chair

Just look how cute is this toddler chair! The backrest shaped like bunny ears is an adorable feature for little ones. It is made of solid and smooth wooden construction, making it long-lasting and sturdy.

Based on its innovative design, you would think that this toddler chair would be much more expensive than any other option on the market. However, it is not the case here!

This bunny ears toddler chair is available on Amazon at a very reasonable price. It is solid, durable, and easy to keep clean, making it the best choice for your household.

Famobay Antlers Wooden Toddler Chair

If you like Bunny Ears Toddler Wooden Chair, you will be happy to learn that another similar toddler chair is available. This time its design is based on antlers to make your little one feel like a wild animal.

Like the previous product, this antlers toddler chair is made of natural smooth wood, making it very durable. It can carry heavyweight up to 400 lbs, which you can’t say about other products on the market. There is no doubt that your child can use this chair for many years.

Melissa & Doug Wooden Chairs, Set of 2

Melissa & Doug’s kids’ furniture is the way to go for those wanting a simple and classic design for kids’ room. The playroom chairs set was created with safety in mind – something Melissa & Doug is well-known for. The chairs feature a reinforced, tip-resistant design.

They are made from durable wood, with materials that hold up to 100 pounds. The Melissa & Doug Solid Wood Chairs set makes an excellent gift for kids from 3 to 8 years old.

Toddler Chair CXRYLZ

If you like a simple design but still wish to bring your toddler a little bit of color, take a look at this chair. This little chair is made of wood and will surely look lovely in any kids’ room. It also has round edges, which are much safer for your toddler than sharp ones.

This chair is so cute, you might not want to give it away when your kid grows up! Beautiful colors are not the only thing that makes this chair so unique. It is also made of quality materials which make it durable and long-lasting.

Its compact size will help you fit it almost anywhere in your house, even if you have limited space for kids’ furniture: 25cm/10in legs, 30 x 30cm/12” x 12” seat surface, and 26.5 x 20cm/10.5” x 7.9” backrest. This cutie can hold up to 250 lbs.

HOUCHICS Wooden Toddler Chair

Unique toddler wooden chair! This piece has a curved back to better protect your child from being injured by the prismatic edge of the wooden stool. It is not only for safety reasons but also to give your child a more comfortable sitting position.

This toddler chair has an additional feature of non-slip pads that will help prevent the stool from sliding on the floor surface. You can also adjust its height depending on what you need at the moment.

The HOUCHICS Wooden Toddler Chair is very easy to clean – you can use a damp cloth and soap. A wide handle at the back provides a comfortable grip to lift the toddler chair off the floor.

It’s suitable for children from 3 years of age up to 6-7 years. Be sure that your child will be comfortable sitting on this stool!

iPlay iLearn 10 Inch Kids Solid Hard Wood Animal Chair

Cute animal chairs with vivid, engaging characters are appealing and attractive to children aged 2 and up. Not only will your child be excited to sit on the iPlay iLearn animal chairs, but they will be learning at the same time.

Each chair has a sturdy 10-inch solid wood base designed to prevent tipping when your child leans against it. The triangular structure of the chair legs helps prevent tipping, and anti-slip pads on each leg prevent falls.

The iPlay iLearn chairs can be stacked for easy storage, and they are made of 100% natural smooth hardwood. They are hand-painted with non-toxic paints to ensure complete safety.

This is the perfect gift for toddlers aged 2-4! Parents love it because it is sturdy, safe, and easy to clean; kids love them because of the vibrant colors and animal characters. Among the patterns, you’ll find a giraffe visible above, a frog, and a cow.

Why is it necessary to buy a wooden toddler chair?

As we all know, kids look forward to spending time with their parents and if you can encourage them to spend more time on furniture, then do it as soon as possible. As a parent, you must have noticed that your child prefers being beside you rather than playing alone. It’s a sign that your child wants to spend more time with you, which is absolutely normal.

The best way to encourage them is to buy them a wooden toddler chair because it would allow your child to sit beside you and enjoy being together while reading books or watching exhibits on display cabinets. This can be good practice for your child to be more responsible and understand the concept of time, as it also encourages them to play independently.

Children prefer wooden chairs because they are much comfortable than other types. They like splashing colors in their surrounding, so if you buy a wooden toddler chair in vibrant color, your child will love it. A wooden toddler chair is the best way to encourage your child and develop their creative side.

Apart from all these benefits, toddlers learn many things by watching us, so if you want them to become responsible and obedient in the future, teach them how to sit correctly on a wooden toddler chair. They will get an idea about sitting straight, and it would be beneficial for them in the future.

The post Toddler wooden chair – cute and worth your trust! appeared first on Comfy Bummy.

Jan
13
2022
--

How Patroni Addresses the Problem of the Logical Replication Slot Failover in a PostgreSQL Cluster

PostgreSQL Patroni Logical Replication Slot Failover

Failover of the logical replication slot has always been the pain point while using the logical replication in PostgreSQL. This lack of feature undermined the use of logical replication and acted as one of the biggest deterrents. The stake and impact were so high that many organizations had to discard their plans around logical replication, and it affected many plans for migrations to PostgreSQL. It was painful to see that many had to opt for proprietary/vendor-specific solutions instead.

At Percona, we have written about this in the past: Missing Piece: Failover of the Logical Replication Slot.  In that post, we discussed one of the possible approaches to solve this problem, but there was no reliable mechanism to copy the slot information to a Physical standby and maintain it.

The problem, in nutshell, is: the replication slot will be always maintained on the Primary node. If there is a switchover/failover to promote one of the standby, the new primary won’t have any idea about the replication slot maintained by the previous primary node. This breaks the logical replication from the downstream systems or if a new slot is created, it becomes unsafe to use.

The good news is that Patroni developers and maintainers addressed this problem from Version 2.1.0 and provided a working solution without any invasive methods/extensions. For me, this is a work that deserves a big round of applause from the Patroni community and that is the intention of this blog post and to make sure that a bigger crowd is aware of it.

How to Set it Up

A ready-to-use Patroni package is available from the Percona repository. But you are free to use Patroni from any source.

Basic Configuration

In case you are excited about this and want to try it, the following steps might be helpful.

The entire discussion is about logical replication. So the minimum requirement is to have a wal_level set to “logical”. If the existing Patroni configuration is having wal_level set to “replica” and if you want to use this feature, you may just edit the Patroni configuration.

Patroni configuration

However, this change requires the PostgreSQL restart:

“Pending restart” with * marking indicates the same.

You may use Patroni’s “switchover” feature to restart the node to make the changes into effect because the demoted node goes for a restart.

If there are any remaining nodes, they can be restarted later.

Creating Logical Slots

Now we can add a permanent logical replication slot to PostgreSQL which will be maintained by Patroni.

Edit the patroni configuration:

$ patronictl -c /etc/patroni/patroni.yml edit-config

A slot specification can be added as follows:

…
slots:
  logicreplia:
    database: postgres
    plugin: pgoutput
    type: logical
…

The “slots:” section defines permanent replication slots. These slots will be preserved during switchover/failover. “pgoutput” is the decoding plugin for PostgreSQL logical replication.

Once the change is applied, the logical replication slot will be created on the primary node. Which can be verified by querying:

select * from pg_replication_slots;

The following is a sample output:

patroni output

Now here is the first level of magic! The same replication slot will be created on the standbys, also. Yes, Patroni does it. Patroni internally copies the replication slot information from the primary to all eligible standby nodes!.

We can use the same query on the pg_replication_slots on the standby and see similar information.

The following is an example showing the same replication slot reflecting on the standby side:

replication slot

This slot can be used by the subscription by explicitly specifying the slot name while creating the subscription.

CREATE SUBSCRIPTION sub2 CONNECTION '<connection_string' PUBLICATION <publication_name> WITH (copy_data = true, create_slot=false, enabled=true, slot_name=logicreplia);

Alternatively, an existing subscription can be modified to use the new slot which I generally prefer to do.

For example:

ALTER SUBSCRIPTION name SET (slot_name=logicreplia);

Corresponding PostgreSQL log entries can confirm the slot name change:

2021-12-27 15:56:58.294 UTC [20319] LOG:  logical replication apply worker for subscription "sub2" will restart because the replication slot name was changed
2021-12-27 15:56:58.304 UTC [20353] LOG:  logical replication apply worker for subscription "sub2" has started

From the publisher side, We can confirm the slot usage by checking the active_pid and advancing LSN for the slots.

The second level of Surprise! The Replication Slot information in all the standby nodes of the Patroni cluster is also advanced as the logical replication progresses from the primary side

At a higher level, this is exactly what this feature is doing:

  1. Automatically create/copy the replication slot information from the primary node of the Patroni cluster to all eligible standby nodes.
  2. Automatically advances the LSN numbers on slots of standby nodes as the LSN number advances on the corresponding slot on the primary.

After a Switchover/Failover

In the event of a switchover or failover, we are not losing any slot information as they are already maintained on the standby nodes.

After the switchover, the topology looks like this:

Now, any downstream logical replica can be repointed to the new primary.

postgres=# ALTER SUBSCRIPTION sub2 CONNECTION 'host=192.168.50.10 port=5432 dbname=postgres user=postgres password=vagrant';                                                                                                                                                  
ALTER SUBSCRIPTION

This continues the replication, and pg_replication_slot information can confirm this.

Summary + Key Points

The logical replication slot is conceptually possible only on the primary Instance because that is where the logical decoding happens. Now with this improvement, Patroni makes sure that the slot information is available on standby also and it will be ready to take over the connection from the subscriber.

  • This solution requires PostgreSQL 11 or above because it uses the  pg_replication_slot_advance() function which is available from PostgreSQL 11 onwards, for advancing the slot.
  • The downstream connection can use HAProxy so that the connection will be automatically routed to the primary (not covered in this post). No modification to PostgreSQL code or Creation of any extension is required.
  • The copying of the slot happens over PostgreSQL protocol (libpq) rather than any OS-specific tools/methods. Patroni uses rewind or superuser credentials. Patroni uses the pg_read_binary_file()  function to read the slot information. Source code Reference.
  • Once the logical slot is created on the replica side, Patroni uses pg_replication_slot_advance() to move the slot forward.
  • The permanent slot information will be added to DCS and will be continuously maintained by the primary instance of the Patroni. A New DCS key with the name “status” is introduced and supported across all DCS options (zookeeper, etcd, consul, etc.).
  • hot_standby_feedback must be enabled on all standby nodes where the logical replication slot needs to be maintained.
  • Patroni parameter postgresql.use_slots must be enabled to make sure that every standby node uses a slot on the primary node.
Jan
11
2022
--

Creating a Standby Cluster With the Percona Distribution for PostgreSQL Operator

Standby Cluster With the Percona Distribution for PostgreSQL Operator

A customer recently asked if our Percona Distribution for PostgreSQL Operator supports the deployment of a standby cluster, which they need as part of their Disaster Recovery (DR) strategy. The answer is yes – as long as you are making use of an object storage system for backups, such as AWS S3 or GCP Cloud Storage buckets, that can be accessed by the standby cluster. In a nutshell, it works like this:

  • The primary cluster is configured with pgBackRest to take backups and store them alongside archived WAL files in a remote repository;
  • The standby cluster is built from one of these backups and it is kept in sync with the primary cluster by consuming the WAL files that are copied from the remote repository.

Note that the primary node in the standby cluster is not a streaming replica from any of the nodes in the primary cluster and that it relies on archived WAL files to replicate events. For this reason, this approach cannot be used as a High Availability (HA) solution. Even though the primary use of a standby cluster in this context is DR, it can be also employed for migrations as well.

So, how can we create a standby cluster using the Percona operator? We will show you next. But first, let’s create a primary cluster for our example.

Creating a Primary PostgreSQL Cluster Using the Percona Operator

You will find a detailed procedure on how to deploy a PostgreSQL cluster using the Percona operator in our online documentation. Here we want to highlight the main steps involved, particularly regarding the configuration of object storage, which is a crucial requirement and should better be done during the initial deployment of the cluster. In the following example, we will deploy our clusters using the Google Kubernetes Engine (GKE) but you can find similar instructions for other environments in the previous link.

Considering you have a Google account configured as well as the gcloud (from the Google Cloud SDK suite) and kubectl command-line tools installed, authenticate yourself with gcloud auth login, and off we go!

Creating a GKE Cluster and Basic Configuration

The following command will create a default cluster named “cluster-1” and composed of three nodes. We are creating it in the us-central1-a zone using e2-standard-4 VMs but you may choose different options. In fact, you may also need to indicate the project name and other main settings if you do not have your gcloud environment pre-configured with them:

gcloud container clusters create cluster-1 --preemptible --machine-type e2-standard-4 --num-nodes=3 --zone us-central1-a

Once the cluster is created, use your IAM identity to control access to this new cluster:

kubectl create clusterrolebinding cluster-admin-binding --clusterrole cluster-admin --user $(gcloud config get-value core/account)

Finally, create the pgo namespace:

kubectl create namespace pgo

and set the current context to refer to this new namespace:

kubectl config set-context $(kubectl config current-context) --namespace=pgo

Creating a Cloud Storage Bucket

Remember for this setup we need a Google Cloud Storage bucket configured as well as a Service Account created with the necessary privileges/roles to access it. The respective procedures to obtain these vary according to how your environment is configured so we won’t be covering them here. Please refer to the Google Cloud Storage documentation for the exact steps. The bucket we created for the example in this post was named cluster1-backups-and-wals.

Likewise, please refer to the Creating and managing service account keys documentation to learn how to create a Service Account and download the corresponding key in JSON format – we will need to provide it to the operator so our PostgreSQL clusters can access the storage bucket.

Creating the Kubernetes Secrets File to Access the Storage Bucket

Create a file named my-gcs-account-secret.yaml with the following structure:

apiVersion: v1
kind: Secret
metadata:
  name: cluster1-backrest-repo-config
type: Opaque
data:
  gcs-key: <VALUE>

replacing the <VALUE> placeholder by the output of the following command according to the OS you are using:

Linux:

base64 --wrap=0 your-service-account-key-file.json

macOS:

base64 your-service-account-key-file.json

Installing and Deploying the Operator

The most practical way to install our operator is by cloning the Git repository, and then moving inside its directory:

git clone -b v1.1.0 https://github.com/percona/percona-postgresql-operator
cd percona-postgresql-operator

The following command will deploy the operator:

kubectl apply -f deploy/operator.yaml

We have already prepared the secrets file to access the storage bucket so we can apply it now:

kubectl apply -f my-gcs-account-secret.yaml

Now, all that is left is to customize the storages options in the deploy/cr.yaml file to indicate the use of the GCS bucket as follows:

    storages:
      my-gcs:
        type: gcs
        bucket: cluster1-backups-and-wals

We can now deploy the primary PostgreSQL cluster (cluster1):

kubectl apply -f deploy/cr.yaml

Once the operator has been deployed, you can run the following command to do some housekeeping:

kubectl delete -f deploy/operator.yaml

Creating a Standby PostgreSQL Cluster Using the Percona Operator

After this long preamble, let’s look at what brought you here: how to deploy a standby cluster, which we will refer to as cluster2, that will replicate from the primary cluster.

Copying the Secrets Over

Considering you probably have customized the passwords you use in your primary cluster and that they differ from the default values found in the operator’s git repository, we need to make a copy of the secrets files, adjusted to the standby cluster’s name. The following procedure facilitates this task, saving the secrets files under /tmp/cluster1-cluster2-secrets (you can choose a different target directory):

NOTE: make sure you have the yq tool installed in your system.
mkdir -p /tmp/cluster1-cluster2-secrets/
export primary_cluster_name=cluster1
export standby_cluster_name=cluster2
export secrets="${primary_cluster_name}-users"
kubectl get secret/$secrets -o yaml \
| yq eval 'del(.metadata.creationTimestamp)' - \
| yq eval 'del(.metadata.uid)' - \
| yq eval 'del(.metadata.selfLink)' - \
| yq eval 'del(.metadata.resourceVersion)' - \
| yq eval 'del(.metadata.namespace)' - \
| yq eval 'del(.metadata.annotations."kubectl.kubernetes.io/last-applied-configuration")' - \
| yq eval '.metadata.name = "'"${secrets/$primary_cluster_name/$standby_cluster_name}"'"' - \
| yq eval '.metadata.labels.pg-cluster = "'"${standby_cluster_name}"'"' - \
>/tmp/cluster1-cluster2-secrets/${secrets/$primary_cluster_name/$standby_cluster_name}

Deploying the Standby Cluster: Fast Mode

Since we have already covered the procedure used to create the primary cluster in detail in a previous section, we will be presenting the essential steps to create the standby cluster below and provide additional comments only when necessary.

NOTE: the commands below are issued from inside the percona-postgresql-operator directory hosting the git repository for our operator.

Deploying a New GKE Cluster Named cluster-2

This time using the us-west1-b zone here:

gcloud container clusters create cluster-2 --preemptible --machine-type e2-standard-4 --num-nodes=3 --zone us-west1-b
kubectl create clusterrolebinding cluster-admin-binding --clusterrole cluster-admin --user $(gcloud config get-value core/account)
kubectl create namespace pgo
kubectl config set-context $(kubectl config current-context) --namespace=pgo
kubectl apply -f deploy/operator.yaml

Apply the Adjusted Kubernetes Secrets:

export standby_cluster_name=cluster2
export secrets="${standby_cluster_name}-users"
kubectl create -f /tmp/cluster1-cluster2-secrets/$secrets

The list above does not include the GCS secret file; the key contents remain the same but the backrest-repo pod name needs to be adjusted. Make a copy of that file:

cp my-gcs-account-secret.yaml my-gcs-account-secret-2.yaml

then edit the copy to indicate “cluster2-” instead of “cluster1-”:

name: cluster2-backrest-repo-config

You can apply it now:

kubectl apply -f my-gcs-account-secret-2.yaml

The cr.yaml file of the Standby Cluster

Let’s make a copy of the cr.yaml file we customized for the primary cluster:

cp deploy/cr.yaml deploy/cr-2.yaml

and edit the copy as follows:

1) Change all references (that are not commented) from cluster1 to cluster2  – including current-primary but excluding the bucket reference, which in our example is prefixed with “cluster1-”; the storage section must remain unchanged. (We know it’s not very practical to replace so many references, we still need to improve this part of the routine).

2) Enable the standby option:

standby: true

3) Provide a repoPath that points to the GCS bucket used by the primary cluster (just below the storages section, which should remain the same as in the primary cluster’s cr.yaml file):

repoPath: “/backrestrepo/cluster1-backrest-shared-repo”

And that’s it! All that is left now is to deploy the standby cluster:

kubectl apply -f deploy/cr-2.yaml

With everything working on the standby cluster, do some housekeeping:

kubectl delete -f deploy/operator.yaml

Verifying it all Works as Expected

Remember that the standby cluster is created from a backup and relies on archived WAL files to be continued in sync with the primary cluster. If you make a change in the primary cluster, such as adding a row to a table, that change won’t reach the standby cluster until the WAL file it has been recorded to is archived and consumed by the standby cluster.

When checking if all is working with the new setup, you can force the rotation of the WAL file (and subsequent archival of the previous one) in the primary node of the primary cluster to accelerate the sync process by issuing:

psql> SELECT pg_switch_wal();

The Percona Kubernetes Operators automate the creation, alteration, or deletion of members in your Percona Distribution for MySQL, MongoDB, or PostgreSQL environment.

Learn More About Percona Kubernetes Operators

Jan
11
2022
--

Online DDL With Group Replication in MySQL 8.0.27

Online DDL With Group Replication in MySQL 8.0.27

Online DDL With Group Replication in MySQL 8.0.27In April 2021, I wrote an article about Online DDL and Group Replication. At that time we were dealing with MySQL 8.0.23 and also opened a bug report which did not have the right answer to the case presented. 

Anyhow, in that article I have shown how an online DDL was de facto locking the whole cluster for a very long time even when using the consistency level set to EVENTUAL.

This article is to give justice to the work done by the MySQL/Oracle engineers to correct that annoying inconvenience. 

Before going ahead, let us remember how an Online DDL was propagated in a group replication cluster, and identify the differences with what happens now, all with the consistency level set to EVENTUAL (see).

In MySQL 8.0.23 we were having:

While in MySQL 8.0.27 we have:

As you can see from the images we have three different phases. Phase one is the same between version 8.0.23 and version 8.0.27. 

Phases two and three, instead, are quite different. In MySQL 8.0.23 after the DDL is applied on the Primary, it is propagated to the other nodes, but a metalock was also acquired and the control was NOT returned. The result was that not only the session executing the DDL was kept on hold, but also all the other sessions performing modifications. 

Only when the operation was over on all secondaries, the DDL was pushed to Binlog and disseminated for Asynchronous replication, lock raised and operation can restart.

Instead, in MySQL 8.0.27,  once the operation is over on the primary the DDL is pushed to binlog, disseminated to the secondaries and control returned. The result is that the write operations on primary have no interruption whatsoever and the DDL is distributed to secondary and Asynchronous replication at the same time. 

This is a fantastic improvement, available only with consistency level EVENTUAL, but still, fantastic.

Let’s See Some Numbers

To test the operation, I have used the same approach used in the previous tests in the article mentioned above.

Connection 1:
    ALTER TABLE windmills_test ADD INDEX idx_1 (`uuid`,`active`), ALGORITHM=INPLACE, LOCK=NONE;
    ALTER TABLE windmills_test drop INDEX idx_1, ALGORITHM=INPLACE;
    
Connection 2:
 while [ 1 = 1 ];do da=$(date +'%s.%3N');/opt/mysql_templates/mysql-8P/bin/mysql --defaults-file=./my.cnf -uroot -D windmills_large -e "insert into windmills_test  select null,uuid,millid,kwatts_s,date,location,active,time,strrecordtype from windmill7 limit 1;" -e "select count(*) from windmills_large.windmills_test;" > /dev/null;db=$(date +'%s.%3N'); echo "$(echo "($db - $da)"|bc)";sleep 1;done

Connection 3:
 while [ 1 = 1 ];do da=$(date +'%s.%3N');/opt/mysql_templates/mysql-8P/bin/mysql --defaults-file=./my.cnf -uroot -D windmills_large -e "insert into windmill8  select null,uuid,millid,kwatts_s,date,location,active,time,strrecordtype from windmill7 limit 1;" -e "select count(*) from windmills_large.windmills_test;" > /dev/null;db=$(date +'%s.%3N'); echo "$(echo "($db - $da)"|bc)";sleep 1;done

Connections 4-5:
     while [ 1 = 1 ];do echo "$(date +'%T.%3N')";/opt/mysql_templates/mysql-8P/bin/mysql --defaults-file=./my.cnf -uroot -D windmills_large -e "show full processlist;"|egrep -i -e "(windmills_test|windmills_large)"|grep -i -v localhost;sleep 1;done

Modifying a table with ~5 million rows:

node1-DC1 (root@localhost) [windmills_large]>select count(*) from  windmills_test;
+----------+
| count(*) |
+----------+
|  5002909 |
+----------+

The numbers below represent the time second/milliseconds taken by the operation to complete. While I was also catching the state of the ALTER on the other node I am not reporting it here given it is not relevant. 

EVENTUAL (on the primary only)
-------------------
Node 1 same table:
.184
.186 <--- no locking during alter on the same node
.184
<snip>
.184
.217 <--- moment of commit
.186
.186
.186
.185

Node 1 another table :
.189
.198 <--- no locking during alter on the same node
.188
<snip>
.191
.211  <--- moment of commit
.194

As you can see there is just a very small delay at the moment of commit, but other impacts.

Now if we compare this with the recent tests I have done for Percona XtraDB Cluster (PXC) Non-Blocking operation (see A Look Into Percona XtraDB Cluster Non-Blocking Operation for Online Schema Upgrade) with the same number of rows and same kind of table/data:

Action Group Replication PXC (NBO)
Time on hold for insert for altering table ~ 0.217 sec ~ 120 sec
Time on hold for insert for another table ~ 0.211 sec ~ 25 sec

However, yes there is a however, PXC was maintaining consistency between the different nodes during the DDL execution, while MySQL 8.0.27 with Group Replication was postponing consistency on the secondaries, thus Primary and Secondary were not in sync until full DDL finalization on the secondaries.

Conclusions

MySQL 8.0.27 comes with this nice fix that significantly reduces the impact of an online DDL operation on a busy server. But we can still observe a significant misalignment of the data between the nodes when a DDL is executing. 

On the other hand, PXC with NBO is a bit more “expensive” in time, but nodes remain aligned all the time.

In the end, is what is more important for you to choose one or the other solution, consistency vs. operational impact.

Great MySQL to all.

Jan
10
2022
--

Talking Drupal #329 – The Penguin Corps

Today we are talking about The Penguin Corps with Stu Keroff and Students from the Penguin Corps.

TalkingDrupal.com/329

Topics

  • Stephen – AZ trip
  • Nic – Computer build
  • Stu – Back to school
  • Favorite things
  • Rania Grade 7
    • Walking up and down stairs 10 times to get to sleep
  • Michael – Grade 7
    • Sports, Basketball or Swim
  • Cam – Grade 7
    • Working on cars, 1986 Ford Mustang
  • Geoffrey – Grade 6
    • Soccer
  • Nithya – Grade 6
    • Reading
  • Penguin Corps
  • How it got started
  • Getting support
  • Why Linux
  • Computers in the classroom
  • Importance
  • Digital Divide
  • Hardware
  • Donations
  • Beyond the classroom
  • Corporate support

Resources

Guests

Stu Keroff – @studoeslinux Rania Michael Cam Geoffrey Nithya

Hosts

Nic Laflin – www.nLighteneddevelopment.com @nicxvan John Picozzi – www.epam.com @johnpicozzi Stephen Cross – @stephencross

Jan
10
2022
--

MySQL 8.0 Functional Indexes

MySQL 8.0 Functional Indexes

MySQL 8.0 Functional IndexesWorking with hundreds of different customers I often face similar problems around running queries. One very common problem when trying to optimize a database environment is index usage. A query that cannot use an index is usually a long-running one, consuming more memory or triggering more disk iops.

A very common case is when a query uses a filter condition against a column that is involved in some kind of functional expression. An index on that column can not be used.

Starting from MySQL 8.0.13 functional indexes are supported. In this article, I’m going to show what they are and how they work.

The Well-Known Problem

As already mentioned, a very common problem about index usage is when you have a filter condition against one or more columns involved in some kind of functional expression.

Let’s see a simple example.

You have a table called products containing the details of your products, including a create_time TIMESTAMP column. If you would like to calculate the average price of your products on a specific month you could do the following:

mysql> SELECT AVG(price) FROM products WHERE MONTH(create_time)=10;
+------------+
| AVG(price) |
+------------+
| 202.982582 |
+------------+

The query returns the right value, but take a look at the EXPLAIN:

mysql> EXPLAIN SELECT AVG(price) FROM products WHERE MONTH(create_time)=10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: products
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 99015
     filtered: 100.00
        Extra: Using where

 

The query triggers a full scan of the table. Let’s create an index on create_time and check again:

mysql> ALTER TABLE products ADD INDEX(create_time);
Query OK, 0 rows affected (0.71 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> explain SELECT AVG(price) FROM products WHERE MONTH(create_time)=10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: products
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 99015
     filtered: 100.00
        Extra: Using where

 

A full scan again. The index we have created is not effective. Indeed any time an indexed column is involved in a function the index can not be used.

To optimize the query the workaround is rewriting it differently in order to isolate the indexed column from the function.

Let’s test the following equivalent query:

mysql> SELECT AVG(price) FROM products WHERE create_time BETWEEN '2019-10-01' AND '2019-11-01';
+------------+
| AVG(price) |
+------------+
| 202.982582 |
+------------+

mysql> EXPLAIN SELECT AVG(price) FROM products WHERE create_time BETWEEN '2019-10-01' AND '2019-11-01'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: products
   partitions: NULL
         type: range
possible_keys: create_time
          key: create_time
      key_len: 5
          ref: NULL
         rows: 182
     filtered: 100.00
        Extra: Using index condition

 

Cool, now the index is used. Then rewriting the query was the typical suggestion.

Quite a simple solution, but not all the times it was possible to change the application code for many valid reasons. So, what to do then?

 

MySQL 8.0 Functional Indexes

Starting from version 8.0.13, MySQL supports functional indexes. Instead of indexing a simple column, you can create the index on the result of any function applied to a column or multiple columns.

Long story short, now you can do the following:

mysql> ALTER TABLE products ADD INDEX((MONTH(create_time)));
Query OK, 0 rows affected (0.74 sec)
Records: 0  Duplicates: 0  Warnings: 0

Be aware of the double parentheses. The syntax is correct since the expression must be enclosed within parentheses to distinguish it from columns or column prefixes.

Indeed the following returns an error:

mysql> ALTER TABLE products ADD INDEX(MONTH(create_time));
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'create_time))' at line 1

Let’s check now our original query and see what happens to the EXPLAIN

mysql> SELECT AVG(price) FROM products WHERE MONTH(create_time)=10;
+------------+
| AVG(price) |
+------------+
| 202.982582 |
+------------+

 

mysql> EXPLAIN SELECT AVG(price) FROM products WHERE MONTH(create_time)=10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: products
   partitions: NULL
         type: ref
possible_keys: functional_index
          key: functional_index
      key_len: 5
          ref: const
         rows: 182
     filtered: 100.00
        Extra: NULL

 

The query is no longer a full scan and runs faster. The functional_index has been used, with only 182 rows examined. Awesome.

Thanks to the functional index we are no longer forced to rewrite the query.

Which Functional Indexes are Permitted

We have seen an example involving a simple function applied to a column, but you are granted to create more complex indexes.

A functional index may contain any kind of expressions, not only a single function. The following patterns are valid functional indexes:

INDEX( ( col1 + col2 ) )
INDEX( ( FUNC(col1) + col2 – col3 ) )

You can use ASC or DESC as well:

INDEX( ( MONTH(col1) ) DESC )

You can have multiple functional parts, each one included in parentheses:

INDEX( ( col1 + col2 ), ( FUNC(col2) ) )

You can mix functional with nonfunctional parts:

INDEX( (FUNC(col1)), col2, (col2 + col3), col4 )

There are also limitations you should be aware of:

  • A functional key can not contain a single column. The following is not permitted:
    INDEX( (col1), (col2) )
  • The primary key can not include a functional key part
  • The foreign key can not include a functional key part
  • SPATIAL and FULLTEXT indexes can not include functional key parts
  • A functional key part can not refer to a column prefix

At last, remember that the functional index is useful only to optimize the query that uses the exact same expression. An index created with nonfunctional parts can be used instead to solve multiple different queries.

For example, the following conditions can not rely on the functional index we have created:

WHERE YEAR(create_time) = 2019

WHERE create_time > ‘2019-10-01’

WHERE create_time BETWEEN ‘2019-10-01’ AND ‘2019-11-01’

WHERE MONTH(create_time+INTERVAL 1 YEAR)

All these will trigger a full scan.

Functional Index Internal

The functional indexes are implemented as hidden virtual generated columns. For this reason, you can emulate the same behavior even on MySQL 5.7 by explicitly creating the virtual column. We can test this, starting by dropping the indexes we have created so far.

mysql> SHOW CREATE TABLE products\G
*************************** 1. row ***************************
       Table: products
Create Table: CREATE TABLE `products` (
  `id` int unsigned NOT NULL AUTO_INCREMENT,
  `description` longtext,
  `price` decimal(8,2) DEFAULT NULL,
  `create_time` timestamp NULL DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `create_time` (`create_time`),
  KEY `functional_index` ((month(`create_time`)))
) ENGINE=InnoDB AUTO_INCREMENT=149960 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

 

mysql> ALTER TABLE products DROP INDEX `create_time`, DROP INDEX `functional_index`;
Query OK, 0 rows affected (0.03 sec)

We can try now to create the virtual generated column:

mysql> ALTER TABLE products ADD COLUMN create_month TINYINT GENERATED ALWAYS AS (MONTH(create_time)) VIRTUAL;
Query OK, 0 rows affected (0.04 sec)

Create the index on the virtual column:

mysql> ALTER TABLE products ADD INDEX(create_month);
Query OK, 0 rows affected (0.55 sec)

 

mysql> SHOW CREATE TABLE products\G
*************************** 1. row ***************************
       Table: products
Create Table: CREATE TABLE `products` (
  `id` int unsigned NOT NULL AUTO_INCREMENT,
  `description` longtext,
  `price` decimal(8,2) DEFAULT NULL,
  `create_time` timestamp NULL DEFAULT NULL,
  `create_month` tinyint GENERATED ALWAYS AS (month(`create_time`)) VIRTUAL,
  PRIMARY KEY (`id`),
  KEY `create_month` (`create_month`)
) ENGINE=InnoDB AUTO_INCREMENT=149960 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

 

We can now try our original query. We expect to see the same behavior as the functional index.

mysql> SELECT AVG(price) FROM products WHERE MONTH(create_time)=10;
+------------+
| AVG(price) |
+------------+
| 202.982582 |
+------------+

mysql> EXPLAIN SELECT AVG(price) FROM products WHERE MONTH(create_time)=10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: products
   partitions: NULL
         type: ref
possible_keys: create_month
          key: create_month
      key_len: 2
          ref: const
         rows: 182
     filtered: 100.00
        Extra: NULL

 

Indeed, the behavior is the same. The index on the virtual column can be used and the query is optimized.

The good news is that you can use this workaround to emulate a functional index even on 5.7, getting the same benefits. The advantage of MySQL 8.0 is that it is completely transparent, no need to create the virtual column.

Since the functional index is implemented as a hidden virtual column, there is no additional space needed for the data, only the index space will be added to the table.

By the way, this is the same technique used for creating indexes on JSON documents’ fields.

Conclusion

The functional index support is an interesting improvement you can find in MySQL 8.0. Some of the queries that required rewriting to get optimized don’t require that anymore. Just remember that only the queries having the same filter pattern can rely on the functional index. Then you need to create additional indexes or other functional indexes to improve other search patterns.

The same feature can be implemented on MySQL 5.7 with the explicit creation of a virtual generated column and the index.

For more detailed information, read the following page:

https://dev.mysql.com/doc/refman/8.0/en/create-index.html#create-index-functional-key-parts

Jan
10
2022
--

Baby Racing Car Seat From Delta Children

You read it right! Delta Children, the company known and loved for their eye-catching baby chairs and accessories, has created something for the little ones who were born with the speed in their veins.

Delta Children Sit N Play Portable Activity Seat for Babies is what you might call a sporty addition to the baby gear family. It is safe, comfortable, sturdy, and perfectly adjustable to any surface.

About Delta Children Portable Activity Chair

The Delta Children Sit N’ Play Portable Activity Seat will help your little one sit, interact, and play at home and on the road. The sturdy upright seat allows your baby to enjoy and interact with the world completely, while the beautiful design will look great in any room. The portable play seat is easy to fold for storage and take along, while the non-skid bottom will keep it secure on nearly any surface.

Your little one will love playing with the engaging race car-themed toys that help increase gross motor skills, and you’ll love how easy it is to clean–just remove the seat pad and pop it in the washing machine. The rest of the activity seat features water-and-stain-resistant fabric that’s easy to wipe clean!

This infant floor seat is perfect for traveling because of its innovative zippered design and convenient carry handle, which unzips to fold flat quickly.

Why choose a baby racing car seat?

Keeping your baby content in one spot is one of the most challenging tasks when it comes to caring for them. Babies love to move around and explore; they’re constantly crawling everywhere, looking for something they can pick up in their little hands, something that will bring them joy and entertainment.

This is where the Delta Children Sit N’ Play Portable Activity Seat comes in to help you. This product will have your child content on any surface, whether it’s at home or outside, on the porch, for example. You can place it anywhere, and your baby will be thrilled playing with the interactive toys that are included.

It is also vital to consider kids’ interests as soon as possible. You don’t want to wait until they’re older and have become bored with the toys you chose for them during their infantile stage. Let them play with what they enjoy, let them be kids while they still can, and once they get a bit older, things will start getting complicated because of what is expected from them in terms of behavior and maturity.

Quality kids’ chairs at your reach

Besides being a practical solution for your child’s entertainment needs, the Delta Children Sit N’ Play Portable Activity Seat is also very affordable. Now you can have a play seat for your infant that won’t break the bank as it costs just as much as other products on the market today.

Delta Children products were mentioned on ComfyBummy numerous times. You can, for example, see reviews for their amazing kids’ Frozen chairs or explore our guide to the Delta Children’s products.

You’ll never go wrong having Delta Children around. Their products are some of the most durable ones you’ll find, not only when it comes to kids’ furniture but also in terms of toys. Their quality is extremely high while their prices are fair enough that everyone can buy their products. You can find them on Amazon, where you can browse their many items and choose the one you like most.

The post Baby Racing Car Seat From Delta Children appeared first on Comfy Bummy.

Jan
07
2022
--

Configure wiredTiger cacheSize Inside Percona Distribution for MongoDB Kubernetes Operator

wiredTiger cacheSize Inside Percona Distribution for MongoDB Kubernetes Operator

wiredTiger cacheSize Inside Percona Distribution for MongoDB Kubernetes OperatorNowadays we are seeing a lot of customers starting to use our Percona Distribution for MongoDB Kubernetes Operator. The Percona Kubernetes Operators are based on best practices for the configuration of a Percona Server for MongoDB replica set or the sharded cluster. The main component in MongoDB is the wiredTiger cache which helps to define the cache used by this engine and we can set it based on our load.

In this blog post, we will see how to define the resources’ memory and set the wiredTiger cache for the shard replicaset to improve the performance of the sharded cluster.

The Necessity of WT cache

The parameter storage.wiredTiger.engineConfig.cacheSizeGB limits the size of the WiredTiger internal cache. The operating system will use the available free memory for filesystem cache, which allows the compressed MongoDB data files to stay in memory. In addition, the operating system will use any free RAM to buffer file system blocks and file system cache. To accommodate the additional consumers of RAM, you may have to set WiredTiger’s internal cache size properly.

Starting from MongoDB 3.4, the default WiredTiger internal cache size is the larger of either:

50% of (RAM - 1 GB), or 256 MB.

For example, on a system with a total of 4GB of RAM the WiredTiger cache will use 1.5GB of RAM (0.5 * (4 GB – 1 GB) = 1.5 GB). Conversely, a system with a total of 1.25 GB of RAM will allocate 256 MB to the WiredTiger cache because that is more than half of the total RAM minus one gigabyte (0.5 * (1.25 GB – 1 GB) = 128 MB < 256 MB).

WT cacheSize in Kubernetes Operator

The mongodb wiredTiger cacheSize can be tune with the parameter storage.wiredTiger.engineConfig.cacheSizeRatio and its default value is 0.5. As explained above, if the system allocated memory limit is too low, then the WT cache is set to 256M or calculated as per the formula.

Prior to PSMDB operator 1.9.0, the cacheSizeRatio can be tuned under the sharding section of the cr.yaml file. This is deprecated from v1.9.0+ and unavailable from v1.12.0+. So you have to use the cacheSizeRatio parameter available under replsets configuration instead. The main thing that you will need to check here before changing the cacheSize is to make sure that the resources’ memory limit allocated is also available as per your cacheSize’s requirement. i.e the below section limiting the memory:

     resources:
       limits:
         cpu: "300m"
         memory: "0.5G"
       requests:
         cpu: "300m"
         memory: "0.5G"

 

https://github.com/percona/percona-server-mongodb-operator/blob/main/pkg/psmdb/container.go#L307

From the source code that calculates the mongod.storage.wiredTiger.engineConfig.cacheSizeRatio:

// In normal situations WiredTiger does this default-sizing correctly but under Docker
// containers WiredTiger fails to detect the memory limit of the Docker container. We
// explicitly set the WiredTiger cache size to fix this.
//
// https://docs.mongodb.com/manual/reference/configuration-options/#storage.wiredTiger.engineConfig.cacheSizeGB//

func getWiredTigerCacheSizeGB(resourceList corev1.ResourceList, cacheRatio float64, subtract1GB bool) float64 {
 maxMemory := resourceList[corev1.ResourceMemory]
 var size float64
 if subtract1GB {
  size = math.Floor(cacheRatio * float64(maxMemory.Value()-gigaByte))
 } else {
  size = math.Floor(cacheRatio * float64(maxMemory.Value()))
 }
 sizeGB := size / float64(gigaByte)
 if sizeGB < minWiredTigerCacheSizeGB {
  sizeGB = minWiredTigerCacheSizeGB
 }
 return sizeGB
}

 

Changing the cacheSizeRatio

Here for the test, we deployed the PSMDB operator on GCP. You can refer here for the steps – https://www.percona.com/doc/kubernetes-operator-for-psmongodb/gke.html. With the latest operator v1.11.0, the sharded cluster has been started with a shard and a config server replicaSets along with mongos pods.

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
my-cluster-name-cfg-0 2/2 Running 0 4m9s
my-cluster-name-cfg-1 2/2 Running 0 2m55s
my-cluster-name-cfg-2 2/2 Running 1 111s
my-cluster-name-mongos-758f9fb44-d4hnh 1/1 Running 0 99s
my-cluster-name-mongos-758f9fb44-d5wfm 1/1 Running 0 99s
my-cluster-name-mongos-758f9fb44-wmvkx 1/1 Running 0 99s
my-cluster-name-rs0-0 2/2 Running 0 4m7s
my-cluster-name-rs0-1 2/2 Running 0 2m55s
my-cluster-name-rs0-2 2/2 Running 0 117s
percona-server-mongodb-operator-58c459565b-fc6k8 1/1 Running 0 5m45s

Now login into the shard and check the default memory allocated to the container and to the mongod instance. In below, the memory size available is 15G, but the memory limit to use in this container is 476MB only:

rs0:PRIMARY> db.hostInfo()
{
"system" : {
"currentTime" : ISODate("2021-12-30T07:16:59.441Z"),
"hostname" : "my-cluster-name-rs0-0",
"cpuAddrSize" : 64,
"memSizeMB" : NumberLong(15006),
"memLimitMB" : NumberLong(476),
"numCores" : 4,
"cpuArch" : "x86_64",
"numaEnabled" : false
},
"os" : {
"type" : "Linux",
"name" : "Red Hat Enterprise Linux release 8.4 (Ootpa)",
"version" : "Kernel 5.4.144+"
},
"extra" : {
"versionString" : "Linux version 5.4.144+ (builder@7d732a1aec13) (Chromium OS 12.0_pre408248_p20201125-r7 clang version 12.0.0 (/var/tmp/portage/sys-devel/llvm-12.0_pre408248_p20201125-r7/work/llvm-12.0_pre408248_p20201125/clang f402e682d0ef5598eeffc9a21a691b03e602ff58)) #1 SMP Sat Sep 25 09:56:01 PDT 2021",
"libcVersion" : "2.28",
"kernelVersion" : "5.4.144+",
"cpuFrequencyMHz" : "2000.164",
"cpuFeatures" : "fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves arat md_clear arch_capabilities",
"pageSize" : NumberLong(4096),
"numPages" : 3841723,
"maxOpenFiles" : 1048576,
"physicalCores" : 2,
"mountInfo" : [
..
..

 

The cachesize in MB of wiredTiger engine allocated in Shard is as follows:

rs0:PRIMARY> db.serverStatus().wiredTiger.cache["maximum bytes configured"]/1024/1024
256

The cache size of 256MB is too low for the real environment. So let’s see how to tune the memory limit and also the cacheSize of WT engine. You can use the parameter called cacheSizeRatio to mention the WT cache ratio (out of 1) and memlimit to mention the memory allocated to the container. To do this, edit the cr.yaml file under deploy directory in the operator to change the settings. From the PSMDB operator v1.9.0, editing cacheSizeRatio parameter under mongod section is deprecated. So for the WT cache limit, use the cacheSizeRatio parameter under the section “replsets” and to set memory, use the memlimit parameter. Setting 3G for the container and 80% of the memory calculations.

deploy/cr.yaml:58

46 configuration: |
47 # operationProfiling:
48 # mode: slowOp
49 # systemLog:
50 # verbosity: 1
51 storage:
52 engine: wiredTiger
53 # inMemory:
54 # engineConfig:
55 # inMemorySizeRatio: 0.9
56 wiredTiger:
57 engineConfig:
58 cacheSizeRatio: 0.8

 

deploy/cr.yaml:229-232:

226 resources:
227 limits:
228 cpu: "300m"
229 memory: "3G"
230 requests:
231 cpu: "300m"
232 memory: "3G"

 

Apply the new cr.yaml

# kubectl appli -f deploy/cr.yaml
perconaservermongodb.psmdb.percona.com/my-cluster-name configured

The shard pods are re-allocated and you can check the progress as follows:

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
my-cluster-name-cfg-0 2/2 Running 0 36m
my-cluster-name-cfg-1 2/2 Running 0 35m
my-cluster-name-cfg-2 2/2 Running 1 34m
my-cluster-name-mongos-758f9fb44-d4hnh 1/1 Running 0 34m
my-cluster-name-mongos-758f9fb44-d5wfm 1/1 Running 0 34m
my-cluster-name-mongos-758f9fb44-wmvkx 1/1 Running 0 34m
my-cluster-name-rs0-0 0/2 Init:0/1 0 13s
my-cluster-name-rs0-1 2/2 Running 0 60s
my-cluster-name-rs0-2 2/2 Running 0 8m33s
percona-server-mongodb-operator-58c459565b-fc6k8 1/1 Running 0 38m

Now check the new settings of WT cache as follows:

rs0:PRIMARY> db.hostInfo().system
{
"currentTime" : ISODate("2021-12-30T08:37:38.790Z"),
"hostname" : "my-cluster-name-rs0-1",
"cpuAddrSize" : 64,
"memSizeMB" : NumberLong(15006),
"memLimitMB" : NumberLong(2861),
"numCores" : 4,
"cpuArch" : "x86_64",
"numaEnabled" : false
}
rs0:PRIMARY> 
rs0:PRIMARY> 
rs0:PRIMARY> db.serverStatus().wiredTiger.cache["maximum bytes configured"]/1024/1024
1474

Here, the memory calculation for WT is done roughly as follows (Memory limit should be more than 1G, else 256MB is allocated by default:
(Memory limit – 1G) * cacheSizeRatio

(2861 - 1) *0.8 = 1467

 

NOTE:

Till PSMDB operator v1.10.0, the operator takes the change of cacheSizeRatio only if the resources.limit.cpu is also set. This is a bug and it got fixed in v1.11.0 – refer https://jira.percona.com/browse/K8SPSMDB-603 . So if you’re in an older version, don’t be surprised and you have to make sure the resources.limit.cpu is set as well.

https://github.com/percona/percona-server-mongodb-operator/blob/v1.10.0/pkg/psmdb/container.go#L194

if limit, ok := resources.Limits[corev1.ResourceCPU]; ok && !limit.IsZero() {
args = append(args, fmt.Sprintf(
"--wiredTigerCacheSizeGB=%.2f",
getWiredTigerCacheSizeGB(resources.Limits, replset.Storage.WiredTiger.EngineConfig.CacheSizeRatio, true),
))
}

From v1.11.0:
https://github.com/percona/percona-server-mongodb-operator/blob/v1.11.0/pkg/psmdb/container.go#L194

if limit, ok := resources.Limits[corev1.ResourceMemory]; ok && !limit.IsZero() {
    args = append(args, fmt.Sprintf(
       "--wiredTigerCacheSizeGB=%.2f",
       getWiredTigerCacheSizeGB(resources.Limits, replset.Storage.WiredTiger.EngineConfig.CacheSizeRatio, true),
))
}

 

Conclusion

So based on the application load, you will need to set the cacheSize of WT for better performance. You can use the above methods to tune the cache size for the shard replicaset in the PSMDB operator.

Reference Links :

https://www.percona.com/doc/kubernetes-operator-for-psmongodb/operator.html

https://www.percona.com/doc/kubernetes-operator-for-psmongodb/gke.html

https://www.percona.com/doc/kubernetes-operator-for-psmongodb/operator.html#mongod-storage-wiredtiger-engineconfig-cachesizeratio

MongoDB 101: How to Tune Your MongoDB Configuration After Upgrading to More Memory

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com