Dec
20
2023
--

Using Huge Pages with PostgreSQL Running Inside Kubernetes

Huge pages make PostgreSQL faster; can we implement it in Kubernetes? Modern servers operate with terabytes of RAM, and by default, processors work with virtual memory address translation for each 4KB page. OS maintains a huge list of allocated and free pages to make slow but reliable address translation from virtual to physical.

Please check out the Why Linux HugePages are Super Important for Database Servers: A Case with PostgreSQL blog post for more information.

Setup

I recommend starting with 2MB huge pages because it’s trivial to set up. Unfortunately, the performance in benchmarks is almost the same as for 4KB pages. Kubernetes worker nodes should be configured with GRUB_CMDLINE_LINUX or sysctl vm.nr_hugepages=N: https://kubernetes.io/docs/tasks/manage-hugepages/scheduling-hugepages/

This step could be hard with managed Kubernetes services, like GCP, but easy for kubeadm, kubespray, k3d, and kind installations.

Kubectl helps to check the amount of huge pages available.

kubectl describe nodes NODENAME
…
  hugepages-1Gi      0 (0%)     0 (0%)
  hugepages-2Mi      1Gi (25%)  1Gi (25%)
…

The tool reports only 2MB pages availability in the above output. During the deployment procedure on the custom resource apply stage, Percona Operator for PostgreSQL 2.2.0 is not able to start on such nodes:

$ kubectl -n pgo get pods -l postgres-operator.crunchydata.com/data=postgres
NAME                        READY   STATUS             RESTARTS       AGE
cluster1-instance1-f65t-0   3/4     CrashLoopBackOff   6 (112s ago)   8m35s
cluster1-instance1-2bss-0   3/4     CrashLoopBackOff   6 (100s ago)   8m35s
cluster1-instance1-89v7-0   3/4     CrashLoopBackOff   6 (104s ago)   8m35s

Logs are very confusing:

kubectl -n pgo logs cluster1-instance1-f65t-0 -c database
selecting dynamic shared memory implementation ... posix
sh: line 1:   737 Bus error               (core dumped) "/usr/pgsql-15/bin/postgres" --check -F -c log_checkpoints=false -c max_connections=100 -c shared_buffers=1000 -c dynamic_shared_memory_type=posix < "/dev/null" > "/dev/null" 2>&1

By default, PostgreSQL is configured to use huge pages, but Kubernetes needs to allow it first. .spec.instances.resources.limits should be modified to mention huge pages. PG pods are not able to start without proper limits on the node with huge pages enabled.

instances:
  - name: instance1
    replicas: 3
    resources:
      limits:
        hugepages-2Mi: 1024Mi
        memory: 1Gi
        cpu: 500m

hugepages-2Mi works in combination with the memory parameter; you can’t just specify huge pages limits.

Finally, let’s verify huge pages usage in postmaster memory map:

$ kubectl -n pgo exec -it cluster1-instance1-hgrp-0 -c database -- bash

ps -eFH # check process tree and find “first” postgres process

pmap -X -p 107|grep huge

         Address Perm   Offset Device     Inode   Size   Rss  Pss Pss_Dirty Referenced Anonymous LazyFree ShmemPmdMapped FilePmdMapped Shared_Hugetlb Private_Hugetlb Swap SwapPss Locked THPeligible Mapping

    7f35c5c00000 rw-s 00000000  00:0f 145421787 432128     0    0         0          0         0        0              0             0          18432          264192    0       0      0           0 /anon_hugepage (deleted)

Both Shared_Hugetlb Private_Hugetlb columns are set (18432 and 264192). It confirms that PostgreSQL can use huge pages.

Don’t set huge pages to the exact value of shared_buffers, as shared memory could also be consumed by extensions and many internal structures.

postgres=# SELECT sum(allocated_size)/1024/1024 FROM pg_shmem_allocations ;
       ?column?       
----------------------
 422.0000000000000000
(1 row)
postgres=# select * from pg_shmem_allocations order by allocated_size desc LIMIT 10;
         name         |    off    |   size    | allocated_size 
----------------------+-----------+-----------+----------------
 <anonymous>          |           | 275369344 |      275369344
 Buffer Blocks        |   6843520 | 134217728 |      134217728
 pg_stat_monitor      | 147603584 |  20971584 |       20971648
 XLOG Ctl             |     54144 |   4208200 |        4208256
                      | 439219200 |   3279872 |        3279872
 Buffer Descriptors   |   5794944 |   1048576 |        1048576
 CommitTs             |   4792192 |    533920 |         534016
 Xact                 |   4263040 |    529152 |         529152
 Checkpointer Data    | 146862208 |    393280 |         393344
 Checkpoint BufferIds | 141323392 |    327680 |         327680
(10 rows)

Pg_stat_statements and pg_stat_monitor could introduce a significant difference to the small value of shared_buffers. Thus you need “hugepages-2Mi: 512Mi” for “shared_buffers: 128MB”.

Now you know all the caveats and may want to repeat the configuration.

It’s easy with anydbver and k3d. Allocate 2MB huge pages:

sysctl vm.nr_hugepages=2048

Verify huge pages availability:

egrep 'Huge|Direct' /proc/meminfo
AnonHugePages:    380928 kB
ShmemHugePages:        0 kB
FileHugePages:         0 kB
HugePages_Total:    2048
HugePages_Free:     2048
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:         4194304 kB
DirectMap4k:     1542008 kB
DirectMap2M:    19326976 kB
DirectMap1G:           0 kB

  1. Install and configure anydbver.

    git clone https://github.com/ihanick/anydbver.git
    cd anydbver
    ansible-galaxy collection install theredgreek.sqlite
    echo PROVIDER=docker > .anydbver
    (cd images-build;./build.sh)
  2. Start k3d cluster and install Percona Operator for PostgreSQL 2.2.0:

    ./anydbver deploy k8s-pg:2.2.0
  3. The command hangs on the cluster deployment stage, and the second terminal shows CrashLoopBackoff state:

    kubectl -n pgo get pods -l postgres-operator.crunchydata.com/data=postgres
  4. Change data/k8s/percona-postgresql-operator/deploy/cr.yaml
    Uncomment .spec.instances[0].resources.limits and set memory: 1Gi, hugepages-2Mi: 1024Mi
  5. Apply CR again:

    kubectl -n pgo apply -f data/k8s/percona-postgresql-operator/deploy/cr.yaml

In summary:

  • Huge pages are not supported out of the box in public clouds
  • Database crashes can occur if huge pages allocation fails with a bus error
  • Huge pages is not a silver bullet.
    • Without frequent CPU context switches and massively random large shared buffer access, default 4K pages show comparable results.
    • Workloads with less than 4-5k transactions per second are fine even without huge pages

 

Learn more about Percona Operator for PostgreSQL

Nov
01
2023
--

Percona Operators Deployment Explained: Delving Into Cluster-Wide vs. Namespace-Scoped

Cluster-Wide vs Namespace-Scoped Percona Kubernetes Operators

Namespaces in Kubernetes provide a way to isolate groups of resources within a single cluster. They are useful in multi-tenant environments with a few to tens of users and teams. They also allow dividing cluster resources between these users through quotas.

Kubernetes Cluster

Percona Operators come with cluster-wide and namespace-scope deployments. When you have a multi-namespace Kubernetes cluster, it poses a question of which way to use it.

It is worth mentioning that there is also a middle ground: cluster-wide with namespace limitations. That is when an Operator is deployed in cluster-wide mode but has control over a few namespaces, not all. There can be multiple Operators in this mode.

In this blog post, we will dive deeper into the pros and cons of these solutions and help you navigate this uncertainty.

Deeper dive

We will review the differences between cluster-wide and namespace-scoped through the prism of security, availability, and performance. These are the most important aspects of any infrastructure.

Availability

Operators are responsible for deploying new and managing existing database clusters. Existing database clusters will continue working if the Operator goes down, but certain features will not be available:

  1. Deploy new and delete existing clusters
  2. Backups and restores
  3. Scaling, upgrades, and other management tasks

When you choose the deployment method, we recommend you consider the blast radius. The more clusters a single operator controls, the bigger the radius is. The recommendation is to lower the blast radius as much as possible.

As you see in the gif, it is also possible that a namespace-scoped deployment can have a huge blast radius. In this case, it would be wise to redistribute database clusters across multiple namespaces.

Performance

Operators do not impact the performance of the databases but can harm the operational aspect. This is quite similar to the availability problem. The issue is that by default, Operator SDK processes custom resources one by one with no concurrency. The more Custom Resources (CR) you create and update with a single Operator, the slower the processing gets. It is important to remember that Operators come with various CRs. For example, Percona Operator for PostgreSQL has three:

  • PerconaPGCluster
  • PerconaPGBackup
  • PerconaPGRestore

This means that not only database clusters themselves but also backups and restores are Custom Resources and can impact the processing. For instance, if you have 100 clusters managed by an Operator and create 100 backups at the same time (on schedule), these resources will be processed consecutively.

Another aspect here is blocking operations: Smart Upgrade, full cluster crash recovery, etc. If the Operator is executing these operations on at least one cluster, then other operations are queuing up.

The recommended strategy here is similar to the Availability part: reduce the blast radius.

Maximum Concurrent Reconciles

It is possible to increase the concurrency of the Operator through MaxConcurrentReconciles, but it might introduce other problems, for example, race conditions. We are researching this option to relax performance constraints.

Security

When you deploy Percona Operators, we create the following resources in the cluster:

  • Custom Resource Definitions (CRDs) – To extend Kubernetes APIs. CRDs are a cluster-level resource; it does not matter which method is used.
  • Service accounts and roles – To give access to the Operator to create, modify, and remove resources.
  • Deployment – It is the Operator itself.

From a security perspective, we are interested in Service accounts and cluster roles. For cluster-wide, the Operator should have access to multiple namespaces (if not all) in the cluster. The Operator creates Stateful Sets, Services, Volumes, and more. To grant this access in cluster-wide mode, we create ClusterRole instead of Role for the Operator. For example, for Percona Operator for MySQL, you can see the difference between rbac.yaml and cw-rbac.yaml (cluster-wide).

Even a service account token leak is a rare situation, but you must be aware that in the case when a cluster-wide Operator has all the keys to the kingdom.

Resource consumption

Every container consumes resources. Operator, as it is deployed as a Pod, is not an exception. In a namespace-scoped or cluster-wide with namespace limitation, you might have multiple installations of Operator Pod that would consume compute resources: CPU and memory. Percona Operators do not consume much – ~50 milicores and ~30MB of RAM. So it might become a problem only if you have hundreds or thousands of Operator installations.

Conclusion

Operators are a tool that simplifies application deployment and management on Kubernetes. There are certain availability, security, and performance considerations that users should keep in mind when using such tools. The general suggestion is to keep the blast radius small — manage a reasonable number of Custom Resources by a single Operator.

 

Percona Kubernetes Operators

You can get early access to new product features, invite-only ”ask me anything” sessions with Percona Kubernetes experts, and monthly swag raffles. Interested? Fill in the form at percona.com/k8s.

Mar
16
2023
--

Percona Labs Presents: Infrastructure Generator for Percona Database as a Service (DBaaS)

Percona MyDBaaS

Let’s look at how you can run Percona databases on Kubernetes, the easy way.

Chances are that if you are using the latest Percona Monitoring and Management (PMM) version, you have seen the availability of the new Percona Database as a Service (DBaaS). If not, go and get a glimpse of the fantastic feature with our docs on DBaaS – Percona Monitoring and Management.

Now, if you like it and wanna give it a try! (Yay!), but you don’t wanna deal with Kubernetes, (nay)o worries; we have a tool for you. Introducing the Percona DBaaS Infrastructure Creator, or Percona My Database as a Service (MyDBaaS).

My Database as a Service

First, a clarification: this tool is focused on running your DBaaS on Amazon Web Services (AWS).

percona My Database as a Service

At the time of the writing of this article, the PMM DBaaS supported a limited number of cloud vendors, AWS being one of them. This tool creates the infrastructure in an AWS account.

The usage is pretty simple. You have three features:

  • An instance selector: In case you don’t know what instance type to use, but you do know how many CPUs and how much memory your DBs requires.
  • A Cluster Creator: This is where you decide on some minimal and basic properties of the cluster and deploy it.
  • Deploy a Percona Operator: You can choose from our three Operators for Kubernetes to be deployed on the same infrastructure you are creating.

Instance selector

Currently, AWS has 270 different instance types available. Which one to use? The instance selector will help you with that. Just pass the number of vCPUs, amount of memory, and the region you intend to use, and it will show a list of EKS-suitable EC2 instances.

percona my database Instance selector

Why ask for the region? Costs! Instances types have different costs depending on the region they are in, and costs are something the tool will show you:

Cluster creator

A very basic interface. You only need to pass the name of the cluster, the amount of desired nodes, the instance type, and on which region you would like to run the cluster.

percona my database cluster

If you pass your AWS key/secret, the tool will take care of deploying the cluster, and once it is done, it will return the contents of the Kubeconfig file for you to use in the DBaaS of PMM.

A note on how we handle your AWS key/secret

Percona will never store this information. In the particular case of this tool, we do set the following environment variables:

  • AWS_ACCESS_KEY_ID
  • AWS_SECRET_ACCESS_KEY
  • AWS_DEFAULT_REGION

After the creation of the cluster, the environment variables are unset.

Now, why is there an option to not pass the AWS credentials? Well, for security, of course. 

Under the hood, the cluster is created using EKSCTL, a “CloudFormation stack creator.” 

If you are proficient enough with ekstcl, you can just copy/paste the created YAML and run the eksctl command on your own server without sharing your credentials ever. Now, for DBaaS, you still have to provide them, but that’s another topic

using EKSCTL

If you choose to pass the AWS credentials, an EKS cluster will be deployed. The outcome will be the contents of the kubeconfig file

kubeconfig file Percona

With that in place, one can now go to PMM and register the Kubernetes cluster as described in https://docs.percona.com/percona-monitoring-and-management/dbaas/get-started.html#add-a-kubernetes-cluster to deploy the DBaaS in there.

But there’s more:

Percona Operators

Now, there’s an additional option: deploying a Percona Operator. Currently, PMM DBaaS focuses on the Percona Operator for MySQL based on Percona XtraDB Cluster. With the MyDBaaS tool, you can choose from three operators available.

To have the full deployment, you need to check both AWS credentials and deploy a Percona Operator.

deploy a Percona Operator

At the end of the installation, you will have the contents of the Kubeconfig file, plus:

  • The password of the root user 
  • Instructions on how to connect to it

installing percona operator for MySQL

After this, you can use a Percona database running on Kubernetes. Have fun!

About Percona Labs

Percona Labs is a place for the community to have a voice on how we approach solutions, curated by Percona’s CTO, Vadim Tkachenko. Feedback is always welcome:

Community Forum

Oct
14
2022
--

Using Percona Kubernetes Operators With K3s Part 3: Monitoring and PMM

Percona Kubernetes Operators With K3s Monitoring and Management

Percona Kubernetes Operators With K3s Monitoring and ManagementAs we have Kubernetes installed in part one (see Using Percona Kubernetes Operators With K3s Part 1: Installation) and have Percona Server for MySQL running (Using Percona Kubernetes Operators With K3s Part 2: Percona Server for MySQL Operator) lets review how we can install monitoring and monitor our running instance.

Percona Monitoring and Management installation

We recently implemented helm charts to allow Percona Monitoring and Management (PMM) to be directly installed into Kubernetes (Percona Monitoring and Management in Kubernetes is now in Tech Preview).

The most convenient way is to use helm, so there it is:

helm repo add percona https://percona.github.io/percona-helm-charts/
helm repo update
helm install pmm percona/pmm

Please remember this is a Technical Preview release and there could be some issues (one of them later in this post).

And we will see the pod running:

kubectl get pods -o wide                                                                                                                                                beast-node8-ubuntu: Wed Oct 12 09:05:39 2022
 
NAME                                             READY   STATUS    RESTARTS        AGE     IP           NODE                 NOMINATED NODE   READINESS GATES
percona-server-mysql-operator-7bb68f7b6d-tsvbv   1/1     Running   0               6d19h   10.42.1.10   beast-node6-ubuntu   <none>           <none>
pmm-0                                            1/1     Running   0               5d18h   10.42.2.12   beast-node5-ubuntu   <none>           <none>

This also will automatically expose PMM service to the outside world:

kubectl get svc
monitoring-service       NodePort    10.43.35.98     <none>        443:32227/TCP,80:32436/TCP

Monitoring the running Percona Operator for MySQL

This is the most manual process so far and I expect we will simplify it in the future.

There is also a problem with auto-generated passwords which can’t be used in the command line, so we need to do extra steps.

To get the running cluster monitored, we need to obtain an API key:

Step 1. Obtain PMM password

$(kubectl get secret pmm-secret -o jsonpath='{.data.PMM_ADMIN_PASSWORD}' | base64 --decode)

Likely it will contain symbols not friendly for the command line (#wUd}Bz]qp3<@=@H in my case), so extra steps:

  • Login to PMM (we can use the IP address of our Kubernetes master node and port from monitoring-service 32227)
  • Change the admin password to contain only letters and numbers, we will use axTTG6ouUuqwXu

Step 2. Obtain API_KEY via the command line

The command line is critical in this case. When I got API_KEY via GUI it did not work for me.

curl --insecure -X POST -H "Content-Type: application/json" -d '{"name":"operator1", "role": "Admin"}' 'https://admin:axTTG6ouUuqwXu @10.30.2.34:32227/graph/api/auth/keys' | jq .key
> eyJrIjoiTFFpQW84S1dFZW1jVDZSMnM1SGk0M0c2djlzRFhjZHQiLCJuIjoib3BlcmF0b3I1IiwiaWQiOjF9

Step 3. Change pmmserverkey secret for running MySQL cluster:

kubectl patch secret cluster9-secrets --type merge --patch '{"stringData": {"pmmserverkey": "eyJrIjoiTFFpQW84S1dFZW1jVDZSMnM1SGk0M0c2djlzRFhjZHQiLCJuIjoib3BlcmF0b3I1IiwiaWQiOjF9"}}'

After this is processed we will see our cluster in PMM!

Percona Monitoring and Management installation

PMM will also show annotations:

PMM installation kubernetes

And we can see how the workload switches between nodes when we perform workload (see the previous post Using Percona Kubernetes Operators With K3s Part 2: Percona Server for MySQL Operator).

Percona Kubernetes Operator

Percona Operators

Summary

In this three-part series I showed:

  • How to install your own Kubernetes Cluster (using k3s distribution)
  • How to deploy Percona Operator for MySQL with asynchronous replication, where the Operator can handle node failures
  • How to deploy PMM and get replication monitored with workload charts and failover events

Both Percona Operator for MySQL with asynchronous replication and PMM for Kubernetes are in Technical Preview state, so we need your feedback on what should be improved before we ship the final product!

Percona Kubernetes Operators automate the creation, alteration, or deletion of members in your Percona Distribution for MySQL, MongoDB, or PostgreSQL environment.

Learn More About Percona Kubernetes Operators

Oct
11
2022
--

Using Percona Kubernetes Operators With K3s Part 2: Percona Server for MySQL Operator

Percona Kubernetes Operators With K3s Part 2: Percona Server for MySQL Operator

Percona Kubernetes Operators With K3s Part 2: Percona Server for MySQL OperatorAs we have Kubernetes installed in part one (see Using Percona Kubernetes Operators With K3s Part 1: Installation), now we will install Percona Server for MySQL Operator into the running cluster.

I will copy some ideas from Peter’s Minukube tutorial (see Exploring MySQL on Kubernetes with Minkube).

In this case, I will use not Percona XtraDB Cluster Operator but a regular Percona Server for MySQL with Asynchronous replication.

We have recently released version 0.3.0 and it is still in the technical preview state, so we are actively looking for more feedback!

If we go with all defaults, then installation can be done in two steps.

Step 1. Install Operator

kubectl apply -f https://raw.githubusercontent.com/percona/percona-server-mysql-operator/v0.3.0/deploy/bundle.yaml

Step 2. Install Percona Server for MySQL in source->replica mode, with one source and two replicas

kubectl apply -f https://raw.githubusercontent.com/percona/percona-server-mysql-operator/v0.3.0/deploy/cr.yaml

And we can see the following pods running:

kubectl get pods -n mysql
NAME                                             READY   STATUS    RESTARTS      AGE
percona-server-mysql-operator-7bb68f7b6d-tsvbv   1/1     Running   0             30m
cluster1-orc-0                                   2/2     Running   0             28m
cluster1-orc-1                                   2/2     Running   0             28m
cluster1-mysql-0                                 3/3     Running   0             28m
cluster1-haproxy-0                               2/2     Running   0             27m
cluster1-haproxy-1                               2/2     Running   0             26m
cluster1-haproxy-2                               2/2     Running   0             26m
cluster1-orc-2                                   2/2     Running   0             27m
cluster1-mysql-1                                 3/3     Running   2 (26m ago)   27m
cluster1-mysql-2                                 3/3     Running   2 (25m ago)   26m

There is a lot of stuff going on, but remember we just installed three Percona Server for MySQL servers combined in the replication setup.

What else is there? You can find there Orchestrator (in three nodes for HA) and HAProxy (also in three nodes).

We need an Orchestrator to handle replication failover, that is if the primary MySQL node is killed or evicted, the Orchestrator will handle replica election and promote it to the primary.

HAProxy is needed to have a single point of entry and it will direct to the primary, no matter which pod is the primary right now.

How to connect to MySQL?

For connectivity, Kubernetes offers services endpoints and we can see them as:

kubectl get svc
cluster1-mysql-primary   ClusterIP   10.43.162.118   <none>        3306/TCP,33062/TCP,33060/TCP,6033/TCP   40m
cluster1-mysql-unready   ClusterIP   None            <none>        3306/TCP,33062/TCP,33060/TCP,6033/TCP   40m
cluster1-mysql           ClusterIP   None            <none>        3306/TCP,33062/TCP,33060/TCP,6033/TCP   40m
cluster1-orc             ClusterIP   None            <none>        3000/TCP,10008/TCP                      40m
cluster1-orc-0           ClusterIP   10.43.242.81    <none>        3000/TCP,10008/TCP                      40m
cluster1-orc-1           ClusterIP   10.43.69.105    <none>        3000/TCP,10008/TCP                      40m
cluster1-orc-2           ClusterIP   10.43.184.202   <none>        3000/TCP,10008/TCP                      40m
cluster1-haproxy         ClusterIP   10.43.150.69    <none>        3306/TCP,3307/TCP,3309/TCP              40m

Primarily we are looking for cluster1-haproxy if we have HAProxy enabled and cluster1-mysql-primary if we work without HAProxy. These are hostnames we can use to connect from inside Kubernetes, but to connect from outside of Kubernetes we will need to do some extra work: expose services. 

Typically this is done using NodePort, but before exposing the primary MySQL node, let’s do a little undocumented hack and expose Orchestrator GUI so we can see the current replication topology:

kubectl expose service cluster1-orc --type=NodePort --port=3000 --name=orc
service/orc exposed
 
services:
orc                      NodePort    10.43.87.69     <none>        3000:30924/TCP                          49s

Now we can see the current Orchestrator port as 30924 and we can use Kubernetes Master IP address (10.30.2.34 in our case) with this port to connect to Orchestrator GUI from the browser:

Orchestrator GUI from the browser

The topology confirms that the primary mysql node is cluster1-mysql-0.

Now we will try to connect to MySQL.

Step 1. Exposing HAProxy service:

kubectl expose service  cluster1-haproxy --type=NodePort --port=3306 --name=mysql-ha
service/mysql-ha exposed
 
mysql-ha                 NodePort    10.43.21.250    <none>        3306:31687/TCP                          57s

Step 2. Figuring out MySQL password:

$ kubectl get secrets cluster1-secrets  -ojson | jq -r .data.root | base64 -d
root_password

Step 3. Creating a database for benchmark:

kubectl exec -it cluster1-mysql-0   -- mysql -uroot -proot_password
create database sbtest

Step 4. Preparing sysbench database:

sysbench --db-driver=mysql --threads=4 --mysql-host=10.30.2.34 --mysql-port=31687 --mysql-user=root --mysql-password=root_password --mysql-db=sbtest /usr/share/sysbench/oltp_point_select.lua --report-interval=1 --table-size=1000000 prepare

Here are the connection parameters,

--mysql-host=10.30.2.34 --mysql-port=31687 --mysql-user=root --mysql-password=root_password

where 10.30.2.34 is our Kubernetes master node IP and 31687 is the exposed HAProxy port.

Now we can run some sysbench benchmark:

sysbench --db-driver=mysql --threads=4 --mysql-host=10.30.2.34 --mysql-port=31687 --mysql-user=root --mysql-password=root_password --mysql-db=sbtest1 /usr/share/sysbench/oltp_point_select.lua --report-interval=1 --table-size=1000000 --mysql-ignore-errors=all --time=3600 run
 
sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2)
 
Running the test with following options:
Number of threads: 4
Report intermediate results every 1 second(s)
Initializing random number generator from current time
 
 
Initializing worker threads...
 
Threads started!
 
[ 1s ] thds: 4 tps: 10573.86 qps: 10573.86 (r/w/o: 10573.86/0.00/0.00) lat (ms,95%): 0.53 err/s: 0.00 reconn/s: 0.00
[ 2s ] thds: 4 tps: 11219.62 qps: 11219.62 (r/w/o: 11219.62/0.00/0.00) lat (ms,95%): 0.51 err/s: 0.00 reconn/s: 0.00
[ 3s ] thds: 4 tps: 11196.11 qps: 11196.11 (r/w/o: 11196.11/0.00/0.00) lat (ms,95%): 0.52 err/s: 0.00 reconn/s: 0.00
[ 4s ] thds: 4 tps: 11555.85 qps: 11555.85 (r/w/o: 11555.85/0.00/0.00) lat (ms,95%): 0.50 err/s: 0.00 reconn/s: 0.00
[ 5s ] thds: 4 tps: 11002.38 qps: 11002.38 (r/w/o: 11002.38/0.00/0.00) lat (ms,95%): 0.52 err/s: 0.00 reconn/s: 0.00
[ 6s ] thds: 4 tps: 11450.22 qps: 11450.22 (r/w/o: 11450.22/0.00/0.00) lat (ms,95%): 0.51 err/s: 0.00 reconn/s: 0.00
[ 7s ] thds: 4 tps: 11477.98 qps: 11477.98 (r/w/o: 11477.98/0.00/0.00) lat (ms,95%): 0.50 err/s: 0.00 reconn/s: 0.00
[ 8s ] thds: 4 tps: 11481.14 qps: 11481.14 (r/w/o: 11481.14/0.00/0.00) lat (ms,95%): 0.52 err/s: 0.00 reconn/s: 0.00
[ 9s ] thds: 4 tps: 11603.96 qps: 11603.96 (r/w/o: 11603.96/0.00/0.00) lat (ms,95%): 0.52 err/s: 0.00 reconn/s: 0.00
[ 10s ] thds: 4 tps: 11554.07 qps: 11554.07 (r/w/o: 11554.07/0.00/0.00) lat (ms,95%): 0.51 err/s: 0.00 reconn/s: 0.00

Now we want to see how our cluster will handle the primary MySQL node failure.

For this, we will run sysbench and during this time will kill the primary pod:

kubectl delete pods  cluster1-mysql-0  --grace-period=0 --force
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "cluster1-mysql-0" force deleted

And this is what happened with sysbench:

sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2)
 
Running the test with following options:
Number of threads: 4
Report intermediate results every 1 second(s)
Initializing random number generator from current time
 
 
Initializing worker threads...
 
Threads started!
 
[ 1s ] thds: 4 tps: 11904.89 qps: 11904.89 (r/w/o: 11904.89/0.00/0.00) lat (ms,95%): 0.39 err/s: 0.00 reconn/s: 0.00
[ 2s ] thds: 4 tps: 12179.00 qps: 12179.00 (r/w/o: 12179.00/0.00/0.00) lat (ms,95%): 0.37 err/s: 0.00 reconn/s: 0.00
[ 3s ] thds: 4 tps: 12344.97 qps: 12344.97 (r/w/o: 12344.97/0.00/0.00) lat (ms,95%): 0.37 err/s: 0.00 reconn/s: 0.00
[ 4s ] thds: 4 tps: 12583.93 qps: 12583.93 (r/w/o: 12583.93/0.00/0.00) lat (ms,95%): 0.35 err/s: 0.00 reconn/s: 0.00
[ 5s ] thds: 4 tps: 12288.16 qps: 12288.16 (r/w/o: 12288.16/0.00/0.00) lat (ms,95%): 0.37 err/s: 0.00 reconn/s: 0.00
[ 6s ] thds: 4 tps: 11970.54 qps: 11970.54 (r/w/o: 11970.54/0.00/0.00) lat (ms,95%): 0.37 err/s: 0.00 reconn/s: 0.00
[ 7s ] thds: 4 tps: 12247.29 qps: 12247.29 (r/w/o: 12247.29/0.00/0.00) lat (ms,95%): 0.36 err/s: 0.00 reconn/s: 0.00
[ 8s ] thds: 4 tps: 12364.22 qps: 12364.22 (r/w/o: 12364.22/0.00/0.00) lat (ms,95%): 0.36 err/s: 0.00 reconn/s: 0.00
[ 9s ] thds: 4 tps: 10705.93 qps: 10705.93 (r/w/o: 10705.93/0.00/0.00) lat (ms,95%): 0.37 err/s: 3.00 reconn/s: 0.00
[ 10s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 11s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 12s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 13s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 14s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 15s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 16s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 17s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 18s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 19s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 20s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 21s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 22s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 23s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 24s ] thds: 4 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 25s ] thds: 4 tps: 11970.08 qps: 11970.08 (r/w/o: 11970.08/0.00/0.00) lat (ms,95%): 0.39 err/s: 0.00 reconn/s: 4.00
[ 26s ] thds: 4 tps: 13008.25 qps: 13008.25 (r/w/o: 13008.25/0.00/0.00) lat (ms,95%): 0.38 err/s: 0.00 reconn/s: 0.00
[ 27s ] thds: 4 tps: 13099.60 qps: 13099.60 (r/w/o: 13099.60/0.00/0.00) lat (ms,95%): 0.36 err/s: 0.00 reconn/s: 0.00
[ 28s ] thds: 4 tps: 12875.61 qps: 12875.61 (r/w/o: 12875.61/0.00/0.00) lat (ms,95%): 0.37 err/s: 0.00 reconn/s: 0.00
[ 29s ] thds: 4 tps: 13019.67 qps: 13019.67 (r/w/o: 13019.67/0.00/0.00) lat (ms,95%): 0.37 err/s: 0.00 reconn/s: 0.00
[ 30s ] thds: 4 tps: 12904.84 qps: 12904.84 (r/w/o: 12904.84/0.00/0.00) lat (ms,95%): 0.38 err/s: 0.00 reconn/s: 0.00
[ 31s ] thds: 4 tps: 12727.94 qps: 12727.94 (r/w/o: 12727.94/0.00/0.00) lat (ms,95%): 0.38 err/s: 0.00 reconn/s: 0.00
[ 32s ] thds: 4 tps: 12683.05 qps: 12683.05 (r/w/o: 12683.05/0.00/0.00) lat (ms,95%): 0.38 err/s: 0.00 reconn/s: 0.00
[ 33s ] thds: 4 tps: 12494.87 qps: 12494.87 (r/w/o: 12494.87/0.00/0.00) lat (ms,95%): 0.39 err/s: 0.00 reconn/s: 0.00
[ 34s ] thds: 4 tps: 12670.94 qps: 12670.94 (r/w/o: 12670.94/0.00/0.00) lat (ms,95%): 0.38 err/s: 0.00 reconn/s: 0.00

So, we can observe a 15-sec downtime. During this time, Orchestrator redirected traffic to a working replica, and we can notice this from Orchestrator GUI.

Immediately after mysql-0 pod failure:

mysql-0 pod failure

We can see the topology changed only to two nodes and mysql-2 was elected as primary.

After mysql-0 pod healed:

mysql-0 pod healed

Mysql-0 re-joined the cluster now as a replica to mysql-2 primary.

Now let’s summarize what happened

  • The Primary node becomes unavailable
  • Sysbench traffic paused
  • Orchestrator diagnosed primary node failure and elected mysql-2 as the new primary and re-configured replication
  • Sysbench was able to continue the workload
  • Mysql-0 pod was recovered and re-joined the cluster. Orchestrator joined it with a replica role.
  • Endpoint mysql-haproxy continued to serve as endpoint

All this was handled AUTOMATICALLY by Percona Server for MySQL Operator without human interaction!

Why don’t you give Percona Operator for MySQL based on Percona Server for MySQL a try and provide us feedback on your experience!

Oct
05
2022
--

Using Percona Kubernetes Operators With K3s Part 1: Installation

Using Percona Kubernetes Operators With K3s

Using Percona Kubernetes Operators With K3sRecently Peter provided an extensive tutorial on how to use Percona Kubernetes Operators on minikube: Exploring MySQL on Kubernetes with Minikube. Minikube is a great option for local deployment and to get familiar with Kubernetes on a single developer’s box.

But what if we want to get experience with setups that are closer to production Kubernetes and use multiple servers for deployments?

I think there is an alternative that is also easy to set up and easy to start. I am talking about K3s. In fact, it is so easy that this part one will be very short — there is just one requirement that we need to resolve, but it is also easy.

So let’s assume we have four servers we want to use for our Kubernetes deployments: one master and three worker nodes. In my case:

beast-node7-ubuntu
beast-node8-ubuntu (master)
beast-node5-ubuntu
beast-node6-ubuntu

For step one, on master we execute:

curl -sfL https://get.k3s.io | sh -

For step two, we need to prepare a script, and we need two parameters: the IP address of the master and the token for the master. Finding the token is probably the most complicated part of this setup and it is stored in the file

/var/lib/rancher/k3s/server/node-token

.

Having these parameters, the script for other nodes is:

k3s_url="https://10.30.2.34:6443"
k3s_token="K109a7b255d0cf88e75f9dcb6b944a74dbca7a949ebd7924ec3f6135eeadd6624e9::server:5bfa3b7e679b23c55c81c198cc282543"
curl -sfL https://get.k3s.io | K3S_URL=${k3s_url} K3S_TOKEN=${k3s_token} sh -

After executing this script on other nodes we will have our Kubernetes running:

kubectl get nodes
NAME                 STATUS   ROLES                  AGE    VERSION
beast-node7-ubuntu   Ready    <none>                 125m   v1.24.6+k3s1
beast-node8-ubuntu   Ready    control-plane,master   23h    v1.24.6+k3s1
beast-node5-ubuntu   Ready    <none>                 23h    v1.24.6+k3s1
beast-node6-ubuntu   Ready    <none>                 23h    v1.24.6+k3s1

This is sufficient for a basic Kubernetes setup, but for our Kubernetes Operators, we need an extra step: Dynamic Volume Provisioning, because our Operators request volumes to store data.

Actually, after further research, it appears that Operators will use the default local-path provisioner, which satisfies Operator requirements.

After this, the K3s cluster should be ready to deploy our Operators, which we will review in the next part of this series. Stay tuned!

Oct
03
2022
--

Making Percona Kubernetes Operators More Secure with Trivy

Making Percona Kubernetes Operators More Secure with Trivy

Making Percona Kubernetes Operators More Secure with TrivyJust to have a simple refresher, let’s start with a bit of Wikipedia: a vulnerability (in computing) is:

Vulnerabilities are flaws in a computer system that weaken the overall security of the device/system. Vulnerabilities can be exploited  by a threat actor, such as an attacker, to cross privilege boundaries (i.e. perform unauthorized actions) within a computer system.

There are a bunch of software vulnerabilities out there. Probably the most popular are the ones like Heartbleed, Meltdown, or Shellshock. These examples are what is called a known vulnerability. These vulnerabilities are the ones that are disclosed and have an ID (like CVE ID) assigned to them and which we can look up and track in databases like National Vulnerability Database. On the other hand, unknown vulnerabilities are vulnerabilities present in a computer system that have either not been discovered or discovered but have not been documented yet. 

When it comes to the unknown vulnerabilities, typically they are bugs that we produce in our software and we should deal with them like bugs are dealt with – proper and continuous testing (tools like OWASP Zap can be helpful with these as well). But for the known ones, if there are ways with which we can spot them and eliminate them, it should be mandatory that we do it since it is crucial for the safety and reliability of our products. This is where vulnerability scanners come into play and, in our case here, Trivy.

What is Trivy

Trivy is an open source, reliable, fast, easy-to-use comprehensive security scanner. It has different scanners for different security issues (like known vulnerabilities – CVEs or IaC misconfigurations) and supports different targets (like container images, filesystem, or Kubernetes resources) where it can find these issues. Check the docs to learn more.

It satisfies our needs with, first of all, its reliability and correctness, but also for being dead simple to use. It is a one-command tool and, for example, to scan a docker image just call it like:

trivy image [YOUR_IMAGE_NAME]

As the result you will see:

$ trivy image myimage:1.0.0
2022-05-16T13:25:17.826+0100    INFO    Detected OS: alpine
2022-05-16T13:25:17.826+0100    INFO    Detecting Alpine vulnerabilities...
2022-05-16T13:25:17.826+0100    INFO    Number of language-specific files: 0

myimage:1.0.0 (alpine 3.15.3)

Total: 2 (UNKNOWN: 0, LOW: 0, MEDIUM: 0, HIGH: 0, CRITICAL: 2)

????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
?  Library   ? Vulnerability  ? Severity ? Installed Version ? Fixed Version ?                          Title                          ?
????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
? busybox    ? CVE-2022-28391 ? CRITICAL ? 1.34.1-r4         ? 1.34.1-r5     ? busybox: remote attackers may execute arbitrary code if ?
?            ?                ?          ?                   ?               ? netstat is used                                         ?
?            ?                ?          ?                   ?               ? https://avd.aquasec.com/nvd/cve-2022-28391              ?
??????????????                ?          ?                   ?               ?                                                         ?
? ssl_client ?                ?          ?                   ?               ?                                                         ?
?            ?                ?          ?                   ?               ?                                                         ?
?            ?                ?          ?                   ?               ?                                                         ?
????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????

app/deploy.sh (secrets)

Total: 1 (UNKNOWN: 0, LOW: 0, MEDIUM: 0, HIGH: 0, CRITICAL: 1)

??????????????????????????????????????????????????????????????????????????????????????
? Category ?    Description    ? Severity ? Line No ?             Match              ?
??????????????????????????????????????????????????????????????????????????????????????
?   AWS    ? AWS Access Key ID ? CRITICAL ?    3    ? export AWS_ACCESS_KEY_ID=***** ?
??????????????????????????????????????????????????????????????????????????????????????

 

Scanning Percona Kubernetes Operators

With our Percona Kubernetes Operators, we focus on scanning our container images for known vulnerabilities. It is crucial for us that we prevent deploying containers with vulnerable packages to a running environment.  CI/CD pipelines (GitHub actions in our case) are the best place to do it since we want to check everything as early as possible.

Using Trivy directly as a GitHub action can be smoothly implemented. To try it in your GitHub repository, just find it in the marketplace and add it to your workflow.

All of our Percona docker images are scanned with Trivy. This is a nice example of it finding a vulnerability in Percona Operator for MySQL after we upgraded our logger lib:

usr/local/bin/percona-server-mysql-operator (gobinary)
======================================================
Total: 1 (HIGH: 0, CRITICAL: 1)

??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
?            Library             ? Vulnerability ? Severity ?  Installed Version  ? Fixed Version ?                            Title                             ?
??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
? github.com/emicklei/go-restful ? CVE-2022-1996 ? CRITICAL ? v2.9.5+incompatible ? v3.8.0        ? go-restful: Authorization Bypass Through User-Controlled Key ?
?                                ?               ?          ?                     ?               ? https://avd.aquasec.com/nvd/cve-2022-1996                    ?
??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????

There we can see that we had a vulnerability CVE-2022-1996 with go-restful lib and we clearly know what version we need to stop having (v3.8.0 in this case). This is a highly automated process and the level of effectiveness is really great.

Besides scanning just the operator image itself, we also scan every other image used by the operator and we check it on every commit. That’s how efficient the whole process is.

Conclusion

With proper and continuous testing we can avoid different unknown vulnerabilities, but for known vulnerabilities, it is critical for us to eliminate them and eliminate them as early as possible. Trivy is a tool that ticks our checkboxes, reliable, correct, efficient, and simple and we can highly recommend it for everyone to integrate it into their CI/CD workflow.

Want to report the security issue in one of the Percona products? Please send an email to security@percona.com. For more information please read https://www.percona.com/security.

The Percona Kubernetes Operators automate the creation, alteration, or deletion of members in your Percona Distribution for MySQL, MongoDB, or PostgreSQL environment.

Learn More About Percona Kubernetes Operators

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com