Mar
15
2022
--

Run PostgreSQL on Kubernetes with Percona Operator & Pulumi

Run PostgreSQL on Kubernetes with Percona Operator and Pulumi

Avoid vendor lock-in, provide a private Database-as-a-Service for internal teams, quickly deploy-test-destroy databases with CI/CD pipeline – these are some of the most common use cases for running databases on Kubernetes with operators. Percona Distribution for PostgreSQL Operator enables users to do exactly that and more.

Pulumi is an infrastructure-as-a-code tool, which enables developers to write code in their favorite language (Python, Golang, JavaScript, etc.) to deploy infrastructure and applications easily to public clouds and platforms such as Kubernetes.

This blog post is a step-by-step guide on how to deploy a highly-available PostgreSQL cluster on Kubernetes with our Percona Operator and Pulumi.

Desired State

We are going to provision the following resources with Pulumi:

  • Google Kubernetes Engine cluster with three nodes. It can be any Kubernetes flavor.
  • Percona Operator for PostgreSQL
  • Highly available PostgreSQL cluster with one primary and two hot standby nodes
  • Highly available pgBouncer deployment with the Load Balancer in front of it
  • pgBackRest for local backups

Pulumi code can be found in this git repository.

Prepare

I will use the Ubuntu box to run Pulumi, but almost the same steps would work on macOS.

Pre-install Packages

gcloud and kubectl

echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] https://packages.cloud.google.com/apt cloud-sdk main" | sudo tee -a /etc/apt/sources.list.d/google-cloud-sdk.list
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key --keyring /usr/share/keyrings/cloud.google.gpg add -
sudo apt-get update
sudo apt-get install -y google-cloud-sdk docker.io kubectl jq unzip

python3

Pulumi allows developers to use the language of their choice to describe infrastructure and applications. I’m going to use python. We will also pip (python package-management system) and venv (virtual environment module).

sudo apt-get install python3 python3-pip python3-venv

Pulumi

Install Pulumi:

curl -sSL https://get.pulumi.com | sh

On macOS, this can be installed view Homebrew with

brew install pulumi

 

You will need to add .pulumi/bin to the $PATH:

export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/home/percona/.pulumi/bin

Authentication

gcloud

You will need to provide access to Google Cloud to provision Google Kubernetes Engine.

gcloud config set project your-project
gcloud auth application-default login
gcloud auth login

Pulumi

Generate Pulumi token at app.pulumi.com. You will need it later to init Pulumi stack:

Action

This repo has the following files:

  • Pulumi.yaml

    – identifies that it is a folder with Pulumi project

  • __main__.py

    – python code used by Pulumi to provision everything we need

  • requirements.txt

    – to install required python packages

Clone the repo and go to the

pg-k8s-pulumi

folder:

git clone https://github.com/spron-in/blog-data
cd blog-data/pg-k8s-pulumi

Init the stack with:

pulumi stack init pg

You will need the key here generated before on app.pulumi.com.

__main__.py

Python code that Pulumi is going to process is in __main__.py file. 

Lines 1-6: importing python packages

Lines 8-31: configuration parameters for this Pulumi stack. It consists of two parts:

  • Kubernetes cluster configuration. For example, the number of nodes.
  • Operator and PostgreSQL cluster configuration – namespace to be deployed to, service type to expose pgBouncer, etc.

Lines 33-80: deploy GKE cluster and export its configuration

Lines 82-88: create the namespace for Operator and PostgreSQL cluster

Lines 91-426: deploy the Operator. In reality, it just mirrors the operator.yaml from our Operator.

Lines 429-444: create the secret object that allows you to set the password for pguser to connect to the database

Lines 445-557: deploy PostgreSQL cluster. It is a JSON version of cr.yaml from our Operator repository

Line 560: exports Kubernetes configuration so that it can be reused later 

Deploy

At first, we will set the configuration for this stack. Execute the following commands:

pulumi config set gcp:project YOUR_PROJECT
pulumi config set gcp:zone us-central1-a
pulumi config set node_count 3
pulumi config set master_version 1.21

pulumi config set namespace percona-pg
pulumi config set pg_cluster_name pulumi-pg
pulumi config set service_type LoadBalancer
pulumi config set pg_user_password mySuperPass

These commands set the following:

  • GCP project where GKE is going to be deployed
  • GCP zone 
  • Number of nodes in a GKE cluster
  • Kubernetes version
  • Namespace to run PostgreSQL cluster
  • The name of the cluster
  • Expose pgBouncer with LoadBalancer object

Deploy with the following command:

$ pulumi up
Previewing update (pg)

View Live: https://app.pulumi.com/spron-in/percona-pg-k8s/pg/previews/d335d117-b2ce-463b-867d-ad34cf456cb3

     Type                                                           Name                                Plan       Info
 +   pulumi:pulumi:Stack                                            percona-pg-k8s-pg                   create     1 message
 +   ?? random:index:RandomPassword                                 pguser_password                     create
 +   ?? random:index:RandomPassword                                 password                            create
 +   ?? gcp:container:Cluster                                       gke-cluster                         create
 +   ?? pulumi:providers:kubernetes                                 gke_k8s                             create
 +   ?? kubernetes:core/v1:ServiceAccount                           pgoPgo_deployer_saServiceAccount    create
 +   ?? kubernetes:core/v1:Namespace                                pgNamespace                         create
 +   ?? kubernetes:batch/v1:Job                                     pgoPgo_deployJob                    create
 +   ?? kubernetes:core/v1:ConfigMap                                pgoPgo_deployer_cmConfigMap         create
 +   ?? kubernetes:core/v1:Secret                                   percona_pguser_secretSecret         create
 +   ?? kubernetes:rbac.authorization.k8s.io/v1:ClusterRoleBinding  pgo_deployer_crbClusterRoleBinding  create
 +   ?? kubernetes:rbac.authorization.k8s.io/v1:ClusterRole         pgo_deployer_crClusterRole          create
 +   ?? kubernetes:pg.percona.com/v1:PerconaPGCluster               my_cluster_name                     create

Diagnostics:
  pulumi:pulumi:Stack (percona-pg-k8s-pg):
    E0225 14:19:49.739366105   53802 fork_posix.cc:70]           Fork support is only compatible with the epoll1 and poll polling strategies

Do you want to perform this update? yes

Updating (pg)
View Live: https://app.pulumi.com/spron-in/percona-pg-k8s/pg/updates/5
     Type                                                           Name                                Status      Info
 +   pulumi:pulumi:Stack                                            percona-pg-k8s-pg                   created     1 message
 +   ?? random:index:RandomPassword                                 pguser_password                     created
 +   ?? random:index:RandomPassword                                 password                            created
 +   ?? gcp:container:Cluster                                       gke-cluster                         created
 +   ?? pulumi:providers:kubernetes                                 gke_k8s                             created
 +   ?? kubernetes:core/v1:ServiceAccount                           pgoPgo_deployer_saServiceAccount    created
 +   ?? kubernetes:core/v1:Namespace                                pgNamespace                         created
 +   ?? kubernetes:core/v1:ConfigMap                                pgoPgo_deployer_cmConfigMap         created
 +   ?? kubernetes:batch/v1:Job                                     pgoPgo_deployJob                    created
 +   ?? kubernetes:core/v1:Secret                                   percona_pguser_secretSecret         created
 +   ?? kubernetes:rbac.authorization.k8s.io/v1:ClusterRole         pgo_deployer_crClusterRole          created
 +   ?? kubernetes:rbac.authorization.k8s.io/v1:ClusterRoleBinding  pgo_deployer_crbClusterRoleBinding  created
 +   ?? kubernetes:pg.percona.com/v1:PerconaPGCluster               my_cluster_name                     created

Diagnostics:
  pulumi:pulumi:Stack (percona-pg-k8s-pg):
    E0225 14:20:00.211695433   53839 fork_posix.cc:70]           Fork support is only compatible with the epoll1 and poll polling strategies

Outputs:
    kubeconfig: "[secret]"

Resources:
    + 13 created

Duration: 5m30s

Verify

Get kubeconfig first:

pulumi stack output kubeconfig --show-secrets > ~/.kube/config

Check if Pods of your PG cluster are up and running:

$ kubectl -n percona-pg get pods
NAME                                             READY   STATUS      RESTARTS   AGE
backrest-backup-pulumi-pg-dbgsp                  0/1     Completed   0          64s
pgo-deploy-8h86n                                 0/1     Completed   0          4m9s
postgres-operator-5966f884d4-zknbx               4/4     Running     1          3m27s
pulumi-pg-787fdbd8d9-d4nvv                       1/1     Running     0          2m12s
pulumi-pg-backrest-shared-repo-f58bc7657-2swvn   1/1     Running     0          2m38s
pulumi-pg-pgbouncer-6b6dc4564b-bh56z             1/1     Running     0          81s
pulumi-pg-pgbouncer-6b6dc4564b-vpppx             1/1     Running     0          81s
pulumi-pg-pgbouncer-6b6dc4564b-zkdwj             1/1     Running     0          81s
pulumi-pg-repl1-58d578cf49-czm54                 0/1     Running     0          46s
pulumi-pg-repl2-7888fbfd47-h98f4                 0/1     Running     0          46s
pulumi-pg-repl3-cdd958bd9-tf87k                  1/1     Running     0          46s

Get the IP-address of pgBouncer LoadBalancer:

$ kubectl -n percona-pg get services
NAME                             TYPE           CLUSTER-IP     EXTERNAL-IP    PORT(S)                      AGE
…
pulumi-pg-pgbouncer              LoadBalancer   10.20.33.122   35.188.81.20   5432:32042/TCP               3m17s

You can connect to your PostgreSQL cluster through this IP-address. Use pguser password that was set earlier with

pulumi config set pg_user_password

:

psql -h 35.188.81.20 -p 5432 -U pguser pgdb

Clean up

To delete everything it is enough to run the following commands:

pulumi destroy
pulumi stack rm

Tricks and Quirks

Pulumi Converter

kube2pulumi is a huge help if you already have YAML manifests. You don’t need to rewrite the whole code, but just convert YAMLs to Pulumi code. This is what I did for operator.yaml.

apiextensions.CustomResource

There are two ways for Custom Resource management in Pulumi:

crd2pulumi generates libraries/classes out of Custom Resource Definitions and allows you to create custom resources later using these. I found it a bit complicated and it also lacks documentation.

apiextensions.CustomResource on the other hand allows you to create Custom Resources by specifying them as JSON. It is much easier and requires less manipulation. See lines 446-557 in my __main__.py.

True/False in JSON

I have the following in my Custom Resource definition in Pulumi code:

perconapg = kubernetes.apiextensions.CustomResource(
…
    spec= {
…
    "disableAutofail": False,
    "tlsOnly": False,
    "standby": False,
    "pause": False,
    "keepData": True,

Be sure that you use boolean of the language of your choice and not the “true”/”false” strings. For me using the strings turned into a failure as the Operator was expecting boolean, not the strings.

Depends On…

Pulumi makes its own decisions on the ordering of provisioning resources. You can enforce the order by specifying dependencies

For example, I’m ensuring that Operator and Secret are created before the Custom Resource:

    },opts=ResourceOptions(provider=k8s_provider,depends_on=[pgo_pgo_deploy_job,percona_pg_cluster1_pguser_secret_secret])

Dec
14
2021
--

High Availability and Disaster Recovery Recipes for PostgreSQL on Kubernetes

High Availability and Disaster Recovery PostgreSQL on Kubernetes

Percona Distribution for PostgreSQL Operator allows you to deploy and manage highly available and production-grade PostgreSQL clusters on Kubernetes with minimal manual effort. In this blog post, we are going to look deeper into High Availability, Disaster Recovery, and Scaling of PostgreSQL clusters.

High Availability

Our default custom resource manifest deploys a highly available (HA) PostgreSQL cluster. Key components of HA setup are:

  • Kubernetes Services that point to pgBouncer and replica nodes
  • pgBouncer – a lightweight connection pooler for PostgreSQL
  • Patroni – HA orchestrator for PostgreSQL
  • PostgreSQL nodes – we have one primary and 2 replica nodes in hot standby by default

high availability postgresql

Kubernetes Service is the way to expose your PostgreSQL cluster to applications or users. We have two services:

  • clusterName-pgbouncer

    – Exposing your PostgreSQL cluster through pgBouncer connection pooler. Both reads and writes are sent to the Primary node. 

  • clusterName-replica

    – Exposes replica nodes directly. It should be used for reads only. Also, keep in mind that connections to this service are not pooled. We are working on a better solution, where the user would be able to leverage both connection pooling and read-scaling through a single service.

By default we use ClusterIP service type, but you can change it in

pgBouncer.expose.serviceType

or

pgReplicas.hotStandby.expose.serviceType,

respectively.

Every PostgreSQL container has Patroni running. Patroni monitors the state of the cluster and in case of Primary node failure switches the role of the Primary to one of the Replica nodes. PgBouncer always knows where Primary is.

As you see we distribute PostgreSQL cluster components across different Kubernetes nodes. This is done with Affinity rules and they are applied by default to ensure that single node failure does not cause database downtime.

Multi-Datacenter with Multi-AZ

Good architecture design is to run your Kubernetes cluster across multiple datacenters. Public clouds have a concept of availability zones (AZ) which are data centers within one region with a low-latency network connection between them. Usually, these data centers are at least 100 kilometers away from each other to minimize the probability of regional outage. You can leverage multi-AZ Kubernetes deployment to run cluster components in different data centers for better availability.

Multi-Datacenter with Multi-AZ

To ensure that PostgreSQL components are distributed across availability zones, you need to tweak affinity rules. Now it is only possible through editing Deployment resources directly:

$ kubectl edit deploy cluster1-repl2
…
-            topologyKey: kubernetes.io/hostname
+            topologyKey: topology.kubernetes.io/zone

Scaling

Scaling PostgreSQL to meet the demand at peak hours is crucial for high availability. Our Operator provides you with tools to scale PostgreSQL components both horizontally and vertically.

Vertical Scaling

Scaling vertically is all about adding more power to a PostgreSQL node. The recommended way is to change resources in the Custom Resource (instead of changing them in Deployment objects directly). For example, change the following in the

cr.yaml

to get 256 MBytes of RAM for all PostgreSQL Replica nodes:

  pgReplicas:
    hotStandby:
      resources:
        requests:
-         memory: "128Mi"
+         memory: "256Mi"

Apply

cr.yaml

:

$ kubectl apply -f cr.yaml

Use the same approach to tune other components in their corresponding sections.

You can also leverage Vertical Pod Autoscaler (VPA) to react to load spikes automatically. We create a Deployment resource for Primary and each Replica node. VPA objects should target these deployments. The following example will track one of the replicas Deployment resources of cluster1 and scale automatically:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: pxc-vpa
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind:       Deployment
    name:     cluster1-repl1  
    namespace:  pgo
  updatePolicy:
    updateMode: "Auto"

Please read more about VPA and its capabilities in its documentation.

Horizontal Scaling

Adding more replica nodes or pgBouncers can be done by changing size parameters in the Custom Resource. Do the following change in the default

cr.yaml

:

  pgReplicas:
    hotStandby:
-      size: 2
+      size: 3

Apply the change to get one more PostgreSQL Replica node:

$ kubectl apply -f cr.yaml

Starting from release 1.1.0 it is also possible to scale our cluster using kubectl scale command. Execute the following to have two PostgreSQL replica nodes in cluster1:

$ kubectl scale --replicas=2 perconapgcluster/cluster1
perconapgcluster.pg.percona.com/cluster1 scaled

In the latest release, it is not possible to use Horizontal Pod Autoscaler (HPA) yet and we will have it supported in the next one. Stay tuned.

Disaster Recovery

It is important to understand that Disaster Recovery (DR) is not High Availability. DR’s goal is to ensure business continuity in the case of a massive disaster, such as a full region outage. Recovery in such cases can be of course automated, but not necessarily – it strictly depends on the business requirements.

Disaster Recovery postgresql

Backup and Restore

I think it is the most common Disaster Recover protocol – take the backup, store it in some 3rd party premises, restore to another datacenter if needed.

This approach is simple, but comes with a long recovery time, especially if the database is big. Use this method only if it passes your Recovery Time Objectives (RTO).

Recovery Time Objectives

Our Operator handles backup and restore for PostgreSQL clusters. The disaster recovery is built around pgBackrest and looks like the following:

  1. Configure pgBackrest to upload backups to S3 or GCS (see our documentation for details).
  2. Create the backup manually (through pgTask) or ensure that a scheduled backup was created. 
  3. Once the Main cluster fails, create the new cluster in the Disaster Recovery data center. The cluster must be running in standby mode and pgBackrest must be pointing to the same repository as the main cluster:
spec:
  standby: true
  backup:
  # same config as on original cluster

Once data is recovered, the user can turn off standby mode and switch the application to DR cluster.

Continuous Restoration

This approach is quite similar to the above: pgBackrest instances continuously synchronize data between two clusters through object storage. This approach minimizes RTO and allows you to switch the application traffic to the DR site almost immediately. 

Continuous Restoration postgresql

Configuration here is similar to the previous case, but we always run a second PostgreSQL cluster in the Disaster Recovery data center. In case of main site failure just turn off the standby mode:

spec:
  standby: false

You can use a similar setup to migrate the data to and from Kubernetes. Read more about it in the Migrating PostgreSQL to Kubernetes blog post.

Conclusion

Kubernetes Operators provide ready-to-use service, and in the case of Percona Distribution for PostgreSQL Operator, the user gets a production-grade, highly available database cluster. In addition, the Operator provides day-2 operation capabilities and automates day-to-day routine.

We encourage you to try out our operator. See our GitHub repository and check out the documentation.

Found a bug or have a feature idea? Feel free to submit it in JIRA.

For general questions please raise the topic in the community forum

Are you a developer and looking to contribute? Please read our CONTRIBUTING.md and send the Pull.

Dec
08
2021
--

Percona Distribution for PostgreSQL Operator 1.1.0 – Notable Features

Features in Percona Distribution for PostgreSQL Operator

Features in Percona Distribution for PostgreSQL OperatorPercona in 2021 is heavily invested in making the PostgreSQL ecosystem better and contributing to it from different angles:

With this in mind let me introduce to you Percona Distribution for PostgreSQL Operator version 1.1.0 and its notable features:

  • Smart Update – forget about manual and error-prone database upgrades
  • System Users management – add and modify system users with ease with a single Kubernetes Secret resource
  • PostgreSQL 14 support – leverage the latest and greatest by running Percona Distribution for PostgreSQL on Kubernetes

Full release notes can be found here.

Smart Update Feature

Updating databases and their components is always a challenge. In our Operators for MySQL and MongoDB we have simplified and automated upgrade procedures, and now it’s time for PostgreSQL. In the 1.1.0 version, we ship this feature as Technical Preview with a plan to promote it to GA in the next release.

This feature consists of two parts:

  • Version Service – get the latest or recommended version of the database or other component (PMM for example)
  • Smart Update – apply new version without downtime

Version Service

This feature answers the question: which PostgreSQL/pgBackRest/pgBouncer version should I be running with this Operator? It is important to note, that Version Service and Smart Update can only perform minor version upgrades (ex. from 13.1 to 13.4). Major Version upgrades are manual for now and will be automated in the Operator soon.

The way it works is well depicted on the following diagram: 

Percona Distribution for PostgreSQL Operator

Version Service is an open source tool, see the source code on Github. Percona hosts check.percona.com and Operators use it by default, but users can run their own self-hosted Version Service.

Users who worked with our Operators for MySQL and MongoDB will find the configuration of Version Service and Smart Update quite familiar:

  upgradeOptions:
    versionServiceEndpoint: https://check.percona.com
    apply: recommended
    schedule: "0 2 * * *"

  • Define Version Service endpoint
  • Define PostgreSQL version – Operator will automatically figure out components versions
  • Schedule defines the time when the rollout of newer versions is going to take place. Good practice to set this time outside of peak hours.

Smart Update

Okay, now Operator knows the versions that should be used. It is time to apply them and do it with minimal downtime. Here is where the Smart Update feature kicks in. 

The heart of Smart Update is smartUpdateCluster function. The goal here is to switch container images versions for database components in a specific order and minimize downtime. Once the image is changed, Kubernetes does the magic. For Deployment resources, which we use in our Operator, Kubernetes first spins up the Pod with a new image and then terminates the old one. This provides minimal downtime. The update itself looks like this:

  1. Upgrade pgBackRest image in Deployment object in Kubernetes
  2. Start upgrading PostgreSQL itself
    1. Percona Monitoring and Management which runs as a sidecar gets the new version here as well
    2. Same for pgBadger
    3. We must upgrade replica nodes first here. If we upgrade the primary node first, the cluster will not recover. The tricky part here, is that in an event of failover Primary node can be somewhere in the pgReplicas Deployment. So we need to verify where the primary is first and only after that change the image. See the Smart Update sequence diagram for more details. 
  3. Last, but not least – change the image for pgBouncer. To minimize the downtime here, we recommend running at least two pgBouncer nodes. By default pgBouncer.size is set to 3.

As a result, the user gets the latest, most secure, and performant PostgreSQL and its components automatically with minimal downtime.

System Users Management

Our Operator has multiple system users to manage the cluster and ensure its health. Our users raised two main concerns:

  • it is not possible to change system user password with the Operator after cluster deployment
  • it is confusing that there is a Secret object per user

In this release, we are moving all system users to a single Secret. The change in the Secret resource is going to trigger the update of the passwords in PostgreSQL automatically.

If the cluster is created from scratch the Secret with system users is going to be created automatically and passwords would be randomly generated. By default the Secret name is

<clusterName>-users

, it can be changed under

spec.secretUsers

variable in the Custom Resource.

spec:
  secretsName: my-custom-secret

When upgrading from 1.0.0 to 1.1.0, if you want to keep old passwords, please create the Secret resource manually. Otherwise, the passwords for system users are going to be generated randomly and updated by the Operator.

PostgreSQL 14 Support

PostgreSQL 14 provides an extensive set of new features and enhancements to security, performance, usability for client applications, and more.

Most notable of them include the following:

  • Expired B-tree index entries can now be detected and removed between vacuum runs. This results in a lesser number of page splits and reduces the index bloat.
  • The vacuum process now deletes dead tuples in a single cycle, as opposed to the previous 2-step approach of first marking tuples as deleted and then actually freeing up space in the next run. This speeds up free space cleanup.
  • Support for subscripts in JSON is added to simplify data retrieval using a commonly recognized syntax.
  • Stored procedures can accept OUT parameters.
  • The libpq library now supports the pipeline mode. Previously, the client applications waited for a transaction to be completed before sending the next one. The pipeline mode allows the applications to send multiple transactions at the same time thus boosting performance.
  • Large transactions are now streamed to subscribers in-progress, thus increasing the performance. This improvement applies to logical replication.
  • LZ4 compression is added for TOAST operations. This speeds up large data processing and also improves the compression ratio.
  • SCRAM is made the default authentication mechanism. This mechanism improves security and simplifies regulatory compliance for data security.

In the 1.1.0 version of PostgreSQL Distribution for PostgreSQL Operator, we enable our users to run the latest and greatest PostgreSQL 14. PostgreSQL 14 is the default version since this release, but you still can use versions 12 and 13.

Conclusion

Kubernetes Operators are mainly seen as the tool to automate deployment and management of the applications. With this Percona Distribution for PostgreSQL Operator release, we simplify PostgreSQL management even more and enable users to leverage the latest version 14. 

We encourage you to try out our operator. See our github repository and check out the documentation.

Found a bug or have a feature idea? Feel free to submit it in JIRA.

For general questions please raise the topic in the community forum

You are a developer and looking to contribute? Please read our CONTRIBUTING.md and send the Pull Request.

Aug
24
2021
--

PostgreSQL Custom Dashboards Ported to Percona Monitoring and Management 2

Custom Dashboards for PostgreSQL

At the recent Percona Live, Agustín Gallego and I did a presentation “Optimizing and Troubleshooting PostgreSQL with PMM“. While preparing for this talk, I’ve ported some of the older PostgreSQL custom dashboards that we published over the years over at grafana.com. Initially made for Percona Monitoring and Management (PMM) 1, they are now capable of being added to PMM 2 easily. In this short blog post, I will show you how to set up two of these dashboards on PMM 2: “PostgreSQL Tuple Statistics” and “PostgreSQL DB/Table Size Details”. Technically, the DB/Table Size one is for PMM 2, but there’s an easier way to set it up in recent PMM versions.

You can also check out the original blog post covering the installation of the Tuple Statistics dashboard for PMM 1: “Adding PostgreSQL Tuple Statistics Dashboard to the PMM Plugin“.

Disclaimer About PMM Versions

PostgreSQL support is continuously improving in PMM, and the engineering team is busy with QAN and exporter improvements. In PMM release 2.16.0, postgres_exporter was updated to the community version 0.8.0, which among other things significantly improves the way custom queries are executed. In this post, I assume that you’re using PMM version 2.16.0 or above. If you’re using an older version of PMM 2, you will have to use the dblink approach proposed in the original blog post. Note that some parts of the new dashboards may not work properly with the older PMM versions.

PostgreSQL Custom Dashboards Overview

The two dashboards that were ported are “PostgreSQL Tuple Statistics” and “PostgreSQL DB/Table Size Details“, both adding important missing pieces of information to PMM. They were initially implemented by Vadim Yalovets.

“PostgreSQL Tuple Statistics”, in addition to the breakdown of tuple operations, provides an overview of the dead/live tuples ratio and details on the history of vacuum executions. New addition with this port to PMM2 is that the dashboard now gives some insight into database age and current progress towards the “vacuum to prevent wraparound.” Having a view of operations happening on a per-table basis can highlight the hottest objects in the databases, and the vacuum details are a welcome addition to any monitoring system that works with PostgreSQL.

The “PostgreSQL DB/Table Size Details” dashboard is somewhat simpler and narrower in its scope. This dashboard gives an overview of database and table sizes and their rate of growth. Helpful when you want to understand where all the disk space went.

Setting up Custom Queries

Once you have set up the PMM client to monitor your PostgreSQL instance, you will find three directories where you can put custom queries:

# cd /usr/local/percona/pmm2/collectors/custom-queries/postgresql/
# ls -l
drwxr-xr-x. 2 pmm-agent pmm-agent  69 Jul 12 21:16 high-resolution
drwxr-xr-x. 2 pmm-agent pmm-agent 126 Jul 12 21:24 low-resolution
drwxr-xr-x. 2 pmm-agent pmm-agent  34 Jul 12 21:16 medium-resolution

Technical details can be found in the “Running Custom MySQL Queries in Percona Monitoring and Management” blog post. However, all you need to do is to create two files. Download or otherwise copy the following files to the low-resolution directory:

Once that’s done, restart the pmm-agent or kill the postgres_exporter process:

# systemctl restart pmm-agent

I recommend using the low resolution for these particular queries as, frankly, it doesn’t make a lot of sense to be checking database size and number of updates every 1 or 5 seconds.

Importing the Dashboards

Once the custom queries are prepared and the exporter is restarted, you can go ahead and import new dashboards in Grafana! See the official documentation on how to do that: Dashboards/Export and import.

You can pick up the dashboards on grafana.com: “PostgreSQL Tuple Statistics (Designed for PMM2)“, “PostgreSQL DB/Table Size Details (Designed for PMM2)“, or find raw sources over at my GitHub.

If everything worked as expected, you should see your new dashboards showing the data:

PMM Dashboard: PostgreSQL Tuple Details pt1

PostgreSQL Tuple Details pt1

PMM Dashboard: PostgreSQL Tuple Details pt2

PostgreSQL Tuple Details pt2

PMM Dashboard: PostgreSQL Tuple Details pt3

PostgreSQL Tuple Details pt3

PMM Dashboard: PostgreSQL DB/Table Size Details

PostgreSQL DB/Table Size Details

Enjoy monitoring with these PostgreSQL custom dashboards!

Aug
13
2021
--

Migrating PostgreSQL to Kubernetes

Migrating PostgreSQL to Kubernetes

More and more companies are adopting Kubernetes. For some it is about being cutting-edge, for some, it is a well-defined strategy and a business transformation. Developers and operations teams all over the world are struggling with moving applications that aren’t cloud-native friendly to containers and Kubernetes.

Migrating databases is always a challenge, which comes with risks and downtime for businesses. Today I’m going to show how easy it is to migrate a PostgreSQL database to Kubernetes with minimal downtime with Percona Distribution for PostgreSQL Operator.

Goal

To perform the migration I’m going to use the following setup:

Migrating PostgreSQL to Kubernetes

  1. PostgreSQL database deployed on-prem or somewhere in the cloud. It will be the Source.
  2. Google Kubernetes Engine (GKE) cluster where Percona Operator deploys and manages PostgreSQL cluster (the Target) and pgBackRest Pod
  3. PostgreSQL backups and Write Ahead Logs are uploaded to some Object Storage bucket (GCS in my case)
  4. pgBackRest Pod reads the data from the bucket
  5. pgBackRest Pod restores the data continuously to the PostgreSQL cluster in Kubernetes

The data should be continuously synchronized. In the end, I want to shut down PostgreSQL running on-prem and only keep the cluster in GKE.

Migration

Prerequisites

To replicate the setup you will need the following:

  • PostgreSQL (v 12 or 13) running somewhere
  • pgBackRest installed
  • Google Cloud Storage or any S3 bucket. My examples will be about GCS.
  • Kubernetes cluster

Configure The Source

I have Percona Distribution for PostgreSQL version 13 running on some Linux machines.

1. Configure pgBackrest

# cat /etc/pgbackrest.conf
[global]
log-level-console=info
log-level-file=debug
start-fast=y

[db]
pg1-path=/var/lib/postgresql/13/main
repo1-type=gcs
repo1-gcs-bucket=sp-test-1
repo1-gcs-key=/tmp/gcs.key
repo1-path=/on-prem-pg

  • pg1-path should point to PostgreSQL data directory
  • repo1-type is set to GCS as we want our backups to go there
  • The key is in /tmp/gcs.key file. The key can be obtained through Google Cloud UI. Read more about it here.
  • The backups are going to be stored in on-prem-pg folder in sp-test-1 bucket

2. Edit

postgresql.conf

config to enable archival through pgBackrest 

archive_mode = on   
archive_command = 'pgbackrest --stanza=db archive-push %p'

Restart is required after changing the configuration.

3. Operator requires to have a

postgresql.conf

file in the data directory. It is enough to have an empty file:

touch /var/lib/postgresql/13/main/postgresql.conf

4.

primaryuser

must be created on the Source to ensure replication is correctly set up by the Operator. 

# create user primaryuser with encrypted password '<PRIMARYUSER PASSWORD>' replication;

Configure The Target

1. Deploy Percona Distribution for PostgreSQL Operator on Kubernetes. Read more about it in the documentation here.

# create the namespace
kubectl create namespace pgo

# clone the git repository
git clone -b v0.2.0 https://github.com/percona/percona-postgresql-operator/
cd percona-postgresql-operator

# deploy the operator
kubectl apply -f deploy/operator.yaml

2. Edit main custom resource manifest – deploy/cr.yaml.

  • I’m not going to change the cluster name and keep it cluster1
  • the cluster is going to operate in Standby mode, which means it is going to sync the data from the GCS bucket. Set
    spec.standby

    to

    true

    .

  • configure GCS itself.
    spec.backup

    section would look like this (

    bucket

      and

    repoPath

    are the same as in pgBackrest configuration above)

backup:
...
    repoPath: "/on-prem-pg"
...
    storages:
      my-s3:
        type: gcs
        endpointUrl: https://storage.googleapis.com
        region: us-central1-a
        uriStyle: path
        verifyTLS: false
        bucket: sp-test-1
    storageTypes: [
      "gcs"
    ]

  • I would like to have at least one Replica in my PostgreSQL cluster. Set
    spec.pgReplicas.hotStandby.size

    to 1.

3. Operator should be able to authenticate with GCS. To do that we need to create a secret object called

<CLUSTERNAME>-backrest-repo-config

with

gcs-key

in data. It should be the same key we used on the Source. See the example of this secret here.

kubectl apply -f gcs.yaml

4. Create users by creating Secret objects:

postgres

  and

primaryuser

(the one we created on the Source). See the examples of users Secrets here. The passwords should be the same as on the Source.

kubectl apply -f users.yaml

5. Now let’s deploy our cluster on Kubernetes by applying the

cr.yaml

:

kubectl apply -f deploy/cr.yaml

Verify and Troubleshoot

If everything is done correctly you should see the following in the Primary Pod logs:

kubectl -n pgo logs -f --tail=20 cluster1-5dfb96f77d-7m2rs
2021-07-30 10:41:08,286 INFO: Reaped pid=548, exit status=0
2021-07-30 10:41:08,298 INFO: establishing a new patroni connection to the postgres cluster
2021-07-30 10:41:08,359 INFO: initialized a new cluster
Fri Jul 30 10:41:09 UTC 2021 INFO: PGHA_INIT is 'true', waiting to initialize as primary
Fri Jul 30 10:41:09 UTC 2021 INFO: Node cluster1-5dfb96f77d-7m2rs fully initialized for cluster cluster1 and is ready for use
2021-07-30 10:41:18,781 INFO: Lock owner: cluster1-5dfb96f77d-7m2rs; I am cluster1-5dfb96f77d-7m2rs                                 2021-07-30 10:41:18,810 INFO: no action.  i am the standby leader with the lock                                                     2021-07-30 10:41:28,781 INFO: Lock owner: cluster1-5dfb96f77d-7m2rs; I am cluster1-5dfb96f77d-7m2rs                                 2021-07-30 10:41:28,832 INFO: no action.  i am the standby leader with the lock

Change some data on the Source and ensure that it is properly synchronized to the Target cluster.

Common Issues

The following error message indicates that you forgot to create

postgresql.conf

file in the data directory:

FileNotFoundError: [Errno 2] No such file or directory: '/pgdata/cluster1/postgresql.conf' -> '/pgdata/cluster1/postgresql.base.conf'

Sometimes it is easy to forget to create the

primaryuser

  and see the following in the logs:

psycopg2.OperationalError: FATAL:  password authentication failed for user "primaryuser"

Wrong or missing object store credentials will trigger the following error:

WARN: repo1: [CryptoError] unable to load info file '/on-prem-pg/backup/db/backup.info' or '/on-prem-pg/backup/db/backup.info.copy':      CryptoError: raised from remote-0 protocol on 'cluster1-backrest-shared-repo': unable to read PEM: [218529960] wrong tag            HINT: is or was the repo encrypted?                                                                                                 CryptoError: raised from remote-0 protocol on 'cluster1-backrest-shared-repo': unable to read PEM: [218595386] nested asn1 error
      HINT: is or was the repo encrypted?
      HINT: backup.info cannot be opened and is required to perform a backup.
      HINT: has a stanza-create been performed?
ERROR: [075]: no backup set found to restore
Fri Jul 30 10:54:00 UTC 2021 ERROR: pgBackRest standby Creation: pgBackRest restore failed when creating standby

Cutover

Everything looks good and it is time to perform the cutover. In this blog post, I cover only the database side but do not forget that your application should be reconfigured to point to the correct PostgreSQL cluster. It might be a good idea to stop the application before the cutover.

1. Stop the source PostgreSQL cluster to ensure no data is written

systemctl stop postgresql

2. Promote the Target cluster to primary. To do that remove

spec.backup.repoPath

, change

spec.standby

to false in

deploy/cr.yaml

, and apply the changes:

kubectl apply -f deploy/cr.yaml

PostgreSQL will be restarted automatically and you will see the following in the logs:

2021-07-30 11:16:20,020 INFO: updated leader lock during promote
2021-07-30 11:16:20,025 INFO: Changed archive_mode from on to True (restart might be required)
2021-07-30 11:16:20,025 INFO: Changed max_wal_senders from 10 to 6 (restart might be required)
2021-07-30 11:16:20,027 INFO: Reloading PostgreSQL configuration.
server signaled
2021-07-30 11:16:21,037 INFO: Lock owner: cluster1-5dfb96f77d-n4c79; I am cluster1-5dfb96f77d-n4c79
2021-07-30 11:16:21,132 INFO: no action.  i am the leader with the lock

Conclusion

Deploying and managing database clusters is not an easy task. Recently released Percona Distribution for PostgreSQL Operator automates day-1 and day-2 operations and turns running PostgreSQL on Kubernetes into a smooth and pleasant journey.

With Kubernetes becoming the default control plane, the most common task for developers and operations teams is to perform the migration, which usually turns into a complex project. This blog post shows that database migration can be an easy task with minimal downtime.

We encourage you to try out our operator. See our github repository and check out the documentation.

Found a bug or have a feature idea? Feel free to submit it in JIRA.

For general questions please raise the topic in the community forum.

Are you a developer and looking to contribute? Please read our CONTRIBUTING.md and send the Pull Request.

Percona Distribution for PostgreSQL provides the best and most critical enterprise components from the open-source community, in a single distribution, designed and tested to work together.

Download Percona Distribution for PostgreSQL Today!

May
26
2021
--

Percona Distribution for PostgreSQL Operator Technical Preview Released

Percona Distribution for PostgreSQL Operator

Percona is championing open source database software and we are committed to running our products on Kubernetes. We don’t only want to run the software, but make sure that the database is highly available, secure, and follows best practices. We also focus on day-2 operations such as scaling, backup and restore, disaster recovery, and customization.

To get there, we have Operators – the software framework that extends Kubernetes APIs and provides control over database deployment and operations through the control plane. Until May we had two Operators:

The only missing piece was Percona Distribution for PostgreSQL, for which we introduced the Operator during Percona Live in May 2021. This completes our vision for deploying our software on Kubernetes. See the release notes of the initial version here.

Kubernetes Operator FAQ

This blog post is intended to answer some frequently asked questions we received from our community about Percona Distribution for PostgreSQL Operator.

Is This a Brand New Operator?

No. Our Operator is based on PGO, the Postgres Operator from Crunchy Data, which we modified and enhanced in order to support our PostgreSQL distribution.

Why CrunchyData Operator?

As noted above, we are committed to running our database software on Kubernetes. There are multiple ways to achieve this goal:

  1. Develop a new Operator from scratch
  2. Collaborate and contribute necessary changes to an existing Operator
  3. Fork an  existing Operator

Option (1) looks great, but it is time and effort-intensive, and we might be re-inventing existing wheels. Our goal is to minimize Time To Market (TTM), so we dropped this option right away.

For options (2) and (3), there are at least three different PostgreSQL Operators in active development:

Stackgres is written in Java, and our engineering team is more familiar with C/C++ and Golang. We do not see that changing in the near future. Zalando Operator is great and provides a lot of functionality out of the box, but our Engineering team estimated the effort to perform the needed changes almost similar to writing the Operator from scratch.

PGO is written in Golang and provides the features we were looking for: high availability with Patroni, scaling, backups, and many more. Our engineering team did not flag any complexity of introducing the changes and we jumped to work.

Will Percona use PGO as an Upstream?

For now – yes. We will be merging new features implemented in PGO into our fork. Version 0.1.0 of our Operator is based on the 4.6.2 version of PGO. Version 0.2.0 will be based on version 4.7.X. At the same time, we want to contribute back some of our changes to the upstream and have already sent some pull requests (one, two). We’ll continue submitting patches to the upstream project.

What is Different About Percona Operator?

The main differences were highlighted in the release notes. Here they are:

  • Percona Distribution for PostgreSQL is now used as the main container image. CrunchyData container images are provided under Crunchy Data Developer Program, which means that without an active contract they could not be used for production. Percona container images are fully open source and do not have any limitations for use.
  • It is possible to specify custom images for all components separately. For example, users can easily build and use custom images for one or several components (e.g. pgBouncer) while all other images will be the official ones. Also, users can build and use all custom images.
  • All container images are reworked and simplified. They are built on Red Hat Universal Base Image (UBI) 8.
  • The Operator has built-in integration with Percona Monitoring and Management (PMM) v2.
  • A build/test infrastructure was created, and we have started adding e2e tests to be sure that all pieces of the cluster work together as expected.
  • We have phased out the PGO CLI tool, and the Custom Resource UX will be completely aligned with other Percona Operators in the following release.

For future releases, our goal is to cover the feature and UX parity between the Operators, so that our users will have the same look and feel for all three database engines.

What Does Tech Preview Mean?

Tech Preview Features are not yet ready for production use and are not included in support via SLA (Service License Agreement). They are included in this release so that users can provide feedback prior to the full release of the feature in a future GA (General Availability) release (or removal of the feature if it is deemed not useful). This functionality can change (APIs, CLIs, etc.) from tech preview to GA.

When is GA Coming and What is Going to be Included?

Our goal is to release the GA version early in Q3. The changes in this version would include:

  • Moving control over replicas to the main Custom Resource instead of managing them separately
  • Change the main manifest to provide the same look and feel as in other Percona Operators
  • Rework scheduled backups and control them with main CR and Kubernetes primitives
  • Add support for Google Cloud Storage (this will be merged from upstream)

Call for Action

To install our Operator and learn more about it please read the documentation.

Our Operator is licensed under Apache 2.0 and can be found in percona-postgresql-operator repository on Github. There are multiple ways to contact us or share your ideas:

  • To report a bug use jira.percona.com and create the bug in K8SPG project
  • For general questions and sharing your thoughts, we have a community forum or Discord where we chat about open source, databases, Kubernetes, and many more.
  • We have a public roadmap where you can see what is planned and what is under development. You can share your ideas about new features there as well.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com