Jan
13
2022
--

How Patroni Addresses the Problem of the Logical Replication Slot Failover in a PostgreSQL Cluster

PostgreSQL Patroni Logical Replication Slot Failover

Failover of the logical replication slot has always been the pain point while using the logical replication in PostgreSQL. This lack of feature undermined the use of logical replication and acted as one of the biggest deterrents. The stake and impact were so high that many organizations had to discard their plans around logical replication, and it affected many plans for migrations to PostgreSQL. It was painful to see that many had to opt for proprietary/vendor-specific solutions instead.

At Percona, we have written about this in the past: Missing Piece: Failover of the Logical Replication Slot.  In that post, we discussed one of the possible approaches to solve this problem, but there was no reliable mechanism to copy the slot information to a Physical standby and maintain it.

The problem, in nutshell, is: the replication slot will be always maintained on the Primary node. If there is a switchover/failover to promote one of the standby, the new primary won’t have any idea about the replication slot maintained by the previous primary node. This breaks the logical replication from the downstream systems or if a new slot is created, it becomes unsafe to use.

The good news is that Patroni developers and maintainers addressed this problem from Version 2.1.0 and provided a working solution without any invasive methods/extensions. For me, this is a work that deserves a big round of applause from the Patroni community and that is the intention of this blog post and to make sure that a bigger crowd is aware of it.

How to Set it Up

A ready-to-use Patroni package is available from the Percona repository. But you are free to use Patroni from any source.

Basic Configuration

In case you are excited about this and want to try it, the following steps might be helpful.

The entire discussion is about logical replication. So the minimum requirement is to have a wal_level set to “logical”. If the existing Patroni configuration is having wal_level set to “replica” and if you want to use this feature, you may just edit the Patroni configuration.

Patroni configuration

However, this change requires the PostgreSQL restart:

“Pending restart” with * marking indicates the same.

You may use Patroni’s “switchover” feature to restart the node to make the changes into effect because the demoted node goes for a restart.

If there are any remaining nodes, they can be restarted later.

Creating Logical Slots

Now we can add a permanent logical replication slot to PostgreSQL which will be maintained by Patroni.

Edit the patroni configuration:

$ patronictl -c /etc/patroni/patroni.yml edit-config

A slot specification can be added as follows:

…
slots:
  logicreplia:
    database: postgres
    plugin: pgoutput
    type: logical
…

The “slots:” section defines permanent replication slots. These slots will be preserved during switchover/failover. “pgoutput” is the decoding plugin for PostgreSQL logical replication.

Once the change is applied, the logical replication slot will be created on the primary node. Which can be verified by querying:

select * from pg_replication_slots;

The following is a sample output:

patroni output

Now here is the first level of magic! The same replication slot will be created on the standbys, also. Yes, Patroni does it. Patroni internally copies the replication slot information from the primary to all eligible standby nodes!.

We can use the same query on the pg_replication_slots on the standby and see similar information.

The following is an example showing the same replication slot reflecting on the standby side:

replication slot

This slot can be used by the subscription by explicitly specifying the slot name while creating the subscription.

CREATE SUBSCRIPTION sub2 CONNECTION '<connection_string' PUBLICATION <publication_name> WITH (copy_data = true, create_slot=false, enabled=true, slot_name=logicreplia);

Alternatively, an existing subscription can be modified to use the new slot which I generally prefer to do.

For example:

ALTER SUBSCRIPTION name SET (slot_name=logicreplia);

Corresponding PostgreSQL log entries can confirm the slot name change:

2021-12-27 15:56:58.294 UTC [20319] LOG:  logical replication apply worker for subscription "sub2" will restart because the replication slot name was changed
2021-12-27 15:56:58.304 UTC [20353] LOG:  logical replication apply worker for subscription "sub2" has started

From the publisher side, We can confirm the slot usage by checking the active_pid and advancing LSN for the slots.

The second level of Surprise! The Replication Slot information in all the standby nodes of the Patroni cluster is also advanced as the logical replication progresses from the primary side

At a higher level, this is exactly what this feature is doing:

  1. Automatically create/copy the replication slot information from the primary node of the Patroni cluster to all eligible standby nodes.
  2. Automatically advances the LSN numbers on slots of standby nodes as the LSN number advances on the corresponding slot on the primary.

After a Switchover/Failover

In the event of a switchover or failover, we are not losing any slot information as they are already maintained on the standby nodes.

After the switchover, the topology looks like this:

Now, any downstream logical replica can be repointed to the new primary.

postgres=# ALTER SUBSCRIPTION sub2 CONNECTION 'host=192.168.50.10 port=5432 dbname=postgres user=postgres password=vagrant';                                                                                                                                                  
ALTER SUBSCRIPTION

This continues the replication, and pg_replication_slot information can confirm this.

Summary + Key Points

The logical replication slot is conceptually possible only on the primary Instance because that is where the logical decoding happens. Now with this improvement, Patroni makes sure that the slot information is available on standby also and it will be ready to take over the connection from the subscriber.

  • This solution requires PostgreSQL 11 or above because it uses the  pg_replication_slot_advance() function which is available from PostgreSQL 11 onwards, for advancing the slot.
  • The downstream connection can use HAProxy so that the connection will be automatically routed to the primary (not covered in this post). No modification to PostgreSQL code or Creation of any extension is required.
  • The copying of the slot happens over PostgreSQL protocol (libpq) rather than any OS-specific tools/methods. Patroni uses rewind or superuser credentials. Patroni uses the pg_read_binary_file()  function to read the slot information. Source code Reference.
  • Once the logical slot is created on the replica side, Patroni uses pg_replication_slot_advance() to move the slot forward.
  • The permanent slot information will be added to DCS and will be continuously maintained by the primary instance of the Patroni. A New DCS key with the name “status” is introduced and supported across all DCS options (zookeeper, etcd, consul, etc.).
  • hot_standby_feedback must be enabled on all standby nodes where the logical replication slot needs to be maintained.
  • Patroni parameter postgresql.use_slots must be enabled to make sure that every standby node uses a slot on the primary node.
Jan
11
2022
--

Creating a Standby Cluster With the Percona Distribution for PostgreSQL Operator

Standby Cluster With the Percona Distribution for PostgreSQL Operator

A customer recently asked if our Percona Distribution for PostgreSQL Operator supports the deployment of a standby cluster, which they need as part of their Disaster Recovery (DR) strategy. The answer is yes – as long as you are making use of an object storage system for backups, such as AWS S3 or GCP Cloud Storage buckets, that can be accessed by the standby cluster. In a nutshell, it works like this:

  • The primary cluster is configured with pgBackRest to take backups and store them alongside archived WAL files in a remote repository;
  • The standby cluster is built from one of these backups and it is kept in sync with the primary cluster by consuming the WAL files that are copied from the remote repository.

Note that the primary node in the standby cluster is not a streaming replica from any of the nodes in the primary cluster and that it relies on archived WAL files to replicate events. For this reason, this approach cannot be used as a High Availability (HA) solution. Even though the primary use of a standby cluster in this context is DR, it can be also employed for migrations as well.

So, how can we create a standby cluster using the Percona operator? We will show you next. But first, let’s create a primary cluster for our example.

Creating a Primary PostgreSQL Cluster Using the Percona Operator

You will find a detailed procedure on how to deploy a PostgreSQL cluster using the Percona operator in our online documentation. Here we want to highlight the main steps involved, particularly regarding the configuration of object storage, which is a crucial requirement and should better be done during the initial deployment of the cluster. In the following example, we will deploy our clusters using the Google Kubernetes Engine (GKE) but you can find similar instructions for other environments in the previous link.

Considering you have a Google account configured as well as the gcloud (from the Google Cloud SDK suite) and kubectl command-line tools installed, authenticate yourself with gcloud auth login, and off we go!

Creating a GKE Cluster and Basic Configuration

The following command will create a default cluster named “cluster-1” and composed of three nodes. We are creating it in the us-central1-a zone using e2-standard-4 VMs but you may choose different options. In fact, you may also need to indicate the project name and other main settings if you do not have your gcloud environment pre-configured with them:

gcloud container clusters create cluster-1 --preemptible --machine-type e2-standard-4 --num-nodes=3 --zone us-central1-a

Once the cluster is created, use your IAM identity to control access to this new cluster:

kubectl create clusterrolebinding cluster-admin-binding --clusterrole cluster-admin --user $(gcloud config get-value core/account)

Finally, create the pgo namespace:

kubectl create namespace pgo

and set the current context to refer to this new namespace:

kubectl config set-context $(kubectl config current-context) --namespace=pgo

Creating a Cloud Storage Bucket

Remember for this setup we need a Google Cloud Storage bucket configured as well as a Service Account created with the necessary privileges/roles to access it. The respective procedures to obtain these vary according to how your environment is configured so we won’t be covering them here. Please refer to the Google Cloud Storage documentation for the exact steps. The bucket we created for the example in this post was named cluster1-backups-and-wals.

Likewise, please refer to the Creating and managing service account keys documentation to learn how to create a Service Account and download the corresponding key in JSON format – we will need to provide it to the operator so our PostgreSQL clusters can access the storage bucket.

Creating the Kubernetes Secrets File to Access the Storage Bucket

Create a file named my-gcs-account-secret.yaml with the following structure:

apiVersion: v1
kind: Secret
metadata:
  name: cluster1-backrest-repo-config
type: Opaque
data:
  gcs-key: <VALUE>

replacing the <VALUE> placeholder by the output of the following command according to the OS you are using:

Linux:

base64 --wrap=0 your-service-account-key-file.json

macOS:

base64 your-service-account-key-file.json

Installing and Deploying the Operator

The most practical way to install our operator is by cloning the Git repository, and then moving inside its directory:

git clone -b v1.1.0 https://github.com/percona/percona-postgresql-operator
cd percona-postgresql-operator

The following command will deploy the operator:

kubectl apply -f deploy/operator.yaml

We have already prepared the secrets file to access the storage bucket so we can apply it now:

kubectl apply -f my-gcs-account-secret.yaml

Now, all that is left is to customize the storages options in the deploy/cr.yaml file to indicate the use of the GCS bucket as follows:

    storages:
      my-gcs:
        type: gcs
        bucket: cluster1-backups-and-wals

We can now deploy the primary PostgreSQL cluster (cluster1):

kubectl apply -f deploy/cr.yaml

Once the operator has been deployed, you can run the following command to do some housekeeping:

kubectl delete -f deploy/operator.yaml

Creating a Standby PostgreSQL Cluster Using the Percona Operator

After this long preamble, let’s look at what brought you here: how to deploy a standby cluster, which we will refer to as cluster2, that will replicate from the primary cluster.

Copying the Secrets Over

Considering you probably have customized the passwords you use in your primary cluster and that they differ from the default values found in the operator’s git repository, we need to make a copy of the secrets files, adjusted to the standby cluster’s name. The following procedure facilitates this task, saving the secrets files under /tmp/cluster1-cluster2-secrets (you can choose a different target directory):

NOTE: make sure you have the yq tool installed in your system.
mkdir -p /tmp/cluster1-cluster2-secrets/
export primary_cluster_name=cluster1
export standby_cluster_name=cluster2
export secrets="${primary_cluster_name}-users"
kubectl get secret/$secrets -o yaml \
| yq eval 'del(.metadata.creationTimestamp)' - \
| yq eval 'del(.metadata.uid)' - \
| yq eval 'del(.metadata.selfLink)' - \
| yq eval 'del(.metadata.resourceVersion)' - \
| yq eval 'del(.metadata.namespace)' - \
| yq eval 'del(.metadata.annotations."kubectl.kubernetes.io/last-applied-configuration")' - \
| yq eval '.metadata.name = "'"${secrets/$primary_cluster_name/$standby_cluster_name}"'"' - \
| yq eval '.metadata.labels.pg-cluster = "'"${standby_cluster_name}"'"' - \
>/tmp/cluster1-cluster2-secrets/${secrets/$primary_cluster_name/$standby_cluster_name}

Deploying the Standby Cluster: Fast Mode

Since we have already covered the procedure used to create the primary cluster in detail in a previous section, we will be presenting the essential steps to create the standby cluster below and provide additional comments only when necessary.

NOTE: the commands below are issued from inside the percona-postgresql-operator directory hosting the git repository for our operator.

Deploying a New GKE Cluster Named cluster-2

This time using the us-west1-b zone here:

gcloud container clusters create cluster-2 --preemptible --machine-type e2-standard-4 --num-nodes=3 --zone us-west1-b
kubectl create clusterrolebinding cluster-admin-binding --clusterrole cluster-admin --user $(gcloud config get-value core/account)
kubectl create namespace pgo
kubectl config set-context $(kubectl config current-context) --namespace=pgo
kubectl apply -f deploy/operator.yaml

Apply the Adjusted Kubernetes Secrets:

export standby_cluster_name=cluster2
export secrets="${standby_cluster_name}-users"
kubectl create -f /tmp/cluster1-cluster2-secrets/$secrets

The list above does not include the GCS secret file; the key contents remain the same but the backrest-repo pod name needs to be adjusted. Make a copy of that file:

cp my-gcs-account-secret.yaml my-gcs-account-secret-2.yaml

then edit the copy to indicate “cluster2-” instead of “cluster1-”:

name: cluster2-backrest-repo-config

You can apply it now:

kubectl apply -f my-gcs-account-secret-2.yaml

The cr.yaml file of the Standby Cluster

Let’s make a copy of the cr.yaml file we customized for the primary cluster:

cp deploy/cr.yaml deploy/cr-2.yaml

and edit the copy as follows:

1) Change all references (that are not commented) from cluster1 to cluster2  – including current-primary but excluding the bucket reference, which in our example is prefixed with “cluster1-”; the storage section must remain unchanged. (We know it’s not very practical to replace so many references, we still need to improve this part of the routine).

2) Enable the standby option:

standby: true

3) Provide a repoPath that points to the GCS bucket used by the primary cluster (just below the storages section, which should remain the same as in the primary cluster’s cr.yaml file):

repoPath: “/backrestrepo/cluster1-backrest-shared-repo”

And that’s it! All that is left now is to deploy the standby cluster:

kubectl apply -f deploy/cr-2.yaml

With everything working on the standby cluster, do some housekeeping:

kubectl delete -f deploy/operator.yaml

Verifying it all Works as Expected

Remember that the standby cluster is created from a backup and relies on archived WAL files to be continued in sync with the primary cluster. If you make a change in the primary cluster, such as adding a row to a table, that change won’t reach the standby cluster until the WAL file it has been recorded to is archived and consumed by the standby cluster.

When checking if all is working with the new setup, you can force the rotation of the WAL file (and subsequent archival of the previous one) in the primary node of the primary cluster to accelerate the sync process by issuing:

psql> SELECT pg_switch_wal();

The Percona Kubernetes Operators automate the creation, alteration, or deletion of members in your Percona Distribution for MySQL, MongoDB, or PostgreSQL environment.

Learn More About Percona Kubernetes Operators

Dec
20
2021
--

PMM Now Supports Monitoring of PostgreSQL Instances Connecting With Any Database (Name)

Monitoring of PostgreSQL Instances

The recent release of Percona Monitoring and Management 2.25.0 (PMM) includes a fix for bug PMM-6937: before that, PMM expected all monitoring connections to PostgreSQL servers to be made using the default postgres database. This worked well for most deployments, however, some DBaaS providers like Heroku and DigitalOcean do not provide direct access to the postgres database (at least not by default) and instead create custom databases for each of their instances. Starting with this new release of PMM, it is possible to specify the name of the database the Postgres exporter should connect to, both through the pmm-admin command-line tool as well as when configuring the monitoring of a remote PostgreSQL instance directly in the PMM web interface.

Secure Connections

Most DBaaS providers enforce or even restrict access to their instances to SSL (well, in fact, TLS) client connections – and that’s a good thing! Just pay attention to this detail when you configure the monitoring of these instances on PMM. If when you configure your instance the system returns a red-box alert with the saying:

Connection check failed: pq: no pg_hba.conf entry for host “123.123.123.123”, user “pmm”, database “mydatabase”, SSL off.

and your firewall allows external connections to your database, chances are the problem is on the last part – “SSL off”.

Web Interface

When configuring the monitoring of a remote PostgreSQL instance in the PMM web interface, make sure you tick the box for using TLS connections:

PMM TLS connections

But note that checking this box is not enough; you have to complement this setting with one of two options:

a) Provide the TLS certificates and key, using the forms that appear once you click on the “Use TLS” check-box:

  • TLS CA
  • TLS certificate key
  • TLS certificate

The PMM documentation explains this further and more details about server and client certificates for PostgreSQL connection can be found in the SSL Support chapter of its online manual.

b) Opt to not provide TLS certificates, leaving the forms empty and checking the “Skip TLS” check-box:

PMM Skip TLS
What should rule this choice is the SSL mode the PostgreSQL server requires for client connections. When DBaaS providers enforce secure connections, usually this means sslmode=require and for this option b above is enough. Only when the PostgreSQL server employs the more restrictive verify-ca or verify-full modes you will need to with option a and provide the TLS certificates and key.

Command-Line Tool

When configuring the monitoring of a PostgreSQL server using the pmm-admin command-line tool, the same options are available and the PMM documentation covers them as well. Here’s a quick example to illustrate configuring the monitoring for a remote PostgreSQL server when sslmode=require and providing a custom database name:

$ pmm-admin add postgresql --server-url=https://admin:admin@localhost:443 --server-insecure-tls --host=my.dbaas.domain.com --port=12345 --username=pmmuser --password=mYpassword --database=customdb --tls --tls-skip-verify

Note that option server-insecure-tls applies to the connection with the PMM server itself; the options prefixed with tls are the ones that apply to the connection to the PostgreSQL database.

Tracking Stats

You may have also noticed in the release notes that PMM 2.25.0 also provides support for the release candidate of our own pg_stat_monitor. For now, however, it is unlikely this extension will be made available by DBaaS providers, so you need to stick with pg_stat_statements for now:

pg_stat_statements

Make sure this extension is loaded for your instance’s custom database, otherwise, PMM won’t be able to track many important stats. That is the case already for all non-hobby Heroku Postgres but for other providers like Digital Ocean it needs to be created beforehand:

defaultdb=> select count(*) from pg_stat_statements;
ERROR:  relation "pg_stat_statements" does not exist
LINE 1: select count(*) from pg_stat_statements;
                             ^
defaultdb=> CREATE EXTENSION pg_stat_statements;
CREATE EXTENSION

defaultdb=> select count(*) from pg_stat_statements;
 count 
-------
    60
(1 row)

Percona Monitoring and Management is a best-of-breed open source database monitoring solution. It helps you reduce complexity, optimize performance, and improve the security of your business-critical database environments, no matter where they are located or deployed.

Download Percona Monitoring and Management Today

Dec
14
2021
--

High Availability and Disaster Recovery Recipes for PostgreSQL on Kubernetes

High Availability and Disaster Recovery PostgreSQL on Kubernetes

Percona Distribution for PostgreSQL Operator allows you to deploy and manage highly available and production-grade PostgreSQL clusters on Kubernetes with minimal manual effort. In this blog post, we are going to look deeper into High Availability, Disaster Recovery, and Scaling of PostgreSQL clusters.

High Availability

Our default custom resource manifest deploys a highly available (HA) PostgreSQL cluster. Key components of HA setup are:

  • Kubernetes Services that point to pgBouncer and replica nodes
  • pgBouncer – a lightweight connection pooler for PostgreSQL
  • Patroni – HA orchestrator for PostgreSQL
  • PostgreSQL nodes – we have one primary and 2 replica nodes in hot standby by default

high availability postgresql

Kubernetes Service is the way to expose your PostgreSQL cluster to applications or users. We have two services:

  • clusterName-pgbouncer

    – Exposing your PostgreSQL cluster through pgBouncer connection pooler. Both reads and writes are sent to the Primary node. 

  • clusterName-replica

    – Exposes replica nodes directly. It should be used for reads only. Also, keep in mind that connections to this service are not pooled. We are working on a better solution, where the user would be able to leverage both connection pooling and read-scaling through a single service.

By default we use ClusterIP service type, but you can change it in

pgBouncer.expose.serviceType

or

pgReplicas.hotStandby.expose.serviceType,

respectively.

Every PostgreSQL container has Patroni running. Patroni monitors the state of the cluster and in case of Primary node failure switches the role of the Primary to one of the Replica nodes. PgBouncer always knows where Primary is.

As you see we distribute PostgreSQL cluster components across different Kubernetes nodes. This is done with Affinity rules and they are applied by default to ensure that single node failure does not cause database downtime.

Multi-Datacenter with Multi-AZ

Good architecture design is to run your Kubernetes cluster across multiple datacenters. Public clouds have a concept of availability zones (AZ) which are data centers within one region with a low-latency network connection between them. Usually, these data centers are at least 100 kilometers away from each other to minimize the probability of regional outage. You can leverage multi-AZ Kubernetes deployment to run cluster components in different data centers for better availability.

Multi-Datacenter with Multi-AZ

To ensure that PostgreSQL components are distributed across availability zones, you need to tweak affinity rules. Now it is only possible through editing Deployment resources directly:

$ kubectl edit deploy cluster1-repl2
…
-            topologyKey: kubernetes.io/hostname
+            topologyKey: topology.kubernetes.io/zone

Scaling

Scaling PostgreSQL to meet the demand at peak hours is crucial for high availability. Our Operator provides you with tools to scale PostgreSQL components both horizontally and vertically.

Vertical Scaling

Scaling vertically is all about adding more power to a PostgreSQL node. The recommended way is to change resources in the Custom Resource (instead of changing them in Deployment objects directly). For example, change the following in the

cr.yaml

to get 256 MBytes of RAM for all PostgreSQL Replica nodes:

  pgReplicas:
    hotStandby:
      resources:
        requests:
-         memory: "128Mi"
+         memory: "256Mi"

Apply

cr.yaml

:

$ kubectl apply -f cr.yaml

Use the same approach to tune other components in their corresponding sections.

You can also leverage Vertical Pod Autoscaler (VPA) to react to load spikes automatically. We create a Deployment resource for Primary and each Replica node. VPA objects should target these deployments. The following example will track one of the replicas Deployment resources of cluster1 and scale automatically:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: pxc-vpa
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind:       Deployment
    name:     cluster1-repl1  
    namespace:  pgo
  updatePolicy:
    updateMode: "Auto"

Please read more about VPA and its capabilities in its documentation.

Horizontal Scaling

Adding more replica nodes or pgBouncers can be done by changing size parameters in the Custom Resource. Do the following change in the default

cr.yaml

:

  pgReplicas:
    hotStandby:
-      size: 2
+      size: 3

Apply the change to get one more PostgreSQL Replica node:

$ kubectl apply -f cr.yaml

Starting from release 1.1.0 it is also possible to scale our cluster using kubectl scale command. Execute the following to have two PostgreSQL replica nodes in cluster1:

$ kubectl scale --replicas=2 perconapgcluster/cluster1
perconapgcluster.pg.percona.com/cluster1 scaled

In the latest release, it is not possible to use Horizontal Pod Autoscaler (HPA) yet and we will have it supported in the next one. Stay tuned.

Disaster Recovery

It is important to understand that Disaster Recovery (DR) is not High Availability. DR’s goal is to ensure business continuity in the case of a massive disaster, such as a full region outage. Recovery in such cases can be of course automated, but not necessarily – it strictly depends on the business requirements.

Disaster Recovery postgresql

Backup and Restore

I think it is the most common Disaster Recover protocol – take the backup, store it in some 3rd party premises, restore to another datacenter if needed.

This approach is simple, but comes with a long recovery time, especially if the database is big. Use this method only if it passes your Recovery Time Objectives (RTO).

Recovery Time Objectives

Our Operator handles backup and restore for PostgreSQL clusters. The disaster recovery is built around pgBackrest and looks like the following:

  1. Configure pgBackrest to upload backups to S3 or GCS (see our documentation for details).
  2. Create the backup manually (through pgTask) or ensure that a scheduled backup was created. 
  3. Once the Main cluster fails, create the new cluster in the Disaster Recovery data center. The cluster must be running in standby mode and pgBackrest must be pointing to the same repository as the main cluster:
spec:
  standby: true
  backup:
  # same config as on original cluster

Once data is recovered, the user can turn off standby mode and switch the application to DR cluster.

Continuous Restoration

This approach is quite similar to the above: pgBackrest instances continuously synchronize data between two clusters through object storage. This approach minimizes RTO and allows you to switch the application traffic to the DR site almost immediately. 

Continuous Restoration postgresql

Configuration here is similar to the previous case, but we always run a second PostgreSQL cluster in the Disaster Recovery data center. In case of main site failure just turn off the standby mode:

spec:
  standby: false

You can use a similar setup to migrate the data to and from Kubernetes. Read more about it in the Migrating PostgreSQL to Kubernetes blog post.

Conclusion

Kubernetes Operators provide ready-to-use service, and in the case of Percona Distribution for PostgreSQL Operator, the user gets a production-grade, highly available database cluster. In addition, the Operator provides day-2 operation capabilities and automates day-to-day routine.

We encourage you to try out our operator. See our GitHub repository and check out the documentation.

Found a bug or have a feature idea? Feel free to submit it in JIRA.

For general questions please raise the topic in the community forum

Are you a developer and looking to contribute? Please read our CONTRIBUTING.md and send the Pull.

Dec
08
2021
--

Percona Distribution for PostgreSQL Operator 1.1.0 – Notable Features

Features in Percona Distribution for PostgreSQL Operator

Features in Percona Distribution for PostgreSQL OperatorPercona in 2021 is heavily invested in making the PostgreSQL ecosystem better and contributing to it from different angles:

With this in mind let me introduce to you Percona Distribution for PostgreSQL Operator version 1.1.0 and its notable features:

  • Smart Update – forget about manual and error-prone database upgrades
  • System Users management – add and modify system users with ease with a single Kubernetes Secret resource
  • PostgreSQL 14 support – leverage the latest and greatest by running Percona Distribution for PostgreSQL on Kubernetes

Full release notes can be found here.

Smart Update Feature

Updating databases and their components is always a challenge. In our Operators for MySQL and MongoDB we have simplified and automated upgrade procedures, and now it’s time for PostgreSQL. In the 1.1.0 version, we ship this feature as Technical Preview with a plan to promote it to GA in the next release.

This feature consists of two parts:

  • Version Service – get the latest or recommended version of the database or other component (PMM for example)
  • Smart Update – apply new version without downtime

Version Service

This feature answers the question: which PostgreSQL/pgBackRest/pgBouncer version should I be running with this Operator? It is important to note, that Version Service and Smart Update can only perform minor version upgrades (ex. from 13.1 to 13.4). Major Version upgrades are manual for now and will be automated in the Operator soon.

The way it works is well depicted on the following diagram: 

Percona Distribution for PostgreSQL Operator

Version Service is an open source tool, see the source code on Github. Percona hosts check.percona.com and Operators use it by default, but users can run their own self-hosted Version Service.

Users who worked with our Operators for MySQL and MongoDB will find the configuration of Version Service and Smart Update quite familiar:

  upgradeOptions:
    versionServiceEndpoint: https://check.percona.com
    apply: recommended
    schedule: "0 2 * * *"

  • Define Version Service endpoint
  • Define PostgreSQL version – Operator will automatically figure out components versions
  • Schedule defines the time when the rollout of newer versions is going to take place. Good practice to set this time outside of peak hours.

Smart Update

Okay, now Operator knows the versions that should be used. It is time to apply them and do it with minimal downtime. Here is where the Smart Update feature kicks in. 

The heart of Smart Update is smartUpdateCluster function. The goal here is to switch container images versions for database components in a specific order and minimize downtime. Once the image is changed, Kubernetes does the magic. For Deployment resources, which we use in our Operator, Kubernetes first spins up the Pod with a new image and then terminates the old one. This provides minimal downtime. The update itself looks like this:

  1. Upgrade pgBackRest image in Deployment object in Kubernetes
  2. Start upgrading PostgreSQL itself
    1. Percona Monitoring and Management which runs as a sidecar gets the new version here as well
    2. Same for pgBadger
    3. We must upgrade replica nodes first here. If we upgrade the primary node first, the cluster will not recover. The tricky part here, is that in an event of failover Primary node can be somewhere in the pgReplicas Deployment. So we need to verify where the primary is first and only after that change the image. See the Smart Update sequence diagram for more details. 
  3. Last, but not least – change the image for pgBouncer. To minimize the downtime here, we recommend running at least two pgBouncer nodes. By default pgBouncer.size is set to 3.

As a result, the user gets the latest, most secure, and performant PostgreSQL and its components automatically with minimal downtime.

System Users Management

Our Operator has multiple system users to manage the cluster and ensure its health. Our users raised two main concerns:

  • it is not possible to change system user password with the Operator after cluster deployment
  • it is confusing that there is a Secret object per user

In this release, we are moving all system users to a single Secret. The change in the Secret resource is going to trigger the update of the passwords in PostgreSQL automatically.

If the cluster is created from scratch the Secret with system users is going to be created automatically and passwords would be randomly generated. By default the Secret name is

<clusterName>-users

, it can be changed under

spec.secretUsers

variable in the Custom Resource.

spec:
  secretsName: my-custom-secret

When upgrading from 1.0.0 to 1.1.0, if you want to keep old passwords, please create the Secret resource manually. Otherwise, the passwords for system users are going to be generated randomly and updated by the Operator.

PostgreSQL 14 Support

PostgreSQL 14 provides an extensive set of new features and enhancements to security, performance, usability for client applications, and more.

Most notable of them include the following:

  • Expired B-tree index entries can now be detected and removed between vacuum runs. This results in a lesser number of page splits and reduces the index bloat.
  • The vacuum process now deletes dead tuples in a single cycle, as opposed to the previous 2-step approach of first marking tuples as deleted and then actually freeing up space in the next run. This speeds up free space cleanup.
  • Support for subscripts in JSON is added to simplify data retrieval using a commonly recognized syntax.
  • Stored procedures can accept OUT parameters.
  • The libpq library now supports the pipeline mode. Previously, the client applications waited for a transaction to be completed before sending the next one. The pipeline mode allows the applications to send multiple transactions at the same time thus boosting performance.
  • Large transactions are now streamed to subscribers in-progress, thus increasing the performance. This improvement applies to logical replication.
  • LZ4 compression is added for TOAST operations. This speeds up large data processing and also improves the compression ratio.
  • SCRAM is made the default authentication mechanism. This mechanism improves security and simplifies regulatory compliance for data security.

In the 1.1.0 version of PostgreSQL Distribution for PostgreSQL Operator, we enable our users to run the latest and greatest PostgreSQL 14. PostgreSQL 14 is the default version since this release, but you still can use versions 12 and 13.

Conclusion

Kubernetes Operators are mainly seen as the tool to automate deployment and management of the applications. With this Percona Distribution for PostgreSQL Operator release, we simplify PostgreSQL management even more and enable users to leverage the latest version 14. 

We encourage you to try out our operator. See our github repository and check out the documentation.

Found a bug or have a feature idea? Feel free to submit it in JIRA.

For general questions please raise the topic in the community forum

You are a developer and looking to contribute? Please read our CONTRIBUTING.md and send the Pull Request.

Nov
29
2021
--

PostgreSQL 14 Database Monitoring and Logging Enhancements

PostgreSQL-14 Database Monitoring and Logging Enhancements

PostgreSQL-14 was released in September 2021, and it contained many performance improvements and feature enhancements, including some features from a monitoring perspective. As we know, monitoring is the key element of any database management system, and PostgreSQL keeps updating and enhancing the monitoring capabilities. Here are some key ones in PostgreSQL-14.

Query Identifier

Query identifier is used to identify the query, which can be cross-referenced between extensions. Prior to PostgreSQL-14, extensions used an algorithm to calculate the query_id. Usually, the same algorithm is used to calculate the query_id, but any extension can use its own algorithm. Now, PostgreSQL-14 optionally provides a query_id to be computed in the core. Now PostgreSQL-14’s monitoring extensions and utilities like pg_stat_activity, explain, and in pg_stat_statments use this query_id instead of calculating its own. This query_id can be seen in csvlog, after specifying in the log_line_prefix. From a user perspective, there are two benefits of this feature.

  • All the utilities/extensions will use the same query_id calculated by core, which provides an ease to cross-reference this query_id. Previously, all the utilities/extensions needed to use the same algorithm in their code to achieve this capability.
  • The second benefit is extension/utilities can use calculated query_id and don’t need to again, which is a performance benefit.

PostgreSQL introduces a new GUC configuration parameter compute_query_id to enable/disable this feature. The default is auto; this can be turned on/off in postgresql.conf file, or using the SET command.

  • pg_stat_activity

SET compute_query_id = off;

SELECT datname, query, query_id FROM pg_stat_activity;
 datname  |                                 query                                 | query_id 
----------+-----------------------------------------------------------------------+----------
 postgres | select datname, query, query_id from pg_stat_activity;                |         
 postgres | UPDATE pgbench_branches SET bbalance = bbalance + 2361 WHERE bid = 1; |

SET compute_query_id = on;

SELECT datname, query, query_id FROM pg_stat_activity;
 datname  |                                 query                                 |      query_id       
----------+-----------------------------------------------------------------------+---------------------
 postgres | select datname, query, query_id from pg_stat_activity;                |  846165942585941982
 postgres | UPDATE pgbench_tellers SET tbalance = tbalance + 3001 WHERE tid = 44; | 3354982309855590749

  • Log

In the previous versions, there was no mechanism to compute the query_id in the server core. The query_id is especially useful in the log files. To enable that, we need to configure the log_line_prefix configuration parameter. The “%Q” option is added to show the query_id; here is the example.

log_line_prefix = 'query_id = [%Q] -> '

query_id = [0] -> LOG:  statement: CREATE PROCEDURE ptestx(OUT a int) LANGUAGE SQL AS $$ INSERT INTO cp_test VALUES (1, 'a') $$;
query_id = [-6788509697256188685] -> ERROR:  return type mismatch in function declared to return record
query_id = [-6788509697256188685] -> DETAIL:  Function's final statement must be SELECT or INSERT/UPDATE/DELETE RETURNING.
query_id = [-6788509697256188685] -> CONTEXT:  SQL function "ptestx"
query_id = [-6788509697256188685] -> STATEMENT:  CREATE PROCEDURE ptestx(OUT a int) LANGUAGE SQL AS $$ INSERT INTO cp_test VALUES (1, 'a') $$;

  • Explain

The EXPLAIN VERBOSE will show the query_id if compute_query_id is true.

SET compute_query_id = off;

EXPLAIN VERBOSE SELECT * FROM foo;
                          QUERY PLAN                          
--------------------------------------------------------------

 Seq Scan on public.foo  (cost=0.00..15.01 rows=1001 width=4)
   Output: a
(2 rows)

SET compute_query_id = on;

EXPLAIN VERBOSE SELECT * FROM foo;
                          QUERY PLAN                          
--------------------------------------------------------------
 Seq Scan on public.foo  (cost=0.00..15.01 rows=1001 width=4)
   Output: a
 Query Identifier: 3480779799680626233
(3 rows)

autovacuum and auto-analyze Logging Enhancements

PostgreSQL-14 improves the logging of auto-vacuum and auto-analyze. Now we can see the I/O timings in the log, showing how much has been spent reading and writing.

automatic vacuum of table "postgres.pg_catalog.pg_depend": index scans: 1
pages: 0 removed, 67 remain, 0 skipped due to pins, 0 skipped frozen
tuples: 89 removed, 8873 remain, 0 are dead but not yet removable, oldest xmin: 210871
index scan needed: 2 pages from table (2.99% of total) had 341 dead item identifiers removed
index "pg_depend_depender_index": pages: 39 in total, 0 newly deleted, 0 currently deleted, 0 reusable
index "pg_depend_reference_index": pages: 41 in total, 0 newly deleted, 0 currently deleted, 0 reusable

I/O timings: read: 44.254 ms, write: 0.531 ms

avg read rate: 13.191 MB/s, avg write rate: 8.794 MB/s
buffer usage: 167 hits, 126 misses, 84 dirtied
WAL usage: 85 records, 15 full page images, 78064 bytes
system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.07 s

These logs are only available if track_io_timing is enabled.

Connecting Logging

PostgreSQL already logs the connection/disconnection if log_connections/log_disconnections is on. Therefore, PostgreSQL-14 now also logs the actual username supplied by the user. In case some external authentication is used, and mapping is defined in pg_ident.conf, it will become hard to identify the actual user name. Before PostgreSQL-14, you only see the mapped user instead of the actual user.

pg_ident.conf

# MAPNAME       SYSTEM-USERNAME         PG-USERNAME

pg              vagrant                 postgres

pg_hba.conf

# TYPE  DATABASE        USER            ADDRESS                 METHOD
# "local" is for Unix domain socket connections only
local   all             all                                     peer map=pg

Before PostgreSQL-14

LOG:  database system was shut down at 2021-11-19 11:24:30 UTC
LOG:  database system is ready to accept connections
LOG:  connection received: host=[local]
LOG:  connection authorized: user=postgres database=postgres application_name=psql

PostgreSQL-14

LOG:  database system is ready to accept connections
LOG:  connection received: host=[local]
LOG:  connection authenticated: identity="vagrant" method=peer (/usr/local/pgsql.14/bin/data/pg_hba.conf:89)
LOG:  connection authorized: user=postgres database=postgres application_name=psql

Conclusion

Every major PostgreSQL release carries significant enhancements, and PostgreSQL-14 was no different.

Monitoring is a key feature of any DBMS system, and PostgreSQL keeps upgrading its capabilities to improve its logging and monitoring capabilities. With these newly added features, you have more insights into connections; one can easily track queries and observe performance, and identify how much time is being spent by the vacuum process in read/write operations. This can significantly benefit you in configuring vacuum parameters better.


As more companies look at migrating away from Oracle or implementing new databases alongside their applications, PostgreSQL is often the best option for those who want to run on open source databases.

Read Our New White Paper:

Why Customers Choose Percona for PostgreSQL

Nov
23
2021
--

Upgrading PostGIS: A Proof Of Concept

Upgrading Postgis PostgreSQL

My last blog introduced the issues one can face when upgrading PostGIS and PostgreSQL at the same time. The purpose of this blog is to walk through the steps with an example.

For our purposes, we will confine ourselves to working with the community versions of 9.6 and 11 respectively, and use LXD in order to create a working environment prototyping the steps, and profiling the issues.

Creating the Demo Environment Using LXD

The first step is creating a template container with the requisite packages and configurations. This template is a basic distribution of Ubuntu 18.04, which has already been installed in the development environment.

# creating the template container
lxc cp template-ubuntu1804 template-BetterView
lxc start template-BetterView
lxc exec template-BetterView bash

These packages install the necessary supporting packages, installing PostgreSQL from the community repository:

apt install -y wget gnupg2

These steps are copied from the community download page for Ubuntu distributions:

echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list 
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -

The aforementioned repository is now updated thus making it possible to install our two versions of PostgreSQL (i.e. 9.6 and 11, respectively). Installing pg_repack pulls in the requisite packages while installing this very useful package at the same time too:

apt update
apt install -y postgresql-11-repack postgresql-9.6-repack

These next packages are useful. Midnight Commander, mc, is a terminal-based shell navigator and file manager while the other package installs utilities, such as netstat, to monitor the status of all network-based services on the localhost:

apt install -y mc net-tools

This last step merely updates the man pages database and the mlocate database. It makes it easier to locate files on the host. Beware this can be a security risk if used on a production host.

mandb && updatedb

This little snippet of code creates our simulated production host. Creating the instance from a template container makes it much easier to try different variations in quick order:

# creating the POC upgrade container
lxc rm --force pg1-BV
lxc cp template-BetterView pg1-BV
lxc start pg1-BV
lxc exec pg1-BV bash

As per our scenario, upgrading PostGIS requires two different versions to be installed on the host. Notice that PostgreSQL version 9.6 has the older version of PostGIS, while version 11 has the newer one.

For our purposes, this presentation assumes upgrading both PostgreSQL and PostGIS is the method to be used.

ATTENTION: Executing this upgrade operation into two distinct phases is preferred. Either upgrade PostgreSQL and then upgrade PostGIS or upgrade PostGIS on the old version to match the new version on PostgreSQL and then upgrade the PostgreSQL data cluster. 

The underlying assumption is that application code can break between PostGIS version upgrades therefore pursuing an incremental process can mitigate potential issues.

https://PostGIS.net/docs/PostGIS_Extensions_Upgrade.html 

https://PostGIS.net/workshops/PostGIS-intro/upgrades.html

apt install -y postgresql-9.6-postgis-2.4 postgresql-11-postgis-3

About PostGIS

Available versions of PostGIS, as per the community repository at the time of this blog’s publication:

  • 9.6:
    • postgresql-9.6-postgis-2.4
    • postgresql-9.6-postgis-2.5
    • postgresql-9.6-PostGIS-3
  • 11:
    • postgresql-11-postgis-2.5
    • postgresql-11-postgis-3
  • PostGIS supported versions matrix

ATTENTION: Azure supports only PostgreSQL 9.6 with PostGIS 2.3.2.

Before You Upgrade

About

This query lists all user-defined functions that have been installed in your database. Use it to summarize not only what you’ve created but the entire suite of PostGIS function calls:

--
-- get list of all PostGIS functions
--
select nspname, proname
from pg_proc
join pg_namespace on pronamespace=pg_namespace.oid
where nspname not in ('pg_catalog','information_schema')
order by 1,2;

In order to validate your functions, you need to know which ones are being used, therefore tracking the functions prior to the upgrade process will identify them. Please note there are two settings i.e. pl, all. Out of an abundance of caution, it is suggested initially using all for an extended period of time:

--
-- postgresql.conf
-- track_functions = none                    # none, pl, all
--
alter system set track_functions=all;
select pg_reload_conf();

This view collects all the statistics related to function calls:

--
-- track function activity
--
            View "pg_catalog.pg_stat_user_functions"
   Column   |       Type       | Collation | Nullable | Default
------------+------------------+-----------+----------+---------
 funcid     | oid              |           |          |
 schemaname | name             |           |          |
 funcname   | name             |           |          |
 calls      | bigint           |           |          |
 total_time | double precision |           |          |
 self_time  | double precision |           |          |

Example

This is a simple example demonstrating tracking function call usage. Note there are two function calls and one of them is invoked in the other:

CREATE OR REPLACE FUNCTION f1 (
    in  a integer,
    out b integer
) AS
$$
BEGIN
    raise notice 'function f1 is called';
    perform pg_sleep(1);
    b = a+1;
END
$$
LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION f2 (
    in  c integer,
    out d integer
) as
$$
BEGIN
    raise notice 'function f2 is called';
    perform f1(c);
    raise notice 'returning from f2';
    d := 0;
END
$$
language plpgsql;

This SQL statement resets all statistics being tracked in the PostgreSQL database. Please note there are other functions that can be used to reset specific statistics while preserving others:

select * from pg_stat_reset();

And here’s our functions’ invocations:

db01=# select * from f1(4);
NOTICE:  function f1 is called
 b
---
 5
db01=# select * from f2(4);
NOTICE:  function f2 is called
NOTICE:  function f1 is called
NOTICE:  returning from f2
 d
---
 0
db01=# select * from pg_stat_user_functions;
 funcid | schemaname | funcname | calls | total_time | self_time
--------+------------+----------+-------+------------+-----------
  17434 | public     | f1       |     2 |   2002.274 |  2002.274
  17437 | public     | f2       |     1 |   1001.126 |     0.599

An Upgrade Example Using pg_upgrade

SYNOPSIS

There are two discrete upgrades:

  1. pg_upgrade: pg 9.6 -> pg 11
  2. PostGIS upgrade: postgis-2.4 -> postgis2.5 -> postgis-3

HOUSE CLEANING

An Ubuntu-based upgrade requires removing the target data cluster because installing PostgreSQL packages onto a Debian-based distro always includes creating a data cluster:

pg_lsclusters
Ver Cluster   Port   Status   Owner       Data directory
9.6 main      5432   online   postgres    /var/lib/postgresql/9.6/main
11  main      5434   online   postgres    /var/lib/postgresql/11/main  

pg_dropcluster --stop 11 main

For our purposes we are simply adding the extension, no user-defined functions have been included:

su - postgres
createdb -p 5432 db01
psql -p 5432 db01 -c "create extension PostGIS"
exit

Shutting down the source data cluster is the last step before the upgrade process can begin:

systemctl stop postgresql@9.6-main

Debian based distros provide a convenient CLI, making upgrades easy:

# /usr/bin/pg_upgradecluster [OPTIONS] <old version> <cluster name> [<new data directory>]
pg_upgradecluster -v 11 9.6 main

It’s important to check the upgrade logs before starting PostgreSQL version 11. This is a one-way process and once it’s active the old PostgreSQL 9.6 cluster is no longer available and must be destroyed:

systemctl start postgresql@11-main
pg_dropcluster --stop 9.6 main

Here’s confirmation of the PostgreSQL and PostGIS upgrade respectively:

su - postgres
psql -p 5432 db01
show server_version;
           server_version
------------------------------------
 11.14 (Ubuntu 11.14-1.pgdg18.04+1)
select * from PostGIS_version();
            PostGIS_version
---------------------------------------
 3.1 USE_GEOS=1 USE_PROJ=1 USE_STATS=1

PostGIS Function API, Inspection, and Review

This is critical; the process validates that the application logic works or that it must be updated.

METHOD: inspect each function call used between all versions:

  • from 2.4 -> 2.5
  • from 2.5 -> 3.0
  • from 3.0 -> 3.1

TIP: 3.1 documentation encapsulates all previous versions i.e. section 9.12

REFERENCES:

Regression Testing

  • In the current setup, pg 9.6
    • Identify all functions used in PostGIS
    • Execute a simple function call with every type of parameter typically used in your environment
    • Collect, record all variables returned
  • In the target setup, pg 11 or pg 13
    • Execute a simple function call with every type of parameter typically used in your environment
    • Collect, record all variables returned
  • Analysis
    • Compare values: similar values mean you don’t have a problem

Working With Cloud Provider Technology

Be advised, cloud environments are not ideal upgrade candidates. The aforementioned process is quite detailed and will facilitate a successful upgrade process.

  • AZURE
    • pg 9.6: PostGIS 2.3.2
    • pg 11: PostGIS 2.5.1
  • AMAZON
    • pg 11, 13: PostGIS 3.1.4
    • pg 9.6.*: PostGIS 2.3.[0247], 2.5.[25]

References:

https://docs.microsoft.com/en-us/azure/postgresql/concepts-extensions

https://docs.microsoft.com/en-us/azure/postgresql/concepts-extensions#postgres-96-extensions

https://docs.microsoft.com/en-us/azure/postgresql/concepts-extensions#postgres-11-extensions

https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Appendix.PostgreSQL.CommonDBATasks.PostGIS.html


As more companies look at migrating away from Oracle or implementing new databases alongside their applications, PostgreSQL is often the best option for those who want to run on open source databases.

Read Our New White Paper:

Why Customers Choose Percona for PostgreSQL

Nov
18
2021
--

Should I Create an Index on Foreign Keys in PostgreSQL?

Index on Foreign Keys in PostgreSQL

Welcome to a weekly blog where I get to answer (like, really answer) some of the questions I’ve seen in the webinars I’ve presented lately. If you missed the latest one, PostgreSQL Performance Tuning Secrets, it might be helpful to give some of it a listen before or after you read this post. Each week, I’ll dive deep into one question. Let me know what you think in the comments. 

We constantly hear that indexes improve read performance and it’s usually true, but we also know that it will always have an impact on writes. What we don’t hear about too often is that in some cases, it may not give any performance improvement at all. This happens more than we want and might happen more than we even notice, and foreign keys (FKs) are a great example. I’m not saying that all FK’s indexes are bad, but most of the ones I’ve seen are just unnecessary, only adding load to the system.

For example, the below relationship where we have a 1:N relationship between the table “Supplier” and table “Product”:

foreign keys index

If we pay close attention to the FK’s in this example it won’t have a high number of lookups on the child table using the FK column, “SupplierID” in this example, if we compare with the number of lookups using “ProductID” and probably “ProductName”. The major usage will be to keep the relationship consistent and search in the other direction, finding the supplier for a certain product. In this circumstance, adding an index to the FK child without ensuring the access pattern requires it will add the extra cost of updating the index every time we update the “Product” table.

Another point we need to pay attention to is the index cardinality. If the index cardinality is too low Postgres won’t use it and the index will be just ignored. One can ask why that happens and if that wouldn’t still be cheaper for the database to go, for example, through half of the indexes instead of doing a full table scan? The answer is no, especially for databases that use heap tables like Postgres. The table access in Postgres (heap table) is mostly sequential, which is faster than random access in spinning HDD disks and still a bit faster on SSDs, while b+-tree index access is random by nature.

When Postgres uses an index it needs to open the index file, find the records it needs and then open the table file, do a lookup using the page addresses it got from the indexes changing the access pattern from sequential to random and depending on the data distribution it will probably access the majority of the table pages, ending up in a full table scan but now using random access, which is much more expensive. If we have columns with low cardinality and we really need to index them we need to use an alternative to b-tree indexes, for example, a GIN index, but this is a topic for another discussion.

With all of that, we may think that FK indexes are always evil and never use them on a child table, right? Well, that’s not true either and there are many circumstances they are useful and needed, for example, the below picture has another two tables, “Customer” and “Order”:

FK child table

It can be handy to have an index on the child table “Order->CustomerId” as it’s common to show all orders from a certain user and the column “CustomerId” on the table “Order” will be used quite frequently as the lookup key.

Another good example is to provide a faster method to validate referential integrity. If one needs to change the parent table (update or delete any parent key) the children need to be checked to make sure that the relationship isn’t broken. In this case, having an index on the child’s side would help to improve performance. When it worths the index however is “load dependant”. If the parent key has many deletes it might be a case to consider, however, if it’s a mostly static table or mostly has inserts or updates to the other columns other than the parent key column then it’s not a good candidate to have an index on the children tables.

There are many other examples that can be given to explain why an index on the child table might be useful and worth the extra write cost/penalty.

Conclusion

The takeaway here is that we should not indiscriminately create indexes on all FKs because many of them will just not be used or so rarely used that they aren’t worth the cost. It’s better to initially design the database with the FKs but not the indexes and add them while the database grows and we understand the workload. It’s possible that at some point we find that we need an index on “Product->SupplierId” due to our workload and the index on “Order->CustomerId” isn’t necessary anymore. Loads change and data distribution as well, index shall follow them and not be treated as immutable entities.


As more companies look at migrating away from Oracle or implementing new databases alongside their applications, PostgreSQL is often the best option for those who want to run on open source databases.

Read Our New White Paper:

Why Customers Choose Percona for PostgreSQL

Nov
10
2021
--

Issues Upgrading PostGIS and PostgreSQL

PostGIS is arguably the best GIS implementation in the world. There may be other database technologies with GIS but the value proposition of quality, performance, and sophistication, and, of course, cost, is nowhere near what PostgreSQL offers. So let me set the stage for this blog of mine: you've been using PostGIS on your PostgreSQL server for a few years now but having reached its end-of-life it's time to upgrade the system. How do you do it? Most times upgrading PostgreSQL, although a significant activity, can be a pain-free experience. And for those people who've implemented their PostgreSQL database in the cloud, upgrading PostgreSQL is often a matter of just clicking a few buttons. The only real work is fleshing out the steps as you compose your run-book and rehearse drills before performing your scheduled maintenance window. Upgrading a PostgreSQL server with extensions is a matter of ensuring the same versions are present on both source and target hosts. In the case of extensions bundled with PostgreSQL, such as pg_stat_statements, there's nothing to worry about since compatibility is pretty much guaranteed. Things are just a touch more complex when using popular extensions such as pg_repack as you must install its package on the target host. Less popular extensions can be more challenging in that they may not have been as rigorously tested. But in the case of upgrading a PostGIS-enabled database engine, you can bid any hopes for a stress-free experience a sweet goodbye as it saunters out the door waving fondly at you. Looking at the library dependencies for pg_stat_statements, pg_repack and postgis respectively should help you appreciate the issues a little better: pg_stat_statements ldd /usr/lib/postgresql/12/lib/pg_stat_statements.so linux-vdso.so.1 (0x00007ffc3759c000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f5e4ff60000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f5e4fb6f000) /lib64/ld-linux-x86-64.so.2 (0x00007f5e50508000) pg_repack ldd /usr/lib/postgresql/12/lib/pg_repack.so linux-vdso.so.1 (0x00007ffeaaf8c000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f9601226000) /lib64/ld-linux-x86-64.so.2 (0x00007f960181e000) postgis ldd /usr/lib/postgresql/12/lib/postgis-3.so linux-vdso.so.1 (0x00007fff7f736000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f953bf27000) libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f953bb9e000) libgeos_c.so.1 => /usr/lib/x86_64-linux-gnu/libgeos_c.so.1 (0x00007f953b96d000) libproj.so.12 => /usr/lib/x86_64-linux-gnu/libproj.so.12 (0x00007f953b704000) libjson-c.so.3 => /lib/x86_64-linux-gnu/libjson-c.so.3 (0x00007f953b4f9000) libprotobuf-c.so.1 => /usr/lib/x86_64-linux-gnu/libprotobuf-c.so.1 (0x00007f953b2f0000) libxml2.so.2 => /usr/lib/x86_64-linux-gnu/libxml2.so.2 (0x00007f953af2f000) libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f953ad17000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f953a926000) /lib64/ld-linux-x86-64.so.2 (0x00007f953c612000) libgeos-3.7.1.so => /usr/lib/x86_64-linux-gnu/libgeos-3.7.1.so (0x00007f953a583000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f953a364000) libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f953a160000) libicuuc.so.60 => /usr/lib/x86_64-linux-gnu/libicuuc.so.60 (0x00007f9539da8000) libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f9539b8b000) liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f9539965000) libicudata.so.60 => /usr/lib/x86_64-linux-gnu/libicudata.so.60 (0x00007f9537dbc000) PostGIS is probably the most sophisticated extension you will ever encounter on PostgreSQL. Its dependencies encompass not just the database system but a number of disparate libraries and projects that the RDBMS under normal circumstances has no relation. Let's itemize some key issues: The naming scheme protocol used in PostGIS is not the same as PostgreSQL. For example, PostgreSQL version 13.4 means the major version is "13" and the minor release or bug release is "4". But PostGIS still uses the old numbering system that the PostgreSQL project followed before version 10. For example, PostGIS version 2.4.1 means the major version is "2.4" and the minor release is "1". All currently supported versions of PostGIS are NOT being supported for all currently maintained versions of PostgreSQL. For example, in the case of Ubuntu, the PostgreSQL community repository currently makes available PostGIS 2.4 for PostgreSQL versions 10 and older but it isn't found for versions 11 and newer. Referring to the postgis support matrix, which lists all currently supported versions postgis and their end-of-life, one can see that technically speaking version 2.4 is supported on postgres version 11. On the other hand, there is support on the Red/Centos side of things. This isn't laziness on the part of the maintainers, rather there's so much in the way of advances that it's simply not practical to maintain this technical debt for longer than a few years at a time. Best practices require the same versioned extension to be present on both source and target data clusters. Keep in mind when it comes to major version differences some extensions are more flexible than others. Otherwise, one is forced to remove it from the source data cluster if it's not present on the target. For some extensions, such as the internal ones that come bundled with the PostgreSQL packages and certain 3rd party extensions, such as pg_repack, it is possible to upgrade between extension major versions without issue. In regards to PostGIS: when faced with different extension versions between the source and target data clusters one must begin by upgrading postgis on the source data cluster such that it matches BOTH the major and minor versions of the target data cluster and only thereafter performing the PostgreSQL upgrade. Alternatively, one can install a downgraded major/minor version of PostGIS on the target data cluster, matching the source data cluster's major and minor version of PostGIS, and then upgrade PostgreSQL (yes, this is convoluted). Some cloud providers do NOT offer more than one version of PostGIS for any particular version of PostgreSQL thereby making the previous recommendation redundant. The reason is simple, they are not prepared to handle PostGIS upgrades. There are ways of dealing with them but that's for another blog. Please note that this is a rapidly moving target and can/will probably change in the future. Because the PostgreSQL community repository removes all versions of PostgreSQL and their extensions when they have reached end-of-life. PostGIS package support becomes problematic when upgrading database servers that have reached end-of-life. Over the past twenty years, I've had occasion to upgrade some fairly large installations using PostGIS, the most complex being over 120TB data on a version of PostgreSQL that hadn't been upgraded in several years and the PostGIS was so old that it wasn't even installed using extensions. As a result of this and other experiences I have, where possible, adopted the practice of performing PostGIS and PostgreSQL upgrades as separate exercises. I can't emphasize enough that regression testing for each GIS function call be made thoroughly in a development environment before going into production. Because the PostGIS project's development cycle advances at the same torrid pace as the demand for GIS, it can happen that the intrinsic behavior of a function can change seemingly without warning. Therefore, merely validating the API via a document review is insufficient, i.e. TEST IT! Here's a tip: Turn on run-time parameter track_functions and you'll be able to identify all those wonderful PostGIS functions your application stack depends upon. Finally, by all means, take a look at an earlier blog "Working with PostgreSQL and PostGIS: How To Become A GIS Expert" for more information about PostGIS.

PostGIS is arguably the best GIS implementation in the world. There may be other database technologies with GIS but the value proposition of quality, performance, and sophistication, and, of course, cost, is nowhere near what PostgreSQL offers.

So let me set the stage for this blog of mine: you’ve been using PostGIS on your PostgreSQL server for a few years now but having reached its end-of-life, it’s time to upgrade the system.

How do you do it?

Most times upgrading PostgreSQL, although a significant activity, can be a pain-free experience. And for those people who’ve implemented their PostgreSQL database in the cloud, upgrading PostgreSQL is often a matter of just clicking a few buttons. The only real work is fleshing out the steps as you compose your run-book and rehearse drills before performing your scheduled maintenance window.

Upgrading a PostgreSQL server with extensions is a matter of ensuring the same versions are present on both source and target hosts. In the case of extensions bundled with PostgreSQL, such as pg_stat_statements, there’s nothing to worry about since compatibility is pretty much guaranteed. Things are just a touch more complex when using popular extensions such as pg_repack as you must install its package on the target host. Less popular extensions can be more challenging in that they may not have been as rigorously tested. But in the case of upgrading a PostGIS-enabled database engine, you can bid any hopes for a stress-free experience a sweet goodbye as it saunters out the door waving fondly at you.

Looking at the library dependencies for pg_stat_statements, pg_repack and postgis respectively should help you appreciate the issues a little better:

pg_stat_statements

ldd /usr/lib/postgresql/12/lib/pg_stat_statements.so
linux-vdso.so.1 (0x00007ffc3759c000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f5e4ff60000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f5e4fb6f000)
/lib64/ld-linux-x86-64.so.2 (0x00007f5e50508000)

pg_repack

ldd /usr/lib/postgresql/12/lib/pg_repack.so
linux-vdso.so.1 (0x00007ffeaaf8c000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f9601226000)
/lib64/ld-linux-x86-64.so.2 (0x00007f960181e000)

postgis

ldd /usr/lib/postgresql/12/lib/postgis-3.so
linux-vdso.so.1 (0x00007fff7f736000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f953bf27000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f953bb9e000)
libgeos_c.so.1 => /usr/lib/x86_64-linux-gnu/libgeos_c.so.1 (0x00007f953b96d000)
libproj.so.12 => /usr/lib/x86_64-linux-gnu/libproj.so.12 (0x00007f953b704000)
libjson-c.so.3 => /lib/x86_64-linux-gnu/libjson-c.so.3 (0x00007f953b4f9000)
libprotobuf-c.so.1 => /usr/lib/x86_64-linux-gnu/libprotobuf-c.so.1 (0x00007f953b2f0000)
libxml2.so.2 => /usr/lib/x86_64-linux-gnu/libxml2.so.2 (0x00007f953af2f000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f953ad17000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f953a926000)
/lib64/ld-linux-x86-64.so.2 (0x00007f953c612000)
libgeos-3.7.1.so => /usr/lib/x86_64-linux-gnu/libgeos-3.7.1.so (0x00007f953a583000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f953a364000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f953a160000)
libicuuc.so.60 => /usr/lib/x86_64-linux-gnu/libicuuc.so.60 (0x00007f9539da8000)
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f9539b8b000)
liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f9539965000)
libicudata.so.60 => /usr/lib/x86_64-linux-gnu/libicudata.so.60 (0x00007f9537dbc000)

PostGIS is probably the most sophisticated extension you will ever encounter on PostgreSQL. Its dependencies encompass not just the database system but a number of disparate libraries and projects that the RDBMS under normal circumstances has no relation.

Let’s itemize some key issues:

  • The naming scheme protocol used in PostGIS is not the same as PostgreSQL. For example, PostgreSQL version 13.4 means the major version is “13” and the minor release or bug release is “4”. But PostGIS still uses the old numbering system that the PostgreSQL project followed before version 10. For example, PostGIS version 2.4.1 means the major version is “2.4” and the minor release is “1”.
  • All currently supported versions of PostGIS are NOT being supported for all currently maintained versions of PostgreSQL. For example, in the case of Ubuntu, the PostgreSQL community repository currently makes available PostGIS 2.4 for PostgreSQL versions 10 and older but it isn’t found for versions 11 and newer. Referring to the postgis support matrix, which lists all currently supported versions postgis and their end-of-life, one can see that technically speaking version 2.4 is supported on postgres version 11. On the other hand, there is support on the Red/Centos side of things. This isn’t laziness on the part of the maintainers, rather there’s so much in the way of advances that it’s simply not practical to maintain this technical debt for longer than a few years at a time.
  • Best practices require the same versioned extension to be present on both source and target data clusters. Keep in mind when it comes to major version differences some extensions are more flexible than others. Otherwise, one is forced to remove it from the source data cluster if it’s not present on the target.
  • For some extensions, such as the internal ones that come bundled with the PostgreSQL packages and certain 3rd party extensions, such as pg_repack, it is possible to upgrade between extension major versions without issue.
  • In regards to PostGIS: when faced with different extension versions between the source and target data clusters one must begin by upgrading postgis on the source data cluster such that it matches BOTH the major and minor versions of the target data cluster and only thereafter performing the PostgreSQL upgrade. Alternatively, one can install a downgraded major/minor version of PostGIS on the target data cluster, matching the source data cluster’s major and minor version of PostGIS, and then upgrade PostgreSQL (yes, this is convoluted).
  • Some cloud providers do NOT offer more than one version of PostGIS for any particular version of PostgreSQL thereby making the previous recommendation redundant. The reason is simple, they are not prepared to handle PostGIS upgrades. There are ways of dealing with them but that’s for another blog. Please note that this is a rapidly moving target and can/will probably change in the future.
  • Because the PostgreSQL community repository removes all versions of PostgreSQL and their extensions when they have reached end-of-life. PostGIS package support becomes problematic when upgrading database servers that have reached end-of-life.

Over the past twenty years, I’ve had occasion to upgrade some fairly large installations using PostGIS, the most complex being over 120TB data on a version of PostgreSQL that hadn’t been upgraded in several years and the PostGIS was so old that it wasn’t even installed using extensions. As a result of this and other experiences I have, where possible, adopted the practice of performing PostGIS and PostgreSQL upgrades as separate exercises.

I can’t emphasize enough that regression testing for each GIS function call be made thoroughly in a development environment before going into production. Because the PostGIS project’s development cycle advances at the same torrid pace as the demand for GIS, it can happen that the intrinsic behavior of a function can change seemingly without warning. Therefore, merely validating the API via a document review is insufficient, i.e. TEST IT!

Here’s a tip: Turn on run-time parameter track_functions and you’ll be able to identify all those wonderful PostGIS functions your application stack depends upon.

Finally, by all means, take a look at an earlier blog “Working with PostgreSQL and PostGIS: How To Become A GIS Expert for more information about PostGIS.

As more companies look at migrating away from Oracle or implementing new databases alongside their applications, PostgreSQL is often the best option for those who want to run on open source databases.

Read Our New White Paper:

Why Customers Choose Percona for PostgreSQL

Oct
14
2021
--

Custom Percona Monitoring and Management Metrics in MySQL and PostgreSQL

mysql postgresl custom metrics

mysql postgresl custom metricsA few weeks ago we did a live stream talking about Percona Monitoring and Management (PMM) and showcased some of the fun things we were doing at the OSS Summit.  During the live stream, we tried to enable some custom queries to track the number of comments being added to our movie database example.  We ran into a bit of a problem live and did not get it to work. As a result, I wanted to follow up and show you all how to add your own custom metrics to PMM and show you some gotchas to avoid when building them.

Custom metrics are defined in a file deployed on each server you are monitoring (not on the server itself).  You can add custom metrics by navigating over to one of the following:

  • For MySQL:  /usr/local/percona/pmm2/collectors/custom-queries/mysql
  • For PostgreSQL:  /usr/local/percona/pmm2/collectors/custom-queries/postgresql
  • For MongoDB:  This feature is not yet available – stay tuned!

You will notice the following directories under each directory:

  • high-resolution/  – every 5 seconds
  • medium-resolution/ – every 10 seconds
  • low-resolution/ – every 60 seconds

Note you can change the frequency of the default metric collections up or down by going to the settings and changing them there.  It would be ideal if in the future we added a resolution config in the YML file directly.  But for now, it is a universal setting:

Percona Monitoring and Management metric collections

In each directory you will find an example .yml file with a format like the following:

mysql_oss_demo: 
  query: "select count(1) as comment_cnt from movie_json_test.movies_normalized_user_comments;"
  metrics: 
    - comment_cnt: 
        usage: "GAUGE" 
        description: "count of the number of comments coming in"

Our error during the live stream was we forgot to include the database in our query (i.e. table_name.database_name), but there was a bug that prevented us from seeing the error in the log files.  There is no setting for the database in the YML, so take note.

This will create a metric named mysql_oss_demo_comment_cnt in whatever resolution you specify.  Each YML will execute separately with its own connection.  This is important to understand as if you deploy lots of custom queries you will see a steady number of connections (this is something you will want to consider if you are doing custom collections).  Alternatively, you can add queries and metrics to the same file, but they are executed sequentially.  If, however, the entire YML file can not be completed in less time than the defined resolution ( i.e. finished within five seconds for high resolution), then the data will not be stored, but the query will continue to run.  This can lead to a query pile-up if you are not careful.   For instance, the above query generally takes 1-2 seconds to return the count.  I placed this in the medium bucket.  As I added load to the system, the query time backed up.

You can see the slowdown.  You need to be careful here and choose the appropriate resolution.  Moving this over to the low resolution solved the issue for me.

That said, query response time is dynamic based on the conditions of your server.  Because these queries will run to completion (and in parallel if the run time is longer than the resolution time), you should consider limiting the query time in MySQL and PostgreSQL to prevent too many queries from piling up.

In MySQL you can use:

mysql>  select /*+ MAX_EXECUTION_TIME(4) */  count(1) as comment_cnt from movie_json_test.movies_normalized_user_comments ;
ERROR 1317 (70100): Query execution was interrupted

And on PostgreSQL you can use:

SET statement_timeout = '4s'; 
select count(1) as comment_cnt from movies_normalized_user_comments ;
ERROR:  canceling statement due to statement timeout

By forcing a timeout you can protect yourself.  That said, these are “errors” so you may see errors in the error log.

You can check the system logs (syslog or messages) for errors with your custom queries (note at this time as of PMM 2.0.21, errors were not making it into these logs because of a potential bug).  If the data is being collected and everything is set up correctly, head over to the default Grafana explorer or the “Advanced Data Exploration” dashboard in PMM.  Look for your metric and you should be able to see the data graphed out:

Advanced Data Exploration PMM

In the above screenshot, you will notice some pretty big gaps in the data (in green).  These gaps were caused by our query taking longer than the resolution bucket.  You can see when we moved to 60-second resolution (in orange), the graphs filled in.

Percona Monitoring and Management is a best-of-breed open source database monitoring solution. It helps you reduce complexity, optimize performance, and improve the security of your business-critical database environments, no matter where they are located or deployed.

Download Percona Monitoring and Management Today

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com