Dec
23
2022
--

Diffing PostgreSQL Schema Changes

Diffing PostgreSQL Schema Changes

One of the routine operations when administering PostgreSQL is periodic updates to the database system’s architecture. PostgreSQL does a good job of allowing one to update a schema, add types, functions, triggers, or alter a table by adding and removing columns and updating column data types, etc., in a reliable manner. However, there is no built-in mechanism to help identify the differences, let alone generate the necessary SQL, to accomplish updates in an easy manner from the development to the production environment.

So let’s talk about possible approaches to schema changes.

Using logical dump manifests

The easiest way to identify changes between schemas from one database to another is to compare schema dump manifests

The following example demonstrates an approach one can take looking for differences between schema on different databases:

EXAMPLE:

-- create database schemas
create database db01
create database db01


-- db01: version 1
create table t1 (
    c1 int,
    c2 text,
    c4 date
);

create table t2(
    c1 int,
    c2 varchar(3),
    c3 timestamp,
    c4 date
);
-- db02: version 2
create table t1 (
    c1 serial primary key,
    c2 varchar(256),
    c3 date default now()
);

create table t2(
    c1 serial primary key,
    c2 varchar(3),
    c3 varchar(50),
    c4 timestamp with time zone default now(),
    c5 int references t1(c1)
);

create index on t2 (c5);

 

# generate schema dumps
pg_dump -s db01 -Fc > db01.db
pg_dump -s db02 -Fc > db02.db
# generate manifests
pg_restore -l db01.db > db01_manifest.ini
pg_restore -l db02.db > db02_manifest.ini

This snippet demonstrates looking for differences by comparing the md5 checksums:

# EX 1: generate checksums
md5sum \
<(tail +16 db01_manifest.ini | cut -d ' ' -f 4-) \
<(tail +16 db02_manifest.ini | cut -d ' ' -f 4-)

# output
$ 9d76c028259f2d8bed966308c256943e  /dev/fd/63
$ ba124f9410ea623085c237dc4398388a  /dev/fd/62

This next snippet diffs the differences between the two manifests identifying only those objects and attributes that have changed. Notice that redundant information, the first 16 lines, are skipped:

# EX 2: perform diff
diff \
> <(tail +16 db01_manifest.ini | cut -d ' ' -f 4-) \
> <(tail +16 db02_manifest.ini | cut -d ' ' -f 4-)

 

This resultant diff shows the changes made between the two schemas:

1a2,3
> SEQUENCE public t1_c1_seq postgres
> SEQUENCE OWNED BY public t1_c1_seq postgres
2a5,12
> SEQUENCE public t2_c1_seq postgres
> SEQUENCE OWNED BY public t2_c1_seq postgres
> DEFAULT public t1 c1 postgres
> DEFAULT public t2 c1 postgres
> CONSTRAINT public t1 t1_pkey postgres
> CONSTRAINT public t2 t2_pkey postgres
> INDEX public t2_c5_idx postgres
> FK CONSTRAINT public t2 t2_c5_fkey postgres

The good news is that there are a number of existing tools that can reconcile differences between a proposed schema design and the target schema:

  1. Commercial offerings can differentiate schema between databases in an elegant and efficient manner. Researching, ala Google, yields the most popular technologies one can use.
  2. In regards to open source solutions, there are a number of projects capable of diffing Postgres database schemas.

Working with the apgdiff extension

The following is an example implementation of the open source tool apgdiff

Apgdiff can be found in the Postgres community repository. It compares two schema dump files and creates an SQL output file that is, for the most part, suitable for upgrades of old schemata:

Package: apgdiff

Version: 2.7.0-1.pgdg18.04+1
Architecture: all
Maintainer: Debian PostgreSQL Maintainers <team+postgresql@tracker.debian.org>
Installed-Size: 173
Depends: default-jre-headless | java2-runtime-headless
Homepage: https://www.apgdiff.com/
Priority: optional
Section: database
Filename: pool/main/a/apgdiff/apgdiff_2.7.0-1.pgdg18.04+1_all.deb
Size: 154800
SHA256: 9a83fcf54aed00e1a28c3d00eabe1c166977af1e26e91035e15f88b5215b181b
SHA1: ea713acb55898f07374dadd1bebb09ec2fa4b589
MD5sum: e70a97903cb23b8df8a887da4c54e945

The following example demonstrates how one can update differences between the development environment and the production database schema using apgdiff.

EXAMPLE:

apt install -y apgdiff
# EX 1: dump as SQL statements
pg_dump -s db01 -Fp > db01.sql
pg_dump -s db02 -Fp > db02.sql
createdb db03 --template=db01

apgdiff --ignore-start-with db01.sql db02.sql > db01-db02.sql

# “psql -1” encapsulates statements within a transaction
psql -1 -f db01-db02.sql db03

 

# EX 2: uses logical dumps
# notice the dumps are standard logical dumps and includes data
pg_dump db01 -Fc > db01.db
pg_dump db02 -Fc > db02.db
createdb db03 --template=db01

# this invocation assumes the resultant diff doesn’t require editing
apgdiff --ignore-start-with \
    <(pg_restore -s -f - db01.db) \
    <(pg_restore -s -f - db02.db) \
    | psql -1 db03

There’s more you can accomplish with these simple approaches. By incorporating variations of these, one can create fairly sophisticated shell scripts with little code and, with a little luck, not that much effort.

Dec
14
2022
--

Powering PostgreSQL 15 With Columnar Tables

Powering PostgreSQL 15 With Columnar Tables

Prologue

This blog is the result of my investigation into column-wise tables. Hopefully, you will find it useful in your deliberations.

When I started writing it was meant to be a “Good News” blurb. I was pretty optimistic that I’d truly have impressive numbers to share but as it turns out, while there is a potential for significant advantage, a detailed testing regime of the CITUS columnar store extension should be carried out before implementing them into your own environment.

The missing sizzle

Sizzle: A phrase used to show affirmation or approval. It references the sound that one hears when cooking certain delicious foods i.e. bacon or fajitas and therefore transfers those feelings to non-edible objects or ideas.

There’s a lot to take in and I’m often impressed with every new version of PostgreSQL that is released. Nevertheless, there’s been a long-standing feature that, to be honest, I’ve always felt was missing. This weakness is being addressed by the industry. Ironically, some of these implementations are in fact Postgres derivatives.

What is this sizzling feature that I speak of you ask?

Column-wise tables!

When to consider column-wise tables

Consider using the CITUS Columnar Extension under one or more of the following conditions:

  • In regards to table attributes:
    • they are wide with “many” columns
    • size should at a minimum be in the tens of GB
    • the typical row is byte “heavy”
  • System space is at a premium i.e. it’s worth your while to manage space utilization as much as possible.
  • OLAP is a major component of overall activity i.e. lots of different kinds of SELECT statements.
  • INSERT performance is not a priority.
  • Indexes are not feasible 
  • Creating EXPRESSION indexes on columnar tables is faster by orders of magnitude than on heap tables.

CAVEAT: You cannot perform UPDATE, DELETE operations on a columnar table.

OLTP vs OLAP

Let’s get back to basics. In the database world there are essentially two types of database operations:

    1. Online Transaction Processing, OLTP: Online transaction processing applications have high throughput and are insert- or update-intensive in database management. These applications are used concurrently by hundreds of users. The key goals of OLTP applications are availability, speed, concurrency, and recoverability.

    2. Online analytical processing, OLAP: Online analytical processing applications enable users to analyze multidimensional data interactively from multiple perspectives which consist of three basic analytical operations: 
      1. consolidation (roll-up)
      2. drill-down
      3. slicing and dicing

About columnar tables

As an RDBMS, PostgreSQL is geared more toward OLTP than OLAP operations. Manipulating data by inserting, updating, deleting, etc., it does well. But while eminently capable of performing OLAP, it’s not quite as efficient. The primary reason is actually a pretty common characteristic among most RDBMS i.e. it stores and processes collections of data as individual records, otherwise known as row-wise processing. 

Suppose you are performing a deep-dive analysis involving a table of records containing 50 columns. And further, suppose that your query only needs a single column. It still needs to read all of the data from those 50 columns per row in order to access that single column. And if the query processes 100 million rows, that’s definitely not trivial!

Now let’s consider a reorientation of the table i.e. column-wise. In this case, the query only reads one column and not 50. The result is that the query is now much lighter, requiring less IO and processing power yet achieving the same result in significantly less time.

Introducing the CITUS columnar extension

As a General Purpose Database Management System, basic behaviors can be reconfigured in Postgres for different purposes. And as such, it is possible to enhance PostgreSQL with columnar tables by using the CITUS columnar table extension.

The CITUS columnar extension is just one part of a larger set of capabilities of this extension that when fully implemented creates a fully scalable distributed Postgres database system.

The CITUS columnar extension feature set includes:

  • Highly compressed tables:
    • Reduces storage requirements
    • Reduces the IO needed to scan the table
  • Projection Pushdown:  Only those column(s) within the SELECT are returned further reducing IO.
  • Chunk Group Filtering: Allows queries to skip over whole groups of unrelated data without processing them.

The complete CITUS feature set which, except for the Columnar storage component, is not covered in this blog, includes:

  • Distributed tables
  • References tables
  • Distributed query engine routes and parallelizes SELECT, DML, and operations
  • Columnar storage
    • compresses data
    • speeds up scans, 
    • supports fast projections
  • Query from any node

Getting it

This is a bit of a bad-news, good-news, excellent-news situation.

BAD-NEWS: At the time of writing this blog, the only publicly available packages on the Postgres repositories that I could find were the cstore foreign data wrappers which cover only version 12 and older versions of Postgres.

GOOD-NEWS: The CITUS extension is available on the Citus Data download page.

EXCELLENT-NEWS: As part of the process of writing this blog custom, DEB and RPM packages were created for PostgreSQL version 15 for CENTOS 7, 8, and Ubuntu 18.04, 20.04 and which are available for download from github HERE.

First-time installation

Assuming you opted for the easy way, installing the packages made for this blog:

  • Go to the PostgreSQL download page and configure your packaging system for your Linux distribution.
  • Download the appropriate custom-made columnar package for your version of Linux:
# Ubuntu 18.04
wget https://github.com/rbernierZulu/columnar-tables-extension/blob/main/postgresql-15-citus-columnar_11.1-1-UBUNTU1804_amd64.deb

# Ubuntu 20.04
wget https://github.com/rbernierZulu/columnar-tables-extension/blob/main/postgresql-15-citus-columnar_11.1-1-UBUNTU2004_amd64.deb

# Centos 7
wget https://github.com/rbernierZulu/columnar-tables-extension/blob/main/postgresql-15-citus_columnar-11.1-1-CENTOS7.x86_64.rpm

# Centos 8
wget https://github.com/rbernierZulu/columnar-tables-extension/blob/main/postgresql-15-citus_columnar-11.1-1-CENTOS8.x86_64.rpm

 

  • Install the package: note that all the dependent packages, including the Postgres server, will be automatically installed onto your host. Pay attention to CENTOS 7 as it requires the epel-release repository too.
# Ubuntu 18.04
apt update
apt-get install ./postgresql-15-citus-columnar_11.1-1-UBUNTU1804_amd64.deb

# Ubuntu 20.04
apt update
apt-get install ./postgresql-15-citus-columnar_11.1-1-UBUNTU2004_amd64.deb

# Centos 7
# ATTENTION: the epel package must be installed beforehand!
yum install epel-release
yum install ./postgresql-15-citus_columnar-11.1-1-CENTOS7.x86_64.rpm

# Centos 8
dnf -qy module disable postgresql
dnf install ./postgresql-15-citus_columnar-11.1-1-CENTOS7.x86_64.rpm
dnf check-update
  • Initialize the postgres data cluster and configure appropriately for your distribution of Linux in order to login to postgres.

 

A working example of columnar tables

What follows here are the results of my analysis. Of course, there’s always more that can be said. Hopefully, this will give you enough of an overview of the possibilities of this extension.

Installing the extension into your database is pretty much standard fare:

-- create your extension in the database:
create database db01;
\c db01
create extension citus_columnar;

Here are two tables, of type HEAP and COLUMNAR, that will be used for the initial investigations. You’ll notice that it took less time to create the regular HEAP accessed table than the columnar table:

drop table if exists t1,t2;

\timing

-- Time: 7628.612 ms (00:07.629)
create table if not exists t1(id,qty) 
using heap 
as 
select (random()*10)::int, (random()*1000)::int from generate_series(1,10e6);

-- Time:15635.768 ms (00:15.636)
create table if not exists t2(id,qty) 
using columnar 
as 
select (random()*10)::int, (random()*1000)::int from generate_series(1,10e6);
Table "public.t1"
Column |  Type   | Collation | Nullable | Default  
--------+---------+-----------+----------+---------
id     | integer |           |          |  
qty    | integer |           |          |  
Access method: heap

                Table "public.t2"
Column |  Type   | Collation | Nullable | Default  
--------+---------+-----------+----------+---------
id     | integer |           |          |  
qty    | integer |
Access method: columnar

The number of records is 10 million:

with a(row_wise)    as (select count(*) from t1),
     b(column_wise) as (select count(*) from t2)
select row_wise,column_wise from a,b;

 row_wise | column_wise
----------+-------------
  9900001 |     9900001

This is very cool, the columnar compression does a great job of shrinking the table:

\dt+ t[12]
                                   List of relations
 Schema | Name | Type  |  Owner   | Persistence | Access method |  Size  |
--------+------+-------+----------+-------------+---------------+--------+
 public | t1   | table | postgres | permanent   | heap          | 346 MB |
 public | t2   | table | postgres | permanent   | columnar      | 27 MB  |

SQL statements, preliminary

Let’s begin by comparing basic administration and SELECT statements of a HEAP vs COLUMNAR table. 

Examining the tabulated results you’ll see that much of the performance times indicate that columnar tables either perform, at best, similarly to that of a HEAP table but most of the time they take more time executing the same operations.

Using a psql session, the following SQL statements are executed and examined for performance differences:

SQL

Timings 

— HEAP TABLE
create table if not exists t1(id,qty) using heap as select (random()*10)::int, (random()*1000)::int from generate_series(1,10e6);

7.6s

15.6s

— COLUMNAR TABLE
create table if not exists t2(id,qty) using columnar as select (random()*10)::int, (random()*1000)::int from generate_series(1,10e6);

— HEAP TABLE, adding 5 million records
do
$$
begin
    for i in 5.1e6..10e6
    loop
    insert into t1 values((random()*10)::int,(random()*1000)::int);
    end loop;
end
$$;

13.7s

18.5s

— COLUMNAR TABLE, adding 5 million records
do
$$
begin
    for i in 5.1e6..10e6
    loop
    insert into t2 values((random()*10)::int,(random()*1000)::int);
    end loop;
end
$$;

— HEAP TABLE
create index on t1(id);

4.9s

7.8s

— HEAP TABLE
create index on t2(id);

SQL statement query plans

Part one (10 million records, two column tables)

Using the aforementioned table definitions, the following metrics were generated with the runtime parameter max_parallel_workers_per_gather = 4.

It’s quite evident that, at least for these two tables, there’s no performance benefit of a columnar accessed table over a regular heap accessed one:

SQL 

Timings 

— HEAP TABLE
explain analyze select id,qty from t1;

742.411 ms

914.096 ms

— COLUMNAR TABLE
explain analyze select id,qty from t2;

— HEAP TABLE
explain analyze select id,qty from t1 order by random();

6441.425 ms

5871.620 ms

— COLUMNAR TABLE
explain analyze select id,qty from t2 order by random();

— HEAP TABLE
explain analyze select sum(qty) from t1;

329.562 ms

902.614 ms

— COLUMNAR TABLE
explain analyze select sum(qty) from t2;

— HEAP TABLE
explain analyze select id,sum(qty) from t1 group by id;

531.525 ms

1602.756 ms

— COLUMNAR TABLE
explain analyze select id,sum(qty) from t2 group by id;

Part two (five million records, 100 column tables)

In order to get a better idea of performance differences, a second set of tables at a greater scale were created. However, this time, while the number of records was halved, the number of columns was increased from two to one hundred.

Even if most of the columns are simply copies of one another, the columnar table’s resultant compression is remarkable as the default size is reduced by a factor of 752X.

/* TABLES

                                                  Table "public.t[34]"
 Column |  Type  | Collation | Nullable |                                  Default
--------+--------+-----------+----------+----------------------------------------------------------------------------
 c1     | bigint |           | not null | nextval('t1_c1_seq'::regclass)
 c2     | bigint |           |          |
 c3     | text   |           |          | 'aowjfa fawjfawofjawofjawoifawevvaerarpfjkaofvaweawe[OJARGOIAJOAFWF'::text
 c4     | text   |           |          | 'aowjfa fawjfawofjawofjawoifawevvaerarpfjkaofvaweawe[OJARGOIAJOAFWF'::text
.
.
.
 c98    | text   |           |          | 'aowjfa fawjfawofjawofjawoifawevvaerarpfjkaofvaweawe[OJARGOIAJOAFWF'::text
 c99    | text   |           |          | 'aowjfa fawjfawofjawofjawoifawevvaerarpfjkaofvaweawe[OJARGOIAJOAFWF'::text
 c100   | text   |           |          | 'aowjfa fawjfawofjawofjawoifawevvaerarpfjkaofvaweawe[OJARGOIAJOAFWF'::text
Indexes:
    "t1_pkey" PRIMARY KEY, btree (c1)
    "t1_c2_idx" btree (c2)

List of relations
 Schema | Name | Type  |  Owner   | Persistence | Access method | Size  | Description
--------+------+-------+----------+-------------+---------------+-------+-------------
 public | t3   | table | postgres | permanent   | heap          | 67 GB |
 public | t4   | table | postgres | permanent   | columnar      | 89 MB |
*/

Examining the indexes one sees them to be the same size.

List of relations
Schema |     Name      | Type  |  Owner   | Table | Persistence | Access method |  Size  |
-------+---------------+-------+----------+-------+-------------+---------------+--------+
public | t3_c2_idx     | index | postgres | t3    | permanent   | btree         | 105 MB |  
public | t3_c2_idx1    | index | postgres | t3    | permanent   | btree         | 105 MB |  
public | t3_length_idx | index | postgres | t3    | permanent   | btree         | 33 MB  |  
public | t3_pkey       | index | postgres | t3    | permanent   | btree         | 107 MB |  
public | t4_c2_idx     | index | postgres | t4    | permanent   | btree         | 105 MB |  
public | t4_length_idx | index | postgres | t4    | permanent   | btree         | 33 MB  |  
public | t4_pkey       | index | postgres | t4    | permanent   | btree         | 107 MB |

Unlike the first set of query plans, these ones clearly demonstrate a significant performance improvement. 

Curious to see what would change in the way of performance, the varying the  max_parallel_workers_per_gather doesn’t appear to have changed much.

SQL 

      Timings
      max parallel workers 
      4                            1

— HEAP TABLE without index
explain analyze select sum(c2) from t3;

9.6s

590.176ms

8.7s

596.459ms

— COLUMNAR TABLE without index
explain analyze select sum(c2) from t4;

— HEAP TABLE
explain analyze select count(c3) from t3;

10.4s

509.209 ms

8.8s

541.452ms

— COLUMNAR TABLE
explain analyze select count(c3) from t4;

— HEAP TABLE
explain analyze select max(length(c25)) from t3;

1m34s

1.1s

1m17s

1.2s

— COLUMNAR TABLE
explain analyze select max(length(c25)) from t4;

— HEAP TABLE
explain analyze select sum(length(c50)) from t3;

1m33s

1.2s

1m18s

1.2s

— COLUMNAR TABLE
explain analyze select sum(length(c50)) from t4;

Working with indexes

General observations: Btree indexes are similar in size between HEAP and COLUMNAR tables. Overall their performance also appears similar although the columnar table’s index is somewhat slower, no doubt due to the extra processing required to uncompress the table’s values. 

Regarding Expression Indexes: Creating an EXPRESSION index on COLUMNAR table is significantly faster than that of HEAP:

-- 1m17s
create index on t3(length(c90));
-- 14s
create index on t4(length(c90));

Regarding Runtime Parameter max parallel workers: Index performance varies considerably on HEAP tables depending upon the value set to max parallel workers.

The following results highlight that, depending upon the type of table used, it can become important when hardware resource and server costs are a consideration:

SQL 

Timings

max parallel workers  

   4                        1

— HEAP TABLE using BTREE index
explain analyze select sum(c2) from t3;

467.789ms

561.522 ms

748.939ms

599.629ms

— COLUMNAR TABLE using BTREE index
explain analyze select sum(c2) from t4;

— HEAP TABLE using EXPRESSION index
explain analyze select max(length(c90)) from t3;

1.614ms

31.980ms

2.346ms

38.766ms

— COLUMNAR TABLE using EXPRESSION index
explain analyze select max(length(c90)) from t4;

About table constraints and access methods

Overall, indexes, constraints, and access methods are still evolving with many of the features still to be implemented.

Let’s start with a big issue i.e. DELETE and UPDATE:

-- fails
delete from t2 where id=5;  
ERROR:  UPDATE and CTID scans not supported for ColumnarScan

-- fails
update t2 set id=5;   
ERROR:  UPDATE and CTID scans not supported for ColumnarScan

Creating indexes on a columnar table is restricted to btree indexes:

-- works
create index on t2 using btree (id);

-- fails
create index on t2 using columnar (id);
ERROR:  unsupported access method for the index on columnar table t2

Creating foreign key constraints aren’t implemented:

select generate_series as id into t3 from generate_series(0,15);
alter table t3 add primary key(id);

-- works for our standard table t1
alter table t1 add foreign key(id) references t3(id);

-- fails with the columnar table t2
alter table t2 add foreign key(id) references t3(generate_series);
ERROR:  Foreign keys and AFTER ROW triggers are not supported for columnar tables
HINT:  Consider an AFTER STATEMENT trigger instead.

--works after converting table t1 from COLUMNAR to HEAP
alter table t2 set access method heap;
alter table t2 add foreign key(id) references t3(generate_series);
alter table t2 set access method columnar;

 

Partitioning

Columnar tables can be used as partitions; a partitioned table can be made up of any combination of row and columnar partitions.

An excellent use case is INSERT once and READ only table partitions where one can leverage both its compression and better performing OLAP type queries for very large tables.

 

Caveat

Columnar Extension Limitations, as of version 11.1:

  • It takes more time to create the table than standard heap access-based tables
  • The query performance is equal or slower with smallish tables compared to heap-based tables.
  • There is no update/delete possible in a columnar table.
  • The indexes are limited to btree, as of version 10.0.
  • There is no logical replication.

There’s actually more documented. Refer here for more information.

AUTHOR’S NOTE: In regard to the custom packages created for this blog. The entire citus suite is designed to enhance Postgres as a distributed database solution. Only the columnar table component was covered in this blog. The complete feature set is part of this package install and should work but has not been tested. 

Conclusion

Despite its current limitations, there are use cases where this extension can definitely make a difference. And it speaks well of its future as the team continues development by constantly improving and adding capabilities. Watch for updates on its GitHub source repository.

 

Sep
13
2022
--

Working With Snapshots in PostgreSQL

Working With Snapshots in PostgreSQL

One of the reasons I’ve never strayed far from PostgreSQL is that I am always discovering features that overcome all sorts of interesting problems. Although, to be honest, from time to time I’ve had the feeling that sometimes some features are a solution in search of a problem. Take, for example, exporting transaction snapshots

Originally implemented in version 9.2, snapshot exports allow multiple transactions to share identical views of the same database state. Back then I wasn’t certain how I could use it until pg version 9.3 came out presenting a true use-case when multi-threaded downloads were introduced to pg_dump.

Here’s a simple example; let’s start by creating two tables with lots of data:

begin;
    create table t1(
        c1 serial primary key,
        c2 text default lpad('',500,md5(random()::text)),
        c3 timestamp default clock_timestamp()
    );

    create table t2(like t1 including all);

    insert into t1 select generate_series(1,5e6);
    insert into t2 select generate_series(1,5e6);

    analyze;
commit;
List of relations
 Schema | Name | Type  |  Owner   |  Size   | Description 
--------+------+-------+----------+---------+-------------
 public | t1   | table | postgres | 2791 MB | 
 public | t2   | table | postgres | 2791 MB |

TIP: Changing the arguments in the generate_series function will populate the tables that best fits your hardware’s capabilities.

The traditional invocation of a logical dump uses a single CPU/thread archiving the tables one at a time. The CLI time generates timing statistics returning the time it takes to perform the dump using the “compressed” format:

# execution time: 60 seconds
time pg_dump -Fc db01 > db01.db

real    1m0.322s
user    0m50.203s
sys     0m1.309s

An alternate invocation generating the dump uses the “directory” format:

# execution time: 52 seconds
time pg_dump -Fd db01 -f db01_dirA

real    0m52.304s
user    0m50.908s
sys     0m1.152s

Thus far the execution utilizes a single CPU. Now look at the execution time when the ‘-j’, or ‘–jobs’ switch, where multiple connections are used to generate the dump:

# execution time: 31 seconds
time pg_dump -Fd db01 -f db01_dirB -j 2

real    0m31.115s
user    1m0.242s
sys     0m1.377s

Notice the number of processes running in the background. The third connection is a leader process which coordinates the two worker processes:

$: ps aux | grep -v grep | grep pg_dump
rbernier 1872809  0.1  0.0  14444  5968 pts/1    S+   08:52   0:00 pg_dump -Fd db01 -f db01_dirB -j 2
rbernier 1872811  100  0.0  14804  4480 pts/1    R+   08:52   0:12 pg_dump -Fd db01 -f db01_dirB -j 2
rbernier 1872812  100  0.0  14804  4488 pts/1    R+   08:52   0:12 pg_dump -Fd db01 -f db01_dirB -j 2

Without the synchronized snapshot feature, the different worker jobs wouldn’t be guaranteed to see the same data in each connection, which could lead to an inconsistent backup.

One important caveat to remember is that the pg_dump jobs switch only works with the “directory” format. Attempting to use jobs to a single database dump fails:

pg_dump -Fc db01 -j 2 > db01.db               
pg_dump: error: parallel backup only supported by the directory format

There is a full description of other caveats using the jobs switch in the PostgreSQL documentation.

So we’ve shown how it works with pg_dump, but can we go further? … YES!

We can replicate the behavior using the snapshot synchronization function pg_export_snapshot().

Continuing with the two previously created tables, let’s create another scenario. Look at the table below for each step:

  • STEP 1: Three psql sessions login and are interacting with tables t1 and t2 in tandem.
  • STEP 2: Session 1 inserts every five seconds to the tables. Session 2 sees the data differently than session 3, note the three-second delay queried in session 3, thus making it problematic seeing the same dataset at the same time.
  • STEP 3: Session 1 continues updating tables t1, t2 but now sessions 2 and 3 are using the same snapshot session.
  • STEP 4: While session 1 continues updating tables t1, t2 sessions 2 and 3 see the same datasets i.e. they are synchronized.
  • STEP 5: Both sessions 2, and 3 are actively copying data at the same time without fear of inconsistency.
  • STEP 6: The COPY is completed so let’s finish up by committing the transactions.

STEP

Session 1

Session 2

Session 3

1

psql db01

psql db01

psql db01

2

with a(x) as (select max(c1) from t1),

b as (insert into t1

select generate_series(x+1,x+10)

from a returning *),

c as (insert into t2

select generate_series(x+1,x+10)

from a returning *)

select x as increment from a;

\watch 5

with a(x) as (select max(c1) from t1)

select c1,md5(c2),c3 from t1,a where c1=x

union all

select c1,md5(c2),c3 from t2,a where c1=x;

with a(x) as (select max(c1) from t1)

select c1,md5(c2),c3 from t1,a where c1=x

union all

select c1,md5(c2),c3 from t2,a,pg_sleep(3) where c1=x;

3

Step 2 is active

— let’s export the snapshot

— 00000003-000021CE-1

begin transaction isolation level repeatable read;

select pg_export_snapshot();

— let’s import the snapshot

begin transaction isolation level repeatable read;

set transaction snapshot ‘00000003-000021CE-1’;

4

Step 2 is active

with a(x) as (select max(c1) from t1)

select c1,md5(c2),c3 from t1,a where c1=x

union all

select c1,md5(c2),c3 from t2,a where c1=x;

with a(x) as (select max(c1) from t1)

select c1,md5(c2),c3 from t1,a where c1=x

union all

select c1,md5(c2),c3 from t2,a,pg_sleep(3) where c1=x;

5

Step 2 is active

\copy t1 to program ‘gzip > t1.gz’

\copy t2 to program ‘gzip > t2.gz’

6

Step 2 is terminated

— commit or rollback,
— it’s your choice

commit

— commit or rollback,
— it’s your choice

commit

 

And there you have it, a neat little trick exporting snapshots! Of course, there’s more you can do than just copying tables in a multi-threaded fashion but hopefully, this little exposé will lead to more opportunities to simplify your life while working with Postgres.

Sep
02
2022
--

Working With PostgreSQL Dump Manifests

Working With PostgreSQL Dump Manifests

After working with PostgreSQL for so many years it’s sometimes surprising to see some of the utilities, to a certain degree, have capabilities that remain relatively unknown to more than a few people. So let me introduce you to manifests.

Both pg_dump and pg_restore have switches enabling us to zone in on specific objects of interest such as functions, tables, data, etc., as we create and restore a logical dump. But there comes a time when expressing this kind of granularity as arguments can become overly complex when we look at dozens, if not hundreds, of objects.

Enter manifests

PostgreSQL dumps store not only the objects and data of a database but their descriptive meta-data too. When generated as a text file, a PostgreSQL manifest details each object on a single line. These lines can then be edited by either removing or reordering them. The edited file can then be used with pg_restore to restore the entities in the prescribed order.

Here’s how it works; the first step is to generate a standard dump using the custom compression argument -Fc

# generate the logical dump
pg_dump -C -Fc mydatabase > mydatabase.db

Once the dump has been created the manifest is generated using pg_restore and the -l switch, which goes to standard output:

# generate a manifest, the meta data, sic description, of the logical dump
pg_restore -l mydatabase.db > mydatabase_manifest.ini

And like a recipe, the manifest is then used to read and restore the dump following its directions:

# restore the dump using the manifest
pg_restore -L mydatabase_manifest.ini -d postgres mydatabase.db

Of course, in between generating and reading the manifest, one takes the opportunity of editing it too:

So let’s use a more realistic i.e. pgbench. The first step initializes the benchmarking database, notice the use of foreign keys:

dropdb --if-exists pgbench
createdb pgbench
pgbench -i --foreign-keys -s 50 pgbench

Let’s complicate the landscape by adding a number of views including a materialized view:

psql pgbench <<_eof_
    create or replace view v1_pgbench_accounts as select * from pgbench_accounts;
    create or replace view v2_pgbench_accounts as select * from pgbench_accounts;
    create or replace view v3_pgbench_accounts as select * from pgbench_accounts;
    create materialized view m_pgbench_accounts as select * from v1_pgbench_accounts order by random() limit 1e3;
_eof_

And now let’s create the database dump. Remember, manifests can only be generated when the dump has been created using the custom format switch -Fc.

pg_dump -Fc -C pgbench > pgbench.db

Now, let’s generate the manifest. I’ve opted to use the ini extension to the file name as my code editor recognizes the semicolon as a comment line and colors it accordingly:

pg_restore -C -l pgbench.db > pgbench_manifest.ini

As per the normal process of generating and restoring a logical dump, one sees from the output listed below the command creating the database, that’s from the -C switch, followed by the views, table definitions, data population, and the constraints, and finishing with the data population of the materialized view:

;
; Archive created at 2022-08-24 10:21:46 PDT
;     dbname: pgbench
;     TOC Entries: 25
;     Compression: -1
;     Dump Version: 1.14-0
;     Format: CUSTOM
;     Integer: 4 bytes
;     Offset: 8 bytes
;     Dumped from database version: 14.0
;     Dumped by pg_dump version: 14.0
;
;
; Selected TOC Entries:
;
3361; 1262 25332 DATABASE - pgbench postgres
211; 1259 25339 TABLE public pgbench_accounts postgres
213; 1259 25380 VIEW public v1_pgbench_accounts postgres
216; 1259 25392 MATERIALIZED VIEW public m_pgbench_accounts postgres
212; 1259 25342 TABLE public pgbench_branches postgres
209; 1259 25333 TABLE public pgbench_history postgres
210; 1259 25336 TABLE public pgbench_tellers postgres
214; 1259 25384 VIEW public v2_pgbench_accounts postgres
215; 1259 25388 VIEW public v3_pgbench_accounts postgres
3353; 0 25339 TABLE DATA public pgbench_accounts postgres
3354; 0 25342 TABLE DATA public pgbench_branches postgres
3351; 0 25333 TABLE DATA public pgbench_history postgres
3352; 0 25336 TABLE DATA public pgbench_tellers postgres
3200; 2606 25354 CONSTRAINT public pgbench_accounts pgbench_accounts_pkey postgres
3202; 2606 25350 CONSTRAINT public pgbench_branches pgbench_branches_pkey postgres
3198; 2606 25352 CONSTRAINT public pgbench_tellers pgbench_tellers_pkey postgres
3207; 2606 25360 FK CONSTRAINT public pgbench_accounts pgbench_accounts_bid_fkey postgres
3205; 2606 25375 FK CONSTRAINT public pgbench_history pgbench_history_aid_fkey postgres
3203; 2606 25365 FK CONSTRAINT public pgbench_history pgbench_history_bid_fkey postgres
3204; 2606 25370 FK CONSTRAINT public pgbench_history pgbench_history_tid_fkey postgres
3206; 2606 25355 FK CONSTRAINT public pgbench_tellers pgbench_tellers_bid_fkey postgres
3355; 0 25392 MATERIALIZED VIEW DATA public m_pgbench_accounts postgres

Once generated, the manifest can now be tailored to our requirements which in this case is to restore a subset of the database dump as follows:

  1. Create database pgbench_restore
  2. Restore only view v1_pgbench_accounts
  3. Restore and populate table pgbench_accounts
  4. Restore and populate materialized view m_pgbench_accounts

You will note that in order to restore table pgbench_accounts it will be necessary to not add all the constraints, i.e. remove the FK constraints.

After some editing here’s what the resulting manifest looks like:

211; 1259 25339 TABLE public pgbench_accounts postgres
213; 1259 25380 VIEW public v1_pgbench_accounts postgres
216; 1259 25392 MATERIALIZED VIEW public m_pgbench_accounts postgres
3353; 0 25339 TABLE DATA public pgbench_accounts postgres
3200; 2606 25354 CONSTRAINT public pgbench_accounts pgbench_accounts_pkey postgres
3355; 0 25392 MATERIALIZED VIEW DATA public m_pgbench_accounts postgres

With a little creativity it’s not that big of a leap to dynamically generate the appropriate manifest:

GREP="TABLE public pgbench_accounts|VIEW public v1_pgbench_accounts|MATERIALIZED|DATA public pgbench_accounts|pgbench_accounts_pkey|DATA public m_pgbench_accounts"

pg_restore -C -l pgbench.db | grep -E "$GREP" > pgbench_manifest.ini

And now we proceed with the final restoration steps:

dropdb --if-exists pgbench_restore
createdb pgbench_restore
#
# take a look at the generated SQL before uploading the logical dump
#
pg_restore -1 -L pgbench_manifest.ini -f - | less -S

#
# the logical dump can now be uploaded
#
pg_restore -1 -L pgbench_manifest.ini -d pgbench_restore pgbench.db

Taking a look at the resulting database confirms success:

psql pgbench_restore -c '\d'

List of relations
 Schema |        Name         |       Type        |  Owner   | Persistence | Access method |  Size
--------+---------------------+-------------------+----------+-------------+---------------+---------
 public | m_pgbench_accounts  | materialized view | postgres | permanent   | heap          | 144 kB  |
 public | pgbench_accounts    | table             | postgres | permanent   | heap          | 640 MB  |
 public | v1_pgbench_accounts | view              | postgres | permanent   |               | 0 bytes |

And that’s all there is to it! 

One final caveat: one edits the manifest by deleting whole lines. It’s not possible to edit the lines themselves, so don’t change anything, nor can new lines be arbitrarily added to the manifest. Remember, manifests are like pointers in that they can only reference what already exists in the database dump.

Jul
13
2022
--

Introducing PostgreSQL 15: Working with DISTINCT

Introducing PostgreSQL 15: Working with DISTINCT

Well, it’s that time of the year when once again we have a look at the newest version of PostgreSQL.

As tradition dictates, here at Percona, the team is given a list of features to write about. Mine happened to be about a very basic and, I might add, important function i.e. SELECT DISTINCT.

Before getting into the details I’d like to mention a couple of caveats regarding how the results were derived for this blog:

  1. The tables are pretty small and of a simple architecture.
  2. Because this demonstration was performed upon a relatively low-powered system, the real metrics have the potential of being significantly greater than what is demonstrated.

For those new to postgres, and the ANSI SQL standard for that matter, the SELECT DISTINCT statement eliminates duplicate rows from the result by matching specified expressions.

For example, given the following table:

table t_ex;
 c1 | c2
----+----
  2 | B
  4 | C
  6 | A
  2 | C
  4 | B
  6 | B
  2 | A
  4 | B
  6 | C
  2 | C

This SQL statement returns those records filtering out the UNIQUE values found in column “c1” in SORTED order:

select distinct on(c1) * from t_ex;

Notice, as indicated by column “c2”, that c1 uniqueness returns the first value found in the table:

c1 | c2
----+----
  2 | B
  4 | B
  6 | B

This SQL statement returns those records filtering out UNIQUE values found in column “c2”

select distinct on(c2) * from t_ex;
c1 | c2
----+----
  6 | A
  2 | B
  4 | C

And finally, of course, returning uniqueness for the entire row:

select distinct * from t_ex;
c1 | c2
----+----
  2 | A
  6 | B
  4 | C
  2 | B
  6 | A
  2 | C
  4 | B
  6 | C

So what’s this special new enhancement of DISTINCT you ask? The answer is that it’s been parallelized!

In the past, only a single CPU/process was used to count the number of distinct records. However, in postgres version 15 one can now break up the task of counting by running multiple numbers of workers in parallel each assigned to a separate CPU process. There are a number of runtime parameters controlling this behavior but the one we’ll focus on is max_parallel_workers_per_gather.

Let’s generate some metrics!

In order to demonstrate this improved performance three tables were created, without indexes, and populated with approximately 5,000,000 records. Notice the number of columns for each table i.e. one, five, and 10 respectively:

Table "public.t1"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 c1     | integer |           |          |
Table "public.t5"
 Column |         Type          | Collation | Nullable | Default
--------+-----------------------+-----------+----------+---------
 c1     | integer               |           |          |
 c2     | integer               |           |          |
 c3     | integer               |           |          |
 c4     | integer               |           |          |
 c5     | character varying(40) |           |          |
Table "public.t10"
 Column |         Type          | Collation | Nullable | Default
--------+-----------------------+-----------+----------+---------
 c1     | integer               |           |          |
 c2     | integer               |           |          |
 c3     | integer               |           |          |
 c4     | integer               |           |          |
 c5     | character varying(40) |           |          |
 c6     | integer               |           |          |
 c7     | integer               |           |          |
 c8     | integer               |           |          |
 c9     | integer               |           |          |
 c10    | integer               |           |          |
insert into t1 select generate_series(1,500);
insert into t5
select   generate_series(1,500)
        ,generate_series(500,1000)
        ,generate_series(1000,1500)
        ,(random()*100)::int
        ,'aofjaofjwaoeev$#^&ETHE#@#Fasrhk!!@%Q@';
insert into t10
select   generate_series(1,500)
        ,generate_series(500,1000)
        ,generate_series(1000,1500)
        ,(random()*100)::int
        ,'aofjaofjwaoeev$#^&ETHE#@#Fasrhk!!@%Q@'
        ,generate_series(1500,2000)
        ,generate_series(2500,3000)
        ,generate_series(3000,3500)
        ,generate_series(3500,4000)
        ,generate_series(4000,4500);
List of relations
 Schema | Name | Type  |  Owner   | Persistence | Access method |  Size  |
--------+------+-------+----------+-------------+---------------+--------+
 public | t1   | table | postgres | permanent   | heap          | 173 MB |
 public | t10  | table | postgres | permanent   | heap          | 522 MB |
 public | t5   | table | postgres | permanent   | heap          | 404 MB |

The next step is to copy the aforementioned data dumps into the following versions of postgres:

PG VERSION
    pg96
    pg10
    pg11
    pg12
    pg13
    pg14
    pg15

The postgres binaries were compiled from the source and data clusters were created on the same low-powered hardware using the default, and untuned, runtime configuration values.

Once populated, the following bash script was executed to generate the results:

#!/bin/bash
for v in 96 10 11 12 13 14 15
do
    # run the explain analzye 5X in order to derive consistent numbers
    for u in $(seq 1 5)
    do
        echo "--- explain analyze: pg${v}, ${u}X ---"
        psql -p 100$v db01 -c "explain analyze select distinct on (c1) * from t1" > t1.pg$v.explain.txt
        psql -p 100$v db01 -c "explain analyze select distinct * from t5" > t5.pg$v.explain.txt
        psql -p 100$v db01 -c "explain analyze select distinct * from t10" > t10.pg$v.explain.txt
    done
done

And here are the results: One can see that the larger the tables become the greater the performance gains that can be achieved.

PG VERSION

1 column (t1), ms

5 column (t5), ms

10 column (t10), ms

pg96

3,382

9,743

20,026

pg10

2,004

5,746

13,241

pg11

1,932

6,062

14,295

pg12

1,876

5,832

13,214

pg13

1,973

2,358

3,135

pg14

1,948

2,316

2,909

pg15

1,439

1,025

1,245

 

QUERY PLAN

One of the more interesting aspects of the investigation was reviewing the query plans between the different versions of postgres. For example, the query plan for a single column DISTINCT was actually quite similar, ignoring the superior execution time of course, between the postgres 9.6 and 15 plans respectively.

PG96 QUERY PLAN, TABLE T1
-------------------------------------------------------------------------------
 Unique  (cost=765185.42..790185.42 rows=500 width=4) (actual time=2456.805..3381.230 rows=500 loops=1)
   ->  Sort  (cost=765185.42..777685.42 rows=5000000 width=4) (actual time=2456.804..3163.600 rows=5000000 loops=1)
         Sort Key: c1
         Sort Method: external merge  Disk: 68432kB
         ->  Seq Scan on t1  (cost=0.00..72124.00 rows=5000000 width=4) (actual time=0.055..291.523 rows=5000000 loops=1)
 Planning time: 0.161 ms
 Execution time: 3381.662 ms
PG15 QUERY PLAN, TABLE T1
---------------------------------------------------------------------------
 Unique  (cost=557992.61..582992.61 rows=500 width=4) (actual time=946.556..1411.421 rows=500 loops=1)
   ->  Sort  (cost=557992.61..570492.61 rows=5000000 width=4) (actual time=946.554..1223.289 rows=5000000 loops=1)
         Sort Key: c1
         Sort Method: external merge  Disk: 58720kB
         ->  Seq Scan on t1  (cost=0.00..72124.00 rows=5000000 width=4) (actual time=0.038..259.329 rows=5000000 loops=1)
 Planning Time: 0.229 ms
 JIT:
   Functions: 1
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 0.150 ms, Inlining 31.332 ms, Optimization 6.746 ms, Emission 6.847 ms, Total 45.074 ms
 Execution Time: 1438.683 ms

The real difference showed up when the number of DISTINCT columns were increased, as demonstrated by querying table t10. One can see parallelization in action!

PG96 QUERY PLAN, TABLE T10
-------------------------------------------------------------------------------------------
 Unique  (cost=1119650.30..1257425.30 rows=501000 width=73) (actual time=14257.801..20024.271 rows=50601 loops=1)
   ->  Sort  (cost=1119650.30..1132175.30 rows=5010000 width=73) (actual time=14257.800..19118.145 rows=5010000 loops=1)
         Sort Key: c1, c2, c3, c4, c5, c6, c7, c8, c9, c10
         Sort Method: external merge  Disk: 421232kB
         ->  Seq Scan on t10  (cost=0.00..116900.00 rows=5010000 width=73) (actual time=0.073..419.701 rows=5010000 loops=1)
 Planning time: 0.352 ms
 Execution time: 20025.956 ms
PG15 QUERY PLAN, TABLE T10
------------------------------------------------------------------------------------------- HashAggregate  (cost=699692.77..730144.18 rows=501000 width=73) (actual time=1212.779..1232.667 rows=50601 loops=1)
   Group Key: c1, c2, c3, c4, c5, c6, c7, c8, c9, c10
   Planned Partitions: 16  Batches: 17  Memory Usage: 8373kB  Disk Usage: 2976kB
   ->  Gather  (cost=394624.22..552837.15 rows=1002000 width=73) (actual time=1071.280..1141.814 rows=151803 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  HashAggregate  (cost=393624.22..451637.15 rows=501000 width=73) (actual time=1064.261..1122.628 rows=50601 loops=3)
               Group Key: c1, c2, c3, c4, c5, c6, c7, c8, c9, c10
               Planned Partitions: 16  Batches: 17  Memory Usage: 8373kB  Disk Usage: 15176kB
               Worker 0:  Batches: 17  Memory Usage: 8373kB  Disk Usage: 18464kB
               Worker 1:  Batches: 17  Memory Usage: 8373kB  Disk Usage: 19464kB
               ->  Parallel Seq Scan on t10  (cost=0.00..87675.00 rows=2087500 width=73) (actual time=0.072..159.083 rows=1670000 loops=3)
 Planning Time: 0.286 ms
 JIT:
   Functions: 31
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 3.510 ms, Inlining 123.698 ms, Optimization 200.805 ms, Emission 149.608 ms, Total 477.621 ms
 Execution Time: 1244.556 ms

INCREASING THE PERFORMANCE: Performance enhancements were made by updating the postgres runtime parameter max_parallel_workers_per_gather. The default value in a newly initialized cluster is 2. As the table below indicates, it quickly became an issue of diminishing returns due to the restricted capabilities of the testing hardware itself.

POSTGRES VERSION 15

max_parallel_workers_per_gather

1 column (t1)

5 column (t5)

10 column (t10)

2

1,439

1,025

1,245

3

1,464

875

1,013

4

1,391

858

977

6

1,401

846

1,045

8

1,428

856

993

PostgreSQL Distinct

ABOUT INDEXES: Performance improvements were not realized when indexes were applied as demonstrated in this query plan.

PG15, TABLE T10(10 DISTINCT columns), and max_parallel_workers_per_gather=4:

QUERY PLAN                                                                                 
-----------------------------------------------------------------------------------
 Unique  (cost=0.43..251344.40 rows=501000 width=73) (actual time=0.060..1240.729 rows=50601 loops=1)
   ->  Index Only Scan using t10_c1_c2_c3_c4_c5_c6_c7_c8_c9_c10_idx on t10  (cost=0.43..126094.40 rows=5010000 width=73) (actual time=0.058..710.780 rows=5010000 loops=1)
         Heap Fetches: 582675
 Planning Time: 0.596 ms
 JIT:
   Functions: 1
   Options: Inlining false, Optimization false, Expressions true, Deforming true
   Timing: Generation 0.262 ms, Inlining 0.000 ms, Optimization 0.122 ms, Emission 2.295 ms, Total 2.679 ms
 Execution Time: 1249.391 ms

CONCLUDING THOUGHTS: Running DISTINCT across multiple CPUs is a big advance in performance capabilities. But keep in mind the risk of diminishing performance as you increase the number of max_parallel_workers_per_gather and you approach your hardware’s limitations. And as the investigation showed, under normal circumstances, the query planner might decide to use indexes instead of running parallel workers. One way to get around this is to consider disabling runtime parameters such as enable_indexonlyscan and enable_indexscan. Finally, don’t forget to run EXPLAIN ANALYZE in order to understand what’s going on.

Apr
27
2022
--

Working With Large PostgreSQL Databases

Working With Large PostgreSQL Databases

It’s a funny thing when the topic of database sizes comes up. Calling one small, medium, large, or even huge isn’t as straightforward as you’d think. Distinguishing the size of a database is based upon a number of factors whose characteristics can be classified as either “tangible”, things that you can measure in an objective manner, or “intangible”, those attributes best expressed using the catch-all phrase “it depends”. For example, a 2TB database is, to many people, a large database. On the other hand, a veteran DBA could describe a PostgreSQL database cluster as large when it enters the realm of Peta-Bytes.

Here’s a recap of some of PostgreSQL’s basic capabilities:

database size

unlimited

number of databases

4,294,950,911

relations per database

1,431,650,303

table size

32TB

rows per table, defined by the number
of tuples that can fit onto the page

4,294,967,295 pages

field per table

1,600

field size

1GB

identifier length

63 bytes

indexes per table

unlimited

columns per index

32

partition keys

32

NB: Despite possible physical constraints one faces when creating large numbers of schema, there is no theoretical limitation to the number created in postgres.

I’ve come to differentiate a small database from a large one using the following caveats. And while it is true that some of the caveats for a large database can be applied to a small one, and vice-versa, the fact of the matter is that most of the setups out there in the wild follow these observations:

  1. Small databases are often administered by a single person
  2. Small databases can be managed manually.
  3. Small databases are minimally tuned.
  4. Small databases can typically tolerate production inefficiencies more than large ones.
  5. Large databases are managed using automated tools.
  6. Large databases must be constantly monitored and go through an active tuning life cycle.
  7. Large databases require rapid response to imminent and ongoing production issues to maintain optimal performance.
  8. Large databases are particularly sensitive to technical debt.

Large databases often bring up the following questions and issues:

  • Is the system performance especially sensitive to changes in production load?
  • Is the system performance especially sensitive to minor tuning effects?
  • Are there large amounts of data churn?
  • Does the database load system saturate your hardware’s capabilities?
  • Do the maintenance activities, such as logical backups and repacking tables, take several days or even more than a week?
  • Does your Disaster Recovery Protocol require having a very small Recovery Point Objective (RPO) or Recovery Time Objective (RTO)?

The key difference between a small vs large database is how they are administered:

  1. Whereas it is common that small databases are manually administered, albeit it’s not best practice, using automation is the industry default mode of operation in many of these situations for large databases.
  2. Because circumstances can change quickly, large databases are particularly sensitive to production issues.
  3. Tuning is constantly evolving; while it is true that newly installed architectures are often well-tuned, circumstances change as they age and large databases are especially vulnerable.

Good planning is your friend: addressing potential issues for a large database by anticipating future conditions is the goal i.e. testing the entire infrastructure before it goes into production. 

Scripting your build environment using tools such as Ansible, Puppet, Terraform, etc. mitigates human error when provisioning the underlying infrastructure. It’s important to be able to build in a consistent and repeatable manner.

Once a database is in production it must be monitored and wired with alerts for the various critical thresholds. Aside from the standard errors, consider configuring your monitoring solution to follow the “Rule Of Three”. Select and watch only three metrics that track and alert for a specific “change of state”. This is not to be confused with following a particular issue, rather it is meant to inform you that you should pay attention to your system in order to understand that something has changed from what is considered normal behavior. Depending on your preferences you may want to watch for known production issues or when the system is stable you might be more interested in trending alerts such as query performance which have slowed below a predefined threshold.

In regards to system tuning: while small databases can, after a fashion, perform in a satisfactory manner using the default values large databases cannot. Configuring initial tuning parameters such as the shared_buffers etc is de rigueur but you should also monitor the environment in order to trend issues such as for example bloat and long-term query performance. Remember, the most common problem experienced by an otherwise stable and well-thought-out architecture is table and index bloat. Addressing bloat by tuning the autovacuum characteristics is essential.

Monitoring, especially before and after maintenance windows, is required because they can catch potential problems to the update before becoming production issues.

Pay close attention to following the regular maintenance activities during the life-cycle of your system:

  • Logical backups and misc database redundancies
  • Architectural evolution:
    • application stack updates, upgrades, and rollbacks
    • application/database scaling
  • PostgreSQL server upgrades:
    • minor
    • major
  • Database migrations
  • Hardware upgrades
  • Scaling the cluster by adding and removing server nodes

Maintenance activities such as logical backups and PostgreSQL minor upgrades are performed at regular intervals.

Plan for space utilization requirements of logical dumps and WAL archives.

In regards to logical backups: it can be difficult to justify backing up an entire database when it can take a week. Alternatively, differential backups are a potential solution. Backing up tables that are updated and deleted regularly can be archived at a faster frequency than the slower changing tables which can be stored without changes for a longer period of time. This approach however requires the appropriate architectural design considerations such as using table partitioning. 

An alternative to logical backups is to consider Copy On Write (COW), or stacked file systems, such as ZFS and BTRFS. Environments within containers for example can leverage snapshots and clones allowing for near-instant recoveries in a disaster recovery scenario.

Complex operations, such as hardware and database scaling, encompass many sub-activities and can often involve working with several teams at the same time. In this case, maintaining reference documentation is critical. Activities such as these are best tracked and planned in a Kanban, or Scrum, environment.

In regards to Disaster Recovery (DR) consider automating the following operations:

  • Recovery via Logical backups
  • Failover of a PRIMARY to a REPLICA
  • Dropping/Building a new REPLICA
  • Point In Time Recovery (PITR): rebuilding a cluster to a point in time

As an aside to PITR: instead of rebuilding an entire data cluster from scratch to a particular point in time, one can instead create a STANDBY host that is replicated on a delay and can be recovered to a particular point in time or promoted in its current state. Refer to run-time parameter recovery_min_apply_delay for more information.

In conclusion, while small databases can be managed by administrating in an ad hoc manner, the administration of a large database must always be performed using a more rigorous and conscientious approach. And what you learn from administering a large database can be carried over to administering a small one.

REFERENCES:

Apr
13
2022
--

Overcoming VACUUM WRAPAROUND

Overcoming VACUUM WRAPAROUND

Transaction ID Wraparound occurs when the VACUUM process cannot keep up with database activity and the PostgreSQL service is forced to shut down.

In more technical parlance: Transaction ID Wraparound occurs when the semantics of Multi-Version Concurrency Control (MVCC) fail and when the number of unique transaction ids reaches its maximum which numbers about two billion.

What leads up to this situation is when the VACUUM process managed by either the autovacuum workers or user-interaction (manual) does not keep up with the DML operations.

Transaction ID Wraparound can be caused by a combination of one or more of the following circumstances:

  1. Autovacuum is turned off
  2. Long-lived transactions
  3. Database logical dumps (on a REPLICA using streaming replication)
  4. Many session connections with locks extending across large swaths of the data cluster
  5. Intense DML operations forcing the cancellation of autovacuum worker processes

Transaction WRAPAROUND can cause a spontaneous shutdown of the Postgres database server in order to protect the integrity of the data.

PostgreSQL at any one time has a number of transactions that are tracked by a unique ID. Every so often that number reaches the upper limit that can be registered, for example, 200 million transactions which is the default and is then renumbered. But if the number of unique transaction IDs goes to its maximum transactions limit, known as TXID Wraparound, Postgres will force a shutdown in order to protect the data.

Here’s how it works:

  • 4 billion transactions, 2^32, is the integer upper limit for the datatype used in Postgres.
  • 2 billion transactions, 2^31, is the upper limit that PostgreSQL permits before forcing a shutdown.
  • 10 million transactions before the upper limit is reached, WARNING messages consisting of a countdown will be logged.
  • 1 million transactions before the upper limit is reached, PostgreSQL goes to READ-ONLY mode.

Warning Signs

In the case of the autovacuum daemon falling behind across the entire data cluster, review your monitoring solution in order to identify the trend of these metrics:

  • IO wait increases
  • CPU load increases
  • SQL performance decreases

Mitigation steps include:

  • Reviewing the internal Postgres monitoring metrics and confirming tables are being vacuumed.
  • Reviewing the Postgres logs, look for an overabundance of canceled autovacuum worker processes.
  • Reviewing the view “pg_stat_activity” and looking for a query string – PREVENTING TRANSACTION ID WRAPAROUND -. Actually, this is a normal message. But one should not see autovacuum running solely for the purposes of mitigating WRAPAROUND.

Here are example error messages that you can find in the Postgres logs when threatened by a shutdown due to WRAPAROUND:

#
# When less than 10 million transactions remain before shutdown
#
WARNING: database "mydb" must be vacuumed within 177009986 transactions
HINT: To avoid a database shutdown, execute a database-wide VACUUM in "mydb".

Here’s a set of queries that will help you determine if WRAPAROUND is a risk:

  --
    -- Database query for transaction age per database
    --  and as a percentage of maximum permitted transactions
    --
SELECT  datname,
       age(datfrozenxid),
       (age(datfrozenxid)::numeric/1000000000*100)::numeric(4,2) as "% WRAPAROUND RISK"
FROM pg_database ORDER BY 2 DESC;
--
-- Database query for transaction age per table
--
SELECT  
c.oid::regclass as table_name,
greatest(age(c.relfrozenxid),age(t.relfrozenxid)) as "TXID age",
(greatest(age(c.relfrozenxid),age(t.relfrozenxid))::numeric/1000000000*100)::numeric(4,2) as "% WRAPAROUND RISK"
FROM pg_class c
LEFT JOIN pg_class t ON c.reltoastrelid = t.oid
WHERE c.relkind IN ('r', 'm')
ORDER BY 2 DESC;
--
-- Other vacuum runtime parameters of interest
--  returning TXID age
--
SELECT  name, setting
FROM pg_settings
WHERE name ~ 'vacuum'
AND name ~'_age$'
ORDER BY 1 ASC;

Preventing Transaction ID Wraparound

First and foremost, make certain all tables are regularly vacuumed. A correctly configured autovacuum process takes care of this without it ever becoming an issue. Otherwise, you will need to consider a manual VACUUM strategy.

The following are merely suggestions since each situation is highly subjective.

If you have the time, run the following invocation of vacuumdb. The value for option ‘-j’ can vary from a couple to a value equal to the number of CPUs on the host. The option ‘-a’ will process each database in alphabetical order.

vacuumdb -F -a -z -j 10 -v

Consider a bash script targeting individual databases if you see one is more urgent than another:

vacuumdb -z -j 10 -v <mydatabase>

Immediate Action: Approaching Wraparound at < 10 Million Transactions

The following is the set of actions to take when TRANSACTION WRAPAROUND is imminent. Remember, you are in a race against time.

You must vacuum the entire data cluster before the remaining available transaction id drops to 1 million transactions.

Action

  • The task is to vacuum the databases as quickly as possible.
  • The tool of choice is the CLI “vacuumdb”.
  • Use as many threads as reasonable.
  • Run VACUUM  in verbose mode and log the output.
  • Monitor log output for anomalous messages i.e. vacuum fails, etc.
  • Run “vacuumdb” against individual databases and, if necessary, individual tables
  • Avoid using the option ‘-a’

Scripts

Here’s a pair of example scripts that you can use as a starting point when developing your own mitigation protocol.

Method

  1. Identify the database with the oldest TXID
  2. Generate a list of tables in order of the oldest TXID age to the youngest
  3. Feed this list of tables into a script that invokes vacuumdb and VACUUM one table per invocation

The secret sauce is xargs which enables one to utilize as many CPUs as reasonably possible. The following pair of bash scripts invoke vacuumdb against a series of tables. Of course, there’s more than one way to do this.

Script one generates a list of tables in a selected database and calls script two which executes the VACUUM on each of those tables individually.

SCRIPT ONE  (go1_highspeed_vacuum.sh)

#!/bin/bash
#
# INVOCATION
# EX: ./go1_highspeed_vacuum.sh
#

########################################################
# EDIT AS REQUIRED
export CPU=4
export PAGER=less PGUSER=postgres PGPASSWORD=mypassword PGDATABASE=db01 PGOPTIONS='-c statement_timeout=0'
########################################################

SQL1="
with a as (select &nbsp; c.oid::regclass as table_name,
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; greatest(age(c.relfrozenxid),age(t.relfrozenxid))
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; from pg_class c
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; left join pg_class t on c.reltoastrelid = t.oid
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; where c.relkind in ('r', 'm')
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; order by 2 desc)
select table_name from a
"

LIST="$(echo "$SQL1" | psql -t)"

# the 'P' sets the number of CPU to use simultaneously
xargs -t -n 1 -P $CPU ./go2_highspeed_vacuum.sh $PGDATABASE<<<$LIST

echo "$(date): DONE"

SCRIPT TWO (go2_highspeed_vacuum.sh)

#!/bin/bash

########################################################
# EDIT AS REQUIRED
export PAGER=less PGUSER=postgres PGPASSWORD=mypassword PGOPTIONS='-c statement_timeout=0'
export DB=$1

########################################################

vacuumdb --verbose ${DB} > ${DB}.log 2>&1

Tips

  • Be prepared to execute vacuumdb against the databases in REVERSE alphabetical order to avoid clashing with the autovacuum worker processes which vacuums in FORWARD alphabetical order.
  • Query table “pg_stat_activity”.
  • Always monitor where the autovacuum processes are working.
  • Avoid working on the same table that the autovacuum workers are currently processing.
  • Use the autovacuum workers as an indicator of what databases remain to be processed.
  • Kill active autovacuum workers when in conflict with a manual vacuum in order to speed things up.

Immediate Action: When PostgreSQL Service Has Shutdown Due to Transaction Wraparound

One recovers from a forced shutdown due to transaction id wraparound by performing a cluster-wide vacuum in single-user mode:

Log in to the host and as UNIX user “postgres” execute an invocation that is something similar:

  # it is understood that environment 
    # variable PGDATA points to the data cluster
    #
    postgres --single -D $PGDATA postgres <<< 'vacuum analyze'

I would suggest scripting the vacuum process because you’ll need to log in to each database to perform the VACUUM.

Generate and edit a list of all the databases:

    postgres --single -D $PGDATA postgres <<< 'select datname from pg_database' \
    | grep '"' | cut -d '"' -f 2 > list_db

Here is an example using the aforementioned list “list_db”:

#
# it is understood the database list has 

#  been edited before invoking this code snippet
#
for u in $(cat list_db)
do
    postgres --single -D $PGDATA $u <<< 'vacuum analyze'
done

TXID Wraparound is one of the scariest scenarios that can occur. Thankfully, this is an extremely rare incident and only occurs when systems are either under extremely heavy load or have been neglected.

Don’t get caught!

Remember: the best DBA is the one that’s never noticed. ?

References

Jan
31
2022
--

How PostgreSQL Pipeline Mode Works

How PostgreSQL Pipeline Mode Works

I’d like to introduce to you a very cool feature introduced in PostgreSQL, the Pipeline Mode.

So just what exactly is Pipeline Mode? Pipeline Mode allows applications to send a query without having to read the result of the previously sent query. In short, it provides a significant performance boost by allowing multiple queries and results to be sent and received in a single network transaction.

As with all good ideas, there is precedent: one can emulate such behavior with a little application code wizardry. Alternatively known as “Batch Mode”, running asynchronous communications between a client and its server has been around for some time. There are a number of existing solutions batching multiple queries in an asynchronous fashion. For example, PgJDBC has supported batch mode for many years using the standard JDBC batch interface. And of course, there’s the old reliable standby dblink.

What distinguishes Pipeline Mode is that it provides an out-of-the-box solution greatly reducing the application code’s complexity handling the client-server session.

Traditional BATCH MODE Operations

Traditional BATCH MODE Operations

 

Pipeline Mode

Pipeline Mode PostgreSQL

Although introduced in PostgreSQL 14, pipeline mode works against any currently supported version of postgres as the enhancement is in the LIBPQ which is used by the client and not the server itself!

And now for the bad news, of a sort; leveraging Pipeline Mode requires using “C” or an equivalent programming language capable of interfacing directly with LIBPQ. Unfortunately, there’s not too much out there yet in the way of ODBC development offering the requisite hooks taking advantage of this enhanced feature. Therefore, one is required to design and program the client-application session in the said programming language.

HINT: This is a great way for somebody to make a name for themselves and create a convenient interface to the LIBPQ Pipeline Mode.

How It Works

Now that I’ve issued the requisite caveat, let’s talk about how this mechanism works.

Keeping things simple

  1. The client first makes a connection to the postgres server.
  2. The client must then switch the connection to pipeline mode.
  3. Once in pipeline mode, SQL statements are sent to the server.
  4. Upon arrival to the server, the statements are immediately executed and results sent back to the client i.e. client/server acknowledgments are not required.
  5. Because each SQL statement is sent sequentially, the application logic can either use a state machine or take advantage of what is obviously a FIFO queue in order to process the results.
  6. Once all asynchronous statements have been executed and returned the client application explicitly terminates the pipeline mode and returns the connection to its default setting.

Since each SQL statement is essentially idempotent it is up to the client logic to make sense of the results. Sending SQL statements and pulling out results that have no relation with each other is one thing but life gets more complicated when working with logical outcomes that have some level of interdependence.

It is possible to bundle asynchronous SQL statements as a single transaction. But as with all transactions, failure of any one of these asynchronously sent SQL statements will result in a rollback for all the SQL statements.

Of course, the API does provide error handling in the case of pipeline failures. In the case of a FATAL condition, when the pipeline itself fails, the client connection is informed of the error thus flagging the remaining queued operations as lost. Thereafter normal processing is resumed as if the pipeline was explicitly closed by the client, and the client connection remains active.

Getting Into The, UGH, Details …

For the C programmer at heart, here’s a couple of references that I can share with you:

Caveat

  • Pipeline Mode is designed expressly for asynchronous mode. Synchronous mode is therefore not possible, which kinda defeats the purpose of pipeline mode.
  • One can only send a single SQL command at a time i.e. multiple SQL commands are disallowed.
  • COPY is disallowed.
  • In the case of sending a transaction COMMIT: The client cannot assume the transaction is committed until it receives the corresponding result.
  • Leveraging Pipeline Mode requires programming in either C or a language that can access the libpq API.

Remember to check with the postgres documentation which has more to say here.

Interesting References

Percona Distribution for PostgreSQL provides the best and most critical enterprise components from the open-source community, in a single distribution, designed and tested to work together.

Download Percona Distribution for PostgreSQL Today!

Nov
23
2021
--

Upgrading PostGIS: A Proof Of Concept

Upgrading Postgis PostgreSQL

My last blog introduced the issues one can face when upgrading PostGIS and PostgreSQL at the same time. The purpose of this blog is to walk through the steps with an example.

For our purposes, we will confine ourselves to working with the community versions of 9.6 and 11 respectively, and use LXD in order to create a working environment prototyping the steps, and profiling the issues.

Creating the Demo Environment Using LXD

The first step is creating a template container with the requisite packages and configurations. This template is a basic distribution of Ubuntu 18.04, which has already been installed in the development environment.

# creating the template container
lxc cp template-ubuntu1804 template-BetterView
lxc start template-BetterView
lxc exec template-BetterView bash

These packages install the necessary supporting packages, installing PostgreSQL from the community repository:

apt install -y wget gnupg2

These steps are copied from the community download page for Ubuntu distributions:

echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list 
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -

The aforementioned repository is now updated thus making it possible to install our two versions of PostgreSQL (i.e. 9.6 and 11, respectively). Installing pg_repack pulls in the requisite packages while installing this very useful package at the same time too:

apt update
apt install -y postgresql-11-repack postgresql-9.6-repack

These next packages are useful. Midnight Commander, mc, is a terminal-based shell navigator and file manager while the other package installs utilities, such as netstat, to monitor the status of all network-based services on the localhost:

apt install -y mc net-tools

This last step merely updates the man pages database and the mlocate database. It makes it easier to locate files on the host. Beware this can be a security risk if used on a production host.

mandb && updatedb

This little snippet of code creates our simulated production host. Creating the instance from a template container makes it much easier to try different variations in quick order:

# creating the POC upgrade container
lxc rm --force pg1-BV
lxc cp template-BetterView pg1-BV
lxc start pg1-BV
lxc exec pg1-BV bash

As per our scenario, upgrading PostGIS requires two different versions to be installed on the host. Notice that PostgreSQL version 9.6 has the older version of PostGIS, while version 11 has the newer one.

For our purposes, this presentation assumes upgrading both PostgreSQL and PostGIS is the method to be used.

ATTENTION: Executing this upgrade operation into two distinct phases is preferred. Either upgrade PostgreSQL and then upgrade PostGIS or upgrade PostGIS on the old version to match the new version on PostgreSQL and then upgrade the PostgreSQL data cluster. 

The underlying assumption is that application code can break between PostGIS version upgrades therefore pursuing an incremental process can mitigate potential issues.

https://PostGIS.net/docs/PostGIS_Extensions_Upgrade.html 

https://PostGIS.net/workshops/PostGIS-intro/upgrades.html

apt install -y postgresql-9.6-postgis-2.4 postgresql-11-postgis-3

About PostGIS

Available versions of PostGIS, as per the community repository at the time of this blog’s publication:

  • 9.6:
    • postgresql-9.6-postgis-2.4
    • postgresql-9.6-postgis-2.5
    • postgresql-9.6-PostGIS-3
  • 11:
    • postgresql-11-postgis-2.5
    • postgresql-11-postgis-3
  • PostGIS supported versions matrix

ATTENTION: Azure supports only PostgreSQL 9.6 with PostGIS 2.3.2.

Before You Upgrade

About

This query lists all user-defined functions that have been installed in your database. Use it to summarize not only what you’ve created but the entire suite of PostGIS function calls:

--
-- get list of all PostGIS functions
--
select nspname, proname
from pg_proc
join pg_namespace on pronamespace=pg_namespace.oid
where nspname not in ('pg_catalog','information_schema')
order by 1,2;

In order to validate your functions, you need to know which ones are being used, therefore tracking the functions prior to the upgrade process will identify them. Please note there are two settings i.e. pl, all. Out of an abundance of caution, it is suggested initially using all for an extended period of time:

--
-- postgresql.conf
-- track_functions = none                    # none, pl, all
--
alter system set track_functions=all;
select pg_reload_conf();

This view collects all the statistics related to function calls:

--
-- track function activity
--
            View "pg_catalog.pg_stat_user_functions"
   Column   |       Type       | Collation | Nullable | Default
------------+------------------+-----------+----------+---------
 funcid     | oid              |           |          |
 schemaname | name             |           |          |
 funcname   | name             |           |          |
 calls      | bigint           |           |          |
 total_time | double precision |           |          |
 self_time  | double precision |           |          |

Example

This is a simple example demonstrating tracking function call usage. Note there are two function calls and one of them is invoked in the other:

CREATE OR REPLACE FUNCTION f1 (
    in  a integer,
    out b integer
) AS
$$
BEGIN
    raise notice 'function f1 is called';
    perform pg_sleep(1);
    b = a+1;
END
$$
LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION f2 (
    in  c integer,
    out d integer
) as
$$
BEGIN
    raise notice 'function f2 is called';
    perform f1(c);
    raise notice 'returning from f2';
    d := 0;
END
$$
language plpgsql;

This SQL statement resets all statistics being tracked in the PostgreSQL database. Please note there are other functions that can be used to reset specific statistics while preserving others:

select * from pg_stat_reset();

And here’s our functions’ invocations:

db01=# select * from f1(4);
NOTICE:  function f1 is called
 b
---
 5
db01=# select * from f2(4);
NOTICE:  function f2 is called
NOTICE:  function f1 is called
NOTICE:  returning from f2
 d
---
 0
db01=# select * from pg_stat_user_functions;
 funcid | schemaname | funcname | calls | total_time | self_time
--------+------------+----------+-------+------------+-----------
  17434 | public     | f1       |     2 |   2002.274 |  2002.274
  17437 | public     | f2       |     1 |   1001.126 |     0.599

An Upgrade Example Using pg_upgrade

SYNOPSIS

There are two discrete upgrades:

  1. pg_upgrade: pg 9.6 -> pg 11
  2. PostGIS upgrade: postgis-2.4 -> postgis2.5 -> postgis-3

HOUSE CLEANING

An Ubuntu-based upgrade requires removing the target data cluster because installing PostgreSQL packages onto a Debian-based distro always includes creating a data cluster:

pg_lsclusters
Ver Cluster   Port   Status   Owner       Data directory
9.6 main      5432   online   postgres    /var/lib/postgresql/9.6/main
11  main      5434   online   postgres    /var/lib/postgresql/11/main  

pg_dropcluster --stop 11 main

For our purposes we are simply adding the extension, no user-defined functions have been included:

su - postgres
createdb -p 5432 db01
psql -p 5432 db01 -c "create extension PostGIS"
exit

Shutting down the source data cluster is the last step before the upgrade process can begin:

systemctl stop postgresql@9.6-main

Debian based distros provide a convenient CLI, making upgrades easy:

# /usr/bin/pg_upgradecluster [OPTIONS] <old version> <cluster name> [<new data directory>]
pg_upgradecluster -v 11 9.6 main

It’s important to check the upgrade logs before starting PostgreSQL version 11. This is a one-way process and once it’s active the old PostgreSQL 9.6 cluster is no longer available and must be destroyed:

systemctl start postgresql@11-main
pg_dropcluster --stop 9.6 main

Here’s confirmation of the PostgreSQL and PostGIS upgrade respectively:

su - postgres
psql -p 5432 db01
show server_version;
           server_version
------------------------------------
 11.14 (Ubuntu 11.14-1.pgdg18.04+1)
select * from PostGIS_version();
            PostGIS_version
---------------------------------------
 3.1 USE_GEOS=1 USE_PROJ=1 USE_STATS=1

PostGIS Function API, Inspection, and Review

This is critical; the process validates that the application logic works or that it must be updated.

METHOD: inspect each function call used between all versions:

  • from 2.4 -> 2.5
  • from 2.5 -> 3.0
  • from 3.0 -> 3.1

TIP: 3.1 documentation encapsulates all previous versions i.e. section 9.12

REFERENCES:

Regression Testing

  • In the current setup, pg 9.6
    • Identify all functions used in PostGIS
    • Execute a simple function call with every type of parameter typically used in your environment
    • Collect, record all variables returned
  • In the target setup, pg 11 or pg 13
    • Execute a simple function call with every type of parameter typically used in your environment
    • Collect, record all variables returned
  • Analysis
    • Compare values: similar values mean you don’t have a problem

Working With Cloud Provider Technology

Be advised, cloud environments are not ideal upgrade candidates. The aforementioned process is quite detailed and will facilitate a successful upgrade process.

  • AZURE
    • pg 9.6: PostGIS 2.3.2
    • pg 11: PostGIS 2.5.1
  • AMAZON
    • pg 11, 13: PostGIS 3.1.4
    • pg 9.6.*: PostGIS 2.3.[0247], 2.5.[25]

References:

https://docs.microsoft.com/en-us/azure/postgresql/concepts-extensions

https://docs.microsoft.com/en-us/azure/postgresql/concepts-extensions#postgres-96-extensions

https://docs.microsoft.com/en-us/azure/postgresql/concepts-extensions#postgres-11-extensions

https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Appendix.PostgreSQL.CommonDBATasks.PostGIS.html


As more companies look at migrating away from Oracle or implementing new databases alongside their applications, PostgreSQL is often the best option for those who want to run on open source databases.

Read Our New White Paper:

Why Customers Choose Percona for PostgreSQL

Nov
10
2021
--

Issues Upgrading PostGIS and PostgreSQL

PostGIS is arguably the best GIS implementation in the world. There may be other database technologies with GIS but the value proposition of quality, performance, and sophistication, and, of course, cost, is nowhere near what PostgreSQL offers. So let me set the stage for this blog of mine: you've been using PostGIS on your PostgreSQL server for a few years now but having reached its end-of-life it's time to upgrade the system. How do you do it? Most times upgrading PostgreSQL, although a significant activity, can be a pain-free experience. And for those people who've implemented their PostgreSQL database in the cloud, upgrading PostgreSQL is often a matter of just clicking a few buttons. The only real work is fleshing out the steps as you compose your run-book and rehearse drills before performing your scheduled maintenance window. Upgrading a PostgreSQL server with extensions is a matter of ensuring the same versions are present on both source and target hosts. In the case of extensions bundled with PostgreSQL, such as pg_stat_statements, there's nothing to worry about since compatibility is pretty much guaranteed. Things are just a touch more complex when using popular extensions such as pg_repack as you must install its package on the target host. Less popular extensions can be more challenging in that they may not have been as rigorously tested. But in the case of upgrading a PostGIS-enabled database engine, you can bid any hopes for a stress-free experience a sweet goodbye as it saunters out the door waving fondly at you. Looking at the library dependencies for pg_stat_statements, pg_repack and postgis respectively should help you appreciate the issues a little better: pg_stat_statements ldd /usr/lib/postgresql/12/lib/pg_stat_statements.so linux-vdso.so.1 (0x00007ffc3759c000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f5e4ff60000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f5e4fb6f000) /lib64/ld-linux-x86-64.so.2 (0x00007f5e50508000) pg_repack ldd /usr/lib/postgresql/12/lib/pg_repack.so linux-vdso.so.1 (0x00007ffeaaf8c000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f9601226000) /lib64/ld-linux-x86-64.so.2 (0x00007f960181e000) postgis ldd /usr/lib/postgresql/12/lib/postgis-3.so linux-vdso.so.1 (0x00007fff7f736000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f953bf27000) libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f953bb9e000) libgeos_c.so.1 => /usr/lib/x86_64-linux-gnu/libgeos_c.so.1 (0x00007f953b96d000) libproj.so.12 => /usr/lib/x86_64-linux-gnu/libproj.so.12 (0x00007f953b704000) libjson-c.so.3 => /lib/x86_64-linux-gnu/libjson-c.so.3 (0x00007f953b4f9000) libprotobuf-c.so.1 => /usr/lib/x86_64-linux-gnu/libprotobuf-c.so.1 (0x00007f953b2f0000) libxml2.so.2 => /usr/lib/x86_64-linux-gnu/libxml2.so.2 (0x00007f953af2f000) libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f953ad17000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f953a926000) /lib64/ld-linux-x86-64.so.2 (0x00007f953c612000) libgeos-3.7.1.so => /usr/lib/x86_64-linux-gnu/libgeos-3.7.1.so (0x00007f953a583000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f953a364000) libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f953a160000) libicuuc.so.60 => /usr/lib/x86_64-linux-gnu/libicuuc.so.60 (0x00007f9539da8000) libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f9539b8b000) liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f9539965000) libicudata.so.60 => /usr/lib/x86_64-linux-gnu/libicudata.so.60 (0x00007f9537dbc000) PostGIS is probably the most sophisticated extension you will ever encounter on PostgreSQL. Its dependencies encompass not just the database system but a number of disparate libraries and projects that the RDBMS under normal circumstances has no relation. Let's itemize some key issues: The naming scheme protocol used in PostGIS is not the same as PostgreSQL. For example, PostgreSQL version 13.4 means the major version is "13" and the minor release or bug release is "4". But PostGIS still uses the old numbering system that the PostgreSQL project followed before version 10. For example, PostGIS version 2.4.1 means the major version is "2.4" and the minor release is "1". All currently supported versions of PostGIS are NOT being supported for all currently maintained versions of PostgreSQL. For example, in the case of Ubuntu, the PostgreSQL community repository currently makes available PostGIS 2.4 for PostgreSQL versions 10 and older but it isn't found for versions 11 and newer. Referring to the postgis support matrix, which lists all currently supported versions postgis and their end-of-life, one can see that technically speaking version 2.4 is supported on postgres version 11. On the other hand, there is support on the Red/Centos side of things. This isn't laziness on the part of the maintainers, rather there's so much in the way of advances that it's simply not practical to maintain this technical debt for longer than a few years at a time. Best practices require the same versioned extension to be present on both source and target data clusters. Keep in mind when it comes to major version differences some extensions are more flexible than others. Otherwise, one is forced to remove it from the source data cluster if it's not present on the target. For some extensions, such as the internal ones that come bundled with the PostgreSQL packages and certain 3rd party extensions, such as pg_repack, it is possible to upgrade between extension major versions without issue. In regards to PostGIS: when faced with different extension versions between the source and target data clusters one must begin by upgrading postgis on the source data cluster such that it matches BOTH the major and minor versions of the target data cluster and only thereafter performing the PostgreSQL upgrade. Alternatively, one can install a downgraded major/minor version of PostGIS on the target data cluster, matching the source data cluster's major and minor version of PostGIS, and then upgrade PostgreSQL (yes, this is convoluted). Some cloud providers do NOT offer more than one version of PostGIS for any particular version of PostgreSQL thereby making the previous recommendation redundant. The reason is simple, they are not prepared to handle PostGIS upgrades. There are ways of dealing with them but that's for another blog. Please note that this is a rapidly moving target and can/will probably change in the future. Because the PostgreSQL community repository removes all versions of PostgreSQL and their extensions when they have reached end-of-life. PostGIS package support becomes problematic when upgrading database servers that have reached end-of-life. Over the past twenty years, I've had occasion to upgrade some fairly large installations using PostGIS, the most complex being over 120TB data on a version of PostgreSQL that hadn't been upgraded in several years and the PostGIS was so old that it wasn't even installed using extensions. As a result of this and other experiences I have, where possible, adopted the practice of performing PostGIS and PostgreSQL upgrades as separate exercises. I can't emphasize enough that regression testing for each GIS function call be made thoroughly in a development environment before going into production. Because the PostGIS project's development cycle advances at the same torrid pace as the demand for GIS, it can happen that the intrinsic behavior of a function can change seemingly without warning. Therefore, merely validating the API via a document review is insufficient, i.e. TEST IT! Here's a tip: Turn on run-time parameter track_functions and you'll be able to identify all those wonderful PostGIS functions your application stack depends upon. Finally, by all means, take a look at an earlier blog "Working with PostgreSQL and PostGIS: How To Become A GIS Expert" for more information about PostGIS.

PostGIS is arguably the best GIS implementation in the world. There may be other database technologies with GIS but the value proposition of quality, performance, and sophistication, and, of course, cost, is nowhere near what PostgreSQL offers.

So let me set the stage for this blog of mine: you’ve been using PostGIS on your PostgreSQL server for a few years now but having reached its end-of-life, it’s time to upgrade the system.

How do you do it?

Most times upgrading PostgreSQL, although a significant activity, can be a pain-free experience. And for those people who’ve implemented their PostgreSQL database in the cloud, upgrading PostgreSQL is often a matter of just clicking a few buttons. The only real work is fleshing out the steps as you compose your run-book and rehearse drills before performing your scheduled maintenance window.

Upgrading a PostgreSQL server with extensions is a matter of ensuring the same versions are present on both source and target hosts. In the case of extensions bundled with PostgreSQL, such as pg_stat_statements, there’s nothing to worry about since compatibility is pretty much guaranteed. Things are just a touch more complex when using popular extensions such as pg_repack as you must install its package on the target host. Less popular extensions can be more challenging in that they may not have been as rigorously tested. But in the case of upgrading a PostGIS-enabled database engine, you can bid any hopes for a stress-free experience a sweet goodbye as it saunters out the door waving fondly at you.

Looking at the library dependencies for pg_stat_statements, pg_repack and postgis respectively should help you appreciate the issues a little better:

pg_stat_statements

ldd /usr/lib/postgresql/12/lib/pg_stat_statements.so
linux-vdso.so.1 (0x00007ffc3759c000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f5e4ff60000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f5e4fb6f000)
/lib64/ld-linux-x86-64.so.2 (0x00007f5e50508000)

pg_repack

ldd /usr/lib/postgresql/12/lib/pg_repack.so
linux-vdso.so.1 (0x00007ffeaaf8c000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f9601226000)
/lib64/ld-linux-x86-64.so.2 (0x00007f960181e000)

postgis

ldd /usr/lib/postgresql/12/lib/postgis-3.so
linux-vdso.so.1 (0x00007fff7f736000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f953bf27000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f953bb9e000)
libgeos_c.so.1 => /usr/lib/x86_64-linux-gnu/libgeos_c.so.1 (0x00007f953b96d000)
libproj.so.12 => /usr/lib/x86_64-linux-gnu/libproj.so.12 (0x00007f953b704000)
libjson-c.so.3 => /lib/x86_64-linux-gnu/libjson-c.so.3 (0x00007f953b4f9000)
libprotobuf-c.so.1 => /usr/lib/x86_64-linux-gnu/libprotobuf-c.so.1 (0x00007f953b2f0000)
libxml2.so.2 => /usr/lib/x86_64-linux-gnu/libxml2.so.2 (0x00007f953af2f000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f953ad17000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f953a926000)
/lib64/ld-linux-x86-64.so.2 (0x00007f953c612000)
libgeos-3.7.1.so => /usr/lib/x86_64-linux-gnu/libgeos-3.7.1.so (0x00007f953a583000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f953a364000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f953a160000)
libicuuc.so.60 => /usr/lib/x86_64-linux-gnu/libicuuc.so.60 (0x00007f9539da8000)
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f9539b8b000)
liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f9539965000)
libicudata.so.60 => /usr/lib/x86_64-linux-gnu/libicudata.so.60 (0x00007f9537dbc000)

PostGIS is probably the most sophisticated extension you will ever encounter on PostgreSQL. Its dependencies encompass not just the database system but a number of disparate libraries and projects that the RDBMS under normal circumstances has no relation.

Let’s itemize some key issues:

  • The naming scheme protocol used in PostGIS is not the same as PostgreSQL. For example, PostgreSQL version 13.4 means the major version is “13” and the minor release or bug release is “4”. But PostGIS still uses the old numbering system that the PostgreSQL project followed before version 10. For example, PostGIS version 2.4.1 means the major version is “2.4” and the minor release is “1”.
  • All currently supported versions of PostGIS are NOT being supported for all currently maintained versions of PostgreSQL. For example, in the case of Ubuntu, the PostgreSQL community repository currently makes available PostGIS 2.4 for PostgreSQL versions 10 and older but it isn’t found for versions 11 and newer. Referring to the postgis support matrix, which lists all currently supported versions postgis and their end-of-life, one can see that technically speaking version 2.4 is supported on postgres version 11. On the other hand, there is support on the Red/Centos side of things. This isn’t laziness on the part of the maintainers, rather there’s so much in the way of advances that it’s simply not practical to maintain this technical debt for longer than a few years at a time.
  • Best practices require the same versioned extension to be present on both source and target data clusters. Keep in mind when it comes to major version differences some extensions are more flexible than others. Otherwise, one is forced to remove it from the source data cluster if it’s not present on the target.
  • For some extensions, such as the internal ones that come bundled with the PostgreSQL packages and certain 3rd party extensions, such as pg_repack, it is possible to upgrade between extension major versions without issue.
  • In regards to PostGIS: when faced with different extension versions between the source and target data clusters one must begin by upgrading postgis on the source data cluster such that it matches BOTH the major and minor versions of the target data cluster and only thereafter performing the PostgreSQL upgrade. Alternatively, one can install a downgraded major/minor version of PostGIS on the target data cluster, matching the source data cluster’s major and minor version of PostGIS, and then upgrade PostgreSQL (yes, this is convoluted).
  • Some cloud providers do NOT offer more than one version of PostGIS for any particular version of PostgreSQL thereby making the previous recommendation redundant. The reason is simple, they are not prepared to handle PostGIS upgrades. There are ways of dealing with them but that’s for another blog. Please note that this is a rapidly moving target and can/will probably change in the future.
  • Because the PostgreSQL community repository removes all versions of PostgreSQL and their extensions when they have reached end-of-life. PostGIS package support becomes problematic when upgrading database servers that have reached end-of-life.

Over the past twenty years, I’ve had occasion to upgrade some fairly large installations using PostGIS, the most complex being over 120TB data on a version of PostgreSQL that hadn’t been upgraded in several years and the PostGIS was so old that it wasn’t even installed using extensions. As a result of this and other experiences I have, where possible, adopted the practice of performing PostGIS and PostgreSQL upgrades as separate exercises.

I can’t emphasize enough that regression testing for each GIS function call be made thoroughly in a development environment before going into production. Because the PostGIS project’s development cycle advances at the same torrid pace as the demand for GIS, it can happen that the intrinsic behavior of a function can change seemingly without warning. Therefore, merely validating the API via a document review is insufficient, i.e. TEST IT!

Here’s a tip: Turn on run-time parameter track_functions and you’ll be able to identify all those wonderful PostGIS functions your application stack depends upon.

Finally, by all means, take a look at an earlier blog “Working with PostgreSQL and PostGIS: How To Become A GIS Expert for more information about PostGIS.

As more companies look at migrating away from Oracle or implementing new databases alongside their applications, PostgreSQL is often the best option for those who want to run on open source databases.

Read Our New White Paper:

Why Customers Choose Percona for PostgreSQL

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com