As we know, Patroni is a well-established standard for an HA framework for PostgreSQL clusters. From time to time, we need to perform maintenance tasks like upgrading the topology or making changes to the existing setup. Here, we will discuss mainly how we can replace the IP/Host information in Patroni and Etcd layers. Below, we […]
02
2025
How to Replace Patroni and Etcd IP/Host Information in PostgreSQL
09
2019
Continuous Replication From a Legacy PostgreSQL Version to a Newer Version Using Slony
Native streaming replication in PostgreSQL works only between servers running the same major version. We discussed about logical replication in our previous blog post. In that post, we saw how logical replication could help us set up migration between two different PostgreSQL versions. However, logical replication works only for the currently supported versions of PostgreSQL, for example between PostgreSQL 9.4 and PostgreSQL 11. So what about the legacy versions that are older than 9.4? Slony-I could help us meet this replication requirement.
Replication between different PostgreSQL versions with Slony-I is useful for migration from legacy database installations to the latest available version. So what is Slony and how does it work?
This post is the fourth of our Upgrading or Migrating Your Legacy PostgreSQL to Newer PostgreSQL Versions series where we’ll be exploring different methods available to upgrade your PostgreSQL databases.
Slony
Slony is an application-level logical replication implementation for PostgreSQL. Rather, we could say that it is an external replication tool that requires a separate installation and configuration. Slony has been around for a long time. The latest version supports PostgreSQL versions from 8.4 and up to 11.
The main goal for replication is to ship changes from one database server to another. To better understand the architecture, you should know the terms such as Slon, Events and Slonik in Slony-I.
An aside: Slony means elephants in Russian, and elephants are indeed reputed to have a great memory. A slightly angry, but nevertheless pretty elephant, “Slonik”, looks at you from the PostgreSQL logo image.
Slon
Slon is a daemon that runs on each PostgreSQL node in Slony-I replication. These daemons are used for processing configuration and replication events for each PostgreSQL server. Each PostgreSQL server is called a “node”. All nodes together form a Slony “cluster”.
The “publisher node” is a source for replicated changes. While “subscriber” nodes receive and apply changes from the publisher node.
In order to setup replication, we should specify all replicated tables or “set”. Actual subscription works within a specific set. Changes to the tables being replicated are grouped together into SYNCs. These group of transactions are applied together to the subscriber nodes.
Events
Changes are transferred from the publisher in the form of “events”. When an event is processed by the slon daemon on a remote node, it generates a “confirmation”. Events are also used to notify nodes about configuration changes like adding/removing new nodes, new subscriptions or DDL changes.
Each event has a unique origin identifier, sequence number, transaction id for the snapshot on provider node for this event, multiple arguments, and timestamp with timezone.
Triggers written in PL/pgSQL register all changes in replicated tables. Unfortunately, there is no reliable way yet to handle changes to large objects (BLOBS), DDLs, or changes to users and roles.
slonik
Slonik means a little elephant. It is a command line utility with parser and interpreter and it accepts “slonik scripts” – a simple declarative scripting language. It is designed to overcome the limitations of procedural language. You use slonik commands to set up or modify slony replication and they can be embedded in shell scripts. It can accept commands from standard input or from files. The following example shows how a slonik script being fed to the slonik utility, and then embedded in shell scripts.
The script to create the initial configuration for the simple master-slave setup of our pgbench database looks like this:
#!/bin/sh slonik <<_EOF_ cluster name = percona_pg; node 1 admin conninfo = 'dbname=pg93 host=pg93_host user=percona_pg93_user'; node 2 admin conninfo = 'dbname=pg11 host=pg11_host user=percona_pg11_user'; #-- # Creates a _$(clustername), this example, _percona_pg schema #-- init cluster ( id=1, comment = 'Legacy PG Node'); #-- # Add a list of tables being replicated to a set. #-- create set (id=1, origin=1, comment='pgbench'); set add table (set id=1, origin=1, id=1, fully qualified name = 'public.pgbench_accounts', comment='accounts'); set add table (set id=1, origin=1, id=2, fully qualified name = 'public.pgbench_branches', comment='branches'); set add table (set id=1, origin=1, id=3, fully qualified name = 'public.pgbench_tellers', comment='tellers'); set add table (set id=1, origin=1, id=4, fully qualified name = 'public.pgbench_history', comment='history'); #-- # Create the second node (the slave) tell the 2 nodes how to connect to # each other and how they should listen for events. #-- store node (id=2, comment = 'Target node', event node=1); store path (server = 1, client = 2, conninfo='dbname=pg93 host=pg93_host user=percona_pg93_user'); store path (server = 2, client = 1, conninfo='dbname=pg11 host=pg11_host user=percona_pg11_user'); _EOF_
Why Slony is useful for migrations?
Despite the benefits of internal logical replication, this external solution is better for migrations between different versions that are older than PostgreSQL 9.4. The trigger-based approach depends on the database API – both older and newer versions should be compatible for PL/pgSQL and SQL syntax.
How to adapt your database for usage with Slony?
- Your tables have to have primary keys. Add a serial field to all tables without primary key
- OID blobs will not have their changes replicated. If you have columns with small-length values, you could convert these to BYTEA. For a really large objects like images, it’s better to store data externally e.g. use S3 in Amazon cloud. If it’s too hard to change your application, you could apply blob changes at the last stage of migration.
- ALTER TABLE and other DDL operations. Slony can’t detect table structure changes. Instead you should use an EXECUTE SCRIPT slonik command to apply a SQL file with DDL or SQL strings to the whole replication cluster.
Online migration from legacy PostgreSQL
- Create replication user with superuser privileges. It’s possible to use fine-grained privileges, but they are significantly harder to setup.
- Create a database on the destination, setup access by TCP/IP
- Copy table definitions from master to slaves
- Install Slony-I. On servers with an old OS distribution you might find it simpler to install Slony-I from the source code.
- Define cluster, set of tables, and connection information to nodes as a list of slonik commands
- Start the slon daemon on each postgresql server. Check standard output or log files for any potential communication errors.
- Execute subscription slonik commands to start sync
- Test your read-only queries with a newer version of postgres
- Once all the data has been replicated and is in sync, stop your applications and repoint them to the new postgres server.
- Use “uninstall node” on the newer version of PostgreSQL to remove all traces of Slony replication
Downgrade steps
To downgrade, follow the same procedure as upgrade. Slony allows you to replicate from and to any versions of PosgreSQL supported by your slony version. The minimum supported version is 8.4.
Summary
So far, we have seen a high level overview of how Slony can be helpful to perform an upgrade with the least possible downtime. Come and see more of this in action during our Webinar. And don’t forget at Percona Live in Austin, May 28-30 2019, we’ll have two days of PostgreSQL content in a postgres dedicated track.
30
2018
PostgreSQL Streaming Physical Replication With Slots
PostgreSQL streaming physical replication with slots simplifies setup and maintenance procedures. Usually, you should estimate disk usage for the Write Ahead Log (WAL) and provide appropriate limitation to the number of segments and setup of the WAL archive procedure. In this article, you will see how to use replication with slots and understand what problems it could solve.
Introduction
PostgreSQL physical replication is based on WAL. Th Write Ahead Log contains all database changes, saved in 16MB segment files. Normally postgres tries to keep segments between checkpoints. So with default settings, just 1GB of WAL segment files is available.
Replication requires all WAL files created after backup and up until the current time. Previously, it was necessary to keep a huge archive directory (usually mounted by NFS to all slave servers). The slots feature introduced in 9.4 allows Postgres to track the latest segment downloaded by a slave server. Now, PostgreSQL can keep all segments on disk, even without archiving, if a slave is seriously behind its master due to downtime or networking issues. The drawback: the disk space could be consumed infinitely in the case of configuration error. Before continuing, if you need a better understanding of physical replication and streaming replication, I recommend you read “Streaming Replication with PostgreSQL“.
Create a sandbox with two PostgreSQL servers
To setup replication, you need at least two PostgreSQL servers. I’m using pgcli (pgc) to setup both servers on the same host. It’s easy to install on Linux, Windows, and OS X, and provides the ability to download and run any version of PostgreSQL on your staging server or even on your laptop.
python -c "$(curl -fsSL https://s3.amazonaws.com/pgcentral/install.py)" mv bigsql master cp -r master slave $ cd master master$ ./pgc install pg10 master$ ./pgc start pg10 $ cd ../slave slave$ ./pgc install pg10 slave$ ./pgc start pg10
First of all you should allow the replication user to connect:
master$ echo "host replication replicator 127.0.0.1/32 md5" >> ./data/pg10/pg_hba.conf
If you are running master and slave on different servers, please replace 127.0.0.1 with the slave’s address.
Next pgc creates a shell environment file with PATH and all the other variables required for PostgreSQL:
master$ source ./pg10/pg10.env
Allow connections from the remote host, and create a replication user and slot on master:
master$ psql postgres=# CREATE USER replicator WITH REPLICATION ENCRYPTED PASSWORD 'replicator'; CREATE ROLE postgres=# ALTER SYSTEM SET listen_addresses TO '*'; ALTER SYSTEM postgres=# SELECT pg_create_physical_replication_slot('slot1'); pg_create_physical_replication_slot ------------------------------------- (slot1,)
To apply system variables changes and hba.conf, restart the Postgres server:
master$ ./pgc stop ; ./pgc start pg10 stopping pg10 starting on port 5432
Test table
Create a table with lots of padding on the master:
master$ psql psql (10.6) Type "help" for help. postgres=# CREATE TABLE t(id INT, pad CHAR(200)); postgres=# CREATE INDEX t_id ON t (id); postgres=# INSERT INTO t SELECT generate_series(1,1000000) AS id, md5((random()*1000000)::text) AS pad;
Filling WAL with random data
To see the benefits of slots, we should fill the WAL with some data by running transactions. Repeat the update statement below to generate a huge amount of WAL data:
UPDATE t SET pad = md5((random()*1000000)::text);
Checking the current WAL size
You can check total size for all WAL segments from the shell or from psql:
master$ du -sh data/pg10/pg_wal 17M data/pg10/pg_wal master$ source ./pg10/pg10.env master$ psql postgres=# \! du -sh data/pg10/pg_wal 17M data/pg10/pg_wal
Check maximum WAL size without slots activated
Before replication configuration, we can fill the WAL with random data and find that after 1.1G, the data/pg10/pg_wal
directory size does not increase regardless of the number of update queries.
postgres=# UPDATE t SET pad = md5((random()*1000000)::text); -- repeat 4 times postgres=# \! du -sh data/pg10/pg_wal 1.1G data/pg10/pg_wal postgres=# UPDATE t SET pad = md5((random()*1000000)::text); postgres=# \! du -sh data/pg10/pg_wal 1.1G data/pg10/pg_wal
Backup master from the slave server
Next, let’s make a backup for our slot1:
slave$ source ./pg10/pg10.env slave$ ./pgc stop pg10 slave$ rm -rf data/pg10/* # If you are running master and slave on different servers, replace 127.0.0.1 with master's IP address. slave$ PGPASSWORD=replicator pg_basebackup -S slot1 -h 127.0.0.1 -U replicator -p 5432 -D $PGDATA -Fp -P -Xs -Rv
Unfortunately pg_basebackup hangs with: initiating base backup, waiting for checkpoint to complete
.
We can wait for the next checkpoint, or force the checkpoint on the master. Checkpoint happens every checkpoint_timeout seconds, and is set to five minutes by default.
Forcing checkpoint on master:
master$ psql postgres=# CHECKPOINT;
The backup continues on the slave side:
pg_basebackup: checkpoint completed pg_basebackup: write-ahead log start point: 0/92000148 on timeline 1 pg_basebackup: starting background WAL receiver 1073986/1073986 kB (100%), 1/1 tablespace pg_basebackup: write-ahead log end point: 0/927FDDE8 pg_basebackup: waiting for background process to finish streaming ... pg_basebackup: base backup completed
The backup copies settings from the master, including its TCP port value. I’m running both master and slave on the same host, so I should change the port in the slave .conf file:
slave$ vim data/pg10/postgresql.conf # old value port = 5432 port = 5433
Now we can return to the master and run some queries:
slave$ cd ../master master$ source pg10/pg10.env master$ psql postgres=# UPDATE t SET pad = md5((random()*1000000)::text); UPDATE t SET pad = md5((random()*1000000)::text);
By running these queries, the WAL size is now 1.4G, and it’s bigger than 1.1G! Repeat this update query three times and the WAL grows to 2.8GB:
master$ du -sh data/pg10/pg_wal 2.8G data/pg10/pg_wal
Certainly, the WAL could grow infinitely until whole disk space is consumed.
How do we find out the reason for this?
postgres=# SELECT redo_lsn, slot_name,restart_lsn, round((redo_lsn-restart_lsn) / 1024 / 1024 / 1024, 2) AS GB_behind FROM pg_control_checkpoint(), pg_replication_slots; redo_lsn | slot_name | restart_lsn | gb_behind ------------+-----------+-------------+----------- 1/2A400630 | slot1 | 0/92000000 | 2.38
We have one slot behind the master of 2.38GB.
Let’s repeat the update and check again. The gap has increased:
postgres=# postgres=# SELECT redo_lsn, slot_name,restart_lsn, round((redo_lsn-restart_lsn) / 1024 / 1024 / 1024, 2) AS GB_behind FROM pg_control_checkpoint(), pg_replication_slots; redo_lsn | slot_name | restart_lsn | gb_behind ------------+-----------+-------------+----------- 1/8D400238 | slot1 | 0/92000000 | 3.93
Wait, though: we have already used slot1 for backup! Let’s start the slave:
master$ cd ../slave slave$ ./pgc start pg10
Replication started without any additional change to recovery.conf:
slave$ cat data/pg10/recovery.conf standby_mode = 'on' primary_conninfo = 'user=replicator password=replicator passfile=''/home/pguser/.pgpass'' host=127.0.0.1 port=5432 sslmode=prefer sslcompression=1 krbsrvname=postgres target_session_attrs=any' primary_slot_name = 'slot1'
pg_basebackup -R option instructs backup to write to the recovery.conf file with all required options, including primary_slot_name.
WAL size, all slots connected
The gap reduced several seconds after the slave started:
postgres=# SELECT redo_lsn, slot_name,restart_lsn, round((redo_lsn-restart_lsn) / 1024 / 1024 / 1024, 2) AS GB_behind FROM pg_control_checkpoint(), pg_replication_slots; redo_lsn | slot_name | restart_lsn | gb_behind ------------+-----------+-------------+----------- 1/8D400238 | slot1 | 0/9A000000 | 3.80
And a few minutes later:
postgres=# SELECT redo_lsn, slot_name,restart_lsn, round((redo_lsn-restart_lsn) / 1024 / 1024 / 1024, 2) AS GB_behind FROM pg_control_checkpoint(), pg_replication_slots; redo_lsn | slot_name | restart_lsn | gb_behind ------------+-----------+-------------+----------- 1/9E5DECE0 | slot1 | 1/9EB17250 | -0.01 postgres=# \!du -sh data/pg10/pg_wal 1.3G data/pg10/pg_wal
Slave server maintenance
Let’s simulate slave server maintenance with ./pgc stop pg10
executed on the slave. We’ll push some data onto the master again (execute the UPDATE query 4 times).
Now, “slot1” is again 2.36GB behind.
Removing unused slots
By now, you might realize that a problematic slot is not in use. In such cases, you can drop it to allow retention for segments:
master$ psql postgres=# SELECT pg_drop_replication_slot('slot1');
Finally the disk space is released:
master$ du -sh data/pg10/pg_wal 1.1G data/pg10/pg_wal
Important system variables
- archive_mode is not required for streaming replication with slots.
- wal_level – is
replica
by default - max_wal_senders – set to 10 by default, a minimum of three for one slave, plus two for each additional slave
- wal_keep_segments – 32 by default, not important because PostgreSQL will keep all segments required by slot
- archive_command – not important for streaming replication with slots
- listen_addresses – the only option that it’s necessary to change, to allow remote slaves to connect
- hot_standby – set to on by default, important to enable reads on slave
- max_replication_slots – 10 by default https://www.postgresql.org/docs/10/static/runtime-config-replication.html
Summary
- Physical replication setup is really easy with slots. By default in pg10, all settings are already prepared for replication setup.
- Be careful with orphaned slots. PostgreSQL will not remove WAL segments for inactive slots with initialized restart_lsn.
- Check pg_replication_slots restart_lsn value and compare it with current redo_lsn.
- Avoid long downtime for slave servers with slots configured.
- Please use meaningful names for slots, as that will simplify debug.
References
- https://www.openscg.com/bigsql/package-manager/
- https://www.openscg.com/bigsql/docs/pgcli/pgcli/
- https://www.postgresql.org/docs/current/static/wal-configuration.html
- https://paquier.xyz/postgresql-2/postgres-9-4-feature-highlight-replication-phydical-slots/
- https://www.postgresql.org/docs/current/static/protocol-replication.html
- https://www.postgresql.org/docs/9.4/static/functions-admin.html