Jul
06
2023
--

How Locking Works in PostgreSQL

locking works in postgresql

Locking in PostgreSQL is a mechanism used to control concurrent access to shared resources, ensuring data consistency and preventing conflicting operations. PostgreSQL offers various lock types, each serving a specific purpose. Let’s explore how locking works in PostgreSQL.

Lock Modes

PostgreSQL supports different lock modes, including:

Access Share (SELECT): Allows concurrent read-only access to a resource.

Row Share (SELECT FOR UPDATE): Allows read access and allows other sessions to read the same resource.

Row Exclusive (UPDATE, DELETE): Allows exclusive access to a specific row, preventing concurrent writes or deletes on the same row.

Share Update Exclusive (VACUUM): Allows read access and exclusive write access to the entire table, but not to individual rows.

Share (CREATE INDEX): Allows concurrent read access and concurrent index creation on a table.

Share Row Exclusive (ALTER TABLE): Allows concurrent read access and schema modifications on a table.

Exclusive (ALTER TABLE, DROP TABLE): Provides exclusive access to a resource, preventing any other concurrent access.

Access Exclusive (CREATE/DROP DATABASE): Allows exclusive access to a database.

Lock Granularity

PostgreSQL supports various levels of lock granularity, allowing locks to be acquired at different levels:

Row-Level Locks: Provide the finest level of granularity, allowing locks to be acquired on specific rows within a table. Multiple sessions can concurrently hold row-level locks on different rows of the same table.

Page-Level Locks: Locks are acquired at the page level, which is a group of database blocks. Multiple sessions can hold page-level locks on different pages within the same table.

Table-Level Locks: Locks are acquired on the entire table, preventing concurrent access to any rows or pages within the table.

Lock Compatibility

PostgreSQL uses a lock compatibility matrix to determine which locks can coexist and which conflict with each other. The compatibility matrix ensures that transactions do not interfere with each other by acquiring incompatible locks simultaneously.

For example, an exclusive lock on a row conflicts with other exclusive or shared locks on the same row but is compatible with an access share lock.

Locking and Transaction Isolation Levels:

Transaction isolation levels in PostgreSQL determine the degree to which concurrent transactions can interact with each other. The isolation levels (Read Uncommitted, Read Committed, Repeatable Read, Serializable) influence the behavior of locking and concurrency control.

Read Uncommitted: No locks are acquired, allowing dirty reads and potential data inconsistencies.

Read Committed: Exclusive locks are acquired for write operations, preventing dirty reads and ensuring data consistency within a transaction.

Repeatable Read: Additional locks are acquired to ensure that data read within a transaction remains consistent, preventing phantom reads.

Serializable: Provides the strictest isolation level, ensuring that concurrent transactions do not produce non-repeatable reads, phantom reads, or serialization anomalies. This level acquires more locks to enforce serializability.

Lock Timeout and Deadlocks

PostgreSQL allows you to set a timeout for acquiring locks. If a lock cannot be acquired within the specified timeout, an error is raised. Deadlocks can occur when two or more transactions are waiting for locks held by each other, leading to a circular dependency. PostgreSQL employs a deadlock detection mechanism that detects such situations and aborts one of the transactions involved to resolve the deadlock.

Conclusion

It is essential to understand how locking works in PostgreSQL to optimize transaction management, prevent contention, and ensure efficient concurrency control. By choosing the appropriate lock modes, understanding lock compatibility, and setting the right transaction isolation level, you can design robust and performant applications on PostgreSQL.

Percona Distribution for PostgreSQL provides the best and most critical enterprise components from the open-source community in a single distribution, designed and tested to work together.

 

Download Percona Distribution for PostgreSQL Today!

Apr
13
2023
--

It’s All About Replication Lag in PostgreSQL

replication lag in postgresql

PostgreSQL is a popular open source relational database management system that is widely used for storing and managing data. One of the common issues that can be encountered in PostgreSQL is replication lag.

In this blog, we will discuss what replication lag is, why it occurs, and how to mitigate it in PostgreSQL.

What is replication lag?

Replication lag is the delay between the time when data is written to the primary database and the time when it is replicated to the standby databases. In PostgreSQL, replication lag can occur due to various reasons such as network latency, slow disk I/O, long-running transactions, etc.

Replication lag can have serious consequences in high-availability systems where standby databases are used for failover. If the replication lag is too high, it can result in data loss when failover occurs.

The most common approach is to run a query referencing this view in the primary node.

postgres=# SELECT pid,application_name,client_addr,client_hostname,state,sync_state,replay_lag
postgres-# FROM pg_stat_replication
postgres-# ;
pid | application_name | client_addr | client_hostname | state | sync_state | replay_lag
-----+------------------+-------------+-----------------+-------+------------+------------
(0 rows)
postgres=#

Queries to check in the Standby node:

postgres=# select pg_is_in_recovery(),pg_is_wal_replay_paused(), pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn(), pg_last_xact_replay_timestamp();
pg_is_in_recovery | pg_is_wal_replay_paused | pg_last_wal_receive_lsn | pg_last_wal_replay_lsn | pg_last_xact_replay_timestamp
-------------------+-------------------------+-------------------------+------------------------+-------------------------------
t | f | D1/8000000 | D1/8000000 | 2023-03-16 11:37:57.861711+00
(1 row)
postgres=#

Why does replication lag occur?

Replication lag can occur due to various reasons, such as:

Network latency: Network latency is the delay caused by the time it takes for data to travel between the primary and standby databases.

Various factors, such as the distance between the databases, network congestion, etc., can cause this delay:

Slow disk I/O: Slow disk I/O can be caused by various factors such as disk fragmentation, insufficient disk space, etc. Slow disk I/O can delay writing data to the standby databases.

Long-running transactions: Long-running transactions can cause replication lag because the changes made by these transactions are not replicated until the transaction is committed.

A poor configuration, like setting low numbers of max_wal_senders while processing huge numbers of transaction requests.

Sometimes the server recycles old WAL segments before the backup can finish and cannot find the WAL segment from the primary.

Usually, this is also due to the checkpointing behavior where WAL segments are rotated or recycled.

Mitigating replication lag in PostgreSQL

There are several ways to mitigate replication lag in PostgreSQL, such as:

Increasing the network bandwidth: Increasing the network bandwidth between the primary and standby databases can help reduce replication lag caused by network latency.

Using asynchronous replication: Asynchronous replication can help reduce replication lag by allowing the standby databases to lag behind the primary database.

This means that the standby databases do not have to wait for the primary database to commit transactions before replicating the data.

Tuning PostgreSQL configuration parameters: Tuning the PostgreSQL configuration parameters such as wal_buffers, max_wal_senders, etc.
can help improve replication performance and reduce replication lag.

Monitoring replication lag: Monitoring replication lag can help identify the cause of the lag and take appropriate actions to mitigate it.

PostgreSQL provides several tools, such as pg_stat_replication, pg_wal_receiver_stats, etc., for monitoring replication lag.

Conclusion

Replication lag is a common issue in PostgreSQL that can seriously affect high-availability systems.

Understanding the causes of replication lag and taking appropriate measures to mitigate it can help ensure the availability and reliability of the database system.

By increasing network bandwidth, using asynchronous replication, tuning PostgreSQL configuration parameters, and monitoring replication lag, administrators can mitigate replication lag and ensure a more stable and reliable database environment.

Percona Distribution for PostgreSQL provides the best and most critical enterprise components from the open-source community, in a single distribution, designed and tested to work together.

 

Download Percona Distribution for PostgreSQL Today!

Mar
23
2023
--

Why PostgreSQL Is a Top Choice for Enterprise-level Databases

postgresql enterprise

When it comes to enterprise-level databases, there are several options available in the market, but PostgreSQL stands out as one of the most popular and reliable choices. PostgreSQL is a free and open source object-relational database management system (ORDBMS) that has existed since the mid-1990s.

Over the years, it has evolved into a robust and feature-rich database that offers several advantages over other database management systems. In this blog post, we will explore some of the reasons why PostgreSQL is a top choice for enterprise-level databases.

Open source and free

  • PostgreSQL is an open source database, which means it is free to use, distribute, and modify.
  • This makes it an attractive option for businesses of all sizes, especially for startups and small businesses that may not have the budget to invest in expensive database management systems.
  • PostgreSQL is backed by a large community of developers contributing to its development, support, and documentation.

Advanced features

PostgreSQL offers a wide range of advanced features that make it a top choice for enterprise-level databases. Some of the notable features include:

  • Support for JSON and XML data types
  • Full-text search capabilities
  • Built-in replication and high availability
  • Extensible architecture with support for custom data types, functions, and operators
  • Support for foreign keys and referential integrity constraints
  • Transactions with ACID (Atomicity, Consistency, Isolation, Durability) compliance

Scalability

  • PostgreSQL is designed to scale with growing businesses. It can handle large datasets and complex queries with ease, and it can be easily scaled horizontally or vertically.
  • PostgreSQL supports sharding, which allows data to be distributed across multiple servers, making it ideal for high-traffic websites and applications.

Reliability

  • PostgreSQL is known for its reliability and stability. It has a robust transactional system that ensures data integrity and consistency, even in high-transaction environments.
  • It has a proven track record of handling large volumes of data and high-traffic websites. PostgreSQL uses a multi-version concurrency control (MVCC) system, ensuring multiple users can access the same data simultaneously without conflicts or data loss.
  • It also supports multiple concurrent users and can handle complex database operations without downtime or data loss. This makes it an excellent choice for mission-critical applications that require high availability.

Flexibility

PostgreSQL offers a high degree of flexibility, allowing developers to customize it to their specific needs. It supports a wide range of data types, including JSON, and can handle complex data structures. It also offers a range of extensions and plugins that can be used to enhance its functionality, making it a versatile choice for production-ready databases.

Security

PostgreSQL offers robust security features to protect data from unauthorized access. It supports SSL encryption for secure communication and offers various authentication methods, including LDAP, Kerberos, and GSSAPI. It also supports role-based access control, which allows developers to control access to data at the user level. This makes it an ideal choice for businesses that handle sensitive data.

Integration with other tools and technologies

PostgreSQL integrates seamlessly with other tools and technologies commonly used in enterprise-level applications. It has connectors for programming languages such as Java, Python, and PHP, as well as integrations with popular data visualization tools such as Tableau and Power BI.

Conclusion

PostgreSQL is a top choice for production-ready databases due to its scalability, reliability, flexibility, security, and community support. Its features make it an ideal choice for businesses that require a robust and versatile database system. If you are looking for a database system for your enterprise-level application, PostgreSQL is definitely worth considering.

Percona provides managed PostgreSQL services that include database monitoring, performance tuning, and backup and recovery services. Percona also provides PostgreSQL Support Services for businesses and organizations that use PostgreSQL as their primary database technology.

Percona Distribution for PostgreSQL provides the best and most critical enterprise components from the open-source community, in a single distribution, designed and tested to work together.

Download Percona Distribution for PostgreSQL Today!

 

Join the Percona PostgreSQL Community

Feb
20
2023
--

PostgreSQL Migration From Digital Ocean DBaaS to Digital Ocean Droplet

PostgreSQL Migration

Recently, one of our customers approached us with a unique challenge: they needed to migrate their entire PostgreSQL cluster from DigitalOcean’s Database as a Service (DBaaS) to a DigitalOcean Droplet. The reason for their migration from DBaaS to Droplets was to lower their cost. This task proved to be quite challenging, as DigitalOcean’s documentation clearly states that “We do not currently support migrating databases from clusters inside of DigitalOcean to other clusters inside of DigitalOcean.”

In short, we have to migrate the database as per the client’s request, and we gave them two options:

1. pg_dump

2. Logical replication

The pg_dump method requires downtime as we must take the dump and restore it on the new server. Logical replication keeps the source database operational while transferring data to the target database. Once we reach the desired state, we can cut over to the target database.

For migrating to the logical replication method, all tables required to get replicated must have a Primary Key/Unique Key.

Prerequisites for migration

To migrate an existing database to a DigitalOcean database cluster, we need to ensure logical replication is enabled on the source database, have the source database’s connection credentials and disable or update any firewalls between the databases.

Have Superuser permissions: For preparing a database for migration and to migrate a database, we need superuser permissions on the source database.

Make database publicly accessible: To migrate a database, the source database’s hostname or IP address must be accessible from the public internet. Public connection information for DigitalOcean databases are in the database’s Connection Details in the control panel.

Allow remote connections: First, verify that the database allows all remote connections. This is determined by the database’s listen_addresses variable, which allows all remote connections when its value is set to *. To check its current value, run the following query in the PostgreSQL (psql) terminal:

SHOW listen_addresses;
If enabled, the command line returns:
listen_addresses
-----------
*
(1 row)

If the output is different, allow remote connections in your database by running the following query:

ALTER SYSTEM SET listen_addresses = '*';

We must also change your local IPv4 connection to allow all incoming IPs. To do this, find the configuration file pg_hba.conf with the following query:

SHOW hba_file;

Open pg_hba.conf in your text editor, such as nano: nano pg_hba.conf

Under IPv4 local connections, find and replace the IP address with 0.0.0.0/0, which allows all IPv4 addresses:

# TYPE DATABASE USER ADDRESS METHOD

# IPv4 local connections:
host all all 0.0.0.0/0 md5
# IPv6 local connections:
host all all ::/0 md5

Enable logical replication:

Most cloud database providers have logical replication enabled by default. Logical replication may not be enabled if you migrate a database from an on-premises server. If your database is not set up for logical replication, the migration process will not work because the database can only move your schemas, not your data.

To verify that logical replication has been enabled, run the following query in the PostgreSQL (psql) terminal:

show wal_level;
If enabled, the output returns:
wal_level
-----------
logical
(1 row)
If the output is different, enable logical replication in your database by setting wal_level to logical:
ALTER SYSTEM SET wal_level = logical;

Change max replication slots:

After enabling logical replication, we need to verify that the database’s max_replication_slots value is equal to or greater than the number of databases we have in your PostgreSQL server. To check the current value, run the following query in the PostgreSQL (psql) terminal:

show max_replication_slots;

The output returns:

max_replication_slots
-----------

(1 row)

If it is smaller than the number of databases in our PostgreSQL server, adjust it by running the following query, where use_your_number is the number of databases in our server:

ALTER SYSTEM SET max_replication_slots = use_your_number;

And restart the server.

Challenges we face during migration

There are some challenges when we implement a logical replication without having any primary key. There are two different methods to implement logical replication without having a PK column, the second being by using a unique key.

This can be implemented with a similar set of steps that we perform. Also its function is similar. Here, instead of the primary key, a unique key is going to keep updates.

Caveats

  • It does not support DELETE/UPDATE without a replica identity.
  • A unique index can not be used with a replica identity if NULLs are allowed.
  • Using REPLICA IDENTITY to FULL
  • When no appropriate index is found for replica identity, we may set replica identity to FULL. In this case, all the table columns collectively act as a primary key.
  • Due to supplemental logging, this generates a huge amount of WALs.
  • This may be slower than the traditional one.

Things to consider

We need to set the replica identity full for the tables that are logically migrated using only the UNIQUE key as otherwise DELETE/UPDATE won’t be supported.

After data gets synced from the DBaaS fork to the new droplet VM, we need to perform the pg_dump and pg_restore method for sequences. Now here is a question that arises: Why do we need to dump the sequence and why cannot we replicate it via logical replication?

Logical replication is designed to track the WAL changes and report to subscribers about the current states and values. It would be quite contradicting to replicate a sequence because the current sequence value does not equal the value stored in the WAL. To remedy this, PostgreSQL documentation suggests manually copying over the sequence values or using a utility such as pg_dump to do the copying.

  • Dump the sequences from the DBaaS DB fork
  • Stop the DBaaS DB fork
  • Restore the sequences on the new droplet
  • Disable the logical subscriptions

Below is the short summary that has been followed to migrate the environment:

Source: Digital Ocean DBasS
Destination: Digital Ocean Droplets
Process:

  • The client has chosen migration via a logical replication process to reduce downtime.
  • On the target VM, we installed Percona Distribution for PostgreSQL 13.7.
  • Dumped the roles from source to destination, i.e., DBasS.
  • Listed out the tables that don’t have PK and informed them.
  • The client added the PK for some tables and the UNIQUE key for some tables.
  • Installed the extensions on the VM that was present on the source cluster.
  • Dumped only schema from source, i.e., DBasS
  • Restored the schema on destination, i.e., Droplets
  • Adjusted the logical replication-related parameters on the source and destination like max_replication_slots, max_logical_replication_workers, and max_wal_senders.
  • Configured the logical replication by creating the publication and subscription between the source and destination.
  • Once the destination is in sync, disable the subscribers.
  • Dumped the sequences from the source and restored them on the destination.
  • Adjusted the listen_address, pg_hba files on the destination.
  • Dropped the subscribers on the destination

Conclusion

As we all know, PostgreSQL is an open source, object-relational database built with a focus on extensibility, data integrity, and speed. Its concurrency support makes it fully ACID-compliant, and we can achieve the result of migrating customer data from DBasS to Droplets by using one of the great features of PostgreSQL, i.e., logical replication, and we dumped the sequences from the source and restored them on the destination.

Percona Distribution for PostgreSQL provides the best and most critical enterprise components from the open-source community, in a single distribution, designed and tested to work together.

Download Percona Distribution for PostgreSQL Today!

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com