Nov
19
2025
--

Data Retention Policy Implementation – How and Why

Data Retention Policy Implementation - How and WhyIs your PostgreSQL database Feeling Sluggish? Are SQL statements taking more time than in earlier days? Are you experiencing performance cliffs (Unexpected, sudden drops in performance)? Are backups taking a long time to complete? Are you getting a bigger bill for storage? Are standby rebuilds and development refreshes becoming a herculean task? Are the auditors […]

Nov
06
2025
--

PostgreSQL 13 Is Reaching End of Life. The Time to Upgrade is Now!

PostgreSQL 13 Is Reaching End of LifePostgreSQL 13 will officially reach End-of-Life (EOL) on November 13, 2025. After this date, the PostgreSQL Global Development Group will stop releasing security patches and bug fixes for this version. That means if you’re still running PostgreSQL 13, you’ll soon be on your own with no updates, no community support, and growing security risks. Why […]

Jan
14
2025
--

What Hurts in PostgreSQL Part One: Temporary Tables

Temporary Tables PostgreSQLPostgreSQL is one of the most powerful database systems in the world. I have always been passionate about its great power, especially its modern SQL language features. However, that doesn’t mean everything is great. There are areas where it hurts. Novice users unaware of the problem might get into trouble, and I see such things […]

Jun
24
2024
--

Upgrading to the New Etcd Version From 3.3 for Patroni

Upgrading to the New Etcd Version From 3.3 for PatroniWe have been promoting and using Patroni as the best high availability framework for PostgreSQL, and Etcd was the preferred/recommended DCS for the Patroni cluster. Both Patroni and Etcd have been part of PostgreSQL distribution from Percona for years now. But one area where we were stuck was the Etcd version, and we continued to […]

Jun
03
2024
--

Controlling Resource Consumption on a PostgreSQL Server Using Linux cgroup2

Controlling Resource Consumption on a PostgreSQL Server Using Linux cgroup2Multi-tenancy/co-hosting is always challenging. Running multiple PG instances could help to reduce the internal contention points (scalability issues) within PostgreSQL. However, the load caused by one of the tenants could affect other tenets, which is generally referred to as the “Noisy Neighbor” effect. Luckily, Linux allows users to control the resources consumed by each program […]

Apr
30
2024
--

LDAP Authentication in PgBouncer Through PAM

There are many cases where external connection poolers like pgBouncer become unavoidable despite the costs and complexities associated with them. PgBouncer is one of the most popular external connection poolers for PostgreSQL. It is thin and lightweight, so it doesn’t have built-in authentication features like LDAP, which is essential for many enterprises. Luckily, pgBouncer has […]

Dec
29
2023
--

Human Factors Behind Incidents: Why Settings Like “idle_session_timeout” Can Be a Bad Idea

idle_session_timeout

PostgreSQL 14 introduced the parameter
idle_session_timeout, and, unfortunately, many DBAs jumped to start using it without understanding or by ignoring the consequences. In a short span of time, it has become one of the most misused parameters in many PostgreSQL installations. There is nothing wrong with
idle_session_timeout  from a technical perspective; even without this parameter, I know frustrated DBAs run scheduled shell scripts to keep terminating old idle sessions. The new parameter made their work easy to nuke all idle connections. The problem is the collateral damage it can cause.

Note for novice DBAs: There are human and organizational factors for almost every incident, rather than just technical reasons. Trying to do a quick/dirty fix by changing parameters will have far-reaching implications. Not everything can be addressed on the database side. This is more evident in connection-related incidents. Blame games and bad decisions following one incident/outage can lead to further bad decisions and further problems.

For example, many such incidents will have a common root cause” ” something like unexpected, sudden connection explosion — a big bang of connections with hundreds or thousands of idle connections. The RCA investigation may end with only the technical side of the problem, like loose limits invited disaster to happen. But the right question to be addressed would be: Why do DBAs end up setting such loose limits? What are the factors that influenced and forced DBAs to do so? Unfortunately, it gets addressed very rarely in many organizations.

Common factors contributing to connection-related outages

1. Unrestricted connection settings: Leaving connection settings, such as max_connections and per-user limits, wide open exposes the system to malicious Denial-of-Service (DoS) attacks. These attacks can rapidly consume available connections, leaving legitimate users unable to connect and disrupting system functionality.

2. Inadequate application-side pooling: Suboptimal application-side pooling can lead to opening a large number of connections, and they may remain open and unused, consuming valuable resources. These connections can accumulate, eventually exhausting server resources and causing the system to become unresponsive.

3. Unexpected connection leaks: Unanticipated connection leaks can also arise from programming errors or faulty application logic. These leaks can quickly deplete the available connection pool, hindering system performance and potentially leading to crashes.

4. Human factor: In an ideal case, there must be stringent restrictions and enforcement on what load and number of connections can be onboarded to the Database. It should be a tight scrutiny like airline security.

The problem generally begins as a conflict of interests. App-Dev often demands unrestricted, uncontrolled access to the database. A “NO” from the Ops/DBAs at the right time may save the entire system from outages. However, gatekeeping is not an easy job. Pressure builds up, and DBAs give up in most cases and allow settings that the database server cannot accommodate. The first decision goes wrong here. Obviously, an incident is expected down the line.

5. Organizational factor of wrong decisions: A surprising set of wrong decisions generally happens after the investigation following major incidents, The investigation could be pointing fingers at an excessive number of connections. Its common that Ops/DBA becomes responsible for managing the “idle” connections!   because “it is a database problem”. I would say the very wrong person takes charge.

The point forgotten is that connections are owned by clients but maintained by a database. It’s like a bank account. The customer owns it, but the bank maintains it.  We shouldn’t be terminating or closing connections from the database side. Instead of Investigating who owns them and addressing them at their root, DBAs often go for shortcuts like idle_session_timeout or a script to terminate all “idle” connections. The boundaries of ownership are violated, inviting the next level of problems.

There are a lot of misunderstandings about “idle” connections among the user community. We should remember that a connection will be “idle” immediately after establishing it. It becomes “active” when there is an SQL for executing. It goes back to “idle” when the processing is over. Most of the connections from any connection pooler also will be “idle” most of the time. So, there is nothing wrong with a connection being “idle”. The point I want to make is that a connection appearing as “idle” doesn’t make it a candidate for termination/killing. This may sound too silly for an experienced user, but the reality is that I ended up sitting on emergency calls to explain this.

What matters is the number of connections. The right question to ask in an RCA call is, “Who is responsible for creating the large number of connections that are idle most of the time?

Common symptoms of poor connection management

  1. Gradual server performance degradation due to the high number of connections
  2. Often, the server becomes unresponsive and/or crashes
  3. The server is running out of memory, and OOM killer terminates PostgreSQL
  4. The server is running out of available connections
  5. Application reporting connection failures and unexpected drops of connection crashes

Problems of backend termination

I often get the question, “What is the problem with terminating connections from backends?”. In layman’s terms, it will be like a bank closing your account and taking all the money in it. That can become a big surprise for the database client/application. Most of the applications may not be designed and tested for handling such surprises. Even a
psql  will get it with a surprise

postgres=> l+
FATAL:  terminating connection due to idle-session timeout
FATAL:  server conn crashed?
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

So unless the application is designed, developed, and tested for handling unexpected connection drops, it won’t be able to handle this. The majority of applications just exit with error reporting. In critical systems, this causes serious outages for business.

Despite being tailored for PostgreSQL, pgBouncer is not immune to sudden connection terminations:

2023-11-14 16:17:21.967 UTC [4630] WARNING S-0x10b7940: postgres/authuser@127.0.0.1:5432 got packet 'E' from server when not linked
2023-11-14 16:17:21.967 UTC [4630] LOG S-0x10b7940: postgres/authuser@127.0.0.1:5432 closing because: server conn crashed? (age=218s)
2023-11-14 16:17:22.927 UTC [4630] LOG stats: 0 xacts/s, 0 queries/s, 0 client parses/s, 0 server parses/s, 0 binds/s, in 0 B/s, out 11 B/s, xact 0 us, query 0 us, wait 12 us
2023-11-14 16:17:40.533 UTC [4630] LOG S-0x10b7bf8: postgres/pgbtestapp@[::1]:5432 closing because: server conn crashed? (age=236s)
2023-11-14 16:17:40.533 UTC [4630] LOG C-0x10af130: postgres/pgbtestapp@[::1]:39470 closing because: server conn crashed? (age=28s)
2023-11-14 16:17:40.533 UTC [4630] WARNING C-0x10af130: postgres/pgbtestapp@[::1]:39470 pooler error: server conn crashed?
2023-11-14 16:18:06.370 UTC [4630] LOG S-0x10b7bf8: postgres/authuser@127.0.0.1:5432 new connection to server (from 127.0.0.1:46632)

Even if we ignore all other problems, the connection pooler will end up re-establishing pretty much every connection in the pool. This completely destroys the very purpose of the connection pooler.

It is not that the PostgreSQL community is not aware of this possible abuse of parameters. This has been discussed in the community, and the implications on the pooler are known and well documented.

However, users still overlook the implications, and serious mess-ups happen.

From the discussions in the PostgreSQL community, the only strong argument in favor of this parameter is one-off users who directly log in to the database to execute custom statements. They may accidentally leave their sessions open for a long time, which causes damage.

There were discussions about making it more explicit like: Consider setting this for specific users instead of as a server default. Client connections managed by connection poolers or initiated indirectly, like those by a remote postgres_fdw using server, should probably be excluded from this timeout.

What can we do about it without causing any serious consequences?

Obviously, DBAs may have questions like, what if user accounts / application connections are really abusive? As mentioned above, the solution is not just technical,  but understanding the system and implementing appropriate strategies is important.

Segregation of database accounts and role-based settings

Those accounts used by individual users for interactive logins need to be differentiated from the accounts used by the application – The service accounts, In terms of user/role-level settings.PostgreSQL allows us to have parameter settings at different levels and scope. More stringent restrictions are to be placed on the interactive login accounts. It is very much OK to have settings like idle_session_timeout for those accounts, Preferably not exceeding 5 minutes. Most importantly, the idle_in_transaction_session_timeout also does not exceed a few seconds.
On the other hand, I would recommend NOT to use
idle_session_timeout  for service accounts. But using
idle_in_transaction_session_timeout  for a couple of minutes is acceptable as it helps us to find problematic application logic.

Use network timeouts

There are alternative approaches if we suspect that the connections are just leaked without the application holding them.  TCP timeouts could be the best solution for such cases. PostgreSQL has TCP-related timeouts as a parameter option, which can be specified like any other PostgreSQL parameter option.

tcp_keepalives_idle: This parameter specifies the number of seconds of inactivity, after which a keepalive message will be sent to the client by the PostgreSQL server. For example, if we set a value “120”, Sever will send a keepalive message to the client after two minutes of inactivity and wait for a response. If there is an acknowledgment from the client, the connection is validated.  What if there is no acknowledgment from a client? That is where the next parameter helps

tcp_keepalives_interval:  This is the number of seconds after which the TCP keepalive message will be retransmitted if there is no acknowledgment from the client. For example, setting a value of 20 results in sending keepalive packets every 20 seconds until an acknowledgment is obtained. What if there is no acknowledgment for any of the keepalive attempts? This is where the next parameter comes into play.

tcp_keepalives_count: This is the number of keepalive messages that can be lost before the client connection is considered as “dead”. And terminated.

So, the overall settings summary discussed is:

tcp_keepalives_idle = 120
tcp_keepalives_interval = 20
tcp_keepalives_count = 6

But you may adjust the values according to the needs of your environment.

There is one more parameter: tcp_user_timeout, Which, as some caveat, is documented and reportedly affects other parameter values. So this is not something we should consider on day one.

PostgreSQL 14 onwards, we have an additional option: client_connection_check_interval, which can detect whether the connection to the client has gone away or not.  This is most useful if a client dies during a long-running statement like those complex reporting queries from OLAP systems.  By default, this check is off (value 0). Setting this value to 5 to 10 seconds will be of great value if you are expecting cases of client connectivity. Please note that the value is specified in milliseconds. PostgreSQL allows us to set these values at even user level.

With all TCP timeouts in place, you may see entries as follows in PostgreSQL logs if some timeout happens.

2023-12-20 04:37:12.026 UTC [1230] LOG:  could not receive data from client: Connection timed out
2023-12-20 04:37:12.026 UTC [1230] LOG:  disconnection: session time: 0:22:23.419 user=jobin database=postgres host=fd42:8aba:4133:fb7d:216:3eff:fe3c:7bcb port=52984

Summary

There should be a cap on what a system can accept as a load. At the PostgreSQL level, it is the max_connections. I would consider it the primary duty of any Production DBA to protect the system from any possible abuse.

Dear DBAs, Please consider the following points to keep the system stable and protect it from any abuses.

  1. Configure connection settings appropriately. Set connection limits, such as max_connections and per-user limits, to reasonable values that align with expected usage patterns. Regularly review and adjust these settings as needed. The max_connections setting of a PostgreSQL instance should not be too far from 10x of available CPUs.
  2. Encourage the use of robust application-side pooling: Implement well-configured application-side pooling mechanisms to manage connections and prevent leaks efficiently. Employ libraries or frameworks that provide reliable pooling capabilities. In case the application framework doesn’t have a good connection pooler, consider external connection poolers like pgBouncer.
  3. Connections are owned by those who create it. If there is an excessive number of idle connections. The part of the system that is causing the creation of connections needs to be fixed. Terminating them silently from the backend is not a solution, but it invites more problems.
  4. Never terminate connections from the backend unless there is a request to do so from the application/client who owns the connections.
  5. Avoid using parameters like idle_session_timeout at the global level; use it only at the user/role level, especially for interactive login accounts.
  6. Avoid idle_session_timeout for application/service accounts. It can be used for accounts that are used for interactive logins where the consequence of timeout is predictable.
  7. Alternate options, as discussed above, to handle leaked connections.
  8. Monitor the connections. It is precious.

And most importantly, the crucial part. Empower yourself or your DBAs with the ability to say NO to any connection abuses. Junior DBAs are more vulnerable to pressure from other parts of organizations to take the wrong steps.

Percona Distribution for PostgreSQL provides the best and most critical enterprise components from the open-source community, in a single distribution, designed and tested to work together.

 

Download Percona Distribution for PostgreSQL Today!

Aug
30
2023
--

PostgreSQL Parameters: Scope and Priority Users Should Know

postgresql parameters

PostgreSQL allows its users to set parameters at different scopes, and the same parameter can be specified at different places and using different methods. And there could be conflicts. Someone might be wondering why certain changes are not coming into effect, so it is important to understand/recollect the scope and priority of settings.

In this blog, I am trying to list the options available for users and list them in increasing order of priority. The purpose is to give a high-level view to users.

1. Compile time parameter settings

These are the set of parameters that are set at the time of compilation. This acts as the default value for PostgreSQL. We can check these values in the boot_val field of pg_settings.

select name,boot_val from pg_settings;

These compile time settings have the least priority and can be overridden in any other levels. However, some of these parameters cannot be modified by any other means. Changing these values at compile time is not intended for common use. If a PostgreSQL user wants to change these values, they need to recompile the PostgreSQL from the source code. Some are exposed through the
configure command line option. Some such configuration options are: 
withblocksize=<BLOCKSIZE>    This sets table block size in kB. The default is 8kb. 
withsegsize=<SEGSIZE>  This sets table segment size in GB. The default is 1GB. This means PostgreSQL creates a new file in the data directory as the table size exceeds 1GB.  
withwalblocksize=<BLOCKSIZE>   sets WAL block size in kB, and the default is 8kB.

Most of the parameters have compile time defaults. That is the reason why we can start running PostgreSQL by specifying a very minimal number of parameter values.

2. Data directory/initialization-specific parameter settings

Parameters can also be specified at the data directory initialization time.  Some of these parameters cannot be changed by other means or are difficult to change.

For example, the
wal_segment_size, which determines the WAL segment file, is such a parameter. PostgreSQL generates WAL segment files of 16MB by default, and it can be specified at the time of initialization only. This is the level at which decisions on whether to use
data_checksums  need to be taken. This can be changed later using the pg_checksums utility, but that will be a painful exercise on a big database.

The default character encoding and locale settings to be used can be specified at this level. But this can be specified at the subsequent levels also. You may refer to initdb options for more information:  https://www.postgresql.org/docs/current/app-initdb.html.

Those parameters taken from the specific data directory initialization, which overrides the built-in parameters, can be checked like this:

select name,setting from pg_settings where source='override';

This override includes some of the calculated auto-tune values for that environment.

3. PostgreSQL parameters set by environment variables

PostgreSQL executables, including the postmaster, are honoring many environment variables. But they are generally used by client tools. The most common parameter used by the PostgreSQL server (postmaster) will be PGDATA, which sets the parameter data_directory.  These parameters can be specified by the service managers like systemd.

$ cat /usr/lib/systemd/system/postgresql-14.service
...
Environment=PGDATA=/var/lib/pgsql/14/data/
…
Environment=PG_OOM_ADJUST_FILE=/proc/self/oom_score_adj
Environment=PG_OOM_ADJUST_VALUE=0

For many automation/scripting, this will be handy. Here is an example:

$ export PGDATA=/home/postgres/data
$ export PGPORT=5434
$ pg_ctl start
waiting for server to start....2023-08-04 06:53:09.637 UTC [5787] LOG:  starting PostgreSQL 15.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
2023-08-04 06:53:09.637 UTC [5787] LOG:  listening on IPv6 address "::1", port 5434
2023-08-04 06:53:09.637 UTC [5787] LOG:  listening on IPv4 address "127.0.0.1", port 5434
2023-08-04 06:53:09.639 UTC [5787] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5434"

As we can see, PostgreSQL took the port as 5434.

4. Configuration files

Probably, this is the method every novice user will be aware of. The fundamental configuration file is postgresql.conf, and it is the most common place to have global settings. PostgreSQL looks for a configuration file in the PGDATA by default, but an alternate location can be specified using the command line parameter config_file  of postmaster.  The parameter specifications can be split into multiple files and directories because Postgresql supports 
include and 
include_dir directives in the configuration files. So, there can be nested/cascaded configurations.

PostgreSQL rereads all its configuration files if it receives a SIGHUP signal. If the same parameter is set in multiple locations, the last to read will be will be considered. Among all configuration files, postgresql.auto.conf gets the highest priority because that is the file to read the last. That is where all “ALTER SYSTEM SET/RESET” commands keep the information.

5. Command line argument to postmaster

The postmaster, aka Postgres, has a feature to set parameters as command-line arguments (it has features to get the values also). This is one of the most reliable methods used by many of the external tools to manage PostgreSQL service. For example, the high availability solution Patroni passes some of the most critical parameters as a command line argument.  Here is how the Postgres process with command-line options looks in a Patroni environment

/usr/pgsql-14/bin/postgres -D /var/lib/pgsql/14/data --config-file=/var/lib/pgsql/14/data/postgresql.conf --listen_addresses=0.0.0.0 --port=5432 
 --cluster_name=kc_primary_cluster --wal_level=replica --hot_standby=on --max_connections=100 --max_wal_senders=5 --max_prepared_transactions=0 
 --max_locks_per_transaction=64 --track_commit_timestamp=off --max_replication_slots=10 --max_worker_processes=8 --wal_log_hints=on

So Patroni can ensure that there is no local configuration mistake that can adversely affect the availability and stability of the cluster. But changing this is possible only at the server startup. Obviously, this has higher precedence over the values from the configuration files. The scope will be at the instance level. This answers many of the Patroni user’s questions on why they cannot change some of the parameters directly in the parameter file. PostgreSQL users can check those parameters which came as command line arguments like:

postgres=# select name,setting from pg_settings where source='command line';
       	name        	      |  setting  	 
------------------------------+--------------------
 cluster_name          	      | perconapg_cluster
 hot_standby           	      | on
 listen_addresses      	      | 0.0.0.0
 max_connections       	      | 100
 max_locks_per_transaction    | 64
 max_prepared_transactions    | 0
 max_replication_slots 	      | 10
 max_wal_senders       	      | 5
 max_worker_processes  	      | 8
 port                  	      | 5432
 track_commit_timestamp	      | off
 wal_level             	      | replica
 wal_log_hints         	      | on
(13 rows)

Parameters specifications up to this level can have “postmaster” context.  The concept of “context” is discussed as part of the next section.

6. Database level setting

All options discussed so far have a global scope. Meaning they are applicable for the entire instance. But there could be reasons why a PostgreSQL user wants to change that at a specific database level. For example, one of the databases might be handling an OLTP workload where query parallelism may not be really needed and may have an adverse impact. But another database might be an OLAP system.

postgres=# ALTER DATABASE newdb SET max_parallel_workers_per_gather = 4;
ALTER DATABASE

The concept of context

At this stage, we should recollect another concept called the “context” of a parameter. For example, the network port at which PostgreSQL listens cannot be changed at the individual database level. A change of such parameters requires a PostgreSQL restart. So we say that the context of parameter “
port” is postmaster. A change to such parameters requires the postmaster — the main process of the PostgreSQL — to restart. Please refer to the documentation: https://www.postgresql.org/docs/current/view-pg-settings.html to understand different contexts of PostgreSQL parameters.  We won’t be allowed to change a set of parameters at this level onwards; any attempt will be prevented.

postgres=# ALTER DATABASE db1 SET max_connections=100;
ERROR:  parameter "max_connections" cannot be changed without restarting the server

The max_connections is something to specify at the global (instance) level by the postmaster, and it requires restart.

There are other sets of parameters that need to be communicated through postmaster only, even though they can be changed without restarting the server. That context is called sighup. Because we can signal the postmaster, and it will re-read such parameters and propagate the same to all its child processes, changing them at the database level will be prevented.

postgres=# ALTER DATABASE db1 SET log_filename='postgresql-DB1.log';
ERROR:  parameter "log_filename" cannot be changed now

You may even consider looking at the PostgreSQL source code:

https://github.com/postgres/postgres/blob/6fde2d9a005a5bc04aa059d3faeb865c8dd322ce/src/backend/utils/misc/guc.c#L3376

for a much deeper understanding of the logic of “context” and what is allowed in which level.

7. User-level settings

Each user can have their preferred parameter settings so that all sessions created by that user will have that setting in place.  Please remember that this user-level setting has a higher preference than database-level settings. Users can check their own user-level settings, like this:

select name,setting,source,context from pg_settings where  source='user';

8. Database – user combination

PostgreSQL allows us to have parameter settings that will be applicable when a particular user/role connects to a particular database.

For example:

ALTER USER admin IN DATABASE db1 SET max_parallel_workers_per_gather=6;

Setting at this level has even higher priority than everything mentioned before.

select name,setting,source,context from pg_settings where  name='max_parallel_workers_per_gather';
          	name           	| setting |	source 	| context
---------------------------------+---------+---------------+---------
 max_parallel_workers_per_gather | 6   	| database user | user
(1 row)

9. Parameters by the client connection request

There is an option to specify parameters while making a new connection. It can be passed to PostgreSQL as part of the connection string.

For example, I want to connect to the database to perform some bulk data loading and manipulation (ETL), and I don’t want to wait for any WAL writing. If, at all, there is any crash in between, I am OK to perform the ETL again. So, I am going to request a connection with
synchronous_commit  off.

$ psql "host=localhost user=postgres options='-c synchronous_commit=off'"
psql (14.8)
Type "help" for help.

postgres=# select name,setting,source,context from pg_settings where  name='synchronous_commit';
    	name        | setting   | source | context
--------------------+-----------+--------+---------
 synchronous_commit | off 	| client | user
(1 row)

10. Session-level setting

Each session can decide on the settings for that session at that point in time or execution. The sessions are allowed to modify this session-level setting as and when required.

postgres=# set jit=off;
SET
postgres=# select name,setting,source,context from pg_settings where  name='jit';
 name | setting | source  | context
------+---------+---------+---------
 jit  | off 	| session | user
(1 row)

A good use case is that, suppose we are going to rebuild a big index. We know that it is going to use considerable maintenance_work_mem.  Setting this at the session level simplifies our life without affecting other sessions.

set maintenance_work_mem = '4GB';

11. Transaction-level settings

PostgreSQL allows us to specify parameters at a very small scope, like transaction level.

Here is an example of discouraging sequential scans in a particular transaction block.

postgres=# BEGIN;
BEGIN
postgres=*# SET LOCAL enable_seqscan=off;
SET

I prefer transaction-level settings because the changes are very local to the transaction, and they will be reverted back once the transaction is completed. This is the most preferred place to set the work_mem to minimize the impact.

12. Object-level settings

PostgreSQL allows us to specify the parameter specific to a program block, like a PL/pgSQL function. So, the setting goes as part of the function definition.

Here is an example of the function definition to test the function-level settings.

CREATE OR REPLACE FUNCTION checkParams()
 RETURNS BOOLEAN
as $$
DECLARE
  nm TEXT;
  setng TEXT;
  cntxt TEXT;
  src TEXT;
BEGIN
SELECT name,setting,context,source INTO nm,setng,cntxt,src from pg_settings where  name='enable_partitionwise_join';
RAISE NOTICE 'Parameter Name: % value:%  Context:%  Source:%',nm,setng,cntxt,src;
RETURN true;
END;
$$ LANGUAGE plpgsql
SET enable_partitionwise_join = 'on'
SET enable_seqscan = 'off';

PostgreSQL parameters summary

PostgreSQL parameter specification is very flexible and powerful, so understanding the scope, context, and priority is important for every user. A rule of thumb could be the broader the scope, the lower the priority.

Percona Distribution for PostgreSQL provides the best and most critical enterprise components from the open-source community in a single distribution, designed and tested to work together.

 

Download Percona Distribution for PostgreSQL Today!

Jul
20
2023
--

How To Measure the Network Impact on PostgreSQL Performance

Network Impact on PostgreSQL Performance

It is very common to see many infrastructure layers standing between a PostgreSQL database and the Application server.  The most common ones are connection poolers, load balancers, routers, firewalls, etc.  We often forget or take for granted the network hops involved and the additional overhead it creates on the overall performance. But it could cause severe performance penalties in many cases and overall throughput deterioration. I have been trying to get a good estimate of this overhead for quite some time. Previously I had written about how the volume of data transmission as part of SQL execution, as well as the cursor location, affects the overall performance.

Meanwhile, Hans-Jürgen Schönig’s presentation, which brought up the old discussion of Unix socket vs. TCP/IP connection, triggered me to write about other aspects of network impact on performance. He demonstrated a specific case of a 2x performance degradation while using TCP/IP connection.

How to detect and measure the impact

There is no easy mechanism for measuring the impact of network overhead. But a very close analysis of wait_events from pg_stat_activity can tell us the story as closely as possible. So we should be sampling the wait events. Many methods exist for wait-event sampling, including extensions. But I prefer not to install special tools or extensions on the user environment for the wait event sampling. At Percona Support, we use pg_gather as the method to collect and study the wait events because it is a standalone SQL script and doesn’t need to install anything on the database systems. It is designed to be very lightweight as well. There will be 2,000 samples collected per session.

pg_gather analysis report can show wait events and other information associated with each session.


But I will be discussing and highlighting only the wait events portion of it in this blog while going through different types types of workloads and how network performance shows up in wait events.

Case 1: Query retrieving a large number of rows

Let’s consider the case of pg_dump taking a logical backup on a different machine.  If the network is fast, we may see a good amount of CPU utilization and “DataFileRead” as a wait event.

Of course, there are “ClientWrite” events, which is the wait event related to writing the data to the client (pg_dump) in this case. If the client is a lightweight tool like psql and the network is really fast, the “ClientWrite” may even become invisible.

But let’s see what the wait events look like if the network slows down.

We can see that the CPU utilization and “DataFileRead” wait events are dropped, indicating that the overall server-side session activity is slow downed. At the same time, “ClientWrite” is shot up to 1821, indicating that the session is spending considerable time sending the data to its client-side (pg_dump). There is also “ClientRead,” indicating that the acknowledgment from the pg_dump is taking time.

The spike in “ClientWrite” does not depend on the client tool. Following is the screenshot of a regular psql session for a query retrieving a large number of records.

This excessive “ClientWrite” is good enough to spot the problem in these cases.

Case 2: Bulk data loading

This is the opposite of the previous case. But PostgreSQL has a lot more work to do for a write operation with bulk data. Following wait events are captured from a really fast/low latency network.

Obviously, the PostgreSQL process has to spend time in “DataFileExtend,” “WALWrite,” and “WALSync.” Now, if the network slows down, many of those wait events we saw may become invisible as the performance bottleneck emerges.

The following is the wait events from the same bulk data load over a slower network.

As we can see, “ClientRead” has become the major wait event. This means the server session is spending more time reading from its client.

The change may not be dramatic in many systems, but overall “ClientRead” has become more prominent.

Case 3: Impact on transactions

One might ask what is so special about transactions. On an OLTP workload, statements could be simple and small to cause any observable network impact.  But back-and-forth communication between the server and the client can result in unwanted delays between statements and final commits or rollback.  Yes, I mean delays/gaps between each statement.

Following is the wait event of a fast network and micro-transactions using pgbench.

Obviously, there are high WAL-related wait events and CPU usage. But we can see there is considerable “ClientRead” also. This happens because there will be a lot of network interactions for microtransactions. ClientRead is unavoidable for transactions, and it’s OK to expect 5-10% of it.

But as the network slows down, the “ClientRead” becomes increasingly important. Following is the information from the same pgbench transaction workload over a slower network.

The ClientRead became the biggest wait event in this case.

You might wonder, what is the “Net/Delay*” showing up? This additional analysis is available in the new version of pg_gather (version 21) to assess the delay outside the transaction block. See the next session for details.

Case 4: Connection utilization

As the network latency increases, the client connection won’t be able to use the server session to the extent possible. The server session has to wait on the eighter “ClientRead”/”ClientWrite” or sit idle. Either way, it can drastically affect the throughput of the system.

Within a transaction, the delay is captured as “ClientRead,” but the delay between two transactions is not captured because the session becomes “idle” momentarily. pg_gather new version prepares an estimate of this momentary switches to idle as the server wastes time or “Net/Delay*.” It could be due to network delays or poor application response. From the database side, it is difficult to distinguish between them. But the “Net/Delay*” can give a good idea about how much server time is wasted.

If it is possible to install PostgreSQL client tools on the application server, it is easy to simulate a load and study both network delay and application side response delay and compare that with the actual data.

The delay/latency becomes more visible when there is a lot of back-and-forth communication between the client and server.  This can be easily tested by creating a single statement file. (Idea taken from Hans-Jürgen.)

echo "SELECT 1" > query.sql

This can be executed against a remote database over a TCP connection for a specified number of seconds.

$ pgbench -h 10.197.42.1 -T 20 -f query.sql

On a fast network between my servers, I could get the following result as TPS of a single session.

…
latency average = 0.030 ms
initial connection time = 5.882 ms
tps = 32882.734311 (without initial connection time)

But the wait event analysis by pg_gather tells me that more time is spent on Net/Delay*.

It makes sense because “SELECT 1” doesn’t have much to do at the server, and this workload is all about sending back-and-forth communication.

With a local Unix socket connection, the single session throughput increased by more than double!

latency average = 0.013 ms
initial connection time = 1.498 ms
tps = 75972.733205 (without initial connection time)

But the wait event analysis tells us that still, the client-server communication is a major time consumer. 🙁

This kind of highly interactive workload could benefit from server-side programming (stored proc/function) or even an extension. Interestingly, the CPU usage is of less proportion compared to TPS when a Unix socket connection is used; that’s an important point to be noted. “ClientRead” increased because more data was transferred from the client.

If the network slows down in this case, the “Net/Delay*” also increases, and CPU usage and TPS drop because the session spends more time doing nothing between processing two statements.

Since this particular workload doesn’t have transactions and less data to send to the server, the “ClientRead” can drop to an unnoticeable level, as we see.

Summary

The “wait events” information from pg_stat_activity can tell us many details about performance and network congestion. Not just the sum of events, but the gap between two wait events and pattern has a lot of information to dig down.  Properly collected and analyzed data tells the story from the PostgreSQL perspective and how it experiences the network. More importantly, analysis becomes independent of database hosting or OS-level tools. There is no need to have any sophisticated tools or frameworks required to achieve this. Stand-alone SQL scripts like this can be handy in spotting problems and bottlenecks. Even though this blog post is specific about the network, the wait event analysis can be generic to many cases.

Percona Distribution for PostgreSQL provides the best and most critical enterprise components from the open-source community in a single distribution, designed and tested to work together.

 

Download Percona Distribution for PostgreSQL Today!

Apr
24
2023
--

PostgreSQL Indexes Can Hurt You: Negative Effects and the Costs Involved

PostgreSQL Indexes

Indexes are generally considered to be the panacea when it comes to SQL performance tuning, and PostgreSQL supports different types of indexes catering to different use cases. I keep seeing many articles and talks on “tuning” discussing how creating new indexes speeds up SQL but rarely ones discussing removing them. The urge to create more and more indexes is found to be causing severe damage in many systems. Many times, removing indexes is what we should be doing first before considering any new indexes for the benefit of the entire system. Surprised? Understanding the consequences and overhead of indexes can help to make an informed decision and potentially save the system from many potential problems.

At a very basic level, we should remember that indexes are not free of cost. The benefits come with a cost in terms of performance and resource consumption. The following is the list of ten problems/overheads that the excessive use of indexes can cause. This post is about PostgreSQL, but most of the problems also apply to other database systems.

1. Indexes penalize the transactions

We might see an improvement in the performance of a SELECT statement after adding an index. But we should not forget that the performance gains come with a cost to the transactions on the same table. Conceptually, every DML on a table needs to update all the indexes of the table. Even though there are a lot of optimizations for reducing the write amplification, it is a considerable overhead.

For example, let’s assume that there are five indexes on a table; every INSERT into the table will result in an INSERT of the index record on those five indexes. Logically, five index pages also will be updated. So effectively, the overhead is 5x.

2. Memory usage

Index pages must be in memory, regardless of whether any query uses them because they need to get updated by transactions. Effectively, the memory available for pages of the table gets less. The more indexes, the more the requirement of memory for effective caching. If we don’t increase the available memory,  this starts hurting the entire performance of the system.

3. Random writes: Updating indexes is more costly

Unlike INSERTS new records into tables, rows are less likely to be inserted into the same page. Indexes like B-Tree indexes are known to cause more random writes.

4. Indexes need more cache than tables

Due to random writes and reads, indexes need more pages to be in the cache. Cache requirements for indexes are generally much higher than associated tables.

5. WAL generation

In addition to WAL records of the table updates, there will also be WAL records for indexes. This helps in crash recovery and replication. If you are using any wait event analysis tools/scripts like pg_gather, the overhead of the WAL generation will be clearly visible. The actual impact depends on the index type.

WAL

This is a synthetic test case, but if WAL-related wait events appear as any of the top wait events, it is a matter of concern for a transaction system, and we should take every step to address it.

 

Download Percona Distribution for PostgreSQL Today!

 

6. More and more I/O

Not just WAL records are generated; we will have more pages dirtied, as well. As the index pages get dirtied, they must be written back to files, leading to more I/O again—the “DataFileWrite” wait event, as seen in the previous screenshot.

Another side effect is indexes increase the total Active-Dataset size. By “Active dataset,” I mean the tables and indexes which are frequently queried and used. As the size of the active dataset increases, the cache becomes less and less efficient. Less-effective cache results in more datafile read, so read I/O is increased. This is in addition to the read I/O required to bring the additional index pages from storage for specific queries.

Again the pg_gather report of another system with mainly select queries shows this problem. As the Active-Dataset increases, PostgreSQL has no choice but to bring the pages from storage.

PostgreSQL Active-Dataset

A more significant percentage of “DataFileRead” sustaining for a longer duration indicates that the Active-Dataset is much bigger, which is not cachable.

7. Impact on VACUUM/AUTOVACUUM

The overhead is not only for inserting or updating index pages, as discussed in the previous points. There is overhead in maintaining it since the indexes also need cleanups of old tuple references.

I have seen cases where autovacuum workers on a single table run for a very long duration because of the size of the table and, most importantly, the excessive number of indexes on the table. In fact, it is widespread that users see their autovacuum worker is “stuck” for hours without showing any progress for a longer duration.  This happens because the index cleanup by the autovacuum is the opaque stage of autovacuum and is not visible through views like pg_stat_progress_vacuum other than the vacuum phase is indicated as vacuuming indexes.

Indexes can get bloated and become less efficient over time.  Periodic index maintenance (REINDEX) might be needed in many systems.

8. Tunnel vision while tuning

Tunnel vision is the loss of the field view. The user may be concentrating on a particular SQL statement in an attempt to “tune” and decide on creating indexes.  By creating an index for tuning a query, we are shifting more system resources to that query.  Then it may give more performance to that particular statement by penalizing others.

But as we keep creating more and more indexes for tuning other queries, the resources will shift again towards other queries. This leads to a situation where the effort to tune every query penalizes every other query. Ultimately, everyone will be hurt, and only losers will be in this war. Someone trying to tune should consider how every part of the system can co-exist (maximizing business value) rather than absolute maximum performance for a particular query.

9. Greater storage requirement

Almost every day, I see cases where indexes take more storage than tablets.

PostgreSQL index storage requirement

This may sound too silly for those with more money to spend on storage, but we should remember that this has a cascading effect. The total database size grows to a multiple of the actual data. So obviously, backups take more time, storage, and network resources, and then the same backup can put more load on the host machine. This would also increase the time to restore a backup and recover it. Bigger databases affect many things, including more time to build standby instances.

10. Indexes are more prone to corruption

I am not just talking about rarely occurring index-related bugs like silent index corruption of PostgreSQL 14 or index corruption due to glibc collation change, which keeps popping up now and then and affects many environments even today. Over decades of working with databases, I have observed that index corruptions are reported more frequently. (I hope anyone involved in PostgreSQL for years and who has seen hundreds of cases will agree with me). As we increase the number of indexes, we increase the probability.

What should we do?

A set of critical questions should accompany new index considerations: Is it essential to have this index, or is it necessary to speed up the query at the cost of more index?  Is there a way to rewrite the query to get a better performance? Is it ok to discard the small gains and live without an index?

Existing indexes also require a critical review over a period of time. All unused indexes (those indexes with idx_scan as zero in pg_stat_user_indexes) should be considered for dropping. Scripts like the one from pgexperts can help to do more analysis.

The upcoming PostgreSQL 16 has one more column in pg_stat_user_indexes / pg_stat_all_indexes  with the name last_idx_scan, which can tell us when was the last time the index was used (timestamp). This will help us to take a well-informed look at all the indexes in the system.

Summary

The summary in simple words: Indexes are not cheap. There is a cost, and the cost can be manifold.  Indexes are not always good, and sequential scans are not always bad, either.  My humble advice is to avoid looking for improving individual queries as the first step because it is a slippery slope. A top-down approach to tuning the system yields better results starting from tuning the Host machine, Operating System, PostgreSQL parameter, Schema, etc.  An objective “cost-benefit analysis” is important before creating an index.

Our world-class PostgreSQL training teaches you all about indexes: when to use them, when not to use them, how they impact your system, etc. Come take our intensive instructor-led training course.

 

Learn more about Percona Training

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com