PostgreSQL locking visibility for application developers and DBAs is in most cases related to heavyweight locks. Complex database locking operations require full instrumentation using views from the system catalog. It should be clear which object is locked by a specific database “backend” process. An alternative name for any lock is “bottleneck”. In order to make database operations parallel we should split a single bottleneck into multiple operation specific tasks.
This is the second part of three blog posts related to table level locks. The previous post was about row-level locks, and a subsequent post reviews the latches protecting internal database structures.
Example environment
A simple table with several rows:
CREATE TABLE locktest (c INT); INSERT INTO locktest VALUES (1), (2);
Helper view
In order to check different types of these locks, let’s create a helper view as suggested by Bruce Momjian in his presentation:
CREATE VIEW lockview AS SELECT pid, virtualtransaction AS vxid, locktype AS lock_type, mode AS lock_mode, granted, CASE WHEN virtualxid IS NOT NULL AND transactionid IS NOT NULL THEN virtualxid || ' ' || transactionid WHEN virtualxid::text IS NOT NULL THEN virtualxid ELSE transactionid::text END AS xid_lock, relname, page, tuple, classid, objid, objsubid FROM pg_locks LEFT OUTER JOIN pg_class ON (pg_locks.relation = pg_class.oid) WHERE -- do not show our view’s locks pid != pg_backend_pid() AND -- no need to show self-vxid locks virtualtransaction IS DISTINCT FROM virtualxid -- granted is ordered earlier ORDER BY 1, 2, 5 DESC, 6, 3, 4, 7;
RowShareLock (ROW SHARE)
Many applications use the read-modify-write paradigm. For instance, the application fetches single object fields from a table, modifies the data, and saves the changes back to the database. In a multi-user environment, different users could modify the same rows in the course of this transaction. We can get inconsistent data with just a plain select. In response to user demands, almost all SQL databases have SELECT … FOR SHARE locking. This feature prevents the application entity from making data modifications until the locker transaction commits or rolls back.
For example:
- There is a user with multiple bank accounts stored in an accounts table, with total_amount stored in a bank_clients table.
- In order to update the total_amount field, we should prevent modification of all rows related to the specific bank client.
- It would be better to use a single update statement to calculate total_amount and select it from the accounts table. If the update requires external data, or some action from the user, then several statements are required
START TRANSACTION; SELECT * FROM accounts WHERE client_id = 55 FOR SHARE; SELECT * FROM bank_clients WHERE client_id=55 FOR UPDATE; UPDATE bank_clients SET total_amount=38984.33, client_status='gold' WHERE client_id=55; COMMIT;
The SELECT FOR SHARE statement creates a “RowShareLock” lock on the relation locktest.
Here’s exactly the same lock created with an SQL statement:
BEGIN; LOCK TABLE locktest IN ROW SHARE MODE;
A single heavyweight RowShareLock is required regardless of the number of rows locked by the query.
This is illustrated with an unfinished transaction in the following example. Start the unfinished transaction, and select from lockview in a second connection to the database:
BEGIN; SELECT * FROM locktest FOR SHARE; -- In second connection: postgres=# SELECT pid,vxid,lock_type,lock_mode,granted,xid_lock,relname FROM lockview; pid | vxid | lock_type | lock_mode | granted | xid_lock | relname -------+------+---------------+---------------+---------+----------+---------- 21144 | 3/13 | transactionid | ExclusiveLock | t | 586 | 21144 | 3/13 | relation | RowShareLock | t | | locktest
RowExclusiveLock (ROW EXCLUSIVE)
Real queries that modify rows also require heavyweight locks on tables, one per table.
The next example uses a DELETE query, but an UPDATE will have the same effect.
All commands that modify data in a table obtain a ROW EXCLUSIVE lock.
BEGIN; DELETE FROM locktest; -- second connection postgres=# SELECT pid,vxid,lock_type,lock_mode,granted,xid_lock,relname FROM lockview; pid | vxid | lock_type | lock_mode | granted | xid_lock | relname -------+------+---------------+------------------+---------+----------+---------- 10997 | 3/6 | transactionid | ExclusiveLock | t | 589 | 10997 | 3/6 | relation | RowExclusiveLock | t | | locktest
This new lock is incompatible with the previous FOR SHARE example.
SELECT * FROM locktest FOR SHARE
waits for the delete transaction to finish or abort:
postgres=# SELECT pid,vxid,lock_type,lock_mode,granted,xid_lock,relname,page,tuple FROM lockview; pid | vxid | lock_type | lock_mode | granted | xid_lock | relname | page | tuple -------+------+---------------+------------------+---------+----------+----------+------+------- 10997 | 3/6 | transactionid | ExclusiveLock | t | 589 | | | 10997 | 3/6 | relation | RowExclusiveLock | t | | locktest | | 11495 | 5/9 | relation | RowShareLock | t | | locktest | | 11495 | 5/9 | tuple | RowShareLock | t | | locktest | 0 | 1 11495 | 5/9 | transactionid | ShareLock | f | 589 | | |
Queries modifying table content also lock all indexes, even if the index does not contain modified fields.
-- preparation CREATE INDEX c_idx2 ON locktest (c); ALTER TABLE locktest ADD COLUMN c2 INT; CREATE INDEX c2_idx ON locktest(c2); -- unfinished example transaction BEGIN; UPDATE locktest SET c=3 WHERE c=1; -- second connection postgres=# SELECT * FROM lockview; pid | vxid | lock_type | lock_mode | granted | xid_lock | relname | page | tuple | classid | objid | objsubid ------+--------+------------+------------------+---------+----------+----------+------+-------+---------+-------+---------- 3998 | 3/7844 | virtualxid | ExclusiveLock | t | 3/7844 | | | | | | 3998 | 3/7844 | relation | RowExclusiveLock | t | | c2_idx | | | | | 3998 | 3/7844 | relation | RowExclusiveLock | t | | c_idx | | | | | 3998 | 3/7844 | relation | RowExclusiveLock | t | | c_idx2 | | | | | 3998 | 3/7844 | relation | RowExclusiveLock | t | | locktest | | | | |
ShareLock (SHARE)
The non-concurrent version of CREATE INDEX prevents table updates, e.g. DROP TABLE or INSERT or DELETE, with ShareLock.
BEGIN; CREATE INDEX c_idx ON locktest (c); -- second connection postgres=# SELECT * FROM lockview; pid | vxid | lock_type | lock_mode | granted | xid_lock | relname | page | tuple | classid | objid | objsubid ------+--------+---------------+---------------------+---------+----------+----------+------+-------+---------+-------+---------- 3998 | 3/7835 | virtualxid | ExclusiveLock | t | 3/7835 | | | | | | 3998 | 3/7835 | transactionid | ExclusiveLock | t | 564 | | | | | | 3998 | 3/7835 | relation | AccessExclusiveLock | t | | | | | | | 3998 | 3/7835 | relation | ShareLock | t | | locktest | | | | |
You can execute multiple CREATE INDEX queries in parallel unless the index name is exactly the same. The wait happens on the row lock (ShareLock with “transactionid” type) in the pg_class table.
Note that there is also AccessExclusiveLock lock with type “relation”, but it’s not a table level one.
ShareUpdateExclusiveLock (SHARE UPDATE EXCLUSIVE)
These database maintenance operations need to take a ShareUpdateExclusiveLock:
- ANALYZE table
- VACUUM (without full) runs.
- CREATE INDEX CONCURRENTLY
The
ANALYZE tablename;
statement updates table statistics. The query planner/optimizer is able to provide the best plans for query execution only if the statistics are up to date.
BEGIN; ANALYZE locktest; -- in second connection postgres=# SELECT pid,vxid,lock_type,lock_mode,granted,xid_lock,relname FROM lockview; pid | vxid | lock_type | lock_mode | granted | xid_lock | relname -------+------+---------------+--------------------------+---------+----------+---------- 10997 | 3/7 | transactionid | ExclusiveLock | t | 591 | 10997 | 3/7 | relation | ShareUpdateExclusiveLock | t | | locktest
There is no conflict between RowExclusiveLock and ShareUpdateExclusiveLock. UPDATE/DELETE/INSERT could still modify rows during ANALYZE.
VACUUM and CREATE INDEX CONCURRENTLY can be executed only outside a transaction. To see the effects of these statements in lockview, execute a conflicting transaction first e.g. run ANALYZE in a transaction, or run VACUUM against a huge table.
CREATE INDEX CONCURRENTLY locking can be confusing. SHARE UPDATE EXCLUSIVE lock does not conflict with a ROW EXCLUSIVE lock that’s used for DELETES, INSERT and UPDATES. Unfortunately, CREATE INDEX CONCURRENTLY waits until active transactions are finished twice due to full table scans:
“In a concurrent index build, the index is actually entered into the system catalogs in one transaction, then two table scans occur in two more transactions. Before each table scan, the index build must wait for existing transactions that have modified the table to terminate.” PostgreSQL Documentation
AccessExclusiveLock (ACCESS EXCLUSIVE)
This lock conflicts with any other locks and is used by these statements:
- CREATE RULE
- DROP TABLE
- DROP INDEX
- TRUNCATE
- VACUUM FULL
- LOCK TABLE (default mode)
- CLUSTER
- REINDEX
- REFRESH MATERIALIZED VIEW (without CONCURRENTLY)
BEGIN; CREATE RULE r_locktest AS ON INSERT TO locktest DO INSTEAD NOTHING; -- second connection postgres=# select pid,vxid,lock_type,lock_mode,granted,xid_lock,relname from lockview; pid | vxid | lock_type | lock_mode | granted | xid_lock | relname -------+------+---------------+---------------------+---------+----------+---------- 10997 | 3/19 | transactionid | ExclusiveLock | t | 596 | 10997 | 3/19 | relation | AccessExclusiveLock | t | | locktest
More importantly, drop index requires access exclusive locks for both table and index:
BEGIN; DROP INDEX c_idx; -- second connection postgres=# SELECT * FROM lockview; pid | vxid | lock_type | lock_mode | granted | xid_lock | relname | page | tuple | classid | objid | objsubid ------+--------+---------------+---------------------+---------+----------+----------+------+-------+---------+-------+---------- 3998 | 3/7839 | virtualxid | ExclusiveLock | t | 3/7839 | | | | | | 3998 | 3/7839 | transactionid | ExclusiveLock | t | 569 | | | | | | 3998 | 3/7839 | relation | AccessExclusiveLock | t | | c_idx | | | | | 3998 | 3/7839 | relation | AccessExclusiveLock | t | | locktest | | | | |
Note: This is the most dangerous type of lock. Avoid running queries requiring access exclusive lock in production, or at least put the application in maintenance mode.
ExclusiveLock
Meanwhile, SQL commands don’t use ExclusiveLock, except for with the general LOCK TABLE statement. This lock prevents all requests except for a non-locking select (i.e. without FOR SHARE/UPDATE).
BEGIN; LOCK TABLE locktest IN EXCLUSIVE MODE; -- second connection postgres=# SELECT pid,vxid,lock_type,lock_mode,granted,xid_lock,relname FROM lockview; pid | vxid | lock_type | lock_mode | granted | xid_lock | relname -------+------+-----------+---------------+---------+----------+---------- 10997 | 3/21 | relation | ExclusiveLock | t | | locktest
Savepoints
Savepoint produces an additional ExclusiveLock of transactionid type with new xid value.
BEGIN; SELECT * FROM locktest FOR SHARE; SAVEPOINT s1; SELECT * FROM locktest FOR UPDATE; -- second connection postgres=# SELECT pid,vxid,lock_type,lock_mode,granted,xid_lock,relname FROM lockview; pid | vxid | lock_type | lock_mode | granted | xid_lock | relname -------+------+---------------+-----------------+---------+----------+---------- 10997 | 3/37 | transactionid | ExclusiveLock | t | 602 | 10997 | 3/37 | transactionid | ExclusiveLock | t | 603 | 10997 | 3/37 | relation | AccessShareLock | t | | c_idx 10997 | 3/37 | relation | RowShareLock | t | | locktest
pg_advisory_lock
Sometimes application developers require synchronization between processes. In such systems, the application creates and removes locks frequently. Systems with a row-based locks implementation tend to cause table bloat.
There are many functions related to advisory locks:
- per session or per transaction
- wait if lock is not available or immediately return false
- exclusive or shared
- 64-bit or two 32-bit integers resource identifiers
Imagine that we have several cron jobs and that the application should prevent simultaneous runs of the same script. Next, each script can check if a lock is available in PostgreSQL for specific integer job identifier:
postgres=# SELECT pg_try_advisory_lock(10); pg_try_advisory_lock ---------------------- t -- second connection postgres=# SELECT * FROM lockview; pid | vxid | lock_type | lock_mode | granted | xid_lock | relname | page | tuple | classid | objid | objsubid ------+------+-----------+---------------+---------+----------+---------+------+-------+---------+-------+---------- 3998 | 3/0 | advisory | ExclusiveLock | t | | | | | 0 | 10 | 1 -- other connections SELECT pg_try_advisory_lock(10); pg_try_advisory_lock ---------------------- f
The query produces ExclusiveLock with type advisory.
Deadlocks
Any system with multiple locks tends to have a deadlock situation when queries will never finish. The only way to resolve such issues: kill one of blocked statements. More importantly, deadlock detection is an expensive procedure in PostgreSQL. A check for deadlock only happens when a transaction is locked for deadlock_timeout milliseconds—after one second by default.
Here is an illustration of a deadlock situation for two different connections A and B:
Any deadlock starting from lock wait.
A: BEGIN; SELECT c FROM locktest WHERE c=1 FOR UPDATE; B: BEGIN; SELECT c FROM locktest WHERE c=2 FOR UPDATE; SELECT c FROM locktest WHERE c=1 FOR UPDATE;
You are not alone with the identification of deadlocks, as the pg_stat_activity system view helps you to find statements and transactions causing lock waits:
postgres=# SELECT pg_stat_activity.pid AS pid, query, wait_event, vxid, lock_type, lock_mode, granted, xid_lock FROM lockview JOIN pg_stat_activity ON (lockview.pid = pg_stat_activity.pid); pid | query | wait_event | vxid | lock_type | lock_mode | granted | xid_lock -------+----------------------------+---------------+------+---------------+---------------------+---------+---------- 10997 | SELECT ... c=1 FOR UPDATE; | ClientRead | 3/43 | transactionid | ExclusiveLock | t | 605 10997 | SELECT ... c=1 FOR UPDATE; | ClientRead | 3/43 | advisory | ExclusiveLock | t | 10997 | SELECT ... c=1 FOR UPDATE; | ClientRead | 3/43 | relation | AccessShareLock | t | 10997 | SELECT ... c=1 FOR UPDATE; | ClientRead | 3/43 | relation | RowShareLock | t | 11495 | SELECT ... c=1 FOR UPDATE; | transactionid | 5/29 | transactionid | ExclusiveLock | t | 606 11495 | SELECT ... c=1 FOR UPDATE; | transactionid | 5/29 | advisory | ExclusiveLock | t | 11495 | SELECT ... c=1 FOR UPDATE; | transactionid | 5/29 | relation | AccessShareLock | t | 11495 | SELECT ... c=1 FOR UPDATE; | transactionid | 5/29 | relation | RowShareLock | t | 11495 | SELECT ... c=1 FOR UPDATE; | transactionid | 5/29 | tuple | AccessExclusiveLock | t | 11495 | SELECT ... c=1 FOR UPDATE; | transactionid | 5/29 | transactionid | ShareLock | f | 605
The SELECT FOR UPDATE on c=2 row causes a deadlock:
SELECT c FROM locktest WHERE c=2 FOR UPDATE;
Afterwards, PostgreSQL reports in server log:
2018-08-02 08:46:07.793 UTC [10997] ERROR: deadlock detected 2018-08-02 08:46:07.793 UTC [10997] DETAIL: Process 10997 waits for ShareLock on transaction 606; blocked by process 11495. Process 11495 waits for ShareLock on transaction 605; blocked by process 10997. Process 10997: select c from locktest where c=2 for update; Process 11495: select c from locktest where c=1 for update; 2018-08-02 08:46:07.793 UTC [10997] HINT: See server log for query details. 2018-08-02 08:46:07.793 UTC [10997] CONTEXT: while locking tuple (0,3) in relation "locktest" 2018-08-02 08:46:07.793 UTC [10997] STATEMENT: SELECT c FROM locktest WHERE c=2 FOR UPDATE; ERROR: deadlock detected DETAIL: Process 10997 waits for ShareLock on transaction 606; blocked by process 11495. Process 11495 waits for ShareLock on transaction 605; blocked by process 10997. HINT: See server log for query details. CONTEXT: while locking tuple (0,3) in relation "locktest"
As you can see, the database server aborts one blocked transaction automatically.
Multi-way deadlocks
Normally there are just two transactions creating deadlocks. However, in complex cases, an application could caused deadlocks with multiple transactions forming a dependency circle.
A: locks row1, B locks row2, C locks row3
Step2
A: trying to get row3, B: trying to get row1, C: trying to get row2
Summary
- Do not put DDL statements in long transactions.
- Please avoid DDL on during high load for frequently updated tables
- CLUSTER command requires exclusive access to the table and all it’s indexes
- Monitor postgresql log for deadlock-related messages