Oct
24
2018
--

PostgreSQL locking, part 2: heavyweight locks

Locking in PostgreSQL

Locking in PostgreSQLPostgreSQL locking visibility for application developers and DBAs is in most cases related to heavyweight locks. Complex database locking operations require full instrumentation using views from the system catalog. It should be clear which object is locked by a specific database “backend” process. An alternative name for any lock is “bottleneck”. In order to make database operations parallel we should split a single bottleneck into multiple operation specific tasks.

This is the second part of three blog posts related to table level locks. The previous post was about row-level locks, and a subsequent post reviews the latches protecting internal database structures.

Example environment

A simple table with several rows:

CREATE TABLE locktest (c INT);
INSERT INTO locktest VALUES (1), (2);

Helper view

In order to check different types of these locks, let’s create a helper view as suggested by Bruce Momjian in his presentation:

CREATE VIEW lockview AS SELECT pid, virtualtransaction AS vxid, locktype AS lock_type,
mode AS lock_mode, granted,
CASE
WHEN virtualxid IS NOT NULL AND transactionid IS NOT NULL
THEN virtualxid || ' ' || transactionid
WHEN virtualxid::text IS NOT NULL
THEN virtualxid
ELSE transactionid::text                                                                                                                                                                                           END AS xid_lock, relname,
page, tuple, classid, objid, objsubid
FROM pg_locks LEFT OUTER JOIN pg_class ON (pg_locks.relation = pg_class.oid)
WHERE -- do not show our view’s locks
pid != pg_backend_pid() AND
-- no need to show self-vxid locks                                                                                                                                                                                 virtualtransaction IS DISTINCT FROM virtualxid
-- granted is ordered earlier
ORDER BY 1, 2, 5 DESC, 6, 3, 4, 7;

RowShareLock (ROW SHARE)

Many applications use the read-modify-write paradigm. For instance, the application fetches single object fields from a table, modifies the data, and saves the changes back to the database. In a multi-user environment, different users could modify the same rows in the course of this transaction. We can get inconsistent data with just a plain select. In response to user demands, almost all SQL databases have SELECT … FOR SHARE locking. This feature prevents the application entity from making data modifications until the locker transaction commits or rolls back.

For example:

  1. There is a user with multiple bank accounts stored in an accounts table, with total_amount stored in a bank_clients table.
  2. In order to update the total_amount field, we should prevent modification of all rows related to the specific bank client.
  3. It would be better to use a single update statement to calculate total_amount and select it from the accounts table. If the update requires external data, or some action from the user, then several statements are required
START TRANSACTION;
SELECT * FROM accounts WHERE client_id = 55 FOR SHARE;
SELECT * FROM bank_clients WHERE client_id=55 FOR UPDATE;
UPDATE bank_clients SET total_amount=38984.33, client_status='gold' WHERE client_id=55;
COMMIT;

The SELECT FOR SHARE statement creates a “RowShareLock” lock on the relation locktest.

Here’s exactly the same lock created with an SQL statement:

BEGIN;
LOCK TABLE locktest IN ROW SHARE MODE;

A single heavyweight RowShareLock is required regardless of the number of rows locked by the query.

This is illustrated with an unfinished transaction in the following example. Start the unfinished transaction, and select from lockview in a second connection to the database:

BEGIN;
SELECT * FROM locktest FOR SHARE;
-- In second connection:
postgres=# SELECT pid,vxid,lock_type,lock_mode,granted,xid_lock,relname FROM lockview;
  pid  | vxid |   lock_type   |   lock_mode   | granted | xid_lock | relname
-------+------+---------------+---------------+---------+----------+----------
 21144 | 3/13 | transactionid | ExclusiveLock | t       | 586      |
 21144 | 3/13 | relation      | RowShareLock  | t       |          | locktest

RowExclusiveLock (ROW EXCLUSIVE)

Real queries that modify rows also require heavyweight locks on tables, one per table.

The next example uses a DELETE query, but an UPDATE will have the same effect.

All commands that modify data in a table obtain a ROW EXCLUSIVE lock.

BEGIN;
DELETE FROM locktest;
-- second connection
postgres=# SELECT pid,vxid,lock_type,lock_mode,granted,xid_lock,relname FROM lockview;
  pid  | vxid |   lock_type   |    lock_mode     | granted | xid_lock | relname
-------+------+---------------+------------------+---------+----------+----------
 10997 | 3/6  | transactionid | ExclusiveLock    | t       | 589      |
 10997 | 3/6  | relation      | RowExclusiveLock | t       |          | locktest

This new lock is incompatible with the previous FOR SHARE example.

SELECT * FROM locktest FOR SHARE

  waits for the delete transaction to finish or abort:

postgres=# SELECT pid,vxid,lock_type,lock_mode,granted,xid_lock,relname,page,tuple FROM lockview;
  pid  | vxid |   lock_type   |    lock_mode     | granted | xid_lock | relname  | page | tuple
-------+------+---------------+------------------+---------+----------+----------+------+-------
 10997 | 3/6  | transactionid | ExclusiveLock    | t       | 589      |          |      |
 10997 | 3/6  | relation      | RowExclusiveLock | t       |          | locktest |      |
 11495 | 5/9  | relation      | RowShareLock     | t       |          | locktest |      |
 11495 | 5/9  | tuple         | RowShareLock     | t       |          | locktest |    0 |     1
 11495 | 5/9  | transactionid | ShareLock        | f       | 589      |          |      |

Queries modifying table content also lock all indexes, even if the index does not contain modified fields.

-- preparation
CREATE INDEX c_idx2 ON locktest (c);
ALTER TABLE locktest ADD COLUMN c2 INT;
CREATE INDEX c2_idx ON locktest(c2);
-- unfinished example transaction
BEGIN;
UPDATE locktest SET c=3 WHERE c=1;
-- second connection
postgres=# SELECT * FROM lockview;
 pid  |  vxid  | lock_type  |    lock_mode     | granted | xid_lock | relname  | page | tuple | classid | objid | objsubid
------+--------+------------+------------------+---------+----------+----------+------+-------+---------+-------+----------
 3998 | 3/7844 | virtualxid | ExclusiveLock    | t       | 3/7844   |          |      |       |         |       |
 3998 | 3/7844 | relation   | RowExclusiveLock | t       |          | c2_idx   |      |       |         |       |
 3998 | 3/7844 | relation   | RowExclusiveLock | t       |          | c_idx    |      |       |         |       |
 3998 | 3/7844 | relation   | RowExclusiveLock | t       |          | c_idx2   |      |       |         |       |
 3998 | 3/7844 | relation   | RowExclusiveLock | t       |          | locktest |      |       |         |       |

ShareLock (SHARE)

The non-concurrent version of CREATE INDEX prevents table updates, e.g. DROP TABLE or INSERT or DELETE, with ShareLock.

BEGIN;
CREATE INDEX c_idx ON locktest (c);
-- second connection
postgres=# SELECT * FROM lockview;
 pid  |  vxid  |   lock_type   |      lock_mode      | granted | xid_lock | relname  | page | tuple | classid | objid | objsubid
------+--------+---------------+---------------------+---------+----------+----------+------+-------+---------+-------+----------
 3998 | 3/7835 | virtualxid    | ExclusiveLock       | t       | 3/7835   |          |      |       |         |       |
 3998 | 3/7835 | transactionid | ExclusiveLock       | t       | 564      |          |      |       |         |       |
 3998 | 3/7835 | relation      | AccessExclusiveLock | t       |          |          |      |       |         |       |
 3998 | 3/7835 | relation      | ShareLock           | t       |          | locktest |      |       |         |       |

You can execute multiple CREATE INDEX queries in parallel unless the index name is exactly the same. The wait happens on the row lock (ShareLock with “transactionid” type) in the pg_class table.

Note that there is also AccessExclusiveLock lock with type “relation”, but it’s not a table level one.

ShareUpdateExclusiveLock (SHARE UPDATE EXCLUSIVE)

These database maintenance operations need to take a ShareUpdateExclusiveLock:

  • ANALYZE table
  • VACUUM (without full) runs.
  • CREATE INDEX CONCURRENTLY

The 

ANALYZE tablename;

  statement updates table statistics. The query planner/optimizer is able to provide the best plans for query execution only if the statistics are up to date.

BEGIN;
ANALYZE locktest;
-- in second connection
postgres=# SELECT pid,vxid,lock_type,lock_mode,granted,xid_lock,relname FROM lockview;
  pid  | vxid |   lock_type   |        lock_mode         | granted | xid_lock | relname
-------+------+---------------+--------------------------+---------+----------+----------
 10997 | 3/7  | transactionid | ExclusiveLock            | t       | 591      |
 10997 | 3/7  | relation      | ShareUpdateExclusiveLock | t       |          | locktest

There is no conflict between RowExclusiveLock and ShareUpdateExclusiveLock. UPDATE/DELETE/INSERT could still modify rows during ANALYZE.

VACUUM and CREATE INDEX CONCURRENTLY can be executed only outside a transaction. To see the effects of these statements in lockview, execute a conflicting transaction first e.g. run ANALYZE in a transaction, or run VACUUM against a huge table.

CREATE INDEX CONCURRENTLY locking can be confusing. SHARE UPDATE EXCLUSIVE lock does not conflict with a ROW EXCLUSIVE lock that’s used for DELETES, INSERT and UPDATES. Unfortunately, CREATE INDEX CONCURRENTLY waits until active transactions are finished twice due to full table scans:

In a concurrent index build, the index is actually entered into the system catalogs in one transaction, then two table scans occur in two more transactions. Before each table scan, the index build must wait for existing transactions that have modified the table to terminate.” PostgreSQL Documentation

AccessExclusiveLock  (ACCESS EXCLUSIVE)

This lock conflicts with any other locks and is used by these statements:

  • CREATE RULE
  • DROP TABLE
  • DROP INDEX
  • TRUNCATE
  • VACUUM FULL
  • LOCK TABLE (default mode)
  • CLUSTER
  • REINDEX
  • REFRESH MATERIALIZED VIEW (without CONCURRENTLY)
BEGIN;
CREATE RULE r_locktest AS ON INSERT TO locktest DO INSTEAD NOTHING;
-- second connection
postgres=# select pid,vxid,lock_type,lock_mode,granted,xid_lock,relname from lockview;
  pid  | vxid |   lock_type   |      lock_mode      | granted | xid_lock | relname
-------+------+---------------+---------------------+---------+----------+----------
 10997 | 3/19 | transactionid | ExclusiveLock       | t       | 596      |
 10997 | 3/19 | relation      | AccessExclusiveLock | t       |          | locktest

More importantly, drop index requires access exclusive locks for both table and index:

BEGIN;
DROP INDEX c_idx;
-- second connection
postgres=# SELECT * FROM lockview;
 pid  |  vxid  |   lock_type   |      lock_mode      | granted | xid_lock | relname  | page | tuple | classid | objid | objsubid
------+--------+---------------+---------------------+---------+----------+----------+------+-------+---------+-------+----------
 3998 | 3/7839 | virtualxid    | ExclusiveLock       | t       | 3/7839   |          |      |       |         |       |
 3998 | 3/7839 | transactionid | ExclusiveLock       | t       | 569      |          |      |       |         |       |
 3998 | 3/7839 | relation      | AccessExclusiveLock | t       |          | c_idx    |      |       |         |       |
 3998 | 3/7839 | relation      | AccessExclusiveLock | t       |          | locktest |      |       |         |       |

Note: This is the most dangerous type of lock. Avoid running queries requiring access exclusive lock in production, or at least put the application in maintenance mode.

ExclusiveLock

Meanwhile, SQL commands don’t use ExclusiveLock, except for with the general LOCK TABLE statement. This lock prevents all requests except for a non-locking select (i.e. without FOR SHARE/UPDATE).

BEGIN;
LOCK TABLE locktest IN EXCLUSIVE MODE;
-- second connection
postgres=# SELECT pid,vxid,lock_type,lock_mode,granted,xid_lock,relname FROM lockview;
  pid  | vxid | lock_type |   lock_mode   | granted | xid_lock | relname
-------+------+-----------+---------------+---------+----------+----------
 10997 | 3/21 | relation  | ExclusiveLock | t       |          | locktest

Savepoints

Savepoint produces an additional ExclusiveLock of transactionid type with new xid value.

BEGIN;
SELECT * FROM locktest FOR SHARE;
SAVEPOINT s1;
SELECT * FROM locktest FOR UPDATE;
-- second connection
postgres=# SELECT pid,vxid,lock_type,lock_mode,granted,xid_lock,relname FROM lockview;
  pid  | vxid |   lock_type   |    lock_mode    | granted | xid_lock | relname
-------+------+---------------+-----------------+---------+----------+----------
 10997 | 3/37 | transactionid | ExclusiveLock   | t       | 602      |
 10997 | 3/37 | transactionid | ExclusiveLock   | t       | 603      |
 10997 | 3/37 | relation      | AccessShareLock | t       |          | c_idx
 10997 | 3/37 | relation      | RowShareLock    | t       |          | locktest

pg_advisory_lock

Sometimes application developers require synchronization between processes. In such systems, the application creates and removes locks frequently. Systems with a row-based locks implementation tend to cause table bloat.

There are many functions related to advisory locks:

  • per session or per transaction
  • wait if lock is not available or immediately return false
  • exclusive or shared
  • 64-bit or two 32-bit integers resource identifiers

Imagine that we have several cron jobs and that the application should prevent simultaneous runs of the same script. Next, each script can check if a lock is available in PostgreSQL for specific integer job identifier:

postgres=# SELECT pg_try_advisory_lock(10);
 pg_try_advisory_lock
----------------------
 t
-- second connection
postgres=# SELECT * FROM lockview;
 pid  | vxid | lock_type |   lock_mode   | granted | xid_lock | relname | page | tuple | classid | objid | objsubid
------+------+-----------+---------------+---------+----------+---------+------+-------+---------+-------+----------
 3998 | 3/0  | advisory  | ExclusiveLock | t       |          |         |      |       |       0 |    10 |        1
-- other connections
SELECT pg_try_advisory_lock(10);
 pg_try_advisory_lock
----------------------
 f

The query produces ExclusiveLock with type advisory.

Deadlocks

Any system with multiple locks tends to have a deadlock situation when queries will never finish. The only way to resolve such issues: kill one of blocked statements. More importantly, deadlock detection is an expensive procedure in PostgreSQL. A check for deadlock only happens when a transaction is locked for deadlock_timeout milliseconds—after one second by default.

Here is an illustration of a deadlock situation for two different connections A and B:

Any deadlock starting from lock wait.

A: BEGIN; SELECT c FROM locktest WHERE c=1 FOR UPDATE;
B: BEGIN; SELECT c FROM locktest WHERE c=2 FOR UPDATE; SELECT c FROM locktest WHERE c=1 FOR UPDATE;

You are not alone with the identification of deadlocks, as the pg_stat_activity system view helps you to find statements and transactions causing lock waits:

postgres=# SELECT pg_stat_activity.pid AS pid,
query, wait_event, vxid, lock_type,
lock_mode, granted, xid_lock
FROM lockview JOIN pg_stat_activity ON (lockview.pid = pg_stat_activity.pid);
  pid  |          query             |  wait_event   | vxid |   lock_type   |      lock_mode      | granted | xid_lock
-------+----------------------------+---------------+------+---------------+---------------------+---------+----------
 10997 | SELECT ... c=1 FOR UPDATE; | ClientRead    | 3/43 | transactionid | ExclusiveLock       | t       | 605
 10997 | SELECT ... c=1 FOR UPDATE; | ClientRead    | 3/43 | advisory      | ExclusiveLock       | t       |
 10997 | SELECT ... c=1 FOR UPDATE; | ClientRead    | 3/43 | relation      | AccessShareLock     | t       |
 10997 | SELECT ... c=1 FOR UPDATE; | ClientRead    | 3/43 | relation      | RowShareLock        | t       |
 11495 | SELECT ... c=1 FOR UPDATE; | transactionid | 5/29 | transactionid | ExclusiveLock       | t       | 606
 11495 | SELECT ... c=1 FOR UPDATE; | transactionid | 5/29 | advisory      | ExclusiveLock       | t       |
 11495 | SELECT ... c=1 FOR UPDATE; | transactionid | 5/29 | relation      | AccessShareLock     | t       |
 11495 | SELECT ... c=1 FOR UPDATE; | transactionid | 5/29 | relation      | RowShareLock        | t       |
 11495 | SELECT ... c=1 FOR UPDATE; | transactionid | 5/29 | tuple         | AccessExclusiveLock | t       |
 11495 | SELECT ... c=1 FOR UPDATE; | transactionid | 5/29 | transactionid | ShareLock           | f       | 605

The SELECT FOR UPDATE on c=2 row causes a deadlock:

SELECT c FROM locktest WHERE c=2 FOR UPDATE;

Afterwards, PostgreSQL reports in server log:

2018-08-02 08:46:07.793 UTC [10997] ERROR:  deadlock detected
2018-08-02 08:46:07.793 UTC [10997] DETAIL:  Process 10997 waits for ShareLock on transaction 606; blocked by process 11495.
Process 11495 waits for ShareLock on transaction 605; blocked by process 10997.
Process 10997: select c from locktest where c=2 for update;
Process 11495: select c from locktest where c=1 for update;
2018-08-02 08:46:07.793 UTC [10997] HINT:  See server log for query details.
2018-08-02 08:46:07.793 UTC [10997] CONTEXT:  while locking tuple (0,3) in relation "locktest"
2018-08-02 08:46:07.793 UTC [10997] STATEMENT:  SELECT c FROM locktest WHERE c=2 FOR UPDATE;
ERROR:  deadlock detected
DETAIL:  Process 10997 waits for ShareLock on transaction 606; blocked by process 11495.
Process 11495 waits for ShareLock on transaction 605; blocked by process 10997.
HINT:  See server log for query details.
CONTEXT:  while locking tuple (0,3) in relation "locktest"

As you can see, the database server aborts one blocked transaction automatically.

Multi-way deadlocks

Normally there are just two transactions creating deadlocks. However, in complex cases, an application could caused deadlocks with multiple transactions forming a dependency circle.

A: locks row1, B locks row2, C locks row3

Step2

A: trying to get row3, B: trying to get row1, C: trying to get row2

Summary

  • Do not put DDL statements in long transactions.
  • Please avoid DDL on during high load for frequently updated tables
  • CLUSTER command requires exclusive access to the table and all it’s indexes
  • Monitor postgresql log for deadlock-related messages

Photo by shy sol from Pexels

Oct
16
2018
--

PostgreSQL locking, Part 1: Row Locks

PostgreSQL row level locks

row signing with postgresqlAn understanding of PostgreSQL locking is important to build scalable applications and avoid downtime. Modern computers and servers have many CPU cores and it’s possible to execute multiple queries in parallel. Databases containing many consistent structures with changes made by queries or background processes running in parallel could crash a database or even corrupt data. Thus we need the ability to prevent access from concurrent processes, while changing shared memory structures or rows. One thread updates the structure while all others wait (exclusive lock), or multiple threads read the structure and all writes wait. The side effect of waits is a locking contention and server resources waste. Thus it’s important to understand why waits happen and what locks are involved. In this article, I review PostgreSQL row level locking.

In follow up posts, I will investigate table-level locks and latches protecting internal database structures.

Row locks – an overview

PostgreSQL has many locks at different abstraction levels. The most important locks for applications are related to MVCC implementation – row level locking. In second place – locks appearing during maintenance tasks (during backups/database migrations schema changes) – table level locking. It’s also possible—but rare—to see waits on low level PostgreSQL locks. More often there is a high CPU usage, with many concurrent queries running, but overall server performance reduced in comparison with normal number of queries running in parallel.

Example environment

To follow along, you need a PostgreSQL server with a single-column table containing several rows:

postgres=# CREATE TABLE locktest (c INT);
CREATE TABLE
postgres=# INSERT INTO locktest VALUES (1), (2);
INSERT 0 2

Row locks

Scenario: two concurrent transactions are trying to select a row for update.

PostgreSQL uses row-level locking in this case. Row level locking is tightly integrated with MVCC implementation, and uses hidden xmin and xmax fields.

xmin

 and

xmax

 store the transaction id. All statements requiring row-level locks modify the xmax field (even SELECT FOR UPDATE). The modification happens after the query returns its results, so in order to see xmax change we need to run SELECT FOR UPDATE twice. Usually, the xmax field is used to mark a row as expired—either removed by some transaction completely or in favor of updated row version—but it also used for row-level locking infrastructure.

If you need more details about the xmin and xmax hidden fields and MVCC implementation, please check our “Basic Understanding of Bloat and VACUUM in PostgreSQL” blog post.

postgres=# BEGIN;
postgres=# SELECT xmin,xmax, txid_current(), c FROM locktest WHERE c=1 FOR UPDATE;
BEGIN
 xmin | xmax | txid_current | c
------+------+--------------+---
  579 |  581 |          583 | 1
(1 row)
postgres=# SELECT xmin,xmax, txid_current(), c FROM locktest WHERE c=1 FOR UPDATE;
 xmin | xmax | txid_current | c
------+------+--------------+---
  579 |  583 |          583 | 1
(1 row)

If a statement is trying to to modify the same row, it checks the list of unfinished transactions. The statement has to wait for modification until the transaction with id=xmax is finished.

There is no infrastructure for waiting on a specific row, but a transaction can wait on transaction id.

-- second connection
SELECT xmin,xmax,txid_current() FROM locktest WHERE c=1 FOR UPDATE;

The SELECT FOR UPDATE query running in the second connection is unfinished, and waiting for the first transaction to complete.

pg_locks

Such waits and locks can be seen by querying pg_locks:

postgres=# SELECT locktype,transactionid,virtualtransaction,pid,mode,granted,fastpath
postgres-#  FROM pg_locks WHERE transactionid=583;
   locktype    | transactionid | virtualtransaction |  pid  |     mode      | granted | fastpath
---------------+---------------+--------------------+-------+---------------+---------+----------
 transactionid |           583 | 4/107              | 31369 | ShareLock     | f       | f
 transactionid |           583 | 3/11               | 21144 | ExclusiveLock | t       | f

You can see the writer transaction id for locktype=transactionid == 583. Let’s get the pid and backend id for the holding lock:

postgres=# SELECT id,pg_backend_pid() FROM pg_stat_get_backend_idset() AS t(id)
postgres-#  WHERE pg_stat_get_backend_pid(id) = pg_backend_pid();
 id | pg_backend_pid
----+----------------
  3 |          21144

This backend has its lock granted (t). Each backend has an OS process identifier (PID) and internal PostgreSQL identifier (backend id). PostgreSQL can process many transactions, but locking can happen only between backends, and each backend executes a single transaction. Internal bookkeeping requires just a virtual transaction identifier: a pair of backend ids and a sequence number inside the backend.

Regardless of the number of rows locked, PostgreSQL will have only a single related lock in the pg_locks table. Queries might modify billions of rows but PostgreSQL does not waste memory for redundant locking structures.

A writer thread sets ExclusiveLock on its transactionid. All row level lock waiters set ShareLock. The lock manager resumes all previously locked backend locks as soon as the writer releases the lock.

Lock release for transactionid occurs on commit or rollback.

pg_stat_activity

Another great method to get locking-related details is to select from the pg_stat_activity table:

postgres=# SELECT pid,backend_xid,wait_event_type,wait_event,state,query FROM pg_stat_activity WHERE pid IN (31369,21144);
-[ RECORD 1 ]---+---------------------------------------------------------------------------------------------------------------------------
pid             | 21144
backend_xid     | 583
wait_event_type | Client
wait_event      | ClientRead
state           | idle in transaction
query           | SELECT id,pg_backend_pid() FROM pg_stat_get_backend_idset() AS t(id) WHERE pg_stat_get_backend_pid(id) = pg_backend_pid();
-[ RECORD 2 ]---+---------------------------------------------------------------------------------------------------------------------------
pid             | 31369
backend_xid     | 585
wait_event_type | Lock
wait_event      | transactionid
state           | active
query           | SELECT xmin,xmax,txid_current() FROM locktest WHERE c=1 FOR UPDATE;

Source code-level investigation

Let’s check the stack trace for the waiter with gdb and the pt-pmp tool:

# pt-pmp -p 31369
Sat Jul 28 10:10:25 UTC 2018
30	../sysdeps/unix/sysv/linux/epoll_wait.c: No such file or directory.
      1 epoll_wait,WaitEventSetWaitBlock,WaitEventSetWait,WaitLatchOrSocket,WaitLatch,ProcSleep,WaitOnLock,LockAcquireExtended,LockAcquire,XactLockTableWait,heap_lock_tuple,ExecLockRows,ExecProcNode,ExecutePlan,standard_ExecutorRun,PortalRunSelect,PortalRun,exec_simple_query,PostgresMain,BackendRun,BackendStartup,ServerLoop,PostmasterMain,main

The WaitOnLock function is causing the wait. The function is located in lock.c file (POSTGRES primary lock mechanism).

A lock table is a shared memory hash table. The conflicting process sleeps for the lock in storage/lmgr/proc.c. For the most part, this code should be invoked via lmgr.c or another lock-management module, not directly.

Next, locks listed in pg_stat_activity as “Lock” are also called heavyweight locks, and controlled by Lock Manager. HWLocks are also used for many high level actions.

By the way, a full description can be found here: https://www.postgresql.org/docs/current/static/explicit-locking.html

Summary

  • Avoid long running transactions modifying frequently updated rows or too many rows
  • Next, do not use hotspots (single row or multiple rows updated in parallel by many application client connections) with MVCC databases. This kind of workload is more suitable for in-memory databases and can usually be separated from the main business logic.
Sep
25
2017
--

Avoid Shared Locks from Subqueries When Possible

Shared Locks

Shared LocksIn this blog post, we’ll look at how to avoid shared locks from subqueries.

I’m pretty sure most of you have seen an UPDATE statement matching rows returned from a SELECT query:

update ibreg set k=1 where id in (select id from ibcmp where id > 90000);

This query, when executed with

autocommit=1

, is normally harmless. However, this can have bad effects when combined with other statements in the same transaction that result in holding the shared locks from the SELECT query. But first, let me explain why the SELECT query would hold locks in the first place.

Due to InnoDB’s ACID properties, to make sure that the outer UPDATE statement has a consistent view of the matching rows from the SELECT query the server has to acquire a shared lock on those rows. No other thread should modify those matching rows to maintain consistency within the transaction. To demonstrate, let’s take two transactions executed in specific order below:

mysql1> begin;
mysql1> update ibreg set k=1 where id in (select id from ibcmp where id > 90000);
mysql2> begin;
mysql2> delete from ibcmp where id > 90000;

By the time the second session executes, it will be in a LOCK WAIT state (as confirmed from INFORMATION_SCHEMA):

mysql1> select * from information_schema.innodb_trx G
*************************** 1. row ***************************
                    trx_id: 3932449
                 trx_state: LOCK WAIT
               trx_started: 2017-09-06 00:20:05
     trx_requested_lock_id: 3932449:13:1354:31
          trx_wait_started: 2017-09-06 00:20:05
                trx_weight: 2
       trx_mysql_thread_id: 9
                 trx_query: delete from test.ibcmp where id > 90000
       trx_operation_state: starting index read
...
mysql1> select * from information_schema.innodb_locks G
*************************** 1. row ***************************
    lock_id: 3932449:13:1354:31
lock_trx_id: 3932449
  lock_mode: X
  lock_type: RECORD
 lock_table: `test`.`ibcmp`
 lock_index: PRIMARY
 lock_space: 13
  lock_page: 1354
   lock_rec: 31
  lock_data: 90001
*************************** 2. row ***************************
    lock_id: 3932174:13:1354:31
lock_trx_id: 3932174
  lock_mode: S
  lock_type: RECORD
 lock_table: `test`.`ibcmp`
 lock_index: PRIMARY
 lock_space: 13
  lock_page: 1354
   lock_rec: 31
  lock_data: 90001

Information_Schema.INNODB_LOCKS confirms that our first transaction has held a shared lock on the rows that matched the SELECT queries from the first transaction. This can be bad for a number of reasons:

  1. As the number of rows that matches the SELECT grows, DEADLOCK and lock wait timeouts can become more frequent
  2. As a consequence of this, ROLLBACKs would also increase (and are expensive operations)
  3. Your users can become unhappy, especially if it is not handled gracefully from the application

If you really need the consistency of the view between the table being read from and the table getting updated, the lock is necessary and unavoidable. Avoiding the deadlocks and lock wait timeouts can be minimized, but not totally avoided.

On the other hand, if you’re not worried about view consistency, there are two ways you can avoid such problems: by using variables or making sure the SELECT becomes a transient read inside the transaction (i.e., by dumping the results into an OUTFILE).

mysql1> begin;
mysql1> select group_concat(id) into @ids from ibcmp where id > 90000;
mysql1> update ibreg set k=1 where id in (@ids);
mysql2> begin;
mysql2> delete from ibcmp where iid > 90000;

The first method is bound by the

group_concat_max_len

 variable. If you think you will only have a few resulting IDs that fit into

group_concat_max_len

, this is a good solution.

mysql1> begin;
mysql1> select id into outfile '/tmp/id.csv' from ibcmp where id > 90000;
mysql1> create temporary table t (id int unsigned not null) engine=innodb;
mysql1> load data infile '/tmp/id.csv' into table t;
mysql1> update ibreg inner join t on ibreg.id = t.id;
mysql2> begin;
mysql2> delete from ibcmp where id > 90000;

The second approach is only meant to overcome the limitation of the GROUP_CONCAT method.

Again, these two approaches only work if you do not care if the result of the SELECT queries changes on the other table between the BEGIN statement and UPDATE within the transaction.

May
11
2017
--

MyRocks and LOCK IN SHARE MODE

LOCK IN SHARE MODE

LOCK IN SHARE MODEIn this blog post, we’ll look at MyRocks and the

LOCK IN SHARE MODE

.

Those who attended the March 30th webinar “MyRocks Troubleshooting” might remember our discussion with Yoshinori on 

LOCK IN SHARE MODE

.

I did more tests, and I can confirm that his words are true:

LOCK IN SHARE MODE

 works in MyRocks.

This quick example demonstrates this. The initial setup:

CREATE TABLE t (
id int(11) NOT NULL,
f varchar(100) DEFAULT NULL,
PRIMARY KEY (id)
) ENGINE=ROCKSDB;
insert into t values(12345, 'value1'), (54321, 'value2');

In session 1:

session 1> begin;
Query OK, 0 rows affected (0.00 sec)
session 1> select * from t where id=12345 lock in share mode;
+-------+--------+
| id | f |
+-------+--------+
| 12345 | value1 |
+-------+--------+
1 row in set (0.01 sec)

In session 2:

session 2> begin;
Query OK, 0 rows affected (0.00 sec)
session 2> update t set f='value3' where id=12345;
ERROR HY000: Lock wait timeout exceeded; try restarting transaction

However, in the webinar I wanted to remind everyone about the differences between

LOCK IN SHARE MODE

  and

FOR UPDATE

. To do so, I added the former to my “session 2” test for the webinar. Once I did, it ignores the lock set in “session 1”. I can update a row and commit:

session 2> select * from t where id=12345 lock in share mode;
+-------+--------+
| id    | f      |
+-------+--------+
| 12345 | value1 |
+-------+--------+
1 row in set (0.00 sec)
session 2> update t set f='value3' where id=12345;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0
session 2> commit;
Query OK, 0 rows affected (0.02 sec)

I reported this behavior here, and also at Percona Jira bugs database: MYR-107. In Facebook, this bug is already fixed.

This test clearly demonstrates that it is fixed in Facebook. In “session 1”:

session1> CREATE TABLE `t` (
    -> `id` int(11) NOT NULL,
    -> `f` varchar(100) DEFAULT NULL,
    -> PRIMARY KEY (`id`)
    -> ) ENGINE=ROCKSDB;
Query OK, 0 rows affected (0.00 sec)
session1> insert into t values(12345, 'value1'), (54321, 'value2');
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0
session1> begin;
Query OK, 0 rows affected (0.00 sec)
session1>  select * from t where id=12345 lock in share mode;
+-------+--------+
| id    | f      |
+-------+--------+
| 12345 | value1 |
+-------+--------+
1 row in set (0.00 sec)

And now in another session:

session2> begin;
Query OK, 0 rows affected (0.00 sec)
session2> select * from t where id=12345 lock in share mode;
+-------+--------+
| id    | f      |
+-------+--------+
| 12345 | value1 |
+-------+--------+
1 row in set (0.00 sec)
session2> update t set f='value3' where id=12345;
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction: Timeout on index: test.t.PRIMARY

If you want to test the fix with the Facebook MySQL build, you need to update submodules to download the patch:

git submodule update

.

Jun
17
2016
--

InnoDB locks and transaction isolation level

InnoDB locks and transaction isolation

InnoDB locks and transaction isolationWhat is the difference between InnoDB locks and transaction isolation level? We’ll discuss it in this post.

Recently I received a question from a user about one of my earlier blog posts. Since it wasn’t sent as a comment, I will answer it here. The question:

> I am reading your article:
> https://www.percona.com/resources/technical-presentations/troubleshooting-locking-issues-percona-mysql-webinar

> Full table scan locks whole table.

> Some bad select (read) query can do full table scan on InnoDB, does it lock whole table please?

> My understanding was that SELECT (read) blocks another DML only in MyISAM.

To answer this question, we to need understand two different concepts: locking and transaction isolation level.

In the default transaction isolation mode for InnoDB, 

REPEATABLE READ

 and the lower

TRANSACTION ISOLATION

levels,

SELECT

  doesn’t block any DML unless it uses 

SELECT ... FOR UPDATE

 or

SELECT ... LOCK IN SHARE MODE

. On slide 20 (31) of my presentation, I use the 

SELECT ... FOR UPDATE

 statement (this is why a lock is seen).

However, if transaction isolation mode is

SERIALIZABLE

, then 

SELECT

 can block updates. You can see this in the example below:

mysql1> set transaction isolation level serializable;
Query OK, 0 rows affected (0,00 sec)
mysql1> begin;
Query OK, 0 rows affected (0,00 sec)
mysql1> select * from employees join titles using(emp_no);
c936e6fc4c6cbaf77679ba5013339dff  -
443308 rows in set (29,69 sec)
mysql2> begin;
Query OK, 0 rows affected (0,00 sec)
mysql2> update titles set title='Engineer' where emp_no=15504;
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

Regarding the different level locks set by InnoDB, I recommend that you follow the links on slide 37 (67) of the presentation, especially “InnoDB Locking Explained With Stick Figures” by Bill Karwin. Innodb table locks are also explained very well in this blog post by Peter Zaitsev.

Sep
20
2010
--

Instrumentation and the cost of Foreign Keys

I occasionally get in to light arguments healthy discussions with students about whether or not to use Foreign Key constraints on InnoDB tables.  My standard response has always been: “it depends on how much of a tradeoff you are willing to make for performance. In some situations the cost can be considerable”.

.. that’s when they expect me to “come up with some real proof” to show them. I do not disagree with their logic or proof being on their list-of-demands.  I support the use of data to make decisions.

The problem is that MySQL has (traditionally) been lacking the instrumentation required to make these decisions easy.  This is getting better  – here is an example we recently added to our InnoDB course:

 CREATE TABLE parent (
 id INT NOT NULL AUTO_INCREMENT,
 bogus_column char(32),
 PRIMARY KEY (id)
) ENGINE=InnoDB;

CREATE TABLE child (
 id INT NOT NULL AUTO_INCREMENT,
 parent_id INT NOT NULL,
 bogus_column char(32),
 PRIMARY KEY (id),
 KEY (parent_id),
 CONSTRAINT child_ibfk_1 FOREIGN KEY (parent_id) REFERENCES parent (id)
) ENGINE=InnoDB;

INSERT INTO parent (bogus_column)
VALUES ('aaa'), ('bbb'), ('ccc'), ('ddd'), ('eee');

INSERT INTO child (parent_id,bogus_column) VALUES
(1, 'aaa'), (2,'bbb'), (3, 'ccc'),
(4, 'ddd'), (5, 'eee');

START TRANSACTION; # session1
START TRANSACTION; # session2

# session1
UPDATE child SET parent_id = 5? WHERE parent_id = 4;

#session2
UPDATE parent SET bogus_column = 'new!' WHERE id = 4;

#session2
UPDATE parent SET bogus_column = 'new!' WHERE id = 5;

In the last statement, session2 will block waiting on a lock. Want to know where that lock is? Check information_schema.innodb_locks:

mysql> SELECT * FROM information_schema.innodb_locks\G
*************************** 1. row ***************************
    lock_id: 87035:1300:3:6
lock_trx_id: 87035
  lock_mode: X
  lock_type: RECORD
 lock_table: `test`.`parent`
 lock_index: `PRIMARY`
 lock_space: 1300
  lock_page: 3
   lock_rec: 6
  lock_data: 5
*************************** 2. row ***************************
    lock_id: 87034:1300:3:6
lock_trx_id: 87034
  lock_mode: S
  lock_type: RECORD
 lock_table: `test`.`parent`
 lock_index: `PRIMARY`
 lock_space: 1300
  lock_page: 3
   lock_rec: 6
  lock_data: 5
2 rows in set (0.00 sec)

The same example without the foreign key constraints does not block on the last statement.  We also see the expected output change to:

mysql> SELECT * FROM information_schema.innodb_locks;
Empty set (0.00 sec)

This information_schema table is new to InnoDB plugin.  In earlier releases of MySQL you may be able to get the data, but it is not in such an easily digestible form.  Instrumentation is the most under talked about feature in all new releases of MySQL and Percona Server.  See BUG #53336 for more examples of pure awesomeness.


Entry posted by Morgan Tocker |
6 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com