Personal blog of Yzmir Ramirez

Oct
09
2023

PostgreSQL Tuple-Level Statistics With pgstattuple

Since Postgres table bloat degrades database performance, we can improve its performance by removing the table bloat. We can use the pgstattuple extension to identify the bloated tables.

This extension provides several functions for obtaining tuple-level statistics. Because the pgstattuple functions produce extensive page-level information, access to them is, by default, limited. Only the pg_stat_scan_tables role has the ability to EXECUTE the pgstattuple functions by default. The pgstattuple functions are not restricted to the superuser.

Using the pgstattuple function, we can list the tables with a high percentage of dead tuples and run a manual VACUUM to reclaim the space occupied by the dead tuples.

In this blog, we will discuss the pgstattuple extension, which provides various functions to obtain tuple-level statistics.

First, create the pgstattuple extension. The latest version of pgstattuple is 1.5.

postgres@ip-172-31-46-212:~$ psql
psql (15.4 (Ubuntu 15.4-1.pgdg22.04+1))
Type "help" for help.
postgres=# select * from pg_available_extensions where name='pgstattuple';
    name     | default_version | installed_version |           comment
-------------+-----------------+-------------------+-----------------------------
 pgstattuple | 1.5             |                   | show tuple-level statistics
(1 row)
postgres=#
postgres=# create extension pgstattuple;
CREATE EXTENSION
postgres=#
postgres=# select * from pg_available_extensions where name='pgstattuple';
    name     | default_version | installed_version |           comment
-------------+-----------------+-------------------+-----------------------------
 pgstattuple | 1.5             | 1.5               | show tuple-level statistics
(1 row)
postgres=#

By default, only the superuser has access to the pgstattuple functions; however, you can grant non-superuser access by granting the pg_stat_scan_tables role to the non-superuser.

postgres=# grant pg_stat_scan_tables to <nonsuperuser>;
GRANT ROLE

postgres=> du <nonsuperuser>
                 List of roles
 Role name | Attributes |       Member of
———–+————+———————–
 <nonsuperuser>        |            | {pg_stat_scan_tables}
postgres=>

Next, before using the pgstattuple functions, let’s create a table and index for demonstration.

postgres=# create table workshop (jobno int);
CREATE TABLE

postgres=# insert into workshop values (generate_series(1,80000));
INSERT 0 80000

postgres=# create index workshop_index on workshop (jobno);
CREATE INDEX
postgres=#

pgstattuple functions

pgstattuple(regclass): This function returns a relation’s physical length, percentage of “dead” tuples, and other info. This may help users to determine whether a vacuum is necessary or not. The argument is the target relation’s name (optionally schema-qualified) or OID.

For example:

SELECT * FROM pgstattuple('pg_catalog.pg_proc');

SELECT * FROM pgstattuple('workshop');

pgstattuple only obtains a read lock on the relation. As a result, the pgstattuple output does not represent an instantaneous snapshot. The concurrent updates will change the output of the pgstattuple.

The example below shows the tuple statistics of the workshop table.

postgres=# SELECT * FROM pgstattuple('workshop');

-[ RECORD 1 ]------+--------
table_len          | 2899968
tuple_count        | 80000
tuple_len          | 2240000
tuple_percent      | 77.24
dead_tuple_count   | 0
dead_tuple_len     | 0
dead_tuple_percent | 0
free_space         | 10056
free_percent       | 0.35

Below are the pgstattuple output columns and their description.

Column	Type	Description
table_len	bigint	Physical relation length in bytes
tuple_count	bigint	Number of live tuples
tuple_len	bigint	Total length of live tuples in bytes
tuple_percent	float8	Percentage of live tuples
dead_tuple_count	bigint	Number of dead tuples
dead_tuple_len	bigint	Total length of dead tuples in bytes
dead_tuple_percent	float8	Percentage of dead tuples
free_space	bigint	Total free space in bytes
free_percent	float8	Percentage of free space

Let’s check the tuple statistics when we delete or update the rows. Below are the current tuple statistics for the reference.

postgres=# SELECT * FROM pgstattuple('workshop');
-[ RECORD 1 ]------+--------
table_len          | 2899968
tuple_count        | 80000
tuple_len          | 2240000
tuple_percent      | 77.24
dead_tuple_count   | 0
dead_tuple_len     | 0
dead_tuple_percent | 0
free_space         | 10056
free_percent       | 0.35

Delete the few ROWS from the table.

postgres=# DELETE FROM workshop WHERE jobno % 8 = 0;
DELETE 10000
postgres=#

In the below output, the dead_tuple_count shows that Postgres marked the rows as deleted but did not remove them from the table, as the length of the table is the same after the deletion of the rows.

postgres=# SELECT * FROM pgstattuple('workshop');
-[ RECORD 1 ]------+--------
table_len          | 2899968
tuple_count        | 70000
tuple_len          | 1960000
tuple_percent      | 67.59
dead_tuple_count   | 10000
dead_tuple_len     | 280000
dead_tuple_percent | 9.66
free_space         | 10056
free_percent       | 0.35

Now execute the VACUUM on the table; after running the plain VACUUM, we see that the:

free_space has increased. Vacuum has reclaimed the space and made it available for re-use at the database level.
Dead rows have been removed.
The table_len is the same as O.S. level space is not reclaimed.

postgres=# vacuum workshop;
VACUUM

postgres=# SELECT * FROM pgstattuple('workshop');
-[ RECORD 1 ]------+--------
table_len          | 2899968
tuple_count        | 70000
tuple_len          | 1960000
tuple_percent      | 67.59
dead_tuple_count   | 0
dead_tuple_len     | 0
dead_tuple_percent | 0
free_space         | 330412
free_percent       | 11.39

After running the VACUUM FULL, we can see that table_len has decreased. It shows that O.S. level space is reclaimed after running the VACUUM FULL.

postgres=# vacuum full workshop;
VACUUM

postgres=# SELECT * FROM pgstattuple('workshop');
-[ RECORD 1 ]------+--------
table_len          | 2539520
tuple_count        | 70000
tuple_len          | 1960000
tuple_percent      | 77.18
dead_tuple_count   | 0
dead_tuple_len     | 0
dead_tuple_percent | 0
free_space         | 10840
free_percent       | 0.43
postgres=#

pgstattuple queries to check table bloat

We can list the tables where the dead tuple percentage is high by using the query below.

postgres=#select relname,(pgstattuple(oid)).dead_tuple_percent from pg_class where relkind = 'r' order by dead_tuple_percent desc;
 relname                  | dead_tuple_percent
--------------------------+--------------------
 pg_init_privs            |               2.56
 pg_class                 |               1.69
 pg_statistic             |               1.45
 pg_authid                |               1.37
 pg_extension             |               1.28
 pg_type                  |               1.17
 pg_namespace             |               1.17

The below query will show you the tuple statistics of the tables in detail where the dead tuple percentage is high.

SELECT relname,oid,relowner,(pgstattuple(oid)).dead_tuple_percent ,(pgstattuple(oid)).* FROM pg_class WHERE relkind = 'r' order by dead_tuple_percent desc;

-[ RECORD 1 ]------+-------------------------
relname            | pg_init_privs
oid                | 3394
relowner           | 10
dead_tuple_percent | 2.56
table_len          | 24576
tuple_count        | 222
tuple_len          | 17316
tuple_percent      | 70.46
dead_tuple_count   | 9
dead_tuple_len     | 630
dead_tuple_percent | 2.56
free_space         | 4568
free_percent       | 18.59

-[ RECORD 2 ]------+-------------------------
relname            | pg_class
oid                | 1259
relowner           | 10
dead_tuple_percent | 1.69
table_len          | 114688
tuple_count        | 415
tuple_len          | 80473
tuple_percent      | 70.17
dead_tuple_count   | 11
dead_tuple_len     | 1937
dead_tuple_percent | 1.69
free_space         | 26976
free_percent       | 23.52

pgstatindex(regclass): This function returns a record showing information about a B-tree index.

For example:

postgres=# SELECT * FROM pgstatindex ('workshop_index');

-[ RECORD 1 ]------+--------
version            | 4
tree_level         | 1
index_size         | 1589248
root_block_no      | 3
internal_pages     | 1
leaf_pages         | 192
empty_pages        | 0
deleted_pages      | 0
avg_leaf_density   | 89.74
leaf_fragmentation | 0

The output columns and their description.

Column	Type	Description
version	integer	B-tree version number
tree_level	integer	Tree level of the root page
index_size	bigint	Total index size in bytes
root_block_no	bigint	Location of root page (zero if none)
internal_pages	bigint	Number of “internal” (upper-level) pages
leaf_pages	bigint	Number of leaf pages
empty_pages	bigint	Number of empty pages
deleted_pages	bigint	Number of deleted pages

pgstatginindex(regclass): This function returns a record showing information about a GIN index.

For example:

SELECT * FROM pgstatginindex('table_gin_index');

pgstathashindex(regclass): This function returns a record showing information about a HASH index.

For example:

select * from pgstathashindex('table_hash_index');

In this blog, we have seen how the pgstattuple can be used for tuple-level analysis and to find bloated tables or vacuuming candidates. https://www.postgresql.org/docs/current/pgstattuple.html has some additional information on the extension.

Percona Distribution for PostgreSQL provides the best and most critical enterprise components from the open-source community, in a single distribution, designed and tested to work together.

Download Percona Distribution for PostgreSQL Today!

Written by Shivam Dhapatkar in: Monitoring,Percona Software,Postgres Bloat,postgres extensions,postgresql,Shivam PP,vacuum,Zend Developer |

Aug
05
2022

An Illustration of PostgreSQL Bloat

I have been working with Postgres now for many years, most recently as a consultant at Percona for companies with new implementations of Postgres as a result of migrating from Oracle or some other legacy database engine. In some cases, these are Fortune 100 companies with many talented people working for them. However, not all databases work the same and one of the most common observations I make when reviewing a Postgres environment for a client is the amount of table bloat, index bloat, and lack of understanding of its impact on performance – and how to address it.

I wrote a blog about this topic a few years ago and never gave it much thought after its publication. However, with the large number of companies moving to Postgres for obvious reasons, and the lack of true Postgres database management skills needed to support fairly large databases, I thought I would rewrite this blog and bring it back to life with some clarity to help one understand bloat and why it happens.

What causes bloat?

In PostgreSQL, the culprit is Multi-Version Concurrency Control, commonly referred to as MVCC.

MVCC ensures that a transaction against a database will return only data that’s been committed, in a snapshot, even if other processes are trying to modify that data.

Imagine a database with millions of rows in a table. Anytime you update or delete a row, Postgres has to keep track of that row based on a transaction ID. For example, you may be running a long query with transaction ID 100 while someone named John just updated the same table using transaction ID 101. At this point, since you are still inside of transaction 100, which is older than 101 as far as your query is concerned, the changes made by John in transaction ID 101 are not relevant or visible to your query. You are in your own personal bubble of data back before the data has changed. Any new queries from you or anyone else with a transaction ID greater than 101 will see the changes made by John in transaction 101. After all, new transaction IDs are greater than 101, meaning no other transactions are currently in play with IDs less than 101 the data you saw in transaction ID 100 will no longer be needed by the database and will be considered dead but not gone. Hence, bloat!

At a high level, vacuuming is used to free up dead rows in a table so they can be reused. It also helps you avoid transaction ID wraparound.

Let’s go through a few steps to illustrate how all this takes place

In order for Postgres to know which transaction data should be in the result set of your query, the snapshot makes note of transaction information.

Essentially, if your transaction ID is 100, you will only see data from all transaction IDs leading up to 100. As stated above, you will not see data from transaction ID 101 or greater.

Setting up an example

Let’s start by creating a simple table for our example, called percona:

percona=# CREATE TABLE percona ( col1 int );
CREATE TABLE


percona=# INSERT INTO percona values (1);
INSERT 0 1


percona=# INSERT INTO percona values (2);
INSERT 0 1


percona=# INSERT INTO percona values (3);
INSERT 0 1


percona=# INSERT INTO percona values (4);
INSERT 0 1


percona=# INSERT INTO percona values (5);
INSERT 0 1

You can wrap multiple inserts into a single transaction with BEGIN and COMMIT:

percona=# BEGIN;
BEGIN


percona=*# INSERT INTO percona SELECT generate_series(6,10);
INSERT 0 5

percona=*# COMMIT;
COMMIT

Here we can see the 10 rows we inserted into the table, along with some hidden system columns:

percona=# SELECT xmin, xmax, * FROM percona;
   xmin   | xmax | col1
----------+------+------
 69099597 |    0 |    1
 69099609 |    0 |    2
 69099627 |    0 |    3
 69099655 |    0 |    4
 69099662 |    0 |    5
 69099778 |    0 |    6
 69099778 |    0 |    7
 69099778 |    0 |    8
 69099778 |    0 |    9
 69099778 |    0 |   10
(10 rows)

As you can see, values one through five (in the col1 column) have unique transaction IDs (represented in the xmin column)—they were the result of individual INSERT statements, made one after the other. The rows with values of six through 10 share the same transaction ID of 6909978; they were all part of the one transaction we created with the BEGIN and COMMIT statements.

At this point, you may be asking yourself what this has to do with vacuum or autovacuum. We will get there. First, you need to know about the transaction id logic and visually see it for a better understanding as shown above.

How does the table bloat?

In Postgres, the heap is a file containing a list of variable-sized records, in no particular order, that points to the location of a row within a page. (A Postgres page is 8k in size). The pointer to the location is called the CTID.

To view the heap without having to read the raw data from the file, we need to create the following extension inside of our database:

CREATE extension pageinspect;

Now we can inspect the heap for our newly created table and rows:

percona=# SELECT * FROM heap_page_items(get_raw_page('percona',0));


 lp | lp_off | lp_flags | lp_len |  t_xmin  | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid |   t_data
----+--------+----------+--------+----------+--------+----------+--------+-------------+------------+--------+--------+-------+------------
  1 |   8160 |        1 |     28 | 69099597 |      0 |        0 | (0,1)  |           1 |       2304 |     24 |        |       | \x01000000
  2 |   8128 |        1 |     28 | 69099609 |      0 |        0 | (0,2)  |           1 |       2304 |     24 |        |       | \x02000000
  3 |   8096 |        1 |     28 | 69099627 |      0 |        0 | (0,3)  |           1 |       2304 |     24 |        |       | \x03000000
  4 |   8064 |        1 |     28 | 69099655 |      0 |        0 | (0,4)  |           1 |       2304 |     24 |        |       | \x04000000
  5 |   8032 |        1 |     28 | 69099662 |      0 |        0 | (0,5)  |           1 |       2304 |     24 |        |       | \x05000000
  6 |   8000 |        1 |     28 | 69099778 |      0 |        0 | (0,6)  |           1 |       2304 |     24 |        |       | \x06000000
  7 |   7968 |        1 |     28 | 69099778 |      0 |        0 | (0,7)  |           1 |       2304 |     24 |        |       | \x07000000
  8 |   7936 |        1 |     28 | 69099778 |      0 |        0 | (0,8)  |           1 |       2304 |     24 |        |       | \x08000000
  9 |   7904 |        1 |     28 | 69099778 |      0 |        0 | (0,9)  |           1 |       2304 |     24 |        |       | \x09000000
 10 |   7872 |        1 |     28 | 69099778 |      0 |        0 | (0,10) |           1 |       2304 |     24 |        |       | \x0a000000
(10 rows)

The table above shows 10 entries with a few columns:

lp is the ID of the row/tuple
t_xmin is the transaction ID
t_ctid is the pointer
t_data is the actual data

Currently, the pointer for each row is pointing to itself as determined by the form (page,tupleid). Pretty straightforward.

Now, let’s perform a few updates on a specific row. Let’s change the value of five to 20, then to 30, and finally back to five.

percona=# UPDATE percona SET col1 = 20 WHERE col1 = 5;
UPDATE 1


percona=# UPDATE percona SET col1 = 30 WHERE col1 = 20;
UPDATE 1


percona=# UPDATE percona SET col1 = 5 WHERE col1 = 30;
UPDATE 1

These three changes took place under three different transactions.

What does this mean? We changed the values for a column three times but never added or deleted any rows. So we should still have 10 rows, right?

percona=# SELECT COUNT(*) FROM percona;
 count
-------
    10
(1 row)

Looks as expected. But wait! Let’s look at the heap now. The real data on disk.

percona=# SELECT * FROM heap_page_items(get_raw_page('percona',0));


 lp | lp_off | lp_flags | lp_len |  t_xmin  |  t_xmax  | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid |   t_data
----+--------+----------+--------+----------+----------+----------+--------+-------------+------------+--------+--------+-------+------------
  1 |   8160 |        1 |     28 | 69099597 |        0 |        0 | (0,1)  |           1 |       2304 |     24 |        |       | \x01000000
  2 |   8128 |        1 |     28 | 69099609 |        0 |        0 | (0,2)  |           1 |       2304 |     24 |        |       | \x02000000
  3 |   8096 |        1 |     28 | 69099627 |        0 |        0 | (0,3)  |           1 |       2304 |     24 |        |       | \x03000000
  4 |   8064 |        1 |     28 | 69099655 |        0 |        0 | (0,4)  |           1 |       2304 |     24 |        |       | \x04000000
  5 |   8032 |        1 |     28 | 69099662 | 69103876 |        0 | (0,11) |       16385 |       1280 |     24 |        |       | \x05000000
  6 |   8000 |        1 |     28 | 69099778 |        0 |        0 | (0,6)  |           1 |       2304 |     24 |        |       | \x06000000
  7 |   7968 |        1 |     28 | 69099778 |        0 |        0 | (0,7)  |           1 |       2304 |     24 |        |       | \x07000000
  8 |   7936 |        1 |     28 | 69099778 |        0 |        0 | (0,8)  |           1 |       2304 |     24 |        |       | \x08000000
  9 |   7904 |        1 |     28 | 69099778 |        0 |        0 | (0,9)  |           1 |       2304 |     24 |        |       | \x09000000
 10 |   7872 |        1 |     28 | 69099778 |        0 |        0 | (0,10) |           1 |       2304 |     24 |        |       | \x0a000000
 11 |   7840 |        1 |     28 | 69103876 | 69103916 |        0 | (0,12) |       49153 |       9472 |     24 |        |       | \x14000000
 12 |   7808 |        1 |     28 | 69103916 | 69103962 |        0 | (0,13) |       49153 |       9472 |     24 |        |       | \x1e000000
 13 |   7776 |        1 |     28 | 69103962 |        0 |        0 | (0,13) |       32769 |      10496 |     24 |        |       | \x05000000
(13 rows)

We have 13 rows, not 10. What the heck just happened?

Let’s examine our three separate update transactions (69103876, 69103916, 69103962) to see what’s happening with the heap:

t_xmin (691103876)

UPDATE percona SET col1 = 20 WHERE col1 = 5;
Logically DELETE tuple ID 5
Physically INSERT tuple ID 11
UPDATE tuple ID 5 pointer (t_tcid) to point to tuple ID 11

Tuple ID 5 becomes a dead row when its t_xmax gets set to the new transaction ID initiated by transaction 691103876.

t_xmin (69103916)

UPDATE percona SET col1 = 30 WHERE col1 = 20;
Logically DELETE tuple ID 11
Physically INSERT tuple ID 12
UPDATE tuple ID 11 pointer (t_tcid) to point to tuple ID 12

Once again, Tuple ID 11 becomes a dead row when its t_xmax set to the new transaction ID initiated by transaction 69103916.

t_xmin (69103962)

UPDATE percona SET col1 = 5 WHERE col1 = 30;
Logically DELETE tuple ID 12
Physically INSERT tuple ID 13
UPDATE tuple ID 12 pointer (t_tcid) to point to tuple ID 13

Tuple ID 13 is live and visible to other transactions. It has no t_xmax and the t_ctid (0,13) points to itself.

The key takeaway from this is that we have not added or deleted rows in our table. We still see 10 in the count, but our heap has increased to 13 by an additional three transactions being executed.

At a very high level, this is how PostgreSQL implements MVCC and why we have table bloat in our heap. In essence, changes to data results in a new row reflecting the latest state of the data. The old rows need to be cleaned or reused for efficiency.

Vacuuming the table

The way to deal with the table bloat is to vacuum the table:

percona=# vacuum percona;
VACUUM

Now, let’s examine the heap again:

percona=# SELECT * FROM heap_page_items(get_raw_page('percona',0));

lp | lp_off | lp_flags | lp_len |  t_xmin  | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid |   t_data
----+--------+----------+--------+----------+--------+----------+--------+-------------+------------+--------+--------+-------+------------
  1 |   8160 |        1 |     28 | 69099597 |      0 |        0 | (0,1)  |           1 |       2304 |     24 |        |       | \x01000000
  2 |   8128 |        1 |     28 | 69099609 |      0 |        0 | (0,2)  |           1 |       2304 |     24 |        |       | \x02000000
  3 |   8096 |        1 |     28 | 69099627 |      0 |        0 | (0,3)  |           1 |       2304 |     24 |        |       | \x03000000
  4 |   8064 |        1 |     28 | 69099655 |      0 |        0 | (0,4)  |           1 |       2304 |     24 |        |       | \x04000000
  5 |     13 |        2 |      0 |          |        |          |        |             |            |        |        |       |
  6 |   8032 |        1 |     28 | 69099778 |      0 |        0 | (0,6)  |           1 |       2304 |     24 |        |       | \x06000000
  7 |   8000 |        1 |     28 | 69099778 |      0 |        0 | (0,7)  |           1 |       2304 |     24 |        |       | \x07000000
  8 |   7968 |        1 |     28 | 69099778 |      0 |        0 | (0,8)  |           1 |       2304 |     24 |        |       | \x08000000
  9 |   7936 |        1 |     28 | 69099778 |      0 |        0 | (0,9)  |           1 |       2304 |     24 |        |       | \x09000000
 10 |   7904 |        1 |     28 | 69099778 |      0 |        0 | (0,10) |           1 |       2304 |     24 |        |       | \x0a000000
 11 |      0 |        0 |      0 |          |        |          |        |             |            |        |        |       |
 12 |      0 |        0 |      0 |          |        |          |        |             |            |        |        |       |
 13 |   7872 |        1 |     28 | 69103962 |      0 |        0 | (0,13) |       32769 |      10496 |     24 |        |       | \x05000000
(13 rows)

After vacuuming the table, rows five, 11, and 12 are now free to be used again.

So let’s insert another row, with the value of 11 and see what happens:

percona=# INSERT INTO percona values (11);
INSERT 0 1

Let’s examine the heap once more:

percona=# SELECT * FROM heap_page_items(get_raw_page('percona',0));

 lp | lp_off | lp_flags | lp_len |  t_xmin  | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid |   t_data
----+--------+----------+--------+----------+--------+----------+--------+-------------+------------+--------+--------+-------+------------
  1 |   8160 |        1 |     28 | 69099597 |      0 |        0 | (0,1)  |           1 |       2304 |     24 |        |       | \x01000000
  2 |   8128 |        1 |     28 | 69099609 |      0 |        0 | (0,2)  |           1 |       2304 |     24 |        |       | \x02000000
  3 |   8096 |        1 |     28 | 69099627 |      0 |        0 | (0,3)  |           1 |       2304 |     24 |        |       | \x03000000
  4 |   8064 |        1 |     28 | 69099655 |      0 |        0 | (0,4)  |           1 |       2304 |     24 |        |       | \x04000000
  5 |     13 |        2 |      0 |          |        |          |        |             |            |        |        |       |
  6 |   8032 |        1 |     28 | 69099778 |      0 |        0 | (0,6)  |           1 |       2304 |     24 |        |       | \x06000000
  7 |   8000 |        1 |     28 | 69099778 |      0 |        0 | (0,7)  |           1 |       2304 |     24 |        |       | \x07000000
  8 |   7968 |        1 |     28 | 69099778 |      0 |        0 | (0,8)  |           1 |       2304 |     24 |        |       | \x08000000
  9 |   7936 |        1 |     28 | 69099778 |      0 |        0 | (0,9)  |           1 |       2304 |     24 |        |       | \x09000000
 10 |   7904 |        1 |     28 | 69099778 |      0 |        0 | (0,10) |           1 |       2304 |     24 |        |       | \x0a000000
 11 |   7840 |        1 |     28 | 69750201 |      0 |        0 | (0,11) |           1 |       2048 |     24 |        |       | \x0b000000
 12 |      0 |        0 |      0 |          |        |          |        |             |            |        |        |       |
 13 |   7872 |        1 |     28 | 69103962 |      0 |        0 | (0,13) |       32769 |      10496 |     24 |        |       | \x05000000
(13 rows)

Our new tuple (with transaction ID 69750201) reused tuple 11, and now the tuple 11 pointer (0,11) is pointing to itself.

As you can see, the heap did not grow to accommodate the new row. It reused an open block for the new row that was made available when we vacuumed the table freeing up dead rows (rows that will no longer be visible in a transaction).

And there you have it. A step-by-step illustration of how bloat occurs in PostgreSQL!

Written by Jorge Torralba in: bloat,Heap,Insight for DBAs,Jorge PP,Postgres,postgresql,vacuum,Zend Developer |

Personal blog of Yzmir Ramirez

The Glass is twice as large as it needs to be

PostgreSQL Tuple-Level Statistics With pgstattuple

pgstattuple functions

pgstattuple queries to check table bloat

An Illustration of PostgreSQL Bloat

What causes bloat?

Setting up an example

How does the table bloat?

Vacuuming the table

Archives

Contributors

Personal blog of Yzmir Ramirez

The Glass is twice as large as it needs to be

PostgreSQL Tuple-Level Statistics With pgstattuple

pgstattuple functions

pgstattuple queries to check table bloat

An Illustration of PostgreSQL Bloat

What causes bloat?

Setting up an example

How does the table bloat?

Vacuuming the table

Tag Cloud

Archives

Contributors