Apr
26
2024
--

Trying out the PostgreSQL pg_tde Tech Preview Release

PostgreSQL pg_tde Tech Preview ReleaseThe tech preview version of the pg_tde extension for PostgreSQL 16 was released on March 28th, with many improvements compared to our previous MVP release. In addition to lots of bug fixes, the new release supports multi-tenancy and key rotation with a simplified configuration mechanism. Note that the Tech Preview release is not recommended for […]

Dec
19
2023
--

Using the Transparent Data Encryption Extension PG_TDE With PostgreSQL

PG_TDE With PostgreSQL

PG_TDE (PostgreSQL Transparent Data Encryption) is an experimental transparent data encryption access method for PostgreSQL 16, available on GitHub as an extension for vanilla PostgreSQL. Since our previous blog post, Adding Transparent Data Encryption to PostgreSQL with pg_tde: Please Test, it received new features such as TOAST data encryption and HashiCorp Vault support for key storage. Today, we’d like to share more details about the extension, including information about its internal architecture.

Deploying PostgreSQL Transparent Data Encryption

PG_TDE is available as an easy-to-use Docker image with everything set up and ready to be used, Ubuntu 22.04 deb package, or as source code on github. The simplest way to run it for testing is to use the docker container:

docker run --name pg-tde -e POSTGRES_PASSWORD=mysecretpassword -d perconalab/postgres-tde-ext

This command starts a container with the extension loaded and already enabled for the Postgres database. Using this, we can create two tables, one encrypted and one using the default access method:

docker run -it --rm postgres psql -h 172.17.0.2 -U postgres
CREATE TABLE test_enc (id SERIAL, t VARCHAR(32), PRIMARY KEY(id)) USING pg_tde;
INSERT INTO test_enc (t) VALUES ('foobar'), ('barfoo');
CREATE TABLE test_basic (id SERIAL, t VARCHAR(32), PRIMARY KEY(id));
INSERT INTO test_basic (t) VALUES ('foobar'), ('barfoo');
SELECT * FROM test_enc;
 id |   t    
----+--------
  1 | fooba
  2 | barfoo
----+--------
  1 | foobar
  2 | barfoo
(2 rows)
SELECT to_regclass('test_enc')::oid; -- 16397
SELECT to_regclass('test_basic')::oid; -- 16404

Then, we can use the docker cp command to retrieve and check the data files from the container:

mkdir /tmp/tdedata
docker cp pg-tde:/var/lib/postgresql /tmp/tdedata
ls -l /tmp/tdedata/postgresql/data/base/5/16404*
-rw------- 1 pg    pg    8192 dec   11 21:49 /tmp/tdedata/postgresql/data/base/5/16404
ls -l /tmp/tdedata/postgresql/data/base/5/16397*
-rw------- 1 pg    pg    8192 dec   11 21:49 /tmp/tdedata/postgresql/data/base/5/16397
-rw------- 1 pg    pg     304 dec   11 21:47 /tmp/tdedata/postgresql/data/base/5/16397.tde
strings /tmp/tdedata/postgresql/data/base/5/16404
barfoo
foobar
strings /tmp/tdedata/postgresql/data/base/5/16397
# no output

The above commands show us that:

  1. The strings “foobar” and “barfoo”, which are rows in the table, are clearly visible in the unencrypted table but can’t be found as-is in the encrypted table.
  2. Compared to the basic unencrypted table, the encrypted table has one additional fork ending with the “.tde” extension, which contains encryption metadata.

How data is encrypted

Going one step further, we can run hexdiff on the two datafiles and look for differences:

hexdiff /tmp/tdedata/postgresql/data/base/5/16397 /tmp/tdedata/postgresql/data/base/5/16404
   Visuel HexDiff v 0.0.53 by tTh 2007                             dec   7bits  
    8000   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    8016   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    8032   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    8048   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    8064   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    8080   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    8096   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    8112   ed 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00     
    8128   02 00 02 00 02 09 18 00 3f 72 99 0c ad 72 c4 96            ?r   r 
    8144   af 2d 1c 00 00 00 00 00 ed 02 00 00 00 00 00 00     -       
    8160   00 00 00 00 00 00 00 00 01 00 02 00 02 09 18 00                  
    8176   ae 7b da 47 b6 4c 69 e6 0a e7 cd 00 00 00 00 00     { G Li    
    8192
    8208
    8224
** /tmp/tdedata/postgresql/data/base/5/16397            8192     8000  97%      
    8000   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    8016   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    8032   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    8048   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    8064   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    8080   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    8096   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    8112   ef 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    8128   02 00 02 00 02 08 18 00 02 00 00 00 0f 62 61 72                 bar
    8144   66 6f 6f 00 00 00 00 00 ef 02 00 00 00 00 00 00    foo
    8160   00 00 00 00 00 00 00 00 01 00 02 00 02 08 18 00
    8176   01 00 00 00 0f 66 6f 6f 62 61 72 00 00 00 00 00         foobar
    8192
    8208
    8224
   /tmp/tdedata/postgresql/data/base/5/16404            8192     8000  97%

This command shows that the data files have only two differences: the page checksum is different, and later parts of the tuples are different too – shown in the above output.

But it’s also important to note that not everything in the tuple structure is different: only the actual tuple data gets encrypted, and the tuple headers remain unencrypted. As these fields do not contain sensitive data, this is not a security issue and helps with the performance and simplicity of the extension. Our PG_TDE implementation is based on the original heap engine of PostgreSQL, with as few modifications as possible to encrypt the contents of the tables.

For this reason, compared to the commonly used page-level encryption of other database servers, where entire pages are encrypted and decrypted during disk IO, our approach decrypts/encrypts tuples (table rows) when they are accessed or written.

Database keys and encryption algorithms

To encrypt or decrypt any row or part of any row within a table, the encryption in PG_TDE needs the ability to work on any length of data at any offset without changing anything before or after the specific row. To accomplish this, PG_TDE uses the AES-CTR encryption algorithm. To ensure that the extension never reuses the same (key, offset) pairs, every table has its own encryption key. This, with some additional information, is what gets stored in the .tde file:

hexdump -C /tmp/tdedata/postgresql/data/base/5/16397.tde 
00000000  6d 61 73 74 65 72 2d 6b  65 79 00 00 00 00 00 00  |master-key......|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000100  01 00 00 00 00 00 00 00  54 af ea f2 21 f0 80 78  |........T...!..x|
00000110  69 f7 bc e5 19 b7 5f 60  00 00 00 00 00 00 00 00  |i....._`........|
00000120  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000130

The above output shows that the TDE file itself is encrypted using a different encryption key, named “master-key”, and then contains the encryption information for the table itself, encrypted using the previously mentioned key. This key is stored in an external keyring, possibly at a separate location, to increase security. Pg_tde currently supports a local file-based key storage intended only for development and storage using Hashicorp Vault. In the future, we plan to add support for other services, too, for example, the KMIP protocol.

Providing feedback on PG_TDE

Percona wants to make PG_TDE your choice of TDE encryption, and for that, we need as many people testing and providing feedback as possible.

Follow the directions above or on the Github repository. Please let us know what you like and what you dislike about PG_TDE, and let Percona know about any issues you discover. Tell us what additional tooling about PG_TDE you would like to have, as well as any other feedback.

This is open source software, and the old adage about having many eyeballs on the code to ensure its quality is applicable here, as Percona wants your input.

Percona Distribution for PostgreSQL provides the best and most critical enterprise components from the open-source community, in a single distribution, designed and tested to work together.

 

Download Percona Distribution for PostgreSQL Today!

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com