Feb
08
2022
--

PostgreSQL 14 Predefined Roles – Making Access Control Easy

PostgreSQL 14 Predefined Roles

The PostgreSQL RDBMS has shown rapid growth in terms of adoption and usability for a wide range of industries and projects. The same is true as for the new methodologies for application development and deployment, which currently focuses on making it easier for developers to get the infrastructure and the DB model they need with not too much intervention from other IT teams, so they can jump into their development process as quick as possible.

The PostgreSQL community is very active and always has included new features in every release that cover or improve some necessities for this changing and always evolving IT world. 

In this blog post, I’m going to go through a “small” feature just added in the latest PostgreSQL version 14 release. This for sure comes in handy when the time for granting privileges for users in the DB model comes. This is the inclusion of two new predefined roles, which can simplify the privilege management:

  • pg_read_all_data 
  • pg_write_all_data

This new facility to grant access in a wide scope requires to be used with the proper control and awareness, especially when we work in a production environment. With that said, let’s check what is this about.

Users, Roles, and Privileges

When working with a PostgreSQL system, the concepts of Users and Roles may be used indistinguishably from each other, they refer to a cluster-level object that usually represents a database user (User) and/or can act as a logical container for privileges (Role), the syntax is often interchangeable, ie:

ALTER ROLE... 
ALERT USER...

NOTE: In this blog, we are going to grant the privileges directly to the users for sake of simplicity, however, as a best practice is advisable to set all permissions to a role rather than a user so we can get better control over them.

The privileges are all those permissions an administrator, or object owner, can grant to a role (and thus to a user, or a set of users) and apply to specific objects within a database context. Depending on the object type there are different privileges that can be granted, their usage and effect are not exactly in the scope of this blog post but you can review the official documentation as always. 

We are going to review a specific use case to get how the new predefined roles can be used to simplify some of the actions related to the access control.

The Use Case

Let’s imagine we are working in a new application or service that will use our loved PostgreSQL database as backend, this application can be an ETL, a data extraction tool, or even an on-demand microservice, the thing is it requires access practically to all the data in our PG cluster, that means all the user tables (and views) within all the schemas and in all the databases present in the cluster. Think in the next model, two databases, one has two schemas the other only one schema, each schema is owned by a different user, each schema has one or more tables. Our application user is going to be named appuser

database database1:
  schema schema1:
    owner: user1
    table table1:
      columns:
      - id: integer
      - data: text
    table table2:
      columns:
      - id: integer
      - data: text
  schema schema2:
    owner: user2
    table table1:
      columns:
      - id: integer
      - data: text
database database2:
  schema schema1:
    owner: user1
    table table1:
      columns:
      - id: integer
      - data: text

Getting Hands-On

Before trying this with the new predefined roles let’s check how it would be handled in a PostgreSQL 13.

First, create the user

pg13-database1 postgres =# create user appuser;
CREATE ROLE
pg13-database1 postgres =# alter user appuser password 'secretsuperpasswd';
ALTER ROLE

If we connect with our new user and try to access the tables on schema1:

??pg13-database1 user1 => \c database1 appuser
Password for user appuser:
psql (14.1, server 13.5 (Debian 13.5-1.pgdg110+1))
You are now connected to database "database1" as user "appuser".

pg13-database1 user1 => select * from schema1.table1 limit 5;
ERROR:  permission denied for schema schema1
LINE 1: select * from schema1.table1 limit 5;
                      ^

Ok, grant USAGE on the schema to the user:

Pg13-database1 postgres =# grant usage on schema schema1 to appuser;
GRANT

Now try to select from the table:

pg13-database1 appuser => select * from schema1.table1 limit 5;
ERROR:  permission denied for table table1

Ok, ok, we need to grant SELECT to the table, got it:

pg13-database1 user1 => grant select on schema1.table1 to appuser;
GRANT

pg13-database1 appuser => select * from schema1.table1 limit 5;
 id | data
----+------
  1 | 1
  2 | 2
  3 | 3
  4 | 4
  5 | 5
(5 rows)

Now try the other table in the schema:

pg13-database1 appuser => select * from schema1.table2 limit 5;
ERROR:  permission denied for table table2

Mmmh, ok, let’s grant SELECT over all the tables in the schema:

pg13-database1 user1 => grant select on all tables in schema schema1 to appuser;
GRANT

pg13-database1 appuser => select * from schema1.table2 limit 5 ;
 id | data
----+------
  1 | 1
  2 | 2
  3 | 3
  4 | 4
  5 | 5
(5 rows)

Worked!

What about somebody finds that the model needs a new table (table3), look at the next:

pg13-database1 user1 => create table schema1.table3 (like schema1.table1) ;
CREATE TABLE

pg13-database1 user1 => insert into schema1.table3(id, data)
select i,i::text from generate_series(1,1000) i;
INSERT 0 1000

Now our appuser tries to read from it, we already granted SELECT over all the tables, right?:

pg13-database1 appuser => select * from schema1.table3 limit 5;
ERROR:  permission denied for table table3

We have to repeat the previous grants in the new table, just as we did for the table2 table.  

To let our user keep the read access even in new tables we have to alter the default privileges at the schema:

pg13-database1 user1 => alter default privileges in schema schema1 grant select on tables to appuser;
ALTER DEFAULT PRIVILEGES

Now if a new table is created our user will have access to it:

pg13-database1 user1 => create table schema1.table4 (like schema1.table1) ;
CREATE TABLE

pg13-database1 user1 => insert into schema1.table4(id, data)
select i,i::text from generate_series(1,1000) i;
INSERT 0 1000

pg13-database1 appuser => select * from schema1.table4 limit 5 ;
 id | data
----+------
  1 | 1
  2 | 2
  3 | 3
  4 | 4
  5 | 5
(5 rows)

Nice!

But do not forget we have another schema, and the appuser should be able to get the data from there as well.

pg13-database1 appuser => select * from schema2.table1 limit 5;
ERROR:  permission denied for schema schema2

OK, we have to apply all the above… again. 

And also remember we have a different database in the cluster, so:

pg13-database1 appuser => \c database2 appuser
psql (14.1, server 13.5 (Debian 13.5-1.pgdg110+1))
You are now connected to database "database2" as user "appuser".

pg13-database2 appuser =>  select * from schema1.table1 limit 5;
ERROR:  permission denied for schema schema1

We have to do the same for this database’s schemas and tables as well. 

Is true we can follow the well-known “code once, execute many” and turn all these steps into a single automation piece, but we still need to take them into consideration since they are needed.

What about the “little” feature in PostgreSQL 14 we want to check around, the new predefined roles are described as follow:

Role

Allowed Access

pg_read_all_data

Read all data (tables, views, sequences), as if having SELECT rights on those objects, and USAGE rights on all schemas, even without having it explicitly. This role does not have the role attribute BYPASSRLS set. If RLS is being used, an administrator may wish to set BYPASSRLS on roles which this role is GRANTed to.

pg_write_all_data

Write all data (tables, views, sequences), as if having INSERT, UPDATE, and DELETE rights on those objects, and USAGE rights on all schemas, even without having it explicitly. This role does not have the role attribute BYPASSRLS set. If RLS is being used, an administrator may wish to set BYPASSRLS on roles which this role is GRANTed to.

https://www.postgresql.org/docs/14/predefined-roles.html

So, how our previous scenario would work in PostgreSQL 14? let’s check it out. 

As before, we start creating our user

pg14-database1 postgres =# create user appuser;
CREATE ROLE
pg14-database1 postgres =# alter user appuser password 'supersecretpasswd';
ALTER ROLE

If we try accessing the table right now with our new user, we know what is going to happen:

pg14-database1 user1 => \c database1 appuser
Password for user appuser:
You are now connected to database "database1" as user "appuser".

pg14-database1 appuser => select * from schema1.table1 limit 5;
ERROR:  permission denied for schema schema1
LINE 1: select * from schema1.table1 limit 5 ;

The magic starts here, we can grant the new predefined role:

pg14-database1 postgres =# grant pg_read_all_data to appuser ;
GRANT ROLE

And…

pg14-database1 appuser => select * from schema1.table1 limit 5 ;
 id | data
----+------
  1 | 1
  2 | 2
  3 | 3
  4 | 4
  5 | 5
(5 rows)

WORKED!

Even for the other schema:

pg14-database1 appuser => select * from schema2.table1 limit 5 ;
 id | data
----+------
  1 | 1
  2 | 2
  3 | 3
  4 | 4
  5 | 5
(5 rows)

Now let’s add a new table:

pg14-database1 user1 => create table schema1.table3 (like schema1.table1) ;
CREATE TABLE

pg14-database1 user1 => insert into schema1.table3(id, data)
database1-> select i,i::text from generate_series(1,1000000) i;
INSERT 0 1000000

And try the SELECT from it:

pg14-database1 appuser => select * from schema1.table3 limit 5;
 id | data
----+------
  1 | 1
  2 | 2
  3 | 3
  4 | 4
  5 | 5
(10 rows)

WORKED!

What about the other database:

pg14-database1 appuser => \c database2 appuser
You are now connected to database "database2" as user "appuser".

pg14-database2 appuser => select * from schema1.table1 limit 5;
 id |  data
----+------
  1 | 1
  2 | 2
  3 | 3
  4 | 4
  5 | 5
(5 rows)

WOW!

As we can see, a single command did the work for the previous multiple commands we required in PG13, cool!.

For the write privileges we should do almost the same: 

  • For PG13 grant the UPDATE, INSERT, and (acting with caution) the DELETE instead of SELECT.
  • In PG14 grant the pg_write_all_data role.

NOTE: Granting only the privileges for DML directly or using the new PG14 predefined role, without the read part the user won’t be able to perform UPDATES or DELETES over subsets, due to it requires the filtering, and that only is viable if the read access is in place.

pg14-database1 write_appuser => \dg write_appuser
                List of roles
 Role name      | Attributes |      Member of
----------------+------------+---------------------
 write_appuser  |            | {pg_write_all_data}
 
pg14-database1 write_appuser => insert into schema1.table1(id, data)
values(2000000, 'database1.schema1.table1-2000000');
INSERT 0 1

pg14-database1 write_appuser => update schema1.table1 set data = 'database1.schema1.table1-2000000-upd' where id = 2000000 ;
ERROR:  permission denied for table table1

pg14-database1 write_appuser => delete from schema1.table1 where id = 2000000 ;
ERROR:  permission denied for table table1

pg14-database1 write_appuser => update schema1.table1 set data = '';
UPDATE 1000001
pg14-database1 write_appuser => delete from schema1.table1;
DELETE 1000001

This feature might help us to get things done quickly, especially if we are working in large environments. However, we always need to be sure to who we are giving access to and how it will be used. 

Percona Distribution for PostgreSQL provides the best and most critical enterprise components from the open-source community, in a single distribution, designed and tested to work together.

Download Percona Distribution for PostgreSQL Today!

Jul
26
2021
--

The 1-2-3 for PostgreSQL Logical Replication Using an RDS Snapshot

PostgreSQL Logical Replication RDS

We have several PostgreSQL versions that support logical decoding to replicate data changes from a source database to a target database, which is a cool and very powerful tool that gives the option to replicate all the tables in a database, only one schema, a specific set of tables or even only some columns/rows, also is a helpful method for version upgrades since the target database can run on a different (minor or major) PostgreSQL version.

PostgreSQL Logical Replication

Image from: https://severalnines.com/sites/default/files/blog/node_5443/image2.png

There are some cases when the databases have been hosted in the AWS Relational Database Service (RDS) which is the fully auto-managed solution offered by Amazon Web Services, there is no secret that choosing this option for our database backend comes with a level of vendor lock-in, and even when RDS offers some build-in replica solutions such as Multi-AZ or read-replicas sometimes we can take advantage of the benefits from logical replication.

In this post I will describe the simplest and basic steps I used to implement this replica solution avoiding the initial copy data from the source database to the target, creating the target instance from an RDS snapshot. Certainly, you can take advantage of this when you work with a big/huge data set and the initial copy could lead to high timeframes or network saturation.   

NOTE: The next steps were tested and used for a specific scenario and they are not intended to be an any-size solution, rather give some insight into how this can be made and most importantly, to stimulate your own creative thinking.  

The Scenario

Service Considerations

In this exercise, I wanted to perform a version upgrade from PostgreSQL v11.9 to PostgreSQL v12.5, we can perform a direct upgrade using the build-in option RDS offers, but that requires a downtime window that can vary depending on some of the next:

  • Is Multi-AZ enabled?
  • Are the auto backups enabled?
  • How transactional is the source database?

During the direct upgrade process, RDS takes a couple of new snapshots of the source instance, firstly at the beginning of the upgrade and finally when all the modifications are done, depending on how old is the previous backup and how many changes have been made on the datafiles the pre backup could take some time. Also, if the instance is Multi-AZ the process should upgrade both instances, which adds more time for the upgrade, during most of these actions the database remains inaccessible.

The next is a basic diagram of how an RDS Multi-AZ instance looks, all the client requests are sent to the master instance, while the replica is not accessible and some tasks like the backups are executed on it.

PostgreSQL Logical Replication on RDS

Therefore, I choose logical replication as the mechanism to achieve the objective, we can aim for a quicker switch-over if we create the new instance in the desired version and just replicate all the data changes, then we need a small downtime window just to move the traffic from the original instance to the upgraded new one.

Prerequisites

To be able to perform these actions we would need:

  • An AWS user/access that can operate the DB instances, take DB snapshots and upgrade and restore them.
  • The AWS user also should be able to describe and create DB PARAMETER GROUPS.
  • A DB user with enough privileges to create the PUBLICATION on source and SUBSCRIPTION on target also is advisable to create a dedicated replication user with the minimum permissions. 

The 1-2-3 Steps

Per the title of this post, the next is the list of steps to set up a PostgreSQL logical replication between a PostgreSQL v11.9 and a v12.5 using an RDS snapshot to initialize the target database. 

  1. Verify the PostgreSQL parameters for logical replication
  2. Create the replication user and grant all the required privileges
  3. Create the PUBLICATION
  4. Create a REPLICATION SLOT
  5. Create a new RDS snapshot 
  6. Upgrade the RDS snapshot to the target version
  7. Restore the upgraded RDS snapshot
  8. Get the LSN position 
  9. Create the SUBSCRIPTION
  10. Advance the SUBSCRIPTION 
  11. Enable the SUBSCRIPTION

Source Database Side

1. Verify the PostgreSQL parameters for logical replication

We require the next PostgreSQL parameters for this exercise

demodb=> select name,setting from pg_settings where name in (
        'wal_level',
        'track_commit_timestamp',
        'max_worker_processes',
        'max_replication_slots',
        'max_wal_senders') ;
          name          | setting
------------------------+---------
 max_replication_slots  | 10
 max_wal_senders        | 10
 max_worker_processes   | 10
 track_commit_timestamp | on
 wal_level              | logical
(5 rows)

NOTE: The parameter track_commit_timestamp can be optional since in some environments is not advisable for the related overhead, but it would help to track and resolve any conflict that may occur when the subscriptions are started.

2. Create the replication user and grant all the required privileges

demodb=> CREATE USER pgrepuser WITH password 'SECRET';
CREATE ROLE
demodb=> GRANT rds_replication TO pgrepuser;
GRANT ROLE
demodb=> GRANT SELECT ON ALL TABLES IN SCHEMA public TO pgrepuser;
GRANT

3. Create the PUBLICATION

demodb=> CREATE PUBLICATION pglogical_rep01 FOR ALL TABLES;
CREATE PUBLICATION

4. Create a REPLICATION SLOT

demodb=> SELECT pg_create_logical_replication_slot('pglogical_rep01', 'pgoutput');
 pg_create_logical_replication_slot
------------------------------------
 (pglogical_rep01,3C/74000060)
(1 row)

AWS RDS Steps

5. Create a new RDS snapshot 

aws rds create-db-snapshot \
    --db-instance-identifier demodb-postgres\
    --db-snapshot-identifier demodb-postgres-to-125

6. Upgrade the RDS snapshot to the target version

aws rds modify-db-snapshot \
    --db-snapshot-identifier demodb-postgres-to-125 \
    --engine-version 12.5

7. Restore the upgraded RDS snapshot 

Since we are moving from version 11.9 to 12.5 we may need to create a new DB parameter group if we are using some custom parameters. 
From the instance describe we can verify the current parameter group

aws rds describe-db-instances \
        --db-instance-identifier demodb-postgres \| 
jq '.DBInstances | map({DBInstanceIdentifier: .DBInstanceIdentifier, DBParameterGroupName: .DBParameterGroups[0].DBParameterGroupName})'
[
  {
    "DBInstanceIdentifier": "demodb-postgres",
    "DBParameterGroupName": "postgres11-logicalrep"
  }
]

Then we can validate the custom parameters 

aws rds describe-db-parameters \
	--db-parameter-group-name postgres11-logicalrep \
	--query "Parameters[*].[ParameterName,ParameterValue]" \
	--source user --output text 
track_commit_timestamp	1

We need to create a new parameter group in the target version

aws rds create-db-parameter-group \
	--db-parameter-group-name postgres12-logicalrep \
	--db-parameter-group-family postgres12

Finally, we need to modify the parameters we got before in the new parameter group

aws rds modify-db-parameter-group \
	--db-parameter-group-name postgres12-logicalrep \
	--parameters "ParameterName='track_commit_timestamp',ParameterValue=1,ApplyMethod=immediate"

Now we can use the new parameter group to restore the upgraded snapshot

aws rds restore-db-instance-from-db-snapshot \
	--db-instance-identifier demodb-postgres-125 \
	--db-snapshot-identifier demodb-postgres-to-125 \
	--db-parameter-group-name postgres12-logicalrep

8. Get the LSN position from the target instance log

To list all the database logs for the new DB instance

aws rds describe-db-log-files \
	--db-instance-identifier demodb-postgres-125

We should pick the latest database log

aws rds download-db-log-file-portion \
	--db-instance-identifier demodb-postgres-125 \
	--log-file-name "error/postgresql.log.2021-03-23-18"

From the retrieved log portion we need to find the value after for the log entry redo done at:

...
2021-03-23 18:19:58 UTC::@:[5212]:LOG:  redo done at 3E/50000D08
...

Target Database Side

9. Create SUBSCRIPTION

demodb=> CREATE SUBSCRIPTION pglogical_sub01 CONNECTION 'host=demodb-postgres.xxxx.us-east-1.rds.amazonaws.com port=5432 dbname=demodb user=pgrepuser password=SECRET' PUBLICATION pglogical_rep01
WITH (
  copy_data = false,
  create_slot = false,
  enabled = false,
  connect = true,
  slot_name = 'pglogical_rep01'
);
CREATE SUBSCRIPTION

10. Advance the SUBSCRIPTION 

We need to get the subscription id

demodb=> SELECT 'pg_'||oid::text AS "external_id"
FROM pg_subscription 
WHERE subname = 'pglogical_sub01';
 external_id
-------------
 pg_73750
(2 rows)

Now advance the subscription to the LSN we got in step 8

demodb=> SELECT pg_replication_origin_advance('pg_73750', '3E/50000D08') ;
pg_replication_origin_advance
-------------------------------
(1 row)

11. Enable the SUBSCRIPTION

demodb=> ALTER SUBSCRIPTION pglogical_sub01 ENABLE;
ALTER SUBSCRIPTION

Once we are done with all the steps the data changes should flow from the source database to the target, we can check the status at the pg_stat_replication view. 

Conclusion

Choosing DBaaS from cloud vendors bring some advantages and can speed up some implementations, but they come with some costs, and not all the available tools or solutions fits all the requirements, that is why always is advisable to try some different approaches and think out of the box, technology can go so far as our imagination. 

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com