Dec
03
2018
--

Percona Live 2019 Call for Papers is Now Open!

Percona Live CFP 2019

Percona Live 2019Announcing the opening of the Percona Live 2019 Open Source Database Conference call for papers. It will be open from now until January 20, 2019. The Percona Live Open Source Database Conference 2019 takes place May 28-30 in Austin, Texas.

Our theme this year is CONNECT. ACCELERATE. INNOVATE.

As a speaker at Percona Live, you’ll have the opportunity to CONNECT with your peers—open source database experts and enthusiasts who share your commitment to improving knowledge and exchanging ideas. ACCELERATE your projects and career by presenting at the premier open source database event, a great way to build your personal and company brands. And influence the evolution of the open source software movement by demonstrating how you INNOVATE!

Community initiatives remain core to the open source ethos, and we are proud of the contribution we make with Percona Live in showcasing thought leading practices in the open source database world.

With a nod to innovation, this year we are introducing a business track to benefit those business leaders who are exploring the use of open source and are interested in learning more about its costs and benefits.

Speaking Opportunities

The Percona Live Open Source Database Conference 2019 Call for Papers is open until January 20, 2019. We invite you to submit your speaking proposal for breakout, tutorial or lightning talk sessions. Classes and talks are invited for Foundation (either entry-level or of general interest to all), Core (intermediate), and Masterclass (advanced) levels.

  • Breakout Session. Broadly cover a technology area using specific examples. Sessions should be either 25 minutes or 50 minutes in length (including Q&A).
  • Tutorial Session. Present a technical session that aims for a level between a training class and a conference breakout session. We encourage attendees to bring and use laptops for working on detailed and hands-on presentations. Tutorials will be three or six hours in length (including Q&A).
  • Lightning Talk. Give a five-minute presentation focusing on one key point that interests the open source community: technical, lighthearted or entertaining talks on new ideas, a successful project, a cautionary story, a quick tip or demonstration.

If your proposal is selected for breakout or tutorial sessions, you will receive a complimentary full conference pass.

Topics and Themes

We want proposals that cover the many aspects of application development using all open source databases, as well as new and interesting ways to monitor and manage database environments. Did you just embrace open source databases this year? What are the technical and business values of moving to or using open source databases? How did you convince your company to make the move? Was there tangible ROI?

Best practices and current trends, including design, application development, performance optimization, HA and clustering, cloud, containers and new technologies –  what’s holding your focus? Share your case studies, experiences and technical knowledge with an engaged audience of open source peers.

In the submission entry, indicate which of these themes your proposal best fits: tutorial, business needs; case studies/use cases; operations; or development. Also include which track(s) from the list below would be best suited to your talk.

Tracks

The conference committee is looking for proposals that cover the many aspects of using, deploying and managing open source databases, including:

  • MySQL. Do you have an opinion on what is new and exciting in MySQL? With the release of MySQL 8.0, are you using the latest features? How and why? Are they helping you solve any business issues, or making deployment of applications and websites easier, faster or more efficient? Did the new release influence you to change to MySQL? What do you see as the biggest impact of the MySQL 8.0 release? Do you use MySQL in conjunction with other databases in your environment?
  • MariaDB. Talks highlighting MariaDB and MariaDB compatible databases and related tools. Discuss the latest features, how to optimize performance, and demonstrate the best practices you’ve adopted from real production use cases and applications.
  • PostgreSQL. Why do you use PostgreSQL as opposed to other SQL options? Have you done a comparison or benchmark of PostgreSQL vs. other types of databases related to your applications? Why, and what were the results? How does PostgreSQL help you with application performance or deployment? How do you use PostgreSQL in conjunction with other databases in your environment?
  • MongoDB. Has the 4.0 release improved your experience in application development or time-to-market? How are the new features making your database environment better? What is it about MongoDB 4.0 that excites you? What are your experiences with Atlas? Have you moved to it, and has it lived up to its promises? Do you use MongoDB in conjunction with other databases in your environment?
  • Polyglot Persistence. How are you using multiple open source databases together? What tools and technologies are helping you to get them interacting efficiently? In what ways are multiple databases working together helping to solve critical business issues? What are the best practices you’ve discovered in your production environments?
  • Observability and Monitoring. How are you designing your database-powered applications for observability? What monitoring tools and methods are providing you with the best application and database insights for running your business? How are you using tools to troubleshoot issues and bottlenecks? How are you observing your production environment in order to understand the critical aspects of your deployments? 
  • Kubernetes. How are you running open source databases on the Kubernetes, OpenShift and other container platforms? What software are you using to facilitate their use? What best practices and processes are making containers a vital part of your business strategy? 
  • Automation and AI. How are you using automation to run databases at scale? Are you using automation to create self-running, self-healing, and self-tuning databases? Is machine learning and artificial intelligence (AI) helping you create a new generation of database automation?
  • Migration to Open Source Databases. How are you migrating to open source databases? Are you migrating on-premises or to the cloud? What are the tools and strategies you’ve used that have been successful, and what have you learned during and after the migration? Do you have real-world migration stories that illustrate how best to migrate?
  • Database Security and Compliance. All of us have experienced security and compliance challenges. From new legislation like GDPR, PCI and HIPAA, exploited software bugs, or new threats such as ransomware attacks, when is enough “enough”? What are your best practices for preventing incursions? How do you maintain compliance as you move to the cloud? Are you finding that security and compliance requirements are preventing your ability to be agile?
  • Other Open Source Databases. There are many, many great open source database software and solutions we can learn about. Submit other open source database talk ideas – we welcome talks for both established database technologies as well as the emerging new ones that no one has yet heard about (but should).
  • Business and Enterprise. Has your company seen big improvements in ROI from using Open Source Databases? Are there efficiency levels or interesting case studies you want to share? How did you convince your company to move to Open Source?

How to Respond to the Call for Papers

For information on how to submit your proposal, visit our call for papers page.

Sponsorship

If you would like to obtain a sponsor pack for Percona Live Open Source Database Conference 2019, you will find more information including a prospectus on our sponsorship page. You are welcome to contact me, Bronwyn Campbell, directly.

Nov
30
2018
--

PostgreSQL Streaming Physical Replication With Slots

postgres replication using slots

PostgreSQLPostgreSQL streaming physical replication with slots simplifies setup and maintenance procedures. Usually, you should estimate disk usage for the Write Ahead Log (WAL) and provide appropriate limitation to the number of segments and setup of the WAL archive procedure. In this article, you will see how to use replication with slots and understand what problems it could solve.

Introduction

PostgreSQL physical replication is based on WAL. Th Write Ahead Log contains all database changes, saved in 16MB segment files. Normally postgres tries to keep segments between checkpoints. So with default settings, just 1GB of WAL segment files is available.

Replication requires all WAL files created after backup and up until the current time. Previously, it was necessary to keep a huge archive directory (usually mounted by NFS to all slave servers). The slots feature introduced in 9.4 allows Postgres to track the latest segment downloaded by a slave server. Now, PostgreSQL can keep all segments on disk, even without archiving, if a slave is seriously behind its master due to downtime or networking issues. The drawback: the disk space could be consumed infinitely in the case of configuration error. Before continuing, if you need a better understanding of physical replication and streaming replication, I recommend you read “Streaming Replication with PostgreSQL“.

Create a sandbox with two PostgreSQL servers

To setup replication, you need at least two PostgreSQL servers. I’m using pgcli (pgc) to setup both servers on the same host. It’s easy to install on Linux, Windows, and OS X, and provides the ability to download and run any version of PostgreSQL on your staging server or even on your laptop.

python -c "$(curl -fsSL https://s3.amazonaws.com/pgcentral/install.py)"
mv bigsql master
cp -r master slave
$ cd master
master$ ./pgc install pg10
master$ ./pgc start pg10
$ cd ../slave
slave$ ./pgc install pg10
slave$ ./pgc start pg10

First of all you should allow the replication user to connect:

master$ echo "host replication replicator 127.0.0.1/32 md5" >> ./data/pg10/pg_hba.conf

If you are running master and slave on different servers, please replace 127.0.0.1 with the slave’s address.

Next pgc creates a shell environment file with PATH and all the other variables required for PostgreSQL:

master$ source ./pg10/pg10.env

Allow connections from the remote host, and create a replication user and slot on master:

master$ psql
postgres=# CREATE USER replicator WITH REPLICATION ENCRYPTED PASSWORD 'replicator';
CREATE ROLE
postgres=# ALTER SYSTEM SET listen_addresses TO '*';
ALTER SYSTEM
postgres=# SELECT pg_create_physical_replication_slot('slot1');
pg_create_physical_replication_slot
-------------------------------------
(slot1,)

To apply system variables changes and hba.conf, restart the Postgres server:

master$ ./pgc stop ; ./pgc start
pg10 stopping
pg10 starting on port 5432

Test table

Create a table with lots of padding on the master:

master$ psql psql (10.6) Type "help" for help.
postgres=# CREATE TABLE t(id INT, pad CHAR(200));
postgres=# CREATE INDEX t_id ON t (id);
postgres=# INSERT INTO t SELECT generate_series(1,1000000) AS id, md5((random()*1000000)::text) AS pad;

Filling WAL with random data

To see the benefits of slots, we should fill the WAL with some data by running transactions. Repeat the update statement below to generate a huge amount of WAL data:

UPDATE t SET pad = md5((random()*1000000)::text);

Checking the current WAL size

You can check total size for all WAL segments from the shell or from psql:

master$ du -sh data/pg10/pg_wal
17M data/pg10/pg_wal
master$ source ./pg10/pg10.env
master$ psql
postgres=# \! du -sh data/pg10/pg_wal
17M data/pg10/pg_wal

Check maximum WAL size without slots activated

Before replication configuration, we can fill the WAL with random data and find that after 1.1G, the data/pg10/pg_wal directory size does not increase regardless of the number of update queries.

postgres=# UPDATE t SET pad = md5((random()*1000000)::text); -- repeat 4 times
postgres=# \! du -sh data/pg10/pg_wal
1.1G data/pg10/pg_wal
postgres=# UPDATE t SET pad = md5((random()*1000000)::text);
postgres=# \! du -sh data/pg10/pg_wal
1.1G data/pg10/pg_wal

Backup master from the slave server

Next, let’s make a backup for our slot1:

slave$ source ./pg10/pg10.env
slave$ ./pgc stop pg10
slave$ rm -rf data/pg10/*
# If you are running master and slave on different servers, replace 127.0.0.1 with master's IP address.
slave$ PGPASSWORD=replicator pg_basebackup -S slot1 -h 127.0.0.1 -U replicator -p 5432 -D $PGDATA -Fp -P -Xs -Rv

Unfortunately pg_basebackup hangs with: initiating base backup, waiting for checkpoint to complete.
We can wait for the next checkpoint, or force the checkpoint on the master. Checkpoint happens every checkpoint_timeout seconds, and is set to five minutes by default.

Forcing checkpoint on master:

master$ psql
postgres=# CHECKPOINT;

The backup continues on the slave side:

pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/92000148 on timeline 1
pg_basebackup: starting background WAL receiver
1073986/1073986 kB (100%), 1/1 tablespace
pg_basebackup: write-ahead log end point: 0/927FDDE8
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: base backup completed

The backup copies settings from the master, including its TCP port value. I’m running both master and slave on the same host, so I should change the port in the slave .conf file:

slave$ vim data/pg10/postgresql.conf
# old value port = 5432
port = 5433

Now we can return to the master and run some queries:

slave$ cd ../master
master$ source pg10/pg10.env
master$ psql
postgres=# UPDATE t SET pad = md5((random()*1000000)::text);
UPDATE t SET pad = md5((random()*1000000)::text);

By running these queries, the WAL size is now 1.4G, and it’s bigger than 1.1G! Repeat this update query three times and the WAL grows to 2.8GB:

master$ du -sh data/pg10/pg_wal
2.8G data/pg10/pg_wal

Certainly, the WAL could grow infinitely until whole disk space is consumed.
How do we find out the reason for this?

postgres=# SELECT redo_lsn, slot_name,restart_lsn,
round((redo_lsn-restart_lsn) / 1024 / 1024 / 1024, 2) AS GB_behind
FROM pg_control_checkpoint(), pg_replication_slots;
redo_lsn    | slot_name | restart_lsn | gb_behind
------------+-----------+-------------+-----------
1/2A400630  | slot1     |  0/92000000 | 2.38

We have one slot behind the master of 2.38GB.

Let’s repeat the update and check again. The gap has increased:

postgres=# postgres=# SELECT redo_lsn, slot_name,restart_lsn,
round((redo_lsn-restart_lsn) / 1024 / 1024 / 1024, 2) AS GB_behind
FROM pg_control_checkpoint(), pg_replication_slots;
redo_lsn    | slot_name | restart_lsn | gb_behind
------------+-----------+-------------+-----------
1/8D400238  |     slot1 | 0/92000000  | 3.93

Wait, though: we have already used slot1 for backup! Let’s start the slave:

master$ cd ../slave
slave$ ./pgc start pg10

Replication started without any additional change to recovery.conf:

slave$ cat data/pg10/recovery.conf
standby_mode = 'on'
primary_conninfo = 'user=replicator password=replicator passfile=''/home/pguser/.pgpass'' host=127.0.0.1 port=5432 sslmode=prefer sslcompression=1 krbsrvname=postgres target_session_attrs=any'
primary_slot_name = 'slot1'

pg_basebackup -R option instructs backup to write to the recovery.conf file with all required options, including primary_slot_name.

WAL size, all slots connected

The gap reduced several seconds after the slave started:

postgres=# SELECT redo_lsn, slot_name,restart_lsn,
round((redo_lsn-restart_lsn) / 1024 / 1024 / 1024, 2) AS GB_behind
FROM pg_control_checkpoint(), pg_replication_slots;
redo_lsn    | slot_name | restart_lsn | gb_behind
------------+-----------+-------------+-----------
 1/8D400238 |     slot1 |  0/9A000000 | 3.80

And a few minutes later:

postgres=# SELECT redo_lsn, slot_name,restart_lsn,
round((redo_lsn-restart_lsn) / 1024 / 1024 / 1024, 2) AS GB_behind
FROM pg_control_checkpoint(), pg_replication_slots;
redo_lsn    | slot_name | restart_lsn | gb_behind
------------+-----------+-------------+-----------
 1/9E5DECE0 |     slot1 |  1/9EB17250 | -0.01
postgres=# \!du -sh data/pg10/pg_wal
1.3G data/pg10/pg_wal

Slave server maintenance

Let’s simulate slave server maintenance with ./pgc stop pg10 executed on the slave. We’ll push some data onto the master again (execute the UPDATE query 4 times).

Now, “slot1” is again 2.36GB behind.

Removing unused slots

By now, you might realize that a problematic slot is not in use. In such cases, you can drop it to allow retention for segments:

master$ psql
postgres=# SELECT pg_drop_replication_slot('slot1');

Finally the disk space is released:

master$ du -sh data/pg10/pg_wal
1.1G data/pg10/pg_wal

Important system variables

  • archive_mode is not required for streaming replication with slots.
  • wal_level – is replica by default
  • max_wal_senders – set to 10 by default, a minimum of three for one slave, plus two for each additional slave
  • wal_keep_segments – 32 by default, not important because PostgreSQL will keep all segments required by slot
  • archive_command – not important for streaming replication with slots
  • listen_addresses – the only option that it’s necessary to change, to allow remote slaves to connect
  • hot_standby – set to on by default, important to enable reads on slave
  • max_replication_slots – 10 by default https://www.postgresql.org/docs/10/static/runtime-config-replication.html

Summary

  • Physical replication setup is really easy with slots. By default in pg10, all settings are already prepared for replication setup.
  • Be careful with orphaned slots. PostgreSQL will not remove WAL segments for inactive slots with initialized restart_lsn.
  • Check pg_replication_slots restart_lsn value and compare it with current redo_lsn.
  • Avoid long downtime for slave servers with slots configured.
  • Please use meaningful names for slots, as that will simplify debug.

References

Nov
19
2018
--

Installing and Configuring JIT in PostgreSQL 11

JIT with PostgreSQL

JIT in PostgreSQLJust-in-time (JIT in PostgreSQL) compilation of SQL statements is one of the highlighted features in PostgreSQL 11. There is great excitement in the community because of the many claims of up to a 30% jump in performance. Not all queries and workloads get the benefit of JIT compilation. So you may want to test your workload against this new feature.

However, It is important to have a general understanding of what it does and where we can expect the performance gains. Installing PostgreSQL 11 with the new JIT compilation feature requires few extra steps and packages. Taking the time and effort to figure out how to do this shouldn’t be a reason to shy away from trying these cutting-edge features and testing a workload against the JIT feature. This blog post is for those who want to try it.

What is JIT and What it does in PostgreSQL

Normal SQL execution in any DBMS software is similar to what an interpreted language does to the source code. No machine code gets generated out of your SQL statement. But we all know that how dramatic the performance gains can be from a JIT compilation and execution of the machine code it generates. We saw the magic Google V8 engine did to JavaScript language. The quest for doing a similar thing with SQL statement was there for quite some time. But it is a challenging task.

It is challenging because we don’t have the source code (SQL statement) ready within the PostgreSQL server. The source code that needs to undergo JIT need to come from client connections and there could be expressions/functions with a different number of arguments, and it may be dealing with tables of different number and type of columns.

Generally, a computer program won’t get modified at this level while it is running, so branching-predictions are possible. The unpredictability and dynamic nature of SQL statements coming from client connections and hitting the database from time-to-time give no scope for doing advance prediction or compilation in advance. That means the JIT compiler should kick in every time the database gets an SQL statement. For this reason, PostgreSQL needs the help of compiler infrastructure like LLVM  continuously available behind. Even though there were a couple of other options, the main developer of this feature (Andres Freund) had a strong reason why LLVM was the right choice.

. In PostgreSQL 11, the JIT feature currently does:

  1. Accelerating expression evaluation: Expressions in  WHERE clauses, target lists, aggregates and projections
  2. Tuple deforming: Converting on-disk image to corresponding memory representation.
  3. In-lining: bodies of small custom functions, operators and user-defined data types are inline-ed into the expressions using them
  4. You can use compiler optimizations provided by LLVM for preparing optimized machine code.

In this blog, we are going to see how to install PostgreSQL with JIT. Just like regular PostgreSQL installations, we have two options:

  1. Get PostgreSQL from the packages in the PGDG repository
  2. Build PostgreSQL from source

Option 1. Install from PGDG repository.

Compiling from source requires us to install all compilers and tools. We might want to avoid this for various reasons. Installing packages from a PGDG repository is straightforward. On production systems or a container, you might want to install only the bare minimum required packages. Additional packages you don’t really use are always a security concern. Distributions like Ubuntu provide more recent versions of libraries and tool-sets in their default repos. However, distributions like CentOS / RHEL are quite conservative — their priority is stability and proven servers rather than cutting-edge features. So In this section of the post is mostly relevant for CentOS7/RHEL 7.

Here are the steps for the bare minimum installation of PostgreSQL with JIT feature on CentOS7

Step 1. Install PGDG repo and Install PostgreSQL server package.

This is usually the bare minimum installation if we don’t need the JIT feature.

sudo yum install https://download.postgresql.org/pub/repos/yum/11/redhat/rhel-7-x86_64/pgdg-centos11-11-2.noarch.rpm
sudo yum install postgresql11-server

At this stage, we can initialize the data directory and start the service if we don’t need JIT:

sudo /usr/pgsql-11/bin/postgresql*-setup initdb
sudo systemctl start postgresql-11

Step 2. Install EPEL repository

sudo yum install epel-release

Step 3. Install package for PostgreSQL with llvmjit

sudo yum install postgresql11-llvmjit

Since we have already added the EPEL repository, now the dependancy can be resolved by YUM and it can pull and install the necessary package from EPEL. Installation message contains the necessary packages.

...
Installing:
postgresql11-llvmjit      x86_64     11.1-1PGDG.rhel7     pgdg11    9.0 M
Installing for dependencies:
llvm5.0                   x86_64     5.0.1-7.el7          epel      2.6 M
llvm5.0-libs              x86_64     5.0.1-7.el7          epel      13 M
...

As we can see, there are two packages: llvm5.0 and llvm5.0-libs get installed.

Note for Ubuntu users:

As we already mentioned, Repositories of recent versions of Ubuntu contains recent versions of LLVM libraries. For example, Ubuntu 16.04 LTS repo contains libllvm6.0 by default. Moreover, PostgreSQL server package is not divided to have a separate package for jit related files. So default installation of PostgreSQL 11 can get you JIT feature also.

wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt/ $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
sudo apt install postgresql-11

Option 2. Building from Source

The primary means of distributing PostgreSQL is the source code. Building a minimal PostgreSQL instance requires just a C compiler. But building JIT options requires a few more things. One of the challenges you can run into is different errors during the build process due to older versions of LLVM and Clang present in the system.

Step 1. Download PostgreSQL source tarball and unpack

Tarballs are available in the repository. We can grab and unpack the latest:

curl -LO https://ftp.postgresql.org/pub/source/v11.0/postgresql-11.0.tar.bz2
tar -xvf postgresql-11.0.tar.bz2

Step 2.  Get SCL Repository and Install toolset

Latest versions of LLVM, CLang and GCC are available in SCL. We can get everything in a stretch:

sudo yum install centos-release-scl
sudo yum install llvm-toolset-7 llvm-toolset-7-llvm-devel.x86_64

Now either you can set or edit your PATH to have all new tools in PATH. I would prefer to put that into my profile file:

PATH=/opt/rh/devtoolset-7/root/usr/bin/:/opt/rh/llvm-toolset-7/root/usr/bin/:$PATH

Alternatively, we can open a shell with SCL enabled:

scl enable devtoolset-7 llvm-toolset-7 bash

We should attempt to compile the source from a shell with all these paths set.

Step 3. Install Additional libraries/tools

Based on the configuration options you want, this list may change. Consider this as a sample for demonstration purposes:

sudo yum install  readline-devel zlib-devel libxml2-devel openssl-devel

Step 4. Configure with –with-llvm option and make

Now we should be able to configure and make with our preferred options. The JIT feature will be available if the 

--with-llvm

 option is specified. For this demonstration, I am using an installation directory with my home (/home/postgres/pg11):

./configure --prefix=/home/postgres/pg11 --with-openssl --with-libxml --with-zlib --with-llvm
make
make install

Enabling JIT

You may observe that there is a new directory under the PostgreSQL’s lib folder with name

bit code

Which contains lots of files with .bc extension these are pre-generated bytecodes for LLVM for facilitating features like in-lining.

By default, the JIT feature is disabled in PostgreSQL 11. If you want to test it, you may have to enable the parameter

jit

:

postgres=# ALTER SYSTEM SET jit=on;
ALTER SYSTEM
postgres=# select pg_reload_conf();
 pg_reload_conf
----------------
 t
(1 row)
postgres=# show jit;
 jit
-----
 on
(1 row)

By default, most of the simple queries won’t use JIT because of the cost. The cost is high when JIT kicks in. In case we want to test if JIT is properly configured, we can lie to PostgreSQL that that cost is very low by adjusting the parameter value. However, we should keep in mind that we are accepting negative performance gains. Let me show a quick example:

postgres=# SET jit_above_cost=5;
SET
postgres=# create table t1 (id int);
CREATE TABLE
postgres=# insert into t1 (SELECT (random()*100)::int FROM generate_series(1,800000) as g);
INSERT 0 800000
postgres=# analyze t1;
ANALYZE
postgres=# explain select sum(id) from t1;
                                     QUERY PLAN
-------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=8706.88..8706.89 rows=1 width=8)
   ->  Gather  (cost=8706.67..8706.88 rows=2 width=8)
         Workers Planned: 2
         ->  Partial Aggregate  (cost=7706.67..7706.68 rows=1 width=8)
               ->  Parallel Seq Scan on t1  (cost=0.00..6873.33 rows=333333 width=4)
 JIT:
   Functions: 6
   Options: Inlining false, Optimization false, Expressions true, Deforming true
(8 rows)

As we can see in the above example, a separate JIT section comes up in the explain plan.

We expect JIT compilation to make a difference in complex analytical queries because the overhead in JIT compilation gets compensated only if the code runs for the duration. Here is a simple aggregate query for demonstration. (I know this is not a complex query, and not the perfect example for demonstrating JIT feature):

postgres=# EXPLAIN ANALYZE SELECT COMPANY_ID,
      SUM(SHARES) TOT_SHARES,
      SUM(SHARES* RATE) TOT_INVEST,
      MIN(SHARES* RATE) MIN_TRADE,
      MAX(SHARES* RATE) MAX_TRADE,
      SUM(SHARES* RATE * 0.002) BROKERAGE
FROM TRADING
GROUP BY COMPANY_ID;
                                                                       QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize GroupAggregate  (cost=757298.72..758741.91 rows=5005 width=138) (actual time=16992.290..17011.395 rows=5000 loops=1)
   Group Key: company_id
   ->  Gather Merge  (cost=757298.72..758466.64 rows=10010 width=138) (actual time=16992.270..16996.919 rows=15000 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Sort  (cost=756298.70..756311.21 rows=5005 width=138) (actual time=16983.900..16984.356 rows=5000 loops=3)
               Sort Key: company_id
               Sort Method: quicksort  Memory: 1521kB
               Worker 0:  Sort Method: quicksort  Memory: 1521kB
               Worker 1:  Sort Method: quicksort  Memory: 1521kB
               ->  Partial HashAggregate  (cost=755916.09..755991.16 rows=5005 width=138) (actual time=16975.997..16981.354 rows=5000 loops=3)
                     Group Key: company_id
                     ->  Parallel Seq Scan on trading  (cost=0.00..287163.65 rows=12500065 width=12) (actual time=0.032..1075.833 rows=10000000 loops=3)
 Planning Time: 0.073 ms
 Execution Time: 17013.116 ms
(15 rows)

We can switch on the JIT parameter at the session level and retry the same query:

postgres=# SET JIT=ON;
SET
postgres=# EXPLAIN ANALYZE SELECT COMPANY_ID,
      SUM(SHARES) TOT_SHARES,
      SUM(SHARES* RATE) TOT_INVEST,
      MIN(SHARES* RATE) MIN_TRADE,
      MAX(SHARES* RATE) MAX_TRADE,
      SUM(SHARES* RATE * 0.002) BROKERAGE
FROM TRADING
GROUP BY COMPANY_ID;
                                                                       QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize GroupAggregate  (cost=757298.72..758741.91 rows=5005 width=138) (actual time=15672.809..15690.901 rows=5000 loops=1)
   Group Key: company_id
   ->  Gather Merge  (cost=757298.72..758466.64 rows=10010 width=138) (actual time=15672.781..15678.736 rows=15000 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Sort  (cost=756298.70..756311.21 rows=5005 width=138) (actual time=15661.144..15661.638 rows=5000 loops=3)
               Sort Key: company_id
               Sort Method: quicksort  Memory: 1521kB
               Worker 0:  Sort Method: quicksort  Memory: 1521kB
               Worker 1:  Sort Method: quicksort  Memory: 1521kB
               ->  Partial HashAggregate  (cost=755916.09..755991.16 rows=5005 width=138) (actual time=15653.390..15658.581 rows=5000 loops=3)
                     Group Key: company_id
                     ->  Parallel Seq Scan on trading  (cost=0.00..287163.65 rows=12500065 width=12) (actual time=0.039..1084.820 rows=10000000 loops=3)
 Planning Time: 0.072 ms
 JIT:
   Functions: 28
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 5.844 ms, Inlining 137.226 ms, Optimization 201.152 ms, Emission 125.022 ms, Total 469.244 ms
 Execution Time: 15696.092 ms
(19 rows)

Here we see a 7.7% improvement in performance. I executed this several times and found that the performance gain is consistently 7-8% for this simple query (which takes 15 seconds to execute). The gains are higher for queries with more calculations/expressions.

Summary

It is fairly simple to install and configure JIT with PostgreSQL as demonstrated above. One point we would like to highlight is that installing JIT packages and enabling the JIT feature can be done online while the database is up and running. This is because all JIT related parameters are dynamic in nature. Parameter changes can be loaded a SIGHUP signal or

SELECT pg_reload_conf()

 by the superuser. If it is not helping our workload, we can turn it off anytime. Nothing stops you from trying it in a non-production environment. We might not see a gain in small and simple queries that take less time for execution because the overhead in doing the JIT compilation can become more than executing the SQL statement. But we should expect a good gain in OLAP workload with complex queries that run for a longer duration.

Nov
16
2018
--

Newly-Released PostgreSQL Minor Versions: Time to Update!

PostgreSQL Minor Versions

In this blog post we’ll look at what the newly-released PostgreSQL minor versions contain. You probably want to update your current versions and use these updates.

You might already have seen that they released the updates for supported PostgreSQL versions on November 8, 2018. PostgreSQL releases minor versions with several bug fixes and feature enhancements each quarter. An important point to note is that PostgreSQL 9.3 got its final minor version release (9.3.24) this quarter, and is no longer supported.

We always recommended that you keep your PostgreSQL databases updated to the latest minor versions. Applying a minor release might need a restart after updating the new binaries. The following is the sequence of steps you should follow to upgrade to the latest minor versions:

  1. Shutdown the PostgreSQL database server
  2. Install the updated binaries
  3. Restart your PostgreSQL database server

Most times, you can choose to update the minor versions in a rolling fashion, in a master-slave (replication) setup. Just perform the update on one server after another, but not all-at-once. Rolling updates avoid downtime for both reads and writes simultaneously. However, we recommended that you shutdown, update and restart them all-at-once while you are performing the updates.

One of the most important fixes is a security fix: CVE-2018-16850. The bug allowed an attacker with CREATE privileges on some non-temporary schema or TRIGGER privileges on some table to create a malicious trigger that, when dumped and restored using pg_dump/pg_restore, would result in additional SQL statements being executed. This applies to PostgreSQL 10 and 11 versions.

Before proceeding further, let’s look at the list of minor versions released this quarter.

  • PostgreSQL 11.1
  • PostgreSQL 10.6
  • PostgreSQL 9.6.11
  • PostgreSQL 9.5.15
  • PostgreSQL 9.4.20
  • PostgreSQL 9.3.25

Now, let us look into the benefits you should see by updating your Postgres versions with the latest minor versions.

PostgreSQL 11.1

PostgreSQL 11.0 was released on October 18, 2018. You might want to look at our blog post on our first take on PostgreSQL 11. With the new minor release PostgreSQL 11.1, we get some interesting functionalities and fixes after 21 days of its previous release, as seen here. The following is a small list of fixes that you might find interesting:

  • Ability to create child indexes of partition tables in another tablespace.
  • NULL handling in parallel hashed multi-batch left joins.
  • Fix to the strictness logic that incorrectly ignored rows of ORDER BY values those were null.
  • Disable recheck_on_update that is forced off for all indexes, as it is not ready yet and requires more time.
  • Prevent creation of a partition in a trigger attached to its parent table
  • Disallow the pg_read_all_stats role from executing pg_stat_statements_reset() to rest the stats as it is just needed for monitoring. You must run ALTER EXTENSION pg_stat_statements UPDATE … to get this into effect.

PostgreSQL 10.6

There are some common fixes that were applied to PostgreSQL 11.1 and PostgreSQL 10.6. You can find PostgreSQL 10.6 release details here. Some of the fixes applied to PostgreSQL 10.6 are in common with other supported PostgreSQL versions, as highlighted below:

  • Disallow the pg_read_all_stats role from executing pg_stat_statements_reset() to rest the stats as it is just needed for monitoring. (PostgreSQL 11.1 and 10.6)
  • Avoid pushing sub-SELECTs containing window functions, LIMIT, or OFFSET to parallel workers to avoid the behavior of different workers getting different answers due to row-ordering variations. (PostgreSQL 9.6 and 10.6)
  • Fixed the WAL file recycling logic that might make a standby fail to remove the WAL files that need to be removed. (PostgreSQL 9.5.15, 9.6.11 and 10.6)
  • Handling of commit-timestamp tracking during recovery has been fixed to avoid recovery failures while trying to fetch the commit timestamp for a transaction that did not record it. (PostgreSQL 9.5.15, 9.6.11 and 10.6)
  • Applied the fix that ensures that the background workers are stopped properly when the postmaster receives a fast-shutdown request before completing the database startup. (PostgreSQL 9.5.15, 9.6.11 and 10.6)
  • Fixed unnecessary failures or slow connections when multiple target host names are used in libpq so that the DNS lookup happens one at a time but not all at once. (PostgreSQL 10.6)

Following is a list of some common fixes applied to PostgreSQL 9.6.11, PostgreSQL 9.5.15, PostgreSQL 9.4.20 and PostgreSQL 9.3.25:

  • Avoid O(N^2) slowdown in regular expression match/split functions on long strings.
  • Fix mis-execution of SubPlans when the outer query is being scanned backward.
  • Fix failure of UPDATE/DELETE … WHERE CURRENT OF … after rewinding the referenced cursor.
  • Fix EvalPlanQual to avoid crashes or wrong answers in concurrent updates, when the code contains an uncorrelated sub-SELECT inside a CASE construct.
  • Memory leak in repeated SP-GiST index scans has been fixed.
  • Ensure that hot standby processes use the correct WAL consistency point to prevent possible misbehavior after reaching a consistent database state during WAL replay.
  • Fix possible inconsistency in pg_dump’s sorting of dissimilar object names.
  • Ensure that pg_restore will schema-qualify the table name when emitting DISABLE/ENABLE TRIGGER commands when a restore is ran using a restrictive search_path.
  • Fix pg_upgrade to handle event triggers in extensions correctly.
  • Fix pg_upgrade’s cluster state check to work correctly on a standby server.
  • Fix build problems on macOS 10.14

Now that you understand the added fixes to existing PostgreSQL versions, we recommend that you test and update your PostgreSQL databases with the new minor versions (if you haven’t already).

If you are currently running your databases on PostgreSQL 9.3.x or earlier, we recommend that you to prepare a plan to upgrade your PostgreSQL databases to the supported versions ASAP. Please subscribe to our blog posts so that you know about the various options on upgrading your PostgreSQL databases to a supported major version.

Nov
15
2018
--

Reference Architecture(s) for High Availability Solutions in Geographic Distributed Scenarios: Why Should I Care?

High Availability Solutions

High Availability Solutions

High Availability Solutions. Shutterstock.com

In this series of blog posts, I’m going to look at some high availability reference architecture solutions over geographically distributed areas.

The Problem

Nowadays, when businesses plan a new service or application, it is very common for them to worry about ensuring a very high level of availability. 

It doesn’t matter if we are talking about an online shop, online banking or the internal services of a large organization. We know users are going to expect access to services 24x7x365. They also expect to access data consistently and instantaneously. If we fail to meet their expectations, then they move to another provider and we lose money. Simple as that.

The other important aspect of providing online services and applications is that the amount of data produced, analyzed and stored is growing every day. We’ve moved from the few gigabytes of yesterday to terabytes today. Who knows what number of petabytes we need tomorrow?

What was once covered with a single LAMP stack, today can require dozens of Ls, As, different letters instead of P (like J, R, Py, G) and M. Our beloved MySQL that used to be “enough” to cover our needs 12 years ago is not fitting well with all the needs of many modern applications.

It is very common to have an application using different types of  “storage” at different levels and in different aspects of their activities. We can use a key-value store to cache inflight operations, and a relational full ACID database for the “valuable” core data (the kind of data that must be consistent and durable). Large data gets stored in an eventually consistent columns store mechanism, and long-term data in some “big data” approach. 

On top of all this is are reporting mechanisms that collect elements of each data store to provide a required, comprehensive data picture. The situation is very diversified and complex, and the number of possible variables is high. The way we can combine them is so vast that nowadays developers have no limits, and often comes up with creative solutions.

This is where we as architects can help: we can clarify how each tool can be used for the right JOB. We, at Percona, have the strong belief that we must seek simplicity in the complexity, and embracing the KISS approach. This starts with the initial identification of the right tool for the job.

Let’s start by looking at the following good practices in the following examples:

  • It is not a good idea to use key-value storage if you need to define the relationship between entities and rules between them.
  • Avoid using an eventually consistent storage when you have to save monetary information about customer payments.
  • It’s not a best practice to use a relational database to store HTML caching, page-tracking information, or game info in real time.

Use the right tool for the right job. Some tools scale writes better and keep an eventually consistent approach. Some others are designed to store an unbelievable amount of data, but cannot handle relations. As a result, they might take a long time when processing a typical OLTP request – if they can at all. Each tool has a different design and goal, each one scales differently, and each one has its way of handling and improving availability.

It is a crucial part of the architectural phase of your project not to mix the cards. Keep things clean and build the right architecture for each component. Then combine them in the way that harmonizes in the final result. We should optimize each block when solving a complex issue with simple answers.

How far are we from the old LAMP single stack? Ages. It is like turning your head and looking at our ancestors building the first tents. Tents are still a valid solution if you want to go camping. But only for fun, not for everyday life.

There is too often confusion around what a relational database should do and how it should do it. A relational database should not replace every other component of the wide architecture, and vice versa. They must coexist and work together with other options. Each one should maximize its characteristics and minimize its limitations.

In this series, we will focus on RDBMSs, and we will present a few possible reference architectures for the relational database layer. I will illustrate solutions that improve service availability, keeping a focus on what the tool’s design and the relational data approach concerning the ACID paradigm.

This means employing the simple rules of:

  • Atomicity -> All operations, part of the same transaction, are concluded successfully or not applied at all.
  • Consistency -> Any data written must be valid/validated against the defined rules and combination thereof.
  • Isolation -> Guarantees that all transactions will occur in isolation. No transaction affects any other transaction.
  • Durability -> Durability means that, once a transaction is committed, it will remain in the system even if there’s a system crash immediately following the transaction. Transaction changes must be stored permanently.

We will discuss the solution involving the most common open source RDBMSs, covering on-premises and in the cloud:

  • MySQL
  • PostgreSQL
  • MongoDB

The scenario will be common to all solutions, but the way we implement the solution will instead answer to different needs. The first example is MySQL high availability on premises: MySQL High Availability on premises.

Nov
06
2018
--

Percona Live Europe 2018: What’s Up for Wednesday

Percona Live Europe Open Source Database Conference PLE 2018

Welcome to Wednesday at Percona Live Europe 2018! Today is the final day! Check out all of the excellent sessions to attend.

Please see the important updates below.

Download the conference App

If you haven’t already downloaded the app, go to the app store and download the official Percona Live App! You can view the schedule, be alerted for any important updates, create your own personalized schedule, rate the talks and interact with fellow attendees.

For Apple: Download here
For Android: Download here

Rate the talks!

We want to encourage all attendees to rate the talks which you have attended. Please take a few moments to rate the talks which you attended on the Percona Live App.

Registration and Badge Pick Up

Registration is open from 8 am.

AWS Cloud Track

Join the featured cloud track today where AWS will be presenting A Deep Dive on Amazon Aurora, Zero to Serverless in 60 Seconds, Top 10 Mistakes When Migrating From Oracle to PostgreSQL to name a few! These sessions will run in Wallstreet 2!

AWS LEC

Keynotes

Keynotes begin promptly at 9:15 am. Please be seated and ready! Arrive early to secure your spot! Keynotes will take place in Dow Jones next to the expo area.

Expo Opening Hours

Have you visited the expo area yet? The expo will be open from 8:00 am to 4:30 pm today.

Conference Slides

Conference slides and presentations will be available to view after the conference and will be located on the Percona Live Europe website.

Breaks and Lunch

Coffee Breaks: The morning break is at 10:50 am – 11:20 am and the afternoon break from 4:10 pm- 4:30 pm (Conference Floor Foyer)
Lunch: 1:10 pm -2:10 pm Lunch will be served on the conference floor and in Showroom and Gaia restaurant on the lobby level.

With Thanks to Our Sponsors!

Percona Live Europe 2018 Sponsors
We hope you have enjoyed the conference!

Save the Date!

Percona Live 2019 will happen in Austin, Texas. Save the dates in your diary for May 28-30, 2019!

The conference will take place just after Memorial Day at The Hyatt Regency, Austin on the shores of Lady Bird Lake. This is also an ideal central location for those who wish to extend their stay and explore what Austin has to offer! Call for papers, ticket sales and sponsorship opportunities will be announced soon, so stay tuned!

Nov
06
2018
--

Welcome to Percona Live Europe 2018 Tuesday Keynotes and Sessions!

Percona Live Europe Open Source Database Conference PLE 2018

Hello, open source database enthusiasts at Percona Live Europe 2018! There is a lot to see and do today, and we’ve got some of the highlights listed below.

On Facebook? Go here for some pics that captured the action on Percona Live Europe 2018 Tutorials day (Monday, Nov. 5, 2018). 
Percona Live Europe 2018 app

 

Download the Conference App

We apologize for the confusion yesterday on the app but can assure you, the schedule and timings have been updated! If you haven’t already downloaded the app, go to the app store and download the official Percona Live App! You can view the schedule, be alerted for any important updates, create your own personalized schedule, rate the talks and interact with fellow attendees.

For Apple: Download here
For Android: Download here

Percona Live Frankfurt 1st Day-1244Registration and Badge Pick Up

Registration is open from 8 am. The registration desk is located at the top of the stairs on the first floor of the Radisson Blu Hotel. 

Keynotes

Keynotes begin promptly at 9:15 am. Please be seated and ready! Arrive early to secure your spot! Keynotes will take place in Dow Jones next to the expo area. 

Community Networking Reception

Join the Open Source community on Tuesday evening at Chicago Meatpackers (Riverside), Frankfurt!

This is a great opportunity to socialize and network with Percona Live Attendees and Other Open Source Enthusiasts who’d like to come along too!

This is not a ticketed event or an official event of Percona Live Europe, simply an open invitation with a place to congregate for food and drinks! An A La Carte food menu and cash bar will be available.

Percona Live Frankfurt 1st Day-1000Expo Opening Hours

The expo will be open from 8:00 am to 4:30 pm today. 

Breaks & Lunch

Coffee Breaks: Sponsored by Facebook!  AM Break will be at 10:50am – 11:20 am and the Afternoon break from  4:10 pm- 4:30 pm (Conference Floor Foyer)
Lunch: 1:10 pm -2:10 pm Lunch will be served on the conference floor and in Showroom and Gaia restaurant on the lobby level.

With thanks to our Sponsors!

Percona Live Europe 2018 Sponsors

Enjoy the conference!

Oct
30
2018
--

20+ MongoDB Alternatives You Should Know About

alternatives to MongoDB

alternatives to MongoDBAs MongoDB® has changed their license from AGPL to SSPL many are concerned by this change, and by how sudden it has been. Will SSPL be protective enough for MongoDB, or will the next change be to go to an altogether proprietary license? According to our poll, many are going to explore MongoDB alternatives. This blog post provides a brief outline of technologies to consider.

Open Source Data Stores

  • PostgreSQL is the darling of the open source database community. Especially if your concern is the license,  PostgreSQL’s permissive licence is hard to beat. PostgreSQL has powerful JSON Support, and there are many successful stories of migrating from MongoDB to PostgreSQL
  • Citus While PostgreSQL is a powerful database, and you can store terabytes of data on a single cluster, at a larger scale you will need sharding. If so, consider the Citus PostgreSQL extension, or the DBaaS offering from the same guys.
  • TimescaleDB  If on the other hand you are storing  time series data in MongoDB, then TimescaleDB might be a good fit.
  • ToroDB If you would love to use PostgreSQL but need MongoDB wire protocol compatibility, take a look at ToroDB. While it can’t serve as a full drop-in replacement for MongoDB server just yet, the developer told me that with some work it is possible.
  • CockroachDB While not based on the PostgreSQL codebase, CockroachDB is PostgreSQL wire protocol compatible and it is natively distributed, so you will not need to do manual sharding.
  • MySQL® is another feasible replacement. MySQL 5.7 and MySQL 8 have great support for JSON, and it continues to get better with every maintenance release. You can also consider MySQL Cluster for medium size sharded environments. You can also consider MariaDB and Percona Server  for MySQL
  • MySQL DocStore is a CRUD interface for JSON data stored in MySQL, and while it is not the same as MongoDB’s query language, it is much easier to transition to compared to SQL.
  • Vitess Would you love to use MySQL but can’t stand manual sharding? Vitess is a powerful sharding engine for MySQL which will allow you to grow to great scale while using proven MySQL as a backend.
  • TiDB is another take on MySQL compatible sharding. This NewSQL engine is MySQL wire protocol compatible but underneath is a distributed database designed from the ground up.
  • CouchDB is a document database which speaks JSON natively.
  • CouchBase is another database engine to consider. While being a document based database, CouchBase offers the N1QL language which has SQL look and feel.
  • ArangoDB is multi-model database, which can be used as document store.
  • Elastic While not a perfect choice for every MongoDB workload, for workloads where document data is searched and analyzed ElasticSearch can be a great alternative.
  • Redis is another contender for some MongoDB workloads. Often used as a cache in front of MongoDB, it can also be used as a JSON store through extensions.  While such extensions from RedisLabs are no longer open source, GoodForm projects provides open source alternatives.
  • ClickHouse may be a great contender for moving analytical workloads from MongoDB. Much faster, and with JSON support and Nested Data Structures, it can be great choice for storing and analyzing document data.
  • Cassandra does not have a document data model, but it has proven to be extremely successful for building scalable distributed clusters. If this is your main use case for MongoDB, then you should consider Cassandra.
  • ScyllaDB is a protocol compatible Cassandra alternative which claims to offer much higher per node performance.
  • HBase is another option worth considering, especially if you already have a Hadoop/HDFS infrastructure.

Public Cloud Document Stores

Most major cloud providers offer some variant of a native document database for you to consider.

  • Microsoft Azure Cosmos DB is an interesting engine that provides multiple NoSQL APIs, including for MongoDB and Cassandra.
  • Amazon DynamoDB supports key value and document based APIs. While not offering MongoDB compatibility, DynamoDB has been around for a long time, and is the most battle tested of the public cloud database offerings.
  • Google Cloud DataStore  – Google Cloud offers a number of data storage options for you to consider, and Cloud DataStore offers a data model and query language that is the most similar to MongoDB.

If you’re not ready for a major migration effort, there is one more solution for you – Percona Server for MongoDB.  Based on MongoDB Community, and enhanced by Percona with Enterprise Features, Percona Server for MongoDB offers 100% compatibility. As we wrote in a previous post, we commit to shipping a supported AGPL version until the situation around SSPL is clearly resolved.

Want help on deciding what is the best option for you, or with migration heavy lifting? Percona Professional Services can help!

Have idea for another feasible MongoDB alternative?  Please comment, and I will consider adding it to the list!


Image by JOSHUA COLEMAN on Unsplash

Oct
30
2018
--

PostgreSQL locking, part 3: lightweight locks

LWLocks lightweight locks postgres

PostgreSQL logoPostgreSQL lightweight locks, or LWLocks, control memory access. PostgreSQL uses multi-process architecture and should allow only consistent reads and writes to shared memory structures. LWLocks have two levels of locking: shared and exclusive. It’s also possible to release all acquired LWLocks to simplify clean up. Other databases often call primitives similar to LWLocks “latches”. Because LWLocks is an implementation detail, application developers shouldn’t pay much attention to this kind of locking.

This is the third and final part of a series on PostgreSQL locking, related to latches protecting internal database structures. Here are the previous parts: Row-level locks and table-level locks.

Instrumentation

Starting from PostgreSQL 9.6, LWLocks activity can be investigated with the pg_stat_activity system view. It could be useful under high CPU utilization. There are system settings to help with contention on specific lightweight locks.

Before PostgreSQL 9.5, the LWLocks implementation used spin-locks.  It was a bottleneck. This was fixed in 9.5 with atomic state variable.

Potential heavy contention places

  • WALInsertLock: protects WAL buffers. You can increase the number of wal buffers to get a slight improvement. Incidentally, synchronous_commit=off increases pressure on the lock even more, but it’s not a bad thing. full_page_writes=off reduces contention, but it’s generally not recommended.
  • WALWriteLock: accrued by PostgreSQL processes while WAL records are flushed to disk or during a WAL segments switch. synchronous_commit=off removes the wait for disk flush, full_page_writes=off reduces the amount of data to flush.
  • LockMgrLock: appears in top waits during a read-only workload. It latches relations regardless of its size. It’s not a single lock, but at least 16 partitions. Thus it’s important to use multiple tables during benchmarks and avoid single table anti-pattern in production.
  • ProcArrayLock: Protects the ProcArray structure. Before PostgreSQL 9.0, every transaction acquired this lock exclusively before commit.
  • CLogControlLock: protects CLogControl structure, if it shows on the top of pg_stat_activity, you should check the location of $PGDATA/pg_clog—it should be on a buffered file system.
  • SInvalidReadLock: protects sinval array. Readers using shared lock. SICleanupQueue, and other array-wide updates, requires an exclusive lock. It shows at the top of the pg_stat_activity when the shared buffer pool is under stress. Using a higher number of shared_buffers helps to reduce contention.
  • BufMappingLocks: protects regions of buffers. Sets 128 regions (16 before 9.5) of buffers to handle the whole buffer cache.

Spinlocks

The lowest level for locking is spinlocks. Therefore, it’s implemented within CPU-specific instructions. PostgreSQL is trying to change an atomic variable value in a loop. If the value is changed from zero to one – the process obtained a spinlock. If it’s not possible to get a spinlock immediately, the process will increase its wait delay exponentially.  There is no monitoring on spinlocks and it’s not possible to release all accrued spinlocks at once. Due to the single state change, it’s also an exclusive lock. In order to simplify the porting of PostgreSQL to exotic CPU and OS variants, PostgreSQL uses OS semaphores for its spinlocks implementation. Of course, it’s significantly slower compared to native CPU instructions port.

Summary

  • Use pg_stat_activity to find which queries or LWLocks are causing lock waits
  • Use fresh branches of PostgreSQL, as developers have been working on performance improvements and trying to reduce locking contention on hot mutexes.

References

Locks Photo by Warren Sammut on Unsplash

Oct
29
2018
--

One Week Until Percona Live Open Source Database Conference Europe 2018

Percona Live Europe 2018

Percona Live Europe Open Source Database Conference PLE 2018It’s almost here! One week until the Percona Live Europe Open Source Database Conference 2018 in Frankfurt, Germany! Are you ready?

This year’s theme is “Connect. Accelerate. Innovate.” We want to live these words by making sure that the conference allows you to connect with others in the open source community, accelerate your ideas and solutions and innovate when you get back to your projects and companies.

  • There is one day of tutorials (Monday) and two days of sessions (Tuesday and Wednesday). We have multiple tracks: MySQL 8.0, Using MySQL, MongoDB, PostgreSQL, Cloud, Database Security and Compliance, Monitoring and Ops, and Containers and Emerging Technologies. This year also includes a specialized “Business Track” aimed at how open source can solve critical enterprise issues.
  • Each of the session days begins with excellent keynote presentations in the main room by well-known people and players in the open source community. Don’t miss them!
  • Don’t forget to attend our Welcome Reception on Monday.
  • Want to meet with our Product Managers? Join them for Lunch on Wednesday, November 7, where you’ll have a chance to participate in the development of Percona Software!
  • On our community blog, we’ve been highlighting some of the sessions that will be occurring during the conference. You can check them out here.

Percona Live Europe TutorialsThe entire conference schedule is up and available here.

Percona Live Europe provides the community with an opportunity to discover and discuss the latest open source trends, technologies and innovations. The conference includes the best and brightest innovators and influencers in the open source database industry.

Our daily sessions, day-one tutorials, demonstrations, keynotes and events provide access to what is happening NOW in the world of open source databases. At the conference, you can mingle with all levels of the database community: DBAs, developers, C-level executives and the latest database technology trend-setters.

Network with peers and technology professionals and unite the open source database community! Share knowledge, experiences and use cases! Learn about how open source database technology can power your applications, improve your websites and solve your critical database issues.

Come to the conference.

Don’t miss out, buy your tickets here!

Percona Live Europe TutorialsConnect. Accelerate. Innovate.

With a lot of focus on the benefits of open source over proprietary models of software delivery, you surely can’t afford to miss this opportunity to connect with leading figures of the open source database world. On Monday, November 5 you can opt to accelerate your knowledge with our in-depth tutorials, or choose to attend our business track geared towards open source innovation and adoption.

Tuesday and Wednesday’s sessions across eight different tracks provides something for all levels of experience, and addresses a range of business challenges. See the full schedule.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com