Jan
19
2021
--

pg_stat_monitor: A New Way Of Looking At PostgreSQL Metrics

Percona pg_stat_monitor

Enter pg_stat_monitor: this extension, created here at Percona, has been developed as an advanced replacement of pg_stat_statement, providing new capabilities in addition to the standard fare.

As you may recall, PostgreSQL’s pg_stat_statements extension provides a means of tracking execution statistics of all SQL statements executed by the server. But sometimes just having the basics is not enough as the queries, and their generated metrics, are presented as one big “clump” of data. It neither provides aggregated statistics nor histogram information. This makes it difficult to identify problem queries during peak loading times versus periods of normal loading versus report generation during off-peak times. In order to trend server performance, one is therefore either forced to manually generate the needed data and calculate the aggregate, which can become quite the chore, or use a 3rd party monitoring solution presenting the results on a dashboard and which brings its own overhead requirements (and headaches?).

Download/Compile/Install

Depending upon your circumstances, I’m going to share with you three (3) methods one of which you can use to obtain and try out pg_stat_monitor. Although I’m limiting the installation instructions to Ubuntu, one can, of course, install it on the Redhat/CENTOS distros too.

Method 1: The Percona Distribution For PostgreSQL

The easiest way, of course, is downloading and installing the extension from our own Percona repository. The following instructions are performed as root on your OS.

First things first, update your distribution packages:

apt update
apt upgrade -y

Install the Percona repository and download the latest Percona release package:

apt install -y wget gnupg2 lsb-release curl
wget https://repo.percona.com/apt/percona-release_latest.generic_all.deb

The package percona-release_latest.generic_all.deb installs a binary you will use to install the packages of interest:

dpkg -i percona-release_latest.generic_all.deb

And now you install both pg_stat_monitor and Percona’s own bundled version of PostgreSQL at the same time!

percona-release setup ppg-12
apt install -y percona-postgresql-12 percona-pg-stat-monitor12

Attention: At this time, the only real difference between Percona’s and the community version of PostgreSQL is that it is located under a different path. However, over time we will be adding additional capabilities in the form of extensions. This is a work in progress, stay tuned!

Method 2: Compile And Install (Community PostgreSQL Repository)

Suppose you aren’t using the Percona Distribution For PostgreSQL, shame on you, but instead, the community version downloaded from postgresql.org. Although a little more detailed, this set of instructions will help you install the extension with your current version of PostgreSQL. In this case, one downloads the source code for pg_stat_monitor and compiles using the community PostgreSQL development packages.

Download the source code directly from our GIT HUB:

git clone https://github.com/percona/pg_stat_monitor

For demonstration purposes, we use version 12, although any version newer than 11 will work just fine. Be warned; there are a lot of packages! My tests, while writing this blog, required 500MB of additional space:

apt install -y postgresql-server-dev-12 git make gcc

Now it’s an easy matter of compiling; execute the following as root:

cd pg_stat_monitor
make USE_PGXS=1
make USE_PGXS=1 install

Method 3: Roll Your Own Packages

The method lends itself well for production environments by bundling your own package, whether it be DEB or RPM, using the FPM package management system.

FPM is a command-line program designed to help you build packages.

Author’s note:  I won’t go into details using FPM due to the complexity involved in getting and configuring it  … unless you’d like me to, and in which case, I’d be very happy to write a blog about this really cool tool. ?

Building a package might look something like this:

fpm -s <source type> -t <target type> [list of sources]…

“Source type” is what your package is coming from; a directory (dir), a rubygem (gem), an rpm (rpm), a python package (python), a PHP pear module (pear), etc.

“Target type” is what your output package form should be, such as RPM and DEB.

Method 4: Using PGXN

pg_stat_monitor is released on PGXN and the latest stable version is available there. It is easily downloadable using pgxn utility.

pgxn install pg_star_monitor.

Create Extension “pg_stat_monitor”

Once compiled and installed, this next step is straightforward. Update the PostgreSQL runtime parameters so it sees the extension’s module:

-- there's more than one way to configure this parameter
alter system set shared_preload_libraries = 'pg_stat_monitor';

Restart the server:

systemctl restart postgresql

You can install this on any database as this is a data cluster-wide extension:

create extension pg_stat_monitor;

There are two views:

  • The first view is pg_stat_monitor which is similar to pg_stat_statements in that you can view generated metrics in real-time.
  • The second view, pg_stat_monitor_settings, returns the entire suite of parameters defining and controlling this extension’s behavior. One can edit these parameters using ALTER SYSTEM.
List of relations
Schema | Name                     | Type | Owner
-------+--------------------------+------+----------
public | pg_stat_monitor          | view | postgres
public | pg_stat_monitor_settings | view | postgres

db01=# \d pg_stat_monitor_settings

View "public.pg_stat_monitor_settings"
         Column | Type    | Collation | Nullable | Default
 ---------------+---------+-----------+----------+---------
         name   | text    |           |          |
          value | integer |           |          |
  default_value | integer |           |          |
    description | text    |           |          | 
        minimum | integer |           |          |
        maximum | integer |           |          |
        restart | integer |           |          |

Using pg_stat_monitor

Let’s generate some activity using pgbench:

pgbench -i db01
pgbench -c 4 -j 2 -T 300 -b tpcb-like db01 > /dev/null 2>&1 &

Now query the view pg_stat_monitor, returning the top ten results for all operations on the current database:

select application_name,
       userid::regrole AS user_name,
       datname AS database_name,
       substr(query,0, 50) AS query,
       calls,
       client_ip
from pg_stat_monitor, pg_database
where dbid = oid
order by calls desc, application_name
limit 10;

And here’s our results, notice pg_stat_monitor returns information similarly to, but not quite, to pg_stat_statements:

application_name | user_name | db    | query                                             | calls | client_ip
------------------+-----------+-------+---------------------------------------------------+-------+-----------
          pgbench |  postgres | db01  | UPDATE pgbench_branches SET bbalance = bbalance + | 3523  | 127.0.0.1
          pgbench |  postgres | db01  | END                                               | 3523  | 127.0.0.1
          pgbench |  postgres | db01  | UPDATE pgbench_tellers SET tbalance = tbalance +  | 3523  | 127.0.0.1
          pgbench |  postgres | db01  | SELECT abalance FROM pgbench_accounts WHERE aid = | 3523  | 127.0.0.1
          pgbench |  postgres | db01  | UPDATE pgbench_accounts SET abalance = abalance + | 3523  | 127.0.0.1
          pgbench |  postgres | db01  | INSERT INTO pgbench_history (tid, bid, aid, delta | 3523  | 127.0.0.1
          pgbench |  postgres | db01  | BEGIN                                             | 3523  | 127.0.0.1
          pgbench | postgres  | db01  | END                                               | 3257  | 127.0.0.1
          pgbench | postgres  | db01  | INSERT INTO pgbench_history (tid, bid, aid, delta | 3257  | 127.0.0.1
          pgbench | postgres  | db01  | UPDATE pgbench_branches SET bbalance = bbalance + | 3256  | 127.0.0.1

This query highlights the key difference between pg_stat_monitor and pg_stat_statements, i.e. aggregating performance over a time interval using buckets:

postgres=# SELECT bucket,
                  bucket_start_time,
                  application_name,
                  datname AS database_name,
                  substr(query,0, 50) AS query,
                  calls 
            FROM pg_stat_monitor
            LIMIT 10;

 bucket |  bucket_start_time  | application_name | database_name |                       query                       | calls 
——–+———————+——————+—————+—————————————————+——-
    5 | 2021-01-14 14:49:10 | pgbench          | db01          | vacuum analyze pgbench_history                    |     1
    5 | 2021-01-14 14:49:10 | pgbench          | db01          | select count(*) from pgbench_branches             |     1
    5 | 2021-01-14 14:49:10 | pgbench          | db01          | UPDATE pgbench_accounts SET abalance = abalance + |  1375
    5 | 2021-01-14 14:49:10 | pgbench          | db01          | begin                                             |     1
    5 | 2021-01-14 14:49:10 | pgbench          | db01          | UPDATE pgbench_branches SET bbalance = bbalance + |  1372
    5 | 2021-01-14 14:49:10 | pgbench          | db01          | copy pgbench_accounts from stdin                  |     1
    5 | 2021-01-14 14:49:10 | pgbench          | db01          | create table pgbench_branches(bid int not null,bb |     1
    5 | 2021-01-14 14:49:10 | pgbench          | db01          | alter table pgbench_accounts add primary key (aid |     1
    5 | 2021-01-14 14:49:10 | pgbench          | db01          | INSERT INTO pgbench_history (tid, bid, aid, delta |  1372
    5 | 2021-01-14 14:49:10 | pgbench          | db01          | alter table pgbench_branches add primary key (bid |     1
(10 rows)

Updating pg_stat_monitor_settings

View pg_stat_monitor_settings returns those parameters controlling the metrics that gather the data.

This example query returns the list of runtime parameters that can be edited:

select name,description from pg_stat_monitor_settings;

As you can see, one has substantially more parameters allowing for targeted investigations and analysis:

                    name                      |                                               description                                                
-----------------------------------------------+----------------------------------------------------------------------------------------------------------
 pg_stat_monitor.pgsm_max                      | Sets the maximum size of shared memory in (MB) used for statement's metadata tracked by pg_stat_monitor.
 pg_stat_monitor.pgsm_query_max_len            | Sets the maximum length of query.
 pg_stat_monitor.pgsm_enable                   | Enable/Disable statistics collector.
 pg_stat_monitor.pgsm_track_utility            | Selects whether utility commands are tracked.
 pg_stat_monitor.pgsm_normalized_query         | Selects whether save query in normalized format.
 pg_stat_monitor.pgsm_max_buckets              | Sets the maximum number of buckets.
 pg_stat_monitor.pgsm_bucket_time              | Sets the time in seconds per bucket.
 pg_stat_monitor.pgsm_respose_time_lower_bound | Sets the time in millisecond.
 pg_stat_monitor.pgsm_respose_time_step        | Sets the response time steps in millisecond.
 pg_stat_monitor.pgsm_query_shared_buffer      | Sets the maximum size of shared memory in (MB) used for query tracked by pg_stat_monitor.
(10 rows)

Altering the runtime parameters is as simple as executing ALTER SYSTEM:

-- the default length is 1,024 characters
show pg_stat_monitor.pgsm_query_max_len;

-- increase the query length to 2,048 characters
alter system set pg_stat_monitor.pgsm_query_max_len = 2048;

Restarting the server updates the parameter:

systemctl restart postgresql

Here is the updated value:

SELECT name, 
       value 
FROM pg_stat_monitor_settings 
where name = 'pg_stat_monitor.pgsm_query_max_len';

                               name | value
------------------------------------+-------
 pg_stat_monitor.pgsm_query_max_len | 2048

Error Monitoring

pg_stat_monitor not only records the successful queries but all the ERROR and WARNINGS too.

SELECT decode_error_level(elevel) AS level,
query, message FROM pg_stat_monitor WHERE elevel != 0; 
elevel | sqlcode  |         query         |             message 
--------+----------+-- ---------------------+---------------------------------- 
ERROR  | 16908420 | SELECT * FROM pg_foo; | relation "pg_foo" does not exist
ERROR  | 33816706 | SELECT 1/0;           | division by zero

Tell us what you think; your feedback is important!

Jan
15
2021
--

MySQL 8.0.22: SHOW PROCESSLIST Version 2 – Now Available From PERFORMANCE_SCHEMA

SHOW PROCESSLIST Version 2

SHOW PROCESSLIST Version 2The “SHOW PROCESSLIST” command is very famous and very useful for MySQL DBAs. It will help you to understand the ongoing thread activities and their current states. By default, the “show processlist” output details will be collected from the thread manager, and it needs the global mutex. From MySQL 8.0.22, we have an alternative way to get the process details from the PERFORMANCE_SCHEMA. It doesn’t need the global mutex. ?

Note: We also have the non-blocking SYS schema views “processlist” and “x$processlist”, which provide more complete information than the SHOW PROCESSLIST statement and the INFORMATION_SCHEMA.PROCESSLIST and PERFORMANCE_SCHEMA.PROCESSLIST. But, we can’t integrate this with the “SHOW PROCESSLIST” command.

In this blog, I am going to explain the complete details about the new processlist implementation using PERFORMANCE_SCHEMA.

“SHOW PROCESSLIST” Using Thread Manager (default)

  • This is the default method.
  • The default “show processlist” implementation iterates across active threads from within the thread manager while holding a global mutex.
  • Negatively impacts performance.
  • Particularly impacts the busy systems quite badly.
  • The INFORMATION_SCHEMA.PROCESSLIST is one of the sources of process information. This will also use the thread manager to collect the metrics.
  • By default, “mysqladmin processlist” also uses the thread manager to get the details.

The following statements are equivalent:

SHOW FULL PROCESSLIST;
SELECT * FROM INFORMATION_SCHEMA.PROCESSLIST;
Mysqladmin processlist --verbose

“SHOW PROCESSLIST” Using Performance Schema

  • Available from MySQL 8.0.22.
  • It collects the thread details from the PERFORMANCE_SCHEMA>PROCESSLIST table.
  • Global mutex is not needed.
  • Helps to avoid the performance impact during querying the “show processlist”, particularly in busy systems.
  • The implementation also applies to “mysqladmin processlist”

The following statements are equivalent:

SHOW FULL PROCESSLIST;
SELECT * FROM PERFORMANCE_SCHEMA.PROCESSLIST;
Mysqladmin processlist --verbose

“PERFORMANCE_SCHEMA.PROCESSLIST” table has similar columns as “INFORMATION_SCHEMA.PROCESSLIST”

mysql> desc performance_schema.processlist;
+---------+-----------------+------+-----+---------+-------+
| Field   | Type            | Null | Key | Default | Extra |
+---------+-----------------+------+-----+---------+-------+
| ID      | bigint unsigned | NO   | PRI | NULL    |       |
| USER    | varchar(32)     | YES  |     | NULL    |       |
| HOST    | varchar(255)    | YES  |     | NULL    |       |
| DB      | varchar(64)     | YES  |     | NULL    |       |
| COMMAND | varchar(16)     | YES  |     | NULL    |       |
| TIME    | bigint          | YES  |     | NULL    |       |
| STATE   | varchar(64)     | YES  |     | NULL    |       |
| INFO    | longtext        | YES  |     | NULL    |       |
+---------+-----------------+------+-----+---------+-------+
8 rows in set (0.00 sec)

mysql> desc information_schema.processlist;
+---------+-----------------+------+-----+---------+-------+
| Field   | Type            | Null | Key | Default | Extra |
+---------+-----------------+------+-----+---------+-------+
| ID      | bigint unsigned | NO   |     |         |       |
| USER    | varchar(32)     | NO   |     |         |       |
| HOST    | varchar(261)    | NO   |     |         |       |
| DB      | varchar(64)     | YES  |     |         |       |
| COMMAND | varchar(16)     | NO   |     |         |       |
| TIME    | int             | NO   |     |         |       |
| STATE   | varchar(64)     | YES  |     |         |       |
| INFO    | varchar(65535)  | YES  |     |         |       |
+---------+-----------------+------+-----+---------+-------+
8 rows in set (0.00 sec)

Implementation

  • Make sure the PERFORMANCE_SCHEMA is enabled at the server startup.
  • Make sure MySQL was configured and built with the thread instrumentations enabled.

MySQL provides a variable “performance_schema_show_processlist” to enable this feature. Once we enable the variable, the “SHOW PROCESSLIST” command will start to show the details from the “PERFORMANCE_SCHEMA.PROCESSLIST” table instead of the thread manager.

The variable has a global scope, no need to restart the MySQL server.

mysql> show global variables like 'performance_schema_show_processlist';
+-------------------------------------+-------+
| Variable_name                       | Value |
+-------------------------------------+-------+
| performance_schema_show_processlist | OFF   |
+-------------------------------------+-------+
1 row in set (0.08 sec)

mysql> set global performance_schema_show_processlist='ON';
Query OK, 0 rows affected (0.00 sec)

mysql> \r
Connection id:    23
Current database: *** NONE ***

mysql> show global variables like 'performance_schema_show_processlist';
+-------------------------------------+-------+
| Variable_name                       | Value |
+-------------------------------------+-------+
| performance_schema_show_processlist | ON    |
+-------------------------------------+-------+
1 row in set (0.00 sec)

Yes, the PERFORMANCE_SCHEMA.PROCESSLIST table is activated for the “SHOW PROCESSLIST”.

“SHOW PROCESSLIST” output from “PERFORMANCE_SCHEMA”:

mysql> show processlist\G
*************************** 1. row ***************************
     Id: 5
   User: event_scheduler
   Host: localhost
     db: NULL
Command: Daemon
   Time: 2461
  State: Waiting on empty queue
   Info: NULL
*************************** 2. row ***************************
     Id: 23
   User: root
   Host: localhost
     db: NULL
Command: Query
   Time: 0
  State: executing
   Info: show processlist
2 rows in set (0.00 sec)

You can also query the “performance_schema.processlist” table to get the thread information.

mysql> select * from performance_schema.processlist\G
*************************** 1. row ***************************
     ID: 5
   USER: event_scheduler
   HOST: localhost
     DB: NULL
COMMAND: Daemon
   TIME: 2448
  STATE: Waiting on empty queue
   INFO: NULL
*************************** 2. row ***************************
     ID: 23
   USER: root
   HOST: localhost
     DB: NULL
COMMAND: Query
   TIME: 0
  STATE: executing
   INFO: select * from performance_schema.processlist
2 rows in set (0.00 sec)

“mysqladmin processlist” output from “performance_schema”:

[root@mysql8 vagrant]# mysqladmin processlist
+----+-----------------+-----------+----+---------+------+------------------------+------------------+
| Id | User            | Host      | db | Command | Time | State                  | Info             |
+----+-----------------+-----------+----+---------+------+------------------------+------------------+
| 5  | event_scheduler | localhost |    | Daemon  | 2631 | Waiting on empty queue |                  |
| 24 | root            | localhost |    | Query   | 0    | executing              | show processlist |
+----+-----------------+-----------+----+---------+------+------------------------+------------------+

Recommendations

  • To avoid having some threads ignored, leave the “performance_schema_max_thread_instances” and “performance_schema_max_thread_classes” system variables set to their default value (default = -1, meaning the parameter will be autosized during the server startup).
  • To avoid having some STATE column values be empty, leave the “performance_schema_max_stage_classes” system variable set to its default (default = -1, meaning the parameter will be autosized during the server startup).
Jan
11
2021
--

A Snippet of Features: Reviewing PostgreSQL Version 13

Reviewing PostgreSQL Version 13

This latest and greatest version of PostgreSQL, version 13, has been here since last summer and continues the tradition of gains in performance, administration, application development, and security enhancements.

Of the many new features and enhancements that have been made, some of the more notable ones include:

  • Space savings and performance gains from de-duplication of B-tree index entries
  • Improved performance for queries that use aggregates or partitioned tables
  • Better query planning when using extended statistics
  • Parallelized vacuuming of indexes
  • Incremental sorting

TIP: More detailed information can be found in the Release Notes here.

I learned long ago that it can be something of a challenge keeping up to date with all this new stuff. Therefore, I’d like to cover a small subset of these new advances by demonstrating simple use cases which I hope you will find constructive and are related to pgbench, logical replication, streaming replication fail-over, and re-provisioning a deprecated PRIMARY as a STANDBY using pg_rewind.

About pgbench

As you may recall, pgbench is a simple program for benchmark tests on PostgreSQL by running the same sequence of SQL commands over and over in multiple concurrent database sessions. Pgbench finds its most common use, as the name implies, as a benchmarking tool testing hardware and configuration runtime parameters for the purposes of performance tuning.

The latest iteration of pgbench has a number of new capabilities and includes:

  • A new set of options chooses data generation either from the client or the server side i.e. ‘g’, and ‘G’. Whereas previous versions generated data on the pgbench client and then sent it to the server, one now has the ability to test data generation exclusively on the server itself without being affected by network latency.
  • The ability to create a partitioned “accounts” table using either range or hash partitioning although the default is range partitioning.
  • A new option ‘–show-script’ which echoes the actual code of any built-in script name on stderr.

Example 1: Hash and Range Partitioning

The first step is to of course initialize the benchmarking, in this case, we start out using hash partitioning:

# hash partitioning table public.pgbench_accounts
pgbench -i --partition-method=hash --partitions=5

...
creating tables...
creating 5 partitions...
generating data (client-side)...
100000 of 100000 tuples (100%) done (elapsed 0.04 s, remaining 0.00 s)
vacuuming...
creating primary keys...
done in 0.79 s (drop tables 0.02 s, create tables 0.17 s, client-side generate 0.13 s, vacuum 0.26 s, primary keys 0.22 s).

Here’s what it should look like:

Schema |        Name        |       Type        |  Owner   | Persistence |  Size   
--------+--------------------+-------------------+----------+-------------+---------+
 public | pgbench_accounts   | partitioned table | postgres | permanent   | 0 bytes 
 public | pgbench_accounts_1 | table             | postgres | permanent   | 2656 kB 
 public | pgbench_accounts_2 | table             | postgres | permanent   | 2656 kB 
 public | pgbench_accounts_3 | table             | postgres | permanent   | 2656 kB 
 public | pgbench_accounts_4 | table             | postgres | permanent   | 2656 kB 
 public | pgbench_accounts_5 | table             | postgres | permanent   | 2656 kB 
 public | pgbench_branches   | table             | postgres | permanent   | 40 kB   
 public | pgbench_history    | table             | postgres | permanent   | 0 bytes 
 public | pgbench_tellers    | table             | postgres | permanent   | 40 kB

And here’s the partitioned pgbench_accounts table definition:

Partitioned table "public.pgbench_accounts"
  Column  |     Type      | Collation | Nullable | Default | Storage  | Stats target  Description
----------+---------------+-----------+----------+---------+----------+--------------
 aid      | integer       |           | not null |         | plain    |              
 bid      | integer       |           |          |         | plain    |              
 abalance | integer       |           |          |         | plain    |              
 filler   | character(84) |           |          |         | extended |              
Partition key: HASH (aid)
Indexes:
    "pgbench_accounts_pkey" PRIMARY KEY, btree (aid)
Foreign-key constraints:
    "pgbench_accounts_bid_fkey" FOREIGN KEY (bid) REFERENCES pgbench_branches(bid)
Referenced by:
    TABLE "pgbench_history" CONSTRAINT "pgbench_history_aid_fkey" FOREIGN KEY (aid) REFERENCES pgbench_accounts(aid)
Partitions: pgbench_accounts_1 FOR VALUES WITH (modulus 5, remainder 0),
            pgbench_accounts_2 FOR VALUES WITH (modulus 5, remainder 1),
            pgbench_accounts_3 FOR VALUES WITH (modulus 5, remainder 2),
            pgbench_accounts_4 FOR VALUES WITH (modulus 5, remainder 3),
            pgbench_accounts_5 FOR VALUES WITH (modulus 5, remainder 4)

 

Alternatively, we can just as easily partition using range partitioning:

# range partitioning public.pgbench_accounts
pgbench -i --partition-method=range --partitions=5

Partitioned table "public.pgbench_accounts"
  Column  |     Type      | Collation | Nullable | Default | Storage  | Stats target | Description
----------+---------------+-----------+----------+---------+----------+--------------+
 aid      | integer       |           | not null |         | plain    |              |
 bid      | integer       |           |          |         | plain    |              |
 abalance | integer       |           |          |         | plain    |              |
 filler   | character(84) |           |          |         | extended |              |
Partition key: RANGE (aid)
Indexes:
    "pgbench_accounts_pkey" PRIMARY KEY, btree (aid)
Partitions: pgbench_accounts_1 FOR VALUES FROM (MINVALUE) TO (20001),
            pgbench_accounts_2 FOR VALUES FROM (20001) TO (40001),
            pgbench_accounts_3 FOR VALUES FROM (40001) TO (60001),
            pgbench_accounts_4 FOR VALUES FROM (60001) TO (80001),
            pgbench_accounts_5 FOR VALUES FROM (80001) TO (MAXVALUE)

For those people curious to see the actual commands creating the partition(s) just update the runtime parameter and look in your postgres log:

postgres=# alter system set log_statement = 'ddl';
ALTER SYSTEM
postgres=# select pg_reload_conf();
pg_reload_conf
----------------
t

postgres=# show log_statement;
log_statement
---------------
ddl

 

LOG:  statement: drop table if exists pgbench_accounts, pgbench_branches, pgbench_history, pgbench_tellers
LOG:  statement: create table pgbench_history(tid int,bid int,aid    int,delta int,mtime timestamp,filler char(22))
LOG:  statement: create table pgbench_tellers(tid int not null,bid int,tbalance int,filler char(84)) with (fillfactor=100)
LOG:  statement: create table pgbench_accounts(aid    int not null,bid int,abalance int,filler char(84)) partition by range (aid)
LOG:  statement: create table pgbench_branches(bid int not null,bbalance int,filler char(88)) with (fillfactor=100)
LOG:  statement: create table pgbench_accounts_1
         partition of pgbench_accounts
         for values from (minvalue) to (20001) with (fillfactor=100)
LOG:  statement: create table pgbench_accounts_2
         partition of pgbench_accounts
         for values from (20001) to (40001) with (fillfactor=100)
LOG:  statement: create table pgbench_accounts_3
         partition of pgbench_accounts
         for values from (40001) to (60001) with (fillfactor=100)
LOG:  statement: create table pgbench_accounts_4
         partition of pgbench_accounts
         for values from (60001) to (80001) with (fillfactor=100)
LOG:  statement: create table pgbench_accounts_5
         partition of pgbench_accounts
         for values from (80001) to (maxvalue) with (fillfactor=100)
LOG:  statement: alter table pgbench_branches add primary key (bid)
LOG:  statement: alter table pgbench_tellers add primary key (tid)
LOG:  statement: alter table pgbench_accounts add primary key (aid)

 

Example 2: Benchmarking

Part I

Let’s initialize a database with the following conditions:

  • install pg_stat_statements to see our results
  • initialize benchmarking
  • generate data solely on the server-side
  • create primary keys
  • create foreign key constraints
  • range partition table “accounts” creating five child tables

Open file postgresql.conf? and edit the following runtime parameters. Restart the server and add extension pg_stat_statement to any database using the command create pg_stat_statement.

# pg_conftool 13 main postgresql.conf edit
    shared_preload_libraries = 'pg_stat_statements'
    pg_stat_statements.max = 10000
    pg_stat_statements.track = all

systemctl restart postgresql@13-main

Initialize an empty database:

export PGHOST=pg1-POC13 PGPORT=5432 PGDATABASE=db01 PGUSER=postgres PGPASSWORD=mypassword
createdb db01
pgbench -i -s 10 -I dtGvpf -F 90 --partition-method=hash --partitions=5

This query should return a nice summary of the commands thus far executed:

SELECT query, calls, total_exec_time::int, rows, (100.0 * shared_blks_hit /
       nullif(shared_blks_hit + shared_blks_read, 0))::int AS hit_percent
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 5;

 

-[ RECORD 1 ]                ----------------+------------------------------------------------------------------------------------
query           | SELECT abalance FROM pgbench_accounts WHERE aid = $1
calls           | 1560162
total_exec_time | 73958
rows            | 1560162
hit_percent     | 94
-[ RECORD 2 ]
----------------+------------------------------------------------------------------------------------
query           | insert into pgbench_accounts(aid,bid,abalance,filler) select aid, (aid - $1) / $2 + $3, $4, $5 from generate_series($7, $8) as aid
calls           | 1
total_exec_time | 2250
rows            | 1000000
hit_percent     | 100
-[ RECORD 3 ]
----------------+------------------------------------------------------------------------------------
query           | create database db03
calls           | 1
total_exec_time | 2092
rows            | 0
hit_percent     | 90
-[ RECORD 4 ]
----------------+------------------------------------------------------------------------------------
query           | vacuum analyze pgbench_accounts
calls           | 1
total_exec_time | 1591
rows            | 0
hit_percent     | 92
-[ RECORD 5 ]
----------------+------------------------------------------------------------------------------------
query           | alter table pgbench_accounts add primary key (aid)
calls           | 1
total_exec_time | 1086
rows            | 0
hit_percent     | 59

Part II

Execute the benchmarking for a duration of five minutes. Just to make things clearer, you can reset the stats in view pg_stat_statements.

# clear the stats before starting the benchmarking
#    SELECT pg_stat_statements_reset();
pgbench -c 40 -j 7 -T 300 -b tpcb-like db01 -P 60

New SQL statements representing the DML operations are now listed:

SELECT query, calls, total_exec_time::int, rows, (100.0 * shared_blks_hit /
       nullif(shared_blks_hit + shared_blks_read, 0))::int AS hit_percent
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 5;

 

-[ RECORD 1 ]
---+-----------------------------------------------------------------------------------------------------------
query           | UPDATE pgbench_branches SET bbalance = bbalance + $1 WHERE bid = $2
calls           | 42767
total_exec_time | 6203809
rows            | 42767
hit_percent     | 100
-[ RECORD 2 ]
---+-----------------------------------------------------------------------------------------------------------
query           | UPDATE pgbench_tellers SET tbalance = tbalance + $1 WHERE tid = $2
calls           | 42767
total_exec_time | 3146175
rows            | 42767
hit_percent     | 100
-[ RECORD 3 ]
---+-----------------------------------------------------------------------------------------------------------
query           | UPDATE pgbench_accounts SET abalance = abalance + $1 WHERE aid = $2
calls           | 42767
total_exec_time | 28281
rows            | 42767
hit_percent     | 95
-[ RECORD 4 ]
---+-----------------------------------------------------------------------------------------------------------
query           | INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES ($1, $2, $3, $4, CURRENT_TIMESTAMP)
calls           | 42767
total_exec_time | 22797
rows            | 42767
hit_percent     | 100
-[ RECORD 5 ]
---+-----------------------------------------------------------------------------------------------------------
query           | SELECT $2 FROM ONLY "public"."pgbench_branches" x WHERE "bid" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x
calls           | 42767
total_exec_time | 2347
rows            | 42767
hit_percent     | 100

Notice how the child tables have grown in size:

List of relations
 Schema |        Name        |       Type        |  Owner   | Persistence |  Size   | Description
--------+--------------------+-------------------+----------+-------------+---------+
 public | pgbench_accounts   | partitioned table | postgres | permanent   | 0 bytes |
 public | pgbench_accounts_1 | table             | postgres | permanent   | 28 MB   |
 public | pgbench_accounts_2 | table             | postgres | permanent   | 29 MB   |
 public | pgbench_accounts_3 | table             | postgres | permanent   | 28 MB   |
 public | pgbench_accounts_4 | table             | postgres | permanent   | 28 MB   |
 public | pgbench_accounts_5 | table             | postgres | permanent   | 28 MB   |
 public | pgbench_branches   | table             | postgres | permanent   | 168 kB  |
 public | pgbench_history    | table             | postgres | permanent   | 2384 kB |
 public | pgbench_tellers    | table             | postgres | permanent   | 272 kB  |

About Logical Replication

We now explore another very useful addition to pg version 13 which is the transparent replication of partitioned tables via logical decoding. No longer is it necessary to expend valuable time manually adding the triggers and supplementary instruction replicating the child tables.

Example

Using the aforementioned pgbench example, execute the following. Note that one must update table history, by adding a primary key, otherwise, replication is not possible with this table:

# UPDATE TABLE, ADD PK
alter table public.pgbench_history add primary key (tid,bid,aid,mtime);

# CREATE DATABASE ON SUBSCRIBER NODE
createdb -h pg4-POC13 -U postgres db

# COPY DATABASE SCHEMA
pg_dump -h pg1-POC13 -U postgres -s db01 | psql 'host=pg4-POC13 user=postgres dbname=db01'

#
# PROVIDER pg1-POC13: DB01
#
psql 'host=pg1-POC13 user=postgres password=mypassword dbname=db01' <<_eof_
   \set ON_ERROR_STOP
    create publication publication1 for all tables;
_eof_

#
# SUBSCRIBER pg4-POC13: DB01
#
psql 'host=pg4-POC13 user=postgres password=mypassword dbname=db01' <<_eof_
   \set ON_ERROR_STOP
    create subscription subscript_set1
        connection 'host=pg1-POC13 dbname=db01 user=postgres password=mypassword'
        publication publication1
        with (copy_data=true, create_slot=true, enabled=true, slot_name=myslot1);
_eof_

And here we see the child accounts tables have been correctly replicated:

List of relations
 Schema |        Name        |       Type        |  Owner   | Persistence |    Size    | Description
--------+--------------------+-------------------+----------+-------------+------------+
 public | pgbench_accounts   | partitioned table | postgres | permanent   | 0 bytes    |
 public | pgbench_accounts_1 | table             | postgres | permanent   | 28 MB      |
 public | pgbench_accounts_2 | table             | postgres | permanent   | 29 MB      |
 public | pgbench_accounts_3 | table             | postgres | permanent   | 28 MB      |
 public | pgbench_accounts_4 | table             | postgres | permanent   | 28 MB      |
 public | pgbench_accounts_5 | table             | postgres | permanent   | 28 MB      |
 public | pgbench_branches   | table             | postgres | permanent   | 8192 bytes |
 public | pgbench_history    | table             | postgres | permanent   | 2352 kB    |
 public | pgbench_tellers    | table             | postgres | permanent   | 8192 bytes |

About Streaming Replication

Because of the large number of new features added to streaming replication, I’m limiting the focus on two enhancements:

  • Restarting a STANDBY in order to point it to a newly promoted PRIMARY, after updating primary_conninfo and primary_slot_name, is no longer necessary as a reload will suffice.
  • pg_rewind has a new option, –write-recovery-conf,  that simplifies the steps of reprovisioning a failed PRIMARY as a viable STANDBY, similarly to the one found in pg_basebackup.

Recall the three-node streaming replication cluster we’ve been using:

pg1-POC13: PRIMARY (read-write)
pg2-POC13: REPLICA (read-only, streaming)
pg3-POC13: REPLICA (read-only, streaming)

Example 1: STANDBY, pg3, Points to New PRIMARY pg2

Suppose a failover promotes pg2. This new feature adds robustness to the entire database cluster. REPLICA pg3 continues service without interruption as it is redirected from pg1 to pg2. Previous versions of postgres required host pg3 to be restarted in order to effect new primary_conninfo parameters.

Step 1

pg3-POC13; run a benchmark of SELECTS on host pg3

pgbench -h pg3-POC13 -U postgres -c 40 -j 7 -T 300 -b select-only db01 -P 5

Step 2

pg2-POC13; promote the host while the bench-marking on pg3 is active

-- create a new slot for pg3
select * from pg_create_physical_replication_slot('pg3');
-- confirm slot is inactive
select * from pg_get_replication_slots();
-- promote host
select pg_promote():
-- confirm read-write state
select pg_is_in_recovery();

TIP: it’s understood that runtime parameter wal_keep_size is a non-zero value and is set sufficiently retaining WALs as pg3 is redirected away from pg1 towards pg2.

Step 3

pg3-POC13;  point to pg2-POC13

-- make the updates
alter system set primary_conninfo = 'user=postgres host=10.231.38.112';
alter system set primary_slot_name = 'pg3';
select pg_reload_conf();
-- confirm replication is active
select * from pg_stat_wal_receiver;

Step 4

pg2-POC13; validate replication from pg2->pg3

-- confirm slot is active
select * from pg_get_replication_slots();
-- confirm replication is active
select * from pg_stat_replication;

Example 2

When re-provisioning a failed PRIMARY, ie pg1, as a new STANDBY, it used to be necessary to edit the requisite recovery runtime configuration parameters after executing pg_rewind but before starting the host.

PostgreSQL version 13 now simplifies the exercise by providing the switch  –write-recovery-conf and updating postgresql.auto.conf with the correct runtime parameters.

TIP: pg_rewind requires runtime parameter wal_log_hints=on before re-provisioning takes place on the failed PRIMARY

Step 1

pg2-POC13; add a slot on PRIMARY pg2 for the new STANDBY pg1

-- as postgres create a new slot
select * from pg_create_physical_replication_slot('pg1');
select * from pg_get_replication_slots();

Step 2

pg1-POC13;  re-provision pg1

# as postgres, perform the rewind
/usr/lib/postgresql/13/bin/pg_rewind \
    --target-pgdata=/var/lib/postgresql/13/main \
    --source-server="host=pg2-POC13 port=5432 user=postgres dbname=postgres " \
    --write-recovery-conf \
    --progress

You should get messaging similar to the following:

pg_rewind: connected to server
    pg_rewind: servers diverged at WAL location 0/6CDCEA88 on timeline 1
    pg_rewind: rewinding from last common checkpoint at 0/6CDCE9D8 on timeline 1
    pg_rewind: reading source file list
    pg_rewind: reading target file list
    pg_rewind: reading WAL in target
    pg_rewind: need to copy 206 MB (total source directory size is 422 MB)
    211594/211594 kB (100%) copied
    pg_rewind: creating backup label and updating control file
    pg_rewind: syncing target data directory
    pg_rewind: Done!

Now you can restart host pg1-POC13 and bring it back into service:

# as root, restart the server
systemctl start postgresql@13-main

Login pg1 and confirm replication:

# confirm replication is active
select * from pg_stat_wal_receiver;

Caveat

In the case of failure, check the following:

  • pg2-POC13: As the PRIMARY you will want to check that you have a slot installed and ready before starting up the reprovisioned pg1 as a new standby
  • pg1-POC13:
    • check to see if he server started up i.e. ‘netstat -tlnp’
    • check postgresql.auto.conf if you get a FATAL server is starting up message
    • check the slot name
    • check connectivity info is correct

Conclusion

Although not germane to features and capabilities, I was a little curious to see the differences between PostgreSQL 13 and earlier versions. Here’s a table with metrics creating a data cluster, database, and time required initializing pgbench. It’s interesting to see how performance times have improved over the previous versions:

initdb         datacluster  binaries   createdb    pgbench -i
ver time        size(bytes) size(bytes) time        time
9.6 0m0.889s    38,877,134  38,844,934  0m0.311s    0m0.236s
10  0m0.729s    39,598,542  42,054,339  0m0.725s    0m0.240s
11  0m0.759s    40,844,747  41,336,566  0m0.683s    0m0.212s
12  0m0.592s    41,560,196  43,853,282  0m0.179s    0m0.213s
13  0m0.502s    41,266,877  65,652,665  0m0.188s    0m0.168s

Frankly, I find it amazing how the size of both binaries and the data cluster has remained so compact over the years. I remember, way back in 2000, I was working with a proprietary RDBMS that upon initialization created an empty instance of 1GB in size – I wonder how big it gets these days??

Happy Hacking!

References:

PostgreSQL 13 Released!

PostgreSQL 13 Documentation

PostgreSQL Versioning Policy


Our white paper “Why Choose PostgreSQL?” looks at the features and benefits of PostgreSQL and presents some practical usage examples. We also examine how PostgreSQL can be useful for companies looking to migrate from Oracle.

Download PDF

Jan
08
2021
--

Updated MySQL OSMetrics Plugins

Updated MySQL OSMetrics Plugins

Updated MySQL OSMetrics PluginsIt has been some time since I have posted updates to my plugins.  After the initial version, I decided to split the plugins into categories of metrics.  This will allow users to choose whether they want to install all of the plugins or only select ones they care about.

Since the installation process is unfamiliar to many users, I also expanded the instructions to make it a little easier to follow.  Moreover, I added a Makefile.

I have also reformatted the output of some plugins to be either horizontal or vertical in orientation.  There is still more work to do in this area as well.

Where to Get The MySQL Plugins

You can get the plugins from GitHub at https://github.com/toritejutsu/osmetrics but they will have to be compiled from source.  As mentioned above, you can choose whether to install all of them or one by one.  If you have an interest, feel free to contribute to the code as it would make them much more useful to get more input and actual usage.

What Are The Plugins?

These are a collection of MySQL plugins for displaying Operating System metrics in INFORMATION_SCHEMA. This would allow monitoring tools, such as Percona Monitoring and Management (PMM), to retrieve these values remotely via the MySQL interface.

Values are pulled via standard C library calls and some are read from the /proc filesystem so overhead is absolutely minimal. I added a couple of libraries originally to show that even Windows and other variants of UNIX can be utilized, but commented them out to keep it all simple for now.

Many variables were added to show what was possible. Some of these may not be of interest. I just wanted to see what kind of stuff was possible and would tweak these over time.

Also, none of the calculations were rounded. This was done just to keep precision for the graphing of values but could easily be changed later. If there is interest, this could be expanded to add more metrics and unnecessary ones removed. Just looking for feedback.

Keep in mind that my C programming skills are rusty and I am sure the code could be cleaned up.

Make sure you have the source code for MySQL and have done a cmake on it. This will be necessary to compile the plugin as well as some SQL install scripts.

Preparing The Environment

Below is the way that I compiled the plugin. You will obviously need to make changes to match your environment. You will also need to have the Percona Server for MySQL source code on your server:

wget https://www.percona.com/downloads/Percona-Server-5.7/Percona-Server-5.7.17-13/source/tarball/percona-server-5.7.17-13.tar.gz

Uncompress the file and go into the directory:

tar -zxvf percona-server-5.7.17-13.tar.gz
cd percona-server-5.7.17-13

I also had to add a few utilities:

sudo yum install cmake
sudo yum install boost
sudo yum install ncurses-devel
sudo yum install readline-devel
cmake -DDOWNLOAD_BOOST=1 -DWITH_BOOST=..

Compiling The Plugins

First, you will need to put the plugin code in the plugin directory of the source code you downloaded. For me, this was “/home/ec2-user/percona-server-5.7.17-13/plugin” and I named the directory “osmetrics”. Of course, you can just do a “git” to retrieve this to your server or download it as a zip file and decompress it. Just make sure it is placed into the “plugin” directory of the source code as noted above.

Next, you will need to know where your MySQL plugin directory is located. You can query that with the following SQL:

mysql> SHOW GLOBAL VARIABLES LIKE "%plugin_dir%";
+---------------+-------------------------+
| Variable_name | Value                   |
+---------------+-------------------------+
| plugin_dir    | /jet/var/mysqld/plugin/ |
+---------------+-------------------------+
1 row in set (0.01 sec)

You will then need to edit the Makefile and define this path there. Once that is complete, you can compile the plugins:

make clean
make
make install

Installing The Plugins

Finally, you can log in to MySQL and activate the plugins:

mysql> INSTALL PLUGIN OS_CPU SONAME 'osmetrics-cpu.so';
mysql> INSTALL PLUGIN OS_CPUGOVERNOR SONAME 'osmetrics-cpugovernor.so';
mysql> INSTALL PLUGIN OS_CPUINFO SONAME 'osmetrics-cpuinfo.so';
mysql> INSTALL PLUGIN OS_IOSCHEDULER SONAME 'osmetrics-ioscheduler.so';
mysql> INSTALL PLUGIN OS_DISKSTATS SONAME 'osmetrics-diskstats.so';
mysql> INSTALL PLUGIN OS_LOADAVG SONAME 'osmetrics-loadavg.so';
mysql> INSTALL PLUGIN OS_MEMINFO SONAME 'osmetrics-meminfo.so';
mysql> INSTALL PLUGIN OS_MEMORY SONAME 'osmetrics-memory.so';
mysql> INSTALL PLUGIN OS_MISC SONAME 'osmetrics-misc.so';
mysql> INSTALL PLUGIN OS_MOUNTS SONAME 'osmetrics-mounts.so';
mysql> INSTALL PLUGIN OS_NETWORK SONAME 'osmetrics-network.so';
mysql> INSTALL PLUGIN OS_STAT SONAME 'osmetrics-stat.so';
mysql> INSTALL PLUGIN OS_SWAPINFO SONAME 'osmetrics-swapinfo.so';
mysql> INSTALL PLUGIN OS_VERSION SONAME 'osmetrics-version.so';    
mysql> INSTALL PLUGIN OS_VMSTAT SONAME 'osmetrics-vmstat.so';

Alternatively, you can run the install SQL script:

mysql> SOURCE /path/to/install_plugins.sql

Verify Installation

If all went well, you should see several new plugins available. Just make sure the status is “ACTIVE.”

mysql> SHOW PLUGINS;
+-----------------------------+----------+--------------------+----------------------------+---------+
| Name                        | Status   | Type               | Library                    | License |
+-----------------------------+----------+--------------------+----------------------------+---------+
...
| OS_CPU                      | ACTIVE   | INFORMATION SCHEMA | osmetrics-cpu.so           | GPL     |
| OS_GOVERNOR                 | ACTIVE   | INFORMATION SCHEMA | osmetrics-cpugovernor.so   | GPL     |
| OS_CPUINFO                  | ACTIVE   | INFORMATION SCHEMA | osmetrics-cpuinfo.so       | GPL     |
| OS_DISKSTATS                | ACTIVE   | INFORMATION SCHEMA | osmetrics-diskstats.so     | GPL     |
| OS_IOSCHEDULER              | ACTIVE   | INFORMATION SCHEMA | osmetrics-diskscheduler.so | GPL     |    
| OS_LOADAVG                  | ACTIVE   | INFORMATION SCHEMA | osmetrics-loadavg.so       | GPL     |
| OS_MEMINFO                  | ACTIVE   | INFORMATION SCHEMA | osmetrics-meminfo.so       | GPL     |
| OS_MEMORY                   | ACTIVE   | INFORMATION SCHEMA | osmetrics-memory.so        | GPL     |
| OS_MISC                     | ACTIVE   | INFORMATION SCHEMA | osmetrics-misc.so          | GPL     |
| OS_MOUNTS                   | ACTIVE   | INFORMATION SCHEMA | osmetrics-mounts.so        | GPL     |
| OS_NETWORK                  | ACTIVE   | INFORMATION SCHEMA | osmetrics-network.so       | GPL     |
| OS_STAT                     | ACTIVE   | INFORMATION SCHEMA | osmetrics-stat.so          | GPL     |
| OS_SWAPINFO                 | ACTIVE   | INFORMATION SCHEMA | osmetrics-swapinfo.so      | GPL     |
| OS_VERSION                  | ACTIVE   | INFORMATION SCHEMA | osmetrics-version.so       | GPL     |
| OS_VMSTAT                   | ACTIVE   | INFORMATION SCHEMA | osmetrics-vmstat.so        | GPL     |
+-----------------------------+----------+--------------------+----------------------------+---------+

Querying The Plugins

Let’s look at some example output from each of the plugins below:

mysql> SELECT * FROM INFORMATION_SCHEMA.OS_CPU;
+---------------+------------+--------------------------------------------------------------------+
| name          | value      | comment                                                            |
+---------------+------------+--------------------------------------------------------------------+
| numcores      |          1 | Number of virtual CPU cores                                        |
| speed         |   2299.892 | CPU speed in MHz                                                   |
| bogomips      |    4600.08 | CPU bogomips                                                       |
| user          |          0 | Normal processes executing in user mode                            |
| nice          |       4213 | Niced processes executing in user mode                             |
| sys           |     610627 | Processes executing in kernel mode                                 |
| idle          |        524 | Processes which are idle                                           |
| iowait        |          0 | Processes waiting for I/O to complete                              |
| irq           |          9 | Processes servicing interrupts                                     |
| softirq       |        765 | Processes servicing Softirqs                                       |
| guest         |          0 | Processes running a guest                                          |
| guest_nice    |          0 | Processes running a niced guest                                    |
| intr          |     200642 | Count of interrupts serviced since boot time                       |
| ctxt          |     434493 | Total number of context switches across all CPUs                   |
| btime         | 1595891204 | Ttime at which the system booted, in seconds since the Unix epoch  |
| processes     |       9270 | Number of processes and threads created                            |
| procs_running |          3 | Total number of threads that are running or ready to run           |
| procs_blocked |          0 | Number of processes currently blocked, waiting for I/O to complete |
| softirq       |        765 | Counts of softirqs serviced since boot time                        |
| idle_pct      |       0.09 | Average CPU idle time                                              |
| util_pct      |      99.91 | Average CPU utilization                                            |
| procs         |        120 | Number of current processes                                        |
| uptime_tv_sec |          1 | User CPU time used (in seconds)                                    |
| utime_tv_usec |     943740 | User CPU time used (in microseconds)                               |
| stime_tv_sec  |          1 | System CPU time (in seconds)                                       |
| stime_tv_usec |     315574 | System CPU time (in microseconds)                                  |
| utime         |    1.94374 | Total user time                                                    |
| stime         |   1.315574 | Total system time                                                  |
| minflt        |      34783 | Page reclaims (soft page faults)                                   |
| majflt        |          0 | Page faults                                                        |
| nvcsw         |        503 | Number of voluntary context switches                               |
| nivcsw        |        135 | Number of involuntary context switches                             |
+---------------+------------+--------------------------------------------------------------------+
32 rows in set (0.00 sec)

mysql> SELECT * FROM INFORMATION_SCHEMA.OS_CPUGOVERNOR;
+--------+-------------+
| name   | governor    |
+--------+-------------+
| cpu0   | performance |
+--------+-------------+
1 row in set (0.00 sec)

mysql> SELECT * FROM INFORMATION_SCHEMA.OS_CPUINFO;
+-----------+--------------+------------+-------+-------------------------------------------+----------+-----------+----------+------------+-------------+----------+---------+-----------+--------+----------------+-----+---------------+-------------+-----+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------+----------+--------------+-----------------+-----------------------------------+------------------+
| processor | vendor_id    | cpu_family | model | model_name                                | stepping | microcode | cpu_MHz  | cache_size | physical_id | siblings | core_id | cpu_cores | apicid | initial_apicid | fpu | fpu_exception | cpuid_level | wp  | flags                                                                                                                                                                                                                                                                   | bugs                                                                               | bogomips | clflush_size | cache_alignment | address_sizes                     | power_management |
+-----------+--------------+------------+-------+-------------------------------------------+----------+-----------+----------+------------+-------------+----------+---------+-----------+--------+----------------+-----+---------------+-------------+-----+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------+----------+--------------+-----------------+-----------------------------------+------------------+
| 0         | GenuineIntel | 6          | 63    | Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz | 2        | 0x43      | 2400.005 | 30720 KB   | 0           | 1        | 0       | 1         | 0      | 0              | yes | yes           | 13          | yes | fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave a | cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit | 4800.11  | 64           | 64              | 46 bits physical, 48 bits virtual |                  |
+-----------+--------------+------------+-------+-------------------------------------------+----------+-----------+----------+------------+-------------+----------+---------+-----------+--------+----------------+-----+---------------+-------------+-----+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------+----------+--------------+-----------------+-----------------------------------+------------------+
1 row in set (0.00 sec)

mysql> SELECT * FROM INFORMATION_SCHEMA.OS_IOSCHEDULER;
+--------+-----------+
| device | scheduler |
+--------+-----------+
| xvda   | [noop]    |
+--------+-----------+
1 row in set (0.00 sec)

mysql> SELECT * FROM INFORMATION_SCHEMA.OS_DISKSTATS;
+-----------+-----------+--------+---------------+--------------+--------------+---------------+----------------+---------------+-----------------+----------------+-----------------+------------------+---------------------------+
| major_num | minor_num | device | reads_success | reads_merged | sectors_read | time_reads_ms | writes_success | writes_merged | sectors_written | time_writes_ms | ios_in_progress | time_doing_io_ms | weighted_time_doing_io_ms |
+-----------+-----------+--------+---------------+--------------+--------------+---------------+----------------+---------------+-----------------+----------------+-----------------+------------------+---------------------------+
|       202 |         0 | xvda   |         10286 |           10 |       472913 |          7312 |           4137 |          2472 |          351864 |          14276 |               0 |             4452 |                     21580 |
|       202 |         1 | xvda1  |         10209 |           10 |       468929 |          7280 |           4137 |          2472 |          351864 |          14276 |               0 |             4436 |                     21548 |
|       202 |         2 | xvda2  |            40 |            0 |         3504 |            24 |              0 |             0 |               0 |              0 |               0 |               24 |                        24 |
+-----------+-----------+--------+---------------+--------------+--------------+---------------+----------------+---------------+-----------------+----------------+-----------------+------------------+---------------------------+
3 rows in set (0.00 sec)

mysql> SELECT * FROM INFORMATION_SCHEMA.OS_LOADAVG;
+--------+-------+------------------------+
| name   | value | comment                |
+--------+-------+------------------------+
| 1_min  |  0.09 | 1 minute load average  |
| 5_min  |  0.02 | 5 minute load average  |
| 15_min |  0.01 | 15 minute load average |
+--------+-------+------------------------+
3 rows in set (0.00 sec)

mysql> SELECT * FROM INFORMATION_SCHEMA.OS_MEMINFO;
+-----------------+------------+
| name            | value      |
+-----------------+------------+
| MemTotal        | 2090319872 |
| MemFree         | 1658920960 |
| MemAvailable    | 1762938880 |
| Buffers         |   22573056 |
| Cached          |  206209024 |
| SwapCached      |          0 |
| Active          |  284999680 |
| Inactive        |  100868096 |
| Active(anon)    |  156110848 |
| Inactive(anon)  |      53248 |
| Active(file)    |  128888832 |
| Inactive(file)  |  100814848 |
| Unevictable     |          0 |
| Mlocked         |          0 |
| SwapTotal       |          0 |
| SwapFree        |          0 |
| Dirty           |     811008 |
| Writeback       |          0 |
| AnonPages       |  157085696 |
| Mapped          |   53223424 |
| Shmem           |      65536 |
| Slab            |   29102080 |
| SReclaimable    |   18337792 |
| SUnreclaim      |   10764288 |
| KernelStack     |    2162688 |
| PageTables      |    3444736 |
| NFS_Unstable    |          0 |
| Bounce          |          0 |
| WritebackTmp    |          0 |
| CommitLimit     | 1045159936 |
| Committed_AS    |  770662400 |
| VmallocTotal    | 4294966272 |
| VmallocUsed     |          0 |
| VmallocChunk    |          0 |
| AnonHugePages   |          0 |
| ShmemHugePages  |          0 |
| ShmemPmdMapped  |          0 |
| HugePages_Total |          0 |
| HugePages_Free  |          0 |
| HugePages_Rsvd  |          0 |
| HugePages_Surp  |          0 |
| Hugepagesize    |    2097152 |
| DirectMap4k     |   60817408 |
| DirectMap2M     | 2086666240 |
+-----------------+------------+
44 rows in set (0.00 sec)

mysql> SELECT * FROM INFORMATION_SCHEMA.OS_MEMORY;
+----------------+-----------------------+--------------------------------------+
| name           | value                 | comment                              |
+----------------+-----------------------+--------------------------------------+
| total_ram      |            2090319872 | Total usable main memory size        |
| free_ram       |            1452339200 | Available memory size                |
| used_ram       |             637980672 | Used memory size                     |
| free_ram_pct   |                 69.48 | Available memory as a percentage     |
| used_ram_pct   |                 30.52 | Free memory as a percentage          |
| shared_ram     |                 61440 | Amount of shared memory              |
| buffer_ram     |             108040192 | Memory used by buffers               |
| total_high_ram |                     0 | Total high memory size               |
| free_high_ram  |                     0 | Available high memory size           |
| total_low_ram  |            2090319872 | Total low memory size                |
| free_low_ram   |            1452339200 | Available low memory size            |
| maxrss         |       140308942222128 | Maximum resident set size            |
+----------------+-----------------------+--------------------------------------+
12 rows in set (0.00 sec)

mysql> SELECT * FROM INFORMATION_SCHEMA.OS_MISC;
+-----------------------+-------------------+-------------------------------------------------+
| name                  | value             | comment                                         |
+-----------------------+-------------------+-------------------------------------------------+
| datadir_size          |        8318783488 | MySQL data directory size                       |
| datadir_size_free     |        2277470208 | MySQL data directory size free space            |
| datadir_size_used     |        6041313280 | MySQL data directory size used space            |
| datadir_size_used_pct |             72.62 | MySQL data directory used space as a percentage |
| uptime                |            100026 | Uptime (in seconds)                             |
| uptime_days           |                 1 | Uptime (in days)                                |
| uptime_hours          |                27 | Uptime (in hours)                               |
| procs                 |               122 | Number of current processes                     |
| swappiness            |                60 | Swappiness setting                              |
+-----------------------+-------------------+-------------------------------------------------+
9 rows in set (0.00 sec)

mysql> SELECT * FROM INFORMATION_SCHEMA.OS_MOUNTS;
+------------+--------------------------+------------------+---------------------------+
| device     | mount_point              | file_system_type | mount_options             |
+------------+--------------------------+------------------+---------------------------+
| proc       | /proc                    | proc             | rw,relatime               |
| sysfs      | /sys                     | sysfs            | rw,relatime               |
| devtmpfs   | /dev                     | devtmpfs         | rw,relatime,size=1010060k |
| devpts     | /dev/pts                 | devpts           | rw,relatime,gid=5,mode=62 |
| tmpfs      | /dev/shm                 | tmpfs            | rw,relatime               |
| /dev/xvda1 | /                        | ext4             | rw,noatime,data=ordered   |
| devpts     | /dev/pts                 | devpts           | rw,relatime,gid=5,mode=62 |
| none       | /proc/sys/fs/binfmt_misc | binfmt_misc      | rw,relatime               |
+------------+--------------------------+------------------+---------------------------+
8 rows in set (0.00 sec)

mysql> SELECT * FROM INFORMATION_SCHEMA.OS_NETWORK;
+-----------+------------+------------+----------+----------+------------+------------+-----------+-----------+
| interface | tx_packets | rx_packets | tx_bytes | rx_bytes | tx_dropped | rx_dropped | tx_errors | rx_errors |
+-----------+------------+------------+----------+----------+------------+------------+-----------+-----------+
| lo        |      26528 |      26528 |  1380012 |  1380012 |          0 |          0 |         0 |         0 |
| eth0      |     102533 |     144031 | 16962983 | 23600676 |          0 |          0 |         0 |         0 |
+-----------+------------+------------+----------+----------+------------+------------+-----------+-----------+
2 rows in set (0.00 sec)

mysql> SELECT * FROM INFORMATION_SCHEMA.OS_STAT;
+------+----------+-------+------+------+---------+--------+-------+---------+--------+---------+--------+---------+-------+-------+--------+--------+----------+------+-------------+-------------+-----------+------------+-------+-----------------------+-----------+----------+-----------------+---------+---------+--------+---------+-----------+----------+-------+-------+--------+-------------+-----------+-------------+--------+-----------------------+------------+-------------+------------+----------+-----------+-----------------+-----------------+-----------------+-----------------+-----------+
| pid  | comm     | state | ppid | pgrp | session | tty_nr | tpgid | flags   | minflt | cminflt | majflt | cmajflt | utime | stime | cutime | cstime | priority | nice | num_threads | itrealvalue | starttime | vsize      | rss   | rsslim                | startcode | endcode  | startstack      | kstkeep | kstkeip | signal | blocked | sigignore | sigcatch | wchan | nswap | cnswap | exit_signal | processor | rt_priority | policy | delayacct_blkio_ticks | guest_time | cguest_time | start_data | end_data | start_brk | arg_start       | arg_end         | env_start       | env_end         | exit_code |
+------+----------+-------+------+------+---------+--------+-------+---------+--------+---------+--------+---------+-------+-------+--------+--------+----------+------+-------------+-------------+-----------+------------+-------+-----------------------+-----------+----------+-----------------+---------+---------+--------+---------+-----------+----------+-------+-------+--------+-------------+-----------+-------------+--------+-----------------------+------------+-------------+------------+----------+-----------+-----------------+-----------------+-----------------+-----------------+-----------+
| 6656 | (mysqld) | S     | 2030 | 1896 |    1896 |      0 |    -1 | 4194304 |  34784 |       0 |      0 |       0 |    96 |    55 |      0 |      0 |       20 |    0 |          29 |           0 |    965078 | 1153900544 | 37324 | 1.8446744073709552e19 |   4194304 | 27414570 | 140728454321408 |       0 |       0 |      0 |  540679 |     12294 |     1768 |     0 |     0 |      0 |          17 |         0 |           0 |      0 |                     4 |          0 |           0 |   29511728 | 31209920 |  36462592 | 140728454327797 | 140728454328040 | 140728454328040 | 140728454328281 |         0 |
+------+----------+-------+------+------+---------+--------+-------+---------+--------+---------+--------+---------+-------+-------+--------+--------+----------+------+-------------+-------------+-----------+------------+-------+-----------------------+-----------+----------+-----------------+---------+---------+--------+---------+-----------+----------+-------+-------+--------+-------------+-----------+-------------+--------+-----------------------+------------+-------------+------------+----------+-----------+-----------------+-----------------+-----------------+-----------------+-----------+
1 row in set (0.00 sec)

mysql> SELECT * FROM INFORMATION_SCHEMA.OS_SWAPINFO;
+---------------+-------+--------------------------------------+
| name          | value | comment                              |
+---------------+-------+--------------------------------------+
| total_swap    |     0 | Total swap space size                |
| free_swap     |     0 | Swap space available                 |
| used_swap     |     0 | Swap space used                      |
| free_swap_pct |     0 | Swap space available as a percentage |
| used_swap_pct |     0 | Swap space used as a percentage      |
+---------------+-------+--------------------------------------+
5 rows in set (0.00 sec)

mysql> SELECT * FROM INFORMATION_SCHEMA.OS_VMSTAT;
+---------------------------+----------+
| name                      | value    |
+---------------------------+----------+
| nr_free_pages             |   354544 |
| nr_zone_inactive_anon     |       13 |
| nr_zone_active_anon       |    38257 |
| nr_zone_inactive_file     |    30742 |
| nr_zone_active_file       |    70367 |
| nr_zone_unevictable       |        0 |
| nr_zone_write_pending     |        9 |
| nr_mlock                  |        0 |
| nr_page_table_pages       |      854 |
| nr_kernel_stack           |     2176 |
| nr_bounce                 |        0 |
| nr_zspages                |        0 |
| nr_free_cma               |        0 |
| numa_hit                  | 14011351 |
| numa_miss                 |        0 |
| numa_foreign              |        0 |
| numa_interleave           |    15133 |
| numa_local                | 14011351 |
| numa_other                |        0 |
| nr_inactive_anon          |       13 |
| nr_active_anon            |    38257 |
| nr_inactive_file          |    30742 |
| nr_active_file            |    70367 |
| nr_unevictable            |        0 |
| nr_slab_reclaimable       |     9379 |
| nr_slab_unreclaimable     |     2829 |
| nr_isolated_anon          |        0 |
| nr_isolated_file          |        0 |
| workingset_refault        |        0 |
| workingset_activate       |        0 |
| workingset_nodereclaim    |        0 |
| nr_anon_pages             |    38504 |
| nr_mapped                 |    13172 |
| nr_file_pages             |   100873 |
| nr_dirty                  |        9 |
| nr_writeback              |        0 |
| nr_writeback_temp         |        0 |
| nr_shmem                  |       15 |
| nr_shmem_hugepages        |        0 |
| nr_shmem_pmdmapped        |        0 |
| nr_anon_transparent_hugep |        0 |
| nr_unstable               |        0 |
| nr_vmscan_write           |        0 |
| nr_vmscan_immediate_recla |        0 |
| nr_dirtied                |   389218 |
| nr_written                |   381326 |
| nr_dirty_threshold        |    87339 |
| nr_dirty_background_thres |    43616 |
| pgpgin                    |   619972 |
| pgpgout                   |  2180908 |
| pswpin                    |        0 |
| pswpout                   |        0 |
| pgalloc_dma               |        0 |
| pgalloc_dma32             | 14085334 |
| pgalloc_normal            |        0 |
| pgalloc_movable           |        0 |
| allocstall_dma            |        0 |
| allocstall_dma32          |        0 |
| allocstall_normal         |        0 |
| allocstall_movable        |        0 |
| pgskip_dma                |        0 |
| pgskip_dma32              |        0 |
| pgskip_normal             |        0 |
| pgskip_movable            |        0 |
| pgfree                    | 14440053 |
| pgactivate                |    55703 |
| pgdeactivate              |        1 |
| pglazyfree                |      249 |
| pgfault                   | 14687206 |
| pgmajfault                |     1264 |
| pglazyfreed               |        0 |
| pgrefill                  |        0 |
| pgsteal_kswapd            |        0 |
| pgsteal_direct            |        0 |
| pgscan_kswapd             |        0 |
| pgscan_direct             |        0 |
| pgscan_direct_throttle    |        0 |
| zone_reclaim_failed       |        0 |
| pginodesteal              |        0 |
| slabs_scanned             |        0 |
| kswapd_inodesteal         |        0 |
| kswapd_low_wmark_hit_quic |        0 |
| kswapd_high_wmark_hit_qui |        0 |
| pageoutrun                |        0 |
| pgrotated                 |       44 |
| drop_pagecache            |        0 |
| drop_slab                 |        0 |
| oom_kill                  |        0 |
| numa_pte_updates          |        0 |
| numa_huge_pte_updates     |        0 |
| numa_hint_faults          |        0 |
| numa_hint_faults_local    |        0 |
| numa_pages_migrated       |        0 |
| pgmigrate_success         |        0 |
| pgmigrate_fail            |        0 |
| compact_migrate_scanned   |        0 |
| compact_free_scanned      |        0 |
| compact_isolated          |        0 |
| compact_stall             |        0 |
| compact_fail              |        0 |
| compact_success           |        0 |
| compact_daemon_wake       |        0 |
| compact_daemon_migrate_sc |        0 |
| compact_daemon_free_scann |        0 |
| htlb_buddy_alloc_success  |        0 |
| htlb_buddy_alloc_fail     |        0 |
| unevictable_pgs_culled    |     1300 |
| unevictable_pgs_scanned   |        0 |
| unevictable_pgs_rescued   |      266 |
| unevictable_pgs_mlocked   |     2626 |
| unevictable_pgs_munlocked |     2626 |
| unevictable_pgs_cleared   |        0 |
| unevictable_pgs_stranded  |        0 |
| thp_fault_alloc           |        0 |
| thp_fault_fallback        |        0 |
| thp_collapse_alloc        |        0 |
| thp_collapse_alloc_failed |        0 |
| thp_file_alloc            |        0 |
| thp_file_mapped           |        0 |
| thp_split_page            |        0 |
| thp_split_page_failed     |        0 |
| thp_deferred_split_page   |        0 |
| thp_split_pmd             |        0 |
| thp_split_pud             |        0 |
| thp_zero_page_alloc       |        0 |
| thp_zero_page_alloc_faile |        0 |
| thp_swpout                |        0 |
| thp_swpout_fallback       |        0 |
| swap_ra                   |        0 |
| swap_ra_hit               |        0 |
+---------------------------+----------+
130 rows in set (0.00 sec)

mysql> SELECT * FROM INFORMATION_SCHEMA.OS_VERSION;
+------------+---------------------------+
| name       | value                     |
+------------+---------------------------+
| sysname    | Linux                     |
| nodename   | ip-172-31-25-133          |
| release    | 4.14.181-108.257.amzn1.x8 |
| version    | #1 SMP Wed May 27 02:43:0 |
| machine    | x86_64                    |
| domainname | (none)                    |
+------------+---------------------------+
6 rows in set (0.00 sec)

Uninstalling The Plugins

To uninstall the plugins, you can remove them with the following SQL commands. To completely remove them, you will need to remove them from your plugin directory.

mysql> UNINSTALL PLUGIN OS_CPU;
mysql> UNINSTALL PLUGIN OS_CPUGOVERNOR;
mysql> UNINSTALL PLUGIN OS_CPUINFO;
mysql> UNINSTALL PLUGIN OS_IOSCHEDULER;
mysql> UNINSTALL PLUGIN OS_DISKSTATS;
mysql> UNINSTALL PLUGIN OS_LOADAVG;
mysql> UNINSTALL PLUGIN OS_MEMINFO;
mysql> UNINSTALL PLUGIN OS_MEMORY;
mysql> UNINSTALL PLUGIN OS_MISC;
mysql> UNINSTALL PLUGIN OS_MOUNTS;
mysql> UNINSTALL PLUGIN OS_NETWORK;
mysql> UNINSTALL PLUGIN OS_STAT;
mysql> UNINSTALL PLUGIN OS_SWAPINFO;
mysql> UNINSTALL PLUGIN OS_VERSION;
mysql> UNINSTALL PLUGIN OS_VMSTAT;

Alternatively, you can run the uninstall SQL script:

mysql> SOURCE /path/to/uninstall_plugins.sql

What’s Next?

Who knows! If there is enough interest, I would be happy to expand the plugins. First, I need to do some more code cleanup and performance test them. I do not expect them to have a significant performance impact, but one never knows until you test…

Jan
05
2021
--

MongoDB 101: 5 Configuration Options That Impact Performance and How to Set Them

MongoDB configuration options that impact Performance

MongoDB configuration options that impact PerformanceAs with any database platform, MongoDB performance is of paramount importance to keeping your application running quickly.   In this blog post, we’ll show you five configuration options that can impact the performance for your MongoDB Deployment and will help keep your database fast and performing at its peak.

MongoDB Performance Overview

MongoDB performance consists of several factors; there’s OS Settings, DB Configuration Settings, DB Internal Settings, Memory Settings, and Application Settings.  This post is going to focus on the MongoDB database configuration options around performance and how to set them.  These options are ones that are set in the database configuration itself that can impact your performance.

Configuration Options

So how do we ensure our performance configuration options are enabled or set up correctly?  And which ones are the most important?  We’ll now go through five configuration options that will help your MongoDB environment be performant!

MongoDB uses a configuration file in the YAML file format.  The configuration file is usually found in the following locations, depending on your Operating System:

DEFAULT CONFIGURATION FILE

  • On Linux, a default /etc/mongod.conf configuration file is included when using a package manager to install MongoDB.
  • On Windows, a default <install directory>/bin/mongod.cfg configuration file is included during the installation.
  • On macOS, a default /usr/local/etc/mongod.conf configuration file is included when installing from MongoDB’s official Homebrew tap.

 

storage.wiredTiger.engineConfig.cacheSizeGB

Our first configuration option to help with your MongoDB performance is storage.wiredTiger.engineConfig.cacheSizeGB.

storage:
   wiredTiger:
       engineConfig:
           cacheSizeGB: <value>

Since MongoDB 3.0, MongoDB has used WiredTiger as its default Storage Engine, so we’ll be examining MongoDB Memory Performance from a WiredTiger perspective. By default, MongoDB will reserve 50% of the available memory – 1 GB for the WiredTiger cache or 256 MB, whichever is greater.  For example, a system with 16 GB of RAM would have a WiredTiger cache size of 7.5 GB.

( 0.5 * (16-1) )

The size of this cache is important to ensure WiredTiger is performant. It’s worth taking a look to see if you should alter it from the default. A good rule of thumb is that the size of the cache should be large enough to hold the entire application working set.

Note that if you’re in a containerized environment, you may also need to set the configuration option to 50% – 1 GB of the memory that is available to the container.  MongoDB may adhere to the container’s memory limits or it may get the host’s memory limit, depending on how the system call is returned when MongoDB asks.  You can verify what MongoDB believes the memory limit is by running:

db.hostInfo()

And checking the hostinfo.system.memLimitMB value.  This is available from MongoDB 3.6.13+ and MongoDB 4.0.9+ forward.

How do we know whether to increase or decrease our cache size? Look at the cache usage statistics:

db.serverStatus().wiredTiger.cache
{
"application threads page read from disk to cache count" : 9,
"application threads page read from disk to cache time (usecs)" : 17555,
"application threads page write from cache to disk count" : 1820,
"application threads page write from cache to disk time (usecs)" : 1052322,
"bytes allocated for updates" : 20043,
"bytes belonging to page images in the cache" : 46742,
"bytes belonging to the history store table in the cache" : 173,
"bytes currently in the cache" : 73044,
"bytes dirty in the cache cumulative" : 38638327,
"bytes not belonging to page images in the cache" : 26302,
"bytes read into cache" : 43280,
"bytes written from cache" : 20517382,
"cache overflow score" : 0,
"checkpoint blocked page eviction" : 0,
"eviction calls to get a page" : 5973,
"eviction calls to get a page found queue empty" : 4973,
"eviction calls to get a page found queue empty after locking" : 20,
"eviction currently operating in aggressive mode" : 0,
"eviction empty score" : 0,
"eviction passes of a file" : 0,
"eviction server candidate queue empty when topping up" : 0,
"eviction server candidate queue not empty when topping up" : 0,
"eviction server evicting pages" : 0,
"eviction server slept, because we did not make progress with eviction" : 735,
"eviction server unable to reach eviction goal" : 0,
"eviction server waiting for a leaf page" : 2,
"eviction state" : 64,
"eviction walk target pages histogram - 0-9" : 0,
"eviction walk target pages histogram - 10-31" : 0,
"eviction walk target pages histogram - 128 and higher" : 0,
"eviction walk target pages histogram - 32-63" : 0,
"eviction walk target pages histogram - 64-128" : 0,
"eviction walk target strategy both clean and dirty pages" : 0,
"eviction walk target strategy only clean pages" : 0,
"eviction walk target strategy only dirty pages" : 0,
"eviction walks abandoned" : 0,
"eviction walks gave up because they restarted their walk twice" : 0,
"eviction walks gave up because they saw too many pages and found no candidates" : 0,
"eviction walks gave up because they saw too many pages and found too few candidates" : 0,
"eviction walks reached end of tree" : 0,
"eviction walks started from root of tree" : 0,
"eviction walks started from saved location in tree" : 0,
"eviction worker thread active" : 4,
"eviction worker thread created" : 0,
"eviction worker thread evicting pages" : 902,
"eviction worker thread removed" : 0,
"eviction worker thread stable number" : 0,
"files with active eviction walks" : 0,
"files with new eviction walks started" : 0,
"force re-tuning of eviction workers once in a while" : 0,
"forced eviction - history store pages failed to evict while session has history store cursor open" : 0,
"forced eviction - history store pages selected while session has history store cursor open" : 0,
"forced eviction - history store pages successfully evicted while session has history store cursor open" : 0,
"forced eviction - pages evicted that were clean count" : 0,
"forced eviction - pages evicted that were clean time (usecs)" : 0,
"forced eviction - pages evicted that were dirty count" : 0,
"forced eviction - pages evicted that were dirty time (usecs)" : 0,
"forced eviction - pages selected because of too many deleted items count" : 0,
"forced eviction - pages selected count" : 0,
"forced eviction - pages selected unable to be evicted count" : 0,
"forced eviction - pages selected unable to be evicted time" : 0,
"forced eviction - session returned rollback error while force evicting due to being oldest" : 0,
"hazard pointer blocked page eviction" : 0,
"hazard pointer check calls" : 902,
"hazard pointer check entries walked" : 25,
"hazard pointer maximum array length" : 1,
"history store key truncation calls that returned restart" : 0,
"history store key truncation due to mixed timestamps" : 0,
"history store key truncation due to the key being removed from the data page" : 0,
"history store score" : 0,
"history store table insert calls" : 0,
"history store table insert calls that returned restart" : 0,
"history store table max on-disk size" : 0,
"history store table on-disk size" : 0,
"history store table out-of-order resolved updates that lose their durable timestamp" : 0,
"history store table out-of-order updates that were fixed up by moving existing records" : 0,
"history store table out-of-order updates that were fixed up during insertion" : 0,
"history store table reads" : 0,
"history store table reads missed" : 0,
"history store table reads requiring squashed modifies" : 0,
"history store table remove calls due to key truncation" : 0,
"history store table writes requiring squashed modifies" : 0,
"in-memory page passed criteria to be split" : 0,
"in-memory page splits" : 0,
"internal pages evicted" : 0,
"internal pages queued for eviction" : 0,
"internal pages seen by eviction walk" : 0,
"internal pages seen by eviction walk that are already queued" : 0,
"internal pages split during eviction" : 0,
"leaf pages split during eviction" : 0,
"maximum bytes configured" : 8053063680,
"maximum page size at eviction" : 376,
"modified pages evicted" : 902,
"modified pages evicted by application threads" : 0,
"operations timed out waiting for space in cache" : 0,
"overflow pages read into cache" : 0,
"page split during eviction deepened the tree" : 0,
"page written requiring history store records" : 0,
"pages currently held in the cache" : 24,
"pages evicted by application threads" : 0,
"pages queued for eviction" : 0,
"pages queued for eviction post lru sorting" : 0,
"pages queued for urgent eviction" : 902,
"pages queued for urgent eviction during walk" : 0,
"pages read into cache" : 20,
"pages read into cache after truncate" : 902,
"pages read into cache after truncate in prepare state" : 0,
"pages requested from the cache" : 33134,
"pages seen by eviction walk" : 0,
"pages seen by eviction walk that are already queued" : 0,
"pages selected for eviction unable to be evicted" : 0,
"pages selected for eviction unable to be evicted as the parent page has overflow items" : 0,
"pages selected for eviction unable to be evicted because of active children on an internal page" : 0,
"pages selected for eviction unable to be evicted because of failure in reconciliation" : 0,
"pages walked for eviction" : 0,
"pages written from cache" : 1822,
"pages written requiring in-memory restoration" : 0,
"percentage overhead" : 8,
"tracked bytes belonging to internal pages in the cache" : 5136,
"tracked bytes belonging to leaf pages in the cache" : 67908,
"tracked dirty bytes in the cache" : 493,
"tracked dirty pages in the cache" : 1,
"unmodified pages evicted" : 0
}

There’s a lot of data here about WiredTiger’s cache, but we can focus on the following fields:

Looking at the above values, we can determine if we need to up the size of the WiredTiger cache for our instance. Additionally, we can look at the wiredTiger.cache.pages read into cache value for read-heavy applications. If this value is consistently high, increasing the cache size may improve overall read performance.

storage.wiredTiger.engineConfig.directoryForIndexes

Our second configuration option is storage.wiredTiger.engineConfig.directoryForIndexes.

storage:
   wiredTiger:
       engineConfig:
           directoryForIndexes: <true or false>

Setting this value to true creates two directories in your storage.dbPath directory, one named collection which will hold your collection data files, and another one named index which will hold your index data files.  This allows you to create separate storage volumes for collections and indexes if you wish, which can spread the amount of disk I/O across each volume, but with most modern storage options you can get the same performance benefits by just striping your disk across two volumes (RAID 0).  This can help separate index I/O from collection-based I/O and reduce storage based latencies, although index-based I/O is unlikely to be costly due to its smaller size.

storage.wiredTiger.collectionConfig.blockCompressor

Our third configuration option is storage.wiredTiger.collectionConfig.blockCompressor.

storage:
   wiredTiger:
       collectionConfig:
           blockCompressor: <value>

This option sets the compression options for all of your collection data.  Possible values for this parameter are none, snappy (default), zlib, and zstd.  So how does compression help your performance?  The WiredTiger cache generally stores changes uncompressed, with the exception of some very large documents. Now we need to write that uncompressed data to disk.

Compression Types:

Snappy compression is fairly straightforward, snappy compression gathers your data up to a maximum of 32KB, compresses your data, and if compression is successful, writes the block rounded up to the nearest 4KB.

Zlib compression works a little differently; it will gather more data and compress enough to fill a 32KB block on disk. This is more CPU-intensive but generally results in better compression ratios (independent of the inherent differences between snappy and zlib).

Zstd is a newer compression algorithm, developed by Facebook that offers improvements over zlib (better compression rates, less CPU intensive, faster performance).

Which compression algorithm to choose depends greatly on your workload.  For most write-heavy workloads, snappy compression will perform better than zlib and zstd but will require more disk space.  For read-heavy workloads, zstd is often the best choice because of its better decompression rates.

storage.directoryPerDB

Another configuration option to help with MongoDB performance is storage.directoryPerDB.

storage:
   directoryPerDB: <true or false>

Similar to the above configuration file option, storage.wiredTiger.engineConfig.directoryForIndexes, setting this value to true creates a separate directory in your storage.dbPath for each database in your MongoDB instance.  This allows you to create separate storage volumes for each database if you wish, which can spread the amount of disk I/O across each volume.  This can help when you have multiple databases with intensive I/O needs. Additionally, if you use this parameter in tandem with storage.wiredTiger.engineConfig.directorForIndexes, your directory structure will look like this:

-Database_name
    -Collection
    -Index

net.compression.compressors

Our final configuration option that can help keep your database performant is the net.compression.compressors configuration option.

net:
   compression:
       compressors: <value>

This option allows you to compress the network traffic between your mongos, mongod, and even your mongo shell. There are currently three types of compression available, snappy, zlib, and zstd. Since MongoDB 3.6, compression has been enabled by default.  In MongoDB 3.6 and 4.0, snappy was the default.  Since MongoDB 4.2, the default is now snappy, zstd, and zlib compressors, in that order.  It’s also important to note that you must have at least one mutual compressor on each side of your network conversation for compression to happen.  For example, if your shell uses zlib compression but you have your mongod set to only accept snappy compression, then no compression will occur between the two. If both accept zstd compression, then zstd compression would be used between them.  When compression is set it can be very helpful in reducing replication lag and overall reducing network latency as the size of the data moving across the network is decreased, sometimes dramatically.  In cloud environments, setting this configuration option can also lead to decreased data transfer costs.

Summary:

In this blog post, we’ve gone over five MongoDB configuration options to ensure you have a more performant MongoDB deployment.  We hope that these configuration options will help you build more performant MongoDB deployments and avoid slowdowns and bottlenecks.   Thanks for reading!

Additional Resources: MongoDB Best Practices 2020 Edition

Jan
04
2021
--

Percona Monthly Bug Report: December 2020

Percona Bug Report Dec 2020

Percona Bug Report Dec 2020Here at Percona, we operate on the premise that full-transparency makes a product better. We strive to build the best open-source database products, but also to help you manage any issues that arise in any of the databases that we support. And, in true open-source form, report back on any issues or bugs you might encounter along the way.

We constantly update our bug reports and monitor other boards to ensure we have the latest information, but we wanted to make it a little easier for you to keep track of the most critical ones. This monthly post is a central place to get information on the most noteworthy open and recently resolved bugs. 

In this December 2020 edition of our monthly bug report, we have the following list of bugs:

Percona Server for MySQL/MySQL Bugs

PS-7264 (MySQL #83799) ALTER table fails with the following error for a table with a self-referential (referring to the same table), leaving the table in an inconsistent state.

 [ERROR] InnoDB: dict_load_foreigns() returned 38 for ALTER TABLE t1

After mysqld restart accessing this table will throw a warning for missing foreign key indexes.

Affects Version/s: 5.6,5.7  [Tested/Reported version 5.6.47-87.0, 5.7.30-33]

 

PS-5641 (MySQL#95863): Deadlock on MTS replica while running administrative commands, caused by one of the MTS thread and administrative command thread where both are waiting for commit lock. As a result of this bug, you may see replication lag along with deadlock. This is a design flaw in the way MTS locks interact with non-replication locks like MDL or ACL locks, and the lack of deadlock detection infrastructure in replication. 

There are three ways of hitting this issue:

  • SET GLOBAL read_only=[ON/OFF]
  • SET GLOBAL super_read_only=[ON/OFF]
  • FLUSH TABLE WITH READ LOCK

 Affects Version/s: 5.7,8.0  [Tested/Reported version 5.7.26, 8.0.15,8.0.20]

 

PS-7437 (MySQL#101716): We use Performance_Schema to check various stats while troubleshooting but you need to be careful while checking memory stats from memory_summary_by_account_by_event_name table which shows incorrect “current_number_of_bytes_used” value for events.

 Affects Version/s: 5.7,8.0  [Tested/Reported version 5.7.31-34, 8.0.21-12]

 

Percona XtraDB Cluster

PXC-3418: PXC Cluster node can hang when there is a concurrent read-write load on the table and there is a parallel in-place ALTER TABLE is happening on the table.

For example, case deadlock can be seen with the following steps:

  • An In-place ALTER TABLE query in one session.
  • A DML on the same table is from another session.

Affects Version/s: 5.7,8.0

Fixed version/s:  5.7.33, 8.0.21

 

PXC-3366: Upgrading SSL traffic enabled  PXC 5.7 to PXC 8.0 will not show any status/variable, it will give empty output always. Also, taking backup of this upgraded PXC 8.0 node using xtrabackup will fail with a crash due to this bug.

Affects Version/s: 8.0   [Tested/Reported 8.0.19]

 

 

Percona XtraBackup

PXB-2314: Percona XtraBackup 8.0.14 is not compatible with MySQL 8.0.22 due to disk format changes introduced in the 8.0.22 release. Issue fixed in xtrabackup version 8.0.22 and you should use xtrabackup 8.0.22 version for taking backup of PS/MySQL version 8.0.22 and above.

Affects Version/s:  8.0.14  

Fixed version/s:  8.0.22

 

PXB-2375:  In some cases, xtrabackup will write the wrong binlog filename, pos, and GTID combination info in xtrabackup_binlog_info. 

If we are using this backup with GTID position details in xtrabackup_binlog_info to create a new replica, then most likely replication will break due to incorrect GTID position.

Looks like the GTID position is not consistent with binlog file pos, they are captured differently and later printed together in xtrabackup_binlog_info  file.

Affects Version/s:  8.0 [Tested/Reported version 8.0.14]

 

Percona Toolkit

PT-1891: pt-mongodb-summary fails for SSL enabled MongoDB instances. It fails because pt-mongodb-summary can’t connect to SSL enabled MongoDB.

Affects Version/s:  3.1.10

Fixed Version/s: 3.3.0

 

PT-1747: pt-online-schema-change was bringing the database into a broken state when applying the “rebuild_constraints” foreign keys modification method if any of the child tables were blocked by the metadata lock.

Affects Version/s:  3.x   [Tested/Reported version 3.0.13]

A critical bug since it can cause data inconsistency in the user environment. It potentially affects who rebuilds tables with foreign keys. Affects those who rebuild tables with a self-referencing foreign key.

 

PMM  [Percona Monitoring and Management]

PMM-7092:  Upgrade to PMM 2.12.0 fails with the container in an unhealthy state.   

Upgrade process fails with following error,

/srv/victoriametrics/data with -retentionPeriod=30d: cannot create a directory for the storage at “/srv/victoriametrics/data”: mkdir /srv/victoriametrics: permission denied

Starting from PMM 2.12.0 release, it integrated with VictoriaMetricsDB and replaced Prometheus as its default method of data ingestion.

It has easy workaround as follows:

1) docker exec -ti pmm-server mkdir -p /srv/victoriametrics/data

2) docker exec -ti pmm-server chown -R pmm:pmm /srv/victoriametrics/

3)  docker restart pmm-server

Affects Version/s:  2.12.0  [Tested/Reported version 2.12.0]

Fixed Version: PMM-2.13.0

 

PMM-7178: PMM-PostgreSQL monitoring, pg_stat_monitor integration is broken due to a missing column in PostgreSQL.

This query fails on the PostgreSQL side because many of the columns (for example, total_calls, effected_rows, tables_names) are not existing anymore in the pg_stat_monitor view.

A bug is valid for –query-source=pgstatmonitor, with default query-source i.e pg_stat_statements we don’t see this issue.

 

PMM-4547: MongoDB dashboard replication lag count incorrect.

With a hidden replica (with/without any delay specified). Another replica will show a delay of the max unsigned integer value (about 136 years).

The problem here is that sometimes, MongoDB reports (in getDiagnosticData) that the primary is behind the secondary. Since timestamps are unsigned ints, subtracting a bigger number produces an overflow and that’s why we see 100+ years lag.

Affects Version/s:  2.9  [Tested/Reported version 2.0,2.9.0]

Fixed Version: PMM-2.12.0

 

Summary

We welcome community input and feedback on all our products. If you find a bug or would like to suggest an improvement or a feature, learn how in our post, How to Report Bugs, Improvements, New Feature Requests for Percona Products.

For the most up-to-date information, be sure to follow us on Twitter, LinkedIn, and Facebook.

Quick References:

Percona JIRA  

MySQL Bug Report

Report a Bug in a Percona Product

___

About Percona:

As the only provider of distributions for all three of the most popular open source databases—PostgreSQL, MySQL, and MongoDB—Percona provides expertise, software, support, and services no matter the technology.

Whether its enabling developers or DBAs to realize value faster with tools, advice, and guidance, or making sure applications can scale and handle peak loads, Percona is here to help.

Percona is committed to being open source and preventing vendor lock-in. Percona contributes all changes to the upstream community for possible inclusion in future product releases.

Nov
23
2020
--

Uncommon Sense MySQL – When EXPLAIN Can Trash Your Database

When EXPLAIN Can Trash Your Database

When EXPLAIN Can Trash Your DatabaseIf I ask you if running EXPLAIN on the query can change your database, you will probably tell me NO; it is common sense. EXPLAIN should show us how the query is executed, not execute the query, hence it can’t change any data.

Unfortunately, this is the case where common sense does not apply to MySQL (at the time of this writing MySQL 8.0.21 and previous versions) – there are edge cases where EXPLAIN can actually change your database as this Bug illustrates:

DELIMITER $$
CREATE FUNCTION `cleanup`() RETURNS char(50) CHARSET latin1
    DETERMINISTIC
BEGIN 
delete from test.t1;
RETURN 'OK'; 
END $$

Query OK, 0 rows affected (0.01 sec)

DELIMITER ;

mysql> create table t1(i int);
mysql> insert into t1 values(1); 
Query OK, 1 row affected (0.00 sec)

mysql> select * from t1; 
+------+
| i    |
+------+
|    1 |
+------+
1 row in set (0.00 sec)


mysql> explain select * from (select cleanup()) as t1clean; 
+----+-------------+------------+------------+--------+---------------+------+---------+------+------+----------+----------------+
| id | select_type | table      | partitions | type   | possible_keys | key  | key_len | ref  | rows | filtered | Extra          |
+----+-------------+------------+------------+--------+---------------+------+---------+------+------+----------+----------------+
|  1 | PRIMARY     | <derived2> | NULL       | system | NULL          | NULL | NULL    | NULL |    1 |   100.00 | NULL           |
|  2 | DERIVED     | NULL       | NULL       | NULL   | NULL          | NULL | NULL    | NULL | NULL |     NULL | No tables used |
+----+-------------+------------+------------+--------+---------------+------+---------+------+------+----------+----------------+
2 rows in set, 1 warning (0.00 sec)


mysql> select * from t1;
Empty set (0.00 sec)

The problem is EXPLAIN executes the cleanup() stored function… which is permitted to modify data. This is different from the more sane PostgreSQL behavior which will NOT execute stored functions while running EXPLAIN (it will if you run EXPLAIN ANALYZE).

This decision in the MySQL case comes from trying to do the right stuff and provide the most reliable explain (query execution plan may well depend on what stored function returns) but it looks like this security tradeoff was not considered.

While this consequence of the current MySQL EXPLAIN design is one of the most severe, you also have the problem that EXPLAIN – which a rational user would expect to be a fast way to check the performance of a query – can take unbound time to complete, for example:

mysql> explain select * from (select sleep(5000) as a) b;

This will run for more than an hour, creating an additional accidental (or not) Denial of Service attack vector.

Going Deeper Down the Rabbit Hole

While this behavior is unfortunate, it will happen only if you have unrestricted privileges.  If you have a more complicated setup, the behavior may vary.

If the user lacks EXECUTE privilege, the EXPLAIN statement will fail.

mysql> explain select * from (select cleanup()) as t1clean;
ERROR 1370 (42000): execute command denied to user 'msandbox_ro'@'localhost' for routine 'test.cleanup'

If the user has EXECUTE privilege but the user executing the stored function lacks DELETE privilege, it will fail too:

mysql> explain select * from (select cleanup()) as t1clean;
ERROR 1142 (42000): DELETE command denied to user 'msandbox_ro'@'localhost' for table 't2'

Note: I’m saying user executing stored function, rather than the current user, as depending on the SECURITY clause in Stored Function definition it may be run either as definer or as invoker.

So what can you do if you want to improve EXPLAIN safety, for example, if you’re developing a tool like Percona Monitoring and Management which, among other features, allows users to run EXPLAIN on their queries?

  • Advise users to set up privileges for monitoring correctly.  It should be the first line of defense from this (and many other) issues, however, it is hard to rely on.  Many users will choose the path of simplicity and will use “root” user with full privileges for monitoring.
  • Wrap your EXPLAIN statement in BEGIN … ROLLBACK which will undo any damage EXPLAIN may have caused. The downside of course is the “work” of deleting the data and when undoing the work will be done. (Note: Of course this only works for Transactional tables, if you still run MyISAM…. Well in this case you have worse problems to worry about.)
  • Use “ set transaction read-only”  to signal you’re not expecting any writes…   EXPLAIN which tries to write data will fail in this case without doing any work.

While these workarounds can have tools running EXPLAIN safer, it does not help users running EXPLAIN directly, and I really hope this issue will be fixed by redesigning EXPLAIN in a way it is not trying to run stored functions, as PostgreSQL already does.

For those who want to know how the query is executed EXACTLY, there is now EXPLAIN ANALYZE.

Nov
09
2020
--

OpenEBS for the Management of Kubernetes Storage Volumes

OpenEBS for the Management of Kubernetes Storage Volumes

OpenEBS for the Management of Kubernetes Storage VolumesI recently wrote about OpenEBS in regard to local volumes in my post Deploying Percona Kubernetes Operators with OpenEBS Local Storage. And there are also my recent blog posts about Kubernetes, Percona Operators, and storage management problems:

For me, working with my own deployment of Kubernetes, I struggled for a long time how to manage storage volumes in Kubernetes on your own hardware.

If you are using Kubernetes from a cloud provider, or with NAS/SAN system that comes with Kubernetes support, this is less of a problem, but I was looking for a widely available (Open Source) solution that could provide cloud-like flexibility and be feature-rich.

Today I want to offer to take a look at OpenEBS as a general storage management solution for Kubernetes. There are two more storage options, besides the already mentioned LocalPV:

What is so special about those?

cStor

cStor allows you to combine storage capacity from the multiple nodes and provides the following features (I took the info from https://docs.openebs.io/docs/next/casengines.html):

FEATURE CSTOR
Lightweight and completely in user space Yes
Synchronous replication Yes
Snapshots and cloning support Advanced
Data consistency Yes
Suitable for high capacity workloads Yes
Disk pool or aggregate support Yes
On-demand capacity expansion Yes
Data resiliency (RAID support ) Yes
Near disk performance No

As we can see, cStor offers capabilities compared with storage volumes offered by cloud provides, such as Replication, Data Snapshot and Cloning, aggregation of multiple disks, etc.

The only downside (and it might be an important factor for consideration) is that the performance will be worse compared to the local storage. The most limiting factor is network performance, and if the storage nodes share the same network with the workload nodes, they will compete for the network resources. Also, the IO latency (which is important for databases) will be limited by network latency.

Zfs-localpv

This is where zfs-localpv comes into play. If you need better performance and still want to use some advanced capabilities like volume snapshots and cloning, you can choose the zfs-localpv storage provisioner.

As the name suggests, this provisioner uses the ZFS filesystem over local storage, where local storage offers high-performance IO and ZFS offers more features than a traditional Linux filesystem.

There is one more storage in development: Mayastor, which MayaData covered in our community blog.

Conclusion

I praise MayaData, creators of OpenEBS, to take the lead on providing OpenSource volume management tools for Kubernetes. This is a valid choice for on-premise Kubernetes deployments or when you do not want to rely on cloud provider volumes.

Nov
06
2020
--

Various Backup Compression Methods Using Mysqlpump

Backup Compression Methods Using Mysqlpump

Backup Compression Methods Using MysqlpumpMysqlpump is a client program that was released with MySQL 5.7.8 and is used to perform logical backups in a better way. Mysqlpump supports parallelism and it has the capability of creating compressed output. Pablo already wrote a blog about this utility (The mysqlpump Utility), and in this blog, I am going to explore the available compression techniques in the Mysqlpump utility.

Overview

Mysqlpump has three options to perform the compression backup.

–compress: Used to compress all the information sent between client and server.

–compression-algorithm: It was added in MySQL 8.0.18. Used to define the compression algorithm for all incoming connections to the server. (available options: zlib, zstd, uncompressed )

–compress-output: Used to define the compression algorithm for the backup file (available options: lz4, zlib)

Here, “–compress-output” is the option used to define the compression algorithm for the backup file. Which has two algorithms.

  • Lz4
  • Zlib

Lz4: LZ4 is a lossless data compression algorithm that is focused on compression and decompression speed.

Zlib: zlib is a software library used for data compression. zlib compressed data are typically written with a gzip or a zlib wrapper. 

Lab Setup

To experiment with both compression techniques, I have installed the MySQL (8.0.22) server in my local environment. I also created the table “percona_test.mp_test” which has an 11GB size.

[root@localhost]# mysql -e "select @@version, @@version_comment\G"
*************************** 1. row ***************************
        @@version: 8.0.22
@@version_comment: MySQL Community Server - GPL

[root@localhost]# mysql -e "select count(*) from percona_test.mp_test\G"
*************************** 1. row ***************************
count(*): 70698024

[root@localhost percona_test]# ls -lrth
total 11G
-rw-r-----. 1 mysql mysql 11G Oct 23 11:20 mp_test.ibd

Now, I am going to experiment with both compression algorithms.

Compression with Lz4

I am going to take the backup (table: mp_test) using the lz4 compression algorithm.

[root@localhost]# time mysqlpump --set-gtid-purged=off --compress --compress-output=lz4 percona_test mp_test > percona_test.mp_test.lz4
Dump progress: 0/1 tables, 250/70131715 rows
Dump progress: 0/1 tables, 133000/70131715 rows
Dump progress: 0/1 tables, 278500/70131715 rows
...
...
Dump progress: 0/1 tables, 70624000/70131715 rows
Dump completed in 540824
real 9m0.857s

It took 9.1 minutes to complete. And, the file size is 1.1 GB, looks like 10x compression.

[root@dc1 percona_test]# ls -lrth | grep lz4

-rw-r--r--. 1 root  root  1.1G Oct 23 12:47 percona_test.mp_test.lz4

Compression with Zlib

Now, I am going to start the backup with “zlib” algorithm.

[root@dc1]# time mysqlpump --set-gtid-purged=off --compress --compress-output=zlib percona_test mp_test > percona_test.mp_test.zlib
Dump progress: 0/1 tables, 250/70131715 rows
Dump progress: 0/1 tables, 133250/70131715 rows
Dump progress: 0/1 tables, 280250/70131715 rows
Dump progress: 0/1 tables, 428750/70131715 rows
...
...
Dump progress: 0/1 tables, 70627000/70131715 rows
Dump completed in 546249
real 10m6.436s

It took 10.6 minutes to complete the process. And the file size is the same 1.1 GB (10x compression).

[root@dc1]# ls -lrth | grep -i zlib

-rw-r--r--. 1 root  root  1.1G Oct 23 13:06 percona_test.mp_test.zlib

 

 

How to Decompress the Backup

MySQL community provides two utilities to decompress the backups.

  • zlib_decompress ( for zlib compression files )
  • lz4_decompress ( for lz4 compression files )

lz4_decompress

[root@dc1]# time lz4_decompress percona_test.mp_test.lz4 percona_test.mp_test.sql
real 0m45.287s
user 0m1.114s
sys 0m6.568s
[root@dc1]# ls -lrth | grep percona_test.mp_test.sql
-rw-r--r--. 1 root  root  9.1G Oct 23 13:30 percona_test.mp_test.sql

lz4 took 45 seconds to decompress the backup file.

zlib_decompress

[root@dc1]# time zlib_decompress percona_test.mp_test.zlib percona_test.mp_test.sql
real 0m35.553s
user 0m6.642s
sys 0m7.105s
[root@dc1]# ls -lrth | grep percona_test.mp_test.sql
-rw-r--r--. 1 root  root  9.1G Oct 23 13:49 percona_test.mp_test.sql

zlib took 36 seconds to decompress the backup file.

This is the procedure we have to compress/decompress the backups with Mysqlpump. It seems both the algorithms provide the 10x compression. Also, there is not much difference in the execution time as well, but it may be the big one with a large dataset. 

Nov
05
2020
--

ChaosMesh to Create Chaos in Kubernetes

ChaosMesh to Create Chaos in Kubernetes

ChaosMesh to Create Chaos in KubernetesIn my talk on Percona Live (download the presentation), I spoke about how we can use Percona Kubernetes Operators to deploy our own Database-as-Service, based on fully OpenSource components and independent from any particular cloud provider.

Today I want to mention an important tool that I use to test our Operators: ChaosMesh, which actually is part of CNCF and recently became GA version 1.0.

ChaosMesh seeks to deploy chaos engineering experiments in Kubernetes deployments which allows it to test how deployment is resilient against different kinds of failures.

Obviously, this tool is important for Kubernetes Database deployments, and I believe this also can be very useful to test your application deployment to understand how the application will perform and handle different failures.

ChaosMesh allows to emulate:

  • Pod Failure: kill pod or error on pod
  • Network Failure: network partitioning, network delays, network corruptions
  • IO Failure: IO delays and IO errors
  • Stress emulation: stress memory and CPU usage
  • Kernel Failure: return errors on system calls
  • Time skew: Emulate time drift on pods

For our Percona Kubernetes Operators, I found Network Failure especially interesting, as clusters that rely on network communication should provide enough resiliency against network issues.

Let’s review an example of how we can emulate a network failure on one of the pods. Assume we have cluster2 running:

kubectl get pods              
NAME                                                     READY   STATUS                       RESTARTS   AGE
cluster2-haproxy-0                                       2/2     Running                      1          12d
cluster2-haproxy-1                                       2/2     Running                      2          12d
cluster2-haproxy-2                                       2/2     Running                      2          12d
cluster2-pxc-0                                           1/1     Running                      0          12d
cluster2-pxc-1                                           1/1     Running                      0          12d
cluster2-pxc-2                                           1/1     Running                      0          12d

And we will isolate cluster2-pxc-1 from the rest of the cluster, by using the following Chaos Experiment:

apiVersion: chaos-mesh.org/v1alpha1
kind: NetworkChaos
metadata:
  name: pxc-network-delay
spec:
  action: partition # the specific chaos action to inject
  mode: one # the mode to run chaos action; supported modes are one/all/fixed/fixed-percent/random-max-percent
  selector: # pods where to inject chaos actions
    pods:
      pxc: # namespace of the target pods
        - cluster2-pxc-1
  direction: to
  target:
    selector:
      pods:
        pxc: # namespace of the target pods
          - cluster2-pxc-0
    mode: one
  duration: "3s"
  scheduler: # scheduler rules for the running time of the chaos experiments about pods.
    cron: "@every 1000s"
---
apiVersion: chaos-mesh.org/v1alpha1
kind: NetworkChaos
metadata:
  name: pxc-network-delay2
spec:
  action: partition # the specific chaos action to inject
  mode: one # the mode to run chaos action; supported modes are one/all/fixed/fixed-percent/random-max-percent
  selector: # pods where to inject chaos actions
    pods:
      pxc: # namespace of the target pods
        - cluster2-pxc-1
  direction: to
  target:
    selector:
      pods:
        pxc: # namespace of the target pods
          - cluster2-pxc-2
    mode: one
  duration: "3s"
  scheduler: # scheduler rules for the running time of the chaos experiments about pods.
    cron: "@every 1000s"

This will isolate the pod  cluster2-pxc-1 for three seconds. Let’s see what happens with the workload which we directed on cluster2-pxc-0 node (the output is from sysbench-tpcc benchmark):

1041,56,1232.46,36566.42,16717.16,17383.33,2465.93,90.78,4.99,0.00
1042,56,1305.42,35841.03,16295.74,16934.44,2610.84,71.83,6.01,0.00
1043,56,1084.73,30647.99,14056.49,14422.06,2169.45,68.05,5.99,0.00
1044,56,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00
1045,56,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00
1046,56,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00
1047,56,129.00,4219.97,1926.99,2034.98,258.00,4683.57,0.00,0.00
1048,56,939.41,25800.68,11706.55,12215.31,1878.82,960.30,2.00,0.00
1049,56,1182.09,34390.72,15708.49,16318.05,2364.18,66.84,4.00,0.00

And the log from cluster2-pxc-1 pod:

2020-11-05T17:36:27.962719Z 0 [Warning] WSREP: Failed to report last committed 133737, -110 (Connection timed out)
2020-11-05T17:36:29.962975Z 0 [Warning] WSREP: Failed to report last committed 133888, -110 (Connection timed out)
2020-11-05T17:36:30.243902Z 0 [Note] WSREP: (11fdd640, 'ssl://0.0.0.0:4567') turning message relay requesting on, nonlive peers: ssl://192.168.66.9:4567 ssl://192.168.71.201:4567
2020-11-05T17:36:31.161485Z 0 [Note] WSREP: SSL handshake successful, remote endpoint ssl://192.168.66.9:34760 local endpoint ssl://192.168.61.137:4567 cipher: ECDHE-RSA-AES256-GCM-SHA384 compression: none
2020-11-05T17:36:31.162325Z 0 [Note] WSREP: (11fdd640, 'ssl://0.0.0.0:4567') connection established to 0008bac8 ssl://192.168.66.9:4567
2020-11-05T17:36:31.162694Z 0 [Note] WSREP: (11fdd640, 'ssl://0.0.0.0:4567') reconnecting to 448e265d (ssl://192.168.71.201:4567), attempt 0
2020-11-05T17:36:31.174019Z 0 [Note] WSREP: SSL handshake successful, remote endpoint ssl://192.168.71.201:4567 local endpoint ssl://192.168.61.137:47252 cipher: ECDHE-RSA-AES256-GCM-SHA384 compression: none
2020-11-05T17:36:31.176521Z 0 [Note] WSREP: SSL handshake successful, remote endpoint ssl://192.168.71.201:56892 local endpoint ssl://192.168.61.137:4567 cipher: ECDHE-RSA-AES256-GCM-SHA384 compression: none
2020-11-05T17:36:31.177086Z 0 [Note] WSREP: (11fdd640, 'ssl://0.0.0.0:4567') connection established to 448e265d ssl://192.168.71.201:4567
2020-11-05T17:36:31.177289Z 0 [Note] WSREP: (11fdd640, 'ssl://0.0.0.0:4567') connection established to 448e265d ssl://192.168.71.201:4567
2020-11-05T17:36:34.244970Z 0 [Note] WSREP: (11fdd640, 'ssl://0.0.0.0:4567') turning message relay requesting off

We can see that the node lost communication for three seconds and then recovered.

There is a variable evs.suspect_timeout with default five sec which defines the limit of how long the nodes will wait till forming a new quorum without the affected node. So let’s see what will happen if we isolate  cluster2-pxc-1 for nine seconds:

369,56,1326.66,38898.39,17789.62,18462.43,2646.33,77.19,5.99,0.00
370,56,1341.82,38812.61,17741.30,18382.65,2688.65,74.46,5.01,0.00
371,56,364.33,11058.76,5070.72,5256.38,731.66,68.05,0.00,0.00
372,56,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00
373,56,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00
374,56,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00
375,56,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00
376,56,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00
377,56,613.56,17233.62,7862.12,8139.38,1232.12,6360.91,5.00,0.00
378,56,1474.66,43070.96,19684.16,20439.47,2947.33,75.82,4.00,0.00

The workload was stalled for five seconds but continued after that. And we can see from the log what happened with node  cluster2-pxc-1. The log is quite verbose, but to describe what is happening:

  1. After 5 sec node declared that it lost connection to other nodes
  2. Figured out it is in minority and can’t form a quorum, declared itself NON-PRIMARY
  3. After the network restored, the node reconnected with cluster
  4. The node caught up with other nodes using IST (incremental state transfer) method
  5. Cluster became 3-nodes cluster
2020-11-05T17:39:18.282832Z 0 [Warning] WSREP: Failed to report last committed 334386, -110 (Connection timed out)
2020-11-05T17:39:19.283066Z 0 [Warning] WSREP: Failed to report last committed 334516, -110 (Connection timed out)
2020-11-05T17:39:20.768879Z 0 [Note] WSREP: (11fdd640, 'ssl://0.0.0.0:4567') turning message relay requesting on, nonlive peers: ssl://192.168.66.9:4567 ssl://192.168.71.201:4567 
2020-11-05T17:39:21.769154Z 0 [Note] WSREP: (11fdd640, 'ssl://0.0.0.0:4567') reconnecting to 0008bac8 (ssl://192.168.66.9:4567), attempt 0
2020-11-05T17:39:21.769544Z 0 [Note] WSREP: (11fdd640, 'ssl://0.0.0.0:4567') reconnecting to 448e265d (ssl://192.168.71.201:4567), attempt 0
2020-11-05T17:39:24.769604Z 0 [Note] WSREP: (11fdd640, 'ssl://0.0.0.0:4567') connection to peer 00000000 with addr ssl://192.168.66.9:4567 timed out, no messages seen in PT3S (gmcast.peer_timeout), socket stats: rtt: 0 rttvar: 250000 rto: 2000000 lost: 1 last_data_recv: 2949502432 cwnd: 1 last_queued_since: 2949803921272502 last_delivered_since: 2949803921272502 send_queue_length: 0 send_queue_bytes: 0
2020-11-05T17:39:25.269672Z 0 [Note] WSREP: (11fdd640, 'ssl://0.0.0.0:4567') connection to peer 00000000 with addr ssl://192.168.71.201:4567 timed out, no messages seen in PT3S (gmcast.peer_timeout), socket stats: rtt: 0 rttvar: 250000 rto: 4000000 lost: 1 last_data_recv: 2949502932 cwnd: 1 last_queued_since: 2949804421325209 last_delivered_since: 2949804421325209 send_queue_length: 0 send_queue_bytes: 0
2020-11-05T17:39:25.879338Z 0 [Note] WSREP: declaring node with index 0 suspected, timeout PT5S (evs.suspect_timeout)
2020-11-05T17:39:25.879373Z 0 [Note] WSREP: declaring node with index 2 suspected, timeout PT5S (evs.suspect_timeout)
2020-11-05T17:39:25.879399Z 0 [Note] WSREP: evs::proto(11fdd640, OPERATIONAL, view_id(REG,0008bac8,3)) suspecting node: 0008bac8
2020-11-05T17:39:25.879414Z 0 [Note] WSREP: evs::proto(11fdd640, OPERATIONAL, view_id(REG,0008bac8,3)) suspected node without join message, declaring inactive
2020-11-05T17:39:25.879431Z 0 [Note] WSREP: evs::proto(11fdd640, OPERATIONAL, view_id(REG,0008bac8,3)) suspecting node: 448e265d
2020-11-05T17:39:25.879445Z 0 [Note] WSREP: evs::proto(11fdd640, OPERATIONAL, view_id(REG,0008bac8,3)) suspected node without join message, declaring inactive
2020-11-05T17:39:26.379920Z 0 [Note] WSREP: declaring node with index 0 inactive (evs.inactive_timeout) 
2020-11-05T17:39:26.379956Z 0 [Note] WSREP: declaring node with index 2 inactive (evs.inactive_timeout) 
2020-11-05T17:39:26.791118Z 0 [Note] WSREP: SSL handshake successful, remote endpoint ssl://192.168.66.9:4567 local endpoint ssl://192.168.61.137:51672 cipher: ECDHE-RSA-AES256-GCM-SHA384 compression: none
2020-11-05T17:39:26.791958Z 0 [Note] WSREP: (11fdd640, 'ssl://0.0.0.0:4567') connection established to 0008bac8 ssl://192.168.66.9:4567
2020-11-05T17:39:26.879766Z 0 [Note] WSREP: Current view of cluster as seen by this node
view (view_id(NON_PRIM,0008bac8,3)
memb {
        11fdd640,0
        }
joined {
        }
left {
        }
partitioned {
        0008bac8,0
        448e265d,0
        }
)
2020-11-05T17:39:26.879962Z 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2020-11-05T17:39:26.879975Z 0 [Note] WSREP: Current view of cluster as seen by this node
view (view_id(NON_PRIM,11fdd640,4)
memb {
        11fdd640,0
        }
joined {
        }
left {
        }
partitioned {
        0008bac8,0
        448e265d,0
        }
)
2020-11-05T17:39:26.880029Z 0 [Note] WSREP: Flow-control interval: [100, 100]
2020-11-05T17:39:26.880066Z 0 [Note] WSREP: Received NON-PRIMARY.
2020-11-05T17:39:26.880076Z 0 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 334573)
2020-11-05T17:39:26.880095Z 0 [Warning] WSREP: FLOW message from member 139968689209344 in non-primary configuration. Ignored.
2020-11-05T17:39:26.880121Z 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2020-11-05T17:39:26.880134Z 0 [Note] WSREP: Flow-control interval: [100, 100]
2020-11-05T17:39:26.880140Z 0 [Note] WSREP: Received NON-PRIMARY.
2020-11-05T17:39:26.880255Z 2 [Note] WSREP: New cluster view: global state: f2d3cb29-1578-11eb-857b-624f681f446d:334573, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version 3
2020-11-05T17:39:26.880287Z 2 [Note] WSREP: Setting wsrep_ready to false
2020-11-05T17:39:26.880310Z 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2020-11-05T17:39:26.880428Z 2 [Note] WSREP: New cluster view: global state: f2d3cb29-1578-11eb-857b-624f681f446d:334573, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version 3
2020-11-05T17:39:26.880438Z 2 [Note] WSREP: Setting wsrep_ready to false
2020-11-05T17:39:26.880445Z 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2020-11-05T17:39:27.193945Z 0 [Note] WSREP: SSL handshake successful, remote endpoint ssl://192.168.71.201:57892 local endpoint ssl://192.168.61.137:4567 cipher: ECDHE-RSA-AES256-GCM-SHA384 compression: none
2020-11-05T17:39:27.194926Z 0 [Note] WSREP: (11fdd640, 'ssl://0.0.0.0:4567') connection established to 448e265d ssl://192.168.71.201:4567
2020-11-05T17:39:27.305150Z 0 [Note] WSREP: SSL handshake successful, remote endpoint ssl://192.168.71.201:4567 local endpoint ssl://192.168.61.137:48990 cipher: ECDHE-RSA-AES256-GCM-SHA384 compression: none
2020-11-05T17:39:27.306328Z 0 [Note] WSREP: (11fdd640, 'ssl://0.0.0.0:4567') connection established to 448e265d ssl://192.168.71.201:4567
2020-11-05T17:39:27.882743Z 0 [Note] WSREP: declaring 0008bac8 at ssl://192.168.66.9:4567 stable
2020-11-05T17:39:27.882774Z 0 [Note] WSREP: declaring 448e265d at ssl://192.168.71.201:4567 stable
2020-11-05T17:39:27.883565Z 0 [Note] WSREP: Node 0008bac8 state primary
2020-11-05T17:39:27.884475Z 0 [Note] WSREP: Current view of cluster as seen by this node
view (view_id(PRIM,0008bac8,5)
memb {
        0008bac8,0
        11fdd640,0
        448e265d,0
        }
joined {
        }
left {
        }
partitioned {
        }
)
2020-11-05T17:39:27.884499Z 0 [Note] WSREP: Save the discovered primary-component to disk
2020-11-05T17:39:27.885430Z 0 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 3
2020-11-05T17:39:27.885465Z 0 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
2020-11-05T17:39:27.886654Z 0 [Note] WSREP: STATE EXCHANGE: sent state msg: da55f2d8-1f8d-11eb-80cf-075e56823087
2020-11-05T17:39:27.887174Z 0 [Note] WSREP: STATE EXCHANGE: got state msg: da55f2d8-1f8d-11eb-80cf-075e56823087 from 0 (cluster2-pxc-0)
2020-11-05T17:39:27.887194Z 0 [Note] WSREP: STATE EXCHANGE: got state msg: da55f2d8-1f8d-11eb-80cf-075e56823087 from 1 (cluster2-pxc-1)
2020-11-05T17:39:27.887208Z 0 [Note] WSREP: STATE EXCHANGE: got state msg: da55f2d8-1f8d-11eb-80cf-075e56823087 from 2 (cluster2-pxc-2)
2020-11-05T17:39:27.887225Z 0 [Note] WSREP: Quorum results:
        version    = 6,
        component  = PRIMARY,
        conf_id    = 4,
        members    = 2/3 (primary/total),
        act_id     = 338632,
        last_appl. = 334327,
        protocols  = 0/9/3 (gcs/repl/appl),
        group UUID = f2d3cb29-1578-11eb-857b-624f681f446d
2020-11-05T17:39:27.887244Z 0 [Note] WSREP: Flow-control interval: [173, 173]
2020-11-05T17:39:27.887252Z 0 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 338632)
2020-11-05T17:39:27.887439Z 2 [Note] WSREP: State transfer required: 
        Group state: f2d3cb29-1578-11eb-857b-624f681f446d:338632
        Local state: f2d3cb29-1578-11eb-857b-624f681f446d:334573
2020-11-05T17:39:27.887476Z 2 [Note] WSREP: REPL Protocols: 9 (4, 2)
2020-11-05T17:39:27.887486Z 2 [Note] WSREP: REPL Protocols: 9 (4, 2)
2020-11-05T17:39:27.887504Z 2 [Note] WSREP: New cluster view: global state: f2d3cb29-1578-11eb-857b-624f681f446d:338632, view# 5: Primary, number of nodes: 3, my index: 1, protocol version 3
2020-11-05T17:39:27.887516Z 2 [Note] WSREP: Setting wsrep_ready to true
2020-11-05T17:39:27.887524Z 2 [Warning] WSREP: Gap in state sequence. Need state transfer.
2020-11-05T17:39:27.887530Z 2 [Note] WSREP: Setting wsrep_ready to false
2020-11-05T17:39:27.887540Z 2 [Note] WSREP: You have configured 'xtrabackup-v2' state snapshot transfer method which cannot be performed on a running server. Wsrep provider won't be able to fall back to it if other means of state transfer are unavailable. In that case you will need to restart the server.
2020-11-05T17:39:27.887556Z 2 [Note] WSREP: Auto Increment Offset/Increment re-align with cluster membership change (Offset: 2 -> 2) (Increment: 3 -> 3)
2020-11-05T17:39:27.887563Z 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2020-11-05T17:39:27.907686Z 2 [Note] WSREP: Assign initial position for certification: 338632, protocol version: 4
2020-11-05T17:39:27.908853Z 0 [Note] WSREP: Service thread queue flushed.
2020-11-05T17:39:27.909023Z 2 [Note] WSREP: Check if state gap can be serviced using IST
2020-11-05T17:39:27.909165Z 2 [Note] WSREP: IST receiver addr using ssl://192.168.61.137:4568
2020-11-05T17:39:27.909236Z 2 [Note] WSREP: IST receiver using ssl
2020-11-05T17:39:27.910176Z 2 [Note] WSREP: Prepared IST receiver, listening at: ssl://192.168.61.137:4568
2020-11-05T17:39:27.910195Z 2 [Note] WSREP: State gap can be likely serviced using IST. SST request though present would be void.
2020-11-05T17:39:27.922651Z 0 [Note] WSREP: Member 1.0 (cluster2-pxc-1) requested state transfer from '*any*'. Selected 0.0 (cluster2-pxc-0)(SYNCED) as donor.
2020-11-05T17:39:27.922679Z 0 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 338687)
2020-11-05T17:39:27.922744Z 2 [Note] WSREP: Requesting state transfer: success, donor: 0
2020-11-05T17:39:27.922791Z 2 [Note] WSREP: GCache history reset: f2d3cb29-1578-11eb-857b-624f681f446d:334573 -> f2d3cb29-1578-11eb-857b-624f681f446d:338632
2020-11-05T17:39:27.956992Z 2 [Note] WSREP: GCache DEBUG: RingBuffer::seqno_reset(): discarded 133734664 bytes
2020-11-05T17:39:27.957016Z 2 [Note] WSREP: GCache DEBUG: RingBuffer::seqno_reset(): found 1/56 locked buffers
2020-11-05T17:39:27.958791Z 2 [Note] WSREP: Receiving IST: 4059 writesets, seqnos 334573-338632
2020-11-05T17:39:27.958908Z 0 [Note] WSREP: 0.0 (cluster2-pxc-0): State transfer to 1.0 (cluster2-pxc-1) complete.
2020-11-05T17:39:27.958929Z 0 [Note] WSREP: Member 0.0 (cluster2-pxc-0) synced with group.
2020-11-05T17:39:27.958946Z 0 [Note] WSREP: Receiving IST...  0.0% (   0/4059 events) complete.
2020-11-05T17:39:30.770542Z 0 [Note] WSREP: (11fdd640, 'ssl://0.0.0.0:4567') turning message relay requesting off
2020-11-05T17:39:31.851914Z 0 [Note] WSREP: Receiving IST...100.0% (4059/4059 events) complete.
2020-11-05T17:39:31.853178Z 2 [Note] WSREP: IST received: f2d3cb29-1578-11eb-857b-624f681f446d:338632
2020-11-05T17:39:31.854358Z 0 [Note] WSREP: 1.0 (cluster2-pxc-1): State transfer from 0.0 (cluster2-pxc-0) complete.
2020-11-05T17:39:31.854396Z 0 [Note] WSREP: SST leaving flow control
2020-11-05T17:39:31.854406Z 0 [Note] WSREP: Shifting JOINER -> JOINED (TO: 344195)
2020-11-05T17:40:17.927370Z 0 [Warning] WSREP: Trying to continue unpaused monitor
2020-11-05T17:40:26.972878Z 0 [Note] WSREP: Member 1.0 (cluster2-pxc-1) synced with group.
2020-11-05T17:40:26.972913Z 0 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 407602)
2020-11-05T17:40:27.062892Z 4 [Note] WSREP: Synchronized with group, ready for connections
2020-11-05T17:40:27.062911Z 4 [Note] WSREP: Setting wsrep_ready to true
2020-11-05T17:40:27.062922Z 4 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

Conclusion

ChaosMesh is a great tool to test the resiliency of a deployment, and in my opinion, it can be useful not only for database clusters but also for the testing of general applications to make sure the application is able to sustain different failure scenarios.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com