Feb
13
2019
--

plprofiler – Getting a Handy Tool for Profiling Your PL/pgSQL Code

plprofiler postgres performance tool

PostgreSQL is emerging as the standard destination for database migrations from proprietary databases. As a consequence, there is an increase in demand for database side code migration and associated performance troubleshooting. One might be able to trace the latency to a plsql function, but explaining what happens within a function could be a difficult question. Things get messier when you know the function call is taking time, but within that function there are calls to other functions as part of its body. It is a very challenging question to identify which line inside a function—or block of code—is causing the slowness. In order to answer such questions, we need to know how much time an execution spends on each line or block of code. The plprofiler project provides great tooling and extensions to address such questions.

Demonstration of plprofiler using an example

The plprofiler source contains a sample for testing plprofiler. This sample serves two purposes. It can be used for testing the configuration of plprofiler, and it is great place to see how to do the profiling of a nested function call. Files related to this can be located inside the “examples” directory. Don’t worry—I’ll be running through the installation of plprofiler later in this article.

$ cd examples/

The example expects you to create a database with name “pgbench_plprofiler”

postgres=# CREATE DATABASE pgbench_plprofiler;
CREATE DATABASE

The project provides a shell script along with a source tree to test plprofiler functionality. So testing is just a matter of running the shell script.

$ ./prepdb.sh
dropping old tables...
....

Running session level profiling

This profiling uses session level local-data. By default the plprofiler extension collects runtime data in per-backend hashtables (in-memory). This data is only accessible in the current session, and is lost when the session ends or the hash tables are explicitly reset. plprofiler’s run command will execute the plsql code and capture the profile information.

This is illustrated by below example,

$ plprofiler run --command "SELECT tpcb(1, 2, 3, -42)" -d pgbench_plprofiler --output tpcb-test1.html
SELECT tpcb(1, 2, 3, -42)
-- row1:
tpcb: -42
----
(1 rows)
SELECT 1 (0.073 seconds)

What happens during above plprofiler command run can be summarised in 3 steps:

  1. A function call with four parameters “SELECT tpcb(1, 2, 3, -42)” is presented to the plprofiler tool for execution.
  2. plprofiler establishes a connection to PostgreSQL and executes the function
  3. The tool collects the profile information captured in the local-data hash tables and generates an HTML report “tpcb-test1.html”

Global profiling

As mentioned previously, this method is useful if we want to profile the function executions in other sessions or on the entire database. During global profiling, data is captured into a shared-data hash table which is accessible for all sessions in the database. The plprofiler extension periodically copies the local-data from the individual sessions into shared hash tables, to make the statistics available to other sessions. See the

plprofiler monitor

  command, below, for details. This data still relies on the local database system catalog to resolve Oid values into object definitions.

In this example, the plprofiler tool will be running in monitor mode for a duration of 60 seconds. Every 10 seconds, the tool copies data from local-data to shared-data.

$ plprofiler monitor --interval=10 --duration=60 -d pgbench_plprofiler
monitoring for 60 seconds ...
done.

For testing purposes you can start executing a few functions at the same time.

Once the data is captured into shared-data, we can generate a report. For example:

$ plprofiler report --from-shared --title=MultipgMax --output=MultipgMax.html -d pgbench_plprofiler

The data in shared-data will be retained until it’s explicitly cleared using the

plprofiler reset

  command

$ plprofiler reset

If there is no profile data present in the shared hash tables, execution of the report will result in error message.

$ plprofiler report --from-shared --title=MultipgMax --output=MultipgMax.html
Traceback (most recent call last):
File "/usr/bin/plprofiler", line 11, in <module>
load_entry_point('plprofiler==4.dev0', 'console_scripts', 'plprofiler')()
File "/usr/lib/python2.7/site-packages/plprofiler-4.dev0-py2.7.egg/plprofiler/plprofiler_tool.py", line 67, in main
return report_command(sys.argv[2:])
File "/usr/lib/python2.7/site-packages/plprofiler-4.dev0-py2.7.egg/plprofiler/plprofiler_tool.py", line 493, in report_command
report_data = plp.get_shared_report_data(opt_name, opt_top, args)
File "/usr/lib/python2.7/site-packages/plprofiler-4.dev0-py2.7.egg/plprofiler/plprofiler.py", line 555, in get_shared_report_data
raise Exception("No profiling data found")
Exception: No profiling data found

Report on profile information

The HTML report generated by plprofiler is a self-contained HTML document and it gives detailed information about the PL/pgSQL function execution. There will be a clickable FlameGraph at the top of the report with details about functions in the profile. The plprofiler FlameGraph is based on the actual Wall-Clock time spent in the PL/pgSQL functions. By default, plprofiler provides details on the top ten functions, based on their self_time (total_time – children_time).

This section of the report is followed by tabular representation of function calls. For example:

This gives a lot of detailed information such as execution counts and time spend against each line of code.

Binary Packages

Binary distributions of plprofiler are not common. However the BigSQL project provides plprofiler packages as an easy to use bundle. Such ready-to-use packages are one of the reasons for BigSQL to remain as one of the most developer friendly PostgreSQL distributions. The first screen of Package manager installation of BigSQL provided me with the information I am looking for:


Appears that there was a recent release of BigSQL packages and plprofiler is an updated package within that.

Installation and configuration is made simple:

$ ./pgc install plprofiler-pg11
['plprofiler-pg11']
File is already downloaded.
Unpacking plprofiler-pg11-3.3-1-linux64.tar.bz2
install-plprofiler-pg11...
Updating postgresql.conf file:
old: #shared_preload_libraries = '' # (change requires restart)
new: shared_preload_libraries = 'plprofiler'

As we can see, even PostgreSQL parameters are updated to have plprofiler as a

shared_preload_library

 .  If need to use plprofiler for investigating code, these binary packages from the BigSQL project are my first preference because everything is ready to use. Definitely, this is developer-friendly.

Creation of extension and configuring the plprofiler tool

At the database level, we should create the plprofiler extension to profile the function execution. This step needs to be performed in both cases, whether we want global profiling where share_preload_libraries are set, or at session level where that is not required

postgres=# create extension plprofiler;
CREATE EXTENSION

plprofiler is not just an extension, but comes with tooling to invoke profiling or to generate reports. These scripts are primarily coded in Python and uses psycopg2 to connect to PostgreSQL. The python code is located inside the “python-plprofiler” directory of the source tree. There are a few perl dependencies too which will be resolved as part of installation

sudo yum install python-setuptools.noarch
sudo yum install python-psycopg2
cd python-plprofiler/
sudo python ./setup.py install

Building from source

If you already have a PostgreSQL instance running using binaries from PGDG repository OR you want to wet your hands by building everything from source, then installation needs a different approach. I have PostgreSQL 11 already running on the system. The first step is to get the corresponding development packages which have all the header files and libraries to support a build from source. Obviously this is the thorough way of getting plprofiler working.

$ sudo yum install postgresql11-devel

We need to have build tools, and since the core of plprofiler is C code, we have to install a C compiler and make utility.

$ sudo yum install gcc make

Preferably, we should build plprofiler using the same OS user that runs PostgreSQL server, which is “postgres” in most environments. Please make sure that all PostgreSQL binaries are available in the path and that you are able to execute the pg_config, which lists out build related information:

$ pg_config
BINDIR = /usr/pgsql-11/bin
..
INCLUDEDIR = /usr/pgsql-11/include
PKGINCLUDEDIR = /usr/pgsql-11/include
INCLUDEDIR-SERVER = /usr/pgsql-11/include/server
LIBDIR = /usr/pgsql-11/lib
PKGLIBDIR = /usr/pgsql-11/lib
LOCALEDIR = /usr/pgsql-11/share/locale
MANDIR = /usr/pgsql-11/share/man
SHAREDIR = /usr/pgsql-11/share
SYSCONFDIR = /etc/sysconfig/pgsql
PGXS = /usr/pgsql-11/lib/pgxs/src/makefiles/pgxs.mk
CONFIGURE = '--enable-rpath' '--prefix=/usr/pgsql-11' '--includedir=/usr/pgsql-11/include' '--mandir=/usr/pgsql-11/share/man' '--datadir=/usr/pgsql-11/share' '--with-icu' 'CLANG=/opt/rh/llvm-toolset-7/root/usr/bin/clang' 'LLVM_CONFIG=/usr/lib64/llvm5.0/bin/llvm-config' '--with-llvm' '--with-perl' '--with-python' '--with-tcl' '--with-tclconfig=/usr/lib64' '--with-openssl' '--with-pam' '--with-gssapi' '--with-includes=/usr/include' '--with-libraries=/usr/lib64' '--enable-nls' '--enable-dtrace' '--with-uuid=e2fs' '--with-libxml' '--with-libxslt' '--with-ldap' '--with-selinux' '--with-systemd' '--with-system-tzdata=/usr/share/zoneinfo' '--sysconfdir=/etc/sysconfig/pgsql' '--docdir=/usr/pgsql-11/doc' '--htmldir=/usr/pgsql-11/doc/html' 'CFLAGS=-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic' 'LDFLAGS=-Wl,--as-needed' 'PKG_CONFIG_PATH=:/usr/lib64/pkgconfig:/usr/share/pkgconfig'
CC = gcc
...
VERSION = PostgreSQL 11.1

Now we’re ready to get the source code and build it. You should be able to checkout the git repository for plprofiler.

$ git clone https://github.com/pgcentral/plprofiler.git
Cloning into 'plprofiler'...
...

Building against PostgreSQL 11 binaries from PGDG can be a bit complicated because of th JIT feature. Configuration flag

--with-llvm

  will be enabled. So we may have to have LLVM present in the system as detailed in my previous blog about JIT in PostgreSQL11

Once we’re ready, we can move to the plprofiler directory and build it:

$ cd plprofiler
$ make USE_PGXS=1
--- Output ----
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -fPIC -I. -I./ -I/usr/pgsql-11/include/server -I/usr/pgsql-11/include/internal -D_GNU_SOURCE -I/usr/include/libxml2 -I/usr/include -c -o plprofiler.o plprofiler.c
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -fPIC -shared -o plprofiler.so plprofiler.o -L/usr/pgsql-11/lib -Wl,--as-needed -L/usr/lib64/llvm5.0/lib -L/usr/lib64 -Wl,--as-needed -Wl,-rpath,'/usr/pgsql-11/lib',--enable-new-dtags
/opt/rh/llvm-toolset-7/root/usr/bin/clang -Wno-ignored-attributes -fno-strict-aliasing -fwrapv -O2 -I. -I./ -I/usr/pgsql-11/include/server -I/usr/pgsql-11/include/internal -D_GNU_SOURCE -I/usr/include/libxml2 -I/usr/include -flto=thin -emit-llvm -c -o plprofiler.bc plprofiler.c

Now we should be able to install this extension:

$ sudo make USE_PGXS=1 install
--- Output ----
/usr/bin/mkdir -p '/usr/pgsql-11/lib'
/usr/bin/mkdir -p '/usr/pgsql-11/share/extension'
/usr/bin/mkdir -p '/usr/pgsql-11/share/extension'
/usr/bin/install -c -m 755 plprofiler.so '/usr/pgsql-11/lib/plprofiler.so'
/usr/bin/install -c -m 644 .//plprofiler.control '/usr/pgsql-11/share/extension/'
/usr/bin/install -c -m 644 .//plprofiler--1.0--2.0.sql .//plprofiler--2.0--3.0.sql .//plprofiler--3.0.sql '/usr/pgsql-11/share/extension/'
/usr/bin/mkdir -p '/usr/pgsql-11/lib/bitcode/plprofiler'
/usr/bin/mkdir -p '/usr/pgsql-11/lib/bitcode'/plprofiler/
/usr/bin/install -c -m 644 plprofiler.bc '/usr/pgsql-11/lib/bitcode'/plprofiler/./

The above command expects all build tools to be in the proper path even with sudo.

Profiling external sessions

To profile a function executed by another session, or all other sessions, we should load the libraries at global level. In production environments, that will be the case. This can be done by adding the extension library to the

shared_preload_libraries

  specification. You won’t need this if you only want to profile functions executed within your session. Session level profiling is generally possible only in Dev/Test environments.

To enable global profiling, verify the current value of

shared_preload_libraries

  and add plprofiler to the list.

postgres=# show shared_preload_libraries ;
shared_preload_libraries
--------------------------
(1 row)
postgres=# alter system set shared_preload_libraries = 'plprofiler';
ALTER SYSTEM
postgres=#

This change requires us to restart the PostgreSQL server

$ sudo systemctl restart postgresql-11

After the restart, it’s a good idea to verify the parameter change

postgres=# show shared_preload_libraries ;
shared_preload_libraries
--------------------------
plprofiler
(1 row)

From this point onwards, the steps are same as those for the binary package setup discussed above.

Summary

plprofiler is a wonderful tool for developers. I keep seeing many users who are in real need of it. Hopefully this blog post will help those who never tried it.

Jan
28
2019
--

Monitor and Optimize Slow Queries with PMM and EverSQL – Part 2

percona_pmm_eversql

EverSQL is a platform that intelligently tunes your SQL queries by providing query optimization recommendations, and feedback on missing indexes. This is the second post of our EverSQL series, if you missed our introductory post take a look there first and then come back to this article.

We’ll use the Stackoverflow data set again as we did in our first post.

Diving into query optimization

We’ll grab the worst performing query in the list from PMM and optimize it. This query builds a list of the top 50 most recent posts which have a score greater than two, and involves joining two large tables – posts and comments. The original runtime of that query is above 20 minutes and causes high load on the server while running.

worst-query-in-PMM

Assuming you have EverSQL’s chrome extension installed, you’ll see a new button in the PMM Query Analytics page, allowing you to send the query and schema structure directly to EverSQL, to retrieve indexing and query optimization recommendations.

eversql recommendations

 

eversql-dashboard1

After implementing EverSQL’s recommendations, the query’s execution duration significantly improved:

improved-query-response-time

Optimization Internals

So what was the actual optimization in this specific case? And why did it work so well? Let’s look at the original query:

SELECT
   p.title
FROM
   so.posts p
       INNER JOIN
   so.comments c ON p.id = c.postid
WHERE
c.score > 2
GROUP BY p.id
ORDER BY p.creationdate DESC
LIMIT 100;

The tables’ structure:

CREATE TABLE `posts` (
  `Id` int(11) NOT NULL,
  `CreationDate` datetime NOT NULL,
  ...
  PRIMARY KEY (`Id`),
  KEY `posts_idx_creationdate` (`CreationDate`),
) ENGINE=InnoDB DEFAULT CHARSET=latin1
CREATE TABLE `comments` (
  `Id` int(11) NOT NULL,
  `CreationDate` datetime NOT NULL,
  `PostId` int(11) NOT NULL,
  `Score` int(11) DEFAULT NULL,
  ....
  PRIMARY KEY (`Id`),
  KEY `comments_idx_postid` (`PostId`),
  KEY `comments_idx_postid_score` (`PostId`,`Score`),
  KEY `comments_idx_score` (`Score`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

This query will return the post title of the latest 100 stackoverflow posts, which had at least one popular comment (with a score higher than two). The posts table contains 39,646,923 records, while the comments table contains 64,510,258 records.

This is the execution plan MySQL (v5.7.20) chose:

original-execution-plan

One of the challenges with this query is that the GROUP BY and ORDER BY clauses contain different fields, which prevent MySQL from using an index for the ORDER BY. As MySQL’s documentation states:

“In some cases, MySQL cannot use indexes to resolve the ORDER BY, although it may still use indexes to find the rows that match the WHERE clause. Examples:  … The query has different ORDER BY and GROUP BY expressions.”.

Now let’s look into the optimized query:

SELECT
   p.title
FROM
   so.posts p
WHERE
   EXISTS( SELECT
           1
       FROM
           so.comments c
       WHERE
           p.id = c.postid AND c.score > 2)
ORDER BY p.creationdate DESC
LIMIT 100;

Since the comments table is joined in this query only to check for existence of matching records in the posts table, we can use an EXISTS subquery instead. This will allow us to avoid inflating the results (by using JOIN) and then deflating them (by using GROUP BY), which are costly operations.

Now that the GROUP BY is redundant and removed, the database can optionally choose to use an index for the ORDER BY clause.

The new execution plan MySQL chooses is:

As mentioned above, this transformation reduced the query execution duration from ~20 minutes to 370ms. We hope you enjoyed this post, please let us know your experiences using the integration between PMM Query Analytics and EverSQL!

As mentioned above, this transformation reduced the query execution duration from ~20 minutes to 370ms.

We hope you enjoyed this post, please let us know your experiences using the integration between PMM Query Analytics and EverSQL!

Co-Author: Tomer Shay

Tomer Shay, EverSQL

 

Tomer Shay is the Founder of EverSQL. He loves being where the challenge is. In the last 12 years, he had the privilege to code a lot and lead teams of developers, while focusing on databases and performance. He enjoys using technology to bring ideas into reality, help people and see them smile.

Jan
18
2019
--

Replication Manager Works with MariaDB

Complex multi-cluster replication topology

Some time ago I wrote a script to manage asynchronous replication links between Percona XtraDB clusters. The original post can be found here. The script worked well with Percona XtraDB Cluster but it wasn’t working well with MariaDB®.  Finally, the replication manager works with MariaDB.

First, let’s review the purpose of the script. Managing replication links between Galera based clusters is a tedious task. There are many potential slaves and many potential masters. Furthermore, each replication link must have only a single slave. Just try to imagine how you would maintain the following replication topology:

A complex replication topology

The above topology consists of five clusters and four master-to-master links. The replication manager can easily handle this topology. Of course, it is not a fix to the limitations of asynchronous replication. You must make sure your writes are replication safe. You could want, for example, a global user list or to centralize some access logs. Just to refresh memories, here are some of the script highlights:

  • Uses the Galera cluster for Quorum
  • Configurable, arbitrarily complex topologies
  • The script stores the topology in database tables
  • Elects slaves automatically
  • Monitors replication links
  • Slaves can connect to a list of potential masters

As you probably know, MariaDB has a different GTID implementation and syntax for the multi-source replication commands. I took some time to investigate why the script was failing and fixed it. Now, provided you are using MariaDB 10.1.4+ with GTIDs, the replication manager works fine.

You can found the script here. Be aware that although I work for Percona, the script is not officially supported by Percona.

Aug
17
2018
--

Replication from Percona Server for MySQL to PostgreSQL using pg_chameleon

postgres mysql replication using pg_chameleon

postgres mysql replication using pg_chameleonReplication is one of the well-known features that allows us to build an identical copy of a database. It is supported in almost every RDBMS. The advantages of replication may be huge, especially HA (High Availability) and load balancing. But what if we need to build replication between 2 heterogeneous databases like MySQL and PostgreSQL? Can we continuously replicate changes from a MySQL database to a PostgreSQL database? The answer to this question is pg_chameleon.

For replicating continuous changes, pg_chameleon uses the mysql-replication library to pull the row images from MySQL, which are transformed into a jsonb object. A pl/pgsql function in postgres decodes the jsonb and replays the changes into the postgres database. In order to setup this type of replication, your mysql binlog_format must be “ROW”.

A few points you should know before setting up this tool :

  1. Tables that need to be replicated must have a primary key.
  2. Works for PostgreSQL versions > 9.5 and MySQL > 5.5
  3. binlog_format must be ROW in order to setup this replication.
  4. Python version must be > 3.3

When you initialize the replication, pg_chameleon pulls the data from MySQL using the CSV format in slices, to prevent memory overload. This data is flushed to postgres using the COPY command. If COPY fails, it tries INSERT, which may be slow. If INSERT fails, then the row is discarded.

To replicate changes from mysql, pg_chameleon mimics the behavior of a mysql slave. It creates the schema in postgres, performs the initial data load, connects to MySQL replication protocol, stores the row images into a table in postgres. Now, the respective functions in postgres decode those rows and apply the changes. This is similar to storing relay logs in postgres tables and applying them to a postgres schema. You do not have to create a postgres schema using any DDLs. This tool automatically does that for the tables configured for replication. If you need to specifically convert any types, you can specify this in the configuration file.

The following is just an exercise that you can experiment with and implement if it completely satisfies your requirement. We performed these tests on CentOS Linux release 7.4.

Prepare the environment

Set up Percona Server for MySQL

InstallMySQL 5.7 and add appropriate parameters for replication.

In this exercise, I have installed Percona Server for MySQL 5.7 using YUM repo.

yum install http://www.percona.com/downloads/percona-release/redhat/0.1-6/percona-release-0.1-6.noarch.rpm
yum install Percona-Server-server-57
echo "mysql ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers
usermod -s /bin/bash mysql
sudo su - mysql

pg_chameleon requires the following the parameters to be set in your my.cnf file (parameter file of your MySQL server). You may add the following parameters to /etc/my.cnf

binlog_format= ROW
binlog_row_image=FULL
log-bin = mysql-bin
server-id = 1

Now start your MySQL server after adding the above parameters to your my.cnf file.

$ service mysql start

Fetch the temporary root password from mysqld.log, and reset the root password using mysqladmin

$ grep "temporary" /var/log/mysqld.log
$ mysqladmin -u root -p password 'Secret123!'

Now, connect to your MySQL instance and create sample schema/tables. I have also created an emp table for validation.

$ wget http://downloads.mysql.com/docs/sakila-db.tar.gz
$ tar -xzf sakila-db.tar.gz
$ mysql -uroot -pSecret123! < sakila-db/sakila-schema.sql
$ mysql -uroot -pSecret123! < sakila-db/sakila-data.sql
$ mysql -uroot -pSecret123! sakila -e "create table emp (id int PRIMARY KEY, first_name varchar(20), last_name varchar(20))"

Create a user for configuring replication using pg_chameleon and give appropriate privileges to the user using the following steps.

$ mysql -uroot -p
create user 'usr_replica'@'%' identified by 'Secret123!';
GRANT ALL ON sakila.* TO 'usr_replica'@'%';
GRANT RELOAD, REPLICATION CLIENT, REPLICATION SLAVE ON *.* TO 'usr_replica'@'%';
FLUSH PRIVILEGES;

While creating the user in your mysql server (‘usr_replica’@’%’), you may wish to replace % with the appropriate IP or hostname of the server on which pg_chameleon is running.

Set up PostgreSQL

Install PostgreSQL and start the database instance.

You may use the following steps to install PostgreSQL 10.x

yum install https://yum.postgresql.org/10/redhat/rhel-7.4-x86_64/pgdg-centos10-10-2.noarch.rpm
yum install postgresql10*
su - postgres
$/usr/pgsql-10/bin/initdb
$ /usr/pgsql-10/bin/pg_ctl -D /var/lib/pgsql/10/data start

As seen in the following logs, create a user in PostgreSQL using which pg_chameleon can write changed data to PostgreSQL. Also create the target database.

postgres=# CREATE USER usr_replica WITH ENCRYPTED PASSWORD 'secret';
CREATE ROLE
postgres=# CREATE DATABASE db_replica WITH OWNER usr_replica;
CREATE DATABASE

Steps to install and setup replication using pg_chameleon

Step 1: In this exercise, I installed Python 3.6 and pg_chameleon 2.0.8 using the following steps. You may skip the python install steps if you already have the desired python release. We can create a virtual environment if the OS does not include Python 3.x by default.

yum install gcc openssl-devel bzip2-devel wget
cd /usr/src
wget https://www.python.org/ftp/python/3.6.6/Python-3.6.6.tgz
tar xzf Python-3.6.6.tgz
cd Python-3.6.6
./configure --enable-optimizations
make altinstall
python3.6 -m venv venv
source venv/bin/activate
pip install pip --upgrade
pip install pg_chameleon

Step 2: This tool requires a configuration file to store the source/target server details, and a directory to store the logs. Use the following command to let pg_chameleon create the configuration file template and the respective directories for you.

$ chameleon set_configuration_files

The above command would produce the following output, which shows that it created some directories and a file in the location where you ran the command.

creating directory /var/lib/pgsql/.pg_chameleon
creating directory /var/lib/pgsql/.pg_chameleon/configuration/
creating directory /var/lib/pgsql/.pg_chameleon/logs/
creating directory /var/lib/pgsql/.pg_chameleon/pid/
copying configuration example in /var/lib/pgsql/.pg_chameleon/configuration//config-example.yml

Copy the sample configuration file to another file, lets say, default.yml

$ cd .pg_chameleon/configuration/
$ cp config-example.yml default.yml

Here is how my default.yml file looks after adding all the required parameters. In this file, we can optionally specify the data type conversions, tables to skipped from replication and the DML events those need to skipped for selected list of tables.

---
#global settings
pid_dir: '~/.pg_chameleon/pid/'
log_dir: '~/.pg_chameleon/logs/'
log_dest: file
log_level: info
log_days_keep: 10
rollbar_key: ''
rollbar_env: ''
# type_override allows the user to override the default type conversion into a different one.
type_override:
  "tinyint(1)":
    override_to: boolean
    override_tables:
      - "*"
#postgres  destination connection
pg_conn:
  host: "localhost"
  port: "5432"
  user: "usr_replica"
  password: "secret"
  database: "db_replica"
  charset: "utf8"
sources:
  mysql:
    db_conn:
      host: "localhost"
      port: "3306"
      user: "usr_replica"
      password: "Secret123!"
      charset: 'utf8'
      connect_timeout: 10
    schema_mappings:
      sakila: sch_sakila
    limit_tables:
#      - delphis_mediterranea.foo
    skip_tables:
#      - delphis_mediterranea.bar
    grant_select_to:
      - usr_readonly
    lock_timeout: "120s"
    my_server_id: 100
    replica_batch_size: 10000
    replay_max_rows: 10000
    batch_retention: '1 day'
    copy_max_memory: "300M"
    copy_mode: 'file'
    out_dir: /tmp
    sleep_loop: 1
    on_error_replay: continue
    on_error_read: continue
    auto_maintenance: "disabled"
    gtid_enable: No
    type: mysql
    skip_events:
      insert:
#        - delphis_mediterranea.foo #skips inserts on the table delphis_mediterranea.foo
      delete:
#        - delphis_mediterranea #skips deletes on schema delphis_mediterranea
      update:

Step 3: Initialize the replica using this command:

$ chameleon create_replica_schema --debug

The above command creates a schema and nine tables in the PostgreSQL database that you specified in the .pg_chameleon/configuration/default.yml file. These tables are needed to manage replication from source to destination. The same can be observed in the following log.

db_replica=# \dn
List of schemas
Name | Owner
---------------+-------------
public | postgres
sch_chameleon | target_user
(2 rows)
db_replica=# \dt sch_chameleon.t_*
List of relations
Schema | Name | Type | Owner
---------------+------------------+-------+-------------
sch_chameleon | t_batch_events | table | target_user
sch_chameleon | t_discarded_rows | table | target_user
sch_chameleon | t_error_log | table | target_user
sch_chameleon | t_last_received | table | target_user
sch_chameleon | t_last_replayed | table | target_user
sch_chameleon | t_log_replica | table | target_user
sch_chameleon | t_replica_batch | table | target_user
sch_chameleon | t_replica_tables | table | target_user
sch_chameleon | t_sources | table | target_user
(9 rows)

Step 4: Add the source details to pg_chameleon using the following command. Provide the name of the source as specified in the configuration file. In this example, the source name is mysql and the target is postgres database defined under pg_conn.

$ chameleon add_source --config default --source mysql --debug

Once you run the above command, you should see that the source details are added to the t_sources table.

db_replica=# select * from sch_chameleon.t_sources;
-[ RECORD 1 ]-------+----------------------------------------------
i_id_source | 1
t_source | mysql
jsb_schema_mappings | {"sakila": "sch_sakila"}
enm_status | ready
t_binlog_name |
i_binlog_position |
b_consistent | t
b_paused | f
b_maintenance | f
ts_last_maintenance |
enm_source_type | mysql
v_log_table | {t_log_replica_mysql_1,t_log_replica_mysql_2}
$ chameleon show_status --config default
Source id Source name Type Status Consistent Read lag Last read Replay lag Last replay
----------- ------------- ------ -------- ------------ ---------- ----------- ------------ -------------
1 mysql mysql ready Yes N/A N/A

Step 5: Initialize the replica/slave using the following command. Specify the source from which you are replicating the changes to the PostgreSQL database.

$ chameleon init_replica --config default --source mysql --debug

Initialization involves the following tasks on the MySQL server (source).

1. Flush the tables with read lock
2. Get the master’s coordinates
3. Copy the data
4. Release the locks

The above command creates the target schema in your postgres database automatically.
In the default.yml file, we mentioned the following schema_mappings.

schema_mappings:
sakila: sch_sakila

So, now it created the new schema scott in the target database db_replica.

db_replica=# \dn
List of schemas
Name | Owner
---------------+-------------
public | postgres
sch_chameleon | usr_replica
sch_sakila | usr_replica
(3 rows)

Step 6: Now, start replication using the following command.

$ chameleon start_replica --config default --source mysql

Step 7: Check replication status and any errors using the following commands.

$ chameleon show_status --config default
$ chameleon show_errors

This is how the status looks:

$ chameleon show_status --source mysql
Source id Source name Type Status Consistent Read lag Last read Replay lag Last replay
----------- ------------- ------ -------- ------------ ---------- ----------- ------------ -------------
1 mysql mysql running No N/A N/A
== Schema mappings ==
Origin schema Destination schema
--------------- --------------------
sakila sch_sakila
== Replica status ==
--------------------- ---
Tables not replicated 0
Tables replicated 17
All tables 17
Last maintenance N/A
Next maintenance N/A
Replayed rows
Replayed DDL
Skipped rows

Now, you should see that the changes are continuously getting replicated from MySQL to PostgreSQL.

Step 8:  To validate, you may insert a record into the table in MySQL that we created for the purpose of validation and check that it is replicated to postgres.

$ mysql -u root -pSecret123! -e "INSERT INTO sakila.emp VALUES (1,'avinash','vallarapu')"
mysql: [Warning] Using a password on the command line interface can be insecure.
$ psql -d db_replica -c "select * from sch_sakila.emp"
 id | first_name | last_name
----+------------+-----------
  1 | avinash    | vallarapu
(1 row)

In the above log, we see that the record that was inserted to the MySQL table was replicated to the PostgreSQL table.

You may also add multiple sources for replication to PostgreSQL (target).

Reference : http://www.pgchameleon.org/documents/

Please refer to the above documentation to find out about the many more options that are available with pg_chameleon

The post Replication from Percona Server for MySQL to PostgreSQL using pg_chameleon appeared first on Percona Database Performance Blog.

Aug
13
2018
--

Webinar Tues 8/14: Amazon Migration Service: The Magic Wand to Migrate Away from Your Proprietary Environment to MySQL

Amazon Migration Service to migrate to MySQL

Amazon Migration Service to migrate to MySQLPlease join Percona’s Solution Engineer, Dimitri Vanoverbeke as he presents Amazon Migration Service: The Magic Wand to Migrate Away from Your Proprietary Environment to MySQL on Tuesday, August 14th, 2018 at 7:00 AM PDT (UTC-7) / 10:00 AM EDT (UTC-4).

In this talk, we will learn about the Amazon Migration Tool. The talk will cover the possibilities, potential pitfalls prior to migrating and a high-level overview of its functionalities.

Register Now

The post Webinar Tues 8/14: Amazon Migration Service: The Magic Wand to Migrate Away from Your Proprietary Environment to MySQL appeared first on Percona Database Performance Blog.

Aug
01
2018
--

Percona Monitoring and Management 1.13.0 Is Now Available

Percona Monitoring and Management

Percona Monitoring and ManagementPMM (Percona Monitoring and Management) is a free and open-source platform for managing and monitoring MySQL and MongoDB performance. You can run PMM in your own environment for maximum security and reliability. It provides thorough time-based analysis for MySQL and MongoDB servers to ensure that your data works as efficiently as possible.

The most significant feature in this release is Prometheus 2, however we also packed a lot of visual changes into release 1.13:

  • Prometheus 2 – Consumes less resources, and Dashboards load faster!
  • New Dashboard: Network Overview – New dashboard for all things IPv4!
  • New Dashboard: NUMA Overview – New Dashboard! Understand memory allocation across DIMMs
  • Snapshots and Updates Improvements – Clearer instructions for snapshot sharing, add ability to disable update reporting
  • System Overview Dashboard improvements – See high level summary, plus drill in on CPU, Memory, Disk, and Network
  • Improved SingleStat for percentages – Trend line now reflects percentage value

We addressed 13 new features and improvements, and fixed 13 bugs.

Prometheus 2

The long awaited Prometheus 2 release is here!  By upgrading to PMM release 1.13, Percona’s internal testing has shown you will achieve a 3x-10x reduction in CPU usage, which translates into PMM Server being able to handle more instances than you could in 1.12.  You won’t see any gaps in graphs since internally PMM Server will run two instances of Prometheus and leverage remote_read in order to provide consistent graphs!

Our Engineering teams have worked very hard to make this upgrade as transparent as possible – hats off to them for their efforts!!

Lastly on Prometheus 2, we also included a new set of graphs to the Prometheus Dashboard to help you better understand when your PMM Server may run out of space. We hope you find this useful!

Network Overview Dashboard

We’re introducing a new dashboard that focuses on all things Networking – we placed a Last Hour panel highlighting high-level network metrics, and then drill into Network Traffic + Details, then focus on TCP, UDP, and ICMP behavior.

Snapshots and Updates Improvements

Of most interest to current Percona Customers, we’ve clarified the instructions on how to take a snapshot of a Dashboard in order to highlight that you are securely sharing with Percona. We’ve also configured the sharing timeout to 30 seconds (up from 4 seconds) so that we more reliably share useful data to Percona Support Engineers, as shorter timeout led to incomplete graphs being shared.

Packed into this feature is also a change to how we report installed version, latest version, and what’s new information:

Lastly, we modified the behavior of the docker environment option DISABLE_UPDATES to remove the Update button.  As a reminder, you can choose to disable update reporting for environments where you want tighter control over (i.e. lock down) who can initiate an update by launching the PMM docker container along with the environment variable as follows:

docker run ... -e DISABLE_UPDATES=TRUE

System Overview Dashboard Improvements

We’ve updated our System Overview Dashboard to focus on the four criteria of CPU, Memory, Disk, and Network, while also presenting a single panel row of high level information (uptime, count of CPUs, load average, etc)

Our last feature we’re introducing in 1.13 is a fix to SingleStat panels where the percentage value is reflected in the level of the trend line in the background.  For example, if you have a stat panel at 20% and 86%, the line in the background should fill the respective amount of the box:Improved SingleStat for percentages

New Features & Improvements

  • PMM-2225 – Add new Dashboard: Network Overview
  • PMM-2485 – Improve Singlestat for percentage values to accurately display trend line
  • PMM-2550 – Update to Prometheus 2
  • PMM-1667 – New Dashboard: NUMA Overview
  • PMM-1930 – Reduce Durability for MySQL
  • PMM-2291 – Add Prometheus Disk Space Utilization Information
  • PMM-2444 – Increase space for legends
  • PMM-2594 – Upgrade to Percona Toolkit 3.0.10
  • PMM-2610 – Configure Snapshot Timeout Default Higher and Update Instructions
  • PMM-2637 – Check for Updates and Disable Updates Improvements
  • PMM-2652 – Fix “Unexpected error” on Home dashboard after upgrade
  • PMM-2661 – Data resolution on Dashboards became 15sec min instead of 1sec
  • PMM-2663 – System Overview Dashboard Improvements

Bug Fixes

  • PMM-1977 – after upgrade pmm-client (1.6.1-1) can’t start mysql:metrics – can’t find .my.cnf
  • PMM-2379 – Invert colours for Memory Available graph
  • PMM-2413 – Charts on MySQL InnoDB metrics are not fully displayed
  • PMM-2427 – Information loss in CPU Graph with Grafana 5 upgrade
  • PMM-2476 – AWS PMM is broken on C5/M5 instances
  • PMM-2576 – Error in logs for MySQL 8 instance on CentOS
  • PMM-2612 – Wrong information in PMM Scrapes Task
  • PMM-2639 – mysql:metrics does not work on Ubuntu 18.04
  • PMM-2643 – Socket detection and MySQL 8
  • PMM-2698 – Misleading Graphs for Rare Events
  • PMM-2701 – MySQL 8 – Innodb Checkpoint Age
  • PMM-2722 – Memory auto-configuration for Prometheus evaluates to minimum of 128MB in entrypoint.sh

How to get PMM Server

PMM is available for installation using three methods:

The post Percona Monitoring and Management 1.13.0 Is Now Available appeared first on Percona Database Performance Blog.

Jun
27
2018
--

Percona Monitoring and Management 1.12.0 Is Now Available

Percona Monitoring and Management

Percona Monitoring and ManagementPMM (Percona Monitoring and Management) is a free and open-source platform for managing and monitoring MySQL and MongoDB performance. You can run PMM in your own environment for maximum security and reliability. It provides thorough time-based analysis for MySQL and MongoDB servers to ensure that your data works as efficiently as possible.

In release 1.12, we invested our efforts in the following areas:

  • Visual Explain in Query Analytics – Gain insight into MySQL’s query optimizer for your queries
  • New Dashboard – InnoDB Compression Metrics – Evaluate effectiveness of InnoDB Compression
  • New Dashboard – MySQL Command/Handler Compare – Contrast MySQL instances side by side
  • Updated Grafana to 5.1 – Fixed scrolling issues

We addressed 10 new features and improvements, and fixed 13 bugs.

Visual Explain in Query Analytics

We’re working on substantial changes to Query Analytics and the first part to roll out is something that users of Percona Toolkit may recognize – we’ve introduced a new element called Visual Explain based on pt-visual-explain.  This functionality transforms MySQL EXPLAIN output into a left-deep tree representation of a query plan, in order to mimic how the plan is represented inside MySQL.  This is of primary benefit when investigating tables that are joined in some logical way so that you can understand in what order the loops are executed by the MySQL query optimizer. In this example we are demonstrating the output of a single table lookup vs two table join:

Single Table Lookup Two Tables via INNER JOIN
SELECT DISTINCT c
FROM sbtest13
WHERE id
BETWEEN 49808
AND 49907
ORDER BY c
SELECT sbtest3.c
FROM sbtest1
INNER JOIN sbtest3
ON sbtest1.id = sbtest3.id
WHERE sbtest3.c ='long-string';

InnoDB Compression Metrics Dashboard

A great feature of MySQL’s InnoDB storage engine includes compression of data that is transparently handled by the database, saving you space on disk, while reducing the amount of I/O to disk as fewer disk blocks are required to store the same amount of data, thus allowing you to reduce your storage costs.  We’ve deployed a new dashboard that helps you understand the most important characteristics of InnoDB’s Compression.  Here’s a sample of visualizing Compression and Decompression attempts, alongside the overall Compression Success Ratio graph:

 

MySQL Command/Handler Compare Dashboard

We have introduced a new dashboard that lets you do side-by-side comparison of Command (Com_*) and Handler statistics.  A common use case would be to compare servers that share a similar workload, for example across MySQL instances in a pool of replicated slaves.  In this example I am comparing two servers under identical sysbench load, but exhibiting slightly different performance characteristics:

The number of servers you can select for comparison is unbounded, but depending on the screen resolution you might want to limit to 3 at a time for a 1080 screen size.

New Features & Improvements

  • PMM-2519: Display Visual Explain in Query Analytics
  • PMM-2019: Add new Dashboard InnoDB Compression metrics
  • PMM-2154: Add new Dashboard Compare Commands and Handler statistics
  • PMM-2530: Add timeout flags to mongodb_exporter (thank you unguiculus for your contribution!)
  • PMM-2569: Update the MySQL Golang driver for MySQL 8 compatibility
  • PMM-2561: Update to Grafana 5.1.3
  • PMM-2465: Improve pmm-admin debug output
  • PMM-2520: Explain Missing Charts from MySQL Dashboards
  • PMM-2119: Improve Query Analytics messaging when Host = All is passed
  • PMM-1956: Implement connection checking in mongodb_exporter

Bug Fixes

  • PMM-1704: Unable to connect to AtlasDB MongoDB
  • PMM-1950: pmm-admin (mongodb:metrics) doesn’t work well with SSL secured mongodb server
  • PMM-2134: rds_exporter exports memory in Kb with node_exporter labels which are in bytes
  • PMM-2157: Cannot connect to MongoDB using URI style
  • PMM-2175: Grafana singlestat doesn’t use consistent colour when unit is of type Time
  • PMM-2474: Data resolution on Dashboards became 15sec interval instead of 1sec
  • PMM-2581: Improve Travis CI tests by addressing pmm-admin check-network Time Drift
  • PMM-2582: Unable to scroll on “_PMM Add Instance” page when many RDS instances exist in an AWS account
  • PMM-2596: Set fixed height for panel content in PMM Add Instances
  • PMM-2600: InnoDB Checkpoint Age does not show data for MySQL
  • PMM-2620: Fix balancerIsEnabled & balancerChunksBalanced values
  • PMM-2634: pmm-admin cannot create user for MySQL 8
  • PMM-2635: Improve error message while adding metrics beyond “exit status 1”

Known Issues

  • PMM-2639: mysql:metrics does not work on Ubuntu 18.04 – We will address this in a subsequent release

How to get PMM Server

PMM is available for installation using three methods:

The post Percona Monitoring and Management 1.12.0 Is Now Available appeared first on Percona Database Performance Blog.

Feb
12
2018
--

Does Percona Monitoring and Management (PMM) Support External Monitoring Services? Yes It Does!

External Monitoring Services

Percona Monitoring and Management (PMM) is a free and open-source platform for managing and monitoring MySQL and MongoDB performance. You can run PMM in your own environment for maximum security and reliability. It provides thorough time-based analysis for MySQL and MongoDB servers to ensure that your data works as efficiently as possible.

Starting with version 1.4.0 and improved in 1.7.0, PMM supports external monitoring services. This means you can plug in Prometheus exporters for technologies not directly provided by Percona. For example, you can start monitoring the metrics of your PostgreSQL database host, Memcached or Redis.

Exporters Overview

Applications store their metrics in arbitrary formats, and Prometheus exporters collect them and produce (or export to) a consistent format of key-value pairs. The keys refer to metric types and values are numbers in the float 64 format. Due to the diversity of formats that applications may use, you should program a specific exporter for each format. However, if you decide to make the metrics of your application available via PMM you may consider using one of existing Prometheus exporters.

Currently, PMM offers exporters for MySQL (mysqld_exporter) and MongoDB (mongodb_exporter) database management systems. Built-in exporters also exist for Percona XtraDBCluster, MariaDB, RDS and Aurora via mysqld_exporter and for ProxySQL (via proxysql_exporter). These exporters are made available as monitoring services that you can add or remove as necessary. In addition, PMM includes the node_exporter to capture the host level Linux metrics such as CPU, Load, and disk resources.

Using Exporters

On the computer where the PMM client is installed and connected to a PMM server, make use of the pmm-admin utility to add any built-in monitoring service directly. There is no extra effort in this case: the added monitoring service will run its exporter and all required configuration updates are made automatically to make the metrics available in the web interface for further analysis in Query analytics and Metrics monitor.

In case of external monitoring services, you need to locate, download, set up and run the specific Prometheus exporter to collect metrics. When it is ready, you can add it as a monitoring service:

pmm-admin add external:service job_name [instance] --service-port=PORT_NUMBER

This command adds an external monitoring service bound to the Prometheus job that you specify as the job_name parameter. You should also provide the port associated with this Prometheus job as the value of the service-port parameter. The instance parameter is optional. By default, it is assigned the name of the host where you run pmm-admin.

Example 1: Adding a PostgreSQL Monitoring Service

In order to add an external monitoring service for a PostgreSQL database server, make sure to install and configure your PostgreSQL server. Then, select a PostgreSQL Prometheus exporter from the list available from the  Prometheus site, such as PostgreSQL metric exporter for Prometheus. Refer to the documentation for this exporter for details about how to install and set it up.

As soon as your Prometheus exporter can collect metrics from your PostgreSQL database server,  you are ready to add this exporter as a monitoring service. Make sure that you have access to a configured PMM server and your PMM client has been set up to use it. Use the pmm-admin utility, which is part of PMM client, to add the PostgreSQL monitoring service. Assuming postgresql is the name of this monitoring service, your command should look like this:

pmm-admin add external:service --service-port=PORT_NUMBER postgresql

It is time now to display the metrics on the PMM Server. Open Metrics Monitor and check the Advanced Data Exploration dashboard. This can dashboard visualize a lot of metrics including those exposed by external monitoring services. In the Host field select your host. Use the Metric field to select a metric.

External Monitoring Services
Viewing a metric exposed by a PostgreSQL exporter.

Setting up an external monitoring service requires extra work compared to adding built-in monitoring services. However, by using external monitoring services you can considerably extend the capabilities of your PMM installation.

Note that running the pmm-admin list command lists the added external monitoring services. They also appear in the JSON output, too. To remove an external service use the remove (or its short form rm) command:

pmm-admin rm external:service --service-port=PORT_NUMBER NAME_OF_EXTERNAL_MONITORING_SERVICE

$ sudo pmm-admin list
pmm-admin 1.7.0
PMM Server      | 192.0.2.2 (password-protected)
Client Name     | postgres01
Client Address  | 192.0.2.3
Service Manager | unix-systemv
Job name    Scrape interval  Scrape timeout  Metrics path  Scheme  Target         Labels                   Health
postgresql  1s               1s              /metrics      http    192.0.2.3:9187 instance="postgres01"      UP

Example 2: Adding a Redis Monitoring Service

To start with, you must install a Prometheus exporter for Redis (this exporter is listed on the Prometheus Exporters and Integrations page) on the machine where your PMM client runs. The following command adds this exporter as an external monitoring service (run it as a superuser or use sudo). This time the command has an extra parameter:

$ sudo pmm-admin add external:service redis --service-port 9121 redis01
External service added.

Notice that we use Redis Server as the last parameter passed to pmm-admin add external:service command. The last positional parameter is a label that you assign to this particular instance.

pmm-admin add external:service --service-port=PORT_NUMBER NAME_OF_EXTERNAL_MONITORING_SERVICE [INSTANCE_LABEL]

You may choose any name for this purpose. Make sure to use quotes if you decide to use a label made of two or more words.

$ sudo pmm-admin list
pmm-admin 1.7.0
PMM Server | 127.0.0.1
Client Name | percona
Client Address | 172.17.0.1
Service Manager | linux-systemd
No services under monitoring.
Job name Scrape interval Scrape timeout Metrics path Scheme Target          Labels                  Health
redis    1m0s            10s            /metrics     http   172.17.0.1:9121 instance="redis01"      UP

To view Redis related metrics you need to open the Advanced Data Exploration dashboard on your PMM Server. The redis01 label automatically appears in the Host field in the Advanced Data Exploration dashboard. In the Host field, select the redis01 option and choose a metric to view from the Metric field, such as redis_exporter_scrapes_total.

Other Ways to Add External Services

The pmm-admin add external:service command is the recommended way to add an external monitoring service. There exist other, more specific, methods. The pmm-admin add external:metrics adds external Prometheus exporters job to metrics monitoring.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com