Securing PostgreSQL as an Enterprise-Grade Environment

PostgreSQL enterprise-grade security

PostgreSQL® logoIn this post, we review how you can build an enhanced and secure PostgreSQL database environment using community software. We look at the features that are available in PostgreSQL that, when implemented, provide improved security.

As discussed in the introductory blog post of this series, in our webinar of October 10, 2018 we highlight important aspects an enterprise should consider for their PostgreSQL environments. This series of blogs addressing particular aspects of the enterprise-grade postgres environment complements the webinar. This post addresses security.

Authentication Layer

Client connections to PostgreSQL Server using host based authentication

PostgreSQL uses a host based authentication file (pg_hba.conf) to authorize incoming connections. This file contains entries with a combination of 5 categories: type, database, user, address, and method. A client is allowed to connect to a database only when the combination of username, database and the hostname of the client matches an entry in the pg_hba.conf file.

Consider the following entry in pg_hba.conf file :

host percona pguser md5

This entry says that connections from server are only allowed from user pguser and only to the database percona. The method md5 forces password authentication.

The order of the entries in the pg_hba.conf file matters. If you have an entry that rejects connections from a given server followed by another that allows connections from it, the first entry in the order is considered. So, in this case, the connection is rejected.

This is the first layer of protection in authentication. If this criteria is not satisfied in this Access Control List (ACL), PostgreSQL will discard the request without considering even the server authentication.

Server Authentication

Historically, PostgreSQL uses MD5 digest as a password hash by default. The problem with pure MD5 hashing is that this function will always return the same hash for a given password, which renders a MD5 digest more susceptible for password cracking. Newer versions of PostgreSQL implement SCRAM Authentication (Simple Authentication and Secured Layer) that stores passwords in salted and iterated hash formats to strengthen PostgreSQL against offline attacks. SCRAM-SHA-256 support was introduced in PostgreSQL 10. What matters most in terms of “enterprise-grade” security is that PostgreSQL supports industry-standard authentication methods out of the box, like SSL certificates, PAM/LDAP, Kerberos, etc.

Authorization Layer

User management through roles and privileges

It is always recommended to implement segregation of users through roles and privileges. There may be several user accounts in your PostgreSQL server. Only a few of them may be application accounts while the rest are developers or admin accounts. In such cases, PostgreSQL allows you to create multiple roles. Those can be assigned with a set of privileges. Thus, instead of managing user privileges individually, standard roles can be maintained and the appropriate role from the list can be assigned to a user. Through roles, database access can be standardized, which helps in user management and avoids granting too much or too little privilege to a given user.

For example, we might have six roles:


Now, if you need to create a new dev user who can only have read access, grant one among the appropriate roles, such as dev_read_only:

GRANT dev_read_only to avi_im_developer;

Row level Security

Starting with version 9.5, PostgreSQL implements row level security, which can limit access to only a subset of records/rows in a table. Usually a user is granted a mix of SELECT, INSERT, DELETE and UPDATE privileges on a given table, which allows access to all records in the table. Through row level security, however, such privileges can be restricted to a subset of the records by means of a policy, which in turn can be  assigned to a role.

In the next example, we create an employee table and two manager accounts. We then enable row level security on the table and create a policy that allows the managers to only view/modify their own subordinates’ records:

CREATE TABLE scott.employee (id INT, first_name VARCHAR(20), last_name VARCHAR(20), manager VARCHAR(20));
INSERT INTO scott.employee VALUES (1,'avinash','vallarapu','carina');
INSERT INTO scott.employee VALUES (2,'jobin','augustine','stuart');
INSERT INTO scott.employee VALUES (3,'fernando','laudares','carina');
CREATE ROLE managers;
GRANT managers TO carina, stuart;
GRANT SELECT, INSERT, UPDATE, DELETE ON scott.employee TO managers;
GRANT USAGE ON SCHEMA scott TO managers;
CREATE POLICY employee_managers ON scott.employee TO managers USING (manager = current_user);

In the log we can see that only certain records are visible to each manager:

$ psql -d percona -U carina
psql (10.5)
Type "help" for help.
percona=> select * from scott.employee ;
id | first_name | last_name | manager
 1 | avinash    | vallarapu | carina
 3 | fernando   | laudares | carina
(2 rows)
$ psql -d percona -U stuart
psql (10.5)
Type "help" for help.
percona=> select * from scott.employee ;
id | first_name | last_name | manager
 2 | jobin      | augustine | stuart
(1 row)

You can read more about row level security in the manual page.

Data Security

1. Encryption of data over the wire using SSL

PostgreSQL allows you to use SSL to enable encryption of data in motion. In addition, you may enable certification based authentication to ensure that the communication is happening between trusted parties. SSL is implemented by OpenSSL and thus it requires the OpenSSL package to be installed in your PostgreSQL server and PostgreSQL to be built –with-openssl support.

The following entry in a pg_hba.conf file says that connections to any database and from any user are allowed from server as long as the communication is encrypted over SSL. Also, the connection is only established when a valid client certificate is provided:

hostssl all all md5

Optionally, you may also use Client Certificate Authentication using the following method:

hostssl all all cert clientcert=1

2. Encryption at Rest – pgcrypto

The pgcrypto module provides cryptographic functions for PostgreSQL, allowing certain fields to be stored encrypted. pgcrypto implements PGP encryption, which is part of the OpenPGP (RFC 4880) standard. It supports both symmetric-key and public-key encryption. Besides the advanced features offered by PGP for encryption, pgcrypto also offers functions for running simple encryption based on ciphers. These functions only run a cipher over data.

Accounting and Auditing

Logging in PostgreSQL

PostgreSQL allows you to log either all of the statements or a few statements based on parameter settings. You can log all the DDLs or DMLs or any statement running for more than a certain duration to the log file when logging_collector is enabled. To avoid write overload to the data directory, you may also move your log_directory to a different location. Here’s a few important parameters you should review when logging activities in your PostgreSQL server:


Please note that detailed logging takes additional disk space and may impose an important overhead in terms of write IO depending on the activity in your PostgreSQL server. You should be careful when enabling logging and should only do so after understanding the overhead and performance degradation it may cause to your workload.

Auditing – pgaudit and set_user

Some essential auditing features in PostgreSQL are implemented as extensions, which can be enabled at will on highly secured environments with regulatory requirements.

pgaudit helps to audit the activities happening in the database. If any unauthorized user has intentionally obfuscated the DDL or DML, the statement the user has passed and the sub-statement that was actually executed in the database will be logged in the PostgreSQL log file.


  provides a method of privilege escalations. If properly implemented, it provides the highest level of auditing, which allows the monitoring of even SUPERUSER actions.

You can read more about pgaudit here.

Security Bug Fixes

PostgreSQL Global Development Group (PGDG) considers security bugs seriously. Any security vulnerabilities can be reported directly to The list of security issues fixed for all the supported PostgreSQL versions can be found here. Security fixes to PostgreSQL are made available through minor version upgrades. This is the main reason why it is advised to always maintain PostgreSQL servers upgraded to the latest minor version.

If you liked this post…

Please join Percona’s PostgreSQL Support Technical Lead,  Avinash Vallarapu; Senior Support Engineer, Fernando Laudares; and Senior Support Engineer, Jobin Augustine, on Wednesday, October 10, 2018 at 7:00 AM PDT (UTC-7) / 10:00 AM EDT (UTC-4), as they demonstrate an enterprise-grade PostgreSQL® environment built using a combination of open source tools and extensions.

Register Now

The post Securing PostgreSQL as an Enterprise-Grade Environment appeared first on Percona Database Performance Blog.


Comparing Data At-Rest Encryption Features for MariaDB, MySQL and Percona Server for MySQL

Encryption at rest MariaDB MySQL Percona Server

Encryption at rest MariaDB MySQL Percona ServerProtecting the data stored in your database may have been at the top of your priorities recently, especially with the changes that were introduced earlier this year with GDPR.

There are a number of ways to protect this data, which until not so long ago would have meant either using an encrypted filesystem (e.g. LUKS), or encrypting the data before it is stored in the database (e.g. AES_ENCRYPT or other abstraction within the application). A few years ago, the options started to change, as Alexander Rubin discussed in MySQL Data at Rest Encryption, and now MariaDB®, MySQL®, and Percona Server for MySQL all support encryption at-rest. However, the options that you have—and, indeed, the variable names—vary depending upon which database version you are using.

In this blog post we will take a look at what constitutes the maximum level of at-rest encryption that can be achieved with each of the latest major GA releases from each provider. To allow a fair comparison across the three, we will focus on the file-based key management; keyring_file plugin for MySQL and Percona Server for MySQL along with file_key_management plugin for MariaDB.

MariaDB 10.3

The MariaDB team take the credit for leading the way with at-rest encryption, as most of their features have been present since the 10.1 release (most notably the beta release of 10.1.3 in March 2015). Google donated the tablespace encryption, and eperi donated per-table encryption and key identifier support.

The current feature set for MariaDB 10.3 comprises of the following variables:

Maximising at-rest encryption with MariaDB 10.3

Using the following configuration would give you maximum at-rest encryption with MariaDB 10.3:

plugin_load_add = file_key_management
file_key_management_filename = /etc/mysql/keys.enc
file_key_management_filekey = FILE:/etc/mysql/.key
file_key_management_encryption_algorithm = aes_cbc
innodb_encrypt_log = ON
innodb_encrypt_tables = FORCE
Innodb_encrypt_threads = 4
encrypt_binlog = ON
encrypt_tmp_disk_tables = ON
encrypt_tmp_files = ON
aria_encrypt_tables = ON

This configuration would provide the following at-rest protection:

  • automatic and enforced InnoDB tablespace encryption
  • automatic encryption of existing tables that have not been marked with
  • 4 parallel encryption threads
  • encryption of temporary files and tables
  • encryption of Aria tables
  • binary log encryption
  • an encrypted file that contains the main encryption key

You can read more about preparing the keys, as well as the other key management plugins in the Encryption Key Management docs.

There is an existing bug related to encrypt_tmp_files (MDEV-14884), which causes the use of

mysqld --help --verbose

 to fail, which if you are using the official MariaDB Docker container for 10.3 will cause you to be unable to keep mysqld up and running. Messages similar to these would be visible in the Docker logs:

ERROR: mysqld failed while attempting to check config
command was: "mysqld --verbose --help --log-bin-index=/tmp/tmp.HDiERM4SPx"
2018-08-15 13:38:15 0 [Note] Plugin 'FEEDBACK' is disabled.
2018-08-15 13:38:15 0 [ERROR] Failed to enable encryption of temporary files
2018-08-15 13:38:15 0 [ERROR] Aborting

N.B. you should be aware of the limitations for the implementation, most notably log tables and files are not encrypted and may contain data along with any query text.

One of the key features supported by MariaDB that is not yet supported by the other providers is the automatic, parallel encryption of tables that will occur when simply enabling


 . This avoids the need to mark the existing tables for encryption using


 , although at the same time it also does not automatically add the comment and so you would not see this information. Instead, to check for encrypted InnoDB tables in MariaDB you should check


 , an example query being:

mysql> SELECT SUBSTRING_INDEX(name, '/', 1) AS db_name,
   ->   SUBSTRING_INDEX(name, '/', -1) AS db_table,
   -> FROM information_schema.INNODB_SYS_TABLESPACES
| db_name | db_table             | encrypted |
| mysql   | innodb_table_stats   |      1    |
| mysql   | innodb_index_stats   |      0    |
| mysql   | transaction_registry |      0    |
| mysql   | gtid_slave_pos       |      0    |

As can be inferred from this query, the system tables in MariaDB 10.3 are still predominantly MyISAM and as such cannot be encrypted.


Whilst the enterprise version of MySQL has support for a number of data at-rest encryption features as of 5.7, most of them are not available to the community edition. The latest major release of the community version sees the main feature set comprise of:

The enterprise edition adds the following extra support:

Maximising at-rest encryption with MySQL 8.0

Using the following configuration would give you maximum at-rest encryption with MySQL 8.0:

This configuration would provide the following at-rest protection:

  • optional InnoDB tablespace encryption
  • redo and undo log encryption

You would need to create new, or alter existing tables with the


 option, which would then be visible by examining


 , an example query being:

mysql> SELECT TABLE_SCHEMA AS db_name,
   ->    TABLE_NAME AS db_table,
   ->    CREATE_OPTIONS LIKE '%ENCRYPTION="Y"%' AS encrypted
   -> FROM information_schema.INNODB_TABLESPACES ts
   -> INNER JOIN information_schema.TABLES t ON t.TABLE_SCHEMA = SUBSTRING_INDEX(, '/', 1)
   ->                                        AND t.TABLE_NAME = SUBSTRING_INDEX(, '/', -1);
| db_name | db_table        | encrypted |
| sys     | sys_config      |     1     |

N.B. You are able to encrypt the tablespaces in 5.7, in which case you should use


 as the internal system views on the data dictionary were renamed (InnoDB Changes).

Unfortunately, whilst all of the tables in the mysql schema use the InnoDB engine (except for the log tables), you cannot encrypt them and instead get the following error:

ERROR 3183 (HY000): This tablespace can't be encrypted.

Interestingly, you are led to believe that you can indeed encrypt the




 tables, but this is in fact a bug (#91791).

Percona Server for MySQL

Last, but not least we have Percona Server for MySQL, which, whilst not completely matching MariaDB for features, is getting very close. As we shall see shortly, it does in fact have some interesting differences to both MySQL and MariaDB.

The current feature set for 5.7, which does indeed exceed the features provided by MySQL 5.7 and for the most part 8.0, is as follows:

Maximising at-rest encryption with Percona Server for MySQL 5.7

Using the following configuration would give you maximum at-rest encryption with Percona Server 5.7:

This configuration would provide the following at-rest protection:

  • automatic and enforced InnoDB tablespace encryption
  • encryption of temporary files and tables
  • binary log encryption
  • encryption when performing online DDL

There are some additional features that are due for release in the near future:

  • Encryption of the doublewrite buffer
  • Automatic key rotation
  • Undo log and redo log encryption
  • InnoDB system tablespace encryption
  • InnoDB tablespace and redo log scrubbing
  • Amazon KMS keyring plugin

Just like MySQL, encryption of any existing tables needs to be specified via


 via an


, however new tables are automatically encrypted. Another difference is that in order to check which tables are encrypted you can should the flag set against the tablespace in


, an example query being:

mysql> SELECT SUBSTRING_INDEX(name, '/', 1) AS db_name,
   ->    SUBSTRING_INDEX(name, '/', -1) AS db_table,
   ->    (flag & 8192) != 0 AS encrypted
   -> FROM information_schema.INNODB_SYS_TABLESPACES;
| db_name | db_table                  | encrypted |
| sys     | sys_config                |      1    |
| mysql   | engine_cost               |      1    |
| mysql   | help_category             |      1    |
| mysql   | help_keyword              |      1    |
| mysql   | help_relation             |      1    |
| mysql   | help_topic                |      1    |
| mysql   | innodb_index_stats        |      1    |
| mysql   | innodb_table_stats        |      1    |
| mysql   | plugin                    |      1    |
| mysql   | servers                   |      1    |
| mysql   | server_cost               |      1    |
| mysql   | slave_master_info         |      1    |
| mysql   | slave_relay_log_info      |      1    |
| mysql   | slave_worker_info         |      1    |
| mysql   | time_zone                 |      1    |
| mysql   | time_zone_leap_second     |      1    |
| mysql   | time_zone_name            |      1    |
| mysql   | time_zone_transition      |      1    |
| mysql   | time_zone_transition_type |      1    |
| mysql   | gtid_executed             |      0    |

Here you will see something interesting! We are able to encrypt most of the system tables, including two that are of significance, as they can contain plain text passwords:

| db_name | db_table          | encrypted |
| mysql   | servers           |      1    |
| mysql   | slave_master_info |      1    |

In addition to the above, Percona Server for MySQL also supports using the opensource HashiCorp Vault to host the keyring decryption information using the keyring_vault plugin; utilizing this setup (provided Vault is not on the same device as your mysql service, and is configured correctly) gains you an additional layer of security.

You may also be interested in my earlier blog post on using Vault with MySQL, showing you how to store your credentials in a central location and use them to access your database, including the setup and configuration of Vault with Let’s Encrypt certificates.


There are significant differences both in terms of features and indeed variable names, but all of them are able to provide encryption of the InnoDB tablespaces that will be containing your persistent, sensitive data. The temporary tablespaces, InnoDB logs and temporary files contain transient data, so whilst they should ideally be encrypted, only a small section of data would exist in them for a finite amount of time which is less of a risk, albeit a risk nonetheless.

Here are the highlights:

MariaDB 10.3 MySQL 8.0 Percona Server 5.7
encrypted InnoDB data Y Y Y
encrypted non-InnoDB data Aria-only N N
encrypted InnoDB logs Y Y TBA
automatic encryption Y N Y
enforced encryption Y N Y
automatic key rotation Y N TBA
encrypted binary logs Y N Y
encrypted online DDL ? N Y
encrypted keyring Y Enterprise-only N
mysql.slave_master_info N N Y
mysql.servers N N Y
Hashicorp Vault N N Y
AWS KMS Y Enterprise-only TBA


Extra reading:


The post Comparing Data At-Rest Encryption Features for MariaDB, MySQL and Percona Server for MySQL appeared first on Percona Database Performance Blog.


Another Day, Another Data Leak

another day another data leak Exactis

another day another data leak ExactisIn the last few days, there has been information released about yet another alleged data leak, placing in jeopardy “…[the] personal information on hundreds of millions of American adults, as well as millions of businesses.” In this case, the “victim” was Exactis, for whom data collection and data security are core business functions.

Some takeaways from Exactis

Please excuse the pun! In security, we have few chances to chuckle. In fact, as a Security Architect, I sigh deeply when I read about this kind of issue. Firstly, it’s preventable. Secondly, I worry that if an organization like Exactis is not getting it right, what chance the rest of the world?

As the Wired article notes the tool can be revealing and well worth a look. For example, you can see there are still MANY elasticSearch systems exposed to the public internet here. Why not use shodan to check what everyone else in the world can see openly on your systems ?

Securing databases

Databases in themselves do not need to be at risk, as long as you take the necessary precautions. We discussed this in this blog post that I co-authored last year.

In this latest alleged gaffe, as far as I can discern, had the setup made use of iptables or a similar feature then the breach could not have occurred.

With immaculate timing, my colleague Marco Tusa wrote a post last month on how to set up iptables for Percona XtraDB Cluster, and if you are not sure if or how that applies to your setup, it is definitely worth a read. In fact, you can access all of our security blog posts if you would like some more pointers.

Of course, security does not stop with iptables. Application developers should already be familiar with the need to avoid SQL injection, and there is a decent SQL injection prevention cheat sheet here, offered by The Open Web Application Security Project (OWASP). Even if you don’t fully understand the technical details, a cheat sheet like this might help you to ask the right questions for your application.

MySQL resources

For a more in-depth look at MySQL security, I have two talks up on YouTube. The first of these is a twenty-minute presentation on hardening MySQL and the second on web application security and why you really should review yours. You could also check out our recorded webinar Security and Encryption in the MySQL world presented by Dimitri Vanoverbeke.

MongoDB resources

Of course, security challenges are not unique to SQL databases. If you are a MongoDB user, this webinar MongoDB Security: Making things secure by default might be of interest to you. Or perhaps this one on using LDAP Authentication with MongoDB? Adamo Tonete presents both of these webinars.

For a more widely applicable view, you could try Colin Charles’ recent webinar too.

There are always consequences

As Exactis are no doubt discovering, managing the fallout from such a breach is a challenge. If you are not sure where you stand on security, or what you can do to improve your situation, then audit services such as those we offer could prove to be a valuable investment.

Finally, some of you will be lucky enough to have someone dedicated to IT security in your organizations. Next time you see them, instead of avoiding their steely stare, why not invite them for a coffee* and a chat? It could be enlightening!

*Beer or scotch is also almost always accepted too…

The post Another Day, Another Data Leak appeared first on Percona Database Performance Blog.


Webinar Thurs 6/14: MongoDB Backup and Recovery Field Guide

mongodb backup and recovery field guide

mongodb backup and recovery field guidePlease join Percona’s Sr. Technical Operations Architect, Tim Vaillancourt as he presents MongoDB Backup and Recovery Field Guide on Thursday, June 14, 2018, at 10:00 AM PDT (UTC-7) / 1:00 PM EDT (UTC-4).

This talk will cover backup and recovery solutions for MongoDB replica sets and clusters, focusing on online and low-impact solutions for production systems.

Register for the webinar

Tim Vaillancourt

Senior Technical Operations Architect

With experience operating infrastructures in industries such as government, online marketing/publishing, SaaS and gaming combined with experience tuning systems from the hard disk all the way up to the end-user, Tim has spent time in nearly every area of the modern IT stack with many lessons learned.

Tim is based in Amsterdam, NL and enjoys traveling, coding and music. Prior to Percona Tim was the Lead MySQL DBA of Electronic Arts’ DICE studios, helping some of the largest games in the world (“Battlefield” series, “Mirrors Edge” series, “Star Wars: Battlefront”) launch and operate smoothly while also leading the automation of MongoDB deployments for EA systems. Before the role of DBA at EA’s DICE studio, Tim served as a subject matter expert in NoSQL databases, queues and search on the Online Operations team at EA SPORTS.

Prior to moving to the gaming industry, Tim served as a Database/Systems Admin operating a large MySQL-based SaaS infrastructure at AbeBooks/Amazon Inc.

The post Webinar Thurs 6/14: MongoDB Backup and Recovery Field Guide appeared first on Percona Database Performance Blog.


MongoDB: deploy a replica set with transport encryption (part 3/3)

MongoDB Encryption Replica Sets

MongoDB Encryption Replica SetsIn this third and final post of the series, we look at how to configure transport encryption on a deployed MongoDB replica set. Security vulnerabilities can arise when internal personnel have legitimate access to the private network, but should not have access to the data. Encrypting intra-node traffic ensures that no one can “sniff” sensitive data on the network.

In part 1 we described MongoDB replica sets and how they work.
In part 2 we provided a step-by-step guide to deploy a simple 3-node replica set, including information on replica set configuration.

Enable Role-Based Access Control

In order for the encryption to be used in our replica set, we need first to activate Role-Based Access Control (RBAC). By default, a MongoDB installation permits anyone to connect and see the data, as in the sample deployment we created in part 2. Having RBAC enabled is mandatory for encryption.

RBAC governs access to a MongoDB system. Users are created and assigned privileges to access specific resources, such as databases and collections. Likewise, for carrying out administrative tasks, users need to be created with specific grants. Once activated, every user must authenticate themselves in order to access MongoDB.

Prior to activating RBAC, let’s create an administrative user. We’ll connect to the PRIMARY member and do the following:

rs-test:PRIMARY> use admin
switched to db admin
rs-test:PRIMARY> db.createUser({user: 'admin', pwd: 'secret', roles:['root']})
Successfully added user: { "user" : "admin", "roles" : [ "root" ] }

Let’s activate the RBAC in the configuration file /etc/mongod.conf on each node

      authorization: enabled

and restart the daemon

sudo service mongod restart

Now to connect to MongoDB we issue the following command:

mongo -u admin -p secret --authenticationDatabase "admin"


MongoDB supports X.509 certificate authentication for use with a secure TLS/SSL connection. The members can use X.509 certificates to verify their membership of the replica set.

In order to use encryption, we need to create certificates on all the nodes and have a certification authority (CA) that signs them. Since having a certification authority can be quite costly, we decide to use self-signed certificates. For our purposes, this solution ensures encryption and has no cost. Using a public CA is not necessary inside a private infrastructure.

To proceed with certificate generation we need to have openssl installed on our system and certificates need to satisfy these requirements:

  • any certificate needs to be signed by the same CA
  • the common name (CN) required during the certificate creation must correspond to the hostname of the host
  • any other field requested in the certificate creation should be a non-empty value and, hopefully, should reflect our organization details
  • it is also very important that all the fields, except the CN, should match those from the certificates for the other cluster members

The following guide describes all the steps to configure internal X.509 certificate-based encryption.

1 – Connect to one of the hosts and generate a new private key using openssl

openssl genrsa -out mongoCA.key -aes256 8192

We have created a new 8192 bit private key and saved it in the file mongoCA.key
Remember to enter a strong passphrase when requested.

2 – Sign a new CA certificate

Now we are going to create our “fake” local certification authority that we’ll use later to sign each node certificate.

During certificate creation, some fields must be filled out. We could choose these randomly but they should correspond to our organization’s details.

root@psmdb1:~# openssl req -x509 -new -extensions v3_ca -key mongoCA.key -days 365 -out
    Enter pass phrase for mongoCA.key:
    You are about to be asked to enter information that will be incorporated
    into your certificate request.
    What you are about to enter is what is called a Distinguished Name or a DN. There are quite a few fields but you can leave some blank
    For some fields there will be a default value,
    If you enter '.', the field will be left blank.
    Country Name (2 letter code) [AU]:US
    State or Province Name (full name) [Some-State]:California
    Locality Name (eg, city) []:San Francisco
    Organization Name (eg, company) [Internet Widgits Pty Ltd]:My Company Ltd
    Organizational Unit Name (eg, section) []:DBA
    Common Name (e.g. server FQDN or YOUR name) []:psmdb
    Email Address []

3 – Issue self-signed certificates for all the nodes

For each node, we need to generate a certificate request and sign it using the CA certificate we created in the previous step.

Remember: fill out all the fields requested the same for each host, but remember to fill out a different common name (CN) that must correspond to the hostname.

For the first node issue the following commands.

openssl req -new -nodes -newkey rsa:4096 -keyout psmdb1.key -out psmdb1.csr
openssl x509 -CA mongoCA.crt -CAkey mongoCA.key -CAcreateserial -req -days 365 -in psmdb1.csr -out psmdb1.crt
cat psmdb1.key psmdb1.crt > psmdb1.pem

for the second node

openssl req -new -nodes -newkey rsa:4096 -keyout psmdb2.key -out psmdb2.csr
openssl x509 -CA mongoCA.crt -CAkey mongoCA.key -CAcreateserial -req -days 365 -in psmdb2.csr -out psmdb2.crt
cat psmdb2.key psmdb2.crt > psmdb2.pem

and for the third node

openssl req -new -nodes -newkey rsa:4096 -keyout psmdb3.key -out psmdb3.csr
openssl x509 -CA mongoCA.crt -CAkey mongoCA.key -CAcreateserial -req -days 365 -in psmdb3.csr -out psmdb3.crt
cat psmdb3.key psmdb3.crt > psmdb3.pem

4 – Place the files

We could execute all of the commands in the previous step on the same host, but now we need to copy the generated files to the proper nodes:

  • Copy to each node the CA certifcate file: mongoCA.crt
  • Copy each self signed certifcate <hostname>.pem into the relative member
  • Create on each member a directory that only the MongoDB user can read, and copy both files there
sudo mkdir -p /etc/mongodb/ssl
sudo chmod 700 /etc/mongodb/ssl
sudo chown -R mongod:mongod /etc/mongodb
sudo cp psmdb1.pem /etc/mongodb/ssl
sudo cp mongoCA.crt /etc/mongodb/ssl

Do the same on each host.

5 – Configure mongod

Finally, we need to instruct mongod about the certificates to enable the encryption.

Change the configuration file /etc/mongod.conf on each host adding the following rows:

   port: 27017
      mode: requireSSL
      PEMKeyFile: /etc/mongodb/ssl/psmdb1.pem
      CAFile: /etc/mongodb/ssl/mongoCA.crt
      clusterFile: /etc/mongodb/ssl/psmdb1.pem
      authorization: enabled
      clusterAuthMode: x509

Restart the daemon

sudo service mongodb restart

Make sure to put the proper file names on each host (psmdb2.pem on psmdb2 host and so on)

Now, as long as we have made no mistakes, we have a properly configured replica set that is using encrypted connections.

Issue the following command to connect on node psmdb1:

mongo admin --ssl --sslCAFile /etc/mongodb/ssl/mongoCA.crt
--sslPEMKeyFile /etc/mongodb/ssl/psmdb1.pem
-u admin -p secret --host psmdb1

Access the first two articles in this series

  • Part 1: Introduces basic replica set concepts, how it works and what its main features
  • Part 2:  Provides a step-by-step guide to configure a three-node replica set

The post MongoDB: deploy a replica set with transport encryption (part 3/3) appeared first on Percona Database Performance Blog.


Deploy a MongoDB Replica Set with Transport Encryption (Part 2)


In this article series, we will talk about the basic high availability architecture of a MongoDB: the MongoDB replica set.

  • Part 1 : We introduced basic replica set concepts, how it works and what its main features
  • Part 2 (this post): We’ll provide a step-by-step guide to configure a three-node replica set
  • Part 3: We’ll talk about how to configure transport encryption between the nodes

In part 1 we introduced and described the main features of a MongoDB replica set. In this post, we are going to present a step-by-step guide to deploy a basic and fully operational 3-nodes replica set. We’ll use just regular members, all with priority=1, no arbiter, no hidden or delayed nodes.

The environment

Our example environment is 3 virtual hosts with Ubuntu 16.04 LTS, although the configuration is the same with CentOS or other Linux distributions.

We have installed Percona Server for MongoDB on each node. Hostnames and IPs are:

  • psmdb1 :
  • psmdb2 :
  • psmdb3 :

It is not the goal of this post to provide installation details, but in case you need them you can follow this guide: MongoDB installation from the repository is very easy.


Once we have all the nodes with MongoDB installed, we just need to be sure that each one is accessible by all the others on port 27017, the default port.

Since our members are on the same network we can simply try to test the connectivity between each pair of nodes, connecting the mongo client from one node to each of the others.

psmdb1> mongo --host --port 27017
psmdb1> mongo --host --port 27017
psmdb2> mongo --host --port 27017
psmdb2> mongo --host --port 27017
psmdb3> mongo --host --port 27017
psmdb3> mongo --host --port 27017

If the mongo client is not able to connect, we need to check the network configuration, or to configure or disable the firewall.


Configuring the hostnames into our hosts is not mandatory for the replica set. In fact you can configure the replica set using just the IPs and it’s fine. But we need to define the hostnames because they will be very useful when we discuss how to configure internal encryption in Part 3.

We need to ensure that each member is accessible by way of resolvable DNS or hostnames.

Set up each node in the /etc/hosts file

root@psmdb1:~# cat /etc/hosts       localhost  psmdb1  psmdb2  psmdb3

Choose a name for the replica set

We are now close to finalizing the configuration.

Now we have to choose a name for the replica set. We need to choose one and put t on each member’s configuration file. Let’s say we decide to use rs-test.

Put the replica set name into /etc/mongod.conf (the MongoDB configuration file) on each host. Enter the following:

     replSetName: "rs-test"

Restart the server:

sudo service mongod restart

Remember to do this on all the nodes.

That’s all we need to do to configure the replication at its most basic. There are obviously other configuration parameters we could set, but maybe we’ll talk about them in another post when discussing more advanced features. For this basic deployment we can assume that all the default values are good enough.

Initiate replication

Now we need to connect to one of the nodes. It doesn’t matter which, just choose one of them and launch mongo shell to connect to the local mongod instance.

Then issue the rs.initiate() command to let the replica set know what all the members are.

mongo> rs.initiate( {
      ... _id: “rs-test”,
      ... members: [
      ... { _id: 0, host: “psmdb1:27017” },
      ... { _id: 1, host: “psmdb2:27017” },
      ... { _id: 2, host: “psmdb3:27017” }
      ... ] })

After issuing the command, MongoDB initiates the replication process using the default configuration. A PRIMARY node is elected and all the documents will be created by now will be asynchronously replicated on the SECONDARY nodes.

We don’t need to do any more. The replica set is now working.

We can verify that the replication is working by taking a look at the mongo shell prompt. Once the replica set is up and running the prompt should be like this on the PRIMARY node:


and like this on the SECONDARY nodes:


MongoDB lets you know the replica role of the node that you are connected to.

A couple of useful commands

There are several commands to investigate and to do some administrative tasks on the replica set. Here are a couple of them.

To investigate the replica set configuration you can issue rs.conf() on any node

rs-test:PRIMARY> rs.conf()
 "_id" : "rs-test",
 "version" : 68835,
 "protocolVersion" : NumberLong(1),
 "members" : [
 "_id" : 0,
 "host" : "psmdb1:27017",
 "arbiterOnly" : false,
 "buildIndexes" : true,
 "hidden" : false,
 "priority" : 1,
 "tags" : {
 "slaveDelay" : NumberLong(0),
 "votes" : 1
 "_id" : 1,
 "host" : "psmdb2:27017",
 "arbiterOnly" : false,
 "buildIndexes" : true,
 "hidden" : false,
 "priority" : 1,
 "tags" : {
 "slaveDelay" : NumberLong(0),
 "votes" : 1
 "_id" : 2,
 "host" : "psmdb3:27017",
 "arbiterOnly" : false,
 "buildIndexes" : true,
 "hidden" : false,
 "priority" : 1,
 "tags" : {
 "slaveDelay" : NumberLong(0),
 "votes" : 1
 "settings" : {
 "chainingAllowed" : true,
 "heartbeatIntervalMillis" : 2000,
 "heartbeatTimeoutSecs" : 10,
 "electionTimeoutMillis" : 10000,
 "catchUpTimeoutMillis" : 60000,
 "getLastErrorModes" : {
 "getLastErrorDefaults" : {
 "w" : 1,
 "wtimeout" : 0
 "replicaSetId" : ObjectId("5aa2600d377adb63d28e7f0f")

We can see information about the configured nodes, whether arbiter or hidden, the priority, and other details regarding the heartbeat process.

To investigate the replica set status you can issue rs.status() on any node

rs-test:SECONDARY> rs.status()
 "set" : "rs-test",
 "date" : ISODate("2018-05-14T10:16:05.228Z"),
 "myState" : 2,
 "term" : NumberLong(47),
 "syncingTo" : "psmdb3:27017",
 "heartbeatIntervalMillis" : NumberLong(2000),
 "optimes" : {
 "lastCommittedOpTime" : {
 "ts" : Timestamp(1526292954, 1),
 "t" : NumberLong(47)
 "appliedOpTime" : {
 "ts" : Timestamp(1526292964, 1),
 "t" : NumberLong(47)
 "durableOpTime" : {
 "ts" : Timestamp(1526292964, 1),
 "t" : NumberLong(47)
 "members" : [
 "_id" : 0,
 "name" : "psmdb1:27017",
 "health" : 1,
 "state" : 2,
 "stateStr" : "SECONDARY",
 "uptime" : 392,
 "optime" : {
 "ts" : Timestamp(1526292964, 1),
 "t" : NumberLong(47)
 "optimeDate" : ISODate("2018-05-14T10:16:04Z"),
 "syncingTo" : "psmdb3:27017",
 "configVersion" : 68835,
 "self" : true
 "_id" : 1,
 "name" : "psmdb2:27017",
 "health" : 1,
 "state" : 1,
 "stateStr" : "PRIMARY",
 "uptime" : 379,
 "optime" : {
 "ts" : Timestamp(1526292964, 1),
 "t" : NumberLong(47)
 "optimeDurable" : {
 "ts" : Timestamp(1526292964, 1),
 "t" : NumberLong(47)
 "optimeDate" : ISODate("2018-05-14T10:16:04Z"),
 "optimeDurableDate" : ISODate("2018-05-14T10:16:04Z"),
 "lastHeartbeat" : ISODate("2018-05-14T10:16:04.832Z"),
 "lastHeartbeatRecv" : ISODate("2018-05-14T10:16:03.318Z"),
 "pingMs" : NumberLong(0),
 "electionTime" : Timestamp(1526292592, 1),
 "electionDate" : ISODate("2018-05-14T10:09:52Z"),
 "configVersion" : 68835
 "_id" : 2,
 "name" : "psmdb3:27017",
 "health" : 1,
 "state" : 2,
 "stateStr" : "SECONDARY",
 "uptime" : 378,
 "optime" : {
 "ts" : Timestamp(1526292964, 1),
 "t" : NumberLong(47)
 "optimeDurable" : {
 "ts" : Timestamp(1526292964, 1),
 "t" : NumberLong(47)
 "optimeDate" : ISODate("2018-05-14T10:16:04Z"),
 "optimeDurableDate" : ISODate("2018-05-14T10:16:04Z"),
 "lastHeartbeat" : ISODate("2018-05-14T10:16:04.832Z"),
 "lastHeartbeatRecv" : ISODate("2018-05-14T10:16:04.822Z"),
 "pingMs" : NumberLong(0),
 "syncingTo" : "psmdb2:27017",
 "configVersion" : 68835
 "ok" : 1

Here we can see for example if nodes are reachable and are running, but in particular we can see the role they have at this moment: which is the PRIMARY and which are SECONDARY

Test replication

Finally, let’s try to test that the replication process is really working as expected.

Connect to the PRIMARY node and create a sample document:

rs-test:PRIMARY> use test
switched to db test
rs-test:PRIMARY> {name:"Bruce", surname:"Dickinson"} )
WriteResult({ "nInserted" : 1 })
    "_id" : ObjectId("5ae05ac27e6680071caf94b7")
    "name" : "Bruce"
    "surname" : "Dickinson"

Then connect to a SECONDARY node and look for the same document.

Remember that you can’t connect to the SECONDARY node to read the data. By default reads and writes are allowed only on the PRIMARY. So, if you want to read data on a SECONDARY node, you first need to issue the rs.slaveOK() command. If you don’t do this you will receive an error.

rs-test:SECONDARY> rs.slaveOK()
rs-test:SECONDARY> show collections
     "_id" : ObjectId("5ae05ac27e6680071caf94b7")
     "name" : "Bruce"
     "surname" : "Dickinson"

As we can see, the SECONDARY node has replicated the creation of the collection foo and the inserted document.

This simple test demonstrates that the replication process is working as expected.

There are more sophisticated features to investigate the replica set, and for troubleshooting, but discussing them it’s not in the scope of this post.

In Part 3, we’ll show how to encrypt the internal replication process we have deployed so far.

Read the first post of this series: Deploy a MongoDB Replica Set with Transport Encryption

The post Deploy a MongoDB Replica Set with Transport Encryption (Part 2) appeared first on Percona Database Performance Blog.


Don’t Get Hit with a Database Disaster: Database Security Compliance

Percona Live 2018 security talks

In this post, we discuss database security compliance, what you should be looking at and where to get more information.

As Percona’s Chief Customer Officer, I get the opportunity to talk with a lot of customers. Hearing about the problems that both their technical teams face, as well as the business challenges their companies experience first-hand is incredibly valuable in terms of what the market is facing in general. Not every problem you see has a purely technical solution, and not every good technical solution solves the core business problem.

Matt Yonkovit, Percona CCOAs database technology advances and data continues to be the core blood of most modern applications, DBA’s will have a say in business level strategic planning more than ever. This coincides with the advances in technology and automation that make many classic manual “DBA” jobs and tasks obsolete. Traditional DBA’s are evolving into a blend of system architect, data strategist and master database architect. I want to talk about the business problems that not only the C-Suite care about, but DBAs as a whole need to care about in the near future.

Let’s start with one topic everyone should have near the top of their list: security.

We did a recent survey of our customers, and their biggest concern right now is security and compliance.

Not long ago, most DBA’s I knew dismissed this topic as “someone else’s problem” (I remember being told that the database is only as secure as the network, so fix the network!). Long gone are the days when network security was enough. Even the DBA’s who did worry about security only did so within the limited scope of what the database system could provide out of the box.  Again, not enough.

So let me run an experiment:

Raise your hand if your company has some bigger security initiative this year. 

I’m betting a lot of you raised your hand!

Security is not new to the enterprise. It’s been a priority for years now. However, it has not been receiving a hyper-focus in the open source database space until the last three years or so. Why? There have been a number of high profile database security breaches in the last year, all highlighting a need for better database security. This series of serious data breaches have exposed how fragile some security protocols in companies are. If that was not enough, new government regulations and laws have made data protection non-optional. This means you have to take the security of your database seriously, or there could be fines and penalties.

Percona Live 2018 security talksGovernment regulations are nothing new, but the breadth and depth of these are growing and are opening up a whole new challenge for databases systems and administrators. GDPR was signed into law two years ago (you can read more here: and and is scheduled to take effect on May 25, 2018. This has many businesses scrambling not only to understand the impact, but figure out how they need to comply. These regulations redefine simple things, like what constitutes “personal data” (for instance, your anonymous buying preferences or location history even without your name).

New requirements also mean some areas get a bit more complicated as they approach the gray area of definition. For instance, GDPR guarantees the right to be forgotten. What does this mean? In theory, it means end-users can request that all their personal information is removed from your systems as if they did not exist. Seems simple, but in reality, you can go as far down the rabbit hole as you want. Does your application support this already? What about legacy applications? Even if the apps can handle it, does this mean previously taken database backups have to forget you as well? There is a lot to process for sure.

So what are the things you can do?

  1. Educate yourself and understand expectations, even if you weren’t involved in compliance discussions before.
  2. Start working on incremental improvements now on your data security. This is especially true in the area’s where you have some control, without massive changes to the application. Encryption at rest is a great place to start if you don’t have it.
  3. Start talking with others in the organization about how to identify and protect personal information.
  4. Look to increase security by default by getting involved in new applications early in the design phase.

The good news is you are not alone in tackling this challenge. Every company must address it. Because of this focus on security, we felt strongly about ensuring we had a security track at Percona Live 2018 this year. These talks from Fastly, Facebook, Percona, and others provide information on how companies around the globe are tackling these security issues. In true open source fashion, we are better when we learn and grow from one another.

What are the Percona Live 2018 security talks?

We have a ton of great security content this year at Percona Live, across a bunch of technologies and open source software. Some of the more interesting Percona Live 2018 security talks are:

Want to attend Percona Live 2018 security talks? Register for Percona Live 2018. Register now to get the best price! Use the discount code SeeMeSpeakPL18 for 10% off.

Percona Live Open Source Database Conference 2018 is the premier open source event for the data performance ecosystem. It is the place to be for the open source community. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

The Percona Live Open Source Database Conference will be April 23-25, 2018 at the Hyatt Regency Santa Clara & The Santa Clara Convention Center.


ProxySQL Firewalling

ProxySQL Firewalling

ProxySQL FirewallingIn this blog post, we’ll look at ProxySQL firewalling (how to use ProxySQL as a firewall).

Not long ago we had an internal discussion about security, and how to enforce a stricter set of rules to prevent malicious acts and block other undesired queries. ProxySQL came up as a possible tool that could help us in achieving what we were looking for. Last year I wrote about how to use ProxySQL to stop a single query.

That approach may be good for few queries and as a temporary solution. But what can we do when we really want to use ProxySQL as an SQL-based firewall? And more importantly, how to do it right?

First of all, let us define what “right” can be in this context.

For right I mean an approach that allows us to have rules matching as specifically as possible, and impacting the production system as little as possible.

To make this clearer, let us assume I have three schemas:

  • Shakila
  • World
  • Windmills

I want to have my firewall block/allow SQL access independently by each schema, user, eventually by source, and so on.

There are a few case where this is not realistic, like in SaaS setups where each schema represents a customer. In this case, the application will have exactly the same kind of SQL – just pointing to different schemas depending the customer.

Using ProxySQL

Anyhow… ProxySQL allows you to manage query firewalling in a very simple and efficient way using the query rules.

In the mysql_query_rules table, we can define a lot of important things – one being setting our SQL firewall.


Let us take a look to the mysql_query_rules table:

    active INT CHECK (active IN (0,1)) NOT NULL DEFAULT 0,
    username VARCHAR,
    schemaname VARCHAR,
    client_addr VARCHAR,
    proxy_addr VARCHAR,
    proxy_port INT,
    digest VARCHAR,
    match_digest VARCHAR,
    match_pattern VARCHAR,
    negate_match_pattern INT CHECK (negate_match_pattern IN (0,1)) NOT NULL DEFAULT 0,
    re_modifiers VARCHAR DEFAULT 'CASELESS',
    flagOUT INT,
    replace_pattern VARCHAR,
    destination_hostgroup INT DEFAULT NULL,
    cache_ttl INT CHECK(cache_ttl > 0),
    reconnect INT CHECK (reconnect IN (0,1)) DEFAULT NULL,
    timeout INT UNSIGNED,
    retries INT CHECK (retries>=0 AND retries <=1000),
    delay INT UNSIGNED,
    next_query_flagIN INT UNSIGNED,
    mirror_flagOUT INT UNSIGNED,
    mirror_hostgroup INT UNSIGNED,
    error_msg VARCHAR,
    OK_msg VARCHAR,
    sticky_conn INT CHECK (sticky_conn IN (0,1)),
    multiplex INT CHECK (multiplex IN (0,1,2)),
    log INT CHECK (log IN (0,1)),
    apply INT CHECK(apply IN (0,1)) NOT NULL DEFAULT 0,
    comment VARCHAR)

We can define rules around almost everything: connection source, port, destination IP/Port, user, schema, SQL text or any combination of them.

Given we may have quite a large set of queries to manage, I prefer to logically create “areas” around which add the rules to manage SQL access.

For instance, I may decide to allow a specific set of SELECTs to my schema windmills, but nothing more.

Given that, I allocate the set of rule IDs from 100 to 1100 to my schema, and add my rules in three groups.

  1. The exception that will bypass the firewall
  2. The blocking rule(s) (the firewall)
  3. The managing rules (post-processing, like sharding and so on)

There is a simple thing to keep in mind when you design rules for firewalling: do you need post-processing of the query or not?

In the case that you DON’T need post-processing, the rule can simply apply and exit the QueryProcessor. That is probably the most common scenario, and read/write splits can be defined in the exception rules assigned to the rule for the desired HostGroup.

If you DO need post-processing, the rule MUST have apply=0 and the FLAGOUT must be defined. That allows you to have additional actions once the query is beyond the firewall. An example is in case of sharding, where you need to process the sharding key/comment or whatever.

I will use the simple firewall scenario, given this is the topic of the current article.

The rules

Let us start with the easy one, set 2, the blocking rule:

insert into mysql_query_rules (rule_id,username,schemaname,match_digest,error_msg,active,apply) values(1000,'pxc_test','windmills','.','You cannot pass.....I am a servant of the Secret Fire, wielder of the flame of Anor,. You cannot pass.',1, 1);

In this query rule, I had defined the following:

  • User connecting
  • Schema name
  • Any query
  • Message to report
  • Rule_id

That rule will block ANY query that tries to access the schema windmills from application user pxc_test.

Now in set 1, I will add all the rules I want to let pass. I will report here one only, but all can be found in GitHub here (

insert into mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,schemaname,active,retries,apply,flagout,match_digest) values(101,6033,'pxc_test',52,'windmills',1,3,1,1000,'SELECT,wmillAUTOINC.millid,wmillAUTOINC.location FROM wmillAUTOINC WHERE wmillAUTOINC.millid=.* and*');

That is quite simple and straightforward, but there is an important element that you must note. In this rule, apply must have value of =1 always, to allow the query rule to bypass without further delay the firewall.

(Side Note: if you need post-processing, the flagout needs to have a value (like flagout=1000) and apply must be =0. That allows the query to jump to set 3, the managing rules.)

This is it, ProxySQL will go to the managing rules as soon as it finds a matching rule that allows the application to access my database/schema, or it will exit if apply=1.

A graph will help to understand better:

Rule set 3 has the standard query rules to manage what to do with the incoming connection, like sharding or redirecting SELECT FOR UPDATE, and so on:

insert into mysql_query_rules (rule_id,proxy_port,schemaname,username,destination_hostgroup,active,retries,match_digest,apply,flagin) values(1040,6033,'windmills','pxc_test',50,1,3,'^SELECT.*FOR UPDATE',1,1000);

Please note the presence of the flagin, which matches the flagout above.

Setting rules, sometimes thousands of them, can be very confusing. It is very important to correctly plan what should be in as an excluding rule and what should not. Do not rush, take your time and identify the queries you need to manage carefully.

Once more ProxySQL can help us. Querying the table stats_mysql_query_digest tells us exactly what queries were sent to ProxySQL:

admin@ [main]>select hostgroup,schemaname,digest,digest_text,count_star from stats_mysql_query_digest where schemaname='windmills' order by count_star desc;

The above query shows us all the queries hitting the windmills schema. From there we can decide which queries we want to pass and which not.

>select hostgroup,schemaname,digest,digest_text,count_star from stats_mysql_query_digest where schemaname='windmills' order by count_star desc  limit 1G
*************************** 1. row ***************************
  hostgroup: 50
 schemaname: windmills
     digest: 0x18CA8FF2C9C53276
 count_star: 141

Once we have our set done (check on github for an example), we are ready to check how our firewall works.

By default, I suggest you to keep all the exceptions (in set 1) with active=0, just to test the firewall.

For instance, my application generates the following exception:

com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You cannot pass.....I am a servant of the Secret Fire, wielder of the flame of Anor,. You cannot pass.
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
	at java.lang.reflect.Constructor.newInstance(
	at com.mysql.jdbc.Util.handleNewInstance(
	at com.mysql.jdbc.Util.getInstance(
	at com.mysql.jdbc.SQLError.createSQLException(
	at com.mysql.jdbc.MysqlIO.checkErrorPacket(
	at com.mysql.jdbc.MysqlIO.checkErrorPacket(
	at com.mysql.jdbc.MysqlIO.sendCommand(
	at com.mysql.jdbc.MysqlIO.sqlQueryDirect(
	at com.mysql.jdbc.ConnectionImpl.execSQL(
	at com.mysql.jdbc.ConnectionImpl.execSQL(
	at com.mysql.jdbc.StatementImpl.executeQuery(

Activating the rules, will instead allow your application to work as usual.

What is the impact?

First, let’s define the baseline by running the application without any rule blocking (but only the r/w split (set 3)).


Queries/sec global

Using two application servers:

  • Server A: Total Execution time = 213
  • Server B: Total Execution time = 209

Queries/sec per server

As we can see, queries are almost equally distributed.

QueryProcessor time taken/Query processed total

All queries are processed by QueryProcessor in ~148ms AVG (total).

QueryProcessor efficiency per query

The single query cost is in nanoseconds (avg 10 us).

Use match_digest

Once we’ve defined the baseline, we can go ahead and activate all the rules using the match_digest. Run the same tests again and… :

Using two application servers:

  • Server A: Total Execution time = 207
  • Server B: Total Execution time = 204

First of all, we notice that the execution time did not increase. This is mainly because we have CPU cycles to use in the ProxySQL box:

Here we have a bit of unbalance. We will investigate that in a separate session, but all in all, time/effort looks ok:

Here we have the first thing to notice. Comparing this to the baseline we defined, we can see that using the rules as match_digest significantly increased the execution time to 458ms:

Also notice that if we are in the range of nanoseconds, the cost of processing the query is now three times that of the baseline. Not too much, but if you add a stone to another stone and another stone and another stone … you end up building a huge wall.

So, what to do? Up to now, we saw that managing the firewall with ProxySQL is easy and it can be set at very detailed level – but the cost may not be what we expect it to be.

What can be done? Use DIGEST instead

The secret is to not use match_digest (which implies interpretation of the string) but to use the DIGEST of the query (which is calculated ahead and remains constant for that query).

Let us see what happens if we run the same load using DIGEST in the MYSQL_QUERY_RULES table:

Using two application servers:

  • Server A: Total Execution time = 213
  • Server B: Total Execution time = 209

No, this is not an issue with cut and paste. I had more or less the same execution time as without rules, at the seconds (different millisecond though):

Again, there is some unbalance, but a minor thing:

And we drop to 61ms for execution of all queries. Note that we improve the efficiency of the Query Processor from 148ms AVG to 61ms AVG.

Why? Because our rules using the DIGEST also have the instructions for read/write split, so requests can exit the Query Processor with all the information required at this stage (much more efficient).

Finally, when using the DIGEST the cost for query drops to 4us which is … LOW!

That’s it! ProxySQL using the DIGEST field from mysql_query_rules performs much better given that it doesn’t need to analyze the whole SQL string with regular expressions – it just matches the DIGEST.


ProxySQL can be effectively used as an SQL firewall, but some best practices should be taken in to consideration. First of all, try to use specific rules and be specific on what should be filtered/allowed. Use filter by schema or user or IP/port or combination of them. Always try to avoid match_digest and use digest instead. That allows ProxySQL to bypass the call to the regularExp lib and is far more efficient. Use stats_mysql_query_digest to identify the correct DIGEST.

Regarding this, it would be nice to have a GUI interface that allows us to manage these rules. That would make the usage of the ProxySQL much easier, and the maintenance/creation of rule_chains friendlier.


MySQL community set to meet at Percona Live London 2014

The countdown is on for Europe’s largest annual MySQL event, Percona Live London 2014. The two days of technical tutorials and sessions, November 3-4, will focus on the latest MySQL industry trends, news, best practices – and a look at what’s on the near- and long-term horizon within the global MySQL ecosystem.

Percona Live London 2014 will bring attendees up to date on key areas including MySQL 5.7, database security, database as a service (DBaaS), Hadoop and high availability (HA), disaster recovery, replication and backup, performance and scalability, WebScaleSQL and much, much more.

Team Oracle will be in London, led by Tomas Ulin, vice president of Oracle’s MySQL engineering team, who will explain why MySQL just keeps getting better with the latest news for the MySQL Database, MySQL Cluster, MySQL Workbench… and more. Oracle’s Luis Soares, principle software engineer, and Andrew Morgan, MySQL HA product management, will provide insight into what’s in the latest 5.7 development milestone release and also what’s going on in the labs… particularly around MySQL replication. Seize the opportunity to learn how to leverage MySQL 5.7 replication to grow your business from their session, “MySQL Replication: What’s New in 5.7 and Beyond.”

If anything keeps DBAs up at night it’s database security – especially with recent revelations of vulnerabilities like the POODLE SSLv3 security flaw (CVE-2014-3566) and “Bash Bug,” also known as Shellshock (CVE-2014-6271). Attendees will have the opportunity to talk face-to-face with database security expert David Busby of Percona, who will also lead a session titled, “Security it’s more than just your database you should worry about.”

The official Percona Live London 2014 t-shirt!

The official Percona Live London 2014 t-shirt!
(Click image for larger view)

Observe how to incorporate semi-synchronous replication to achieve failover – without data loss. Facebook’s Yoshinori Matsunobu and Santosh Banda will share how they did it at scale (across data centers) by extending MySQL internals along with some handy new self-made tools.

Meet the next-generation C connector for MySQL: libAttachSQL. It’s a new lightweight async C connector library for MySQL being developed from scratch by HP’s Advanced Technology Group. Andrew Hutchings, principal software engineer at Hewlett-Packard, will be on hand to share the latest on libAttachSQL.

Successful applications often become limited by MySQL performance. But tracking down and fixing those issues can be a huge drain on time and resources. Unless you think smart – spending time on what gives you the best return. Percona CEO Peter Zaitsev will explain how in his session, “Practical MySQL Performance Optimization.”

Percona Live London attendees will also learn from the real-life experiences of MySQL experts who share case studies. Shake hands with Art van Scheppingen, head of database engineering at Spil Games, who will explain how to serve out any page with an HA Sphinx environment.

Save yourself a quarter century by absorbing Tim Callaghan’s MySQL performance benchmarking tips, tricks and lessons learned. Tim, vice president of engineering at Tokutek, with share what he’s learned in the past 25 years maintaining the performance of database applications.

And of course there will be a MySQL community dinner! But be sure to register now for the dinner because space is limited – especially if you want to enjoy a ride to the restaurant on a vintage double-decker London bus (you do not need to attend the conference to join the dinner).

Register now for Percona Live London 2014 and save £30 with discount code “MPB30“. See you in London!

The post MySQL community set to meet at Percona Live London 2014 appeared first on MySQL Performance Blog.


Database security: Why should you review yours?

Ah database security… the black sheep of topics and something you would really rather not have to deal with right?

I mean surely all the fanfare and paranoia is reserved for the neck beards with tinfoil hats whom live in their own D.I.Y Faraday cage … that must be it … it just has to be?

Database Security and why you need to review yoursNo, the hard reality is the world is not rose tinted and “they” are out to get you be it for fun or for profit; from defacements to theft compromising your applications, and more importantly your data is big business. For some these acts are nothing short of sheer entertainment for an otherwise boring evening. (I’ll be speaking about this topic next week in much more detail at the Percona Live MySQL Conference and Expo in Santa Clara, California. My session, “Security and why you need to review yours” will go into much more detail on April 2. Use the code “SeeMeSpeak” on the registration page and save 10 percent.)

Note I’m avoiding discussing corporate espionage/government spying, as this re-enforces the image of it all being cloak and dagger; admittedly some of it is … the part that’s going to affect you, your business, your livelihood isn’t.

It’s time for that wake-up caffeine infusion and drink the “kool aid” – this is not something you can shun and pretend it’s a monster under the bed/in the closet that doesn’t really exist. Unlike the “bogey man” these threats are real, and have real impacts to peoples livelihood.

F.U.D? The fear part I certainly am wanting to portray here; a level of Fear is healthy;  it keeps you alert, the uncertainty and doubt? No these should and will be removed so please allow me to continue.

Removing Uncertainty
As with anything that requires research I’m sure you the reader would carry out proper “Due Diligence.” You wouldn’t want anything to adversely affect your product/business, taking hosting as an example you’ll look at the providers reputation their S.L.A. agreements etc.

What about their Security credentials? PCI / SOX / HIPAA … there’s numerous classifications.

“But I don’t need PCI / SOX / HIPAA!” to this I say you need a compromise of your application/business even less… what’s the harm in asking the provider if they have been through any regulatory compliance? I just don’t get the stigma some people seem to feel when asking a question related to security. Remember when deploying your application you’re building upon your hosting providers infrastructure.

“A foolish man who built his house on sand” in short if your foundations are not sound you’re opening yourself up to failure and compromise, an example Paypal and godaddy socially engineered resulting in the loss of a $50K twitter username and Barclays £1.37M theft both of which are due to the same level of Failure, the term for which is “Social Engineering” … which really is just a new term for conning someone into doing what you want them to do.

“The art of the con” is hardly anything new; and has been around for centuries take Victor Lustig whose infamous example of con artistry was to sell the Eiffel tower for scrap … twice.

Dispelling Doubt
“By failing to prepare you are preparing to fail” – Benjamin Franklin

Let’s look at this a little more with some common misconceptions.

“I don’t really need to look at security, my project business is small and will not be attacked.” I’d liken this statement to saying you don’t need seat belts and air bags because you’re a careful driver; driving at night with no lights on because “I have good night vision.” You have safety and security measures in your everyday life which because they fall as part of the routine are not thought about: locks on doors, car/home/business alarms, cctv, gps locators for phones/cars/tablets/laptops … we need to eliminate this thinking that information security is anything other than a requirement which should form part of our every day “norms”.

“Security is too expensive.” Have you looked at the cost of a compromise of your system, how much is the potential loss of your entire customer base worth to you? …not looking quiet so expensive now is it? Liken an investment in security to an investment in High Availability: you want your application to be “always on” … why do many think “secure” is prohibitively expensive to achieve?

“We simply don’t have the resources to implement security measures.” Yet you have the resources for development, DBA’s, sysadmins? One of the best ways to introduce security into your application is “from the ground up,” so that it becomes part of your general practise – this requires a “state of mind” orientated toward security.

What many fail to realize is assuming your business is successful you already have a state of mind orientated to best practises which work for your business to produce an application/service of value; minor tweaking to this could also introduce a mindset of security.

Remediation – “the action of remedying something, in particular of reversing or stopping environmental damage.”

It’s not going to be a painful or expensive as you may think; the first most powerful step is a minor change of development/sysadmin attitudes to consider security implications of code/services/configurations, let’s bring back the healthy attitude of asking questions.

Do I really need to disable SELinux to get this to work? – the answer is of course no you shouldn’t, this should be the same vein of thought of do I really need to chmod this 777 to make it work?

Does this service really need to be installed? – e.g. bluetoothd doesn’t need to be on your production machines.

We’re adding a user input form, we should really sanitize the input – seems obvious to most now, though this was met with just as much “resistance to change” before it became a best practise standard.

Does MySQL really need to be accessible from everywhere on the internet? – again may seem obvious to most now, though this was and sometimes still met with resistance of “I may need to run queries from home, off my mobile, from the open wifi at the local coffee shop …” (those of a security orientated nature I apologize for this statement and I can sense the cringing now in progress as this is read …).

The above is just a small example of reducing your attack surface. Your attack surface being the potential entry points into your system/network/application which can be potentially attacked.

The thinking behind the need to invest a lot of money into some expensive appliance / magic solution is for the most part misguided and throwing a “security blanket” over or in front of your application isn’t going to be as effective as ensuring consideration of security at every layer of your business.

Over the coming months I will be working on more related blogs, webinars – and a reminder that I’ll also be giving a talk on “Security and why you need to review yours” at Percona Live in Santa Clara this April 2nd.

The post Database security: Why should you review yours? appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by