Another Day, Another Data Leak

another day another data leak Exactis

another day another data leak ExactisIn the last few days, there has been information released about yet another alleged data leak, placing in jeopardy “…[the] personal information on hundreds of millions of American adults, as well as millions of businesses.” In this case, the “victim” was Exactis, for whom data collection and data security are core business functions.

Some takeaways from Exactis

Please excuse the pun! In security, we have few chances to chuckle. In fact, as a Security Architect, I sigh deeply when I read about this kind of issue. Firstly, it’s preventable. Secondly, I worry that if an organization like Exactis is not getting it right, what chance the rest of the world?

As the Wired article notes the tool https://shodan.io/ can be revealing and well worth a look. For example, you can see there are still MANY elasticSearch systems exposed to the public internet here. Why not use shodan to check what everyone else in the world can see openly on your systems ?

Securing databases

Databases in themselves do not need to be at risk, as long as you take the necessary precautions. We discussed this in this blog post that I co-authored last year.

In this latest alleged gaffe, as far as I can discern, had the setup made use of iptables or a similar feature then the breach could not have occurred.

With immaculate timing, my colleague Marco Tusa wrote a post last month on how to set up iptables for Percona XtraDB Cluster, and if you are not sure if or how that applies to your setup, it is definitely worth a read. In fact, you can access all of our security blog posts if you would like some more pointers.

Of course, security does not stop with iptables. Application developers should already be familiar with the need to avoid SQL injection, and there is a decent SQL injection prevention cheat sheet here, offered by The Open Web Application Security Project (OWASP). Even if you don’t fully understand the technical details, a cheat sheet like this might help you to ask the right questions for your application.

MySQL resources

For a more in-depth look at MySQL security, I have two talks up on YouTube. The first of these is a twenty-minute presentation on hardening MySQL and the second on web application security and why you really should review yours. You could also check out our recorded webinar Security and Encryption in the MySQL world presented by Dimitri Vanoverbeke.

MongoDB resources

Of course, security challenges are not unique to SQL databases. If you are a MongoDB user, this webinar MongoDB Security: Making things secure by default might be of interest to you. Or perhaps this one on using LDAP Authentication with MongoDB? Adamo Tonete presents both of these webinars.

For a more widely applicable view, you could try Colin Charles’ recent webinar too.

There are always consequences

As Exactis are no doubt discovering, managing the fallout from such a breach is a challenge. If you are not sure where you stand on security, or what you can do to improve your situation, then audit services such as those we offer could prove to be a valuable investment.

Finally, some of you will be lucky enough to have someone dedicated to IT security in your organizations. Next time you see them, instead of avoiding their steely stare, why not invite them for a coffee* and a chat? It could be enlightening!

*Beer or scotch is also almost always accepted too…

The post Another Day, Another Data Leak appeared first on Percona Database Performance Blog.


Webinar Thurs 6/14: MongoDB Backup and Recovery Field Guide

mongodb backup and recovery field guide

mongodb backup and recovery field guidePlease join Percona’s Sr. Technical Operations Architect, Tim Vaillancourt as he presents MongoDB Backup and Recovery Field Guide on Thursday, June 14, 2018, at 10:00 AM PDT (UTC-7) / 1:00 PM EDT (UTC-4).

This talk will cover backup and recovery solutions for MongoDB replica sets and clusters, focusing on online and low-impact solutions for production systems.

Register for the webinar

Tim Vaillancourt

Senior Technical Operations Architect

With experience operating infrastructures in industries such as government, online marketing/publishing, SaaS and gaming combined with experience tuning systems from the hard disk all the way up to the end-user, Tim has spent time in nearly every area of the modern IT stack with many lessons learned.

Tim is based in Amsterdam, NL and enjoys traveling, coding and music. Prior to Percona Tim was the Lead MySQL DBA of Electronic Arts’ DICE studios, helping some of the largest games in the world (“Battlefield” series, “Mirrors Edge” series, “Star Wars: Battlefront”) launch and operate smoothly while also leading the automation of MongoDB deployments for EA systems. Before the role of DBA at EA’s DICE studio, Tim served as a subject matter expert in NoSQL databases, queues and search on the Online Operations team at EA SPORTS.

Prior to moving to the gaming industry, Tim served as a Database/Systems Admin operating a large MySQL-based SaaS infrastructure at AbeBooks/Amazon Inc.

The post Webinar Thurs 6/14: MongoDB Backup and Recovery Field Guide appeared first on Percona Database Performance Blog.


MongoDB: deploy a replica set with transport encryption (part 3/3)

MongoDB Encryption Replica Sets

MongoDB Encryption Replica SetsIn this third and final post of the series, we look at how to configure transport encryption on a deployed MongoDB replica set. Security vulnerabilities can arise when internal personnel have legitimate access to the private network, but should not have access to the data. Encrypting intra-node traffic ensures that no one can “sniff” sensitive data on the network.

In part 1 we described MongoDB replica sets and how they work.
In part 2 we provided a step-by-step guide to deploy a simple 3-node replica set, including information on replica set configuration.

Enable Role-Based Access Control

In order for the encryption to be used in our replica set, we need first to activate Role-Based Access Control (RBAC). By default, a MongoDB installation permits anyone to connect and see the data, as in the sample deployment we created in part 2. Having RBAC enabled is mandatory for encryption.

RBAC governs access to a MongoDB system. Users are created and assigned privileges to access specific resources, such as databases and collections. Likewise, for carrying out administrative tasks, users need to be created with specific grants. Once activated, every user must authenticate themselves in order to access MongoDB.

Prior to activating RBAC, let’s create an administrative user. We’ll connect to the PRIMARY member and do the following:

rs-test:PRIMARY> use admin
switched to db admin
rs-test:PRIMARY> db.createUser({user: 'admin', pwd: 'secret', roles:['root']})
Successfully added user: { "user" : "admin", "roles" : [ "root" ] }

Let’s activate the RBAC in the configuration file /etc/mongod.conf on each node

      authorization: enabled

and restart the daemon

sudo service mongod restart

Now to connect to MongoDB we issue the following command:

mongo -u admin -p secret --authenticationDatabase "admin"


MongoDB supports X.509 certificate authentication for use with a secure TLS/SSL connection. The members can use X.509 certificates to verify their membership of the replica set.

In order to use encryption, we need to create certificates on all the nodes and have a certification authority (CA) that signs them. Since having a certification authority can be quite costly, we decide to use self-signed certificates. For our purposes, this solution ensures encryption and has no cost. Using a public CA is not necessary inside a private infrastructure.

To proceed with certificate generation we need to have openssl installed on our system and certificates need to satisfy these requirements:

  • any certificate needs to be signed by the same CA
  • the common name (CN) required during the certificate creation must correspond to the hostname of the host
  • any other field requested in the certificate creation should be a non-empty value and, hopefully, should reflect our organization details
  • it is also very important that all the fields, except the CN, should match those from the certificates for the other cluster members

The following guide describes all the steps to configure internal X.509 certificate-based encryption.

1 – Connect to one of the hosts and generate a new private key using openssl

openssl genrsa -out mongoCA.key -aes256 8192

We have created a new 8192 bit private key and saved it in the file mongoCA.key
Remember to enter a strong passphrase when requested.

2 – Sign a new CA certificate

Now we are going to create our “fake” local certification authority that we’ll use later to sign each node certificate.

During certificate creation, some fields must be filled out. We could choose these randomly but they should correspond to our organization’s details.

root@psmdb1:~# openssl req -x509 -new -extensions v3_ca -key mongoCA.key -days 365 -out
    Enter pass phrase for mongoCA.key:
    You are about to be asked to enter information that will be incorporated
    into your certificate request.
    What you are about to enter is what is called a Distinguished Name or a DN. There are quite a few fields but you can leave some blank
    For some fields there will be a default value,
    If you enter '.', the field will be left blank.
    Country Name (2 letter code) [AU]:US
    State or Province Name (full name) [Some-State]:California
    Locality Name (eg, city) []:San Francisco
    Organization Name (eg, company) [Internet Widgits Pty Ltd]:My Company Ltd
    Organizational Unit Name (eg, section) []:DBA
    Common Name (e.g. server FQDN or YOUR name) []:psmdb
    Email Address []:corrado@mycompany.com

3 – Issue self-signed certificates for all the nodes

For each node, we need to generate a certificate request and sign it using the CA certificate we created in the previous step.

Remember: fill out all the fields requested the same for each host, but remember to fill out a different common name (CN) that must correspond to the hostname.

For the first node issue the following commands.

openssl req -new -nodes -newkey rsa:4096 -keyout psmdb1.key -out psmdb1.csr
openssl x509 -CA mongoCA.crt -CAkey mongoCA.key -CAcreateserial -req -days 365 -in psmdb1.csr -out psmdb1.crt
cat psmdb1.key psmdb1.crt > psmdb1.pem

for the second node

openssl req -new -nodes -newkey rsa:4096 -keyout psmdb2.key -out psmdb2.csr
openssl x509 -CA mongoCA.crt -CAkey mongoCA.key -CAcreateserial -req -days 365 -in psmdb2.csr -out psmdb2.crt
cat psmdb2.key psmdb2.crt > psmdb2.pem

and for the third node

openssl req -new -nodes -newkey rsa:4096 -keyout psmdb3.key -out psmdb3.csr
openssl x509 -CA mongoCA.crt -CAkey mongoCA.key -CAcreateserial -req -days 365 -in psmdb3.csr -out psmdb3.crt
cat psmdb3.key psmdb3.crt > psmdb3.pem

4 – Place the files

We could execute all of the commands in the previous step on the same host, but now we need to copy the generated files to the proper nodes:

  • Copy to each node the CA certifcate file: mongoCA.crt
  • Copy each self signed certifcate <hostname>.pem into the relative member
  • Create on each member a directory that only the MongoDB user can read, and copy both files there
sudo mkdir -p /etc/mongodb/ssl
sudo chmod 700 /etc/mongodb/ssl
sudo chown -R mongod:mongod /etc/mongodb
sudo cp psmdb1.pem /etc/mongodb/ssl
sudo cp mongoCA.crt /etc/mongodb/ssl

Do the same on each host.

5 – Configure mongod

Finally, we need to instruct mongod about the certificates to enable the encryption.

Change the configuration file /etc/mongod.conf on each host adding the following rows:

   port: 27017
      mode: requireSSL
      PEMKeyFile: /etc/mongodb/ssl/psmdb1.pem
      CAFile: /etc/mongodb/ssl/mongoCA.crt
      clusterFile: /etc/mongodb/ssl/psmdb1.pem
      authorization: enabled
      clusterAuthMode: x509

Restart the daemon

sudo service mongodb restart

Make sure to put the proper file names on each host (psmdb2.pem on psmdb2 host and so on)

Now, as long as we have made no mistakes, we have a properly configured replica set that is using encrypted connections.

Issue the following command to connect on node psmdb1:

mongo admin --ssl --sslCAFile /etc/mongodb/ssl/mongoCA.crt
--sslPEMKeyFile /etc/mongodb/ssl/psmdb1.pem
-u admin -p secret --host psmdb1

Access the first two articles in this series

  • Part 1: Introduces basic replica set concepts, how it works and what its main features
  • Part 2:  Provides a step-by-step guide to configure a three-node replica set

The post MongoDB: deploy a replica set with transport encryption (part 3/3) appeared first on Percona Database Performance Blog.


Deploy a MongoDB Replica Set with Transport Encryption (Part 2)


In this article series, we will talk about the basic high availability architecture of a MongoDB: the MongoDB replica set.

  • Part 1 : We introduced basic replica set concepts, how it works and what its main features
  • Part 2 (this post): We’ll provide a step-by-step guide to configure a three-node replica set
  • Part 3: We’ll talk about how to configure transport encryption between the nodes

In part 1 we introduced and described the main features of a MongoDB replica set. In this post, we are going to present a step-by-step guide to deploy a basic and fully operational 3-nodes replica set. We’ll use just regular members, all with priority=1, no arbiter, no hidden or delayed nodes.

The environment

Our example environment is 3 virtual hosts with Ubuntu 16.04 LTS, although the configuration is the same with CentOS or other Linux distributions.

We have installed Percona Server for MongoDB on each node. Hostnames and IPs are:

  • psmdb1 :
  • psmdb2 :
  • psmdb3 :

It is not the goal of this post to provide installation details, but in case you need them you can follow this guide: https://www.percona.com/doc/percona-server-for-mongodb/LATEST/install/index.html MongoDB installation from the repository is very easy.


Once we have all the nodes with MongoDB installed, we just need to be sure that each one is accessible by all the others on port 27017, the default port.

Since our members are on the same network we can simply try to test the connectivity between each pair of nodes, connecting the mongo client from one node to each of the others.

psmdb1> mongo --host --port 27017
psmdb1> mongo --host --port 27017
psmdb2> mongo --host --port 27017
psmdb2> mongo --host --port 27017
psmdb3> mongo --host --port 27017
psmdb3> mongo --host --port 27017

If the mongo client is not able to connect, we need to check the network configuration, or to configure or disable the firewall.


Configuring the hostnames into our hosts is not mandatory for the replica set. In fact you can configure the replica set using just the IPs and it’s fine. But we need to define the hostnames because they will be very useful when we discuss how to configure internal encryption in Part 3.

We need to ensure that each member is accessible by way of resolvable DNS or hostnames.

Set up each node in the /etc/hosts file

root@psmdb1:~# cat /etc/hosts       localhost  psmdb1  psmdb2  psmdb3

Choose a name for the replica set

We are now close to finalizing the configuration.

Now we have to choose a name for the replica set. We need to choose one and put t on each member’s configuration file. Let’s say we decide to use rs-test.

Put the replica set name into /etc/mongod.conf (the MongoDB configuration file) on each host. Enter the following:

     replSetName: "rs-test"

Restart the server:

sudo service mongod restart

Remember to do this on all the nodes.

That’s all we need to do to configure the replication at its most basic. There are obviously other configuration parameters we could set, but maybe we’ll talk about them in another post when discussing more advanced features. For this basic deployment we can assume that all the default values are good enough.

Initiate replication

Now we need to connect to one of the nodes. It doesn’t matter which, just choose one of them and launch mongo shell to connect to the local mongod instance.

Then issue the rs.initiate() command to let the replica set know what all the members are.

mongo> rs.initiate( {
      ... _id: “rs-test”,
      ... members: [
      ... { _id: 0, host: “psmdb1:27017” },
      ... { _id: 1, host: “psmdb2:27017” },
      ... { _id: 2, host: “psmdb3:27017” }
      ... ] })

After issuing the command, MongoDB initiates the replication process using the default configuration. A PRIMARY node is elected and all the documents will be created by now will be asynchronously replicated on the SECONDARY nodes.

We don’t need to do any more. The replica set is now working.

We can verify that the replication is working by taking a look at the mongo shell prompt. Once the replica set is up and running the prompt should be like this on the PRIMARY node:


and like this on the SECONDARY nodes:


MongoDB lets you know the replica role of the node that you are connected to.

A couple of useful commands

There are several commands to investigate and to do some administrative tasks on the replica set. Here are a couple of them.

To investigate the replica set configuration you can issue rs.conf() on any node

rs-test:PRIMARY> rs.conf()
 "_id" : "rs-test",
 "version" : 68835,
 "protocolVersion" : NumberLong(1),
 "members" : [
 "_id" : 0,
 "host" : "psmdb1:27017",
 "arbiterOnly" : false,
 "buildIndexes" : true,
 "hidden" : false,
 "priority" : 1,
 "tags" : {
 "slaveDelay" : NumberLong(0),
 "votes" : 1
 "_id" : 1,
 "host" : "psmdb2:27017",
 "arbiterOnly" : false,
 "buildIndexes" : true,
 "hidden" : false,
 "priority" : 1,
 "tags" : {
 "slaveDelay" : NumberLong(0),
 "votes" : 1
 "_id" : 2,
 "host" : "psmdb3:27017",
 "arbiterOnly" : false,
 "buildIndexes" : true,
 "hidden" : false,
 "priority" : 1,
 "tags" : {
 "slaveDelay" : NumberLong(0),
 "votes" : 1
 "settings" : {
 "chainingAllowed" : true,
 "heartbeatIntervalMillis" : 2000,
 "heartbeatTimeoutSecs" : 10,
 "electionTimeoutMillis" : 10000,
 "catchUpTimeoutMillis" : 60000,
 "getLastErrorModes" : {
 "getLastErrorDefaults" : {
 "w" : 1,
 "wtimeout" : 0
 "replicaSetId" : ObjectId("5aa2600d377adb63d28e7f0f")

We can see information about the configured nodes, whether arbiter or hidden, the priority, and other details regarding the heartbeat process.

To investigate the replica set status you can issue rs.status() on any node

rs-test:SECONDARY> rs.status()
 "set" : "rs-test",
 "date" : ISODate("2018-05-14T10:16:05.228Z"),
 "myState" : 2,
 "term" : NumberLong(47),
 "syncingTo" : "psmdb3:27017",
 "heartbeatIntervalMillis" : NumberLong(2000),
 "optimes" : {
 "lastCommittedOpTime" : {
 "ts" : Timestamp(1526292954, 1),
 "t" : NumberLong(47)
 "appliedOpTime" : {
 "ts" : Timestamp(1526292964, 1),
 "t" : NumberLong(47)
 "durableOpTime" : {
 "ts" : Timestamp(1526292964, 1),
 "t" : NumberLong(47)
 "members" : [
 "_id" : 0,
 "name" : "psmdb1:27017",
 "health" : 1,
 "state" : 2,
 "stateStr" : "SECONDARY",
 "uptime" : 392,
 "optime" : {
 "ts" : Timestamp(1526292964, 1),
 "t" : NumberLong(47)
 "optimeDate" : ISODate("2018-05-14T10:16:04Z"),
 "syncingTo" : "psmdb3:27017",
 "configVersion" : 68835,
 "self" : true
 "_id" : 1,
 "name" : "psmdb2:27017",
 "health" : 1,
 "state" : 1,
 "stateStr" : "PRIMARY",
 "uptime" : 379,
 "optime" : {
 "ts" : Timestamp(1526292964, 1),
 "t" : NumberLong(47)
 "optimeDurable" : {
 "ts" : Timestamp(1526292964, 1),
 "t" : NumberLong(47)
 "optimeDate" : ISODate("2018-05-14T10:16:04Z"),
 "optimeDurableDate" : ISODate("2018-05-14T10:16:04Z"),
 "lastHeartbeat" : ISODate("2018-05-14T10:16:04.832Z"),
 "lastHeartbeatRecv" : ISODate("2018-05-14T10:16:03.318Z"),
 "pingMs" : NumberLong(0),
 "electionTime" : Timestamp(1526292592, 1),
 "electionDate" : ISODate("2018-05-14T10:09:52Z"),
 "configVersion" : 68835
 "_id" : 2,
 "name" : "psmdb3:27017",
 "health" : 1,
 "state" : 2,
 "stateStr" : "SECONDARY",
 "uptime" : 378,
 "optime" : {
 "ts" : Timestamp(1526292964, 1),
 "t" : NumberLong(47)
 "optimeDurable" : {
 "ts" : Timestamp(1526292964, 1),
 "t" : NumberLong(47)
 "optimeDate" : ISODate("2018-05-14T10:16:04Z"),
 "optimeDurableDate" : ISODate("2018-05-14T10:16:04Z"),
 "lastHeartbeat" : ISODate("2018-05-14T10:16:04.832Z"),
 "lastHeartbeatRecv" : ISODate("2018-05-14T10:16:04.822Z"),
 "pingMs" : NumberLong(0),
 "syncingTo" : "psmdb2:27017",
 "configVersion" : 68835
 "ok" : 1

Here we can see for example if nodes are reachable and are running, but in particular we can see the role they have at this moment: which is the PRIMARY and which are SECONDARY

Test replication

Finally, let’s try to test that the replication process is really working as expected.

Connect to the PRIMARY node and create a sample document:

rs-test:PRIMARY> use test
switched to db test
rs-test:PRIMARY> db.foo.insert( {name:"Bruce", surname:"Dickinson"} )
WriteResult({ "nInserted" : 1 })
rs-test:PRIMARY> db.foo.find().pretty()
    "_id" : ObjectId("5ae05ac27e6680071caf94b7")
    "name" : "Bruce"
    "surname" : "Dickinson"

Then connect to a SECONDARY node and look for the same document.

Remember that you can’t connect to the SECONDARY node to read the data. By default reads and writes are allowed only on the PRIMARY. So, if you want to read data on a SECONDARY node, you first need to issue the rs.slaveOK() command. If you don’t do this you will receive an error.

rs-test:SECONDARY> rs.slaveOK()
rs-test:SECONDARY> show collections
rs-test:SECONDARY> db.foo.find().pretty()
     "_id" : ObjectId("5ae05ac27e6680071caf94b7")
     "name" : "Bruce"
     "surname" : "Dickinson"

As we can see, the SECONDARY node has replicated the creation of the collection foo and the inserted document.

This simple test demonstrates that the replication process is working as expected.

There are more sophisticated features to investigate the replica set, and for troubleshooting, but discussing them it’s not in the scope of this post.

In Part 3, we’ll show how to encrypt the internal replication process we have deployed so far.

Read the first post of this series: Deploy a MongoDB Replica Set with Transport Encryption

The post Deploy a MongoDB Replica Set with Transport Encryption (Part 2) appeared first on Percona Database Performance Blog.


Don’t Get Hit with a Database Disaster: Database Security Compliance

Percona Live 2018 security talks

In this post, we discuss database security compliance, what you should be looking at and where to get more information.

As Percona’s Chief Customer Officer, I get the opportunity to talk with a lot of customers. Hearing about the problems that both their technical teams face, as well as the business challenges their companies experience first-hand is incredibly valuable in terms of what the market is facing in general. Not every problem you see has a purely technical solution, and not every good technical solution solves the core business problem.

Matt Yonkovit, Percona CCOAs database technology advances and data continues to be the core blood of most modern applications, DBA’s will have a say in business level strategic planning more than ever. This coincides with the advances in technology and automation that make many classic manual “DBA” jobs and tasks obsolete. Traditional DBA’s are evolving into a blend of system architect, data strategist and master database architect. I want to talk about the business problems that not only the C-Suite care about, but DBAs as a whole need to care about in the near future.

Let’s start with one topic everyone should have near the top of their list: security.

We did a recent survey of our customers, and their biggest concern right now is security and compliance.

Not long ago, most DBA’s I knew dismissed this topic as “someone else’s problem” (I remember being told that the database is only as secure as the network, so fix the network!). Long gone are the days when network security was enough. Even the DBA’s who did worry about security only did so within the limited scope of what the database system could provide out of the box.  Again, not enough.

So let me run an experiment:

Raise your hand if your company has some bigger security initiative this year. 

I’m betting a lot of you raised your hand!

Security is not new to the enterprise. It’s been a priority for years now. However, it has not been receiving a hyper-focus in the open source database space until the last three years or so. Why? There have been a number of high profile database security breaches in the last year, all highlighting a need for better database security. This series of serious data breaches have exposed how fragile some security protocols in companies are. If that was not enough, new government regulations and laws have made data protection non-optional. This means you have to take the security of your database seriously, or there could be fines and penalties.

Percona Live 2018 security talksGovernment regulations are nothing new, but the breadth and depth of these are growing and are opening up a whole new challenge for databases systems and administrators. GDPR was signed into law two years ago (you can read more here: https://en.wikipedia.org/wiki/General_Data_Protection_Regulation and https://www.dataiq.co.uk/blog/summary-eu-general-data-protection-regulation) and is scheduled to take effect on May 25, 2018. This has many businesses scrambling not only to understand the impact, but figure out how they need to comply. These regulations redefine simple things, like what constitutes “personal data” (for instance, your anonymous buying preferences or location history even without your name).

New requirements also mean some areas get a bit more complicated as they approach the gray area of definition. For instance, GDPR guarantees the right to be forgotten. What does this mean? In theory, it means end-users can request that all their personal information is removed from your systems as if they did not exist. Seems simple, but in reality, you can go as far down the rabbit hole as you want. Does your application support this already? What about legacy applications? Even if the apps can handle it, does this mean previously taken database backups have to forget you as well? There is a lot to process for sure.

So what are the things you can do?

  1. Educate yourself and understand expectations, even if you weren’t involved in compliance discussions before.
  2. Start working on incremental improvements now on your data security. This is especially true in the area’s where you have some control, without massive changes to the application. Encryption at rest is a great place to start if you don’t have it.
  3. Start talking with others in the organization about how to identify and protect personal information.
  4. Look to increase security by default by getting involved in new applications early in the design phase.

The good news is you are not alone in tackling this challenge. Every company must address it. Because of this focus on security, we felt strongly about ensuring we had a security track at Percona Live 2018 this year. These talks from Fastly, Facebook, Percona, and others provide information on how companies around the globe are tackling these security issues. In true open source fashion, we are better when we learn and grow from one another.

What are the Percona Live 2018 security talks?

We have a ton of great security content this year at Percona Live, across a bunch of technologies and open source software. Some of the more interesting Percona Live 2018 security talks are:

Want to attend Percona Live 2018 security talks? Register for Percona Live 2018. Register now to get the best price! Use the discount code SeeMeSpeakPL18 for 10% off.

Percona Live Open Source Database Conference 2018 is the premier open source event for the data performance ecosystem. It is the place to be for the open source community. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

The Percona Live Open Source Database Conference will be April 23-25, 2018 at the Hyatt Regency Santa Clara & The Santa Clara Convention Center.


ProxySQL Firewalling

ProxySQL Firewalling

ProxySQL FirewallingIn this blog post, we’ll look at ProxySQL firewalling (how to use ProxySQL as a firewall).

Not long ago we had an internal discussion about security, and how to enforce a stricter set of rules to prevent malicious acts and block other undesired queries. ProxySQL came up as a possible tool that could help us in achieving what we were looking for. Last year I wrote about how to use ProxySQL to stop a single query.

That approach may be good for few queries and as a temporary solution. But what can we do when we really want to use ProxySQL as an SQL-based firewall? And more importantly, how to do it right?

First of all, let us define what “right” can be in this context.

For right I mean an approach that allows us to have rules matching as specifically as possible, and impacting the production system as little as possible.

To make this clearer, let us assume I have three schemas:

  • Shakila
  • World
  • Windmills

I want to have my firewall block/allow SQL access independently by each schema, user, eventually by source, and so on.

There are a few case where this is not realistic, like in SaaS setups where each schema represents a customer. In this case, the application will have exactly the same kind of SQL – just pointing to different schemas depending the customer.

Using ProxySQL

Anyhow… ProxySQL allows you to manage query firewalling in a very simple and efficient way using the query rules.

In the mysql_query_rules table, we can define a lot of important things – one being setting our SQL firewall.


Let us take a look to the mysql_query_rules table:

    active INT CHECK (active IN (0,1)) NOT NULL DEFAULT 0,
    username VARCHAR,
    schemaname VARCHAR,
    client_addr VARCHAR,
    proxy_addr VARCHAR,
    proxy_port INT,
    digest VARCHAR,
    match_digest VARCHAR,
    match_pattern VARCHAR,
    negate_match_pattern INT CHECK (negate_match_pattern IN (0,1)) NOT NULL DEFAULT 0,
    re_modifiers VARCHAR DEFAULT 'CASELESS',
    flagOUT INT,
    replace_pattern VARCHAR,
    destination_hostgroup INT DEFAULT NULL,
    cache_ttl INT CHECK(cache_ttl > 0),
    reconnect INT CHECK (reconnect IN (0,1)) DEFAULT NULL,
    timeout INT UNSIGNED,
    retries INT CHECK (retries>=0 AND retries <=1000),
    delay INT UNSIGNED,
    next_query_flagIN INT UNSIGNED,
    mirror_flagOUT INT UNSIGNED,
    mirror_hostgroup INT UNSIGNED,
    error_msg VARCHAR,
    OK_msg VARCHAR,
    sticky_conn INT CHECK (sticky_conn IN (0,1)),
    multiplex INT CHECK (multiplex IN (0,1,2)),
    log INT CHECK (log IN (0,1)),
    apply INT CHECK(apply IN (0,1)) NOT NULL DEFAULT 0,
    comment VARCHAR)

We can define rules around almost everything: connection source, port, destination IP/Port, user, schema, SQL text or any combination of them.

Given we may have quite a large set of queries to manage, I prefer to logically create “areas” around which add the rules to manage SQL access.

For instance, I may decide to allow a specific set of SELECTs to my schema windmills, but nothing more.

Given that, I allocate the set of rule IDs from 100 to 1100 to my schema, and add my rules in three groups.

  1. The exception that will bypass the firewall
  2. The blocking rule(s) (the firewall)
  3. The managing rules (post-processing, like sharding and so on)

There is a simple thing to keep in mind when you design rules for firewalling: do you need post-processing of the query or not?

In the case that you DON’T need post-processing, the rule can simply apply and exit the QueryProcessor. That is probably the most common scenario, and read/write splits can be defined in the exception rules assigned to the rule for the desired HostGroup.

If you DO need post-processing, the rule MUST have apply=0 and the FLAGOUT must be defined. That allows you to have additional actions once the query is beyond the firewall. An example is in case of sharding, where you need to process the sharding key/comment or whatever.

I will use the simple firewall scenario, given this is the topic of the current article.

The rules

Let us start with the easy one, set 2, the blocking rule:

insert into mysql_query_rules (rule_id,username,schemaname,match_digest,error_msg,active,apply) values(1000,'pxc_test','windmills','.','You cannot pass.....I am a servant of the Secret Fire, wielder of the flame of Anor,. You cannot pass.',1, 1);

In this query rule, I had defined the following:

  • User connecting
  • Schema name
  • Any query
  • Message to report
  • Rule_id

That rule will block ANY query that tries to access the schema windmills from application user pxc_test.

Now in set 1, I will add all the rules I want to let pass. I will report here one only, but all can be found in GitHub here (https://github.com/Tusamarco/blogs/tree/master/proxysql_firewall).

insert into mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,schemaname,active,retries,apply,flagout,match_digest) values(101,6033,'pxc_test',52,'windmills',1,3,1,1000,'SELECT wmillAUTOINC.id,wmillAUTOINC.millid,wmillAUTOINC.location FROM wmillAUTOINC WHERE wmillAUTOINC.millid=.* and wmillAUTOINC.active=.*');

That is quite simple and straightforward, but there is an important element that you must note. In this rule, apply must have value of =1 always, to allow the query rule to bypass without further delay the firewall.

(Side Note: if you need post-processing, the flagout needs to have a value (like flagout=1000) and apply must be =0. That allows the query to jump to set 3, the managing rules.)

This is it, ProxySQL will go to the managing rules as soon as it finds a matching rule that allows the application to access my database/schema, or it will exit if apply=1.

A graph will help to understand better:

Rule set 3 has the standard query rules to manage what to do with the incoming connection, like sharding or redirecting SELECT FOR UPDATE, and so on:

insert into mysql_query_rules (rule_id,proxy_port,schemaname,username,destination_hostgroup,active,retries,match_digest,apply,flagin) values(1040,6033,'windmills','pxc_test',50,1,3,'^SELECT.*FOR UPDATE',1,1000);

Please note the presence of the flagin, which matches the flagout above.

Setting rules, sometimes thousands of them, can be very confusing. It is very important to correctly plan what should be in as an excluding rule and what should not. Do not rush, take your time and identify the queries you need to manage carefully.

Once more ProxySQL can help us. Querying the table stats_mysql_query_digest tells us exactly what queries were sent to ProxySQL:

admin@ [main]>select hostgroup,schemaname,digest,digest_text,count_star from stats_mysql_query_digest where schemaname='windmills' order by count_star desc;

The above query shows us all the queries hitting the windmills schema. From there we can decide which queries we want to pass and which not.

>select hostgroup,schemaname,digest,digest_text,count_star from stats_mysql_query_digest where schemaname='windmills' order by count_star desc  limit 1G
*************************** 1. row ***************************
  hostgroup: 50
 schemaname: windmills
     digest: 0x18CA8FF2C9C53276
 count_star: 141

Once we have our set done (check on github for an example), we are ready to check how our firewall works.

By default, I suggest you to keep all the exceptions (in set 1) with active=0, just to test the firewall.

For instance, my application generates the following exception:

com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You cannot pass.....I am a servant of the Secret Fire, wielder of the flame of Anor,. You cannot pass.
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
	at com.mysql.jdbc.Util.getInstance(Util.java:386)
	at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1054)
	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4187)
	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4119)
	at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2570)
	at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2731)
	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2809)
	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2758)
	at com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1612)
	at net.tc.stresstool.statistics.providers.MySQLStatus.getStatus(MySQLStatus.java:48)
	at net.tc.stresstool.statistics.providers.MySQLSuper.collectStatistics(MySQLSuper.java:92)
	at net.tc.stresstool.statistics.StatCollector.collectStatistics(StatCollector.java:258)
	at net.tc.stresstool.StressTool.<init>(StressTool.java:198)
	at net.tc.stresstool.StressTool.main(StressTool.java:282)

Activating the rules, will instead allow your application to work as usual.

What is the impact?

First, let’s define the baseline by running the application without any rule blocking (but only the r/w split (set 3)).


Queries/sec global

Using two application servers:

  • Server A: Total Execution time = 213
  • Server B: Total Execution time = 209

Queries/sec per server

As we can see, queries are almost equally distributed.

QueryProcessor time taken/Query processed total

All queries are processed by QueryProcessor in ~148ms AVG (total).

QueryProcessor efficiency per query

The single query cost is in nanoseconds (avg 10 us).

Use match_digest

Once we’ve defined the baseline, we can go ahead and activate all the rules using the match_digest. Run the same tests again and… :

Using two application servers:

  • Server A: Total Execution time = 207
  • Server B: Total Execution time = 204

First of all, we notice that the execution time did not increase. This is mainly because we have CPU cycles to use in the ProxySQL box:

Here we have a bit of unbalance. We will investigate that in a separate session, but all in all, time/effort looks ok:

Here we have the first thing to notice. Comparing this to the baseline we defined, we can see that using the rules as match_digest significantly increased the execution time to 458ms:

Also notice that if we are in the range of nanoseconds, the cost of processing the query is now three times that of the baseline. Not too much, but if you add a stone to another stone and another stone and another stone … you end up building a huge wall.

So, what to do? Up to now, we saw that managing the firewall with ProxySQL is easy and it can be set at very detailed level – but the cost may not be what we expect it to be.

What can be done? Use DIGEST instead

The secret is to not use match_digest (which implies interpretation of the string) but to use the DIGEST of the query (which is calculated ahead and remains constant for that query).

Let us see what happens if we run the same load using DIGEST in the MYSQL_QUERY_RULES table:

Using two application servers:

  • Server A: Total Execution time = 213
  • Server B: Total Execution time = 209

No, this is not an issue with cut and paste. I had more or less the same execution time as without rules, at the seconds (different millisecond though):

Again, there is some unbalance, but a minor thing:

And we drop to 61ms for execution of all queries. Note that we improve the efficiency of the Query Processor from 148ms AVG to 61ms AVG.

Why? Because our rules using the DIGEST also have the instructions for read/write split, so requests can exit the Query Processor with all the information required at this stage (much more efficient).

Finally, when using the DIGEST the cost for query drops to 4us which is … LOW!

That’s it! ProxySQL using the DIGEST field from mysql_query_rules performs much better given that it doesn’t need to analyze the whole SQL string with regular expressions – it just matches the DIGEST.


ProxySQL can be effectively used as an SQL firewall, but some best practices should be taken in to consideration. First of all, try to use specific rules and be specific on what should be filtered/allowed. Use filter by schema or user or IP/port or combination of them. Always try to avoid match_digest and use digest instead. That allows ProxySQL to bypass the call to the regularExp lib and is far more efficient. Use stats_mysql_query_digest to identify the correct DIGEST.

Regarding this, it would be nice to have a GUI interface that allows us to manage these rules. That would make the usage of the ProxySQL much easier, and the maintenance/creation of rule_chains friendlier.


MySQL community set to meet at Percona Live London 2014

The countdown is on for Europe’s largest annual MySQL event, Percona Live London 2014. The two days of technical tutorials and sessions, November 3-4, will focus on the latest MySQL industry trends, news, best practices – and a look at what’s on the near- and long-term horizon within the global MySQL ecosystem.

Percona Live London 2014 will bring attendees up to date on key areas including MySQL 5.7, database security, database as a service (DBaaS), Hadoop and high availability (HA), disaster recovery, replication and backup, performance and scalability, WebScaleSQL and much, much more.

Team Oracle will be in London, led by Tomas Ulin, vice president of Oracle’s MySQL engineering team, who will explain why MySQL just keeps getting better with the latest news for the MySQL Database, MySQL Cluster, MySQL Workbench… and more. Oracle’s Luis Soares, principle software engineer, and Andrew Morgan, MySQL HA product management, will provide insight into what’s in the latest 5.7 development milestone release and also what’s going on in the labs… particularly around MySQL replication. Seize the opportunity to learn how to leverage MySQL 5.7 replication to grow your business from their session, “MySQL Replication: What’s New in 5.7 and Beyond.”

If anything keeps DBAs up at night it’s database security – especially with recent revelations of vulnerabilities like the POODLE SSLv3 security flaw (CVE-2014-3566) and “Bash Bug,” also known as Shellshock (CVE-2014-6271). Attendees will have the opportunity to talk face-to-face with database security expert David Busby of Percona, who will also lead a session titled, “Security it’s more than just your database you should worry about.”

The official Percona Live London 2014 t-shirt!

The official Percona Live London 2014 t-shirt!
(Click image for larger view)

Observe how to incorporate semi-synchronous replication to achieve failover – without data loss. Facebook’s Yoshinori Matsunobu and Santosh Banda will share how they did it at scale (across data centers) by extending MySQL internals along with some handy new self-made tools.

Meet the next-generation C connector for MySQL: libAttachSQL. It’s a new lightweight async C connector library for MySQL being developed from scratch by HP’s Advanced Technology Group. Andrew Hutchings, principal software engineer at Hewlett-Packard, will be on hand to share the latest on libAttachSQL.

Successful applications often become limited by MySQL performance. But tracking down and fixing those issues can be a huge drain on time and resources. Unless you think smart – spending time on what gives you the best return. Percona CEO Peter Zaitsev will explain how in his session, “Practical MySQL Performance Optimization.”

Percona Live London attendees will also learn from the real-life experiences of MySQL experts who share case studies. Shake hands with Art van Scheppingen, head of database engineering at Spil Games, who will explain how to serve out any page with an HA Sphinx environment.

Save yourself a quarter century by absorbing Tim Callaghan’s MySQL performance benchmarking tips, tricks and lessons learned. Tim, vice president of engineering at Tokutek, with share what he’s learned in the past 25 years maintaining the performance of database applications.

And of course there will be a MySQL community dinner! But be sure to register now for the dinner because space is limited – especially if you want to enjoy a ride to the restaurant on a vintage double-decker London bus (you do not need to attend the conference to join the dinner).

Register now for Percona Live London 2014 and save £30 with discount code “MPB30“. See you in London!

The post MySQL community set to meet at Percona Live London 2014 appeared first on MySQL Performance Blog.


Database security: Why should you review yours?

Ah database security… the black sheep of topics and something you would really rather not have to deal with right?

I mean surely all the fanfare and paranoia is reserved for the neck beards with tinfoil hats whom live in their own D.I.Y Faraday cage … that must be it … it just has to be?

Database Security and why you need to review yoursNo, the hard reality is the world is not rose tinted and “they” are out to get you be it for fun or for profit; from defacements to theft compromising your applications, and more importantly your data is big business. For some these acts are nothing short of sheer entertainment for an otherwise boring evening. (I’ll be speaking about this topic next week in much more detail at the Percona Live MySQL Conference and Expo in Santa Clara, California. My session, “Security and why you need to review yours” will go into much more detail on April 2. Use the code “SeeMeSpeak” on the registration page and save 10 percent.)

Note I’m avoiding discussing corporate espionage/government spying, as this re-enforces the image of it all being cloak and dagger; admittedly some of it is … the part that’s going to affect you, your business, your livelihood isn’t.

It’s time for that wake-up caffeine infusion and drink the “kool aid” – this is not something you can shun and pretend it’s a monster under the bed/in the closet that doesn’t really exist. Unlike the “bogey man” these threats are real, and have real impacts to peoples livelihood.

F.U.D? The fear part I certainly am wanting to portray here; a level of Fear is healthy;  it keeps you alert, the uncertainty and doubt? No these should and will be removed so please allow me to continue.

Removing Uncertainty
As with anything that requires research I’m sure you the reader would carry out proper “Due Diligence.” You wouldn’t want anything to adversely affect your product/business, taking hosting as an example you’ll look at the providers reputation their S.L.A. agreements etc.

What about their Security credentials? PCI / SOX / HIPAA … there’s numerous classifications.

“But I don’t need PCI / SOX / HIPAA!” to this I say you need a compromise of your application/business even less… what’s the harm in asking the provider if they have been through any regulatory compliance? I just don’t get the stigma some people seem to feel when asking a question related to security. Remember when deploying your application you’re building upon your hosting providers infrastructure.

“A foolish man who built his house on sand” in short if your foundations are not sound you’re opening yourself up to failure and compromise, an example Paypal and godaddy socially engineered resulting in the loss of a $50K twitter username and Barclays £1.37M theft both of which are due to the same level of Failure, the term for which is “Social Engineering” … which really is just a new term for conning someone into doing what you want them to do.

“The art of the con” is hardly anything new; and has been around for centuries take Victor Lustig whose infamous example of con artistry was to sell the Eiffel tower for scrap … twice.

Dispelling Doubt
“By failing to prepare you are preparing to fail” – Benjamin Franklin

Let’s look at this a little more with some common misconceptions.

“I don’t really need to look at security, my project business is small and will not be attacked.” I’d liken this statement to saying you don’t need seat belts and air bags because you’re a careful driver; driving at night with no lights on because “I have good night vision.” You have safety and security measures in your everyday life which because they fall as part of the routine are not thought about: locks on doors, car/home/business alarms, cctv, gps locators for phones/cars/tablets/laptops … we need to eliminate this thinking that information security is anything other than a requirement which should form part of our every day “norms”.

“Security is too expensive.” Have you looked at the cost of a compromise of your system, how much is the potential loss of your entire customer base worth to you? …not looking quiet so expensive now is it? Liken an investment in security to an investment in High Availability: you want your application to be “always on” … why do many think “secure” is prohibitively expensive to achieve?

“We simply don’t have the resources to implement security measures.” Yet you have the resources for development, DBA’s, sysadmins? One of the best ways to introduce security into your application is “from the ground up,” so that it becomes part of your general practise – this requires a “state of mind” orientated toward security.

What many fail to realize is assuming your business is successful you already have a state of mind orientated to best practises which work for your business to produce an application/service of value; minor tweaking to this could also introduce a mindset of security.

Remediation – “the action of remedying something, in particular of reversing or stopping environmental damage.”

It’s not going to be a painful or expensive as you may think; the first most powerful step is a minor change of development/sysadmin attitudes to consider security implications of code/services/configurations, let’s bring back the healthy attitude of asking questions.

Do I really need to disable SELinux to get this to work? – the answer is of course no you shouldn’t, this should be the same vein of thought of do I really need to chmod this 777 to make it work?

Does this service really need to be installed? – e.g. bluetoothd doesn’t need to be on your production machines.

We’re adding a user input form, we should really sanitize the input – seems obvious to most now, though this was met with just as much “resistance to change” before it became a best practise standard.

Does MySQL really need to be accessible from everywhere on the internet? – again may seem obvious to most now, though this was and sometimes still met with resistance of “I may need to run queries from home, off my mobile, from the open wifi at the local coffee shop …” (those of a security orientated nature I apologize for this statement and I can sense the cringing now in progress as this is read …).

The above is just a small example of reducing your attack surface. Your attack surface being the potential entry points into your system/network/application which can be potentially attacked.

The thinking behind the need to invest a lot of money into some expensive appliance / magic solution is for the most part misguided and throwing a “security blanket” over or in front of your application isn’t going to be as effective as ensuring consideration of security at every layer of your business.

Over the coming months I will be working on more related blogs, webinars – and a reminder that I’ll also be giving a talk on “Security and why you need to review yours” at Percona Live in Santa Clara this April 2nd.

The post Database security: Why should you review yours? appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com