Apr
19
2018
--

Sysbench-tpcc Supports PostgreSQL (No, Really This Time)

Sysbench-tpcc Supports PostgreSQL

Sysbench-tpcc Supports PostgreSQLThis time, we really mean it when we say sysbench-tpcc supports PostgreSQL.

When I initially announced sysbench-tpcc, I mentioned it potentially could run against PostgreSQL, but it was more like wishful thinking than reality. The reality was that even though both databases speak SQL, the difference in dialects was too big and the queries written for MySQL could not run without modification on PostgreSQL.

Well, we introduced needed changes, and now you can use sysbench-tpcc with PostgreSQL. Just try the latest commit to https://github.com/Percona-Lab/sysbench-tpcc.

If you’re interested, here is a quick overview of what changes we had to make:

  1. It appears that PostgreSQL does not support the 
    tinyint

     and

    datetime

     data types. We had to use smallint and

    timestamp

     fields, even if using

    smallint

     makes the database size bigger.

  2. PostgreSQL does not have a simple equivalent for MySQL’s
    SHOW TABLES

    . The best replacement we found is

    select * from pg_catalog.pg_tables where schemaname != 'information_schema' and schemaname != 'pg_catalog'

    .

  3. PostgreSQL does not have a way to disable Foreign Key checks like MySQL:
    SET FOREIGN_KEY_CHECKS=0

    . With PostgreSQL, we needed to create and load tables in a very specific order to avoid Foreign Keys violations.

  4. PostgreSQL requires you to have a unique index name per the whole database, white MySQL requires it only per table. So instead of using:
    CREATE INDEX idx_customer ON customer1 (c_w_id,c_d_id,c_last,c_first)
    CREATE INDEX idx_customer ON customer2 (c_w_id,c_d_id,c_last,c_first)

    We need to use:

    CREATE INDEX idx_customer1 ON customer1 (c_w_id,c_d_id,c_last,c_first)
    CREATE INDEX idx_customer2 ON customer2 (c_w_id,c_d_id,c_last,c_first)
  5. PostgreSQL does not have a 
    STRAIGHT_JOIN

     hint, so we had to remove this from queries. But it is worth mentioning we use

    STRAIGHT_JOIN

     mostly as a hack to force MySQL to use a correct execution plan for one of the queries.

  6. PostgreSQL is very strict on GROUP BY queries. All fields that are not in the GROUP BY clause must use an aggregation function. So PostgreSQL complained on queries like
    SELECT d_w_id,sum(d_ytd)-w_ytd diff FROM district,warehouse WHERE d_w_id=w_id AND w_id=1 GROUP BY d_w_id

     even when we know that only single value for w_ytd is possible. We had to rewrite this query as

    SELECT d_w_id,SUM(d_ytd)-MAX(w_ytd) diff FROM district,warehouse WHERE d_w_id=w_id AND w_id=1 GROUP BY d_w_id

    .

So you can see there was some work involved when we try to migrate even a simple application from MySQL to PostgreSQL.

Hopefully, now sysbench-tpcc supports PostgreSQL, it is a useful tool to evaluate a PostgreSQL performance. If you find that we did not optimally execute some transaction, please let us know!

The post Sysbench-tpcc Supports PostgreSQL (No, Really This Time) appeared first on Percona Database Performance Blog.

Apr
19
2018
--

Congratulations to Our Friends at Oracle with the MySQL 8.0 GA Release!

MySQL 8.0 GA

MySQL 8.0 GAIt is a great today for whole MySQL community: MySQL 8.0 was just released as GA!

Geir Høydalsvik has a great summary in his “What’s New in MySQL 8.0” blog post. You can find additional information about MySQL 8.0 Replication and MySQL 8.0 Document Store that is also worth reading.

If you can’t wait to upgrade to MySQL 8.0, please make sure to read the Upgrading to MySQL 8.0 section in the manual, and pay particular attention to changes to Connection Authentication. It requires special handling for most applications.

Also keep in mind that while MySQL 8.0 passed through an extensive QA process, this is the first GA release. It is not yet as mature and polished as MySQL 5.7. If you’re just now starting application development, however, you should definitely start with MySQL 8.0 — by the time you launch your application, 8.0 will be good. 

All of us at Percona – and me personally – are very excited about this release. You can learn more details about what we expect from it in our Why We’re Excited about MySQL 8.0 webinar recording.    

We also wrote extensively about MySQL 8.0 on our blog. Below are some posts on various features, as well as thoughts on the various RCs, that you might want to review:

The best way to learn about MySQL 8.0, though, is to attend the Percona Live Open Source Database Conference 2018, taking place in Santa Clara, CA next week. We have an outstanding selection of MySQL 8.0 focused talks both from the MySQL Engineering team and the community at large (myself included):

You can still get tickets to the conference. Come by and learn about MySQL 8.0. If you can’t make it, please check back later for slides.

Done reading? Go ahead go download  MySQL 8.0 and check it out!

The post Congratulations to Our Friends at Oracle with the MySQL 8.0 GA Release! appeared first on Percona Database Performance Blog.

Apr
18
2018
--

Why Analyze Raw MySQL Query Logs?

Raw MySQL Query Logs

Raw MySQL Query LogsIn this blog post, I’ll examine when looking at raw MySQL query logs can be more useful than working with tools that only have summary data.

In my previous blog post, I wrote about analyzing MySQL Slow Query Logs with ClickHouse and ClickTail. One of the follow-up questions I got is when do you want to do that compared to just using tools like Percona Monitoring and Management or VividCortex, which provide a beautiful interface for detailed analyses (rather than spartan SQL interface).    

MySQL Logs

A lot of folks are confused about what query logs MySQL has, and what you can use them for. First, MySQL has a “General Query Log”. As the name implies, this is a general-purpose query log. You would think this is the first log you should use, but it is, in fact, pretty useless:

2018-03-31T15:38:44.521650Z      2356 Query SELECT c FROM sbtest1 WHERE id=164802
2018-03-31T15:38:44.521790Z      2356 Query SELECT c FROM sbtest1 WHERE id BETWEEN 95241 AND 95340
2018-03-31T15:38:44.522168Z      2356 Query SELECT SUM(k) FROM sbtest1 WHERE id BETWEEN 1 AND 100
2018-03-31T15:38:44.522500Z      2356 Query SELECT c FROM sbtest1 WHERE id BETWEEN 304556 AND 304655 ORDER BY c
2018-03-31T15:38:44.522941Z      2356 Query SELECT DISTINCT c FROM sbtest1 WHERE id BETWEEN 924 AND 1023 ORDER BY c
2018-03-31T15:38:44.523525Z      2356 Query UPDATE sbtest1 SET k=k+1 WHERE id=514

As you can see, it only has very limited information about queries: no query execution time or which user is running the query. This type of log is helpful if you want to see very clean, basic information on what queries your application is really running. It can also help debug MySQL crashes because, unlike other log formats, the query is written to this log file before MySQL attempts to execute the query.

The MySQL Slow Log is, in my opinion, much more useful (especially with Percona Server Slow Query Log Extensions). Again as the name implies, you would think it is only used for slow queries (and by default, it is). However, you can set long_query_time to 0 (with a few other options) to get all queries here with lots of rich information about query execution:

# Time: 2018-03-31T15:48:55.795145Z
# User@Host: sbtest[sbtest] @ localhost []  Id: 2332
# Schema: sbtest  Last_errno: 0 Killed: 0
# Query_time: 0.000143  Lock_time: 0.000047 Rows_sent: 1  Rows_examined: 1 Rows_affected: 0
# Bytes_sent: 188  Tmp_tables: 0 Tmp_disk_tables: 0  Tmp_table_sizes: 0
# QC_Hit: No  Full_scan: No Full_join: No  Tmp_table: No Tmp_table_on_disk: No
# Filesort: No  Filesort_on_disk: No  Merge_passes: 0
#   InnoDB_IO_r_ops: 0  InnoDB_IO_r_bytes: 0  InnoDB_IO_r_wait: 0.000000
#   InnoDB_rec_lock_wait: 0.000000  InnoDB_queue_wait: 0.000000
#   InnoDB_pages_distinct: 0
# Log_slow_rate_type: query  Log_slow_rate_limit: 10
SET timestamp=1522511335;
SELECT c FROM sbtest1 WHERE id=2428336;

Finally, there is the MySQL Audit Log, which is part of the MySQL Enterprise offering and format-compatible Percona Server for MySQL Audit Log Plugin. This is designed for auditing access to the server, and as such it has matched details in the log. Unlike the first two log formats, it is designed first and foremost to be machine-readable and supports JSON, XML and CVS output formats:

{"audit_record":{"name":"Query","record":"743017006_2018-03-31T01:03:12","timestamp":"2018-03-31T15:53:42 UTC","command_class":"select","connection_id":"2394","status":0,"sqltext":"SELECT SUM(k) FROM sbtest1 WHERE id BETWEEN 3 AND 102","user":"sbtest[sbtest] @ localhost []","host":"localhost","os_user":"","ip":"","db":"sbtest"}}
{"audit_record":{"name":"Query","record":"743017007_2018-03-31T01:03:12","timestamp":"2018-03-31T15:53:42 UTC","command_class":"select","connection_id":"2394","status":0,"sqltext":"SELECT c FROM sbtest1 WHERE id BETWEEN 2812021 AND 2812120 ORDER BY c","user":"sbtest[sbtest] @ localhost []","host":"localhost","os_user":"","ip":"","db":"sbtest"}}
{"audit_record":{"name":"Query","record":"743017008_2018-03-31T01:03:12","timestamp":"2018-03-31T15:53:42 UTC","command_class":"select","connection_id":"2394","status":0,"sqltext":"SELECT DISTINCT c FROM sbtest1 WHERE id BETWEEN 1 AND 100 ORDER BY c","user":"sbtest[sbtest] @ localhost []","host":"localhost","os_user":"","ip":"","db":"sbtest"}}

As you can see, there are substantial differences in the purposes of the different MySQL log formats, along with the information they provide.

Why analyze raw MySQL query logs

In my opinion, there are two main reasons to look directly at raw log files without aggregation (you might find others):

  • Auditing, where the Audit Log is useful (Vadim recently blogged about it)
  • Advanced MySQL/application debugging, where an aggregated summary might not allow you to drill down to the fullest level of detail

When you’re debugging using MySQL logs, the Slow Query Log, set to log all queries with no sampling, is the most useful. Of course, this can cause significant additional overhead in many workloads, so it is best to do it in a development environment (if you can repeat the situation you’re looking to analyze). At the very least, don’t do it during peak time.

For Percona Server for MySQL, these options ensure it logs all queries to the query log with no sampling:

log_output=file
slow_query_log=ON
long_query_time=0
log_slow_rate_limit=1
log_slow_verbosity=full
log_slow_admin_statements=ON
log_slow_slave_statements=ON
slow_query_log_always_write_time=1

Now that we have full queries, we can easily use Linux command line tools like grep and others to look into what is going on. However, many times this isn’t always convenient. This is where loading logs into storage that you can conveniently query is a good solution.

Let’s look into some specific and interesting cases.

Were any queries killed?

SELECT
   _time,
   query,
   query_time
FROM mysql_slow_log
WHERE killed > 0
????????????????_time???query?????????????????????????????????query_time??
? 2018-04-02 19:02:56 ? select benchmark(10000000000,"1+1") ?  10.640794 ?
??????????????????????????????????????????????????????????????????????????
1 rows in set. Elapsed: 0.242 sec. Processed 929.14 million rows, 1.86 GB (3.84                                         billion rows/s., 7.67 GB/s.)

Yes. A query got killed after running for 10 seconds.

Did any query fail? With what error codes?

SELECT
   error_num,
   min(_time),
   max(_time),
   count(*)
FROM mysql_slow_log
GROUP BY error_num
??error_num????????????min(_time)????????????max(_time)?????count()??
?         0 ? 2018-04-02 18:59:49 ? 2018-04-07 19:39:27 ? 925428375 ?
?      1160 ? 2018-04-02 19:02:56 ? 2018-04-02 19:02:56 ?         1 ?
?      1213 ? 2018-04-02 19:00:00 ? 2018-04-07 19:18:14 ?   3709520 ?
?      1054 ? 2018-04-07 19:38:14 ? 2018-04-07 19:38:14 ?         1 ?
?????????????????????????????????????????????????????????????????????
4 rows in set. Elapsed: 2.391 sec. Processed 929.14 million rows, 7.43 GB (388.64 million rows/s., 3.11 GB/s.)

You can resolve error codes with the 

perror

 command:

root@rocky:~# perror 1054
MySQL error code 1054 (ER_BAD_FIELD_ERROR): Unknown column '%-.192s' in '%-.192s'

This command has many uses. You can use it to hunt down application issues (like in this example of a missing column — likely due to bad or old code). It can also help you to spot SQL injection attempts that often cause queries with bad syntax, and troubleshoot deadlocks or foreign key violations.

Are there any nasty, long transactions?

SELECT
   transaction_id,
   max(_time) - min(_time) AS run_time,
   count(*) AS num_queries,
   sum(rows_affected) AS rows_changed
FROM mysql_slow_log
WHERE transaction_id != ''
GROUP BY transaction_id
ORDER BY rows_changed DESC
LIMIT 10
??transaction_id???run_time???num_queries???rows_changed??
? 17E070082      ? 0        ?      1      ? 9999         ?
? 17934C73C      ? 2        ?      6      ? 4            ?
? 178B6D346      ? 0        ?      6      ? 4            ?
? 17C909086      ? 2        ?      6      ? 4            ?
? 17B45EFAD      ? 5        ?      6      ? 4            ?
? 17ABAB840      ? 0        ?      6      ? 4            ?
? 17A36AD3F      ? 3        ?      6      ? 4            ?
? 178E037A5      ? 1        ?      6      ? 4            ?
? 17D1549C9      ? 0        ?      6      ? 4            ?
? 1799639F2      ? 1        ?      6      ? 4            ?
??????????????????????????????????????????????????????????
10 rows in set. Elapsed: 15.574 sec. Processed 930.58 million rows, 18.23 GB (59.75 million rows/s., 1.17 GB/s.)

Finding transactions that modify a lot of rows, like transaction 17E070082 above, can be very helpful to ensure you control MySQL replication slave lag. It is also critical if you’re looking to migrate to MySQL Group Replication or Percona XtraDB Cluster.

What statements were executed in a long transaction?

SELECT
   _time,
   _ms,
   query
FROM mysql_slow_log
WHERE transaction_id = '17E070082'
ORDER BY
   _time ASC,
   _ms ASC
LIMIT 10
????????????????_time??????_ms???query??????????????????????????????????
? 2018-04-07 20:08:43 ? 890693 ? update sbtest1 set k=0 where id<10000 ?
????????????????????????????????????????????????????????????????????????
1 rows in set. Elapsed: 2.361 sec. Processed 931.04 million rows, 10.79 GB (394.27 million rows/s., 4.57 GB/s.)

I used transaction 17E070082 from the previous query above (which modified 9999 rows). Note that this schema improves compression by storing the seconds and microseconds parts of the timestamp in different columns.

Were any queries dumping large numbers of rows from the database?

SELECT
   _time,
   query,
   rows_sent,
   bytes_sent
FROM mysql_slow_log
WHERE rows_sent > 10000
????????????????_time???query??????????????????????????????????????????????rows_sent???bytes_sent??
? 2018-04-07 20:21:08 ? SELECT /*!40001 SQL_NO_CACHE */ * FROM `sbtest1` ?  10000000 ? 1976260712 ?
???????????????????????????????????????????????????????????????????????????????????????????????????
1 rows in set. Elapsed: 0.294 sec. Processed 932.19 million rows, 3.73 GB (3.18 billion rows/s., 12.71 GB/s.)

Did someone Update a record?

SELECT
   _time,
   query
FROM mysql_slow_log
WHERE (rows_affected > 0) AND (query LIKE '%id=3301689%')
LIMIT 1
????????????????_time???query??????????????????????????????????????
? 2018-04-02 19:04:48 ? UPDATE sbtest1 SET k=k+1 WHERE id=3301689 ?
???????????????????????????????????????????????????????????????????
1 rows in set. Elapsed: 0.046 sec. Processed 2.29 million rows, 161.60 MB (49.57 million rows/s., 3.49 GB/s.)

Note that I’m cheating here by assuming we know an update used a primary key, but it is practically helpful in a lot of cases.

These are just some of the examples of what you can find out by querying raw slow query logs. They contain a ton of information about query execution (especially in Percona Server for MySQL) that allows you to use them both for performance analysis and some security and auditing purposes.

The post Why Analyze Raw MySQL Query Logs? appeared first on Percona Database Performance Blog.

Apr
18
2018
--

Restore a MongoDB Logical Backup

MongoDB Logical Backup

MongoDB Logical BackupIn this article, we will explain how to restore a MongoDB logical backup performed via ‘mongodump’ to a mongod instance.

MongoDB logical backup requires the use of the ‘mongorestore‘ tool to perform the restore backup. This article focuses on this tool and process.

Note: Percona develops a backup tool named Percona-Lab/mongodb-consistent-backup, which is a wrapper for ‘mongodump‘, adding cluster-wide backup consistency. The backups created by mongodb_consistent_backup (in Dump/Mongodump mode) can be restored using the exact same steps as a regular ‘mongodump’ backup – no special steps!

Mongorestore Command Flags

–host/–port (and –user/–password)

Required, even if you’re using the default host/port (localhost:27017). If authorization is enabled, add –user/–password flags also.

–drop

This is almost always required. This causes ‘mongodump‘ to drop the collection that is being restored before restoring it. Without this flag, the documents from the backup are inserted one at a time and if they already exist the restore fails.

–oplogReplay

This is almost always required. Replays the oplog that was dumped by mongodump. It is best to include this flag on replset-based backups unless there is a specific reason not to. You can tell if the backup was from a replset by looking for the file ‘oplog.bson‘ at the base of the dump directory.

–dir

Required. The path to the mongodump data.

–gzip

Optional. For mongodump >= 3.2, enables inline compression on the restore. This is required if ‘mongodump‘ used the –gzip flag (look for *.bson.gz files if you’re not sure if the collection files have no .gz suffix, don’t use –gzip).

–numParallelCollections=<number>

Optional. For mongodump >= 3.2 only, sets the number of collections to insert in parallel. By default four threads are used, and if you have a large server and you want to restore faster (more resource usage though), you could increase this number. Note that each thread uncompresses bson if the ‘–gzip‘ flag is used, so consider this when raising this number.

Steps

  1. (Optional) If the backup is archived (mongodb_consistent_backup defaults to creating tar archives), un-archive the backup so that ‘mongorestore‘ can access the .bson/.bson.gz files:
    $ tar -C /opt/mongodb/backup/testbackup/20160809_1306 -xvf /opt/mongodb/backup/testbackup/20160809_1306/test1.tar
    test1/
    test1/dump/
    test1/dump/wikipedia/
    test1/dump/wikipedia/pages.metadata.json.gz
    test1/dump/wikipedia/pages.bson.gz
    test1/dump/oplog.bson

    ** This command un-tars the backup to ‘/opt/mongodb/backup/testbackup/20160809_1306/test1/dump’ **

  2. Check (and then check again!) that you’re restoring the right backup to the right host. When in doubt, it is safer to ask the customer or others.
    1. The Percona ‘mongodb_consistent_backup‘ tool names backup subdirectories by replica set name, so you can ensure you’re restoring the right backup by checking the replica set name of the node you’re restoring to, if it exists.
    2. If you’re restoring to a replica set you will need to restore to the PRIMARY member and there needs to be a majority (so writes are accepted – some exceptions if you override write-concern, but not advised).
  3. Use ‘mongorestore‘ to restore the data by dropping/restoring each collection (–drop flag) and replay the oplog changes (–oplogReplay flag), specifying the restore dir explicitly (–dir flag) to the ‘mongorestore‘ command. In this example I also used authorization (–user/–password flags) and un-compression (–gzip flag):
    $ mongorestore --drop --host localhost --port 27017 --user secret --password secret --oplogReplay --gzip --dir /opt/mongodb/backup/testbackup/20160809_1306/test1/dump
    2016-08-09T14:23:04.057+0200    building a list of dbs and collections to restore from /opt/mongodb/backup/testbackup/20160809_1306/test1/dump dir
    2016-08-09T14:23:04.065+0200    reading metadata for wikipedia.pages from /opt/mongodb/backup/testbackup/20160809_1306/test1/dump/wikipedia/pages.metadata.json.gz
    2016-08-09T14:23:04.067+0200    restoring wikipedia.pages from /opt/mongodb/backup/testbackup/20160809_1306/test1/dump/wikipedia/pages.bson.gz
    2016-08-09T14:23:07.058+0200    [#######.................]  wikipedia.pages  63.9 MB/199.0 MB  (32.1%)
    2016-08-09T14:23:10.058+0200    [###############.........]  wikipedia.pages  127.7 MB/199.0 MB  (64.1%)
    2016-08-09T14:23:13.060+0200    [###################.....]  wikipedia.pages  160.4 MB/199.0 MB  (80.6%)
    2016-08-09T14:23:16.059+0200    [#######################.]  wikipedia.pages  191.5 MB/199.0 MB  (96.2%)
    2016-08-09T14:23:19.071+0200    [########################]  wikipedia.pages  223.5 MB/199.0 MB  (112.3%)
    2016-08-09T14:23:22.062+0200    [########################]  wikipedia.pages  255.6 MB/199.0 MB  (128.4%)
    2016-08-09T14:23:25.067+0200    [########################]  wikipedia.pages  271.4 MB/199.0 MB  (136.4%)
    ...
    ...
    2016-08-09T14:24:19.058+0200    [########################]  wikipedia.pages  526.9 MB/199.0 MB  (264.7%)
    2016-08-09T14:24:22.058+0200    [########################]  wikipedia.pages  558.9 MB/199.0 MB  (280.8%)
    2016-08-09T14:24:23.521+0200    [########################]  wikipedia.pages  560.6 MB/199.0 MB  (281.6%)
    2016-08-09T14:24:23.522+0200    restoring indexes for collection wikipedia.pages from metadata
    2016-08-09T14:24:23.528+0200    finished restoring wikipedia.pages (32725 documents)
    2016-08-09T14:24:23.528+0200    replaying oplog
    2016-08-09T14:24:23.597+0200    done
    1. If you encounter problems with ‘mongorestore‘, carefully read the error message or rerun with several ‘-v‘ flags, e.g.: ‘-vvv‘. Once you have an error, attempt to troubleshoot the cause.
  4. Check to see that you saw “replaying oplog” and “done” after the restore (last two lines in the example). If you don’t see this, there is a problem.

As you notice, using this tool for MongoDB logical backup is very simple. However, when using sharding please note that –oplog is not available and the mongodump uses the primaries for each shard. As this is not advised typically in production, you might consider looking at Percona-Lab/mongodb-consistent-backup to ensure you are consistent and hitting secondary nodes, like mongodump with replica sets, will work.

If MongoDB and topics like this interest you, please see the document below, we are hiring!

{
  hiring: true,
  role: "Consultant",
  tech: "MongoDB",
  location: "USA",
  moreInfo: "https://www.percona.com/about-percona/careers/mongodb-consultant-usa-based"
}

The post Restore a MongoDB Logical Backup appeared first on Percona Database Performance Blog.

Apr
18
2018
--

Webinar Thursday, April 19, 2018: Running MongoDB in Production, Part 1

Running MongoDB

Running MongoDBPlease join Percona’s Senior Technical Operations Architect, Tim Vaillancourt as he presents Running MongoDB in Production, Part 1 on Thursday, April 19, 2018, at 10:00 am PDT (UTC-7) / 1:00 pm EDT (UTC-4).

Are you a seasoned MySQL DBA that needs to add MongoDB to your skills? Are you used to managing a small environment that runs well, but want to know what you might not know yet? This webinar helps you with running MongoDB in production environments.

MongoDB works well, but when it has issues, the number one question is “where should I go to solve a problem?”

This tutorial will cover:

Backups
– Logical vs Binary-level backups
– Sharding and Replica-Set Backup strategies
Security
– Filesystem and Network Security
– Operational Security
– External Authentication features of Percona Server for MongoDB
– Securing connections with SSL and MongoDB Authorization
– Encryption at Rest
– New Security features in 3.6
Monitoring
– Monitoring Strategy
– Important metrics to monitor in MongoDB and Linux
– Percona Monitoring and Management

Register for the webinar now.

Part 2 of this series will take place on Thursday, April 26, 2018, at 10:00 am PDT (UTC-7) / 1:00 pm EDT (UTC-4). Register for the second part of this series here.

Running MongoDBTimothy Vaillancourt, Senior Technical Operations Architect

Tim joined Percona in 2016 as Sr. Technical Operations Architect for MongoDB, with the goal to make the operations of MongoDB as smooth as possible. With experience operating infrastructures in industries such as government, online marketing/publishing, SaaS and gaming combined with experience tuning systems from the hard disk all the way up to the end-user, Tim has spent time in nearly every area of the modern IT stack with many lessons learned. Tim is based in Amsterdam, NL and enjoys traveling, coding and music.

Prior to Percona Tim was the Lead MySQL DBA of Electronic Arts’ DICE studios, helping some of the largest games in the world (“Battlefield” series, “Mirrors Edge” series, “Star Wars: Battlefront”) launch and operate smoothly while also leading the automation of MongoDB deployments for EA systems. Before the role of DBA at EA’s DICE studio, Tim served as a subject matter expert in NoSQL databases, queues and search on the Online Operations team at EA SPORTS. Before moving to the gaming industry, Tim served as a Database/Systems Admin operating a large MySQL-based SaaS infrastructure at AbeBooks/Amazon Inc.

The post Webinar Thursday, April 19, 2018: Running MongoDB in Production, Part 1 appeared first on Percona Database Performance Blog.

Apr
16
2018
--

Binlog and Replication Improvements in Percona Server for MySQL

Percona Server for MySQL

Percona Server for MySQLDue to continuous development and improvement, Percona Server for MySQL incorporates a number of improvements related to binary log handling and replication. This results in replication specifics, distinguishing it from MySQL Server.

Temporary tables and mixed logging format

Summary of the fix:

As soon as some statement involving temporary tables was met when using a mixed binlog format, MySQL switched to row-based logging for all statements until the end of the session (or until all temporary tables used in the session were dropped). This is inconvenient when you have long-lasting connections, including replication-related ones. Percona Server for MySQL fixes the situation by switching between statement-based and row-based logging when necessary.

Details:

The new mixed binary logging format, supported by Percona Server for MySQL, means that the server runs in statement-based logging by default, but switches to row-based logging when replication would be unpredictable. For example, in the case of a nondeterministic SQL statement that could cause data divergence if reproduced on a slave server. The switch is done when matching any condition from a long list, and one of these conditions is the use of temporary tables.

Temporary tables are never logged using row-based format, but any statement that touches a temporary table is logged in row mode. This way, we intercept all the side effects that temporary tables can produce on non-temporary ones.

There is no need to use the row logging format for any other statements, solely because of the temp table presence. However, MySQL undertook such an excessive precaution: once some statement with a temporary table had appeared and the row-based logging was used, MySQL was logging unconditionally put all subsequent statements in row format.

Percona Server for MySQL has implemented more accurate behavior. Instead of switching to row-based logging until the last temporary table is closed, the usual rules of row vs. statement format apply, and we don’t consider the presence of currently opened temporary tables. This change was introduced with the fix of bug #151 (upstream #72475).

Temporary table drops and binloging on GTID-enabled server

Summary of the fix:

MySQL logs DROP statements for all temporary tables regardless of the logging mode under which these tables were created. This produces binlog writes and errand GTIDs on slaves with row and mixed logging. Percona Server for MySQL fixes this by tracking the binlog format at temporary table create time and uses it to decide whether a DROP should be logged or not.

Details:

Even with read_only mode enabled, the server permits some operations, including ones with temporary tables. With the previous fix, temporary table operations are not binlogged in row- or mixed-mode. But MySQL server doesn’t track what the logging mode was when a temporary table was created, and therefore unconditionally logs DROP statements for all temporary tables. These DROP statements receive IF EXISTS addition, which is intended to make them harmless.

Percona Server for MySQL has fixed this with the bug fixes #964, upstream #83003, and upstream #85258. Moreover, with all the binlogging fixes discussed so far nothing involving temporary tables is logged to the binary log in row or mixed format. There is no need to consider CREATE/DROP TEMPORARY TABLE unsafe for use in stored functions, triggers and multi-statement transactions in row/mixed format. Therefore, we introduced an additional fix to mark the creation and drop of temporary tables as unsafe inside transactions in statement-based replication only (the fixed bug is #1816, while the correspondent upstream one is #89467 and it is still open).

Safety of statements with a LIMIT clause

Summary of the fix:

MySQL Server considers all UPDATE/DELETE/INSERT ... SELECT statements with the LIMIT clause unsafe, no matter if they are really producing non-deterministic results or not. Percona Server for MySQL is more accurate because it acknowledges such instructions as safe when they include ORDER BY PK or WHERE condition.

Details:

MySQL Server treats UPDATE/DELETE/INSERT ... SELECT statements with the LIMIT clause as unsafe, considering that they produce an unpredictable number of rows. But some such statements can still produce an absolutely predictable result. One such deterministic case takes place when a statement with the LIMIT clause has an ORDER BY PK or WHERE condition.

The patch, making updates and deletes with a limit to be supposed as safe if they have an ORDER BY pk_column clause, was initially provided on the upstream bug report and incorporated later into Percona Server for MySQL with additional improvements. Bug fixed #44 (upstream #42415).

Performance improvements

There are also two modifications in Percona Server related to multi-source replication that improve performance on slaves.

The first improvement is about relay log position, which was always updated in multi-source replications setups regardless of whether the committed transaction has already been executed or not. Percona Server omits relay log position updates for the already logged GTIDs.

These unconditional relay log position updates caused additional fsync operations in the case of relay-log-info-repository=TABLE. With the higher number of channels transmitting such duplicate (already executed) transactions, the situation became proportionally worse. The problem was solved in Percona Server 5.7.18-14.  Bug fixed  #1786 (upstream #85141).

The second improvement decreases the load on slave nodes configured to update the master status and connection information only on log file rotation. MySQL additionally updated this information in the case of multi-source replication when a slave had to skip the already executed GTID event. This behavior was the cause of substantially higher write loads on slaves and lower replication throughput.

The configuration with master_info_repository=TABLE and sync_master_info=0  makes the slave update the master status and connection information in this table on log file rotation and not after each sync_master_info event, but it didn’t work on multi-source replication setups. Heartbeats sent to the slave to skip GTID events that it had already executed previously were evaluated as relay log rotation events and reacted with mysql.slave_master_info table sync. This inaccuracy could produce a huge (up to five times on some setups) increase in write load on the slave, before this problem was fixed in Percona Server for MySQL 5.7.20-19. Bug fixed  #1812 (upstream #85158).

Current status of fixes

The three issues related to temporary tables that were fixed in Percona Server 5.5 and contributed upstream, and the final fixes of the bugs #72475, #83003, and #85258, have landed into MySQL Server 8.0.4.

The post Binlog and Replication Improvements in Percona Server for MySQL appeared first on Percona Database Performance Blog.

Apr
13
2018
--

MongoDB Replica Set Tag Sets

MongoDB Replica Set Tag Sets

MongoDB Replica Set Tag SetsIn this blog post, we will look at MongoDB replica set tag sets, which enable you to use customized write concern and read preferences for replica set members.

This blog post will cover most of the questions that come to mind before using tag sets in a production environment.

  • What scenarios are these helpful for?
  • Do these tag sets work with all read preferences modes?
  • What if we’re already using maxStalenessSeconds along with the read preferences, can we still use a tag set?
  • How can one configure tag sets in a replica set?
  • Do these tags work identically for custom read preferences and write concerns?

Now let’s answer all these questions one by one.

What scenarios are these helpful for?

You can use tags:

  • If replica set members have different configurations and queries need to be redirected to the specific secondaries as per their purpose. For example, production queries can be redirected to the higher configuration member for faster execution and queries used for internal reporting purpose can be redirected to the low configurations secondaries. This will help improve per node resource utilization.
  • When you use custom read preferences, but the reads are routed to a secondary that resides in another data center to make reads more optimized and cost-effective. You can use tag sets to make sure that specific reads are routed to the specific secondary node within the DC.
  • If you want to use custom write concerns with the tag set for acknowledging writes are propagated to the secondary nodes per the requirements.

Do these tag sets work with all read preferences modes?

Yes, these tag-sets work with all the read preferences — except “primary” mode. “Primary” preferred read preference mode doesn’t allow you to add any tag sets while querying.

replicaTest:PRIMARY> db.tagTest.find().readPref('primary', [{"specs" : "low","purpose" : "general"}])
Error: error: {
	"ok" : 0,
	"errmsg" : "Only empty tags are allowed with primary read preference",
	"code" : 2,
	"codeName" : "BadValue"
}

What if we’re already using maxStalenessSeconds along with the read preferences, can tag set still be used?

Yes, you can use tag sets with a maxStalenessSeconds value. In that case, priority is given to staleness first, then tags, to get the most recent data from the secondary member.

How can one configure tag sets in a replica set?

You can configure tags by adding a parameter in the replica set configuration. Consider this test case with a five members replica set:

"members" : [
		{
			"_id" : 0,
			"name" : "host1:27017",
			"stateStr" : "PRIMARY",
		},
		{
			"_id" : 1,
			"name" : "host2:27017",
			"stateStr" : "SECONDARY",
		},
		{
			"_id" : 2,
			"name" : "host3:27017",
			"stateStr" : "SECONDARY",
		},
		{
			"_id" : 3,
			"name" : "host4:27017",
			"stateStr" : "SECONDARY",
		},
		{
			"_id" : 4,
			"name" : "host5:27017",
			"stateStr" : "SECONDARY",
         }
		]

For our test case, members specification of the host are “specs” and the requirement for the query as per the application is the “purpose,” in order to route queries to specific members in an optimized manner.

You must associate tags to each member by adding it to the replica set configuration:

cfg=rs.conf()
cfg.members[0].tags={"specs":"high","purpose":"analytics"}
cfg.members[1].tags={"specs":"high"}
cfg.members[2].tags={"specs":"low","purpose":"general"}
cfg.members[3].tags={"specs":"high","purpose":"analytics"}
cfg.members[4].tags={"specs":"low"}
rs.reconfig(cfg)

After adding tags, you can validate these changes by checking replica set configurations like:

rs.conf()
	"members" : [
		{
			"_id" : 0,
			"host" : "host1:27017",
			"tags" : {
				"specs" : "high",
				"purpose" : "analytics"
			},
		},
		{
			"_id" : 1,
			"host" : "host2:27017",
			"tags" : {
				"specs" : "high"
			},
		},
		{
			"_id" : 2,
			"host" : "host3:27017",
			"tags" : {
				"specs" : "low",
				"purpose" : "general"
			},
		},
		{
			"_id" : 3,
			"host" : "host4:27017",
			"tags" : {
				"specs" : "high",
				"purpose" : "analytics"
			},
		},
		{
			"_id" : 4,
			"host" : "host5:27017",
			"tags" : {
				"specs" : "low"
			},
		}
	]

Now, we are done with the tag-set configuration.

Do these tags work identically for custom read preferences and write concerns?

No, custom read preferences and write concerns consider tag sets in different ways.

Read preferences routes read operations to a required specific member by following tag values assigned to it, but write concerns follows tag values only to check if the value is unique or not. It will not consider tag values while selecting replica members.

Let us see how to use tag sets with write concerns. As per our test case, we have two unique tag values (i.e., “analytics” and “general”) defined as:

cfg=rs.conf()
cfg.settings={ getLastErrorModes: {writeNode:{"purpose": 2}}}
rs.reconfig(cfg)

You can validate these changes by checking the replica set configuration:

rs.conf()
	"settings" : {
			"getLastErrorModes" : {
			"writeNode" : {
				"purpose" : 2
			}<strong>
		},</strong>
	}

Now let’s try to insert a sample document in the collection named “tagTest” with this write concern:

db.tagTest.insert({name:"tom",tech:"nosql",status:"active"},{writeConcern:{w:"writeNode"}})
WriteResult({ "nInserted" : 1 })

Here, the write concern “writeNode” means the client gets a write acknowledgment from two nodes with unique tag set values. If the value set in the configuration exceeds the count of unique values, then it leads to an error at the time of the write:

cfg.settings={ getLastErrorModes: {writeNode:{"purpose": 4}}}
rs.reconfig(cfg)
db.tagTest.insert({name:"tom",tech:"nosql",status:"active"},{writeConcern:{w:"writeNode"}})
WriteResult({
	"nInserted" : 1,
	"writeConcernError" : {
		"code" : 100,
		"codeName" : "CannotSatisfyWriteConcern",
		"errmsg" : "Not enough nodes match write concern mode "writeNode""
	}
}

You can perform read and write operations with tag sets like this:

db.tagTest.find({name:"tom"}).readPref("secondary",[{"specs":"low","purpose":"general"}])
db.tagTest.insert({name:"john",tech:"rdbms",status:"active"},{writeConcern:{w:"writeNode"}})

I hope this helps you to understand how to configure MongoDB replica set tag sets, how the read preferences and write concerns handle them, and where you can use them

The post MongoDB Replica Set Tag Sets appeared first on Percona Database Performance Blog.

Apr
12
2018
--

Flashback: Another Take on Point-In-Time Recovery (PITR) in MySQL/MariaDB/Percona Server

Point-In-Time Recovery

Point-In-Time RecoveryIn this blog post, I’ll look at point-in-time recovery (PITR) options for MySQL, MariaDB and Percona Server for MySQL.

It is a common good practice to extend data safety by having additional measures apart from regular data backups, such as delayed slaves and binary log backups. These two options provide the ability to restore the data to any given point in time, or just revert from some bad accidents. These methods have their limitations of course: delayed slaves only help if a deadly mistake is noticed fast enough, while full point-in-time recovery (PITR) requires the last full backup and binary logs (and therefore usually takes a lot of time).

How to reverse from disaster faster

Alibaba engineers and the MariaDB team implemented an interesting feature in their version of the mysqlbinlog tool: the --flashback option. Based on ROW-based DML events, it can transform the binary log and reverse purposes. That means it can help undo given row changes extremely fast. For instance, it can change DELETE events to INSERTs and vice versa, and it will swap WHERE and SET parts of the UPDATE events. This simple idea can dramatically speed up recovery from certain types of mistakes or disasters.

The question is whether it works with non-MariaDB variants. To verify that, I tested this feature with the latest available Percona Server for MySQL 5.7 (which is fully compatible with upstream MySQL).

master [localhost] {msandbox} ((none)) > select @@version,@@version_comment;
+---------------+--------------------------------------------------------+
| @@version     | @@version_comment                                      |
+---------------+--------------------------------------------------------+
| 5.7.21-20-log | Percona Server (GPL), Release 20, Revision ed217b06ca3 |
+---------------+--------------------------------------------------------+
1 row in set (0.00 sec)

First, let’s simulate one possible deadly scenario: a forgotten WHERE in DELETE statement:

master [localhost] {msandbox} ((none)) > select count(*) from test.sbtest1;
+----------+
| count(*) |
+----------+
| 200      |
+----------+
1 row in set (0.00 sec)
master [localhost] {msandbox} ((none)) > delete from test.sbtest1;
Query OK, 200 rows affected (0.04 sec)
slave1 [localhost] {msandbox} ((none)) > select count(*) from test.sbtest1;
+----------+
| count(*) |
+----------+
| 0        |
+----------+
1 row in set (0.00 sec

So, our data is lost on both the master and slave!

Let’s start by downloading the latest MariaDB server 10.2.x package, which I’m hoping has a mysqlbinlog tool that works with MySQL 5.7, and unpack it to some custom location:

$ dpkg -x mariadb-server-10.2_10.2.13+maria~wheezy_amd64.deb /opt/maria/
$ /opt/maria/usr/bin/mysqlbinlog --help|grep flash
-B, --flashback Flashback feature can rollback you committed data to a

It has the function we are looking for. Now, we have to find the culprit transaction or set of transactions we want to revert. A simplified example may look like this:

$ mysqlbinlog -v --base64-output=DECODE-ROWS mysql-bin.000002 > mysql-bin.000002.sql
$ less mysql-bin.000002.sql

By searching through the decoded binary log, we are looking for transactions that have wiped out the table test.sbtest1. It looks like this (as the table had 200 rows, it is pretty long, so I’ve pasting only the beginning and the end):

BEGIN
/*!*/;
# at 291
#180314 15:30:34 server id 1  end_log_pos 348 CRC32 0x06cd193e  Table_map: `test`.`sbtest1` mapped to number 111
# at 348
#180314 15:30:34 server id 1  end_log_pos 8510 CRC32 0x064634c5         Delete_rows: table id 111
...
### DELETE FROM `test`.`sbtest1`
### WHERE
###   @1=200
###   @2=101
###   @3='26157116088-21551255803-13077038767-89418462090-07321921109-99464656338-95996554805-68102077806-88247356874-53904987561'
###   @4='51157774706-69740598871-18633441857-39587481216-98251863874'
# at 38323
#180314 15:30:34 server id 1  end_log_pos 38354 CRC32 0x6dbb7127        Xid = 97
COMMIT/*!*/;

It is very important to take the proper start and stop positions. We need the ones exactly after BEGIN and before the final COMMIT. Then, let’s test if the tool produces the reverse statements as expected. First, decode the rows to the .sql file:

$ /opt/maria/usr/bin/mysqlbinlog --flashback -v --base64-output=DECODE-ROWS --start-position=291 --stop-position=38323 mysql-bin.000002 > mysql-bin.000002_flash.sql

Inside, we find 200 of those. Looks good:

### INSERT INTO `test`.`sbtest1`
### SET
### @1=200
...

Since we verified the positions are correct, we can prepare a binary log file:

$ /opt/maria/usr/bin/mysqlbinlog --flashback --start-position=291 --stop-position=38323 mysql-bin.000002 > mysql-bin.000002_flash.bin

and load it back to our master:

master [localhost] {msandbox} (test) > source mysql-bin.000002_flash.bin
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected, 1 warning (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.04 sec)
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
master [localhost] {msandbox} (test) > select count(*) from test.sbtest1;
+----------+
| count(*) |
+----------+
| 200      |
+----------+
1 row in set (0.00 sec)

and double check they restored on slaves:

slave1 [localhost] {msandbox} (test) > select count(*) from test.sbtest1;
+----------+
| count(*) |
+----------+
| 200      |
+----------+
1 row in set (0.00 sec)

GTID problem

MariaDB has a completely different GTID implementation from MySQL and Percona Server. You can expect problems when decoding incompatible GTID enabled binary logs with MariaDB. As MariaDB’s mysqlbinlog does not support –start/stop-gtid options (even for its own implementation), we have to take the usual positions anyway. From a GTID-enabled binary log, for example, delete can look like this:

# at 2300
#180315 9:37:31 server id 1 end_log_pos 2365 CRC32 0x09e4d815 GTID last_committed=1 sequence_number=2 rbr_only=yes
/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
SET @@SESSION.GTID_NEXT= '00020996-1111-1111-1111-111111111111:2'/*!*/;
# at 2365
#180315 9:37:31 server id 1 end_log_pos 2433 CRC32 0xac62a20d Query thread_id=4 exec_time=0 error_code=0
SET TIMESTAMP=1521103051/*!*/;
BEGIN
/*!*/;
# at 2433
#180315 9:37:31 server id 1 end_log_pos 2490 CRC32 0x275601d6 Table_map: `test`.`sbtest1` mapped to number 108
# at 2490
#180315 9:37:31 server id 1 end_log_pos 10652 CRC32 0xe369e169 Delete_rows: table id 108
...
# at 42355
#180315 9:37:31 server id 1 end_log_pos 42386 CRC32 0xe01ff558 Xid = 31
COMMIT/*!*/;
SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;

The tool seems to work, and transforms the delete transaction to a sequence of INSERTs. However, the server rejects it when we try to load it on a GTID-enabled master:

master [localhost] {msandbox} ((none)) > source mysql-bin.000003.flash
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected, 1 warning (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
ERROR 1782 (HY000): @@SESSION.GTID_NEXT cannot be set to ANONYMOUS when @@GLOBAL.GTID_MODE = ON.
ERROR 1782 (HY000): @@SESSION.GTID_NEXT cannot be set to ANONYMOUS when @@GLOBAL.GTID_MODE = ON.
ERROR 1782 (HY000): @@SESSION.GTID_NEXT cannot be set to ANONYMOUS when @@GLOBAL.GTID_MODE = ON.
ERROR 1782 (HY000): @@SESSION.GTID_NEXT cannot be set to ANONYMOUS when @@GLOBAL.GTID_MODE = ON.
ERROR 1782 (HY000): @@SESSION.GTID_NEXT cannot be set to ANONYMOUS when @@GLOBAL.GTID_MODE = ON.
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected, 1 warning (0.00 sec)
master [localhost] {msandbox} ((none)) > select count(*) from test.sbtest1;
+----------+
| count(*) |
+----------+
| 0 |
+----------+
1 row in set (0.00 sec)

Unfortunately, the solution here is either to disable GTID mode for the recovery time (which is surely tricky in replicated clusters), or try to add GTID-related information to the resulting binary log with the

--flashback option

. In my case, adding these lines worked (I used the next free available GTID sequence):

$ diff -u mysql-bin.000003.flash mysql-bin.000003.flash.gtid
--- mysql-bin.000003.flash 2018-03-15 10:20:20.080487998 +0100
+++ mysql-bin.000003.flash.gtid 2018-03-15 10:25:02.909953620 +0100
@@ -4,6 +4,10 @@
DELIMITER /*!*/;
#180315 9:32:51 server id 1 end_log_pos 123 CRC32 0x941b189a Start: binlog v 4, server v 5.7.21-20-log created 180315 9:32:51 at startup
ROLLBACK/*!*/;
+# at 154
+#180315 9:37:05 server id 1 end_log_pos 219 CRC32 0x69e4ce26 GTID last_committed=0 sequence_number=1 rbr_only=yes
+/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
+SET @@SESSION.GTID_NEXT= '00020996-1111-1111-1111-111111111111:5'/*!*/;
BINLOG '
sy+qWg8BAAAAdwAAAHsAAAAAAAQANS43LjIxLTIwLWxvZwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAACzL6paEzgNAAgAEgAEBAQEEgAAXwAEGggAAAAICAgCAAAACgoKKioAEjQA
@@ -724,6 +728,7 @@
'/*!*/;
COMMIT
/*!*/;
+SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;

master [localhost] {msandbox} ((none)) > source mysql-bin.000003.flash.gtid
(...)
master [localhost] {msandbox} ((none)) > select count(*) from test.sbtest1;
+----------+
| count(*) |
+----------+
| 200      |
+----------+
1 row in set (0.00 sec

Limitations

Obviously, flashback cannot help after DROP/TRUNCATE or other DDL commands. These are not transactional, and affected rows are never recorded in the binary log. It doesn’t work with encrypted or compressed binary logs either. But most importantly, to produce complete events that can reverse bad transactions, the binary format must be ROW. The row image also must be FULL:

master [localhost] {msandbox} ((none)) > select @@binlog_format,@@binlog_row_image;
+-----------------+--------------------+
| @@binlog_format | @@binlog_row_image |
+-----------------+--------------------+
| ROW             | FULL               |
+-----------------+--------------------+
1 row in set (0.00 sec)

If these conditions are not met (or if you’re dealing with a too-complicated GTID issue), you will have to follow the standard point-in-time recovery procedure.

The post Flashback: Another Take on Point-In-Time Recovery (PITR) in MySQL/MariaDB/Percona Server appeared first on Percona Database Performance Blog.

Apr
12
2018
--

Percona Live 2018 Featured Talk: Containerizing Databases at New Relic (What We Learned) with Joshua Galbraith and Bryant Vinisky

Joshua Bryant Percona Live 2018 New Relic

Percona Live 2018 Featured TalkWelcome to another interview blog for the rapidly-approaching Percona Live 2018. Each post in this series highlights a Percona Live 2018 featured talk at the conference and gives a short preview of what attendees can expect to learn from the presenter.

This blog post highlights Joshua Galbraith, Senior Software Engineer and Bryant Vinisky, Site Reliability Engineer at New Relic. Their session talk is titled Containerizing Databases at New Relic: What We Learned. There are many trade-offs when containerizing databases, and there are many open source support options for database containers such as Kubernetes and Apache Mesos. In our conversation, we discussed what containers can bring to a database environment:

Percona: Who are you, and how did you get into databases? What was your path to your current responsibilities?

Joshua: My name is Joshua Galbraith, and I’m a Senior Software Engineer and Technical Product Manager on the Database Engineering team at New Relic. I’ve been working with open-source databases for the past ten years. I started my software engineering career by writing scripts to load several years of high-resolution geophysical sensor data from flat files into a Postgres database. I’ve been writing software that interacts with databases and tools to make operating them easier ever since. I’ve run MySQL, MongoDB, ElasticSearch, Cassandra and Redis in a variety of production environments. A little over three years ago, I co-founded a DBaaS startup and built a container-based platform for Apache Cassandra. I’ve been containerizing databases ever since — most recently by working on a project called Megabase at New Relic.

Joshua Bryant Percona Live 2018 New RelicBryant: Hello, my name is Bryant Vinisky. I currently work on the Database Engineering team at New Relic as a Senior Site Reliability Engineer. My professional experience with databases and the open source ecosystem started over eight years ago when I joined the engineering team at NWEA and helped roll out a SaaS platform for delivering adaptive assessments to students over the web. That platform was backed by a number of different database and related technologies — namely PostgreSQL, MongoDB and Redis.

Though in the beginning, my role didn’t formally involve databases, they were clearly a very important part of the greater system. Armed with an unquenchable curiosity, I was never shy about crossing boundaries into DB land. On the development side, much of the tooling and side projects I worked on over the years frequently touched databases for a storage backend. As the assessment platform scaled out to support hundreds of thousands of concurrent users, my role shifted to a reliability focus, often involving pre-release load testing of the major system components as well as doing follow up for problems that occurred on the production system. Much of the work involved dealing with the databases as a scaling bottleneck.

Containers came into the picture for me over two years ago when I moved to New Relic, where they had been using containers for stateless services for years and largely skipped over VMs. It wasn’t long before we started exploring containers as a solution for stateful database services with our Megabase project, an area where I’ve spent a lot of my time since

Your tutorial is titled “Containerizing Databases at New Relic: What We Learned”. How did you decide on containers as your database solution?

Joshua/Bryant: At New Relic, we had already built a Container Fabric for our stateless services. We knew we needed to provide databases to our internal teams in a way that was fast, cost-efficient and repeatable. Using containers would get us most of the way towards those goals, but we didn’t have a proven pattern that we could follow to reach them. We heard about other companies succeeding in building similar solutions, and we knew that we were moving away from virtual machines. Containers seemed to fit the bill.

What are the benefits and drawbacks of containerizing databases?

Joshua/Bryant: Containers are great for packaging and deployment. They allow us to deploy a known version of configuration, code and environment in a deterministic way. Unfortunately, containers are not perfect mechanisms for resource isolation, especially when large amounts of persistent storage are required. The challenge of containerizing databases is to make the right trade-offs between portability and performance, complexity and ease-of-operation given the specific context and goals of your team and organization.

Were specific database software (MySQL, Redis, PostgreSQL) easier to containerize? Why?

Joshua/Bryant: In general, the databases that are easiest to containerize are the databases that manage their own state and make themselves effectively stateless from the point of the scheduler and container orchestration framework. Databases that provide cluster membership, fault-tolerance and data replication are easier to run with a traditional orchestration framework. Databases that do not do these things on their own require additional “sidecar” services and/or application-specific frameworks to operate reliably.

Why should people attend your talk? What do you hope people will take away from it?

Joshua/Bryant: As a team, we’ve learned a lot of lessons about what to do, and not to do, when running stateful applications in containers. We’d like to pass that knowledge on to our audience. We also want to send people away with enough information to make the right decisions about whether or not to containerize the databases they are responsible for, and how to do it in a way that is successful given their own goals and context. At the very least, we hope it will be entertaining — and maybe we’ll learn some things too.

What are you looking forward to at Percona Live (besides your talk)?

Joshua/Bryant: We’re very excited to hear about open-source tools like Vitess and ProxySQL. I’m also looking forward to hearing about the latest monitoring and performance analysis tools. I always love deep-dives into specific database problems faced by a company, and I love talking to people in the expo hall. I always come away from Percona Live events a little more excited about my day-to-day work and the future of open-source databases.

Want to find out more about this Percona Live 2018 featured talk, and containerizing stateful databases? Register for Percona Live 2018, and see Joshua and Bryant’s session talk Containerizing Databases at New Relic: What We Learned. Register now to get the best price! Use the discount code SeeMeSpeakPL18 for 10% off.

Percona Live Open Source Database Conference 2018 is the premier open source event for the data performance ecosystem. It is the place to be for the open source community. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

The Percona Live Open Source Database Conference will be April 23-25, 2018 at the Hyatt Regency Santa Clara & The Santa Clara Convention Center.

The post Percona Live 2018 Featured Talk: Containerizing Databases at New Relic (What We Learned) with Joshua Galbraith and Bryant Vinisky appeared first on Percona Database Performance Blog.

Apr
11
2018
--

ProxySQL Admin Support for Multiple Clusters

ProxySQL Admin

ProxySQL AdminIn this blog post, we demonstrate a new feature in ProxySQL Admin: support for multiple clusters.

In a previous blog post, Ramesh and Roel introduced a new tool that helps configured Percona XtraDB Cluster nodes into ProxySQL. However, at that time it only worked for a single cluster per ProxySQL Admin configuration. Starting from ProxySQL 1.4.6, which comes with an improved ProxySQL Admin tool (proxysql-admin), our tool now supports configuring multiple Percona XtraDB Cluster clusters with ease (PSQLADM-32).

Pre-requisites

  • Cluster name (wsrep_cluster_name) should be unique.
  • proxysql-admin.cnf configuration differences:
    • ProxySQL READ/WRITE hostgroup should be different for each cluster.
    • Application user should be different for each cluster.
  • Host priority feature support only one cluster at a time.

Configuring /etc/proxysql-admin.cnf

As mentioned above, the CLUSTER_APP_USERNAME and the WRITE/READ_HOSTGROUP should be different for each cluster. Wsrep_cluster_name should also be unique for each cluster.

+--------------------+----------+
| Variable_name      | Value    |
+--------------------+----------+
| wsrep_cluster_name | cluster1 |
+--------------------+----------+
+--------------------+----------+
| Variable_name      | Value    |
+--------------------+----------+
| wsrep_cluster_name | cluster2 |
+--------------------+----------+

Sample configuration of /etc/proxysql-admin.cnf for cluster1:

# proxysql admin interface credentials.
export PROXYSQL_DATADIR='/var/lib/proxysql'
export PROXYSQL_USERNAME='admin'
export PROXYSQL_PASSWORD='admin'
export PROXYSQL_HOSTNAME='localhost'
export PROXYSQL_PORT='6032'
# PXC admin credentials for connecting to pxc-cluster-node.
export CLUSTER_USERNAME='root'
export CLUSTER_PASSWORD='sekret'
export CLUSTER_HOSTNAME='10.0.3.41'
export CLUSTER_PORT='3306'
# proxysql monitoring user. proxysql admin script will create this user in pxc to monitor pxc-nodes.
export MONITOR_USERNAME='monitor'
export MONITOR_PASSWORD='monit0r'
# Application user to connect to pxc-node through proxysql
export CLUSTER_APP_USERNAME='cluster1_user'
export CLUSTER_APP_PASSWORD='c1_passw0rd'
# ProxySQL read/write hostgroup
export WRITE_HOSTGROUP_ID='10'
export READ_HOSTGROUP_ID='11'
# ProxySQL read/write configuration mode.
export MODE="singlewrite"
# ProxySQL Cluster Node Priority File
export HOST_PRIORITY_FILE=$PROXYSQL_DATADIR/host_priority.conf

Sample configuration of /etc/proxysql-admin.cnf for cluster2

# proxysql admin interface credentials.
export PROXYSQL_DATADIR='/var/lib/proxysql'
export PROXYSQL_USERNAME='admin'
export PROXYSQL_PASSWORD='admin'
export PROXYSQL_HOSTNAME='localhost'
export PROXYSQL_PORT='6032'
# PXC admin credentials for connecting to pxc-cluster-node.
export CLUSTER_USERNAME='root'
export CLUSTER_PASSWORD='sekret'
export CLUSTER_HOSTNAME='10.0.3.173'
export CLUSTER_PORT='3306'
# proxysql monitoring user. proxysql admin script will create this user in pxc to monitor pxc-nodes.
export MONITOR_USERNAME='monitor'
export MONITOR_PASSWORD='monit0r'
# Application user to connect to pxc-node through proxysql
export CLUSTER_APP_USERNAME='cluster2_user'
export CLUSTER_APP_PASSWORD='c2_passw0rd'
# ProxySQL read/write hostgroup
export WRITE_HOSTGROUP_ID='20'
export READ_HOSTGROUP_ID='21'
# ProxySQL read/write configuration mode.
export MODE="loadbal"
# ProxySQL Cluster Node Priority File
export HOST_PRIORITY_FILE=$PROXYSQL_DATADIR/host_priority.conf

Setting up Percona XtraDB Cluster nodes in ProxySQL

I would add that you have the option to use a single proxysql-admin.cnf file, and just edit the file where changes are appropriate. You could also use two different files to configure ProxySQL. In my example, I used two files with the contents as seen above:

[root@proxysql_multi-pxc ~]# proxysql-admin --config=/etc/proxysql-admin_cluster1.cnf --enable
This script will assist with configuring ProxySQL (currently only Percona XtraDB cluster in combination with ProxySQL is supported)
ProxySQL read/write configuration mode is singlewrite
Configuring ProxySQL monitoring user..
ProxySQL monitor username as per command line/config-file is monitor
User 'monitor'@'10.%' has been added with USAGE privilege
Configuring the Percona XtraDB Cluster application user to connect through ProxySQL
Percona XtraDB Cluster application username as per command line/config-file is cluster1_user
Percona XtraDB Cluster application user 'cluster1_user'@'10.%' has been added with the USAGE privilege, please make sure to the grant appropriate privileges
Adding the Percona XtraDB Cluster server nodes to ProxySQL
Configuring singlewrite mode with the following nodes designated as priority order:
Write node info
+-----------+--------------+------+---------+---------+
| hostname  | hostgroup_id | port | weight  | comment |
+-----------+--------------+------+---------+---------+
| 10.0.3.41 | 10           | 3306 | 1000000 | WRITE   |
+-----------+--------------+------+---------+---------+
ProxySQL configuration completed!
ProxySQL has been successfully configured to use with Percona XtraDB Cluster
You can use the following login credentials to connect your application through ProxySQL
mysql --user=cluster1_user -p  --host=localhost --port=6033 --protocol=tcp

[root@proxysql_multi-pxc ~]# proxysql-admin --config=/etc/proxysql-admin_cluster2.cnf --enable
This script will assist with configuring ProxySQL (currently only Percona XtraDB cluster in combination with ProxySQL is supported)
ProxySQL read/write configuration mode is loadbal
Host priority file (/var/lib/proxysql/host_priority.conf) is already present. Would you like to replace with the new file [y/n] ? n
Host priority file is not deleted. Please make sure you have properly configured /var/lib/proxysql/host_priority.conf
Configuring ProxySQL monitoring user..
ProxySQL monitor username as per command line/config-file is monitor
User 'monitor'@'10.%' has been added with USAGE privilege
Configuring the Percona XtraDB Cluster application user to connect through ProxySQL
Percona XtraDB Cluster application username as per command line/config-file is cluster2_user
Percona XtraDB Cluster application user 'cluster2_user'@'10.%' has been added with the USAGE privilege, please make sure to the grant appropriate privileges
Adding the Percona XtraDB Cluster server nodes to ProxySQL
ProxySQL configuration completed!
ProxySQL has been successfully configured to use with Percona XtraDB Cluster
You can use the following login credentials to connect your application through ProxySQL
mysql --user=cluster2_user -p  --host=localhost --port=6033 --protocol=tcp

Inspect ProxySQL tables

Login to ProxySQL to confirm that the setup is correct:

[root@proxysql_multi-pxc ~]# mysql -uadmin -p -P6032 -h127.0.0.1
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or g.
Your MySQL connection id is 33893
Server version: 5.5.30 (ProxySQL Admin Module)
Copyright (c) 2009-2018 Percona LLC and/or its affiliates
Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.
mysql> select * from mysql_users;
+---------------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+
| username      | password                                  | active | use_ssl | default_hostgroup | default_schema | schema_locked | transaction_persistent | fast_forward | backend | frontend | max_connections |
+---------------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+
| cluster1_user | *448C417D62616B779E789F3BD72AA3DE9C319EA3 | 1      | 0       | 10                |                | 0             | 1                      | 0            | 1       | 1        | 10000           |
| cluster2_user | *AB1E96267D16A9F26A201282F9ED80B50244B770 | 1      | 0       | 20                |                | 0             | 1                      | 0            | 1       | 1        | 10000           |
+---------------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+
2 rows in set (0.00 sec)
mysql> select * from mysql_servers;
+--------------+------------+------+--------+---------+-------------+-----------------+---------------------+---------+----------------+-----------+
| hostgroup_id | hostname   | port | status | weight  | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment   |
+--------------+------------+------+--------+---------+-------------+-----------------+---------------------+---------+----------------+-----------+
| 11           | 10.0.3.81  | 3306 | ONLINE | 1000    | 0           | 1000            | 0                   | 0       | 0              | READ      |
| 10           | 10.0.3.41  | 3306 | ONLINE | 1000000 | 0           | 1000            | 0                   | 0       | 0              | WRITE     |
| 11           | 10.0.3.232 | 3306 | ONLINE | 1000    | 0           | 1000            | 0                   | 0       | 0              | READ      |
| 20           | 10.0.3.173 | 3306 | ONLINE | 1000    | 0           | 1000            | 0                   | 0       | 0              | READWRITE |
| 20           | 10.0.3.78  | 3306 | ONLINE | 1000    | 0           | 1000            | 0                   | 0       | 0              | READWRITE |
| 20           | 10.0.3.141 | 3306 | ONLINE | 1000    | 0           | 1000            | 0                   | 0       | 0              | READWRITE |
+--------------+------------+------+--------+---------+-------------+-----------------+---------------------+---------+----------------+-----------+
6 rows in set (0.00 sec)
mysql> select * from scheduler;
+----+--------+-------------+----------------------------------+------+------+------+------+------------------------------------------------------+----------+
| id | active | interval_ms | filename                         | arg1 | arg2 | arg3 | arg4 | arg5                                                 | comment  |
+----+--------+-------------+----------------------------------+------+------+------+------+------------------------------------------------------+----------+
| 6  | 1      | 3000        | /usr/bin/proxysql_galera_checker | 10   | 11   | 1    | 1    | /var/lib/proxysql/cluster1_proxysql_galera_check.log | cluster1 |
| 7  | 1      | 3000        | /usr/bin/proxysql_galera_checker | 20   | 20   | 0    | 1    | /var/lib/proxysql/cluster2_proxysql_galera_check.log | cluster2 |
+----+--------+-------------+----------------------------------+------+------+------+------+------------------------------------------------------+----------+
2 rows in set (0.00 sec)
mysql> select * from mysql_query_rules;
+---------+--------+---------------+------------+--------+-------------+------------+------------+--------+---------------------+---------------+----------------------+--------------+---------+-----------------+-----------------------+-----------+-----------+---------+---------+-------+-------------------+----------------+------------------+-----------+--------+-------------+-----------+-----+-------+---------+
| rule_id | active | username      | schemaname | flagIN | client_addr | proxy_addr | proxy_port | digest | match_digest        | match_pattern | negate_match_pattern | re_modifiers | flagOUT | replace_pattern | destination_hostgroup | cache_ttl | reconnect | timeout | retries | delay | next_query_flagIN | mirror_flagOUT | mirror_hostgroup | error_msg | OK_msg | sticky_conn | multiplex | log | apply | comment |
+---------+--------+---------------+------------+--------+-------------+------------+------------+--------+---------------------+---------------+----------------------+--------------+---------+-----------------+-----------------------+-----------+-----------+---------+---------+-------+-------------------+----------------+------------------+-----------+--------+-------------+-----------+-----+-------+---------+
| 7       | 1      | cluster1_user | NULL       | 0      | NULL        | NULL       | NULL       | NULL   | ^SELECT.*FOR UPDATE | NULL          | 0                    | CASELESS     | NULL    | NULL            | 10                    | NULL      | NULL      | NULL    | NULL    | NULL  | NULL              | NULL           | NULL             | NULL      | NULL   | NULL        | NULL      | NULL | 1     | NULL    |
| 8       | 1      | cluster1_user | NULL       | 0      | NULL        | NULL       | NULL       | NULL   | ^SELECT             | NULL          | 0                    | CASELESS     | NULL    | NULL            | 11                    | NULL      | NULL      | NULL    | NULL    | NULL  | NULL              | NULL           | NULL             | NULL      | NULL   | NULL        | NULL      | NULL | 1     | NULL    |
+---------+--------+---------------+------------+--------+-------------+------------+------------+--------+---------------------+---------------+----------------------+--------------+---------+-----------------+-----------------------+-----------+-----------+---------+---------+-------+-------------------+----------------+------------------+-----------+--------+-------------+-----------+-----+-------+---------+
2 rows in set (0.00 sec)
mysql> exit
Bye

It’s as easy as that! We hope you continue to enjoy using ProxySQL Admin!

The post ProxySQL Admin Support for Multiple Clusters appeared first on Percona Database Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com