Apr
23
2018
--

Percona Live 2018 Featured Talk: Data Integrity at Scale with Alexis Guajardo

Alexis Google Percona Live 2018

Percona Live 2018 Featured TalkWelcome to another interview blog for the rapidly-approaching Percona Live 2018. Each post in this series highlights a Percona Live 2018 featured talk at the conference and gives a short preview of what attendees can expect to learn from the presenter.

This blog post highlights Alexis Guajardo, Senior Software Engineer at Google.com. His session talk is titled Data Integrity at Scale. Keeping data safe is the top responsibility of anyone running a database. In this session, he dives into Cloud SQL’s storage architecture to demonstrate how they check data down to the disk level:

Percona: Who are you, and how did you get into databases? What was your path to your current responsibilities?

Alexis: I am a Software Engineer on the Cloud SQL team with Google Cloud. I got into databases by using FileMaker. However, the world of database technology has changed many times over since then.

Percona: Your session is titled “Data Integrity at Scale”. Has the importance of data integrity increased over time? Why?

Alexis Google Percona Live 2018Alexis: Data integrity has always been vital to databases and data in general. The most common method is using checksum validation to ensure data integrity. The challenge that we faced at Cloud SQL on Google Cloud was how to do this for two very popular open source database solutions, and how to do it at scale. The store for MySQL was a bit more straightforward, because of innochecksum.  PostgreSQL required our team to create a utility, which is open sourced. The complicated aspect of data corruption is that sometimes it is dormant and discovered at a most inopportune time. What we have instituted are frequent checks for corruption of the entire data set, so if there is a software bug or other issue, we can mitigate it as soon as possible.

Percona: How does scaling affect the ability to maintain data integrity?

AlexisThere is a benefit to working on a team that provides a public cloud. Since Google Cloud is not bounded by most restrictions that an individual or company would be, we can allocate resources to do data integrity verifications without restriction. If I were to implement a similar system at a smaller company, most likely there would be cost and resource restrictions. However, data integrity is a feature that Google Cloud provides.

Percona: What are three things a DBA should know about ensuring data integrity?

Alexis: I think that the three things can be simplified down to three words: verify your backups.

Even if someone does not use Cloud SQL, it is vital to take backups, maintain them and verify them. Having terabytes of backups, but without verification, leaves open the possibility that a software bug or hardware issue somehow corrupted a backup.

Percona: Why should people attend your talk? What do you hope people will take away from it? 

Alexis: I would say the main reason to attend my talk is to discover more about Cloud SQL. As a DBA or developer, having a managed database as a service solution takes away a lot of the minutia. But there are still the tasks of improving queries and creating applications.  However, having reliable and verified backups is vital. With the addition of high availability and the ability to scale up easily, Cloud SQL’s managed database solution makes life much easier.

Percona: What are you looking forward to at Percona Live (besides your talk)?

Alexis: The many talks about Vitesse look very interesting. It is also an open source Google technology, and to see its adoption by many companies and how they have benefited from its use will be interesting.

Want to find out more about this Percona Live 2018 featured talk, and data integrity at scale? Register for Percona Live 2018, and see Alexis session talk Data Integrity at Scale. Register now to get the best price! Use the discount code SeeMeSpeakPL18 for 10% off.

Percona Live Open Source Database Conference 2018 is the premier open source event for the data performance ecosystem. It is the place to be for the open source community. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

The Percona Live Open Source Database Conference will be April 23-25, 2018 at the Hyatt Regency Santa Clara & The Santa Clara Convention Center.

The post Percona Live 2018 Featured Talk: Data Integrity at Scale with Alexis Guajardo appeared first on Percona Database Performance Blog.

Apr
23
2018
--

This Week In Data with Colin Charles 36: Percona Live 2018

Colin Charles

Colin CharlesPercona Live Santa Clara 2018! Last week’s column may have somehow not made it to Planet MySQL, so please don’t miss the good links at: This Week in Data with Colin Charles 35: Percona Live 18 final countdown and a roundup of recent news.

Back to Percona Live – I expect people are still going to be registering, right down to the wire! I highly recommend you also register for the community dinner. They routinely sell out and people tend to complain about not being able to join in the fun, so reserve your spot early. Please also be present on Monday, which is not just tutorial day, but also during the welcoming reception, there will be the most excellent community awards. In addition, if you don’t find a tutorial as something you’re interested in (or didn’t get a ticket that included tutorials!), why not check out the China Track, something new and unique that showcases the technology coming out of China.

The biggest news this week? On Thursday, April 19, 2018, MySQL 8.0 became Generally Available with the 8.0.11 release. The release notes are a must read, as is the upgrade guide (this time around, you really want to read it!). Some more digestible links: What’s New in MySQL 8.0? (Generally Available), MySQL 8.0: New Features in Replication, MySQL 8.0 – Announcing GA of the MySQL Document Store. As a bonus, the Hacker News thread is also well worth a read. Don’t forget that all the connectors also got a nice version bump.

The PostgreSQL website has been redesigned – check out PostgreSQL.org.

More open source databases are always a good thing, and it’s great to see Apple open sourcing FoundationDB. Being corporate-backed open source, I have great hopes for what the project can become. The requisite Hacker News thread is also well worth a read.

Releases

  • PostgreSQL 10.3, 9.6.8, 9.5.12, 9.4.17, AND 9.3.22 released
  • MariaDB 10.3.6 is another release candidate, more changes for sql_mode=oracle, changes to the INFORMATION_SCHEMA tables around system versioning, and more. Particularly interesting is the contributor list, listing a total of 34 contributors. Five come from the MariaDB Foundation (including Monty) which is 14%, 17 come from the MariaDB Corporation (including Monty again) which is 50%, two from Tempesta, one from IBM, six from Codership (over 17%!), and four are independent. So nearly 62% of contributions are run by the Corporation/Foundation in total.
  • SysbenchRocks, a repository of Sysbench benchmarks, libraries and extensions.

Link List

Upcoming appearances

Feedback

I look forward to feedback/tips via e-mail at colin.charles@percona.com or on Twitter @bytebot.

The post This Week In Data with Colin Charles 36: Percona Live 2018 appeared first on Percona Database Performance Blog.

Apr
19
2018
--

Sysbench-tpcc Supports PostgreSQL (No, Really This Time)

Sysbench-tpcc Supports PostgreSQL

Sysbench-tpcc Supports PostgreSQLThis time, we really mean it when we say sysbench-tpcc supports PostgreSQL.

When I initially announced sysbench-tpcc, I mentioned it potentially could run against PostgreSQL, but it was more like wishful thinking than reality. The reality was that even though both databases speak SQL, the difference in dialects was too big and the queries written for MySQL could not run without modification on PostgreSQL.

Well, we introduced needed changes, and now you can use sysbench-tpcc with PostgreSQL. Just try the latest commit to https://github.com/Percona-Lab/sysbench-tpcc.

If you’re interested, here is a quick overview of what changes we had to make:

  1. It appears that PostgreSQL does not support the 
    tinyint

     and

    datetime

     data types. We had to use smallint and

    timestamp

     fields, even if using

    smallint

     makes the database size bigger.

  2. PostgreSQL does not have a simple equivalent for MySQL’s
    SHOW TABLES

    . The best replacement we found is

    select * from pg_catalog.pg_tables where schemaname != 'information_schema' and schemaname != 'pg_catalog'

    .

  3. PostgreSQL does not have a way to disable Foreign Key checks like MySQL:
    SET FOREIGN_KEY_CHECKS=0

    . With PostgreSQL, we needed to create and load tables in a very specific order to avoid Foreign Keys violations.

  4. PostgreSQL requires you to have a unique index name per the whole database, white MySQL requires it only per table. So instead of using:
    CREATE INDEX idx_customer ON customer1 (c_w_id,c_d_id,c_last,c_first)
    CREATE INDEX idx_customer ON customer2 (c_w_id,c_d_id,c_last,c_first)

    We need to use:

    CREATE INDEX idx_customer1 ON customer1 (c_w_id,c_d_id,c_last,c_first)
    CREATE INDEX idx_customer2 ON customer2 (c_w_id,c_d_id,c_last,c_first)
  5. PostgreSQL does not have a 
    STRAIGHT_JOIN

     hint, so we had to remove this from queries. But it is worth mentioning we use

    STRAIGHT_JOIN

     mostly as a hack to force MySQL to use a correct execution plan for one of the queries.

  6. PostgreSQL is very strict on GROUP BY queries. All fields that are not in the GROUP BY clause must use an aggregation function. So PostgreSQL complained on queries like
    SELECT d_w_id,sum(d_ytd)-w_ytd diff FROM district,warehouse WHERE d_w_id=w_id AND w_id=1 GROUP BY d_w_id

     even when we know that only single value for w_ytd is possible. We had to rewrite this query as

    SELECT d_w_id,SUM(d_ytd)-MAX(w_ytd) diff FROM district,warehouse WHERE d_w_id=w_id AND w_id=1 GROUP BY d_w_id

    .

So you can see there was some work involved when we try to migrate even a simple application from MySQL to PostgreSQL.

Hopefully, now sysbench-tpcc supports PostgreSQL, it is a useful tool to evaluate a PostgreSQL performance. If you find that we did not optimally execute some transaction, please let us know!

The post Sysbench-tpcc Supports PostgreSQL (No, Really This Time) appeared first on Percona Database Performance Blog.

Apr
19
2018
--

Congratulations to Our Friends at Oracle with the MySQL 8.0 GA Release!

MySQL 8.0 GA

MySQL 8.0 GAIt is a great today for whole MySQL community: MySQL 8.0 was just released as GA!

Geir Høydalsvik has a great summary in his “What’s New in MySQL 8.0” blog post. You can find additional information about MySQL 8.0 Replication and MySQL 8.0 Document Store that is also worth reading.

If you can’t wait to upgrade to MySQL 8.0, please make sure to read the Upgrading to MySQL 8.0 section in the manual, and pay particular attention to changes to Connection Authentication. It requires special handling for most applications.

Also keep in mind that while MySQL 8.0 passed through an extensive QA process, this is the first GA release. It is not yet as mature and polished as MySQL 5.7. If you’re just now starting application development, however, you should definitely start with MySQL 8.0 — by the time you launch your application, 8.0 will be good. 

All of us at Percona – and me personally – are very excited about this release. You can learn more details about what we expect from it in our Why We’re Excited about MySQL 8.0 webinar recording.    

We also wrote extensively about MySQL 8.0 on our blog. Below are some posts on various features, as well as thoughts on the various RCs, that you might want to review:

The best way to learn about MySQL 8.0, though, is to attend the Percona Live Open Source Database Conference 2018, taking place in Santa Clara, CA next week. We have an outstanding selection of MySQL 8.0 focused talks both from the MySQL Engineering team and the community at large (myself included):

You can still get tickets to the conference. Come by and learn about MySQL 8.0. If you can’t make it, please check back later for slides.

Done reading? Go ahead go download  MySQL 8.0 and check it out!

The post Congratulations to Our Friends at Oracle with the MySQL 8.0 GA Release! appeared first on Percona Database Performance Blog.

Apr
18
2018
--

Why Analyze Raw MySQL Query Logs?

Raw MySQL Query Logs

Raw MySQL Query LogsIn this blog post, I’ll examine when looking at raw MySQL query logs can be more useful than working with tools that only have summary data.

In my previous blog post, I wrote about analyzing MySQL Slow Query Logs with ClickHouse and ClickTail. One of the follow-up questions I got is when do you want to do that compared to just using tools like Percona Monitoring and Management or VividCortex, which provide a beautiful interface for detailed analyses (rather than spartan SQL interface).    

MySQL Logs

A lot of folks are confused about what query logs MySQL has, and what you can use them for. First, MySQL has a “General Query Log”. As the name implies, this is a general-purpose query log. You would think this is the first log you should use, but it is, in fact, pretty useless:

2018-03-31T15:38:44.521650Z      2356 Query SELECT c FROM sbtest1 WHERE id=164802
2018-03-31T15:38:44.521790Z      2356 Query SELECT c FROM sbtest1 WHERE id BETWEEN 95241 AND 95340
2018-03-31T15:38:44.522168Z      2356 Query SELECT SUM(k) FROM sbtest1 WHERE id BETWEEN 1 AND 100
2018-03-31T15:38:44.522500Z      2356 Query SELECT c FROM sbtest1 WHERE id BETWEEN 304556 AND 304655 ORDER BY c
2018-03-31T15:38:44.522941Z      2356 Query SELECT DISTINCT c FROM sbtest1 WHERE id BETWEEN 924 AND 1023 ORDER BY c
2018-03-31T15:38:44.523525Z      2356 Query UPDATE sbtest1 SET k=k+1 WHERE id=514

As you can see, it only has very limited information about queries: no query execution time or which user is running the query. This type of log is helpful if you want to see very clean, basic information on what queries your application is really running. It can also help debug MySQL crashes because, unlike other log formats, the query is written to this log file before MySQL attempts to execute the query.

The MySQL Slow Log is, in my opinion, much more useful (especially with Percona Server Slow Query Log Extensions). Again as the name implies, you would think it is only used for slow queries (and by default, it is). However, you can set long_query_time to 0 (with a few other options) to get all queries here with lots of rich information about query execution:

# Time: 2018-03-31T15:48:55.795145Z
# User@Host: sbtest[sbtest] @ localhost []  Id: 2332
# Schema: sbtest  Last_errno: 0 Killed: 0
# Query_time: 0.000143  Lock_time: 0.000047 Rows_sent: 1  Rows_examined: 1 Rows_affected: 0
# Bytes_sent: 188  Tmp_tables: 0 Tmp_disk_tables: 0  Tmp_table_sizes: 0
# QC_Hit: No  Full_scan: No Full_join: No  Tmp_table: No Tmp_table_on_disk: No
# Filesort: No  Filesort_on_disk: No  Merge_passes: 0
#   InnoDB_IO_r_ops: 0  InnoDB_IO_r_bytes: 0  InnoDB_IO_r_wait: 0.000000
#   InnoDB_rec_lock_wait: 0.000000  InnoDB_queue_wait: 0.000000
#   InnoDB_pages_distinct: 0
# Log_slow_rate_type: query  Log_slow_rate_limit: 10
SET timestamp=1522511335;
SELECT c FROM sbtest1 WHERE id=2428336;

Finally, there is the MySQL Audit Log, which is part of the MySQL Enterprise offering and format-compatible Percona Server for MySQL Audit Log Plugin. This is designed for auditing access to the server, and as such it has matched details in the log. Unlike the first two log formats, it is designed first and foremost to be machine-readable and supports JSON, XML and CVS output formats:

{"audit_record":{"name":"Query","record":"743017006_2018-03-31T01:03:12","timestamp":"2018-03-31T15:53:42 UTC","command_class":"select","connection_id":"2394","status":0,"sqltext":"SELECT SUM(k) FROM sbtest1 WHERE id BETWEEN 3 AND 102","user":"sbtest[sbtest] @ localhost []","host":"localhost","os_user":"","ip":"","db":"sbtest"}}
{"audit_record":{"name":"Query","record":"743017007_2018-03-31T01:03:12","timestamp":"2018-03-31T15:53:42 UTC","command_class":"select","connection_id":"2394","status":0,"sqltext":"SELECT c FROM sbtest1 WHERE id BETWEEN 2812021 AND 2812120 ORDER BY c","user":"sbtest[sbtest] @ localhost []","host":"localhost","os_user":"","ip":"","db":"sbtest"}}
{"audit_record":{"name":"Query","record":"743017008_2018-03-31T01:03:12","timestamp":"2018-03-31T15:53:42 UTC","command_class":"select","connection_id":"2394","status":0,"sqltext":"SELECT DISTINCT c FROM sbtest1 WHERE id BETWEEN 1 AND 100 ORDER BY c","user":"sbtest[sbtest] @ localhost []","host":"localhost","os_user":"","ip":"","db":"sbtest"}}

As you can see, there are substantial differences in the purposes of the different MySQL log formats, along with the information they provide.

Why analyze raw MySQL query logs

In my opinion, there are two main reasons to look directly at raw log files without aggregation (you might find others):

  • Auditing, where the Audit Log is useful (Vadim recently blogged about it)
  • Advanced MySQL/application debugging, where an aggregated summary might not allow you to drill down to the fullest level of detail

When you’re debugging using MySQL logs, the Slow Query Log, set to log all queries with no sampling, is the most useful. Of course, this can cause significant additional overhead in many workloads, so it is best to do it in a development environment (if you can repeat the situation you’re looking to analyze). At the very least, don’t do it during peak time.

For Percona Server for MySQL, these options ensure it logs all queries to the query log with no sampling:

log_output=file
slow_query_log=ON
long_query_time=0
log_slow_rate_limit=1
log_slow_verbosity=full
log_slow_admin_statements=ON
log_slow_slave_statements=ON
slow_query_log_always_write_time=1

Now that we have full queries, we can easily use Linux command line tools like grep and others to look into what is going on. However, many times this isn’t always convenient. This is where loading logs into storage that you can conveniently query is a good solution.

Let’s look into some specific and interesting cases.

Were any queries killed?

SELECT
   _time,
   query,
   query_time
FROM mysql_slow_log
WHERE killed > 0
????????????????_time???query?????????????????????????????????query_time??
? 2018-04-02 19:02:56 ? select benchmark(10000000000,"1+1") ?  10.640794 ?
??????????????????????????????????????????????????????????????????????????
1 rows in set. Elapsed: 0.242 sec. Processed 929.14 million rows, 1.86 GB (3.84                                         billion rows/s., 7.67 GB/s.)

Yes. A query got killed after running for 10 seconds.

Did any query fail? With what error codes?

SELECT
   error_num,
   min(_time),
   max(_time),
   count(*)
FROM mysql_slow_log
GROUP BY error_num
??error_num????????????min(_time)????????????max(_time)?????count()??
?         0 ? 2018-04-02 18:59:49 ? 2018-04-07 19:39:27 ? 925428375 ?
?      1160 ? 2018-04-02 19:02:56 ? 2018-04-02 19:02:56 ?         1 ?
?      1213 ? 2018-04-02 19:00:00 ? 2018-04-07 19:18:14 ?   3709520 ?
?      1054 ? 2018-04-07 19:38:14 ? 2018-04-07 19:38:14 ?         1 ?
?????????????????????????????????????????????????????????????????????
4 rows in set. Elapsed: 2.391 sec. Processed 929.14 million rows, 7.43 GB (388.64 million rows/s., 3.11 GB/s.)

You can resolve error codes with the 

perror

 command:

root@rocky:~# perror 1054
MySQL error code 1054 (ER_BAD_FIELD_ERROR): Unknown column '%-.192s' in '%-.192s'

This command has many uses. You can use it to hunt down application issues (like in this example of a missing column — likely due to bad or old code). It can also help you to spot SQL injection attempts that often cause queries with bad syntax, and troubleshoot deadlocks or foreign key violations.

Are there any nasty, long transactions?

SELECT
   transaction_id,
   max(_time) - min(_time) AS run_time,
   count(*) AS num_queries,
   sum(rows_affected) AS rows_changed
FROM mysql_slow_log
WHERE transaction_id != ''
GROUP BY transaction_id
ORDER BY rows_changed DESC
LIMIT 10
??transaction_id???run_time???num_queries???rows_changed??
? 17E070082      ? 0        ?      1      ? 9999         ?
? 17934C73C      ? 2        ?      6      ? 4            ?
? 178B6D346      ? 0        ?      6      ? 4            ?
? 17C909086      ? 2        ?      6      ? 4            ?
? 17B45EFAD      ? 5        ?      6      ? 4            ?
? 17ABAB840      ? 0        ?      6      ? 4            ?
? 17A36AD3F      ? 3        ?      6      ? 4            ?
? 178E037A5      ? 1        ?      6      ? 4            ?
? 17D1549C9      ? 0        ?      6      ? 4            ?
? 1799639F2      ? 1        ?      6      ? 4            ?
??????????????????????????????????????????????????????????
10 rows in set. Elapsed: 15.574 sec. Processed 930.58 million rows, 18.23 GB (59.75 million rows/s., 1.17 GB/s.)

Finding transactions that modify a lot of rows, like transaction 17E070082 above, can be very helpful to ensure you control MySQL replication slave lag. It is also critical if you’re looking to migrate to MySQL Group Replication or Percona XtraDB Cluster.

What statements were executed in a long transaction?

SELECT
   _time,
   _ms,
   query
FROM mysql_slow_log
WHERE transaction_id = '17E070082'
ORDER BY
   _time ASC,
   _ms ASC
LIMIT 10
????????????????_time??????_ms???query??????????????????????????????????
? 2018-04-07 20:08:43 ? 890693 ? update sbtest1 set k=0 where id<10000 ?
????????????????????????????????????????????????????????????????????????
1 rows in set. Elapsed: 2.361 sec. Processed 931.04 million rows, 10.79 GB (394.27 million rows/s., 4.57 GB/s.)

I used transaction 17E070082 from the previous query above (which modified 9999 rows). Note that this schema improves compression by storing the seconds and microseconds parts of the timestamp in different columns.

Were any queries dumping large numbers of rows from the database?

SELECT
   _time,
   query,
   rows_sent,
   bytes_sent
FROM mysql_slow_log
WHERE rows_sent > 10000
????????????????_time???query??????????????????????????????????????????????rows_sent???bytes_sent??
? 2018-04-07 20:21:08 ? SELECT /*!40001 SQL_NO_CACHE */ * FROM `sbtest1` ?  10000000 ? 1976260712 ?
???????????????????????????????????????????????????????????????????????????????????????????????????
1 rows in set. Elapsed: 0.294 sec. Processed 932.19 million rows, 3.73 GB (3.18 billion rows/s., 12.71 GB/s.)

Did someone Update a record?

SELECT
   _time,
   query
FROM mysql_slow_log
WHERE (rows_affected > 0) AND (query LIKE '%id=3301689%')
LIMIT 1
????????????????_time???query??????????????????????????????????????
? 2018-04-02 19:04:48 ? UPDATE sbtest1 SET k=k+1 WHERE id=3301689 ?
???????????????????????????????????????????????????????????????????
1 rows in set. Elapsed: 0.046 sec. Processed 2.29 million rows, 161.60 MB (49.57 million rows/s., 3.49 GB/s.)

Note that I’m cheating here by assuming we know an update used a primary key, but it is practically helpful in a lot of cases.

These are just some of the examples of what you can find out by querying raw slow query logs. They contain a ton of information about query execution (especially in Percona Server for MySQL) that allows you to use them both for performance analysis and some security and auditing purposes.

The post Why Analyze Raw MySQL Query Logs? appeared first on Percona Database Performance Blog.

Apr
18
2018
--

Restore a MongoDB Logical Backup

MongoDB Logical Backup

MongoDB Logical BackupIn this article, we will explain how to restore a MongoDB logical backup performed via ‘mongodump’ to a mongod instance.

MongoDB logical backup requires the use of the ‘mongorestore‘ tool to perform the restore backup. This article focuses on this tool and process.

Note: Percona develops a backup tool named Percona-Lab/mongodb-consistent-backup, which is a wrapper for ‘mongodump‘, adding cluster-wide backup consistency. The backups created by mongodb_consistent_backup (in Dump/Mongodump mode) can be restored using the exact same steps as a regular ‘mongodump’ backup – no special steps!

Mongorestore Command Flags

–host/–port (and –user/–password)

Required, even if you’re using the default host/port (localhost:27017). If authorization is enabled, add –user/–password flags also.

–drop

This is almost always required. This causes ‘mongodump‘ to drop the collection that is being restored before restoring it. Without this flag, the documents from the backup are inserted one at a time and if they already exist the restore fails.

–oplogReplay

This is almost always required. Replays the oplog that was dumped by mongodump. It is best to include this flag on replset-based backups unless there is a specific reason not to. You can tell if the backup was from a replset by looking for the file ‘oplog.bson‘ at the base of the dump directory.

–dir

Required. The path to the mongodump data.

–gzip

Optional. For mongodump >= 3.2, enables inline compression on the restore. This is required if ‘mongodump‘ used the –gzip flag (look for *.bson.gz files if you’re not sure if the collection files have no .gz suffix, don’t use –gzip).

–numParallelCollections=<number>

Optional. For mongodump >= 3.2 only, sets the number of collections to insert in parallel. By default four threads are used, and if you have a large server and you want to restore faster (more resource usage though), you could increase this number. Note that each thread uncompresses bson if the ‘–gzip‘ flag is used, so consider this when raising this number.

Steps

  1. (Optional) If the backup is archived (mongodb_consistent_backup defaults to creating tar archives), un-archive the backup so that ‘mongorestore‘ can access the .bson/.bson.gz files:
    $ tar -C /opt/mongodb/backup/testbackup/20160809_1306 -xvf /opt/mongodb/backup/testbackup/20160809_1306/test1.tar
    test1/
    test1/dump/
    test1/dump/wikipedia/
    test1/dump/wikipedia/pages.metadata.json.gz
    test1/dump/wikipedia/pages.bson.gz
    test1/dump/oplog.bson

    ** This command un-tars the backup to ‘/opt/mongodb/backup/testbackup/20160809_1306/test1/dump’ **

  2. Check (and then check again!) that you’re restoring the right backup to the right host. When in doubt, it is safer to ask the customer or others.
    1. The Percona ‘mongodb_consistent_backup‘ tool names backup subdirectories by replica set name, so you can ensure you’re restoring the right backup by checking the replica set name of the node you’re restoring to, if it exists.
    2. If you’re restoring to a replica set you will need to restore to the PRIMARY member and there needs to be a majority (so writes are accepted – some exceptions if you override write-concern, but not advised).
  3. Use ‘mongorestore‘ to restore the data by dropping/restoring each collection (–drop flag) and replay the oplog changes (–oplogReplay flag), specifying the restore dir explicitly (–dir flag) to the ‘mongorestore‘ command. In this example I also used authorization (–user/–password flags) and un-compression (–gzip flag):
    $ mongorestore --drop --host localhost --port 27017 --user secret --password secret --oplogReplay --gzip --dir /opt/mongodb/backup/testbackup/20160809_1306/test1/dump
    2016-08-09T14:23:04.057+0200    building a list of dbs and collections to restore from /opt/mongodb/backup/testbackup/20160809_1306/test1/dump dir
    2016-08-09T14:23:04.065+0200    reading metadata for wikipedia.pages from /opt/mongodb/backup/testbackup/20160809_1306/test1/dump/wikipedia/pages.metadata.json.gz
    2016-08-09T14:23:04.067+0200    restoring wikipedia.pages from /opt/mongodb/backup/testbackup/20160809_1306/test1/dump/wikipedia/pages.bson.gz
    2016-08-09T14:23:07.058+0200    [#######.................]  wikipedia.pages  63.9 MB/199.0 MB  (32.1%)
    2016-08-09T14:23:10.058+0200    [###############.........]  wikipedia.pages  127.7 MB/199.0 MB  (64.1%)
    2016-08-09T14:23:13.060+0200    [###################.....]  wikipedia.pages  160.4 MB/199.0 MB  (80.6%)
    2016-08-09T14:23:16.059+0200    [#######################.]  wikipedia.pages  191.5 MB/199.0 MB  (96.2%)
    2016-08-09T14:23:19.071+0200    [########################]  wikipedia.pages  223.5 MB/199.0 MB  (112.3%)
    2016-08-09T14:23:22.062+0200    [########################]  wikipedia.pages  255.6 MB/199.0 MB  (128.4%)
    2016-08-09T14:23:25.067+0200    [########################]  wikipedia.pages  271.4 MB/199.0 MB  (136.4%)
    ...
    ...
    2016-08-09T14:24:19.058+0200    [########################]  wikipedia.pages  526.9 MB/199.0 MB  (264.7%)
    2016-08-09T14:24:22.058+0200    [########################]  wikipedia.pages  558.9 MB/199.0 MB  (280.8%)
    2016-08-09T14:24:23.521+0200    [########################]  wikipedia.pages  560.6 MB/199.0 MB  (281.6%)
    2016-08-09T14:24:23.522+0200    restoring indexes for collection wikipedia.pages from metadata
    2016-08-09T14:24:23.528+0200    finished restoring wikipedia.pages (32725 documents)
    2016-08-09T14:24:23.528+0200    replaying oplog
    2016-08-09T14:24:23.597+0200    done
    1. If you encounter problems with ‘mongorestore‘, carefully read the error message or rerun with several ‘-v‘ flags, e.g.: ‘-vvv‘. Once you have an error, attempt to troubleshoot the cause.
  4. Check to see that you saw “replaying oplog” and “done” after the restore (last two lines in the example). If you don’t see this, there is a problem.

As you notice, using this tool for MongoDB logical backup is very simple. However, when using sharding please note that –oplog is not available and the mongodump uses the primaries for each shard. As this is not advised typically in production, you might consider looking at Percona-Lab/mongodb-consistent-backup to ensure you are consistent and hitting secondary nodes, like mongodump with replica sets, will work.

If MongoDB and topics like this interest you, please see the document below, we are hiring!

{
  hiring: true,
  role: "Consultant",
  tech: "MongoDB",
  location: "USA",
  moreInfo: "https://www.percona.com/about-percona/careers/mongodb-consultant-usa-based"
}

The post Restore a MongoDB Logical Backup appeared first on Percona Database Performance Blog.

Apr
18
2018
--

Webinar Thursday, April 19, 2018: Running MongoDB in Production, Part 1

Running MongoDB

Running MongoDBPlease join Percona’s Senior Technical Operations Architect, Tim Vaillancourt as he presents Running MongoDB in Production, Part 1 on Thursday, April 19, 2018, at 10:00 am PDT (UTC-7) / 1:00 pm EDT (UTC-4).

Are you a seasoned MySQL DBA that needs to add MongoDB to your skills? Are you used to managing a small environment that runs well, but want to know what you might not know yet? This webinar helps you with running MongoDB in production environments.

MongoDB works well, but when it has issues, the number one question is “where should I go to solve a problem?”

This tutorial will cover:

Backups
– Logical vs Binary-level backups
– Sharding and Replica-Set Backup strategies
Security
– Filesystem and Network Security
– Operational Security
– External Authentication features of Percona Server for MongoDB
– Securing connections with SSL and MongoDB Authorization
– Encryption at Rest
– New Security features in 3.6
Monitoring
– Monitoring Strategy
– Important metrics to monitor in MongoDB and Linux
– Percona Monitoring and Management

Register for the webinar now.

Part 2 of this series will take place on Thursday, April 26, 2018, at 10:00 am PDT (UTC-7) / 1:00 pm EDT (UTC-4). Register for the second part of this series here.

Running MongoDBTimothy Vaillancourt, Senior Technical Operations Architect

Tim joined Percona in 2016 as Sr. Technical Operations Architect for MongoDB, with the goal to make the operations of MongoDB as smooth as possible. With experience operating infrastructures in industries such as government, online marketing/publishing, SaaS and gaming combined with experience tuning systems from the hard disk all the way up to the end-user, Tim has spent time in nearly every area of the modern IT stack with many lessons learned. Tim is based in Amsterdam, NL and enjoys traveling, coding and music.

Prior to Percona Tim was the Lead MySQL DBA of Electronic Arts’ DICE studios, helping some of the largest games in the world (“Battlefield” series, “Mirrors Edge” series, “Star Wars: Battlefront”) launch and operate smoothly while also leading the automation of MongoDB deployments for EA systems. Before the role of DBA at EA’s DICE studio, Tim served as a subject matter expert in NoSQL databases, queues and search on the Online Operations team at EA SPORTS. Before moving to the gaming industry, Tim served as a Database/Systems Admin operating a large MySQL-based SaaS infrastructure at AbeBooks/Amazon Inc.

The post Webinar Thursday, April 19, 2018: Running MongoDB in Production, Part 1 appeared first on Percona Database Performance Blog.

Apr
16
2018
--

Binlog and Replication Improvements in Percona Server for MySQL

Percona Server for MySQL

Percona Server for MySQLDue to continuous development and improvement, Percona Server for MySQL incorporates a number of improvements related to binary log handling and replication. This results in replication specifics, distinguishing it from MySQL Server.

Temporary tables and mixed logging format

Summary of the fix:

As soon as some statement involving temporary tables was met when using a mixed binlog format, MySQL switched to row-based logging for all statements until the end of the session (or until all temporary tables used in the session were dropped). This is inconvenient when you have long-lasting connections, including replication-related ones. Percona Server for MySQL fixes the situation by switching between statement-based and row-based logging when necessary.

Details:

The new mixed binary logging format, supported by Percona Server for MySQL, means that the server runs in statement-based logging by default, but switches to row-based logging when replication would be unpredictable. For example, in the case of a nondeterministic SQL statement that could cause data divergence if reproduced on a slave server. The switch is done when matching any condition from a long list, and one of these conditions is the use of temporary tables.

Temporary tables are never logged using row-based format, but any statement that touches a temporary table is logged in row mode. This way, we intercept all the side effects that temporary tables can produce on non-temporary ones.

There is no need to use the row logging format for any other statements, solely because of the temp table presence. However, MySQL undertook such an excessive precaution: once some statement with a temporary table had appeared and the row-based logging was used, MySQL was logging unconditionally put all subsequent statements in row format.

Percona Server for MySQL has implemented more accurate behavior. Instead of switching to row-based logging until the last temporary table is closed, the usual rules of row vs. statement format apply, and we don’t consider the presence of currently opened temporary tables. This change was introduced with the fix of bug #151 (upstream #72475).

Temporary table drops and binloging on GTID-enabled server

Summary of the fix:

MySQL logs DROP statements for all temporary tables regardless of the logging mode under which these tables were created. This produces binlog writes and errand GTIDs on slaves with row and mixed logging. Percona Server for MySQL fixes this by tracking the binlog format at temporary table create time and uses it to decide whether a DROP should be logged or not.

Details:

Even with read_only mode enabled, the server permits some operations, including ones with temporary tables. With the previous fix, temporary table operations are not binlogged in row- or mixed-mode. But MySQL server doesn’t track what the logging mode was when a temporary table was created, and therefore unconditionally logs DROP statements for all temporary tables. These DROP statements receive IF EXISTS addition, which is intended to make them harmless.

Percona Server for MySQL has fixed this with the bug fixes #964, upstream #83003, and upstream #85258. Moreover, with all the binlogging fixes discussed so far nothing involving temporary tables is logged to the binary log in row or mixed format. There is no need to consider CREATE/DROP TEMPORARY TABLE unsafe for use in stored functions, triggers and multi-statement transactions in row/mixed format. Therefore, we introduced an additional fix to mark the creation and drop of temporary tables as unsafe inside transactions in statement-based replication only (the fixed bug is #1816, while the correspondent upstream one is #89467 and it is still open).

Safety of statements with a LIMIT clause

Summary of the fix:

MySQL Server considers all UPDATE/DELETE/INSERT ... SELECT statements with the LIMIT clause unsafe, no matter if they are really producing non-deterministic results or not. Percona Server for MySQL is more accurate because it acknowledges such instructions as safe when they include ORDER BY PK or WHERE condition.

Details:

MySQL Server treats UPDATE/DELETE/INSERT ... SELECT statements with the LIMIT clause as unsafe, considering that they produce an unpredictable number of rows. But some such statements can still produce an absolutely predictable result. One such deterministic case takes place when a statement with the LIMIT clause has an ORDER BY PK or WHERE condition.

The patch, making updates and deletes with a limit to be supposed as safe if they have an ORDER BY pk_column clause, was initially provided on the upstream bug report and incorporated later into Percona Server for MySQL with additional improvements. Bug fixed #44 (upstream #42415).

Performance improvements

There are also two modifications in Percona Server related to multi-source replication that improve performance on slaves.

The first improvement is about relay log position, which was always updated in multi-source replications setups regardless of whether the committed transaction has already been executed or not. Percona Server omits relay log position updates for the already logged GTIDs.

These unconditional relay log position updates caused additional fsync operations in the case of relay-log-info-repository=TABLE. With the higher number of channels transmitting such duplicate (already executed) transactions, the situation became proportionally worse. The problem was solved in Percona Server 5.7.18-14.  Bug fixed  #1786 (upstream #85141).

The second improvement decreases the load on slave nodes configured to update the master status and connection information only on log file rotation. MySQL additionally updated this information in the case of multi-source replication when a slave had to skip the already executed GTID event. This behavior was the cause of substantially higher write loads on slaves and lower replication throughput.

The configuration with master_info_repository=TABLE and sync_master_info=0  makes the slave update the master status and connection information in this table on log file rotation and not after each sync_master_info event, but it didn’t work on multi-source replication setups. Heartbeats sent to the slave to skip GTID events that it had already executed previously were evaluated as relay log rotation events and reacted with mysql.slave_master_info table sync. This inaccuracy could produce a huge (up to five times on some setups) increase in write load on the slave, before this problem was fixed in Percona Server for MySQL 5.7.20-19. Bug fixed  #1812 (upstream #85158).

Current status of fixes

The three issues related to temporary tables that were fixed in Percona Server 5.5 and contributed upstream, and the final fixes of the bugs #72475, #83003, and #85258, have landed into MySQL Server 8.0.4.

The post Binlog and Replication Improvements in Percona Server for MySQL appeared first on Percona Database Performance Blog.

Apr
13
2018
--

MongoDB Replica Set Tag Sets

MongoDB Replica Set Tag Sets

MongoDB Replica Set Tag SetsIn this blog post, we will look at MongoDB replica set tag sets, which enable you to use customized write concern and read preferences for replica set members.

This blog post will cover most of the questions that come to mind before using tag sets in a production environment.

  • What scenarios are these helpful for?
  • Do these tag sets work with all read preferences modes?
  • What if we’re already using maxStalenessSeconds along with the read preferences, can we still use a tag set?
  • How can one configure tag sets in a replica set?
  • Do these tags work identically for custom read preferences and write concerns?

Now let’s answer all these questions one by one.

What scenarios are these helpful for?

You can use tags:

  • If replica set members have different configurations and queries need to be redirected to the specific secondaries as per their purpose. For example, production queries can be redirected to the higher configuration member for faster execution and queries used for internal reporting purpose can be redirected to the low configurations secondaries. This will help improve per node resource utilization.
  • When you use custom read preferences, but the reads are routed to a secondary that resides in another data center to make reads more optimized and cost-effective. You can use tag sets to make sure that specific reads are routed to the specific secondary node within the DC.
  • If you want to use custom write concerns with the tag set for acknowledging writes are propagated to the secondary nodes per the requirements.

Do these tag sets work with all read preferences modes?

Yes, these tag-sets work with all the read preferences — except “primary” mode. “Primary” preferred read preference mode doesn’t allow you to add any tag sets while querying.

replicaTest:PRIMARY> db.tagTest.find().readPref('primary', [{"specs" : "low","purpose" : "general"}])
Error: error: {
	"ok" : 0,
	"errmsg" : "Only empty tags are allowed with primary read preference",
	"code" : 2,
	"codeName" : "BadValue"
}

What if we’re already using maxStalenessSeconds along with the read preferences, can tag set still be used?

Yes, you can use tag sets with a maxStalenessSeconds value. In that case, priority is given to staleness first, then tags, to get the most recent data from the secondary member.

How can one configure tag sets in a replica set?

You can configure tags by adding a parameter in the replica set configuration. Consider this test case with a five members replica set:

"members" : [
		{
			"_id" : 0,
			"name" : "host1:27017",
			"stateStr" : "PRIMARY",
		},
		{
			"_id" : 1,
			"name" : "host2:27017",
			"stateStr" : "SECONDARY",
		},
		{
			"_id" : 2,
			"name" : "host3:27017",
			"stateStr" : "SECONDARY",
		},
		{
			"_id" : 3,
			"name" : "host4:27017",
			"stateStr" : "SECONDARY",
		},
		{
			"_id" : 4,
			"name" : "host5:27017",
			"stateStr" : "SECONDARY",
         }
		]

For our test case, members specification of the host are “specs” and the requirement for the query as per the application is the “purpose,” in order to route queries to specific members in an optimized manner.

You must associate tags to each member by adding it to the replica set configuration:

cfg=rs.conf()
cfg.members[0].tags={"specs":"high","purpose":"analytics"}
cfg.members[1].tags={"specs":"high"}
cfg.members[2].tags={"specs":"low","purpose":"general"}
cfg.members[3].tags={"specs":"high","purpose":"analytics"}
cfg.members[4].tags={"specs":"low"}
rs.reconfig(cfg)

After adding tags, you can validate these changes by checking replica set configurations like:

rs.conf()
	"members" : [
		{
			"_id" : 0,
			"host" : "host1:27017",
			"tags" : {
				"specs" : "high",
				"purpose" : "analytics"
			},
		},
		{
			"_id" : 1,
			"host" : "host2:27017",
			"tags" : {
				"specs" : "high"
			},
		},
		{
			"_id" : 2,
			"host" : "host3:27017",
			"tags" : {
				"specs" : "low",
				"purpose" : "general"
			},
		},
		{
			"_id" : 3,
			"host" : "host4:27017",
			"tags" : {
				"specs" : "high",
				"purpose" : "analytics"
			},
		},
		{
			"_id" : 4,
			"host" : "host5:27017",
			"tags" : {
				"specs" : "low"
			},
		}
	]

Now, we are done with the tag-set configuration.

Do these tags work identically for custom read preferences and write concerns?

No, custom read preferences and write concerns consider tag sets in different ways.

Read preferences routes read operations to a required specific member by following tag values assigned to it, but write concerns follows tag values only to check if the value is unique or not. It will not consider tag values while selecting replica members.

Let us see how to use tag sets with write concerns. As per our test case, we have two unique tag values (i.e., “analytics” and “general”) defined as:

cfg=rs.conf()
cfg.settings={ getLastErrorModes: {writeNode:{"purpose": 2}}}
rs.reconfig(cfg)

You can validate these changes by checking the replica set configuration:

rs.conf()
	"settings" : {
			"getLastErrorModes" : {
			"writeNode" : {
				"purpose" : 2
			}<strong>
		},</strong>
	}

Now let’s try to insert a sample document in the collection named “tagTest” with this write concern:

db.tagTest.insert({name:"tom",tech:"nosql",status:"active"},{writeConcern:{w:"writeNode"}})
WriteResult({ "nInserted" : 1 })

Here, the write concern “writeNode” means the client gets a write acknowledgment from two nodes with unique tag set values. If the value set in the configuration exceeds the count of unique values, then it leads to an error at the time of the write:

cfg.settings={ getLastErrorModes: {writeNode:{"purpose": 4}}}
rs.reconfig(cfg)
db.tagTest.insert({name:"tom",tech:"nosql",status:"active"},{writeConcern:{w:"writeNode"}})
WriteResult({
	"nInserted" : 1,
	"writeConcernError" : {
		"code" : 100,
		"codeName" : "CannotSatisfyWriteConcern",
		"errmsg" : "Not enough nodes match write concern mode "writeNode""
	}
}

You can perform read and write operations with tag sets like this:

db.tagTest.find({name:"tom"}).readPref("secondary",[{"specs":"low","purpose":"general"}])
db.tagTest.insert({name:"john",tech:"rdbms",status:"active"},{writeConcern:{w:"writeNode"}})

I hope this helps you to understand how to configure MongoDB replica set tag sets, how the read preferences and write concerns handle them, and where you can use them

The post MongoDB Replica Set Tag Sets appeared first on Percona Database Performance Blog.

Apr
12
2018
--

Flashback: Another Take on Point-In-Time Recovery (PITR) in MySQL/MariaDB/Percona Server

Point-In-Time Recovery

Point-In-Time RecoveryIn this blog post, I’ll look at point-in-time recovery (PITR) options for MySQL, MariaDB and Percona Server for MySQL.

It is a common good practice to extend data safety by having additional measures apart from regular data backups, such as delayed slaves and binary log backups. These two options provide the ability to restore the data to any given point in time, or just revert from some bad accidents. These methods have their limitations of course: delayed slaves only help if a deadly mistake is noticed fast enough, while full point-in-time recovery (PITR) requires the last full backup and binary logs (and therefore usually takes a lot of time).

How to reverse from disaster faster

Alibaba engineers and the MariaDB team implemented an interesting feature in their version of the mysqlbinlog tool: the --flashback option. Based on ROW-based DML events, it can transform the binary log and reverse purposes. That means it can help undo given row changes extremely fast. For instance, it can change DELETE events to INSERTs and vice versa, and it will swap WHERE and SET parts of the UPDATE events. This simple idea can dramatically speed up recovery from certain types of mistakes or disasters.

The question is whether it works with non-MariaDB variants. To verify that, I tested this feature with the latest available Percona Server for MySQL 5.7 (which is fully compatible with upstream MySQL).

master [localhost] {msandbox} ((none)) > select @@version,@@version_comment;
+---------------+--------------------------------------------------------+
| @@version     | @@version_comment                                      |
+---------------+--------------------------------------------------------+
| 5.7.21-20-log | Percona Server (GPL), Release 20, Revision ed217b06ca3 |
+---------------+--------------------------------------------------------+
1 row in set (0.00 sec)

First, let’s simulate one possible deadly scenario: a forgotten WHERE in DELETE statement:

master [localhost] {msandbox} ((none)) > select count(*) from test.sbtest1;
+----------+
| count(*) |
+----------+
| 200      |
+----------+
1 row in set (0.00 sec)
master [localhost] {msandbox} ((none)) > delete from test.sbtest1;
Query OK, 200 rows affected (0.04 sec)
slave1 [localhost] {msandbox} ((none)) > select count(*) from test.sbtest1;
+----------+
| count(*) |
+----------+
| 0        |
+----------+
1 row in set (0.00 sec

So, our data is lost on both the master and slave!

Let’s start by downloading the latest MariaDB server 10.2.x package, which I’m hoping has a mysqlbinlog tool that works with MySQL 5.7, and unpack it to some custom location:

$ dpkg -x mariadb-server-10.2_10.2.13+maria~wheezy_amd64.deb /opt/maria/
$ /opt/maria/usr/bin/mysqlbinlog --help|grep flash
-B, --flashback Flashback feature can rollback you committed data to a

It has the function we are looking for. Now, we have to find the culprit transaction or set of transactions we want to revert. A simplified example may look like this:

$ mysqlbinlog -v --base64-output=DECODE-ROWS mysql-bin.000002 > mysql-bin.000002.sql
$ less mysql-bin.000002.sql

By searching through the decoded binary log, we are looking for transactions that have wiped out the table test.sbtest1. It looks like this (as the table had 200 rows, it is pretty long, so I’ve pasting only the beginning and the end):

BEGIN
/*!*/;
# at 291
#180314 15:30:34 server id 1  end_log_pos 348 CRC32 0x06cd193e  Table_map: `test`.`sbtest1` mapped to number 111
# at 348
#180314 15:30:34 server id 1  end_log_pos 8510 CRC32 0x064634c5         Delete_rows: table id 111
...
### DELETE FROM `test`.`sbtest1`
### WHERE
###   @1=200
###   @2=101
###   @3='26157116088-21551255803-13077038767-89418462090-07321921109-99464656338-95996554805-68102077806-88247356874-53904987561'
###   @4='51157774706-69740598871-18633441857-39587481216-98251863874'
# at 38323
#180314 15:30:34 server id 1  end_log_pos 38354 CRC32 0x6dbb7127        Xid = 97
COMMIT/*!*/;

It is very important to take the proper start and stop positions. We need the ones exactly after BEGIN and before the final COMMIT. Then, let’s test if the tool produces the reverse statements as expected. First, decode the rows to the .sql file:

$ /opt/maria/usr/bin/mysqlbinlog --flashback -v --base64-output=DECODE-ROWS --start-position=291 --stop-position=38323 mysql-bin.000002 > mysql-bin.000002_flash.sql

Inside, we find 200 of those. Looks good:

### INSERT INTO `test`.`sbtest1`
### SET
### @1=200
...

Since we verified the positions are correct, we can prepare a binary log file:

$ /opt/maria/usr/bin/mysqlbinlog --flashback --start-position=291 --stop-position=38323 mysql-bin.000002 > mysql-bin.000002_flash.bin

and load it back to our master:

master [localhost] {msandbox} (test) > source mysql-bin.000002_flash.bin
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected, 1 warning (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.04 sec)
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
master [localhost] {msandbox} (test) > select count(*) from test.sbtest1;
+----------+
| count(*) |
+----------+
| 200      |
+----------+
1 row in set (0.00 sec)

and double check they restored on slaves:

slave1 [localhost] {msandbox} (test) > select count(*) from test.sbtest1;
+----------+
| count(*) |
+----------+
| 200      |
+----------+
1 row in set (0.00 sec)

GTID problem

MariaDB has a completely different GTID implementation from MySQL and Percona Server. You can expect problems when decoding incompatible GTID enabled binary logs with MariaDB. As MariaDB’s mysqlbinlog does not support –start/stop-gtid options (even for its own implementation), we have to take the usual positions anyway. From a GTID-enabled binary log, for example, delete can look like this:

# at 2300
#180315 9:37:31 server id 1 end_log_pos 2365 CRC32 0x09e4d815 GTID last_committed=1 sequence_number=2 rbr_only=yes
/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
SET @@SESSION.GTID_NEXT= '00020996-1111-1111-1111-111111111111:2'/*!*/;
# at 2365
#180315 9:37:31 server id 1 end_log_pos 2433 CRC32 0xac62a20d Query thread_id=4 exec_time=0 error_code=0
SET TIMESTAMP=1521103051/*!*/;
BEGIN
/*!*/;
# at 2433
#180315 9:37:31 server id 1 end_log_pos 2490 CRC32 0x275601d6 Table_map: `test`.`sbtest1` mapped to number 108
# at 2490
#180315 9:37:31 server id 1 end_log_pos 10652 CRC32 0xe369e169 Delete_rows: table id 108
...
# at 42355
#180315 9:37:31 server id 1 end_log_pos 42386 CRC32 0xe01ff558 Xid = 31
COMMIT/*!*/;
SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;

The tool seems to work, and transforms the delete transaction to a sequence of INSERTs. However, the server rejects it when we try to load it on a GTID-enabled master:

master [localhost] {msandbox} ((none)) > source mysql-bin.000003.flash
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected, 1 warning (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
ERROR 1782 (HY000): @@SESSION.GTID_NEXT cannot be set to ANONYMOUS when @@GLOBAL.GTID_MODE = ON.
ERROR 1782 (HY000): @@SESSION.GTID_NEXT cannot be set to ANONYMOUS when @@GLOBAL.GTID_MODE = ON.
ERROR 1782 (HY000): @@SESSION.GTID_NEXT cannot be set to ANONYMOUS when @@GLOBAL.GTID_MODE = ON.
ERROR 1782 (HY000): @@SESSION.GTID_NEXT cannot be set to ANONYMOUS when @@GLOBAL.GTID_MODE = ON.
ERROR 1782 (HY000): @@SESSION.GTID_NEXT cannot be set to ANONYMOUS when @@GLOBAL.GTID_MODE = ON.
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected, 1 warning (0.00 sec)
master [localhost] {msandbox} ((none)) > select count(*) from test.sbtest1;
+----------+
| count(*) |
+----------+
| 0 |
+----------+
1 row in set (0.00 sec)

Unfortunately, the solution here is either to disable GTID mode for the recovery time (which is surely tricky in replicated clusters), or try to add GTID-related information to the resulting binary log with the

--flashback option

. In my case, adding these lines worked (I used the next free available GTID sequence):

$ diff -u mysql-bin.000003.flash mysql-bin.000003.flash.gtid
--- mysql-bin.000003.flash 2018-03-15 10:20:20.080487998 +0100
+++ mysql-bin.000003.flash.gtid 2018-03-15 10:25:02.909953620 +0100
@@ -4,6 +4,10 @@
DELIMITER /*!*/;
#180315 9:32:51 server id 1 end_log_pos 123 CRC32 0x941b189a Start: binlog v 4, server v 5.7.21-20-log created 180315 9:32:51 at startup
ROLLBACK/*!*/;
+# at 154
+#180315 9:37:05 server id 1 end_log_pos 219 CRC32 0x69e4ce26 GTID last_committed=0 sequence_number=1 rbr_only=yes
+/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
+SET @@SESSION.GTID_NEXT= '00020996-1111-1111-1111-111111111111:5'/*!*/;
BINLOG '
sy+qWg8BAAAAdwAAAHsAAAAAAAQANS43LjIxLTIwLWxvZwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAACzL6paEzgNAAgAEgAEBAQEEgAAXwAEGggAAAAICAgCAAAACgoKKioAEjQA
@@ -724,6 +728,7 @@
'/*!*/;
COMMIT
/*!*/;
+SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;

master [localhost] {msandbox} ((none)) > source mysql-bin.000003.flash.gtid
(...)
master [localhost] {msandbox} ((none)) > select count(*) from test.sbtest1;
+----------+
| count(*) |
+----------+
| 200      |
+----------+
1 row in set (0.00 sec

Limitations

Obviously, flashback cannot help after DROP/TRUNCATE or other DDL commands. These are not transactional, and affected rows are never recorded in the binary log. It doesn’t work with encrypted or compressed binary logs either. But most importantly, to produce complete events that can reverse bad transactions, the binary format must be ROW. The row image also must be FULL:

master [localhost] {msandbox} ((none)) > select @@binlog_format,@@binlog_row_image;
+-----------------+--------------------+
| @@binlog_format | @@binlog_row_image |
+-----------------+--------------------+
| ROW             | FULL               |
+-----------------+--------------------+
1 row in set (0.00 sec)

If these conditions are not met (or if you’re dealing with a too-complicated GTID issue), you will have to follow the standard point-in-time recovery procedure.

The post Flashback: Another Take on Point-In-Time Recovery (PITR) in MySQL/MariaDB/Percona Server appeared first on Percona Database Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com