Apr
18
2018
--

Why Analyze Raw MySQL Query Logs?

Raw MySQL Query Logs

Raw MySQL Query LogsIn this blog post, I’ll examine when looking at raw MySQL query logs can be more useful than working with tools that only have summary data.

In my previous blog post, I wrote about analyzing MySQL Slow Query Logs with ClickHouse and ClickTail. One of the follow-up questions I got is when do you want to do that compared to just using tools like Percona Monitoring and Management or VividCortex, which provide a beautiful interface for detailed analyses (rather than spartan SQL interface).    

MySQL Logs

A lot of folks are confused about what query logs MySQL has, and what you can use them for. First, MySQL has a “General Query Log”. As the name implies, this is a general-purpose query log. You would think this is the first log you should use, but it is, in fact, pretty useless:

2018-03-31T15:38:44.521650Z      2356 Query SELECT c FROM sbtest1 WHERE id=164802
2018-03-31T15:38:44.521790Z      2356 Query SELECT c FROM sbtest1 WHERE id BETWEEN 95241 AND 95340
2018-03-31T15:38:44.522168Z      2356 Query SELECT SUM(k) FROM sbtest1 WHERE id BETWEEN 1 AND 100
2018-03-31T15:38:44.522500Z      2356 Query SELECT c FROM sbtest1 WHERE id BETWEEN 304556 AND 304655 ORDER BY c
2018-03-31T15:38:44.522941Z      2356 Query SELECT DISTINCT c FROM sbtest1 WHERE id BETWEEN 924 AND 1023 ORDER BY c
2018-03-31T15:38:44.523525Z      2356 Query UPDATE sbtest1 SET k=k+1 WHERE id=514

As you can see, it only has very limited information about queries: no query execution time or which user is running the query. This type of log is helpful if you want to see very clean, basic information on what queries your application is really running. It can also help debug MySQL crashes because, unlike other log formats, the query is written to this log file before MySQL attempts to execute the query.

The MySQL Slow Log is, in my opinion, much more useful (especially with Percona Server Slow Query Log Extensions). Again as the name implies, you would think it is only used for slow queries (and by default, it is). However, you can set long_query_time to 0 (with a few other options) to get all queries here with lots of rich information about query execution:

# Time: 2018-03-31T15:48:55.795145Z
# User@Host: sbtest[sbtest] @ localhost []  Id: 2332
# Schema: sbtest  Last_errno: 0 Killed: 0
# Query_time: 0.000143  Lock_time: 0.000047 Rows_sent: 1  Rows_examined: 1 Rows_affected: 0
# Bytes_sent: 188  Tmp_tables: 0 Tmp_disk_tables: 0  Tmp_table_sizes: 0
# QC_Hit: No  Full_scan: No Full_join: No  Tmp_table: No Tmp_table_on_disk: No
# Filesort: No  Filesort_on_disk: No  Merge_passes: 0
#   InnoDB_IO_r_ops: 0  InnoDB_IO_r_bytes: 0  InnoDB_IO_r_wait: 0.000000
#   InnoDB_rec_lock_wait: 0.000000  InnoDB_queue_wait: 0.000000
#   InnoDB_pages_distinct: 0
# Log_slow_rate_type: query  Log_slow_rate_limit: 10
SET timestamp=1522511335;
SELECT c FROM sbtest1 WHERE id=2428336;

Finally, there is the MySQL Audit Log, which is part of the MySQL Enterprise offering and format-compatible Percona Server for MySQL Audit Log Plugin. This is designed for auditing access to the server, and as such it has matched details in the log. Unlike the first two log formats, it is designed first and foremost to be machine-readable and supports JSON, XML and CVS output formats:

{"audit_record":{"name":"Query","record":"743017006_2018-03-31T01:03:12","timestamp":"2018-03-31T15:53:42 UTC","command_class":"select","connection_id":"2394","status":0,"sqltext":"SELECT SUM(k) FROM sbtest1 WHERE id BETWEEN 3 AND 102","user":"sbtest[sbtest] @ localhost []","host":"localhost","os_user":"","ip":"","db":"sbtest"}}
{"audit_record":{"name":"Query","record":"743017007_2018-03-31T01:03:12","timestamp":"2018-03-31T15:53:42 UTC","command_class":"select","connection_id":"2394","status":0,"sqltext":"SELECT c FROM sbtest1 WHERE id BETWEEN 2812021 AND 2812120 ORDER BY c","user":"sbtest[sbtest] @ localhost []","host":"localhost","os_user":"","ip":"","db":"sbtest"}}
{"audit_record":{"name":"Query","record":"743017008_2018-03-31T01:03:12","timestamp":"2018-03-31T15:53:42 UTC","command_class":"select","connection_id":"2394","status":0,"sqltext":"SELECT DISTINCT c FROM sbtest1 WHERE id BETWEEN 1 AND 100 ORDER BY c","user":"sbtest[sbtest] @ localhost []","host":"localhost","os_user":"","ip":"","db":"sbtest"}}

As you can see, there are substantial differences in the purposes of the different MySQL log formats, along with the information they provide.

Why analyze raw MySQL query logs

In my opinion, there are two main reasons to look directly at raw log files without aggregation (you might find others):

  • Auditing, where the Audit Log is useful (Vadim recently blogged about it)
  • Advanced MySQL/application debugging, where an aggregated summary might not allow you to drill down to the fullest level of detail

When you’re debugging using MySQL logs, the Slow Query Log, set to log all queries with no sampling, is the most useful. Of course, this can cause significant additional overhead in many workloads, so it is best to do it in a development environment (if you can repeat the situation you’re looking to analyze). At the very least, don’t do it during peak time.

For Percona Server for MySQL, these options ensure it logs all queries to the query log with no sampling:

log_output=file
slow_query_log=ON
long_query_time=0
log_slow_rate_limit=1
log_slow_verbosity=full
log_slow_admin_statements=ON
log_slow_slave_statements=ON
slow_query_log_always_write_time=1

Now that we have full queries, we can easily use Linux command line tools like grep and others to look into what is going on. However, many times this isn’t always convenient. This is where loading logs into storage that you can conveniently query is a good solution.

Let’s look into some specific and interesting cases.

Were any queries killed?

SELECT
   _time,
   query,
   query_time
FROM mysql_slow_log
WHERE killed > 0
????????????????_time???query?????????????????????????????????query_time??
? 2018-04-02 19:02:56 ? select benchmark(10000000000,"1+1") ?  10.640794 ?
??????????????????????????????????????????????????????????????????????????
1 rows in set. Elapsed: 0.242 sec. Processed 929.14 million rows, 1.86 GB (3.84                                         billion rows/s., 7.67 GB/s.)

Yes. A query got killed after running for 10 seconds.

Did any query fail? With what error codes?

SELECT
   error_num,
   min(_time),
   max(_time),
   count(*)
FROM mysql_slow_log
GROUP BY error_num
??error_num????????????min(_time)????????????max(_time)?????count()??
?         0 ? 2018-04-02 18:59:49 ? 2018-04-07 19:39:27 ? 925428375 ?
?      1160 ? 2018-04-02 19:02:56 ? 2018-04-02 19:02:56 ?         1 ?
?      1213 ? 2018-04-02 19:00:00 ? 2018-04-07 19:18:14 ?   3709520 ?
?      1054 ? 2018-04-07 19:38:14 ? 2018-04-07 19:38:14 ?         1 ?
?????????????????????????????????????????????????????????????????????
4 rows in set. Elapsed: 2.391 sec. Processed 929.14 million rows, 7.43 GB (388.64 million rows/s., 3.11 GB/s.)

You can resolve error codes with the 

perror

 command:

root@rocky:~# perror 1054
MySQL error code 1054 (ER_BAD_FIELD_ERROR): Unknown column '%-.192s' in '%-.192s'

This command has many uses. You can use it to hunt down application issues (like in this example of a missing column — likely due to bad or old code). It can also help you to spot SQL injection attempts that often cause queries with bad syntax, and troubleshoot deadlocks or foreign key violations.

Are there any nasty, long transactions?

SELECT
   transaction_id,
   max(_time) - min(_time) AS run_time,
   count(*) AS num_queries,
   sum(rows_affected) AS rows_changed
FROM mysql_slow_log
WHERE transaction_id != ''
GROUP BY transaction_id
ORDER BY rows_changed DESC
LIMIT 10
??transaction_id???run_time???num_queries???rows_changed??
? 17E070082      ? 0        ?      1      ? 9999         ?
? 17934C73C      ? 2        ?      6      ? 4            ?
? 178B6D346      ? 0        ?      6      ? 4            ?
? 17C909086      ? 2        ?      6      ? 4            ?
? 17B45EFAD      ? 5        ?      6      ? 4            ?
? 17ABAB840      ? 0        ?      6      ? 4            ?
? 17A36AD3F      ? 3        ?      6      ? 4            ?
? 178E037A5      ? 1        ?      6      ? 4            ?
? 17D1549C9      ? 0        ?      6      ? 4            ?
? 1799639F2      ? 1        ?      6      ? 4            ?
??????????????????????????????????????????????????????????
10 rows in set. Elapsed: 15.574 sec. Processed 930.58 million rows, 18.23 GB (59.75 million rows/s., 1.17 GB/s.)

Finding transactions that modify a lot of rows, like transaction 17E070082 above, can be very helpful to ensure you control MySQL replication slave lag. It is also critical if you’re looking to migrate to MySQL Group Replication or Percona XtraDB Cluster.

What statements were executed in a long transaction?

SELECT
   _time,
   _ms,
   query
FROM mysql_slow_log
WHERE transaction_id = '17E070082'
ORDER BY
   _time ASC,
   _ms ASC
LIMIT 10
????????????????_time??????_ms???query??????????????????????????????????
? 2018-04-07 20:08:43 ? 890693 ? update sbtest1 set k=0 where id<10000 ?
????????????????????????????????????????????????????????????????????????
1 rows in set. Elapsed: 2.361 sec. Processed 931.04 million rows, 10.79 GB (394.27 million rows/s., 4.57 GB/s.)

I used transaction 17E070082 from the previous query above (which modified 9999 rows). Note that this schema improves compression by storing the seconds and microseconds parts of the timestamp in different columns.

Were any queries dumping large numbers of rows from the database?

SELECT
   _time,
   query,
   rows_sent,
   bytes_sent
FROM mysql_slow_log
WHERE rows_sent > 10000
????????????????_time???query??????????????????????????????????????????????rows_sent???bytes_sent??
? 2018-04-07 20:21:08 ? SELECT /*!40001 SQL_NO_CACHE */ * FROM `sbtest1` ?  10000000 ? 1976260712 ?
???????????????????????????????????????????????????????????????????????????????????????????????????
1 rows in set. Elapsed: 0.294 sec. Processed 932.19 million rows, 3.73 GB (3.18 billion rows/s., 12.71 GB/s.)

Did someone Update a record?

SELECT
   _time,
   query
FROM mysql_slow_log
WHERE (rows_affected > 0) AND (query LIKE '%id=3301689%')
LIMIT 1
????????????????_time???query??????????????????????????????????????
? 2018-04-02 19:04:48 ? UPDATE sbtest1 SET k=k+1 WHERE id=3301689 ?
???????????????????????????????????????????????????????????????????
1 rows in set. Elapsed: 0.046 sec. Processed 2.29 million rows, 161.60 MB (49.57 million rows/s., 3.49 GB/s.)

Note that I’m cheating here by assuming we know an update used a primary key, but it is practically helpful in a lot of cases.

These are just some of the examples of what you can find out by querying raw slow query logs. They contain a ton of information about query execution (especially in Percona Server for MySQL) that allows you to use them both for performance analysis and some security and auditing purposes.

The post Why Analyze Raw MySQL Query Logs? appeared first on Percona Database Performance Blog.

Sep
13
2016
--

MySQL CDC, Streaming Binary Logs and Asynchronous Triggers

MySQL CDC

MySQL CDCIn this post, we’ll look at MySQL CDC, streaming binary logs and asynchronous triggers.

What is Change Data Capture and why do we need it?

Change Data Capture (CDC) tracks data changes (usually close to realtime). In MySQL, the easiest and probably most efficient way to track data changes is to use binary logs. However, other approaches exist. For example:

  • General log or Audit Log Plugin (which logs all queries, not just the changes)
  • MySQL triggers (not recommended, as it can slow down the application — more below)

One of the first implementations of CDC for MySQL was the FlexCDC project by Justin Swanhart. Nowadays, there are a lot of CDC implementations (see mysql-cdc-projects wiki for a long list).

CDC can be implemented for various tasks such as auditing, copying data to another system or processing (and reacting to) events. In this blog post, I will demonstrate how to use a CDC approach to stream MySQL binary logs, process events and save it (stream to) another MySQL instance (or MongoDB). In addition, I will show how to implement asynchronous triggers by streaming binary logs.

Streaming binary logs 

You can read binary logs using the mysqlbinlog utility, by adding “-vvv” (verbose option). mysqlbinlog can also show human readable version for the ROW based replication. For example:

# mysqlbinlog -vvv /var/lib/mysql/master.000001
BINLOG '
JxiqVxMBAAAALAAAAI7LegAAAHQAAAAAAAEABHRlc3QAAWEAAQMAAUTAFAY=
JxiqVx4BAAAAKAAAALbLegAAAHQAAAAAAAEAAgAB//5kAAAAedRLHg==
'/*!*/;
### INSERT INTO `test`.`a`
### SET
###   @1=100 /* INT meta=0 nullable=1 is_null=0 */
# at 8047542
#160809 17:51:35 server id 1  end_log_pos 8047573 CRC32 0x56b36ca5      Xid = 24453
COMMIT/*!*/;

Starting with MySQL 5.6, mysqlbinlog can also read the binary log events from a remote master (“fake” replication slave).

Reading binary logs is a great basis for CDC. However, there are still some challenges:

  1. ROW-based replication is probably the easiest way to get the RAW changes, otherwise we will have to parse SQL. At the same time, ROW-based replication binary logs don’t contain the table metadata, i.e. it does not record the field names, only field number (as in the example above “@1” is the first field in table “a”).
  2. We will need to somehow record and store the binary log positions so that the tool can be restarted at any time and proceed from the last position (like a MySQL replication slave).

Maxwell’s daemon (Maxwell = Mysql + Kafka), an application recently released by Zendesk, reads MySQL binlogs and writes row updates as JSON (it can write to Kafka, which is its primary goal, but can also write to stdout and can be extended for other purposes). Maxwell stores the metadata about MySQL tables and binary log events (and other metadata) inside MySQL, so it solves the potential issues from the above list.

Here is a quick demo of Maxwell:

Session 1 (Insert into MySQL):

mysql> insert into a (i) values (151);
Query OK, 1 row affected (0.00 sec)
mysql> update a set i = 300 limit 5;
Query OK, 5 rows affected (0.01 sec)
Rows matched: 5  Changed: 5  Warnings: 0

Session 2 (starting Maxwell):

$ ./bin/maxwell --user='maxwell' --password='maxwell' --host='127.0.0.1' --producer=stdout
16:00:15,303 INFO  Maxwell - Maxwell is booting (StdoutProducer), starting at BinlogPosition[master.000001:15494460]
16:00:15,327 INFO  TransportImpl - connecting to host: 127.0.0.1, port: 3306
16:00:15,350 INFO  TransportImpl - connected to host: 127.0.0.1, port: 3306, context: AbstractTransport.Context[threadId=9,...
16:00:15,350 INFO  AuthenticatorImpl - start to login, user: maxwell, host: 127.0.0.1, port: 3306
16:00:15,354 INFO  AuthenticatorImpl - login successfully, user: maxwell, detail: OKPacket[packetMarker=0,affectedRows=0,insertId=0,serverStatus=2,warningCount=0,message=<null>]
16:00:15,533 INFO  MysqlSavedSchema - Restoring schema id 1 (last modified at BinlogPosition[master.000001:3921])
{"database":"test","table":"a","type":"insert","ts":1472937475,"xid":211209,"commit":true,"data":{"i":151}}
{"database":"test","table":"a","type":"insert","ts":1472937475,"xid":211209,"commit":true,"data":{"i":151}}
{"database":"test","table":"a","type":"update","ts":1472937535,"xid":211333,"data":{"i":300},"old":{"i":150}}
{"database":"test","table":"a","type":"update","ts":1472937535,"xid":211333,"data":{"i":300},"old":{"i":150}}
{"database":"test","table":"a","type":"update","ts":1472937535,"xid":211333,"data":{"i":300},"old":{"i":150}}
{"database":"test","table":"a","type":"update","ts":1472937535,"xid":211333,"data":{"i":300},"old":{"i":150}}
{"database":"test","table":"a","type":"update","ts":1472937535,"xid":211333,"commit":true,"data":{"i":300},"old":{"i":150}}

As we can see in this example, Maxwell get the events from MySQL replication stream and outputs it into stdout (if we change the producer, it can save it to Apache Kafka).

Saving binlog events to MySQL document store or MongoDB

If we want to save the events to some other place we can use MongoDB or MySQL JSON fields and document store (as Maxwell will provide use with JSON documents). For a simple proof of concept, I’ve created nodeJS scripts to implement a CDC “pipleline”:

var mysqlx = require('mysqlx');
var mySession =
mysqlx.getSession({
    host: '10.0.0.2',
    port: 33060,
    dbUser: 'root',
    dbPassword: 'xxx'
});
process.on('SIGINT', function() {
    console.log("Caught interrupt signal. Exiting...");
    process.exit()
});
process.stdin.setEncoding('utf8');
process.stdin.on('readable', () => {
  var chunk = process.stdin.read();
  if(chunk != null) {
    process.stdout.write(`data: ${chunk}`);
    mySession.then(session => {
                    session.getSchema("mysqlcdc").getCollection("mysqlcdc")
                    .add(  JSON.parse(chunk)  ) .execute(function (row) {
                            // can log something here
                    }).catch(err => {
                            console.log(err);
                    })
                    .then( function (notices) {
                            console.log("Wrote to MySQL: " + JSON.stringify(notices))
                    });
    }).catch(function (err) {
                  console.log(err);
                  process.exit();
    });
  }
});
process.stdin.on('end', () => {
  process.stdout.write('end');
  process.stdin.resume();
});

And to run it we can use the pipeline:

./bin/maxwell --user='maxwell' --password='maxwell' --host='127.0.0.1' --producer=stdout --log_level=ERROR  | node ./maxwell_to_mysql.js

The same approach can be used to save the CDC events to MongoDB with mongoimport:

$ ./bin/maxwell --user='maxwell' --password='maxwell' --host='127.0.0.1' --producer=stdout --log_level=ERROR |mongoimport -d mysqlcdc -c mysqlcdc --host localhost:27017

Reacting to binary log events: asynchronous triggers

In the above example, we only recorded the binary log events. Now we can add “reactions”.

One of the practical applications is re-implementing MySQL triggers to something more performant. MySQL triggers are executed for each row, and are synchronous (the query will not return until the trigger event finishes). This was known to cause poor performance, and can significantly slow down bulk operations (i.e., “load data infile” or “insert into … values (…), (…)”). With triggers, MySQL will have to process the “bulk” operations row by row, killing the performance. In addition, when using statement-based replication, triggers on the slave can slow down the replication thread (it is much less relevant nowadays with ROW-based replication and potentially multithreaded slaves).

With the ability to read binary logs from MySQL (using Maxwell), we can process the events and re-implement triggers — now in asynchronous mode — without delaying MySQL operations. As Maxwell gives us a JSON document with the “new” and “old” values (with the default option binlog_row_image=FULL, MySQL records the previous values for updates and deletes) we can use it to create triggers.

Not all triggers can be easily re-implemented based on the binary logs. However, in my experience most of the triggers in MySQL are used for:

  • auditing (if you deleted a row, what was the previous value and/or who did and when)
  • enriching the existing table (i.e., update the field in the same table)

Here is a quick algorithm for how to re-implement the triggers with Maxwell:

  • Find the trigger table and trigger event text (SQL)
  • Create an app or a script to parse JSON for the trigger table
  • Create a new version of the SQL changing the NEW.<field> to “data.field” (from JSON) and OLD.<field> to “old.field” (from JSON)

For example, if I want to audit all deletes in the “transactions” table, I can do it with Maxwell and a simple Python script (do not use this in production, it is a very basic sample):

import json,sys
line = sys.stdin.readline()
while line:
    print line,
    obj=json.loads(line);
    if obj["type"] == "delete":
        print "INSERT INTO transactions_delete_log VALUES ('" + str(obj["data"]) + "', Now() )"
    line = sys.stdin.readline()

MySQL:

mysql> delete from transactions where user_id = 2;
Query OK, 1 row affected (0.00 sec)

Maxwell pipeline:

$ ./bin/maxwell --user='maxwell' --password='maxwell' --host='127.0.0.1' --producer=stdout --log_level=ERROR  | python trigger.py
{"database":"test","table":"transactions","type":"delete","ts":1472942384,"xid":214395,"commit":true,"data":{"id":2,"user_id":2,"value":2,"last_updated":"2016-09-03 22:39:31"}}
INSERT INTO transactions_delete_log VALUES ('{u'last_updated': u'2016-09-03 22:39:31', u'user_id': 2, u'id': 2, u'value': 2}', Now() )

Maxwell limitations

Maxwell was designed for MySQL 5.6 with ROW-based replication. Although it can work with MySQL 5.7, it does not support new MySQL 5.7 data types (i.e., JSON fields). Maxwell does not support GTID, and can’t failover based on GTID (it can parse events with GTID thou).

Conclusion

Streaming MySQL binary logs (for example with Maxwell application) can help to implement CDC for auditing and other purposes, and also implement asynchronous triggers (removing the MySQL level triggers can increase MySQL performance).

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com