Jan
20
2015
--

Identifying useful info from MySQL row-based binary logs

As a MySQL DBA/consultant, it is part of my job to decode the MySQL binary logs – and there are a number of reasons for doing that. In this post, I’ll explain how you can get the important information about your write workload using MySQL row-based binary logs and a simple awk script.

First, it is important to understand that row-based binary logs contain the actual changes done by a query. For example, if I run a delete query against a table, the binary log will contain the rows that were deleted. MySQL provides the mysqlbinlog utility to decode the events stored in MySQL binary logs. You can read more about mysqlbinlog in detail in the reference manual here.

The following example illustrates how mysqlbinlog displays row events that specify data modifications. These correspond to events with the WRITE_ROWS_EVENT, UPDATE_ROWS_EVENT, and DELETE_ROWS_EVENT type codes.

We will use following options of mysqlbinlog.
–base64-output=decode-rows
–verbose, -v
–start-datetime=”datetime”
–stop-datetime=”datetime”

We have a server running with row based binary logging.

mysql> show variables like '%binlog_format%';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| binlog_format | ROW   |
+---------------+-------+
1 row in set (0.00 sec)

We created a test table and executed the following sequence of statements under a transaction.

use test;
CREATE TABLE t
(
  id   INT NOT NULL,
  name VARCHAR(20) NOT NULL,
  date DATE NULL
) ENGINE = InnoDB;
START TRANSACTION;
INSERT INTO t VALUES(1, 'apple', NULL);
UPDATE t SET name = 'pear', date = '2009-01-01' WHERE id = 1;
DELETE FROM t WHERE id = 1;
COMMIT;

Now let’s see how it is represented in binary logs.

# mysqlbinlog --base64-output=decode-rows -vv  --start-datetime="2015-01-12 21:40:00"  --stop-datetime="2015-01-12 21:45:00"  mysqld-bin.000023
/*!*/;
# at 295
#150112 21:40:14 server id 1  end_log_pos 367 CRC32 0x19ab4f0f 	Query	thread_id=108	exec_time=0	error_code=0
SET TIMESTAMP=1421079014/*!*/;
BEGIN
/*!*/;
# at 367
#150112 21:40:14 server id 1  end_log_pos 415 CRC32 0x6b1f2240 	Table_map: `test`.`t` mapped to number 251
# at 415
#150112 21:40:14 server id 1  end_log_pos 461 CRC32 0x7725d174 	Write_rows: table id 251 flags: STMT_END_F
### INSERT INTO `test`.`t`
### SET
###   @1=1 /* INT meta=0 nullable=0 is_null=0 */
###   @2='apple' /* VARSTRING(20) meta=20 nullable=0 is_null=0 */
###   @3=NULL /* VARSTRING(20) meta=0 nullable=1 is_null=1 */
# at 461
#150112 21:40:14 server id 1  end_log_pos 509 CRC32 0x7e44d741 	Table_map: `test`.`t` mapped to number 251
# at 509
#150112 21:40:14 server id 1  end_log_pos 569 CRC32 0x0cd1363a 	Update_rows: table id 251 flags: STMT_END_F
### UPDATE `test`.`t`
### WHERE
###   @1=1 /* INT meta=0 nullable=0 is_null=0 */
###   @2='apple' /* VARSTRING(20) meta=20 nullable=0 is_null=0 */
###   @3=NULL /* VARSTRING(20) meta=0 nullable=1 is_null=1 */
### SET
###   @1=1 /* INT meta=0 nullable=0 is_null=0 */
###   @2='pear' /* VARSTRING(20) meta=20 nullable=0 is_null=0 */
###   @3='2009:01:01' /* DATE meta=0 nullable=1 is_null=0 */
# at 569
#150112 21:40:14 server id 1  end_log_pos 617 CRC32 0xf134ad89 	Table_map: `test`.`t` mapped to number 251
# at 617
#150112 21:40:14 server id 1  end_log_pos 665 CRC32 0x87047106 	Delete_rows: table id 251 flags: STMT_END_F
### DELETE FROM `test`.`t`
### WHERE
###   @1=1 /* INT meta=0 nullable=0 is_null=0 */
###   @2='pear' /* VARSTRING(20) meta=20 nullable=0 is_null=0 */
###   @3='2009:01:01' /* DATE meta=0 nullable=1 is_null=0 */
# at 665
#150112 21:40:15 server id 1  end_log_pos 696 CRC32 0x85ffc9ff 	Xid = 465
COMMIT/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

  • Row with “Table_map: test.t” defines the table name used by query.
  • Rows with “Write_rows/Update_rows/Delete_rows” defines the event type.
  • Lines that start with “###” defines the actual row that got changed.
  • Columns are represented as @1, @2 and so on.

Now have a look at our simple awk script that will use the mysqlbinlog output and print a beautiful summary for INSERT/UPDATE/DELETE events from row-based binary logs. Just replace the “mysqld-bin.000023″ with your binary log file. The string “#15″ in third line is for year 2015. If you are decoding a binary log file from 2014, just change it to “#14″. It is also recommended to use “–start-datetime” and ” –stop-datetime” options to decode binary logs of a specific time range instead of decoding a large binary log file.

Script :

mysqlbinlog --base64-output=decode-rows -vv --start-datetime="2015-01-12 21:40:00"  --stop-datetime="2015-01-12 21:45:00" mysqld-bin.000023 | awk \
'BEGIN {s_type=""; s_count=0;count=0;insert_count=0;update_count=0;delete_count=0;flag=0;} \
{if(match($0, /#15.*Table_map:.*mapped to number/)) {printf "Timestamp : " $1 " " $2 " Table : " $(NF-4); flag=1} \
else if (match($0, /(### INSERT INTO .*..*)/)) {count=count+1;insert_count=insert_count+1;s_type="INSERT"; s_count=s_count+1;}  \
else if (match($0, /(### UPDATE .*..*)/)) {count=count+1;update_count=update_count+1;s_type="UPDATE"; s_count=s_count+1;} \
else if (match($0, /(### DELETE FROM .*..*)/)) {count=count+1;delete_count=delete_count+1;s_type="DELETE"; s_count=s_count+1;}  \
else if (match($0, /^(# at) /) && flag==1 && s_count>0) {print " Query Type : "s_type " " s_count " row(s) affected" ;s_type=""; s_count=0; }  \
else if (match($0, /^(COMMIT)/)) {print "[Transaction total : " count " Insert(s) : " insert_count " Update(s) : " update_count " Delete(s) : " \
delete_count "] \n+----------------------+----------------------+----------------------+----------------------+"; \
count=0;insert_count=0;update_count=0; delete_count=0;s_type=""; s_count=0; flag=0} } '

Output :

Timestamp : #150112 21:40:14 Table : `test`.`t` Query Type : INSERT 1 row(s) affected
Timestamp : #150112 21:40:14 Table : `test`.`t` Query Type : UPDATE 1 row(s) affected
Timestamp : #150112 21:40:14 Table : `test`.`t` Query Type : DELETE 1 row(s) affected
[Transaction total : 3 Insert(s) : 1 Update(s) : 1 Delete(s) : 1]
+----------------------+----------------------+----------------------+----------------------+

This awk script will return following columns as output for every transaction in binary logs.

Timestamp : timestamp of event as logged in binary log.
Table : database.tablename
Query Type : Type of query executed on table and number of rows affected by query.

[Transaction total : 3 Insert(s) : 1 Update(s) : 1 Delete(s) : 1]
Above line print the summary of transaction, it displays the total number of rows affected by transaction, total number of rows affected by each type of query in transaction.

Let’s execute some more queries in sakila database and then we will summarize them using our script. We assume that we executed these queries between “2015-01-16 13:30:00″ and “2015-01-16 14:00:00″

use sakila;
mysql> update city set city="Acua++" where city_id=4;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0
mysql> delete from country where country_id=4;
Query OK, 1 row affected (0.04 sec)
mysql>  insert into country (country_id,country,last_update) values ('4','Angola',now());
Query OK, 1 row affected (0.03 sec)
mysql> create table payment_tmp select * from payment;
Query OK, 16049 rows affected (0.80 sec)
Records: 16049  Duplicates: 0  Warnings: 0
mysql> delete from payment_tmp where payment_id between 10000 and 15000;
Query OK, 5001 rows affected (0.08 sec)
mysql> update payment_tmp set amount=9.99 where payment_id between 1 and 7000;
Query OK, 6890 rows affected (0.08 sec)
Rows matched: 7000  Changed: 6890  Warnings: 0
mysql> start transaction;
Query OK, 0 rows affected (0.00 sec)
mysql> delete from country where country_id=5;
Query OK, 1 row affected (0.00 sec)
mysql> delete from city where country_id=5;
Query OK, 1 row affected (0.00 sec)
mysql> delete from address where city_id=300;
Query OK, 2 rows affected (0.00 sec)
mysql> commit;
Query OK, 0 rows affected (0.01 sec)

We will save our script in a file named “summarize_binlogs.sh” and summarize the queries we just executed.

# vim summarize_binlogs.sh
#!/bin/bash
BINLOG_FILE="mysqld-bin.000035"
START_TIME="2015-01-16 13:30:00"
STOP_TIME="2015-01-16 14:00:00"
mysqlbinlog --base64-output=decode-rows -vv --start-datetime="${START_TIME}"  --stop-datetime="${STOP_TIME}" ${BINLOG_FILE} | awk \
'BEGIN {s_type=""; s_count=0;count=0;insert_count=0;update_count=0;delete_count=0;flag=0;} \
{if(match($0, /#15.*Table_map:.*mapped to number/)) {printf "Timestamp : " $1 " " $2 " Table : " $(NF-4); flag=1} \
else if (match($0, /(### INSERT INTO .*..*)/)) {count=count+1;insert_count=insert_count+1;s_type="INSERT"; s_count=s_count+1;}  \
else if (match($0, /(### UPDATE .*..*)/)) {count=count+1;update_count=update_count+1;s_type="UPDATE"; s_count=s_count+1;} \
else if (match($0, /(### DELETE FROM .*..*)/)) {count=count+1;delete_count=delete_count+1;s_type="DELETE"; s_count=s_count+1;}  \
else if (match($0, /^(# at) /) && flag==1 && s_count>0) {print " Query Type : "s_type " " s_count " row(s) affected" ;s_type=""; s_count=0; }  \
else if (match($0, /^(COMMIT)/)) {print "[Transaction total : " count " Insert(s) : " insert_count " Update(s) : " update_count " Delete(s) : " \
delete_count "] \n+----------------------+----------------------+----------------------+----------------------+"; \
count=0;insert_count=0;update_count=0; delete_count=0;s_type=""; s_count=0; flag=0} } '
:wq
# chmod u+x summarize_binlogs.sh

Now we run our script to get the summary of information logged in binary log.

# ./summarize_binlogs.sh
Timestamp : #150116 13:41:09 Table : `sakila`.`city` Query Type : UPDATE 1 row(s) affected
[Transaction total : 1 Insert(s) : 0 Update(s) : 1 Delete(s) : 0]
+----------------------+----------------------+----------------------+----------------------+
Timestamp : #150116 13:41:59 Table : `sakila`.`country` Query Type : DELETE 1 row(s) affected
[Transaction total : 1 Insert(s) : 0 Update(s) : 0 Delete(s) : 1]
+----------------------+----------------------+----------------------+----------------------+
Timestamp : #150116 13:42:07 Table : `sakila`.`country` Query Type : INSERT 1 row(s) affected
[Transaction total : 1 Insert(s) : 1 Update(s) : 0 Delete(s) : 0]
+----------------------+----------------------+----------------------+----------------------+
Timestamp : #150116 13:42:13 Table : `sakila`.`payment_tmp` Query Type : INSERT 16049 row(s) affected
[Transaction total : 16049 Insert(s) : 16049 Update(s) : 0 Delete(s) : 0]
+----------------------+----------------------+----------------------+----------------------+
Timestamp : #150116 13:42:20 Table : `sakila`.`payment_tmp` Query Type : DELETE 5001 row(s) affected
[Transaction total : 5001 Insert(s) : 0 Update(s) : 0 Delete(s) : 5001]
+----------------------+----------------------+----------------------+----------------------+
Timestamp : #150116 13:42:28 Table : `sakila`.`payment_tmp` Query Type : UPDATE 6890 row(s) affected
[Transaction total : 6890 Insert(s) : 0 Update(s) : 6890 Delete(s) : 0]
+----------------------+----------------------+----------------------+----------------------+
Timestamp : #150116 13:42:42 Table : `sakila`.`country` Query Type : DELETE 1 row(s) affected
Timestamp : #150116 13:42:48 Table : `sakila`.`city` Query Type : DELETE 1 row(s) affected
Timestamp : #150116 13:42:53 Table : `sakila`.`address` Query Type : DELETE 2 row(s) affected
[Transaction total : 4 Insert(s) : 0 Update(s) : 0 Delete(s) : 4]
+----------------------+----------------------+----------------------+----------------------+

See how nicely our script has summarized the information logged in binary log. For the last transaction, it shows 3 rows that corresponds to each statement executed within transaction and then the final summary of transaction i.e. transaction affected 4 rows in total and all of them were deleted.

This was how we can summarize MySQL row-based binary logs in a more readable format using a simple awk script. This will save you time whenever you need to decode the binary logs for troubleshooting. Now I’ll show you how this summarized information can help answer some basic but important questions.

Q1 : Which tables received highest number of insert/update/delete statements?

./summarize_binlogs.sh | grep Table |cut -d':' -f5| cut -d' ' -f2 | sort | uniq -c | sort -nr
      3 `sakila`.`payment_tmp`
      3 `sakila`.`country`
      2 `sakila`.`city`
      1 `sakila`.`address`

Q2 : Which table received the highest number of DELETE queries?

./summarize_binlogs.sh | grep -E 'DELETE' |cut -d':' -f5| cut -d' ' -f2 | sort | uniq -c | sort -nr
      2 `sakila`.`country`
      1 `sakila`.`payment_tmp`
      1 `sakila`.`city`
      1 `sakila`.`address`

Q3: How many insert/update/delete queries executed against sakila.country table?

./summarize_binlogs.sh | grep -i '`sakila`.`country`' | awk '{print $7 " " $11}' | sort -k1,2 | uniq -c
      2 `sakila`.`country` DELETE
      1 `sakila`.`country` INSERT

Q4: Give me the top 3 statements which affected maximum number of rows.

./summarize_binlogs.sh | grep Table | sort -nr -k 12 | head -n 3
Timestamp : #150116 13:42:13 Table : `sakila`.`payment_tmp` Query Type : INSERT 16049 row(s) affected
Timestamp : #150116 13:42:28 Table : `sakila`.`payment_tmp` Query Type : UPDATE 6890 row(s) affected
Timestamp : #150116 13:42:20 Table : `sakila`.`payment_tmp` Query Type : DELETE 5001 row(s) affected

See how easy it is to identify the large transactions.

Q5 : Find DELETE queries that affected more than 1000 rows.

./summarize_binlogs.sh | grep -E 'DELETE' | awk '{if($12>1000) print $0}'
Timestamp : #150116 13:42:20 Table : `sakila`.`payment_tmp` Query Type : DELETE 5001 row(s) affected

If we want to get all queries that affected more than 1000 rows.

./summarize_binlogs.sh | grep -E 'Table' | awk '{if($12>1000) print $0}'
Timestamp : #150116 13:42:13 Table : `sakila`.`payment_tmp` Query Type : INSERT 16049 row(s) affected
Timestamp : #150116 13:42:20 Table : `sakila`.`payment_tmp` Query Type : DELETE 5001 row(s) affected
Timestamp : #150116 13:42:28 Table : `sakila`.`payment_tmp` Query Type : UPDATE 6890 row(s) affected

Conclusion

The information logged in binary logs is really helpful to understand our write workload. We used mysqlbinlog + awk to present the binary log contents in more readable format. Further we used the summarized information to answer some of the important questions.

The post Identifying useful info from MySQL row-based binary logs appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com