Let me start by saying a big “thank you” to the staff at Oracle for deciding to open source reducer.sh. It’s a tool I developed whilst I was working for them several years ago. Its sole purpose is to do one thing – but do it good: test-case simplification.
So, let’s say some customer just sent you 120,000 lines of SQL code and affirms that “it definitely causes a crash.” Or maybe you ran RQG (the Random Query Generator) for awhile (with the general query log turned on) and now you have a nice SQL trace which may just lead to that crash the run resulted in. Or you’re a DBA testing the company’s usual queries with Valgrind, and noticed that 2 in 1000 queries give a Valgrind warning in the mysqld error log – you’re just not sure which one. Or maybe you’re a developer, and during testing you saw that a SELECT query output did not look the way it should – the output was “7″ where it should have been “5″ – the only problem – you have 1000 lines of INSERT statements and are not sure which one caused it. In all of these cases reducer can help.
Here are some of it’s benefits/features:
It can reduce large amount of SQL fast. 40K lines to just a few can usually be done in around 1 hour.
Larger files scale even better – the chunking elimination method automatically adapt to file size.
It can reduce crashes/asserts, Valgrind testcases, mysqld error log messages, and mysql CLI output testcases
Also working (but with complex setup atm) is multi-threaded SQL test-case simplification (ALPHA)
It can reduce sporadic testcases for all of the above (i.e. testcases where the issue does not reproduce every time)
It can reduce sporadic testcases using multiple threads which significantly improves reduction time
It is aware of how to establish a testcase is sporadic (and will report the same) – or not – and will change it’s behavior
It is capable (turned on by default) of reducing actual DML/DDL query code after completing line-based reduction
It is capable (turned on by default) of reducing testcases by eliminating columns from tables and INSERT queries
By default reducer.sh uses tmpfs (highly recommended) to ensure testcases are “as reproducible as possible” (disk I/O)
Additional options for mysqld (necessary to reproduce an issue) can easily be listed/added
Regex syntax can be used in search strings (where applicable)
So, without further ado, let’s have a look at how to get it to do your simplification job
To get reducer.sh today, use these commands (yum example used, but this can easily be adapted to apt-get):
$ sudo yum install bzr
$ cd ~
$ bzr branch lp:randgen
$ cd randgen/util/reducer/
$ ls *
(You may also want to checkout ./status.sh in this directory which is a handy tool for seeing what reducer.sh is upto when it is doing it’s first/original attempt to reproduce a given issue.)
And you can get percona-qa (for parse_general_log.sh [and the prepare_reducer.sh code bit if you need it] as shown in the video):
$ cd ~
$ bzr branch lp:percona-qa
$ cd perconq-qa
$ ls *