Sysbench has long been established as the de facto standard when it comes to benchmarking MySQL performance. Percona relies on it daily, and even Oracle uses it when blogging about new features in MySQL 8. Sysbench comes with several pre-defined benchmarking tests. These tests are written in an easy-to-understand scripting language called Lua. Some of these tests are called: oltp_read_write, oltp_point_select, tpcc, oltp_insert. There are over ten such scripts to emulate various behaviors found in standard OLTP applications.
But what if your application does not fit the pattern of traditional OLTP? How can you continue to utilize the power of load-testing, benchmarking, and results analysis with sysbench? Just write your own Lua script!
For those that want to jump ahead and see the full source, here you go.
Sysbench API
To start off, each Lua script you create must implement three core sysbench-Lua API functions. These are thread_init, thread_done, and event. You can read the comments in the code below for the meaning of each function and what is happening inside.
-- Called by sysbench one time to initialize this script function thread_init() -- Create globals to be used elsewhere in the script -- drv - initialize the sysbench mysql driver drv = sysbench.sql.driver() -- con - represents the connection to MySQL con = drv:connect() end -- Called by sysbench when script is done executing function thread_done() -- Disconnect/close connection to MySQL con:disconnect() end -- Called by sysbench for each execution function event() -- If user requested to disable transactions, -- do not execute BEGIN statement if not sysbench.opt.skip_trx then con:query("BEGIN") end -- Run our custom statements execute_selects() execute_inserts() -- Like above, if transactions are disabled, -- do not execute COMMIT if not sysbench.opt.skip_trx then con:query("COMMIT") end end
That’s all pretty simple and should function as a good template in your scripts. Now let’s take a look at the rest of the script.
Sanity checks and options
Now let’s get into the core code. At the top you’ll find the following sections:
if sysbench.cmdline.command == nil then error("Command is required. Supported commands: run") end sysbench.cmdline.options = { point_selects = {"Number of point SELECT queries to run", 5}, skip_trx = {"Do not use BEGIN/COMMIT; Use global auto_commit value", false} }
The first section is a sanity check to make sure the user actually wants to run this test. Other test scripts, mentioned above, support commands like prepare, run, and cleanup. Our script only supports run as the data we are using is pre-populated by our core application.
The second section allows us, the script writer, to let the user pass some options specific to our test script. In the code above, we can see an option for the number of SELECT statements that will be ran on each thread/iteration (default is 5) and the other option allows the user to disable BEGIN/COMMIT if they so desire (default is false). If you want more customization in your script, simply add more options. You’ll see how to reference these parameters later on.
The queries
Now it is time to define the custom queries we want to execute in our script.
-- Array of categories to be use in the INSERTs local page_types = { "actor", "character", "movie" } -- Array of COUNT(*) queries local select_counts = { "SELECT COUNT(*) FROM imdb.title" } -- Array of SELECT statements that have 1 integer parameter local select_points = { "SELECT * FROM imdb.title WHERE id = %d", "SELECT * FROM imdb.comments ORDER BY id DESC limit 10", "SELECT AVG(rating) avg FROM imdb.movie_ratings WHERE movie_id = %d", "SELECT * FROM imdb.users ORDER BY RAND() LIMIT 1" } -- Array of SELECT statements that have 1 string parameter local select_string = { "SELECT * FROM imdb.title WHERE title LIKE '%s%%'" } -- INSERT statements local inserts = { "INSERT INTO imdb.users (email_address, first_name, last_name) VALUES ('%s', '%s', '%s')", "INSERT INTO imdb.page_views (type, viewed_id, user_id) VALUES ('%s', %d, %d)" }
The above code defines several arrays/lists of different queries. Why is this necessary? Later on in the code, we will have to parse each SQL statement and populate/replace the various parameters with randomly generated values. It would not do us any good to repeat the same SELECT * FROM fooTable WHERE id = 44 every time, now would it? Certainly not. We want to generate random numbers and have our queries select from the entire dataset.
Some queries have no parameters, some have integer-based, and some string-based. We will handle these differently below, which is why they are in different arrays above. This method also allows for future expansion. When you want to run additional queries within the script, just add another line to each array; no need to change any other code.
Parse and execute
The function below, execute_selects, will be called from the parent function, event, which we discussed earlier in the post. You can see for-loops for each of the three SELECT categories we created above. The comments inline should help explain what is happening. Note the use of the user-provided option –point-selects in the second loop below, which we created previously in the ‘Sanity and Options’ section.
function execute_selects() -- Execute each simple, no parameters, SELECT for i, o in ipairs(select_counts) do con:query(o) end -- Loop for however many queries the -- user wants to execute in this category for i = 1, sysbench.opt.point_selects do -- select random query from list local randQuery = select_points[math.random(#select_points)] -- generate random id and execute local id = sysbench.rand.pareto(1, 3000000) con:query(string.format(randQuery, id)) end -- generate random string and execute for i, o in ipairs(select_string) do local str = sysbench.rand.string(string.rep("@", sysbench.rand.special(2, 15))) con:query(string.format(o, str)) end end
Two more things to mention for this code. First, you will notice the use of sysbench.rand.pareto to generate a random number between 1 and 3,000,000. For our dataset, we know that each table referenced in all queries relating to WHERE id = ? has that many number of rows, at minimum. This is specific to our data. Your values will certainly be different. Second, notice the use of sysbench.rand.string, and string.rep. The string.rep segment will generate a string comprised of ‘@’ symbols, between 2 and 15 characters long. That string of ‘@’ symbols will then be passed to sysbench.rand.string, which will swap out each ‘@’ for a random alphanumeric value. For example, ‘@@@@@@’ could be changed to ‘Hk9EdC’ which will then replace the ‘%s’ inside the query string (string.format) and be executed.
Handle inserts
Our INSERT statements require values. Again, sysbench calls the function execute_inserts from event on each iteration. Inside execute_inserts, we generate some fake string data using built-in functions described above.
Those strings are then formatted into the SQL and executed.
function create_random_email() local username = sysbench.rand.string(string.rep("@",sysbench.rand.uniform(5,10))) local domain = sysbench.rand.string(string.rep("@",sysbench.rand.uniform(5,10))) return username .. "@" .. domain .. ".com" end function execute_inserts() -- generate fake email/info local email = create_random_email() local firstname = sysbench.rand.string("first-" .. string.rep("@", sysbench.rand.special(2, 15))) local lastname = sysbench.rand.string("last-" .. string.rep("@", sysbench.rand.special(2, 15))) -- INSERT for new imdb.user con:query(string.format(inserts[1], email, firstname, lastname)) -- INSERT for imdb.page_view local page = page_types[math.random(#page_types)] con:query(string.format(inserts[2], page, sysbench.rand.special(2, 500000), sysbench.rand.special(2, 500000))) end
Example run
$ sysbench imdb_workload.lua \ --mysql-user=imdb --mysql-password=imdb \ --mysql-db=imdb --report-interval=1 \ --events=0 --time=0 run WARNING: Both event and time limits are disabled, running an endless test sysbench 1.0.17 (using system LuaJIT 2.0.4) Running the test with following options: Number of threads: 1 Report intermediate results every 1 second(s) Initializing random number generator from current time Initializing worker threads... Threads started! [ 1s ] thds: 1 tps: 15.96 qps: 177.54 (r/w/o: 112.71/31.92/32.91) lat (ms,95%): 158.63 err/s: 0.00 reconn/s: 0.00 [ 2s ] thds: 1 tps: 15.01 qps: 169.09 (r/w/o: 109.06/30.02/30.02) lat (ms,95%): 137.35 err/s: 0.00 reconn/s: 0.00 [ 3s ] thds: 1 tps: 26.00 qps: 285.00 (r/w/o: 181.00/52.00/52.00) lat (ms,95%): 108.68 err/s: 0.00 reconn/s: 0.00 [ 4s ] thds: 1 tps: 15.00 qps: 170.00 (r/w/o: 108.00/32.00/30.00) lat (ms,95%): 164.45 err/s: 0.00 reconn/s: 0.00
And there we have it! Custom queries specific to our application and dataset. Most of the sysbench parameters are self-explanatory, but let me talk about –report-interval=1 which shows statistics every 1 second. Normally sysbench does not output stats until the end of the run, however, the example execution will run forever (–events=0 –time=0) so we need stats to show all the time. You can adjust the parameters to your liking. For instance, if you only want to run a test for 5 minutes, set –events=0 –run-time=300.
Conclusion
Sysbench is a very well designed application that allows you to load-test your MySQL instances using pre-defined and custom queries. Using the Lua scripting language, you can create just about any scenario to fit your needs. The above is just one example that we use within Percona’s Training and Education department. It is by no means an exhaustive example of all of the capabilities of sysbench-Lua.
—
Photo by Lachlan Donald on Unsplash