An idea for a benchmark based on the “arrival request” rate that I wrote about in a post headlined “Introducing new type of benchmark” back in 2012 was implemented in Sysbench. However, Sysbench provides only a simple workload, so to be able to compare InnoDB with TokuDB, and later MongoDB with Percona TokuMX, I wanted to use more complicated scenarios. (Both TokuDB and TokuMX are part of Percona’s product line, in the case you missed Tokutek now part of the Percona family.)
Thanks to Facebook – they provide LinkBench, a benchmark that emulates the social graph database workload. I made modifications to LinkBench, which are available here: https://github.com/vadimtk/linkbenchX. The summary of modifications is
- Instead of generating events in a loop, we generate events with
requestrate
and send the event for execution to one of available Requester thread. - At the start, we establish N (
requesters
) connections to database, which are idle by default, and just waiting for an incoming event to execute. - The main output of the benchmark is 99% response time for ADD_LINK (INSERT + UPDATE request) and GET_LINKS_LIST (range SELECT request) operations.
- The related output is Concurrency, that is how many Requester threads are active during the time period.
- Ability to report stats frequently (5-10 sec interval); so we can see a trend and a stability of the result.
Also, I provide a Java package, ready to execute, so you do not need to compile from source code. It is available on the release page at https://github.com/vadimtk/linkbenchX/releases
So the main focus of the benchmark is the response time and its stability over time.
For an example, let’s see how TokuDB performs under different request rates (this was a quick run to demonstrate the benchmark abilities, not to provide numbers for TokuDB).
First graph is the 99% response time (in milliseconds), measured each 10 sec, for arrival rate 5000, 10000 and 15000 operations/sec:
or, to smooth spikes, the same graph, but with Log 10 scale for axe Y:
So there are two observations: the response time increases with an increase in the arrival rate (as it supposed to be), and there are periodical spikes in the response time.
And now we can graph Concurrency (how many Threads are busy working on requests)…
…with an explainable observation that more threads are needed to handle bigger arrival rates, and also during spikes all available 200 threads (it is configurable) become busy.
I am looking to adopt LinkBenchX to run an identical workload against MongoDB.
The current schema is simple
CREATE TABLE `linktable` ( `id1` bigint(20) unsigned NOT NULL DEFAULT '0', `id2` bigint(20) unsigned NOT NULL DEFAULT '0', `link_type` bigint(20) unsigned NOT NULL DEFAULT '0', `visibility` tinyint(3) NOT NULL DEFAULT '0', `data` varchar(255) NOT NULL DEFAULT '', `time` bigint(20) unsigned NOT NULL DEFAULT '0', `version` int(11) unsigned NOT NULL DEFAULT '0', PRIMARY KEY (link_type, `id1`,`id2`), KEY `id1_type` (`id1`,`link_type`,`visibility`,`time`,`id2`,`version`,`data`) ) ENGINE=TokuDB DEFAULT CHARSET=latin1; CREATE TABLE `counttable` ( `id` bigint(20) unsigned NOT NULL DEFAULT '0', `link_type` bigint(20) unsigned NOT NULL DEFAULT '0', `count` int(10) unsigned NOT NULL DEFAULT '0', `time` bigint(20) unsigned NOT NULL DEFAULT '0', `version` bigint(20) unsigned NOT NULL DEFAULT '0', PRIMARY KEY (`id`,`link_type`) ) ENGINE=TokuDB DEFAULT CHARSET=latin1; CREATE TABLE `nodetable` ( `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `type` int(10) unsigned NOT NULL, `version` bigint(20) unsigned NOT NULL, `time` int(10) unsigned NOT NULL, `data` mediumtext NOT NULL, PRIMARY KEY(`id`) ) ENGINE=TokuDB DEFAULT CHARSET=latin1;
I am open for suggestions as to what is the proper design of documents for MongoDB – please leave your recommendations in the comments.
The post LinkBenchX: benchmark based on arrival request rate appeared first on MySQL Performance Blog.