MySQL and the SSB – Part 2 – MyISAM vs InnoDB low concurrency

This blog post is part two in what is now a continuing series on the Star Schema Benchmark.

In my previous blog post I compared MySQL 5.5.30 to MySQL 5.6.10, both with default settings using only the InnoDB storage engine.  In my testing I discovered that innodb_old_blocks_time had an effect on performance of the benchmark.  There was some discussion in the comments and I promised to follow up with more SSB tests at a later date.

I also promised more low concurrency SSB tests when Peter blogged about the importance of performance at low concurrency.

The SSB tests a database’s ability to optimize queries for a star schema. A star schema presents some unique challenge to the database optimizer. The SSB benchmark consists of four sets of queries. Each set is known as a “flight”. I have labeled each query as Q{FLIGHT_NUMBER}.{QUERY_NUMBER}. In general, each flight examines different time periods or different regions. The flights represent the type of investigations and drill-downs that are common in OLAP analysis.

Each query in each flight (Q1.1 for example) is tested with a cold buffer pool. Then the query is tested again without restarting the database. The first test is described as the cold test, and the second as the hot test. The database software is restarted after the hot test. All OS caches are dropped at this time as well.

These set of queries were tested on the SSB at SCALE FACTOR: 20. This means there is approximately 12GB of data in the largest table.

You can find the individual SSB query definitions in my previous blog post.

Test environment
These tests were done on a relatively fast machine with a Xeon E5-2680 (8 cores, 16 threads) with fast IO (OCZ R4 1.6TB) and 128GB memory. For the hot test, the data fits in the buffer pool and has been loaded by the cold test already. The buffer pool and adaptive hash index are cold for the cold test. All tests were done with no concurrency. The hardware for this test was provided by Adotomi. I will be blogging about raw performance of the OCZ card in another post.

Also, while it is labeled on the graphs, it is important to note that in all cases, lower times are better.

SSB Flight #1
Here you will see the start of an interesting trend. MyISAM is faster when the data is not cached (the cold run) but is slower in the hot (cached) run. I did some investigation during the testing and found that InnoDB does more IO than MyISAM when the database is cold, but uses less CPU time when the database is hot. I am only speculating (and I can investigate further), but I believe the adaptive hash index is improving performance of InnoDB significantly during the hot run, as hash indexes are faster than a b-tree index. Also accessing pages from the buffer pool should be faster than getting them from the OS cache, which is another advantage of InnoDB.













SSB Flight #2
Flight #2 is similar to Flight #1. MyISAM is faster than InnoDB when the database is cold, but the opposite is true when the database is hot.













SSB Flight #3
Here in some cases MyISAM is substantially faster than InnoDB both cold and hot.













SSB Flight #4
There is one query in this flight, Q4.3, which is faster using MyISAM than InnoDB. Like the queries in Flight #3 that are faster using MyISAM, Q4.3 examines very little data. It seems that InnoDB performs better when a larger number of rows must be joined together (Q4.1, Q4.2) but worse when small amounts of data are examined.













In some cases MyISAM is faster than InnoDB, but usually only when the buffer pool is cold. Please don’t take away that you should be using MyISAM for everything!. MyISAM may be good for raw performance, but there are limitations which MyISAM imposes that are difficult to work with.  MyISAM does not maintain checksum consistency during regular operations and is not ACID compliant. MyISAM and InnoDB may perform differently under concurrency, which this benchmark does not cover. I will make a follow-up post about concurrency in another blog post in this series. Regardless, when the working set fits in memory, InnoDB almost always performs better, at least for this workload.


MySQL version used: 5.6.11, custom compiled to remove performance_schema

For the InnoDB tests, a 64GB buffer pool was used. O_DIRECT was used so, there was no caching of data at the filesystem level. The InnoDB indexes were built using ALTER TABLE fast index creation (merge sort).

For the MyISAM tests I used a 10GB key buffer. I used ALTER TABLE DISABLE KEYS and built the keys with sort via ALTER TABLE ENABLE KEYS.

# Disabling symbolic-links is recommended to prevent assorted security risks

The post MySQL and the SSB – Part 2 – MyISAM vs InnoDB low concurrency appeared first on MySQL Performance Blog.


Why MySQL Performance at Low Concurrency is Important

PerformanceA few weeks ago I wrote about “MySQL Performance at High Concurrency” and why it is important, which was followed up by Vadim’s post on ThreadPool in Percona Server providing some great illustration on the topic. This time I want to target an opposite question: why MySQL performance at low concurrency is important for you.

I decided to write about this topic as a number of recent blog posts and articles look at MySQL performance starting with certain concurrency as the low point. For example, MySQL 5.6 DBA and Developer Guide has a number of graphs starting with 32 connections at the low point. Mark Callaghan also starts with a concurrency of 8 in some of the results he publishes. All of this may make it look as though single-thread performance is no more a concern. I believe it still is!

First, performance at concurrency of 1 serves as an important baseline that offers the best response times (MySQL server does not use multiple threads to execute query in parallel at this point). Response times at the variety of concurrency levels can be compared with this baseline to see how they are affected and how the system scales with increasing concurrency in effect.

Second, there are many cases in which single-thread execution is what is going to happen — many batch jobs are written to be single-threaded. MySQL replication is single-thread too, and MySQL 5.6 brings some abilities of parallel replication but these apply only to some use cases. Many database maintenance operations such as “alter table” are run in single-thread too.

In fact,  chances are your system actually runs at low concurrency most of the time. The majority of systems I see in the wild have Threads_running on average of 5 or less, at least the system is performing well, pointing to rather low concurrency in the normal system operation. Much higher concurrency in most cases is seen when the system is overloaded and suffering.

So what is the problem with looking at performance at just high-concurrency values? This might mislead you on how your production will be impacted. Think about, for example, a system that is slower at low concurrency but a lot faster at high concurrency (which is the typical situation). If you just look at performance at high concurrency you might not understand why the system is showing worse response times when you are actually running it in real life with low concurrency.

When it comes to benchmarks I would love to see results published starting from concurrency of 1 if possible so we can see the full picture. In terms of benchmarks, Mark Callaghan published an example where 8 sysbench tables are used which means the lowest concurrency you can run with to compare apples to apples is 8. Running a comparable benchmark with concurrency of 1 and the same tool is not possible.

Want to talk more about performance? Come to Percona Live MySQL Conference and Expo April 22-25 in Santa Clara, California, where “performance” is going to be a big topic :)

The post Why MySQL Performance at Low Concurrency is Important appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by