Feb
07
2018
--

ProxySQL Query Cache: What It Is, How It Works

ProxySQL query cache

In this blog post, I’ll present the ProxySQL query cache functionality. This is a query caching mechanism on top of ProxySQL. As there are already many how-tos regarding the ProxySQL prerequisites and installation process, we are going to skip these steps. For those who are already familiar with ProxySQL query cache configuration, let’s go directly to the query rules and the performance results.

Before talking about the ProxySQL query cache, let’s take a look at other caching mechanisms available for MySQL environments.

MySQL query cache is a query caching mechanism – deprecated as of MySQL 5.7.20 and removed in MySQL 8.0 – on top of MySQL itself (based on the official MySQL documentation).

The MySQL query cache stores the text of a SELECT statement together with the corresponding result sent to the client. If an identical statement is received later, the server retrieves the results from the query cache rather than parsing and executing the statement again. The query cache is shared among sessions, so a result set generated by one client can be sent in response to the same query issued by another client.

Although MySQL query cache is supposed to improve performance, there are cases where MySQL query cache is not scaling well and can degrade performance due to its locking and invalidation algorithms.

You can find a really interesting post regarding MySQL query cache here.

There is also another method to cache results in a MySQL environment. It’s external caching (i.e., Memcached, Redis, etc.), but this also has some drawbacks. Introducing such a mechanism requires some changes on the application side.

But what is ProxySQL query cache if there are already the MySQL query cache and other external caching mechanisms? At the moment, although we’ve done some tests, we are not going to compare ProxySQL query cache performance against other caching mechanisms. We’ll address this in a future blog post. We will only focus on ProxySQL itself.

What is ProxySQL Query Cache

ProxySQL query cache is an in-memory key-value storage that uses:

  • as key: a combination of username, schema and query text. It is a hash derived from username, schema name and the query itself. Combining these ensures that users access only their resultsets and for the correct schema.
  • as value: the resultset returned by the backend (mysqld or another proxy).

There is some more metadata stored for each resultset:

  • length: length of the resultset
  • expire_ms: defines when the entry will expire
  • access_ms: records the last time an entry was accessed
  • ref_count: a reference count to identify resultset currently in use

Based on the configuration, the resultsets are cached on the wire while queries are executed, and the resultset is returned to the application. If the application re-executes the same query within the time slot defined by “expire_ms”, the resultset is returned by the embedded ProxySQL query cache.

The only way to invalidate entries from the ProxySQL query cache is through a time-to-live in milliseconds. This is in contrast to MySQL query cache, where the query cache gets invalidated each time a table gets updated. At the moment, it is only possible to tune the total amount of memory used by the query cache, using the variable “mysql-query_cache_size_MB”. The current implementation of mysql-query_cache_size_MB doesn’t impose a hard limit. Instead, it is used as an argument by the purging thread.

It’s obvious that it’s not easy to directly compare these two cache mechanisms, as each of them has its own way to invalidate results. Please also note a significant difference between MySQL and ProxySQL when query cache is enabled. ProxySQL query cache may serve stale resultsets due to the way it invalidates cached data (cached data are valid for the interval specified by “cache_ttl”, while MySQL’s cached data get invalidated each time data change). Every query that is cached may return stale data, and this may or may not be acceptable by the application.

How it Works

Before doing any benchmarks, I will try to give you a short description of how ProxySQL query cache gets enabled. Unlike MySQL query cache, where a common caching space exists for all tables, in ProxySQL query cache we have to define what traffic gets cached. This is done by defining query rules that match traffic that is going to be cached and setting a “cache_ttl” for the cached results. There are many ways to define matches for incoming traffic, either by query or digest using patterns. All we need to cache the resultset is to define the matching criteria and the TTL. If a query passed the matching criteria, the resultset is cached so the next requests are served directly from the ProxySQL instance instead of querying the hostgroup DB nodes (if cache_ttl has not expired).

Let’s use an example to make it clear how ProxySQL query cache is enabled.

In our setup we have three backend DB servers in a master-slave topology, with Percona Server for MySQL 5.7 and a ProxySQL ver. 1.4.3 instance with sysbench 1.0 installed. Backend servers are within the same reader hostgroup, and traffic is balanced among these servers using the same priority.

As I’ve already said, we won’t look at the ProxySQL installation. There are many topologies you can implement: deploying ProxySQL on each application server thus removing the “single point of failure” weakness, for example. But in our case, we will just present the ProxySQL query cache having a single instance. In general, you would expect to have better performance with the ProxySQL instance closer to the application.

Configuration

With the ProxySQL instance up and running, let’s confirm that all servers are OK. Querying ProxySQL admin shows that all servers are ONLINE:

Admin> select * from mysql_servers;
+--------------+-----------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| hostgroup_id | hostname  | port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment |
+--------------+-----------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| 2        	| 10.0.2.12 | 3306 | ONLINE | 1  	| 0       	| 1000        	| 0               	| 0   	| 0          	|     	|
| 1        	| 10.0.2.11 | 3306 | ONLINE | 1  	| 0       	| 1000        	| 0               	| 0   	| 0          	|     	|
| 2        	| 10.0.2.13 | 3306 | ONLINE | 1  	| 0       	| 1000            | 0               	| 0   	| 0          	|     	|
| 2        	| 10.0.2.11 | 3306 | ONLINE | 1  	| 0       	| 1000        	| 0               	| 0   	| 0          	|     	|
+--------------+-----------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+

As you can see, there are two hostgroups: the “1” used for WRITES and the “2” used for READS.

Some random connects proves that traffic is correctly routed to the DB backends:

[RDBA] percona@atsaloux-proxysql: ~ $ mysql -h 127.0.0.1 -P 6033 -e "select @@hostname"
+--------------+
| @@hostname   |
+--------------+
| db1-atsaloux |
+--------------+
[RDBA] percona@atsaloux-proxysql: ~ $ mysql -h 127.0.0.1 -P 6033 -e "select @@hostname"
+--------------+
| @@hostname   |
+--------------+
| db2-atsaloux |
+--------------+
[RDBA] percona@atsaloux-proxysql: ~ $ mysql -h 127.0.0.1 -P 6033 -e "select @@hostname"
+--------------+
| @@hostname   |
+--------------+
| db1-atsaloux |
+--------------+
[RDBA] percona@atsaloux-proxysql: ~ $ mysql -h 127.0.0.1 -P 6033 -e "select @@hostname"
+--------------+
| @@hostname   |
+--------------+
| db3-atsaloux |
+--------------+

Let’s first take a look at some statistics. Before using sysbench, the “stats_mysql_query_digest” table (where digests are stored) is empty:

Admin> SELECT count_star,sum_time,hostgroup,digest,digest_text FROM stats_mysql_query_digest_reset ORDER BY sum_time DESC;
Empty set (0.00 sec)

The “stats_mysql_query_digest: table contains statistics related to the queries routed through the ProxySQL server. How many times each query was executed and the total execution time are two of the several provided statistics.

Before doing any benchmarks, I had to create some data. The following sysbench commands were used for selects by PK or by RANGE. For simplicity, we are not going to execute benchmarks inside transactions — although ProxySQL query cache is effective.

--threads

 will be adjusted for each benchmark:

sysbench --threads=16 --max-requests=0 --time=60 --mysql-user=percona --mysql-password=percona --mysql-db=sbtest --mysql-host=127.0.0.1 --mysql-port=6033 --oltp-table-size=1000000 --oltp-read-only=on --oltp-skip-trx=on --oltp-test-mode=simple --oltp-sum-ranges=0 --oltp-order-ranges=0 --oltp-distinct-ranges=0 --oltp-point-selects=1 --oltp-simple-ranges=0 /usr/share/sysbench/tests/include/oltp_legacy/oltp.lua run
sysbench --threads=16 --max-requests=0 --time=60 --mysql-user=percona --mysql-password=percona --mysql-db=sbtest --mysql-host=127.0.0.1 --mysql-port=6033 --oltp-table-size=1000000 --oltp-read-only=on --oltp-skip-trx=on --oltp-test-mode=simple --oltp-sum-ranges=0 --oltp-order-ranges=0 --oltp-distinct-ranges=0 --oltp-point-selects=0 --oltp-simple-ranges=1 /usr/share/sysbench/tests/include/oltp_legacy/oltp.lua run

Before running the full benchmark, a simple sysbench was run to get the queries digests that are used for the ProxySQL query cache configuration.

After running the first benchmark with ProxySQL query cache disabled, I queried the “stats_mysql_query_digest” table again and got the following results where it logs all executed queries.

Admin> SELECT count_star,sum_time,hostgroup,digest,digest_text FROM stats_mysql_query_digest_reset ORDER BY sum_time DESC;
+------------+-------------+-----------+--------------------+------------------------------------------------+
| count_star | sum_time    | hostgroup | digest             | digest_text                                    |
+------------+-------------+-----------+--------------------+------------------------------------------------+
| 301536     | 20962929791 | 2         | 0xBF001A0C13781C1D | SELECT c FROM sbtest1 WHERE id=?               |
| 3269       | 30200073    | 2         | 0x290B92FD743826DA | SELECT c FROM sbtest1 WHERE id BETWEEN ? AND ? |
+------------+-------------+-----------+--------------------+------------------------------------------------+
2 row in set (0.01 sec)

Add mysql_query_rules To Be Cached

Output above provides all the needed information in order to enable ProxySQL query cache. What we need to do now add the query rules that match the results that should be cached. In this case we use a matching pattern criteria and a cache_ttl of 5000ms. Taking this into consideration, we added the following rules:

Admin> INSERT INTO mysql_query_rules (rule_id,active,digest,cache_ttl,apply) VALUES (1,1,'0xBF001A0C13781C1D',5000,1);
Query OK, 1 row affected (0.00 sec)
Admin> INSERT INTO mysql_query_rules (rule_id,active,digest,cache_ttl,apply) VALUES (2,1,'0x290B92FD743826DA',5000,1);
Query OK, 1 row affected (0.00 sec)

We shouldn’t forget that we must load query rules at runtime. If we don’t want to lose these rules (i.e., after a ProxySQL restart), we should also save to disk:

Admin> LOAD MYSQL QUERY RULES TO RUNTIME; SAVE MYSQL QUERY RULES TO DISK;
Query OK, 0 rows affected (0.00 sec)

Now let’s reset the stats_mysql_query_digest results:

Admin> SELECT 1 FROM stats_mysql_query_digest_reset LIMIT 1; -- we reset the counters
+---+
| 1 |
+---+
| 1 |
+---+
1 row in set (0.01 sec)
----------

And re-run the benchmarks with query cache enabled. To confirm what traffic was cached, we have to query the stats_mysql_query_digest once again:

Admin> SELECT count_star,sum_time,hostgroup,digest,digest_text FROM stats_mysql_query_digest ORDER BY sum_time DESC;
+------------+------------+-----------+--------------------+------------------------------------------------+
| count_star | sum_time   | hostgroup | digest             | digest_text                                    |
+------------+------------+-----------+--------------------+------------------------------------------------+
| 108681     | 6304585632 | 2         | 0x290B92FD743826DA | SELECT c FROM sbtest1 WHERE id BETWEEN ? AND ? |
| 343277     | 0          | -1        | 0x290B92FD743826DA | SELECT c FROM sbtest1 WHERE id BETWEEN ? AND ? |
+------------+------------+-----------+--------------------+------------------------------------------------+
2 rows in set (0.00 sec)

Admin> SELECT count_star,sum_time,hostgroup,digest,digest_text FROM stats_mysql_query_digest ORDER BY sum_time DESC;
+------------+-----------+-----------+--------------------+----------------------------------+
| count_star | sum_time  | hostgroup | digest             | digest_text                      |
+------------+-----------+-----------+--------------------+----------------------------------+
| 79629      | 857050510 | 2         | 0xBF001A0C13781C1D | SELECT c FROM sbtest1 WHERE id=? |
| 441194     | 0         | -1        | 0xBF001A0C13781C1D | SELECT c FROM sbtest1 WHERE id=? |
+------------+-----------+-----------+--------------------+----------------------------------+
2 rows in set (0.00 sec)

Cached queries are the ones marked with a special hostgroup -1 (this means that these queries were not sent to any hostgroup), and the total execution time for the queries cached is 0 (this means that the request was served within the same events loop).

Below you can see the benchmark results. Let’s look at what happens for selects by PK and selects by RANGE:

ProxySQL query cache

 

ProxySQL query cache

Points of Interest

  • In all cases, when threads and backend servers are increasing, ProxySQL performs better. This is achieved due to it’s connection pooling and multiplexing capabilities.
  • Enabling ProxySQL query cache provides a significant performance boost.
  • ProxySQL query cache can achieve a ~2X performance boost at a minimum.
  • This boost can be considerably valuable in cases where MySQL performance may fall to 50% (i.e., select by RANGE).
  • We shouldn’t forget that results are affected by hardware specs as well, but it’s obvious that ProxySQL with query cache enabled gives a really high throughput.

ProxySQL Query Cache Limitations

Current known limitations:

  • It is not possible to define query cache invalidation other than with cache_ttl.
  • There is no way to enforce query cache purge.
  • mysql-query_cache_size_MB is not strictly enforced, but only used as a metric to trigger automatic purging of expired entries.
  • Although access_ms is recorded, it is not used as a metric to expire an unused metric when mysql-query_cache_size_MB is achieved.
  • Query cache does not support prepared statements.
  • Query cache may serve stale data.

Conclusion

ProxySQL is generally a very powerful and easy-to-use tool.

With regards to query cache, it seems to scale very well and achieve a significant performance boost. Although having complex configs (not only for query cache) can add some extra overhead, it’s easy to maintain.

cache_ttl can be a limitation, but if it’s correctly configured in conjunction with max_replication_lag when configuring nodes in hostgroups, it does not add any significant drawback. In any case, it depends whether or not this is acceptable by the application.

Nov
11
2016
--

Is Your Query Cache Really Disabled?

Query Cache

This blog post was motivated by an internal discussion about how to fully disable query cache in MySQL.

According to the manual, we should be able to disable “Query Cache” on the fly by changing

query_cache_type

 to 0, but as we will show this is not fully true. This blog will show you how to properly disable “query cache,” and how common practices might not be as good as we think.

Can we just disable it by changing variables, or does it requires a restart to avoid the global mutex? Let’s see how it works.

Some Query Cache context

The query cache stores the text of a “Select” statement together with the corresponding result that was sent to the client. If an identical statement is received later, the server retrieves the results from the query cache rather than parsing and executing the statement again. The query cache is shared among sessions, so a result set generated by one client can be sent in response to the same query issued by another client.

But cacheable queries take out an “exclusive lock” on MySQL’s query cache. In addition, any insert, update, delete or other modifications to a table causes any relevant entries in the query cache to be flushed. If you see many “Waiting for query cache lock” in the processlist, you might be suffering from this exclusive lock. In this blog post, you can see how this global mutex in high concurrency can cause performance degradation.

If we are facing with this situation, how can we disable it?

Disabling Query Cache

There are two options that you can change:

query_cache_type

 and query_cache_size.

So if we change

query_cache_size

 to “0”, does it means the cache is disabled? Or we also have to change

query_cache_type

? Or both? And does MySQL require a restart to avoid the global mutex?

The source code shows us this:

int Query_cache::send_result_to_client(THD *thd, const LEX_CSTRING &sql)
{
  ulonglong engine_data;
  Query_cache_query *query;
#ifndef EMBEDDED_LIBRARY
  Query_cache_block *first_result_block;
#endif
  Query_cache_block *result_block;
  Query_cache_block_table *block_table, *block_table_end;
  char *cache_key= NULL;
  size_t tot_length;
  Query_cache_query_flags flags;
  DBUG_ENTER("Query_cache::send_result_to_client");
  /*
    Testing 'query_cache_size' without a lock here is safe: the thing
    we may loose is that the query won't be served from cache, but we
    save on mutex locking in the case when query cache is disabled.
    See also a note on double-check locking usage above.
  */
  if (is_disabled() || thd->locked_tables_mode ||
      thd->variables.query_cache_type == 0 || query_cache_size == 0)
    goto err;
...

MySQL is going to check if the query cache is enabled before it locks it. It is checking four conditions, and one of them has to be true. The last three could be obvious, but what is the “is_disabled()” function? Following the source code, we can find the next: sql_cache.h

void disable_query_cache(void) { m_query_cache_is_disabled= TRUE; }
...
bool is_disabled(void) { return m_query_cache_is_disabled; }

sql_cache.cc

void Query_cache::init()
{
  DBUG_ENTER("Query_cache::init");
  mysql_mutex_init(key_structure_guard_mutex,
                   &structure_guard_mutex, MY_MUTEX_INIT_FAST);
  mysql_cond_init(key_COND_cache_status_changed,
                  &COND_cache_status_changed);
  m_cache_lock_status= Query_cache::UNLOCKED;
  initialized = 1;
  /*
    If we explicitly turn off query cache from the command line query cache will
    be disabled for the reminder of the server life time. This is because we
    want to avoid locking the QC specific mutex if query cache isn't going to
    be used.
  */
  if (global_system_variables.query_cache_type == 0)
    query_cache.disable_query_cache();
  DBUG_VOID_RETURN;
}

If the

global_system_variables.query_cache_type == 0

 condition is true it is going to call the 

disable_query_cache

  function, which sets

m_query_cache_is_disabled = True

, so

is_disabled

 going to be “True”. That means if we are setting

query_cache_type

 to 0 in runtime, that should eliminate the global mutex. Let’s run some tests to confirm this and see if the global mutex disappears after changing

query_cache_type

 to 0.

Running tests

Context on the tests:

  1. We ran simple OLTP tests using sysbench as follows:
sysbench --test="/usr/share/doc/sysbench/tests/db/oltp.lua" --report-interval=1 --max-time=120 --oltp-read-only=off --max-requests=0 --num-threads=4 --oltp-table-size=2000000 --mysql-host=localhost --mysql-db=test --db-driver=mysql --mysql-user=root run

  1. Important portion of my.cnf file:
query_cache_type =1
query_cache_limit = 1M
query_cache_size =1G
performance_schema_instrument='wait/synch/%Query_cache%=COUNTED'

Disable the Query Cache

So basically the tests were run for two minutes each while playing with

query_cache_type

 and

query_cache_size

.

  1. Started MySQL with
    query_cache_type = 1

     and

    query_cache_size=1G

    .

  2. Change
    query_cache_type

     to 0. As we can see nothing changed, MySQL is still using the query cache.

  3. But when we stopped sysbench and started again (closing and opening new connections), we can see there are no more inserts going into query cache. But we still can see the queries like “Not Cached” that means changing the
    query_cache_type

     applies only for the new connections, and we still can see some mutex.

  4. Restarted MySQL with
    query_cache_type = 0

     and

    query_cache_size=0

    . Finally we disabled the query cache and all the mutex is disappeared.

  5. Restarted MySQL with query cache enabled.
  6. We changed
    query_cache_size=0

     and it almost worked, we could disable query cache on the fly, but as we can see there is still some mutex activity.

  7. Changing
    query_cache_type=0

     and restarting sysbench does not have any effect on the mutex.

So the only way to stop any activity around query cache requires restarting MySQL with

query_cache_type = 0

  and

query_cache_size=0

. Disabling it or even set it to “0” on runtime is not completely stopping mutex activity.

But why do we still need

query_cache_size

 while in theory

query_cache_type

 should be enough?

As referenced above, the manual says if query_cache_type = 0:

Do not cache results in or retrieve results from the query cache. Note that this does not deallocate the query cache buffer. To do that, you should set query_cache_size to 0.

Based on our test, if we change

query_cache_type

 to 0, it still hits the cache.

So you might think “well, I don’t enable the query cache and use defaults to keep it disabled.” Keep reading, because you might be wrong. According to manual, starting from 5.6.8

query_cache_type=0

 is set by default, but

query_cache_size= 1048576

  (1MB). This means that if we keep default configuration, we will still see activity in the query cache as follows:

mysql -e "show global status like 'qca%';"
+-------------------------+---------+
| Variable_name | Value |
+-------------------------+---------+
| Qcache_free_blocks | 1 |
| Qcache_free_memory | 1031320 |
| Qcache_hits | 0 |
| Qcache_inserts | 0 |
| Qcache_lowmem_prunes | 0 |
| Qcache_not_cached | 423294 |
| Qcache_queries_in_cache | 0 |
| Qcache_total_blocks | 1 |
+-------------------------+---------+

But if we just add

query_cache_size=0

  to my.cnf and check again (of course after restarting server):

mysql -e "show global status like 'qca%';"
+-------------------------+-------+
| Variable_name | Value |
+-------------------------+-------+
| Qcache_free_blocks | 0 |
| Qcache_free_memory | 0 |
| Qcache_hits | 0 |
| Qcache_inserts | 0 |
| Qcache_lowmem_prunes | 0 |
| Qcache_not_cached | 0 |
| Qcache_queries_in_cache | 0 |
| Qcache_total_blocks | 0 |
+-------------------------+-------+

We finally get no query cache related activity at all. How much overhead is caused by this? We’re not fully sure because we didn’t perform benchmarks, but we like to see no activity when we don’t want to.
Now we’re wondering if this case requires a bug report. Stay tuned, we will publish results in the post soon.

Digging more code

Let’s have a look on store_query function. MySQL uses this function to store queries in the query cache. If we read the code we can find this:

if (thd->locked_tables_mode || query_cache_size == 0)
    DBUG_VOID_RETURN;

It only checks the

query_cache_size

, it does not check the type.

Store_query

 is called in handle_query, which also does not check the

query_chache_type

.

Conclusion

There is some contradiction between checking the query cache and storing the data in the query cache, which needs further investigation. But as we can see it is not possible to fully disable the query cache on the fly by changing

query_cache_type

  or/and

query_cache_size

 to 0. Based on the code and the tests, if you want to make sure the query cache is fully disabled, change

query_cache_size

 and

query_cache_type

 to 0 and restart MySQL.

Is a known fact that query cache can be a big point of contention, and we are not trying to benchmark the performance overhead since this mostly depends on the workload type. However, we still can see some overhead if the query cache is not fully disabled when MySQL is started.

Jun
10
2014
--

Measure the impact of MySQL configuration changes with Percona Cloud Tools

When you make a change to your MySQL configuration in production it would be great to know the impact (a “before and after” type of picture). Some changes are obvious. For many variables proper values can be determined beforehand, i.e. innodb_buffer_pool_size or innodb_log_file_size. However, there is 1 configuration variable which is much less obvious for many people working with MySQL: query_cache.

The idea of query cache is great, however, there are a lot of issues with MySQL query cache, one of the most important issues is query cache mutex which can cause a severe contention on the CPU bound workloads. In MySQL 5.6 you can remove the mutex by disabling the query cache completely (set query_cache_type=0).

There is a big issue with disabling query_cache though and it is not a technical issue. The issue is how do you convince your boss or dev team/operations team that query cache is bad. Almost all measurements available in MySQL will tell you that the query cache is utilized: there will be a good number of Qcache_hits. On the other hand you may see the contention in processlist: for example you can see one of those states associates with query cache contention:  Waiting for query cache lock or Waiting on query cache mutex (if you are running Percona Server).

Now you use Percona Cloud Tools (with Percona Server) to actually see how query cache is used globally and on per query basis.

Query cache visualization

Percona Server has this very useful additional feature: if you set log_slow_verbosity”  option to “microtime, query_plan, innodb” it will also log the information if the query was a query cache hit or query cache miss. This can give you a valuable inside for the query cache real utilization.

Percona Cloud Tools will be really helpful here as it will visualize this information and will actually see “before and after” graph. (It will set “log_slow_verbosity” option to “microtime, query_plan, innodb” for you if you set Slow log verbosity = Full in the agent configuration options).

Here is the example. I’ve disabled the query cache and then looked at the total query and lock time.

pct_query_time_lock

As we can see the total query time (across all queries) decreased significantly. The Average QPS on this picture is actually a little bit misleading and should be named “Queries running at the same time” (it is calculated as query count / time). The above graphs shows clear contention on the query cache level.

Just to confirm, we can look at the number of query cache hits:

pct_qc_hits

The number of Query Cache “hits” dropped to 0 when we disabled the query cache.

Please note: if your workload is readonly and you are running the same complex query over and over again, query cache may be helpful. Usually, in normal circumstances however, the query cache can cause contention and reduce the response time (as you can see above).

Percona Cloud Tools is a free beta, please give it a try and share your experience in the comments. If you want to learn more about it, join the June 25 webinar hosted by Vadim Tkachenko titled, “Monitoring All (Yes, All!) MySQL Metrics with Percona Cloud Tools.” The webinar, which starts at 10 a.m. Pacific time, is free but I recommend registering now to reserve your spot.

 

The post Measure the impact of MySQL configuration changes with Percona Cloud Tools appeared first on MySQL Performance Blog.

Aug
02
2013
--

Advanced MySQL Query Tuning: Webinar followup Q&A

Thanks to all who attended my “MySQL Query Tuning” webinar on July 24.  If you missed it, you can you can download the slides and also watch the recorded video. Thank you for the excellent questions after the webinar as well. Query tuning is a big topic and, due to the limited time, I had to skip some material, especially some of the monitoring. I would like, however, to answer all the questions I did not get into during the webinar session.

Q: Did you reset the query cache before doing your benchmark on your query? 0.00 seconds sounds too good 

A: (This is in response to a couple of slides where the time showed as 0.00). Yes, MySQL was running with query cache disabled. The 0.00 just means that the query was executed in less than 0.004 sec. MySQL does not show the higher precision if you run the query from mysql monitor. There are a couple of ways to get the exact query times:

  • MySQL 5.0 +: Use “profiling” feature: http://dev.mysql.com/doc/refman/5.5/en/show-profile.html
  • MySQL 5.1 +: Enable the slow query log with microsecond presision and log the query. To log all queries in the slow query log you can temporary set:  long_query_time = 0
  • MySQL 5.6: Use the new performance_schema counters

Here is the profile for an example query, the query shows 0.00 seconds:

mysql> show profile;
+----------------------+----------+
| Status | Duration |
+----------------------+----------+
| starting | 0.000064 |
| checking permissions | 0.000003 |
| checking permissions | 0.000006 |
| Opening tables | 0.000019 |
| System lock | 0.000011 |
| init | 0.000031 |
| optimizing | 0.000011 |
| statistics | 0.000014 |
| preparing | 0.000011 |
| executing | 0.000002 |
| Sending data | 0.002161 |
| end | 0.000004 |
| query end | 0.000002 |
| closing tables | 0.000007 |
| freeing items | 0.000012 |
| logging slow query | 0.000001 |
| cleaning up | 0.000002 |
+----------------------+----------+

As we can see, sending data is actually 0.002 seconds.

Q: Do you ever see doing a seminar that shows how to leverage parallelization (openCL or CUDA) with databases and the performance differences?

A:  MySQL does not support it right now. Usually openCL / CUDA does not help with the disk-bounded applications like databases. However, some projects in OLAP space can actually utilize openCL/CUDA, for example, Alenka, is a column store that is massively parallel. Scanning, aggregation, sorting, etc are done in a data flow manner via the CUDA processing.

 Q: Is this possible to use this /covered index for order by – A.R/ with join? For example if we want to use where on table A and sort it by column from table B

A: Unfortunately, MySQL does not support that with the covered index.  MySQL will only use the filter on the where condition (to limit the number of rows) + filesort. However, if we have a limit clause, MySQL may be able to use the index for order by and stop after finding N rows, matching the condition. It may not be faster thou (as I showed during the webinar) and you may have to use index hints to tell mysql to use the exact index (may not be the best approach as in some cases the use of this index may not be the best for this case). Example:

mysql> explain select * from City ct join Country cn on (ct.CountryCode = cn.Code) where Continent = 'North America' order by ct.population desc limit 10\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: ct
type: index
possible_keys: NULL
key: Population
key_len: 4
ref: NULL
rows: 10
Extra:
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: cn
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 3
ref: world.ct.CountryCode
rows: 1
Extra: Using where

As we can see, MySQL will use index and avoid “order by”.

Q: Why are Hash Indexes not available for InnoDB engine ? Any plans to bring Hash indexes.

A: InnoDB use Hash Indexes for so called “Adaptive Hash Index” feature.  InnoDB does not  support hash indexes as a normal table index. We are not aware of the Oracle’s InnoDB team plans to bring this feature in.

Please note: MySQL will allow you to use “using hash” keyword when creating an index on InnoDB table. However, it will create a b-tree index instead.

Q: Will foreign key constraints slow down my queries?

A: It may slow down the queries, as InnoDB will have to

  1. Check the foreign key constraint table
  2. Place a shared lock on the row it will read: 

If a FOREIGN KEY constraint is defined on a table, any insert, update, or delete that requires the constraint condition to be checked sets shared record-level locks on the records that it looks at to check the constraint.InnoDB also sets these locks in the case where the constraint fails. (http://dev.mysql.com/doc/refman/5.5/en/innodb-locks-set.html)

Q: How does use of index vary with the number of columns selected in a select query?

If we are talking about the covered index: if we select a column which is not a part of covered index, mysql will not be able to satisfy the query with index only (“using index” in the explain plan). It may be slower, especially if MySQL will have to select large columns and the data is not cached.

In addition, if we select a text or blob column and MySQL will need to create a temporary table, this temporary table will be created ondisk. I’ve described this scenario during the webinar.

The post Advanced MySQL Query Tuning: Webinar followup Q&A appeared first on MySQL Performance Blog.

Feb
01
2013
--

Implications of Metadata Locking Changes in MySQL 5.5

While most of the talk recently has mostly been around the new changes in MySQL 5.6 (and that is understandable), I have had lately some very interesting cases to deal with, with respect to the Metadata Locking related changes that were introduced in MySQL 5.5.3. It appears that the implications of Metadata Locking have not been covered well, and since there are still a large number of MySQL 5.0 and 5.1 installations that would upgrade or are in the process of upgrading to MySQL 5.5, I thought it necessary to discuss what these implications exactly are.

To read what Metadata Locking exactly is please read this section here in the MySQL manual.

Let’s start off with having a look at the Meta Data Locking behavior prior to MySQL 5.5.3

Metadata Locking behavior prior to MySQL 5.5.3

Prior to MySQL 5.5.3 a statement that opened a table only held meta data locks till the end of the statement and not the end of the transaction. This meant that transaction was not really isolated, because the same query could return different results if executed twice and if a DDL was executed between the query invocations. Let me give you an example via a simple test case where I will add a new column to the table while a transaction in REPEATABLE-READ isolation mode is ACTIVE.

session1 > select @@tx_isolation;
+-----------------+
| @@tx_isolation  |
+-----------------+
| REPEATABLE-READ |
+-----------------+
1 row in set (0.00 sec)

session1 > start transaction;
Query OK, 0 rows affected (0.00 sec)  

session1 > select * from test where id=1;
+----+------+
| id | x    |
+----+------+
|  1 | foo  |
+----+------+
1 row in set (0.00 sec)

session2 > ALTER TABLE test add column c char(32) default 'dummy_text';
Query OK, 2 rows affected (0.57 sec)
Records: 2  Duplicates: 0  Warnings: 0

session1 > select * from test where id=1;
Empty set (0.00 sec)

session1 > rollback;
Query OK, 0 rows affected (0.00 sec)  

session1 > start transaction;
Query OK, 0 rows affected (0.00 sec)  

session1 > select * from test where id=1;
+----+------+------------+
| id | x    | c          |
+----+------+------------+
|  1 | foo  | dummy_text |
+----+------+------------+
1 row in set (0.00 sec)

And you can see how isolation is broken because the SELECT was not repeatable although transaction isolation level of REPEATABLE-READ was used. This behavior of versions prior to 5.5 also means that queries could be written in different order to the binary log breaking locking semantics and contrary to serialization concepts. For example take a look at the following excerpt from the binary log of a case when an UPDATE transaction is mixed with an ALTER:

# at 536
#130201 11:21:03 server id 1  end_log_pos 658   Query   thread_id=8     exec_time=0     error_code=0
SET TIMESTAMP=1359714063/*!*/;
ALTER TABLE test add column id_2 int(11) default 0 after id
/*!*/;
# at 658
#130201 11:21:39 server id 1  end_log_pos 726   Query   thread_id=7     exec_time=0     error_code=0
SET TIMESTAMP=1359714099/*!*/;
BEGIN
/*!*/;
# at 726
# at 773
#130201 11:21:35 server id 1  end_log_pos 773   Table_map: `test`.`test` mapped to number 17
#130201 11:21:35 server id 1  end_log_pos 829   Update_rows: table id 17 flags: STMT_END_F

BINLOG '
L5cLURMBAAAALwAAAAUDAAAAABEAAAAAAAEABHRlc3QABHRlc3QAAwMD/gL+CQY=
L5cLURgBAAAAOAAAAD0DAAAAABEAAAAAAAEAA///+AIAAAAAAAAAA2JhcvgCAAAAAAAAAANob3A=
'/*!*/;
### UPDATE test.test
### WHERE
###   @1=2 /* INT meta=0 nullable=0 is_null=0 */
###   @2=0 /* INT meta=0 nullable=1 is_null=0 */
###   @3='bar' /* STRING(9) meta=65033 nullable=1 is_null=0 */
### SET
###   @1=2 /* INT meta=0 nullable=0 is_null=0 */
###   @2=0 /* INT meta=0 nullable=1 is_null=0 */
###   @3='hop' /* STRING(9) meta=65033 nullable=1 is_null=0 */
# at 829
#130201 11:21:39 server id 1  end_log_pos 856   Xid = 85
COMMIT/*!*/;

Note how ALTER is logged before the UPDATE, because ALTER did not block waiting for the transaction to commit.

For the reasons described above the implementation of Metadata Locking was changed, starting MySQL 5.5.3. Let’s see how this works now.

Metadata Locking behavior starting MySQL 5.5.3

Starting with 5.5.3 DDL statements that modify the table metadata are executed in an isolated fashion consistent with transactional behavior. This means that any open transaction will hold metadata locks on the table it has accessed for as long as the transaction is open. Since an open transaction retains metadata locks on all tables that were opened by the transaction, hence any DDL operation cannot commence till all the transactions that accessed that table are open. Let’s see this in affect via a simple test case:

session1 > start transaction;
Query OK, 0 rows affected (0.00 sec)

session1 > select * from test order by id;
+----+------+
| id | x    |
+----+------+
|  1 | foo  |
|  2 | bar  |
+----+------+
2 rows in set (0.00 sec)

session2 > ALTER TABLE test add column c char(32) default 'dummy_text';

session3 > show processlist;
+----+----------+-----------+------+---------+------+---------------------------------+-------------------------------------------------------------+
| Id | User     | Host      | db   | Command | Time | State                           | Info                                                        |
+----+----------+-----------+------+---------+------+---------------------------------+-------------------------------------------------------------+
|  1 | msandbox | localhost | test | Sleep   |  140 |                                 | NULL                                                        |
|  2 | msandbox | localhost | test | Query   |    3 | Waiting for table metadata lock | ALTER TABLE test add column c char(32) default 'dummy_text' |
|  3 | msandbox | localhost | test | Query   |    0 | NULL                            | show processlist                                            |
+----+----------+-----------+------+---------+------+---------------------------------+-------------------------------------------------------------+
3 rows in set (0.00 sec)

You can see how the ALTER blocks, because the transaction in session1 is still open and once the transaction in session1 is closed, the ALTER proceeds through successfully:

session1 > rollback;
Query OK, 0 rows affected (0.00 sec)

session2 > ALTER TABLE test add column c char(32) default 'dummy_text';
Query OK, 2 rows affected (46.77 sec)
Records: 2  Duplicates: 0  Warnings: 0

Let’s see where the ALTER spent most of its time:

session2 > show profiles;
+----------+-------------+-------------------------------------------------------------+
| Query_ID | Duration    | Query                                                       |
+----------+-------------+-------------------------------------------------------------+
|        1 | 46.78110075 | ALTER TABLE test add column c char(32) default 'dummy_text' |
+----------+-------------+-------------------------------------------------------------+
1 row in set (0.00 sec)

session2 > show profile for query 1;
+------------------------------+-----------+
| Status                       | Duration  |
+------------------------------+-----------+
| starting                     |  0.000060 |
| checking permissions         |  0.000003 |
| checking permissions         |  0.000003 |
| init                         |  0.000005 |
| Opening tables               |  0.000045 |
| System lock                  |  0.000006 |
| setup                        |  0.000016 |
| creating table               |  0.168283 |
| After create                 |  0.000061 |
| copy to tmp table            |  0.165808 |
| rename result table          | 46.446738 |
| end                          |  0.000035 |
| Waiting for query cache lock |  0.000003 |
| end                          |  0.000006 |
| query end                    |  0.000003 |
| closing tables               |  0.000008 |
| freeing items                |  0.000016 |
| cleaning up                  |  0.000004 |
+------------------------------+-----------+
18 rows in set (0.00 sec)

So the ALTER waited on the meta data locks just after the table with the new structure had been created and populated with data but before the old table was swapped with the new one. Note that ALTER is a multi-step process, the old table is locked in shared mode and then something similar to the following steps are taken: a new table with the new structure is created and then INSERT INTO new_table SELECT * FROM old_table is done and then RENAME old_table to tmp_table, new_table to old_table and finally DROP tmp_table.
Let’s see another example, this time trying a RENAME:

session2 > RENAME TABLE test to test_2;

session3 > show processlist;
+----+----------+-----------+------+---------+------+---------------------------------+-----------------------------+
| Id | User     | Host      | db   | Command | Time | State                           | Info                        |
+----+----------+-----------+------+---------+------+---------------------------------+-----------------------------+
|  1 | msandbox | localhost | test | Sleep   |   49 |                                 | NULL                        |
|  2 | msandbox | localhost | test | Query   |   35 | Waiting for table metadata lock | RENAME TABLE test to test_2 |
|  3 | msandbox | localhost | test | Query   |    0 | NULL                            | show processlist            |
+----+----------+-----------+------+---------+------+---------------------------------+-----------------------------+
3 rows in set (0.00 sec)

And you can see that the RENAME is also blocked, because a transaction that accessed the table “test” is still open.

So we have an interesting conclusion here that the ALTER waits only at the last stages when its making changes to the table metadata, a table ALTER that alters a big table can keep executing without any hindrance, copying rows from the table with the old structure to the table with the new structure and will only wait at the last step when its about to make changes to table metadata.

Let’s see another interesting side-affect of metadata locking.

When can ALTER render the table inaccessible?

Now there is another interesting side-affect, and that is that when the ALTER comes at the state where it needs to wait for metadata locks, at that point the ALTER simply blocks any type of queries to the table, we know that writes would be blocked anyhow for the entire duration of the ALTER, but reads would be blocked as well at the time when the ALTER is waiting for metadata locks. Let’s see this in action via another test case:

session1 > start transaction;
Query OK, 0 rows affected (0.00 sec)

session1 > select * from test_2 order by id;
+----+------+------------+
| id | x    | c          |
+----+------+------------+
|  1 | foo  | dummy_text |
|  2 | bar  | dummy_text |
+----+------+------------+
2 rows in set (0.00 sec)

session6 > ALTER TABLE test_2 DROP COLUMN c;

session7 > select * from test_2 order by id;
session8 > select * from test_2 order by id;
session9 > select * from test_2 order by id;
session10 > select * from test_2 order by id;

session3 > show processlist;
+----+----------+-----------+------+---------+------+---------------------------------+----------------------------------+
| Id | User     | Host      | db   | Command | Time | State                           | Info                             |
+----+----------+-----------+------+---------+------+---------------------------------+----------------------------------+
|  1 | msandbox | localhost | test | Sleep   |  403 |                                 | NULL                             |
|  3 | msandbox | localhost | test | Query   |    0 | NULL                            | show processlist                 |
|  6 | msandbox | localhost | test | Query   |  229 | Waiting for table metadata lock | ALTER TABLE test_2 DROP COLUMN c |
|  7 | msandbox | localhost | test | Query   |  195 | Waiting for table metadata lock | select * from test_2 order by id |
|  8 | msandbox | localhost | test | Query   |  180 | Waiting for table metadata lock | select * from test_2 order by id |
|  9 | msandbox | localhost | test | Query   |  169 | Waiting for table metadata lock | select * from test_2 order by id |
| 10 | msandbox | localhost | test | Query   |   55 | Waiting for table metadata lock | select * from test_2 order by id |
+----+----------+-----------+------+---------+------+---------------------------------+----------------------------------+
7 rows in set (0.00 sec)

And you can see that the table is blocked for any kind of operation. Let’s see the profiling information for one of the queries that was blocked to see where the query spent most of its time:

session10 > show profile for query 1;
+----------------------+------------+
| Status               | Duration   |
+----------------------+------------+
| starting             |   0.000058 |
| checking permissions |   0.000006 |
| Opening tables       | 213.028481 |
| System lock          |   0.000009 |
| init                 |   0.000014 |
| optimizing           |   0.000002 |
| statistics           |   0.000005 |
| preparing            |   0.000006 |
| executing            |   0.000001 |
| Sorting result       |   0.000002 |
| Sending data         |   0.000040 |
| end                  |   0.000003 |
| query end            |   0.000002 |
| closing tables       |   0.000003 |
| freeing items        |   0.000007 |
| logging slow query   |   0.000002 |
| cleaning up          |   0.000002 |
+----------------------+------------+
17 rows in set (0.00 sec)

And you can see how the query spent nearly all its time waiting in the “Opening tables” state. Now this behavior with respect to ALTER making the table inaccessible in some cases is not really documented and as such I have reported a bug: http://bugs.mysql.com/bug.php?id=67647

Metadata locking behaves differently for queries that are serviced from the Query Cache, let’s see what happens in that case.

Metadata Locking and Query Cache

How does metadata locking behave with query_cache? That is an important question. If Query Cache is enabled and the SELECT can be serviced from the Query Cache then the SELECT will not block on the ALTER even though the ALTER is waiting for meta data locks. Why? Because in such a case no table open operation has to be performed. Let’s see this scenario via a test case:

session1 > start transaction;
Query OK, 0 rows affected (0.00 sec)

session1 > select * from test_2 order by id;
+----+------+
| id | x    |
+----+------+
|  1 | foo  |
|  2 | bar  |
+----+------+
2 rows in set (0.00 sec)

session6 > RENAME TABLE test_2 to test;

session10 > select * from test_2 order by id;
+----+------+
| id | x    |
+----+------+
|  1 | foo  |
|  2 | bar  |
+----+------+
2 rows in set (0.00 sec)

session3 > show processlist;
+----+----------+-----------+------+---------+------+---------------------------------+-----------------------------+
| Id | User     | Host      | db   | Command | Time | State                           | Info                        |
+----+----------+-----------+------+---------+------+---------------------------------+-----------------------------+
|  1 | msandbox | localhost | test | Sleep   |   22 |                                 | NULL                        |
|  3 | msandbox | localhost | test | Query   |    0 | NULL                            | show processlist            |
|  6 | msandbox | localhost | test | Query   |    3 | Waiting for table metadata lock | RENAME TABLE test_2 to test |
| 10 | msandbox | localhost | test | Sleep   |   37 |                                 | NULL                        |
+----+----------+-----------+------+---------+------+---------------------------------+-----------------------------+
4 rows in set (0.00 sec)

The query proceeds without being blocked on anything while the RENAME is still waiting for metadata locks. Let’s see the profiling information for this query:

session10 > show profile for query 11;
+--------------------------------+----------+
| Status                         | Duration |
+--------------------------------+----------+
| starting                       | 0.000018 |
| Waiting for query cache lock   | 0.000003 |
| checking query cache for query | 0.000007 |
| checking privileges on cached  | 0.000003 |
| checking permissions           | 0.000005 |
| sending cached result to clien | 0.000011 |
| logging slow query             | 0.000002 |
| cleaning up                    | 0.000002 |
+--------------------------------+----------+
8 rows in set (0.00 sec)

You can see that no table open operation was performed and hence no wait.

Does the fact that the table has already been opened and table object is in the table_cache change anything with respect to metadata locks.

Metadata Locking and Table Cache

No matter if a connection accesses a table that is already in the Table Cache, any query to a table that has a DDL operation waiting, will block. Why? Because MySQL sees that the old entries in the Table Cache have to be invalidated, and any query that accesses the table will have to reopen the modified table and there will be new entries in the Table Cache. Let’s see this phenomenon in action:

session6 > ALTER TABLE test add column c char(32) default 'dummy_text';
Query OK, 2 rows affected (59.80 sec)

session10 > show status like 'Open%tables';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| Open_tables   | 30    |
| Opened_tables | 0     |
+---------------+-------+
2 rows in set (0.00 sec)

session10 > select * from test order by id;
+----+------+------------+
| id | x    | c          |
+----+------+------------+
|  1 | foo  | dummy_text |
|  2 | bar  | dummy_text |
+----+------+------------+
2 rows in set (53.78 sec)

session10 > show status like 'Open%tables';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| Open_tables   | 30    |
| Opened_tables | 1     |
+---------------+-------+
2 rows in set (0.00 sec)

session10 > show profile for query 18;
+----------------------+-----------+
| Status               | Duration  |
+----------------------+-----------+
| starting             |  0.000059 |
| checking permissions |  0.000010 |
| Opening tables       | 53.786685 |
| System lock          |  0.000009 |
| init                 |  0.000012 |
| optimizing           |  0.000003 |
| statistics           |  0.000007 |
| preparing            |  0.000006 |
| executing            |  0.000001 |
| Sorting result       |  0.000004 |
| Sending data         |  0.000033 |
| end                  |  0.000003 |
| query end            |  0.000002 |
| closing tables       |  0.000004 |
| freeing items        |  0.000009 |
| logging slow query   |  0.000002 |
| cleaning up          |  0.000002 |
+----------------------+-----------+
17 rows in set (0.00 sec)

As you can see that the SELECT query still blocks, and the status counter Opened_tables is also incremented once the query finishes.

So much for the explanation, now let’s take a look at the consequences.

Consequences

The consequences of these changes in metadata locking is that, if you have some really hot tables, for example in web applications its typical to see a “sessions” table that is accessed on every request, then care should be taken when you have to ALTER the table otherwise it can easily cause a stall as many threads can get piled up waiting for table metadata lock bringing down the MySQL server or causing all the connections to get depleted.

There are some other interesting consequences as well for application that use MySQL versions prior to 5.5:

  • I remember a customer case where there is a reporting slave that daily runs a long running transaction, this transactions tends to run for hours. Now everyday one of the tables was renamed and swapped and that table was the one that is read from by the long running transaction. As the slave tried to execute the rename query it would simply block waiting for the long running transaction to finish, this would cause the slave to lag for hours waiting for the transaction to be completed, as you know that the slave is single-threaded so it cannot really apply any other event. This was never an issue when the application was using MySQL version < 5.5 as the datastore.
  • There was another interesting case this time with how Active MQ uses MySQL when in HA mode. In HA mode there are two Active MQ servers, both try to do something similar to the following sequence of events:
    session1 > CREATE TABLE t1(i int(11) not null auto_increment primary key) ENGINE=InnoDB;
    Query OK, 0 rows affected (0.09 sec)
    
    session1 > insert into t1 values(null);
    Query OK, 1 row affected (0.21 sec)
    
    session1 > start transaction;
    Query OK, 0 rows affected (0.00 sec)
    
    session1 > select * from t1 where i=1 for update;
    +---+
    | i |
    +---+
    | 1 |
    +---+
    1 row in set (0.00 sec)
    
    session2 > CREATE TABLE t1(i int(11) not null auto_increment primary key) ENGINE=InnoDB;
    

    When using MySQL 5.1 the second CREATE would just fail immediately with the error “ERROR 1050 (42S01): Table ‘t1′ already exists”, but because of how meta data locking works in 5.5 this is no longer the case, the second CREATE will simply block with unintended consequences. A workaround here would be to set lock_wait_timeout variable to a very low value and then execute the CREATE TABLE, this will make sure that the CREATE fails immediately (however due to a different reason):

    session2 > set session lock_wait_timeout=1;CREATE TABLE t1(i int(11) not null auto_increment primary key) ENGINE=InnoDB;
    

    However, I feel that the CREATE TABLE should fail in such a case when the table already exists and there is no other DDL like a DROP table waiting to run on the same table, and as such I have reported the bug: http://bugs.mysql.com/bug.php?id=67873

The post Implications of Metadata Locking Changes in MySQL 5.5 appeared first on MySQL Performance Blog.

Apr
04
2011
--

MySQL caching methods and tips

“The least expensive query is the query you never run.”

Data access is expensive for your application. It often requires CPU, network and disk access, all of which can take a lot of time. Using less computing resources, particularly in the cloud, results in decreased overall operational costs, so caches provide real value by avoiding using those resources. You need an efficient and reliable cache in order to achieve the desired result. Your end users also care about response times because this affects their work productivity or their enjoyment of your service. This post describes some of the most common cache methods for MySQL.

Popular cache methods

The MySQL query cache

When the query cache is enabled, MySQL examines each query to see if the contents have been stored in the query cache. If the results have been cached they are used instead of actually running the query.. This improves the response time by avoiding the work of the query. If you are getting the impression that I’ve just introduced you to the query performance magic bullet, unfortunately, I haven’t.

The problem with the query cache is “coarse invalidation”. That is, as soon as you change a single row in any table, the query cache entries for every query which accessed that table must be invalidated. If a frequently executed or expensive query is invalidated, response time will be significantly impacted.

The invalidation frequency is controlled by the rate of change in the database tables. This results in unpredictable and therefore, undesirable, performance. The other big problem with the query cache is that it is protected by a single mutex. On servers with many cores, a high volume of queries can cause extensive mutex contention. Percona Server even has a state in the processlist ‘waiting on query cache mutex’ so that this is easier to spot.

External cache (Memcached)

In order to eliminate the unpredictable nature of the the query cache, external caches like Memcached are usually employed. When rows in the database change, the old query results remain available in the cache, but they are now “stale”. Until the cache key expires, the contents are available immediately and you avoid performing work in the database.

Eventually the cache contents expire. When this happens, the application attempts to get the cached value and fails. At this time, the application must compute the results, then place them into the cache. Additionally, memory pressure may cause unexpected invalidation of items, once again resulting in unpredictable, and therefore undesirable performance.

If the cache is emptied (perhaps due to a restart, crash, upgrade or power loss) then all of the results are invalidated, which can cause very poor performance.

Cache invalidation is a problem

With both of these cache methods, once the cache is invalidated or expires the entire result must be recalculated. This recalculation may be very expensive. For example, suppose we need to calculate the total count of items sold. A query may have to access many millions of rows of data to compute that result and this takes time.

Another problem which can occur is the “cache stampede” aka a “miss storm”. This happens when multiple requests need data for the same key, but the key recently expired. The stampede impacts performance because multiple requests try to recompute the contents at the same time. These cache stampedes are essentially the cause of the unpredictable performance of the MySQL query cache, since the rate of invalidation can not be controlled, and multiple cache entries may be invalidated by a single table change.

Peter’s advice in the miss storm blog post suggests that for best performance one should pre-compute the data for expensive keys. That is, the keys which are frequently accessed, or those that are expensive to recompute. For the queries that are expensive to recompute, the pre-computation normally takes the form of summary tables, which are discussed next.

If request frequency is the problem (mostly likely because the response time goes up due to increased concurrency) but the time to compute the contents is low, then using a mutex around the content generation is a possible solution to the problem. This forces one query to do the computation while others wait for the result. There are also probabilistic methods to enqueue items to be rebuilt with some increasing probability as the request time approaches the expiration time. This does not, however, offer very much improvement for keys which are not accessed very frequently.

Use what you need

Both Memcached and the MySQL query cache are limited in size. If you try to cache more information than you can store, space will need to be freed in order to store the new information in the cache. In order to ensure cache efficiency, you must only place information in the cache that you intend to retrieve again. Storing data in the cache that you won’t read again also increases response time in your application because it wastes a round trip to the cache server. It wastes CPU and other resources too.

Pick an efficient cache representation

If you can cache an entire HTML block instead of the rows used to create it, then do so. This avoids the CPU usage on your web server to create the block from the rows again and again. If you are paying for compute cycles in the cloud, this can be very beneficial as may reduce the number of instances you need.

Don’t make too many round trips

Asking the cache for many different pieces of data to satisfy a single request is not very efficient. Use multi_get when possible. Once again, caching entire portions of pages is a good way to reduce the number of round trips to the cache.

Summary tables

Queries that access a lot of data usually face two bottlenecks: disk IO and CPU usage. Disk IO is expensive, and even if the disk bottleneck is eliminated, the sorting, aggregation and join operations are still CPU intensive and single threaded. In order to avoid these operations, the results can be pre-aggregated into a summary table.

Ideally, the summary tables can be updated with only the information that changed since they were last populated. MySQL includes two statements that make this easier: CREATE TABLE .. SELECT and INSERT .. SELECT. These SQL commands can be used to either replace the contents of, or insert new data into the summary table.

One advantage of summary tables is that they are persistent unlike the query cache or Memcached. There is no risk of unexpected invalidation, either. The summary table always exists, and should be fast to access, since it can be indexed appropriately for your queries.

Using INSERT .. SELECT for summary tables

The INSERT .. SELECT approach works best when there is some sort of log table, such as a log of clicks, a web access log, metric data from a monitoring system, or the like, which is to be aggregated over time. With this type of source data, one does not expect to see very many (if any) updates to the data once it has been collected.

This method does not usually work well when database tables may be updated or when rows may be deleted. Such changes may happen if a log is accidentally imported twice, for example and then the duplicate items are deleted. When this happens, decreasing the counts in summary tables may not be possible and thus they may be out of sync with the underlying data.

If there are frequent changes to the data then other options for maintaining summary tables must be used. Either they must be rebuilt from scratch each time (like a memcache miss) or they must be updated. Updating the summary tables efficiently is a hard problem to solve. My latest post on using Flexviews addresses this problem in a dedicated post.

Conclusion

If the least expensive queries are the ones you never run, then the most expensive queries very well may be the ones you have to run when the cache is empty. When talking about cache, the miss path is at least as important as the hit one. In order to make the miss path less expensive, use a layered approach to your caching. Cron jobs and summary tables can be used to make the miss path much less expensive. If you don’t pre-compute, and your website performance is dependent on your cache, then you could have a serious performance problem if your cache is unexpectedly emptied.

Effective caching is important. It is important to only cache things you know you will need again. Cache data in the form that it makes most sense to your application. Don’t cache the results of every query, simply because it is easy to do so.

Apr
04
2011
--

MySQL caching methods and tips

“The least expensive query is the query you never run.”

Data access is expensive for your application. It often requires CPU, network and disk access, all of which can take a lot of time. Using less computing resources, particularly in the cloud, results in decreased overall operational costs, so caches provide real value by avoiding using those resources. You need an efficient and reliable cache in order to achieve the desired result. Your end users also care about response times because this affects their work productivity or their enjoyment of your service. This post describes some of the most common cache methods for MySQL.

Popular cache methods

The MySQL query cache

When the query cache is enabled, MySQL examines each query to see if the contents have been stored in the query cache. If the results have been cached they are used instead of actually running the query.. This improves the response time by avoiding the work of the query. If you are getting the impression that I’ve just introduced you to the query performance magic bullet, unfortunately, I haven’t.

The problem with the query cache is “coarse invalidation”. That is, as soon as you change a single row in any table, the query cache entries for every query which accessed that table must be invalidated. If a frequently executed or expensive query is invalidated, response time will be significantly impacted.

The invalidation frequency is controlled by the rate of change in the database tables. This results in unpredictable and therefore, undesirable, performance. The other big problem with the query cache is that it is protected by a single mutex. On servers with many cores, a high volume of queries can cause extensive mutex contention. Percona Server even has a state in the processlist ‘waiting on query cache mutex’ so that this is easier to spot.

External cache (Memcached)

In order to eliminate the unpredictable nature of the the query cache, external caches like Memcached are usually employed. When rows in the database change, the old query results remain available in the cache, but they are now “stale”. Until the cache key expires, the contents are available immediately and you avoid performing work in the database.

Eventually the cache contents expire. When this happens, the application attempts to get the cached value and fails. At this time, the application must compute the results, then place them into the cache. Additionally, memory pressure may cause unexpected invalidation of items, once again resulting in unpredictable, and therefore undesirable performance.

If the cache is emptied (perhaps due to a restart, crash, upgrade or power loss) then all of the results are invalidated, which can cause very poor performance.

Cache invalidation is a problem

With both of these cache methods, once the cache is invalidated or expires the entire result must be recalculated. This recalculation may be very expensive. For example, suppose we need to calculate the total count of items sold. A query may have to access many millions of rows of data to compute that result and this takes time.

Another problem which can occur is the “cache stampede” aka a “miss storm”. This happens when multiple requests need data for the same key, but the key recently expired. The stampede impacts performance because multiple requests try to recompute the contents at the same time. These cache stampedes are essentially the cause of the unpredictable performance of the MySQL query cache, since the rate of invalidation can not be controlled, and multiple cache entries may be invalidated by a single table change.

Peter’s advice in the miss storm blog post suggests that for best performance one should pre-compute the data for expensive keys. That is, the keys which are frequently accessed, or those that are expensive to recompute. For the queries that are expensive to recompute, the pre-computation normally takes the form of summary tables, which are discussed next.

If request frequency is the problem (mostly likely because the response time goes up due to increased concurrency) but the time to compute the contents is low, then using a mutex around the content generation is a possible solution to the problem. This forces one query to do the computation while others wait for the result. There are also probabilistic methods to enqueue items to be rebuilt with some increasing probability as the request time approaches the expiration time. This does not, however, offer very much improvement for keys which are not accessed very frequently.

Use what you need

Both Memcached and the MySQL query cache are limited in size. If you try to cache more information than you can store, space will need to be freed in order to store the new information in the cache. In order to ensure cache efficiency, you must only place information in the cache that you intend to retrieve again. Storing data in the cache that you won’t read again also increases response time in your application because it wastes a round trip to the cache server. It wastes CPU and other resources too.

Pick an efficient cache representation

If you can cache an entire HTML block instead of the rows used to create it, then do so. This avoids the CPU usage on your web server to create the block from the rows again and again. If you are paying for compute cycles in the cloud, this can be very beneficial as may reduce the number of instances you need.

Don’t make too many round trips

Asking the cache for many different pieces of data to satisfy a single request is not very efficient. Use multi_get when possible. Once again, caching entire portions of pages is a good way to reduce the number of round trips to the cache.

Summary tables

Queries that access a lot of data usually face two bottlenecks: disk IO and CPU usage. Disk IO is expensive, and even if the disk bottleneck is eliminated, the sorting, aggregation and join operations are still CPU intensive and single threaded. In order to avoid these operations, the results can be pre-aggregated into a summary table.

Ideally, the summary tables can be updated with only the information that changed since they were last populated. MySQL includes two statements that make this easier: CREATE TABLE .. SELECT and INSERT .. SELECT. These SQL commands can be used to either replace the contents of, or insert new data into the summary table.

One advantage of summary tables is that they are persistent unlike the query cache or Memcached. There is no risk of unexpected invalidation, either. The summary table always exists, and should be fast to access, since it can be indexed appropriately for your queries.

Using INSERT .. SELECT for summary tables

The INSERT .. SELECT approach works best when there is some sort of log table, such as a log of clicks, a web access log, metric data from a monitoring system, or the like, which is to be aggregated over time. With this type of source data, one does not expect to see very many (if any) updates to the data once it has been collected.

This method does not usually work well when database tables may be updated or when rows may be deleted. Such changes may happen if a log is accidentally imported twice, for example and then the duplicate items are deleted. When this happens, decreasing the counts in summary tables may not be possible and thus they may be out of sync with the underlying data.

If there are frequent changes to the data then other options for maintaining summary tables must be used. Either they must be rebuilt from scratch each time (like a memcache miss) or they must be updated. Updating the summary tables efficiently is a hard problem to solve. My latest post on using Flexviews addresses this problem in a dedicated post.

Conclusion

If the least expensive queries are the ones you never run, then the most expensive queries very well may be the ones you have to run when the cache is empty. When talking about cache, the miss path is at least as important as the hit one. In order to make the miss path less expensive, use a layered approach to your caching. Cron jobs and summary tables can be used to make the miss path much less expensive. If you don’t pre-compute, and your website performance is dependent on your cache, then you could have a serious performance problem if your cache is unexpectedly emptied.

Effective caching is important. It is important to only cache things you know you will need again. Cache data in the form that it makes most sense to your application. Don’t cache the results of every query, simply because it is easy to do so.

Jul
05
2010
--

mk-query-digest, query comments and the query cache

I very much like the fact that MySQL allows you to embed comments into SQL statements. These comments are extremely convenient, because they are written into MySQL log files as part of the query. This includes the general log, the binary log and the slow query log. Maatkit includes tools which interact with these logs, including mk-query-digest. This tool, in particular, has a very nice option called –embedded-attributes which can process data embedded in query comments.

The support for embedded attributes makes some cool tricks possible. Peter and I co-presented a talk at this past MySQL Conference and Expo. In this talk I presented my Instrumentation-for-PHP class as a demonstration instrumentation application for the LAMP stack. Instrumentation-for-php collects information about the PHP process such as wall time, cpu time, mysql query times, etc, and automatically places this information into the Apache environment. It also includes support for “augmenting” SQL queries with additional information such as the PHP source line and user session information by embedding this information in a specially formatted SQL comment. This information can then be cross-referenced between MySQL logs and Apache logs.

SQL:

  1. SELECT 1
  2. — File: myfile.php Line: 25 Function: myFunc SessionID: a1b2c3d4e5f6a7b8c9d0aef UserID: swany78

Notice that the query ends with a comment, and that this comment includes important information, such as the PHP session ID, and the application user name. This allows mk-query digest to discover these attributes and use them for aggregation. You could also filter on the embedded information to display only queries from a particular file, particular session or particular user for example.

The –embedded-attributes option takes two parts. The first part is a Perl regular expression which identifies which comment will be treated as the one with attributes. The second part is another expression used to split the comment into attribute key/value pairs.

For example:

--embedded-attributes '^-- [^\n]+','(\w+): ([^ ]+)'

The first part: ‘^– [^\n]+’ matches everything starting with — and ending with newline.
The second part includes two expressions, each enclosed by parenthesis so that they will be captured. The first matching expression:(\w+): matches one or more words followed by colon and a space. The second match expression:([^ ]+)captures everything up to the first space.

A problem with this approach is that MySQL treats identical queries with different comments as different queries. This essentially renders the query cache inoperative when this approach is taken. A cache that can’t return any results is worse than not having a cache at all. The cache won’t return any result if each query has some sort of unique identifier in it. Evicting queries will become very frequent when every query is unique and the query cache will use additional resources for no effective benefit.

Luckily the problem is mostly* fixed in Percona Server 11.

I wrote a simple test script to demonstrate the difference. The test loops twice over a set of query patterns. Each pattern is executed 1000 times in each iteration of the loop. The queries differ only by comments. In the best case scenario where comments are stripped we would expect 1000 misses and then all hits from this test. You can enable query cache stripping in Percona Server at runtime:

SQL:

  1. mysql> SET global query_cache_strip_comments=‘ON’;
  2. Query OK, 0 rows affected (0.00 sec)
Query

Hits

Misses

Hits

Misses
 

Stripping No Stripping
no comment

1000

1000

1000

1000
random at start

2000

0

0

2000
random at middle

2000

0

0

2000
static at start

2000

0

1000

1000

As you can see, the first loop of the queries (that is, those with no comment at all) get 1000 misses. Everything beyond that is a hit when stripping is enabled. Contrast this with the results where stripping is not enabled. There are many more misses due to the random values in the query comments. Also notice the query with static comments. There are 1000 hits and 1000 misses because MySQL treats identical queries with identical comments as the same. When comments are stripped every one of these queries is answered by the cache.

If you are benefiting from the query cache, but would like better instrumentation in your queries, consider using Percona Server and turning on the –strip-query-cache-comments feature.

* Percona Server 11.1 still has a problem with # and — comments at the start of a query(before SELECT), and in some cases, with varying white-space. These issues will be rectified in the next release. I’ve updated instrumentation-for-php to place the comment at the end of the query to avoid these problems.


Entry posted by Justin Swanhart |
2 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com