Jun
15
2018
--

This Week in Data with Colin Charles 42: Security Focus on Redis and Docker a Timely Reminder to Stay Alert

Colin Charles

Colin CharlesJoin Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

Much of last week, there was a lot of talk around this article: New research shows 75% of ‘open’ Redis servers infected. It turns out, it helps that one should always read beyond the headlines because they tend to be more sensationalist than you would expect. From the author of Redis, I highly recommend reading Clarifications on the Incapsula Redis security report, because it turns out that in this case, it is beyond the headline. The content is also suspect. Antirez had to write this to help the press (we totally need to help keep reportage accurate).

Not to depart from the Redis world just yet, but Antirez also had some collaboration with the Apple Information Security Team with regards to the Redis Lua subsystem. The details are pretty interesting as documented in Redis Lua scripting: several security vulnerabilities fixed because you’ll note that the Alibaba team also found some other issues. Antirez also ensured that the Redis cloud providers (notably: Redis Labs, Amazon, Alibaba, Microsoft, Google, Heroku, Open Redis and Redis Green) got notified first (and in the comments, compose.io was missing, but now added to the list). I do not know if Linux distributions were also informed, but they will probably be rolling out updates soon.

In the “be careful where you get your software” department: some criminals have figured out they could host some crypto-currency mining software that you would get pre-installed if you used their Docker containers. They’ve apparently made over $90,000. It is good to note that the Backdoored images downloaded 5 million times finally removed from Docker Hub. This, however, was up on the Docker Hub for ten months and they managed to get over 5 million downloads across 17 images. Know what images you are pulling. Maybe this is again more reason for software providers to run their own registries?

James Turnbull is out with a new book: Monitoring with Prometheus. It just got released, I’ve grabbed it, but a review will come shortly. He’s managed all this while pulling off what seems to be yet another great O’Reilly Velocity San Jose Conference.

Releases

A quiet week on this front.

Link List

  • INPLACE upgrade from MySQL 5.7 to MySQL 8.0
  • PostgreSQL relevant: What’s is the difference between streaming replication vs hot standby vs warm standby ?
  • A new paper on Amazon Aurora is out: Amazon Aurora: On Avoiding Distributed Consensus for I/Os, Commits, and Membership Changes. It was presented at SIGMOD 2018, and an abstract: “One of the more novel differences between Aurora and other relational databases is how it pushes redo processing to a multi-tenant scale-out storage service, purpose-built for Aurora. Doing so reduces networking traffic, avoids checkpoints and crash recovery, enables failovers to replicas without loss of data, and enables fault-tolerant storage that heals without database involvement. Traditional implementations that leverage distributed storage would use distributed consensus algorithms for commits, reads, replication, and membership changes and amplify cost of underlying storage.” Aurora, as you know, avoids distributed consensus under most circumstances. Short 8-page read.
  • Dormando is blogging again, and this was of particular interest — Caching beyond RAM: the case for NVMe. This is done in the context of memcached, which I am certain many use.
  • It is particularly heartening to note that not only does MongoDB use Linkbench for some of their performance testing, they’re also contributing to making it better via a pull request.

Industry Updates

Trying something new here… To cover fundraising, and people on the move in the database industry.

  • Kenny Gorman — who has been on the program committee for several Percona Live conferences, and spoken at the event multiple times before — is the founder and CEO of Eventador, a stream-processing as a service company built on Apache Kafka and Apache Flink, has just raised $3.8 million in funding to fuel their growth. They are also naturally spending this on hiring. The full press release.
  • Jimmy Guerrero (formerly of MySQL and InfluxDB) is now VP Marketing & Community at YugaByte DB. YugaByte was covered in column 13 as having raised $8 million in November 2017.

Upcoming appearances

  • DataOps Barcelona – Barcelona, Spain – June 21-22, 2018 – code dataopsbcn50 gets you a discount
  • OSCON – Portland, Oregon, USA – July 16-19, 2018
  • Percona webinar on Maria Server 10.3 – June 26, 2018

Feedback

I look forward to feedback/tips via e-mail at colin.charles@percona.com or on Twitter @bytebot.

The post This Week in Data with Colin Charles 42: Security Focus on Redis and Docker a Timely Reminder to Stay Alert appeared first on Percona Database Performance Blog.

Jun
19
2013
--

What technologies are you running alongside MySQL?

What technologies are you running alongside MySQL?In many environments MySQL is not the only technology used to store in-process data.

Quite frequently, especially with large-scale or complicated applications, we use MySQL alongside other technologies for certain tasks of reporting, caching as well as main data-store for portions of application.

What technologies for data storage and processing do you use alongside MySQL in your environment? Please feel free to elaborate in the comments about your use case and experiences!

Note: There is a poll embedded within this post, please visit the site to participate in this post’s poll.

The post What technologies are you running alongside MySQL? appeared first on MySQL Performance Blog.

Mar
29
2013
--

MySQL 5.6 – InnoDB Memcached Plugin as a caching layer

A common practice to offload traffic from MySQL 5.6 is to use a caching layer to store expensive result sets or objects.  Some typical use cases include:

  • Complicated query result set (search results, recent users, recent posts, etc)
  • Full page output (relatively static pages)
  • Full objects (user or cart object built from several queries)
  • Infrequently changing data (configurations, etc)

In pseudo-code, here is the basic approach:

data = fetchCache(key)
if (data) {
  return data
}
data = callExpensiveFunction(params)
storeCache(data, key)
return data

Memcached is a very popular (and proven) option used in production as a caching layer.  While very fast, one major potential shortcoming of memcached is that it is not persistent.  While a common design consideration when using a cache layer is that “data in cache may go away at any point”, this can result in painful warmup time and/or costly cache stampedes.

Cache stampedes can be mitigated through application approaches (semaphores, pre-expiring and populating, etc), but those approaches are more geared towards single key expiration or eviction.  However, they can’t help overall warmup time when the entire cache is cleared (think restarting a memcache node).  This is where a persistent cache can be invaluable.

Enter MySQL 5.6 with the memcached plugin

As part of the standard MySQL 5.6 GA distribution, there is a memcached plugin included in the base plugin directory (/usr/lib64/mysql/plugin/libmemcached.so) that can be stopped and started at runtime.  In a nutshell, here is how one would start the memcached plugin:

mysql> install plugin daemon_memcached soname 'libmemcached.so';

In an effort to not re-invent the wheel, here is a link to the full documentation for setting up the plugin:

http://dev.mysql.com/doc/refman/5.6/en/innodb-memcached-setup.html

As a quick benchmark, I ran some batches of fetch and store against both a standard memcached instance and a minimally tuned MySQL 5.6 instance running the memcached plugin.  Here are some details about the test:

  • Minimal hardware (vBox instances on MacBook Pro)
    • Centos 6.4
    • Single core VM
    • 528M RAM
    • Host-Only network
    • 1 Box with http/php, 1 box with memcache or mysql started
  • PHP script
    • Zend framework
    • libmemcached PECL module
    • Zend_Cache_Backend_Libmemcached

Here is the rough code for this benchmark:

// Identical config/code for memcached vs InnoDB
$frontendOpts = array(
  'caching' => true,
  'lifetime' => 3600,
  'automatic_serialization' => true
);
$memcacheOpts = array(
  'servers' =>array(
    array(
      'host'   => '192.168.57.51',
      'port'   => 11211,
      'weight' => 1,
    )
  ),
  'client' => array(
    'compression' => true,
  ),
);
$cache = Zend_Cache::factory('Core', 'Libmemcached', $frontendOpts, $memcacheOpts);
$timer->start();
for ($i = 0; $i < 100000; $i++) {
  $cache->save(new Object(), "key_$i");
}
$totalTimeStore = $timer->stop();
$avgTimeStore = $totalTimeStore / 100000;
$timer->start();
for ($i = 0; $i < 10; $i++) {
  for ($i = 0; $i < 100000; $i++) {
    $obj = $cache->load("key_$i");
  }
}
$totalTimeFetch = $timer->stop();
$avgTimeFetch = $totalTimeFetch / 1000000;

While this benchmark doesn’t show any multi-threading or other advanced operation, it is using identical code to eliminate variation due to client libraries.  The only change between runs is on the remote server (stop/start memcached, stop/start plugin).

As expected, there is a slowdown for write operations when using the InnoDB version.  But there is also a slight increase in the average fetch time.  Here are the raw results from this test run (100,000 store operations, 1,000,000 fetch operations):

Standard Memcache:

Storing [100,000] items:

60486 ms total
0.60486 ms per/cmd
0.586 ms min per/cmd
0.805 ms max per/cmd
0.219 ms range per/cmd

Fetching [1,000,000] items:
288257 ms total
0.288257 ms per/cmd
0.2843 ms min per/cmd
0.3026 ms max per/cmd
0.0183 ms range per/cmd

InnoDB Memcache:

Storing [100,000] items:

233863 ms total
2.33863 ms per/cmd
1.449 ms min per/cmd
7.324 ms max per/cmd
5.875 ms range per/cmd

Fetching [1,000,000] items:
347181 ms total
0.347181 ms per/cmd
0.3208 ms min per/cmd
0.4159 ms max per/cmd
0.0951 ms range per/cmd

InnoDB MySQL Select (same table):

Fetching [1,000,000] items:

441573 ms total
0.441573 ms per/cmd
0.4327 ms min per/cmd
0.5129 ms max per/cmd
0.0802 ms range per/cmd

Keep in mind that the entire data set fits into the buffer pool, so there are no reads from disk.  However, there is write activity stemming from the fact that this is using InnoDB under the hood (redo logs, etc).

Based on the above numbers, here are the relative differences:

  • InnoDB store operation was 280% higher (~1.73 ms/op)
  • InnoDB fetch operation was 20% higher (~.06 ms/op)
  • MySQL Select showed 27% increase over InnoDB fetch (~.09 ms/op)
    • This replaced $cache->load() with $db->query(“SELECT * FROM memcached.container WHERE id=’key_id’”);
    • id is PK of the container table

While there are increases in both operations, there are some tradeoffs to consider:

  • Cost of additional memcached hardware
  • Cost of operations time to maintain an additional system
  • Impact of warmup time to application
  • Cost of disk space on database server

Now, there are definitely other NoSQL options for persistent cache out there (Redis, Couchbase, etc), but they are outside the scope of this investigation and would require different client libraries and benchmark methodology.

My goal here was to compare a transparent switch (in terms of code) and experiment with the memcache plugin.  Even the use of HandlerSocket would require coding changes (which is why it was also left out of the discussion).

The post MySQL 5.6 – InnoDB Memcached Plugin as a caching layer appeared first on MySQL Performance Blog.

Apr
04
2011
--

MySQL caching methods and tips

“The least expensive query is the query you never run.”

Data access is expensive for your application. It often requires CPU, network and disk access, all of which can take a lot of time. Using less computing resources, particularly in the cloud, results in decreased overall operational costs, so caches provide real value by avoiding using those resources. You need an efficient and reliable cache in order to achieve the desired result. Your end users also care about response times because this affects their work productivity or their enjoyment of your service. This post describes some of the most common cache methods for MySQL.

Popular cache methods

The MySQL query cache

When the query cache is enabled, MySQL examines each query to see if the contents have been stored in the query cache. If the results have been cached they are used instead of actually running the query.. This improves the response time by avoiding the work of the query. If you are getting the impression that I’ve just introduced you to the query performance magic bullet, unfortunately, I haven’t.

The problem with the query cache is “coarse invalidation”. That is, as soon as you change a single row in any table, the query cache entries for every query which accessed that table must be invalidated. If a frequently executed or expensive query is invalidated, response time will be significantly impacted.

The invalidation frequency is controlled by the rate of change in the database tables. This results in unpredictable and therefore, undesirable, performance. The other big problem with the query cache is that it is protected by a single mutex. On servers with many cores, a high volume of queries can cause extensive mutex contention. Percona Server even has a state in the processlist ‘waiting on query cache mutex’ so that this is easier to spot.

External cache (Memcached)

In order to eliminate the unpredictable nature of the the query cache, external caches like Memcached are usually employed. When rows in the database change, the old query results remain available in the cache, but they are now “stale”. Until the cache key expires, the contents are available immediately and you avoid performing work in the database.

Eventually the cache contents expire. When this happens, the application attempts to get the cached value and fails. At this time, the application must compute the results, then place them into the cache. Additionally, memory pressure may cause unexpected invalidation of items, once again resulting in unpredictable, and therefore undesirable performance.

If the cache is emptied (perhaps due to a restart, crash, upgrade or power loss) then all of the results are invalidated, which can cause very poor performance.

Cache invalidation is a problem

With both of these cache methods, once the cache is invalidated or expires the entire result must be recalculated. This recalculation may be very expensive. For example, suppose we need to calculate the total count of items sold. A query may have to access many millions of rows of data to compute that result and this takes time.

Another problem which can occur is the “cache stampede” aka a “miss storm”. This happens when multiple requests need data for the same key, but the key recently expired. The stampede impacts performance because multiple requests try to recompute the contents at the same time. These cache stampedes are essentially the cause of the unpredictable performance of the MySQL query cache, since the rate of invalidation can not be controlled, and multiple cache entries may be invalidated by a single table change.

Peter’s advice in the miss storm blog post suggests that for best performance one should pre-compute the data for expensive keys. That is, the keys which are frequently accessed, or those that are expensive to recompute. For the queries that are expensive to recompute, the pre-computation normally takes the form of summary tables, which are discussed next.

If request frequency is the problem (mostly likely because the response time goes up due to increased concurrency) but the time to compute the contents is low, then using a mutex around the content generation is a possible solution to the problem. This forces one query to do the computation while others wait for the result. There are also probabilistic methods to enqueue items to be rebuilt with some increasing probability as the request time approaches the expiration time. This does not, however, offer very much improvement for keys which are not accessed very frequently.

Use what you need

Both Memcached and the MySQL query cache are limited in size. If you try to cache more information than you can store, space will need to be freed in order to store the new information in the cache. In order to ensure cache efficiency, you must only place information in the cache that you intend to retrieve again. Storing data in the cache that you won’t read again also increases response time in your application because it wastes a round trip to the cache server. It wastes CPU and other resources too.

Pick an efficient cache representation

If you can cache an entire HTML block instead of the rows used to create it, then do so. This avoids the CPU usage on your web server to create the block from the rows again and again. If you are paying for compute cycles in the cloud, this can be very beneficial as may reduce the number of instances you need.

Don’t make too many round trips

Asking the cache for many different pieces of data to satisfy a single request is not very efficient. Use multi_get when possible. Once again, caching entire portions of pages is a good way to reduce the number of round trips to the cache.

Summary tables

Queries that access a lot of data usually face two bottlenecks: disk IO and CPU usage. Disk IO is expensive, and even if the disk bottleneck is eliminated, the sorting, aggregation and join operations are still CPU intensive and single threaded. In order to avoid these operations, the results can be pre-aggregated into a summary table.

Ideally, the summary tables can be updated with only the information that changed since they were last populated. MySQL includes two statements that make this easier: CREATE TABLE .. SELECT and INSERT .. SELECT. These SQL commands can be used to either replace the contents of, or insert new data into the summary table.

One advantage of summary tables is that they are persistent unlike the query cache or Memcached. There is no risk of unexpected invalidation, either. The summary table always exists, and should be fast to access, since it can be indexed appropriately for your queries.

Using INSERT .. SELECT for summary tables

The INSERT .. SELECT approach works best when there is some sort of log table, such as a log of clicks, a web access log, metric data from a monitoring system, or the like, which is to be aggregated over time. With this type of source data, one does not expect to see very many (if any) updates to the data once it has been collected.

This method does not usually work well when database tables may be updated or when rows may be deleted. Such changes may happen if a log is accidentally imported twice, for example and then the duplicate items are deleted. When this happens, decreasing the counts in summary tables may not be possible and thus they may be out of sync with the underlying data.

If there are frequent changes to the data then other options for maintaining summary tables must be used. Either they must be rebuilt from scratch each time (like a memcache miss) or they must be updated. Updating the summary tables efficiently is a hard problem to solve. My latest post on using Flexviews addresses this problem in a dedicated post.

Conclusion

If the least expensive queries are the ones you never run, then the most expensive queries very well may be the ones you have to run when the cache is empty. When talking about cache, the miss path is at least as important as the hit one. In order to make the miss path less expensive, use a layered approach to your caching. Cron jobs and summary tables can be used to make the miss path much less expensive. If you don’t pre-compute, and your website performance is dependent on your cache, then you could have a serious performance problem if your cache is unexpectedly emptied.

Effective caching is important. It is important to only cache things you know you will need again. Cache data in the form that it makes most sense to your application. Don’t cache the results of every query, simply because it is easy to do so.

Apr
04
2011
--

MySQL caching methods and tips

“The least expensive query is the query you never run.”

Data access is expensive for your application. It often requires CPU, network and disk access, all of which can take a lot of time. Using less computing resources, particularly in the cloud, results in decreased overall operational costs, so caches provide real value by avoiding using those resources. You need an efficient and reliable cache in order to achieve the desired result. Your end users also care about response times because this affects their work productivity or their enjoyment of your service. This post describes some of the most common cache methods for MySQL.

Popular cache methods

The MySQL query cache

When the query cache is enabled, MySQL examines each query to see if the contents have been stored in the query cache. If the results have been cached they are used instead of actually running the query.. This improves the response time by avoiding the work of the query. If you are getting the impression that I’ve just introduced you to the query performance magic bullet, unfortunately, I haven’t.

The problem with the query cache is “coarse invalidation”. That is, as soon as you change a single row in any table, the query cache entries for every query which accessed that table must be invalidated. If a frequently executed or expensive query is invalidated, response time will be significantly impacted.

The invalidation frequency is controlled by the rate of change in the database tables. This results in unpredictable and therefore, undesirable, performance. The other big problem with the query cache is that it is protected by a single mutex. On servers with many cores, a high volume of queries can cause extensive mutex contention. Percona Server even has a state in the processlist ‘waiting on query cache mutex’ so that this is easier to spot.

External cache (Memcached)

In order to eliminate the unpredictable nature of the the query cache, external caches like Memcached are usually employed. When rows in the database change, the old query results remain available in the cache, but they are now “stale”. Until the cache key expires, the contents are available immediately and you avoid performing work in the database.

Eventually the cache contents expire. When this happens, the application attempts to get the cached value and fails. At this time, the application must compute the results, then place them into the cache. Additionally, memory pressure may cause unexpected invalidation of items, once again resulting in unpredictable, and therefore undesirable performance.

If the cache is emptied (perhaps due to a restart, crash, upgrade or power loss) then all of the results are invalidated, which can cause very poor performance.

Cache invalidation is a problem

With both of these cache methods, once the cache is invalidated or expires the entire result must be recalculated. This recalculation may be very expensive. For example, suppose we need to calculate the total count of items sold. A query may have to access many millions of rows of data to compute that result and this takes time.

Another problem which can occur is the “cache stampede” aka a “miss storm”. This happens when multiple requests need data for the same key, but the key recently expired. The stampede impacts performance because multiple requests try to recompute the contents at the same time. These cache stampedes are essentially the cause of the unpredictable performance of the MySQL query cache, since the rate of invalidation can not be controlled, and multiple cache entries may be invalidated by a single table change.

Peter’s advice in the miss storm blog post suggests that for best performance one should pre-compute the data for expensive keys. That is, the keys which are frequently accessed, or those that are expensive to recompute. For the queries that are expensive to recompute, the pre-computation normally takes the form of summary tables, which are discussed next.

If request frequency is the problem (mostly likely because the response time goes up due to increased concurrency) but the time to compute the contents is low, then using a mutex around the content generation is a possible solution to the problem. This forces one query to do the computation while others wait for the result. There are also probabilistic methods to enqueue items to be rebuilt with some increasing probability as the request time approaches the expiration time. This does not, however, offer very much improvement for keys which are not accessed very frequently.

Use what you need

Both Memcached and the MySQL query cache are limited in size. If you try to cache more information than you can store, space will need to be freed in order to store the new information in the cache. In order to ensure cache efficiency, you must only place information in the cache that you intend to retrieve again. Storing data in the cache that you won’t read again also increases response time in your application because it wastes a round trip to the cache server. It wastes CPU and other resources too.

Pick an efficient cache representation

If you can cache an entire HTML block instead of the rows used to create it, then do so. This avoids the CPU usage on your web server to create the block from the rows again and again. If you are paying for compute cycles in the cloud, this can be very beneficial as may reduce the number of instances you need.

Don’t make too many round trips

Asking the cache for many different pieces of data to satisfy a single request is not very efficient. Use multi_get when possible. Once again, caching entire portions of pages is a good way to reduce the number of round trips to the cache.

Summary tables

Queries that access a lot of data usually face two bottlenecks: disk IO and CPU usage. Disk IO is expensive, and even if the disk bottleneck is eliminated, the sorting, aggregation and join operations are still CPU intensive and single threaded. In order to avoid these operations, the results can be pre-aggregated into a summary table.

Ideally, the summary tables can be updated with only the information that changed since they were last populated. MySQL includes two statements that make this easier: CREATE TABLE .. SELECT and INSERT .. SELECT. These SQL commands can be used to either replace the contents of, or insert new data into the summary table.

One advantage of summary tables is that they are persistent unlike the query cache or Memcached. There is no risk of unexpected invalidation, either. The summary table always exists, and should be fast to access, since it can be indexed appropriately for your queries.

Using INSERT .. SELECT for summary tables

The INSERT .. SELECT approach works best when there is some sort of log table, such as a log of clicks, a web access log, metric data from a monitoring system, or the like, which is to be aggregated over time. With this type of source data, one does not expect to see very many (if any) updates to the data once it has been collected.

This method does not usually work well when database tables may be updated or when rows may be deleted. Such changes may happen if a log is accidentally imported twice, for example and then the duplicate items are deleted. When this happens, decreasing the counts in summary tables may not be possible and thus they may be out of sync with the underlying data.

If there are frequent changes to the data then other options for maintaining summary tables must be used. Either they must be rebuilt from scratch each time (like a memcache miss) or they must be updated. Updating the summary tables efficiently is a hard problem to solve. My latest post on using Flexviews addresses this problem in a dedicated post.

Conclusion

If the least expensive queries are the ones you never run, then the most expensive queries very well may be the ones you have to run when the cache is empty. When talking about cache, the miss path is at least as important as the hit one. In order to make the miss path less expensive, use a layered approach to your caching. Cron jobs and summary tables can be used to make the miss path much less expensive. If you don’t pre-compute, and your website performance is dependent on your cache, then you could have a serious performance problem if your cache is unexpectedly emptied.

Effective caching is important. It is important to only cache things you know you will need again. Cache data in the form that it makes most sense to your application. Don’t cache the results of every query, simply because it is easy to do so.

Sep
23
2010
--

More on dangers of the caches

I wrote couple of weeks ago on dangers of bad cache design. Today I’ve been troubleshooting the production down case which had fair amount of issues related to how cache was used.

The deal was as following. The update to the codebase was performed and it caused performance issues, so it was rolled back but yet the problem remained. This is a very common case when you would see customer telling you everything is the same as it was yesterday… but it does not work today.

When I hear these words I like to tell people computers are state machines and they work in predictable way. If it does not work same today as it worked yesterday something was changed… it is just you may not recognize WHAT was changed. It may be something subtle as change in query plan or increase in search engine bot activity. It may be RAID writeback cache disabled due to battery learning but there must be something. This is actually where Trending often comes handy – graphs would often expose which metrics became different, they just need to be detailed enough.

So back to this case… MySQL was getting overloaded with thousands of same queries… which corresponded to cache miss storm but why it was not problem before ? The answer lies in caching as well. When software is deployed the memcache is cleared to avoid potential issues with different cache content, so system have to start with cold cache which overloads the system and it never recovers. When you have expiration based cache you increase the chance of conditions when system will not gradually recover by populating cache – if because of cache misses performance is so bad the speed of populating cache with new items is lower than speed with which items expire due to timeout you may never get a system warmed up.

But wait again… was this the first change ? Was not the code ever updated before ? Of course it was. As often with serious failures there is more than one reason which pushes system over top. During normal deployment the code change is done at night when when the traffic is low, so even if system has higher load and worse response time for several minutes after code is updated, the traffic is not high enough to push it to conditions it is unable to recover. This time code update was not successful and by the time rollback was completed the traffic was already high enough to cause the problems.

So the immediate solution to bring the system up was surprisingly simple. We just had to get traffic on the system in stages allowing Memcache to be warmed up. There were no code which would allow to do it on application side so we did it on MySQL side instead. “SET GLOBAL max_connections=20” to limit number of connections to MySQL and so let application to err when it tries to put too much load on MySQL as MySQL load stabilizes increasing number of connections higher until you finally can serve all traffic without problems.

So what we can learn from this, besides cache design related issues I mentioned in the previous post.

Include Rollback in Maintainance Window Ensure you plan the maintainance window long enough so you can do rollback inside this window and do not hesitate to do this rollback
if you’re running out of time. Know how long rollback takes and have it well prepared. Way to often I see people trying to make things work until time allocated for the operation is up and when
rollback have to be done outside of the time window allowed.

Know your Cold Cache Performance and Behavior Know how your application behaves with cold cache. Does it recovers or does it just dies with the high traffic ? How high is the response time penalty and how long it takes to reach normal performance ?

Have a way to increase traffic gradually There are many reasons beyond caching when you may want to slowly ramp up the traffic on the system. Make sure you have some means to do that. I’d recommend doing it on user session so some users are in and can use the system completely while others have to wait for their turn to get in. It is a lot better than having it done on page basics when you randomly have some pages giving error messages. In some cases you can also do ramp up feature by feature.

Consider Pre-Priming Caches In some cases when cold performance gives too bad response time you may want to prime the caches by running/replaying some production workload on the system before it is put online. In this case all ramp up and suffering from bad response time can be done by script… which does not care.


Entry posted by peter |
5 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Jul
24
2010
--

Caching could be the last thing you want to do

I recently had a run-in with a very popular PHP ecommerce package which makes me want to voice a recurring mistake I see in how many web applications are architected.

What is that mistake?

The ecommerce package I was working with depended on caching.  Out of the box it couldn’t serve 10 pages/second unless I enabled some features which were designed to be “optional” (but clearly they weren’t).

I think with great tools like memcached it is easy to get carried away and use it as the mallet for every performance problem, but in many cases it should not be your first choice.  Here is why:

  • Caching might not work for all visitors – You look at a page, it loads fast.  But is this the same for every user?  Caching can sometimes be an optimization that makes the average user have a faster experience, but in reality you should be caring more that all users get a good experience (Peter explains why here, talking about six sigma).  In practice it can often be the same user that has all the cache misses, which can make this problem even worse.
  • Caching can reduce visibility – You look at the performance profile of what takes the most time for a page to load and start trying to apply optimization.  The problem is that the profile you are looking at may skew what you should really be optimizing.  The real need (thinking six sigma again) is to know what the miss path costs, but it is somewhat hidden.
  • Cache management is really hard – have you planned for cache stampeding, or many cache items being invalidated at the same time?

What alternative approach should be taken?

Caching should be seen more as a burden that many applications just can’t live without.  You don’t want that burden until you have exhausted all other easily reachable optimizations.

What other optimizations are possible?

Before implementing caching, here is a non-exhaustive checklist to run through:

  • Do you understand every execution plan of every query? If you don’t, set long_query_time=0 and use mk-query-digest to capture queries.  Run them through MySQL’s EXPLAIN command.
  • Do your queries SELECT *, only to use subset of columns?  Or do you extract many rows, only to use a subset? If so, you are extracting too much data, and (potentially) limiting further optimizations like covering indexes.
  • Do you have information about how many queries were required to generate each page? Or more specifically do you know that each one of those queries is required, and that none of those queries could potentially be eliminated or merged?

I believe this post can be summed up as “Optimization rarely decreases complexity. Avoid adding complexity by only optimizing what is necessary to meet your goals.”  – a quote from Justin’s slides on instrumentation-for-php.  In terms of future-proofing design, many applications are better off keeping it simple and (at least initially) refusing the temptation to try and solve some problems “like the big guys do”.


Entry posted by Morgan Tocker |
13 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Jan
10
2010
--

Active Cache for MySQL

One of the problems I have with Memcache is this cache is passive, this means it only stores cached data. This means application using Memcache has to has to special logic to handle misses from the cache, being careful updating the cache – you may have multiple data modifications happening at the same time. Finally you have to pay with increased latency constructing the items expired from the cache, while they could have been refreshed in the background. I think all of these problems could be solved with concept of active cache

The idea with Active Cache is very simple – for any data retrieval operation cache would actually know how to construct the object, so you will never get a miss from the cache, unless there is an error. From existing tools this probably lies out best on registering the jobs with Gearman.

The updates of the data in this case should go through the same system so you can get serialization (or other logic) for your data updates.

You could also use the same functions updating the data when it expires. This could be exposed as explicit logic, something like expires in 300 seconds, start refresh in 200 seconds as well as automated.

The logic for automatic handling could be as follows – after the key has expired we can purge its value but keep it in cache with “expired” flag. If we can see for the same key we get a lot of requests when it is expired cache could decide to refresh such keys based on available bandwidth.

Another extension to common caching methods I’d like to see is having max_age specified on GET request. In many applications expiration is not data driven but rather request driven. Consider for example posting the blog comment on this blog. If you’re the user who posted the comment you have to see it instantly to avoid bad experience. At the same time other users can continue reading stale data – if they see comment appearing 10 seconds later they will not have any bad user experience.

Finally I think Active Cache could be very helpful handling write back scenarios. There are many cases when there is a lot of updates happening to the data – counters, last login, scores etc which do not really need to be reflected in the database instantly. If cache itself “knows” how to update the data you could define the policies on how frequently the data object needs to be synced to database.

I’d like to hear some feedback if you think such concept would be helpful for your applications and if you think there are existing tools and technologies which can be used to conveniently build things like this.


Entry posted by peter |
11 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Oct
16
2009
--

MySQL-Memcached or NOSQL Tokyo Tyrant – part 2

Part 1 of our series set-up our “test”  application and looked at boosting performance of the application by buffer MySQL with memcached.  Our test application is simple and requires only 3 basic operations per transaction 2 reads and 1 write.  Using memcached combined with MySQL we ended up nearly getting a 10X performance boost from the application.  Now we are going to look at what we could achieve if we did not have to write to the database at all.  So let’s look at what happens if we push everything including writes into memcached.

Benchmarks if everything is in memcached

Wow that’s shockingly fast isn’t it! I guess being completely in memory helps for this app.  What is very interesting is accessing 100% of the data in memcached gives very similar numbers to accessing 100% of the data in memory in the DB ( part 1 benchmarked a 4GB bp as being able to handle 7K TPS)… something is not 100% right here.  It stands to reason that memcached should be faster for this application then the DB.  Its just doing two gets via key and 1 set.  So why the similar numbers?

Well glad you asked.  It’s the API.  The api in this case was Cache::Memcached, by switching to using Cache::Memcached::Fast look what happens:

Memcached API - Fast

That is a nice jump in performance!

Using Memcached::Fast was kind of a mixed bag when looking at the benchmarks for mixing MySQL and Memcached in my tests:

Sometimes Api changes can make a huge difference

In this case I think the Fast api was slower when working with MySQL with a 256m BP because the slower returns from memcached acted as a bottleneck to thin the demands on MySQL to write data, smoothing out the work load.  When we eliminate this bottleneck with the Fast api, MySQL gets hammered.  This type of thing happens a lot.  For example an application is CPU bound, so you add more processing power, but then you hit disks harder and  now your disk bound.

A couple of good things to remember here:  #1 resolving 1 bottleneck can open another bottleneck that is much worse.  #2  is to understand that not all API’s are created equal.  Additionally the configuration and setup that works well on one system may not work well on another.  Because of this people often leave lots of performance on the table.  Don’t just trust that your current API or config is optimal, test and make sure it fits your application.

So adding Memcached on top of MySQL for our test application can significantly boost performance. But you notice that if we were running 100% in memcached and could cut out MySQL we could get 2.5x more performance over a mixed solution and 100X over just stock MySQL.  As the number of writes against the database increase this gap will increase.  So let’s ditch the database!  But wait!  you need the DB for  persistence, right?

It depends.  A database may not be the best fit for every application.  There are several “NOSQL”  solutions out in the open source space that can give you some of the ease of a Memcached but with persistence most people use their database for.   Each application is different and understanding the application’s requirements is key to picking an appropriate solution.   I am going to look at several database alternatives over the next few months.  I need to start somewhere, so I decided to start with Tokyo Tyrant and Cabinet.    So stop in next time for part 3 of this series where we will focus on running the same tests against Tokyo Tyrant.

Wow that’s shockingly fast isn’t it! I guess being completely in memory helps for this app. What is very interesting is accessing 100% of the data in memcached gives very similar numbers to accessing 100% of the data in memory in the DB… something is not 100% right here. It stands to reason that memcached should be faster for this application then the DB, two gets via key and 1 set. So why the similar numbers?

Well glad you asked. It’s the API. The api in this case was Cache::Memcached, by switching to using Cache::Memcached::Fast look what happens:

That is a nice jump in performance!

Using Memcached::Fast was kind of a mixed bag when looking at the benchmarks for mixing MySQL and Memcached in my tests:

In this case I think Fast was slower when working with MySQL with a 256m BP because the slower returns from memcached acted as a bottleneck to thin the demands on MySQL to write data, smoothing out the work load. When we eliminate this bottleneck with the Fast api, MySQL gets hammered.

A couple of good things to remember here: #1 resolving 1 bottleneck can open another bottleneck that is much worse. #2 is to understand that not all API’s are created equal. Additionally the configuration and setup that works well on one system may not work well on another. Because of this people often leave lots of performance on the table. Don’t just trust that your current API or config is optimal, test and make sure it fits your application.

So adding Memcached on top of MySQL for our test application can significantly boost performance. But you notice that if we were running 100% in memcached and could cut out MySQL we could get 2.5x more performance. As the number of writes against the database increase this gap will increase. So let’s ditch the database! But wait! you need the DB for persistence, right?

It depends. A database may not be the best fit for every application. There are several “NOSQL” solutions out in the open source space that can give you some of the ease of a Memcached but with persistence most people use their database for. Each application is different and understanding the application’s requirements is key to picking an appropriate solution. I am going to look at several database alternatives over the next few months. I need to start somewhere, so I decided to start with Tokyo Tyrant and Cabinet.


Entry posted by matt |
7 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Oct
15
2009
--

MySQL-Memcached or NOSQL Tokyo Tyrant – part 1

All to often people force themselves into using a database like MySQL with no thought into whether if its the best solution to there problem. Why?  Because their other applications use it, so why not the new application?  Over the past couple of months I have been doing a ton of work for clients who use their database like most people use memcached .  Lookup a row based on a key, update the data in the row, stuff the row back in the database.  Rinse and repeat.  Sure these setups vary sometimes, throwing in a “lookup” via username, or even the rare count.  But for the most part they are designed to be simple.

A classic example is a simple online game.  An online game may only require that an application retrieve a single record from the database.  The record may contain all the vital stats for the game, be updated and stuffed back into the database.  You would be surprised how many people use this type of system as I run into this type of application frequently.  Keeping it simple, ensures that application is generally mean and lean and performs well.  The issue is even this simple design can start to have issues as the data size increases and you blow through your available memory.  Is there a better architecture?  Is there a way to get more scalability out of your database?  Is the database even the best place for this data?

I decided to walk through setting up a very simple application that does what I have seen many clients do.  Using this application I can then compare using MySQL to using MySQL + Memcached, and then to other solutions like Tokyo Tyrant or Cassandra.   My Application does the following:

A.)  read a row from a database based on an integer based primary key
B.)  Update data from that row and replace the stored contents on disk
C.)  Use the data from that row to lookup up a row in another table based on a text field ( called email address ).

Seems simple enough right?  My two tables each contain 5M rows of data.  let’s see what happens:

DB Fits into Memory

Chart of TPS for benchmark application

You can see a dramatic drop off in performance as my data falls out of memory, that’s not cool is it?  After all database sizes tend to always grow and very rarely shrink.  Which leads to a challenge faced by almost everyone how do you maintain your performance in the face of increasing data size?

Here is where people start to scratch their heads.  They naturally assume they need to scale more, we need more memory!   If performance sucks, we must need more.  So here comes the bigger boxes, the read-only slaves,  the complex sharding systems, the discussions on cluster, more memcached.  We need to cover up the databases inefficiencies to ensure that our application scales.

The problem is for some applications, we are fixing symptoms, not the problem itself.  No matter how much you want it to fit,  some things may not work (like the Godfather 3).    The issue is people assume that data storage has to be in the database.  “It’s data, it needs to go into the database.” is often the battle cry.   But hold on to your hat,  I  am going to shock you.  For some applications, putting your data in the database is silly.  Yes the guy who blogs on bigdbahead.com and is writing this on the mysqlperformanceblog is saying you may not want to use a database.  Heresy I know!  But for many of us we already accept storing data ( at least temporarily ) outside the DB.  Think memcached.

Almost everyone loves memcached, it’s simple, fast, and just works.  When your dataset exceeds your memory limitations or the database can simply not keep up any more this solution can really boost performance.  I know you’re thinking my simple key lookup should really benefit from memcached. So let’s try it!  I took the simple app I created that reads two rows, and update one of them to read from memcached if available, remove on update, and read from the db only when required.  I tested with a memcached size of 1GB, 2GB, and 4GB.  For these tests I left Innodb with a 256M buffer pool, or roughly with 9% of the total data in memory.

let’s look at the 1GB Setting:

Ensure you have enough memory for memcached

What, a performance regression?  But we threw more memory at it!!   How can that be!

Memcached is not a cure all.  I have talked to many client’s who say “we will just throw memcached at it”.   Sometimes an app will scream other times it won’t… and yet others require lots and lots of memory allocated to memcached to be successful.    This application selects a random # between 1 and 2 Million and looks up the result via that key.  It then uses data from that random row to look up a second piece of information via email address.  Because the entire dataset  is about 4GB and only 1G is in memcached, I keep pushing data out of memcached to make room for new records I am reading from the database. Remember memcached needs repeatability to be helpful.   I am still getting a really solid # of hits in memcached, but the # of writes in MySQL coupled with the still large # of reads takes its toll.  Another place where I have seen this kill clients is in apps that do some sequential scanning and do not have enough memory for memcached.  For instance, if you have 1,000,000 rows of data, but enough memory to only store 500,000 rows… sequentially accessing this data will destroy the use of cache:

get record 1, miss, read from disk, cache record 1
….
get record 500,001, miss, read from disk, expunge record 1, cache record 500,001
….
get record 1, miss, read from disk, expunge record 500,001, cache record 1

you keep overwriting the cache before you can use it.  So here the complexity of adding memcached hurts us, because the cache is not actually buying us anything.

Now bumping this up to 2GB actually makes the TPS jump around a lot, sometimes hitting 400 or 500 TPS and other times hitting as high as 1800 TPS.  My guess is the movement is caused by the random #’s being generated and simply the luck of the draw.

Finally let’s look when we have 4GB of memory allocated to memcached (full dataset fits ):

Transactions with and without Memcached

Here you can see that our “transactions”  per second for this app increased almost 10Xby using memcached.  The TPS I get here vary from 1100 TPS to just under 2000TPS with the average around 1400TPS.   I think we would all be very happy if we could get a 10X performance boost from your application.

But wouldn’t it be great if we could get more?  I mean our reads are going pretty fast, but our writes leave a lot to be desired:

Read -vs- write times with memcached + mysql mixed

Over 17 MS to do an update.  Wouldn’t be great to just eliminate all the updates as well?  What sort of throughput would we get?   I will show you in part 2.  Part 2 of this post will talk about performance in a 100% pure memcached environment. Part 3 will focus on these same benchmarks in Tokyo tyrant.

All to often people force themselves into using a database like MySQL with no thought into whether if its the best solution to there problem. Why? Because their other applications use it, so why not the new application? Over the past couple of months I have been doing a ton of work for clients who use their database like most people use memcached . Lookup a row based on a key, update the data in the row, stuff the row back in the database. Rinse and repeat. Sure these setups vary sometimes, throwing in a “lookup” via username, or even the rare count. But for the most part they are designed to be simple.

A classic example is a simple online game. An online game may only require that an application retrieve a single record from the database. The record may contain all the vital stats for the game, be updated and stuffed back into the database. You would be surprised how many people use this type of system as I run into this type of application frequently. Keeping it simple, ensures that application is generally mean and lean and performs well. The issue is even this simple design can start to have issues as the data size increases and you blow through your available memory. Is there a better architecture? Is there a way to get more scalability out of your database? Is the database even the best place for this data?

I decided to walk through setting up a very simple application that does what I have seen many clients do. Using this application I can then compare using MySQL to using MySQL + Memcached, and then to other solutions like Tokyo Tyrant or Cassandra. My Application does the following:

A.) read a row from a database based on an integer based primary key

B.) Update data from that row and replace the stored contents on disk

C.) Use the data from that row to lookup up a row in another table based on a text field ( called email address ).

Seems simple enough right? My two tables each contain 5M rows of data. let’s see what happens:

You can see a dramatic drop off in performance as my data falls out of memory, that’s not cool is it? After all database sizes tend to always grow and very rarely shrink. Which leads to a challenge faced by almost everyone how do you maintain your performance in the face of increasing data size?

Here is where people start to scratch their heads. They naturally assume they need to scale more, we need more memory! If performance sucks, we must need more. So here comes the bigger boxes, the read-only slaves, the complex sharding systems, the discussions on cluster, more memcached. We need to cover up the databases inefficiencies to ensure that our application scales.

The problem is for some applications, we are fixing symptoms, not the problem itself. No matter how much you want it to fit, somethings may not work (like the Godfather 3). The issue is people assume that data storage has to be in the database. “It’s data, it needs to go into the database.” is often the battle cry. But hold on to your hat, I am going to shock you. For some applications, putting your data in the database is silly. Yes the guy who blogs on bigdbahead.com and is writing this on the mysqlperformanceblog is saying you may not want to use a database. Heresy I know! But for many of us we already accept storing data ( at least temporarily ) outside the DB. Think memcached.

Almost everyone loves memcached, it’s simple, fast, and just works. When your dataset exceeds your memory limitations or the database can simply not keep up any more this solution can really boost performance. I know you’re thinking my simple key lookup should really benefit from memcached. So let’s try it! I took the simple app I created that reads two rows, and update one of them to read from memcached if available, remove on update, and read from the db only when required. I tested with a memcached size of 1GB, 2GB, and 4GB. For these tests I left Innodb with a 256M buffer pool, or roughly with 9% of the total data in memory.

let’s look at the 1GB Setting:

What, a performance regression? But we threw more memory at it!! How can that be!

Memcached is not a cure all. I have talked to many client’s who say “we will just throw memcached as it”. Sometimes an app will scream other times it won’t… and yet others require lots and lots of memory allocated to memcached to be successful. This application selects a random # between 1 and 2 Million and looks up the result via that key. It then uses data from that random row to look up a second piece of information via email address. Because the entire dataset is about 4GB and only 1G is in memcached, I keep pushing data out of memcached to make room for new records I am reading from the database. Remember memcached needs repeatability to be helpful. I am still getting a really solid # of hits in memcached, but the # of writes in MySQL coupled with the still large # of reads takes its toll. Another place where I have seen this kill clients is in apps that do some sequential scanning and do not have enough memory for memcached. For instance, if you have 1,000,000 rows of data, but enough memory to only store 500,000 rows… sequentially accessing this data will destroy the use of cache:

  • get record 1, miss, read from disk, cache record 1

  • ….

  • get record 500,001, miss, read from disk, expunge record 1, cache record 500,001

  • ….

  • get record 1, miss, read from disk, expunge record 500,001, cache record 1

    you keep overwriting the cache before you can use it. So here the complexity of adding memcached hurts us, because the cache is not actually buying us anything.

Now bumping this up to 2GB actually makes the TPS jump around a lot, sometimes hitting 400 or 500 TPS and other times hitting as high as 1800 TPS. My guess is the movement is caused by the random #’s being generated and simply the luck of the draw.

Finally let’s look when we have 4GB of memory allocated to memcached (full dataset fits ):

Here you can see that our “transactions” per second for this app increased almost 10Xby using memcached. The TPS I get here vary from 1100 TPS to just under 2000TPS with the average around 1400TPS. I think we would all be very happy if we could get a 10X performance boost from your application.

But wouldn’t it be great if we could get more? I mean our reads are going pretty fast, but our writes leave a lot to be desired:

Over 17 MS to do an update. Wouldn’t be great to just eliminate all the updates as well? What sort of throughput would we get?


Entry posted by matt |
12 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com