Hours remain for Percona Live NYC Early Bird Discount

Just few short hours remain for Early Bird Discount for Percona Live, New York City. This is your last chance to grab the ticket at almost 50% discount, for $129 instead of $259. We have a Great Session Lineup and outstanding networking opportunities !


Sysbench with support of multi-tables workload

We just pushed to sysbench support for workload against multiple tables ( traditionally it used only single table).

It is available from launchpad source tree lp:sysbench .

This is set of LUA scripts for sysbench 0.5 ( it supports scripting), and it works following way:

– you should use --test=tests/db/oltp.lua to run OLTP test
i.e. prepare

./sysbench --test=tests/db/oltp.lua --oltp-tables-count=25 prepare


./sysbench --test=tests/db/oltp.lua --oltp-tables-count=25 --num-threads=5 run

oltp.lua should understand most options that available for regular sysbench –test=oltp

there are couple other scripts, like

to support different OLTP modes.

Most interesting: all scripts support --oltp-tables-count=N ( default 1), in this case N sbtest tables will be used.

Additional parameter --oltp-secondary is taken from Mark Callaghan’s sysbench tree , in this case
table sbtest will be created with KEY xid (ID) , instead of PRIMARY KEY (ID)

Now, as we have N tables, we may want to prepare them in parallel, to save some time for preparation
I implemented that in script parallel_prepare.lua

so instead of

./sysbench --test=tests/db/oltp.lua --oltp-tables-count=25 prepare

you may use

./sysbench --test=tests/db/parallel_prepare.lua --oltp-tables-count=64 --num-threads=8 run

( please note that oltp-tables-count should be multiplier of num-threads)


Percona Server 5.5.11-20.2 Stable Release

Percona Server 5.5.11-20.2 is now available for download, including an experimental build for MacOS. Released on April 28, 2011, it is the current stable release in the the 5.5 series.

New Features

  1. HandlerSocket, a NoSQL plugin for MySQL, has been updated to the latest stable version as April 11th, 2011.
  2. InnoDB fast index creation now works with mysqldump, ALTER TABLE and OPTIMIZE TABLE. (Alexey Kopytov)

Variable Changes

  1. Variable innodb_extra_rsegments was removed because the equivalent, ‘innodb_rollback_segments’, has been implemented in MySQL 5.5. (Yasufumi Kinoshita)

Bug Fixes

  1. Bug #757749 – Using ALTER TABLE to convert an InnoDB table to a MEMORY table could fail due to a bug in innodb_expand_fast_index_creation.patch. (Alexey Kopytov)
  2. Bug #764395 – InnoDB crashed with an assertion failure when receiving a signal on pwrite(). The problem was that InnoDB I/O code was not fully conforming to the standard on POSIX systems. Calls to fsync(), pread(), and pwrite() can be interrupted by a signal. In such cases, InnoDB would crash with an assertion failure, rather than just restarting the interrupted call. (Alexey Kopytov)
  3. Bug #766236 – A crash was occurring in some cases when innodb_lazy_drop_table was enabled with very large buffer pools. (Yasufumi Kinoshita)
  4. Bug #733317 – SYS_STATS internal table of XtraDB has been expanded for supporting ‘innodb_stats_method’ from InnoDB-plugin. (Yasufumi Kinoshita)

Please report any bugs found at Bugs in Percona Server.

For more information, please see the following links:


Scalr and MNPP

Good news for all Percona Software users out there. There are more ways how you can get access to Percona Software. If you’re using Scalr (See also Scalr Project on Google Code) you might be interested in Percona Server roles support Scalr team has announced recently. If you’re Mac user and looking for complete development environment for PHP and MySQL you can check out MNPP which also includes lightweight Ngnix web server instead of Apache.

Now about any Open Source projects including or based on Percona Server or Xtrabackup ? Let us know !


The four fundamental performance metrics

There are many ways to slice and aggregate metrics of activity on a system such as MySQL. In the best case, we want to know everything about the system’s activity: we want to know how many things happened, how big they were, and how long they took. We want to know precisely when they happened. We want to know resource consumption, instantaneous status snapshots, and wait analysis and graphs. These things can be expensive to measure, and even more importantly, they can require a lot of work to analyze. And sometimes the analysis is only valid when very stringent conditions are satisfied. What metrics can we gather that provide the most value for the effort?

There’s a relationship known as Little’s Law (sometimes erroneously referred to as the fundamentals of queueing theory), that lays out some simple properties between various metrics. One of the properties of Little’s Law is that it holds regardless of inter-arrival-time distributions and other complex things. And you can build upon it to create surprisingly powerful models of the system.

Let’s begin with a query. It arrives, executes for a bit, and then completes.

Then some time elapses, and another query arrives, executes, and completes.

To add the minimal amount of extra complexity, let’s say that another query arrived while query 2 was executing, and completed after query 2 completed. Now we have the following picture, fully annotated. The arrivals and completions are denoted by “a” and “c”.

If we start at the left end of the Time arrow, and call it the beginning of the observation period, and observe the system until the right end of the arrow, we have our observation interval. If we travel from left to right along the arrow, we can count the “residence time” as the time during which at least one query is running. This is the period from 1a to 1c, plus the period from 2a to 3c. I am in the habit of calling this the “busy time” for historical reasons, too.

It’s easy to compute this by just creating a counter, a running total duration, and a last-event timestamp. Every time there is an arrival or completion, look at the counter of currently executing queries. If it is greater than 0, increment the duration by the time elapsed since the last arrival or completion. Then increment the counter of currently executing queries if the current event is an arrival, and decrement it if it’s a completion. Finally, set the last-event timestamp to the current time.

Now, if we add another duration variable, we can get what I call the “weighted time.” (I have not heard a better phrase for this.) This is similar to the residence time, but instead of just adding the time elapsed since the last event, we multiply the time elapsed by the number of currently executing queries and add it. This is effectively the “area under the curve” — the blue-colored regions in the picture above. (Each query has a height of 1.)

Finally, we can simply count the number of arrivals or completions. It doesn’t matter which; Little’s Law is based on a long-term average, where the two are equal.

All together, we have the following four fundamental metrics:

  1. The observation interval.
  2. The number of queries in the interval.
  3. The total time during which queries resided in the system — the residence time.
  4. The total execution time of all queries — the “weighted time.”

We can maintain these variables in the system, and sample them at will by simply looking at the clock and writing down the three variables at the start and end of whatever observation period we choose.

What is this good for? We can use these together with Little’s Law to derive these additional four long-term average metrics:

  1. Throughput: divide the number of queries in the observation interval by the length of the interval.
  2. Execution time: divide the weighted time by the number of queries in the interval.
  3. Concurrency: divide the weighted time by the length of the interval.
  4. Utilization: divide the residence time by the length of the interval.

This is extremely powerful. These are the key metrics we need to model scalability with the Universal Scalability Law, perform queueing analysis, and do capacity planning. And we obtained all of this without needing to analyze anything complicated at all. This is just addition, subtraction, multiplication, and division of some constantly increasing counters and durations!

There’s more. We derived our four long-term average metrics from the four fundamental metrics, and we derived those in turn from the arrivals and completions. In a future article, I’ll show you what else we can learn by observing arrivals and completions. And I’ll present on this at Percona Live, where I’ll show you how to tie it all together quickly to answer real-world questions about practically any system, whether it’s instrumented with the variables I mentioned or not, with shockingly little effort.

PS: I’ve edited to reflect Dr. Gunther’s comments, which I solicited. Also, readers should know that these are the four metrics that I think of as fundamental — I’m not referring to any established standard. PPS: updated to clarify “counter” versus “duration” as per Mr. Wilson’s comments.


On Tour

I have just finalized the tour schedule for my “SQL Performance Basics Workshop for Developers“.
The most important tips and tricks every developer must know—in just 3 hours. It takes place in your office because every developer—the entire team—should take part.

The tour covers Austria, Germany and Switzerland. Just let me know if there isn’t a scheduled date for your area. I’m sure we can arrange something. Small teams can also take the virtual workshop.

Hope to see you soon.

Written by in: Zend Developer |

Performance Schema overhead

As continuation of my CPU benchmarks it is interesting to see what is scalability limitation in MySQL 5.6.2, and I am going to check that using PERFORMANCE SCHEMA, but before that let’s estimate what is potential overhead of using PERFORMANCE SCHEMA. So I am going to run the same benchmarks (sysbench read-only and read-write) as in previous post with different performance schema options and compare results.

I am going to use Cisco UCS C250
with next settings:

  • PERFORMANCE SCHMEA enabled, with all consumers ON (PS on)
  • PERFORMANCE SCHMEA enabled, but only global_instrumentation consumer enabled. It allows to gather table and index access statistics (PS only global)
  • PERFORMANCE SCHMEA enabled, but all consumers OFF (PS all off)

The full results with details are not our Benchmark Wiki

There is graph for read-only case:

and for read-write:

To have some numeric impression, let’s see ration of result with PS to result without PS

There is table with ratios for read-only case:

Threads PS on PS only global PS all off
1 1.11 1.10 1.13
2 1.13 1.08 1.04
4 1.15 1.07 1.05
8 1.18 1.07 1.03
24 1.21 1.08 1.06
32 1.25 1.10 1.08
48 1.25 1.10 1.08
64 1.23 1.10 1.06
128 1.23 1.10 1.04
256 1.21 1.08 1.04
512 1.18 1.07 1.01
1024 1.17 1.01 0.96

There is table with ratios for read-write case:

Threads PS on PS only global PS all off
1 1.07 0.94 0.98
2 1.11 1.00 1.06
4 1.15 1.04 1.08
8 1.19 1.02 1.08
24 1.17 1.00 1.07
32 1.18 1.07 1.06
48 1.18 1.09 1.13
64 1.17 1.11 1.11
128 1.18 1.09 1.12
256 1.14 1.04 0.99
512 1.17 1.02 1.04
1024 1.21 1.06 1.07

So this allows us to make next summary:

In read-only case, Performance Schema with all consumers gives about 25% overhead,
with “global instrumentation” only –10%, and with all disabled consumers – about 8%.

For read-write case, Performance Schema with all consumers gives about 19% overhead,
with “global instrumentation” only –11%, and it is about the same with all disabled consumers.

Is that big or small ? I leave it for you to decide, I think it may be acceptable in some cases and not in some others.
I wish only that Performance Schema with all disabled consumers gives less overhead, 8-11% seems significant.
If nothing helps I would like to be able to fully disable / enable performance schema in run-time, not at start-time.

As I understand dtrace / systemtap probes can be disabled / enabled at run-time, and when they disabled – it is almost 0% overhead, why Performance Schema can’t do the same ?

(Disclaimer: This benchmark is sponsored by Well Know Social Network, and they are generous to make it public)


MySQL at Facebook, present and future

Harrison Fisk, from the Facebook database team, will be in New York City delivering the closing keynote at Percona Live next month. He’ll be talking about the present status and future plans for MySQL inside Facebook’s environment. As you probably know, Facebook is one of the largest users of MySQL, with their own set of patches, and the Facebook team is very open about how they use MySQL and what their challenges are. This is a great chance to hear from Harrison in person and ask questions — and we have a great lineup of other speakers, too. Check out the list of sessions for more information.

The early-bird discount ticket pricing ends in a few days, so go ahead and register while it’s cheap! (PS: use the Twitter hashtag #perconalive in your Tweets.)


Percona Server and XtraBackup Weekly News, April 25th

Time for our weekly review of the news! The big news this week is that Percona XtraBackup is ready to try out on Windows. Otherwise, there’s just more steady progress.

One quick note, sort of a “to whom it may concern,” is that the proper trademarked names of our products include “Percona Server”, “Percona XtraDB”, and “Percona XtraBackup”. “XtraDB” and “XtraBackup” are also Percona trademarks, but the proper names of the products include “Percona” at the beginning. The word “xtrabackup” alone and lowercased is only proper when referring to the xtrabackup compiled C binary, e.g. in how-to documentation in a context that includes the whole product name. We are happy to grant written permission for appropriate use of the trademarks. We’re working on publishing an official trademark policy on our website.

  • We continue progress towards our Percona Server 5.5 release.
  • Percona XtraBackup is ready to try on Windows.
  • We’re working on porting the Percona XtraBackup test suite to Windows.
  • Our Debian packages now link against readline, not editline.
  • We did a lot of work on testing, merging forward to new versions, and bug fixing last week.

Intel Nehalem vs AMD Opteron shootout in sysbench workload

Having two big boxes in our lab, one based Intel Nehalem (Cisco UCS C250) and second on AMD Opteron (Dell PowerEdge R815), I decided to run some simple sysbench benchmark to compare how both CPUs perform and what kind of scalability we can expect.

It is hard to make apples to apples comparison, but I think it is still interesting.
Cisco UCS C250 has total 12 cores / 24 threads of Intel Nehalem X5670, and Dell PowerEdge R815 has 48 cores of AMD Opteron 6168.
One of biggest difference is that Cisco is running CentOS 5.5 and Dell R815 is based on RedHat EL 6. I will probably will need to rerun benchmark after upgrade Cisco to CentOS 6 ( will be it even released or it is dead ? ).
For benchmark I took sysbench oltp ( both read-only and read-write) and MySQL 5.6.2 as it seems most scalable system at the moment. All data fits into memory, so it is full CPU bound benchmark. (more…)

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com