Sep
15
2011
--

Percona Live MySQL Conference and Expo opens Call for Papers

Call for Papers for the Percona Live MySQL Conference and Expo is finally open! We had a little delay while we set up the software so thank you for your patience. Taking into consideration the short delay we are going to extend the close date to December 5th, 2011. Simply go to our website and click the Log In tab on the top of the page. Please help us make this conference outstanding by submitting your talks!

Sep
15
2011
--

Using MySQL as a Queue, and Free Percona Live Tickets

This week’s TGIF give-away contest is a day early, and it’s a guest post over on the Engine Yard blog: 5 subtle ways you’re using MySQL as a queue, and why it’ll bite you. Go there to read the full post, and watch @engineyard’s Twitter feed for the chance to enter the contest for free Percona Live London tickets!

Excerpt from the blog post:

Many people realize the difficulties in building a good job queue or batch processing system, and try not to create one inside MySQL. Although the job queue is a great design pattern from the developer’s point of view, they know it’s often hard to implement well in a relational database. However, experience shows me that job queues sneak up in unexpected ways, even if you’re a seasoned developer. » read more

Sep
14
2011
--

SELECT UNION Results INTO OUTFILE

Here’s a quick tip I know some of us has overlooked at some point. When doing SELECT … UNION SELECT, where do you put the the INTO OUTFILE clause? On the first SELECT, on the last or somewhere else? The manual has the answer here, to quote:

Only the last SELECT statement can use INTO OUTFILE. (However, the entire UNION result is written to the file.)

However, I still see queries going further lengths to the same effect. For example, putting the UNIONs into a subquery and then doing the SELECT INTO OUFILE from this. Using the employees.employees table:

 mysql [localhost] {msandbox} (employees) > EXPLAIN SELECT * INTO OUTFILE '/home/viiin/sandboxes/msb_5_1_58p/outfile.txt'
-> FROM (
-> SELECT * FROM employees
-> WHERE hire_date BETWEEN '1990-01-01' AND '1990-01-02'
-> UNION ALL
-> SELECT * FROM employees
-> WHERE hire_date BETWEEN '1990-01-05' AND '1990-01-06'
-> UNION ALL
-> SELECT * FROM employees
-> WHERE hire_date BETWEEN '1990-01-03' AND '1990-01-04'
-> ) t;
+----+--------------+--------------+-------+---------------+-----------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------+--------------+-------+---------------+-----------+---------+------+------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 421 | |
| 2 | DERIVED | employees | range | hire_date | hire_date | 3 | NULL | 119 | Using where |
| 3 | UNION | employees | range | hire_date | hire_date | 3 | NULL | 123 | Using where |
| 4 | UNION | employees | range | hire_date | hire_date | 3 | NULL | 179 | Using where |
| NULL | UNION RESULT | | ALL | NULL | NULL | NULL | NULL | NULL | |
+----+--------------+--------------+-------+---------------+-----------+---------+------+------+-------------+
5 rows in set (0.00 sec)

Compared to using UNION simply:

 mysql [localhost] {msandbox} (employees) > EXPLAIN SELECT *
-> FROM employees
-> WHERE hire_date BETWEEN '1990-01-01' AND '1990-01-02'
-> UNION ALL
-> SELECT *
-> FROM employees
-> WHERE hire_date BETWEEN '1990-01-05' AND '1990-01-06'
-> UNION ALL
-> SELECT * INTO OUTFILE '/home/viiin/sandboxes/msb_5_1_58p/outfile.txt'
-> FROM employees
-> WHERE hire_date BETWEEN '1990-01-03' AND '1990-01-04';
+----+--------------+--------------+-------+---------------+-----------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------+--------------+-------+---------------+-----------+---------+------+------+-------------+
| 1 | PRIMARY | employees | range | hire_date | hire_date | 3 | NULL | 119 | Using where |
| 2 | UNION | employees | range | hire_date | hire_date | 3 | NULL | 123 | Using where |
| 3 | UNION | employees | range | hire_date | hire_date | 3 | NULL | 179 | Using where |
| NULL | UNION RESULT | | ALL | NULL | NULL | NULL | NULL | NULL | |
+----+--------------+--------------+-------+---------------+-----------+---------+------+------+-------------+
4 rows in set (0.00 sec)

You will notice the second query without the additional derived <derived2> table holding all the results from the UNION queries, making the latter more performant than the former. Of course you can stil break all this UNION queries down and SELECT them INTO OUTFILE separately, then concatenate the resulting files offline if you think the number of resulting rows would be to big to bear, otherwise SELECT .. UNION SELECT INTO OUTFILE should be convenient.

Lastly to demonstrate the purpose of this post.

mysql [localhost] {msandbox} (employees) > SELECT *
    -> FROM employees
    -> WHERE hire_date BETWEEN '1990-01-01' AND '1990-01-02'
    -> UNION ALL
    -> SELECT *
    -> FROM employees
    -> WHERE hire_date BETWEEN '1990-01-05' AND '1990-01-06'
    -> UNION ALL
    -> SELECT * INTO OUTFILE '/home/viiin/sandboxes/msb_5_1_58p/outfile.txt'
    -> FROM employees
    -> WHERE hire_date BETWEEN '1990-01-03' AND '1990-01-04';
Query OK, 421 rows affected (0.00 sec)
Sep
09
2011
--

MySQL data via a NoSQL solution, Free ticket to Percona Live London

The much-anticipated ability to access MySQL data via a NoSQL solution has been realized. Using HandlerSocket, significant performance gains can be realized for certain workloads. Sound like something you are interested in? Join us for Percona Live London to hear Ryan Lowe, Percona’s Director of American Consulting speak on this subject. In Ryan’s session he will answer the questions like:

  • What is HandlerSocket?
  • Where does HandlerSocket fit in my application stack?
  • Why would I want to use HandlerSocket?
  • How do I use Handlersocket?

He’ll walk through relevant configuration & tuning options, and at the end of the presentation attendees will be able to identify candidate use cases within their application to use HandlerSocket.

For a chance to win a free ticket to Percona Live London and get the opportunity to hear Ryan and so many others speak on a number of different MySQL related topics simply watch our @Percona Twitter stream and retweet the contest to win a free ticket! One lucky winner will be picked every Friday up until the conference. Enjoy your weekend and remember to enter our TGIF contest for a chance to win! If you don’t win this time, try next Friday or register and get the early-bird discount (but don’t wait too long: it expires September 18th).
Sep
07
2011
--

Percona Toolkit 1.0.1 released

Daniel and I and the Percona release engineering team are proud to announce version 1.0.1 of Percona Toolkit. This toolkit is Percona’s suite of DBA and sysadmin productivity tools for MySQL. It is the successor to the popular Maatkit and Aspersa toolkits. Percona Toolkit is available for immediate download at percona.com/downloads/percona-toolkit/.

Helpful links:

Sep
06
2011
--

Dynamic row format for MEMORY tables

The latest Percona Server release has one new feature: now MEMORY tables can have BLOB and TEXT columns, and VARCHAR columns will not waste space due to implicit extension to CHAR.

The lack of these features in the MEMORY storage engine (also known as HEAP) in MySQL has been one of the long-standing limitations caused by the fact that variable-length records were not supported by the engine. This is true for all MySQL versions, including the latest 5.6 development tree as of last week.

There have been efforts in the community to remove this limitation. In 2008 Igor Chernyshev of eBay implemented true-VARCHAR support for MEMORY tables for MySQL 5.0. Although this patch did not add the BLOB and TEXT support, it refactored the code to make it easier to add it. The code was proposed for inclusion into upstream sources but unfortunately never made it there, probably due to some limitations discussed below.

We at Percona took the eBay patch, ported it to 5.5, tested it extensively and fixed all the bugs we found.

There are a few points to keep in mind when using this new feature (and these points are probably the reason why it was not included in MySQL in 2008):

  1. The key columns must come before any non-key variable-length columns. For performance reasons, the key columns are stored in a fixed-length block at the beginning of the row. If there were any variable-length column in between them, it had to be stored in the fixed-length format, either wasting space (VARCHAR) or outright impossible (BLOB).
  2. Indexes on BLOB columns are not supported.

While we do not consider these limitations to be show-stopping, in the future we will work to remove them.

The full usage notes, including discussion when the dynamic row format is chosen over the fixed one, how to request it explicitly and how to set the data block size at CREATE TABLE time, are in the documentation.

What of course interests us is the MEMORY engine performance with this feature added. For that we run a few experiments, described next. These tests are not exhaustive and, of course, if you have a performance-critical application depending on MEMORY tables, they are no replacement for application-specific tests.

We start by checking if there is any performance regression when the variable-length records are not used, i.e. if you pay for what you don’t use. This is a valid question because the MEMORY implementation has changed significantly with this patch–for the fixed-length records too.

Default sysbench 0.4.12 schema, one client thread:
Fixed-row format (FRF, Percona Server without this patch applied):
Insertions in 900 seconds: 25962452.
VmPeak: 6267692 kB

Dynamic-row format (DRF):
Insertions in 900 seconds: 25949631 (0.05% slower).
VmPeak: 6267700 kB

Two client threads:
FRF:
Insertions in 900 seconds: 42946880.
VmPeak: 10069040 kB

DRF:
Insertions in 900 seconds: 43325887 (0.88% faster).
VmPeak: 10069048 kB

Here we can see that the performance and memory usage is about the same for both row formats. (Speaking very strictly, the VM Peak value is not comparable due to slightly different number of rows inserted).

In the second test we check if the dynamic row format delivers the memory savings for VARCHAR fields agains the fixed row format. For that we modify the default sysbench schema to be the following:

CREATE TABLE sbtest (id INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
c VARCHAR(4096) NOT NULL DEFAULT ”, PRIMARY KEY (id))
ENGINE=MEMORY ROW_FORMAT=DYNAMIC DEFAULT CHARSET=latin1;

Also we modify the sysbench non-transactional INSERT test to insert strings of random length between 0 and 4096 in uniform distribution, that is, 2048 on average.

The results of this test (two client threads):
FRF:
Insertions in 900 seconds: 9922468
VmPeak: 40871220 kB

DRF:
Insertions in 900 seconds: 9831946 (0.92% less)
VmPeak: 22520888 kB (45% reduction)

Thus the test confirms that the dynamic row format delivers the expected memory savings.

One quite important feature is missing from this release however: MyISAM tables are still used for temporary tables where the MEMORY tables with dynamic row format could be used instead: for schemas containing large VARCHARs, BLOB, TEXT. Unfortunately, enabling this feature is not trivial, but we are working on this and expect to deliver it in a future release.

Sep
06
2011
--

Continuing with Data-Clusters

Today’s installment covers the first two parts from Chapter 5, „Clustering Data. It explains the “cluster” term and demonstrates the impact of clustering data with a simple example.

This will be the basis for the next installment, which explains why adding another where clause may slowdown statement execution. Not just a little bit, but by factors. That is actually the question that has the highest failure rate at the 3 minute test.

Sep
06
2011
--

Continuing with Data-Clusters

Today's installment covers the first two parts from Chapter 5, „Clustering Data. It explains the “cluster” term and demonstrates the impact of clustering data with a simple example.

This will be the basis for the next installment, which explains why adding another where clause may slowdown statement execution. Not just a little bit, but by factors. That is actually the question that has the highest failure rate at the 3 minute test.

Sep
02
2011
--

Understand InnoDB spin waits, win a Percona Live ticket

It’s Friday again (so soon!) and time for our TGIF contest, to give away a free ticket to Percona Live London. Before we do that, though, just what in the world does this output from SHOW INNODB STATUS mean?

Mutex spin waits 5870888, rounds 19812448, OS waits 375285

To understand this text, you have to understand how InnoDB handles mutexes. It tries a two-step approach to getting a lock on a mutex. First, a thread tries to lock the mutex. If the mutex is locked by someone else, then the thread does a so-called spin wait. This means that it repeatedly checks “are you free? are you free? are you free?” in a loop. If this doesn’t work after a while, it gives up and goes to sleep until the mutex is free. I’m simplifying this a lot, perhaps too much, but it’s a topic that would take a long blog post to explain correctly in detail. The related source files are in the sync/ InnoDB source code directory if you want to learn more.

The reason for the two-step approach is that going to sleep and waking up again is slow. That’s why it spends a little time doing a spin-wait first. It’s actually cheaper to keep checking for a little bit than to go to sleep right away.

So, back to the output from SHOW INNODB STATUS. Here’s what it means.

  • Mutex spin waits 5870888 is the number of times a thread tried to get a mutex and it wasn’t available, so it waited in a spin-wait.
  • rounds 19812448 is the number of times threads looped in the spin-wait cycle, checking the mutex.
  • OS waits 375285 is the number of times the thread gave up spin-waiting and went to sleep instead.

So what we have here is really a kind of “mutex miss rate counter.” We don’t have a counter for the number of times the mutex was free and could be grabbed right away, but we have the number of times it wasn’t free, and we have the number of times it wasn’t free even after spinning a little while.

And now that you know a little bit more about InnoDB locking strategies and how it’s exposed in the status output, wouldn’t you like to come hear Peter Zaitsev talk about InnoDB architecture and performance optimization in London on October 24th? Well, you can — for free! Watch our @Percona Twitter stream and retweet the contest to win a free ticket! We’ll pick a random retweet and give away a free ticket each week. If you don’t win this time, try next Friday or register and get the early-bird discount (but don’t wait too long: it expires September 18th). Our tutorial schedule is 100% complete at this time, and the session schedule has two full tracks already with the full schedule to be announced very shortly now that the CfP has ended. Don’t miss your opportunity to learn about MySQL in London from world-famous experts!

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com