Mar
30
2010
--

Can we get faster expression handling in MySQL

Andrew from Sphinx continues to work on improving SQL (or SphinxQL) support and now he published benchmarks comparing arithmetic expression handling in Sphinx to one in MySQL. The result ? Sphinx scored 3x to 20x faster. Andrew goes to explain results are not 100% comparable (as we can see in the table results are even different) and some performance can be attributed to Sphinx using different typing for expression. Though I’m wondering how much of Performance difference that really explain. I doubt it is 20x. Andrew goes on to explain there are further optimizations possible for Sphinx such as JIT compilation and expression optimization which he claims can increase the lead even further

Andrew is kind enough to provide explanations and benchmark results though this is far from the first notion of this problem I hear about. Many other companies looking to optimize MySQL problem, especially in analytics space when expressions should be computed over hundreds of millions of rows have expressed their concerns with this part of MySQL execution. I’m wondering if we’re going to see any improvements in this space by Oracle, MariaDB or Drizzle. I think Drizzle is in advantage here – with simplifying MySQL functionality a lot they probably can also simplify expression handling to be faster.


Entry posted by peter |
6 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Written by in: MySQL,Zend Developer |
Mar
26
2010
--

How well do your tables fit in buffer pool

In XtraDB we have the table INNODB_BUFFER_POOL_PAGES_INDEX which shows which pages belong to which indexes in which tables. Using thing information and standard TABLES table we can see how well different tables fit in buffer pool.

SQL:

  1. mysql> SELECT d.*,round(100*cnt*16384/(data_length+index_length),2) fit FROM (SELECT schema_name,table_name,count(*) cnt,sum(dirty),sum(hashed)  FROM INNODB_BUFFER_POOL_PAGES_INDEX GROUP BY schema_name,table_name ORDER BY cnt DESC LIMIT 20) d JOIN TABLES ON (TABLES.table_schema=d.schema_name AND TABLES.table_name=d.table_name);
  2. +————-+———————+———+————+————-+——–+
  3. | schema_name | table_name          | cnt     | sum(dirty) | sum(hashed) | fit    |
  4. +————-+———————+———+————+————-+——–+
  5. | db          | table1              | 1699133 |      13296 |      38584187.49 |
  6. | db          | table2              | 1173272 |      17399 |       1109998.42 |
  7. | db          | table3              |  916641 |       7849 |       1531694.77 |
  8. | db          | table4              |   86999 |       1555 |       7555487.42 |
  9. | db          | table5              |   32701 |       7997 |       3008291.61 |
  10. | db          | table6              |   31990 |       4495 |       25681 | 102.97 |
  11. | db          | table7              |       1 |          0 |           0 | 100.00 |
  12. +————-+———————+———+————+————-+——–+
  13. 7 rows IN SET (26.45 sec)

You can also see in one of the cases the value shown is a bit over 100% – I am not sure where it comes from but more pages reported to belong to the table in buffer pool than on disk. Though it seems to work well enough for estimation purposes.


Entry posted by peter |
3 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Mar
24
2010
--

RAID throughput on FusionIO

Along with maximal possible fsync/sec it is interesting how different software RAID modes affects throughput on FusionIO cards.

In short conclusion, RAID10 modes really disappoint me, the detailed numbers to follow.

To get numbers I run sysbench fileio test with 16KB page size, random read and writes, 1 and 16 threads, O_DIRECT mode.

FusionIO cards are the same as in the previous experiment, as I am running XFS with nobarrier mount options.

OS is CentOS 5.3 with 2.6.18-128.1.10.el5 kernel.

For RAID modes I use:

  • single card ( for baseline)
  • RAID0 over 2 FusionIO cards
  • RAID1 over 2 FusionIO cards
  • RAID1 over 2 RAID0 partitions (4 cards in total)
  • RAID0 over 2 RAID1 partitions (4 cards in total)
  • special RAID10 mode with n2 layout

Latest mode you can get creating RAID as:

mdadm --create --verbose /dev/md0 --level=10 --layout=n2 --raid-devices=4 --chunk=64 /dev/fioa /dev/fiob /dev/fioc /dev/fiod

In this case for all modes use 64KB chunk size ( different chunk sizes also interesting question).

There is graph for 16 threads runs, and raw results are below.

As expected RAID1 over 2 disks shows hit on write throughput comparing to single disk,
but RAID10 modes over 4 disks surprises me, showing almost 2x drops.

Only in RAID10n2 random reads skyrocket, while writes are equal to single disk.

This makes me asking if RAID1 mode is really usable, and how it performs
on regular hard drives or SSD disks.

The performance drop in RAID settings is unexpected. I am working with Fusion-io engineers to figure out the issue.

The next experiment I am going to look into is different page sizes.

Raw results (in requests / seconds, more is better):

single disk
read/1

12765.49
read/16

31604.86
write/1

14357.65
write/16

32447.07
raid0 2 disks
read/1

12046.12
read/16

57410.58
write/1

12993.91
write/16

43023.12
raid1 2 disks
read/1

11484.17
read/16

51084.02
write/1

9821.12
write/16

15220.57
raid1 over raid0 4 disks
read/1

10227.13
read/16

61392.25
write/1

7395.75
write/16

13536.86
raid0 over raid1 4 disks
read/1

10810.08
read/16

66316.29
write/1

8830.49
write/16

18687.97
raid10 n2
read/1

11612.89
read/16

99170.51
write/1

10634.62
write/16

31038.5

Script for reference:

CODE:

  1. #!/bin/sh
  2. set -u
  3. set -x
  4. set -e
  5.  
  6. for size in 50G; do
  7.    for mode in rndrd rndwr; do
  8.    #for mode in rndwr; do
  9.    #for blksize in 512 4096 8192 16384 32768 65536  ; do
  10.    for blksize in 16384 ; do
  11.       sysbench –test=fileio –file-num=64 –file-total-size=$size prepare
  12.       #for threads in 1 4 8; do
  13.       for threads in 1 16 ; do
  14.          echo “====== testing $blksize in $threads threads”
  15.          echo PARAMS $size $mode $threads $blksize> sysbench-size-$size-mode-$mode-threads-$threads-blksz-$blksize
  16.          for i in 1 2 3 ; do
  17.            sysbench –test=fileio –file-total-size=$size –file-test-mode=$mode\
  18.             –max-time=180 –max-requests=100000000 –num-threads=$threads –init-rng=on \
  19.             –file-num=64 –file-extra-flags=direct –file-fsync-freq=0 –file-block-size=$blksize run \
  20.             | tee -a sysbench-size-$size-mode-$mode-threads-$threads-blksz-$blksize 2>&1
  21.          done
  22.       done
  23.       sysbench –test=fileio –file-total-size=$size cleanup
  24.    done
  25.    done
  26. done

Entry posted by Vadim |
5 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Mar
24
2010
--

xtrabackup-1.1

Dear Community,

It is time to announce the next version of backup software from Percona – XtraBackup 1.1.

The list of changes in version 1.1 includes:
Changelog:

  • XtraBackup is built on a base of MySQL 5.1.44 with InnoDB plugin 1.0.6
  • Added –host option
  • tar4ibd can treat over 64GB file
  • tar4ibd is default method for stream, even tar is specified
  • the binary supports compressed tables and Baraccuda format

Fixed bugs:

The binary packages for RHEL4,5, Debian, FreeBSD, Windows, Mac OS as well as source code of the XtraBackup is available on http://www.percona.com/percona-builds/XtraBackup/XtraBackup-1.1/.

Debian and RPM are available in Percona repository.

The project lives on Launchpad : https://launchpad.net/percona-xtrabackup and you can report bug to Launchpad bug system:
https://launchpad.net/percona-xtrabackup/+filebug. The documentation is available on our Wiki.

For general questions use our Pecona-discussions group, and for development question Percona-dev group.

For support, commercial and sponsorship inquiries contact Percona.


Entry posted by Aleksandr Kuzminsky |
5 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Mar
23
2010
--

fsyncs on software raid on FusionIO

As soon as we get couple FusionIO cards, there is question how to join them in single space for database. FusionIO does not provide any mirroring/stripping solutions and totally relies on OS tools there.

So for Linux we have software RAID and LVM, I tried to followup on my post
How many fsync / sec FusionIO can handle, and check what overhead we can expect using additional layers over FusionIO card.

The card I used is Fusion-io ioDrive Duo 320GB, physically it is two cards on single board, and visible as two cards to OS.

By some reason I was not able to setup LVM on cards, so I’ve finished tests only for software RAID0 and RAID1.

I used XFS filesystem mounted with “-o nobarrier” option, and I’ve the test I used in previous post on next configurations:

  • Single card
  • RAID0 over two cards
  • RAID1 over two cards

There what I’ve got:

  • Single card: 14050.59 req/sec
  • RAID0: 13039.00 req/sec
  • RAID1: 324.71 req/sec

By single card I’ve results much better than in my previous test, probably
Duo card has better characteristics.

RAID0 shows some overhead 8%, but it is acceptable.

And something is terrible wrong with RAID1, I am getting 40x drops in amount of fsyncs.
So it seems we can’t use innodb transactional log files with innodb_flush_log_at_trx_commit=1 on software RAID1 over FusionIO.

For reference I used command:

sysbench --test=fileio --file-num=1 --file-total-size=50G --file-fsync-all=on --file-test-mode=seqrewr --max-time=100 --file-block-size=4096 --max-requests=0 run

And next steps is to check how RAID setup affects IO throughput.


Entry posted by Vadim |
6 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Written by in: Innodb,MySQL,Zend Developer |
Mar
23
2010
--

Too many connections? No problem!

Did that ever happen to you in production?

CODE:

  1. [percona@sandbox msb_5_0_87]$ ./use
  2. ERROR 1040 (00000): Too many connections

Just happened to one of our customers. Want to know what we did?

For demo purposes I’ll use sandbox here (so the ./use is actually executing mysql cli). Oh and mind it is not a general-purpose best-practice, but rather a break-and-enter hack when the server is flooded. So, when this happens in production, the problem is – how do you quickly regain access to mysql server to see what are all the sessions doing and how do you do that without restarting the application? Here’s the trick:

CODE:

  1. [percona@sandbox msb_5_0_87]$ gdb -p $(cat data/mysql_sandbox5087.pid) \
  2.                                      -ex “set max_connections=5000” -batch
  3. [Thread debugging using libthread_db enabled]
  4. [New Thread 0x2ad3fe33b5c0 (LWP 1809)]
  5. [New Thread 0x4ed19940 (LWP 27302)]
  6. [New Thread 0x41a8b940 (LWP 27203)]
  7. [New Thread 0x42ec5940 (LWP 1813)]
  8. [New Thread 0x424c4940 (LWP 1812)]
  9. 0x00000035f36cc4c2 in select () from /lib64/libc.so.6

And here’s the result:

CODE:

  1. [percona@test9 msb_5_0_87]$ ./use
  2. Welcome to the MySQL monitor.  Commands end with ; or \g.
  3. Your MySQL connection id is 8
  4. Server version: 5.0.87-percona-highperf-log MySQL Percona High Performance Edition, Revision 61 (GPL)
  5.  
  6. Type ‘help;’ or \h for help. Type \c to clear the current input statement.
  7.  
  8. mysql [localhost] {msandbox} ((none))> select @@global.max_connections;
  9. +————————–+
  10. | @@global.max_connections |
  11. +————————–+
  12. |                     5000 |
  13. +————————–+
  14. 1 row in set (0.00 sec)

Credit for the gdb magic goes to Domas.

Few notes:

  • You would usually have one connection reserved for SUPER user, but that does not help if your application is connecting as a SUPER user (which is a bad idea anyway).
  • This worked for me on 5.0.87-percona-highperf, but use it at your own risk and better test it before you actually have to do it in production.
  • This example assumes you had less than 5000 max_connections configured ;)

Entry posted by Aurimas Mikalauskas |
19 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Mar
20
2010
--

InnoDB TABLE/INDEX stats

In Released and new coming features I did not mentioned two additional INFORMATION_SCHEMA tables available in XtraDB:
It is

  • INNODB_TABLE_STATS
  • INNODB_INDEX_STATS

These table show statistics about InnoDB tables ( taken from InnoDB data dictionary).

INNODB_TABLE_STATS is

  • | table_name | table name in InnoDB internal style (‘database/table’) |
  • | rows | estimated number of all rows |
  • | clust_size | cluster index (table/primary key) size in number of pages|
  • | other_size | other index (non primary key) size in number of pages|
  • | modified | internal counter to judge whether statistics recalculation should be done |

INNODB_INDEX_STATS is

  • | table_name | table name in InnoDB internal style (‘database/table’) |
  • | index_name | index name |
  • | fields | How many fields the index key has. (it is internal structure of InnoDB, it may be larger than the ‘CREATE TABLE’) |
  • | row_per_keys | estimate rows per 1 key value. ([1 column value], [2 columns value], [3 columns value], …) |
  • | index_size | index size in pages |
  • | leaf_pages | number of leaf pages |

Using these stats you can estimate how big is index is, and also what is statistics per index (or at least what InnoDB thinks about statistics in index)


Entry posted by Vadim |
5 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Mar
19
2010
--

Is your MySQL Server Loaded ?

So you’re running the benchmark/stress test – how do you tell if MySQL server is really loaded ? This looks like the trivial question but in fact, especially when workload consists of simple queries I see the load generation and network really putting a lot less load on MySQL than expected. For example you may have 32 threads (or processes) running queries as fast as they can… does it really mean there is an 32 concurrent queries ran all the time ? It may be the case or it may be not…

Take a look at this server for example:

CODE:

  1. [root@db01 ~]# mysqladmin -i1  extended | grep Threads_running
  2. | Threads_running                       | 1                    |
  3. | Threads_running                       | 1                    |
  4. | Threads_running                       | 1                    |
  5. | Threads_running                       | 19                   |
  6. | Threads_running                       | 1                    |
  7. | Threads_running                       | 1                    |
  8. | Threads_running                       | 20                   |
  9. | Threads_running                       | 1                    |
  10. | Threads_running                       | 1                    |
  11. | Threads_running                       | 1                    |
  12. | Threads_running                       | 11                   |
  13. | Threads_running                       | 1                    |

This corresponds to what is expected to be stress load but we can see MySQL is getting only spikes of concurrent query executions and most commonly there are no queries executing. Value 1 for Threads_running corresponds to the connection which runs “SHOW STATUS” so you need to subscribe 1 from the reported amount to see the true number. No wonder in the case above there were a lot of free CPU and IO capacity.

Take a look at another sample:

CODE:

  1. | Threads_running                       | 33                   |
  2. | Threads_running                       | 27                   |
  3. | Threads_running                       | 1                    |
  4. | Threads_running                       | 28                   |
  5. | Threads_running                       | 19                   |
  6. | Threads_running                       | 21                   |
  7. | Threads_running                       | 2                    |
  8. | Threads_running                       | 27                   |
  9. | Threads_running                       | 27                   |
  10. | Threads_running                       | 32                   |
  11. | Threads_running                       | 27                   |
  12. | Threads_running                       | 1                    |
  13. | Threads_running                       | 24                   |
  14. | Threads_running                       | 29                   |

In this case the load is higher and a lot more uniform – there are cases when actually 32 queries are active (this is test with 32 connections) – but you can see most of the time it is less than that.

Looking at Threads_running is a very simple and powerful tool to see whenever you’re really putting sustained load on the database you may be expecting.

It may be worth to explain what value of Threads_running represents. This is amount of queries which are being currently processing – the ball is on Server side. The server has gotten the query but has not completed sending response back yet. This is a very broad measure of activity – if query is waiting on IO, blocked on Mutex, table lock, row level lock, waiting on innodb_thread_concurrency it will be still considered running. This will be even the case when result of large query is being sent back and send operation is blocked because of slow network or the client. Because the measure is so broad it is very helpful to see if client is loading the server well – if it does the number of threads_running will be appropriately high.


Entry posted by peter |
3 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Mar
18
2010
--

When the subselect runs faster

A few weeks ago, we had a query optimization request from one of our customer.

The query was very simple like:

CODE:

  1. SELECT * FROM `table` WHERE (col1=‘A’||col1=‘B’) ORDER BY id DESC LIMIT 20 OFFSET 0

This column in the table is looks like this:

CODE:

  1. `col1` enum(‘A’,‘B’,‘C’,‘CD’,‘DE’,‘F’,‘G’,‘HI’) default NULL

The table have 549252 rows and of course, there is an index on the col1. MySQL estimated the cardinality of that index as 87, though what was of course misleading as index cardinality in this case can’t be over 9, as there is only 8(+ NULL) different possible values for this column.

CODE:

  1. +—-+————-+——-+——-+—————+——+———+——+——–+—————————–+
  2. | id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows   | Extra                       |
  3. +—-+————-+——-+——-+—————+——+———+——+——–+—————————–+
  4. 1 | SIMPLE      | table  | range | col1         | col1 | 2       | NULL | 549252 | Using where; Using filesort |
  5. +—-+————-+——-+——-+—————+——+———+——+——–+—————————–+

This query took more than 5 minutes (the rows are large and table does not fit in cache well)

When you want to run this query mysql first will try to find each row where col1 is A or B using index. Then its going to order by the ID using file sort and then send first 20 rows ignoring the rest.

In this case MySQL has 2 indexes where one is usable to find rows, while other is usable to return them in the right order. MySQL can chose only one of them to execute the query – use index to find rows. This is sensible strategy if there is no LIMIT, however it is poor chose if there is one – it is often a lot faster to retrieve rows in order checking WHERE clause for them until required number of rows were returned. Especially in the cases when WHERE clause is not very selective.

So I tried this:

CODE:

  1. select * from table where id in (SELECT id FROM `table` WHERE (col1=‘A’||col1=‘B’)) ORDER BY id DESC LIMIT 20 OFFSET 0;

In this case we forcing MySQL to do retrieve rows in sorted order and checking if it matches our original WHERE clause with subselects. It looks scary if we look at EXPLAIN but in reality the dependent subquery is only executed enough times to produce 20 rows in result set.

CODE:

  1. +—-+——————–+——-+—————–+—————+———+———+——+——–+————-+
  2. | id | select_type        | table | type            | possible_keys | key     | key_len | ref  | rows   | Extra       |
  3. +—-+——————–+——-+—————–+—————+———+———+——+——–+————-+
  4. 1 | PRIMARY            | table  | index           | NULL          | PRIMARY | 4       | NULL | 765105 | Using where |
  5. 2 | DEPENDENT SUBQUERY | table  | unique_subquery | PRIMARY,col1  | PRIMARY | 4       | func |      1 | Using where |
  6. +—-+——————–+——-+—————–+—————+———+———+——+——–+————-+

The result is a lot better result time:

CODE:

  1. (20 rows in set (0.01 sec))

So by rewriting query using subqueries we actually improved it performance 100 times. So subqueries are
not always slowing things down.

Even though proving subqueries are not always slow this way is not the most optimal. We do not really need separate subselect to make MySQL check WHERE clause while scanning table in index order. We can just use FORCE INDEX hint to override MySQL index choice:

CODE:

  1. mysql> explain select * from table FORCE INDEX(PRIMARY) where (col1=‘A’||col1=‘B’) order by id desc limit 20 offset 0;
  2. +—-+————-+——-+——-+—————+———+———+——+——–+————-+
  3. | id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows   | Extra       |
  4. +—-+————-+——-+——-+—————+———+———+——+——–+————-+
  5. 1 | SIMPLE      | table  | index | NULL          | PRIMARY | 4       | NULL | 549117 | Using where |
  6. +—-+————-+——-+——-+—————+———+———+——+——–+————-+
  7.  
  8. mysql> select * from table FORCE INDEX(PRIMARY) where (col1=‘A’||col1=‘B’) order by id desc limit 20 offset 0;
  9. 20 rows in set (0.00 sec)

This approach works well if WHERE clause is not very selective, otherwise MySQL may need to scan very many rows to find enough matching rows. You can use another trick Peter
wrote. about couple of years ago.


Entry posted by Istvan Podor |
7 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Mar
17
2010
--

Percona-XtraDB-9.1: released and new coming features

Recently Alexandr announced new Percona-XtraDB-9.1 release, and now it is good time to summarize features we have and what is expected soon.

This release contains long waited features from 5.0:

  • extended slow.log
  • USER/TABLE/INDEX/CLIENT_STATISTICS + THREAD_STATISTICS ( coming in release-10)

Extended slow.log is now even more extended, there is additional information for each query:

CODE:

  1. # Bytes_sent: 4973  Tmp_tables: 1  Tmp_disk_tables: 1  Tmp_table_sizes: 7808

That is you can see how many bytes was returned by query, was temporary table used,
was it disk table or in-memory, and how big was temporary table.

Also you can profile each individual query from stored procedures, so now you should not
guess what took so long in stored procedure.

What we are working on right now is:

  • InnoDB pagesize, so you will be able to specify 4K, 8K, 16K pagesize when you created database
  • InnoDB fast checksums, idea and implementation was taken from Facebook patches
  • InnoDB big log files ( bigger 4GB). As our benchmarks on FusionIO shows, you may need
    4GB+ log files to get more performance on fast storage systems
  • SHOW TEMPORARY TABLES, patch from Venu Anuganti
  • INFORMATION_SCHEMA.SYS_TABLES and SYS_INDEXES to show internal InnoDB data dictionary

Full list of XtraDB engine features (what makes it different from InnoDB-plugin) on the state as it is right now:

Performance improvements

  • Improved buffer_pool scalability
  • Fast recovery
  • Improved IO path
  • Improved rollback segment scalability
  • Separate purge thread
  • Limited size of data dictionary in memory
  • Increased number of concurrent write transactions (undo slots) ( up to 4000 )
  • Fast checksums ( in release process )
  • Support of different pagesizes ( 4K, 8K, 16K) ( in release process )

Usability / operations

  • Show content of buffer_pool
  • Import / export of dedicated tables
  • Import / export of buffer_pool
  • Transactional replication
  • Show internal InnoDB data dictionary
  • Show InnoDB locking/io profiling in slow.log

I hope you enjoy our work!


Entry posted by Vadim |
13 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com