May
31
2011
--

What’s a good buffer pool read/write ratio?

At Percona Live last week, someone showed me a graph from their Cacti monitoring system, using the templates that I wrote. It was the buffer pool pages read, written, and created. He asked me if the graph was okay. Shouldn’t there be a lot more pages read than written, he asked? It’s a great question.

I’ve blogged before about the danger of trying to interpret ratios. Ratios are not good ways to discover whether systems are healthy. So, why graph them, then?

First, let me say that the graph actually doesn’t show a ratio — it just shows the absolute values of the reads, writes, and creates per second, stacked on top of each other. The person was mentally comparing them and creating a ratio from them. But there’s no ratio on the graph itself:

Regardless of that, some systems ought to have more reads than writes, and vice versa. So if you’re looking at your graph wondering what it should look like, the answer is probably “it should look exactly as it looks!”

I’ve gotten a lot of questions over time about how to interpret the Cacti graphs, and this person helped me to understand what the questions were really about. People were asking me “when I look at these graphs, how can I tell if anything is wrong with my system?” But that’s not really the most useful way to approach the graphs.

It really comes down to the difference between discovery and diagnosis. In general, it’s best to use the graphs for diagnosing problems that you already know about, not for trying to discover problems. Your monitoring and alerting system (Nagios?) should be trying to discover whether there is a problem. The graphs are there for quickly showing you what has changed. If the website suddenly starts responding very slowly, for example, then you can look at the graphs and see if any of them have sharp increases or decreases. You can use that information to help you diagnose.

But in general, I wouldn’t spend very much time looking at the graphs from day to day. I’d just check them once in a while — maybe once a week I’d look at the monthly view — to see if there were any sharp changes during the past week; I’d ensure that I know why those changes happened if I see any (maybe I deployed a new release); and I’d want to make sure that the graphs are still working, and haven’t gotten broken due to some problem like privileges or firewall rules.

In the ideal world, I’d like to simply collect everything, and not even define any graphs for the metrics. Then I’d like an easy way to make graphs on an ad-hoc basis. But Cacti is designed to have defined graphs, and that makes it tempting for people to spend a lot of time looking at them :-)

May
25
2011
--

Joining with ORM

Today’s installment is the first part in the chapter about joining. The installment consists of two parts. First, explaining some general characteristics of joins. Second, explaining the nested loops join algorithm by example, covering the N+1 problem and other aspects of Object-Relational mapping (ORM).

Written by in: Zend Developer |
May
25
2011
--

Just hours left until Percona Live sells out!

We’re seeing a spike of last-minute ticket sales for Percona Live New York. Meanwhile, we’re assembling bags and unpacking t-shirts at the venue. If you want to come, buy your ticket now — we’re rapidly selling out the remaining tickets!

Remember, if all the tickets are sold, you can still come to the evening event — free drinks and food! Free prizes, including a Gold Unlimited support contract from Percona!

May
23
2011
--

Free Percona Live tickets and Percona Support contract!

Just a couple of announcements for Percona Live: we’re giving away free tickets through Engine Yard, and we’ve arranged our evening open-bar event.

For a chance to win a free ticket, check out the guest blog post I wrote for Engine Yard on how to make Rails migrations faster and safer.

In other news, Percona Live now has an open-bar evening event, sponsored by Clustrix. It is open to all, not just registered attendees of Percona Live. You can come to Mickey Mantle’s legendary Sports Bar and Restaurant for free drinks and food, networking, prizes, and a chance to win an Unlimited Gold MySQL Support contract from Percona, worth over $15,000!

In case you don’t know, Percona Live is a one-day intensive MySQL conference, with four tracks of 100% technical content from world-famous MySQL experts. Tickets are still available at http://www.percona.com/live/. Registration is not required to attend the evening event, so if you can’t attend Percona Live, come meet us afterwards anyway! We’ve arranged for a substantial amount of free food, so you won’t have to leave to find dinner.

The evening event begins at 6pm this Thursday the 26th, and is open-bar (free drinks) until 8pm. Mickey Mantle’s is at 42 Central Park South. From the conference center, just walk west on 59th street. For more information and directions, visit the Percona Live page for venue details.

May
20
2011
--

Scaling problems still exist in MySQL 5.5 and Percona Server 5.5

MySQL 5.5 and Percona Server 5.5 do not solve all scalability problems even for read only workloads. Workloads which got a lot of attention such as Sysbench and DBT2/TPC-C scale pretty well a they got a lot of attention, there can be other quite typical workloads however which do not scale that well. This is why it is important to test performance and scalability for your application on your hardware if you really want results most relevant for your situation.

In this example I spotted the query pattern responsible for large portion of the load in the application by aggregating query log with mk-query-digest. When I filtered out only queries of this pattern and got a simplified part of production workload which focuses only on one query but yet mimics real world values distribution.

The query looks something like this:

SELECT *  FROM table WHERE deleted = 0 AND group_id IN (62715996, 62716592, 62717660, 62717716, 62717722, 62717728, 62721292, 62722236, 62722238, 62722240, 62722282, 62722284, 62722286, 62722288, 62722292, 62722294, 62728432, 62728434, 62728436, 62728438, 62728440, 62728442, 62728596, 62728688, 62728882, 62728884, 62728886, 62728888, 62728890, 62728892, 62729006, 62729008, 62729010, 62729012, 62729014, 62729042, 62729044, 62729068, 62729070, 62729072, 62729224, 62729226, 62729382, 62729560, 62729564, 62729618, 62729620, 62729624, 62729640, 62729642, 62729644, 62729750, 62729752, 62729754, 62729756, 62729760, 62729762, 62730046, 62730050, 62730052, 62730054, 62730268, 62730270, 62730272, 62730274, 62730276, 62730278, 62730280, 62730284, 62730286, 62730324, 62730326, 62730668, 62730672, 62730676, 62730678, 62730682, 62730686, 62730688, 62730692, 62730696, 62730710, 62730724, 62732448, 62733114, 62734150, 62734442, 62734446, 62734448, 62734456, 62734460, 62734682, 62734684, 62734842, 62734844, 62734846, 62734848, 62735354, 62735356, 62735358, 62736110, 62736112, 62736114, 62736116, 62736118, 62736120, 62737096, 62737098, 62737100, 62737102, 62737104, 62737106, 62737114, 62737118, 62737120, 62737122, 62737124, 62737170, 62737172, 62737174, 62737198, 62737200, 62737226, 62737240, 62737242, 62737244, 62737246, 62737250, 62737252, 62737254, 62737368, 62737370, 62737372, 62737374, 62737376, 62737378, 62737382, 62737384, 62737650, 62737652, 62737654, 62737922, 62737924, 62737926, 62737928, 62737930, 62737932, 62737934, 62737936, 62739648, 62739652, 62739654, 62739658, 62739662, 62739668, 62739680, 62739844, 62739856, 62739860, 62739868, 62740030, 62740036, 62740042, 62740044, 62740046, 62740154, 62740156, 62740160, 62740164, 62740170, 62740386, 62740388, 62740390, 62740392, 62740394, 62740396, 62740398, 62740400, 62740402, 62740404, 62740406, 62740408, 62740964, 62740966, 62740968, 62740970, 62740972, 62742370, 62742374, 62742380, 62742384, 62742388, 62742394, 62742398, 62742402, 62743720, 62743962, 62744142, 62744144, 62744146)

With secondary key defined on (group_id,deleted)

This pattern of medium size IN lists is typical for a lot of modern applications. The system I used for test was 4 socket r910 with X7542 CPUs giving 24 real cores. The workload was completely in memory in this test. I have compared Percona Server 5.1.56-12.7 and Percona Server 5.5.11-20.2 with former being configured with innodb_buffer_pool_instances=16 and 400GB buffer pool. Here are results:

First it is worth to say we see great improvements for this workload in Percona Server 5.5 showing up to 2.5 times better performance or 150%, which is a great headline numbers. If we look at more details however we can see this is improvement from complete disaster to something absolutely unacceptable. If we look at the gain system shows from performance at 1 thread to the peak performance we see the number is 2.3 for Percona Server 5.1 and around 6 for Percona Server 5.5 which is way too little for what you would expect from 24 core system. For comparison you can check
Sysbench Results which Vadim published. For read only workload we get 17x gain from 1 thread to peak performance on system with 12 physical cores and 24 threads, which is almost perfect scalability.

So there are workloads where the problem is solved and when there are where a lot of work need to be done and where you still can’t get use of more than 8 cores effectively (which would correspond to single CPU socket these days)

Here are some more details. oprofile:

867964   15.1181  mysqld                   btr_cur_search_to_nth_level
792651   13.8063  mysqld                   btr_search_guess_on_hash
716506   12.4800  mysqld                   mutex_spin_wait
467974    8.1511  mysqld                   rec_get_offsets_func
376898    6.5648  mysqld                   buf_page_get_gen

PMP

     10 libaio::??,os_aio_linux_collect,os_aio_linux_handle,fil_aio_wait,io_handler_thread,start_thread,clone
      5 pthread_cond_wait,os_cond_wait,os_event_wait_low,sync_array_wait_event,rw_lock_s_lock_spin,pfs_rw_lock_s_lock_fu
nc,btr_search_build_page_hash_index,btr_search_info_update_slow,btr_search_info_update,btr_cur_search_to_nth_level,btr_p
cur_open_with_no_init_func,row_sel_get_clust_rec_for_mysql,row_search_for_mysql,ha_innobase::index_read,handler::read_ra
nge_first,handler::read_multi_range_next,QUICK_RANGE_SELECT::get_next,rr_quick,sub_select,do_select,JOIN::exec,mysql_sel
ect,handle_select,execute_sqlcom_select,mysql_execute_command,mysql_parse,dispatch_command,do_command,do_handle_one_conn
ection,handle_one_connection,start_thread,clone
      4 pthread_cond_wait,os_cond_wait,os_event_wait_low,sync_array_wait_event,mutex_spin_wait,mutex_enter_func,pfs_mute
x_enter_func,buf_page_get_mutex_enter,buf_page_get_gen,btr_cur_search_to_nth_level,btr_pcur_open_with_no_init_func,row_s
earch_for_mysql,ha_innobase::index_read,handler::read_range_first,handler::read_multi_range_next,QUICK_RANGE_SELECT::get
_next,rr_quick,sub_select,do_select,JOIN::exec,mysql_select,handle_select,execute_sqlcom_select,mysql_execute_command,my
sql_parse,dispatch_command,do_command,do_handle_one_connection,handle_one_connection,start_thread,clone
      2 pthread_cond_wait,os_cond_wait,os_event_wait_low,sync_array_wait_event,rw_lock_x_lock_func,pfs_rw_lock_x_lock_fu
nc,btr_search_build_page_hash_index,btr_search_info_update_slow,btr_search_info_update,btr_cur_search_to_nth_level,btr_p
cur_open_with_no_init_func,row_sel_get_clust_rec_for_mysql,row_search_for_mysql,ha_innobase::index_read,handler::read_ra
nge_first,handler::read_multi_range_next,QUICK_RANGE_SELECT::get_next,rr_quick,sub_select,do_select,JOIN::exec,mysql_sel
ect,handle_select,execute_sqlcom_select,mysql_execute_command,mysql_parse,dispatch_command,do_command,do_handle_one_conn
ection,handle_one_connection,start_thread,clone
      2 pthread_cond_wait,os_cond_wait,os_event_wait_low,sync_array_wait_event,rw_lock_s_lock_spin,pfs_rw_lock_s_lock_fu
nc,btr_search_build_page_hash_index,btr_search_info_update_slow,btr_search_info_update,btr_cur_search_to_nth_level,btr_p
cur_open_with_no_init_func,row_sel_get_clust_rec_for_mysql,row_search_for_mysql,ha_innobase::general_fetch,handler::read
_multi_range_next,QUICK_RANGE_SELECT::get_next,rr_quick,sub_select,do_select,JOIN::exec,mysql_select,handle_select,execu
te_sqlcom_select,mysql_execute_command,mysql_parse,dispatch_command,do_command,do_handle_one_connection,handle_one_conne
ction,start_thread,clone

P.S Disabling adaptive hash index makes things a lot worse in this workload and innodb_adaptive_hash_index_partitions=16
available for partitioning adaptive hash index does not help because everything hits the same index.

May
20
2011
--

InnoDB compression woes

InnoDB compression is getting some traction, and I see quite contradictory opinions. Someone has successful deployments in productions, and someone says that compression in current implementation is useless.
To get some initial impression about performance I decided to run some sysbench with multi-tables benchmarks.
I actually was preparing to do complex research, but even first initial results are quite discouraging.

My setup: Dell PowerEdge R900, running Percona-Server-5.1.57-rel12.8 (will be in public release soon), storage is FusionIO 320GB MLC card, which does not matter a lot in this case of CPU-bound benchmark.

First stage – load data. Scripts for multi-table sysbench allow to load data in parallel, so let’s load in 16 tables in 16 parallel threads, 25,000,000 rows in each tables. That gives about 6GB of data per table (uncompressed) and 96GB of data in total.

./sysbench --test=tests/db/parallel_prepare.lua --oltp-tables-count=16 --num-threads=16 --oltp-table-size=25000000 run

Results: Load time for regular tables: 19693 sec, for compressed tables: 38278 sec.
Compressed tables are create as: ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8,
with final size 3.1GB per table. So we have 2x win in space in trade for 2x worse time. Maybe fair deal.

Now let’s run oltp read-only workload in 16 parallel threads with limiting dataset to 3,000,000 rows per table, that is in total about 11GB of working set. Using 24GB of memory for buffer_pool will give us fully in-memory CPU-bound workload.

command to run:

./sysbench --test=tests/db/oltp.lua --oltp-tables-count=16 --oltp-table-size=5000000 --oltp-read-only=on --rand-init=on --num-threads=16 --max-requests=0 --rand-type=uniform --max-time=1800 --mysql-user=root --report-interval=10 run

this will report us results each 10 sec.

After initial warm-up the throughput for regular tables are stabilized on level 4650 transactions per sec. I expected some overhead for compressed tables, but not such: the throughput with compressed tables are 30 transactions per sec. This is 150x difference.

As workload is clear read-only CPU bound, let’s check CPU stats:

regular tables:

 -----cpu------
 us sy id wa st
  1  0 98  0  0
 85 13  2  0  0
 85 13  2  0  0
 85 13  1  0  0
 85 13  2  0  0

Compressed tables:

 -----cpu------
us sy id wa st
  2  0 97  1  0
  7  0 93  0  0
  7  0 93  0  0
  7  0 93  0  0
  7  0 93  0  0
  7  0 93  0  0
  7  0 93  0  0

With regular tables CPU is utilized 85% and this is quite decent number. With compressed tables
CPU utilization is 7%. Obviously we have some mutex serialization problem.

Analyzing SHOW INNODB STATUS (SEMAPHORES) for workload with compression tables we can see

      1 Mutex at 0xe13880 '&buf_pool_zip_mutex'
     14 Mutex at 0xe13780 '&LRU_list_mutex'

Apparently using compressed tables we have very strong contention in LRU_list_mutex.

I should check how compressed tables perform in IO-bound workload (this is where they should give main benefit),
but for in-memory load it shows significant scalability problem.

May
18
2011
--

Flexviews is a working scalable database transactional memory example

http://Flexvie.ws fully implements a method for creating materialized views for MySQL data sets. The tool is for MySQL, but the methods are database agnostic. A materialized view is an analogue of software transactional memory. You can think of this as database transactional memory, or as database state distributed over time, but in an easy way to manage.

It has been shown that combinatorial algebraics can be applied to all aggregate functions (Flexviews uses only composable operations even for deletes over “non-distributable aggregate functions”), and Flexviews includes this capability today. Because Flexviews can move database objects forward in time synchronously with each other to specific points in time in database state, invalid database state is not possible, particularly if there are thorough database checks.

Flexviews is:
lock free for read
fully log based
supports all DML
supports all aggregate functions
supports join
supports NULL values
supports simple projection
only rolls forward in time right now
MVCC
row level locking with innodb

Flexviews was based on the following research:
pages.cs.wisc.edu/~beyer/
citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.41.7788

Also includes my own personal and extensive research into combinatorial algebra over non-distributable aggregate functions. I was not aware of research into combinatorial algebra and developed these techniques in isolation from other researchers as I am an amateur.

May
17
2011
--

Using sets to solve difficult decision problems – the subset sum problem

In a previous post, I described how to use natural database compression on sets to reduce their footprint in the database by removing duplicates and replacing them with a single row and a count column. I also showed an easy way to detect if any subset of numbers in one set exists in another set in a very efficient way using a table encoded using that method. I also showed a special case that allowed you to find any two numbers that summed to zero. That example was a case of the P vs NP problem subset sum problem. That example was a foothold on a way to efficiently test a decision problem on a data set of virtually any size. I then set about solving the subset sum problem because it is easily understood and simple to solve once you understand the method.

Here is a generic method for determining if any subset of numbers in a set adds up to any other number. By naturally compressing the data in a computational friendly way, we can change the way we work with sets of numbers.

Lets find out if any of the numbers in our list add up to any other number. First, the set is contained in an rle encoded form that can be decoded without decompression via relational algebra:

CREATE TABLE `ex2` (
  `val` bigint(20) NOT NULL DEFAULT '0',
  `cnt` bigint(21) NOT NULL DEFAULT '0'
);

mysql> select * from test.ex2;
+-----+----------+
| val | cnt      |
+-----+----------+
| -10 |        1 |
|  -3 |        1 |
|  -2 |        1 |
|   0 |   250000 |
|   4 |  8000000 |
|   7 | 14680064 |
|  14 | 14680064 |
|  15 | 35417088 |
|  16 |        1 |
+-----+----------+
9 rows in set (0.00 sec)

Decoded length:

mysql> select sum(cnt) from ex2;
+----------+
| sum(cnt) |
+----------+
| 73027220 |
+----------+
1 row in set (0.00 sec)

Here we can create a decision table, that is a set of values to run our check through. My test will check every combination of values in the input set (73M rows) against this decision table. If any combination of values adds up to sum_to_check_for, then we will know quickly.

mysql> select * from sum_to_check;
+------+------------------+
| id   | sum_to_check_for |
+------+------------------+
|    1 |                0 |
|    2 |                1 |
|    3 |                2 |
|    4 |                5 |
|    5 |               -5 |
+------+------------------+
5 rows in set (0.00 sec)

Here, we can test for any N items to see if they add up:

mysql> select sum_to_check_for,
group_concat(a.val),
group_concat(sum_to_check_for),
sum(a.val) = sum_to_check_for matched
from ex2 a
join sum_to_check
where val in(-2,-3)
group by 1
having sum(a.val) = sum_to_check_for ;
+------------------+---------------------+--------------------------------+---------+
| sum_to_check_for | group_concat(a.val) | group_concat(sum_to_check_for) | matched |
+------------------+---------------------+--------------------------------+---------+
|               -5 | -3,-2               | -5,-5                          |       1 |
+------------------+---------------------+--------------------------------+---------+
1 row in set (0.00 sec)

Now we can check the entire set in a reasonable amount of time, even though it is logically millions of entries long before natural compression:

mysql> select sum_to_check_for,
group_concat(a.val),
 group_concat(sum_to_check_for),
sum(a.val) = sum_to_check_for matched
from ex2 a join sum_to_check where val in(-10) group by 1 having sum(a.val) = sum_to_check_for ;
Empty set (0.00 sec)

mysql> select sum_to_check_for,
group_concat(a.val),
group_concat(sum_to_check_for),
sum(a.val) = sum_to_check_for matched
from ex2 a join sum_to_check where val in(-10,-3) group by 1 having sum(a.val) = sum_to_check_for ;
Empty set (0.00 sec)

mysql> select sum_to_check_for,
group_concat(a.val),
group_concat(sum_to_check_for),
 sum(a.val) = sum_to_check_for matched
from ex2 a join sum_to_check
where val in(-10,-3,-2,0) group by 1
having sum(a.val) = sum_to_check_for ;
Empty set (0.00 sec)

mysql> select sum_to_check_for,
group_concat(a.val),
group_concat(sum_to_check_for),
sum(a.val) = sum_to_check_for matched
from ex2 a join sum_to_check
where val in(-10,-3,-2)
group by 1
having sum(a.val) = sum_to_check_for ;

...

Note that here we search the entire set of millions of items 5 times over (one for each sum_to_check_for):

mysql> select sum_to_check_for,
group_concat(a.val),
group_concat(sum_to_check_for),
sum(a.val) = sum_to_check_for matched
from ex2 a join sum_to_check
where val in(-10,-3,-2,0,4,7,14,15,16)
group by 1
having sum(a.val) = sum_to_check_for ;
Empty set (0.00 sec)

mysql> select sum_to_check_for,
group_concat(a.val),
group_concat(sum_to_check_for),
sum(a.val) = sum_to_check_for matched
from ex2 a join sum_to_check
where val in(-3,-2)
group by 1
having sum(a.val) = sum_to_check_for ;
+------------------+---------------------+--------------------------------+---------+
| sum_to_check_for | group_concat(a.val) | group_concat(sum_to_check_for) | matched |
+------------------+---------------------+--------------------------------+---------+
|               -5 | -3,-2               | -5,-5                          |       1 |
+------------------+---------------------+--------------------------------+---------+
1 row in set (0.00 sec)

Remember that there are a lot of items in my list, a list that would take forever (literally) to process before.

mysql> select sum(cnt) from ex2;
+----------+
| sum(cnt) |
+----------+
| 73027220 |
+----------+
1 row in set (0.00 sec)

mysql> select * from ex2;
+-----+----------+
| val | cnt      |
+-----+----------+
| -10 |        1 |
|  -3 |        1 |
|  -2 |        1 |
|   0 |   250000 |
|   4 |  8000000 |
|   7 | 14680064 |
|  14 | 14680064 |
|  15 | 35417088 |
|  16 |        1 |
+-----+----------+
9 rows in set (0.00 sec)

This works for two reasons. First, I know the contents of the set, and the distribution characteristics from the structure of the compressed version of the table. The data is stored in a histogram. This allows me to collapse the set of items to search to only the items in the set, and then do a linear search. You can do all of the searches in parallel in a parallel database O(N). By making the check very fast, and being able to execute the check in parallel we can scan any size dataset very quickly, over many machines.

If we keep the counts with the items (maintained with changes to the set) then this allows us to create a finite and easily computable and easily checkable result. We know not to check to see if 17 adds up to something because we know it is not in our set.

May
17
2011
--

The case for getting rid of duplicate “sets”

The most useful feature of the relational database is that it allows us to easily process data in sets, which can be much faster than processing it serially.

When the relational database was first implemented, write-ahead-logging and other technologies did not exist. This made it difficult to implement the database in a way that matched with natural set theory. You see in, in set theory there is no possibility of duplicates in sets. This seems hard to fathom with the way we are used to dealing with data in a computer, but intuitive when you “think about how you think”.

When you say, “I have ten fingers and ten toes”, and I say, “how many is that combined?” you say 20, of course. Did you stop to add up the number one 10 times, and then then ten times more? Of course not. You did simple arithmetic to make the math faster, adding 10 + 10. This is a simple fact, that when you distribute computation you work faster.

How can we effectively reduce the data size of any set, possibly by orders of magnitude, with almost no work? Simple, we start thinking in sets like our brains do.

I am going to use a bigger example than “digits” because this is too small for you to notice an effective change.

Lets say I have a table of numbers, and I want to sum them all up:

mysql> show create table ex1;
+-------+-----------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                        |
+-------+-----------------------------------------------------------------------------------------------------+
| ex1   | CREATE TABLE `ex1` (
  `val` bigint(20) NOT NULL DEFAULT '0'
) ENGINE=MyISAM DEFAULT CHARSET=latin1 |
+-------+-----------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> select count(*) from ex1;
+----------+
| count(*) |
+----------+
| 73027220 |
+----------+
1 row in set (0.00 sec)

mysql> select sum(val) from ex1;
+-----------+
| sum(val)  |
+-----------+
| 871537665 |
+-----------+
1 row in set (5.49 sec)

Now, what if I structure my data differently? We can “compress” the table in the database by removing the duplicates:

mysql> create table ex2 as select val, count(*) from ex1 group by val;
Query OK, 9 rows affected (19.51 sec)
Records: 9  Duplicates: 0  Warnings: 0

mysql> select * from ex2;
+-----+----------+
| val | count(*) |
+-----+----------+
| -10 |        1 |
|  -3 |        1 |
|  -2 |        1 |
|   0 |   250000 |
|   4 |  8000000 |
|   7 | 14680064 |
|  14 | 14680064 |
|  15 | 35417088 |
|  16 |        1 |
+-----+----------+
9 rows in set (0.00 sec)

mysql> select sum(`count(*)`) from ex2;
+-----------------+
| sum(`count(*)`) |
+-----------------+
|        73027220 |
+-----------------+
1 row in set (0.00 sec)

This is very useful compression. First, the data is stored inside of a histogram in the set. The cardinality and the distribution is always known. Second, computation on the compressed data is done without decompression, because a mathematical transformation does not have to be decompressed. Third, the COUNT acts as RLE compression for the other attributes. Don’t use FLOAT with this method (don’t use it at all).

We can do even more interesting things with this concept. Lets consider that I want to determine if I have certain numbers in my set?

First, lets create a table to hold the set of numbers (and their desired minimum cardinality) to a “search set” table:

CREATE TABLE `search_set` (
  `val` bigint(20) DEFAULT NULL,
  `cnt` bigint(20) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1 ;

mysql> insert into search_set values (-2,1),(-3,1),(-10,1),(15,2),(16,1);
Query OK, 5 rows affected (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 0

We are looking for at least one minus-two, at least two fifteens, etc

select * from compressed join search_set using (val) where compressed.cnt >= search_set.cnt;

mysql> select * from compressed join search_set using (val) where compressed.cnt >= search_set.cnt;
+-----+----------+------+
| val | cnt      | cnt  |
+-----+----------+------+
| -10 |        1 |    1 |
|  -3 |        1 |    1 |
|  -2 |        1 |    1 |
|  15 | 35417088 |    2 |
|  16 |        1 |    1 |
+-----+----------+------+
5 rows in set (0.00 sec)

mysql> delete from compressed where val=16;
Query OK, 1 row affected (0.00 sec)

mysql> select * from compressed join search_set using (val) where compressed.cnt >= search_set.cnt;
+-----+----------+------+
| val | cnt      | cnt  |
+-----+----------+------+
| -10 |        1 |    1 |
|  -3 |        1 |    1 |
|  -2 |        1 |    1 |
|  15 | 35417088 |    2 |
+-----+----------+------+
4 rows in set (0.00 sec)

Great, things are looking good. What if we were to do this on our original relation before compression? This is a way to do it without using the “search set” trick above.

Also, notice that it takes a long time to delete. There is no index on this table.

mysql> delete from data where val=16;
Query OK, 1 row affected (3.14 sec)

mysql> select val, count(distinct id) from data             where val in (-2,-3,-10,15,15,16)  group by val ;
+-----+--------------------+
| val | count(distinct id) |
+-----+--------------------+
| -10 |                  1 |
|  -3 |                  1 |
|  -2 |                  1 |
|  15 |           35417088 |
+-----+--------------------+
4 rows in set (39.82 sec)

We can also check to see if any two numbers sum to zero:


mysql> select a.val, b.val from ex2 a,ex2 b where a.val + b.val = 0;
+-----+-----+
| val | val |
+-----+-----+
|   0 |   0 |
+-----+-----+
1 row in set (0.00 sec)

mysql> select a.val, b.val, count(*) from ex2 a,ex2 b where a.val + b.val = 0 group by 1,2 ;
+-----+-----+----------+
| val | val | count(*) |
+-----+-----+----------+
|   0 |   0 |        1 |
+-----+-----+----------+
1 row in set (0.00 sec)

mysql> insert into ex2 values (2,1);
Query OK, 1 row affected (0.00 sec)

mysql> select a.val, b.val, count(*) from ex2 a,ex2 b where a.val + b.val = 0 group by 1,2 ;
+-----+-----+----------+
| val | val | count(*) |
+-----+-----+----------+
|  -2 |   2 |        1 |
|   0 |   0 |        1 |
|   2 |  -2 |        1 |
+-----+-----+----------+
3 rows in set (0.00 sec) 
May
16
2011
--

Percona welcomes Stewart Smith

Percona is pleased to welcome Stewart Smith to the team. Stewart does not need an extended introduction for MySQL Community, but just in case: Stewart has a long history with both the MySQL and Drizzle code bases. He’s been one of the core Drizzle developers since the start of the project (working on Drizzle for Sun and then Rackspace), he maintains HailDB and previously worked for MySQL (and then Sun) on MySQL Cluster (NDB).

Stewart joins Percona to lead our development team and will take lead role in both Percona Server and Percona XtraBackup products. You can certainly expect more mutual exchange of features and ideas between Drizzle, Percona Server, XtraDB and XtraBackup to make all products stronger.

If you want to hear more about about Drizzle and Percona Server you are welcome to attend Percona Live in New York, on May, 26th, 2011 where Stewart will present “Drizzle 7, GA and Supported: Current & Future Features”. There you will meet the Percona team and have chance to learn about Percona Server, XtraBackup and Drizzle first hand from our experts.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com