Nov
23
2018
--

Compression options in MySQL (part 1)

Over the last year, I have been pursuing a part time hobby project exploring ways to squeeze as much data as possible in MySQL. As you will see, there are quite a few different ways. Of course things like compression ratio matters a lot but, other items like performance of inserts, selects and updates, along with the total amount of bytes written are also important. When you start combining all the possibilities, you end up with a large set of compression options and, of course, I am surely missing a ton. This project has been a great learning opportunity and I hope you’ll enjoy reading about my results. Given the volume of results, I’ll have to write a series of posts. This post is the first of the series. I also have to mention that some of my work overlaps work done by one of my colleague, Yura Sorokin, in a presentation he did in Dublin.

The compression options

  • InnoDB page size in {16k, 32k, 64k} (as references)
  • InnoDB barracuda page compression, block_size in {8k, 4k}
  • InnoDB Transparent page compression with punch holes, page size in {16k, 32k, 64k} * compression algo in {LZ4, Zlib}
  • MyISAM, MyISAM Packed, MyISAM on ZFS with recordsize in {16k, 32k}
  • InnoDB on ZFS, ZFS compression algo in {LZ4, Zlib}, ZFS record size in {16k, 32k, 64k, 128k}, InnoDB page size in {16k, 32k, 64k}
  • TokuDB, TokuDB compression algo in {ZLIB, LZMA, QUICKLZ, SNAPPY}
  • TokuDB on ZFS, TokuDB compression algo set to None, ZFS compression Zlib, ZFS record size in {16k, 32k, 64k, 128k}
  • MyRocks, compression algo in {None, ZSTD}
  • MyRocks on ZFS, MyRocks compression algo set to None, ZFS compression Zlib, ZFS record size in {16k, 32k, 64k, 128k}

In many interesting cases, the ZFS experiments have been conducted with and without a SLOG.

The test datasets

In order to cover these solutions efficiently, I used a lot of automation and I restricted myself to two datasets. My first dataset consists of a set of nearly 1 billion rows from the Wikipedia access stats. The table schema is:

CREATE TABLE `wiki_pagecounts` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `day` date NOT NULL,
  `hour` tinyint(4) NOT NULL,
  `project` varchar(30) NOT NULL,
  `title` text NOT NULL,
  `request` int(10) unsigned NOT NULL,
  `size` int(10) unsigned NOT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_time` (`day`,`hour`)
);

and here’s a typical row:

mysql> select * from wiki_pagecounts where id = 16\G
*************************** 1. row ***************************
     id: 16
    day: 2016-01-01
   hour: 0
project: aa
  title: 'File:Wiktionary-logo-en.png'
request: 1
   size: 10752
1 row in set (0.00 sec)

The average length of the title columns is above 70 and it often has HTML escape sequences for UTF-8 characters in it. The actual column content is not really important but it is not random data. Loading this dataset in plain InnoDB results in a data file of about 113GB.

The second dataset is from the defunct Percona cloud tool project and is named “o1543”. Instead of a large number of rows, it is made of only 77M rows but this time, the table has 134 columns, mostly using float or bigint. The table definition is:

CREATE TABLE `query_class_metrics` (
   `day` date NOT NULL,
   `query_class_id` int(10) unsigned NOT NULL,
   `instance_id` int(10) unsigned NOT NULL,
   `start_ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
   `end_ts` timestamp NOT NULL DEFAULT '1970-01-01 00:00:01',
   `query_count` bigint(20) unsigned NOT NULL,
   `lrq_count` bigint(20) unsigned NOT NULL DEFAULT '0',
   ...
   `Sort_scan_sum` bigint(20) unsigned DEFAULT NULL,
   `No_index_used_sum` bigint(20) unsigned DEFAULT NULL,
   `No_good_index_used_sum` bigint(20) unsigned DEFAULT NULL,
   PRIMARY KEY (`start_ts`,`instance_id`,`query_class_id`),
   KEY `start_ts` (`instance_id`)
 );

When loaded in plain InnoDB, the resulting data file size is slightly above 87GB.

The test queries

The first test query is, of course, the inserts used to load the datasets. These are multi-inserts statements in primary key order generated by the mysqldump utility.

The second test query is a large range select. For the Wikipedia dataset I used:

select `day`, `hour`, max(request), sum(request), sum(size)
from wikipedia_pagecounts.wiki_pagecounts
where day = '2016-01-05'
group by `day`,`hour`;

while for the o1543 dataset, I used:

select query_class_id, sum(Query_time_sum) as Totat_time,sum(query_count), sum(Rows_examined_sum),
  sum(InnoDB_pages_distinct_sum)
from o1543.query_class_metrics
where start_ts between '2014-10-01 00:00:00' and '2015-06-30 00:00:00'
group by query_class_id
order by Totat_time desc limit 10;

In both cases, a significant amount of data needs to be processed. Finally, I tested random access with the updates. I generated 20k distinct single row updates in random primary key order for the Wikipedia dataset like:

update wikipedia_pagecounts.wiki_pagecounts set request = request + 1 where id = 377748793;

For the o1543 dataset, I used the following update statement:

update o1543.query_class_metrics set Errors_sum = Errors_sum + 1
where query_class_id = 472 and start_ts between '2014-10-01 00:00:00' and '2014-12-31 23:59:59';

which ends up updating very close to 20k rows, well spaced in term of primary key values.

The metrics recorded

In order to compare the compression options, I recorded key metrics.

  • Time: that’s simply the execution time of the queries. It is very difficult to minimally tune a server for all the different engines. Some also rely on the OS file cache. Take the execution time as a rough performance indicator which could be modified substantially through targeted tuning.
  • Amount of data read and written by the MySQL process, as reported by /proc/$(pidof mysqld)/io.
  • Amount of data written to the actual physical device where the data is stored, from /sys/block/$datadevice/stat. That matters a lot for flash devices that have a finite endurance. The amount of data written to the storage is the main motivation of Facebook with MyRocks.
  • The actual size of the final datasets

Even with these simple metrics, you’ll see there is quite a lot to discuss and learn.

The procedure

For the benchmarks, I used a LXC virtual machine. My main goal was to simulate a dataset much larger than the available memory. I tried to limit the MySQL buffers to 128MB but in some cases, like with MyRocks, that was pretty unfair and it impacted the performance results. Basically, the procedure was:

  1. Start mysqld (no buffer pool load)
  2. Sync + drop cache
  3. Capture: du -hs the datadir
  4. Capture: cat /proc/$(pidof mysqld)/io
  5. Capture: cat /sys/block/vdb/stat
  6. Capture: show global variables and show global status;
  7. Run the test query
  8. Wait for 30 minutes for any flushing or maintenance to complete
  9. Capture: du -hs the datadir
  10. Capture: cat /proc/$(pidof mysqld)/io
  11. Capture: cat /sys/block/vdb/stat
  12. Capture: show global variables and show global status;
  13. Stop mysqld

As much as possible, I automated the whole procedure. On many occasions, I ran multiple runs of the same benchmark to validate unexpected behaviors.

First results: Traditional storage options

Inserting the data

In this first post of the series, I’ll report on the traditional “built-in” options which are InnoDB with Barracuda compression (IBC) and MyISAM with packing. I debated a bit the inclusion of MyISAM in this post since tables become read-only once packed but still, I personally implemented solutions using MyISAM packed a few times.

On the first figure (above), we have the final table sizes in GB for the different options we are considering in this first round of results. Over all the post series, we’ll use the plain InnoDB results as a reference. The Wikipedia dataset has a final size in InnoDB of 112.6GB while the o1543 dataset is slightly smaller, at 87.4GB.

The MyISAM sizes are smaller by 10 to 15% which is expected since InnoDB is page based, with page and row headers, and it doesn’t fully pack its pages. The InnoDB dataset size could have been up to twice as large if the data was inserted in random order of the primary key.

Adding Barracuda page compression with an 8KB block size (InnoDBCmp8k), both datasets shrink by close to 50%. Pushing to a block size of 4KB (InnoDBCmp4k), the Wikipedia dataset clearly doesn’t compress that much but, the o1543 dataset is very compressible, by more than 75%. Looking at the MyISAM Packed results, the o1543 dataset compressed to only 8.8GB, a mere 11% of the original MyISAM size. That means the o1543 could have done well with IBC using block size of 2KB. Such a compression ratio is really exceptional. I’ve rarely encountered a ratio that favorable in a production database.

With IBC, we know when the block size is too small for the compressibility of the dataset when the final size is no longer following the ratio of compressed block size over the original block size. For example, the Wikipedia dataset started at 112.6GB and a fourth (4KB/16KB) of this number is much smaller than the 46.9GB size of the InnoDB compression with 4k block size. You’ll also see a large number of compression failure in the innodb_cmp table of the information schema.

When the compression fails with IBC, InnoDB splits the page in two and recompresses each half. That process adds an overhead which can observed in the insertion time figure. While the insertion time of the Wikipedia dataset for the InnoDBCmp4k explodes to 2.6 times the uncompressed insertion time, the o1543 dataset takes only 30% more time. I suggest you do not take the times here too formally, the test environment I used was not fully isolated. View these times as trends.

The amount of data written during the inserts, shown on the above figure, has bugged me for a while. Let’s consider, for example, the amount of writes needed to insert all the rows of the Wikipedia dataset in plain InnoDB. The total is 503GB for a dataset of 113GB. From the MySQL status variables, I have 114GB written to the data files (innodb_pages_written * 16kb), 114GB written to the double write buffer (Inndb_dblwr_pages_written * 16kb) and 160GB written to the InnoDB log files (innodb_os_log_written). If you sum all these values, you have about 388GB, a value short by about… 114GB, too close to the size of the dataset to be an accident. What else is written?

After some research, I finally found it! When a datafile is extended, InnoDB first writes zeros to allocate the space on disk. All the tablespaces have to go through that initial allocation phase so here are the missing 114GB.

Back to the data written during the inserts figure, look at the number of writes required for the Wikipedia dataset when using InnoDBCmp4k. Any idea why it is higher? What if I tell you the double write buffer only writes 16KB pages (actually, it depends on innodb_page_size)? So, when the pages are compressed, they are padded with zeros when written to the double write buffer. We remember that, when compressing to 4KB, we had many compression misses so we ended up with many more pages to write. Actually, Domas Mituzas filed a bug back in 2013 about this. Also, by default (innodb_log_compressed_pages), the compressed and uncompressed versions of the pages are written to the InnoDB log files. Actually, only the writes to the tablespace are reduced by IBC.

MyISAM, since it is not a transactional engine, cheats here. The only overhead are the writes to the b-trees of the index. So, to the expense of durability, MyISAM writes much less data in this simple benchmark.

Range selects

So great, we have now inserted a lot of rows in our tables. How fast can we access these rows? The following figure presents the times to perform large range scans on the datasets. In InnoDB, the times are “fairly” stable. For the Wikipedia dataset, InnoDB compression improves the select performance, the time to decompress a page is apparently shorter than the time to read a full page. MyISAM without compression is the fastest solution but, once again, the benchmark offers the most favorable conditions to MyISAM. MyISAMPacked doesn’t fare as well for the large range select, likely too much data must be decompressed.

20k updates

Going to the updates, the time required to perform 20k updates by a single thread is shown on the above figure. For both datasets, InnoDB and InnoDBCmp8k show similar times. There is a divergence with Wikipedia dataset stored on InnoDBCmp4k, the execution time is 57% larger, essentially caused by a large increase in the number of pages read. MyISAM is extremely efficient dealing with the updates of the o1543 dataset since the record size is fixed and the update process is single-threaded.


Finally, let’s examine the number of bytes written per updates as shown on the figure above. I was naively expecting about two pages written per update statement, one for the double write buffer and one for the tablespace file. The Wikipedia dataset shows more like three pages written per update while the o1543 dataset fits rather well with what I was expecting. I had to look at the file_summary_by_instance table of the Performance schema and the innodb_metrics table to understand. Actually, my updates to the Wikipedia dataset are single row updates executed in autocommit mode, while the updates to the o1543 dataset are from a single statement updating 20k rows. When you do multiple small transactions, you end up writing much more to the undo log and to the system tablespace. The worse case is when the updates are in separate transactions and a long time is allowed for MySQL to flush the dirty pages.

Here are the writes associated with 30 updates, in autocommit mode, 20s apart:

mysql> select NAME, COUNT_RESET from innodb_metrics where name like '%writt%' and count_reset > 0 ;
+---------------------------------+-------------+
| NAME                            | COUNT_RESET |
+---------------------------------+-------------+
| buffer_pages_written            |         120 |
| buffer_data_written             |     2075136 |
| buffer_page_written_index_leaf  |          30 |
| buffer_page_written_undo_log    |          30 |
| buffer_page_written_system_page |          30 |
| buffer_page_written_trx_system  |          30 |
| os_log_bytes_written            |       77312 |
| innodb_dblwr_pages_written      |         120 |
+---------------------------------+-------------+
8 rows in set (0.01 sec)

The index leaf write is where the row is stored in the tablespace, 30 matches the number of rows updated. Each update has dirtied one leaf page as expected. The undo log is used to store the previous version of the row for rollback, in the ibdata1 file. I wrongly assumed these undo entries would not be actually written to disk, and would only live in the buffer pool and purged before they needed to be flushed to disk. I don’t clearly enough understand what is written to the system page and trx system to attempt a clear explanation for these ones. The sum of pages to write is 120, four per update but you need to multiply by two because of the double write buffer. So, in this worse case scenario, a simple single row update may cause up to eight pages to be written to disk.

Grouping the updates in a single transaction basically removes the pages written to the system_page and trx_system as these are per transaction.

Here is the result for the same 30 updates, send at a 20s interval, but this time in a single transaction:

mysql> select NAME, COUNT_RESET from innodb_metrics where name like '%writt%' and count_reset > 0 ;
+---------------------------------+-------------+
| NAME                            | COUNT_RESET |
+---------------------------------+-------------+
| buffer_pages_written            |          63 |
| buffer_data_written             |     1124352 |
| buffer_page_written_index_leaf  |          30 |
| buffer_page_written_undo_log    |          31 |
| buffer_page_written_system_page |           1 |
| buffer_page_written_trx_system  |           1 |
| os_log_bytes_written            |       60928 |
| innodb_dblwr_pages_written      |          63 |
+---------------------------------+-------------+

The write load, in terms of the number of pages written, is cut by half, to four per update. The most favorable case will be a single transaction with no sleep in between.

For 30 updates in a single transaction with no sleep, the results are:

mysql> select NAME, COUNT_RESET from innodb_metrics where name like '%writt%' and count_reset > 0 ;
+---------------------------------+-------------+
| NAME                            | COUNT_RESET |
+---------------------------------+-------------+
| buffer_pages_written            |          33 |
| buffer_data_written             |      546304 |
| buffer_page_written_index_leaf  |          30 |
| buffer_page_written_undo_log    |           1 |
| buffer_page_written_system_page |           1 |
| buffer_page_written_trx_system  |           1 |
| os_log_bytes_written            |        4608 |
| innodb_dblwr_pages_written      |          33 |
+---------------------------------+-------------+
8 rows in set (0.00 sec)

Now, the undo log is flushed only once and we are down to approximately two page writes per update. This is what I was originally expecting. The other results falls well into place if you keep in mind that only the index_leaf writes are compressed. The InnoDBCmp4k results for the Wikipedia dataset are higher, essentially because it took much more time and thus more page flushing occurred.

What we learned?

Everything can be a pretext to explore and learn. Just to summarize, what have we learned in this post?

  • The InnoDB log file logs compressed and uncompressed result by default (see innodb_log_compressed_pages)
  • The double write buffer only writes full pages, compressed pages are zero padded
  • With InnoDB the total amount of data written to disk during the inserts is more than 5 times the final size. Compression worsen the ratio.
  • A single row update causes from two up to eight pages to be written to disk

Not bad in term of collateral learning…

Next?

In this post, we reviewed the traditional data compression solutions available with MySQL. In future posts, we’ll start looking at the alternatives. In the next one, I will evaluate InnoDB Transparent page compression with punch hole, a feature available since MySQL 5.7.

Jul
12
2018
--

Why MySQL Stored Procedures, Functions and Triggers Are Bad For Performance

Execution map for func1()

MySQL stored procedures, functions and triggers are tempting constructs for application developers. However, as I discovered, there can be an impact on database performance when using MySQL stored routines. Not being entirely sure of what I was seeing during a customer visit, I set out to create some simple tests to measure the impact of triggers on database performance. The outcome might surprise you.

Why stored routines are not optimal performance wise: short version

Recently, I worked with a customer to profile the performance of triggers and stored routines. What I’ve learned about stored routines: “dead” code (the code in a branch which will never run) can still significantly slow down the response time of a function/procedure/trigger. We will need to be careful to clean up what we do not need.

Profiling MySQL stored functions

Let’s compare these four simple stored functions (in MySQL 5.7):

Function 1:

CREATE DEFINER=`root`@`localhost` FUNCTION `func1`() RETURNS int(11)
BEGIN
	declare r int default 0;
RETURN r;
END

This function simply declares a variable and returns it. It is a dummy function

Function 2:

CREATE DEFINER=`root`@`localhost` FUNCTION `func2`() RETURNS int(11)
BEGIN
    declare r int default 0;
    IF 1=2
    THEN
		select levenshtein_limit_n('test finc', 'test func', 1000) into r;
    END IF;
RETURN r;
END

This function calls another function, levenshtein_limit_n (calculates levenshtein distance). But wait: this code will never run – the condition IF 1=2 will never be true. So that is the same as function 1.

Function 3:

CREATE DEFINER=`root`@`localhost` FUNCTION `func3`() RETURNS int(11)
BEGIN
    declare r int default 0;
    IF 1=2 THEN
		select levenshtein_limit_n('test finc', 'test func', 1) into r;
    END IF;
    IF 2=3 THEN
		select levenshtein_limit_n('test finc', 'test func', 10) into r;
    END IF;
    IF 3=4 THEN
		select levenshtein_limit_n('test finc', 'test func', 100) into r;
    END IF;
    IF 4=5 THEN
		select levenshtein_limit_n('test finc', 'test func', 1000) into r;
    END IF;
RETURN r;
END

Here there are four conditions and none of these conditions will be true: there are 4 calls of “dead” code. The result of the function call for function 3 will be the same as function 2 and function 1.

Function 4:

CREATE DEFINER=`root`@`localhost` FUNCTION `func3_nope`() RETURNS int(11)
BEGIN
    declare r int default 0;
    IF 1=2 THEN
		select does_not_exit('test finc', 'test func', 1) into r;
    END IF;
    IF 2=3 THEN
		select does_not_exit('test finc', 'test func', 10) into r;
    END IF;
    IF 3=4 THEN
		select does_not_exit('test finc', 'test func', 100) into r;
    END IF;
    IF 4=5 THEN
		select does_not_exit('test finc', 'test func', 1000) into r;
    END IF;
RETURN r;
END

This is the same as function 3 but the function we are running does not exist. Well, it does not matter as the

select does_not_exit

  will never run.

So all the functions will always return 0. We expect that the performance of these functions will be the same or very similar. Surprisingly it is not the case! To measure the performance I used the “benchmark” function to run the same function 1M times. Here are the results:

+-----------------------------+
| benchmark(1000000, func1()) |
+-----------------------------+
|                           0 |
+-----------------------------+
1 row in set (1.75 sec)
+-----------------------------+
| benchmark(1000000, func2()) |
+-----------------------------+
|                           0 |
+-----------------------------+
1 row in set (2.45 sec)
+-----------------------------+
| benchmark(1000000, func3()) |
+-----------------------------+
|                           0 |
+-----------------------------+
1 row in set (3.85 sec)
+----------------------------------+
| benchmark(1000000, func3_nope()) |
+----------------------------------+
|                                0 |
+----------------------------------+
1 row in set (3.85 sec)

As we can see func3 (with four dead code calls which will never be executed, otherwise identical to func1) runs almost 3x slower compared to func1(); func3_nope() is identical in terms of response time to func3().

Visualizing all system calls from functions

To figure out what is happening inside the function calls I used performance_schema / sys schema to create a trace with ps_trace_thread() procedure

  1. Get the thread_id for the MySQL connection:
    mysql> select THREAD_ID from performance_schema.threads where processlist_id = connection_id();
    +-----------+
    | THREAD_ID |
    +-----------+
    |        49 |
    +-----------+
    1 row in set (0.00 sec)
  2. Run ps_trace_thread in another connection passing the thread_id=49:
    mysql> CALL sys.ps_trace_thread(49, concat('/var/lib/mysql-files/stack-func1-run1.dot'), 10, 0, TRUE, TRUE, TRUE);
    +--------------------+
    | summary            |
    +--------------------+
    | Disabled 0 threads |
    +--------------------+
    1 row in set (0.00 sec)
    +---------------------------------------------+
    | Info                                        |
    +---------------------------------------------+
    | Data collection starting for THREAD_ID = 49 |
    +---------------------------------------------+
    1 row in set (0.00 sec)
  3. At that point I switched to the original connection (thread_id=49) and run:
    mysql> select func1();
    +---------+
    | func1() |
    +---------+
    |       0 |
    +---------+
    1 row in set (0.00 sec)
  4. The sys.ps_trace_thread collected the data (for 10 seconds, during which I ran the
    select func1()

     ), then it finished its collection and created the dot file:

    +-----------------------------------------------------------------------+
    | Info                                                                  |
    +-----------------------------------------------------------------------+
    | Stack trace written to /var/lib/mysql-files/stack-func3nope-new12.dot |
    +-----------------------------------------------------------------------+
    1 row in set (9.21 sec)
    +-------------------------------------------------------------------------------+
    | Convert to PDF                                                                |
    +-------------------------------------------------------------------------------+
    | dot -Tpdf -o /tmp/stack_49.pdf /var/lib/mysql-files/stack-func3nope-new12.dot |
    +-------------------------------------------------------------------------------+
    1 row in set (9.21 sec)
    +-------------------------------------------------------------------------------+
    | Convert to PNG                                                                |
    +-------------------------------------------------------------------------------+
    | dot -Tpng -o /tmp/stack_49.png /var/lib/mysql-files/stack-func3nope-new12.dot |
    +-------------------------------------------------------------------------------+
    1 row in set (9.21 sec)
    Query OK, 0 rows affected (9.45 sec)

I repeated these steps for all the functions above and then created charts of the commands.

Here are the results:

Func1()

Execution map for func1()

Func2()

Execution map for func2()

Func3()

Execution map for func3()

 

As we can see there is a sp/jump_if_not call for every “if” check followed by an opening tables statement (which is quite interesting). So parsing the “IF” condition made a difference.

For MySQL 8.0 we can also see MySQL source code documentation for stored routines which documents how it is implemented. It reads:

Flow Analysis Optimizations
After code is generated, the low level sp_instr instructions are optimized. The optimization focuses on two areas:

Dead code removal,
Jump shortcut resolution.
These two optimizations are performed together, as they both are a problem involving flow analysis in the graph that represents the generated code.

The code that implements these optimizations is sp_head::optimize().

However, this does not explain why it executes “opening tables”. I have filed a bug.

When slow functions actually make a difference

Well, if we do not plan to run one million of those stored functions we will never even notice the difference. However, where it will make a difference is … inside a trigger. Let’s say that we have a trigger on a table: every time we update that table it executes a trigger to update another field. Here is an example: let’s say we have a table called “form” and we simply need to update its creation date:

mysql> update form set form_created_date = NOW() where form_id > 5000;
Query OK, 65536 rows affected (0.31 sec)
Rows matched: 65536  Changed: 65536  Warnings: 0

That is good and fast. Now we create a trigger which will call our dummy func1():

CREATE DEFINER=`root`@`localhost` TRIGGER `test`.`form_AFTER_UPDATE`
AFTER UPDATE ON `form`
FOR EACH ROW
BEGIN
	declare r int default 0;
	select func1() into r;
END

Now repeat the update. Remember: it does not change the result of the update as we do not really do anything inside the trigger.

mysql> update form set form_created_date = NOW() where form_id > 5000;
Query OK, 65536 rows affected (0.90 sec)
Rows matched: 65536  Changed: 65536  Warnings: 0

Just adding a dummy trigger will add 2x overhead: the next trigger, which does not even run a function, introduces a slowdown:

CREATE DEFINER=`root`@`localhost` TRIGGER `test`.`form_AFTER_UPDATE`
AFTER UPDATE ON `form`
FOR EACH ROW
BEGIN
	declare r int default 0;
END
mysql> update form set form_created_date = NOW() where form_id > 5000;
Query OK, 65536 rows affected (0.52 sec)
Rows matched: 65536  Changed: 65536  Warnings: 0

Now, lets use func3 (which has “dead” code and is equivalent to func1):

CREATE DEFINER=`root`@`localhost` TRIGGER `test`.`form_AFTER_UPDATE`
AFTER UPDATE ON `form`
FOR EACH ROW
BEGIN
	declare r int default 0;
	select func3() into r;
END
mysql> update form set form_created_date = NOW() where form_id > 5000;
Query OK, 65536 rows affected (1.06 sec)
Rows matched: 65536  Changed: 65536  Warnings: 0

However, running the code from the func3 inside the trigger (instead of calling a function) will speed up the update:

CREATE DEFINER=`root`@`localhost` TRIGGER `test`.`form_AFTER_UPDATE`
AFTER UPDATE ON `form`
FOR EACH ROW
BEGIN
    declare r int default 0;
    IF 1=2 THEN
		select levenshtein_limit_n('test finc', 'test func', 1) into r;
    END IF;
    IF 2=3 THEN
		select levenshtein_limit_n('test finc', 'test func', 10) into r;
    END IF;
    IF 3=4 THEN
		select levenshtein_limit_n('test finc', 'test func', 100) into r;
    END IF;
    IF 4=5 THEN
		select levenshtein_limit_n('test finc', 'test func', 1000) into r;
    END IF;
END
mysql> update form set form_created_date = NOW() where form_id > 5000;
Query OK, 65536 rows affected (0.66 sec)
Rows matched: 65536  Changed: 65536  Warnings: 0

Memory allocation

Potentially, even if the code will never run, MySQL will still need to parse the stored routine—or trigger—code for every execution, which can potentially lead to a memory leak, as described in this bug.

Conclusion

Stored routines and trigger events are parsed when they are executed. Even “dead” code that will never run can significantly affect the performance of bulk operations (e.g. when running this inside the trigger). That also means that disabling a trigger by setting a “flag” (e.g.

if @trigger_disable = 0 then ...

 ) can still affect performance of bulk operations.

The post Why MySQL Stored Procedures, Functions and Triggers Are Bad For Performance appeared first on Percona Database Performance Blog.

Jun
29
2018
--

MySQL 8.0 Hot Rows with NOWAIT and SKIP LOCKED

MySQL 8.0 hot rows

In MySQL 8.0 there are two new features designed to support lock handling: NOWAIT and SKIP LOCKED. In this post, we’ll look at how MySQL 8.0 handles hot rows. Up until now, how have you handled locks that are part of an active transaction or are hot rows? It’s likely that you have the application attempt to access the data, and if there is a lock on the requested rows, you incur a timeout and have to retry the transaction. These two new features help you to implement sophisticated lock handling scenarios allowing you to handle timeouts better and improve the application’s performance.

To demonstrate I’ll use this product table.

mysql> select @@version;
+-----------+
| @@version |
+-----------+
| 8.0.11    |
+-----------+
1 row in set (0.00 sec)

CREATE TABLE `product` (
`p_id` int(11) NOT NULL AUTO_INCREMENT,
`p_name` varchar(255) DEFAULT NULL,
`p_cost` decimal(19,4) NOT NULL,
`p_availability` enum('YES','NO') DEFAULT 'NO',
PRIMARY KEY (`p_id`),
KEY `p_cost` (`p_cost`),
KEY `p_name` (`p_name`)
) ENGINE=InnoDB AUTO_INCREMENT=8 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

Let’s run through an example. The transaction below will lock the rows 2 and 3 if not already locked. The rows will get released when our transaction does a COMMIT or a ROLLBACK. Autocommit is enabled by default for any transaction and can be disabled either by using the START TRANSACTION clause or by setting the Autocommit to 0.

Session 1:

mysql> START TRANSACTION;SELECT * FROM mydb.product WHERE p_cost >=20 and p_cost <=30 FOR UPDATE;
Query OK, 0 rows affected (0.00 sec)
+------+--------+---------+----------------+
| p_id | p_name | p_cost  | p_availability |
+------+--------+---------+----------------+
|    2 | Item2  | 20.0000 | YES            |
|    3 | Item3  | 30.0000 | YES            |
+------+--------+---------+----------------+
2 rows in set (0.00 sec)

InnoDB performs row-level locking in such a way that when it searches or scans a table index, it sets shared or exclusive locks on the index records it encounters. Thus, the row-level locks are actually index-record locks.

We can get the details of a transaction such as the transaction id, row lock count etc using the command innodb engine status or by querying the performance_schema.data_locks table. The result from the innodb engine status command can however be confusing as we can see below. Our query only locked rows 3 and 4 but the output of the query reports 5 rows as locked (Count of Locked PRIMARY+ locked selected column secondary index + supremum pseudo-record). We can see that the row right next to the rows that we selected is also reported as locked. This is an expected and documented behavior. Since the table is small with only 5 rows, a full scan of the table is much faster than an index search. This causes all rows or most rows of the table to end up as locked as a result of our query.

Innodb Engine Status :-

---TRANSACTION 205338, ACTIVE 22 sec
3 lock struct(s), heap size 1136, 5 row lock(s)
MySQL thread id 8, OS thread handle 140220824467200, query id 28 localhost root

performance_schema.data_locks (another new feature in 8.0.1):

mysql> SELECT ENGINE_TRANSACTION_ID,
 CONCAT(OBJECT_SCHEMA, '.',
 OBJECT_NAME)TBL,
 INDEX_NAME,count(*) LOCK_DATA
FROM performance_schema.data_locks
where LOCK_DATA!='supremum pseudo-record'
GROUP BY ENGINE_TRANSACTION_ID,INDEX_NAME,OBJECT_NAME,OBJECT_SCHEMA;
+-----------------------+--------------+------------+-----------+
| ENGINE_TRANSACTION_ID | TBL          | INDEX_NAME | LOCK_DATA |
+-----------------------+--------------+------------+-----------+
|                205338 | mydb.product | p_cost     |         3 |
|                205338 | mydb.product | PRIMARY    |         2 |
+-----------------------+--------------+------------+-----------+
2 rows in set (0.04 sec)

mysql> SELECT ENGINE_TRANSACTION_ID as ENG_TRX_ID,
 object_name,
 index_name,
 lock_type,
 lock_mode,
 lock_data
FROM performance_schema.data_locks WHERE object_name = 'product';
+------------+-------------+------------+-----------+-----------+-------------------------+
| ENG_TRX_ID | object_name | index_name | lock_type | lock_mode | lock_data               |
+------------+-------------+------------+-----------+-----------+-------------------------+
|     205338 | product     | NULL       | TABLE     | IX        | NULL                    |
|     205338 | product     | p_cost     | RECORD    | X         | 0x800000000000140000, 2 |
|     205338 | product     | p_cost     | RECORD    | X         | 0x8000000000001E0000, 3 |
|     205338 | product     | p_cost     | RECORD    | X         | 0x800000000000320000, 5 |
|     205338 | product     | PRIMARY    | RECORD    | X         | 2                       |
|     205338 | product     | PRIMARY    | RECORD    | X         | 3                       |
+------------+-------------+------------+-----------+-----------+-------------------------+
6 rows in set (0.00 sec)

Session 1:

mysql> COMMIT;
Query OK, 0 rows affected (0.00 sec)


SELECT FOR UPDATE with innodb_lock_wait_timeout:

The innodb_lock_wait_timeout feature is one mechanism that is used to handle lock conflicts. The variable has default value set to 50 sec and causes any transaction that is waiting for a lock for more than 50 seconds to terminate and post a timeout message to the user. The parameter is configurable based on the requirements of the application.

Let’s look at how this feature works using an example with a select for update query.

mysql> select @@innodb_lock_wait_timeout;
+----------------------------+
| @@innodb_lock_wait_timeout |
+----------------------------+
|                         50 |
+----------------------------+
1 row in set (0.00 sec)

Session 1:

mysql> START TRANSACTION;SELECT * FROM mydb.product WHERE p_cost >=20 and p_cost <=30 FOR UPDATE;
Query OK, 0 rows affected (0.00 sec)
+------+--------+---------+----------------+
| p_id | p_name | p_cost  | p_availability |
+------+--------+---------+----------------+
|    2 | Item2  | 20.0000 | YES            |
|    3 | Item3  | 30.0000 | YES            |
+------+--------+---------+----------------+
2 rows in set (0.00 sec)

Session 2:

mysql> select now();SELECT * FROM mydb.product WHERE p_id=3 FOR UPDATE;select now();
+---------------------+
| now()               |
+---------------------+
| 2018-06-19 05:29:48 |
+---------------------+
1 row in set (0.00 sec)
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
+---------------------+
| now()               |
+---------------------+
| 2018-06-19 05:30:39 |
+---------------------+
1 row in set (0.00 sec)
mysql>

Autocommit is enabled (by default) and as expected the transaction waited for lock wait timeout and exited.

Session 1:

mysql> COMMIT;
Query OK, 0 rows affected (0.00 sec)


NOWAIT:

The NOWAIT clause causes a query to terminate immediately in the case that candidate rows are already locked. Considering the previous example, if the application’s requirement is to not wait for the locks to be released or for a timeout, using the NOWAIT clause is the perfect solution. (Setting the innodb_lock_wait_timeout=1 in session also has the similar effect). 

Session 1:

mysql> START TRANSACTION;SELECT * FROM mydb.product WHERE p_cost >=20 and p_cost <=30 FOR UPDATE;
Query OK, 0 rows affected (0.00 sec)
+------+--------+---------+----------------+
| p_id | p_name | p_cost  | p_availability |
+------+--------+---------+----------------+
|    2 | Item2  | 20.0000 | YES            |
|    3 | Item3  | 30.0000 | YES            |
+------+--------+---------+----------------+
2 rows in set (0.00 sec)

Session 2:

mysql>  SELECT * FROM mydb.product WHERE p_id = 3 FOR UPDATE NOWAIT;
ERROR 3572 (HY000): Statement aborted because lock(s) could not be acquired immediately and NOWAIT is set.
mysql>

Session 1:

mysql> COMMIT;
Query OK, 0 rows affected (0.00 sec)


SKIP LOCKED:

The SKIP LOCKED clause asks MySQL to non-deterministically skip over the locked rows and process the remaining rows based on the where clause. Let’s look at how this works using some examples:

Session 1:

mysql> START TRANSACTION;SELECT * FROM mydb.product WHERE p_cost >=20 and p_cost <=30 FOR UPDATE;
Query OK, 0 rows affected (0.00 sec)
+------+--------+---------+----------------+
| p_id | p_name | p_cost  | p_availability |
+------+--------+---------+----------------+
|    2 | Item2  | 20.0000 | YES            |
|    3 | Item3  | 30.0000 | YES            |
+------+--------+---------+----------------+
2 rows in set (0.00 sec)

Session 2:

mysql> SELECT * FROM mydb.product WHERE p_cost = 30 FOR UPDATE SKIP LOCKED;
Empty set (0.00 sec)
mysql>

mysql> SELECT * from mydb.product where p_id IN (1,2,3,4,5) FOR UPDATE SKIP LOCKED;
+------+--------+---------+----------------+
| p_id | p_name | p_cost  | p_availability |
+------+--------+---------+----------------+
|    1 | Item1  | 10.0000 | YES            |
|    5 | Item5  | 50.0000 | YES            |
+------+--------+---------+----------------+
2 rows in set (0.00 sec)

Session 1:

mysql> COMMIT;
Query OK, 0 rows affected (0.00 sec)

The first transaction is selecting rows 2 and 3 for update(ie locked). The second transaction skips these rows and returns the remaining rows when the SKIP LOCKED clause is used.

Important Notes: As the SELECT … FOR UPDATE clause affects concurrency, it should only be used when absolutely necessary. Make sure to index the column part of the where clause as the SELECT … FOR UPDATE is likely to lock the whole table if proper indexes are not setup for the table. When an index is used, only the candidate rows are locked.

The post MySQL 8.0 Hot Rows with NOWAIT and SKIP LOCKED appeared first on Percona Database Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com