Oct
28
2020
--

Say Hello to Libcoredumper – A New Way to Generate Core Dumps, and Other Improvements

Libcoredumper

LibcoredumperIn a perfect world, we expect all software to run flawlessly and never have problems such as bugs and crashes. We also know that this perfect world doesn’t exist and we better be as prepared as possible to troubleshoot those types of situations. Historically, generating core dumps has been a task delegated to the kernel. If you are curious about how to enable it via Linux kernel, you can check out Getting MySQL Core file on Linux. There are a few drawbacks that pose either a limitation or a huge strain to get it working, such as:

  • System-wide configuration required. This is not something DBA always has access to.
  • Inability or very difficult to enable it for a specific binary only. Standards ways enable it for every software running on the box.
  • Nowadays, with cloud and containers, this task has become even more difficult because it sometimes requires containers to be running on privileged mode and host OS to be properly configured by the provider.

The above issues have driven exploration of alternative ways to do create a core dump to help troubleshooting bugs and crashes. More details can be found at PS-7114 .

The Libcoredumper

The libcoredumper is a fork of the archived project google-coredumper. Percona has forked it under Percona-Lab Coredumper, cherry-picked a few improvements from other forks of the same project, and enhanced it to compile and work on newer versions of Linux as well on newer versions of GCC and CLANG.

This project is a Tech Preview, as you may infer from the repository name (Percona Lab). We might not maintain compatibility with future kernel versions and/or patches. One should test the core dumper on their environment before putting this tool into production. We have tested on kernel versions up to 5.4.

This functionality is present on all versions of Percona Server for MySQL and Percona XtraDB Cluster starting from 5.7.31 and 8.0.21. If you compile PS/PXC from source, you can control if the library will be compiled by switching -DWITHCOREDUMPER to ON/OFF (default is ON).

How To Configure It

A new variable named coredumper has been introduced. One should include it under the [mysqld] section of my.cnf and it works independently of the older configuration core-file. This new variable can either be a boolean (no value specified) or with value. It follows a few rules:

  • No value – core dump will have saved under MySQL datadir and will be named core.
  • A path ending with  /  – core dump will be saved under the specified directory and will be named core.
  • A full path with filename  – core dump will be saved under the specified directory and will use the specified name.

Every core file will end with the timestamp of the crash instead of PID, for two main reasons:

  • Make it easier to correlate a core dump with a crash, as MySQL always print a Zulu/UTC timestamp on the logs when it crashes:
    10:02:09 UTC - mysqld got signal 11 ;
  • Operators / Containers will always be running MySQL (or whatever application it is running) as PID 1. If MySQL has crashed multiple times, we don’t want to core-dump to get overwritten by the last crash.

How To Know If I Am Using libcoredumper

When MySQL attempts to write a core file it stamps the log saying it will write a core file. When it does it delegating the action to Linux kernel, you always see a message like below:

. . .
Writing a core file

The above behavior remains the same, however, when MySQL is using libcoredumper to generate the core file, one should see that message informing that the library will be responsible for the action:

. . .
Writing a core file using lib coredumper

Other Improvements

Apart from libcoredumper, starting from the same 5.7 and 8.0 releases a stack trace will also:

  • Print binary BuildID – This information is very useful for support/development people in case the MySQL binary that crashed is a stripped binary. Stripped binaries are a technique to remove part of the binaries that are not essential for it to run, making the binary occupy less space in disk and in memory. When computers had a restriction on memory, this technique was widely used. Nowadays this doesn’t pose a limitation anymore on most of the hardware, however, it is becoming popular once again with containers where image size matters. Stripping the binary removed the binary symbols table, which is required to resolve a stack trace and lets you read the core dump. BuildID is how we can link things together again.
  • Print the server Version – This information is also useful to have at glance. Recent versions of MySQL/Percona Server for MySQL have a fix for many know issues. Having this information helps to establish the starting point investigation. MySQL only prints the server version when it starts, and by the moment a server crashes, its log may have grown significantly or even got rotated/truncated.

Here is one example of how a crash with stack trace will look like:

14:23:52 UTC - mysqld got signal 11 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.

Build ID: 55b4b87f230554777d28c6715557ee9538d80115
Server Version: 8.0.21-12-debug Source distribution

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x46000
/usr/local/mysql/bin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x55) [0x55943894c280]
/usr/local/mysql/bin/mysqld(handle_fatal_signal+0x2e0) [0x559437790768]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x13f40) [0x7f9d413bcf40]
/lib/x86_64-linux-gnu/libc.so.6(__poll+0x49) [0x7f9d40858729]
/usr/local/mysql/bin/mysqld(Mysqld_socket_listener::listen_for_connection_event()+0x64) [0x55943777db6a]
/usr/local/mysql/bin/mysqld(Connection_acceptor<Mysqld_socket_listener>::connection_event_loop()+0x30) [0x55943737266e]
/usr/local/mysql/bin/mysqld(mysqld_main(int, char**)+0x30c6) [0x559437365de1]
/usr/local/mysql/bin/mysqld(main+0x20) [0x559437114005]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xeb) [0x7f9d4076db6b]
/usr/local/mysql/bin/mysqld(_start+0x2a) [0x559437113f2a]
Please help us make Percona Server better by reporting any
bugs at https://bugs.percona.com/

You may download the Percona Server operations manual by visiting
http://www.percona.com/software/percona-server/. You may find information
in the manual which will help you identify the cause of the crash.
Writing a core file using lib coredumper

TL;DR

Libcoredumper serves as an alternative for current –core-file functionality for generating memory dumps. In case of any crash of MySQL, a core dump is written and can be later processed /read via GDB to understand the circumstances of such a crash.
Users can enable it by adding the below variable to [mysqld] section of my.cnf:

[mysqld]
coredumper

Percona Server for MySQL versions starting from 5.7.31 and 8.0.31 include the library by default. Refer to below documentation pages for more details:

https://www.percona.com/doc/percona-server/5.7/diagnostics/libcoredumper.html

https://www.percona.com/doc/percona-server/5.7/diagnostics/stacktrace.html

Summary

If you faced any issue or limitation on enabling core dumps before feel free to test new versions of Percona Server for MySQL/Percona XtraDB Cluster and use libcoredumper. Also, any feedback is very welcome on how we can improve the troubleshooting of bugs/crashes even further.

Feb
22
2019
--

PostgreSQL fsync Failure Fixed – Minor Versions Released Feb 14, 2019

fsync postgresql upgrade

PostgreSQL logoIn case you didn’t already see this news, PostgreSQL has got its first minor version released for 2019. This includes minor version updates for all supported PostgreSQL versions. We have indicated in our previous blog post that PostgreSQL 9.3 had gone EOL, and it would not support any more updates. This release includes the following PostgreSQL major versions:

What’s new in this release?

One of the common fixes applied to all the supported PostgreSQL versions is on – panic instead of retrying after fsync () failure. This fsync failure has been in discussion for a year or two now, so let’s take a look at the implications.

A fix to the Linux fsync issue for PostgreSQL Buffered IO in all supported versions

PostgreSQL performs two types of IO. Direct IO – though almost never – and the much more commonly performed Buffered IO.

PostgreSQL uses O_DIRECT when it is writing to WALs (Write-Ahead Logs aka Transaction Logs) only when

wal_sync_method

 is set to :

open_datasync

 or to 

open_sync

 with no archiving or streaming enabled. The default 

wal_sync_method

 may be

fdatasync

 that does not use O_DIRECT. This means, almost all the time in your production database server, you’ll see PostgreSQL using O_SYNC / O_DSYNC while writing to WAL’s. Whereas, writing the modified/dirty buffers to datafiles from shared buffers is always through Buffered IO.  Let’s understand this further.

Upon checkpoint, dirty buffers in shared buffers are written to the page cache managed by kernel. Through an fsync(), these modified blocks are applied to disk. If an fsync() call is successful, all dirty pages from the corresponding file are guaranteed to be persisted on the disk. When there is an fsync to flush the pages to disk, PostgreSQL cannot guarantee a copy of a modified/dirty page. The reason is that writes to storage from the page cache are completely managed by the kernel, and not by PostgreSQL.

This could still be fine if the next fsync retries flushing of the dirty page. But, in reality, the data is discarded from the page cache upon an error with fsync. And the next fsync would obviously succeed ignoring the previous errors, because it now includes the next set of dirty buffers that need to be written to disk and not the ones that failed earlier.

To understand it better, consider an example of Linux trying to write dirty pages from page cache to a USB stick that was removed during an fsync. Neither the ext4 file system nor the btrfs nor an xfs tries to retry the failed writes. A silently failing fsync may result in data loss, block corruption, table or index out of sync, foreign key or other data integrity issues… and deleted records may reappear.

Until a while ago, when we used local storage or storage using RAID Controllers with write cache, it might not have been a big problem. This issue goes back to the time when PostgreSQL was designed for buffered IO but not Direct IO. Should this now be considered an issue with PostgreSQL and the way it’s designed? Well, not exactly.

All this started with the error handling during a writeback in Linux. A writeback asynchronously performs dirty page writes from page cache to filesystem. In ext4 like filesystems, upon a writeback error, the page is marked clean and up to date, and the user space is unaware of the problem.

fsync errors are now detected

Starting from kernel 4.13, we can now reliably detect such errors during fsync. So, any open file descriptor to a file includes a pointer to the address_space structure, and a new 32-bit value (errseq_t) has been added that is visible to all the processes accessing that file. With the new minor version for all supported PostgreSQL versions, a PANIC is triggered upon such error. This performs a database crash and initiates recovery from the last CHECKPOINT. There is a patch expected to be released in PostgreSQL 12 that works for newer kernel versions and modifies the way PostgreSQL handles the file descriptors. A long term solution to this issue may be Direct IO, but you might see a different approach to this in PG 12.

A good amount of work on this issue was done by Jeff Layton on reporting writeback errors, and Matthew Wilcox. What this patch means is that a writeback error gets reported during an fsync, which can be seen by another process that opens that file. A new 32-bit value that stores an error code and a sequence number are added to a new

typedef: errseq_t

 . So, these errors are now in the

address_space

 . But, if the struct inode is gone due to a memory pressure, this patch has no value.

Can i enable or disable the PANIC on fsync failure in PostgreSQL newer releases ?

Yes. You can set this parameter :

data_sync_retry

 to false (default), where a PANIC-level error is raised to recover from WAL through a database crash. You must be sure to have a proper high-availability mechanism so that the impact is minimal for your application. You could let your application failover to a slave, which could minimize the impact.

You can always set

data_sync_retry

 to true, if you are sure about how your OS behaves during write-back failures. By setting this to true, PostgreSQL will just report an error and continue to run.

Some of the other possible issues now fixed and common to these minor releases

  1. A lot of features and fixes related to PARTITIONING have been applied in this minor release. (PostgreSQL 10 and 11 only).
  2. Autovacuum has been made more aggressive about removing leftover temporary tables.
  3. Deadlock when acquiring multiple buffer locks.
  4. Crashes in logical replication.
  5. Incorrect planning of queries in which a lateral reference must be evaluated at a foreign table scan.
  6. Fixed some issues reported with ANALYZE and TRUNCATE operations.
  7. Fix to contrib/hstore to calculate correct hash values for empty hstore values that were created in version 8.4 or before.
  8. A fix to pg_dump’s handling of materialized views with indirect dependencies on primary keys.

We always recommend that you keep your PostgreSQL databases updated to the latest minor versions. Applying a minor release might need a restart after updating the new binaries.

Here is the sequence of steps you should follow to upgrade to the latest minor versions after thorough testing :

  1. Shutdown the PostgreSQL database server
  2. Install the updated binaries
  3. Restart your PostgreSQL database server

Most of the time, you can choose to update the minor versions in a rolling fashion in a master-slave (replication) setup because it avoids downtime for both reads and writes simultaneously. For a rolling style update, you could perform the update on one server after another… but not all at once. However, the best method that we’d almost always recommend is – shutdown, update and restart all instances at once.

If you are currently running your databases on PostgreSQL 9.3.x or earlier, we recommend that you to prepare a plan to upgrade your PostgreSQL databases to the supported versions ASAP. Please subscribe to our blog posts so that you can hear about the various options for upgrading your PostgreSQL databases to a supported major version.


Photo by Andrew Rice on Unsplash

Sep
25
2018
--

Why Optimization derived_merge can Break Your Queries

MySQL optimizer bugs

MySQL optimizer bugsLately, I worked on several queries which started returning wrong results after upgrading MySQL Server to version 5.7 The reason for the failure was derived merge optimization which is one of the default

optimizer_switch

  options. Issues were solved, though at the price of performance, when we turned it

OFF

 . But, more importantly, we could not predict if any other query would start returning incorrect data, to allow us to fix the application before it was too late. Therefore I tried to find reasons why

derived_merge

  can fail.

Analyzing the problem

In the first run, we turned SQL Mode

ONLY_FULL_GROUP_BY

on, and this removed most of the problematic queries. That said, few of the queries that were successfully working with

ONLY_FULL_GROUP_BY

  were affected.

A quick search in the MySQL bugs database gave me a not-so-short list of open bugs:

At first glance, the reported queries do not follow any pattern, and we still cannot quickly identify which would break and which would not.

Then I took a second look by running all of the provided test cases in my environment and found that for four bugs, the optimizer rewrote the query. For three of the bugs, it rewrote in both 5.7 and 8.0, and one case it rewrote in 8.0 only.

The remaining three buggy queries (Bug #85117, Bug #91418, Bug #91878) have things in common. Let’s first look at them:

  1. Bug #85117
    select
        temp.sel
    from
        table1 t1
        left join (
            select *,1 as sel from table1_use t1u where t1u.`table1id`=1
        ) temp on temp.table1id = t1.id
    order by t1.value
  2. Bug #91418
    select
        TST.UID ,TST.BID ,TST.THING_NAME ,TST.OTHER_IFO ,vw2.DIST_UID
    from
        TEST_SUB_PROBLEM TST
        join (
            select uuid() as DIST_UID, vw.*
            from (
                select DISTINCT BID, THING_NAME
                from TEST_SUB_PROBLEM
            ) vw
        ) vw2
    on vw2.BID = TST.BID;
  3. Bug #91878
    SELECT
        Virtual_Table.T_FP AS T_FP,
        (
            SELECT COUNT(Virtual_Table.T_FP)
            FROM t1 t
            WHERE t.f1 = Virtual_Table.T_FP
            AND Virtual_Table.T_FP = 731834939448428685
        ) AS Test_Value
    FROM (
        SELECT t.f1 AS T_FP, tv.f1 AS TV_FP
        FROM t1 AS t
        JOIN t2 AS tv
        ON t.f1 = tv.t1_f1
    ) AS Virtual_Table
    GROUP BY
        Virtual_Table.TV_FP
    HAVING
        Test_Value > 0;

Two of the queries use

DISTINCT

  or

GROUP BY

 , one uses

ORDER BY

  clause. The cases do not have not the same clause in common—which is what I’d expect to see—and so, surprisingly, these are not the cause of the failure. However, all three queries use generated values: a constant in the first one;

UUID()

  and

COUNT()

  functions in the second and third respectively. This similarity is something we need to investigate.

To find out why

derived_merge

  might work incorrectly for these queries we need to understand how this optimization works and why it was introduced.

The intent behind derived_merge

First I recommend checking the official MySQL User Reference Manual and MariaDB knowledge base. It is correct to use both manuals: even if low-level implementations vary, the high-level architecture and the purpose of this optimization are the same.

In short:

derived_merge

  is used for queries that have subqueries in the 

FROM

  clause,  also called “derived tables” and practically converts them into

JOIN

 queries. This optimization allows avoiding unnecessary materialization (creating internal temporary tables to hold results). Virtually this is the same thing as a manual rewrite of a query with a subquery into a query that has

JOIN

 clause(s) only. The only difference is that when we rewrite queries manually, we can compare the expected and actual result, then adjust the resulting query if needed. The MySQL optimizer has to do a correct rewrite at the first attempt. And sometimes this effort fails.

Let’s check why this happens for these particular queries, reported in the MySQL Bugs Database.

Case Study 1: a Query from Bug #85117

Original query

select
      temp.sel
from
    table1 t1
    left join (
         select *,1 as sel from table1_use t1u where t1u.`table1id`=1
    ) temp on temp.table1id = t1.id
order by t1.value

was rewritten to:

Note (Code 1003):
/* select#1 */
select 1 AS `sel`
    from
        `test`.`table1` `t1`
    left join
        (`test`.`table1_use` `t1u`)
    on(((`test`.`t1`.`id` = 1) and (`test`.`t1u`.`table1id` = 1)))
    where 1
    order by `test`.`t1`.`value`;

You can always find a query that the optimizer converts the original one to in the

SHOW WARNINGS

 output following

EXPLAIN [EXTENDED]

 for the query.

In this case, the original query asks to return all rows from the table

table1

 , but selects only the generated field from the subquery. The subquery selects the only row with

table1id=1

 .

Avoiding derived merge optimization is practically the same as joining table

table1

 with a table with one row. You can see how it works in this code snippet:

mysql> create temporary table temp as select *,1 as sel from table1_use t1u where t1u.`table1id`=1;
Query OK, 1 row affected (0.00 sec)
Records: 1  Duplicates: 0  Warnings: 0
mysql> select * from temp;
+----+----------+------+-----+
| id | table1id | uid  | sel |
+----+----------+------+-----+
|  1 |        1 |   99 |   1 |
+----+----------+------+-----+
1 row in set (0.00 sec)
mysql> select temp.sel from table1 t1 left join temp on temp.table1id = t1.id order by t1.value;
+------+
| sel  |
+------+
|    1 |
| NULL |
| NULL |
+------+
3 rows in set (0.00 sec)

However, when the optimizer uses derived-merge optimization, it completely ignores the fact that the resulting table has one row, and that the calculated value would be either

NULL

  or 1 depending if a row corresponding to

table1

  exists in the table. That it prints

select 1 AS `sel`

  in the

EXPLAIN

  output while uses

select NULL AS `sel`

  does not change anything: both are wrong. The correct query without a subquery should look like:

mysql> select if(`test`.`t1u`.`table1id`, 1, NULL) AS `sel`
    -> from `test`.`table1` `t1`
    -> left join (`test`.`table1_use` `t1u`)
    -> on(((`test`.`t1`.`id` = 1) and (`test`.`t1u`.`table1id` = 1)))
    -> where 1
    -> order by `test`.`t1`.`value`;
+------+
| sel  |
+------+
|    1 |
| NULL |
| NULL |
+------+
3 rows in set (0.00 sec)

This report is the easiest of the bugs we will discuss in this post, and is also fixed in MariaDB.

Case Study 2: a Query from Bug #91418

mysql> select * from TEST_SUB_PROBLEM;
+-----+--------+------------+---------------------+
| UID | BID    | THING_NAME | OTHER_IFO           |
+-----+--------+------------+---------------------+
|   1 | thing1 | name1      | look a chicken      |
|   2 | thing1 | name1      | look an airplane    |
|   3 | thing2 | name2      | look a mouse        |
|   4 | thing3 | name3      | look a taperecorder |
|   5 | thing3 | name3      | look an explosion   |
|   6 | thing4 | name4      | look at the stars   |
+-----+--------+------------+---------------------+
6 rows in set (0.00 sec)
mysql> select
    ->     TST.UID ,TST.BID ,TST.THING_NAME ,TST.OTHER_IFO ,vw2.DIST_UID
    -> from
    ->     TEST_SUB_PROBLEM TST
    -> join (
    ->     select uuid() as DIST_UID, vw.*
    ->     from (
    ->         select DISTINCT BID, THING_NAME
    ->         from TEST_SUB_PROBLEM
    ->     ) vw
    -> ) vw2
    -> on vw2.BID = TST.BID;
+-----+--------+------------+---------------------+--------------------------------------+
| UID | BID    | THING_NAME | OTHER_IFO           | DIST_UID                             |
+-----+--------+------------+---------------------+--------------------------------------+
|   1 | thing1 | name1      | look a chicken      | e4c288fd-b29c-11e8-b0d7-0242673a86b2 |
|   2 | thing1 | name1      | look an airplane    | e4c28aef-b29c-11e8-b0d7-0242673a86b2 |
|   3 | thing2 | name2      | look a mouse        | e4c28c47-b29c-11e8-b0d7-0242673a86b2 |
|   4 | thing3 | name3      | look a taperecorder | e4c28d92-b29c-11e8-b0d7-0242673a86b2 |
|   5 | thing3 | name3      | look an explosion   | e4c28ed9-b29c-11e8-b0d7-0242673a86b2 |
|   6 | thing4 | name4      | look at the stars   | e4c29031-b29c-11e8-b0d7-0242673a86b2 |
+-----+--------+------------+---------------------+--------------------------------------+
6 rows in set (0.00 sec)

This query should create a unique

DIST_UID

  for each unique

BID

 name. But, instead, it generates a unique

ID

  for each row.

First, let’s split the query into a couple of queries using temporary tables, to confirm our assumption that it was written correctly in the first place:

mysql> create temporary table vw as select DISTINCT BID, THING_NAME from TEST_SUB_PROBLEM;
Query OK, 4 rows affected (0.01 sec)
Records: 4  Duplicates: 0  Warnings: 0
mysql> select * from vw;
+--------+------------+
| BID    | THING_NAME |
+--------+------------+
| thing1 | name1      |
| thing2 | name2      |
| thing3 | name3      |
| thing4 | name4      |
+--------+------------+
4 rows in set (0.00 sec)
mysql> create temporary table vw2 as select uuid() as DIST_UID, vw.* from vw;
Query OK, 4 rows affected (0.01 sec)
Records: 4  Duplicates: 0  Warnings: 0
mysql> select * from vw2;
+--------------------------------------+--------+------------+
| DIST_UID                             | BID    | THING_NAME |
+--------------------------------------+--------+------------+
| eb155f0e-b29d-11e8-b0d7-0242673a86b2 | thing1 | name1      |
| eb158c05-b29d-11e8-b0d7-0242673a86b2 | thing2 | name2      |
| eb159b28-b29d-11e8-b0d7-0242673a86b2 | thing3 | name3      |
| eb15a916-b29d-11e8-b0d7-0242673a86b2 | thing4 | name4      |
+--------------------------------------+--------+------------+
4 rows in set (0.00 sec)
mysql> select
    -> TST.UID ,TST.BID ,TST.THING_NAME ,TST.OTHER_IFO ,vw2.DIST_UID
    -> from TEST_SUB_PROBLEM TST
    -> join vw2
    -> on vw2.BID = TST.BID;
+-----+--------+------------+---------------------+--------------------------------------+
| UID | BID    | THING_NAME | OTHER_IFO           | DIST_UID                             |
+-----+--------+------------+---------------------+--------------------------------------+
|   1 | thing1 | name1      | look a chicken      | eb155f0e-b29d-11e8-b0d7-0242673a86b2 |
|   2 | thing1 | name1      | look an airplane    | eb155f0e-b29d-11e8-b0d7-0242673a86b2 |
|   3 | thing2 | name2      | look a mouse        | eb158c05-b29d-11e8-b0d7-0242673a86b2 |
|   4 | thing3 | name3      | look a taperecorder | eb159b28-b29d-11e8-b0d7-0242673a86b2 |
|   5 | thing3 | name3      | look an explosion   | eb159b28-b29d-11e8-b0d7-0242673a86b2 |
|   6 | thing4 | name4      | look at the stars   | eb15a916-b29d-11e8-b0d7-0242673a86b2 |
+-----+--------+------------+---------------------+--------------------------------------+
6 rows in set (0.01 sec)
mysql> select distinct DIST_UID
    -> from (
    ->     select
    ->         TST.UID ,TST.BID ,TST.THING_NAME ,TST.OTHER_IFO ,vw2.DIST_UID
    ->     from TEST_SUB_PROBLEM TST
    ->     join vw2
    ->     on vw2.BID = TST.BID
    -> ) t;
+--------------------------------------+
| DIST_UID                             |
+--------------------------------------+
| eb155f0e-b29d-11e8-b0d7-0242673a86b2 |
| eb158c05-b29d-11e8-b0d7-0242673a86b2 |
| eb159b28-b29d-11e8-b0d7-0242673a86b2 |
| eb15a916-b29d-11e8-b0d7-0242673a86b2 |
+--------------------------------------+
4 rows in set (0.00 sec)

With temporary tables, we have precisely four unique

DIST_UID

  values unlike the six values that our original query returned.

Let’s check how the original query was rewritten:

Note (Code 1003):
/* select#1 */
select
    `test`.`TST`.`UID` AS `UID`,
    `test`.`TST`.`BID` AS `BID`,
    `test`.`TST`.`THING_NAME` AS `THING_NAME`,
    `test`.`TST`.`OTHER_IFO` AS `OTHER_IFO`,
    uuid() AS `DIST_UID`
from `test`.`TEST_SUB_PROBLEM` `TST`
join
    (/* select#3 */
    select
        distinct `test`.`TEST_SUB_PROBLEM`.`BID` AS `BID`,
        `test`.`TEST_SUB_PROBLEM`.`THING_NAME` AS `THING_NAME`
    from `test`.`TEST_SUB_PROBLEM`) `vw`
where (`vw`.`BID` = `test`.`TST`.`BID`)

You can see that the optimizer did not wholly remove the subquery here. Let’s run this modified query, and run a test with a temporary table one more time:

mysql> select
    ->     `test`.`TST`.`UID` AS `UID`,
    ->     `test`.`TST`.`BID` AS `BID`,
    ->     `test`.`TST`.`THING_NAME` AS `THING_NAME`,
    ->     `test`.`TST`.`OTHER_IFO` AS `OTHER_IFO`,
    ->     uuid() AS `DIST_UID`
    -> from
    ->     `test`.`TEST_SUB_PROBLEM` `TST`
    -> join
    -> (/* select#3 */
    ->     select
    ->         distinct `test`.`TEST_SUB_PROBLEM`.`BID` AS `BID`,
    ->         `test`.`TEST_SUB_PROBLEM`.`THING_NAME` AS `THING_NAME`
    ->     from
    ->         `test`.`TEST_SUB_PROBLEM`
    -> ) `vw`
    -> where (`vw`.`BID` = `test`.`TST`.`BID`)
    -> ;
+-----+--------+------------+---------------------+--------------------------------------+
| UID | BID    | THING_NAME | OTHER_IFO           | DIST_UID                             |
+-----+--------+------------+---------------------+--------------------------------------+
|   1 | thing1 | name1      | look a chicken      | 12c5f554-b29f-11e8-b0d7-0242673a86b2 |
|   2 | thing1 | name1      | look an airplane    | 12c5f73a-b29f-11e8-b0d7-0242673a86b2 |
|   3 | thing2 | name2      | look a mouse        | 12c5f894-b29f-11e8-b0d7-0242673a86b2 |
|   4 | thing3 | name3      | look a taperecorder | 12c5f9de-b29f-11e8-b0d7-0242673a86b2 |
|   5 | thing3 | name3      | look an explosion   | 12c5fb20-b29f-11e8-b0d7-0242673a86b2 |
|   6 | thing4 | name4      | look at the stars   | 12c5fc7d-b29f-11e8-b0d7-0242673a86b2 |
+-----+--------+------------+---------------------+--------------------------------------+
6 rows in set (0.01 sec)

This time the changed query result is no different to the one we received from the original one. Let’s manually replace the subquery with temporary tables, and check if it affects the result again.

mysql> create temporary table vw
    -> select
    ->     distinct `test`.`TEST_SUB_PROBLEM`.`BID` AS `BID`,
    ->     `test`.`TEST_SUB_PROBLEM`.`THING_NAME` AS `THING_NAME`
    -> from `test`.`TEST_SUB_PROBLEM`;
Query OK, 4 rows affected (0.01 sec)<br>Records: 4  Duplicates: 0  Warnings: 0
mysql> select * from vw;
+--------+------------+
| BID    | THING_NAME |
+--------+------------+
| thing1 | name1      |
| thing2 | name2      |
| thing3 | name3      |
| thing4 | name4      |
+--------+------------+
4 rows in set (0.00 sec)
mysql> select
    ->     `test`.`TST`.`UID` AS `UID`,
    ->     `test`.`TST`.`BID` AS `BID`,
    ->     `test`.`TST`.`THING_NAME` AS `THING_NAME`,
    ->     `test`.`TST`.`OTHER_IFO` AS `OTHER_IFO`,
    ->      uuid() AS `DIST_UID`
    -> from `test`.`TEST_SUB_PROBLEM` `TST`
    -> join vw where (`vw`.`BID` = `test`.`TST`.`BID`) ;
+-----+--------+------------+---------------------+--------------------------------------+
| UID | BID    | THING_NAME | OTHER_IFO           | DIST_UID                             |
+-----+--------+------------+---------------------+--------------------------------------+
|   1 | thing1 | name1      | look a chicken      | e11dbe61-b2a0-11e8-b0d7-0242673a86b2 |
|   2 | thing1 | name1      | look an airplane    | e11dc050-b2a0-11e8-b0d7-0242673a86b2 |
|   3 | thing2 | name2      | look a mouse        | e11dc1af-b2a0-11e8-b0d7-0242673a86b2 |
|   4 | thing3 | name3      | look a taperecorder | e11dc2be-b2a0-11e8-b0d7-0242673a86b2 |
|   5 | thing3 | name3      | look an explosion   | e11dc3a8-b2a0-11e8-b0d7-0242673a86b2 |
|   6 | thing4 | name4      | look at the stars   | e11dc4e9-b2a0-11e8-b0d7-0242673a86b2 |
+-----+--------+------------+---------------------+--------------------------------------+
6 rows in set (0.00 sec)

In this case, the temporary table contains the correct number of rows: 4, but the outer query calculates a 

UUID

  value for all rows in the table

TEST_SUB_PROBLEM

 . It does not take into account that the user initially asks for a unique

UUID

  for each unique

BID

  and not each unique

UID

 . Instead, it just moves a call of

UUID()

  function into the outer query, which creates a unique value for each row in the table

TEST_SUB_PROBLEM

 . It does not take into account that the temporary table contains only four rows. In this case, it would not be easy to build an effective query that generates distinct

UUID

  values for rows with different

BID

 ‘s and the same

UUID

  values for rows with the same

BID

 .

Case Study 3: a Query from Bug #91878

This query is supposed to calculate a number of rows based on complex conditions:

SELECT
Virtual_Table.T_FP AS T_FP,
(SELECT COUNT(Virtual_Table.T_FP) FROM t1 t WHERE t.f1 = Virtual_Table.T_FP AND Virtual_Table.T_FP = 731834939448428685) AS Test_Value
FROM
(SELECT t.f1 AS T_FP, tv.f1 AS TV_FP FROM t1 AS t JOIN t2 AS tv ON t.f1 = tv.t1_f1) AS Virtual_Table
GROUP BY Virtual_Table.TV_FP
HAVING Test_Value > 0;

However, it returns no rows when it should return 22 (check the bug report for the full test case).

mysql> SELECT Virtual_Table.T_FP AS T_FP,
    -> (
    ->     SELECT
    ->         COUNT(Virtual_Table.T_FP)
    ->     FROM t1 t
    ->     WHERE
    ->         t.f1 = Virtual_Table.T_FP
    ->     AND
    ->         Virtual_Table.T_FP = 731834939448428685
    -> ) AS Test_Value
    -> FROM (
    ->     SELECT
    ->         t.f1 AS T_FP, tv.f1 AS TV_FP
    ->     FROM t1 AS t
    ->     JOIN t2 AS tv
    ->     ON t.f1 = tv.t1_f1
    -> ) AS Virtual_Table
    -> GROUP BY Virtual_Table.TV_FP
    -> HAVING Test_Value > 0;
Empty set (1.28 sec)

To find out why this happens let’s perform a temporary table check first.

mysql> create temporary table Virtual_Table SELECT t.f1 AS T_FP, tv.f1 AS TV_FP FROM t1 AS t JOIN t2 AS tv ON t.f1 = tv.t1_f1;
Query OK, 18722 rows affected (2.12 sec)
Records: 18722  Duplicates: 0  Warnings: 0
mysql> SELECT Virtual_Table.T_FP AS T_FP,
    -> (SELECT COUNT(Virtual_Table.T_FP) FROM t1 t
    -> WHERE t.f1 = Virtual_Table.T_FP AND Virtual_Table.T_FP = 731834939448428685) AS Test_Value
    -> FROM  Virtual_Table GROUP BY Virtual_Table.TV_FP HAVING Test_Value > 0;
+--------------------+------------+
| T_FP               | Test_Value |
+--------------------+------------+
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
+--------------------+------------+
22 rows in set (1.62 sec)

The rewritten query returned the correct result, as we expected.

To identify why the original query fails, let’s check how the optimizer rewrote it:

Note (Code 1003):
/* select#1 */
select
    `test`.`t`.`f1` AS `T_FP`,
    (/* select#2 */
        select
            count(`test`.`t`.`f1`)
        from
            `test`.`t1` `t`
        where
            (('731834939448428685' = 731834939448428685)
        and (`test`.`t`.`f1` = 731834939448428685))
    ) AS `Test_Value`
    from
        `test`.`t1` `t`
    join
        `test`.`t2` `tv`
    where
        (`test`.`tv`.`t1_f1` = `test`.`t`.`f1`)
    group by `test`.`tv`.`f1`
    having (`Test_Value` > 0)

Interestingly, when I run this query on the original tables it returned all 18722 rows that exist in table

t2

 .

This output means that we cannot entirely rely on the 

EXPLAIN

  output. But still we can see the same symptoms:

  • Subquery uses a function to generate a value
  • Subquery in the
    FROM

      clause is converted into a 

    JOIN

    , and its values are accessible by an outer subquery

We also see that the query has

GROUP BY

  and

HAVING

  clauses, thus adding a complication.

The query is almost correct, but in this case, the optimizer mixed aliases: it uses the same alias in the internal query as in the external one. If you change the alias from

t

  to

t2

  in the subquery, the rewritten query starts returning correct results:

mysql> select
    ->     `test`.`t`.`f1` AS `T_FP`,
    -> (/* select#2 */
    ->     select
    ->         count(`test`.`t`.`f1`)
    ->     from
    ->         `test`.`t1` `t`
    ->     where (
    ->         ('731834939448428685' = 731834939448428685)
    ->     and
    ->         (`test`.`t`.`f1` = 731834939448428685)
    ->     )
    -> ) AS `Test_Value`
    -> from
    ->     `test`.`t1` `t`
    -> join
    ->     `test`.`t2` `tv`
    -> where
    ->     (`test`.`tv`.`t1_f1` = `test`.`t`.`f1`)
    -> group by `test`.`tv`.`f1`
    -> having (`Test_Value` > 0);
...
| 731834939454553991 |          1 |
| 731834939453739998 |          1 |
+--------------------+------------+
18722 rows in set (0.49 sec)
mysql> select
    ->     `test`.`t`.`f1` AS `T_FP`,
    -> (/* select#2 */
    ->     select
    ->         count(`test`.`t`.`f1`)
    ->     from
    ->         `test`.`t1` `t2`
    ->     where (
    ->         (t2.f1=t.f1)
    ->     and
    ->         (`test`.`t`.`f1` = 731834939448428685)
    ->     )
    -> ) AS `Test_Value`
    -> from
    ->     `test`.`t1` `t`
    -> join
    ->     `test`.`t2` `tv`
    -> where
    ->     (`test`.`tv`.`t1_f1` = `test`.`t`.`f1`)
    -> group by `test`.`tv`.`f1`
    -> having (`Test_Value` > 0);
+--------------------+------------+
| T_FP               | Test_Value |
+--------------------+------------+
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
+--------------------+------------+
22 rows in set (1.82 sec)

While the calculated value is not the reason why this query returns incorrect results, it is similar to the previous examples because the optimizer does not take in account that the value of

`test`.`t`.`f1`

  in the outer query is not necessarily equal to 731834939448428685.

Is also interesting that neither Oracle nor PostgreSQL accept such a query, and instead complain of improper use of the 

GROUP BY

 clause. Meanwhile, MySQL accepts this query even with SQL Mode set to

ONLY_FULL_GROUP_BY

 . Reported as bug #92020.

Conclusion and recommendations

While

derived_merge

  is a very effective optimization, it can rewrite queries destructively. Safety measures when using this optimization are:

  1. Make sure that you use the latest version of MySQL/Percona/MariaDB servers which include all of the new bug fixes.
  2. Generated values for the subquery results either constant or returned values of functions is the red flag.
  3. Relaxing SQL Mode
    ONLY_FULL_GROUP_BY

      is always dangerous and should not be used together with

    derived_merge

    .

As a last resort, you can consider rewriting queries to

JOIN

  manually or turning

derived_merge

  optimization

OFF

 .

 

The post Why Optimization derived_merge can Break Your Queries appeared first on Percona Database Performance Blog.

Apr
27
2018
--

MySQL 8.0 GA: Quality or Not?

MySQL 8.0 GA

MySQL 8.0 GAWhat does Anton Ego – a fictional restaurant critic from the Pixar movie Ratatouille – have to do with MySQL 8.0 GA?

When it comes to being a software critic, a lot.

In many ways, the work of a software critic is easy. We risk very little and thrive on negative criticism, which is fun to read and write.

But what about those who give their many hours of code development, and those who have tested such code before release? How about the many people behind the scenes who brought together packaging, documentation, multiple hours of design, marketing, online resources and more?

And all of that, I might add, is open source! Free for the world to take, copy, adapt and even incorporate in full or in part into their own open development.

It is in exactly that area that the team at MySQL shines once again – they have from their humble beginnings build up a colossally powerful database software that handles much of the world’s data, fast.

Used in every area of life – aerospace, defense, education, finances, government, healthcare, pharma, manufacturing, media, retail, telecoms, hospitality, and finally the web – it truly is a community effort.

My little contribution to this effort is first and foremost to say: well done! Well done for such an all-in-all huge endeavor. When I tested MySQL 8.0, I experienced something new: an extraordinarily clean bug report screen when I unleashed our bug hunting rats, ahem, I mean tools. This was somewhat unexpected. Usually, new releases are a fun playground even for seasoned QA engineers who look for the latest toy to break.

I have a suspicion that the team at Oracle either uses newly-improved bug-finding tools or perhaps they included some of our methods and tools in their setup. In either case, it is, was and will be welcome.

When the unexpected occurs, a fight or flight syndrome happens. I tend to be a fighter, so I upped the battle and managed to find about 30 bugs, with 21 bugs logged already. Quite a few of them are Sig 11’s in release builds. Signal 11 exceptions are unexpected crashes, and release builds are the exact same build you would download at dev.mysql.com.

The debug build also had a number of issues, but less than expected, leading me to the conclusions drawn above. Since Oracle engineers marked many of the issues logged as security bugs, I didn’t list them here. I’ll give Oracle some time to fix them, but I might add them later.

In summary, my personal recommendation is this: unless you are a funky new web company thriving on the latest technology, give Oracle the opportunity to make a few small point bugfix releases before adapting MySQL 8.0 GA. After that, providing upgrade prerequisites are matched, and that your software application is compatible, go for it and upgrade.

Before that, this is a great time to start checking out the latest and greatest that MySQL 8.0 GA has to offer!

All in all, I like what I saw, and I expect MySQL 8.0 GA to have a bright future.

Signed, a seasoned software critic.

The post MySQL 8.0 GA: Quality or Not? appeared first on Percona Database Performance Blog.

Dec
20
2017
--

Updates to Percona Bug Tracking

Percona Bug Tracking

Percona Bug TrackingWe’re completing our move of Percona bug tracking into JIRA, and the drop-dead date is December 28, 2017.

For some time now, Percona has maintained both the legacy Launchpad bug tracking system and a JIRA bug tracking system for some of the newer products. The time has come to consolidate everything into the JIRA bug tracking system.

Assuming everything goes according to schedule, on December 28, 2017, we will copy all bug reports in Launchpad into the appropriate JIRA projects (with the appropriate issue state). The new JIRA issue will link to the original Launchpad issue, and the new JIRA issue link is added to the original Launchpad issue. Once this is done, we will then turn off editing on the Launchpad projects.

Q&A

Which Launchpad projects are affected?
Why are you copying all closed issues from Launchpad?

Copying all Launchpad issues to JIRA enables it to be the one place to search for previously reported issues, instead of having to search for old issues in Launchpad and new issues in JIRA.

What should I do now to prepare?

Go to https://jira.percona.com/ and create an account.

Thanks for reporting bugs, and post any questions in the comments section.

Feb
08
2017
--

MySQL super_read_only Bugs

super_read_only

super_read_onlyThis blog we describe an issue with MySQL 5.7’s super_read_only feature when used alongside with GTID in chained slave instances.

Background

In MySQL 5.7.5 and onward introduced the gtid_executed table in the MySQL database to store every GTID. This allows slave instances to use the GTID feature regardless whether the binlog option is set or not. Here is an example of the rows in the gtid_executed table:

mysql> SELECT * FROM mysql.gtid_executed;
+--------------------------------------+----------------+--------------+
| source_uuid                          | interval_start | interval_end |
+--------------------------------------+----------------+--------------+
| 00005730-1111-1111-1111-111111111111 |              1 |            1 |
| 00005730-1111-1111-1111-111111111111 |              2 |            2 |
| 00005730-1111-1111-1111-111111111111 |              3 |            3 |
| 00005730-1111-1111-1111-111111111111 |              4 |            4 |
| 00005730-1111-1111-1111-111111111111 |              5 |            5 |
| 00005730-1111-1111-1111-111111111111 |              6 |            6 |
| 00005730-1111-1111-1111-111111111111 |              7 |            7 |
| 00005730-1111-1111-1111-111111111111 |              8 |            8 |
| 00005730-1111-1111-1111-111111111111 |              9 |            9 |
| 00005730-1111-1111-1111-111111111111 |             10 |           10 |
...

To save space, this table needs to be compressed periodically by replacing GTIDs rows with a single row that represents that interval of identifiers. For example, the above GTIDs can be represented with the following row:

mysql> SELECT * FROM mysql.gtid_executed;
+--------------------------------------+----------------+--------------+
| source_uuid                          | interval_start | interval_end |
+--------------------------------------+----------------+--------------+
| 00005730-1111-1111-1111-111111111111 |              1 |           10 |
...

On the other hand, we have the super_read_only feature, if this option is set to ON, MySQL won’t allow any updates – even from users that have SUPER privileges. It was first implemented on WebscaleSQL and later ported to Percona Server 5.6. MySQL mainstream code implemented a similar feature in version 5.7.8.

The Issue [1]

MySQL’s super_read_only feature won’t allow the compression of the mysql.gtid_executed table. If a high number of transactions run on the master instance, it causes the gtid_executed table to grow to a considerable size. Let’s see an example.

I’m going to use the MySQL Sandbox to quickly setup a Master/Slave configuration, and sysbench to simulate a high number of transactions on master instance.

First, set up replication using GTID:

make_replication_sandbox --sandbox_base_port=5730 /opt/mysql/5.7.17 --how_many_nodes=1 --gtid

Next, set up the variables for a chained slave instance:

echo "super_read_only=ON" >> node1/my.sandbox.cnf
echo "log_slave_updates=ON" >> node1/my.sandbox.cnf
node1/restart

Now, generate a high number of transactions:

sysbench --test=oltp.lua --mysql-socket=/tmp/mysql_sandbox5730.sock --report-interval=1 --oltp-tables-count=100000 --oltp-table-size=100 --max-time=1800 --oltp-read-only=off --max-requests=0 --num-threads=8 --rand-type=uniform --db-driver=mysql --mysql-user=msandbox --mysql-password=msandbox --mysql-db=test prepare

After running sysbench for awhile, we check that the number of rows in the gtid_executed table is increasing faster:

slave1 [localhost] {msandbox} ((none)) &gt; select count(*) from mysql.gtid_executed ;
+----------+
| count(*) |
+----------+
|   300038 |
+----------+
1 row in set (0.00 sec)

By reviewing SHOW ENGINE INNODB STATUS, we can find a compression thread running and trying to compress the gtid_executed table.

---TRANSACTION 4192571, ACTIVE 0 sec fetching rows
mysql tables in use 1, locked 1
9 lock struct(s), heap size 1136, 1533 row lock(s), undo log entries 1525
MySQL thread id 4, OS thread handle 139671027824384, query id 0 Compressing gtid_executed table

This thread runs and takes ages to complete (or may never complete). It has been reported as #84332.

The Issue [2]

What happens if you have to stop MySQL while the thread compressing the gtid_executed table is running? In this special case, if you run the flush-logs command before or at the same time as mysqladmin shutdown, MySQL will actually stop accepting connections (all new connections hang waiting for the server) and will start to wait for the thread compressing the gtid_executed table to complete its work. Below is an example.

First, execute the flush logs command and obtain ERROR 1290:

$ mysql -h 127.0.0.1 -P 5731 -u msandbox -pmsandbox -e "flush logs ;"
ERROR 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement

We’ve tried to shutdown the instance, but it hangs:

$ mysqladmin -h 127.0.0.1 -P 5731 -u msandbox -pmsandbox shutdown
^CWarning;  Aborted waiting on pid file: 'mysql_sandbox5731.pid' after 175 seconds

This bug has been reported and verified as #84597.

The Workaround

If you already have an established connection to your database with SUPER privileges, you can disable the super_read_only feature dynamically. Once that is done, the pending thread compressing the gtid_executed table completes its work and the shutdown finishes successfully. Below is an example.

We check rows in the gtid_executed table:

$ mysql -h 127.0.0.1 -P 5731 -u msandbox -pmsandbox -e "select count(*) from mysql.gtid_executed ;"
+----------+
| count(*) |
+----------+
|   300038 |
+----------+

We disable the super_read_only feature on an already established connection:

$ mysql> set global super_read_only=OFF ;

We check the rows in the gtid_executed table again, verifying that the compress thread ran successfully.

$ mysql -h 127.0.0.1 -P 5731 -u msandbox -pmsandbox -e "select count(*) from mysql.gtid_executed ;"
+----------+
| count(*) |
+----------+
|        1 |
+----------+

Now we can shutdown the instance without issues:

$ mysqladmin -h 127.0.0.1 -P 5731 -u msandbox -pmsandbox shutdown

You can disable the super_read_only feature before you shutdown the instance to compress the gtid_executed table. If you ran into bug above, and don’t have any established connections to your database, the only way to shutdown the server is by issuing a kill -9 on the mysqld process.

Summary

As shown in this blog post, some of the mechanics of MySQL 5.7’s super_read_only command are not working as expected. This can prevent some administrative operations, like shutdown, from happening.

If you are using the super_read_only feature on MySQL 5.7.17 or older, including Percona Server 5.7.16 or older (which ports the mainstream implementation – unlike Percona Server 5.6, which ported Webscale’s super_read_only implementation) don’t use FLUSH LOGS.

Jan
27
2017
--

When MySQL Lies: Wrong seconds_behind_master with slave_parallel_workers > 0

seconds_behind_master

seconds_behind_masterIn today’s blog, I will show an issue with seconds_behind_master that one of our clients faced when running slave_parallel_works > 0. We found out that the reported seconds_behind_master from SHOW SLAVE STATUS was lying. To be more specific, I’m talking about bugs #84415 and #1654091.

The Issue

MySQL will not report the correct slave lag if you have slave_parallel_workers> 0. Let’s show it in practice.

I’ll use MySQL Sandbox to speed up one master and two slaves on MySQL version 5.7.17, and sysbench to populate the database:

# Create sandboxes
make_replication_sandbox /path/to/mysql/5.7.17
# Create table with 1.5M rows on it
sysbench --test=/usr/share/sysbench/tests/db/oltp.lua --mysql-host=localhost --mysql-user=msandbox --mysql-password=msandbox --mysql-socket=/tmp/mysql_sandbox20192.sock --mysql-db=test --oltp-table-size=1500000 prepare
# Add slave_parallel_workers=5 and slave_pending_jobs_size_max=1G" on node1
echo "slave_parallel_workers=5" >> node1/my.sandbox.cnf
echo "slave_pending_jobs_size_max=1G" >> node1/my.sandbox.cnf
node1/restart

Monitor Replication lag via SHOW SLAVE STATUS:

for i in {1..1000};
do
    (
        node1/use -e "SHOW SLAVE STATUSG" | grep "Seconds_Behind_Master" | awk '{print "Node1: " $2}' &
        sleep 0.1 ;
        node2/use -e "SHOW SLAVE STATUSG" | grep "Seconds_Behind_Master" | awk '{print "Node2: " $2}' &
    );
    sleep 1;
done

On a separate terminal, DELETE some rows in the test.sbtest1 table on the master, and monitor the above output once the master completes the delete command:

DELETE FROM test.sbtest1 WHERE id > 100;

Here is a sample output:

master [localhost] {msandbox} (test) > DELETE FROM test.sbtest1 WHERE id > 100;
Query OK, 1499900 rows affected (46.42 sec)
. . .
Node1: 0
Node2: 0
Node1: 0
Node2: 48
Node1: 0
Node2: 48
Node1: 0
Node2: 49
Node1: 0
Node2: 50
. . .
Node1: 0
Node2: 90
Node1: 0
Node2: 91
Node1: 0
Node2: 0
Node1: 0
Node2: 0

As you can see, node1 (which is running with slave_parallel_workers = 5) doesn’t report any lag.

The Workaround

We can workaround this issue by querying performance_schema.threads:

SELECT PROCESSLIST_TIME FROM performance_schema.threads WHERE NAME = 'thread/sql/slave_worker' AND (PROCESSLIST_STATE IS NULL  or PROCESSLIST_STATE != 'Waiting for an event from Coordinator') ORDER BY PROCESSLIST_TIME DESC LIMIT 1;

Let’s modify our monitoring script, and use the above query to monitor the lag on node1:

for i in {1..1000};
do
    (
        node1/use -BNe "SELECT PROCESSLIST_TIME FROM performance_schema.threads WHERE NAME = 'thread/sql/slave_worker' AND (PROCESSLIST_STATE IS NULL  or PROCESSLIST_STATE != 'Waiting for an event from Coordinator') ORDER BY PROCESSLIST_TIME DESC LIMIT 1 INTO @delay; SELECT IFNULL(@delay, 0) AS 'lag';" | awk '{print "Node1: " $1}' &
        sleep 0.1 ;
        node2/use -e "SHOW SLAVE STATUSG" | grep "Seconds_Behind_Master" | awk '{print "Node2: " $2}' &
    );
    sleep 1;
done

master [localhost] {msandbox} (test) > DELETE FROM test.sbtest1 WHERE id > 100;
Query OK, 1499900 rows affected (45.21 sec)
Node1: 0
Node2: 0
Node1: 0
Node2: 0
Node1: 45
Node2: 45
Node1: 46
Node2: 46
. . .
Node1: 77
Node2: 77
Node1: 78
Node2: 79
Node1: 0
Node2: 80
Node1: 0
Node2: 81
Node1: 0
Node2: 0
Node1: 0
Node2: 0

Please note that in our query to performance_schema.threads, we are filtering PROCESSLIST_STATE “NULL” and “!= Waiting for an event from Coordinator”. The correct state is “Executing Event”, but it seems like it doesn’t correctly report that state (#84655).

Summary

MySQL parallel replication is a nice feature, but we still need to make sure we are aware of any potential issues it might bring. Most monitoring systems use the output of SHOW SLAVE STATUS to verify whether or not the slave is lagging behind the master. As shown above, it has its caveats.

As always, we should test, test and test again before implementing any change like this in production!

May
11
2015
--

How Percona Support handles bugs

How Percona Support handles bugsOne of the great values of a Percona Support contract is that we provide bug fixes for covered software, and not just support in terms of advice on how to use it. This is the skill which is most likely missing from in-house for most customers, as it requires a team with code knowledge to build and test infrastructure – something only a few companies can afford to invest in.

There is a lot of misunderstanding about bugs. What is a bug? What is a feature? What is a repeatable bug? How will Percona troubleshoot the bug? In this post I will answer some of the questions about this.

Bugs vs. Features ? One thing a lot of people have a hard time understanding is the difference between a bug and a feature, or when software was designed to work a certain way which might be unwelcome. There is a gray line here, but you need to expect that some of the things you consider to be bugs will be seen as behavior-change features and will be considered as such.

Unfixable Bugs ? There are some behaviors that any sane person would call a bug, but which arise from design limitations or oversight that are impossible to fix in the current GA version without introducing changes that would destabilize it. Such bugs will need to be fixed in the next major GA release or sometimes even further in the future. Some bugs are not bugs at all but rather design tradeoffs made. These can’t be “fixed” unless different design tradeoffs are chosen.

Workaround ? There are going to be unexpected behaviors, unfixable bugs and bugs that take awhile to fix, so your first practical response to running into the bug is often finding a workaround which does not expose it. The Percona Support team will help find a workaround that causes minimal impact to your business, but be prepared: changes to the application, deployed version, schema or configuration will often be required.

Emergencies ? When you have an emergency, our focus is to restore the system to working order. In a complex system a bug fix can often not be delivered in a short period of time, which typically means finding a workaround.

Bug Turnaround ? It is not possible to guarantee the turnaround on a bug fix, as all bugs are different. Some bugs are rather trivial and we might be able to provide a hotfix 24 hours after we have a repeatable test case. In other cases the bug might be complicated and take weeks of engineering to fix or even might be impossible to fix in the current GA version.

Verified Bug Fixes ? When you submit the bug we have to verify if it is actually being a bug. In many cases it might be intended behavior; in others, a user mistake. It is also possible that the behavior has happened once and can’t be repeated. Having a repeatable test case that reveals the bug is the best way to have a bug fixed quickly. You might be able to create a repeatable test case, or our support team might be able to help you create the test case.

Sporadic Bugs ? These are very hard bug types that happen sporadically over a period of time. For example, you might have a system crash once every 3 months with no way to repeat it. The cause of such bugs can be very complicated; for example, a buffer overrun in one piece of code can cause corruption and crash in another place hours later. There are a number of diagnostic tools that exist for such bugs, but generally they take quite awhile to resolve. In addition, without a repeatable test case, it is often impossible to verify that the proposed fix actually resolves the bug.

Environmental Bugs ? Some bugs are caused by what can be called your environment. It could be some hardware bugs or incompatibilities, a build not quite compatible with your version of  operating system, operating system bugs, etc. In some cases we can very clearly point to the environment problems. In others we can suspect the environment is an issue and we may ask you to see if the bug also happens in another environment, such as different hardware or OS installation.

Hot Fixes ? As our default policy we fix bugs in the next release of our software so it can go through the full QA cycle, be properly documented, etc. If you have implemented a workaround and you can wait until the next release, this is the best choice. If not, with the Percona Platinum Support contract, we can provide you with a hotfix that is a special build containing the version of the software you’re running, and with only the bug fix of interest applied. Hotfixes are especially helpful if you’re not looking to do a full software upgrade – requiring several revisions – but want to validate the fix with the minimum possible changes. Hotfixes might also be different from the final bug fix that goes into the GA release. With hotfixes, our goal is to provide a working solution for you faster. Afterward we may optimize or re-architect the code, come up with better option names, etc.

Bug Diagnostics ? Depending on the nature of the bug there are multiple tools that our support team will use for diagnostics – finding a way to fix the bug. To set expectations correctly, it can be a very involved process, where you might need to provide a lot of information or try things on your system, such as:

  • Test case. If you have a test case that can be repeated by the Percona team to trigger the bug, the diagnostic problem is solved from the customer side. Internal debugging starts at this point. It might not be easy to get to that.
  • If we have a crash that we can’t repeat on our system we often will ask you to enable “core” file, or run the program under a debugger so we can get more information when the crash happens.
  • If the problem is related to performance, you should be ready to gather both MySQL information such as EXPLAIN, status counters, information from performance schema, etc., along with system level information such as pt-pmp output,  pt-stalk,  oprofile, perf, etc.
  • If the problem is a “deadlock,” we often need information from gdb about the full state of the system. Information from processlist, performance_schema, SHOW ENGINE INNODB STATUS can also be helpful.
  • It can be very helpful when you have a test system on which you can repeat the problem in your environment and where you can experiment without impacting production. It is not possible in all cases, but is very helpful.
  • Sometimes, for hard-to-repeat bugs, we will need to run a special diagnostics build that provides us with additional debug information. In others, we might need to run a debug build or do a run under valgrind or other software designed to catch bugs. It often has a large performance impact, so it is good to see how your workload can be scaled down in order for this to be feasible.
  • Depending on your environment we might need to login to troubleshoot your bug or might request that you upload the data needed to repeat the bug in our lab (assuming it is not too sensitive). In cases where direct login is not possible, we can help you to come to a repeatable test case via phone, chat, or email. Using screen sharing can also be very helpful.

Bugs and Non-Percona Software ? Percona Support covers some software not produced by Percona. For open source software, if it is not exempt from bug fix support, we will provide the custom build with a bug fix as well as provide the suggested fix to the software maintainer for its possible inclusion in the next release. For example, if we find a bug in the MySQL Community Edition, we will pass our suggested fix to the MySQL Engineering team at Oracle. For other software that is not open source, such as Amazon RDS, we can help to facilitate creation and submission of a repeatable test case and workaround, but we can’t provide a fix as we do not have access to the source code.

In Conclusion ? When I think about software bugs, I find some good parallels with human “bugs” (diseases).  Some issues are trivial to diagnose and the fix is obvious. Others might be very hard to diagnose ? I guess many of us have been in a situation where you visit doctor after doctor and tell them your symptoms, and they run tests but still can’t figure out what’s wrong. Once a diagnosis is done, though, it is not always given the “fix” available or feasible, and while a complete solution is preferred, sometimes we have to settle for “managing” the disease, which is our parallel to implementing changes and settling for a workaround. So in the same way as human doctors, we can’t guarantee we will get to the root of every problem, or if we do, that we will be able to fix every one of them. However, as with having good doctors – having us on your team will maximize the chance of successful resolution.

The post How Percona Support handles bugs appeared first on MySQL Performance Blog.

Feb
11
2014
--

WITHer Recursive Queries?

Over the past few years, we’ve seen MySQL technology advance in leaps and bounds, especially when it comes to scalability. But by focusing on the internals of the storage engine for so long, MySQL has fallen behind regarding support for advanced SQL features.

SQLite, another popular open-source SQL database, just released version 3.8.3, including support for recursive SQL queries using the WITH RECURSIVE syntax, in compliance with SQL:1999.

Why is this significant? It means that MySQL is now the only widely-used SQL implementation that does not support recursive queries. Fifteen years after it was defined in the SQL standard, almost every other SQL database of note has supported this feature:

Only Informix among common RDBMS brands lacks support for WITH RECURSIVE, though Informix still supports recursive queries with the non-standard CONNECT BY syntax.

MySQL has been requested to support common table expressions using the WITH syntax for a long time:

The CTE-style queries would allow us to share more advanced SQL queries with those that are being used by other brands of database, and do it with standard SQL instead of proprietary functions or tricks.

The most common example of a query solved with a recursive CTE is to query a tree of unknown depth. But there are quite a few other useful applications of this form of query, all the way up to fancy stunts like a query that generates a fractal design like the Mandelbrot Set. Recursion is powerful.

Is it time for the MySQL community to raise the priority of the CTE feature requests for MySQL? Visit the links I gave above at bugs.mysql.com, and add your voice by clicking the Affects Me button.

The post WITHer Recursive Queries? appeared first on MySQL Performance Blog.

Nov
01
2013
--

Quick link to Percona Toolkit bugs

In this post I wanted to highlight

percona.com/bugs/<tool>

 , for example: percona.com/bugs/pt-stalk.  This only works for Percona Toolkit.  We often advise people to check a tool’s current bugs, but we don’t always say how.

Percona ToolkitThe official link for Percona Toolkit bugs on Launchpad is https://bugs.launchpad.net/percona-toolkit/+bugs, but then you still have search from there.  So this quick link is a lot easier if, for example, pt-stalk fails to parse df with NFS and you wonder if that’s a known bug (yes it is).  Please never hesitate to file bugs.  We’re always eager to improve our software.

 

The post Quick link to Percona Toolkit bugs appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com