Sep
25
2018
--

Why Optimization derived_merge can Break Your Queries

MySQL optimizer bugs

MySQL optimizer bugsLately, I worked on several queries which started returning wrong results after upgrading MySQL Server to version 5.7 The reason for the failure was derived merge optimization which is one of the default

optimizer_switch

  options. Issues were solved, though at the price of performance, when we turned it

OFF

 . But, more importantly, we could not predict if any other query would start returning incorrect data, to allow us to fix the application before it was too late. Therefore I tried to find reasons why

derived_merge

  can fail.

Analyzing the problem

In the first run, we turned SQL Mode

ONLY_FULL_GROUP_BY

on, and this removed most of the problematic queries. That said, few of the queries that were successfully working with

ONLY_FULL_GROUP_BY

  were affected.

A quick search in the MySQL bugs database gave me a not-so-short list of open bugs:

At first glance, the reported queries do not follow any pattern, and we still cannot quickly identify which would break and which would not.

Then I took a second look by running all of the provided test cases in my environment and found that for four bugs, the optimizer rewrote the query. For three of the bugs, it rewrote in both 5.7 and 8.0, and one case it rewrote in 8.0 only.

The remaining three buggy queries (Bug #85117, Bug #91418, Bug #91878) have things in common. Let’s first look at them:

  1. Bug #85117
    select
        temp.sel
    from
        table1 t1
        left join (
            select *,1 as sel from table1_use t1u where t1u.`table1id`=1
        ) temp on temp.table1id = t1.id
    order by t1.value
  2. Bug #91418
    select
        TST.UID ,TST.BID ,TST.THING_NAME ,TST.OTHER_IFO ,vw2.DIST_UID
    from
        TEST_SUB_PROBLEM TST
        join (
            select uuid() as DIST_UID, vw.*
            from (
                select DISTINCT BID, THING_NAME
                from TEST_SUB_PROBLEM
            ) vw
        ) vw2
    on vw2.BID = TST.BID;
  3. Bug #91878
    SELECT
        Virtual_Table.T_FP AS T_FP,
        (
            SELECT COUNT(Virtual_Table.T_FP)
            FROM t1 t
            WHERE t.f1 = Virtual_Table.T_FP
            AND Virtual_Table.T_FP = 731834939448428685
        ) AS Test_Value
    FROM (
        SELECT t.f1 AS T_FP, tv.f1 AS TV_FP
        FROM t1 AS t
        JOIN t2 AS tv
        ON t.f1 = tv.t1_f1
    ) AS Virtual_Table
    GROUP BY
        Virtual_Table.TV_FP
    HAVING
        Test_Value > 0;

Two of the queries use

DISTINCT

  or

GROUP BY

 , one uses

ORDER BY

  clause. The cases do not have not the same clause in common—which is what I’d expect to see—and so, surprisingly, these are not the cause of the failure. However, all three queries use generated values: a constant in the first one;

UUID()

  and

COUNT()

  functions in the second and third respectively. This similarity is something we need to investigate.

To find out why

derived_merge

  might work incorrectly for these queries we need to understand how this optimization works and why it was introduced.

The intent behind derived_merge

First I recommend checking the official MySQL User Reference Manual and MariaDB knowledge base. It is correct to use both manuals: even if low-level implementations vary, the high-level architecture and the purpose of this optimization are the same.

In short:

derived_merge

  is used for queries that have subqueries in the 

FROM

  clause,  also called “derived tables” and practically converts them into

JOIN

 queries. This optimization allows avoiding unnecessary materialization (creating internal temporary tables to hold results). Virtually this is the same thing as a manual rewrite of a query with a subquery into a query that has

JOIN

 clause(s) only. The only difference is that when we rewrite queries manually, we can compare the expected and actual result, then adjust the resulting query if needed. The MySQL optimizer has to do a correct rewrite at the first attempt. And sometimes this effort fails.

Let’s check why this happens for these particular queries, reported in the MySQL Bugs Database.

Case Study 1: a Query from Bug #85117

Original query

select
      temp.sel
from
    table1 t1
    left join (
         select *,1 as sel from table1_use t1u where t1u.`table1id`=1
    ) temp on temp.table1id = t1.id
order by t1.value

was rewritten to:

Note (Code 1003):
/* select#1 */
select 1 AS `sel`
    from
        `test`.`table1` `t1`
    left join
        (`test`.`table1_use` `t1u`)
    on(((`test`.`t1`.`id` = 1) and (`test`.`t1u`.`table1id` = 1)))
    where 1
    order by `test`.`t1`.`value`;

You can always find a query that the optimizer converts the original one to in the

SHOW WARNINGS

 output following

EXPLAIN [EXTENDED]

 for the query.

In this case, the original query asks to return all rows from the table

table1

 , but selects only the generated field from the subquery. The subquery selects the only row with

table1id=1

 .

Avoiding derived merge optimization is practically the same as joining table

table1

 with a table with one row. You can see how it works in this code snippet:

mysql> create temporary table temp as select *,1 as sel from table1_use t1u where t1u.`table1id`=1;
Query OK, 1 row affected (0.00 sec)
Records: 1  Duplicates: 0  Warnings: 0
mysql> select * from temp;
+----+----------+------+-----+
| id | table1id | uid  | sel |
+----+----------+------+-----+
|  1 |        1 |   99 |   1 |
+----+----------+------+-----+
1 row in set (0.00 sec)
mysql> select temp.sel from table1 t1 left join temp on temp.table1id = t1.id order by t1.value;
+------+
| sel  |
+------+
|    1 |
| NULL |
| NULL |
+------+
3 rows in set (0.00 sec)

However, when the optimizer uses derived-merge optimization, it completely ignores the fact that the resulting table has one row, and that the calculated value would be either

NULL

  or 1 depending if a row corresponding to

table1

  exists in the table. That it prints

select 1 AS `sel`

  in the

EXPLAIN

  output while uses

select NULL AS `sel`

  does not change anything: both are wrong. The correct query without a subquery should look like:

mysql> select if(`test`.`t1u`.`table1id`, 1, NULL) AS `sel`
    -> from `test`.`table1` `t1`
    -> left join (`test`.`table1_use` `t1u`)
    -> on(((`test`.`t1`.`id` = 1) and (`test`.`t1u`.`table1id` = 1)))
    -> where 1
    -> order by `test`.`t1`.`value`;
+------+
| sel  |
+------+
|    1 |
| NULL |
| NULL |
+------+
3 rows in set (0.00 sec)

This report is the easiest of the bugs we will discuss in this post, and is also fixed in MariaDB.

Case Study 2: a Query from Bug #91418

mysql> select * from TEST_SUB_PROBLEM;
+-----+--------+------------+---------------------+
| UID | BID    | THING_NAME | OTHER_IFO           |
+-----+--------+------------+---------------------+
|   1 | thing1 | name1      | look a chicken      |
|   2 | thing1 | name1      | look an airplane    |
|   3 | thing2 | name2      | look a mouse        |
|   4 | thing3 | name3      | look a taperecorder |
|   5 | thing3 | name3      | look an explosion   |
|   6 | thing4 | name4      | look at the stars   |
+-----+--------+------------+---------------------+
6 rows in set (0.00 sec)
mysql> select
    ->     TST.UID ,TST.BID ,TST.THING_NAME ,TST.OTHER_IFO ,vw2.DIST_UID
    -> from
    ->     TEST_SUB_PROBLEM TST
    -> join (
    ->     select uuid() as DIST_UID, vw.*
    ->     from (
    ->         select DISTINCT BID, THING_NAME
    ->         from TEST_SUB_PROBLEM
    ->     ) vw
    -> ) vw2
    -> on vw2.BID = TST.BID;
+-----+--------+------------+---------------------+--------------------------------------+
| UID | BID    | THING_NAME | OTHER_IFO           | DIST_UID                             |
+-----+--------+------------+---------------------+--------------------------------------+
|   1 | thing1 | name1      | look a chicken      | e4c288fd-b29c-11e8-b0d7-0242673a86b2 |
|   2 | thing1 | name1      | look an airplane    | e4c28aef-b29c-11e8-b0d7-0242673a86b2 |
|   3 | thing2 | name2      | look a mouse        | e4c28c47-b29c-11e8-b0d7-0242673a86b2 |
|   4 | thing3 | name3      | look a taperecorder | e4c28d92-b29c-11e8-b0d7-0242673a86b2 |
|   5 | thing3 | name3      | look an explosion   | e4c28ed9-b29c-11e8-b0d7-0242673a86b2 |
|   6 | thing4 | name4      | look at the stars   | e4c29031-b29c-11e8-b0d7-0242673a86b2 |
+-----+--------+------------+---------------------+--------------------------------------+
6 rows in set (0.00 sec)

This query should create a unique

DIST_UID

  for each unique

BID

 name. But, instead, it generates a unique

ID

  for each row.

First, let’s split the query into a couple of queries using temporary tables, to confirm our assumption that it was written correctly in the first place:

mysql> create temporary table vw as select DISTINCT BID, THING_NAME from TEST_SUB_PROBLEM;
Query OK, 4 rows affected (0.01 sec)
Records: 4  Duplicates: 0  Warnings: 0
mysql> select * from vw;
+--------+------------+
| BID    | THING_NAME |
+--------+------------+
| thing1 | name1      |
| thing2 | name2      |
| thing3 | name3      |
| thing4 | name4      |
+--------+------------+
4 rows in set (0.00 sec)
mysql> create temporary table vw2 as select uuid() as DIST_UID, vw.* from vw;
Query OK, 4 rows affected (0.01 sec)
Records: 4  Duplicates: 0  Warnings: 0
mysql> select * from vw2;
+--------------------------------------+--------+------------+
| DIST_UID                             | BID    | THING_NAME |
+--------------------------------------+--------+------------+
| eb155f0e-b29d-11e8-b0d7-0242673a86b2 | thing1 | name1      |
| eb158c05-b29d-11e8-b0d7-0242673a86b2 | thing2 | name2      |
| eb159b28-b29d-11e8-b0d7-0242673a86b2 | thing3 | name3      |
| eb15a916-b29d-11e8-b0d7-0242673a86b2 | thing4 | name4      |
+--------------------------------------+--------+------------+
4 rows in set (0.00 sec)
mysql> select
    -> TST.UID ,TST.BID ,TST.THING_NAME ,TST.OTHER_IFO ,vw2.DIST_UID
    -> from TEST_SUB_PROBLEM TST
    -> join vw2
    -> on vw2.BID = TST.BID;
+-----+--------+------------+---------------------+--------------------------------------+
| UID | BID    | THING_NAME | OTHER_IFO           | DIST_UID                             |
+-----+--------+------------+---------------------+--------------------------------------+
|   1 | thing1 | name1      | look a chicken      | eb155f0e-b29d-11e8-b0d7-0242673a86b2 |
|   2 | thing1 | name1      | look an airplane    | eb155f0e-b29d-11e8-b0d7-0242673a86b2 |
|   3 | thing2 | name2      | look a mouse        | eb158c05-b29d-11e8-b0d7-0242673a86b2 |
|   4 | thing3 | name3      | look a taperecorder | eb159b28-b29d-11e8-b0d7-0242673a86b2 |
|   5 | thing3 | name3      | look an explosion   | eb159b28-b29d-11e8-b0d7-0242673a86b2 |
|   6 | thing4 | name4      | look at the stars   | eb15a916-b29d-11e8-b0d7-0242673a86b2 |
+-----+--------+------------+---------------------+--------------------------------------+
6 rows in set (0.01 sec)
mysql> select distinct DIST_UID
    -> from (
    ->     select
    ->         TST.UID ,TST.BID ,TST.THING_NAME ,TST.OTHER_IFO ,vw2.DIST_UID
    ->     from TEST_SUB_PROBLEM TST
    ->     join vw2
    ->     on vw2.BID = TST.BID
    -> ) t;
+--------------------------------------+
| DIST_UID                             |
+--------------------------------------+
| eb155f0e-b29d-11e8-b0d7-0242673a86b2 |
| eb158c05-b29d-11e8-b0d7-0242673a86b2 |
| eb159b28-b29d-11e8-b0d7-0242673a86b2 |
| eb15a916-b29d-11e8-b0d7-0242673a86b2 |
+--------------------------------------+
4 rows in set (0.00 sec)

With temporary tables, we have precisely four unique

DIST_UID

  values unlike the six values that our original query returned.

Let’s check how the original query was rewritten:

Note (Code 1003):
/* select#1 */
select
    `test`.`TST`.`UID` AS `UID`,
    `test`.`TST`.`BID` AS `BID`,
    `test`.`TST`.`THING_NAME` AS `THING_NAME`,
    `test`.`TST`.`OTHER_IFO` AS `OTHER_IFO`,
    uuid() AS `DIST_UID`
from `test`.`TEST_SUB_PROBLEM` `TST`
join
    (/* select#3 */
    select
        distinct `test`.`TEST_SUB_PROBLEM`.`BID` AS `BID`,
        `test`.`TEST_SUB_PROBLEM`.`THING_NAME` AS `THING_NAME`
    from `test`.`TEST_SUB_PROBLEM`) `vw`
where (`vw`.`BID` = `test`.`TST`.`BID`)

You can see that the optimizer did not wholly remove the subquery here. Let’s run this modified query, and run a test with a temporary table one more time:

mysql> select
    ->     `test`.`TST`.`UID` AS `UID`,
    ->     `test`.`TST`.`BID` AS `BID`,
    ->     `test`.`TST`.`THING_NAME` AS `THING_NAME`,
    ->     `test`.`TST`.`OTHER_IFO` AS `OTHER_IFO`,
    ->     uuid() AS `DIST_UID`
    -> from
    ->     `test`.`TEST_SUB_PROBLEM` `TST`
    -> join
    -> (/* select#3 */
    ->     select
    ->         distinct `test`.`TEST_SUB_PROBLEM`.`BID` AS `BID`,
    ->         `test`.`TEST_SUB_PROBLEM`.`THING_NAME` AS `THING_NAME`
    ->     from
    ->         `test`.`TEST_SUB_PROBLEM`
    -> ) `vw`
    -> where (`vw`.`BID` = `test`.`TST`.`BID`)
    -> ;
+-----+--------+------------+---------------------+--------------------------------------+
| UID | BID    | THING_NAME | OTHER_IFO           | DIST_UID                             |
+-----+--------+------------+---------------------+--------------------------------------+
|   1 | thing1 | name1      | look a chicken      | 12c5f554-b29f-11e8-b0d7-0242673a86b2 |
|   2 | thing1 | name1      | look an airplane    | 12c5f73a-b29f-11e8-b0d7-0242673a86b2 |
|   3 | thing2 | name2      | look a mouse        | 12c5f894-b29f-11e8-b0d7-0242673a86b2 |
|   4 | thing3 | name3      | look a taperecorder | 12c5f9de-b29f-11e8-b0d7-0242673a86b2 |
|   5 | thing3 | name3      | look an explosion   | 12c5fb20-b29f-11e8-b0d7-0242673a86b2 |
|   6 | thing4 | name4      | look at the stars   | 12c5fc7d-b29f-11e8-b0d7-0242673a86b2 |
+-----+--------+------------+---------------------+--------------------------------------+
6 rows in set (0.01 sec)

This time the changed query result is no different to the one we received from the original one. Let’s manually replace the subquery with temporary tables, and check if it affects the result again.

mysql> create temporary table vw
    -> select
    ->     distinct `test`.`TEST_SUB_PROBLEM`.`BID` AS `BID`,
    ->     `test`.`TEST_SUB_PROBLEM`.`THING_NAME` AS `THING_NAME`
    -> from `test`.`TEST_SUB_PROBLEM`;
Query OK, 4 rows affected (0.01 sec)<br>Records: 4  Duplicates: 0  Warnings: 0
mysql> select * from vw;
+--------+------------+
| BID    | THING_NAME |
+--------+------------+
| thing1 | name1      |
| thing2 | name2      |
| thing3 | name3      |
| thing4 | name4      |
+--------+------------+
4 rows in set (0.00 sec)
mysql> select
    ->     `test`.`TST`.`UID` AS `UID`,
    ->     `test`.`TST`.`BID` AS `BID`,
    ->     `test`.`TST`.`THING_NAME` AS `THING_NAME`,
    ->     `test`.`TST`.`OTHER_IFO` AS `OTHER_IFO`,
    ->      uuid() AS `DIST_UID`
    -> from `test`.`TEST_SUB_PROBLEM` `TST`
    -> join vw where (`vw`.`BID` = `test`.`TST`.`BID`) ;
+-----+--------+------------+---------------------+--------------------------------------+
| UID | BID    | THING_NAME | OTHER_IFO           | DIST_UID                             |
+-----+--------+------------+---------------------+--------------------------------------+
|   1 | thing1 | name1      | look a chicken      | e11dbe61-b2a0-11e8-b0d7-0242673a86b2 |
|   2 | thing1 | name1      | look an airplane    | e11dc050-b2a0-11e8-b0d7-0242673a86b2 |
|   3 | thing2 | name2      | look a mouse        | e11dc1af-b2a0-11e8-b0d7-0242673a86b2 |
|   4 | thing3 | name3      | look a taperecorder | e11dc2be-b2a0-11e8-b0d7-0242673a86b2 |
|   5 | thing3 | name3      | look an explosion   | e11dc3a8-b2a0-11e8-b0d7-0242673a86b2 |
|   6 | thing4 | name4      | look at the stars   | e11dc4e9-b2a0-11e8-b0d7-0242673a86b2 |
+-----+--------+------------+---------------------+--------------------------------------+
6 rows in set (0.00 sec)

In this case, the temporary table contains the correct number of rows: 4, but the outer query calculates a 

UUID

  value for all rows in the table

TEST_SUB_PROBLEM

 . It does not take into account that the user initially asks for a unique

UUID

  for each unique

BID

  and not each unique

UID

 . Instead, it just moves a call of

UUID()

  function into the outer query, which creates a unique value for each row in the table

TEST_SUB_PROBLEM

 . It does not take into account that the temporary table contains only four rows. In this case, it would not be easy to build an effective query that generates distinct

UUID

  values for rows with different

BID

 ‘s and the same

UUID

  values for rows with the same

BID

 .

Case Study 3: a Query from Bug #91878

This query is supposed to calculate a number of rows based on complex conditions:

SELECT
Virtual_Table.T_FP AS T_FP,
(SELECT COUNT(Virtual_Table.T_FP) FROM t1 t WHERE t.f1 = Virtual_Table.T_FP AND Virtual_Table.T_FP = 731834939448428685) AS Test_Value
FROM
(SELECT t.f1 AS T_FP, tv.f1 AS TV_FP FROM t1 AS t JOIN t2 AS tv ON t.f1 = tv.t1_f1) AS Virtual_Table
GROUP BY Virtual_Table.TV_FP
HAVING Test_Value > 0;

However, it returns no rows when it should return 22 (check the bug report for the full test case).

mysql> SELECT Virtual_Table.T_FP AS T_FP,
    -> (
    ->     SELECT
    ->         COUNT(Virtual_Table.T_FP)
    ->     FROM t1 t
    ->     WHERE
    ->         t.f1 = Virtual_Table.T_FP
    ->     AND
    ->         Virtual_Table.T_FP = 731834939448428685
    -> ) AS Test_Value
    -> FROM (
    ->     SELECT
    ->         t.f1 AS T_FP, tv.f1 AS TV_FP
    ->     FROM t1 AS t
    ->     JOIN t2 AS tv
    ->     ON t.f1 = tv.t1_f1
    -> ) AS Virtual_Table
    -> GROUP BY Virtual_Table.TV_FP
    -> HAVING Test_Value > 0;
Empty set (1.28 sec)

To find out why this happens let’s perform a temporary table check first.

mysql> create temporary table Virtual_Table SELECT t.f1 AS T_FP, tv.f1 AS TV_FP FROM t1 AS t JOIN t2 AS tv ON t.f1 = tv.t1_f1;
Query OK, 18722 rows affected (2.12 sec)
Records: 18722  Duplicates: 0  Warnings: 0
mysql> SELECT Virtual_Table.T_FP AS T_FP,
    -> (SELECT COUNT(Virtual_Table.T_FP) FROM t1 t
    -> WHERE t.f1 = Virtual_Table.T_FP AND Virtual_Table.T_FP = 731834939448428685) AS Test_Value
    -> FROM  Virtual_Table GROUP BY Virtual_Table.TV_FP HAVING Test_Value > 0;
+--------------------+------------+
| T_FP               | Test_Value |
+--------------------+------------+
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
+--------------------+------------+
22 rows in set (1.62 sec)

The rewritten query returned the correct result, as we expected.

To identify why the original query fails, let’s check how the optimizer rewrote it:

Note (Code 1003):
/* select#1 */
select
    `test`.`t`.`f1` AS `T_FP`,
    (/* select#2 */
        select
            count(`test`.`t`.`f1`)
        from
            `test`.`t1` `t`
        where
            (('731834939448428685' = 731834939448428685)
        and (`test`.`t`.`f1` = 731834939448428685))
    ) AS `Test_Value`
    from
        `test`.`t1` `t`
    join
        `test`.`t2` `tv`
    where
        (`test`.`tv`.`t1_f1` = `test`.`t`.`f1`)
    group by `test`.`tv`.`f1`
    having (`Test_Value` > 0)

Interestingly, when I run this query on the original tables it returned all 18722 rows that exist in table

t2

 .

This output means that we cannot entirely rely on the 

EXPLAIN

  output. But still we can see the same symptoms:

  • Subquery uses a function to generate a value
  • Subquery in the
    FROM

      clause is converted into a 

    JOIN

    , and its values are accessible by an outer subquery

We also see that the query has

GROUP BY

  and

HAVING

  clauses, thus adding a complication.

The query is almost correct, but in this case, the optimizer mixed aliases: it uses the same alias in the internal query as in the external one. If you change the alias from

t

  to

t2

  in the subquery, the rewritten query starts returning correct results:

mysql> select
    ->     `test`.`t`.`f1` AS `T_FP`,
    -> (/* select#2 */
    ->     select
    ->         count(`test`.`t`.`f1`)
    ->     from
    ->         `test`.`t1` `t`
    ->     where (
    ->         ('731834939448428685' = 731834939448428685)
    ->     and
    ->         (`test`.`t`.`f1` = 731834939448428685)
    ->     )
    -> ) AS `Test_Value`
    -> from
    ->     `test`.`t1` `t`
    -> join
    ->     `test`.`t2` `tv`
    -> where
    ->     (`test`.`tv`.`t1_f1` = `test`.`t`.`f1`)
    -> group by `test`.`tv`.`f1`
    -> having (`Test_Value` > 0);
...
| 731834939454553991 |          1 |
| 731834939453739998 |          1 |
+--------------------+------------+
18722 rows in set (0.49 sec)
mysql> select
    ->     `test`.`t`.`f1` AS `T_FP`,
    -> (/* select#2 */
    ->     select
    ->         count(`test`.`t`.`f1`)
    ->     from
    ->         `test`.`t1` `t2`
    ->     where (
    ->         (t2.f1=t.f1)
    ->     and
    ->         (`test`.`t`.`f1` = 731834939448428685)
    ->     )
    -> ) AS `Test_Value`
    -> from
    ->     `test`.`t1` `t`
    -> join
    ->     `test`.`t2` `tv`
    -> where
    ->     (`test`.`tv`.`t1_f1` = `test`.`t`.`f1`)
    -> group by `test`.`tv`.`f1`
    -> having (`Test_Value` > 0);
+--------------------+------------+
| T_FP               | Test_Value |
+--------------------+------------+
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
| 731834939448428685 |          1 |
+--------------------+------------+
22 rows in set (1.82 sec)

While the calculated value is not the reason why this query returns incorrect results, it is similar to the previous examples because the optimizer does not take in account that the value of

`test`.`t`.`f1`

  in the outer query is not necessarily equal to 731834939448428685.

Is also interesting that neither Oracle nor PostgreSQL accept such a query, and instead complain of improper use of the 

GROUP BY

 clause. Meanwhile, MySQL accepts this query even with SQL Mode set to

ONLY_FULL_GROUP_BY

 . Reported as bug #92020.

Conclusion and recommendations

While

derived_merge

  is a very effective optimization, it can rewrite queries destructively. Safety measures when using this optimization are:

  1. Make sure that you use the latest version of MySQL/Percona/MariaDB servers which include all of the new bug fixes.
  2. Generated values for the subquery results either constant or returned values of functions is the red flag.
  3. Relaxing SQL Mode
    ONLY_FULL_GROUP_BY

      is always dangerous and should not be used together with

    derived_merge

    .

As a last resort, you can consider rewriting queries to

JOIN

  manually or turning

derived_merge

  optimization

OFF

 .

 

The post Why Optimization derived_merge can Break Your Queries appeared first on Percona Database Performance Blog.

Aug
01
2016
--

Introduction into storage engine troubleshooting: Q & A

storage engine troubleshooting

 storage engine troubleshootingIn this blog, I will provide answers to the Q & A for the “Introduction into storage engine troubleshooting” webinar.

First, I want to thank everybody for attending the July 14 webinar. The recording and slides for the webinar are available here. Below is the list of your questions that I wasn’t able to answer during the webinar, with responses:

Q: At which isolation level do 

pt-online-schema-change

 and 

pt-archive

  copy data from a table?

A: Both tools do not change the server’s default transaction isolation level. Use either

REPEATABLE READ

 or set it in my

.cnf

.

Q: Can I create an index to optimize a query which has group by A and order by B, both from different tables and A column is from the first table in the two table join?

A: Do you mean a query like

SELECT ... FROM a, b GROUP BY a.A ORDER BY b.B

 ? Yes, this is possible:

mysql> explain select A, B, count(*) from a join b on(a.A=b.id) WHERE b.B < 4 GROUP BY a.A, b.B ORDER BY b.B ASC;
+----+-------------+-------+-------+---------------+------+---------+-----------+------+-----------------------------------------------------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref       | rows | Extra                                                     |
+----+-------------+-------+-------+---------------+------+---------+-----------+------+-----------------------------------------------------------+
|  1 | SIMPLE      | b     | range | PRIMARY,B     | B    | 5       | NULL      |   15 | Using where; Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | a     | ref   | A             | A    | 5       | test.b.id |    1 | Using index                                               |
+----+-------------+-------+-------+---------------+------+---------+-----------+------+-----------------------------------------------------------+
2 rows in set (0.00 sec)

Q: Where can I find recommendations on what kind of engine to use for different application types or use cases?

A: Storage engines are always being actively developed, therefore I suggest that you don’t search for generic recommendations. These can be outdated just a few weeks after they are written. Study engines instead. For example, just a few years ago MyISAM was the only engine (among those officially supported) that could work with FULLTEXT indexes and SPATIAL columns. Now InnoDB supports both: FULLTEXT indexes since version 5.6 and GIS features in 5.7. Today I can recommend InnoDB as a general-purpose engine for all installations, and TokuDB for write-heavy workloads when you cannot use high-speed disks.

Alternative storage engines can help to realize specific business needs. For example, CONNECT brings data to your server from many sources, SphinxSE talks to the Sphinx daemon, etc.

Other alternative storage engines increase the speed of certain workloads. Memory, for example, can be a good fit for temporary tables.

Q: Can you please explain how we find the full text of the query when we query the view ‘statements_with_full_table_Scans’?

A: Do you mean view in sys schema? Sys schema views take information from

summary_*

 and digests it in Performance Schema, therefore it does not contain full queries (only digests). Full text of the query can be found in the

events_statements_*

  tables in the Performance Schema. Note that even the 

events_statements_history_long

  table can be rewritten very quickly, and you may want to save data from it periodically.

Q: Hi is TokuDB for the new document protocol?

A: As Alex Rubin showed in his detailed blog post, the new document protocol just converts NoSQL queries into SQL, and is thus not limited to any storage engine. To use documents and collections, a storage engine must support generated columns (which TokuDB currently does not). So support of X Protocol for TokuDB is limited to relational tables access.

Q: Please comment on “read committed” versus “repeatable read.”
Q: Repeatable read holds the cursor on the result set for the client versus read committed where the cursor is updated after a transaction.

A:

READ COMMITTED

 and

REPEATABLE READ

 are transaction isolation levels, whose details are explained here.
I would not correlate locks set on table rows in different transaction isolation modes with the result set. A transaction with isolation level

REPEATABLE READ

  instead creates a snapshot of rows that are accessed by the transaction. Let’s consider a table:

mysql> create table ti(id int not null primary key, f1 int) engine=innodb;
Query OK, 0 rows affected (0.56 sec)
mysql> insert into ti values(1,1), (2,2), (3,3), (4,4), (5,5), (6,6), (7,7), (8,8), (9,9);
Query OK, 9 rows affected (0.03 sec)
Records: 9  Duplicates: 0  Warnings: 0

Then start the transaction and select a few rows from this table:

mysql1> begin;
Query OK, 0 rows affected (0.00 sec)
mysql1> select * from ti where id < 5;
+----+------+
| id | f1   |
+----+------+
|  1 |    1 |
|  2 |    2 |
|  3 |    3 |
|  4 |    4 |
+----+------+
4 rows in set (0.04 sec)

Now let’s update another set of rows in another transaction:

mysql2> update ti set f1 = id*2 where id > 5;
Query OK, 4 rows affected (0.06 sec)
Rows matched: 4  Changed: 4  Warnings: 0
mysql2> select * from ti;
+----+------+
| id | f1   |
+----+------+
|  1 |    1 |
|  2 |    2 |
|  3 |    3 |
|  4 |    4 |
|  5 |    5 |
|  6 |   12 |
|  7 |   14 |
|  8 |   16 |
|  9 |   18 |
+----+------+
9 rows in set (0.00 sec)

You see that the first four rows – which we accessed in the first transaction – were not modified, and last four were modified. If InnoDB only saved the cursor (as someone answered above) we would expect to see the same result if we ran 

SELECT * ...

  query in our old transaction, but it actually shows whole table content before modification:

mysql1> select * from ti;
+----+------+
| id | f1   |
+----+------+
|  1 |    1 |
|  2 |    2 |
|  3 |    3 |
|  4 |    4 |
|  5 |    5 |
|  6 |    6 |
|  7 |    7 |
|  8 |    8 |
|  9 |    9 |
+----+------+
9 rows in set (0.00 sec)

So “snapshot”  is a better word than “cursor” for the result set. In the case of

READ COMMITTED

, the first transaction would see modified rows:

mysql1> drop table ti;
Query OK, 0 rows affected (0.11 sec)
mysql1> create table ti(id int not null primary key, f1 int) engine=innodb;
Query OK, 0 rows affected (0.38 sec)
mysql1> insert into ti values(1,1), (2,2), (3,3), (4,4), (5,5), (6,6), (7,7), (8,8), (9,9);
Query OK, 9 rows affected (0.04 sec)
Records: 9  Duplicates: 0  Warnings: 0
mysql1> set transaction isolation level read committed;
Query OK, 0 rows affected (0.00 sec)
mysql1> begin;
Query OK, 0 rows affected (0.00 sec)
mysql1> select * from ti where id < 5;
+----+------+
| id | f1   |
+----+------+
|  1 |    1 |
|  2 |    2 |
|  3 |    3 |
|  4 |    4 |
+----+------+
4 rows in set (0.00 sec)

Let’s update all rows in the table this time:

mysql2> update ti set f1 = id*2;
Query OK, 9 rows affected (0.04 sec)
Rows matched: 9  Changed: 9  Warnings: 0

Now the first transaction sees both the modified rows with id >= 5 (not in the initial result set), but also the modified rows with id < 5 (which existed in the initial result set):

mysql1> select * from ti;
+----+------+
| id | f1   |
+----+------+
|  1 |    2 |
|  2 |    4 |
|  3 |    6 |
|  4 |    8 |
|  5 |   10 |
|  6 |   12 |
|  7 |   14 |
|  8 |   16 |
|  9 |   18 |
+----+------+
9 rows in set (0.00 sec)

Feb
27
2013
--

MySQL optimizer: ANALYZE TABLE and Waiting for table flush

The MySQL optimizer makes the decision of what execution plan to use based on the information provided by the storage engines. That information is not accurate in some engines like InnoDB and they are based in statistics calculations therefore sometimes some tune is needed. In InnoDB these statistics are calculated automatically, check the following blog post for more information:

http://www.mysqlperformanceblog.com/2011/10/06/when-does-innodb-update-table-statistics-and-when-it-can-bite/

There are some variables to tune how that statistics are calculated but we need to wait until the gathering process triggers again to see if there is any improvement. Usually the first step to try to get back to the previous execution plan is to force that process with ANALYZE TABLE that is usually fast enough to not cause troubles.

Let’s see an example of how a simple and fast ANALYZE can cause a downtime.

Waiting for table flush:

In order to trigger this problem we need:

– Lot of concurrency
– A long running query
– Run an ANALYZE TABLE on a table accessed by the long running query

So first we need a long running query against table t:

SELECT * FROM t WHERE c > '%c%';

Then in our efforts to get a better execution plan for another query we run ANALYZE TABLE:

mysql> analyze table t;
+--------+---------+----------+----------+
| Table  | Op      | Msg_type | Msg_text |
+--------+---------+----------+----------+
| test.t | analyze | status   | OK       |
+--------+---------+----------+----------+
1 row in set (0.00 sec)

Perfect, very fast! But then some seconds later we realize that our application is down. Let’s see the process list. I’ve removed most of the columns to make it clearer:

+------+-------------------------+---------------------------------+
| Time | State                   | Info                            |
|   32 | Writing to net          | select * from t where c > '%0%' |
|   12 | Waiting for table flush | select * from test.t where i=1  |
|   12 | Waiting for table flush | select * from test.t where i=2  |
|   12 | Waiting for table flush | select * from test.t where i=3  |
|   11 | Waiting for table flush | select * from test.t where i=7  |
|   10 | Waiting for table flush | select * from test.t where i=11 |
|   11 | Waiting for table flush | select * from test.t where i=5  |
|   11 | Waiting for table flush | select * from test.t where i=4  |
|   11 | Waiting for table flush | select * from test.t where i=9  |
|   11 | Waiting for table flush | select * from test.t where i=8  |
|   11 | Waiting for table flush | select * from test.t where i=12 |
|   11 | Waiting for table flush | select * from test.t where i=14 |
|   10 | Waiting for table flush | select * from test.t where i=6  |
|   10 | Waiting for table flush | select * from test.t where i=15 |
|   10 | Waiting for table flush | select * from test.t where i=10 |
[...]

The ANALYZE TABLE runs perfect but after it the rest of the threads that are running a query against that table need to wait. This is because MySQL has detected that the underlying table has changed and it needs to close and reopen it using FLUSH. Therefore the table will be locked until all queries that are using that table finish. There are only two solutions to this situation, wait until the long query finishes or kill the query. Also, we have to take in account that killing a query could cause even more troubles. If we are dealing with a write query on InnoDB the rollback process could take even more time to finish than the original query. On the other hand, if the table is MyISAM there will be no rollback process so all the already updated rows can’t be recovered.

This particular example is not only a problem of ANALYZE. Other commands like FLUSH TABLES, ALTER, RENAME, OPTIMIZE or REPAIR can cause threads to wait on “Waiting for tables”, “Waiting for table” and “Waiting for table flush”.

Conclusion

Before running an ANALYZE table or any other command listed before, check the running queries. If the table that you are going to work on is very used the recommendation is to run it during the low peak of load or a maintenance window.

The post MySQL optimizer: ANALYZE TABLE and Waiting for table flush appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com