![MySQL Query Tuning for Dev[Op]s webinar](https://www.percona.com/blog/wp-content/uploads/2021/12/MySQL-Query-Tuning-for-DevOps-webinar-200x112.png)
First, I want to thank everyone who attended my November 9, 2021 webinar “Introduction to MySQL Query Tuning for Dev[Op]s“. Recording and slides are available on our webinars page.
Here are answers to the questions from participants which I was not able to provide during the webinar due to a technical issue we experienced at the end.
Q: If a large/complex SELECT is run from inside a Stored Procedure, will it use/not use the same optimizations as it would if run as a direct query? Is there anything to bear in mind when writing Stored Procedures that need to run large/complex SELECTs from within them?
Just wanted to clarify, using a stored procedure does not utilize any of the indexes on the corresponding tables?
A: Any query inside a stored procedure will be optimized the same way as if it was called outside of the routine. We cannot run
EXPLAIN
on stored procedures but we can prove it with the help of Performance Schema.
For example, let’s take a standard test database employees and perform a quite ineffective query on it:
mysql> SELECT emp_no, first_name, last_name, title, salary FROM employees
-> JOIN titles USING(emp_no) JOIN salaries USING(emp_no)
-> WHERE salary = (SELECT MAX(salary) FROM salaries)
-> OR salary = (SELECT MIN(salary) FROM salaries);
+--------+------------+-----------+------------------+--------+
| emp_no | first_name | last_name | title | salary |
+--------+------------+-----------+------------------+--------+
| 43624 | Tokuyasu | Pesch | Senior Staff | 158220 |
| 43624 | Tokuyasu | Pesch | Staff | 158220 |
| 253406 | Olivera | Baek | Technique Leader | 38623 |
+--------+------------+-----------+------------------+--------+
3 rows in set (4,38 sec)
Then let’s create a stored routine that uses this query:
mysql> \d |
mysql> CREATE PROCEDURE my_test()
-> BEGIN
-> SELECT emp_no, first_name, last_name, title, salary
-> FROM employees JOIN titles USING(emp_no)
-> JOIN salaries USING(emp_no)
-> WHERE salary = (SELECT MAX(salary) FROM salaries)
-> OR salary = (SELECT MIN(salary) FROM salaries);
-> END
-> |
Query OK, 0 rows affected (0,02 sec)
This routine call takes about the same time on my laptop as if I run the query separately:
mysql> CALL my_test();
+--------+------------+-----------+------------------+--------+
| emp_no | first_name | last_name | title | salary |
+--------+------------+-----------+------------------+--------+
| 43624 | Tokuyasu | Pesch | Senior Staff | 158220 |
| 43624 | Tokuyasu | Pesch | Staff | 158220 |
| 253406 | Olivera | Baek | Technique Leader | 38623 |
+--------+------------+-----------+------------------+--------+
3 rows in set (4,35 sec)
Query OK, 0 rows affected (4,35 sec)
If I then enable statement instrumentation in Performance Schema and run the query again I would have the following output:
mysql> USE performance_schema
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> UPDATE setup_instruments SET enabled='yes' WHERE name LIKE 'statement/%';
Query OK, 0 rows affected (0,00 sec)
Rows matched: 213 Changed: 0 Warnings: 0
mysql> UPDATE setup_consumers SET enabled='yes' WHERE name LIKE '%statement%';
Query OK, 1 row affected (0,00 sec)
Rows matched: 4 Changed: 1 Warnings: 0
mysql> USE employees;
Database changed
mysql> TRUNCATE performance_schema.events_statements_history;
Query OK, 0 rows affected (0,01 sec)
mysql> SELECT emp_no, first_name, last_name, title, salary FROM employees
-> JOIN titles USING(emp_no) JOIN salaries USING(emp_no)
-> WHERE salary = (SELECT MAX(salary) FROM salaries)
-> OR salary = (SELECT MIN(salary) FROM salaries);
+--------+------------+-----------+------------------+--------+
| emp_no | first_name | last_name | title | salary |
+--------+------------+-----------+------------------+--------+
| 43624 | Tokuyasu | Pesch | Senior Staff | 158220 |
| 43624 | Tokuyasu | Pesch | Staff | 158220 |
| 253406 | Olivera | Baek | Technique Leader | 38623 |
+--------+------------+-----------+------------------+--------+
3 rows in set (4,41 sec)
mysql> SELECT * FROM performance_schema.events_statements_history
-> WHERE thread_id=(SELECT thread_id FROM performance_schema.threads WHERE processlist_id=@@pseudo_thread_id)
-> AND event_name != 'statement/sql/truncate'\G
*************************** 1. row ***************************
THREAD_ID: 100414
EVENT_ID: 24392721
END_EVENT_ID: 26360924
EVENT_NAME: statement/sql/select
SOURCE: init_net_server_extension.cc:93
TIMER_START: 271361154464092000
TIMER_END: 271365562450546000
TIMER_WAIT: 4407986454000
LOCK_TIME: 325000000
SQL_TEXT: SELECT emp_no, first_name, last_name, title, salary FROM employees JOIN titles USING(emp_no)
JOIN salaries USING(emp_no) WHERE salary = (SELECT MAX(salary) FROM salaries) OR salary = (SELECT MIN(salary) FROM salaries)
DIGEST: 4fe5ea4a8a96539aa9e837031c7c4ea59c1b772272a83c1db5e2833170c2d563
DIGEST_TEXT: SELECT `emp_no` , `first_name` , `last_name` , `title` , `salary` FROM `employees` JOIN `titles` USING ( `emp_no` )
JOIN `salaries` USING ( `emp_no` ) WHERE `salary` = ( SELECT MAX ( `salary` )
FROM `salaries` ) OR `salary` = ( SELECT MIN ( `salary` ) FROM `salaries` )
CURRENT_SCHEMA: employees
OBJECT_TYPE: NULL
OBJECT_SCHEMA: NULL
OBJECT_NAME: NULL
OBJECT_INSTANCE_BEGIN: NULL
MYSQL_ERRNO: 0
RETURNED_SQLSTATE: NULL
MESSAGE_TEXT: NULL
ERRORS: 0
WARNINGS: 0
ROWS_AFFECTED: 0
ROWS_SENT: 3
ROWS_EXAMINED: 11069933
CREATED_TMP_DISK_TABLES: 0
CREATED_TMP_TABLES: 0
SELECT_FULL_JOIN: 0
SELECT_FULL_RANGE_JOIN: 0
SELECT_RANGE: 0
SELECT_RANGE_CHECK: 0
SELECT_SCAN: 3
SORT_MERGE_PASSES: 0
SORT_RANGE: 0
SORT_ROWS: 0
SORT_SCAN: 0
NO_INDEX_USED: 1
NO_GOOD_INDEX_USED: 0
NESTING_EVENT_ID: NULL
NESTING_EVENT_TYPE: NULL
NESTING_EVENT_LEVEL: 0
STATEMENT_ID: 428707
1 row in set (0,00 sec)
The interesting part of the output is the performance fields with non-zero values. Namely,
ROWS_SENT: 3
ROWS_EXAMINED: 11069933
SELECT_SCAN: 3
NO_INDEX_USED: 1
Now let’s call the stored procedure and compare these values.
mysql> TRUNCATE performance_schema.events_statements_history;
Query OK, 0 rows affected (0,00 sec)
mysql> CALL my_test();
+--------+------------+-----------+------------------+--------+
| emp_no | first_name | last_name | title | salary |
+--------+------------+-----------+------------------+--------+
| 43624 | Tokuyasu | Pesch | Senior Staff | 158220 |
| 43624 | Tokuyasu | Pesch | Staff | 158220 |
| 253406 | Olivera | Baek | Technique Leader | 38623 |
+--------+------------+-----------+------------------+--------+
3 rows in set (4,55 sec)
Query OK, 0 rows affected (4,55 sec)
mysql> SELECT * FROM performance_schema.events_statements_history
-> WHERE thread_id=(SELECT thread_id FROM performance_schema.threads WHERE processlist_id=@@pseudo_thread_id)
-> AND event_name != 'statement/sql/truncate'\G
*************************** 1. row ***************************
THREAD_ID: 100414
EVENT_ID: 28336202
END_EVENT_ID: 30292242
EVENT_NAME: statement/sp/stmt
SOURCE: sp_head.cc:2210
TIMER_START: 271549571279270000
TIMER_END: 271554128099300000
TIMER_WAIT: 4556820030000
LOCK_TIME: 139000000
SQL_TEXT: SELECT emp_no, first_name, last_name, title, salary FROM employees JOIN titles USING(emp_no)
JOIN salaries USING(emp_no) WHERE salary = (SELECT MAX(salary) FROM salaries) OR salary = (SELECT MIN(salary) FROM salaries)
DIGEST: NULL
DIGEST_TEXT: NULL
CURRENT_SCHEMA: employees
OBJECT_TYPE: PROCEDURE
OBJECT_SCHEMA: employees
OBJECT_NAME: my_test
OBJECT_INSTANCE_BEGIN: NULL
MYSQL_ERRNO: 0
RETURNED_SQLSTATE: NULL
MESSAGE_TEXT: NULL
ERRORS: 0
WARNINGS: 0
ROWS_AFFECTED: 0
ROWS_SENT: 3
ROWS_EXAMINED: 11069933
CREATED_TMP_DISK_TABLES: 0
CREATED_TMP_TABLES: 0
SELECT_FULL_JOIN: 0
SELECT_FULL_RANGE_JOIN: 0
SELECT_RANGE: 0
SELECT_RANGE_CHECK: 0
SELECT_SCAN: 3
SORT_MERGE_PASSES: 0
SORT_RANGE: 0
SORT_ROWS: 0
SORT_SCAN: 0
NO_INDEX_USED: 1
NO_GOOD_INDEX_USED: 0
NESTING_EVENT_ID: 28336201
NESTING_EVENT_TYPE: STATEMENT
NESTING_EVENT_LEVEL: 1
STATEMENT_ID: 429597
*************************** 2. row ***************************
THREAD_ID: 100414
EVENT_ID: 28336201
END_EVENT_ID: 30292243
EVENT_NAME: statement/sql/call_procedure
SOURCE: init_net_server_extension.cc:93
TIMER_START: 271549571185646000
TIMER_END: 271554128135521000
TIMER_WAIT: 4556949875000
LOCK_TIME: 0
SQL_TEXT: CALL my_test()
DIGEST: 070cb7d4d2eba9690a3a993ec61700fb6d6c3e46166e329408d7284e45fd58c2
DIGEST_TEXT: CALL `my_test` ( )
CURRENT_SCHEMA: employees
OBJECT_TYPE: NULL
OBJECT_SCHEMA: NULL
OBJECT_NAME: NULL
OBJECT_INSTANCE_BEGIN: NULL
MYSQL_ERRNO: 0
RETURNED_SQLSTATE: 00000
MESSAGE_TEXT: NULL
ERRORS: 0
WARNINGS: 0
ROWS_AFFECTED: 0
ROWS_SENT: 0
ROWS_EXAMINED: 0
CREATED_TMP_DISK_TABLES: 0
CREATED_TMP_TABLES: 0
SELECT_FULL_JOIN: 0
SELECT_FULL_RANGE_JOIN: 0
SELECT_RANGE: 0
SELECT_RANGE_CHECK: 0
SELECT_SCAN: 0
SORT_MERGE_PASSES: 0
SORT_RANGE: 0
SORT_ROWS: 0
SORT_SCAN: 0
NO_INDEX_USED: 0
NO_GOOD_INDEX_USED: 0
NESTING_EVENT_ID: NULL
NESTING_EVENT_TYPE: NULL
NESTING_EVENT_LEVEL: 0
STATEMENT_ID: 429596
2 rows in set (0,00 sec)
This time, the select from the performance_schema.events_statements_history returned two rows: one for the
SELECT
statement and one for the
CALL
command. We are interested in the data for the
SELECT
statement. It is:
ROWS_SENT: 3
ROWS_EXAMINED: 11069933
SELECT_SCAN: 3
NO_INDEX_USED: 1
So performance metrics are absolutely the same as for the original query.
OK, this was for the slow, not effective query. But what about the one that uses indexes?
Let’s craft another example. Consider a query:
SELECT COUNT(*), title FROM titles GROUP BY title;
And the stored procedure:
mysql> CREATE PROCEDURE my_second_test()
-> BEGIN
-> SELECT COUNT(*), title FROM titles GROUP BY title;
-> END
-> |
Query OK, 0 rows affected (0,02 sec)
Now let’s call the query and examine data in the Performance Schema:
mysql> TRUNCATE performance_schema.events_statements_history;
Query OK, 0 rows affected (0,00 sec)
mysql> SELECT COUNT(*), title FROM titles GROUP BY title;
+----------+--------------------+
| COUNT(*) | title |
+----------+--------------------+
| 97750 | Senior Engineer |
| 107391 | Staff |
| 115003 | Engineer |
| 92853 | Senior Staff |
| 15128 | Assistant Engineer |
| 15159 | Technique Leader |
| 24 | Manager |
+----------+--------------------+
7 rows in set (0,28 sec)
mysql> SELECT * FROM performance_schema.events_statements_history
-> WHERE thread_id=(SELECT thread_id FROM performance_schema.threads WHERE processlist_id=@@pseudo_thread_id)
-> AND event_name != 'statement/sql/truncate'\G
*************************** 1. row ***************************
THREAD_ID: 100414
EVENT_ID: 30340666
END_EVENT_ID: 30346280
EVENT_NAME: statement/sql/select
SOURCE: init_net_server_extension.cc:93
TIMER_START: 271987423077801000
TIMER_END: 271987697673114000
TIMER_WAIT: 274595313000
LOCK_TIME: 137000000
SQL_TEXT: SELECT COUNT(*), title FROM titles GROUP BY title
DIGEST: 3b63daf99fa8586784dcbc73f82b79e777e4fc074c37de3ebd25753f0d684a46
DIGEST_TEXT: SELECT COUNT ( * ) , `title` FROM `titles` GROUP BY `title`
CURRENT_SCHEMA: employees
OBJECT_TYPE: NULL
OBJECT_SCHEMA: NULL
OBJECT_NAME: NULL
OBJECT_INSTANCE_BEGIN: NULL
MYSQL_ERRNO: 0
RETURNED_SQLSTATE: NULL
MESSAGE_TEXT: NULL
ERRORS: 0
WARNINGS: 0
ROWS_AFFECTED: 0
ROWS_SENT: 7
ROWS_EXAMINED: 443308
CREATED_TMP_DISK_TABLES: 0
CREATED_TMP_TABLES: 1
SELECT_FULL_JOIN: 0
SELECT_FULL_RANGE_JOIN: 0
SELECT_RANGE: 0
SELECT_RANGE_CHECK: 0
SELECT_SCAN: 1
SORT_MERGE_PASSES: 0
SORT_RANGE: 0
SORT_ROWS: 0
SORT_SCAN: 0
NO_INDEX_USED: 0
NO_GOOD_INDEX_USED: 0
NESTING_EVENT_ID: NULL
NESTING_EVENT_TYPE: NULL
NESTING_EVENT_LEVEL: 0
STATEMENT_ID: 431634
1 row in set (0,00 sec)
In this case, the picture is different:
ROWS_SENT: 7
ROWS_EXAMINED: 443308
CREATED_TMP_TABLES: 1
SELECT_SCAN: 1
NO_INDEX_USED: 0
NO_GOOD_INDEX_USED: 0
This is still not a very effective query: it uses an index scan but it certainly uses the index.
Let’s check data for the stored procedure:
mysql> TRUNCATE performance_schema.events_statements_history;
Query OK, 0 rows affected (0,00 sec)
mysql> CALL my_second_test();
+----------+--------------------+
| COUNT(*) | title |
+----------+--------------------+
| 97750 | Senior Engineer |
| 107391 | Staff |
| 115003 | Engineer |
| 92853 | Senior Staff |
| 15128 | Assistant Engineer |
| 15159 | Technique Leader |
| 24 | Manager |
+----------+--------------------+
7 rows in set (0,28 sec)
Query OK, 0 rows affected (0,28 sec)
mysql> SELECT * FROM performance_schema.events_statements_history
-> WHERE thread_id=(SELECT thread_id FROM performance_schema.threads WHERE processlist_id=@@pseudo_thread_id)
-> AND event_name != 'statement/sql/truncate'\G
*************************** 1. row ***************************
THREAD_ID: 100414
EVENT_ID: 30354940
END_EVENT_ID: 30360553
EVENT_NAME: statement/sp/stmt
SOURCE: sp_head.cc:2210
TIMER_START: 272124909174908000
TIMER_END: 272125181823179000
TIMER_WAIT: 272648271000
LOCK_TIME: 222000000
SQL_TEXT: SELECT COUNT(*), title FROM titles GROUP BY title
DIGEST: NULL
DIGEST_TEXT: NULL
CURRENT_SCHEMA: employees
OBJECT_TYPE: PROCEDURE
OBJECT_SCHEMA: employees
OBJECT_NAME: my_second_test
OBJECT_INSTANCE_BEGIN: NULL
MYSQL_ERRNO: 0
RETURNED_SQLSTATE: NULL
MESSAGE_TEXT: NULL
ERRORS: 0
WARNINGS: 0
ROWS_AFFECTED: 0
ROWS_SENT: 7
ROWS_EXAMINED: 443308
CREATED_TMP_DISK_TABLES: 0
CREATED_TMP_TABLES: 1
SELECT_FULL_JOIN: 0
SELECT_FULL_RANGE_JOIN: 0
SELECT_RANGE: 0
SELECT_RANGE_CHECK: 0
SELECT_SCAN: 1
SORT_MERGE_PASSES: 0
SORT_RANGE: 0
SORT_ROWS: 0
SORT_SCAN: 0
NO_INDEX_USED: 0
NO_GOOD_INDEX_USED: 0
NESTING_EVENT_ID: 30354939
NESTING_EVENT_TYPE: STATEMENT
NESTING_EVENT_LEVEL: 1
STATEMENT_ID: 432318
*************************** 2. row ***************************
THREAD_ID: 100414
EVENT_ID: 30354939
END_EVENT_ID: 30360554
EVENT_NAME: statement/sql/call_procedure
<I omitted data for the CALL statement>
Counters again have exactly the same values as for the standalone statement:
ROWS_SENT: 7
ROWS_EXAMINED: 443308
CREATED_TMP_TABLES: 1
SELECT_SCAN: 1
NO_INDEX_USED: 0
NO_GOOD_INDEX_USED: 0
So we have proof that the optimizer creates the same query plan no matter if the query was called inside the stored procedure or not.
Q: A few times when I was building a query for a table with multiple indexes, the EXPLAIN command shows me an index that is not clearly the best choice and I had to use FORCE INDEX in the query. I never understand why sometimes this happens, is it possible that sometimes the engine makes that mistake?
A: The engine can make a mistake for sure. If you want to understand more about such mistakes I recommend you to try
EXPLAIN FORMAT=JSON
as described in these series as well as studying the Optimizer Trace. While the different formats of the
EXPLAIN
do not change the query plan,
EXPLAIN FORMAT=JSON
provides more information on the optimizations that are used to resolve the query.
Q: I have a question about statuses. It’s about the Handler_read you mentioned. It keeps on increasing when queries are being executed. If the server has an uptime of more than 14 days, having 1000qps, the handler will be in the range of millions. Once we do a query tuning workshop, I would like to reset the Handlers counter. How to perform that? (just to see: for the same time range, whether the Handler_read decreases)
A: To reset counters for the session use the
FLUSH STATUS
statement. Then you can run the original statement and examine which job needs to be done inside the engine to resolve it.
Here is the pseudo-code, showing how to do it:
FLUSH STATUS;
<RUN YOUR QUERY>
SHOW STATUS LIKE ‘Handler%’;
Q: Can you talk a bit on explain extended? How can we use that to further help us in tuning queries? Does it show the full query that the server/optimizer will execute in the correct format?
A:
EXPLAIN EXTENDED
is included in the standard output of the
EXPLAIN
command since version 5.7. If you started using MySQL earlier and preferred to run
EXPLAIN
command you would now notice two differences.
First, this is the
“Filtered”
column. This column shows how many rows the optimizer expects to be discarded from the output. For example, for the query:
mysql> EXPLAIN SELECT salary FROM salaries WHERE salary BETWEEN 158000 AND 159000\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: salaries
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 2838426
filtered: 11.11
Extra: Using where
1 row in set, 1 warning (0,00 sec)
It expects that the engine will read 2838426 rows but only 11.11% of them would be used to get the final result. This usually indicates that the query is not effective.
On the other way around, the query:
mysql> EXPLAIN SELECT DISTINCT(title) FROM titles\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: titles
partitions: NULL
type: index
possible_keys: PRIMARY
key: PRIMARY
key_len: 209
ref: NULL
rows: 442189
filtered: 100.00
Extra: Using index; Using temporary
1 row in set, 1 warning (0,00 sec)
Would use all 442189 retrieved rows to create the final result set (filtered: 100.00).
Another feature of the
EXPLAIN EXTENDED
before version 5.7 and regular
EXPLAIN
since version 5.7 is that it returns a query as it was rewritten by the optimizer in its diagnostic area, accessible by the
SHOW WARNINGS
command.
For example, let’s take a query:
SELECT emp_no, first_name, last_name FROM employees
WHERE emp_no IN (SELECT emp_no FROM salaries WHERE salary = (SELECT MAX(salary) FROM salaries));
And then run
EXPLAIN
on it, followed by the
SHOW WARNINGS
:
mysql> EXPLAIN SELECT emp_no, first_name, last_name FROM employees
-> WHERE emp_no IN (SELECT emp_no FROM salaries WHERE salary = (SELECT MAX(salary) FROM salaries))\G
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: employees
partitions: NULL
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 299113
filtered: 100.00
Extra: NULL
*************************** 2. row ***************************
id: 1
select_type: PRIMARY
table: <subquery2>
partitions: NULL
type: eq_ref
possible_keys: <auto_distinct_key>
key: <auto_distinct_key>
key_len: 4
ref: employees.employees.emp_no
rows: 1
filtered: 100.00
Extra: NULL
*************************** 3. row ***************************
id: 2
select_type: MATERIALIZED
table: salaries
partitions: NULL
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 2838426
filtered: 10.00
Extra: Using where
*************************** 4. row ***************************
id: 3
select_type: SUBQUERY
table: salaries
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 2838426
filtered: 100.00
Extra: NULL
4 rows in set, 1 warning (0,00 sec)
mysql> SHOW WARNINGS\G
*************************** 1. row ***************************
Level: Note
Code: 1003
Message: /* select#1 */ select `employees`.`employees`.`emp_no` AS `emp_no`,`employees`.`employees`.`first_name` AS `first_name`,
`employees`.`employees`.`last_name` AS `last_name` from `employees`.`employees` semi join (`employees`.`salaries`)
where ((`<subquery2>`.`emp_no` = `employees`.`employees`.`emp_no`)
and (`employees`.`salaries`.`salary` = (/* select#3 */ select max(`employees`.`salaries`.`salary`) from `employees`.`salaries`)))
1 row in set (0,00 sec)
In the
SHOW WARNINGS
output, you see that the optimizer used semi-join optimization for the query: it practically converted one of the subqueries into a
JOIN
. The same query could be written as:
SELECT emp_no, first_name, last_name FROM employees JOIN salaries USING(emp_no) WHERE salary = (SELECT MAX(salary) FROM salaries);
If we turn semi-join optimization off we will see a different warning:
mysql> SET optimizer_switch='semijoin=off';
Query OK, 0 rows affected (0,00 sec)
mysql> EXPLAIN SELECT emp_no, first_name, last_name FROM employees WHERE emp_no IN (SELECT emp_no FROM salaries WHERE salary = (SELECT MAX(salary) FROM salaries))\G
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: employees
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 299113
filtered: 100.00
Extra: Using where
*************************** 2. row ***************************
id: 2
select_type: SUBQUERY
table: salaries
partitions: NULL
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 2838426
filtered: 10.00
Extra: Using where
*************************** 3. row ***************************
id: 3
select_type: SUBQUERY
table: salaries
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 2838426
filtered: 100.00
Extra: NULL
3 rows in set, 1 warning (0,00 sec)
mysql> SHOW WARNINGS\G
*************************** 1. row ***************************
Level: Note
Code: 1003
Message: /* select#1 */ select `employees`.`employees`.`emp_no` AS `emp_no`,`employees`.`employees`.`first_name` AS `first_name`,
`employees`.`employees`.`last_name` AS `last_name` from `employees`.`employees`
where <in_optimizer>(`employees`.`employees`.`emp_no`,`employees`.`employees`.`emp_no`
in ( <materialize> (/* select#2 */ select `employees`.`salaries`.`emp_no` from `employees`.`salaries`
where (`employees`.`salaries`.`salary` = (/* select#3 */ select max(`employees`.`salaries`.`salary`) from `employees`.`salaries`)) ),
<primary_index_lookup>(`employees`.`employees`.`emp_no` in <temporary table> on <auto_distinct_key>
where ((`employees`.`employees`.`emp_no` = `<materialized_subquery>`.`emp_no`)))))
1 row in set (0,00 sec)
This feature could help to understand why one or particular optimization was used.
Q: Have you ever used the mysqltuner perl script and if so, would you suggest it as a short-term option?
A: Do you mean https://github.com/major/MySQLTuner-perl ?
I just run it on my laptop and here are the recommendations I got:
——– Recommendations —————————————————————————
General recommendations:
Reduce or eliminate unclosed connections and network issues
Configure your accounts with ip or subnets only, then update your configuration with skip-name-resolve=1
This is, probably, OK, unless you want to provide access to your MySQL server from the outside.
We will suggest raising the ‘join_buffer_size’ until JOINs not using indexes are found.
See https://dev.mysql.com/doc/internals/en/join-buffer-size.html
(specially the conclusions at the bottom of the page).
I do not understand why to raise
join_buffer_size
if it is not required by queries I use. It also could be dangerous if the number of connections increases.
Before changing innodb_log_file_size and/or innodb_log_files_in_group read this:
It is always good to read the user manual before changing options. I recommend you to do it every time you follow performance tuning advice. Even mine.
Variables to adjust:
join_buffer_size (> 256.0K, or always use indexes with JOINs)
Again, this could not be suggested without examining the queries.
innodb_buffer_pool_size (>= 713.2M) if possible.
This conclusion is based on my data size and this is sane for my laptop with 32G RAM. But if the amount of data is larger than the amount of RAM on the machine this advice would not help you to identify the ideal InnoDB buffer pool size. In this case, I recommend you to start from this blog post and follow the links at the end of it.
innodb_log_file_size should be (=16M) if possible, so InnoDB total log files size equals to 25% of buffer pool size.
InnoDB redo log file size should hold as much data, so InnoDB can flush the data from the buffer pool and reuse log space and does not have to perform aggressive flushing.
If you have a small active data set but you write a lot you may have InnoDB’s total log files size greater than 25% of the buffer pool size. Or, if you have a large dataset, but your workload is almost read, you may have very small redo log files. This advice does not make any sense by itself.
In conclusion, I can say that MySQL Tuner is a product that performs static analysis of your MySQL or MariaDB instance and makes suggestions, based on what its authors think the best practices are.
Unfortunately, it is not possible to tune MySQL the same way for all use cases. InnoDB redo log file size above is just one example. There are other options that should be tuned differently depending on which workload you have.
I suggest you study the product you use and understand what and why you are changing. And it is better to start from the question: “Which problem do I need to solve?” instead of modifying random options.
For a general-purpose server, running on the dedicated machine, you can use option –innodb-dedicated-server. Though, in my opinion, it is far from the ideal too.
Q: Mixed based?
A: Is this the question for the slide “Asynchronous Replica”, explaining how binary logging format affects replication performance? Mixed-based binary log format instructs MySQL to log everything in the
STATEMENT
format by default and automatically switch to the
ROW
format when a user issues commands that are not safe for the
STATEMENT
format. This means that safe queries that were slow on the source server and replicated in the
STATEMENT
format, will experience the same performance issues on the replica too. Unsafe queries are not affected by this behavior, but tables that they modify have to have
PRIMARY KEY
defined. Otherwise, the replica would have to do a full table scan for each row updated.