May
20
2016
--

Introduction to Troubleshooting Performance – Troubleshooting Slow Queries webinar: Q & A

Troubleshooting Slow Queries

Troubleshooting Slow QueriesIn this blog, I will provide answers to the Q & A for the Troubleshooting Slow Queries webinar.

First, I want to thank you for attending the April 28 webinar. The recording and slides for the webinar are available here. Below is the list of your questions that I wasn’t able to answer during the webinar, with responses:

Q: I’ve heard that is a bad idea to use

select *

; what do you recommend?

A: When I used

SELECT *

 in my slides, I wanted to underline the idea that sometimes you need to select all columns from the table. There is nothing bad about it if you need them.

SELECT *

 is bad when you need only a few columns from the table. In this case, you retrieve more data than needed, which affects performance. Another issue that  

SELECT *

 can cause is if you hard-code the statement into your application, then change table definition; the application could start retrieving columns in wrong order and output (e.g., email instead of billing address). Or even worse, it will try to access a non-existent index in the result set array. The best practice is to explicitly enumerate all columns that your application needs.

Q: I heard that using 

index_field

 length will affect the indexing principle during query execution (e.g., one field is varchar and its length is not fixed, some values have short text, some values have long text. at this time). If we use this field as indexing, what happens?

A: I assume you are asking about the ability to create an index with lengths smaller than the column length? They work as follows:

Assume you have a 

TEXT

  field which contains these user questions:

  1. I’ve heard that is a bad idea to use select * what do you recommend?
  2. I heard that using 
    index_field

     length will affect the indexing principle during query execution (e.g., one field is varchar and its length is not fixed, some values have short text, some values have long text. at this time). If we use this field as indexing, what happens?

  3. ….

Since this is a 

TEXT

  field you cannot create and index on it without specifying its length, so you need to make the index as minimal as possible to uniquely identify questions. If you create an index with length 10 it will contain:

  1. I’ve heard
  2. I heard th

You will index only those parts of questions that are not very distinct from each other, and do not contain useful information about what the question is. You can create index of length 255:

  1. I’ve heard that is a bad idea to use select * what do you recommend?
  2. I heard that using index_field length will affect the indexing principle during query execution (e.g., one field is varchar and its length is not fixed, some values have short text, some values have long text. at this time). If we use this field as index

In this case, the index includes the whole first question and almost all the second question. This makes the index too large and requires us to use more disk space (which causes more IO). Also, information from the second question is probably too much.

If make index of length 75, we will have:

  1. I’ve heard that is a bad idea to use select * what do you recommend?
  2. I heard that using index_field length will affect the indexing principle du

This is more than enough for the first question and gives a good idea of what is in the second question. It also potentially will have enough unique entries to make its cardinality look more like the cardinality of real data distribution.

To conclude: choosing the correct index length is something that requires practice and analysis of your actual data. Try to make them as short as possible, but long enough so that the number of unique entries in the index will be similar to a number of unique entries in the table.

Q: Which view can we query to see stats?

A: Do you mean index statistics?

SHOW INDEX FROM table_name

 will do it.

Q: We have an InnoDB table with 47 fields (mostly text); some are ft-indexed. I tried to do an alter table, and it ran for 24 hours. What is the best way to run an alter table to add one extra field? The table has 1.9 M rows and 47 columns with many indexes.

A: Adding a column requires a table copy. Therefore, the speed of this operation depends on the table size and speed of your disk. If you are using version 5.6 and later, adding a column would not block parallel queries (and therefore is not a big deal). If you are using an older version, you can always use the pt-online-schema-change utility from Percona Toolkit. However, it will run even more slowly than the regular

ALTER TABLE

. Unfortunately, you cannot speed up the execution of

ALTER TABLE

 much. The only thing that you can do is to use a faster disk (with options, tuned to explore speed of the disk).

However, if you do not want to have this increased IO affect the production server, you can alter the table on the separate instance, then copy tablespace to production and then apply all changes to the original table from the binary logs. The steps will be something like:

  1. Ensure you use option
    innodb_file_per_table

      and the big table has individual tablespace

  2. Ensure that binary log is enabled
  3. Start a new server (you can also use an existent stand-by slave).
  4. Disable writes to the table
  5. Record the binary log position
  6. Copy the tablespace to the new server as described here.
  7. Enable writes on the production server
  8. Run
    ALTER TABLE

     on the new server you created in step 2 (it will still take 24 hours)

  9. Stop writes to the table on the production server
  10. Copy the tablespace, altered in step 7
  11. Apply all writes to this table, which are in the binary logs after position, recorded in step 4.
  12. Enable writes to the table

This scenario will take even more time overall, but will have minimal impact on the production server

Q: If there is a compound index like index1(emp_id,date), will the following query be able to use index? “select * from table1 where emp_id = 10”

A: Yes. At least it should.

Q: Are 

filesort

 and

temporary

 in extended info for explain not good?

A: Regarding

filesort

: it depends. For example, you will always have the word

filesort

” for tables which perform 

ORDER BY

 and cannot use an index for

ORDER BY

. This is not always bad. For example, in this query:

mysql> explain select emp_no, first_name from employees where emp_no <20000 order by first_nameG
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: employees
   partitions: NULL
         type: range
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 18722
     filtered: 100.00
        Extra: Using where; Using filesort
1 row in set, 1 warning (0,01 sec)

the primary key used to resolve rows and

filesort

 were necessary and not avoidable. You can read about different

filesort

 algorithms here.

Regarding

Using temporary

: this means what during query execution temporary table will be created. This is can be not good, especially if the temporary table is large and cannot fit into memory. In this case, it would be written to disk and slow down operations. But, again, sometimes creating temporary tables in not avoidable, for example, if you have both

GROUP BY

 and

ORDER BY

 clauses which list columns differently as stated in the user manual.

Q: Is

key_len

 length more of a good thing for query execution?

A:

key_len

 field is not

NULL

 for all queries that use and index, and just shows the length of the key part used. It is not good or bad, it is just for information. You can use this information, for example, to identify which part of combined index is used to resolve the query.

Q: Does an alter query go for an optimizer check?

A: No. You can check it either by enabling optimizer trace, running

ALTER

 and find what trace is empty. Or by enabling the debug option and searching the resulting trace for

optimize

.

Q: A query involves four columns that are all individually covered by an index. The optimizer didn’t merge indexes because of cost, and even didn’t choose the composite index I created.

A: This depends on the table definition and query you used. I cannot provide a more detailed answer based only on this information.

Q cont.: Finally, only certain composite indexes were suitable, the column order in the complex index mattered a lot. Why couldn’t the optimizer merge the four individual single column indexes, and why did the order of the columns in the composite index matter?

A: Regarding why the optimizer could not merge four indexes, I need to see how the table is defined and which data is in these indexed columns. Regarding why the order of the columns in the composite index matters, it depends on the query. Why the optimizer can use an index, say, on

(col1, col2)

 where the conditions

col1=X AND col2=Y

 and

col2=Y AND col2=X

 for the case when you use

OR

, the order is important. For example, for the condition

col1=X OR col2=Y

, where the part

col1=X

 is always executed and the part

col2=Y

  is executed only when

col1=X

 is false. The same logic applies to queries like

SELECT col1 WHERE col2=Y ORDER BY col3

. See the user manual for details.

Q: When I try to obtain the optimizer trace on the console, the result is cut off. Even if I redirect the output to a file, how to overcome that?

A: Which version of MySQL Server do you use? The 

TRACE

 column is defined as

longtext NOT NULL

, and should not cause such issues. If it does with a newer version, report a bug at http://bugs.mysql.com/.

Q: Are there any free graphical visualizers for either EXPLAIN or the optimizer TRACE output?

A: There is graphical visualizer for

EXPLAIN

 in MySQL Workbench. But it works with online data only: you cannot run it on

EXPLAIN

 output, saved into a file. I don’t know about any visualizer for the optimizer

TRACE

 output. However, since it is

JSON

 you can simply save it to file and open in web browser. It will allow a better view than if opened in simple text editor.

Q: When do you use force index instead of

use index

 hints?

A: According to user manual “

USE INDEX (index_list)

 hint tells MySQL to use only one of the named indexes to find rows in the table” and “

FORCE INDEX

  hint acts like

USE INDEX (index_list)

, with the addition that a table scan is assumed to be very expensive . . . a table scan is used only if there is no way to use one of the named indexes to find rows in the table.” This means that when you use

USE INDEX

, you are giving a hint for the optimizer to prefer a particular index to others, but not enforcing index usage if the optimizer prefers a table scan, while

FORCE INDEX

 requires using the index. I myself use only

FORCE

 and

IGNORE

  index hints.

Q: Very informative session. I missed the beginning part. Are you going to distribute the recoded session later?

A: Yes. As usual slides and recording available here.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com