MySQL index is a data structure used to optimize the performance of database queries at the expense of additional writes and storage space to keep the index data structure up to date. It is used to quickly locate data without having to search every row in a table. Indexes can be created using one or more columns of a table, and each index is given a name. Indexes are especially useful for queries that filter results based on columns with a high number of distinct values.
Indexes are useful for our queries, but duplicate, redundant, and unused indexes reduce performance by confusing the optimizer with query plans, requiring the storage engine to maintain, calculate, and update more index statistics, as well as requiring more disk space.
Since MySQL 8.0, indexes can be marked as invisible. In this blog, I will detail how to detect duplicate and underused indexes as well as the new feature of invisible indexes and how it can help with index management.
How to find the duplicate indexes?
pt-duplicate-key-checker is a command-line tool from Percona Toolkit that scans a MySQL database and identifies tables that have duplicate indexes or primary keys. It can help identify potential problems with the schema and provide guidance on how to fix them. For each duplicate key, the tool prints a DROP INDEX statement by default, allowing you to copy-paste the statement into MySQL to get rid of the duplicate key.
For example:
$ pt-duplicate-key-checker --host=localhost --user=percona --ask-pass Enter password: # ######################################################################## # mytestdb.authors_test # ######################################################################## # idx_first_name is a left-prefix of idx_first_name_last_name # Key definitions: # KEY `idx_first_name` (`first_name`), # KEY `idx_first_name_last_name` (`first_name`,`last_name`), # Column types: # `first_name` varchar(50) default null # `last_name` varchar(50) default null # To remove this duplicate index, execute: ALTER TABLE `mytestdb`.`authors_test` DROP INDEX `idx_first_name`; # Key idx_last_name_id ends with a prefix of the clustered index # Key definitions: # KEY `idx_last_name_id` (`last_name`,`id`) # PRIMARY KEY (`id`), # Column types: # `last_name` varchar(50) default null # `id` int not null auto_increment # To shorten this duplicate clustered index, execute: ALTER TABLE `mytestdb`.`authors_test` DROP INDEX `idx_last_name_id`, ADD INDEX `idx_last_name_id` (`last_name`); # ######################################################################## # Summary of indexes # ######################################################################## # Size Duplicate Indexes 4125830 # Total Duplicate Indexes 2 # Total Indexes 129
How to find the unused indexes?
Unused indexes are indexes that are created in a database but are not being used in any queries. These indexes take up space in the database and can slow down query performance if they are not maintained. Unused indexes should be identified and removed to improve the performance of the database.
We can use the sys.schema unused indexes view to see which indexes have not been used since the last time the MySQL server was restarted.
Example :
mysql> select * from sys.schema_unused_indexes where index_name not like 'fk_%' and object_schema='mytestdb' and object_name='testtable'; +---------------+-------------+----------------+ | object_schema | object_name | index_name | +---------------+-------------+----------------+ | mytestdb | testtable | idx_first_name | | mytestdb | testtable | idx_last_name | +---------------+-------------+----------------+ 2 rows in set (0.00 sec)
Since the index idx_first_name was used to retrieve the records, it is not listed.
mysql> select last_name from testtable WHERE first_name=’Arun’; +-----------+ | last_name | +-----------+ | Jith | +-----------+ 1 rows in set (0.00 sec) mysql> select * from sys.schema_unused_indexes where index_name not like 'fk_%' and object_schema='mytestdb' and object_name='testtable'; +---------------+-------------+---------------+ | object_schema | object_name | index_name | +---------------+-------------+---------------+ | mytestdb | testtable | idx_last_name | +---------------+-------------+---------------+ 1 row in set (0.01 sec)
Invisible Indexes
In MySQL 8.0, there is a feature that allows you to have an invisible index. This means that an index is created on a table, but the optimizer does not use it by default. By using this feature, you can test the impacts of removing an index without actually dropping it. If desired, the index can be made visible again, therefore avoiding the time-consuming process of re-adding the index to a larger table.
The SET_VAR(optimizer_switch = ‘use_invisible_indexes=on’) allows the invisible index to be used for specific application activities or modules during a single query while preventing it from being used across the entire application.
Let’s look at a simple example to see how it works.
CREATE TABLE `authors` ( `id` int NOT NULL AUTO_INCREMENT, `first_name` varchar(50), `last_name` varchar(50), `email` varchar(100) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `email` (`email`) ) ENGINE=InnoDB;
The indexes are set to visible by default, and we may use the ALTER TABLE table name ALTER INDEX index name to make it INVISIBLE/VISIBLE.
To find the index details, we can use “SHOW INDEXES FROM table;” or query the INFORMATION SCHEMA.STATISTICS.
mysql> SELECT INDEX_NAME, IS_VISIBLE FROM INFORMATION_SCHEMA.STATISTICS WHERE TABLE_SCHEMA = 'mytestdb' AND TABLE_NAME = 'authors_test'; +------------+------------+ | INDEX_NAME | IS_VISIBLE | +------------+------------+ | email | YES | | PRIMARY | YES | +------------+------------+ 2 rows in set (0.00 sec) mysql> ALTER TABLE authors_test ALTER INDEX email INVISIBLE; Query OK, 0 rows affected (0.11 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> SELECT INDEX_NAME, IS_VISIBLE FROM INFORMATION_SCHEMA.STATISTICS WHERE TABLE_SCHEMA = 'mytestdb' AND TABLE_NAME = 'authors_test'; +------------+------------+ | INDEX_NAME | IS_VISIBLE | +------------+------------+ | email | NO | | PRIMARY | YES | +------------+------------+ 2 rows in set (0.00 sec)
The query cannot utilize the index on the email column since it is invisible.
mysql> explain select email from authors_test WHERE email='amanda31@example.org'\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: authors_test partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 10063 filtered: 10.00 Extra: Using where 1 row in set, 1 warning (0.00 sec)
The SET VAR optimizer hint can be used to temporarily update the value of an optimizer switch and enable invisible indexes for a single query only, as shown below:
mysql> explain select /*+ SET_VAR(optimizer_switch = 'use_invisible_indexes=on') */ first_name from authors_test WHERE email='amanda31@example.org'\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: authors_test partitions: NULL type: const possible_keys: email key: email key_len: 302 ref: const rows: 1 filtered: 100.00 Extra: Using index 1 row in set, 1 warning (0.00 sec)
Please refer to the documentation for more information on invisible indexes.
Summary
Removal of duplicate and unnecessary indexes is suggested to prevent performance degradation in high concurrent workloads with less-than-ideal data distribution on a table. Unwanted keys take up unnecessary disc space, which can cause overhead to DML and read queries. To test the effects of dropping an index without fully removing it, it can be made invisible first.