Jan
22
2019
--

Monitor and Optimize Slow Queries with PMM and EverSQL – Part One

PMM EverSQL optimization integration

A common challenge with continuously deployed applications is that new and modified SQL queries are constantly being introduced to the application. Many companies choose to use a database monitoring system (such as PMM) to identify those slow queries. But identifying slow queries is only the start – what about actually optimizing them?

In this post we’ll demonstrate a new way to both identify and optimize slow queries, by utilizing the recent integration of Percona Monitoring and Management with EverSQL Query Optimizer via Chrome browser extension. This integration allows you to identify slow queries using PMM, and optimize them automatically using EverSQL Query Optimizer.

Optimizing queries with PMM & EverSQL

We’re using PMM to monitor our MySQL instance, which was pre-loaded with the publicly available StackOverflow dataset. PMM is configured to monitor for slow queries from MySQL’s slow log file.

monitor slow queries dashboard on PMM

We’ll begin with a basic example of how EverSQL can provide value for  a simple SELECT statement. In a follow-up blog post we’ll go through a more sophisticated multi-table query to show how response time can be reduced from 20 minutes to milliseconds(!) using EverSQL.

Let’s have a look at one of the slow queries identified by PMM:

PMM EverSQL optimization integration

In this example, the table posts contains two indexes by default (in addition to the primary key). One that contains the column AnswerCount, and the other contains the column CreationDate.

CREATE TABLE `posts` (
 `Id` int(11) NOT NULL,
 `AcceptedAnswerId` int(11) DEFAULT NULL,
 `AnswerCount` int(11) DEFAULT NULL,
 `ClosedDate` datetime DEFAULT NULL,
 ….
 `CreationDate` datetime NOT NULL,
  ….
 `ViewCount` int(11) NOT NULL,
 PRIMARY KEY (`Id`),
 KEY `posts_idx_answercount` (`AnswerCount`),
 KEY `posts_idx_creationdate` (`CreationDate`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

As you can see below, EverSQL identifies that a composite index which contains both columns can be more beneficial in this case, and recommends to add an index for posts(AnswerCount, CreationDate).

EverSQL optimization report

After using pt-online-schema-change to apply the schema modification, using PMM we are able to observe that the query execution duration changed from 3m 40s to 83 milliseconds!

execution time improvement with EverSQL

 

Note that this Extension is available for Chrome from the chrome web store:

EverSQL for Database Monitoring Applications

Summary

If you’re looking for an easy way to both monitor for slow queries and quickly optimize them, consider deploying Percona Monitoring and Management and then integrating it with EverSQL’s Chrome extension!

Co-Author: Tomer Shay

Tomer Shay, EverSQL

 

Tomer Shay is the Founder of EverSQL. He loves being where the challenge is. In the last 12 years, he had the privilege to code a lot and lead teams of developers, while focusing on databases and performance. He enjoys using technology to bring ideas into reality, help people and see them smile.

 

Jan
21
2015
--

Importing big tables with large indexes with Myloader MySQL tool

Mydumper is known as the faster (much faster) mysqldump alternative. So, if you take a logical backup you will choose Mydumper instead of mysqldump. But what about the restore? Well, who needs to restore a logical backup? It takes ages! Even with Myloader. But this could change just a bit if we are able to take advantage of Fast Index Creation.

As you probably know, Mydumper and mysqldump export the struct of a table, with all the indexes and the constraints, and of course, the data. Then, Myloader and MySQL import the struct of the table and import the data. The most important difference is that you can configure Myloader to import the data using a certain amount of threads. The import steps are:

  1. Create the complete struct of the table
  2. Import the data

When you execute Myloader, internally it first creates the tables executing the “-schema.sql” files and then takes all the filenames without “schema.sql” and puts them in a task queue. Every thread takes a filename from the queue, which actually is a chunk of the table, and executes it.  When finished it takes another chunk from the queue, but if the queue is empty it just ends.

This import procedure works fast for small tables, but with big tables with large indexes the inserts are getting slower caused by the overhead of insert the new values in secondary indexes. Another way to import the data is:

  1. Split the table structure into table creation with primary key, indexes creation and constraint creation
  2. Create tables with primary key
  3. Per table do:
    1. Load the data
    2. Create index
  4. Create constraints

This import procedure is implemented in a branch of Myloader that can be downloaded from here or directly executing bzr with the repository:

bzr branch lp:~david-ducos/mydumper/mydumper

The tool reads the schema files and splits them into three separate statements which create the tables with the primary key, the indexes and the constraints. The primary key is kept in the table creation in order to avoid the recreation of the table when a primary key is added and the “KEY” and “CONSTRAINT” lines are removed. These lines are added to the index and constraint statements, respectively.

It processes tables according to their size starting with the largest because creating the indexes of a big table could take hours and is single-threaded. While we cannot process other indexes at the time, we are potentially able to create other tables with the remaining threads.

It has a new thread (monitor_process) that decides which chunk of data will be put in the task queue and a communication queue which is used by the task processes to tell the monitor_process which chunk has been completed.

I run multiple imports on an AWS m1.xlarge machine with one table comparing Myloader and this branch and I found that with large indexes the times were:

myloader

As you can see, when you have less than 150M rows, import the data and then create the indexes is higher than import the table with the indexes all at once. But everything changes after 150M rows, import 200M takes 64 minutes more for Myloader but just 24 minutes for the new branch.

On a table of 200M rows with a integer primary key and 9 integer columns, you will see how the time increases as the index gets larger:

myloader2

Where:

2-2-0: two 1-column and two 2-column index
2-2-1: two 1-column, two 2-column and one 3-column index
2-3-1: two 1-column, three 2-column and one 3-column index
2-3-2: two 1-column, three 2-column and two 3-column index

Conclusion

This branch can only import all the tables with this same strategy, but with this new logic in Myloader, in a future version it could be able to import each table with the best strategy reducing the time of the restore considerably.

The post Importing big tables with large indexes with Myloader MySQL tool appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com