Apr
09
2019
--

Upcoming Webinar Wed 4/10: Extending and Customizing Percona Monitoring and Management

Percona Monitoring and Management 1.17.0

Percona Monitoring and Management 1.17.0Please join Percona’s Product Manager, Michael Coburn, as he presents his talk Extending and Customizing Percona Monitoring and Management on April 10th, 2019 at 10:00 AM PDT (UTC-7) / 1:00 PM EDT (UTC-4).

Register Now

Do you already run stock PMM in your environment and want to learn how you extend the PMM platform? If so, come learn about:

1. Dashboard Customizations
* How to create a custom dashboard from existing graphs, or build Cross Server Dashboards
2. External Exporters – Monitor any service, anywhere!
* Adding an exporter, view the data in data exploration, to deploying a working Dashboard
3. Working with custom queries (MySQL and PostgreSQL)
* Execute SELECT statements against your database and store in Prometheus
* Build Dashboards relevant to your environment
4. Customizing Exporter Options
* Enable deactivated functionality that applies to your environment
5. Using Grafana Alerting
* Moreover, how to set up channels (SMTP, Slack, etc)
* What’s more, how to configure thresholds and alerts
6. Using MySQL/PostgreSQL Data Source
* Also, execute SELECT statements against your database and plot your application metrics

In order to learn more, register for Extending and Customizing Percona Monitoring and Management.

Apr
08
2019
--

Adding PostgreSQL Tuple Statistics Dashboard to the PMM Plugin

test installation PostgreSQL tuples

While the PMM Engineering team is working on the polish for release of PMM 2, I wanted to share with you some of the dashboard improvements we’re making around PostgreSQL, specifically how you can plot Tuple Statistics using PMM. PMM (Percona Monitoring and Management) is a free and open-source platform for managing and monitoring MySQL®, MongoDB®, and PostgreSQL performance. You can run PMM in your own environment for maximum security and reliability. It provides thorough time-based analysis for MySQL, MongoDB, and PostgreSQL servers to ensure that your data works as efficiently as possible.

This new dashboard is based on data collected from the pg_stat_user_tables view. However, by default, postgres_exporter doesn’t collect this information. Instead we will leverage the exporter’s ability to run custom SELECT queries from a file (queries.yaml in our example). These generate metric series in PMM that we can then visualize. Collecting tuple information using pg_stat_user_users view requires that databases be identified in queries.yaml, so we’ve written a helper script that takes multiple database names and formats the queries.yaml file accordingly.

Setting up the dashboard

First of all we have to enable the extension dblink. It’s a PostgreSQL contrib extension that allows users to connect to other databases and to run arbitrary queries in them.

postgres=# CREATE EXTENSION dblink;
CREATE EXTENSION
postgres=# SELECT dblink_connect('dbname=sbtest1');
 dblink_connect
----------------
 OK
(1 row)

The next step is to create a query for collecting data from tables. We propose to use our bash script that generates a query and forms other fields for a custom query file. You have to specify the database name, or names, that will be monitored. You can store the result of the script in a separate file or can extend an existing file with queries.

[root@ip-10-178-1-82 ec2-user]# ./postgres_query_generator.sh sbtest1 sbtest2 sbtest3 >> queries.yaml

Now lets run the exporter with this file /home/ec2-user/queries.yaml. Please use flag extend.query-path as in the example below. If you want to change exporter options, the procedure is to remove then add back the exporter with the new option identified.

[root@ip-10-178-1-82 ec2-user]# pmm-admin rm postgresql:metrics PS_NODE-1
OK, removed PostgreSQL metrics PS_NODE-1 from monitoring.
[root@ip-10-178-1-82 ec2-user]# pmm-admin add postgresql:metrics --user postgres PS_NODE-1 -- --extend.query-path /home/ec2-user/queries.yaml
OK, now monitoring PostgreSQL metrics using DSN postgresql:***@/postgres

The PostgreSQL Tuple Statistics Dashboard is available on GrafanaLab for download under the perconalab organization.

Here are some screenshots from our test installation.

Mar
12
2019
--

PMM’s Custom Queries in Action: Adding a Graph for InnoDB mutex waits

PMM mutex wait graph

One of the great things about Percona Monitoring and Management (PMM) is its flexibility. An example of that is how one can go beyond the exporters to collect data. One approach to achieve that is using textfile collectors, as explained in  Extended Metrics for Percona Monitoring and Management without modifying the Code. Another method, which is the subject matter of this post, is to use custom queries.

While working on a customer’s contention issue I wanted to check the behaviour of InnoDB Mutexes over time. Naturally, I went straight to PMM and didn’t find a graph suitable for my needs. No graph, no problem! Luckily anyone can enhance PMM. So here’s how I made the graph I needed.

The final result will looks like this:

Custom Queries

What is it?

Starting from the version 1.15.0, PMM provides user the ability to take a SQL SELECT statement and turn the resultset into a metric series in PMM. That is custom queries.

How do I enable that feature?

This feature is ON by default. You only need to edit the configuration file using YAML syntax

Where is the configuration file located?

Config file location is /usr/local/percona/pmm-client/queries-mysqld.yml by default. You can change it when adding mysql metrics via pmm-admin:

pmm-admin add mysql:metrics ... -- --queries-file-name=/usr/local/percona/pmm-client/query.yml

How often is data being collected?

The queries are executed at the LOW RESOLUTION level, which by default is every 60 seconds.

InnoDB Mutex monitoring

The method used to gather Mutex status is querying the PERFORMANCE SCHEMA, as explained here: https://dev.mysql.com/doc/refman/5.7/en/monitor-innodb-mutex-waits-performance-schema.html but intentionally removed the SUM_TIMER_WAIT > 0 condition, so the query used looks like this:

SELECT
EVENT_NAME, COUNT_STAR, SUM_TIMER_WAIT
FROM performance_schema.events_waits_summary_global_by_event_name
WHERE EVENT_NAME LIKE 'wait/synch/mutex/innodb/%'

For this query to return data, some requirements need to be met:

  • The most important one: Performance Schema needs to be enabled
  • Consumers for “event_waits” enabled
  • Instruments for ‘wait/synch/mutex/innodb’ enabled.

If performance schema is enabled, the other two requirements are met by running these two queries:

update performance_schema.setup_instruments set enabled='YES' where name like 'wait/synch/mutex/innodb%';
update performance_schema.setup_consumers set enabled='YES' where name like 'events_waits%';

YAML Configuration File

This is where the magic happens. Explanation of the YAML syntax is covered in deep on the documentation: https://www.percona.com/doc/percona-monitoring-and-management/conf-mysql.html#pmm-conf-mysql-executing-custom-queries

The one used for this issue is:

---
mysql_global_status_innodb_mutex:
    query: "SELECT EVENT_NAME, COUNT_STAR, SUM_TIMER_WAIT FROM performance_schema.events_waits_summary_global_by_event_name WHERE EVENT_NAME LIKE 'wait/synch/mutex/innodb/%'"
    metrics:
      - EVENT_NAME:
          usage: "LABEL"
          description: "Name of the mutex"
      - COUNT_STAR:
          usage: "COUNTER"
          description: "Number of calls"
      - SUM_TIMER_WAIT:
          usage: "GAUGE"
          description: "Duration"

The key info is:

  • The metric name is mysql_global_status_innodb_mutex
  • Since EVENT_NAME is used as a label, it will be possible to have values per event

Remember that this should be in the queries-mysql.yml file. Full path /usr/local/percona/pmm-client/queries-mysqld.yml  inside the db node.

Once that is done, you will start to have those metrics available in Prometheus. Now, we have a graph to do!

Creating the graph in Grafana

Before jumping to grafana to add the graph, we need a proper Prometheus Query (A.K.A: PromQL). I came up with these two (one for the count_star, one for the sum_timer_wait):

topk(5, label_replace(rate(mysql_global_status_innodb_mutex_COUNT_STAR{instance="$host"}[$interval]), "mutex", "$2", "EVENT_NAME", "(.*)/(.*)" ) or label_replace(irate(mysql_global_status_innodb_mutex_COUNT_STAR{instance="$host"}[5m]), "mutex", "$2", "EVENT_NAME", "(.*)/(.*)" ))

and

topk(5, label_replace(rate(mysql_global_status_innodb_mutex_SUM_TIMER_WAIT{instance="$host"}[$interval]), "mutex", "$2", "EVENT_NAME", "(.*)/(.*)" ) or label_replace(irate(mysql_global_status_innodb_mutex_SUM_TIMER_WAIT{instance="$host"}[5m]), "mutex", "$2", "EVENT_NAME", "(.*)/(.*)" ))

These queries are basically: Return the rate values of each mutex event for a specific host. And make some regex to return only the name of the event, and discard whatever is before the last slash character.

Once we are good with our PromQL queries, we can go and add the graph.

Finally, I got the graph that I needed with a very small effort.

The dashboard is also published on the Grafana Labs Community dashboards site.

Summary

PMM’s collection of graphs and dashboard is quite complete, but it is also natural that there are specific metrics that might not be there. For those cases, you can count on the flexibility and ease usage of PMM to collect metrics and create custom graphs. So go ahead, embrace PMM, customize it, make it yours!

The JSON for this graph, so it can be imported easily, is:

{
  "aliasColors": {},
  "bars": false,
  "dashLength": 10,
  "dashes": false,
  "datasource": "Prometheus",
  "fill": 0,
  "gridPos": {
    "h": 18,
    "w": 24,
    "x": 0,
    "y": 72
  },
  "id": null,
  "legend": {
    "alignAsTable": true,
    "avg": true,
    "current": false,
    "max": true,
    "min": true,
    "rightSide": false,
    "show": true,
    "sideWidth": 0,
    "sort": "avg",
    "sortDesc": true,
    "total": false,
    "values": true
  },
  "lines": true,
  "linewidth": 2,
  "links": [],
  "nullPointMode": "null",
  "percentage": false,
  "pointradius": 0.5,
  "points": false,
  "renderer": "flot",
  "seriesOverrides": [
    {
      "alias": "/Timer Wait/i",
      "yaxis": 2
    }
  ],
  "spaceLength": 10,
  "stack": false,
  "steppedLine": false,
  "targets": [
    {
      "expr": "topk(5, label_replace(rate(mysql_global_status_innodb_mutex_COUNT_STAR{instance=\"$host\"}[$interval]), \"mutex\", \"$2\", \"EVENT_NAME\", \"(.*)/(.*)\" )) or topk(5,label_replace(irate(mysql_global_status_innodb_mutex_COUNT_STAR{instance=\"$host\"}[5m]), \"mutex\", \"$2\", \"EVENT_NAME\", \"(.*)/(.*)\" ))",
      "format": "time_series",
      "interval": "$interval",
      "intervalFactor": 1,
      "legendFormat": "{{ mutex }} calls",
      "refId": "A",
      "hide": false
    },
    {
      "expr": "topk(5, label_replace(rate(mysql_global_status_innodb_mutex_SUM_TIMER_WAIT{instance=\"$host\"}[$interval]), \"mutex\", \"$2\", \"EVENT_NAME\", \"(.*)/(.*)\" )) or topk(5, label_replace(irate(mysql_global_status_innodb_mutex_SUM_TIMER_WAIT{instance=\"$host\"}[5m]), \"mutex\", \"$2\", \"EVENT_NAME\", \"(.*)/(.*)\" ))",
      "format": "time_series",
      "interval": "$interval",
      "intervalFactor": 1,
      "legendFormat": "{{ mutex }} timer wait",
      "refId": "B",
      "hide": false
    }
  ],
  "thresholds": [],
  "timeFrom": null,
  "timeShift": null,
  "title": "InnoDB Mutex",
  "tooltip": {
    "shared": true,
    "sort": 2,
    "value_type": "individual"
  },
  "transparent": false,
  "type": "graph",
  "xaxis": {
    "buckets": null,
    "mode": "time",
    "name": null,
    "show": true,
    "values": []
  },
  "yaxes": [
    {
      "format": "short",
      "label": "",
      "logBase": 1,
      "max": null,
      "min": null,
      "show": true
    },
    {
      "decimals": null,
      "format": "ns",
      "label": "",
      "logBase": 1,
      "max": null,
      "min": "0",
      "show": true
    }
  ],
  "yaxis": {
    "align": false,
    "alignLevel": null
  }
}

Mar
07
2019
--

Reducing High CPU on MySQL: a Case Study

CPU Usage after query tuning

In this blog post, I want to share a case we worked on a few days ago. I’ll show you how we approached the resolution of a MySQL performance issue and used Percona Monitoring and Management PMM to support troubleshooting. The customer had noticed a linear high CPU usage in one of their MySQL instances and was not able to figure out why as there was no much traffic hitting the app. We needed to reduce the high CPU usage on MySQL. The server is a small instance:

Models | 6xIntel(R) Xeon(R) CPU E5-2430 0 @ 2.20GHz
10GB RAM

This symptom can be caused by various different reasons. Let’s see how PMM can be used to troubleshoot the issue.

CPU

The original issue - CPU usage at almost 100% during application use

It’s important to understand where the CPU time is being consumed: user space, system space, iowait, and so on. Here we can see that CPU usage was hitting almost 100% and the majority of the time was being spent on user space. In other words, the time the CPU was executing user code, such as MySQL. Once we determined that the time was being spent on user space, we could discard other possible issues. For example, we could eliminate the possibility that a high amount of threads were competing for CPU resources, since that would cause an increase in context switches, which in turn would be taken care of by the kernel – system space.

With that we decided to look into MySQL metrics.

MySQL

Thread activity graph in PMM for MySQL

Queries per second

As expected, there weren’t a lot of threads running—10 on average—and MySQL wasn’t being hammered with questions/transactions. It was running from 500 to 800 QPS (queries per second). Next step was to check the type of workload that was running on the instance:

All the commands are of a SELECT type, in red in this graph

In red we can see that almost all commands are SELECTS. With that in mind, we checked the handlers using 

SHOW STATUS LIKE 'Handler%'

 to verify if those selects were doing an index scan, a full table scan or what.

Showing that the query was a full table scan

Blue in this graph represents

Handler_read_rnd_next

 , which is the counter MySQL increments every time it reads a row when it’s doing a full table scan. Bingo!!! Around 350 selects were reading 2.5 million rows. But wait—why was this causing CPU issues rather than IO issues? If you refer to the first graph (CPU graph) we cannot see iowait.

That is because the data was stored in the InnoDB Buffer Pool, so instead of having to read those 2.5M rows per second from disk, it was fetching them from memory. The stress had moved from disk to CPU. Now that we identified that the issue had been caused by some queries or query, we went to QAN to verify the queries and check their status:

identifying the long running query in QAN

First query, a

SELECT

  on table 

store.clients

 was responsible for 98% of the load and was executing in 20+ seconds.

The initial query load

EXPLAIN confirmed our suspicions. The query was accessing the table using type ALL, which is the last type we want as it means “Full Table Scan”. Taking a look into the fingerprint of the query, we identified that it was a simple query:

Fingerprint of query
Indexes on table did not include a key column

The query was filtering clients based on the status field

SELECT * FROM store.clients WHERE status = ?

 As shown in the indexes, that column was not indexed. Talking with the customer, this turned out to be a query that was introduced as part of a new software release.

From that point, we were confident that we had identified the problem. There could be more, but this particular query was definitely hurting the performance of the server. We decided to add an index and also sent an annotation to PMM, so we could refer back to the graphs to check when the index has been added, check if CPU usage had dropped, and also check Handler_read_rnd_next.

To run the alter we decided to use pt-online-schema-change as it was a busy table, and the tool has safeguards to prevent the situation from becoming even worse. For example, we wanted to pause or even abort the alter in the case of the number of Threads_Running exceeding a certain threshold. The threshold is controlled by

--max-load

  (25 by default) and

--critical-load

  (50 by default):

pmm-admin annotate "Started ALTER store.clients ADD KEY (status)" && \
pt-online-schema-change --alter "ADD KEY (status)" --execute u=root,D=store,t=clients && \
pmm-admin annotate "Finished ALTER store.clients ADD KEY (status)"
Your annotation was successfully posted.
No slaves found. See --recursion-method if host localhost.localdomain has slaves.
Not checking slave lag because no slaves were found and --check-slave-lag was not specified.
Operation, tries, wait:
analyze_table, 10, 1
copy_rows, 10, 0.25
create_triggers, 10, 1
drop_triggers, 10, 1
swap_tables, 10, 1
update_foreign_keys, 10, 1
Altering `store`.`clients`...
Creating new table...
Created new table store._clients_new OK.
Altering new table...
Altered `store`.`_clients_new` OK.
2019-02-22T18:26:25 Creating triggers...
2019-02-22T18:27:14 Created triggers OK.
2019-02-22T18:27:14 Copying approximately 4924071 rows...
Copying `store`.`clients`: 7% 05:46 remain
Copying `store`.`clients`: 14% 05:47 remain
Copying `store`.`clients`: 22% 05:07 remain
Copying `store`.`clients`: 30% 04:29 remain
Copying `store`.`clients`: 38% 03:59 remain
Copying `store`.`clients`: 45% 03:33 remain
Copying `store`.`clients`: 52% 03:06 remain
Copying `store`.`clients`: 59% 02:44 remain
Copying `store`.`clients`: 66% 02:17 remain
Copying `store`.`clients`: 73% 01:50 remain
Copying `store`.`clients`: 79% 01:23 remain
Copying `store`.`clients`: 87% 00:53 remain
Copying `store`.`clients`: 94% 00:24 remain
2019-02-22T18:34:15 Copied rows OK.
2019-02-22T18:34:15 Analyzing new table...
2019-02-22T18:34:15 Swapping tables...
2019-02-22T18:34:27 Swapped original and new tables OK.
2019-02-22T18:34:27 Dropping old table...
2019-02-22T18:34:32 Dropped old table `store`.`_clients_old` OK.
2019-02-22T18:34:32 Dropping triggers...
2019-02-22T18:34:32 Dropped triggers OK.
Successfully altered `store`.`clients`.
Your annotation was successfully posted.

Results

MySQL Handlers after query tuning MySQL query throughput after query tuning
Query analysis by EXPLAIN in PMM after tuning

As we can see, above, CPU usage dropped to less than 25%, which is 1/4 of the previous usage level. Handler_read_rnd_next dropped and we can’t even see it once pt-osc has finished. We had a small increase on Handler_read_next as expected because now MySQL is using the index to resolve the WHERE clause. One interesting outcome is that the instance was able to increase it’s QPS by 2x after the index was added as CPU/Full Table Scan was no longer limiting performance. On average, query time has dropped from 20s to only 661ms.

Summary:

  1. Applying the correct troubleshooting steps to your problems is crucial:
    a) Understand what resources have been saturated.
    b) Understand what if anything is causing an error.
    c) From there you can divert into the areas that are related to that resource and start to narrow down the issue.
    d) Tackle the problems bit by bit.
  2. Having the right tools for the job key for success. PMM is a great example of a tool that can help you quickly identify, drill in, and fix bottlenecks.
  3. Have realistic load tests. In this case, they had tested the new release on a concurrency level that was not like their production
  4. By identifying the culprit query we were able to:
    a.) Drop average query time from 20s to 661ms
    b.) Increase QPS by 2x
    c.) Reduce the usage of CPU to 1/4 of its level prior to our intervention

Disclosure: For security reasons, sensitive information, such as database, table, column names have been modified and graphs recreated to simulate a similar problem.

Feb
20
2019
--

Percona Monitoring and Management (PMM) 1.17.1 Is Now Available

Percona Monitoring and Management 1.17.0

Percona Monitoring and Management

Percona Monitoring and Management (PMM) is a free and open-source platform for managing and monitoring MySQL®, MongoDB®, and PostgreSQL performance. You can run PMM in your own environment for maximum security and reliability. It provides thorough time-based analysis for MySQL®, MongoDB®, and PostgreSQL® servers to ensure that your data works as efficiently as possible.

In this release, we are introducing support for detection of our upcoming PMM 2.0 release in order to avoid potential version conflicts in the future, as PMM 1.x will not be compatible with PMM 2.x.

Another improvement in this release is we have updated the Tooltips for Dashboard MySQL Query Response Time by providing a description of what the graphs display, along with links to related documentation resources. An example of Tooltips in action:

PMM 1.17.1 release provides fixes for CVE-2018-16492 and CVE-2018-16487 vulnerabilities, related to Node.js modules. The authentication system used in PMM is not susceptible to the attacks described in these CVE reports. PMM does not use client-side data objects to control user-access.

In release 1.17.1 we have included two improvements and fixed nine bugs.

Improvements

  • PMM-1339: Improve tooltips for MySQL Query Response Time dashboard
  • PMM-3477: Add Ubuntu 18.10 support

Fixed Bugs

  • PMM-3471: Fix global status metric names in mysqld_exporter for MySQL 8.0 compatibility
  • PMM-3400: Duplicate column in the Query Analytics dashboard Explain section
  • PMM-3353: postgres_exporter does not work with PostgreSQL 11
  • PMM-3188: Duplicate data on Amazon RDS / Aurora MySQL Metrics dashboard
  • PMM-2615: Fix wrong formatting in log which appears if pmm-qan-agent process fails to start
  • PMM-2592: MySQL Replication Dashboard shows error with multi-source replication
  • PMM-2327: Member State Uptime and Max Member Ping time charts on the MongoDB ReplSet dashboard return an error
  • PMM-955: Fix format of User Time and CPU Time Graphs on MySQL User Statistics dashboard
  • PMM-3522: CVE-2018-16492 and CVE-2018-16487

Help us improve our software quality by reporting any Percona Monitoring and Management bugs you encounter using our bug tracking system.

Feb
05
2019
--

Upcoming Webinar Wed 2/6: Percona Software News and Roadmap Update

Percona Software News and Roadmap Update Webinar

Percona Software News and Roadmap Update WebinarJoin Percona CEO Peter Zaitsev as he presents Percona Software News and Roadmap Update on Wednesday, February 6, 2019, at 11:00 AM PST (UTC-8) / 2:00 PM EST (UTC-5).

Register Now

Come and listen to Percona CEO Peter Zaitsev discuss what’s new in Percona open source software. Topics include Percona Server for MySQL and MongoDB, Percona XtraBackup, Percona Toolkit, Percona XtraDB Cluster and Percona Monitoring and Management.

During this webinar, Peter will talk about newly released features in Percona software. He will also show a few quick demos and share with you highlights from the Percona open source software roadmap.

Peter will also talk about new developments in Percona commercial services and finish with a Q&A.

Register today to join Peter for his Percona Software News and Roadmap Update.

Jan
31
2019
--

A New PMM Dashboard to Monitor Memory Usage!

Dashboard to Monitor Memory Usage in Linux

While the PMM team works hard on our PMM 2.0 release, we have been working on a few things in the background which we’d like to show off !  In particular we have developed a new dashboard that displays metrics related to memory usage on Linux systems. The dashboard leverages information collected by node_exporter. The graphs take advantage of  /proc filesystem files, specifically:

  • meminfo: Provides information about distribution and utilization of memory. This varies by architecture and compile options.
  • vmstat: Provides information about block IO and CPU activity in addition to memory.

The information is split into five sections:

  1. Total Memory
  2. VMM (Virtual Memory Manager) Statistics
  3. Memory Statistics
  4. Number and Dynamic of Pages
  5. Pages per Zone

The dashboard will be included as part of the PMM 2.0 release. For you early adopters, you can get it from GrafanaLab and install it alongside your existing Dashboards – it won’t overwrite anything!

Please notice that the dashboard can be imported by ID (9692) in Grafana versions since 5.4.2 or should be downloaded and imported manually in older Grafana versions.

Jan
28
2019
--

Monitor and Optimize Slow Queries with PMM and EverSQL – Part 2

percona_pmm_eversql

EverSQL is a platform that intelligently tunes your SQL queries by providing query optimization recommendations, and feedback on missing indexes. This is the second post of our EverSQL series, if you missed our introductory post take a look there first and then come back to this article.

We’ll use the Stackoverflow data set again as we did in our first post.

Diving into query optimization

We’ll grab the worst performing query in the list from PMM and optimize it. This query builds a list of the top 50 most recent posts which have a score greater than two, and involves joining two large tables – posts and comments. The original runtime of that query is above 20 minutes and causes high load on the server while running.

worst-query-in-PMM

Assuming you have EverSQL’s chrome extension installed, you’ll see a new button in the PMM Query Analytics page, allowing you to send the query and schema structure directly to EverSQL, to retrieve indexing and query optimization recommendations.

eversql recommendations

 

eversql-dashboard1

After implementing EverSQL’s recommendations, the query’s execution duration significantly improved:

improved-query-response-time

Optimization Internals

So what was the actual optimization in this specific case? And why did it work so well? Let’s look at the original query:

SELECT
   p.title
FROM
   so.posts p
       INNER JOIN
   so.comments c ON p.id = c.postid
WHERE
c.score > 2
GROUP BY p.id
ORDER BY p.creationdate DESC
LIMIT 100;

The tables’ structure:

CREATE TABLE `posts` (
  `Id` int(11) NOT NULL,
  `CreationDate` datetime NOT NULL,
  ...
  PRIMARY KEY (`Id`),
  KEY `posts_idx_creationdate` (`CreationDate`),
) ENGINE=InnoDB DEFAULT CHARSET=latin1
CREATE TABLE `comments` (
  `Id` int(11) NOT NULL,
  `CreationDate` datetime NOT NULL,
  `PostId` int(11) NOT NULL,
  `Score` int(11) DEFAULT NULL,
  ....
  PRIMARY KEY (`Id`),
  KEY `comments_idx_postid` (`PostId`),
  KEY `comments_idx_postid_score` (`PostId`,`Score`),
  KEY `comments_idx_score` (`Score`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

This query will return the post title of the latest 100 stackoverflow posts, which had at least one popular comment (with a score higher than two). The posts table contains 39,646,923 records, while the comments table contains 64,510,258 records.

This is the execution plan MySQL (v5.7.20) chose:

original-execution-plan

One of the challenges with this query is that the GROUP BY and ORDER BY clauses contain different fields, which prevent MySQL from using an index for the ORDER BY. As MySQL’s documentation states:

“In some cases, MySQL cannot use indexes to resolve the ORDER BY, although it may still use indexes to find the rows that match the WHERE clause. Examples:  … The query has different ORDER BY and GROUP BY expressions.”.

Now let’s look into the optimized query:

SELECT
   p.title
FROM
   so.posts p
WHERE
   EXISTS( SELECT
           1
       FROM
           so.comments c
       WHERE
           p.id = c.postid AND c.score > 2)
ORDER BY p.creationdate DESC
LIMIT 100;

Since the comments table is joined in this query only to check for existence of matching records in the posts table, we can use an EXISTS subquery instead. This will allow us to avoid inflating the results (by using JOIN) and then deflating them (by using GROUP BY), which are costly operations.

Now that the GROUP BY is redundant and removed, the database can optionally choose to use an index for the ORDER BY clause.

The new execution plan MySQL chooses is:

As mentioned above, this transformation reduced the query execution duration from ~20 minutes to 370ms. We hope you enjoyed this post, please let us know your experiences using the integration between PMM Query Analytics and EverSQL!

As mentioned above, this transformation reduced the query execution duration from ~20 minutes to 370ms.

We hope you enjoyed this post, please let us know your experiences using the integration between PMM Query Analytics and EverSQL!

Co-Author: Tomer Shay

Tomer Shay, EverSQL

 

Tomer Shay is the Founder of EverSQL. He loves being where the challenge is. In the last 12 years, he had the privilege to code a lot and lead teams of developers, while focusing on databases and performance. He enjoys using technology to bring ideas into reality, help people and see them smile.

Jan
22
2019
--

Monitor and Optimize Slow Queries with PMM and EverSQL – Part One

PMM EverSQL optimization integration

A common challenge with continuously deployed applications is that new and modified SQL queries are constantly being introduced to the application. Many companies choose to use a database monitoring system (such as PMM) to identify those slow queries. But identifying slow queries is only the start – what about actually optimizing them?

In this post we’ll demonstrate a new way to both identify and optimize slow queries, by utilizing the recent integration of Percona Monitoring and Management with EverSQL Query Optimizer via Chrome browser extension. This integration allows you to identify slow queries using PMM, and optimize them automatically using EverSQL Query Optimizer.

Optimizing queries with PMM & EverSQL

We’re using PMM to monitor our MySQL instance, which was pre-loaded with the publicly available StackOverflow dataset. PMM is configured to monitor for slow queries from MySQL’s slow log file.

monitor slow queries dashboard on PMM

We’ll begin with a basic example of how EverSQL can provide value for  a simple SELECT statement. In a follow-up blog post we’ll go through a more sophisticated multi-table query to show how response time can be reduced from 20 minutes to milliseconds(!) using EverSQL.

Let’s have a look at one of the slow queries identified by PMM:

PMM EverSQL optimization integration

In this example, the table posts contains two indexes by default (in addition to the primary key). One that contains the column AnswerCount, and the other contains the column CreationDate.

CREATE TABLE `posts` (
 `Id` int(11) NOT NULL,
 `AcceptedAnswerId` int(11) DEFAULT NULL,
 `AnswerCount` int(11) DEFAULT NULL,
 `ClosedDate` datetime DEFAULT NULL,
 ….
 `CreationDate` datetime NOT NULL,
  ….
 `ViewCount` int(11) NOT NULL,
 PRIMARY KEY (`Id`),
 KEY `posts_idx_answercount` (`AnswerCount`),
 KEY `posts_idx_creationdate` (`CreationDate`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

As you can see below, EverSQL identifies that a composite index which contains both columns can be more beneficial in this case, and recommends to add an index for posts(AnswerCount, CreationDate).

EverSQL optimization report

After using pt-online-schema-change to apply the schema modification, using PMM we are able to observe that the query execution duration changed from 3m 40s to 83 milliseconds!

execution time improvement with EverSQL

 

Note that this Extension is available for Chrome from the chrome web store:

EverSQL for Database Monitoring Applications

Summary

If you’re looking for an easy way to both monitor for slow queries and quickly optimize them, consider deploying Percona Monitoring and Management and then integrating it with EverSQL’s Chrome extension!

Co-Author: Tomer Shay

Tomer Shay, EverSQL

 

Tomer Shay is the Founder of EverSQL. He loves being where the challenge is. In the last 12 years, he had the privilege to code a lot and lead teams of developers, while focusing on databases and performance. He enjoys using technology to bring ideas into reality, help people and see them smile.

 

Jan
15
2019
--

Customizing Per-Process Metrics in PMM

Process Memory Usage - a filtered graph in PMM

If you have set up per-process metrics in Percona Monitoring and Management, you may have found yourself in need of tuning it further to not only group processes together, but to display some of them in isolation. In this blogpost we will explore how to modify the rules for grouping processes, so that you can make the most out of this awesome PMM integration.

Let’s say you have followed the link above on how to set up the per-process metrics integration on PMM, and you have imported the dashboard to show these metrics. You will see something like the following:

PMM database and system monitoring and management software

This is an internal testing server we use, in which you can see a high number of VBoxHeadless (29) and mysqld (99) processes running. All the metrics in the dashboard will be grouped by the name of the command used. But, what if we want to see metrics for only one of these processes in isolation? As things stand, we will not be able to do so. It may not make sense to do so in a testing environment, but if you are running multiple mysqld processes (or mongos, postgres, etc) bound to different ports, you may want to see metrics for each of them separately.

Modifying the configuration file

Enter all.yaml!

In the process-exporter documentation on using a configuration file, we can see the following:

The general format of the -config.path YAML file is a top-level process_names section, containing a list of name matchers. […] A process may only belong to one group: even if multiple items would match, the first one listed in the file wins.

This means that even if we have two rules that would match a process, only the first one will be taken into account. This will allow us to both list processes by themselves, and not miss any non-grouped process. How? Let’s imagine we have the following processes running:

mysqld --port=1
mysqld --port=2
mysqld --port=3
mysqld --port=4

And we wanted to be able to tell apart the instances running in ports 1 and 2 from the other ones, we could use the following rules:

- name: "mysqld_port_1"
 cmdline:
 - '.*mysqld.*port=1.*'
- name: "mysqld_port_2"
 cmdline:
 - '.*mysqld.*port=2.*'
- name: "{{.Comm}}"
 cmdline:
 - '.+'

In cmdline we will need the regular expression against which to match the process command running. In this case, we made use of the fact that they were using different ports, but any difference in the command strings can be used. The last rule is the one that will default to “anything else” (with the regular expression that matches anything).

The default rule at the end will make sure you don’t miss any other process, so unless you want only some processes metrics collected, you should always have a rule for it.

A real life working example of configuring per-process metrics

In case all these generic information didn’t make much sense, we will present a concrete example, hoping that it will make everything fit together nicely.

In this example we want to have the mysqld instance using the mysql_sandbox16679.sock socket isolated from all the others, and the VM with ID finishing in 97eafa2795da listed by their own. All other processes are to be grouped together by using the basename of the executable.

You can check the output from ps aux to see the full command used. For instance:

shell> ps aux | grep 97eafa2795da
agustin+ 27785  0.7 0.2 5619280 542536 ?      Sl Nov28 228:24 /usr/lib/virtualbox/VBoxHeadless --comment centos_node1_1543443575974_22181 --startvm a0151e29-35dd-4c14-8e37-97eafa2795da --vrde config

So, we can use the following regular expression for it (we use .* to match any string):

.*VBoxHeadless.*97eafa2795da.*

The same applies to the regular expression for the mysqld process.

The configuration file will end up looking like:

shell>  cat /etc/process-exporter/all.yaml
process_names:
 - name: "Custom VBox"
   cmdline:
   - '.*VBoxHeadless.*97eafa2795da.*'
 - name: "Custom MySQL"
   cmdline:
   - '.*mysqld.*mysql_sandbox16679.sock.*'
 - name: "{{.Comm}}"
   cmdline:
   - '.+'

Let’s restart the service, so that new changes apply, and we will check the graphs after five minutes, to see new changes. Note that you may have to reload the page for the changes to apply.

shell> systemctl restart process-exporter

After refreshing, we will see the new list of processes in the drop-down list:

A new list of processes in PMM after filtering

And after we select them, we will be able to see data for those processes in particular:

Thanks to the default configuration at the end, we are still capturing data from all the other mysqld processes. However, they will have their own group, as mentioned before:

System Processes Metrics graph in PMM

 

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com