A vulnerability has been discovered in all versions of Percona Monitoring and Management (PMM). There is no evidence this vulnerability has been exploited in the wild, and no customer data has been exposed. Vulnerability details This vulnerability stems from the way PMM handles input for MySQL services and agent actions. By abusing specific API endpoints, […]
31
2025
Security Advisory: CVE Affecting Percona Monitoring and Management (PMM)
16
2025
How to Extend Percona Monitoring and Management to Add Logging Functionality
Evolution is one of the inherent traits of modern software. Many people reach out to product teams daily, asking to add more functionality to the software products they use and love. This is understandable: there will always be ways to make a product better by adding more features to the users’ delight so they can […]
20
2025
Percona Monitoring and Management 2: Clarifying the End-of-Life and Transition to PMM 3
At Percona, we’re committed to providing you with the best database monitoring and management tools. With the release of Percona Monitoring and Management 3 (PMM 3), we’re now entering a critical phase in the lifecycle of PMM 2. We understand that PMM 2 remains a vital tool for many of our users, and we want […]
11
2025
Security Advisory: CVE Affecting Percona Monitoring and Management (PMM)
A critical security vulnerability has been identified in the following software that Percona has made available: PMM Open Virtual Appliance (OVA) installations, version 2.38 and above. This vulnerability does not extend to Docker or Amazon Machine Images (AMIs). The Common Vulnerabilities and Exposures (CVE) identifier for this issue has been requested from mitre.org. Immediate actions […]
04
2024
Downsampling Metrics in Percona Monitoring and Management: Saving Space and Improving Performance
Downsampling is the process by which we can selectively prune (discard, summarize, or recalculate) data from a series of samples in order to decrease how much storage is consumed. This has the downside of reducing the accuracy of the data, but has the great benefit of allowing us to store data from a wider sampling […]
03
2024
Keeping an Eye on the Eye: Self-Monitoring for Percona Monitoring and Management
Percona Monitoring and Management (PMM) is a powerful tool for keeping your databases healthy, but what about PMM itself? While it comes pre-packaged as an appliance, PMM’s internal workings can be complex. Many users, including our internal teams, frequently ask: “How can I tell if everything inside PMM is functioning properly?”To address this need, we’ve […]
01
2024
Simplify the Use of ENV Variables in Percona Monitoring and Management AMI
The Percona Monitoring and Management (PMM) Amazon Machine Image (AMI) currently lacks native support for ENV variables. In this guide, we’ll walk through a straightforward workaround that simplifies the process of using ENV variables in PMM AMI and reapplying them after an upgrade.Step one: Adding ENV variables to /srv/.envBegin by consolidating your ENV variables in […]
03
2024
PMM Dump GUI in Percona Monitoring and Management 2.41.0
A couple of weeks ago, we announced the first GA release of the PMM Dump: a new support tool that dumps Percona Monitoring and Management (PMM) metrics and Query Analytics (QAN) data to transfer to the expert engineer for review and performance suggestions. That blog introduced a command-line interface. A week later, PMM 2.0.41.0 was released with GUI for the PMM Dump.
If you are a database administrator or developer, you may encounter some issues that require external assistance. Whether you seek help from Percona Support or the Community, you must provide sufficient information about your database performance and configuration.
Having all the data at hand is crucial for finding the root cause of the issue and providing the best solution. Without the data, Percona experts may ask you multiple questions and request additional information, which can delay the resolution process and increase your frustration. However, gathering such information can be challenging and time-consuming. Providing direct access to PMM, even through a VPN, could be impossible.
That’s why we have introduced a new feature for PMM that allows you to collect the necessary data about your database performance with just one click. This feature will save you time and effort and enable Percona experts to diagnose and resolve your problem faster.
By using PMM Dump in PMM, you can avoid back-and-forth communication and get your problem solved as quickly as possible.
You can use this feature when you report a Support case as a Percona customer or when you report a bug in our Jira as a Community user. This blog post will show you how to use this feature and what kind of data it collects.
PMM Dump is included in PMM server distribution, and you can try it straight away.
PMM Dump menu is located in the bottom left corner, under the “Help” group:
After you click “PMM Dump,” a new dashboard will be opened with a “Create dataset” button.
Click on the “Create dataset” button to make a first dump.
You can choose the service names you want to export in the opened window or leave the default (“All Services”). By default, PMM sets PMM Dump to export data collected in the last 12 hours (default for PMM). You can change this range by adjusting “Start Time” and “End Time.” To export QAN data, select “Export QAN.”
The “Ignore load” checkbox is here in case PMM Dump cannot finish due to protections set in the code. If you want to keep but increase protection limits, use a command-line tool with custom options
—max–load and
—critical–load , as described here.
The same applies if you need advanced filtering or other custom options that PMM Dump provides. I hope that in future versions of PMM, we will have full support for PMM Dump options.
After you click the “Create dataset” button, a dump will be created and available on the PMM Dump dashboard.
Once the dump is complete, the status changes from “Pending” to “Success.” Here you can see details about your dump:
This will be handy after you create a few of them.
If you click on the dots, you will see the options:
“Download” allows you to download the exported data locally. “View logs” will open a modal window with the PMM Dump log file. “Delete” will remove the dump file.
If you are a Percona Support customer, you can safely upload the dump to the Percona SFTP server by clicking “Send to Support.”
In this case, you need to open a Support case and then create individual credentials for the Percona SFTP server. Enter them into the “Name” and “Password” fields. Put your ticket number into the Directory field. You will find more details in the Knowledge Base inside the Support portal.
We require individual credentials for each ticket for extra protection of customer’s data. Once the issue is resolved and the corresponding ticket closed, we remove all data provided by customers. You can read more about Percona security and data privacy practices at https://www.percona.com/legal/percona-security-and-data-privacy-practices.
After you create multiple dumps, you may want to perform bulk actions on them.
Choose a few dumps or click on the top checkbox to select all of them, then choose any of the available operations: “Send to Support,” “Download,” or “Delete.”
Dump files in PMM are stored in the
pmm–data Docker volume. Therefore, you need to watch if they don’t take up too much space. Ensure you are deleting old dump files when they are not needed anymore.
You will find more information on PMM Dump support in the PMM Dump topic in the PMM documentation.
Percona Monitoring and Management is a best-of-breed open source database monitoring solution. It helps you reduce complexity, optimize performance, and improve the security of your business-critical database environments, no matter where they are located or deployed.
07
2023
How to Filter or Customize Alert Notifications in Percona Monitoring and Management (Subject and Body)
In many scenarios, the standard alert notification template in Percona Monitoring and Management (PMM), while comprehensive, may not align perfectly with specific operational needs. This often leads to an excess of details in the notification’s “Subject” and “Body”, cluttering your inbox with information that may not be immediately relevant.
The focus today is on tailoring these notifications to fit your unique requirements. We’ll guide you through the process of editing the “Subject” and “Body” in the PMM UI, ensuring that the alerts you receive are filtered and relevant to your specific business context.
Please note: This post assumes a foundational understanding of basic alerting and configuration in PMM. For those new to these concepts, we recommend consulting the documentation on “SMTP” and “PMM Integrated/Grafana alert” for a primer.
Customizing the “Subject” section of alert notification
1) The default “Subject” will look something like below.
2) Now, let’s proceed to edit the “subject” content.
I) First, we need to create a new message template called “email.subject” in Alerting -> Contact points with the following content.
Template_name: email.subject
{{ define "email.subject" }} {{ range .Alerts }} Percona Alert | {{ .Labels. }} | {{ .Labels.node_name }} | {{ .Labels.DB }} {{ end }} {{ end }}
Here, we are simply using the range to iterate over the alert labels. We loop through the alert labels and extract the alert name and node name.
The provided template is written in Go’s templating language. For a more detailed understanding of the syntax and usage of templates, please refer to the official manual.
II) Then we need to edit the default contact point name inside “Alerting->Contact points”
And define the below “Subject” under “Optional Email Settings”.
{{ template "email.subject". }}
III) After successfully testing, we can save the changes.
That’s it. Now, if the alert triggers, we will observe a customized subject in the email.
Example:
Customizing the “Body” section of alert notification
1) Let’s first see how the notifications appear with the native alerting. This is a basic notification alert that triggers when the database/MySQL is down. As we can see, it includes additional information, such as various labels and a summary.
2) Now, suppose we want to get rid of some content and want only a few relevant details. This can be achieved by following the below outlined steps.
I) Go to Alerting -> Contact points and add new “Message templates”.
II) Next, create a notification template named “email” with two templates in the content: “email.message_alert” and “email.message”.
The “email.message_alert” template is used to display the labels and values for each firing and resolved alert, while the “email.message” template contains the email’s structure.
Template name: email.message
### These are the key-value pairs that we want to display in our alerts.### {{- define "email.message_alert" -}} AlertName = {{ index .Labels "alertname" }}{{ "n" }} Database = {{ index .Labels "DB" }}{{ "n" }} Node_name = {{ index .Labels "node_name" }}{{ "n" }} Service_name = {{ index .Labels "service_name" }}{{ "n" }} Service Type = MySQL {{ "n" }} Severity = {{ index .Labels "severity" }}{{ "n" }} TemplateName = {{ index .Labels "template_name" }}{{ "n" }} {{- end -}} ### Next, we have defined the main section that governs the alerting and firing rules. ### {{ define "email.message" }} There are {{ len .Alerts.Firing }} firing alert(s), and {{ len .Alerts.Resolved }} resolved alert(s){{ "n" }} ###Finally, the alerts and firing rules are invoked and triggered based on the generated alerts or fixes. ### {{ if .Alerts.Firing -}} Firing alerts:{{ "n" }} {{- range .Alerts.Firing }} - {{ template "email.message_alert" . }} {{- end }} {{- end }} {{ if .Alerts.Resolved -}} Resolved alerts:{{ "n" }} {{- range .Alerts.Resolved }} - {{ template "email.message_alert" . }} {{- end }} {{- end }} {{ end }}
The above template is written in Go’s templating language. To know more in detail about the syntax and template usage you can refer to the manual.
III) Lastly, simply save the template
3) Next, we will edit the default “Contact points” and define the below content under “Update contact point -> Optional Email settings->Message” for email. Similarly, you can add other channels as well, like Telegram, Slack, etc.
Execute the template from the “message” field in your contact point integration.
{{ template "email.message" . }}
Percona Alerting comes with a pre-configured default notification policy. This policy utilizes the grafana-default-email contact point and is automatically applied to all alerts that do not have a custom notification policy assigned to them.
Reference:- https://docs.percona.com/percona-monitoring-and-management/use/alerting.html#notification-policies
After verifying a successful test message, we can save the updated contact point.
4) Finally, once the alert is triggered, you will be able to see the customized notification reflecting only the defined key/values.
Moreover, we can also use “LABEL LOOPS” instead of defining the separate “Key/Value” pairs as we did in the above steps. In this way, we can have all the default parameters in iteration without explicitly defining each of them.
Here, we use a range to iterate over the alerts such that dot refers to the current alert in the list of alerts, and then use a range on the sorted labels so dot is updated to refer to the current label. Inside the range, use “.Name” and “.Value” to print the name and value of each label.
### applying label loop option ### {{- define "email.message_alert" -}} Label Loop: {{ range .Labels.SortedPairs }} {{ .Name }} => {{ .Value }} {{ end }} {{- end -}} {{ define "email.message" }} There are {{ len .Alerts.Firing }} firing alert(s), and {{ len .Alerts.Resolved }} resolved alert(s){{ "n" }} {{ if .Alerts.Firing -}} Firing alerts:{{ "n" }} {{- range .Alerts.Firing }} - {{ template "email.message_alert" . }} {{- end }} {{- end }} {{ if .Alerts.Resolved -}} Resolved alerts:{{ "n" }} {{- range .Alerts.Resolved }} - {{ template "email.message_alert" . }} {{- end }} {{- end }} {{ end }}
To add some more options, say (summary and description) in the customized alerts below, template changes can be performed.
I) First, you can add/update the “Summary and annotations” section inside the “alert rule” based on your preference.
II) Then, edit the below Message template (“email.message”) in Alerting->contact points with the updated changes.
Template name: email.message
{{- define "email.message_alert" -}} AlertName = {{ index .Labels "alertname" }}{{ "n" }} Database = {{ index .Labels "DB" }}{{ "n" }} Node_name = {{ index .Labels "node_name" }}{{ "n" }} Service_name = {{ index .Labels "service_name" }}{{ "n" }} Service Type = {{ index .Labels "service_type" }}{{ "n" }} Severity = {{ index .Labels "severity" }}{{ "n" }} TemplateName = {{ index .Labels "template_name" }}{{ "n" }} {{- end -}} {{ define "email.message" }} There are {{ len .Alerts.Firing }} firing alert(s), and {{ len .Alerts.Resolved }} resolved alert(s){{ "n" }} {{ if .Alerts.Firing -}} Firing alerts:{{ "n" }} {{- range .Alerts.Firing }} - {{ template "email.message_alert" . }} - {{ template "alerts.summarize" . }} {{- end }} {{- end }} {{ if .Alerts.Resolved -}} Resolved alerts:{{ "n" }} {{- range .Alerts.Resolved }} - {{ template "email.message_alert" . }} - {{ template "alerts.summarize" . }} {{- end }} {{- end }} {{ end }} {{ define "alerts.summarize" -}} {{ range .Annotations.SortedPairs}} {{ .Name }} = {{ .Value }} {{ end }} {{ end }}
Reference:- https://grafana.com/blog/2023/04/05/grafana-alerting-a-beginners-guide-to-templating-alert-notifications/
Sometimes, the alert notifications might appear in a single line instead of separate lines for all the Keys. Although this is not a regular behavior it can be fixed by using the below changes.
I) Access to the PMM Server
sudo docker exec -it pmm-server bash
II) Thereafter, you can edit the file:- “/usr/share/grafana/public/emails/ng_alert_notification.html” and replace the text in between lines (288 to 290) as below.
Replace:
{{ if gt (len .Message) 0 }} <div style="white-space: pre-line;" align="left">{{ .Message }} {{ else }}
With:
{{ if gt (len .Message) 0 }} <span style="white-space: pre-line;">{{ .Message }}</span> {{ else }}
Note: Please ensure to take the backup before making any changes to the PMM Server files. Moreover, these changes could be lost when doing a PMM upgrade, especially when Grafana is upgraded as part of PMM, so a backup of the edited version would also be needed for later restoration purposes.
III) Finally, you can restart the Grafana service.
supervisorctl restart grafana
Summary
Filtering in alert notifications proves useful in concealing extraneous information from the relevant users. Only the specified elements are displayed in the notification email, thereby preventing unnecessary clutter in the alert content.
Percona Monitoring and Management is a best-of-breed open source database monitoring solution. It helps you reduce complexity, optimize performance, and improve the security of your business-critical database environments, no matter where they are located or deployed.
08
2023
MySQL Capacity Planning
As businesses grow and develop, the requirements that they have for their data platform grow along with it. As such, one of the more common questions I get from my clients is whether or not their system will be able to endure an anticipated load increase. Or worse yet, sometimes I get questions about regaining normal operations after a traffic increase caused performance destabilization.
As the subject of this blog post suggests, this all comes down to proper capacity planning. Unfortunately, this topic is more of an art than a science, given that there is really no foolproof algorithm or approach that can tell you exactly where you might hit a bottleneck with server performance. But we can discuss common bottlenecks, how to assess them, and have a better understanding as to why proactive monitoring is so important when it comes to responding to traffic growth.
Hardware considerations
The first thing we have to consider here is the resources that the underlying host provides to the database. Let’s take a look at each common resource. In each case, I’ll explain why a 2x increase in traffic doesn’t necessarily mean you’ll have a 2x increase in resource consumption.
Memory
Memory is one of the easier resources to predict and forecast and one of the few places where an algorithm might help you, but for this, we need to know a bit about how MySQL uses memory.
MySQL has two main memory consumers. Global caches like the InnoDB buffer pool and MyISAM key cache and session-level caches like the sort buffer, join buffer, random read buffer, etc.
Global memory caches are static in size as they are defined solely by the configuration of the database itself. What this means is that if you have a buffer pool set to 64Gb, having an increase in traffic isn’t going to make this any bigger or smaller. What changes is how session-level caches are allocated, which may result in larger memory consumption.
A tool that was popular at one time for calculating memory consumption was mysqlcalculator.com. Using this tool, you could enter in your values for your global and session variables and the number of max connections, and it would return the amount of memory that MySQL would consume. In practice, this calculation doesn’t really work, and that’s due to the fact that caches like the sort buffer and join buffer aren’t allocated when a new connection is made; they are only allocated when a query is run and only if MySQL determines that one or more of the session caches will be needed for that query. So idle connections don’t use much memory at all, and active connections may not use much more if they don’t require any of the session-level caches to complete their query.
The way I get around this is to estimate the amount of memory consumed on average by sessions as such…
({Total memory consumed by MySQL} – {sum of all global caches}) / {average number of active sessions}
Keep in mind that even this isn’t going to be super accurate, but at least it gives you an idea of what common session-level memory usage looks like. If you can figure out what the average memory consumption is per active session, then you can forecast what 2x the number of active sessions will consume.
This sounds simple enough, but in reality, there could be more to consider. Does your traffic increase come with updated code changes that change the queries? Do these queries use more caches? Will your increase in traffic mean more data, and if so, will you need to grow your global cache to ensure more data fits into it?
With the points above under consideration, we know that we can generally predict what MySQL will do with memory under a traffic increase, but there may be changes that could be unforeseen that could change the amount of memory that sessions use.
The solution is proactive monitoring using time-lapse metrics monitoring like what you would get with Percona Monitoring and Management (PMM). Keep an eye on your active session graph and your memory consumption graph and see how they relate to one another. Checking this frequently can help you get a better understanding of how session memory allocation changes over time and will give you a better understanding of what you might need as traffic increases.
CPU
When it comes to CPU, there’s obviously a large number of factors that contribute to usage. The most common is the queries that you run against MySQL itself. However, having a 2x increase in traffic may not lead to a 2x increase in CPU as, like memory, it really depends on the queries that are run against the database. In fact, the most common cause of massive CPU increase that I’ve seen isn’t traffic increase; it’s code changes that introduced inefficient revisions to existing queries or new queries. As such, a 0% increase in traffic can result in full CPU saturation.
This is where proactive monitoring comes into play again. Keep an eye on CPU graphs as traffic increases. In addition, you can collect full query profiles on a regular basis and run them through tools like pt-query-digest or look at the Query Analyzer (QAN) in PMM to keep track of query performance, noting where queries may be less performant than they once were, or when new queries have unexpected high load.
Disk space
A 2x increase in traffic doesn’t mean a 2x increase in disk space consumption. It may increase the rate at which disk space is accumulated, but that also depends on how much of the traffic increase is write-focused. If you have a 4x increase in reads and a 1.05X increase in writes, then you don’t need to be overly concerned about disk space consumption rates.
Once again, we look at proactive monitoring to help us. Using time-lapse metrics monitoring, we can monitor overall disk consumption and the rate at which consumption occurs and then predict how much time we have left before we run out of space.
Disk IOPS
The amount of disk IOPS your system uses will be somewhat related to how much of your data can fit into memory. Keep in mind that the disk will still need to be used for background operations as well, including writing to the InnoDB redo log, persisting/checkpointing data changes to table spaces from the redo log, etc. But, for example, if you have a large traffic increase that’s read-dependent and all of the data being read in the buffer pool, you may not see much of an IOPS increase at all.
Guess what we should do in this case? If you said “proactive monitoring,” you get a gold star. Keep an eye out for metrics related to IOPS and disk utilization as traffic increases.
Before we move on to the next section, consider the differences in abnormal between disk space and disk IOPS. When you saturate disk IOPS, your system is going to run slow. If you fill up your disk, your database will start throwing errors and may stop working completely. It’s important to understand the difference so you know how to act based on the situation at hand.
Database engine considerations
While resource utilization/saturation are very common bottlenecks for database performance, there are limitations within the engine itself. Row-locking contention is a good example, and you can keep an eye on row-lock wait time metrics in tools like PMM. But, much like any other software that allows for concurrent session usage, there are mutexes/semaphores in the code that are used to limit the number of sessions that can access shared resources. Information about this can be found in the semaphores section in the output of the “SHOW ENGINE INNODB STATUS” command.
Unfortunately, this is the single hardest bottleneck to predict and is based solely on the use case. I’ve seen systems running 25,000+ queries per second with no issue, and I’ve also seen systems running ~5,000 queries per second that ran into issues with mutex contention.
Keeping an eye on metrics for OS context switching will help with this a little bit, but unfortunately this is a situation where you normally don’t know where the wall is until you run right into it. Adjusting variables like innodb_thread_concurrency can help with this in a pinch, but when you get to this point, you really need to look at query efficiency and horizontal scaling strategies.
Another thing to consider is configurable hard limits like max_connections, where you can limit the upper bound of the number of connections that can connect to MySQL at any given time. Keep in mind that increasing this value can impact memory consumption as more connections will use more memory, so use caution when adjusting upward.
Conclusion
Capacity planning is not something you do once a year or more as part of a general exercise. It’s not something you do when management calls you to let you know a big sale is coming up that will increase the load on the hosts. It’s part of a regular day-to-day activity for anyone that’s operating in a database administrator role.
Proactive monitoring plays a big part in capacity planning. I’m not talking about alert-based monitoring that hits your pager when it’s already too late, but evaluating metrics usage on a regular basis to see what the data platform is doing, how it’s handling its current traffic, etc. In most cases, you don’t see massive increases in traffic all at once; typically, it’s gradual enough that you can monitor as it increases and adjust your system or processes to avoid saturation.
Tools like PMM and the Percona Toolkit play a big role in proactive monitoring and are open source for free usage. So if you don’t have tools like this in place, this comes in at a price point that makes tool integration easier for your consideration.
Also, if you still feel concerned about your current capacity planning, you can always reach out to Percona Managed Services for a performance review or query review that will give you a detailed analysis of the current state of your database along with recommendations to keep it as performant as possible.
Percona Monitoring and Management is a best-of-breed open source database monitoring solution. It helps you reduce complexity, optimize performance, and improve the security of your business-critical database environments, no matter where they are located or deployed.