At Percona, we’re committed to providing you with the best database monitoring and management tools. With the release of Percona Monitoring and Management 3 (PMM 3), we’re now entering a critical phase in the lifecycle of PMM 2. We understand that PMM 2 remains a vital tool for many of our users, and we want […]
20
2025
Percona Monitoring and Management 2: Clarifying the End-of-Life and Transition to PMM 3
08
2024
Troubleshooting PostgreSQL on Kubernetes With Coroot
Coroot, an open source observability tool powered by eBPF, went generally available with version 1.0 last week. As this tool is cloud-native, we were curious to know how it can help troubleshoot databases on Kubernetes.In this blog post, we will see how to quickly debug PostgreSQL with Coroot and Percona Operator for PostgreSQL.PrepareInstall CorootThe easiest […]
06
2023
Percona Operators Custom Resource Monitoring With Kube-state-metrics

There are more than 300 Operators in operatorhub, and the number is growing. Percona Operators allow users to easily manage complex database systems in a Kubernetes environment. With Percona Operators, users can easily deploy, monitor, and manage databases orchestrated by Kubernetes, making it easier and more efficient to run databases at scale.
Our Operators come with Custom Resources that have their own statuses and fields to ease monitoring and troubleshooting. For example, PerconaServerMongoDBBackup resource has information about the backup, like the success or failure of the backup. Obviously, there are ways to monitor the backup through storage monitoring or Pod status, but why bother if the Operator already provides this information?
In this article, we will see how someone can monitor Custom Resources that are created by the Operators with kube-state-metrics (KSM), a standard and widely adopted service that listens to the Kubernetes API server and generates metrics. These methods can be applied to any Custom Resources.
Please find the code and recipes from this blog post in this GitHub repository.
The problem
Kube-state-metrics talks to Kubernetes API and captures the information about various resources – Pods, Deployments, Services, etc. Once captured, the metrics are exposed. In the monitoring pipeline, a tool like Prometheus scrapes the metrics exposed.

The problem is that the Custom Resource manifest structure varies depending on the Operator. KSM does not know what to look for in the Kubernetes API. So, our goal is to explain which fields in the Custom Resource we want kube-state-metrics to capture and expose.
The solution
Kube-state-metrics is designed to be extendable for capturing custom resource metrics. It is possible to specify through the custom configuration the resources you need to capture and expose.
Details
Install Kube-state-metrics
To start with, install kube-state-metrics if not done already. We observed issues in scraping custom resource metrics using version 2.5.0. We were able to scrape custom resource metrics without any issues from version >= 2.8.2.
Identify the metrics you want to expose along with the path
Custom resources have a lot of fields. You need to choose the fields that need to be exposed.
For example, the Custom resource “PerconaXtraDBCluster“ has plenty of fields: “spec.crVersion” indicates the CR version, “spec.pxc.size” shows the number of Percona XtraDB Cluster nodes set by the user (We will later look at how to monitor the number of nodes in PXC cluster in a better way).
Metrics can be captured from the status field of the Custom Resources if present. For example:
Following is the status of CustomResource PerconaXtraDBCluster fetched.
status.state indicates the status of Custom Resource, which is very handy information.
$ kubectl get pxc pxc-1 -oyaml | yq 'del(.status.conditions) | .status'
backup: {}
haproxy:
…
ready: 3
size: 3
status: ready
pxc:
…
ready: 3
size: 3
status: ready
version: 8.0.29-21.1
ready: 6
size: 6
state: ready
Decide the type of metrics for the fields identified
As of today, kube-state-metrics supports three types of metrics available in the open metrics specification:
- Gauge
- StateSet
- Info
Based on the fields selected, map the fields identified to how you want to expose it. For example:
spec.crVersion remains constant throughout the lifecycle of the custom resource until it’s upgraded. Metric type “Info” would be a better fit for this.
spec.pxc.size is a number, and it keeps changing based on the number desired by the user and operator configurations. Even though the number is pretty much constant in the later phase of the lifecycle of the custom resource, it can change. “Gauge” is a great fit for this type of metric.
status.state can take one of the following possible values. “StateSet” would be a better fit for this type of metric.
Derive the configuration to capture custom resource metrics
As per the documentation, configuration needs to be added to kube-state-metrics deployment to define your custom resources and the fields to turn into metrics.
Configuration derived for the three metrics discussed above can be found here.
Consume the configuration in kube-state-metrics deployment
As per the official documentation, there are two ways to apply custom configurations:
- Inline: By using
—custom–resource–state–config “inline yaml” - Refer a file: By using
—custom–resource–state–config–file /path/to/config.yaml
Inline is not handy if the configuration is big. Referring to a file is better and gives more flexibility.
It is important to note that the path to file is the path in the container file system of kube-state-metrics. There are several ways to get a file into the container file system, but one of the options is to mount the data of a ConfigMap to a container.
Steps:
1. Create a configmap with the configurations derived
2. Add configmap as a volume to the kube-state-metrics pod.
volumes: - configMap: name: customresource-config-ksm name: cr-config
3. Mount the volume to the container. As per the Dockerfile of the kube-state-metrics, path “/go/src/k8s.io/kube-state-metrics/” can be used to mount the file.
volumeMounts: - mountPath: /go/src/k8s.io/kube-state-metrics/ name: cr-config
Provide permission to access the custom resources
By default, kube-state-metrics will have permission to access standard resources only as per the ClusterRole. If deployment is done without adding additional privileges, required metrics won’t be scraped.
Add additional privileges based on the custom resource you want to monitor. In this example, we will add additional privileges to monitor
PerconaXtraDBCluster,
PerconaXtraDBClusterBackup,
PerconaXtraDBClusterRestore.
Apply cluster-role and check the logs to see if custom resources are being captured
Validate the metrics being captured
Check the logs of kube-state-metrics
$ kubectl logs -f deploy/kube-state-metrics I0706 14:43:25.273822 1 wrapper.go:98] "Starting kube-state-metrics" . . . I0706 14:43:28.285613 1 discovery.go:274] "discovery finished, cache updated" I0706 14:43:28.285652 1 metrics_handler.go:99] "Autosharding disabled" I0706 14:43:28.288930 1 custom_resource_metrics.go:79] "Custom resource state added metrics" familyNames=[kube_customresource_pxc_info kube_customresource_pxc_size kube_customresource_pxc_status_state] I0706 14:43:28.411540 1 builder.go:275] "Active resources" activeStoreNames="certificatesigningrequests,configmaps,cronjobs,daemonsets,deployments,endpoints,horizontalpodautoscalers,ingresses,jobs,leases,limitranges,mutatingwebhookconfigurations,namespaces,networkpolicies,nodes,persistentvolumeclaims,persistentvolumes,poddisruptionbudgets,pods,replicasets,replicationcontrollers,resourcequotas,secrets,services,statefulsets,storageclasses,validatingwebhookconfigurations,volumeattachments,pxc.percona.com/v1, Resource=perconaxtradbclusters"
Check the kube-state-metrics service to list the metrics scraped.
Open a terminal and keep the port-forward command running:
$ kubectl port-forward svc/kube-state-metrics 8080:8080 Forwarding from 127.0.0.1:8080 -> 8080 Forwarding from [::1]:8080 -> 8080 Handling connection for 8080 Handling connection for 8080
In a browser, check for the metrics captured using “127.0.0.1:8080” (remember to keep the terminal running where the port-forward command is running).
Observe the metrics
kube_customresource_pxc_info,
kube_customresource_pxc_status_state,
kube_customresource_pxc_size being captured.
# HELP kube_customresource_pxc_info Information of PXC cluster on k8s
# TYPE kube_customresource_pxc_info info
kube_customresource_pxc_info{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",version="1.9.0"} 1
# HELP kube_customresource_pxc_size Desired size for the PXC cluster
# TYPE kube_customresource_pxc_size gauge
kube_customresource_pxc_size{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1"} 3
# HELP kube_customresource_pxc_status_state State of PXC Cluster
# TYPE kube_customresource_pxc_status_state stateset
kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="error"} 1
kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="initializing"} 0
kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="paused"} 0
kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="ready"} 0
kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="unknown"} 0
Customize the metric name, add default labels
As seen above, the metrics captured had the prefix
kube_customresource. What if we want to customize it?
There are some standard labels, like the name of the custom resource and namespace of the custom resources, which might need to be captured as labels for all the metrics related to a custom resource. It’s not practical to add this for every single metric captured. Hence, identifiers
labelsFromPath and
metricNamePrefix are used.
In the below snippet, all the metrics captured for the group
pxc.percona.com, version
v1, kind
PerconaXtrDBCluster will have the metric prefix
kube_pxc and all the metrics will have the following labels-
- name – Derived from the path metadata.name of the custom resource
- namespace – Derived from the path metadata.namespace of the custom resource.
spec: resources: - groupVersionKind: group: pxc.percona.com version: v1 kind: PerconaXtraDBCluster labelsFromPath: name: [metadata,name] namespace: [metadata,namespace] metricNamePrefix: kube_pxc
Change the configuration present in the configmap and apply the new configmap.
When the new configmap is applied, kube-state-metrics should automatically pick up the configuration changes; you can also do a “kubectl rollout restart deploy kube-state-metrics” to expedite the pod restart.
Once the changes are applied, check the metrics by port-forwarding to kube-state-metrics service.
$ kubectl port-forward svc/kube-state-metrics 8080:8080 Forwarding from 127.0.0.1:8080 -> 8080 Forwarding from [::1]:8080 -> 8080 Handling connection for 8080 Handling connection for 8080
In a browser, check for the metrics captured using “127.0.0.1:8080” (remember to keep the terminal running where the port-forward command is running).
Observe the metrics:
# HELP kube_pxc_pxc_info Information of PXC cluster on k8s
# TYPE kube_pxc_pxc_info info
kube_pxc_pxc_info{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",version="1.9.0"} 1
# HELP kube_pxc_pxc_size Desired size for the PXC cluster
# TYPE kube_pxc_pxc_size gauge
kube_pxc_pxc_size{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc"} 3
# HELP kube_pxc_pxc_status_state State of PXC Cluster
# TYPE kube_pxc_pxc_status_state stateset
kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="error"} 1
kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="initializing"} 0
kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="paused"} 0
kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="ready"} 0
kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="unknown"} 0
Labels customization
By default, kube-state-metrics doesn’t capture all the labels of the resources. However, this might be handy in deriving co-relations from custom resources to the k8s objects. To add additional labels, use the flag
—metric–labels–allowlist as mentioned in the documentation.
To demonstrate, changes are made to the kube-state-metrics deployment and applied.
Check the metrics by doing a port-forward to the service as instructed earlier.
Check the labels captured of pod
cluster1–pxc–0:
kube_pod_labels{namespace="pxc",pod="cluster1-pxc-0",uid="1083ac08-5c25-4ede-89ce-1837f2b66f3d",label_app_kubernetes_io_component="pxc",label_app_kubernetes_io_instance="cluster1",label_app_kubernetes_io_managed_by="percona-xtradb-cluster-operator",label_app_kubernetes_io_name="percona-xtradb-cluster",label_app_kubernetes_io_part_of="percona-xtradb-cluster"} 1
Labels of the pod can be checked in the cluster:
$ kubectl get po -n pxc cluster1-pxc-0 --show-labels NAME READY STATUS RESTARTS AGE LABELS cluster1-pxc-0 3/3 Running 0 3h54m app.kubernetes.io/component=pxc,app.kubernetes.io/instance=cluster1,app.kubernetes.io/managed-by=percona-xtradb-cluster-operator,app.kubernetes.io/name=percona-xtradb-cluster,app.kubernetes.io/part-of=percona-xtradb-cluster,controller-revision-hash=cluster1-pxc-6f4955bbc7,statefulset.kubernetes.io/pod-name=cluster1-pxc-0
Adhering to the Prometheus conventions, character . (dot) is replaced with _(underscore). Only labels mentioned in the
—metric–labels–allowlist are captured for the labels info.
Checking for the other pod:
$ kubectl get po -n kube-system kube-state-metrics-7bd9c67f64-46ksw --show-labels NAME READY STATUS RESTARTS AGE LABELS kube-state-metrics-7bd9c67f64-46ksw 1/1 Running 1 (40m ago) 120m app.kubernetes.io/component=exporter,app.kubernetes.io/name=kube-state-metrics,app.kubernetes.io/version=2.9.2,pod-template-hash=7bd9c67f64
Following are the labels captured in the kube-state-metrics service:
kube_pod_labels{namespace="kube-system",pod="kube-state-metrics-7bd9c67f64-46ksw",uid="d4b30238-d29e-4251-a8e3-c2fad1bff724",label_app_kubernetes_io_component="exporter",label_app_kubernetes_io_name="kube-state-metrics"} 1
As can be seen above, label
app.kubernetes.io/version is not captured because it was not mentioned in the
—metric–labels–allowlist flag of kube-state-metrics.
Conclusion
- Custom Resource metrics can be captured by modifying kube-state-metrics deployment. Metrics can be captured without writing any code.
- Alternate to the above method, the custom exporter can be written to expose the metrics, which gives a lot of flexibility. However, this needs coding and maintenance.
- Metrics can be scraped by Prometheus to derive useful insights combined with the other metrics.
If you want to extend the same process to other custom resources related to Percona Operators, use the following ClusterRole to provide permission to read the relevant custom resources. Configurations for some of the important metrics related to the custom resources are captured in this Configmap for you to explore.
The Percona Kubernetes Operators automate the creation, alteration, or deletion of members in your Percona Distribution for MySQL, MongoDB, or PostgreSQL environment.
10
2021
ServiceNow leaps into applications performance monitoring with Lightstep acquisition
This morning ServiceNow announced that it was acquiring Lightstep, an applications performance monitoring startup that has raised over $70 million, according to Crunchbase data. The companies did not share the acquisition price.
ServiceNow wants to take advantage of Lightstep’s capabilities to enhance its IT operations offerings. With Lightstep, the company should be able to provide customers with a way to monitor the performance of applications with the goal of detecting problems before the grow into major issues that take down a website or application.
“With Lightstep, ServiceNow will transform how software solutions are delivered to customers. This will ultimately make it easier for customers to innovate quickly. Now they’ll be able to build and operate their software faster than ever before and take the new era of work head on with confidence,” Pablo Stern, SVP & GM for IT Workflow Products at ServiceNow said in a statement.
Ben Sigelman, founder and CEO at Lightstep sees the larger organization being a good landing spot for his company. “We’ve always believed that the value of observability should extend across the entire enterprise, providing greater clarity and confidence to every team involved in these modern, digital businesses. By joining ServiceNow, together we will realize that vision for our customers and help transform the world of work in the process […], Sigelman said in a statement.
Lightstep is part of the application performance monitoring market with companies like DataDog, New Relic and AppDynamics, which Cisco acquired in 2017 the week before it was scheduled to IPO for $3.7 billion. It seems to be an area that is catching the interest of larger enterprise vendors, who are picking off smaller startups in the space.
Last November, IBM bought Instana, an APM startup and then bought Turbonomic for $2 billion at the end of last month as a complementary technology. Being able to monitor apps and keep them up and running is crucial, not only from a business continuity perspective, but also from a brand loyalty one. Even if the app isn’t completely down, but is running slowly or generally malfunctioning in some way, it’s likely to annoy users and could ultimately cause users to jump to a competitor. This type of software gives customers the ability to observe and detect problems before they have an impact on large numbers of users.
Lightstep, which is based in San Jose California, was founded in 2015. It raised $70 million from investors like Altimeter Capital, Sequoia, Redpoint and Harrison Metal. Customers include GitHub, Spotify and Twilio. The deal is expected to close this quarter.
10
2020
New Relic acquires Kubernetes observability platform Pixie Labs
Two months ago, Kubernetes observability platform Pixie Labs launched into general availability and announced a $9.15 million Series A funding round led by Benchmark, with participation from GV. Today, the company is announcing its acquisition by New Relic, the publicly traded monitoring and observability platform.
The Pixie Labs brand and product will remain in place and allow New Relic to extend its platform to the edge. From the outset, the Pixie Labs team designed the service to focus on providing observability for cloud-native workloads running on Kubernetes clusters. And while most similar tools focus on operators and IT teams, Pixie set out to build a tool that developers would want to use. Using eBPF, a relatively new way to extend the Linux kernel, the Pixie platform can collect data right at the source and without the need for an agent.
At the core of the Pixie developer experience are what the company calls “Pixie scripts.” These allow developers to write their debugging workflows, though the company also provides its own set of these and anybody in the community can contribute and share them as well. The idea here is to capture a lot of the informal knowledge around how to best debug a given service.
“We’re super excited to bring these companies together because we share a mission to make observability ubiquitous through simplicity,” Bill Staples, New Relic’s chief product officer, told me. “[…] According to IDC, there are 28 million developers in the world. And yet only a fraction of them really practice observability today. We believe it should be easier for every developer to take a data-driven approach to building software and Kubernetes is really the heart of where developers are going to build software.”
It’s worth noting that New Relic already had a solution for monitoring Kubernetes clusters. Pixie, however, will allow it to go significantly deeper into this space. “Pixie goes much, much further in terms of offering on-the-edge, live debugging use cases, the ability to run those Pixie scripts. So it’s an extension on top of the cloud-based monitoring solution we offer today,” Staples said.
The plan is to build integrations into New Relic into Pixie’s platform and to integrate Pixie use cases with New Relic One as well.
Currently, about 300 teams use the Pixie platform. These range from small startups to large enterprises and, as Staples and Pixie co-founder Zain Asgar noted, there was already a substantial overlap between the two customer bases.
As for why he decided to sell, Asgar — a former Google engineer working on Google AI and adjunct professor at Stanford — told me that it was all about accelerating Pixie’s vision.
“We started Pixie to create this magical developer experience that really allows us to redefine how application developers monitor, secure and manage their applications,” Asgar said. “One of the cool things is when we actually met the team at New Relic and we got together with Bill and [New Relic founder and CEO] Lew [Cirne], we realized that there was almost a complete alignment around this vision […], and by joining forces with New Relic, we can actually accelerate this entire process.”
New Relic has recently done a lot of work on open-sourcing various parts of its platform, including its agents, data exporters and some of its tooling. Pixie, too, will now open-source its core tools. Open-sourcing the service was always on the company’s road map, but the acquisition now allows it to push this timeline forward.
“We’ll be taking Pixie and making it available to the community through open source, as well as continuing to build out the commercial enterprise-grade offering for it that extends the New Relic One platform,” Staples explained. Asgar added that it’ll take the company a little while to release the code, though.
“The same fundamental quality that got us so excited about Lew as an EIR in 2007, got us excited about Zain and Ishan in 2017 — absolutely brilliant engineers, who know how to build products developers love,” Benchmark Ventures General Partner Eric Vishria told me. “New Relic has always captured developer delight. For all its power, Kubernetes completely upends the monitoring paradigm we’ve lived with for decades. Pixie brings the same easy to use, quick time to value, no-nonsense approach to the Kubernetes world as New Relic brought to APM. It is a match made in heaven.”