There are more than 300 Operators in operatorhub, and the number is growing. Percona Operators allow users to easily manage complex database systems in a Kubernetes environment. With Percona Operators, users can easily deploy, monitor, and manage databases orchestrated by Kubernetes, making it easier and more efficient to run databases at scale.
Our Operators come with Custom Resources that have their own statuses and fields to ease monitoring and troubleshooting. For example, PerconaServerMongoDBBackup resource has information about the backup, like the success or failure of the backup. Obviously, there are ways to monitor the backup through storage monitoring or Pod status, but why bother if the Operator already provides this information?
In this article, we will see how someone can monitor Custom Resources that are created by the Operators with kube-state-metrics (KSM), a standard and widely adopted service that listens to the Kubernetes API server and generates metrics. These methods can be applied to any Custom Resources.
Please find the code and recipes from this blog post in this GitHub repository.
The problem
Kube-state-metrics talks to Kubernetes API and captures the information about various resources – Pods, Deployments, Services, etc. Once captured, the metrics are exposed. In the monitoring pipeline, a tool like Prometheus scrapes the metrics exposed.
The problem is that the Custom Resource manifest structure varies depending on the Operator. KSM does not know what to look for in the Kubernetes API. So, our goal is to explain which fields in the Custom Resource we want kube-state-metrics to capture and expose.
The solution
Kube-state-metrics is designed to be extendable for capturing custom resource metrics. It is possible to specify through the custom configuration the resources you need to capture and expose.
Details
Install Kube-state-metrics
To start with, install kube-state-metrics if not done already. We observed issues in scraping custom resource metrics using version 2.5.0. We were able to scrape custom resource metrics without any issues from version >= 2.8.2.
Identify the metrics you want to expose along with the path
Custom resources have a lot of fields. You need to choose the fields that need to be exposed.
For example, the Custom resource “PerconaXtraDBCluster“ has plenty of fields: “spec.crVersion” indicates the CR version, “spec.pxc.size” shows the number of Percona XtraDB Cluster nodes set by the user (We will later look at how to monitor the number of nodes in PXC cluster in a better way).
Metrics can be captured from the status field of the Custom Resources if present. For example:
Following is the status of CustomResource PerconaXtraDBCluster fetched.
status.state indicates the status of Custom Resource, which is very handy information.
$ kubectl get pxc pxc-1 -oyaml | yq 'del(.status.conditions) | .status' backup: {} haproxy: … ready: 3 size: 3 status: ready pxc: … ready: 3 size: 3 status: ready version: 8.0.29-21.1 ready: 6 size: 6 state: ready
Decide the type of metrics for the fields identified
As of today, kube-state-metrics supports three types of metrics available in the open metrics specification:
- Gauge
- StateSet
- Info
Based on the fields selected, map the fields identified to how you want to expose it. For example:
spec.crVersion remains constant throughout the lifecycle of the custom resource until it’s upgraded. Metric type “Info” would be a better fit for this.
spec.pxc.size is a number, and it keeps changing based on the number desired by the user and operator configurations. Even though the number is pretty much constant in the later phase of the lifecycle of the custom resource, it can change. “Gauge” is a great fit for this type of metric.
status.state can take one of the following possible values. “StateSet” would be a better fit for this type of metric.
Derive the configuration to capture custom resource metrics
As per the documentation, configuration needs to be added to kube-state-metrics deployment to define your custom resources and the fields to turn into metrics.
Configuration derived for the three metrics discussed above can be found here.
Consume the configuration in kube-state-metrics deployment
As per the official documentation, there are two ways to apply custom configurations:
- Inline: By using
—custom–resource–state–config “inline yaml” - Refer a file: By using
—custom–resource–state–config–file /path/to/config.yaml
Inline is not handy if the configuration is big. Referring to a file is better and gives more flexibility.
It is important to note that the path to file is the path in the container file system of kube-state-metrics. There are several ways to get a file into the container file system, but one of the options is to mount the data of a ConfigMap to a container.
Steps:
1. Create a configmap with the configurations derived
2. Add configmap as a volume to the kube-state-metrics pod.
volumes: - configMap: name: customresource-config-ksm name: cr-config
3. Mount the volume to the container. As per the Dockerfile of the kube-state-metrics, path “/go/src/k8s.io/kube-state-metrics/” can be used to mount the file.
volumeMounts: - mountPath: /go/src/k8s.io/kube-state-metrics/ name: cr-config
Provide permission to access the custom resources
By default, kube-state-metrics will have permission to access standard resources only as per the ClusterRole. If deployment is done without adding additional privileges, required metrics won’t be scraped.
Add additional privileges based on the custom resource you want to monitor. In this example, we will add additional privileges to monitor
PerconaXtraDBCluster,
PerconaXtraDBClusterBackup,
PerconaXtraDBClusterRestore.
Apply cluster-role and check the logs to see if custom resources are being captured
Validate the metrics being captured
Check the logs of kube-state-metrics
$ kubectl logs -f deploy/kube-state-metrics I0706 14:43:25.273822 1 wrapper.go:98] "Starting kube-state-metrics" . . . I0706 14:43:28.285613 1 discovery.go:274] "discovery finished, cache updated" I0706 14:43:28.285652 1 metrics_handler.go:99] "Autosharding disabled" I0706 14:43:28.288930 1 custom_resource_metrics.go:79] "Custom resource state added metrics" familyNames=[kube_customresource_pxc_info kube_customresource_pxc_size kube_customresource_pxc_status_state] I0706 14:43:28.411540 1 builder.go:275] "Active resources" activeStoreNames="certificatesigningrequests,configmaps,cronjobs,daemonsets,deployments,endpoints,horizontalpodautoscalers,ingresses,jobs,leases,limitranges,mutatingwebhookconfigurations,namespaces,networkpolicies,nodes,persistentvolumeclaims,persistentvolumes,poddisruptionbudgets,pods,replicasets,replicationcontrollers,resourcequotas,secrets,services,statefulsets,storageclasses,validatingwebhookconfigurations,volumeattachments,pxc.percona.com/v1, Resource=perconaxtradbclusters"
Check the kube-state-metrics service to list the metrics scraped.
Open a terminal and keep the port-forward command running:
$ kubectl port-forward svc/kube-state-metrics 8080:8080 Forwarding from 127.0.0.1:8080 -> 8080 Forwarding from [::1]:8080 -> 8080 Handling connection for 8080 Handling connection for 8080
In a browser, check for the metrics captured using “127.0.0.1:8080” (remember to keep the terminal running where the port-forward command is running).
Observe the metrics
kube_customresource_pxc_info,
kube_customresource_pxc_status_state,
kube_customresource_pxc_size being captured.
# HELP kube_customresource_pxc_info Information of PXC cluster on k8s # TYPE kube_customresource_pxc_info info kube_customresource_pxc_info{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",version="1.9.0"} 1 # HELP kube_customresource_pxc_size Desired size for the PXC cluster # TYPE kube_customresource_pxc_size gauge kube_customresource_pxc_size{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1"} 3 # HELP kube_customresource_pxc_status_state State of PXC Cluster # TYPE kube_customresource_pxc_status_state stateset kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="error"} 1 kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="initializing"} 0 kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="paused"} 0 kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="ready"} 0 kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="unknown"} 0
Customize the metric name, add default labels
As seen above, the metrics captured had the prefix
kube_customresource. What if we want to customize it?
There are some standard labels, like the name of the custom resource and namespace of the custom resources, which might need to be captured as labels for all the metrics related to a custom resource. It’s not practical to add this for every single metric captured. Hence, identifiers
labelsFromPath and
metricNamePrefix are used.
In the below snippet, all the metrics captured for the group
pxc.percona.com, version
v1, kind
PerconaXtrDBCluster will have the metric prefix
kube_pxc and all the metrics will have the following labels-
- name – Derived from the path metadata.name of the custom resource
- namespace – Derived from the path metadata.namespace of the custom resource.
spec: resources: - groupVersionKind: group: pxc.percona.com version: v1 kind: PerconaXtraDBCluster labelsFromPath: name: [metadata,name] namespace: [metadata,namespace] metricNamePrefix: kube_pxc
Change the configuration present in the configmap and apply the new configmap.
When the new configmap is applied, kube-state-metrics should automatically pick up the configuration changes; you can also do a “kubectl rollout restart deploy kube-state-metrics” to expedite the pod restart.
Once the changes are applied, check the metrics by port-forwarding to kube-state-metrics service.
$ kubectl port-forward svc/kube-state-metrics 8080:8080 Forwarding from 127.0.0.1:8080 -> 8080 Forwarding from [::1]:8080 -> 8080 Handling connection for 8080 Handling connection for 8080
In a browser, check for the metrics captured using “127.0.0.1:8080” (remember to keep the terminal running where the port-forward command is running).
Observe the metrics:
# HELP kube_pxc_pxc_info Information of PXC cluster on k8s # TYPE kube_pxc_pxc_info info kube_pxc_pxc_info{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",version="1.9.0"} 1 # HELP kube_pxc_pxc_size Desired size for the PXC cluster # TYPE kube_pxc_pxc_size gauge kube_pxc_pxc_size{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc"} 3 # HELP kube_pxc_pxc_status_state State of PXC Cluster # TYPE kube_pxc_pxc_status_state stateset kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="error"} 1 kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="initializing"} 0 kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="paused"} 0 kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="ready"} 0 kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="unknown"} 0
Labels customization
By default, kube-state-metrics doesn’t capture all the labels of the resources. However, this might be handy in deriving co-relations from custom resources to the k8s objects. To add additional labels, use the flag
—metric–labels–allowlist as mentioned in the documentation.
To demonstrate, changes are made to the kube-state-metrics deployment and applied.
Check the metrics by doing a port-forward to the service as instructed earlier.
Check the labels captured of pod
cluster1–pxc–0:
kube_pod_labels{namespace="pxc",pod="cluster1-pxc-0",uid="1083ac08-5c25-4ede-89ce-1837f2b66f3d",label_app_kubernetes_io_component="pxc",label_app_kubernetes_io_instance="cluster1",label_app_kubernetes_io_managed_by="percona-xtradb-cluster-operator",label_app_kubernetes_io_name="percona-xtradb-cluster",label_app_kubernetes_io_part_of="percona-xtradb-cluster"} 1
Labels of the pod can be checked in the cluster:
$ kubectl get po -n pxc cluster1-pxc-0 --show-labels NAME READY STATUS RESTARTS AGE LABELS cluster1-pxc-0 3/3 Running 0 3h54m app.kubernetes.io/component=pxc,app.kubernetes.io/instance=cluster1,app.kubernetes.io/managed-by=percona-xtradb-cluster-operator,app.kubernetes.io/name=percona-xtradb-cluster,app.kubernetes.io/part-of=percona-xtradb-cluster,controller-revision-hash=cluster1-pxc-6f4955bbc7,statefulset.kubernetes.io/pod-name=cluster1-pxc-0
Adhering to the Prometheus conventions, character . (dot) is replaced with _(underscore). Only labels mentioned in the
—metric–labels–allowlist are captured for the labels info.
Checking for the other pod:
$ kubectl get po -n kube-system kube-state-metrics-7bd9c67f64-46ksw --show-labels NAME READY STATUS RESTARTS AGE LABELS kube-state-metrics-7bd9c67f64-46ksw 1/1 Running 1 (40m ago) 120m app.kubernetes.io/component=exporter,app.kubernetes.io/name=kube-state-metrics,app.kubernetes.io/version=2.9.2,pod-template-hash=7bd9c67f64
Following are the labels captured in the kube-state-metrics service:
kube_pod_labels{namespace="kube-system",pod="kube-state-metrics-7bd9c67f64-46ksw",uid="d4b30238-d29e-4251-a8e3-c2fad1bff724",label_app_kubernetes_io_component="exporter",label_app_kubernetes_io_name="kube-state-metrics"} 1
As can be seen above, label
app.kubernetes.io/version is not captured because it was not mentioned in the
—metric–labels–allowlist flag of kube-state-metrics.
Conclusion
- Custom Resource metrics can be captured by modifying kube-state-metrics deployment. Metrics can be captured without writing any code.
- Alternate to the above method, the custom exporter can be written to expose the metrics, which gives a lot of flexibility. However, this needs coding and maintenance.
- Metrics can be scraped by Prometheus to derive useful insights combined with the other metrics.
If you want to extend the same process to other custom resources related to Percona Operators, use the following ClusterRole to provide permission to read the relevant custom resources. Configurations for some of the important metrics related to the custom resources are captured in this Configmap for you to explore.
The Percona Kubernetes Operators automate the creation, alteration, or deletion of members in your Percona Distribution for MySQL, MongoDB, or PostgreSQL environment.