Nov
06
2023
--

Percona Operators Custom Resource Monitoring With Kube-state-metrics

Percona Operators Custom Resource Monitoring With Kube-state-metrics

There are more than 300 Operators in operatorhub, and the number is growing. Percona Operators allow users to easily manage complex database systems in a Kubernetes environment. With Percona Operators, users can easily deploy, monitor, and manage databases orchestrated by Kubernetes, making it easier and more efficient to run databases at scale.

Our Operators come with Custom Resources that have their own statuses and fields to ease monitoring and troubleshooting. For example, PerconaServerMongoDBBackup resource has information about the backup, like the success or failure of the backup. Obviously, there are ways to monitor the backup through storage monitoring or Pod status, but why bother if the Operator already provides this information?

In this article, we will see how someone can monitor Custom Resources that are created by the Operators with kube-state-metrics (KSM), a standard and widely adopted service that listens to the Kubernetes API server and generates metrics. These methods can be applied to any Custom Resources.

Please find the code and recipes from this blog post in this GitHub repository.

The problem

Kube-state-metrics talks to Kubernetes API and captures the information about various resources – Pods, Deployments, Services, etc. Once captured, the metrics are exposed. In the monitoring pipeline, a tool like Prometheus scrapes the metrics exposed.

kube-state-metrics

The problem is that the Custom Resource manifest structure varies depending on the Operator. KSM does not know what to look for in the Kubernetes API. So, our goal is to explain which fields in the Custom Resource we want kube-state-metrics to capture and expose.

The solution

Kube-state-metrics is designed to be extendable for capturing custom resource metrics. It is possible to specify through the custom configuration the resources you need to capture and expose.

Details

Install Kube-state-metrics

To start with, install kube-state-metrics if not done already. We observed issues in scraping custom resource metrics using version 2.5.0. We were able to scrape custom resource metrics without any issues from version >= 2.8.2.

Identify the metrics you want to expose along with the path

Custom resources have a lot of fields. You need to choose the fields that need to be exposed.

For example, the Custom resource “PerconaXtraDBCluster“ has plenty of fields: “spec.crVersion” indicates the CR version, “spec.pxc.size shows the number of Percona XtraDB Cluster nodes set by the user (We will later look at how to monitor the number of nodes in PXC cluster in a better way).

Metrics can be captured from the status field of the Custom Resources if present. For example:

Following is the status of CustomResource PerconaXtraDBCluster fetched. 


status.state indicates the status of Custom Resource, which is very handy information.

$ kubectl get pxc pxc-1 -oyaml | yq 'del(.status.conditions) | .status'
backup: {}
haproxy:
…
  ready: 3
  size: 3
  status: ready
pxc:
 …
  ready: 3
  size: 3
  status: ready
  version: 8.0.29-21.1
ready: 6
size: 6
state: ready

Decide the type of metrics for the fields identified

As of today, kube-state-metrics supports three types of metrics available in the open metrics specification:

  1. Gauge
  2. StateSet
  3. Info

Based on the fields selected, map the fields identified to how you want to expose it. For example:


  1. spec.crVersion remains constant throughout the lifecycle of the custom resource until it’s upgraded. Metric type “
    Info” would be a better fit for this.

  2. spec.pxc.size is a number, and it keeps changing based on the number desired by the user and operator configurations. Even though the number is pretty much constant in the later phase of the lifecycle of the custom resource, it can change. “
    Gauge” is a great fit for this type of metric.

  3. status.state can take one of the following possible
    values. “StateSet” would be a better fit for this type of metric.

Derive the configuration to capture custom resource metrics

As per the documentation, configuration needs to be added to kube-state-metrics deployment to define your custom resources and the fields to turn into metrics.

Configuration derived for the three metrics discussed above can be found here.

Consume the configuration in kube-state-metrics deployment

As per the official documentation, there are two ways to apply custom configurations:

  1. Inline: By using
    customresourcestateconfig “inline yaml” 
  2. Refer a file: By using
    customresourcestateconfigfile /path/to/config.yaml

Inline is not handy if the configuration is big. Referring to a file is better and gives more flexibility.

It is important to note that the path to file is the path in the container file system of kube-state-metrics. There are several ways to get a file into the container file system, but one of the options is to mount the data of a ConfigMap to a container.

Steps:

1. Create a configmap with the configurations derived 

2. Add configmap as a volume to the kube-state-metrics pod

  volumes:
      - configMap:
          name: customresource-config-ksm
        name: cr-config

3. Mount the volume to the container. As per the Dockerfile of the kube-state-metrics, path “/go/src/k8s.io/kube-state-metrics/” can be used to mount the file. 

volumeMounts:
        - mountPath: /go/src/k8s.io/kube-state-metrics/
          name: cr-config

Provide permission to access the custom resources

By default, kube-state-metrics will have permission to access standard resources only as per the ClusterRole. If deployment is done without adding additional privileges, required metrics won’t be scraped.   

Add additional privileges based on the custom resource you want to monitor. In this example, we will add additional privileges to monitor
PerconaXtraDBCluster,
PerconaXtraDBClusterBackup,
PerconaXtraDBClusterRestore.

Apply cluster-role and check the logs to see if custom resources are being captured

Validate the metrics being captured

Check the logs of kube-state-metrics

$ kubectl logs -f deploy/kube-state-metrics
I0706 14:43:25.273822       1 wrapper.go:98] "Starting kube-state-metrics"
.
.
.
I0706 14:43:28.285613       1 discovery.go:274] "discovery finished, cache updated"
I0706 14:43:28.285652       1 metrics_handler.go:99] "Autosharding disabled"
I0706 14:43:28.288930       1 custom_resource_metrics.go:79] "Custom resource state added metrics" familyNames=[kube_customresource_pxc_info kube_customresource_pxc_size kube_customresource_pxc_status_state]
I0706 14:43:28.411540       1 builder.go:275] "Active resources" activeStoreNames="certificatesigningrequests,configmaps,cronjobs,daemonsets,deployments,endpoints,horizontalpodautoscalers,ingresses,jobs,leases,limitranges,mutatingwebhookconfigurations,namespaces,networkpolicies,nodes,persistentvolumeclaims,persistentvolumes,poddisruptionbudgets,pods,replicasets,replicationcontrollers,resourcequotas,secrets,services,statefulsets,storageclasses,validatingwebhookconfigurations,volumeattachments,pxc.percona.com/v1, Resource=perconaxtradbclusters"

Check the kube-state-metrics service to list the metrics scraped. 

Open a terminal and keep the port-forward command running:

$ kubectl port-forward svc/kube-state-metrics  8080:8080
Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080
Handling connection for 8080
Handling connection for 8080

In a browser, check for the metrics captured using “127.0.0.1:8080” (remember to keep the terminal running where the port-forward command is running).

Observe the metrics
kube_customresource_pxc_info
,
kube_customresource_pxc_status_state
,
kube_customresource_pxc_size
being captured.

# HELP kube_customresource_pxc_info Information of PXC cluster on k8s
# TYPE kube_customresource_pxc_info info
kube_customresource_pxc_info{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",version="1.9.0"} 1
# HELP kube_customresource_pxc_size Desired size for the PXC cluster
# TYPE kube_customresource_pxc_size gauge
kube_customresource_pxc_size{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1"} 3
# HELP kube_customresource_pxc_status_state State of PXC Cluster
# TYPE kube_customresource_pxc_status_state stateset
kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="error"} 1
kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="initializing"} 0
kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="paused"} 0
kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="ready"} 0
kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="unknown"} 0

Customize the metric name, add default labels

As seen above, the metrics captured had the prefix
kube_customresource. What if we want to customize it?

There are some standard labels, like the name of the custom resource and namespace of the custom resources, which might need to be captured as labels for all the metrics related to a custom resource. It’s not practical to add this for every single metric captured. Hence, identifiers
labelsFromPath and
metricNamePrefix are used.

In the below snippet, all the metrics captured for the group
pxc.percona.com, version
v1, kind
PerconaXtrDBCluster will have the metric prefix
kube_pxc and all the metrics will have the following labels-

  • name – Derived from the path metadata.name of the custom resource
  • namespace – Derived from the path metadata.namespace of the custom resource.
spec:
      resources:
        - groupVersionKind:
            group: pxc.percona.com
            version: v1
            kind: PerconaXtraDBCluster
          labelsFromPath:
            name: [metadata,name]
            namespace: [metadata,namespace]
          metricNamePrefix: kube_pxc

Change the configuration present in the configmap and apply the new configmap.

When the new configmap is applied, kube-state-metrics should automatically pick up the configuration changes; you can also do a “kubectl rollout restart deploy kube-state-metrics” to expedite the pod restart.

Once the changes are applied, check the metrics by port-forwarding to kube-state-metrics service.

$ kubectl port-forward svc/kube-state-metrics  8080:8080
Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080
Handling connection for 8080
Handling connection for 8080

In a browser, check for the metrics captured using “127.0.0.1:8080” (remember to keep the terminal running where the port-forward command is running).

Observe the metrics:

# HELP kube_pxc_pxc_info Information of PXC cluster on k8s
# TYPE kube_pxc_pxc_info info
kube_pxc_pxc_info{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",version="1.9.0"} 1
# HELP kube_pxc_pxc_size Desired size for the PXC cluster
# TYPE kube_pxc_pxc_size gauge
kube_pxc_pxc_size{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc"} 3
# HELP kube_pxc_pxc_status_state State of PXC Cluster
# TYPE kube_pxc_pxc_status_state stateset
kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="error"} 1
kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="initializing"} 0
kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="paused"} 0
kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="ready"} 0
kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="unknown"} 0

Labels customization

By default, kube-state-metrics doesn’t capture all the labels of the resources. However, this might be handy in deriving co-relations from custom resources to the k8s objects. To add additional labels, use the flag
metriclabelsallowlist as mentioned in the
documentation.

To demonstrate, changes are made to the kube-state-metrics deployment and applied.

Check the metrics by doing a port-forward to the service as instructed earlier.

Check the labels captured of pod
cluster1pxc0:

kube_pod_labels{namespace="pxc",pod="cluster1-pxc-0",uid="1083ac08-5c25-4ede-89ce-1837f2b66f3d",label_app_kubernetes_io_component="pxc",label_app_kubernetes_io_instance="cluster1",label_app_kubernetes_io_managed_by="percona-xtradb-cluster-operator",label_app_kubernetes_io_name="percona-xtradb-cluster",label_app_kubernetes_io_part_of="percona-xtradb-cluster"} 1

Labels of the pod can be checked in the cluster:

$ kubectl get po -n pxc cluster1-pxc-0 --show-labels
NAME             READY   STATUS    RESTARTS         AGE     LABELS
cluster1-pxc-0   3/3     Running   0                3h54m   app.kubernetes.io/component=pxc,app.kubernetes.io/instance=cluster1,app.kubernetes.io/managed-by=percona-xtradb-cluster-operator,app.kubernetes.io/name=percona-xtradb-cluster,app.kubernetes.io/part-of=percona-xtradb-cluster,controller-revision-hash=cluster1-pxc-6f4955bbc7,statefulset.kubernetes.io/pod-name=cluster1-pxc-0

Adhering to the Prometheus conventions, character . (dot) is replaced with _(underscore). Only labels mentioned in the
metriclabelsallowlist are captured for the labels info.

Checking for the other pod:

$ kubectl get po -n kube-system kube-state-metrics-7bd9c67f64-46ksw --show-labels
NAME                                  READY   STATUS    RESTARTS      AGE    LABELS
kube-state-metrics-7bd9c67f64-46ksw   1/1     Running   1 (40m ago)   120m   app.kubernetes.io/component=exporter,app.kubernetes.io/name=kube-state-metrics,app.kubernetes.io/version=2.9.2,pod-template-hash=7bd9c67f64

Following are the labels captured in the kube-state-metrics service:

kube_pod_labels{namespace="kube-system",pod="kube-state-metrics-7bd9c67f64-46ksw",uid="d4b30238-d29e-4251-a8e3-c2fad1bff724",label_app_kubernetes_io_component="exporter",label_app_kubernetes_io_name="kube-state-metrics"} 1

As can be seen above, label
app.kubernetes.io/version
is not captured because it was not mentioned in the
metriclabelsallowlist flag of kube-state-metrics. 

Conclusion

  1. Custom Resource metrics can be captured by modifying kube-state-metrics deployment. Metrics can be captured without writing any code.
  2. Alternate to the above method, the custom exporter can be written to expose the metrics, which gives a lot of flexibility. However, this needs coding and maintenance.
  3. Metrics can be scraped by Prometheus to derive useful insights combined with the other metrics.

If you want to extend the same process to other custom resources related to Percona Operators, use the following ClusterRole to provide permission to read the relevant custom resources. Configurations for some of the important metrics related to the custom resources are captured in this Configmap for you to explore.

The Percona Kubernetes Operators automate the creation, alteration, or deletion of members in your Percona Distribution for MySQL, MongoDB, or PostgreSQL environment.

 

Learn More About Percona Kubernetes Operators

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com