Dec
04
2024
--

The Evolution of Stateful Applications in Kubernetes

Stateful Applications in KubernetesRecently I listened to Lenny Rachitsky’s podcast, where he invited Shreyas Doshi for the second time. The session was titled “4 questions Shreyas Doshi wishes he’d asked himself sooner”. One of the questions Shreyas brought up was, “Do I actually have a good taste?”. This is an interesting question to ask for an experienced product […]

Nov
04
2024
--

Exposing PostgreSQL with NGINX Ingress Controller

Exposing PostgreSQL with NGINX Ingress ControllerI wrote a blog post in the past about a generic approach on how to expose databases in Kubernetes with Ingress controllers. Today, we will dive deeper into PostgreSQL with ingress and also review two different ways that can be taken to expose the TCP service. The goal is to expose multiple PostgreSQL clusters through […]

Sep
12
2024
--

Simplify User Management with Percona Operator for MongoDB

Simplify User Management with Percona Operator for MongoDBManaging database users within complex CICD pipelines and GitOps workflows has long been a challenge for MongoDB deployments. With Percona Operator for MongoDB 1.17, we introduce a new feature, currently in technical preview, that streamlines this process. Now, you can create the database users you need directly within the operator, eliminating the need to wait […]

May
29
2024
--

Beyond The Horizon: Mastering Percona Server for MongoDB Exposure in Kubernetes – Part Two – Istio

Percona Server for MongoDB Exposure in Kubernetes IstioThis is the second part of the series of blog posts unmasking the complexity of MongoDB cluster exposure in Kubernetes with Percona Operator for MongoDB. In the first part, we focused heavily on split horizons and a single replica set. In this part, we will expose a sharded cluster and a single replica set with Istio, […]

May
28
2024
--

Beyond the Horizon: Mastering Percona Server for MongoDB Exposure in Kubernetes – Part One

Running and managing MongoDB clusters in Kubernetes is made easy with the Percona Operator for MongoDB. Some aspects are just easy to grasp as they are well defined in the operator custom resources and documentation, but some are often considered to be a hidden craft. Network exposure in cases of sharded clusters is quite straightforward, […]

May
08
2024
--

Troubleshooting PostgreSQL on Kubernetes With Coroot

PostgreSQL on Kubernetes With CorootCoroot, an open source observability tool powered by eBPF, went generally available with version 1.0 last week. As this tool is cloud-native, we were curious to know how it can help troubleshoot databases on Kubernetes.In this blog post, we will see how to quickly debug PostgreSQL with Coroot and Percona Operator for PostgreSQL.PrepareInstall CorootThe easiest […]

Jan
09
2024
--

Create an AI Expert With Open Source Tools and pgvector

2023 was the year of Artificial Intelligence (AI). A lot of companies are thinking about how they can improve user experience with AI, and the most usual first step is to use company data (internal docs, ticketing systems, etc.) to answer customer questions faster and (or) automatically.In this blog post, we will explain the basic […]

Dec
27
2023
--

Cloud Native Predictions for 2024

Cloud Native Predictions for 2024

The evolution of cloud-native technology has been nothing short of revolutionary. As we step into 2024, the cornerstone of cloud-native technology, Kubernetes, will turn ten years old. It continues to solidify its position and is anticipated to reach USD 5575.67 million by 2028, with a forecasted Compound Annual Growth Rate (CAGR) of 18.51% in the coming years, as reported by Industry Research Biz

The Cloud Native landscape continues to encompass both micro-trends and IT macro-trends, influencing and transforming the way businesses operate and deliver value to their customers.

As we at Percona wind down 2023 and look into what the next year holds, our attention is drawn to the cloud-native landscape and how it is maturing, growing, and evolving. 

KubeCon NA 2023 recap

The theme for KubeCon NA was very clear — AI and Large Language Models (LLMs). Keynotes were focused on how Kubernetes and Cloud Native help businesses embrace the new AI era. And it is understandable, as Kubernetes slowly becomes what it is intended to be – the Platform.

The field of Platform Engineering has witnessed significant advancements, as evidenced by the publication of the CNCF platform whitepaper and the introduction of a dedicated Platform Engineering day at the upcoming KubeCon event. At Percona, we observe a growing trend among companies utilizing Kubernetes as a means to offer services to their teams, fostering expedited software delivery and driving business growth.

Declarative GitOps management, with ArgoCD and Flux, is the community way of adding orchestration on top of orchestration. In our conversations with developers and engineers during the conference, we confirmed the CNCF GItOps Microsurvey data – 91% are already using GitOps.

According to the Dynatrace Kubernetes in the Wild 2023 report, a significant 71% (with 48% year-over-year growth!) of respondents are currently utilizing databases in Kubernetes (k8s).  This finding aligns with the observations made at the Data on Kubernetes (DoK) day, where discussions surrounding this topic transitioned from niche, tech-oriented conversations a year ago to more widespread, enterprise-level interest in adopting diverse use cases. These indicators suggest that the adoption of databases on k8s is in its early stages and is likely to continue growing in the future.

Predictions

Multi-cloud is a requirement

While this wave has been building for years, in 2024, we expect it to peak. According to a 2023 Forrester survey commissioned by Hashicorp, 61% of respondents had implemented, were expanding, or were upgrading their multi-cloud strategy. We expect that number to rise higher in 2024.

Nearly every vendor at Kubecon and every person we spoke to had some form of a multi-cloud requirement or strategy. Sometimes, this comes from necessity through acquisition or mergers. Oftentimes, it is a pillar of modern infrastructure strategy to avoid cloud vendor lock-in. At this point, it is ubiquitous, and if it is not part of your strategy, you are falling behind.

The business value of adopting this strategy is multi-fold:

  • Freedom from vendor lock-in, which leads to increased negotiating power
  • Agility in capitalizing on cloud-vendor advancements to innovate faster
  • Increased application and database architecture RPO and RTO
  • Adhering to security and governance requirements of customers

Percona’s Operators for MySQL, MongoDB, and PostgreSQL are designed with this value in mind. We want adopters of our technology to be able to deploy their critical open source databases and applications across any public or private cloud environment. All of the database automation for running a highly available, resilient, and secure database is built into the operator to simplify the operation and management of your clusters. 

Simplify and secure

Looking through various State of Kubernetes reports (VMWare, RedHat, SpectroCloud), it becomes clear that Complexity and Security are the top concerns for platform engineering teams.  

Simplification might come from different angles. Deployment is mostly solved already, whereas management and operations are still not. We expect to see various tooling and core patches to automate scaling, upgrades, migrations, troubleshooting, and more. 

Operators are an integral part of solving the complexity problem, where they take away the need for learning k8s primitives and application configuration internals. They also remove toil and allow engineers to focus on application development vs platform engineering work. Not only will new operators appear, but existing operators will mature and provide capabilities that meet or exceed managed services that users can get on public clouds. 

The latest report on Kubernetes adoption, security, and market trends in 2023 revealed that 67% reported delaying or slowing down deployment due to Kubernetes security concerns. Additionally, 37% of respondents experienced revenue or customer loss due to a container/Kubernetes security incident.

Considering the open source software vulnerability as one of the top concerns and the rapid increase in supply chain attacks (the SolarWinds attack and vulnerabilities like Log4Shell and Spring4Shell), along with container and Kubernetes strategies, there’s a growing emphasis on cybersecurity and operational understanding in development. 

Another significant issue within security concerns is the escalating complexity of modern systems, especially in platforms like Kubernetes, which highlights the need for unified threat models and scanning tools to address vulnerabilities. Standardization and collaboration are key to sharing common knowledge and patterns across teams and infrastructures. Creating repositories for memory-safe patterns in cloud systems to improve overall security.

A majority of RedHat’s security research respondents have a DevSecOps initiative underway. Most organizations are embracing DevSecOps, a term that covers processes and tooling enabling security to be integrated into the application development life cycle rather than treated as a separate process. However, 17% of organizations operate security separately from DevOps, lacking any DevSecOps initiatives. Consequently, they might miss out on the benefits of integrating security into the SDLC, such as enhanced efficiency, speed, and quality in software delivery.

AI and MLOps

Kubernetes has become a new web server for many production AI workloads, focusing on facilitating the development and deployment of AI applications, including model training. The newly formed Open Source AI Alliance, led by META and IBM, promises to support open-source AI. It comprises numerous organizations from various sectors, including software, hardware, nonprofit, public, and academic. The goal is to collaboratively develop tools and programs facilitating open development and run scalable and distributed training jobs for popular frameworks such as PyTorch, TensorFlow, MPI, MXNet, PaddlePaddle, and XGBoost.

While integrating AI and machine learning into cloud-native architectures, there’s an increasing demand from users for AI to be open and collaborative. The emergence of trends stemming from ‘AI Advancements and Ethical Concerns’ cannot be ignored.

Addressing ethical concerns and biases will necessitate the implementation of transparent AI frameworks and ethical guidelines during application development. Customers will increasingly prioritize AI efficiency and education to tackle legal and ethical concerns. This marks the end of an era of chaos, paving the way for efficiency gains, quicker innovation, and standardized practices.

Conclusion

At Percona, we prioritize staying ahead of market trends by adhering to industry best practices and leveraging our team’s expertise.

We’ve always made sure to focus on security in our software development, and weaving multi-cloud deployment into our products has been a crucial part of our strategy. Our commitment to open source software drives us to take additional precautions, ensuring operational security through best practices and principles, such as of least privilege, security in layers, and separation of roles/responsibilities through policy and software controls. And with multi-cloud in mind, we consistently incorporate new sharding functionalities into our roadmaps, such as the upcoming Shard-per-location support in the Percona Operator for MongoDB.

At the same time, we are not hesitating to rock the cloud-native community by incorporating top-notch features to address any new rising trends. You mentioned ‘More Simple Kubernetes’? Well, here we are – with storage autoscaling for databases in Kubernetes, slated for release in Q1, 2024 after a year of hard work. This fully automated scaling and tuning will enable a serverless-like experience in our Operators and Everest. Developers will receive the endpoint without needing to consider resources and tuning at all. It’s worry-free and doesn’t require human intervention.

Finally, the rising popularity of generative AI and engines like OpenAI or Bard has prompted our team to bring vector-handling capabilities to Percona-powered database software by adding support for the pgvector extension.

Our team always focuses on innovation to accelerate progress for everyone, and we will continue to push the boundaries further for our community and the rest of the world.

The Percona Kubernetes Operators automate the creation, alteration, or deletion of members in your Percona Distribution for MySQL, MongoDB, or PostgreSQL environment.

 

Learn More About Percona Kubernetes Operators

Dec
11
2023
--

Storage Strategies for PostgreSQL on Kubernetes

Storage Strategies for PostgreSQL on Kubernetes

Deploying PostgreSQL on Kubernetes is not new and can be easily streamlined through various Operators, including Percona’s. There are a wealth of options on how you can approach storage configuration in Percona Operator for PostgreSQL, and in this blog post, we review various storage strategies — from basics to more sophisticated use cases.

The basics

Setting StorageClass

StorageClass resource in Kubernetes allows users to set various parameters of the underlying storage. For example, you can choose the public cloud storage type – gp3, io2, etc, or set file system.

You can check existing storage classes by running the following command:

$ kubectl get sc
NAME                      PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
premium-rwo               pd.csi.storage.gke.io   Delete          WaitForFirstConsumer   true                   54m
regionalpd-storageclass   pd.csi.storage.gke.io   Delete          WaitForFirstConsumer   false                  51m
standard                  kubernetes.io/gce-pd    Delete          Immediate              true                   54m
standard-rwo (default)    pd.csi.storage.gke.io   Delete          WaitForFirstConsumer   true                   54m

As you see
standardrwo
is a default StorageClass, meaning that if you don’t specify anything, the Operator will use it.

To instruct Percona Operator for PostgreSQL which storage class to use, set it the
spec.instances.[].dataVolumeClaimSpec section:

    dataVolumeClaimSpec:
      accessModes:
      - ReadWriteOnce
      storageClassName: STORAGE_CLASS_NAME
      resources:
        requests:
          storage: 1Gi

Separate volume for Write-Ahead Logs

Write-Ahead Logs (WALs) keep the recording of every transaction in your PostgreSQL deployment. They are useful for point-in-time recovery and minimizing your Recovery Point Objective (RPO). In Percona Operator, it is possible to have a separate volume for WALs to minimize the impact on performance and storage capacity. To set it, use
spec.instances.[].walVolumeClaimSpec section:

    walVolumeClaimSpec:
      accessModes:
      - ReadWriteOnce
      storageClassName: STORAGE_CLASS_NAME
      resources:
        requests:
          storage: 1Gi

If you enable
walVolumeClaimSpec, the Operator will create two volumes per replica Pod – one for data and one for WAL:

cluster1-instance1-8b2m-pgdata   Bound    pvc-2f919a49-d672-49cb-89bd-f86469241381   1Gi        RWO            standard-rwo   36s
cluster1-instance1-8b2m-pgwal    Bound    pvc-bf2c26d8-cf42-44cd-a053-ccb6abadd096   1Gi        RWO            standard-rwo   36s
cluster1-instance1-ncfq-pgdata   Bound    pvc-7ab7e59f-017a-4655-b617-ff17907ace3f   1Gi        RWO            standard-rwo   36s
cluster1-instance1-ncfq-pgwal    Bound    pvc-51baffcf-0edc-472f-9c95-7a0cea3e6507   1Gi        RWO            standard-rwo   36s
cluster1-instance1-w4d8-pgdata   Bound    pvc-c60282ed-3599-4033-afc7-e967871efa1b   1Gi        RWO            standard-rwo   36s
cluster1-instance1-w4d8-pgwal    Bound    pvc-ef530cb4-82fb-4661-ac76-ee7fda1f89ce   1Gi        RWO            standard-rwo   36s

Changing storage size

If your StorageClass and storage interface (CSI) supports VolumeExpansion, you can just change the storage size in the Custom Resource manifest. The operator will do the rest and expand the storage automatically. This is a zero-downtime operation and is limited by underlying storage capabilities only.

Changing storage

It is also possible to change the storage capabilities, such as filesystem, IOPs, and type. Right now, it is possible through creating a new storage class and applying it to the new instance group. 

spec:
  instances:
  - name: newGroup
    dataVolumeClaimSpec:
      accessModes:
      - ReadWriteOnce
      storageClassName: NEW_STORAGE_CLASS
      resources:
        requests:
          storage: 2Gi

Creating a new instance group replicates the data to new replica nodes. This is done without downtime, but replication might introduce additional load on the primary node and the network. 

There is work in progress under Kubernetes Enhancement Proposal (KEP) #3780. It will allow users to change various volume attributes on the fly vs through the storage class. 

Data persistence

Finalizers

By default, the Operator keeps the storage and secret resources if the cluster is deleted. We do it to protect the users from human errors and other situations. This way, the user can quickly start the cluster, reusing the existing storage and secrets.

This default behavior can be changed by enabling a finalizer in the Custom Resource: 

apiVersion: pgv2.percona.com/v2
kind: PerconaPGCluster
metadata:
  name: cluster1
  finalizers:
  - percona.com/delete-pvc
  - percona.com/delete-ssl

This is useful for non-production clusters where you don’t need to keep the data. 

StorageClass data protection

There are extreme cases where human error is inevitable. For example, someone can delete the whole Kubernetes cluster or a namespace. Good thing that StorageClass resource comes with reclaimPolicy option, which can instruct Container Storage Interface to keep the underlying volumes. This option is not controlled by the operator, and you should set it for the StorageClass separately. 

> apiVersion: storage.k8s.io/v1
> kind: StorageClass
> ...
> provisioner: pd.csi.storage.gke.io
- reclaimPolicy: Delete
+ reclaimPolicy: Retain

In this case, even if Kubernetes resources are deleted, the physical storage is still there.

Regional disks

Regional disks are available at Azure and Google Cloud but not yet at AWS. In a nutshell, it is a disk that is replicated across two availability zones (AZ).

Kubernetes Regional disks

To use regional disks, you need a storage class that specifies in which AZs will it be available and replicated to:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: regionalpd-storageclass
provisioner: pd.csi.storage.gke.io
parameters:
  type: pd-balanced
  replication-type: regional-pd
volumeBindingMode: WaitForFirstConsumer
allowedTopologies:
- matchLabelExpressions:
  - key: topology.gke.io/zone
    values:
    - us-central1-a
    - us-central1-b

There are some scenarios where regional disks can help with cost reduction. Let’s review three PostgreSQL topologies:

  1. Single node with regular disks
  2. Single node with regional disks
  3. PostgreSQL Highly Available cluster with regular disks

If we apply availability zone failure to these topologies, we will get the following:

  1. Single node with regular disks is the cheapest one, but in case of AZ failure, recovery might take hours or even days – depending on the data.
  2. With single node and regional disks, you will not be spending a dime on compute for replicas, but at the same time, you will recover within minutes. 
  3. PostgreSQL cluster provides the best availability, but also comes with high compute costs.
Single PostgreSQL node, regular disk Single PostgreSQL node, regional disks PostgreSQL HA, regular disks
Compute costs $ $ $$
Storage costs $ $$ $$
Network costs $0 $0 $
Recovery Time Objective Hours Minutes Seconds

Local storage

One of the ways to reduce your total cost of ownership (TCO) for stateful workloads on Kubernetes and boost your performance is to use local storage as opposed to network disks. Public clouds provide instances with NVMe SSDs that can be utilized in k8s with tools like OpenEBS, Portworx, and more. The way it is consumed is through regular storage classes and deserves a separate blog post.

Kubernetes Local storage

Conclusion

In this blog post, we discussed the basics of storage configuration and saw how to fine-tune various storage parameters. There are different topologies, needs, and corresponding strategies for running PostgreSQL on Kubernetes, and depending on your cost, performance, and availability needs, you have a wealth of options with Percona Operators. 

Try out the Percona Operator for PostgreSQL by following the quickstart guide here.

Join the Percona Kubernetes Squad – a group of database professionals at the forefront of innovating database operations on Kubernetes within their organizations and beyond. The Squad is dedicated to providing its members with unwavering support as we all navigate the cloud-native landscape.

Nov
06
2023
--

Percona Operators Custom Resource Monitoring With Kube-state-metrics

Percona Operators Custom Resource Monitoring With Kube-state-metrics

There are more than 300 Operators in operatorhub, and the number is growing. Percona Operators allow users to easily manage complex database systems in a Kubernetes environment. With Percona Operators, users can easily deploy, monitor, and manage databases orchestrated by Kubernetes, making it easier and more efficient to run databases at scale.

Our Operators come with Custom Resources that have their own statuses and fields to ease monitoring and troubleshooting. For example, PerconaServerMongoDBBackup resource has information about the backup, like the success or failure of the backup. Obviously, there are ways to monitor the backup through storage monitoring or Pod status, but why bother if the Operator already provides this information?

In this article, we will see how someone can monitor Custom Resources that are created by the Operators with kube-state-metrics (KSM), a standard and widely adopted service that listens to the Kubernetes API server and generates metrics. These methods can be applied to any Custom Resources.

Please find the code and recipes from this blog post in this GitHub repository.

The problem

Kube-state-metrics talks to Kubernetes API and captures the information about various resources – Pods, Deployments, Services, etc. Once captured, the metrics are exposed. In the monitoring pipeline, a tool like Prometheus scrapes the metrics exposed.

kube-state-metrics

The problem is that the Custom Resource manifest structure varies depending on the Operator. KSM does not know what to look for in the Kubernetes API. So, our goal is to explain which fields in the Custom Resource we want kube-state-metrics to capture and expose.

The solution

Kube-state-metrics is designed to be extendable for capturing custom resource metrics. It is possible to specify through the custom configuration the resources you need to capture and expose.

Details

Install Kube-state-metrics

To start with, install kube-state-metrics if not done already. We observed issues in scraping custom resource metrics using version 2.5.0. We were able to scrape custom resource metrics without any issues from version >= 2.8.2.

Identify the metrics you want to expose along with the path

Custom resources have a lot of fields. You need to choose the fields that need to be exposed.

For example, the Custom resource “PerconaXtraDBCluster“ has plenty of fields: “spec.crVersion” indicates the CR version, “spec.pxc.size shows the number of Percona XtraDB Cluster nodes set by the user (We will later look at how to monitor the number of nodes in PXC cluster in a better way).

Metrics can be captured from the status field of the Custom Resources if present. For example:

Following is the status of CustomResource PerconaXtraDBCluster fetched. 


status.state indicates the status of Custom Resource, which is very handy information.

$ kubectl get pxc pxc-1 -oyaml | yq 'del(.status.conditions) | .status'
backup: {}
haproxy:
…
  ready: 3
  size: 3
  status: ready
pxc:
 …
  ready: 3
  size: 3
  status: ready
  version: 8.0.29-21.1
ready: 6
size: 6
state: ready

Decide the type of metrics for the fields identified

As of today, kube-state-metrics supports three types of metrics available in the open metrics specification:

  1. Gauge
  2. StateSet
  3. Info

Based on the fields selected, map the fields identified to how you want to expose it. For example:


  1. spec.crVersion remains constant throughout the lifecycle of the custom resource until it’s upgraded. Metric type “
    Info” would be a better fit for this.

  2. spec.pxc.size is a number, and it keeps changing based on the number desired by the user and operator configurations. Even though the number is pretty much constant in the later phase of the lifecycle of the custom resource, it can change. “
    Gauge” is a great fit for this type of metric.

  3. status.state can take one of the following possible
    values. “StateSet” would be a better fit for this type of metric.

Derive the configuration to capture custom resource metrics

As per the documentation, configuration needs to be added to kube-state-metrics deployment to define your custom resources and the fields to turn into metrics.

Configuration derived for the three metrics discussed above can be found here.

Consume the configuration in kube-state-metrics deployment

As per the official documentation, there are two ways to apply custom configurations:

  1. Inline: By using
    customresourcestateconfig “inline yaml” 
  2. Refer a file: By using
    customresourcestateconfigfile /path/to/config.yaml

Inline is not handy if the configuration is big. Referring to a file is better and gives more flexibility.

It is important to note that the path to file is the path in the container file system of kube-state-metrics. There are several ways to get a file into the container file system, but one of the options is to mount the data of a ConfigMap to a container.

Steps:

1. Create a configmap with the configurations derived 

2. Add configmap as a volume to the kube-state-metrics pod

  volumes:
      - configMap:
          name: customresource-config-ksm
        name: cr-config

3. Mount the volume to the container. As per the Dockerfile of the kube-state-metrics, path “/go/src/k8s.io/kube-state-metrics/” can be used to mount the file. 

volumeMounts:
        - mountPath: /go/src/k8s.io/kube-state-metrics/
          name: cr-config

Provide permission to access the custom resources

By default, kube-state-metrics will have permission to access standard resources only as per the ClusterRole. If deployment is done without adding additional privileges, required metrics won’t be scraped.   

Add additional privileges based on the custom resource you want to monitor. In this example, we will add additional privileges to monitor
PerconaXtraDBCluster,
PerconaXtraDBClusterBackup,
PerconaXtraDBClusterRestore.

Apply cluster-role and check the logs to see if custom resources are being captured

Validate the metrics being captured

Check the logs of kube-state-metrics

$ kubectl logs -f deploy/kube-state-metrics
I0706 14:43:25.273822       1 wrapper.go:98] "Starting kube-state-metrics"
.
.
.
I0706 14:43:28.285613       1 discovery.go:274] "discovery finished, cache updated"
I0706 14:43:28.285652       1 metrics_handler.go:99] "Autosharding disabled"
I0706 14:43:28.288930       1 custom_resource_metrics.go:79] "Custom resource state added metrics" familyNames=[kube_customresource_pxc_info kube_customresource_pxc_size kube_customresource_pxc_status_state]
I0706 14:43:28.411540       1 builder.go:275] "Active resources" activeStoreNames="certificatesigningrequests,configmaps,cronjobs,daemonsets,deployments,endpoints,horizontalpodautoscalers,ingresses,jobs,leases,limitranges,mutatingwebhookconfigurations,namespaces,networkpolicies,nodes,persistentvolumeclaims,persistentvolumes,poddisruptionbudgets,pods,replicasets,replicationcontrollers,resourcequotas,secrets,services,statefulsets,storageclasses,validatingwebhookconfigurations,volumeattachments,pxc.percona.com/v1, Resource=perconaxtradbclusters"

Check the kube-state-metrics service to list the metrics scraped. 

Open a terminal and keep the port-forward command running:

$ kubectl port-forward svc/kube-state-metrics  8080:8080
Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080
Handling connection for 8080
Handling connection for 8080

In a browser, check for the metrics captured using “127.0.0.1:8080” (remember to keep the terminal running where the port-forward command is running).

Observe the metrics
kube_customresource_pxc_info
,
kube_customresource_pxc_status_state
,
kube_customresource_pxc_size
being captured.

# HELP kube_customresource_pxc_info Information of PXC cluster on k8s
# TYPE kube_customresource_pxc_info info
kube_customresource_pxc_info{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",version="1.9.0"} 1
# HELP kube_customresource_pxc_size Desired size for the PXC cluster
# TYPE kube_customresource_pxc_size gauge
kube_customresource_pxc_size{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1"} 3
# HELP kube_customresource_pxc_status_state State of PXC Cluster
# TYPE kube_customresource_pxc_status_state stateset
kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="error"} 1
kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="initializing"} 0
kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="paused"} 0
kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="ready"} 0
kube_customresource_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",state="unknown"} 0

Customize the metric name, add default labels

As seen above, the metrics captured had the prefix
kube_customresource. What if we want to customize it?

There are some standard labels, like the name of the custom resource and namespace of the custom resources, which might need to be captured as labels for all the metrics related to a custom resource. It’s not practical to add this for every single metric captured. Hence, identifiers
labelsFromPath and
metricNamePrefix are used.

In the below snippet, all the metrics captured for the group
pxc.percona.com, version
v1, kind
PerconaXtrDBCluster will have the metric prefix
kube_pxc and all the metrics will have the following labels-

  • name – Derived from the path metadata.name of the custom resource
  • namespace – Derived from the path metadata.namespace of the custom resource.
spec:
      resources:
        - groupVersionKind:
            group: pxc.percona.com
            version: v1
            kind: PerconaXtraDBCluster
          labelsFromPath:
            name: [metadata,name]
            namespace: [metadata,namespace]
          metricNamePrefix: kube_pxc

Change the configuration present in the configmap and apply the new configmap.

When the new configmap is applied, kube-state-metrics should automatically pick up the configuration changes; you can also do a “kubectl rollout restart deploy kube-state-metrics” to expedite the pod restart.

Once the changes are applied, check the metrics by port-forwarding to kube-state-metrics service.

$ kubectl port-forward svc/kube-state-metrics  8080:8080
Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080
Handling connection for 8080
Handling connection for 8080

In a browser, check for the metrics captured using “127.0.0.1:8080” (remember to keep the terminal running where the port-forward command is running).

Observe the metrics:

# HELP kube_pxc_pxc_info Information of PXC cluster on k8s
# TYPE kube_pxc_pxc_info info
kube_pxc_pxc_info{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",version="1.9.0"} 1
# HELP kube_pxc_pxc_size Desired size for the PXC cluster
# TYPE kube_pxc_pxc_size gauge
kube_pxc_pxc_size{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc"} 3
# HELP kube_pxc_pxc_status_state State of PXC Cluster
# TYPE kube_pxc_pxc_status_state stateset
kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="error"} 1
kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="initializing"} 0
kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="paused"} 0
kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="ready"} 0
kube_pxc_pxc_status_state{customresource_group="pxc.percona.com",customresource_kind="PerconaXtraDBCluster",customresource_version="v1",name="cluster1",namespace="pxc",state="unknown"} 0

Labels customization

By default, kube-state-metrics doesn’t capture all the labels of the resources. However, this might be handy in deriving co-relations from custom resources to the k8s objects. To add additional labels, use the flag
metriclabelsallowlist as mentioned in the
documentation.

To demonstrate, changes are made to the kube-state-metrics deployment and applied.

Check the metrics by doing a port-forward to the service as instructed earlier.

Check the labels captured of pod
cluster1pxc0:

kube_pod_labels{namespace="pxc",pod="cluster1-pxc-0",uid="1083ac08-5c25-4ede-89ce-1837f2b66f3d",label_app_kubernetes_io_component="pxc",label_app_kubernetes_io_instance="cluster1",label_app_kubernetes_io_managed_by="percona-xtradb-cluster-operator",label_app_kubernetes_io_name="percona-xtradb-cluster",label_app_kubernetes_io_part_of="percona-xtradb-cluster"} 1

Labels of the pod can be checked in the cluster:

$ kubectl get po -n pxc cluster1-pxc-0 --show-labels
NAME             READY   STATUS    RESTARTS         AGE     LABELS
cluster1-pxc-0   3/3     Running   0                3h54m   app.kubernetes.io/component=pxc,app.kubernetes.io/instance=cluster1,app.kubernetes.io/managed-by=percona-xtradb-cluster-operator,app.kubernetes.io/name=percona-xtradb-cluster,app.kubernetes.io/part-of=percona-xtradb-cluster,controller-revision-hash=cluster1-pxc-6f4955bbc7,statefulset.kubernetes.io/pod-name=cluster1-pxc-0

Adhering to the Prometheus conventions, character . (dot) is replaced with _(underscore). Only labels mentioned in the
metriclabelsallowlist are captured for the labels info.

Checking for the other pod:

$ kubectl get po -n kube-system kube-state-metrics-7bd9c67f64-46ksw --show-labels
NAME                                  READY   STATUS    RESTARTS      AGE    LABELS
kube-state-metrics-7bd9c67f64-46ksw   1/1     Running   1 (40m ago)   120m   app.kubernetes.io/component=exporter,app.kubernetes.io/name=kube-state-metrics,app.kubernetes.io/version=2.9.2,pod-template-hash=7bd9c67f64

Following are the labels captured in the kube-state-metrics service:

kube_pod_labels{namespace="kube-system",pod="kube-state-metrics-7bd9c67f64-46ksw",uid="d4b30238-d29e-4251-a8e3-c2fad1bff724",label_app_kubernetes_io_component="exporter",label_app_kubernetes_io_name="kube-state-metrics"} 1

As can be seen above, label
app.kubernetes.io/version
is not captured because it was not mentioned in the
metriclabelsallowlist flag of kube-state-metrics. 

Conclusion

  1. Custom Resource metrics can be captured by modifying kube-state-metrics deployment. Metrics can be captured without writing any code.
  2. Alternate to the above method, the custom exporter can be written to expose the metrics, which gives a lot of flexibility. However, this needs coding and maintenance.
  3. Metrics can be scraped by Prometheus to derive useful insights combined with the other metrics.

If you want to extend the same process to other custom resources related to Percona Operators, use the following ClusterRole to provide permission to read the relevant custom resources. Configurations for some of the important metrics related to the custom resources are captured in this Configmap for you to explore.

The Percona Kubernetes Operators automate the creation, alteration, or deletion of members in your Percona Distribution for MySQL, MongoDB, or PostgreSQL environment.

 

Learn More About Percona Kubernetes Operators

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com