Apr
22
2021
--

Google’s Anthos multicloud platform gets improved logging, Windows container support and more

Google today announced a sizable update to its Anthos multicloud platform that lets you build, deploy and manage containerized applications anywhere, including on Amazon’s AWS and (in preview) on Microsoft Azure.

Version 1.7 includes new features like improved metrics and logging for Anthos on AWS, a new Connect gateway to interact with any cluster right from Google Cloud and a preview of Google’s managed control plane for Anthos Service Mesh. Other new features include Windows container support for environments that use VMware’s vSphere platform and new tools for developers to make it easier for them to deploy their applications to any Anthos cluster.

Today’s update comes almost exactly two years after Google CEO Sundar Pichai originally announced Anthos at its Cloud Next event in 2019 (before that, Google called this project the “Google Cloud Services Platform,” which launched three years ago). Hybrid and multicloud, it’s fair to say, takes a key role in the Google Cloud roadmap — and maybe more so for Google than for any of its competitors. Recently, Google brought on industry veteran Jeff Reed to become the VP of Product Management in charge of Anthos.

Reed told me that he believes that there are a lot of factors right now that are putting Anthos in a good position. “The wind is at our back. We bet on Kubernetes, bet on containers — those were good decisions,” he said. Increasingly, customers are also now scaling out their use of Kubernetes and have to figure out how to best scale out their clusters and deploy them in different environments — and to do so, they need a consistent platform across these environments. He also noted that when it comes to bringing on new Anthos customers, it’s really those factors that determine whether a company will look into Anthos or not.

He acknowledged that there are other players in this market, but he argues that Google Cloud’s take on this is also quite different. “I think we’re pretty unique in the sense that we’re from the cloud, cloud-native is our core approach,” he said. “A lot of what we talk about in [Anthos] 1.7 is about how we leverage the power of the cloud and use what we call “an anchor in the cloud” to make your life much easier. We’re more like a cloud vendor there, but because we support on-prem, we see some of those other folks.” Those other folks being IBM/Red Hat’s OpenShift and VMware’s Tanzu, for example. 

The addition of support for Windows containers in vSphere environments also points to the fact that a lot of Anthos customers are classical enterprises that are trying to modernize their infrastructure, yet still rely on a lot of legacy applications that they are now trying to bring to the cloud.

Looking ahead, one thing we’ll likely see is more integrations with a wider range of Google Cloud products into Anthos. And indeed, as Reed noted, inside of Google Cloud, more teams are now building their products on top of Anthos themselves. In turn, that then makes it easier to bring those services to an Anthos-managed environment anywhere. One of the first of these internal services that run on top of Anthos is Apigee. “Your Apigee deployment essentially has Anthos underneath the covers. So Apigee gets all the benefits of a container environment, scalability and all those pieces — and we’ve made it really simple for that whole environment to run kind of as a stack,” he said.

I guess we can expect to hear more about this in the near future — or at Google Cloud Next 2021.

Apr
20
2021
--

Change Storage Class on Kubernetes on the Fly

Change Storage Class Kubernetes

Change Storage Class KubernetesPercona Kubernetes Operators support various options for storage: Persistent Volume (PV), hostPath, ephemeral storage, etc. In most of the cases, PVs are used, which are provisioned by the Operator through Storage Classes and Persistent Volume Claims.

Storage Classes define the underlying volume type that should be used (ex. AWS, EBS, gp2, or io1), file system (xfs, ext4), and permissions. In various cases, cluster administrators want to change the Storage Class for already existing volumes:

  • DB cluster is underutilized and it is a good cost-saving when switching from io1 to gp2
  • The other way – DB cluster is saturated on IO and it is required to upsize the volumes
  • Switch the file system for better performance (MongoDB is much better with xfs)

In this blog post, we will show what the best way is to change the Storage Class with Percona Operators and not introduce downtime to the database. We will cover the change in the Storage Class, but not the migration from PVC to other storage types, like hostPath.

Changing Storage Class on Kubernetes

Prerequisites:

Goal:

  • Change the storage from pd-standard to pd-ssd without downtime for PXC.

Planning

The steps we are going to take are the following:

  1. Create a new Storage class for pd-ssd volumes
  2. Change the
    storageClassName

    in Custom Resource (CR) 

  3. Change the
    storageClassName

    in the StatefulSet

  4. Scale up the cluster (optional, to avoid performance degradation)
  5. Reprovision the Pods one by one to change the storage
  6. Scale down the cluster

 

Register for Percona Live ONLINE
A Virtual Event about Open Source Databases

 

Execution

Create the Storage Class

By default, standard Storage Class is already present in GKE, we need to create the new

StorageClass

 for ssd:

$ cat pd-ssd.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ssd
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-ssd
volumeBindingMode: Immediate
reclaimPolicy: Delete

$ kubectl apply -f pd-ssd.yaml

The new Storage Class will be called

ssd

and will provision the volumes of

type: pd-ssd

.

Change the storageClassName in Custom Resource

We need to change the configuration for Persistent Volume Claims in our Custom Resource. The variable we look for is

storageClassName

and it is located under

spec.pxc.volumeSpec.persistentVolumeClaim

.

spec:
  pxc:
    volumeSpec:
      persistentVolumeClaim:
-       storageClassName: standard
+       storageClassName: ssd

Now apply new cr.yaml:

$ kubectl apply -f deploy/cr.yaml

Change the storageClassName in the StatefulSet

StatefulSets are almost immutable and when you try to edit the object you get the warning:

# * spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', and 'updateStrategy' are forbidden

We will rely on the fact that the operator controls the StatefulSet. If we delete it, the Operator is going to recreate it with the last applied configuration. For us, it means – with the new storage class. But the deletion of the StatefulSet leads to Pods termination, but our goal is 0 downtime. To get there, we will delete the set, but keep the Pods running. It can be done with –cascade flag:

$ kubectl delete sts cluster1-pxc --cascade=orphan

As a result, the Pods are up and the Operator recreated the StatefulSet with new storageClass:

$ kubectl get sts | grep cluster1-pxc
cluster1-pxc       3/3      8s

Change Storage Class on Kubernetes

Scale up the Cluster (Optional)

Changing the storage type would require us to terminate the Pods, which decreases the computational power of the cluster and might cause performance issues. To improve performance during the operation we are going to changing the size of the cluster from 3 to 5 nodes:

spec:
  pxc:
-   size: 3
+   size: 5

$ kubectl apply -f deploy/cr.yaml

As long as we have changed the StatefulSet already, new PXC Pods will be provisioned with the volumes backed by the new

StorageClass

:

$ kubectl get pvc
datadir-cluster1-pxc-0   Bound    pvc-6476a94c-fa1b-45fe-b87e-c884f47bd328   6Gi        RWO            standard       78m
datadir-cluster1-pxc-1   Bound    pvc-fcfdeb71-2f75-4c36-9d86-8c68e508da75   6Gi        RWO            standard       76m
datadir-cluster1-pxc-2   Bound    pvc-08b12c30-a32d-46a8-abf1-59f2903c2a9e   6Gi        RWO            standard       64m
datadir-cluster1-pxc-3   Bound    pvc-b96f786e-35d6-46fb-8786-e07f3097da02   6Gi        RWO            ssd            69m
datadir-cluster1-pxc-4   Bound    pvc-84b55c3f-a038-4a38-98de-061868fd207d   6Gi        RWO            ssd           68m

Reprovision the Pods One by One to Change the Storage

This is the step where underlying storage is going to be changed for the database Pods.

Delete the PVC of the Pod that you are going to reprovision. Like for Pod

cluster1-pxc-2

the PVC is called

datadir-cluster1-pxc-2

:

$ kubectl delete pvc datadir-cluster1-pxc-2

The PVC will not be deleted right away as there is a Pod using it. To proceed, delete the Pod:

$ kubectl delete pod cluster1-pxc-2

The Pod will be deleted along with the PVCs. The StatefulSet controller will notice that the pod is gone and will recreate it along with the new PVC of a new Storage Class:

$ kubectl get pods
...
cluster1-pxc-2                                     0/3     Init:0/1   0          7s

$ kubectl get pvc datadir-cluster1-pxc-2
NAME                     STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
…
datadir-cluster1-pxc-2   Bound    pvc-08b12c30-a32d-46a8-abf1-59f2903c2a9e   6Gi        RWO            ssd      10s

The STORAGECLASS column indicates that this PVC is of type ssd.

You might face the situation that the Pod is stuck in a pending state with the following error:

  Warning  FailedScheduling  63s (x3 over 71s)  default-scheduler  persistentvolumeclaim "datadir-cluster1-pxc-2" not found

It can happen due to the race condition: Pod was created when old PVC was terminating, and when the Pod is ready to start the PVC is already gone. Just delete the Pod again, so that the PVC is recreated.

Once the Pod is up, the State Snapshot Transfer kicks in and the data is synced from other nodes. It might take a while if your cluster holds lots of data and is heavily utilized. Please wait till the node is fully up and running, sync is finished, and only then proceed to the next Pod.

Scale Down the Cluster

Once all the Pods are running on the new storage it is time to scale down the cluster (if it was scaled up):

spec:
  pxc:
-   size: 5
+   size: 3


$ kubectl apply -f deploy/cr.yaml

Do not forget to clean up the PVCs for nodes 4 and 5. And done!

Conclusion

Changing the storage type is a simple task on the public clouds, but its simplicity is not synchronized yet with Kubernetes capabilities. Kubernetes and Container Native landscape is evolving and we hope to see this functionality soon. The way of changing Storage Class described in this blog post can be applied to both the Percona Operator for PXC and Percona Operator for MongoDB. If you have ideas on how to automate this and are willing to collaborate with us on the implementation, please submit the Issue to our public roadmap on Github.

Apr
07
2021
--

Percona Kubernetes Operators and Azure Blob Storage

Percona Kubernetes Operators and Azure Blob Storage

Percona Kubernetes Operators allow users to simplify deployment and management of MongoDB and MySQL databases on Kubernetes. Both operators allow users to store backups on S3-compatible storage and leverage Percona XtraBackup and Percona Backup for MongoDB to deliver backup and restore functionality. Both backup tools do not work with Azure Blob Storage, which is not compatible with the S3 protocol.

This blog post explains how to run Percona Kubernetes Operators along with MinIO Gateway on Azure Kubernetes Service (AKS) and store backups on Azure Blob Storage:

Percona Kubernetes Operators along with MinIO Gateway

Setup

Prerequisites:

  • Azure account
  • Azure Blob Storage account and container (the Bucket in AWS terms)
  • Cluster deployed with Azure Kubernetes Service (AKS)

Deploy MinIO Gateway

I have prepared the manifest to deploy the MinIO gateway to Kubernetes, you can find them in the Github repo here.

First, create a separate namespace:

kubectl create namespace minio-gw

Create the secret which contains credentials for Azure Blob Storage:

$ cat minio-secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: minio-secret
stringData:
  AZURE_ACCOUNT_NAME: Azure_account_name
  AZURE_ACCOUNT_KEY: Azure_account_key

$ kubectl -n minio-gw apply -f minio-secret.yaml

Apply

minio-gateway.yaml

 from the repository. This manifest does two things:

  1. Creates MinIO Pod backed by Deployment object
  2. Exposes this Pod on port 9000 as a ClusterIP through a Service object
$ kubectl -n minio-gw apply -f blog-data/operators-azure-blob/minio-gateway.yaml

It is also possible to use Helm Charts and deploy the Gateway with MinIO Operator. You can read more about it here. Running a MinIO Operator might be a good choice, but it is an overkill for this blog post.

Deploy PXC

Get the code from Github:

git clone -b v1.7.0 https://github.com/percona/percona-xtradb-cluster-operator

Deploy the bundle with Custom Resource Definitions:

cd percona-xtradb-cluster-operator 
kubectl apply -f deploy/bundle.yaml

Create the Secret object for backup. You should use the same Azure Account Name and Key that you used to setup MinIO:

$ cat deploy/backup-s3.yam
apiVersion: v1
kind: Secret
metadata:
  name: azure-backup
type: Opaque
data:
  AWS_ACCESS_KEY_ID: BASE64_ENCODED_AZURE_ACCOUNT_NAME
  AWS_SECRET_ACCESS_KEY: BASE64_ENCODED_AZURE_ACCOUNT_KEY

Add storage configuration into

cr.yaml

 under

spec.backup.storages

.

storages:
  azure-minio:
    type: s3
    s3:
      bucket: test-azure-container
      credentialsSecret: azure-backup
      endpointUrl: http://minio-gateway-svc.minio-gw:9000

  • bucket

    is the container created on Azure Blob Storage.

  • endpointUrl

    must point to the MinIO Gateway service that was created in the previous section.

Deploy the database cluster:

$ kubectl apply -f deploy/cr.yaml

Read more about the installation of the Percona XtraDB Cluster Operator in our documentation.

Take Backups and Restore

To take the backup or restore, follow the regular approach by creating corresponding

pxc-backup

or

pxc-restore

Custom Resources in Kubernetes. For example, to take the backup I use the following manifest:

$ cat deploy/backup/backup.yaml
apiVersion: pxc.percona.com/v1
kind: PerconaXtraDBClusterBackup
metadata:
  name: backup1
spec:
  pxcCluster: cluster1
  storageName: azure-minio

This creates the Custom Resource object

pxc-backup

and the Operator uploads the backup to the Container in my Storage account:

Read more about backup and restore functionality in the Percona Kubernetes Operator for Percona XtraDB Cluster documentation.

Conclusion

Even though Azure Blob Storage is not S3-compatible, Cloud Native landscape provides production-ready tools for seamless integration. MinIO Gateway will work for both Percona Kubernetes Operators for MySQL and MongoDB, enabling S3-like backup and restore functionality.

The Percona team is committed to delivering smooth integration for its software products for all major clouds. Adding support for Azure Blob Storage is on the roadmap of Percona XtraBackup and Percona Backup for MongoDB, so as the certification on Azure Kubernetes Service for both operators.

Apr
06
2021
--

Esri brings its flagship ArcGIS platform to Kubernetes

Esri, the geographic information system (GIS), mapping and spatial analytics company, is hosting its (virtual) developer summit today. Unsurprisingly, it is making a couple of major announcements at the event that range from a new design system and improved JavaScript APIs to support for running ArcGIS Enterprise in containers on Kubernetes.

The Kubernetes project was a major undertaking for the company, Esri Product Managers Trevor Seaton and Philip Heede told me. Traditionally, like so many similar products, ArcGIS was architected to be installed on physical boxes, virtual machines or cloud-hosted VMs. And while it doesn’t really matter to end-users where the software runs, containerizing the application means that it is far easier for businesses to scale their systems up or down as needed.

Esri ArcGIS Enterprise on Kubernetes deployment

Esri ArcGIS Enterprise on Kubernetes deployment. Image Credits: Esri

“We have a lot of customers — especially some of the larger customers — that run very complex questions,” Seaton explained. “And sometimes it’s unpredictable. They might be responding to seasonal events or business events or economic events, and they need to understand not only what’s going on in the world, but also respond to their many users from outside the organization coming in and asking questions of the systems that they put in place using ArcGIS. And that unpredictable demand is one of the key benefits of Kubernetes.”

Deploying Esri ArcGIS Enterprise on Kubernetes

Deploying Esri ArcGIS Enterprise on Kubernetes. Image Credits: Esri

The team could have chosen to go the easy route and put a wrapper around its existing tools to containerize them and call it a day, but as Seaton noted, Esri used this opportunity to re-architect its tools and break it down into microservices.

“It’s taken us a while because we took three or four big applications that together make up [ArcGIS] Enterprise,” he said. “And we broke those apart into a much larger set of microservices. That allows us to containerize specific services and add a lot of high availability and resilience to the system without adding a lot of complexity for the administrators — in fact, we’re reducing the complexity as we do that and all of that gets installed in one single deployment script.”

While Kubernetes simplifies a lot of the management experience, a lot of companies that use ArcGIS aren’t yet familiar with it. And as Seaton and Heede noted, the company isn’t forcing anyone onto this platform. It will continue to support Windows and Linux just like before. Heede also stressed that it’s still unusual — especially in this industry — to see a complex, fully integrated system like ArcGIS being delivered in the form of microservices and multiple containers that its customers then run on their own infrastructure.

Image Credits: Esri

In addition to the Kubernetes announcement, Esri also today announced new JavaScript APIs that make it easier for developers to create applications that bring together Esri’s server-side technology and the scalability of doing much of the analysis on the client-side. Back in the day, Esri would support tools like Microsoft’s Silverlight and Adobe/Apache Flex for building rich web-based applications. “Now, we’re really focusing on a single web development technology and the toolset around that,” Esri product manager Julie Powell told me.

A bit later this month, Esri also plans to launch its new design system to make it easier and faster for developers to create clean and consistent user interfaces. This design system will launch April 22, but the company already provided a bit of a teaser today. As Powell noted, the challenge for Esri is that its design system has to help the company’s partners put their own style and branding on top of the maps and data they get from the ArcGIS ecosystem.

 

Apr
01
2021
--

Infinitely Scalable Storage with High Compression Feature

Infinitely Scalable Storage with High Compression Feature

Infinitely Scalable Storage with High Compression FeatureIt is no secret that compute and storage costs are the main drivers of cloud bills. Migration of data from the legacy data center to the cloud looks appealing at first as it significantly reduces capital expense (CapEx) and keeps operational expenses (OpEx) under control. But once you see the bill, the lift and shift project does not look that promising anymore. See Percona’s recent open source survey which shows that many organizations saw an unexpected growth around cloud and data.

Storage growth is an organic process for the expanding business: more customers store more data, and more data needs more backups and disaster recovery storage for low RTO.

Today, the Percona Innovation Team, which is part of the Engineering organization, is proud to announce a new feature – High Compression. With this feature enabled, your MySQL databases will have infinite storage at zero cost.

The Problem

Our research team was digging into the problem of storage growth. They have found that the storage growth of a successful business inevitably leads to the increase of the cloud bill. After two years of research we got the data we need and the problem is now clear, and you can see it on the chart below:

The correlation is clearly visible – the more data you have, the more you pay.

The Solution

Once our Innovation Team received the data, we started working day and night on the solution. The goal was to change the trend and break the correlation. That is how after two years, we are proud to share with the community the High Compression feature. You can see the comparison of the storage costs with and without this new feature below:

Option 100 TB AWS EBS 100 TB AWS S3 for backups 100 TB AWS EBS + High compression 100 TB AWS S3 for backups + High Compression
Annual run rate

$120,000

$25,200

$1.2

< $1

As you see it is a 100,000x difference! What is more interesting, the cost of the storage with the High Compression feature enabled always stays flat and the chart now looks like this:

Theory

Not many people know, but data on disks is stored as bits, which are 0s and 1s. They form the binary sequences which are translated into normal data.

After thorough research, we came to the conclusion that we can replace the 1s with 0s easily. The formula is simple:

f(1) = 0

So instead of storing all these 1s, our High Compression feature stores zeroes only:

 

Implementation

The component which does the conversion is called the Nullifier, and every bit of data goes through it. We are first implementing this feature in Percona Operator for Percona XtraDB Cluster and below is the technical view of how it is implemented in Kubernetes:

As you see, all the data written by the user (all Insert or Update statements) goes through the Nullifier first, and only then are stored on the Persistent Volume Claim (PVC). With the High Compression feature enabled, the size of the PVC can be always 1 GB.

Percona is an open source company and we are thrilled to share our code with everyone. You can see the Pull Request for the High Compression feature here. As you see in the PR, our feature provides the Nullifier through the underestimated and very powerful Blackhole engine.

  if [ "$HIGH_COMPRESSION" == 'yes' ]; then
        sed -r "s|^[#]?default_storage_engine=.*$|default_storage_engine=BLACKHOLE|" ${CFG} 1<>${CFG}
        grep -E -q "^[#]?pxc_strict_mode" "$CFG" || sed '/^\[mysqld\]/a pxc_strict_mode=PERMISSIVE\n' ${CFG} 1<>${CFG}
    fi

The High Compression feature will be enabled by default starting from PXC Operator version 1.8.0, but we have added the flag into

cr.yaml

to disable this feature if needed:

spec.pxc.highCompression: true

.

Backups and with the High Compression feature are blazing fast and take seconds with any amount of data. The challenge our Engineering team is working on now is recovery. The Nullifier does the job, but recovering the data is hard. We are confident that De-Nullifier will be released in 1.8.0 as well.

Conclusion

Percona is spearheading innovation in the database technology field. The High Compression feature solves the storage growth problem and as a result, reduces the cloud bill significantly. The release of the Percona Kubernetes Operator for Percona XtraDB Cluster 1.8.0 is planned for mid-April, but this feature is already available in Tech Preview.

As a quick peek at our roadmap, we are glad to share that the Innovation Team has already started working on the High-Density feature, which will drastically reduce the compute footprint required to run MySQL databases.

Mar
22
2021
--

Storing Kubernetes Operator for Percona Server for MongoDB Secrets in Github

storing kubernetes MongoDB secrets github

storing kubernetes MongoDB secrets githubMore and more companies are adopting GitOps as the way of implementing Continuous Deployment. Its elegant approach built upon a well-known tool wins the hearts of engineers. But even if your git repository is private, it’s strongly discouraged to store keys and passwords in unencrypted form.

This blog post will show how easy it is to use GitOps and keep Kubernetes secrets for Percona Kubernetes Operator for Percona Server for MongoDB securely in the repository with Sealed Secrets or Vault Secrets Operator.

Sealed Secrets

Prerequisites:

  • Kubernetes cluster up and running
  • Github repository (optional)

Install Sealed Secrets Controller

Sealed Secrets rely on asymmetric cryptography (which is also used in TLS), where the private key (which in our case is stored in Kubernetes) can decrypt the message encrypted with the public key (which can be stored in public git repository safely). To make this task easier, Sealed Secrets provides the kubeseal tool, which helps with the encryption of the secrets.

Install kubeseal operator into your Kubernetes cluster:

kubectl apply -f https://github.com/bitnami-labs/sealed-secrets/releases/download/v0.15.0/controller.yaml

It will install the controller into the kube-system namespace and provide the Custom Resource Definition

sealedsecrets.bitnami.com

. All resources in Kubernetes with

kind: SealedSecrets

will be handled by this Operator.

Download the kubeseal binary:

wget https://github.com/bitnami-labs/sealed-secrets/releases/download/v0.15.0/kubeseal-linux-amd64 -O kubeseal
sudo install -m 755 kubeseal /usr/local/bin/kubeseal

Encrypt the Keys

In this example, I intend to store important secrets of the Percona Kubernetes Operator for Percona Server for MongoDB in git along with my manifests that are used to deploy the database.

First, I will seal the secret file with system users, which is used by the MongoDB Operator to manage the database. Normally it is stored in deploy/secrets.yaml.

kubeseal --format yaml < secrets.yaml  > blog-data/sealed-secrets/mongod-secrets.yaml

This command creates the file with encrypted contents, you can see it in the blog-data/sealed-secrets repository here. It is safe to store it publicly as it can only be decrypted with a private key.

Executing

kubectl apply -f blog-data/sealed-secrets/mongod-secrets.yaml

does the following:

  1. A sealedsecrets custom resource (CR) is created. You can see it by executing
    kubectl get sealedsecrets

    .

  2. The Sealed Secrets Operator receives the event that a new sealedsecrets CR is there and decrypts it with the private key.
  3. Once decrypted, a regular Secrets object is created which can be used as usual.

$ kubectl get sealedsecrets
NAME               AGE
my-secure-secret   20m

$ kubectl get secrets my-secure-secret
NAME               TYPE     DATA   AGE
my-secure-secret   Opaque   10     20m

Next, I will also seal the keys for my S3 bucket that I plan to use to store backups of my MongoDB database:

kubeseal --format yaml < backup-s3.yaml  > blog-data/sealed-secrets/s3-secrets.yaml
kubectl apply -f blog-data/sealed-secrets/s3-secrets.yaml

Vault Secrets Operator

Sealed Secrets is the simplest approach, but it is possible to achieve the same result with HashiCorp Vault and Vault Secrets Operator. It is a more advanced, mature, and feature-rich approach.

Prerequisites:

Vault Secrets Operator also relies on Custom Resource, but all the keys are stored in HashiCorp Vault:

Preparation

Create a policy on the Vault for the Operator:

cat <<EOF | vault policy write vault-secrets-operator -
path "kvv2/data/*" {
  capabilities = ["read"]
}
EOF

The policy might look a bit differently, depending on where your secrets are.

Create and fetch the token for the policy:

$ vault token create -period=24h -policy=vault-secrets-operator

Key                  Value                                                                                                                                                                                        
---                  -----                                                                                               
token                s.0yJZfCsjFq75GiVyKiZgYVOm
...

Write down the token, as you will need it in the next step.

Create the Kubernetes Secret so that the Operator can authenticate with the Vault:

export VAULT_TOKEN=s.0yJZfCsjFq75GiVyKiZgYVOm
export VAULT_TOKEN_LEASE_DURATION=86400

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Secret
metadata:
  name: vault-secrets-operator
type: Opaque
data:
  VAULT_TOKEN: $(echo -n "$VAULT_TOKEN" | base64)
  VAULT_TOKEN_LEASE_DURATION: $(echo -n "$VAULT_TOKEN_LEASE_DURATION" | base64)
EOF

Deploy Vault Secrets Operator

It is recommended to deploy the Operator with Helm, but before we need to create the values.yaml file to configure the operator.

environmentVars:
  - name: VAULT_TOKEN
    valueFrom:
      secretKeyRef:
        name: vault-secrets-operator
        key: VAULT_TOKEN
  - name: VAULT_TOKEN_LEASE_DURATION
    valueFrom:
      secretKeyRef:
        name: vault-secrets-operator
        key: VAULT_TOKEN_LEASE_DURATION
vault:
  address: "http://vault.vault.svc:8200"

Environment variables are pointing to the Secret that was created in the previous chapter to authenticate with Vault. We also need to provide the Vault address for the Operator to retrieve the secrets.

Now we can deploy the Vault Secrets Operator:

helm repo add ricoberger https://ricoberger.github.io/helm-charts
helm repo update

helm upgrade --install vault-secrets-operator ricoberger/vault-secrets-operator -f blog-data/sealed-secrets/values.yaml

Give me the Secret

I have a key created in my HashiCorp Vault:

$ vault kv get kvv2/mongod-secret
…
Key                                 Value
---                                 -----                                                                                                                                                                         
MONGODB_BACKUP_PASSWORD             <>
MONGODB_CLUSTER_ADMIN_PASSWORD      <>
MONGODB_CLUSTER_ADMIN_USER          <>
MONGODB_CLUSTER_MONITOR_PASSWORD    <>
MONGODB_CLUSTER_MONITOR_USER        <>                                                                                                                                                               
MONGODB_USER_ADMIN_PASSWORD         <>
MONGODB_USER_ADMIN_USER             <>

It is time to create the secret out of it. First, we will create the Custom Resource object of

kind: VaultSecret

.

$ cat blog-data/sealed-secrets/vs.yaml
apiVersion: ricoberger.de/v1alpha1
kind: VaultSecret
metadata:
  name: my-secure-secret
spec:
  path: kvv2/mongod-secret
  type: Opaque

$ kubectl apply -f blog-data/sealed-secrets/vs.yaml

The Operator will connect to HashiCorp Vault and create regular Secret object automatically:

$ kubectl get vaultsecret
NAME               SUCCEEDED   REASON    MESSAGE              LAST TRANSITION   AGE
my-secure-secret   True        Created   Secret was created   47m               47m

$ kubectl get secret  my-secure-secret
NAME               TYPE     DATA   AGE
my-secure-secret   Opaque   7      47m

Deploy MongoDB Cluster

Now that the secrets are in place, it is time to deploy the Operator and the DB cluster:

kubectl apply -f blog-data/sealed-secrets/bundle.yaml
kubectl apply -f blog-data/sealed-secrets/cr.yaml

The cluster will be up in a minute or two and use secrets we deployed.

By the way, my cr.yaml deploys MongoDB cluster with two shards. Multiple shards support was added in version 1.7.0of the Operator – I encourage you to try it out. Learn more about it here: Percona Server for MongoDB Sharding.

Mar
10
2021
--

A Peek at Percona Kubernetes Operator for Percona Server for MongoDB New Features

Percona Kubernetes Operator for Percona Server for MongoDB New Features

Percona Kubernetes Operator for Percona Server for MongoDB New FeaturesThe latest 1.7.0 release of Percona Kubernetes Operator for Percona Server for MongoDB came out just recently and enables users to:

Today we will look into these new features, the use cases, and highlight some architectural and technical decisions we made when implementing them.

Sharding

The 1.6.0 release of our Operator introduced single shard support, which we highlighted in this blog post and explained why it makes sense. But horizontal scaling is not possible without support for multiple shards.

Adding a Shard

A new shard is just a new ReplicaSet which can be added under spec.replsets in cr.yaml:

spec:
  ...
  replsets:
  - name: rs0
    size: 3
  ....
  - name: rs1
    size: 3
  ...

Read more on how to configure sharding.

In the Kubernetes world, a MongoDB ReplicaSet is a StatefulSet with a number of pods specified in

spec.replsets.[].size

variable.

Once pods are up and running, the Operator does the following:

  • Initiates ReplicaSet by connecting to newly created pods running mongod
  • Connects to mongos and adds a shard with sh.addShard() command

    adding a shard mongodb operator

Then the output of db.adminCommand({ listShards:1 }) will look like this:

        "shards" : [
                {
                        "_id" : "replicaset-1",
                        "host" : "replicaset-1/percona-cluster-replicaset-1-0.percona-cluster-replicaset-1.default.svc.cluster.local:27017,percona-cluster-replicaset-1-1.percona-cluster-replicaset-1.default.svc.cluster.local:27017,percona-cluster-replicaset-1-2.percona-cluster-replicaset-1.default.svc.cluster.local:27017",
                        "state" : 1
                },
                {
                        "_id" : "replicaset-2",
                        "host" : "replicaset-2/percona-cluster-replicaset-2-0.percona-cluster-replicaset-2.default.svc.cluster.local:27017,percona-cluster-replicaset-2-1.percona-cluster-replicaset-2.default.svc.cluster.local:27017,percona-cluster-replicaset-2-2.percona-cluster-replicaset-2.default.svc.cluster.local:27017",
                        "state" : 1
                }
        ],

Have open source expertise to share? Submit your talk for Percona Live ONLINE!

Deleting a Shard

Percona Operators are built to simplify the deployment and management of the databases on Kubernetes. Our goal is to provide resilient infrastructure, but the operator does not manage the data itself. Deleting a shard requires moving the data to another shard before removal, but there are a couple of caveats:

  • Sometimes data is not moved automatically by MongoDB – unsharded collections or jumbo chunks
  • We hit the storage problem – what if another shard does not have enough disk space to hold the data?

shard does not have enough disk space to hold the data

There are a few choices:

  1. Do not touch the data. The user needs to move the data manually and then the operator removes the empty shard.
  2. The operator decides where to move the data and deals with storage issues by upscaling if necessary.
    • Upscaling the storage can be tricky, as it requires certain capabilities from the Container Storage Interface (CNI) and the underlying storage infrastructure.

For now, we decided to pick option #1 and won’t touch the data, but in future releases, we would like to work with the community to introduce fully-automated shard removal.

When the user wants to remove the shard now, we first check if there are any non-system databases present on the ReplicaSet. If there are none, the shard can be removed:

func (r *ReconcilePerconaServerMongoDB) checkIfPossibleToRemove(cr *api.PerconaServerMongoDB, usersSecret *corev1.Secret, rsName string) error {
  systemDBs := map[string]struct{}{
    "local": {},
    "admin": {},
    "config":  {},
  }

delete a shard

Custom Sidecars

The sidecar container pattern allows users to extend the application without changing the main container image. They leverage the fact that all containers in the pod share storage and network resources.

Percona Operators have built-in support for Percona Monitoring and Management to gain monitoring insights for the databases on Kubernetes, but sometimes users may want to expose metrics to other monitoring systems.  Lets see how mongodb_exporter can expose metrics running as a sidecar along with ReplicaSet containers.

1. Create the monitoring user that the exporter will use to connect to MongoDB. Connect to mongod in the container and create the user:

> db.getSiblingDB("admin").createUser({
    user: "mongodb_exporter",
    pwd: "mysupErpassword!123",
    roles: [
      { role: "clusterMonitor", db: "admin" },
      { role: "read", db: "local" }
    ]
  })

2. Create the Kubernetes secret with these login and password. Encode both the username and password with base64:

$ echo -n mongodb_exporter | base64
bW9uZ29kYl9leHBvcnRlcg==
$ echo -n 'mysupErpassword!123' | base64
bXlzdXBFcnBhc3N3b3JkITEyMw==

Put these into the secret and apply:

$ cat mongoexp_secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: mongoexp-secret
data:
  username: bW9uZ29kYl9leHBvcnRlcg==
  password: bXlzdXBFcnBhc3N3b3JkITEyMw==

$ kubectl apply -f mongoexp_secret.yaml

3. Add a sidecar for mongodb_exporter into cr.yaml and apply:

replsets:
- name: rs0
  ...
  sidecars:
  - image: bitnami/mongodb-exporter:latest
    name: mongodb-exporter
    env:
    - name: EXPORTER_USER
      valueFrom:
        secretKeyRef:
          name: mongoexp-secret
          key: username
    - name: EXPORTER_PASS
      valueFrom:
        secretKeyRef:
          name: mongoexp-secret
          key: password
    - name: POD_IP
      valueFrom:
        fieldRef:
          fieldPath: status.podIP
    - name: MONGODB_URI
      value: "mongodb://$(EXPORTER_USER):$(EXPORTER_PASS)@$(POD_IP):27017"
    args: ["--web.listen-address=$(POD_IP):9216"

$ kubectl apply -f deploy/cr.yaml

All it takes now is to configure the monitoring system to fetch the metrics for each mongod Pod. For example, prometheus-operator will start fetching metrics once annotations are added to ReplicaSet pods:

replsets:
- name: rs0
  ...
  annotations:
    prometheus.io/scrape: 'true'
    prometheus.io/port: '9216'

PVCs Clean Up

Running CICD pipelines that deploy MongoDB clusters on Kubernetes is a common thing. Once these clusters are terminated, the Persistent Volume Claims (PVCs) are not. We have now added automation that removes PVCs after cluster deletion. We rely on Kubernetes Finalizers – asynchronous pre-delete hooks. In our case we hook the finalizer to the Custom Resource (CR) object which is created for the MongoDB cluster.

PVCs Clean Up

A user can enable the finalizer through cr.yaml in the metadata section:

metadata:
  name: my-cluster-name
  finalizers:
     - delete-psmdb-pvc

Conclusion

Percona is committed to providing production-grade database deployments on Kubernetes. Our Percona Kubernetes Operator for Percona Server for MongoDB is a feature-rich tool to deploy and manage your MongoDB clusters with ease. Our Operator is free and open source. Try it out by following the documentation here or help us to make it better by contributing your code and ideas to our Github repository.

Mar
10
2021
--

Aqua Security raises $135M at a $1B valuation for its cloud native security platform

Aqua Security, a Boston- and Tel Aviv-based security startup that focuses squarely on securing cloud-native services, today announced that it has raised a $135 million Series E funding round at a $1 billion valuation. The round was led by ION Crossover Partners. Existing investors M12 Ventures, Lightspeed Venture Partners, Insight Partners, TLV Partners, Greenspring Associates and Acrew Capital also participated. In total, Aqua Security has now raised $265 million since it was founded in 2015.

The company was one of the earliest to focus on securing container deployments. And while many of its competitors were acquired over the years, Aqua remains independent and is now likely on a path to an IPO. When it launched, the industry focus was still very much on Docker and Docker containers. To the detriment of Docker, that quickly shifted to Kubernetes, which is now the de facto standard. But enterprises are also now looking at serverless and other new technologies on top of this new stack.

“Enterprises that five years ago were experimenting with different types of technologies are now facing a completely different technology stack, a completely different ecosystem and a completely new set of security requirements,” Aqua CEO Dror Davidoff told me. And with these new security requirements came a plethora of startups, all focusing on specific parts of the stack.

Image Credits: Aqua Security

What set Aqua apart, Dror argues, is that it managed to 1) become the best solution for container security and 2) realized that to succeed in the long run, it had to become a platform that would secure the entire cloud-native environment. About two years ago, the company made this switch from a product to a platform, as Davidoff describes it.

“There was a spree of acquisitions by CheckPoint and Palo Alto [Networks] and Trend [Micro],” Davidoff said. “They all started to acquire pieces and tried to build a more complete offering. The big advantage for Aqua was that we had everything natively built on one platform. […] Five years later, everyone is talking about cloud-native security. No one says ‘container security’ or ‘serverless security’ anymore. And Aqua is practically the broadest cloud-native security [platform].”

One interesting aspect of Aqua’s strategy is that it continues to bet on open source, too. Trivy, its open-source vulnerability scanner, is the default scanner for GitLab’s Harbor Registry and the CNCF’s Artifact Hub, for example.

“We are probably the best security open-source player there is because not only do we secure from vulnerable open source, we are also very active in the open-source community,” Davidoff said (with maybe a bit of hyperbole). “We provide tools to the community that are open source. To keep evolving, we have a whole open-source team. It’s part of the philosophy here that we want to be part of the community and it really helps us to understand it better and provide the right tools.”

In 2020, Aqua, which mostly focuses on mid-size and larger companies, doubled the number of paying customers and it now has more than half a dozen customers with an ARR of over $1 million each.

Davidoff tells me the company wasn’t actively looking for new funding. Its last funding round came together only a year ago, after all. But the team decided that it wanted to be able to double down on its current strategy and raise sooner than originally planned. ION had been interested in working with Aqua for a while, Davidoff told me, and while the company received other offers, the team decided to go ahead with ION as the lead investor (with all of Aqua’s existing investors also participating in this round).

“We want to grow from a product perspective, we want to grow from a go-to-market [perspective] and expand our geographical coverage — and we also want to be a little more acquisitive. That’s another direction we’re looking at because now we have the platform that allows us to do that. […] I feel we can take the company to great heights. That’s the plan. The market opportunity allows us to dream big.”

 

Feb
25
2021
--

Why F5 spent $2.2B on 3 companies to focus on cloud native applications

It’s essential for older companies to recognize changes in the marketplace or face the brutal reality of being left in the dust. F5 is an old-school company that launched back in the 90s, yet has been able to transform a number of times in its history to avoid major disruption. Over the last two years, the company has continued that process of redefining itself, this time using a trio of acquisitions — NGINX, Shape Security and Volterra — totaling $2.2 billion to push in a new direction.

While F5 has been associated with applications management for some time, it recognized that the way companies developed and managed applications was changing in a big way with the shift to Kubernetes, microservices and containerization. At the same time, applications have been increasingly moving to the edge, closer to the user. The company understood that it needed to up its game in these areas if it was going to keep up with customers.

Taken separately, it would be easy to miss that there was a game plan behind the three acquisitions, but together they show a company with a clear opinion of where they want to go next. We spoke to F5 president and CEO François Locoh-Donou to learn why he bought these companies and to figure out the method in his company’s acquisition spree madness.

Looking back, looking forward

F5, which was founded in 1996, has found itself at a number of crossroads in its long history, times where it needed to reassess its position in the market. A few years ago it found itself at one such juncture. The company had successfully navigated the shift from physical appliance to virtual, and from data center to cloud. But it also saw the shift to cloud native on the horizon and it knew it had to be there to survive and thrive long term.

“We moved from just keeping applications performing to actually keeping them performing and secure. Over the years, we have become an application delivery and security company. And that’s really how F5 grew over the last 15 years,” said Locoh-Donou.

Today the company has over 18,000 customers centered in enterprise verticals like financial services, healthcare, government, technology and telecom. He says that the focus of the company has always been on applications and how to deliver and secure them, but as they looked ahead, they wanted to be able to do that in a modern context, and that’s where the acquisitions came into play.

As F5 saw it, applications were becoming central to their customers’ success and their IT departments were expending too many resources connecting applications to the cloud and keeping them secure. So part of the goal for these three acquisitions was to bring a level of automation to this whole process of managing modern applications.

“Our view is you fast forward five or 10 years, we are going to move to a world where applications will become adaptive, which essentially means that we are going to bring automation to the security and delivery and performance of applications, so that a lot of that stuff gets done in a more native and automated way,” Locoh-Donou said.

As part of this shift, the company saw customers increasingly using microservices architecture in their applications. This means instead of delivering a large monolithic application, developers were delivering them in smaller pieces inside containers, making it easier to manage, deploy and update.

At the same time, it saw companies needing a new way to secure these applications as they shifted from data center to cloud to the edge. And finally, that shift to the edge would require a new way to manage applications.

Feb
24
2021
--

Google Cloud puts its Kubernetes Engine on autopilot

Google Cloud today announced a new operating mode for its Kubernetes Engine (GKE) that turns over the management of much of the day-to-day operations of a container cluster to Google’s own engineers and automated tools. With Autopilot, as the new mode is called, Google manages all of the Day 2 operations of managing these clusters and their nodes, all while implementing best practices for operating and securing them.

This new mode augments the existing GKE experience, which already managed most of the infrastructure of standing up a cluster. This ‘standard’ experience, as Google Cloud now calls it, is still available and allows users to customize their configurations to their heart’s content and manually provision and manage their node infrastructure.

Drew Bradstock, the Group Product Manager for GKE, told me that the idea behind Autopilot was to bring together all of the tools that Google already had for GKE and bring them together with its SRE teams who know how to run these clusters in production — and have long done so inside of the company.

“Autopilot stitches together auto-scaling, auto-upgrades, maintenance, Day 2 operations and — just as importantly — does it in a hardened fashion,” Bradstock noted. “[…] What this has allowed our initial customers to do is very quickly offer a better environment for developers or dev and test, as well as production, because they can go from Day Zero and the end of that five-minute cluster creation time, and actually have Day 2 done as well.”

Image Credits: Google

From a developer’s perspective, nothing really changes here, but this new mode does free up teams to focus on the actual workloads and less on managing Kubernetes clusters. With Autopilot, businesses still get the benefits of Kubernetes, but without all of the routine management and maintenance work that comes with that. And that’s definitely a trend we’ve been seeing as the Kubernetes ecosystem has evolved. Few companies, after all, see their ability to effectively manage Kubernetes as their real competitive differentiator.

All of that comes at a price, of course, in addition to the standard GKE flat fee of $0.10 per hour and cluster (there’s also a free GKE tier that provides $74.40 in billing credits), plus additional fees for resources that your clusters and pods consume. Google offers a 99.95% SLA for the control plane of its Autopilot clusters and a 99.9% SLA for Autopilot pods in multiple zones.

Image Credits: Google

Autopilot for GKE joins a set of container-centric products in the Google Cloud portfolio that also include Anthos for running in multi-cloud environments and Cloud Run, Google’s serverless offering. “[Autopilot] is really [about] bringing the automation aspects in GKE we have for running on Google Cloud, and bringing it all together in an easy-to-use package, so that if you’re newer to Kubernetes, or you’ve got a very large fleet, it drastically reduces the amount of time, operations and even compute you need to use,” Bradstock explained.

And while GKE is a key part of Anthos, that service is more about brining Google’s config management, service mesh and other tools to an enterprise’s own data center. Autopilot of GKE is, at least for now, only available on Google Cloud.

“On the serverless side, Cloud Run is really, really great for an opinionated development experience,” Bradstock added. “So you can get going really fast if you want an app to be able to go from zero to 1000 and back to zero — and not worry about anything at all and have it managed entirely by Google. That’s highly valuable and ideal for a lot of development. Autopilot is more about simplifying the entire platform people work on when they want to leverage the Kubernetes ecosystem, be a lot more in control and have a whole bunch of apps running within one environment.”

 

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com