Nov
10
2022
--

Run MySQL in Kubernetes: Solutions, Pros and Cons

Run MySQL in Kubernetes

Run MySQL in KubernetesThis blog post continues the series of comparisons of solutions to run databases on Kubernetes. Previous blog posts:

The initial release of MySQL was in 1995 and Kubernetes was released 19 years later in 2014. According to DB-engines, MySQL today is the most popular open source relational database, whereas Kubernetes is the de-facto standard for running containerized workloads. The popularity of these two technologies among engineers drove companies to create solutions to run them together. In this blog post, we will review various Kubernetes Operators for MySQL, see how different they are, and what capabilities they provide for developers and operations teams.

The summary and comparison table can be found in our documentation.

Notable mentions

Before reviewing the operators, I need to mention numerous interesting solutions to run MySQL in k8s.

KubeDB

KubeDB is a swiss-army knife operator, which can deploy and manage multiple databases, including MySQL. The thing is that it is working in an open core model, where the most interesting features are not available in the free version, so I cannot easily try them out. Rest assured it is a viable solution, but in this blog post, I want to focus on open source offerings.

Helm or manual deployment

The desire to run MySQL on Kubernetes was there before the Operator SDK appeared. To address this, the community was very creative and developed numerous ways how to do it, ranging from regular manual deployments to more sophisticated helm charts (ex Bitnami Helm Chart).

They do the job – you can deploy a MySQL server. With some digging and tuning it is even possible to have a cluster. But what these solutions have in common is that they lack the ability to perform different day-2 operations: backups, scaling, upgrades, etc. For databases, it might be crucial, because data consistency and safety are at stake. Applying methods that worked on legacy environments, might not be safe on Kubernetes.

This is where Operators come into play.

Bitpoke MySQL Operator

Bitpoke is a company that provides tools for WordPress self-hosting on Kubernetes, including MySQL and WordPress operators. Their team developed one of the first MySQL operators and shared it with the community. Developed initially within Presslabs, since 2021 the team and operator have moved to Bitpoke. The operator is used in production by numerous companies. 

It is Apache 2.0 licensed. Interestingly enough, it uses Percona Server for MySQL under the hood “because of backup improvements (eg. backup locks), monitoring improvements, and some serviceability improvements (eg. utility user)”.

Deployment

I followed the documentation:

helm repo add bitpoke https://helm-charts.bitpoke.io
helm install mysql-operator bitpoke/mysql-operator

The operator is up and running. Deploying the cluster:

kubectl apply -f https://raw.githubusercontent.com/bitpoke/mysql-operator/master/examples/example-cluster-secret.yaml
kubectl apply -f https://raw.githubusercontent.com/bitpoke/mysql-operator/master/examples/example-cluster.yaml

This deploys MySQL 5.7 cluster with asynchronous replication – one main and one replica node.

Features

Bitpoke operator allows you to back up, restore, scale, and upgrade MySQL on Kubernetes. So regular day-2 operations are available. 

Let’s start with the pros:

  • It is the oldest open source Operator for MySQL. It was battle-tested by Bitpoke and widely adopted by the community and other companies. This gives assurance that there are no critical issues. But do read ahead to the Cons section.
  • Replication lag and mitigation takes lagging nodes out of rotation when lag is above a set threshold. It is important for asynchronous replication to present the real data for the application.

As for cons, it seems that the operator is not actively developed with 15 commits for the last year. 

Oracle MySQL Operator

This is not the first time Oracle created the Operator for MySQL, but the difference now is that this Operator made it to the General Availability stage. Operator is distributed under the unusual “Universal Permissive License (UPL)”, but it is really permissive and close to the MIT license.

Deployment

Standard deployment for the operator with helm, no surprises:

helm repo add mysql-operator https://mysql.github.io/mysql-operator/
helm repo update
helm install mysql-operator mysql-operator/mysql-operator --namespace mysql-operator --create-namespace

Now the cluster:

export NAMESPACE=default
helm install my-mysql-innodbcluster mysql-operator/mysql-innodbcluster -n $NAMESPACE \
        --version 2.0.7 \
        --set credentials.root.password=">-0URS4F3P4SS" \
        --set tls.useSelfSigned=true

This deploys a MySQL cluster with three nodes with Group Replication and a single MySQL router Pod in front of it for query routing. 

Features

Even though the operator was promoted to GA, some basic capabilities are not there or should be implemented by the user. For example, upgrades, monitoring, and topology flexibility.

  • Operator uses MySQL Shell underneath and its codebase is in python (versus regular golang). Backups and restores also rely on MySQL Shell and use the dumpInstance() method. It is a logical backup and might be problematic for big databases.
  • Operator supports only MySQL 8.0 and only Group Replication (or marketed InnoDB Cluster). Which I think is a logical move.
  • You can use MySQL Community container images, but there is a certain trend from Oracle to push users towards Enterprise. At least it is visible in the two last release notes, where new additions are all-around features available in the Enterprise edition only.

Moco

Similar to the Bitpoke operator, Moco was created by Cybozu for its internal needs and later open-sourced. It goes under Apache 2.0 license, is written in Golang, and has a good release cadence.

Deployment

As usual, let’s try a quick start guide. Note that a cert-manager is required (curiosity peaked from the start!). 

Install cert-manager and deploy the operator:

curl -fsLO https://github.com/jetstack/cert-manager/releases/latest/download/cert-manager.yaml
kubectl apply -f cert-manager.yaml

helm repo add moco https://cybozu-go.github.io/moco/
helm repo update
helm install --create-namespace --namespace moco-system moco moco/moco

Create the cluster from an example folder:

kubectl apply -f https://raw.githubusercontent.com/cybozu-go/moco/main/examples/loadbalancer.yaml

This deploys a cluster with three nodes and semi-sync replication exposed with a load balancer.

Features

Moco is quite feature-rich and enables users to execute various management tasks. Refreshing solutions and ideas:

  • Documentation. For an in-house grown product, the operator has pretty good and detailed documentation. From design and architecture decisions to how-tos and common use cases.
  • Similar to Oracle’s operator, this one relies on MySQL Shell for backups and restores, but at the same time supports incremental backups. Binary logs are copied to an S3 bucket at the time of backup, so it is not a continuous binlog copy that can deliver point-in-time recovery, but only a one-time dump. 
  • Operator has its own kubectl plugin to smoothen the onboarding and usage. I’m curious though if these plugins are widely used in production. Please share your thoughts in a comment.

There are some concerns that I have regarding this operator:

  • It works with semi-sync replication which is not a good fit for workloads that have strong data-consistency requirements. See Face to Face with Semi-Synchronous Replication for more details.
  • It does not have official support, so for businesses, it is a “use at your own risk” type of situation.

Vitess

Vitess is a database clustering system for horizontal scaling of MySQL and was initially developed in YouTube (and it was widely used there). Now it is a CNCF project and is actively developed by PlanetScale and the community. It is open source, but there are some features in Vitess itself that are only available for PlanetScale customers. Interesting fact: Vitess Operator serves as a core component of the PlanetScale DBaaS. So it is a production-grade and battle-tested operator.

Deployment

Going with a quickstart

git clone https://github.com/vitessio/vitess
cd vitess/examples/operator

kubectl apply -f operator.yaml

Operator is ready. Let’s deploy an intro cluster:

kubectl apply -f 101_initial_cluster.yaml

This deploys the following:

  • Three etcd Pods
  • Various Vitess Pods:
    • Two vttablet
    • vtadmin
    • vtctld
    • vtgate
    • vtorc 

You can read more about Vitess concepts and pieces in architecture documentation.

Features

Sharding is one of the biggest pros of this operator. The only competitor I can think of is TiDB (which should be MySQL protocol compatible, but not MySQL). There are no other solutions for MySQL sharding in the open source space. But at the same time, it all comes with a price – complexity, which Kubernetes for sure helps to masquerade. Getting familiar with all vt-* components can be overwhelming, especially for users who never used Vitess before. 

Operator provides users with all the regular management operations. The only downside is that these operations are not well documented and you have to discover them through various blog posts, reference docs, and other artifacts. For example, this blog post covers some basics for backups and restores, whereas this document covers basic Vitess operations.

Percona

At Percona we have two operators for MySQL:

  1. Based on Percona XtraDB Cluster (PXC)
  2. Based on Percona Server for MySQL (PS)

In Percona Operator for MySQL – Alpha Release, we explain why we decided to create the new operator. Both operators are fully open source as the components they are based on. The one based on PXC is production-ready, whereas PS is getting there.

Deployment

For Percona Kubernetes Operators we maintain helm charts for ease of onboarding. Deployment is a two-step process.

Deploy the operator:

helm repo add percona https://percona.github.io/percona-helm-charts/
helm install my-op percona/pxc-operator

And the database:

helm install my-db percona/pxc-db

Features

For features, I will focus on PXC as it is production-ready, and we are aiming for PS Operator to reach parity in the nearest future. 

  • Proxy integration. Along with the database, the operator also deploys proxies. Users can choose from HAProxy and ProxySQL. The choice depends on the use case.
  • Operator allows users to have multi-cluster or multi-regional MySQL deployments. This is useful for migrations or disaster recovery.
  • There are no other operators for MySQL that provide a true point-in-time recovery. We developed a solution that continuously stores backups on object storage. Read more about architecture decisions in Point-In-Time Recovery in Percona Operator for MySQL Based on Percona XtraDB Cluster – Architecture Decisions.
  • Operator provides automated and safe upgrades for minor versions of MySQL and its components (like proxies). It is extremely useful to quickly fix critical vulnerabilities and roll out bug fixes with zero downtime.

Percona is committed to providing open source products to the community, but we also provide exceptional services for our customers: managed and professional services and support. We have an ecosystem of products — Percona Platform — that brings together our software and services offerings.

Sep
07
2017
--

Always Verify Examples When Comparing DB Products (PostgreSQL and MySQL)

PostgreSQL and MySQL

PostgreSQL and MySQLIn this blog post, I’ll look at a comparison of PostgreSQL and MySQL.

I came across a post from Hans-Juergen Schoenig, a Postgres consultant at Cybertec. In it, he dismissed MySQL and showed Postgres as better. While his post ignores most of the reasons why MySQL is better, I will focus on where his post is less than accurate. Testing for MySQL was done with Percona Server 5.7, defaults.

Mr. Schoenig complains that MySQL changes data types automatically. He claims inserting 1234.5678 into a numeric(4, 2) column on Postgres produces an error, and that MySQL just rounds the number to fit. In my testing I found this to be a false claim:

mysql> CREATE TABLE data (
    -> id    integer NOT NULL,
    -> data  numeric(4, 2));
Query OK, 0 rows affected (0.07 sec)
mysql> INSERT INTO data VALUES (1, 1234.5678);
ERROR 1264 (22003): Out of range value for column 'data' at row 1

His next claim is that MySQL allows updating a key column to NULL and silently changes it to 0. This is also false:

mysql> INSERT INTO data VALUES (1, 12);
Query OK, 1 row affected (0.00 sec)
mysql> UPDATE data SET id = NULL WHERE id = 1;
ERROR 1048 (23000): Column 'id' cannot be null

In the original post, we never see the warnings and so don’t have the full details of his environment. Since he didn’t specify which version he was testing on, I will point out that MySQL 5.7 does a far better job out-of-the-box handling your data than 5.6 does, and SQL Mode has existed in MySQL for ages. Any user could set it to

STRICT_ALL|TRANS_TABLES

 and get the behavior that is now default in 5.7.

The author is also focusing on a narrow issue, using it to say Postgres is better. I feel this is misleading. I could point out factors in MySQL that are better than in Postgres as well.

This is another case of “don’t necessarily take our word for it”. A simple test of what you see on a blog can help you understand how things work in your environment and why.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com