May
16
2024
--

Valkey/Redis: Not-So-Good Practices

As we’ve been introducing Valkey/Redis these past few weeks, let’s depart from the norm and talk about a few “things you should not do” in Valkey.No passwordBy default, Valkey/Redis uses no authentication. This means anyone can connect to your Valkey server and start writing and reading data. Worst yet, no password while binding to all […]

Nov
02
2023
--

Top 3 Questions From Percona k8s Squad Ask-Me-Anything Session

Percona Kubernetes Squad

On October 24th, the Percona Kubernetes Squad held the first Ask-me-Anything (AMA) session to address inquiries regarding the utilization of Kubernetes for database deployment. This blog post will outline the top three questions raised during the session and provide thorough responses from our team of experts.

Q1: When is it appropriate to use Kubernetes for databases, and when is it not recommended?

There are certain scenarios where Kubernetes is well-suited for database deployment and others where it is not the optimal choice.

The good

Ideal use cases for Kubernetes and databases include small databases that require repeatable tasks, such as continuous integration and deployment (CICD) or developer-driven tasks. 

Another case is where you are struggling with automating your day-to-day routine, and your custom scripts are maintained by a separate team now. Operators and Kubernetes would come in handy here as they take all this complexity away and remove toil.

The bad

On the other hand, Kubernetes is not designed to handle large databases or mainframes. Although it might still function in such cases, it is not recommended to utilize Kubernetes for these scenarios. 

Additionally, a manual approach that resists embracing the core principles of Kubernetes and instead insists on applying traditional methods for database management is not suitable.

The ugly

Lack of Kubernetes expertise but the desire to use it anyway may play a dirty trick. Even Operators hide the complexity of managing Kubernetes primitives and databases, minimal knowledge of the Cloud Native ecosystem is still required to run it successfully.

Joe Brockmeier gave a talk on this topic earlier this year at Open Source Summit North America in Vancouver – “Are Containers Ready for Production Databases?”. This talk covers this topic exactly.

Q2: How does Kubernetes interact with Jenkins automation?

Kubernetes seamlessly integrates with Jenkins, as well as other automation tools that can interact with it. Since Kubernetes operates in a declarative manner and maintains a state, the choice of tools often aligns with its principles. For instance, popular options include Terraform and GitOps tools like ArgoCD, which adhere to the declarative approach.

At Percona, our team has successfully utilized Jenkins alongside Kubernetes, ensuring smooth automated testing of our Operators and other products. 

Q3: What performance issues, if any, does Kubernetes introduce?

Kubernetes performance is heavily influenced by the underlying hardware. Running a database on a Kubernetes cluster should deliver similar performance, with less than a 1% difference when compared to running it on standalone hardware. 

However, Kubernetes does introduce additional layers, particularly in storage and networking.

To achieve optimal performance, it is recommended to keep the networking layer as simple as possible, avoiding tunneling at all costs.

For storage, Kubernetes provides various tools that leverage local storage volumes, such as NVMe, to achieve performance that is comparable to bare-metal deployments. Tools like OpenEBS (local path or Mayastor) allow you to do that.

Join the Percona Kubernetes Squad

Join the Percona Kubernetes Squad – a group of database professionals at the forefront of innovating database operations on Kubernetes within their organizations and beyond. The Squad is dedicated to providing its members with unwavering support as we all navigate the cloud-native landscape.

Nov
02
2023
--

Top 3 Questions From Percona k8s Squad Ask-Me-Anything Session

Percona Kubernetes Squad

On October 24th, the Percona Kubernetes Squad held the first Ask-me-Anything (AMA) session to address inquiries regarding the utilization of Kubernetes for database deployment. This blog post will outline the top three questions raised during the session and provide thorough responses from our team of experts.

Q1: When is it appropriate to use Kubernetes for databases, and when is it not recommended?

There are certain scenarios where Kubernetes is well-suited for database deployment and others where it is not the optimal choice.

The good

Ideal use cases for Kubernetes and databases include small databases that require repeatable tasks, such as continuous integration and deployment (CICD) or developer-driven tasks. 

Another case is where you are struggling with automating your day-to-day routine, and your custom scripts are maintained by a separate team now. Operators and Kubernetes would come in handy here as they take all this complexity away and remove toil.

The bad

On the other hand, Kubernetes is not designed to handle large databases or mainframes. Although it might still function in such cases, it is not recommended to utilize Kubernetes for these scenarios. 

Additionally, a manual approach that resists embracing the core principles of Kubernetes and instead insists on applying traditional methods for database management is not suitable.

The ugly

Lack of Kubernetes expertise but the desire to use it anyway may play a dirty trick. Even Operators hide the complexity of managing Kubernetes primitives and databases, minimal knowledge of the Cloud Native ecosystem is still required to run it successfully.

Joe Brockmeier gave a talk on this topic earlier this year at Open Source Summit North America in Vancouver – “Are Containers Ready for Production Databases?”. This talk covers this topic exactly.

Q2: How does Kubernetes interact with Jenkins automation?

Kubernetes seamlessly integrates with Jenkins, as well as other automation tools that can interact with it. Since Kubernetes operates in a declarative manner and maintains a state, the choice of tools often aligns with its principles. For instance, popular options include Terraform and GitOps tools like ArgoCD, which adhere to the declarative approach.

At Percona, our team has successfully utilized Jenkins alongside Kubernetes, ensuring smooth automated testing of our Operators and other products. 

Q3: What performance issues, if any, does Kubernetes introduce?

Kubernetes performance is heavily influenced by the underlying hardware. Running a database on a Kubernetes cluster should deliver similar performance, with less than a 1% difference when compared to running it on standalone hardware. 

However, Kubernetes does introduce additional layers, particularly in storage and networking.

To achieve optimal performance, it is recommended to keep the networking layer as simple as possible, avoiding tunneling at all costs.

For storage, Kubernetes provides various tools that leverage local storage volumes, such as NVMe, to achieve performance that is comparable to bare-metal deployments. Tools like OpenEBS (local path or Mayastor) allow you to do that.

Join the Percona Kubernetes Squad

Join the Percona Kubernetes Squad – a group of database professionals at the forefront of innovating database operations on Kubernetes within their organizations and beyond. The Squad is dedicated to providing its members with unwavering support as we all navigate the cloud-native landscape.

Jan
20
2023
--

Help! I Am Out of Disk Space!

Out of Disk Space MySQL

Out of Disk Space MySQLHow can we fix a nasty out-of-space issue leveraging the flexibility of Percona Operator for MySQL?

When planning a database deployment, one of the most challenging factors to consider is the amount of space we need to dedicate to data on disk.

This is even more cumbersome when working on bare metal, as it is more difficult to add space when using this kind of solution with respect to the cloud.

When using cloud storage like EBS or similar, it is normally easy(er) to extend volumes, which gives us the luxury to plan the space to allocate for data with a good grade of relaxation. 

Is this also true when using a solution based on Kubernetes like Percona Operator for MySQL? Well, it depends on where you run it. However, if the platform you choose supports the option to extend volumes, K8s per se gives you the possibility to do so as well.

Nonetheless, if it can go wrong it will, and ending up with a fully filled device with MySQL is not a fun experience. 

As you know, on normal deployments, when MySQL has no space left on the device, it simply stops working, ergo it will cause a production down event, which of course is unfortunate and we want to avoid it at any cost.  

This blog is the story of what happened, what was supposed to happen, and why. 

The story

The case was on AWS using EKS.

Given all the above, I was quite surprised when we had a case in which a deployed solution based on Percona Operator for MySQL went out of space. However, we started to dig in and review what was going on and why.

The first thing we did was to quickly investigate what was really taking space, and that could have been an easy win if most of the space was taken by some log, but unfortunately, this was not the case, as data was really taking all the available space. 

The next step was to check what storage class (SC) was used for the PersistentVolumeClaim (PVC):

k get pvc
NAME                         VOLUME    CAPACITY   ACCESS MODES   STORAGECLASS
datadir-mt-cluster-1-pxc-0   pvc-<snip>   233Gi      RWO            io1
datadir-mt-cluster-1-pxc-1   pvc-<snip>   233Gi      RWO            io1
datadir-mt-cluster-1-pxc-2   pvc-<snip>   233Gi      RWO            io1

Ok we use the io1 SC, it is now time to check if the SC is supporting volume expansion:

kubectl describe sc io1
Name:            io1
IsDefaultClass:  No
Annotations:     kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"false"},"name":"io1"},"parameters":{"fsType":"ext4","iopsPerGB":"12","type":"io1"},"provisioner":"kubernetes.io/aws-ebs"}
,storageclass.kubernetes.io/is-default-class=false
Provisioner:           kubernetes.io/aws-ebs
Parameters:            fsType=ext4,iopsPerGB=12,type=io1
AllowVolumeExpansion:  <unset> <------------
MountOptions:          <none>
ReclaimPolicy:         Delete
VolumeBindingMode:     Immediate
Events:                <none>

And no is not enabled, and in this case, we cannot just go and expand the volume, we must change the storage class settings first. To enable volume expansion, you need to delete the storage class and enable it again. 

Unfortunately, we were unsuccessful in doing that operation, because the storage class kept staying unset for  ALLOWVOLUMEEXPANSION. 

As said, this is a production down event, so we cannot invest too much time in digging into why it was not correctly changing the mode, we had to act quickly. 

The only option we had to fix it was:

  • Expand the io1 volumes from the AWS console (or AWS client)
  • Resize the file system 
  • Patch any K8 file to allow K8 to correctly see the new volume’s dimension  

Expanding EBS volumes from the console is trivial, just go to Volumes, select the volume you want to modify, choose modify, and change the size of it with the one desired, and done. 

Once that is done, connect to the Node hosting the pod which has the volume mounted like this:

k get pods -o wide|grep mysql-0
NAME                                        READY     STATUS    RESTARTS   AGE    IP            NODE             
cluster-1-pxc-0                               2/2     Running   1          11d    10.1.76.189     <mynode>.eu-central-1.compute.internal

Then we need to get the id of the PVC to identify it on the node:

k get pvc
NAME                         STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS
datadir-cluster-1-pxc-0   Bound    pvc-1678c7ee-3e50-4329-a5d8-25cd188dc0df   233Gi      RWO            io1

One note, when doing this kind of recovery with a Percona XtraDB Cluster-based solution, always recover node-0 first, then the others.  

So we connect to <mynode> and identify the volume: 

lslbk |grep pvc-1678c7ee-3e50-4329-a5d8-25cd188dc0df
nvme1n1      259:4    0  350G  0 disk /var/lib/kubelet/pods/9724a0f6-fb79-4e6b-be8d-b797062bf716/volumes/kubernetes.io~aws-ebs/pvc-1678c7ee-3e50-4329-a5d8-25cd188dc0df <-----

At this point we can resize it:

root@ip-<snip>:/# resize2fs  /dev/nvme1n1
resize2fs 1.45.5 (07-Jan-2020)
Filesystem at /dev/nvme1n1 is mounted on /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/aws/eu-central-1a/vol-0ab0db8ecf0293b2f; on-line resizing required
old_desc_blocks = 30, new_desc_blocks = 44
The filesystem on /dev/nvme1n1 is now 91750400 (4k) blocks long.

The good thing is that as soon as you do that, the MySQL daemon sees the space and will restart, however, it will happen only on the current pod and K8 will still see the old dimension:

k get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                            STORAGECLASS   REASON
pvc-1678c7ee-3e50-4329-a5d8-25cd188dc0df   333Gi      RWO            Delete           Bound    pxc/datadir-cluster-1-pxc-0   io1

To allow K8 to be aligned with the real dimension, we must patch the information stored, and the command is the following:

kubectl patch pvc <pvc-name>  -n <pvc-namespace> -p '{ "spec": { "resources": { "requests": { "storage": "NEW STORAGE VALUE" }}}}'
Ie:
kubectl patch pvc datadir-cluster-1-pxc-0 -n pxc -p '{ "spec": { "resources": { "requests": { "storage": "350" }}}}'

Remember to use as PVC-name the NAME coming from:

kubectl get pvc.

Once this is done, K8 will see the new volume dimension correctly.

Just repeat the process for Node-1 and Node-2 and …done, the cluster is up again.

Finally, do not forget to modify your custom resources file (cr.yaml) to match the new volume size. I.E.:

volumeSpec:
      persistentVolumeClaim:
        storageClassName: "io1"
        resources:
          requests:
            storage: 350G

The whole process took just a few minutes, it was time now to investigate why the incident happened and why the storage class was not allowing extension in the first place.  

Why it happened

Well, first and foremost the platform was not correctly monitored. As such there was a lack of visibility about space utilization and no alert about disk space. 

This was easy to solve by enabling the Percona Monitoring and Management (PMM) feature in the cluster cr and setting the alert in PMM once the nodes join it (see https://docs.percona.com/percona-monitoring-and-management/get-started/alerting.html for details on how to do it).

The second issue was the problem with the storage class. Once we had the time to carefully review the configuration files, we identified that there was an additional tab in the SC class, which was causing K8 to ignore the directive. 

It was supposed to be:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: io1
  annotations:
    storageclass.kubernetes.io/is-default-class: "false"
provisioner: kubernetes.io/aws-ebs
parameters:
  type: io1
  iopsPerGB: "12"
  fsType: ext4 
allowVolumeExpansion: true <----------

It was:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: io1
  annotations:
    storageclass.kubernetes.io/is-default-class: "false"
provisioner: kubernetes.io/aws-ebs
parameters:
  type: io1
  iopsPerGB: "12"
  fsType: ext4 
  allowVolumeExpansion: true. <---------

What was concerning was the lack of error returned by the Kubernetes API, so in theory the configuration was accepted but not really validated. 

In any case, once we had fixed the typo and recreated the SC, the setting for volume expansion was correctly accepted:

kubectl describe sc io1
Name:            io1
IsDefaultClass:  No
Annotations:     kubectl.kubernetes.io/last-applied-configuration={"allowVolumeExpansion":true,"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"false"},"name":"io1"},"parameters":{"fsType":"ext4","iopsPerGB":"12","type":"io1"},"provisioner":"kubernetes.io/aws-ebs"}
,storageclass.kubernetes.io/is-default-class=false
Provisioner:           kubernetes.io/aws-ebs
Parameters:            fsType=ext4,iopsPerGB=12,type=io1
AllowVolumeExpansion:  True    
MountOptions:          <none>
ReclaimPolicy:         Delete
VolumeBindingMode:     Immediate
Events:                <none>

What should have happened instead?

If proper monitoring and alerting were in place, the administrators would have the time to act and extend the volumes without downtime. 

However, the procedure for extending volumes on K8 is not complex but also not as straightforward as you may think. My colleague Natalia Marukovich wrote a blog post, Percona Operator Volume Expansion Without Downtime, that gives you the step by step instructions on how to extend the volumes without downtime. 

Conclusion

Using the cloud, containers, automation, or more complex orchestrators like Kubernetes, does not solve all, does not prevent mistakes from happening, and more importantly, does not make the right decisions for you.

You must set up a proper architecture that includes backup, monitoring, and alerting. You must set the right alerts and act on them in time.

Finally, automation is cool, however, the devil is in the details and typos are his day-to-day joy. Be careful and check what you put online, do not rush it. Validate, validate, validate…

Great stateful MySQL to all.

Percona Monitoring and Management is a best-of-breed open source database monitoring solution. It helps you reduce complexity, optimize performance, and improve the security of your business-critical database environments, no matter where they are located or deployed.

Download Percona Monitoring and Management Today

Sep
09
2015
--

Percona Server audit log plugin best practices

Auditing your database means tracking access and changes to your data and db objects. The Audit Log Plugin has been shipped with Percona Server since 5.5.37/5.6.17, for a little over 12 months. Prior to the Audit Log Plugin, you had to work in darker ways to achieve some incarnation of an audit trail.

We have seen attempts at creating audit trails using approaches such as ‘sniffing the wire’, init files, in-schema ‘on update’ fields, triggers, proxies and trying to parse the traditional logs of MySQL (slow, general, binary, error). All of these attempts miss a piece of the pie, i.e. if you’re sniffing tcp traffic you’ll miss local connections, parsing binary logs you’re missing any reads. Your reasons for audit logging might be down to compliance requirements (HIPAA, PCI DSS) or you may need a way to examine database activity or track the connections incoming.

Over the past months I’ve met many support requests with the answer ‘install an audit plugin’. These requests have been varied but they have ranged from; finding out if a user is still active and if the impact of decommissioning it, the frequency of specific queries and checking if a slave is being written to name but a few.

So then, lets look at installation. In general we desire installation of the Audit Plugin on an existing instance. We discussed in previous Percona Blog posts, the installation of the plugin is trivial but lets recap. Lets perform a couple of basic checks before we run the install command from the client. First, query MySQL for the location of the plugins directory;

mysql> show global variables like 'plugin_dir';
+---------------+--------------------------+
| Variable_name | Value                    |
+---------------+--------------------------+
| plugin_dir    | /usr/lib64/mysql/plugin/ |
+---------------+--------------------------+
1 row in set (0.00 sec)

Once that’s known we’ll check that the audit log plugin shared library is present;

[moore@randy ~]$ ls -l /usr/lib64/mysql/plugin/audit*
-rwxr-xr-x. 1 root root 42976 Jul  1 09:24 /usr/lib64/mysql/plugin/audit_log.so

Great, we are in good shape to move to the client and install;

mysql> install plugin audit_log soname 'audit_log.so';
Query OK, 0 rows affected (0.00 sec)
mysql> select * from mysql.plugin;
+-------------------------------+--------------+
| name                          | dl           |
+-------------------------------+--------------+
| audit_log                     | audit_log.so |
...
+-------------------------------+--------------+
8 rows in set (0.00 sec)

Voila! It’s that simple. So, what does that provide us? Well now thanks to our default variables we’ve got the following options set;

mysql> show global variables like 'audit%';
+---------------------------+---------------+
| Variable_name             | Value         |
+---------------------------+---------------+
| audit_log_buffer_size     | 1048576       |
| audit_log_file            | audit.log     |
| audit_log_flush           | OFF           |
| audit_log_format          | OLD           |
| audit_log_handler         | FILE          |
| audit_log_policy          | ALL           |
| audit_log_rotate_on_size  | 0             |
| audit_log_rotations       | 0             |
| audit_log_strategy        | ASYNCHRONOUS  |
| audit_log_syslog_facility | LOG_USER      |
| audit_log_syslog_ident    | percona-audit |
| audit_log_syslog_priority | LOG_INFO      |
+---------------------------+---------------+
12 rows in set (0.00 sec)

So what we can tell from that output is that our audit plugin is enabled, it’s logging out to the default location ({datadir}/audit.log) and we’re grabbing all events (ALL) on the server and sending the output in XML format (OLD). From the list of variables above we’ve only got one dynamic variable. This means to change the logfile location or the format we need to put these options into our my.cnf and restart the instance. Not very convenient. Personally, it’s my preference to store the audit.log file away from my datadir.

I also dislike the XML formats in favour of the JSON log format. It is also advised, especially on busier systems, to enable the rotation options, audit_log_rotate_on_size and audit_log_rotations so that you don’t end up filling your disk with a huge audit log. Restarting your production instances isn’t extremely convenient but you’ll be happy to learn there is another way.

Let’s rewind to before we installed the plugin. We had checked the existence of our plugin shared library and were itching to run the install command. Now we can open our my.cnf file and add our preferred options prior to installation. Whilst it’s far from a secret, not many will know that the in the plugin installation phase, MySQL will re-read the my.cnf file to check for configuration relevant to the plugin. So let’s add some variables here;

## Audit Logging ##
audit_log_policy=ALL
audit_log_format=JSON
audit_log_file=/var/log/mysql/audit.log
audit_log_rotate_on_size=1024M
audit_log_rotations=10

A quick review of the above. I intend to log all events in JSON format to the /var/log/mysql location. I will rotate each time the active log file hits 1G and this will circulate 10 files meaning I will not have more than 10G of audit logs on my filesystem.

Now with our predefined configuration in the my.cnf we can install the plugin from cold and begin with our preferred options;

mysql> show global variables like 'audit%';
Empty set (0.00 sec)
mysql> install plugin audit_log soname 'audit_log.so';
Query OK, 0 rows affected (0.00 sec)
mysql> show global variables like 'audit%';
+---------------------------+--------------------------+
| Variable_name             | Value                    |
+---------------------------+--------------------------+
| audit_log_buffer_size     | 1048576                  |
| audit_log_file            | /var/log/mysql/audit.log |
| audit_log_flush           | OFF                      |
| audit_log_format          | JSON                     |
| audit_log_handler         | FILE                     |
| audit_log_policy          | ALL                      |
| audit_log_rotate_on_size  | 1073741824               |
| audit_log_rotations       | 10                       |
| audit_log_strategy        | ASYNCHRONOUS             |
| audit_log_syslog_facility | LOG_USER                 |
| audit_log_syslog_ident    | percona-audit            |
| audit_log_syslog_priority | LOG_INFO                 |
+---------------------------+--------------------------+
12 rows in set (0.00 sec)

Something to remember; if you add these variables before installation of the plugin and you restart your instance or suffer a crash, your instance will not start.

[moore@randy ~]$ sudo systemctl restart mysql
[moore@randy ~]$ sudo egrep 'ERROR' /var/log/mysqld.log
2015-09-02 11:55:16 8794 [ERROR] /usr/sbin/mysqld: unknown variable 'audit_log_policy=ALL'
2015-09-02 11:55:16 8794 [ERROR] Aborting

When all up and running we can check that the content is finding it’s way to our log file by opening it up and taking a look. Our JSON output will store a new line of JSON per event, here’s an example:

{"audit_record":{"name":"Query","record":"1067824616_2015-09-02T10:04:26","timestamp":"2015-09-02T10:54:53 UTC","command_class":"show_status","connection_id":"6","status":0,"sqltext":"SHOW /*!50002 GLOBAL */ STATUS","user":"pct[pct] @ localhost [127.0.0.1]","host":"localhost","os_user":"","ip":"127.0.0.1"}}

compare that with the ‘OLD’ XML output format that spans multiple lines making parsing a more difficult task:

<AUDIT_RECORD
 NAME="Query"
 RECORD="2745742_2015-09-02T21:12:10"
 TIMESTAMP="2015-09-02T21:12:22 UTC"
 COMMAND_CLASS="show_status"
 CONNECTION_ID="8"
 STATUS="0"
 SQLTEXT="SHOW /*!50002 GLOBAL */ STATUS"
 USER="pct[pct] @ localhost [127.0.0.1]"
 HOST="localhost"
 OS_USER=""
 IP="127.0.0.1"
/>

Cost

One of the common assumptions of invoking the Audit Plugin is that it’s going to take an almighty hit on load. Logging all connections, queries and admin statements…surely? Well not so true. I spent some time observing the impact to the resources on a humbly specc’d home server. A small machine running quad core Xeon, 32G of RAM and a Samsung PRO SSD with a 72k rpm disk for the logs. Here are a collection of the graphs to illustrate that the impact of turning on the Audit Logging in asynchronous mode, as you will see the results are encouragingly showing little impact on activation of full logging. In each image, audit logging was set off and subsequently on.

percona_cpu_off_onpercona_disk_off_onpercona_off_on

Summary

We can install the Percona Audit plugin with our preferred options on a running system without interrupting it by adding our variables to the my.cnf. By performing this prior to the installing the plugin gives us best practice options without needing to restart the instance for static variables to take effect. Due to the lightweight nature of the audit plugin you can add this new log file to track access and changes to the data without the performance hit of the slow or general log. The audit log is a great aid to debugging and can serve as a security measure and malpractice deterrent.

The post Percona Server audit log plugin best practices appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com