As we’ve been introducing Valkey/Redis these past few weeks, let’s depart from the norm and talk about a few “things you should not do” in Valkey.No passwordBy default, Valkey/Redis uses no authentication. This means anyone can connect to your Valkey server and start writing and reading data. Worst yet, no password while binding to all […]
02
2023
Top 3 Questions From Percona k8s Squad Ask-Me-Anything Session
On October 24th, the Percona Kubernetes Squad held the first Ask-me-Anything (AMA) session to address inquiries regarding the utilization of Kubernetes for database deployment. This blog post will outline the top three questions raised during the session and provide thorough responses from our team of experts.
Q1: When is it appropriate to use Kubernetes for databases, and when is it not recommended?
There are certain scenarios where Kubernetes is well-suited for database deployment and others where it is not the optimal choice.
The good
Ideal use cases for Kubernetes and databases include small databases that require repeatable tasks, such as continuous integration and deployment (CICD) or developer-driven tasks.
Another case is where you are struggling with automating your day-to-day routine, and your custom scripts are maintained by a separate team now. Operators and Kubernetes would come in handy here as they take all this complexity away and remove toil.
The bad
On the other hand, Kubernetes is not designed to handle large databases or mainframes. Although it might still function in such cases, it is not recommended to utilize Kubernetes for these scenarios.
Additionally, a manual approach that resists embracing the core principles of Kubernetes and instead insists on applying traditional methods for database management is not suitable.
The ugly
Lack of Kubernetes expertise but the desire to use it anyway may play a dirty trick. Even Operators hide the complexity of managing Kubernetes primitives and databases, minimal knowledge of the Cloud Native ecosystem is still required to run it successfully.
Joe Brockmeier gave a talk on this topic earlier this year at Open Source Summit North America in Vancouver – “Are Containers Ready for Production Databases?”. This talk covers this topic exactly.
Q2: How does Kubernetes interact with Jenkins automation?
Kubernetes seamlessly integrates with Jenkins, as well as other automation tools that can interact with it. Since Kubernetes operates in a declarative manner and maintains a state, the choice of tools often aligns with its principles. For instance, popular options include Terraform and GitOps tools like ArgoCD, which adhere to the declarative approach.
At Percona, our team has successfully utilized Jenkins alongside Kubernetes, ensuring smooth automated testing of our Operators and other products.
Q3: What performance issues, if any, does Kubernetes introduce?
Kubernetes performance is heavily influenced by the underlying hardware. Running a database on a Kubernetes cluster should deliver similar performance, with less than a 1% difference when compared to running it on standalone hardware.
However, Kubernetes does introduce additional layers, particularly in storage and networking.
To achieve optimal performance, it is recommended to keep the networking layer as simple as possible, avoiding tunneling at all costs.
For storage, Kubernetes provides various tools that leverage local storage volumes, such as NVMe, to achieve performance that is comparable to bare-metal deployments. Tools like OpenEBS (local path or Mayastor) allow you to do that.
Join the Percona Kubernetes Squad
Join the Percona Kubernetes Squad – a group of database professionals at the forefront of innovating database operations on Kubernetes within their organizations and beyond. The Squad is dedicated to providing its members with unwavering support as we all navigate the cloud-native landscape.
02
2023
Top 3 Questions From Percona k8s Squad Ask-Me-Anything Session
On October 24th, the Percona Kubernetes Squad held the first Ask-me-Anything (AMA) session to address inquiries regarding the utilization of Kubernetes for database deployment. This blog post will outline the top three questions raised during the session and provide thorough responses from our team of experts.
Q1: When is it appropriate to use Kubernetes for databases, and when is it not recommended?
There are certain scenarios where Kubernetes is well-suited for database deployment and others where it is not the optimal choice.
The good
Ideal use cases for Kubernetes and databases include small databases that require repeatable tasks, such as continuous integration and deployment (CICD) or developer-driven tasks.
Another case is where you are struggling with automating your day-to-day routine, and your custom scripts are maintained by a separate team now. Operators and Kubernetes would come in handy here as they take all this complexity away and remove toil.
The bad
On the other hand, Kubernetes is not designed to handle large databases or mainframes. Although it might still function in such cases, it is not recommended to utilize Kubernetes for these scenarios.
Additionally, a manual approach that resists embracing the core principles of Kubernetes and instead insists on applying traditional methods for database management is not suitable.
The ugly
Lack of Kubernetes expertise but the desire to use it anyway may play a dirty trick. Even Operators hide the complexity of managing Kubernetes primitives and databases, minimal knowledge of the Cloud Native ecosystem is still required to run it successfully.
Joe Brockmeier gave a talk on this topic earlier this year at Open Source Summit North America in Vancouver – “Are Containers Ready for Production Databases?”. This talk covers this topic exactly.
Q2: How does Kubernetes interact with Jenkins automation?
Kubernetes seamlessly integrates with Jenkins, as well as other automation tools that can interact with it. Since Kubernetes operates in a declarative manner and maintains a state, the choice of tools often aligns with its principles. For instance, popular options include Terraform and GitOps tools like ArgoCD, which adhere to the declarative approach.
At Percona, our team has successfully utilized Jenkins alongside Kubernetes, ensuring smooth automated testing of our Operators and other products.
Q3: What performance issues, if any, does Kubernetes introduce?
Kubernetes performance is heavily influenced by the underlying hardware. Running a database on a Kubernetes cluster should deliver similar performance, with less than a 1% difference when compared to running it on standalone hardware.
However, Kubernetes does introduce additional layers, particularly in storage and networking.
To achieve optimal performance, it is recommended to keep the networking layer as simple as possible, avoiding tunneling at all costs.
For storage, Kubernetes provides various tools that leverage local storage volumes, such as NVMe, to achieve performance that is comparable to bare-metal deployments. Tools like OpenEBS (local path or Mayastor) allow you to do that.
Join the Percona Kubernetes Squad
Join the Percona Kubernetes Squad – a group of database professionals at the forefront of innovating database operations on Kubernetes within their organizations and beyond. The Squad is dedicated to providing its members with unwavering support as we all navigate the cloud-native landscape.
20
2023
Help! I Am Out of Disk Space!
How can we fix a nasty out-of-space issue leveraging the flexibility of Percona Operator for MySQL?
When planning a database deployment, one of the most challenging factors to consider is the amount of space we need to dedicate to data on disk.
This is even more cumbersome when working on bare metal, as it is more difficult to add space when using this kind of solution with respect to the cloud.
When using cloud storage like EBS or similar, it is normally easy(er) to extend volumes, which gives us the luxury to plan the space to allocate for data with a good grade of relaxation.
Is this also true when using a solution based on Kubernetes like Percona Operator for MySQL? Well, it depends on where you run it. However, if the platform you choose supports the option to extend volumes, K8s per se gives you the possibility to do so as well.
Nonetheless, if it can go wrong it will, and ending up with a fully filled device with MySQL is not a fun experience.
As you know, on normal deployments, when MySQL has no space left on the device, it simply stops working, ergo it will cause a production down event, which of course is unfortunate and we want to avoid it at any cost.
This blog is the story of what happened, what was supposed to happen, and why.
The story
The case was on AWS using EKS.
Given all the above, I was quite surprised when we had a case in which a deployed solution based on Percona Operator for MySQL went out of space. However, we started to dig in and review what was going on and why.
The first thing we did was to quickly investigate what was really taking space, and that could have been an easy win if most of the space was taken by some log, but unfortunately, this was not the case, as data was really taking all the available space.
The next step was to check what storage class (SC) was used for the PersistentVolumeClaim (PVC):
k get pvc NAME VOLUME CAPACITY ACCESS MODES STORAGECLASS datadir-mt-cluster-1-pxc-0 pvc-<snip> 233Gi RWO io1 datadir-mt-cluster-1-pxc-1 pvc-<snip> 233Gi RWO io1 datadir-mt-cluster-1-pxc-2 pvc-<snip> 233Gi RWO io1
Ok we use the io1 SC, it is now time to check if the SC is supporting volume expansion:
kubectl describe sc io1 Name: io1 IsDefaultClass: No Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"false"},"name":"io1"},"parameters":{"fsType":"ext4","iopsPerGB":"12","type":"io1"},"provisioner":"kubernetes.io/aws-ebs"} ,storageclass.kubernetes.io/is-default-class=false Provisioner: kubernetes.io/aws-ebs Parameters: fsType=ext4,iopsPerGB=12,type=io1 AllowVolumeExpansion: <unset> <------------ MountOptions: <none> ReclaimPolicy: Delete VolumeBindingMode: Immediate Events: <none>
And no is not enabled, and in this case, we cannot just go and expand the volume, we must change the storage class settings first. To enable volume expansion, you need to delete the storage class and enable it again.
Unfortunately, we were unsuccessful in doing that operation, because the storage class kept staying unset for ALLOWVOLUMEEXPANSION.
As said, this is a production down event, so we cannot invest too much time in digging into why it was not correctly changing the mode, we had to act quickly.
The only option we had to fix it was:
- Expand the io1 volumes from the AWS console (or AWS client)
- Resize the file system
- Patch any K8 file to allow K8 to correctly see the new volume’s dimension
Expanding EBS volumes from the console is trivial, just go to Volumes, select the volume you want to modify, choose modify, and change the size of it with the one desired, and done.
Once that is done, connect to the Node hosting the pod which has the volume mounted like this:
k get pods -o wide|grep mysql-0 NAME READY STATUS RESTARTS AGE IP NODE cluster-1-pxc-0 2/2 Running 1 11d 10.1.76.189 <mynode>.eu-central-1.compute.internal
Then we need to get the id of the PVC to identify it on the node:
k get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS datadir-cluster-1-pxc-0 Bound pvc-1678c7ee-3e50-4329-a5d8-25cd188dc0df 233Gi RWO io1
One note, when doing this kind of recovery with a Percona XtraDB Cluster-based solution, always recover node-0 first, then the others.
So we connect to <mynode> and identify the volume:
lslbk |grep pvc-1678c7ee-3e50-4329-a5d8-25cd188dc0df nvme1n1 259:4 0 350G 0 disk /var/lib/kubelet/pods/9724a0f6-fb79-4e6b-be8d-b797062bf716/volumes/kubernetes.io~aws-ebs/pvc-1678c7ee-3e50-4329-a5d8-25cd188dc0df <-----
At this point we can resize it:
root@ip-<snip>:/# resize2fs /dev/nvme1n1 resize2fs 1.45.5 (07-Jan-2020) Filesystem at /dev/nvme1n1 is mounted on /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/aws/eu-central-1a/vol-0ab0db8ecf0293b2f; on-line resizing required old_desc_blocks = 30, new_desc_blocks = 44 The filesystem on /dev/nvme1n1 is now 91750400 (4k) blocks long.
The good thing is that as soon as you do that, the MySQL daemon sees the space and will restart, however, it will happen only on the current pod and K8 will still see the old dimension:
k get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON pvc-1678c7ee-3e50-4329-a5d8-25cd188dc0df 333Gi RWO Delete Bound pxc/datadir-cluster-1-pxc-0 io1
To allow K8 to be aligned with the real dimension, we must patch the information stored, and the command is the following:
kubectl patch pvc <pvc-name> -n <pvc-namespace> -p '{ "spec": { "resources": { "requests": { "storage": "NEW STORAGE VALUE" }}}}' Ie: kubectl patch pvc datadir-cluster-1-pxc-0 -n pxc -p '{ "spec": { "resources": { "requests": { "storage": "350" }}}}'
Remember to use as PVC-name the NAME coming from:
kubectl get pvc.
Once this is done, K8 will see the new volume dimension correctly.
Just repeat the process for Node-1 and Node-2 and …done, the cluster is up again.
Finally, do not forget to modify your custom resources file (cr.yaml) to match the new volume size. I.E.:
volumeSpec: persistentVolumeClaim: storageClassName: "io1" resources: requests: storage: 350G
The whole process took just a few minutes, it was time now to investigate why the incident happened and why the storage class was not allowing extension in the first place.
Why it happened
Well, first and foremost the platform was not correctly monitored. As such there was a lack of visibility about space utilization and no alert about disk space.
This was easy to solve by enabling the Percona Monitoring and Management (PMM) feature in the cluster cr and setting the alert in PMM once the nodes join it (see https://docs.percona.com/percona-monitoring-and-management/get-started/alerting.html for details on how to do it).
The second issue was the problem with the storage class. Once we had the time to carefully review the configuration files, we identified that there was an additional tab in the SC class, which was causing K8 to ignore the directive.
It was supposed to be:
kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: io1 annotations: storageclass.kubernetes.io/is-default-class: "false" provisioner: kubernetes.io/aws-ebs parameters: type: io1 iopsPerGB: "12" fsType: ext4 allowVolumeExpansion: true <---------- It was: kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: io1 annotations: storageclass.kubernetes.io/is-default-class: "false" provisioner: kubernetes.io/aws-ebs parameters: type: io1 iopsPerGB: "12" fsType: ext4 allowVolumeExpansion: true. <---------
What was concerning was the lack of error returned by the Kubernetes API, so in theory the configuration was accepted but not really validated.
In any case, once we had fixed the typo and recreated the SC, the setting for volume expansion was correctly accepted:
kubectl describe sc io1 Name: io1 IsDefaultClass: No Annotations: kubectl.kubernetes.io/last-applied-configuration={"allowVolumeExpansion":true,"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"false"},"name":"io1"},"parameters":{"fsType":"ext4","iopsPerGB":"12","type":"io1"},"provisioner":"kubernetes.io/aws-ebs"} ,storageclass.kubernetes.io/is-default-class=false Provisioner: kubernetes.io/aws-ebs Parameters: fsType=ext4,iopsPerGB=12,type=io1 AllowVolumeExpansion: True MountOptions: <none> ReclaimPolicy: Delete VolumeBindingMode: Immediate Events: <none>
What should have happened instead?
If proper monitoring and alerting were in place, the administrators would have the time to act and extend the volumes without downtime.
However, the procedure for extending volumes on K8 is not complex but also not as straightforward as you may think. My colleague Natalia Marukovich wrote a blog post, Percona Operator Volume Expansion Without Downtime, that gives you the step by step instructions on how to extend the volumes without downtime.
Conclusion
Using the cloud, containers, automation, or more complex orchestrators like Kubernetes, does not solve all, does not prevent mistakes from happening, and more importantly, does not make the right decisions for you.
You must set up a proper architecture that includes backup, monitoring, and alerting. You must set the right alerts and act on them in time.
Finally, automation is cool, however, the devil is in the details and typos are his day-to-day joy. Be careful and check what you put online, do not rush it. Validate, validate, validate…
Great stateful MySQL to all.
Percona Monitoring and Management is a best-of-breed open source database monitoring solution. It helps you reduce complexity, optimize performance, and improve the security of your business-critical database environments, no matter where they are located or deployed.
09
2015
Percona Server audit log plugin best practices
Auditing your database means tracking access and changes to your data and db objects. The Audit Log Plugin has been shipped with Percona Server since 5.5.37/5.6.17, for a little over 12 months. Prior to the Audit Log Plugin, you had to work in darker ways to achieve some incarnation of an audit trail.
We have seen attempts at creating audit trails using approaches such as ‘sniffing the wire’, init files, in-schema ‘on update’ fields, triggers, proxies and trying to parse the traditional logs of MySQL (slow, general, binary, error). All of these attempts miss a piece of the pie, i.e. if you’re sniffing tcp traffic you’ll miss local connections, parsing binary logs you’re missing any reads. Your reasons for audit logging might be down to compliance requirements (HIPAA, PCI DSS) or you may need a way to examine database activity or track the connections incoming.
Over the past months I’ve met many support requests with the answer ‘install an audit plugin’. These requests have been varied but they have ranged from; finding out if a user is still active and if the impact of decommissioning it, the frequency of specific queries and checking if a slave is being written to name but a few.
So then, lets look at installation. In general we desire installation of the Audit Plugin on an existing instance. We discussed in previous Percona Blog posts, the installation of the plugin is trivial but lets recap. Lets perform a couple of basic checks before we run the install command from the client. First, query MySQL for the location of the plugins directory;
mysql> show global variables like 'plugin_dir'; +---------------+--------------------------+ | Variable_name | Value | +---------------+--------------------------+ | plugin_dir | /usr/lib64/mysql/plugin/ | +---------------+--------------------------+ 1 row in set (0.00 sec)
Once that’s known we’ll check that the audit log plugin shared library is present;
[moore@randy ~]$ ls -l /usr/lib64/mysql/plugin/audit* -rwxr-xr-x. 1 root root 42976 Jul 1 09:24 /usr/lib64/mysql/plugin/audit_log.so
Great, we are in good shape to move to the client and install;
mysql> install plugin audit_log soname 'audit_log.so'; Query OK, 0 rows affected (0.00 sec) mysql> select * from mysql.plugin; +-------------------------------+--------------+ | name | dl | +-------------------------------+--------------+ | audit_log | audit_log.so | ... +-------------------------------+--------------+ 8 rows in set (0.00 sec)
Voila! It’s that simple. So, what does that provide us? Well now thanks to our default variables we’ve got the following options set;
mysql> show global variables like 'audit%'; +---------------------------+---------------+ | Variable_name | Value | +---------------------------+---------------+ | audit_log_buffer_size | 1048576 | | audit_log_file | audit.log | | audit_log_flush | OFF | | audit_log_format | OLD | | audit_log_handler | FILE | | audit_log_policy | ALL | | audit_log_rotate_on_size | 0 | | audit_log_rotations | 0 | | audit_log_strategy | ASYNCHRONOUS | | audit_log_syslog_facility | LOG_USER | | audit_log_syslog_ident | percona-audit | | audit_log_syslog_priority | LOG_INFO | +---------------------------+---------------+ 12 rows in set (0.00 sec)
So what we can tell from that output is that our audit plugin is enabled, it’s logging out to the default location ({datadir}/audit.log) and we’re grabbing all events (ALL) on the server and sending the output in XML format (OLD). From the list of variables above we’ve only got one dynamic variable. This means to change the logfile location or the format we need to put these options into our my.cnf and restart the instance. Not very convenient. Personally, it’s my preference to store the audit.log file away from my datadir.
I also dislike the XML formats in favour of the JSON log format. It is also advised, especially on busier systems, to enable the rotation options, audit_log_rotate_on_size
and audit_log_rotations
so that you don’t end up filling your disk with a huge audit log. Restarting your production instances isn’t extremely convenient but you’ll be happy to learn there is another way.
Let’s rewind to before we installed the plugin. We had checked the existence of our plugin shared library and were itching to run the install command. Now we can open our my.cnf file and add our preferred options prior to installation. Whilst it’s far from a secret, not many will know that the in the plugin installation phase, MySQL will re-read the my.cnf file to check for configuration relevant to the plugin. So let’s add some variables here;
## Audit Logging ## audit_log_policy=ALL audit_log_format=JSON audit_log_file=/var/log/mysql/audit.log audit_log_rotate_on_size=1024M audit_log_rotations=10
A quick review of the above. I intend to log all events in JSON format to the /var/log/mysql location. I will rotate each time the active log file hits 1G and this will circulate 10 files meaning I will not have more than 10G of audit logs on my filesystem.
Now with our predefined configuration in the my.cnf we can install the plugin from cold and begin with our preferred options;
mysql> show global variables like 'audit%'; Empty set (0.00 sec) mysql> install plugin audit_log soname 'audit_log.so'; Query OK, 0 rows affected (0.00 sec) mysql> show global variables like 'audit%'; +---------------------------+--------------------------+ | Variable_name | Value | +---------------------------+--------------------------+ | audit_log_buffer_size | 1048576 | | audit_log_file | /var/log/mysql/audit.log | | audit_log_flush | OFF | | audit_log_format | JSON | | audit_log_handler | FILE | | audit_log_policy | ALL | | audit_log_rotate_on_size | 1073741824 | | audit_log_rotations | 10 | | audit_log_strategy | ASYNCHRONOUS | | audit_log_syslog_facility | LOG_USER | | audit_log_syslog_ident | percona-audit | | audit_log_syslog_priority | LOG_INFO | +---------------------------+--------------------------+ 12 rows in set (0.00 sec)
Something to remember; if you add these variables before installation of the plugin and you restart your instance or suffer a crash, your instance will not start.
[moore@randy ~]$ sudo systemctl restart mysql [moore@randy ~]$ sudo egrep 'ERROR' /var/log/mysqld.log 2015-09-02 11:55:16 8794 [ERROR] /usr/sbin/mysqld: unknown variable 'audit_log_policy=ALL' 2015-09-02 11:55:16 8794 [ERROR] Aborting
When all up and running we can check that the content is finding it’s way to our log file by opening it up and taking a look. Our JSON output will store a new line of JSON per event, here’s an example:
{"audit_record":{"name":"Query","record":"1067824616_2015-09-02T10:04:26","timestamp":"2015-09-02T10:54:53 UTC","command_class":"show_status","connection_id":"6","status":0,"sqltext":"SHOW /*!50002 GLOBAL */ STATUS","user":"pct[pct] @ localhost [127.0.0.1]","host":"localhost","os_user":"","ip":"127.0.0.1"}}
compare that with the ‘OLD’ XML output format that spans multiple lines making parsing a more difficult task:
<AUDIT_RECORD NAME="Query" RECORD="2745742_2015-09-02T21:12:10" TIMESTAMP="2015-09-02T21:12:22 UTC" COMMAND_CLASS="show_status" CONNECTION_ID="8" STATUS="0" SQLTEXT="SHOW /*!50002 GLOBAL */ STATUS" USER="pct[pct] @ localhost [127.0.0.1]" HOST="localhost" OS_USER="" IP="127.0.0.1" />
Cost
One of the common assumptions of invoking the Audit Plugin is that it’s going to take an almighty hit on load. Logging all connections, queries and admin statements…surely? Well not so true. I spent some time observing the impact to the resources on a humbly specc’d home server. A small machine running quad core Xeon, 32G of RAM and a Samsung PRO SSD with a 72k rpm disk for the logs. Here are a collection of the graphs to illustrate that the impact of turning on the Audit Logging in asynchronous mode, as you will see the results are encouragingly showing little impact on activation of full logging. In each image, audit logging was set off and subsequently on.
Summary
We can install the Percona Audit plugin with our preferred options on a running system without interrupting it by adding our variables to the my.cnf. By performing this prior to the installing the plugin gives us best practice options without needing to restart the instance for static variables to take effect. Due to the lightweight nature of the audit plugin you can add this new log file to track access and changes to the data without the performance hit of the slow or general log. The audit log is a great aid to debugging and can serve as a security measure and malpractice deterrent.
The post Percona Server audit log plugin best practices appeared first on MySQL Performance Blog.