Jan
14
2021
--

How to Store MySQL Audit Logs in MongoDB in a Maintenance-Free Setup

Store MySQL Audit Logs in MongoDB

Store MySQL Audit Logs in MongoDBI was once helping one of our customers on how to load MySQL audit logs into a MySQL database and analyze them. But immediately I thought: “Hey, this is not the most efficient solution! MySQL or typical RDBMS, in general, were not really meant to store logs after all.”

So, I decided to explore an alternative – which seemed more sensible to me – and use MongoDB as the storage for logs, for three main reasons:

  • schema-less nature fits well to the audit log nature, where different types of events may use different fields
  • speaks JSON natively and the audit plugin can use JSON format
  • has capped collections feature, which allows avoiding additional maintenance overhead

Just to mention, audit logging is available in MySQL Enterprise Edition but a similar, yet free, solution, is available in Percona Server for MySQL. In both cases, it works by installing the audit log plugin.

Ad Hoc Import

The simplest scenario is to just set the audit log format to JSON:

audit_log_format = JSON

And as soon as it collects some data, import the log file into MongoDB collection via the mongoimport command, like this:

# mongoimport --username percona --password P3rc0n4 --host 10.95.83.225 --port 27017 --db auditlogs --collection audit1 --file /var/lib/mysql/audit.log
2020-12-31T16:24:43.782+0000 connected to: 10.95.83.225:27017
2020-12-31T16:24:44.316+0000 imported 25462 documents

mongo > db.audit1.countDocuments({})
25462

Of course, this works, but I prefer an automated solution, so I looked at available options for live-streaming the logs.

Syslog

The first thing that looked useful is the ability to send the audit log directly to syslog instead of a file. Knowing that both rsyslog, as well as syslog-ng, have MongoDB output modules, it felt like a very easy approach. So I installed the rsyslog-mongodb module package on my test Ubuntu VM with running Percona Server for MySQL, configured audit log with:

[mysqld]
audit_log_handler = syslog
audit_log_format = JSON

Rsyslog (version 8.2) example configuration with:

# cat /etc/rsyslog.d/49-ship-syslog.conf
action(type="ommongodb"
uristr="mongodb://percona:P3rc0n4@10.95.83.225:27017/?authSource=auditlogs"
db="auditlogs" collection="mysql_node1_log")

This worked, however, inserted documents looked like this:

mongo > db.mysql_node1_log.findOne().pretty()
{
"_id" : ObjectId("5fece941f17f487c7d1d158b"),
"msg" : " {\"audit_record\":{\"name\":\"Connect\",\"record\":\"7_1970-01-01T00:00:00\",\"timestamp\":\"2020-12-30T20:55:29Z\",\"connection_id\":\"9\",\"status\":0,\"user\":\"root\",\"priv_user\":\"root\",\"os_login\":\"root\",\"proxy_user\":\"\",\"host\":\"localhost\",\"ip\":\"\",\"db\":\"\"}}"
}

Basically, because of syslog escaping the double quote symbols, the whole audit record appears as a single string inside MongoDB collection, instead of a JSON object. No matter what I tried, like custom templates and property values in rsyslog, I could not disable escaping. Therefore, although feeding MongoDB with audit logs works this way, it becomes pretty useless when it comes to analyzing the logs later. The same issue applies to syslog-ng and the syslog-ng-mod-mongodb module. And since MongoDB does not offer before-insert triggers, I could not easily “fix” the inserted data on the fly.

Fluentd For The Rescue!

This forced me to look for alternative solutions. One of them would be using FIFO file and tail the audit log continuously to feed it, and then read from it to insert logs to mongodb. I wanted a more robust way, though, and decided to try Fluentd instead. It was created as a versatile log collector machine, highly flexible, prepared to work with many different applications out of the box, but most importantly, it is an open source project and speaks JSON natively. Making it to do the job I wanted turned out to be easier than I expected.

Here is what I did:

  • Installed the Fluentd package (I chose td-agent variant here for an even easier user experience)
  • Installed MongoDB plugin for Fluentd with (don’t use the usual ‘gem install’ here):
td-agent-gem install fluent-plugin-mongo

  • Configured audit log as a source and output directive for MongoDB:
# cat /etc/td-agent/td-agent.conf
####
...
<source>
 @type tail
 path /var/lib/mysql/audit.log
 pos_file /var/log/td-agent/audit.access_log.pos
 <parse>
  @type json
 </parse>
 tag mongo.audit.log
</source>
<match mongo.audit.log>
 @type mongo
 database auditlogs #(required)
 collection audit_log #(optional; default="untagged")
 capped
 capped_size 100m
 host 10.95.83.225 #(optional; default="localhost")
 port 27017 #(optional; default=27017)
 user percona
 password P3rc0n4
 <buffer>
  flush_interval 1s
 </buffer>
</match>

  • Added the user used by Fluentd to mysql group to allow it to read from the audit log:
# id td-agent
uid=114(td-agent) gid=121(td-agent) groups=121(td-agent)
# usermod -a -G mysql td-agent
# id td-agent
uid=114(td-agent) gid=121(td-agent) groups=121(td-agent),120(mysql)

[mysqld]
audit_log_handler = file
audit_log_format = JSON
audit_log_file = audit.log
audit_log_rotate_on_size = 10M
audit_log_rotations = 3

  • Restarted both services to apply changes:
# systemctl restart mysql
# systemctl restart td-agent

  • Checked the Fluentd log to see if it reads the audit log as expected, also for when Percona Server for MySQL rotates it:
# tail -f /var/log/td-agent/td-agent.log
2020-12-31 02:41:39 +0000 [info]: adding match pattern="mongo.audit.log" type="mongo"
...
2020-12-31 02:41:40 +0000 [info]: #0 following tail of /var/lib/mysql/audit.log
...
2020-12-31 02:52:14 +0000 [info]: #0 detected rotation of /var/lib/mysql/audit.log; waiting 5 seconds
2020-12-31 02:52:14 +0000 [info]: #0 following tail of /var/lib/mysql/audit.log

  • Ran sysbench against MySQL instance and verified the new collection in MongoDB gets updated:
mongo > db.audit_log.countDocuments({})
281245

mongo > db.audit_log.stats()
{
 "ns" : "auditlogs.audit_log",
 "size" : 104857293,
 "count" : 281245,
 "avgObjSize" : 372,
 "storageSize" : 26357760,
 "capped" : true,
 "max" : -1,
 "maxSize" : 104857600,
(...)

Yay, it works like a charm! Not only are the audit logs rotated automatically on Percona Server for MySQL, but also on MongoDB the destination collection size cap works as well, so I am safe when it comes to disk space on both hosts!

Here, there is a little caveat – if for some reason you drop the destination collection manually on MongoDB, incoming inserts will make it re-created without the capped setting! Therefore, either let the collection be created by Fluentd on its service startup or create it manually with a capped setting, and don’t drop it later.

Now, we can try some example aggregations to get some useful audit stats:

mongo > db.audit_log.aggregate([ { $group: { _id: {name: "$audit_record.name", command: "$audit_record.command_class"}, count: {$sum:1}}}, { $sort: {count:-1}} ])
{ "_id" : { "name" : "Execute", "command" : "error" }, "count" : 267086 }
{ "_id" : { "name" : "Query", "command" : "begin" }, "count" : 14054 }
{ "_id" : { "name" : "Close stmt", "command" : "error" }, "count" : 76 }
{ "_id" : { "name" : "Query", "command" : "show_variables" }, "count" : 7 }
{ "_id" : { "name" : "Query", "command" : "select" }, "count" : 6 }
{ "_id" : { "name" : "Quit" }, "count" : 5 }
{ "_id" : { "name" : "Query", "command" : "show_tables" }, "count" : 4 }
{ "_id" : { "name" : "Init DB", "command" : "error" }, "count" : 2 }
{ "_id" : { "name" : "Field List", "command" : "show_fields" }, "count" : 2 }
{ "_id" : { "name" : "Query", "command" : "show_databases" }, "count" : 2 }
{ "_id" : { "name" : "Connect" }, "count" : 1 }

mongo > db.audit_log.aggregate([ { $match: { "audit_record.status": {$gt: 0} } }, { $group: { _id: {command_class: "$audit_record.command_class", status: "$audit_record.status"}, count: {$sum:1}}}, { $sort: {count:-1}} ])
{ "_id" : { "command_class" : "error", "status" : 1049 }, "count" : 2 }
{ "_id" : { "command_class" : "show_tables", "status" : 1046 }, "count" : 2 }
{ "_id" : { "command_class" : "create_table", "status" : 1050 }, "count" : 2 }
{ "_id" : { "command_class" : "drop_table", "status" : 1051 }, "count" : 2 }
{ "_id" : { "command_class" : "drop_table", "status" : 1046 }, "count" : 2 }
{ "_id" : { "command_class" : "create_table", "status" : 1046 }, "count" : 1 }
{ "_id" : { "command_class" : "create_table", "status" : 1113 }, "count" : 1 }

References

https://www.percona.com/doc/percona-server/LATEST/management/audit_log_plugin.html
https://dev.mysql.com/doc/refman/8.0/en/audit-log.html
https://www.rsyslog.com/doc/v8-stable/configuration/modules/ommongodb.html
https://docs.fluentd.org/output/mongo

Jan
14
2021
--

Thinking About Deploying MongoDB? Read This First.

Deploying MongoDB

Deploying MongoDBAre you thinking about deploying MongoDB? Is it the right choice for you?

Choosing a database is an important step when designing an application. A wrong choice can have a negative impact on your organization in terms of development and maintenance. Also, the wrong choice can lead to poor performance.

Generally speaking, any kind of database can manage any kind of workload, but any database has specific workloads that fit better than others.

You don’t have to consider MongoDB just because it’s cool and there’s already a lot of companies using it. You need to understand if it fits properly with your workload and expectations. So, choose the right tool for the job.

In this article, we are going to discuss a few things you need to know before choosing and deploying MongoDB.

MongoDB Manages JSON-style Documents and Developers Appreciate That

The basic component of a MongoDB database is a JSON-style document. Technically it is BSON, which contains some extra datatypes (eg. datetime) that aren’t legit JSON.

We can consider the document the same as a record for a relational database. The documents are put into a collection, the same concept as a relational table.

JSON-style documents are widely used by a lot of programmers worldwide to implement web services, applications, and exchange data. Having a database that is able to manage that data natively is really effective.

MongoDB is often appreciated by developers because they can start using it without having specific knowledge about database administration and design and without studying a complex query language. Indeed, the MongoDB query language is also represented by JSON documents.

The developers can create, save, retrieve, and update their JSON-style documents at ease. Great! This leads usually to a significant reduction in development time.

MongoDB is Schemaless

Are you familiar with relational databases? For sure you are, as relational databases are used and studied for such a long time at school and at university. Relational databases are the most widely used in the market nowadays.

You know that a relational schema needs a predefined and fixed structure for the tables. Any time you add or change a column you need to run a DDL query and additional time is necessary to also change your application code to manage the new structure. In the case of a massive change that requires multiple column changes and/or the creation of new tables, the application changes could be impressive. MongoDB’s lack of schema enforcement means none of that is required. You just insert a document in a collection and that’s all. Let suppose that you have a collection with user data. If at some point you need to add for example the new “date_of_birth” field, you simply start to insert the new JSON documents with the additional field. That’s all. No need to change anything on the schema.

You can insert into the same collection even completely different JSON documents, representing different entities. Well, this is technically feasible, but not recommended, anyway.

MongoDB greatly shortens the cycle of application development for a non-technology reason – it removes the need to coordinate a schema change migration project with the DBA team. There is no need to wait until the DBA team does a QA dress-rehearsal and then the production release (with rollback plans) that, often as not, requires some production downtime.

MongoDB Has No Foreign Keys, Stored Procedures, or Triggers. Joins Are Supported, but Untypical.

The database design requires SQL queries to be able to join multiple tables on specific fields. Also, the database design may require foreign keys for assuring the consistency of the data and for running automatic changes on semantically connected fields.

What about stored procedures? They can be useful to embed into the database some application logic to simplify some tasks or to improve the security.

And what about triggers? They are useful to automatically “trigger” changes on the data based on specific events, like adding/changing/deleting a row. They help to manage the consistency of the data and, in some cases, to simplify the application code.

Well, none of them is available on MongoDB. So, be aware of that.

Note: to be honest, there’s an aggregation stage that can implement the same of a LEFT JOIN, but this is the only case.

How to survive without JOIN?

Managing JOINs must be done on your application code. If you need to join two collections, you need to read the first one, selects the join field and use it for querying the second collection, and so on. Well, this seems to be expensive in terms of application development, and also this could lead to more queries executed. Indeed it is, but the good news is that in many cases you don’t have to manage the joins at all.

Remember that MongoDB is a schemaless database; it doesn’t require normalization. If you are able to properly design your collections, you can embed and duplicate data into a single collection without the need of creating an additional collection. This way you won’t need to run any join because all the data you need is already into one collection only.

Foreign keys are not available, but as long as you can embed multiple documents into the same collection, you don’t really need them.

Stored procedures can be implemented easily as external scripts you can write in your preferred language. Triggers can be implemented externally the same way, but with the help of the Change Stream API feature connected to a collection.

If you have a lot of collections with referenced fields, you have to implement in your code a lot of joins or you have to do a lot of checks to assure consistency. This is possible but at a higher cost in terms of development. MongoDB could be the wrong choice in such a case.

MongoDB Replication and Sharding Are Easy to Deploy

MongoDB was natively designed not as a standalone application. It was designed instead to be a piece of a larger puzzle. A mongod server is able to work together with other mongod instances in order to implement replication and sharding efficiently and without the need for any additional third-party tool.

A Replica Set is a group of mongod processes that maintain the same data set. Replica sets provide redundancy and high availability by design. With caveats regarding potentially stale data, you also get read scalability for free. It should be the basis for all production deployments.

The Sharding Cluster is deployed as a group of several Replica Sets with the capability to split and distribute the data evenly on them. The Sharding Cluster provides write scalability in addition to redundancy, high availability, and read scalability. The sharding topology is suitable for the deployment of very large data sets. The number of shards you can add is, in theory, unlimited.

Both the topologies can be upgraded at any time by adding more servers and shards. More importantly, no changes are required for the application since each topology is completely transparent from the application perspective.

Finally, the deployment of such topologies is straightforward. Well, you need to spend some time in the beginning to understand a few basic concepts, but then, in a matter of a few hours, you can deploy even a very large sharded cluster. In the case of several servers, instead of doing everything manually, you can automatize a lot of things using Ansible playbooks or other similar tools.

Further readings:

Deploy a MongoDB Replica Set with Transport Encryption (Part 1)

MongoDB Sharding 101 Webinar

MongoDB Has Indexes and They Are Really Important

MongoDB allows you to create indexes on the JSON document’s fields. Indexes are used the same way as a relational database. They are useful to solve queries faster and to decrease the usage of machine resources: memory, CPU time, and disk IOPS.

You should create all the indexes that will help any of the regularly executed queries, updates, or deletes from your application.

MongoDB has a really advanced indexing capability. It provides TLL indexes, GEO Spatial indexes, indexes on array elements, partial and sparse indexes. If you need more details about the available index types, you can take a look at the following articles:

MongoDB Index Types and MongoDB explain() (part 1)

Using Partial and Sparse Indexes in MongoDB

Create all the indexes you need for your collections. They will help you a lot to improve the overall performance of your database.

MongoDB is Memory Intensive

MongoDB is memory intensive; it needs a lot. This is the same for many other databases. Memory is the most important resource, most of the time.

MongoDB uses the RAM for caching the most frequently and recently accessed data and indexes. The larger this cache, the better the overall performance will be, because MongoDB will be able to retrieve a lot of data faster. Also, MongoDB writes are only committed to memory before client confirmation is returned, at least by default. Writes to disk are done asynchronously – first to the journal file (typically within 50ms), and later into the normal data files (once per min).

The widely used storage engine used by MongoDB is WiredTiger. In the past there was MMAPv1, but it is no longer available on more recent versions. The WiredTiger storage engine uses an important memory cache (the WiredTiger Cache) for caching data and indexes.

Other than using the WTCache, MongoDB relies on the OS file system caches for accessing the disk pages. This is another important optimization, and significant memory may be required also for that.

In addition, MongoDB needs memory for managing other stuff like client connections, in-memory sortings, saving temporary data when executing aggregation pipelines, and other minor things.

In the end, be prepared to provide enough memory to MongoDB.

But how much memory should I need? The rule of thumb is evaluating the “working set” size.

The “working set” is the amount of data that is most frequently requested by your application. Usually, an application needs a limited amount of data, it doesn’t need to read the entire data set during normal operations. For example, in the case of time-series data, most probably you need to read only the last few hours or the last few day’s entries. Only on a few occasions will you need to read legacy data. In such a case, your working set is the one that can store just a few days of data.

Let’s suppose your data set is 100GB and you evaluated your working set is around 20%, then you need to provide at least 20GB for the WTCache.

Since MongoDB uses by default 50% of the RAM for the WTCache (we usually suggest not to increase it significantly), you should provide around 40GB of memory in total for your server.

Every case is different and sometimes it could be difficult to evaluate correctly the working set size. Anyway, the main recommendation is that you should spend a significant part of your budget to provide the larger memory you can. For sure, this is will be beneficial for MongoDB.

What Are the Suitable Use Cases for MongoDB?

Actually, a lot. I have seen MongoDB deployed on a wide variety of environments.

For example, MongoDB is suitable for:

  • events logging
  • content management
  • gaming applications
  • payment applications
  • real-time analytics
  • Internet Of Things applications
  • content caching
  • time-series data applications

And many others.

We can say that you can use MongoDB basically for everything, it is a general-purpose database. The key point is instead the way you use it.

For example, if you plan to use MongoDB the same way as a relational database, with data normalized, a lot of collections around, and a myriad of joins to be managed by the application, then MongoDB is not the right choice for sure. Use a relational database.

The best way to use MongoDB is to adhere to a few best practices and modeling the collections keeping in mind some basic rules like embedding documents instead of creating multiple referenced collections.

Percona Server for MongoDB: The Enterprise-Class Open Source Alternative

Percona develops and deploys its own open source version of MongoDB: the Percona Server for MongoDB (PSMDB).

PSMDB is a drop-in replacement for MongoDB Community and it is 100% compatible. The great advantage provided by PSMDB is that you can get enterprise-class features for free, like:

  • encryption at the rest
  • audit logging
  • LDAP Authentication
  • LDAP Authorization
  • Log redaction
  • Kerberos Authentication
  • Hot backup
  • in-memory storage engine

Without PSMDB all these advanced features are available only in the MongoDB Enterprise subscription.

Please take a look at the following links for more details about PSMDB:

Percona Server for MongoDB Feature Comparison

Percona Server for MongoDB

Remember you can get in touch with Percona at any time for any details or for getting help.

Conclusion

Let’s have a look at the following list with the more important things you need to check before choosing MongoDB as the backend database for your applications. The three colored flags indicate if MongoDB is a good choice: red means it’s not, orange means it could be a good choice but with some limitations or potential bottlenecks, green means it’s very good.

Your applications primarily deal with JSON documents
Your data has unpredictable and frequent schema changes during the time
You have several collections with a lot of external references for assuring consistency and the majority of the queries need joins
You need to replicate stored procedures and triggers you have in your relational database
You need HA and read scalability
You need to scale your data to a very large size
You need to scale because of a huge amount of writes

 

And finally, remember the following:

Take a look at Percona Server for MongoDB 

Jan
13
2021
--

Running Kubernetes on the Edge

Running Kubernetes on the Edge

Running Kubernetes on the EdgeWhat is Edge

Edge is a buzzword that, behind the curtain, means moving private or public clouds closer to the end devices. End devices, such as the Internet of Things (from a doorbell to a VoIP station), become more complex and require more computational power.  There is a constant growth of connected devices and by the end of 2025, there will be 41.6 billion of them, generating 69.4 Zettabytes of data.

Latency, speed of data processing, or security concerns do not allow computation to happen in the cloud. Businesses rely on edge computing or micro clouds, which can run closer to the end devices. All this constructs the Edge.

How Kubernetes Helps Here

Containers are portable and quickly becoming a de facto standard to ship software. Kubernetes is a container orchestrator with robust built-in scaling capabilities. This gives the perfect toolset for businesses to shape their Edge computing with ease and without changing existing processes.

The cloud-native landscape has various small Kubernetes distributions that were designed and built for the Edge: k3s, microk8s, minikube, k0s, and newly released EKS Distro. They are lightweight, can be deployed with few commands, and are fully conformant. Projects like KubeEdge bring even more simplicity and standardization into the Kubernetes ecosystem on the Edge.

Running Kubernetes on the Edge also poses the challenge to manage hundreds and thousands of clusters. Google Anthos, Azure Arc, and VMWare Tanzu allow you to run your clusters anywhere and manage them through a single interface with ease.

Topologies

We are going to review various topologies that Kubernetes provides for the Edge to bring computation and software closer to the end devices.

The end device is a Kubernetes cluster

Kubernetes cluster edge

Some devices run complex software and require multiple components to operate – web servers, databases, built-in data-processing, etc. Using packages is an option, but compared to containers and automated orchestration, it is slow and sometimes turns the upgrade process into a nightmare. In such cases, it is possible to run a Kubernetes cluster on each end device and manage software and infrastructure components using well-known primitives.

The drawback of this solution is the overhead that comes from running etcd and masters’ components on every device.

The end device is a node

The end device is a node kubernetes

In this case, you can manage each end device through a single Kubernetes control plane. Deploying software to support and run your phones, printers or any other devices can be done through standard Kubernetes primitives.

Micro-clouds

kubernetes Micro-clouds

This topology is all about moving computational power closer to the end devices by creating micro-clouds on the Edge. Micro-cloud is formed by the Kubernetes nodes on the server farm on the customer premises. Running your AI/ML (like Kubeflow) or any other resource-heavy application in your own micro-cloud is done with Kubernetes and its primitives.

How Percona Addresses Edge Challenges

We at Percona continue to invest in the Kubernetes ecosystem and expand our partnership with the community. Our Kubernetes Operators for Percona XtraDB Cluster and MongoDB are open source and enable anyone to run production-ready MySQL and MongoDB databases on the Edge.

Check out how easy it is to deploy our operators on Minikube or EKS Distro (which is similar to microk8s). We are working on furthering Day 2 operations simplification and in future blog posts, you will see how to deploy and manage databases on multiple Kubernetes clusters with KubeApps.

Jan
12
2021
--

MySQL Backup and Recovery Best Practices

MySQL Backup and Recovery Best Practices

MySQL Backup and Recovery Best PracticesIn this blog, we will review all the backup and restore strategies for MySQL, the cornerstones of any application. There are a few options, depending on your topology, MySQL versions, etc. And based on that, there are some questions we need to ask ourselves to make sure we make the right choices.

How many backups we need to keep safe, or what’s the best retention policy for us?

This means the number of backups to safeguard, whether local or remote (external fileserver, cloud). The retention policy can be daily, weekly, or monthly, depending on the free space available.

What is the Recovery Time Objective?

The Recovery Time Objective (RTO) refers to the amount of time that may pass during a disruption before it exceeds the maximum allowable threshold specified in the Business Continuity Plan.

The key question related to RTO is, “How quickly must the data on this system be restored?”

What is the Recovery Point Objective?

The Recovery Point Objective (RPO) is the duration of time and service level within which a business process must be stored after a disaster in order to avoid unacceptable consequences associated with a break in continuity.

The key question related to RPO is, “How much data can we lose?”

Different Types of Backups

There are two backup types: physical and logical.

  • Physical (Percona XtraBackup, RDS/LVM Snapshots, MySQL Enterprise Backup), and also you can use cp or rsync command lines to copy the datadir as long as mysql is down/stopped.
  • Logical (mysqldump, mydumper, mysqlpump, mysql shell only for mysql 8)

Also is recommended to take a copy of binlog files, why? Well, this will help us to recover until the last transaction.

Why are backups needed?

Backups are needed in case of multiple problems:

  • Host Failure: We can get multiple problems from disks stalled or broken disks. Also from cloud services, our DB instance can be broken and it’s non-accessible.
  • Corrupted Data: This can happen on a power outage, MySQL wasn’t able to write correctly and close the file, sometimes when MySQL starts again it cannot start due to corrupted data and the crash recovery process cannot fix it.
  • Inconsistent Data: When a human mistake, delete/update erroneous data over the primary or replica node.
  • DataCenter Failure: power outage or internet provider issues.
  • Legislation/Regulation: provide consistent business value and customer satisfaction.

Now let me explain those different types of backups mentioned above, but before I continue, it’s important to configure a new and dedicated replica node for backups purposes, due to the high CPU load to avoid any issue on any other replica node (AKA backup server).

Logical Backup

This is a dump from logical database structure (CREATE DATABASE, CREATE TABLE statements) and content (INSERT statements). This is recommended to be used against smaller amounts of data. The disadvantage of this method is slower (backup and restore) if you compare it with physical backups. Using mydumper you can backup and restore a single database or a single table if it’s needed, and this is useful to copy some data to a different environment to run tests. Also, mydumper can take a consistent (as long as all the tables are InnoDB engine) backup and provides accurate master and slave log positions.

The output is larger than for physical backup, particularly when saved in text format, but it can be compressed on the fly depending on the software you are using. Mydumper can compress and mysqldump needs to add a pipe to redirect the output to gzip, for example.

Logical backups are used to address data corruption or the need to restore a subset of tables.

Physical (Raw) Backup

In short, this consists of exact copies of database directories and files. This can be a copy for all or a part from MySQL datadir directory. This kind of backup is most used to restore or create a new replica node easily and quickly and is used to address host failure. It’s recommended to restore using the same MySQL version. I recommend using Percona XtraBackup because it can include any related files such as configuration files like cnf config files.

Snapshot Backups

Some file system implementations enable “snapshots” to be taken. These provide logical copies of the file system at a given point in time, without requiring a physical copy of the entire file system. MySQL itself does not provide the capability for taking file system snapshots but it is available using third-party solutions such as LVM or ZFS.

The disadvantage is that sometimes physical backups do not compress much, because data is usually in a binary format and sometimes the table is already compressed.

Binary Log Backups

Binlog backups specifically address RPO. Binary log files contain records of each SQL query executed that made changes.

From MySQL 5.6 on, you can use mysqlbinlog to stream binary logs from a remote server. You can combine binlog backups with Percona XtraBackup or mydumper backup to allow restoration up to the end of the most-recently-backed-up binary log.

Incremental / Differential Backups

An incremental backup is a backup of everything that has changed since the last backup (a binary log backup is a special case of an incremental backup). This is a very good option if the dataset size is huge, as you can take a full backup at the beginning of the week and run incremental backups per day. Also, the backup size is smaller than the full backup.

The main risks associated with incremental backups are:

– A single corrupt incremental backup may invalidate all the others

– Incremental backups typically negatively affect the RTO

For a differential backup, it copies the differences from your last backup, and the advantage is that a lot of data does not change from one backup to the next, so the result can be significantly smaller backups. This saves disk space.

Percona XtraBackup supports both incremental and differential backups.

Offsite Storage

It’s highly recommended to copy all the backup methods to another place, like the cloud or an external file server, so in case of host failure or data center failure, you have another copy.

Not all the backup files need to be uploaded to the cloud, sometimes the time you need to spend in the download is bigger than the time consumed in the recovery process.

A good approach is to keep 1-7 days locally on the backup server in case a fast recovery is needed, and this depends on your business regulations.

Encryption

Backups have sensitive data, so it’s highly recommended to encrypt, especially for offsite storage. This adds more time when you need to restore a backup but it keeps your data safe.

GPG is a good option to encrypt backups, and if you use this option or some other alternative, don’t forget to get a copy of the keys/passphrase. If you lose it, your backups will be useless.

Restore Testing

Depending on your business, it’s highly recommended to test your backups at least once per month. This action validates your backups are not corrupted and it provides critical metrics on recovery time. This process should be automated to get the full backup, restore it, and finally configure this server as a replica from the current primary or another replica. This is good as well to validate that the replication process has no errors.

Many customers are using this methodology to refresh their QA/STG environment to have fresh data from production backups.

In addition to the above, it is recommended to create a manual or automated restore documentation process to keep all the steps together, so in case of disaster, you can follow it without wasting time.

Retention Requirements

Last but not least, it is very important to keep multiple copies of different backup types.

Our best recommendation is:

  • One or two physical backups locally on the backup server (as long as space allows it).
  • Seven daily and four weekly logical backups locally on the backup server.
  • 30 days of binlog backups locally on the backup server.
  • For offsite backups (like S3, Google Cloud, etc.), keep monthly backups for one year or more.

For local backups, keep in mind you will need a minimum of 2.5 times the current dataset size as free disk space to save/meet these retention policies. Don’t forget to encrypt all the backup types!

Legal or regulatory requirements may also dictate how long data must be archived.

Percona Can Help

Percona can help you choose, implement, and optimize the most appropriate MySQL backup and recovery solution for your MySQL ecosystem. If your current solution unexpectedly fails, we can facilitate your recovery with onsite, remote, or emergency consulting services. We can also help you take steps to prevent another occurrence. Every situation is unique and we will work with you to create the most effective solution for your business.

Contact Us

Jan
12
2021
--

Webinar January 26: Using Percona Monitoring and Management to Identify and Troubleshoot Problematic MySQL Queries

Troubleshoot Problematic MySQL Queries webinar

Troubleshoot Problematic MySQL Queries webinarJoin us as Michael Coburn, Percona Product Manager, discusses two methods to identify and troubleshoot problematic MySQL queries using the RED Method and Percona Monitoring and Management (PMM) Query Analytics. He will also highlight specific Dashboards in PMM that visualize the rate, errors, and duration of MySQL events that may be impacting the stability and performance of your database instance.

Please join Michael Coburn, Product Manager, Percona, on Tuesday, January 26th, 2021 at 2:30 pm for his webinar “Using Percona Monitoring and Management to Identify and Troubleshoot Problematic MySQL Queries”.

Register for Webinar

If you can’t attend, sign up anyway and we’ll send you the slides and recording afterward.

Jan
12
2021
--

The Preview of Database as a Service (DBaaS) in Percona Monitoring and Management is Now Live!

DBaaS Percona Monitoring and Management

This week we officially kick-off the Preview of Database as a Service (DBaaS) in Percona Monitoring and Management. We are still looking for users to test and provide feedback during this year-long program, and we would love you to participate! 

Preview of Database as a Service in Percona Monitoring and Management

 

Our vision is to deliver a truly open source solution that won’t lock you in. A single pane of glass to easily manage your open source database infrastructure, and a self-service experience enabling fast and consistent open source database deployment. 

Our goal is to deliver the enterprise benefits our customers are looking for, including:

  • A single interface to deploy and manage your open source databases on-premises, in the cloud, or across hybrid and multi-cloud environments.
  • The ability to configure a database once and deploy it anywhere. 
  • Critical database management operations, such as backup, recovery, and patching.
  • Enhanced automation and advisory services allow you to find, eliminate, and prevent outages, security issues, and slowdowns. 
  • A viable alternative to public cloud and large enterprise database vendor DBaaS offerings, allowing you to eliminate vendor lock-in.

Percona applies a user-driven product development process. So, we hope our user community will get involved in the Preview of Database as a Service (DBaaS) in Percona Monitoring and Management and help inform the design and development of this new software functionality.

The Preview is a year-long program consisting of four phases. Each three-month phase will focus on a different area of participant feedback. Preview participants can be involved in as many phases as they like.

Preview of Database as a Service (DBaaS) in Percona Monitoring and Management

Phase One Details for Interested Participants

Phase one will focus on:

  1. Gathering feedback allows us to understand the applicable user personas and the goals and objectives required in their day-to-day roles.
  2. Gathering feedback on the user experience, specifically involving creating, editing, and deleting database clusters and the databases within those clusters. We will also gather feedback on the management of those clusters and the monitoring of added database servers and nodes.

We are starting with a focus on database deployment and management features, as they help users improve their productivity. 

Other details to note…

  • Phase one of the Preview will run until April 2021
  • Phase one requires around 10 hours of self-paced activities, facilitated through the Percona Forum
    • All Community Preview participant feedback will be captured within the Percona Forum
    • Community Preview participant questions will be facilitated through the Percona Forum.

So make sure to sign up to participate in the Preview of Database as a Service (DBaaS) in Percona Monitoring and Management and become a crucial participant in this initiative, helping shape future users’ experience as we develop and test this new software functionality! 

Register Now!

Jan
11
2021
--

MySQL Group Replication – How to Elect the New Primary Node

MySQL Group Replication Primary Node

MySQL Group Replication Primary NodeIn this blog, I am going to explain the different ways of electing the PRIMARY node in MySQL group replication. Before MySQL 8.0.2, primary election was based on the member’s UUID, with the lowest UUID elected as the new primary in the event of a failover.

From MySQL 8.0.2: We can select the node to be promoted as a PRIMARY using the server weight ( group_replication_member_weight ). This can be achieved during the failure of the current primary node.

From MySQL 8.0.12: We can promote any node as a PRIMARY using the function “group_replication_set_as_primary”. This can be set anytime without any failures of nodes.

Scenario:

 I have installed the 3 node group replication cluster. I am using Percona Server for MySQL 8.0.22.

mysql> select member_host,member_state,member_role,member_version from performance_schema.replication_group_members;
+---------------+--------------+-------------+----------------+
| member_host   | member_state | member_role | member_version |
+---------------+--------------+-------------+----------------+
| 172.28.128.15 | ONLINE       | SECONDARY   | 8.0.22         |
| 172.28.128.14 | ONLINE       | PRIMARY     | 8.0.22         |
| 172.28.128.13 | ONLINE       | SECONDARY   | 8.0.22         |
+---------------+--------------+-------------+----------------+
3 rows in set (0.03 sec)

I am planning to take the current PRIMARY node “172.28.128.14” to maintenance for OS patching. When I bring the current PRIMARY node down, I need to promote the server “172.28.128.15” as a PRIMARY member. Let see how this can be achieved in the following ways.

  • Using server weight (group_replication_member_weight)
  • Using function “group_replication_set_as_primary”

Using server weight (group_replication_member_weight):

This approach is not straightforward. The new node will be promoted as a primary member when the current primary member goes down. Currently, I have the same weight on all my nodes.

[root@innodb1 ~]# mysql -e "select @@hostname, @@group_replication_member_weight\G"
*************************** 1. row ***************************
                       @@hostname: 172.28.128.13
@@group_replication_member_weight: 50

[root@innodb2 ~]#  mysql -e "select @@hostname, @@group_replication_member_weight\G"
*************************** 1. row ***************************
                       @@hostname: 172.28.128.14
@@group_replication_member_weight: 50

[root@innodb3 ~]# mysql -e "select @@hostname, @@group_replication_member_weight\G"
*************************** 1. row ***************************
                       @@hostname: 172.28.128.15
@@group_replication_member_weight: 50

I am going to increase the weight on the server “172.28.128.15” so that it will be elected as a PRIMARY member when taking down the “172.28.128.14”.

At “172.28.128.15”,

mysql> set global group_replication_member_weight = 70;
Query OK, 0 rows affected (0.00 sec)

mysql> select @@group_replication_member_weight;
+-----------------------------------+
| @@group_replication_member_weight |
+-----------------------------------+
|                                70 |
+-----------------------------------+
1 row in set (0.00 sec)

Note: To set the weight, you don’t need to execute the STOP/START GROUP_REPLICATION.

The weight was increased to 70 on the server “172.28.128.15”. Now, I am going to bring down the current PRIMARY node “172.28.128.14”.

At “172.28.128.14”,

mysql> stop group_replication;
Query OK, 0 rows affected (4.29 sec)

mysql> select member_host,member_state,member_role,member_version from performance_schema.replication_group_members;
+---------------+--------------+-------------+----------------+
| member_host   | member_state | member_role | member_version |
+---------------+--------------+-------------+----------------+
| 172.28.128.14 | OFFLINE      |             |                |
+---------------+--------------+-------------+----------------+
1 row in set (0.01 sec)

The node was left from the cluster.

At “172.28.128.15”,

mysql> select member_host,member_state,member_role,member_version from performance_schema.replication_group_members;
+---------------+--------------+-------------+----------------+
| member_host   | member_state | member_role | member_version |
+---------------+--------------+-------------+----------------+
| 172.28.128.15 | ONLINE       | PRIMARY     | 8.0.22         |
| 172.28.128.13 | ONLINE       | SECONDARY   | 8.0.22         |
+---------------+--------------+-------------+----------------+
2 rows in set (0.04 sec)

You can see that “172.28.128.15” was selected as a new PRIMARY node.

Using function “group_replication_set_as_primary”:

This method is very straightforward and no need to fail the current PRIMARY node to switch the primary member.

+---------------+--------------+-------------+----------------+
| member_host   | member_state | member_role | member_version |
+---------------+--------------+-------------+----------------+
| 172.28.128.15 | ONLINE       | SECONDARY   | 8.0.22         |
| 172.28.128.14 | ONLINE       | PRIMARY     | 8.0.22         |
| 172.28.128.13 | ONLINE       | SECONDARY   | 8.0.22         |
+---------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)

Now, we need to execute the function “group_replication_set_as_primary” with the member UUID.

At “172.28.128.15”,

mysql> show global variables like 'server_uu%';
+---------------+--------------------------------------+
| Variable_name | Value                                |
+---------------+--------------------------------------+
| server_uuid   | c5aed435-d58d-11ea-bb26-5254004d77d3 |
+---------------+--------------------------------------+
1 row in set (0.00 sec)

mysql> select group_replication_set_as_primary('c5aed435-d58d-11ea-bb26-5254004d77d3');
+--------------------------------------------------------------------------+
| group_replication_set_as_primary('c5aed435-d58d-11ea-bb26-5254004d77d3') |
+--------------------------------------------------------------------------+
| Primary server switched to: c5aed435-d58d-11ea-bb26-5254004d77d3         |
+--------------------------------------------------------------------------+
1 row in set (1.03 sec)

mysql> select member_host,member_state,member_role,member_version from performance_schema.replication_group_members;
+---------------+--------------+-------------+----------------+
| member_host   | member_state | member_role | member_version |
+---------------+--------------+-------------+----------------+
| 172.28.128.15 | ONLINE       | PRIMARY     | 8.0.22         |
| 172.28.128.14 | ONLINE       | SECONDARY   | 8.0.22         |
| 172.28.128.13 | ONLINE       | SECONDARY   | 8.0.22         |
+---------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)

“172.28.128.15” was elected as a PRIMARY node.

Note: You can execute the function “group_replication_set_as_primary_node” on any server of the cluster.

This approach is very useful, and if you want to switch the PRIMARY member to a highly configured server or for any other reason, it will be greatly helpful.

Jan
11
2021
--

Full Read Consistency Within Percona Kubernetes Operator for Percona XtraDB Cluster

Full Read Consistency Within Percona Kubernetes Operator

Full Read Consistency Within Percona Kubernetes OperatorThe aim of Percona Kubernetes Operator for Percona XtraDB Cluster is to be a special type of controller introduced to simplify complex deployments. The Operator extends the Kubernetes API with custom resources. The Operator solution is using Percona XtraDB Cluster (PXC) behind the hood to provide a highly available, resilient, and scalable MySQL service in the Kubernetes space. 

This solution comes with all the advantages/disadvantages provided by Kubernetes, plus some advantages of its own like the capacity to scale reads on the nodes that are not Primary.

Of course, there are some limitations like the way PXC handles DDLs, which may impact the service, but there is always a cost to pay to get something, expecting to have it all for free is unreasonable.     

In this context, we need to talk and cover what is full read consistency in this solution and why it is important to understand the role it plays.  

Stale Reads

When using Kubernetes we should talk about the service and not about the technology/product used to deliver such service. 

In our case, the Percona Operator is there to deliver a MySQL service. We should then see that as a whole, as a single object. To be more clear what we must consider is NOT the fact we have a cluster behind the service but that we have a service that to be resilient and highly available, use a cluster. 

We should not care if a node/pod goes down unless the service is discontinued.

What we have as a plus in the Percona Operator solution is a certain level of READ scalability. This achieved optimizing the use of the non PRIMARY nodes, and instead of having them sitting there applying only replicated data, the Percona Operator provides access to them to scale the reads.  

But… there is always a BUT ? 

Let us start with an image:

 

By design, the apply and commit finalize in Galera (PXC) may have (and has) a delay between nodes. This means that, if using defaults, applications may have inconsistent reads if trying to access the data from different nodes than the Primary. 

It provides access using two different solutions:

  • Using HAProxy (default)
  • Using ProxySQL

 

 

When using HAProxy you will have 2 entry points:

  • cluster1-haproxy, which will point to the Primary ONLY, for reads and writes. This is the default entry point for the applications to the MySQL database.
  • cluster1-haproxy-replicas, which will point to all three nodes and is supposed to be used for READS only. This is the PLUS you can use if your application has READ/WRITE separation.

Please note that at the moment there is nothing preventing an application to use the cluster1-haproxy-replicas also for write, but that is dangerous and wrong because will generate a lot of certification conflicts and BF abort given it will distribute writes all over the cluster impacting on performance as well (and not giving you any write scaling):

 

[marcotusa@instance-1 ~]$ for i in `seq 1 100`; do mysql -h cluster1-haproxy-replicas -e "insert into test.iamwritingto values(null,@@hostname)";done
+----------------+-------------+
| host           | count(host) |
+----------------+-------------+
| cluster1-pxc-1 |          34 |
| cluster1-pxc-2 |          33 |
| cluster1-pxc-0 |          33 |
+----------------+-------------+

When using ProxySQL the entry point is a single one, but you may define query rules to automatically split the R/W requests coming from the application. This is the preferred method when an application has no way to separate the READS from the writes.

Here I have done a comparison of the two methods, HAProxy and ProxySQL.

Now, as mentioned above, by default, PXC (any Galera base solution) comes with some relaxed settings, for performance purposes. This is normally fine in many standard cases, but if you use the Percona Operator and use the PLUS of scaling reads using the second access point with HAproxy or Query Rules with Proxysql, you should NOT have stale reads, given the service must provide consistent data, as if you are acting on a single node. 

To achieve that you can change the defaults and change the parameter in PXC wsrep_sync_wait. 

When changing the parameter wsrep_sync_wait as explained in the documentation, the node initiates a causality check, blocking incoming queries while it catches up with the cluster. 

Once all data on the node receiving the READ request is commit_finalized, the node performs the read.

But this has a performance impact, as said before.

What Is The Impact?

To test the performance impact I had used a cluster deployed in GKE, with these characteristics:

  • 3 Main nodes n2-standard-8 (8 vCPUs, 32 GB memory)
  • 1 App node n2-standard-8 (8 vCPUs, 32 GB memory)
  • PXC pods using:
    •  25GB of the 32 available 
    • 6 CPU of the 8 available
  • HAProxy:
    • 600m CPU
    • 1GB RAM
  • PMM agent
    • 500m CPU
    • 500 MB Ram

In the application node, I used sysbench running two instances, one in r/w mode the other only reads. Finally, to test the stale read, I used the stale read test from my test suite.

Given I was looking for results with a moderate load, I just used 68/96/128 threads per sysbench instance. 

Results

Marco, did we have or not have stale reads? Yes, we did:

I had from 0 (with very light load) up to 37% stale reads with a MODERATED load, where moderated was the 128 threads sysbench running. 

Setting wsrep_sync_wait=3 of course I had full consistency.  But I had performance loss:

As you can see, I had an average loss of 11% in case of READS:

While for writes the average loss was 16%. 

Conclusions

At this point, we need to stop and think about what is worth doing. If my application is READs heavy and READs scaling, it is probably worth enabling the full synchronicity given scaling on the additional node allows me to have 2x or more READs. 

If instead my application is write critical, probably losing also ~16% performance is not good.

Finally if my application is stale reads tolerant, I will just go with the defaults and get all the benefits without penalties.

Also keep in mind that Percona Kubernetes Operator for Percona XtraDB Cluster is designed to offer a MySQL service so the state of the single node is not as critical as if you are using a default PXC installation, PODs are by nature ephemeral objects while service is resilient.

References

Percona Kubernetes Operator for Percona XtraDB Cluster

https://github.com/Tusamarco/testsuite

https://en.wikipedia.org/wiki/Isolation_(database_systems)#Dirty_reads

https://galeracluster.com/library/documentation/mysql-wsrep-options.html#wsrep-sync-wait

https://www.slideshare.net/lefred.descamps/galera-replication-demystified-how-does-it-work

Jan
11
2021
--

Percona Kubernetes Operator for Percona XtraDB Cluster: HAProxy or ProxySQL?

Percona Kubernetes Operator HAProxy or ProxySQL

Percona Kubernetes Operator HAProxy or ProxySQLPercona Kubernetes Operator for Percona XtraDB Cluster comes with two different proxies, HAProxy and ProxySQL. While the initial version was based on ProxySQL, in time, Percona opted to set HAProxy as the default Proxy for the operator, without removing ProxySQL. 

While one of the main points was to guarantee users to have a 1:1 compatibility with vanilla MySQL in the way the operator allows connections, there are also other factors that are involved in the decision to have two proxies. In this article, I will scratch the surface of this why.

Operator Assumptions

When working with the Percona Operator, there are few things to keep in mind:

  • Each deployment has to be seen as a single MySQL service as if a single MySQL instance
  • The technology used to provide the service may change in time
  • Pod resiliency is not guaranteed, service resiliency is
  • Resources to be allocated are not automatically calculated and must be identified at the moment of the deployment
  • In production, you cannot set more than 5 or less than 3 nodes when using PXC

There are two very important points in the list above.

The first one is that what you get IS NOT a Percona XtraDB Cluster (PXC), but a MySQL service. The fact that Percona at the moment uses PXC to cover the service is purely accidental and we may decide to change it anytime.

The other point is that the service is resilient while the pod is not. In short, you should expect to see pods stopping to work and being re-created. What should NOT happen is that service goes down. Trying to debug each minor issue per node/pod is not what is expected when you use Kubernetes. 

Given the above, review your expectations… and let us go ahead. 

The Plus in the Game (Read Scaling)

As said, what is offered with Percona Operator is a MySQL service. Percona has added a proxy on top of the nodes/pods that help the service to respect the resiliency service expectations. There are two possible deployments:

  • HAProxy
  • ProxySQL

Both allow optimizing one aspect of the Operator, which is read scaling. In fact what we were thinking was, given we must use a (virtually synchronous) cluster, why not take advantage of that and allow reads to scale on the other nodes when available? 

This approach will help all the ones using POM to have the standard MySQL service but with a plus. 

But, with it also comes with some possible issues like READ/WRITE splitting and stale reads. See this article about stale reads on how to deal with it. 

For R/W splitting we instead have a totally different approach in respect to what kind of proxy we implement. 

If using HAProxy, we offer a second entry point that can be used for READ operation. That entrypoint will balance the load on all the nodes available. 

Please note that at the moment there is nothing preventing an application to use the cluster1-haproxy-replicas also for write, but that is dangerous and wrong because will generate a lot of certification conflicts and BF abort, given it will distribute writes all over the cluster impacting on performance as well (and not giving you any write scaling). It is your responsibility to guarantee that only READS will go through that entrypoint.

If instead ProxySQL is in use, it is possible to implement automatic R/W splitting. 

Global Difference and Comparison

At this point, it is useful to have a better understanding of the functional difference between the two proxies and what is the performance difference if any. 

As we know HAProxy acts as a level 4 proxy when operating in TCP mode, it also is a forward-proxy, which means each TCP connection is established with the client with the final target and there is no interpretation of the data-flow.

ProxySQL on the other hand is a level 7 proxy and is a reverse-proxy, this means the client establishes a connection to the proxy who presents itself as the final backend. Data can be altered on the fly when it is in transit. 

To be honest, it is more complicated than that but allows me the simplification. 

On top of that, there are additional functionalities that are present in one (ProxySQL) and not in the other. The point is if they are relevant for use in this context or not. For a shortlist see below (source is from ProxySQL blog but data was removed) : 

As you may have noticed HAProxy is lacking some of that functionalities, like R/W split, firewalling, and caching, proper of the level 7 implemented in ProxySQL.  

The Test Environment

To test the performance impact I had used a cluster deployed in GKE, with these characteristics:

  • 3 Main nodes n2-standard-8 (8 vCPUs, 32 GB memory)
  • 1 App node n2-standard-8 (8 vCPUs, 32 GB memory)
  • PXC pods using:
    •  25GB of the 32 available 
    • 6 CPU of the 8 available
  • HAProxy:
    • 600m CPU
    • 1GB RAM
  • PMM agent
    • 500m CPU
    • 500 MB Ram
  • Tests using sysbench as for (https://github.com/Tusamarco/sysbench), see in GitHub for command details.

What I have done is to run several tests running two Sysbench instances. One only executing reads, while the other reads and writes. 

In the case of ProxySQL, I had R/W splitting thanks to the Query rules, so both sysbench instances were pointing to the same address. While testing HAProxy I was using two entry points:

  • Cluster1-haproxy – for read and write
  • Cluster1-haproxy-replicas – for read only

Then I also compare what happens if all requests hit one node only. For that, I execute one Sysbench in R/W mode against one entry point, and NO R/W split for ProxySQL.

Finally, sysbench tests were executed with the –reconnect option to force the tests to establish new connections.

As usual, tests were executed multiple times, on different days of the week and moments of the day. Data reported is a consolidation of that, and images from Percona Monitoring and Management (PMM) are samples coming from the execution that was closest to the average values. 

Comparing Performance When Scaling Reads

These tests imply that one node is mainly serving writes while the others are serving reads. To not affect performance, and given I was not interested in maintaining full read consistency, the parameter wsrep_sync_wait was kept as default (0). 

HAProxy

HAProxy ProxySQL

A first observation shows how ProxySQL seems to keep a more stable level of requests served. The increasing load penalizes HAProxy reducing if ? the number of operations at 1024 threads.

HAProxy ProxySQL HAProxy ProxySQL read comparison

Digging a bit more we can see that HAProxy is performing much better than ProxySQL for the WRITE operation. The number of writes remains almost steady with minimal fluctuations. ProxySQL on the other hand is performing great when the load in write is low, then performance drops by 50%.

For reads, we have the opposite. ProxySQL is able to scale in a very efficient way, distributing the load across the nodes and able to maintain the level of service despite the load increase. 

If we start to take a look at the latency distribution statistics (sysbench histogram information), we can see that:

latency HAProxy latency ProxySQL

In the case of low load and writes, both proxies stay on the left side of the graph with a low value in ms. HAProxy is a bit more consistent and grouped around 55ms value, while ProxySQL is a bit more sparse and spans between 190-293ms.

About reads we have similar behavior, both for the large majority between 28-70ms. We have a different picture when the load increases:  

ProxySQL is having some occurrences where it performs better, but it spans in a very large range, from ~2k ms to ~29k ms. While HAProxy is substantially grouped around 10-11K ms. As a result, in this context, HAProxy is able to better serve writes under heavy load than ProxySQL. 

Again, a different picture in case of reads.

Here ProxySQL is still spanning on a wide range ~76ms – 1500ms, while HAProxy is more consistent but less efficient, grouping around 1200ms the majority of the service. This is consistent with the performance loss we have seen in READ when using high load and HAProxy.  

Comparing When Using Only One Node

But let us now discover what happens when using only one node. So using the service as it should be, without the possible Plus of read scaling. 

Percona Kubernetes Operator for Percona XtraDB Cluster

The first thing I want to mention is strange behavior that was consistently happening (no matter what proxy used) at 128 threads. I am investigating it but I do not have a good answer yet on why the Operator was having that significant drop in performance ONLY with 128 threads.

Aside from that, the results were consistently showing HAProxy performing better in serving read/writes. Keep in mind that HAProxy just establishes the connection point-to-point and is not doing anything else. While ProxySQL is designed to eventually act on the incoming stream of data. 

This becomes even more evident when reviewing the latency distribution. In this case, no matter what load we have, HAProxy performs better:

As you can notice, HAProxy is less grouped than when we have two entry points, but it is still able to serve more efficiently than ProxySQL.

Conclusions

As usual, my advice is to use the right tool for the job, and do not force yourself into something stupid. And as clearly stated at the beginning, Percona Kubernetes Operator for Percona XtraDB Cluster is designed to provide a MySQL SERVICE, not a PXC cluster, and all the configuration and utilization should converge on that.

ProxySQL can help you IF you want to scale a bit more on READS using the possible plus. But this is not guaranteed to work as it works when using standard PXC. Not only do you need to have a very good understanding of Kubernetes and ProxySQL if you want to avoid issues, but with HAProxy you can scale reads as well, but you need to be sure you have R/W separation at the application level.

In any case, utilizing HAProxy for the service is the easier way to go. This is one of the reasons why Percona decided to shift to HAProxy. It is the solution that offers the proxy service more in line with the aim of the Kubernetes service concept. It is also the solution that remains closer to how a simple MySQL service should behave.

You need to set your expectations correctly to avoid being in trouble later.

References

Percona Kubernetes Operator for Percona XtraDB Cluster

 

Wondering How to Run Percona XtraDB Cluster on Kubernetes? Try Our Operator!

The Criticality of a Kubernetes Operator for Databases

 

Jan
08
2021
--

Streaming MySQL Backups with Percona XtraBackup – Another Alternative

Streaming MySQL Backups with Percona XtraBackupToday we are going to present you with another way to make a database copy between servers using our Percona XtraBackup utility.  What is the difference with many others existing on the web?  Not many, but they are interesting in terms of performance and availability.

We will combine the xbstream utility with the power of pigz and socat, to take advantage of the multi-processing in the case of having multiple processors, and at the same time, reduce the use of network bandwidth in cases where this component is a bottleneck.  So let’s explain each component:

socat: This stands for SOcket CAT. It is a utility for data transfer between two addresses.

What makes socat so versatile is the fact that an address can represent a network socket, any file descriptor, a Unix domain datagram or stream socket, TCP and UDP (over both IPv4 and IPv6), SOCKS 4/4a over IPv4/IPv6, SCTP, PTY, datagram, and stream sockets, named and unnamed pipes, raw IP sockets, OpenSSL, or on Linux even any arbitrary network device

pigz: This stands for parallel implementation of gzip, and it is a fully functional replacement for gzip that exploits multiple processors and multiple cores to the hilt when compressing data.

xbstream: (with parallelism) to process several files in parallel.

Packages required: pigz, socat, and, of course, Percona XtraBackup

Servers Involved in the Task

Source: the source database where the backup will be taken (MySQL 5.7 installed on CentOS 7.8)

Target: the destination where the backup will be send (MySQL 5.7 installed on CentOS 7.8)

Steps

1. Install the required packages in the Source and Target nodes:

Source # yum install -y pigz socat
Target # yum install -y pigz socat

If you haven’t installed Percona XtraBackup (in both servers), please follow the instructions explained in https://www.percona.com/doc/percona-xtrabackup/2.4/index.html#installation

Make sure you have a user with proper privileges to do the backup on Source db:

+---------------------------------------------------------------------------+
| Grants for bkpuser@localhost                                              |
+---------------------------------------------------------------------------+
| GRANT RELOAD, PROCESS, REPLICATION CLIENT ON *.* TO 'bkpuser'@'localhost' |
+---------------------------------------------------------------------------+

2. On Target, execute:

Stop current database service (if any):

Target # systemctl stop mysqld

Delete datadir contents (assuming it was installed with default settings) and make sure you’re logged in to the Target server!!

Target # rm -rf /var/lib/mysql/*

Finally, we’re going to execute the command to receive the backup from the source (Source):

Target # socat -u TCP-LISTEN:4444,reuseaddr stdio | pigz -dc -p 4 - | xbstream —p 4 -x -C /var/lib/mysql

3. On Source, execute the command to send the backup to target (Target).

Source # xtrabackup --defaults-file=/etc/my.cnf --backup --user=bkpuser --password=Bkpuser123! --stream=xbstream --parallel 4 --no-timestamp  --target-dir=/tmp | pigz -k -1 -p4 - | socat -u stdio TCP:Target:4444

You will see an output like this:

xtrabackup: recognized server arguments: --datadir=/var/lib/mysql --server-id=1 --log_bin=/var/lib/mysql/mysql-bin --innodb_log_file_size=200M --innodb_log_files_in_group=2 --open_files_limit=65535 --parallel=4
xtrabackup: recognized client arguments: --backup=1 --user=bkpuser --password=* --stream=xbstream --target-dir=/tmp
200822 11:10:16  version_check Connecting to MySQL server with DSN 'dbi:mysql:;mysql_read_default_group=xtrabackup' as 'bkpuser'  (using password: YES).
200822 11:10:16  version_check Connected to MySQL server
200822 11:10:16  version_check Executing a version check against the server...
200822 11:10:16  version_check Done.
200822 11:10:16 Connecting to MySQL server host: localhost, user: bkpuser, password: set, port: not set, socket: not set
Using server version 5.7.30-log
xtrabackup version 2.4.20 based on MySQL server 5.7.26 Linux (x86_64) (revision id: c8b4056)
xtrabackup: uses posix_fadvise().
xtrabackup: cd to /var/lib/mysql
xtrabackup: open files limit requested 65535, set to 65535
xtrabackup: using the following InnoDB configuration:
xtrabackup:   innodb_data_home_dir = .
xtrabackup:   innodb_data_file_path = ibdata1:12M:autoextend
xtrabackup:   innodb_log_group_home_dir = ./
xtrabackup:   innodb_log_files_in_group = 2
xtrabackup:   innodb_log_file_size = 209715200
InnoDB: Number of pools: 1
200822 11:10:16 >> log scanned up to (6724690490)
xtrabackup: Generating a list of tablespaces
InnoDB: Allocated tablespace ID 2 for mysql/plugin, old maximum was 0
xtrabackup: Starting 4 threads for parallel data files transfer
200822 11:10:16 [01] Streaming ./ibdata1
...etc

4. Once finished with step 3, you will see an output like this on the Target node:

...
MySQL binlog position: filename 'mysql-bin.000091', position '102205647'
200822 11:10:21 [00] Streaming <STDOUT>
200822 11:10:21 [00]        ...done
200822 11:10:21 [00] Streaming <STDOUT>
200822 11:10:21 [00]        ...done
xtrabackup: Transaction log of lsn (4308505553) to (4308505562) was copied.
200822 11:10:21 completed OK!

And step 2 will be finished too, so you must execute the following commands in Target node:

Target # xtrabackup --prepare --use-memory=1G --target-dir=/var/lib/mysql/

From the documentation:

After you made a backup with the xtrabackup –backup option, you’ll first need to prepare it in order to restore it. Data files are not point-in-time consistent until they’ve been prepared, because they were copied at different times as the program ran, and they might have been changed while this was happening. If you try to start InnoDB with these data files, it will detect corruption and crash itself to prevent you from running on damaged data. The xtrabackup –prepare step makes the files perfectly consistent at a single instant in time, so you can run InnoDB on them.

Target # chown -R mysql:mysql  /var/lib/mysql/*
Target # systemctl start mysqld

And you have cloned a new database!

You can, of course, set up this new database as a replica with these additional steps executed in the Target node:

Look at the content of the file xtrabackup_binlog_info, it will be something like:

Target # cat /var/lib/mysql/xtrabackup_binlog_info

mysql-bin.000091 102205647

(We’re assuming they have created the following user/grants in the Source db, if not, create it)

mysql> show grants for replicator;
+----------------------------------------------------+
| Grants for replicator@%                            |
+----------------------------------------------------+
| GRANT REPLICATION SLAVE ON *.* TO 'replicator'@'%' |
+----------------------------------------------------+

Connect to the database, and run:

Target # mysql -u root -p

mysql> change master to master_host='Source',master_port=3306,master_user='replicator',master_password='R3pl1c4t10n!',master_log_file='mysql-bin.000091',master_log_pos=102205647;
Query OK, 0 rows affected (0.00 sec)

mysql> start slave;
Query OK, 0 rows affected (0.00 sec)

mysql> pager egrep -i "Master_Host|Master_User|Master_Port|file|behind"

mysql> show slave status\G
                  Master_Host: master
                  Master_User: replicator
                  Master_Port: 3306
              Master_Log_File: mysql-bin.000091
               Relay_Log_File: relay.000001
        Relay_Master_Log_File: mysql-bin.000091
               Until_Log_File:
           Master_SSL_CA_File:
        Seconds_Behind_Master: 0
             Master_Info_File: mysql.slave_master_info

1 row in set (0.00 sec)

And that’s all.  Happy streaming!

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com