Jul
29
2022
--

How to Benchmark Replication Performance in MySQL

Benchmark Replication Performance in MySQL

In this blog, I will cover important aspects which you need to test when benchmarking replication setup. MySQL has great tools that could be used to test its performance. They include:

sysbench – https://github.com/akopytov/sysbench

BMK-kit – http://dimitrik.free.fr/blog/posts/mysql-perf-bmk-kit.html

mysqlslap – https://dev.mysql.com/doc/refman/8.0/en/mysqlslap.html

LinkBench – https://github.com/facebookarchive/linkbench

I will not describe how to use them here, as you can find instructions on the provided links or in the Percona blog by browsing tags sysbench, benchmark, benchmarks, and the category benchmarks.

However, most of the instructions take into account standalone MySQL servers. If you want to test how your replication setup performs, you may want to perform additional checks.

Can the replica catch up to the source server?

To answer this question, you need to run a load on the source machine and monitor the replica delay. You can do it by examining the

SHOW REPLICA STATUS

  (

SHOW SLAVE STATUS

) output, particularly the value of the

Seconds_Behind_Source

  (

Seconds_Behind_Master

 ) parameter. The same parameter is monitored by the “MySQL Replication Lag” graph under the “MySQL Replication Summary” dashboard in Percona Monitoring and Management (PMM).

The more advanced check would be to compare

Exec_Source_Log_Pos

  (

Exec_Master_Log_Pos

),

Relay_Log_File

  with

Read_Source_Log_Pos

  (

Read_Master_Log_Pos

),

Source_Log_File

  pairs since

Seconds_Behind_Source

  may be affected by long-running commands and return wrong values.

You may tune your replica server by adding parallelization if you see increasing lag. Check also how option

binlog_transaction_dependency_tracking

  works: https://dev.mysql.com/doc/refman/8.0/en/replication-options-binary-log.html#sysvar_binlog_transaction_dependency_tracking

Can the replica run queries while applying updates from the source server?

Once you are happy with the replication lag, you may start adding the load you plan to run on this replica and test its performance. For example, if you plan to use the replica for handling read-only queries, you can run a read-only benchmark on it. You need to watch the performance of the read-only benchmark and replication lag.

If you plan to run analytical queries on the replica: test them. If you plan to use this replica for taking backups: take the backup and measure how it will affect the performance of the replication.

Synchronous replication

Synchronous replication solutions such as Galera, Percona XtraDB Cluster (PXC), and Group Replication are affected by the flow control paused time. You can use PMM to monitor it.

For PXC/Galera clusters, watch graph “Flow Control Paused Time” under “PXC/Galera Cluster Summary” dashboard.

Or you can watch the value of the

wsrep_flow_control_paused

  variable. See Galera Flow Control in Percona XtraDB Cluster for MySQL for more details.

For Group Replication, you need to check the value of the

COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE

column in the table

performance_schema.replication_group_member_stats

 . That shows how many transactions are waiting in the queue on the secondary node to apply.

You may also watch “Replication Lag” graph under “MySQL Group Replication” dashboard in PMM

Your best test is your production

As said by Dimitri Kravtchuk at Fosdem 2020:

While it is hard to create such a benchmark on the server that accepts writes from users, for the replica, it is comparatively easy. All you need to do is to create a replica of your production server and not make it available for the application yet. First, watch if it can handle replication lag while idle, then start adding load to it. You may also use the mirroring feature of ProxySQL Note; however, you need to check if some of the limitations described in this post, still apply.

How fast will the replica catch up?

One more thing you may want to test is how fast the replica will recover after a delay in replicating updates from the source server. To perform this test, stop the replica while running the source server’s load, restart it after a certain amount of time, and watch replication lag. If you are testing a replica that serves read-only load, let the load test continue.

It is better to perform stop, wait, and start in one series of commands, so you will have control over how long the replica was stopped. For example, to stop replica for one hour, use the following commands:

STOP REPLICA;
SELECT SLEEP(3600);
START REPLICA;

Then wait until the replica is running:

SELECT SERVICE_STATE FROM performance_schema.replication_applier_status;

Once this query returns ON, start monitoring the value of

Seconds_Behind_Source

  in the

SHOW REPLICA STATUS

  output until it becomes 0. Once it happens, record the time. This way, you will know how much time will be needed to apply one hour of changes.

Conclusion

You can use the same benchmarking tools that are recommended for standalone MySQL servers to perform performance tests against replication setups. You only need to monitor replication lag together with other performance characteristics.

Jul
28
2022
--

PostgreSQL for MySQL DBAs Episode 8: Materialized Views

PostgreSQL for MySQL DBAs Episode 8: Materialized Views

PostgreSQL for MySQL DBAs Episode 8: Materialized ViewsEpisode 8 in the PostgreSQL for MySQL DBAs series covers materialized views. MySQL has had views for many years and they are very useful, but it has never had materialized views. Those who use materialized views appreciate their utility and here we will cover how they are used.  This episode starts with regular, non-materialized views to help those who are not used to using them and then moves on to materialized views so you can see the difference.

Quick recap: Why views?

Views are often used to hide underlying column and table names from users.  This obfuscation works well as the user may have permission to query the view but is denied access to the base tables.  By having users standardize on using the view to get just the needed columns, it does simplify things. It allows the joining process and simplifies the use of multiple tables into a single virtual table. The view becomes a ‘common’ shorthand for required data. It can also encapsulate the details of the structure of your tables, which might change as your application evolves, behind consistent interfaces.

Views are great places for aggregate values like SUM(), AVG(), etc., to keep the calculations separate from the ‘pure’ data. They take up little space as compared to a table. Views can be built on top of other views, and views can be used like a table.

A basic view

Let’s start with a simple table and corresponding simple data.

test=# create table base (a int, b int, c int);
CREATE TABLE
test=# insert into base values (1,2,3),(4,5,6),(7,8,9);
INSERT 0 3

Our first view draws values from the above table.

test=# create view v1 as SELECT a, b, c*4 from base;

CREATE VIEW

The view returned the values of the columns of a and b plus the value of c multiplied by four.

test=# select * from v1;
a | b | ?column?
---+---+----------
1 | 2 | 12
4 | 5 | 24
7 | 8 | 36
(3 rows)

I can use this view like a table by using a WHERE clause.

test=# select * from v1 where a > 6;
a | b | ?column?
---+---+----------
7 | 8 | 36

Materialized views

A materialized view is a static snapshot of the data with the two keywords being ‘static’ and ‘snapshot’.  If you have a case where you do not want many users asking for the same data frequently, then you can use materialized views as a cache.  You may be familiar with stock quotation services that only show updated values every five or so minutes.  This can take a major load off the database instance as it handles requests in the base tables.

test=# create materialized view v2 as SELECT a, b, c*4 from base;
SELECT 3
test=# select * from v2 where a > 6;
a | b | ?column?
---+---+----------
7 | 8 | 36
(1 row)

The keyword MATERIALIZED is added to the view definition.  In the above example, the materialized view uses the same logic as the previous non-materialized view.  This caches the values for later use.

But be careful.  This materialized view is a snapshot of a particular point in time.  Why be careful?  Consider the following situation where we update a value in the table.

test=# update base set b=99 where a = 7;
UPDATE 1
test=# select * from v2 where a > 6;
a | b | ?column?
---+---+----------
7 | 8 | 36
(1 row)

The value for b is updated in the base table however the materialized view value is not updated.

But it is easy to ‘update’ the materialized view with the REFRESH keyword.

test=# refresh materialized view v2;
REFRESH MATERIALIZED VIEW
test=# select * from v2 where a > 6;
a | b | ?column?
---+----+----------
7 | 99 | 36
(1 row)

Materialized views are a great way to cache answers but you have to be aware that the data may be out of date.

Need materialized views for MySQL today?

MySQL and MariaDB do not have materialized views.  You could make a feature request that expresses your desire for this feature with those vendors, but you will not see materialized views implemented in the next release or so.

Another option is EdgeDB, which has materialized views along with a columnar storage engine that is compatible with both MySQL and MariaDB.  This builds on the previous work of Justin Swanhart in the areas of materialized views and the WARP storage engine.

Summary

Now you know how materialized views work and how they differ from regular views.

The past videos for PostgreSQL for MySQL Database Administrators (DBA) can be found here: episode oneepisode two, episode three, episode four, episode five, episode six, and episode seven.

Jul
27
2022
--

A Quick Peek At MySQL 8.0.30

Quick Peek At MySQL 8.0.30

Quick Peek At MySQL 8.0.30MySQL 8.0 is now over four years old and Oracle released the latest quarterly offering today with MySQL 8.0.30, which comes with lots of bug fixes and some interesting additions.  This is my take on the release notes for those who do not have the time to wade through them, comments in italics are my comments and reflect the views of only me

This is an interesting release with a good many bug fixes and I urge those who need fixes to upgrade as soon as they can. For everyone else, the TL;DR is that unless you are on the Oracle Cloud Infrastructure or one of the fixed bugs is causing you problems, then upgrade at your leisure. 

Deprecation and removals

Setting the  replica_parallel_workers system variable to 0 is now deprecated and to use single threading set replica_parallel_workers=1 instead.  So 1 is the new zero for this command?!

The –skip_host_cache server option is deprecated and will be removed in a future release. Use SET GLOBAL host_cache_size = 0 or set host_cache_size = 0.

New stuff

Tables without primary keys are a big problem for Oracle’s MySQL Database Service and their cloud expects them.  A previous version of the server added support for invisible primary keys to help with that issue.  Now we get Generated Invisible Primary Keys (GIPK) to automatically add an invisible primary key to any InnoDB tables without a primary key. 

The GIPK column is defined as:

my_row_id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT INVISIBLE PRIMARY  KEY

The generated primary key is always named my_row_id; you cannot change this while GIPK mode is turned on. Nor can you use this as a column name in a  CREATE TABLE statement that creates a new InnoDB table unless it includes an explicit primary key. These GIPKs are not enabled by default. Enable this new feature by setting the sql_generate_invisible_primary_key  server system variable to ON. Replication is not affected as this setting has no effect on replication applier threads as a replica never generates a primary key for a replicated table that was not created on the source with a primary key.

GIPKs can only be altered to toggle the visibility of the GIPK using ALTER TABLE tbl CHANGE COLUMN my_row_id SET VISIBLE and ALTER TABLE tbl CHANGE COLUMN my_row_id SET INVISIBLE. GIPKs are visible to SHOW CREATE TABLE and SHOW INDEX plus the information_schema but can be hidden by setting show_gipk_in_create_table_and_information_schema to OFF. I will have to try a generated invisible primary key based on a generated column jsut on general principles.

And mysqldump and mysqlpump will skip GIPKs by setting the –skip-generated-invisible-primary-key option. Are these GIPKs skipped on restore and, if not, how much extra load is this going to generate?

AWS keyring

An updated keyring_aws plugin is now available to use the latest AWS Encryption SDK for C version 1.9.186.

SQL syntax notes

There are two new options for REVOKE that let you determine whether a REVOKE statement with issues raises an error or a warning. IF EXISTS causes REVOKE to raise a warning rather than an error as long as the target user or role does not exist while IGNORE UNKNOWN USER causes REVOKE to raise a warning instead of an error if the target user or role is not known. There are lots of areas where it would be nice to be able to set the error or warning issue.

XA transaction fix

Previously, Replication recovery was not guaranteed when a server node in a replication topology unexpectedly halted while executing XA_PREPARE, XA COMMIT, or XA ROLLBACK. MySQL now maintains a consistent XA transaction state across a topology using either MySQL “classic” Replication or MySQL Group Replication when a server node is lost from the topology and then rejoins. This also means that the XA transaction state is now propagated so that nodes do not diverge while doing work within a given transaction in the event that a server node halts, recovers, and rejoins the topology. Who does not like better transactions!  

InnoDB changes

The innodb_doublewrite  system variable gains a pair of two new settings, DETECT_ONLY and DETECT_AND_RECOVER. With the DETECT_ONLY setting, database page content is not written to the doublewrite buffer, and recovery does not use the doublewrite buffer to fix incomplete page writes. This lightweight setting is intended for detecting incomplete page writes only. The DETECT_AND_RECOVER setting is equivalent to the existing ON setting.

Redo logs now have the ability for dynamic configuration capacity, using the innodb_redo_log_capacity system variable to set at runtime to increase or decrease the total amount of disk space occupied by redo log files. And redo logs get a new home!  InnoDB maintains 32 redo log files in an #innodb_redo directory in the data directory instead of two files in the data directory. I wonder how moving these 32 files to another drive would impact performance. And do we really get the disk space back?

22.04 support

Ubuntu 22.04 is now supported. 

MySQL schema

Host name and user name columns are together now in the start of primary keys to avoid full table scans, The tables are mysql.db, mysql.tables_priv, mysql.columns_priv, and mysql.procs_priv are being rearranged, and, when you upgrade, these tables are modified in the second step of the MySQL upgrade process.

MyISAM

The myisqm_repair_threads system variable and myisamchk –parallel-recover option were removed. The future is InnoDB but I still have some fond memories of MyISAM and hate to see tools stripped of features.

Who is on first

One of the reasons I write these quick peeks is to keep you from having to read the curious wordings found in the release notes where technical writing collides with work log notes to create items such as the following that I quote directly.

  • Connections whose users have the CONNECTION_ADMIN privilege are not terminated when MySQL Server is set to offline mode, which is done by changing the value of the offline_mode system variable to ON. Previously, checking for connections that had the CONNECTION_ADMIN privilege could cause a race condition because it involved accessing other threads. Now, a flag for each thread caches whether or not the user for the thread has the CONNECTION_ADMIN privilege. The flag is updated if the user privilege changes. When offline mode is activated for the server, this flag is checked for each thread, rather than the security context of another thread. This change makes the operation threadsafe.In addition, when offline mode is activated, connections whose users have the SYSTEM_USER privilege are now only terminated if the user that runs the operation also has the SYSTEM_USER privilege. Users that only have the SYSTEM_VARIABLES_ADMIN privilege, and do not have the SYSTEM_USER privilege, can set the offline_mode system variable to ON to activate offline mode. However, when they run the operation, any sessions whose users have the SYSTEM_USER privilege remain connected, in addition to any sessions whose users have the CONNECTION_ADMIN privilege. This only applies to existing connections at the time of the operation; users with the SYSTEM_USER privilege but without the CONNECTION_ADMIN privilege cannot make new connections to a system in offline mode.

The above is a hybrid of the nightmares I had after business law, symbolic logic, and English class assignments from my distant past. I had to resist making a flow chart when reading it.

Performance Schema

Performance Schema provides instrumentation for performance monitoring of Group Replication memory usage. To get a peek, use:

SELECT NAME, ENABLED FROM performance_schema.setup_instruments WHERE NAME LIKE'memory/group_rpl/%';

Mainframe news

Added a cycle timer for the s390x architecture.

MySQL 8.0.31 should be out in late October.

Jul
25
2022
--

Managing MySQL Configurations with the PXC Kubernetes Operator V1.10.0 Part 1: Introduction

MySQL Configurations with the PXC Kubernetes Operator

MySQL Configurations with the PXC Kubernetes OperatorIntroduction/FAQ

Question: I need to run a production-grade open source MySQL DB.

Answer: Percona to the rescue! Percona XtraDB Cluster (PXC) is an open source enterprise MySQL solution that helps you to ensure data availability for your applications while improving security and simplifying the development of new applications in the most demanding public, private, and hybrid cloud environments

Question: I forgot to mention that I need to run it on Kubernetes.

Answer: Percona to the rescue again! Percona Distribution for MySQL Operator based on Percona XtraDB Cluster contains everything you need to quickly and consistently deploy and scale Percona XtraDB Cluster instances in a Kubernetes-based environment on-premises or in the cloud.

Question: I have a lot of MySQL configurations to manage.

Answer:  PXC Operator makes it easy to manage MySQL Configurations. Let’s explore.

For the rest of the article, the name of the PXC cluster is assumed to be cluster1, and this can be modified based on the user preference.

How can I change the MySQL configurations?

If you have not done it already, the first thing to do is install the PXC operator. Our Quickstart guide gives detailed instructions on how to get started. 

There are three possible ways to modify the MySQL configurations as described in the Documentation for MySQL options:

  1. Custom Resource PerconaXtraDBCluster (pxc/pxcs/perconaxtradbclusters )
  2. Config map with name cluster1-pxc
  3. Secret with name cluster1-pxc

Which option should I choose for managing configurations?

The choice of using the above options depends on the use case and the user’s preferences.

Following are some examples:

Using ConfigMap

  1. If the MySQL configuration is pretty big and/or if you want to maintain the configuration separately rather than updating everything in PXC object.
  2. If you want to provide permission to change MySQL configurations but not the other properties of PXC objects like resources, affinity, etc., K8s RBAC can be used to achieve this. A Role/ClusterRole can be created to provide access only for the ConfigMap which is used for MySQL configuration.

Using Secrets

  1. If there is any sensitive information that needs to be used in the configuration, secrets are recommended. Even though k8s secrets are just base64 encoded data, secrets have the advantage of integrating well with vaults, and it’s always best practice to use k8s secrets than ConfigMap when there is sensitive data.

What happens when I change MySQL configuration?

Any changes in MySQL configurations will generally recycle the pods in reverse order if the RollingUpdate strategy is used.

Example: If three replicas are used for the PXC cluster, cluster1-pxc-[0,1,2] pods would be created. When MySQL configuration is changed, cluster1-pxc-2 will be terminated first and the system will wait till the new pod cluster1-pxc-2 starts running and becomes healthy, then cluster1-pxc-1 will be terminated, and so on.

Following are the changes observed with the watch command.

# kubectl get po -l app.kubernetes.io/component=pxc --watch
NAME             READY   STATUS    RESTARTS   AGE
cluster1-pxc-0   3/3     Running   0          8m23s
cluster1-pxc-1   3/3     Running   0          10m
cluster1-pxc-2   3/3     Running   0          13m
cluster1-pxc-2   3/3     Terminating   0          13m
cluster1-pxc-2   0/3     Terminating   0          14m
cluster1-pxc-2   0/3     Pending       0          0s
cluster1-pxc-2   0/3     Init:0/1      0          1s
cluster1-pxc-2   0/3     PodInitializing   0          8s
cluster1-pxc-2   2/3     Running           0          10s
cluster1-pxc-2   3/3     Running           0          2m
cluster1-pxc-1   3/3     Terminating       0          14m
cluster1-pxc-1   0/3     Terminating       0          14m
cluster1-pxc-1   0/3     Pending           0          0s
cluster1-pxc-1   0/3     Init:0/1          0          1s
cluster1-pxc-1   0/3     PodInitializing   0          6s
cluster1-pxc-1   2/3     Running           0          8s
cluster1-pxc-1   3/3     Running           0          2m1s
cluster1-pxc-0   3/3     Terminating       0          13m
cluster1-pxc-0   0/3     Terminating       0          14m
cluster1-pxc-0   0/3     Pending           0          0s
cluster1-pxc-0   0/3     Init:0/1          0          0s
cluster1-pxc-0   0/3     PodInitializing   0          6s
cluster1-pxc-0   2/3     Running           0          8s
cluster1-pxc-0   3/3     Running           0          2m

In the upcoming post, we will see the precedence and the changes happening while modifying MySQL configurations. Stay tuned!

Jul
25
2022
--

Helm Chart in Percona Monitoring and Management, Percona Distribution for MongoDB 4.4.15-15: Release Roundup July 25, 2022

Percona Software Roundup July 25 2022

It’s time for the release roundup!

Percona is a leading provider of unbiased open source database solutions that allow organizations to easily, securely, and affordably maintain business agility, minimize risks, and stay competitive.

Our Release Roundups showcase the latest Percona software updates, tools, and features to help you manage and deploy our software. It offers highlights and critical information, as well as links to the full release notes and direct links to the software or service itself to download.

Today’s post includes those releases and updates that have come out since July 11, 2022. Take a look!

Percona Monitoring and Management 2.29.0

Percona Monitoring and Management 2.29.0 (PMM) was released on July 19, 2022. It is an open source database monitoring, management, and observability solution for MySQL, PostgreSQL, and MongoDB. A release highlight is a short in-app tutorial that automatically pops up when you first launch the application. If you are a new user, this is your virtual unboxing of key features to help you get started easier. If you are an Intermediate or Advanced user, use the tour as a checklist of features worth exploring to ensure you’re making the most out of your PMM.

Also, starting with PMM 2.29.0, we have introduced the helm chart supported by Percona to seamlessly deploy your PMM instances on Kubernetes (k8s), a prominent container orchestration system. With the Helm chart, you can customize your deployment. You can tweak your PMM installation by setting these various parameters.

Download Percona Monitoring and Management 2.29.0

 

Percona Distribution for MongoDB 4.4.15-15

On July 19, 2022, we released Percona Distribution for MongoDB 4.4.15-15. It is a freely available MongoDB database alternative, giving you a single solution that combines enterprise components from the open source community, designed and tested to work together. The aim of Percona Distribution for MongoDB is to enable you to run and operate your MongoDB efficiently with the data being consistently backed up.

Download Percona Distribution for MongoDB 4.4.15-15

 

Percona Server for MongoDB 4.4.15-15

Percona Server for MongoDB 4.4.15-15 was released on July 19, 2022. It is an enhanced, source available, and highly-scalable database that is a fully-compatible, drop-in replacement for MongoDB 4.4.15 Community Edition. It supports MongoDB 4.4.15 protocols and drivers. Release highlights include the support of multiple KMIP servers which adds failover to your data-at-rest encryption setup, and users can set KMIP client certificate password through a flag to simplify the migration from MongoDB Enterprise to Percona Server for MongoDB.

Download Percona Server for MongoDB 4.4.15-15

 

Percona Backup for MongoDB 1.8.1

Percona Backup for MongoDB 1.8.1 was released on July 12, 2022. It is a distributed, low-impact solution for consistent backups of MongoDB sharded clusters and replica sets. This is a tool for creating consistent backups across a MongoDB sharded cluster (or a non-sharded replica set), and for restoring those backups to a specific point in time. Release highlights include:

  • PBM-871 – Fixed the restore failure on a different cluster. Now the UUID of users and system collections are not preserved when replaying the oplog.
  • PBM-881 – The PITR chunks display is now consistent in both pbm status and pbm list outputs.

Download Percona Backup for MongoDB 1.8.1

 

Percona Distribution for MySQL 8.0.28 (Percona XtraDB Cluster-based variant)

Percona Distribution for MySQL 8.0.28 (PXC-based variant) was released on July 19, 2022. It is a single solution with the most critical enterprise components from the MySQL open source community, designed and tested to work together. This release of Percona Distribution for MySQL is focused on the Percona XtraDB Cluster-based deployment variant and is based on Percona XtraDB Cluster 8.0.28-19.1.

Download Percona Distribution for MySQL 8.0.28 (PXC-based variant)

 

Percona XtraDB Cluster 8.0.28-19.1

We released Percona XtraDB Cluster 8.0.28-19.1 on July 19, 2022. It is a free, open source, enterprise-grade solution that includes the high availability and security features your business requires to meet your customer expectations and business goals. Release highlights include improvements and bug fixes introduced by Oracle for MySQL 8.0.28 and included in Percona Server for MySQL are following:

  • The ASCII shortcut for CHARACTER SET latin1 and UNICODE shortcut for CHARACTER SET ucs2 are deprecated and raise a warning to use CHARACTER SET instead. The shortcuts will be removed in a future version.
  • A stored function and a loadable function with the same name can share the same namespace. Add the schema name when invoking a stored function in the shared namespace. The server generates a warning when function names collide.
  • InnoDB supports ALTER TABLE ... RENAME COLUMN operations when using ALGORITHM=INSTANT.
  • The limit for innodb_open_files now includes temporary tablespace files. The temporary tablespace files were not counted in the innodb_open_files in previous versions.

Download Percona XtraDB Cluster 8.0.28-19.1

 

Percona XtraBackup 8.0.29-22

On July 19, 2022, we released Percona XtraBackup 8.0.29-22, which enables MySQL backups without blocking user queries. Percona XtraBackup is ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, Percona XtraBackup drives down backup costs while providing unique features for MySQL backups. Release highlights include new redo log types to support the changes in the INSTANT algorithm behavior. MySQL 8.0.29 extended the support for ALGORITHM=INSTANT to allow columns to be added to any position in a table and column drops. Older versions of Percona XtraBackup are incompatible with MySQL 8.0.29 because of this new functionality.

Download Percona XtraBackup 8.0.29-22

 

That’s it for this roundup, and be sure to follow us on Twitter to stay up-to-date on the most recent releases! Percona is a leader in providing best-of-breed enterprise-class support, consulting, managed services, training, and software for MySQL, MongoDB, PostgreSQL, MariaDB, and other open source databases in on-premises and cloud environments.

Jul
21
2022
--

Preventing Stale Reads on Percona XtraDB Cluster by Enforcing Causality Checks on a Per-Query Basis

Stale Reads on Percona XtraDB Cluster

Stale Reads on Percona XtraDB ClusterWhen we run a SELECT in a replica server and it returns a different result to the one we would have obtained had we queried the source server instead, due to changes in the dataset that have not yet been replicated or synchronized to the replica, we get what is known as a stale read. My colleague Marcelo has blogged about Stale Reads and How to Fix Them before.

The need to avoid stale reads is a constraint that leads people into migrating to Percona XtraDB Cluster (PXC) / Galera Cluster or, more recently, Group Replication. I am focusing on PXC and Galera in this short blog post, which, similar to Group Replication, implements not exactly a synchronous replication model but what is commonly referred to as a “virtually synchronous” replication (but not without some polemic). In short, it means a transaction that is committed in one node is guaranteed (and there’s a certification process behind this) to have “reached” other nodes, but that doesn’t mean the changes this transaction carries with it have been applied on the other nodes already. Thus, stale reads can happen in PXC too. For the long version, see the manual page about Certification in Percona XtraDB Cluster.

Why does PXC implement virtually synchronous replication instead of fully synchronous replication? There are a few challenges to accomplish this but we can nail it down to performance constraints. Fully synchronous replication requires not only for the transactions to reach the other nodes but for the transactions to be applied to them too. And then they have to report back the success of the operation. For each transaction. This takes time and increases the overall commit time, which in practice makes writing to the database slow and impacts overall concurrency.

Evolution of “synchronous” replication on PXC

To be clear, no matter how you configure PXC, it won’t become a fully-synchronous system (neither will Group Replication, for that matter). BUT, it is possible to make it behave in such a way that it enforces data consistency. That’s what the variable wsrep_causal_reads has been used for. When this setting was introduced, it worked like a switch that one could use to kind of enable “synchronous” replication (as in data consistency) across the cluster – either for all transactions (globally) or at a connection basis (per session). In fact, what this setting does in practice is to enforce causality checks; quoting the manual: “When this variable is set to ON, the replica will wait until that event is applied before doing any other queries”. By waiting, it means blocking new writes until the expected consistency is reached.

Being able to enable “synchronous” replication for selected connections brought the potential to mix the best of both modes: operate with virtually synchronous replication by default (good performance) and use “synchronous” replication (once more, enforcing data consistency) when stale reads cannot be afforded.

The problem with this approach is that it requires some planning in advance to determine which connections should be configured with wsrep_causal_reads enabled. Without carefully planning ahead, what ends up happening in practice in many cases is having more connections configured with wsrep_causal_reads enabled than necessary. As a result, the cluster performance gets closer to that of operating with that setting applied globally.

The switch button that was wsrep_causal_reads evolved into the more granular wsrep_sync_wait, which allows the configuration of causality checks for different combinations of statements.

The cherry on top

While wsrep_sync_wait expanded the possibilities for causality checks, it still requires the “planning ahead” discussed above. I recently worked on the troubleshooting of a slow cluster and found it was configured with wsrep_sync_wait set to three (“Perform checks for READ, UPDATE, and DELETE statements”) all around, which is an even more “aggressive” configuration than the legacy wsrep_causal_reads. The justification for operating this way was that it was not possible to identify upfront which connection would be issuing a particular query that could not afford stale reads.

The good news is that starting with Percona XtraDB Cluster 8.0.26-16, which was released earlier this year, we can set wsrep_sync_wait on a per-query basis through the use of the optimizer hint SET_VAR: “The SET_VAR hint sets the session value of a system variable temporarily (for the duration of a single statement)”.

Here’s a simple example of employing SET_VAR to enforce the highest level of causality checks for a simple SELECT:

SELECT /*+ SET_VAR(wsrep_sync_wait=7) */ balance FROM accounts WHERE id=100;

If you find it is complicated to determine beforehand which connections should be configured to avoid stale reads, know that now you have the option to literally hard-code this constraint in your query. And if you end up deploying your database in a regular MySQL server, which is shipped without any wsrep_ settings, the optimizer hint is simply ignored and the query is processed normally.

Jul
21
2022
--

How to Resolve “Error Reading Relay Log Event” After Relay Log Corruption

Error Reading Relay Log Event

MySQL replication failureIn this blog, I explain how to recover from a replication failure caused by a corrupted relay log file.

MySQL replica stores data received from its source binary log in the relay log file. This file could be corrupted due to various reasons, mostly hardware failures. If this happens, replication will stop working, and the error log on the replica will have entries similar to:

2022-05-12T12:32:07.282374Z 2 [ERROR] Error in Log_event::read_log_event(): 'Event too small', data_len: 0, event_type: 0
2022-05-12T12:32:07.282386Z 2 [ERROR] Error reading relay log event for channel '': slave SQL thread aborted because of I/O
...
2022-05-12T12:32:07.282396Z 2 [ERROR] Slave SQL for channel '': Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave. Error_code: 1594

Once you perform a suggested check and find that the reason for the failure was, indeed, corruption of the relay log file, you may recover replication by resetting the replica.

First, you must ensure that the source server’s binary log is not corrupted. You can do this with the help of the

mysqlbinlog

command. Simply run it on the binary log, and ensure it does not fail with an error.

To find out which binary log is current, run the command

SHOW REPLICA STATUS

  (or

SHOW SLAVE STATUS

  if you are running MySQL, older than 8.0.22).

Then find the value of

Relay_Source_Log_File

  (

Relay_Master_Log_File

 ):

Relay_Source_Log_File: mysql-bin.000002

This will be the binary log from which the replica SQL thread executed the last statement.

Also, notice the value of the

Exec_Source_Log_Pos

  (

Exec_Master_Log_Pos

 ): the latest executed position. This will be necessary for the next step.

If you are using GTIDs, you need to find the binary log that contains the last GTID in the

Executed_Gtid_Set

.

Once you ensure that the source’s binary log file is healthy, you can run the

RESET REPLICA

  (

RESET SLAVE

 ) statement. As described at https://dev.mysql.com/doc/refman/8.0/en/reset-replica.html, “it clears the replication metadata repositories, deletes all the relay log files, and starts a new relay log file. It also resets to 0 the replication delay specified with the SOURCE_DELAY | MASTER_DELAY option of the CHANGE REPLICATION SOURCE TO statement (from MySQL 8.0.23) or CHANGE MASTER TO statement (before MySQL 8.0.23).

Therefore you need to run the

CHANGE REPLICATION SOURCE TO

  (or

CHANGE MASTER TO

) command after it. If you use position-based replication, point the replica to the

Relay_Source_Log_File

  and

Exec_Source_Log_Pos

, recorded in the previous step.

For GTID-based replicas, use

SOURCE_AUTO_POSITION=1

  (

MASTER_AUTO_POSITION=1

 ).

Conclusion

Relay log file on the replica stores changes that could be retrieved from the source server. Therefore it is safe to remove corrupted relay log files with the help of the

RESET REPLICA

  statement, then allow replication to reload the data from the binary log files on the source server. Mind checking if the source server did not flush the required binary logs before performing this operation.

Jul
19
2022
--

Percona XtraBackup 8.0.29 and INSTANT ADD/DROP Columns

Percona XtraBackup 8.0.29 and INSTANT ADD:DROP Columns

Percona XtraBackup 8.0.29 and INSTANT ADD:DROP ColumnsOracle’s MySQL 8.0.29 release extended the support for ALTER TABLE … ALGORITHM=INSTANT to 1) allow users to instantly add columns in any position of the table, and 2) instantly drop columns. As part of this work, the InnoDB redo log format has changed for all DML operations on the server. This new redo log format introduced a design flaw that can cause data corruption for tables with INSTANT ADD/DROP COLUMNS.

The corruption happens when InnoDB crash recovery takes place. InnoDB applies redo logs at startup. Percona XtraBackup copies the redo log during backup and applies it as part of the –prepare step to bring the backup to a consistent state.

Percona fixed the corruption issue and several other issues with the INSTANT ADD/DROP column feature in the upcoming Percona Server for MySQL 8.0.29 (check PS-8291PS-8292 / PS-8303 for more details) we also raised and provided patches for those issues as a contribution to Community MySQL (see 107613 / 107611 / 107854 for details). Percona XtraBackup 8.0.29 can take backups of Percona Server for MySQL 8.0.29 with tables that have INSTANT ADD/DROP COLUMNS. However, the current version of Community MySQL 8.0.29 still has this flaw, making it unsafe to take backups.

It is impossible for XtraBackup to deal with the corrupted redo log generated by Community MySQL 8.0.29 and, for this reason, XtraBackup 8.0.29 version will not take backups if it detects tables with INSTANT ADD/DROP columns and will create an error with a list of the affected tables and provide instructions to convert them to regular tables.

Please avoid ALTER ADD/DROP COLUMN without an explicit ALGORITHM=INPLACE. The default ALGORITHM is INSTANT, so ALTER TABLE without the ALGORITHM keyword uses the newly added INSTANT algorithm. For example:

Working:

ALTER TABLE tb1 ADD/DROP COLUMN [...], ALGORITHM=INPLACE/COPY;

Not Working:

ALTER TABLE tb1 ADD/DROP COLUMN [...], ALGORITHM=INSTANT;
ALTER TABLE tb1 ADD/DROP COLUMN [...], ALGORITHM=DEFAULT;
ALTER TABLE tb1 ADD/DROP COLUMN [...];

If you already have such tables (see below on how to find such tables), users are advised to run OPTIMIZE TABLE against these tables before taking backups.

Find all tables with INSTANT ADD/DROP COLUMNS:

mysql> SELECT NAME FROM INFORMATION_SCHEMA.INNODB_TABLES WHERE TOTAL_ROW_VERSIONS > 0;
+---------+
| NAME    |
+---------+
| test/t1 |
| test/t2 |
| test/t3 |
+---------+
3 rows in set (0.02 sec)

If this query shows an empty result set, you are all good. Percona XtraBackup will take backups of your MySQL 8.0.29 servers. If not, please run OPTIMIZE TABLE on the list of tables before taking a backup.

Percona XtraBackup error message

If Percona XtraBackup detects that MySQL 8.0.29 server has tables with instant add/drop columns, it aborts with the following error message

2022-07-01T15:18:35.127689+05:30 0 [ERROR] [MY-011825] [Xtrabackup] Found tables with row versions due to INSTANT ADD/DROP columns
2022-07-01T15:18:35.127714+05:30 0 [ERROR] [MY-011825] [Xtrabackup] This feature is not stable and will cause backup corruption.
2022-07-01T15:18:35.127714+05:30 0 [ERROR] [MY-011825] [Xtrabackup] Please check https://docs.percona.com/percona-xtrabackup/8.0/em/instant.html for more details.
2022-07-01T15:18:35.127723+05:30 0 [ERROR] [MY-011825] [Xtrabackup] Tables found:
2022-07-01T15:18:35.127730+05:30 0 [ERROR] [MY-011825] [Xtrabackup] test/t1
2022-07-01T15:18:35.127737+05:30 0 [ERROR] [MY-011825] [Xtrabackup] test/t2
2022-07-01T15:18:35.127744+05:30 0 [ERROR] [MY-011825] [Xtrabackup] test/t3
2022-07-01T15:18:35.127752+05:30 0 [ERROR] [MY-011825] [Xtrabackup] Please run OPTIMIZE TABLE or ALTER TABLE ALGORITHM=COPY on all listed tables to fix this issue.

Summary

Algorithm INSTANT is the new default in 8.0.29. If you do not specify an algorithm, all ALTER TABLE ADD/DROP COLUMN statements will use the default algorithm. 

The INSTANT algorithm is considered unstable at this point.

Percona XtraBackup will refuse to take backups from MySQL 8.0.29 tables that have been modified using this algorithm. Running OPTIMIZE TABLE on affected tables will bring them back to a safe state.

Percona XtraBackup is able to take backups seamlessly from Percona Server for MySQL, as the corruption issues have been fixed in the upcoming release of Percona Server for MySQL 8.0.29

Jul
18
2022
--

Talking Drupal #356 – The Book Module

Welcome to Talking Drupal. Today we are talking about The Book Module with Chad Hester.

www.talkingDrupal.com/356

Topics

  • Book Module
  • High level overview
  • Unique about the module
  • Real world use cases
  • Why not entity reference
  • Should it be in contrib
  • Where does it not make sense
  • Future?

Resources

Guests

www.chadkhester.com @chadkhester

Hosts

Nic Laflin – www.nLighteneddevelopment.com @nicxvan John Picozzi – www.epam.com @johnpicozzi Ryan Price – ryanpricemedia.com@liberatr

MOTW

Purge The purge module facilitates cleaning external caching systems, reverse proxies and CDNs as content actually changes. This allows external caching layers to keep unchanged content cached infinitely, making content delivery more efficient, resilient and better guarded against traffic spikes.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com