Sep
25
2023
--

Talking Drupal #417 – The Recipes Initiative

Today we are talking about The Recipes Initiative, the future of install profiles, if distros are still a thing, and answering a bunch of listener questions with our guest Jim Birch. We’ll also cover Quick Links Kit as our module of the week.

For show notes visit: www.talkingDrupal.com/417

Topics

  • What are recipes
  • How do you use recipes
  • Is it a module, configuration or something else
  • How do recipes compare to install profiles
  • Are you stuck with them
  • What happens if the config is changed
  • Are there namespace collisions
  • How do recipes compare with Distributions
  • Can you include content
  • Listener James: Can recipes uninstall modules
  • Can we use recipes now
  • When will recipes be in core
  • Can recipes be used by tests
  • Listener Andy: Can recipes and startkits interact
  • Can themes require recipes
  • Listener Matthieu: How do recipes compare with Symfony recipes
  • Listener James: How easy will it be to make custom recipes
  • Listener Matthieu: Should contrib maintainers be watching recipes
  • How can we get involved

Resources

Guests

Jim Birch – @jimbirch

Hosts

Nic Laflin – nLighteneddevelopment.com nicxvan John Picozzi – epam.com johnpicozzi Jen Lampton – jenlampton.comjenlampton

MOTW Correspondent

Martin Anderson-Clutz – @mandclu

Quick Links Kit

  • Brief description:
    • Have you ever needed to add a set of quick links, essentially visual navigation prompts, to the home page or section pages of your Drupal site?
    • Quick Links Kit (different from Quicklink module created by last week’s guest, Mike Herschel)
  • Brief history
    • How old: I created in Apr 2021
  • Versions available:
    • 1.0.6 that supports Drupal 8, 9, and 10, and 2.0.1 that supports only D10
  • Maintainership
    • Sporadically maintained, but a pretty simple module
  • Number of open issues:
    • 1, and it’s not a bug
  • Usage stats:
    • 1 site
  • Maintainer(s):
    • Me!
  • Module features and usage
    • The module is really just a set of configuration, with an optional submodule that sets everything up, including the placement of the block on the home page, for sites using Olivero as their theme, so it’s perfect for a fresh install of Drupal
    • It allows for SVG icons to be set for each link, and sets their fill to inherit from the link style
    • The links can be created and managed without leaving the page on which they’re used, by using the settings tray, though it would be a quick configuration change to use a modal or a separate page instead, if preferred.
    • The 2.0 version also makes use of Drupal 10’s new responsive grid views display, so if you’ve wanting to try that out, this is an easy way to get started
    • I thought this module was appropriate for today’s episode because it’s an example of a module that will be a recipe once the infrastructure for them is ready. That said, the Olivero submodule does currently contain a little CSS to improve the display of the links, but that could easily be copied into your site’s custom theme.
Sep
20
2023
--

How to Read Simplified SHOW REPLICA STATUS Output

SHOW REPLICA STATUS

As a MySQL database administrator, you’re likely familiar with the SHOW REPLICA STATUS command. It is an important command for monitoring the replication status on your MySQL replicas. However, its output can be overwhelming for beginners, especially regarding the binary log coordinates. I have seen confusion amongst new DBAs on which binary log file and position represent what in the replication.

In this guide, we’ll simplify the SHOW REPLICA STATUS output, focusing on the critical binary log coordinates essential for troubleshooting and managing replication.

The key binlog coordinates

Before we delve into the output, let’s understand the key binlog coordinates we’ll be working with:

  • Master_Log_File: This is the name of the primary binary log file that the I/O thread is currently reading from.
  • Read_Master_Log_Pos: It represents the position up to which the I/O thread has read in the current primary binary log file.
  • Relay_Log_File: This is the name of the relay log file that the SQL thread is currently processing.
  • Relay_Log_Pos: It shows the position up to which the SQL thread has finished processing in the current relay log file.
  • Relay_Master_Log_File: This is the name of the primary binary log file that contains the most recent event executed by the SQL thread.
  • Exec_Master_Log_Pos: It indicates the position up to which the SQL thread has processed in the current master binary log file. It can be used to start a new replica from a current replica with the CHANGE MASTER TO … MASTER_LOG_POS option.

Simplified SHOW REPLICA STATUS output

I thought of creating the following image to effectively make you understand the binary log coordinates in the show replica status output.

show replica status

Decoding the SHOW REPLICA STATUS output

Now, let’s break down the SHOW REPLICA STATUS output to understand these binlog coordinates:

  • Master_Log_File and Read_Master_Log_Pos: These values tell you which primary binary log file and position the I/O thread is currently reading. It’s like the bookmark in a book, showing you where the replication process is in the Primary’s log.
  • Relay_Log_File and Relay_Log_Pos: These values reveal the name of the relay log file and the position up to which the SQL thread has processed. Think of it as the progress report of the SQL thread.
  • Relay_Master_Log_File and Exec_Master_Log_Pos: These parameters are essential when you need to reset replication on a replica.
  • Relay_Master_Log_File specifies the name of the primary binary log file containing the most recent event executed by the SQL thread.
  • Exec_Master_Log_Pos tells you the precise position within that file. Together, they enable you to pick up where replication left off.

Troubleshooting and managing replication

Understanding these binlog coordinates simplifies troubleshooting and managing replication:

  • When replication breaks, you can use Relay_Master_Log_File and Exec_Master_Log_Pos to identify the exact location and resume from there.
  • Monitoring Master_Log_File and Read_Master_Log_Pos helps you keep track of the I/O thread’s progress on the primary.
  • Checking Relay_Log_File and Relay_Log_Pos lets you know how far the SQL thread has come in processing events.

By grasping these key binlog coordinates, you can confidently manage MySQL replication, resolve issues efficiently, and keep your databases in sync.

Quick tip for DBAs

We know monitoring is one of the important components of your architecture. We recommend using Percona Monitoring and Management, the best monitoring tool for your open source databases.

The MySQL Replication Summary dashboard comes in really handy when monitoring the replication status. Please find the sample snapshot from our test monitoring node:

mysql replication dashboard - PMM

Conclusion

The SHOW REPLICA STATUS output shouldn’t be something confusing you. By focusing on the binlog coordinates, you gain valuable insights into the replication process. Whether you’re troubleshooting a broken replica or monitoring ongoing replication, these coordinates guide success in managing MySQL replication. Next time you encounter the SHOW REPLICA STATUS output, remember that it’s simply telling you where your replication stands in the grand scheme of your MySQL environment.

Percona Monitoring and Management is a best-of-breed open source database monitoring solution. It helps you reduce complexity, optimize performance, and improve the security of your business-critical database environments, no matter where they are located or deployed.

 

Download Percona Monitoring and Management Today

Sep
20
2023
--

Avoid Surprises When Restarting MySQL — Ensure Dynamic Changes Won’t Be Lost

restarting mysql

If you’re a DBA, one of your “easiest” tasks is to stop/start MySQL during a maintenance window, but even that could lead to unwanted scenarios if you modify some dynamic parameters at some point in your instance.

Here’s a brief story of how this could happen, to make it clearer:

You’re a DBA managing a few MySQL servers. The application using one of them starts having issues on a Friday night, right before it’s time to leave; after a quick check, you notice the app is requesting more connections, and the hotfix is to up max connections; you change them dynamically, the fire is off, and so are you. Let your future self worry about finding the root cause and fixing it properly the following Monday.

 

But life happens; Monday is here with new challenges, and you already forgot about the connections issue… A few months later, a restart for MySQL is required, and surprise, surprise, right after the restart, the issue is back ‘unexpectedly’; now you have to troubleshoot and lose time wondering what happened and fixing it.

Side note: This is not advice on how to fix things; the story only introduces stuff that can happen when restarting MySQL and how to prevent it. This could also be a tale of why doing that kind of “administration” is bad.

Introduction of pt-config-diff and the situation

As you might know, the Percona Toolkit is a collection of advanced command-line tools used to perform various MySQL, MongoDB, and system tasks that are too difficult or complex to perform manually.

One of those tools is pt-config-diff; this tool can show differences within MySQL configuration files and server variables.

You can use it to compare two config files (i.e., you want to ensure your new server has the same settings as the old one) or one config file and a running MySQL instance (this is what we’ll use here).

Below are the commands and output that lead to the different values in the story.

First, let’s see the connection settings in the configuration file.

$ grep conn /etc/my.cnf
max_connections=200

And before the dynamic change, MySQL is running with the same value.

$ mysql -e "SELECT @@max_connections"
+-------------------+
| @@max_connections |
+-------------------+
|               200 |
+-------------------+

That Friday night, the value was increased to 500:

$ mysql -vv -e "SET GLOBAL max_connections=500; SELECT @@GLOBAL.max_connections"

--------------
SET GLOBAL max_connections=500
--------------
Query OK, 0 rows affected (0.01 sec)


--------------
SELECT @@GLOBAL.max_connections
--------------
+--------------------------+
| @@GLOBAL.max_connections |
+--------------------------+
|                      500 |
+--------------------------+
1 row in set (0.00 sec)

Bye

By then, the settings in the configuration file and the runtime value for max_connections are different, and the dynamic changes will be lost after a MySQL reboot.

Side note: In case you missed it, with MySQL 8, you can now persist dynamic changes as well to avoid manual modification to the cnf file. See more in this blog post: Using MySQL 8 Persisted System Variables.

How to check dynamic changes

When it’s the moment to reboot MySQL, you can use pt-config-diff to validate if the runtime settings are the same as in the configuration file.

Using the same example, here’s the output:

$ pt-config-diff h=localhost /etc/my.cnf

1 config difference

Variable                  localhost.localdomain /etc/my.cnf
========================= ===================== ===========
max_connections           500                   200

Now, you know the difference between the config file and your runtime settings and can properly prepare for it accordingly (Modify the value in your config file or SET PERSIST).

Let’s modify the configuration file manually:

$ grep conn /etc/my.cnf
max_connections=500

And re-checking with the same command as above, now there’s no output (Meaning no differences were found).

$ pt-config-diff h=localhost /etc/my.cnf
$

Things to be aware of

The tool will only compare the values for parameters present in both sources. 

Using the same example. if the max_connections parameter is not present in the configuration file (MySQL was running with the default value), pt-config-diff won’t show the value in the output.

Example:

The parameter is not present in the configuration file.

$ grep conn /etc/my.cnf
$

And MySQL is running with the default value:

$ mysql -e "SELECT @@max_connections"
+-------------------+
| @@max_connections |
+-------------------+
|               151 |
+-------------------+

Modifying the value dynamically (using the same commands as in the previous example) and running pt-config-diff, the output is empty.

$ mysql -e "SELECT @@max_connections"
+-------------------+
| @@max_connections |
+-------------------+
|               500 |
+-------------------+


$ pt-config-diff h=localhost /etc/my.cnf
$

As mentioned above, this happens because there’s nothing to compare to; the configuration file doesn’t have a value for the max_connections parameter.

Another thing to keep in mind is if you have more than one config file (Using !include or !includedir, or in more than one of the default places MySQL looks for config files), you can’t compare all of them at once.

Ensure you compare all the required config files if that’s your case.

Final thoughts

First, take this one as a reminder always to persist your changes, either manually adding them to your configuration files or using SET PERSIST (Note: This command will create/use mysqld-auto.cnf inside the datadir).

With that being said, even after checking with pt-config-diff, there could be some differences that could be lost after a reboot, but using this tool can help you prevent unwanted surprises and highlight possible issues before a restart.

As always, even when performing the “simplest” task, it’s recommended to have some pre-checks to ensure the activity goes smoothly.

Also, if you identified yourself with the story at the beginning of this article and feel you could use a hand, remember you can always rely on our Managed Services offering, where you can have a team of DBAs managing the instances for you.

Percona Distribution for MySQL is the most complete, stable, scalable, and secure open source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!

 

Try Percona Distribution for MySQL today!

Sep
19
2023
--

Percona Distribution for PostgreSQL 16 Is Now Available

Percona Distribution for PostgreSQL 16 Is Now Available

PostgreSQL Community released PostgreSQL 16 on September 14, 2023.  In years past, we’ve released our Distribution for PostgreSQL a few months later.  We wanted to improve in this regard and establish a new release baseline. Improving quality while maintaining the same resilience towards QA was something we aimed for. It looks like we succeeded, and I’m happy to announce that Percona Distribution for PostgreSQL 16 is now available!

The rocky road to a faster release

It took efforts across the Build and QA teams at Percona. We’ve improved both the packaging build pipelines and tremendously reworked and improved the automation of our QA pipelines. The effort has definitely paid off, as we can release shortly after upstream.

Why should you care about version 16?

PostgreSQL 16 comes packed with features. The community has been working hard on these, and the list of delivered improvements is really impressive. To highlight some:

  • Logical replication from standby servers is now supported natively by PostgreSQL.
  • Parallelization for transactions for logical replication subscribers.
  • Parallelization of FULL and internal right OUTER hash joins is now allowed.
  • pg_stat.io view providing I/O statistics for your monitoring needs,
  • New SQL/JSON constructors and identity functions.
  • Improved vacuum freezing performance.
  • Added support for regular expression matching of user and database names in pg_hba.conf, and user names in pg_ident.conf.

To learn more about these, check out the upstream release notes and the Percona distribution release notes for more details!

What’s next in store?

We’re working hard on closing the gaps for data at rest encryption (DARE), as reported by both our community of users and our customers. While there are proprietary tools out there, as well as limited capabilities thanks to pg_crypto, it is still far from the expectations of enterprise customers who need to comply with regulations and internal policies.

We plan to make Percona images for PostgreSQL available on Docker Hub shortly. The plan is to start with version 16.

We’re hoping to share some good news in this regard and provide an MVP for all of you to try and test by the end of this year. As we are committed to open source, we hope to involve the community in discussing the best approaches for transparent data encryption (TDE) in PostgreSQL.

 

Learn more about Percona for PostgreSQL

Sep
19
2023
--

Use Physical Backups With MySQL InnoDB Redo Log Archiving

InnoDB Redo Log Archiving

In the world of data backup and security, physical backups play an extremely important role. Physical backup methods are faster than logical because they involve only file copying without conversion. This type of backup is suitable for large, important databases that need to be recovered quickly when problems occur.

Physical backups are the backups that consist of raw copies of the directories and files that store database contents. In addition to databases, the backup can include any related files, such as log or configuration files. Now, since backup speed and compactness are important for busy, important databases, Percona’s open source physical backup solution – Percona XtraBackup (PXB), takes into account all these aspects and benefits MySQL world with its exceptional capabilities!

This blog post will walk you through how PXB uses MySQL’s InnoDB redo log archiving to manage the redo log files on the busiest systems and looks at how a new feature released in PXB version 8.0.34-29 will let you set up redo log directory while performing a backup.

InnoDB redo log archiving

Backup utilities that copy redo log records may sometimes fail to keep pace with redo log generation while a backup operation is in progress, resulting in lost redo log records due to overwritten records. This issue most often occurs when there is significant MySQL server activity during the backup operation, and the redo log file storage media operates faster than the backup storage media. The redo log archiving feature, introduced in MySQL 8.0.17, addresses this issue by sequentially writing redo log records to an archive file and the redo log files. Backup utilities can copy redo log records from the archive file as necessary, thereby avoiding the potential loss of data.

Enabling redo log archiving on the server requires setting a value for the innodb_redo_log_archive_dirs system variable. The value is specified as a semicolon-separated list of labeled redo log archive directories. The label:directory pair is separated by a colon (:). For example:

mysql> SET GLOBAL innodb_redo_log_archive_dirs='label1:directory_path1[;label2:directory_path2;…]';

Now, first, we are going to demonstrate to you how we tested the new InnoDB redo log archiving on Percona Server for MySQL 8.0.33-25 (GPL) and should be using MySQL shell (mysqlsh) for enabling the InnoDB redo log archiving.

  1. Create new directories for storing redo log archives.
[root@ip-xxx-xx-xx-xx ~]# mkdir -p /var/lib/mysql-redo-archive/backup1
[root@ip-xxx-xx-xx-xx ~]# chown mysql. -R /var/lib/mysql-redo-archive
[root@ip-xxx-xx-xx-xx ~]# chmod -R 700 /var/lib/mysql-redo-archive/

  1. Login into mysqlsh by root user and check the variable innodb_redo_log_archive_dirs.
[root@ip-xxx-xx-xx-xx ~]# mysqlsh -uroot -p
MySQL Shell 8.0.33
Copyright (c) 2016, 2023, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
Other names may be trademarks of their respective owners.
Type 'help' or '?' for help; 'quit' to exit.
WARNING: Using a password on the command line interface can be insecure.
Creating a session to 'root@localhost'
Fetching schema names for auto-completion... Press ^C to stop.
Your MySQL connection id is 10 (X protocol)
Server version: 8.0.33-25 Percona Server (GPL), Release 25, Revision 60c9e2c5
No default schema selected; type use <schema> to set one.
MySQL localhost:33060+ ssl  SQL >
MySQL localhost:33060+ ssl JS > \sql
Switching to SQL mode... Commands end with ;
Fetching global names for auto-completion... Press ^C to stop.

MySQL localhost:33060+ ssl SQL > show global variables like 'innodb_redo_log_ar%';
+------------------------------+-------+
| Variable_name                | Value |
+------------------------------+-------+
| innodb_redo_log_archive_dirs |       |
+------------------------------+-------+
1 row in set (0.0031 sec)

  1. Set the value of variable innodb_redo_log_archive_dirs to the one in which we created the directories for redo log archives as below.
MySQL  localhost:33060+ ssl  SQL > set persist innodb_redo_log_archive_dirs='backup1:/var/lib/mysql-redo-archive/';
Query OK, 0 rows affected (0.0019 sec)
MySQL  localhost:33060+ ssl  SQL >  show global variables like 'innodb_redo_log_ar%';
+------------------------------+--------------------------------------+
| Variable_name                | Value                                |
+------------------------------+--------------------------------------+
| innodb_redo_log_archive_dirs | backup1:/var/lib/mysql-redo-archive/ |
+------------------------------+--------------------------------------+
1 row in set (0.0025 sec)

  1. Now it’s ready to work, but it’s not enabled. It will be enabled when the session initializes the backup and will invoke innodb_redo_log_archive_start().
MySQL  localhost:33060+ ssl  SQL > select innodb_redo_log_archive_start('backup1', 'backup1');
+-----------------------------------------------------+
| innodb_redo_log_archive_start('backup1', 'backup1') |
+-----------------------------------------------------+
|                                                   0 |
+-----------------------------------------------------+
1 row in set (0.0229 sec)

How to check if the redo log archiving is active

MySQL  localhost:33060+ ssl  SQL > select * from performance_schema.file_instances where event_name like '%::redo_log_archive_file'G
*************************** 1. row ***************************
FILE_NAME: /var/lib/mysql-redo-archive/backup1/archive.f255a32d-2fb4-11ee-889e-0242ac110005.000001.log
EVENT_NAME: wait/io/file/innodb/meb::redo_log_archive_file
OPEN_COUNT: 1
1 row in set (0.0015 sec)

So, this is not enough to ensure the redo log archiving is active. But we have the possibility also to check if the thread is active using this query:

MySQL  localhost:33060+ ssl  SQL > select thread_id, name, type from performance_schema.threads  where name like '%redo_log_archive%';
+-----------+-----------------------------------------------------+------------+
| thread_id | name                                                | type       |
+-----------+-----------------------------------------------------+------------+
|        48 | thread/innodb/meb::redo_log_archive_consumer_thread | BACKGROUND |
+-----------+-----------------------------------------------------+------------+
1 row in set (0.0021 sec)

If a row is returned, it means that the redo log archiving is enabled and active.

If, instead, you wish not to revert and to make it persist on service restart, add it to my.cnf under [mysqld]:

[mysqld]
innodb_redo_log_archive_dirs = redo-archived1:/mysql/archived-redo1

How to stop redo log archiving

MySQL  localhost:33060+ ssl  SQL > select innodb_redo_log_archive_stop();
+--------------------------------+
| innodb_redo_log_archive_stop() |
+--------------------------------+
|                              0 |
+--------------------------------+
1 row in set (0.0009 sec)

[root@ip-xxx-xx-xx-xx ~]# ls -ltrh /var/lib/mysql-redo-archive/backup1/
total 24K
-r--r-----. 1 mysql mysql 24K Aug 21 11:22 archive.94f4ab58-3d1c-11ee-9e4f-0af1fd7c44c9.000001.log

Revert back the changes

To revert back and disable redo log archiving:

mysql> SET GLOBAL innodb_redo_log_archive_dirs='';

How does Percona XtraBackup deal with redo log files?

PXB considers the log sequence number and starts copying away the data files when a backup is performed. The backup operation on InnoDB is a non-blocking operation, so the ibd files are allowed to change while the backup process is running. This means that while PXB is copying the files, the file reflects the state of the database at different times. PXB needs the redo log records for every change to the data files since it began execution.

PXB runs a background process that watches the transaction log files and copies changes from them. It needs to do this continually because InnoDB writes redo logs to the disk circularly and can be reused after a while. However, prior to PXB version 8.0.34-29, if InnoDB redo log archiving is not enabled and provided that redo logs are written faster, then PXB is able to copy it slower than it is written to the disk. It is possible to receive the below message:

xtrabackup: error: log block numbers mismatch:
xtrabackup: error: expected log block no. X, but got no. Y from the log file.
xtrabackup: error: it looks like InnoDB log has wrapped around before xtrabackup could process all records due to either log copying being too slow, or log files being too small.
xtrabackup: Error: xtrabackup_copy_logfile() failed.

Earlier to the release of the feature redo log archiving in MySQL 8.0.17, the  potential solutions to the above problem were:

  • Increasing the size of the redo logs so they are not wrapped while PXB is working.
  • The read speed is too slow, which usually is a sign of IO congestion.
  • The write speed is too slow, usually a sign of IO congestion or network congestion.
  • Taking backup off-peak time while the system saturation is lower.

However, now, with the use of the redo log archiving feature introduced in MySQL 8.0.17, you can overcome the slowness of physical backups and can efficiently perform the backup even during peak hours.

Also, PXB version 8.0.34-29 has introduced a new option –redo-log-arch-dir that can be used to set the redo log archiving directory if not already set in MySQL.

Using MySQL redo log archiving for Percona XtraBackup

So now we have a redo-log-file archived in the folder we created above demonstration.

[root@ip-xxx-xx-xx-xx ~]# ls -ltrh /var/lib/mysql-redo-archive/backup1/
total 24K
-r--r-----. 1 mysql mysql 24K Aug 21 11:22 archive.94f4ab58-3d1c-11ee-9e4f-0af1fd7c44c9.000001.log

Test with Percona XtraBackup

  1. Run PXB as the owner of the mysqld, typically the MySQL OS user. We will be using version 8.0.34-29
[root@ip-xxx-xx-xx-xx ~]# xtrabackup --version
2023-09-11T16:07:17.696628-00:00 0 [Note] [MY-011825] [Xtrabackup] recognized server arguments: --datadir=/var/lib/mysql
xtrabackup version 8.0.34-29 based on MySQL server 8.0.34 Linux (x86_64) (revision id: 5ba706ee)

This is required for PXB to use the InnoDB archiving feature. Otherwise, it will encounter a “Permission denied” error and will not use archiving as below:

2023-08-21T13:27:59.293520-00:00 0 [Note] [MY-011825] [Xtrabackup] xtrabackup redo_log_arch_dir is set to backup1:/var/lib/mysql-redo-archive/
2023-08-21T13:27:59.295379-00:00 0 [ERROR] [MY-011825] [Xtrabackup] failed to fetch query result select innodb_redo_log_archive_start('backup1', '1692710879293') : Cannot create redo log archive file '/var/lib/mysql-redo-archive/1692710879293/archive.94f4ab58-3d1c-11ee-9e4f-0af1fd7c44c9.000001.log' (OS errno: 13 - Permission denied)
2023-08-21T13:27:59.295429-00:00 0 [Note] [MY-011825] [Xtrabackup] Redo Log Archiving is not used.

PXB will create a temporary directory for the archive file. Since redo log archiving is a MySQL process, it needs to be owned by the MySQL OS user to be able to write to it.

1.1) Option 1: Login as the MySQL OS user:

shell> sudo -H -u mysql bash

Execute:

shell> xtrabackup --no-lock=1 --compress --parallel=4 --host=localhost --user=root --password='password_string' --backup=1 --target-dir=/Backup/21Aug 2> /tmp/b0-with-redo-archiving-as-mysql-os-user.log

1.2) Option 2: Directly execute as the root OS user for the MySQL user.

You must add -c on the command like:

shell> sudo -H -u mysql bash -c '<COMMAND>' 2> /tmp/backup.log

Execute:

shell> sudo -H -u mysql bash -c 'xtrabackup --no-lock=1 --compress --parallel=4 --host=localhost --user=root --password='password_string' --backup=1 --target-dir=/Backup/21Aug' 2> /tmp/b0-with-redo-archiving-as-mysql-os-user.log

4.2) Verify archiving is used:

shell> less /tmp/b0-with-redo-archiving-as-mysql-os-user.log 
2023-08-22T13:36:02.773345-00:00 0 [Note] [MY-011825] [Xtrabackup] recognized server arguments: --datadir=/var/lib/mysql  
2023-08-22T13:36:02.773543-00:00 0 [Note] [MY-011825] [Xtrabackup] recognized client arguments: --no-lock=1 --compress --parallel=4 --host=localhost --user=root --password=* --backup=1 --target-dir=/Backup/22Aug  xtrabackup version 8.0.34-29 based on MySQL server 8.0.34 Linux (x86_64) (revision id: 5ba706ee) 230822 13:36:02  version_check Connecting to MySQL server with DSN 'dbi:mysql:;mysql_read_default_group=xtrabackup;host=localhost' as 'root'  (using password: YES). 230822 13:36:02  version_check Connected to MySQL server 230822 13:36:02  version_check Executing a version check against the server... 230822 13:36:02  version_check Done. 
2023-08-22T13:36:02.899397-00:00 0 [Note] [MY-011825] [Xtrabackup] Connecting to MySQL server host: localhost, user: root, password: set, port: not set, socket: not set 2023-08-22T13:36:02.906059-00:00 0 [Note] [MY-011825] [Xtrabackup] Using server version 8.0.34 
.... 
.... 
2023-08-22T13:36:02.980089-00:00 0 [Note] [MY-011825] [Xtrabackup] Connecting to MySQL server host: localhost, user: root, password: set, port: not set, socket: not set 
2023-08-22T13:36:02.986698-00:00 0 [Note] [MY-011825] [Xtrabackup] xtrabackup redo_log_arch_dir is set to backup1:/var/lib/mysql-redo-archive/ 
2023-08-22T13:36:03.015207-00:00 0 [Note] [MY-011825] [Xtrabackup] Waiting for archive file '/var/lib/mysql-redo-archive//1692711362986/archive.94f4ab58-3d1c-11ee-9e4f-0af1fd7c44c9.000001.log' 
2023-08-22T13:36:03.091572-00:00 1 [Note] [MY-011825] [Xtrabackup] >> log scanned up to (19630018) 
.... 
.... 
2023-08-22T13:36:04.569714-00:00 0 [Note] [MY-011825] [Xtrabackup] Done: Compressing file /Backup/22Aug/xtrabackup_info.zst 
2023-08-22T13:36:05.583075-00:00 0 [Note] [MY-011825] [Xtrabackup] Transaction log of lsn (19630018) to (19630028) was copied. 
2023-08-22T13:36:05.790293-00:00 0 [Note] [MY-011825] [Xtrabackup] completed OK!

How to set –redo-log-arch-dir=name in PXB command

PXB 8.0.34-29 introduces a new feature  –redo-log-arch-dir=name. This option sets the redo log archive directory if this directory is not already set on the server. Here is a short example demonstrating the functioning of  –redo-log-arch-dir parameter.

Suppose innodb_redo_log_archive_dirs is NULL, which means initially, the redo logs archive directory is not set in MySQL, as shown below.

mysql>  select @@innodb_redo_log_archive_dirs;
+--------------------------------+
| @@innodb_redo_log_archive_dirs |
+--------------------------------+
| NULL                           |
+--------------------------------+
1 row in set (0.01 sec)

Create a redo log file directory and give the appropriate permissions

[root@ip-xxx-xx-xx-xx ~]# cd /var/lib/mysql-redo-archive/
[root@ip-xxx-xx-xx-xx ~]# mkdir redo_arch_backup

### Check mysql-redo-archive folder Permissions
[root@ip-xxx-xx-xx-xx ~]# ls -ltrh /var/lib/ |grep archive 
drwx------.  4 mysql  mysql     51 Sep 11 17:16 mysql-redo-archive

As you can see, PXB is all set to use the new redo archive directory just by passing the archive directory path to the parameter  –redo-log-arch-dir

–redo-log-arch-dir=redo_arch_backup:/var/lib/mysql-redo-archive/

[root@ip-xxx-xx-xx-xx ~]# sudo -H -u mysql bash
bash-5.1$ xtrabackup --no-lock=1 --compress --parallel=4 --host=localhost --user=root --password='password_string' --backup=1 --target-dir=/Backup/11Sept23_bkp --redo-log-arch-dir=redo_arch_backup:/var/lib/mysql-redo-archive/ 2> /tmp/a1-with-redo-archiving-as-mysql-os-user.log

Verifying the PXB logs:

[root@ip-xxx-xx-xx-xx ~]# cat /tmp/a1-with-redo-archiving-as-mysql-os-user.log
2023-09-11T17:16:47.836951-00:00 0 [Note] [MY-011825] [Xtrabackup] recognized server arguments: --datadir=/var/lib/mysql
2023-09-11T17:16:47.837105-00:00 0 [Note] [MY-011825] [Xtrabackup] recognized client arguments: --no-lock=1 --compress --parallel=4 --host=localhost --user=root --password=* --backup=1 --target-dir=/Backup/11Sept23_bkp --redo_log_arch_dir=redo_arch_backup:/var/lib/mysql-redo-archive/
xtrabackup version 8.0.34-29 based on MySQL server 8.0.34 Linux (x86_64) (revision id: 5ba706ee)
230911 17:16:47  version_check Connecting to MySQL server with DSN 'dbi:mysql:;mysql_read_default_group=xtrabackup;host=localhost' as 'root'  (using password: YES).
230911 17:16:47  version_check Connected to MySQL server
230911 17:16:47  version_check Executing a version check against the server...
230911 17:16:47  version_check Done.
2023-09-11T17:16:47.956847-00:00 0 [Note] [MY-011825] [Xtrabackup] Connecting to MySQL server host: localhost, user: root, password: set, port: not set, socket: not set
....
....
2023-09-11T17:16:48.028506-00:00 0 [Note] [MY-011825] [Xtrabackup] Connecting to MySQL server host: localhost, user: root, password: set, port: not set, socket: not set
2023-09-11T17:16:48.034515-00:00 0 [Note] [MY-011825] [Xtrabackup] xtrabackup redo_log_arch_dir is set to redo_arch_backup:/var/lib/mysql-redo-archive/
2023-09-11T17:16:48.056504-00:00 0 [Note] [MY-011825] [Xtrabackup] Waiting for archive file '/var/lib/mysql-redo-archive//1694452608034/archive.94f4ab58-3d1c-11ee-9e4f-0af1fd7c44c9.000001.log'
2023-09-11T17:16:48.137941-00:00 1 [Note] [MY-011825] [Xtrabackup] >> log scanned up to (19816218)
2023-09-11T17:16:48.139886-00:00 0 [Note] [MY-012953] [InnoDB] Disabling background ibuf IO read threads.
2023-09-11T17:16:48.343117-00:00 0 [Note] [MY-011825] [Xtrabackup] Generating a list of tablespaces
....
....
2023-09-11T17:16:49.496740-00:00 0 [Note] [MY-011825] [Xtrabackup] Compressing /Backup/11Sept23_bkp/backup-my.cnf.zst
2023-09-11T17:16:49.496789-00:00 0 [Note] [MY-011825] [Xtrabackup] Done: Compressing file /Backup/11Sept23_bkp/backup-my.cnf.zst
2023-09-11T17:16:49.498483-00:00 0 [Note] [MY-011825] [Xtrabackup] Compressing /Backup/11Sept23_bkp/xtrabackup_info.zst
2023-09-11T17:16:49.498521-00:00 0 [Note] [MY-011825] [Xtrabackup] Done: Compressing file /Backup/11Sept23_bkp/xtrabackup_info.zst
2023-09-11T17:16:50.501818-00:00 0 [Note] [MY-011825] [Xtrabackup] Transaction log of lsn (19816218) to (19816228) was copied.
2023-09-11T17:16:50.708435-00:00 0 [Note] [MY-011825] [Xtrabackup] completed OK!

mysql>  select @@innodb_redo_log_archive_dirs;
+-----------------------------------------------+
| @@innodb_redo_log_archive_dirs                |
+-----------------------------------------------+
| redo_arch_backup:/var/lib/mysql-redo-archive/ |
+-----------------------------------------------+
1 row in set (0.00 sec)

This means if innodb_redo_log_archive_dirs is not set in MySQL, PXB 8.0.34-29 will set this through –redo-log-arch-dir option and start using the redo log archive while a backup is initiated. When not set explicitly through –redo-log-arch-dir, PXB, by default, will first check innodb_redo_log_archive_dirs. If this variable is set in MySQL, as demonstrated previously, it will use the redo log arching directory assigned in innodb_redo_log_archive_dirs variable while performing the backup.

Conversely, if this MySQL variable innodb_redo_log_archive_dirs is NULL and you did not explicitly specify –redo-log-arch-dir while taking the backup, then PXB won’t use the redo log archiving feature.

Conclusion

This feature is crucial for heavy workloads when the backup storage doesn’t have the same capabilities as the production storage and cannot follow up the speed of the writes.

When enabled by the DBA, Percona XtraBackup will detect the archived redo log files and will use it automatically. To verify if the redo log archiving process was started and is still active, you can check the performance_schema.threads table as illustrated above. On the other hand, to use this feature with Percona XtraBackup, you can also explicitly use –redo-log-arch-dir=name.

Percona XtraBackup is a free, open source, complete online backup solution for all versions of Percona Server for MySQL and MySQL.

 

Try Percona XtraBackup today!

 

Related links:

https://docs.percona.com/percona-xtrabackup/8.0/release-notes/8.0/8.0.34-29.0.html

https://docs.percona.com/percona-xtrabackup/8.0/xtrabackup-option-reference.html#-redo-log-arch-dirname

https://dev.mysql.com/doc/refman/8.0/en/innodb-redo-log.html#innodb-redo-log-archiving

https://www.percona.com/blog/innodb-redo-log-archiving/

 

Sep
18
2023
--

Talking Drupal #416 – Single Directory Components

On today’s show we are talking about Single Directory Components in Drupal, How they differ from Web Components, and what are their benefits with guest Mateu Bosch & Mike Herchel. We’ll also cover Component Libraries: Theme Server as our module of the week.

For show notes visit: www.talkingDrupal.com/416

Topics

  • What are Single Directory Components?
  • Where did the idea of adding Single Directory Components to Drupal come from?
  • Where does support for this stand in Drupal Core? Fully supported? Still need a contrib module?
  • How do they differ from Web Components? (Mike will take this one)
  • How does Single Directory Components make Drupal Theme development easier?
  • What is the point of creating a schema for an SDC?
  • Can modules or themes override SDCs? How?
  • Can SDC be integrated into component library systems like Storybook? How?
  • Any other helpful contrib modules that enhance SDCs?
  • Does this at all help a headless?
  • How can someone get involved or help contribute to Single Directory Components?

Resources

Single Directory Components https://www.drupal.org/project/sdc JSON Schema https://json-schema.org/ SDC Display https://www.drupal.org/project/sdc_display SDC Styleguide https://www.drupal.org/project/sdc_styleguide Cl Devel https://www.drupal.org/project/cl_devel CL Server https://www.drupal.org/project/cl_server CL Generator https://www.drupal.org/project/cl_generator SDC Documentation https://www.drupal.org/project/drupal/issues/3345922 Mike’s blog https://herchel.com/ SDC Slack Channel (Components channel in Drupal Slack) #components https://drupal.slack.com/archives/C4EDNHFGS Drupal Board Elections https://www.drupal.org/association/board/elections

Guests

Mike Herchel – herchel.com @mikeherchel Mateu Bosch – mateuaguilo.com

Hosts

Nic Laflin – nLighteneddevelopment.com nicxvan John Picozzi – epam.com johnpicozzi Andy Blum – andy-blum.comandy_blum

Module of the Week

with Martin Anderson-Clutz – @mandclu

Component Libraries: Theme Server

This module lets you use component libraries, like Storybook, in your Drupal project, without Twig.js!

Sep
18
2023
--

New Versions of Percona Distribution for MongoDB, Percona Operator for MySQL: Release Roundup September 18, 2023

Percona Releases

Percona is a leading provider of unbiased, performance-first, open source database solutions that allow organizations to easily, securely, and affordably maintain business agility, minimize risks, and stay competitive, free from vendor lock-in. Percona software is designed for peak performance, uncompromised security, limitless scalability, and disaster-proofed availability.

Our Release Roundups showcase the latest Percona software updates, tools, and features to help you manage and deploy our software. It offers highlights, critical information, links to the full release notes, and direct links to the software or service itself to download.

Today’s post includes those releases and updates that have come out since September 4, 2023. Take a look.

Percona Distribution for MongoDB 6.0.9

Percona Distribution for MongoDB 6.0.9 was released on September 14, 2023. It is a freely available MongoDB database alternative, giving you a single solution that combines enterprise components from the open source community, designed and tested to work together.

Bug fixes and improvements introduced in MongoDB and included in Percona Distribution for MongoDB are the following:

  • Fixed the flow for converting a replica set into a sharded cluster b adding support for the drivers to communicate the signed $clusterTimes to shardsvr replica set before and after the addShard command is run.
  • Fixed the issue with the incorrect output for the query where the $or operator rewrites the $elemMatch extra condition.
  • Blocked the $group min/max rewrite in timestamp if there is a non-meta filter.
  • Disallowed the retry to forcibly evict the page during reconciliation.

Percona Backup for MongoDB 2.2.1 provides the ability to increase the wait time for physical backup to start. This eliminates the PBM failure when creating $backupCursor takes longer than usual.

Download Percona Distribution for MongoDB 6.0.9

Percona Distribution for MongoDB 5.0.20

On September 7, 2023, Percona Distribution for MongoDB 5.0.20 was released. The bug fixes and improvements provided by MongoDB and included in Percona Server for MongoDB are the following:

  • Fixed the issue with the incorrect output for the query where the $or operator rewrites the $elemMatch extra condition.
  • Fixed commit point propagation for exhaust oplog cursors.
  • Blocked the $group min/max rewrite in timestamp if there is a non-meta filter.
  • Improved the reconciliation time and slow eviction for pages with lots of updates by avoiding saving the update chain when there are no updates to be written to the history store
  • Fixed the Rollback to Stable behavior to skip tables with no time window information in the checkpoint.

Percona Backup for MongoDB 2.2.1 provides the ability to increase the wait time for physical backup to start. This eliminates the PBM failure when creating $backupCursor takes longer than usual.

Download Percona Distribution for MongoDB 5.0.20

Percona Distribution for MongoDB 4.4.24

Percona Distribution for MongoDB 4.4.24 was released on September 11, 2023. The bug fixes and improvements provided by MongoDB and included in Percona Server for MongoDB are the following:

  • Improved issue identification during oplog application on secondary nodes by recording the number of writeConflicts in serverStatus.
  • Added taking snapshot of LDAP host timing data before sorting.
  • Fixed the Rollback to Stable behavior to skip tables with no time window information in the checkpoint.

Download Percona Distribution for MongoDB 4.4.24

Percona Server for MongoDB 6.0.9-7

Percona Server for MongoDB 6.0.9-7 was released on September 19, 2023. It is an enhanced, source-available, and highly-scalable database that is a fully-compatible, drop-in replacement for MongoDB 6.0.9 Community Edition, based on MongoDB 6.0.9 Community Edition, and supports the upstream protocols and drivers. Improvements and bug fixes, provided by MongoDB and included in Percona Server for MongoDB are the following:

  • SERVER-60466 – Fixed the flow for converting a replica set into a sharded cluster b adding support for the drivers to communicate the signed $clusterTimes to shardsvr replica set before and after the addShard command is run
  • SERVER-74954 – Fixed the issue with the incorrect output for the query where the $or operator rewrites the $elemMatch extra condition.
  • SERVER-79136 – Blocked the $group min/max rewrite in timestamp if there is a non-meta filter.
  • WT-10759 – During reconciliation do not retry to forcibly evict the page.

Download Percona Server for MongoDB 6.0.9-7

Percona Server for MongoDB 5.0.20-17

Percona Server for MongoDB 5.0.20-17 was released on September 7, 2023. It’s a fully-compatible, drop-in replacement for MongoDB 5.0.20 Community Edition, supporting MongoDB 5.0.20 protocols and drivers. Percona Server for MongoDB 5.0.20-17 features a Docker image for ARM64 architectures.

Download Percona Server for MongoDB 5.0.20-17

Percona Server for MongoDB 4.4.24-23

On September 11, 2023, we released Percona Server for MongoDB 4.4.24-23, supporting protocols and drivers of MongoDB 4.4.24.

Download Percona Server for MongoDB 4.4.24-23

Percona Operator for MySQL based on Percona Server for MySQL 0.6.0

On September 5, 2023, we released Percona Operator for MySQL (PS-based) 0.6.0. It allows users to deploy MySQL clusters with both asynchronous and group replication topology. This release includes various stability improvements and bug fixes, getting the Operator closer to the General Availability stage. Version 0.6.0 of the Percona Operator for MySQL is still a tech preview release, and it is not recommended for production environments.

Download Percona Operator for MySQL (PS-based) 0.6.0

That’s it for this roundup, and be sure to follow us on Twitter to stay up-to-date on the most recent releases! Percona is a leader in providing best-of-breed enterprise-class support, consulting, managed services, training, and software for MySQL, MongoDB, PostgreSQL, MariaDB, and other open source databases in on-premises and cloud environments and is trusted by global brands to unify, monitor, manage, secure, and optimize their database environments.

Sep
15
2023
--

Convert MariaDB Binary Log File and Position-Based Replication to GTID Replication

Convert MariaDB Binary Log File

At Percona Managed Services, we manage Percona Server for MySQL, Community MySQL, and MariaDB. There are slight differences when configuring and managing MariaDB GTID replication. In this blog, we’ll show you how to convert MariaDB binary log file and position-based replication to GTID replication.

In my lab, we have two test nodes; both servers will have Debian 11 and MariaDB10.5.19 with mariabackup installed.

The PRIMARY server is deb11m8m (IP: 192.168.56.190), and the REPLICA server name is deb11m8s (IP: 192.168.56.191).

1. Install MariaDB on Debian 11 on PRIMARY and REPLICA

wget https://downloads.mariadb.com/MariaDB/mariadb_repo_setup
chmod +x  ./mariadb_repo_setup
./mariadb_repo_setup --mariadb-server-version="mariadb-10.5.19"
 apt-get install mariadb-server mariadb-backup
systemctl daemon-reload
root@deb11m8:~#wget https://downloads.mariadb.com/MariaDB/mariadb_repo_setup
 https://downloads.mariadb.com/MariaDB/mariadb_repo_setup
downloads.mariadb.com (downloads.mariadb.com)|104.17.191.14|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 36188 (35K) [application/octet-stream]
Saving to: ‘mariadb_repo_setup’
2023-08-30 14:42:51 (26.0 MB/s) - ‘mariadb_repo_setup’ saved [36188/36188]
root@deb11m8m:~#  chmod +x  ./mariadb_repo_setup
root@deb11m8m:~#  ./mariadb_repo_setup --mariadb-server-version="mariadb-10.5.19"
# [info] Checking for script prerequisites.
# [warning] Found existing file at /etc/apt/sources.list.d/mariadb.list. Moving to /etc/apt/sources.list.d/mariadb.list.old_1
# [info] MariaDB Server version 10.5.19 is valid
# [info] Repository file successfully written to /etc/apt/sources.list.d/mariadb.list
# [info] Adding trusted package signing keys...
# [info] Running apt-get update…
# [info] Done adding trusted package signing keys
root@deb11m8m:~#
root@deb11m8m:~#  apt-get install mariadb-server mariadb-backup
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following packages were automatically installed and are no longer required:
 libconfig-inifiles-perl libmecab2 libopengl0
Use 'apt autoremove' to remove them.
The following additional packages will be installed:
  libdbd-mariadb-perl libmariadb3 mariadb-client-10.5 mariadb-client-core-10.5 mariadb-common mariadb-server-10.5 mariadb-server-core-10.5
Suggested packages:
  mailx mariadb-test netcat-openbsd
TSelecting previously unselected package mariadb-client-core-10.5.
Preparing to unpack .../mariadb-client-core-10.5_1%3a10.5.19+maria~deb11_amd64.deb ...
Unpacking mariadb-client-core-10.5 (1:10.5.19+maria~deb11) ...
……
Setting up mariadb-server (1:10.5.19+maria~deb11) ...
Processing triggers for man-db (2.9.4-2) ...
Processing triggers for libc-bin (2.31-13+deb11u6) ...
root@deb11m8m:~#

1.1 Reset root user password and create a replication user

MariaDB [(none)]&gt;Alter user 'root'@'localhost' identified by ‘#######’;
MariaDB [(none)]&gt;GRANT PROXY ON ''@'%' TO 'root'@'localhost' WITH GRANT OPTION  ;
MariaDB [(none)]flush privileges;
MariaDB [(none)]CREATE USER 'repl'@'%' IDENTIFIED BY ‘#######’;
MariaDB [(none)]GRANT RELOAD, SUPER, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'repl'@'%';
 MariaDB [(none)]flush privileges;

1.2 Configure my.cnf on PRIMARY and REPLICA

PRIMARY

[mysqld]
pid-file                = /run/mysqld/mysqld.pid
basedir                 = /usr
server_id=1
bind-address=192.168.56.190
log-bin = /var/lib/mysql/bin_log
expire_logs_days        = 10
character-set-server  = utf8mb4
collation-server      = utf8mb4_general_ci

REPLICA
[mysqld]
pid-file                = /run/mysqld/mysqld.pid
basedir                 = /usr
<b>server_id=2</b>
<b>bind-address=192.168.56.191</b>
log-bin = /var/lib/mysql/bin_log
expire_logs_days        = 10
character-set-server  = utf8mb4
collation-server      = utf8mb4_general_ci

2. Let’s set up binary log and position-based replication between PRIMARY and REPLICA with the stream copy method.

2.1 on REPLICA

root@deb11m8s: nc -l -p 2222 | mbstream&nbsp; -x -C /var/lib/mysql

2.2 on PRIMARY

root@deb11m8m:/var/lib/mysql# mariabackup --stream=mbstream --parallel=4&nbsp; --backup |&nbsp; nc -w 2 192.168.56.191 2222
[00] 2023-08-30 18:02:08 Connecting to server host: localhost, user: root, password: set, port: not set, socket: /run/mysqld/mysqld.sock
[00] 2023-08-30 18:02:08 Using server version 10.5.19-MariaDB-1:10.5.19+maria~deb11-log
 mariabackup based on MariaDB server 10.5.19-MariaDB debian-linux-gnu (x86_64)
[00] 2023-08-30 18:02:08 uses posix_fadvise().
[00] 2023-08-30 18:02:08 cd to /var/lib/mysql/
 [00] 2023-08-30 18:02:08 open files limit requested 0, set to 1024
 [00] 2023-08-30 18:02:08 mariabackup: using the following InnoDB configuration:
 [00] 2023-08-30 18:02:08 innodb_data_home_dir =
 [00] 2023-08-30 18:02:08 innodb_data_file_path = ibdata1:12M:autoextend
 [00] 2023-08-30 18:02:08 innodb_log_group_home_dir = ./
 [00] 2023-08-30 18:02:08 InnoDB: Using Linux native AIO
 2023-08-30 18:02:08 0 [Note] InnoDB: Number of pools: 1
 [00] 2023-08-30 18:02:08 mariabackup: Generating a list of tablespaces
 [00] 2023-08-30 18:02:08 >> log scanned up to (10912597)
 [00] 2023-08-30 18:02:08 mariabackup: Starting 4 threads for parallel data files transfer
[03] 2023-08-30 18:02:08 Streaming ./mysql/innodb_index_stats.ibd
[04] 2023-08-30 18:02:08 Streaming ibdata1
……..
'/var/lib/mysql/xtrabackup_backupfiles/'
 [00] 2023-08-30 18:02:10 MySQL binlog position: filename 'bin_log.000002', position '326', GTID of the last change ''
 [00] 2023-08-30 18:02:10 Streaming backup-my.cnf
[00] 2023-08-30 18:02:10 Streaming xtrabackup_info
[00] 2023-08-30 18:02:10 Redo log (from LSN 10912585 to 10912597) was copied.
 [00] 2023-08-30 18:02:10 completed OK!

2.3 On REPLICA, get binlog information

<root@deb11m8s:/var/lib/mysql# cat xtrabackup_binlog_info
bin_log.000002 326

2.4 On REPLICA, prepare the backup

root@deb11m8s:/var/lib/mysql# mariabackup --prepare &nbsp; --use-memory=1G&nbsp; --target-dir=/var/lib/mysql
mariabackup based on MariaDB server 10.5.19-MariaDB debian-linux-gnu (x86_64)
[00] 2023-08-30 18:08:32 cd to /var/lib/mysql/
[00] 2023-08-30 18:08:32 open files limit requested 0, set to 1024
.
2023-08-30 18:08:32 0 [Note] InnoDB: Initializing buffer pool, total size = 1073741824, chunk size = 1073741824
2023-08-30 18:08:32 0 [Note] InnoDB: Completed initialization of buffer pool
2023-08-30 18:08:32 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=10912585,10912585
[00] 2023-08-30 18:08:32 Last binlog file , position 0
[00] 2023-08-30 18:08:32 completed OK!
root@deb11m8s:/var/lib/mysql#

2.5 Set up the replication 

root@deb11m8s:/var/lib/mysql# chown -R mysql:mysql /var/lib/mysql
root@deb11m8s:/var/lib/mysql#  systemctl start mysql
root@deb11m8s:/var/lib/mysql# mysql
Welcome to the MariaDB monitor.  Commands end with ; or g.
Your MariaDB connection id is 6
Server version: 10.5.19-MariaDB-0+deb11u2-log Debian 11
MariaDB [(none)]&gt; stop slave; reset slave; reset slave all; reset master;
Query OK, 0 rows affected, 1 warning (0.000 sec)
Query OK, 0 rows affected (0.001 sec)
Query OK, 0 rows affected (0.000 sec)
Query OK, 0 rows affected (0.009 sec)
ariaDB [(none)]&gt; CHANGE MASTER TO
    MASTER_HOST='192.168.56.190',
    MASTER_USER='repl',
    MASTER_PASSWORD='#######',
    MASTER_PORT=3306,
    MASTER_LOG_FILE='bin_log.000002',
   MASTER_LOG_POS=326,
   MASTER_CONNECT_RETRY=10;
Query OK, 0 rows affected (0.022 sec)
MariaDB [(none)]&gt; start slave;
Query OK, 0 rows affected (0.003 sec)
MariaDB [(none)]&gt; show slave statusG
*************************** 1. row ***************************
            Slave_IO_State: Waiting for master to send event
                 Master_Host: 192.168.56.190
                  Master_User: repl
                 Master_Port: 3306
                Connect_Retry: 10
              Master_Log_File: bin_log.000002
          Read_Master_Log_Pos: 326
                Relay_Log_File: mysqld-relay-bin.000002
                 Relay_Log_Pos: 553
       Relay_Master_Log_File: bin_log.000002
       Slave_IO_Running: Yes
             Slave_SQL_Running: Yes
           …
        Exec_Master_Log_Pos: 326
           Relay_Log_Space: 863
         ….
          Master_SSL_Crlpath:
            Using_Gtid: No
                 Gtid_IO_Pos:
   Replicate_Do_Domain_Ids:
  Replicate_Ignore_Domain_Ids:
                Parallel_Mode: optimistic
                SQL_Delay: 0
         SQL_Remaining_Delay: NULL
     Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
             Slave_DDL_Groups: 0
Slave_Non_Transactional_Groups: 0
   Slave_Transactional_Groups: 0
1 row in set (0.000 sec)

We have now set up binary log and position-based replication between the PRIMARY and REPLICA servers. Our next step is to convert this replication to GTID-based replication.

MariaDB and MySQL have different GTID implementations.

MariaDB GTID consists of three numbers separated with dashes ‘-‘. For example:

0-1-10: The first number, 0, is the domain ID, the second number is the server ID, and the third number is the sequence number.

MySQL GTID has two parts, source_id  and transaction_id, separated by a colon character (:) .

e.g., 3E11FA47-71CA-11E1-9E33-C80AA9429562:23

The source_id  is the source server’s server_uuid, and the transaction_id is the transaction sequence number committed on the source.

Let’s continue.

On PRIMARY, we enable GTID by running the below command.

MariaDB [(none)] SET GLOBAL gtid_domain_id = 1;
MariaDB [(none)] set global gtid_strict_mode=1 ;
And persist  in my.cnf 
Verify GTID is enabled with below query 
MariaDB [(none)]&gt; select @@server_id,@@bind_address,@@log_bin,@@gtid_strict_mode,@@gtid_domain_id;
+-------------+----------------+-----------+--------------------+------------------+
| @@server_id | @@bind_address | @@log_bin | @@gtid_strict_mode | @@gtid_domain_id |
+-------------+----------------+-----------+--------------------+------------------+
|           1 | 192.168.56.190 |         1 |                  <b>1 |                1 |
+-------------+----------------+-----------+--------------------+------------------+
1 row in set (0.000 sec)</b>

ON REPLICA

MariaDB [(none)]>SET GLOBAL gtid_domain_id = 2;
MariaDB [(none)]set global gtid_strict_mode=1 ;
And persist  in my.cnf 
Verify GTID is enabled with below query 
MariaDB [(none)]select @@server_id,@@bind_address,@@log_bin,@@gtid_strict_mode,@@gtid_domain_id;
+-------------+----------------+-----------+--------------------+------------------+
| @@server_id | @@bind_address | @@log_bin | @@gtid_strict_mode | @@gtid_domain_id |
+-------------+----------------+-----------+--------------------+------------------+
|  2 | 192.168.56.191 | 1 | 1 |2 |;
+-------------+----------------+-----------+--------------------+------------------+
1 row in set (0.000 sec)

On REPLICA, we need to stop replication and extract the values of Relay_Master_Log_File and Exec_Master_Log_Pos from the output.

MariaDB [test_db]STOP SLAVE;
Query OK, 0 rows affected (0.007 sec)
MariaDB [test_db] show slave statusG
*************************** 1. row ***************************
            Slave_IO_State:
               Master_Host: 192.168.56.190
            Master_User: repl
                Master_Port: 3306
               Connect_Retry: 10
             Master_Log_File: bin_log.000002
        Read_Master_Log_Pos: 1206
               Relay_Log_File: mysqld-relay-bin.000007
              Relay_Log_Pos: 553
   Relay_Master_Log_File: bin_log.000002
             Slave_IO_Running: No
            Slave_SQL_Running: No
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
      Replicate_Ignore_Table:
    Replicate_Wild_Do_Table:
 Replicate_Wild_Ignore_Table:
                  Last_Errno: 0
                  Last_Error:
                Skip_Counter: 0
 Exec_Master_Log_Pos: 1206
           Relay_Log_Space: 863
    ……..
Slave_Non_Transactional_Groups: 0
   Slave_Transactional_Groups: 0
1 row in set (0.000 sec)

In our case, the Relay_Master_Log_File value is bin_log.000002, and the Exec_Master_Log_Pos is 1206.

On PRIMARY, we could use the below query to get the GTID position that corresponds to these binary log coordinates on PRIMARY.

MariaDB[test_db] SELECT BINLOG_GTID_POS('bin_log.000002', 1206),@@hostname;
  +-----------------------------------------+------------+
| BINLOG_GTID_POS('bin_log.000002', 1206) | @@hostname |
+-----------------------------------------+------------+
| 1-1-1,0-1-4 &nbsp; |
+-----------------------------------------+------------+
1 row in set (0.000 sec)

We can now modify the replica configuration on the REPLICA server deb11m8s by executing the following statements:

MariaDB [test_db]SET GLOBAL gtid_slave_pos = '1-1-1,0-1-4';
MariaDB [test_db] CHANGE MASTER TO master_use_gtid=slave_pos;
MariaDB [test_db]START SLAVE;
MariaDB [test_db] show slave statusG
    *************************** 1. row ***************************
                Slave_IO_State: Waiting for master to send event
                     Master_Host: 192.168.56.190
                    Master_User: repl
                      Master_Port: 3306
                 Connect_Retry: 10
              Master_Log_File: bin_log.000002
          Read_Master_Log_Pos: 1391
                 Relay_Log_File: mysqld-relay-bin.000002
                Relay_Log_Pos: 910
         Relay_Master_Log_File: bin_log.000002
 Slave_IO_Running: Yes
                  Slave_SQL_Running: Yes
                Replicate_Do_DB:
               Replicate_Ignore_DB:
              Replicate_Do_Table:
           Replicate_Ignore_Table:
         Replicate_Wild_Do_Table:
     Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                    Skip_Counter: 0
              Exec_Master_Log_Pos: 1391
                Relay_Log_Space: 1220
      …….
              Master_SSL_Key:
           Seconds_Behind_Master: 0
  Master_SSL_Verify_Server_Cert: No
                  Last_IO_Errno: 0
                 Last_IO_Error:
                   Last_SQL_Errno: 0
                    Last_SQL_Error:
       Replicate_Ignore_Server_Ids:
                 Master_Server_Id: 1
                  Master_SSL_Crl:
              Master_SSL_Crlpath:
                      Using_Gtid: Slave_Pos
                      Gtid_IO_Pos: 1-1-2,0-1-4
           Replicate_Do_Domain_Ids:
       Replicate_Ignore_Domain_Ids:
                    Parallel_Mode: optimistic
                         SQL_Delay: 0
               SQL_Remaining_Delay: NULL
     Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
                Slave_DDL_Groups: 0
   Slave_Non_Transactional_Groups: 0
        Slave_Transactional_Groups: 1
   1 row in set (0.000 sec)

The SHOW SLAVE STATUS output Using_Gtid: Slave_Pos and Gtid_IO_Pos: 1-1-2,0-1-4 indicates that the slave uses GTIDs to track replication. We could use the above command to set up GTID replication on other replicas as well.

When running CHANGE MASTER TO master_use_gtid=slave_pos; and setting the MASTER_USE_GTID replication parameter, you have the option of enabling Global Transaction IDs to use either the current_pos or slave_pos values.

Using the value current_pos causes the replica to set its position based on the gtid_current_pos system variable, which is a union of gtid_binlog_pos and gtid_slave_pos. Using the value slave_pos causes the replica to instead set its position based on the gtid_slave_pos system variable. You may run into issues when you use the value current_pos if you write any local transactions on the replica. For details, please refer to https://mariadb.com/kb/en/gtid/.

On Community MySQL or Percona Server for MySQL, the parameter to enable GTID replication is
gtid-mode=ON enforce-gtid-consistency.

The command is:

change master to
master_host = '192.168.1.120',
master_port=3306,
master_user = 'repl',
master_password = 'password',
master_auto_position=1;

Conclusion

We can see that the procedure to enable GTIDs on MariaDB replication is simple, but the commands and parameters are a bit different.

As mentioned, Percona offers Support and Managed Services for Community versions of MariaDB. At the same time, we encourage MariaDB users to explore Percona Server for MySQL as an alternative. Should you make the decision to migrate to Percona Software for MySQL, our experts will assist throughout the migration process and support you after the migration is complete.

 

Contact us to discuss migration options

 

Hope this is helpful for your daily MySQL and MariaDB management.

Sep
14
2023
--

Complete Walkthrough: MySQL to ClickHouse Replication Using MaterializedMySQL Engine

MySQL to ClickHouse Replication

MySQL is an outstanding open source transactional database used by most web-based applications and is very good at handling OLTP workloads. However, modern business is very much dependent on analytical data. ClickHouse is a columnar database that handles analytical workloads quickly. I recommend you read our previous blog, Using ClickHouse as an Analytic Extension for MySQL, from Vadim to know more about this.

In this blog post, I will show you how to replicate MySQL data in real-time to ClickHouse. I am going to use the MaterializedMySQL engine for this purpose. The blog post contains the following topics.

  • MaterializedMySQL Engine – Overview
  • Prerequisites for the Replication
  • Replication setup
  • Replication testing
  • Understanding ReplacingMergeTree
  • What happens if the Replication event fails?
  • What happens if MySQL or ClickHouse restarted?
  • Conclusion

MaterializedMySQL Engine – Overview

  • MaterializedMySQL engine is an experimental release from the ClickHouse team.
  • By default, it will create a database on ClickHouse with all the tables and their data. (We can do the filter as well.)
  • Schema conversion from MySQL to ClickHouse will be taken care of by the engine. No manual work!
  • Once we configured the setup, the ClickHouse server will act as a replica.
  • ClickHouse will read the MySQL binary logs and perform the replication.
  • Replication will work well with DDL and DML statements.
  • ReplacingMergeTree will be used to perform the replication.

Prerequisites for the replication

We must meet certain requirements to configure the replication from MySQL to ClickHouse. I have split this into two parts, which have to be done from the MySQL end and the ClickHouse end. 

From the MySQL end

MySQL should be configured with the following parameters:

[mysqld]

gtid_mode = ON
enforce_gtid_consistency = ON
binlog_format = ROW
binlog_row_image = FULL
server-id = 100

To configure the replication, we have to create a MySQL user with “REPLICATION SLAVE, REPLICATION CLIENT, SELECT, RELOAD” privileges. Make sure the user should be created with “mysql_native_password” authentication. When you initiate the replication, ClickHouse will copy the existing data from the MySQL table to the ClickHouse table and then start the replication from Binary logs. That is the reason we need SELECT privilege as well.

percona labs > create user ch_repl@'192.168.64.13' identified with mysql_native_password by 'Percona$331';
Query OK, 0 rows affected (0.01 sec)

percona labs > grant replication slave, replication client, reload, select on *.* to ch_repl@'192.168.64.13';
Query OK, 0 rows affected (0.02 sec)

percona labs > flush privileges;
Query OK, 0 rows affected (0.01 sec)

From the ClickHouse end

Make sure you have MySQL port access from the ClickHouse server to the MySQL server.

root@clickhouse:~# telnet 192.168.64.7 3306
Trying 192.168.64.7...
Connected to 192.168.64.7.

MaterializedMySQL engine is the experimental release. So, you should enable the parameter (allow_experimental_database_materialized_mysql) in the ClickHouse config file. This can be enabled in the User profile. (file: /etc/clickhouse-server/users.xml)

#vi /etc/clickhouse-server/users.xml
    <profiles>
        <default>
           <!-- Allowing Experimental feature to enable the replication from MySQL to ClickHouse -->
           <allow_experimental_database_materialized_mysql>1</allow_experimental_database_materialized_mysql>
        </default>
    </profiles>

Activating this parameter does not need a ClickHouse restart. You can verify it using the following command.

clickhouse :) select name, value,changed,description from settings where name = 'allow_experimental_database_materialized_mysql'G
Query id: 6ad9c836-1c95-48a0-bb2f-ef89474618bf
Row 1:
──────
name:        allow_experimental_database_materialized_mysql
value:       1
changed:     1
description: Allow to create database with Engine=MaterializedMySQL(...).
1 row in set. Elapsed: 0.007 sec.

Now, you can configure the replication from MySQL to ClickHouse.

Replication setup

To configure the replication, I have created two servers with the following hostnames. Both servers are running their respective databases.

  • mysql
  • clickhouse

At MySQL, I have created the table “percona.herc” which has five records. 

percona labs > use percona
Database changed

percona labs > show tables;
+-------------------+
| Tables_in_percona |
+-------------------+
| herc              |
+-------------------+
1 row in set (0.00 sec)

percona labs > select * from herc;
+----+------+---------------------+--------+
| id | name | c_time              | active |
+----+------+---------------------+--------+
|  1 | ab   | 2023-08-19 01:28:32 | 1      |
|  2 | cd   | 2023-08-19 01:28:36 | 1      |
|  3 | ef   | 2023-08-19 01:28:39 | 1      |
|  4 | gh   | 2023-08-19 01:28:42 | 1      |
|  5 | ij   | 2023-08-19 01:28:45 | 1      |
+----+------+---------------------+--------+
5 rows in set (0.00 sec)

To configure the replication, we have to run the following command in ClickHouse instance.

Syntax:

create database <database_name> engine = MaterializeMySQL('source host', 'source database', 'source user', 'source password') settings <options>;

In action:

clickhouse :) create database perconaCH engine = MaterializeMySQL('192.168.64.7', 'percona', 'ch_repl', 'Percona$331') settings allows_query_when_mysql_lost=true, max_wait_time_when_mysql_unavailable=10000, materialized_mysql_tables_list='herc';
CREATE DATABASE perconaCH
ENGINE = MaterializeMySQL('192.168.64.7', 'percona', 'ch_repl', 'Percona$331')
SETTINGS allows_query_when_mysql_lost = 1, max_wait_time_when_mysql_unavailable = 10000, materialized_mysql_tables_list = 'herc'
Query id: 499f2057-e879-43ae-b406-eb7cdecd1e82
Ok.
0 rows in set. Elapsed: 0.046 sec.

Let’s verify the data in ClickHouse.

clickhouse :) use perconaCH
USE perconaCH
Query id: 2d534f39-871a-4718-8f24-daa855fe3311
Ok.
0 rows in set. Elapsed: 0.002 sec.

clickhouse :) show tables;
SHOW TABLES
Query id: 6b0763ef-d78e-41aa-8e33-1b7bd1b047da
┌─name─┐
│ herc │
└──────┘
1 row in set. Elapsed: 0.004 sec.

clickhouse :) select * from herc;
SELECT *
FROM herc
Query id: 84bed28e-3db6-4711-b5cf-ad858eebdce2
┌─id─┬─name─┬──────────────c_time─┬─active─┐
│  1 │ ab   │ 2023-08-19 01:28:32 │ 1      │
│  2 │ cd   │ 2023-08-19 01:28:36 │ 1      │
│  3 │ ef   │ 2023-08-19 01:28:39 │ 1      │
│  4 │ gh   │ 2023-08-19 01:28:42 │ 1      │
│  5 │ ij   │ 2023-08-19 01:28:45 │ 1      │
└────┴──────┴─────────────────────┴────────┘
5 rows in set. Elapsed: 0.005 sec.

As you can see, the table “herc” was created, and the data has been copied. (Note: You can use a different database name in ClickHouse.)

ClickHouse has a different data type structure. The data types will be automatically converted by ClickHouse when we initiate the replication process. The relevant data types will be chosen. For example, below, I am sharing the table structure from MySQL and ClickHouse. You can compare the data types for the table we created. You can find more details here. 

MySQL

percona labs > show create table hercG
*************************** 1. row ***************************
       Table: herc
Create Table: CREATE TABLE `herc` (
  `id` int NOT NULL AUTO_INCREMENT,
  `name` varchar(16) DEFAULT NULL,
  `c_time` datetime DEFAULT CURRENT_TIMESTAMP,
  `active` enum('1','0') DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=18 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)

ClickHouse

clickhouse :) show create table hercG
Row 1:
──────
statement: CREATE TABLE perconaCH.herc
(
    `id` Int32,
    `name` Nullable(String),
    `c_time` Nullable(DateTime),
    `active` Nullable(Enum8('1' = 1, '0' = 2)),
    `_sign` Int8 MATERIALIZED 1,
    `_version` UInt64 MATERIALIZED 1,
    INDEX _version _version TYPE minmax GRANULARITY 1
)
ENGINE = ReplacingMergeTree(_version)
PARTITION BY intDiv(id, 4294967)
ORDER BY tuple(id)
SETTINGS index_granularity = 8192
1 row in set. Elapsed: 0.003 sec.

Note: There are two new columns, “_sign” and “_version.” They are created to manage the data deduplication by ReplacingmergeTree. We will see about it later in this post.

To understand the mechanism of how ClickHouse copies the existing data from MySQL, I am sharing the logs that were collected from MySQL’s general logs. You can see the SELECT statement, which is used to copy the data. Initially, before copying the data, it used “FLUSH TABLES WITH READ LOCK → REPEATEBLE_READ → START TRANSACTION – UNLOCK TABLES”.

2023-08-19T14:14:38.543090Z   12 Connect ch_repl@192.168.64.13 on percona using SSL/TLS
2023-08-19T14:14:38.544934Z   12 Query SET NAMES utf8mb4
2023-08-19T14:14:38.546775Z   12 Query SHOW VARIABLES
2023-08-19T14:14:38.557433Z   12 Query SHOW VARIABLES
2023-08-19T14:14:38.560367Z   12 Query SHOW GRANTS FOR CURRENT_USER()
2023-08-19T14:14:38.561910Z   12 Query FLUSH TABLES
2023-08-19T14:14:38.568580Z   12 Query FLUSH TABLES WITH READ LOCK
2023-08-19T14:14:38.569364Z   12 Query SHOW MASTER STATUS
2023-08-19T14:14:38.570542Z   12 Query SHOW VARIABLES WHERE Variable_name = 'binlog_checksum'
2023-08-19T14:14:38.573362Z   12 Query SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ
2023-08-19T14:14:38.574353Z   12 Query START TRANSACTION /*!40100 WITH CONSISTENT SNAPSHOT */
2023-08-19T14:14:38.575577Z   12 Query SELECT TABLE_NAME AS table_name FROM INFORMATION_SCHEMA.TABLES  WHERE TABLE_TYPE != 'VIEW' AND TABLE_SCHEMA = 'percona'
2023-08-19T14:14:38.577692Z   12 Query SHOW CREATE TABLE percona.herc
2023-08-19T14:14:38.579019Z   12 Query UNLOCK TABLES
2023-08-19T14:14:38.587415Z   12 Query SELECT COLUMN_NAME AS column_name, COLUMN_TYPE AS column_type FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA = 'percona' AND TABLE_NAME = 'herc' ORDER BY ORDINAL_POSITION
2023-08-19T14:14:38.589956Z   12 Query SELECT `id`,`name`,`c_time`,`active` FROM percona.herc
2023-08-19T14:14:38.591538Z   12 Query COMMIT
2023-08-19T14:14:38.592842Z   13 Connect ch_repl@192.168.64.13 on  using TCP/IP
2023-08-19T14:14:38.593196Z   13 Query SET @master_binlog_checksum = 'CRC32'
2023-08-19T14:14:38.593536Z   13 Query SET @master_heartbeat_period = 1000000000
2023-08-19T14:14:38.594339Z   13 Binlog Dump GTID Log: '' Pos: 4 GTIDs: '030f9a68-3dfc-11ee-b900-525400edf935:1-44'
2023-08-19T14:14:38.595585Z   12 Query SHOW VARIABLES WHERE Variable_name = 'binlog_checksum'

The following two threads will be initiated at the MySQL end to copy the data and for replication purpose.

| 12 | ch_repl         | 192.168.64.13:44142 | percona | Sleep            |   479 |                                                                 | NULL                                         | PRIMARY          |
| 13 | ch_repl         | 192.168.64.13:44152 | NULL    | Binlog Dump GTID |   479 | Source has sent all binlog to replica; waiting for more updates | NULL                                         | PRIMARY          |

From the ClickHouse end, ClickHouse logs are very verbose trace logs. We can find some useful information there. For example, (file: /var/log/clickhouse-server/clickhouse-server.log).

2023.08.19 03:50:22.755656 [ 4787 ] {} <Debug> mysqlxx::Pool: Connecting to percona@192.168.64.7:3306 as user ch_repl
2023.08.19 03:50:22.755840 [ 4787 ] {} <Debug> mysqlxx::Pool: Creating a new MySQL connection to percona@192.168.64.7:3306 as user ch_repl with settin
gs: connect_timeout=60, read_write_timeout=1800
2023.08.19 19:44:38.557707 [ 5088 ] {76b80228-9870-4b41-b1ad-1d47d1cdda65} <Debug> executeQuery: (internal) /*Materialize MySQL step 1: execute MySQL
DDL for dump data*/ EXTERNAL DDL FROM MySQL(perconaCH, percona) CREATE TABLE `herc` ( `id` int NOT NULL AUTO_INCREMENT, `name` varchar(16) DEFAULT NUL
L, `c_time` datetime DEFAULT CURRENT_TIMESTAMP, `active` enum('1','0') DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=18 DEFAULT CHAR
SET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci (stage: Complete)
2023.08.19 19:44:38.558200 [ 5088 ] {76b80228-9870-4b41-b1ad-1d47d1cdda65} <Debug> executeQuery: (internal) /* Rewritten MySQL DDL Query */ CREATE TAB
LE perconaCH.herc (`id` Int32, `name` Nullable(String), `c_time` Nullable(DateTime), `active` Nullable(Enum8('1' = 1, '0' = 2)), `_sign` Int8() MATERI
ALIZED 1, `_version` UInt64() MATERIALIZED 1, INDEX _version _version TYPE minmax GRANULARITY 1) ENGINE = ReplacingMergeTree(_version) PARTITION BY in
tDiv(id, 4294967) ORDER BY tuple(id) (stage: Complete)
2023.08.19 19:44:38.560678 [ 5088 ] {76b80228-9870-4b41-b1ad-1d47d1cdda65} <Debug> perconaCH.herc (14fd18a9-7030-44a8-ad21-13d0b7c6bfd3): Loading data
 parts

You can see the CREATE TABLE statement from MySQL and how ClickHouse rewrites it from the above logs.

So, all set! We were able to configure the replication, and the data has been copied from MySQL to ClickHouse. Next step, we can start to test the replication.

UPDATE on MySQL

In MySQL

percona labs > select * from herc where id = 5;
+----+------+---------------------+--------+
| id | name | c_time              | active |
+----+------+---------------------+--------+
|  5 | ij   | 2023-08-19 01:28:45 | 1      |
+----+------+---------------------+--------+
1 row in set (0.00 sec)

percona labs > update herc set name = 'Percona' where id = 5;
Query OK, 1 row affected (0.01 sec)
Rows matched: 1  Changed: 1  Warnings: 0

percona labs > select * from herc where id = 5;
+----+---------+---------------------+--------+
| id | name    | c_time              | active |
+----+---------+---------------------+--------+
|  5 | Percona | 2023-08-19 01:28:45 | 1      |
+----+---------+---------------------+--------+
1 row in set (0.00 sec)

At ClickHouse

clickhouse :) select * from herc where id = 5;
SELECT *
FROM herc
WHERE id = 5
Query id: a087e52e-cc58-4057-b608-ad41b594a937
┌─id─┬─name────┬──────────────c_time─┬─active─┐
│  5 │ Percona │ 2023-08-19 01:28:45 │ 1      │
└────┴─────────┴─────────────────────┴────────┘
1 row in set. Elapsed: 0.006 sec.

We can see the data has been updated in ClickHouse. Let’s try to get more details from the ClickHouse log file regarding the UPDATE statement (file: /var/log/clickhouse-server/clickhouse-server.log).

=== XID_EVENT ===
Timestamp: 1692455972
Event Type: XID_EVENT
Server ID: 100
Event Size: 31
Log Pos: 1906
Flags: 0
XID: 108

2023.08.19 20:09:33.075649 [ 5088 ] {b1733f81-e3a9-40b9-a8f9-6df6ceaceea5} <Debug> executeQuery: (internal) /*Materialize MySQL step 1: execute dump d
ata*/ INSERT INTO `herc` (id, name, c_time, active, _sign, _version) VALUES (stage: Complete)
2023.08.19 20:09:33.076804 [ 5088 ] {b1733f81-e3a9-40b9-a8f9-6df6ceaceea5} <Trace> perconaCH.herc (14fd18a9-7030-44a8-ad21-13d0b7c6bfd3): Trying to re
serve 1.00 MiB using storage policy from min volume index 0
2023.08.19 20:09:33.078745 [ 5088 ] {} <Information> MaterializedMySQLSyncThread: MySQL executed position:

=== Binlog Position ===
Binlog: binlog.000007
Position: 1906
GTIDSets: 030f9a68-3dfc-11ee-b900-525400edf935:1-46

As you see above, the logs were generated for the single UPDATE statement. You can find more useful information related to Binary logs, position, GTID, XID number, etc. This can be used to debug and monitor the replication events.

And, as you see above, the UPDATE statement was converted to an INSERT statement with “_sign” and “_version” columns. To understand this, we have to understand How ReplacingMergeTree works.

Understanding ReplacingMergeTree

ReplacingMergeTree deduplicates the data using the “_version” and “_sign” columns in the background. To understand this better in a practical way:

Creating a new record in MySQL

percona labs > insert into herc values (7,'testRep','2023-08-19 01:29:11',1);
Query OK, 1 row affected (0.01 sec)

In ClickHouse, this time, I will use all the columns in the SELECT statement instead of “SELECT *.”

clickhouse :) select id,name,c_time,active,_sign, _version from herc where id = 7;
Query id: fc272a7c-9c03-4061-b76d-2728aadbec31
┌─id─┬─name────┬──────────────c_time─┬─active─┬─_sign─┬─_version─┐
│  7 │ testRep │ 2023-08-19 01:29:11 │ 1      │     1 │        1 │
└────┴─────────┴─────────────────────┴────────┴───────┴──────────┘
1 row in set. Elapsed: 0.006 sec.

You can see this is the first version of data “_version = 1”.

Let’s do UPDATE on MySQL:

percona labs > update herc set name = 'UPDtestRep' where id = 7;
Query OK, 1 row affected (0.01 sec)
Rows matched: 1  Changed: 1  Warnings: 0

At ClickHouse:

clickhouse :) select id,name,c_time,active,_sign, _version from herc where id = 7;
Query id: f498186c-cc9d-4ce1-9613-af491b3850e8
┌─id─┬─name────┬──────────────c_time─┬─active─┬─_sign─┬─_version─┐
│  7 │ testRep │ 2023-08-19 01:29:11 │ 1      │     1 │        1 │
└────┴─────────┴─────────────────────┴────────┴───────┴──────────┘
┌─id─┬─name───────┬──────────────c_time─┬─active─┬─_sign─┬─_version─┐
│  7 │ UPDtestRep │ 2023-08-19 01:29:11 │ 1      │     1 │        2 │
└────┴────────────┴─────────────────────┴────────┴───────┴──────────┘
2 rows in set. Elapsed: 0.005 sec.

You can see two records for “id = 7”. This means it is keeping the previous version of data “_version = 1, name = testRep”, as well as the latest version “_version = 2, name = UPDtestRep”.

Now, let’s execute the DELETE statement on MySQL.

percona labs > delete from herc where id = 7;
Query OK, 1 row affected (0.01 sec)

At ClickHouse:

clickhouse :) select id,name,c_time,active,_sign, _version from herc where id = 7;
Query id: e3aab7e8-1ac8-4e7a-8ec7-c1cd2a63a24d
┌─id─┬─name────┬──────────────c_time─┬─active─┬─_sign─┬─_version─┐
│  7 │ testRep │ 2023-08-19 01:29:11 │ 1      │     1 │        2 │
└────┴─────────┴─────────────────────┴────────┴───────┴──────────┘
┌─id─┬─name───────┬──────────────c_time─┬─active─┬─_sign─┬─_version─┐
│  7 │ UPDtestRep │ 2023-08-19 01:29:11 │ 1      │     1 │        3 │
└────┴────────────┴─────────────────────┴────────┴───────┴──────────┘
┌─id─┬─name───────┬──────────────c_time─┬─active─┬─_sign─┬─_version─┐
│  7 │ UPDtestRep │ 2023-08-19 01:29:11 │ 1      │    -1 │        4 │
└────┴────────────┴─────────────────────┴────────┴───────┴──────────┘
3 rows in set. Elapsed: 0.013 sec.

Now, you can see three records (three versions).

  • The first record is the one we initially created with INSERT ( _sign = 1, _version = 1, name = testRep )
  • Second record is created when doing UPDATE ( _sign = 1, _version = 2, name = UPDtestRep)
  • The third record is created when doing DELETE. You can see the _sign with negative value when doing DELETE ( _sign = -1, _version = 3, name = UPDtestRep)

So, here, the latest version is “3” with negative value “_sign = -1”. Which means the data got deleted. You can find this by executing the following query. 

clickhouse :) select * from herc where id = 7;
Query id: f7634bbf-a12a-4d9a-bea1-cd917f72d581
Ok.
0 rows in set. Elapsed: 0.009 sec.

clickhouse :) select id, name , c_time, active from herc where id = 7;
Query id: bf27b945-ec11-4caa-9942-ef7e35bd2bb4
Ok.
0 rows in set. Elapsed: 0.008 sec.

When doing without “_sign, _version” columns OR “SELECT *”, you can find the latest value 🙂

What happens if the replication event fails?

All the MySQL statements are not supported for ClickHouse with the MaterialisedMySQL engine.  So, it is necessary to monitor the replication failure events. (I will create another blog to cover more on this topic.) You can use the ClickHouse server and error logs to monitor the replication failure events. For example, at MySQL:

percona labs > create user 't2'@localhost identified by 'Aer@321';
Query OK, 0 rows affected (0.01 sec)

At ClickHouse logs:

=== GTID_EVENT ===
Timestamp: 1692394221
Event Type: GTID_EVENT
Server ID: 100
Event Size: 79
Log Pos: 10313
Flags: 0
GTID Next: 030f9a68-3dfc-11ee-b900-525400edf935:35
2023.08.19 03:00:21.267928 [ 3861 ] {9fea936d-8f9e-4c7d-9075-39fe4876610c} <Debug> executeQuery: (internal)  (stage: Complete)
2023.08.19 03:00:21.268093 [ 3861 ] {9fea936d-8f9e-4c7d-9075-39fe4876610c} <Error> MaterializedMySQLSyncThread(perconaCH): Query EXTERNAL DDL FROM MySQL(perconaCH, percona) CREATE USER 't2'@'localhost' IDENTIFIED WITH 'mysql_native_password' AS '*A907AFD142E6FA14B293E13D508A8F505B7119AE' wasn't finished successfully: Code: 62. DB::Exception: Syntax error: failed at position 114 ('USER'): USER 't2'@'localhost' IDENTIFIED WITH 'mysql_native_password' AS '*A907AFD142E6FA14B293E13D508A8F505B7119AE'. Expected one of: TEMPORARY, TABLE. (SYNTAX_ERROR), Stack trace (when copying this message, always include the lines below):

You can see in the log it mentioned: “Query wasn’t finished successfully.” This can be used to monitor the failure events. The replication chain will not be broken if the event is not executed. It will just skip that event.

What happens if MySQL or ClickHouse restarted?

In case of failures of MySQL or ClickHouse, replication will be reinitiated when they are alive. For example, If MySQL restarted, you can find the following logs:

2023.08.19 03:50:12.737040 [ 4787 ] {} <Debug> mysqlxx::Pool: Connecting to percona@192.168.64.7:3306 as user ch_repl
2023.08.19 03:50:12.737204 [ 4787 ] {} <Debug> mysqlxx::Pool: Creating a new MySQL connection to percona@192.168.64.7:3306 as user ch_repl with settin
gs: connect_timeout=60, read_write_timeout=1800
2023.08.19 03:50:12.739134 [ 4787 ] {} <Error> mysqlxx::Pool: mysqlxx::ConnectionFailed
2023.08.19 03:50:22.850810 [ 4787 ] {} <Trace> mysqlxx::Pool: Entry(connection 8): sending PING to check if it is alive.
2023.08.19 03:50:22.851138 [ 4787 ] {} <Trace> mysqlxx::Pool: Entry(connection 8): PING ok.
2023.08.19 03:50:22.851155 [ 4787 ] {} <Trace> mysqlxx::Query: Running MySQL query using connection 8
2023.08.19 19:44:38.504900 [ 4533 ] {499f2057-e879-43ae-b406-eb7cdecd1e82} <Debug> mysqlxx::Pool: Connecting to percona@192.168.64.7:3306 as user ch_repl
2023.08.19 19:44:38.504922 [ 4533 ] {499f2057-e879-43ae-b406-eb7cdecd1e82} <Debug> mysqlxx::Pool: Creating a new MySQL connection to percona@192.168.64.7:3306 as user ch_repl with settings: connect_timeout=60, read_write_timeout=1800

From the above logs, you can see that ClickHouse keeps checking if MySQL is alive or not. Once MySQL is alive, it will initiate the connection to MySQL. When the replication starts, the executed events will not be executed again. 

If ClickHouse restarted

As you can see below, ClickHouse was terminated. After some time, when it starts again, ClickHouse is trying to initiate the connection to MySQL and then resume the replication.

2023.08.19 20:52:59.656085 [ 4532 ] {} <Trace> BaseDaemon: Received signal 15
2023.08.19 20:52:59.656190 [ 4532 ] {} <Information> Application: Received termination signal (Terminated)
…………
2023.08.19 20:52:59.825884 [ 5672 ] {} <Information> Application: Starting ClickHouse 23.7.4.5 (revision: 54476, git hash: bd2fcd445534e57cc5aa8c170cc
25b7479b79c1c, build id: F73D00BCFC4671837E6C185DB350C813D958BD25), PID 5672
2023.08.19 20:52:59.826045 [ 5672 ] {} <Information> Application: starting up
2023.08.19 20:52:59.952675 [ 5929 ] {} <Trace> mysqlxx::Pool: (percona@192.168.64.7:3306 as user ch_repl): Iterating through existing MySQL connections
2023.08.19 20:52:59.952692 [ 5929 ] {} <Trace> mysqlxx::Pool: Entry(connection 11): sending PING to check if it is alive.
2023.08.19 20:52:59.953015 [ 5929 ] {} <Trace> mysqlxx::Pool: Entry(connection 11): PING ok.
2023.08.19 20:52:59.953023 [ 5929 ] {} <Trace> mysqlxx::Query: Running MySQL query using connection 11

Conclusion

Having data in real-time for analytics is helpful in improving your business. The MaterializedMySQL engine is one of the components that can be used to replicate the MySQL data to ClickHouse. But remember, this is still in the experimental phase. Altinity’s Sink connector is another good solution to explore. I will try to write about that in the future.

Percona Distribution for MySQL is the most complete, stable, scalable, and secure open source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!

 

Try Percona Distribution for MySQL today!

Sep
12
2023
--

How To Make Schema Changes and Not Die Trying

Make Schema Changes

Schema changes are required to add new features or to fix bugs in an application. However, there is no standard procedure to make the changes in a quick and safe manner. If the changes are not made considering the necessary precautions, you may face unwanted outages on the database that can cause serious problems to your business. In this blog post, I will delve into the most important things to consider while preparing a schema change.

Table size and concurrency

When assessing a schema change, one of the most important things to consider is the table size and concurrency. For small tables, the ALTER operation usually takes a few milliseconds up to a few seconds. Here is where concurrency plays another important role: if the table has periods of low concurrency during the day and the application allows having it locked for a few seconds or minutes, then it is also a good idea to consider a direct ALTER during that period of time.

How can you define if a table is small, medium, or large? Well, there is no rule of thumb to define the size of a table, but we usually consider that if a table size is less than 1GB of total size, it is considered a small table. If a table is less than 100GB, then it is a medium-sized table, and finally, tables bigger than 100GB are considered large tables. However, if you have a table of 100MB that is highly concurrent, then it is better to wait for a low concurrency period to execute the direct ALTER operation or else use pt-online-schema-change to avoid locking the table for a long time. You can get an estimate of the table size by executing the following query:

mysql> select table_schema, table_name, table_rows, round(data_length / 1024 / 1024) DATA_MB, round(index_length / 1024 / 1024) INDEX_MB, round(data_free / 1024 / 1024) FREE_MB, round(data_length / 1024 / 1024)+round(index_length /
1024 / 1024)+round(data_free / 1024 / 1024) TOTAL_MB from information_schema.tables where table_name = 'sbtest1' and table_schema = 'sbsmall';

### Example output
+--------------+------------+------------+---------+----------+---------+----------+
| TABLE_SCHEMA | TABLE_NAME | TABLE_ROWS | DATA_MB | INDEX_MB | FREE_MB | TOTAL_MB |
+--------------+------------+------------+---------+----------+---------+----------+
| sbsmall      | sbtest1    |     643705 |     101 |        0 |       4 |      105 |
+--------------+------------+------------+---------+----------+---------+----------+
1 row in set (0.01 sec)

Metadata locks

There are several blog posts about this locking mechanism and the common problems you may encounter when executing schema changes. These are some of the blog posts that my colleagues have published about these problems:

The most important takeaway from all these blogs is that Metadata Locks are required for database consistency, and there is no way to avoid them, not even using tools like pt-online-schema-change or gh-ost, as those also require a brief lock on the table to be able to function. Another important fact to consider is that having a metadata lock on a highly concurrent table can cause major locking issues in the database and can cause outages.

If the metadata locks cannot be avoided, what can we do to minimize the impact? Well, one of the first things you should do is check if there are long-running transactions in the database that can block the acquisition of the metadata lock. Once you identify the long-running transactions, evaluate if those can be killed. To check if you have a long-running transaction, you can execute the following command; it will return the top five long-running transactions:

SELECT 
    trx.trx_id, 
    CONCAT(trx.trx_started, ' ', @@time_zone, '/', @@system_time_zone) AS trx_started, 
    TIMEDIFF(NOW(), trx.trx_started) AS trx_length, 
    IF(trx.trx_state='LOCK WAIT', CONCAT(trx.trx_state, ' This statement has been waiting for a lock for ', TIMEDIFF(NOW(), trx.trx_wait_started)), trx.trx_state) AS trx_state, 
    trx.trx_isolation_level, 
    trx.trx_operation_state, 
    IF(trx.trx_is_read_only=1, 'This transaction was started as READ ONLY.', '0') AS trx_is_read_only, 
    CONCAT('This transaction has modified and/or inserted ', trx.trx_rows_modified, ' row(s).') AS trx_rows_modified, 
    CONCAT('This transaction has altered and/or locked ~', trx.trx_weight, ' row(s).') AS trx_weight,
    trx.trx_mysql_thread_id AS connection_id, 
    CONCAT(pl.user, '@', pl.host) AS user, 
    pl.command, 
    pl.time, 
    CONCAT('This statement has locked ', trx.trx_rows_locked, ' row(s) in ', trx.trx_tables_locked, ' table(s).') AS stmt_current_locks,
    trx.trx_query 
FROM 
    information_schema.innodb_trx trx 
    JOIN information_schema.processlist pl ON trx.trx_mysql_thread_id = pl.id 
ORDER BY 3 DESC LIMIT 5G

### Example output
*************************** 1. row ***************************
             trx_id: 6798
        trx_started: 2023-08-16 16:00:00 SYSTEM/UTC
         trx_length: 00:06:30
          trx_state: RUNNING
trx_isolation_level: REPEATABLE READ
trx_operation_state: NULL
   trx_is_read_only: 0
  trx_rows_modified: This transaction has modified and/or inserted 1 row(s).
         trx_weight: This transaction has altered and/or locked ~3 row(s).
      connection_id: 14
               user: root@localhost
            command: Sleep
               time: 390
 stmt_current_locks: This statement has locked 1 row(s) in 1 table(s).
          trx_query: NULL
1 row in set (0.00 sec)

Once you have checked for long-running transactions, you can proceed to execute the ALTER statement directly or use another tool like pt-online-schema-change. When using a direct ALTER command, make sure you first set the session variable “lock_wait_timeout” to a low value before running the ALTER statement. In the following example, this variable is set to five seconds, and the session will abort the ALTER command if it cannot acquire the metadata lock in five seconds:

mysql> SET SESSION lock_wait_timeout=5;
Query OK, 0 rows affected (0.00 sec)

mysql> alter table sbsmall.sbtest1 add column r int null;
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

When using pt-online-schema change, you can also set this variable to a low value to avoid major blocking issues when creating the triggers required by this tool. Please notice in the following example the “–set-vars” option; it sets the variable “lock_wait_timeout” to one second.

pt-online-schema-change --print --progress time,10 
--alter="ADD COLUMN r INT NULL" 
--set-vars tx_isolation='READ-COMMITTED',lock_wait_timeout=1 
h=localhost,D=sbsmall,t=sbtest1 
--execute

Type of topology

The topology is important because it is not the same as executing the schema change on a stand-alone server, a classic replication topology, or a Galera or Percona XtraDB Cluster (PXC). When the change is executed on a stand-alone server, there are a few permutations of how to execute the change, but you need to consider the previous points mentioned above.

When executing the change on a replication topology, it is important to consider all nodes when checking for long-running transactions, as those might prevent the change from executing in one of the replicas and cause replication lag. If you are using pt-online-schema-change and having low replication lag is crucial for your application, then you should consider adding the “–recursion-method” and “–max-lag” options to your command. Having those settings in place will make the command check the replication lag and throttle the execution to avoid increasing the lag. You can find more information about these settings in the documentation, but here is a simple example of the usage of such options:

pt-online-schema-change --print --progress time,10 
--alter="ADD COLUMN r INT NULL" 
--set-vars tx_isolation='READ-COMMITTED',lock_wait_timeout=5 
h=localhost,D=sbsmall,t=sbtest1 
--recursion-method=dsn=h=localhost,D=percona,t=dsns 
--max-lag 60 --execute

If you decide to execute a direct ALTER rather than using pt-online-schema-change, and the table is not small or with high concurrency, you can execute the change as a rolling upgrade, executing the change on each replica node first following the following high-level steps

  • Stop the application traffic to the node
  • Stop replication if needed
  • Set variable “sql_log_bin” to OFF to avoid recording the change in the binlog
  • Execute the ALTER statement
  • Set variable “sql_log_bin” to ON
  • Start replication if it was stopped
  • Resume application traffic

Once all replica nodes are updated, you can execute a failover to promote one of the replica nodes as the new primary node and execute the above steps on the former primary node.

Finally, when executing the schema change on a Galera or PXC cluster, you need to consider that if you are going to execute the change using pt-online-schema change, you will want to use the option “–max-flow-ctl” as this will check for flow control events. It will throttle the execution to decrease the flow control events and allow the cluster to catch up. Here is a simple example of the usage of this option:

pt-online-schema-change --print --progress time,10 
--alter="ADD COLUMN r INT NULL" 
--set-vars tx_isolation='READ-COMMITTED',lock_wait_timeout=5 
h=localhost,D=sbsmall,t=sbtest1 
--recursion-method=none --max-flow-ctl=0 
--execute

Another thing to consider when altering tables on a Galera or PXC cluster is that it tends to stall the whole cluster while the change is applied to all nodes. Consider following the “Manual RSU” approach to perform the schema modifications under this topology type. The following article is a great resource to implement the “Manual RSU”: How to Perform Compatible Schema Changes in Percona XtraDB Cluster (Advanced Alternative)?

Conclusion

There is no free ticket when applying schema changes; you need to analyze and prepare for all the possible drawbacks you might encounter during the implementation. I hope you find this article interesting and helpful so you can avoid finding yourself in the middle of an outage for making a simple ALTER in the database.

If you still have doubts about implementing the changes on your production databases, consider asking for help. Here at Percona, we have several services to help you with your database-specific needs. For example, with Managed Services, we can take care of the difficult changes on your databases while you focus on the most important thing: growing your business. If you have your own DBA team but you need help with more challenging tasks, then Support would be a really good fit for your needs, as you’ll have access to top-class professionals willing to help you troubleshoot the most difficult problems. Finally, if you have a one-off complex task or project that you need help with, contact our Consulting services. Our group of experts will guide you through the whole process to make your project thrive.

Percona Monitoring and Management is a best-of-breed open source database monitoring solution. It helps you reduce complexity, optimize performance, and improve the security of your business-critical database environments, no matter where they are located or deployed.

 

Download Percona Monitoring and Management Today

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com