Oct
06
2021
--

Reminder: TokuDB Storage Engine Will Be Disabled by Default in Percona Server for MySQL 8.0.26

TokuDB Disabled in Percona Server for MySQL

TokuDB Disabled in Percona Server for MySQLAs we’ve communicated in our blog post in May, the TokuDB Storage Engine has been marked as “deprecated” in Percona Server for MySQL 8.0. It will be removed in a future version (Percona Server for MySQL 8.0.28, expected to ship in Q1 2022).

With the release of Percona Server for MySQL 8.0.26, the storage engine will still be included in the binary builds and packages but will be disabled by default. If you are upgrading from a previous version, the TokuDB Storage Engine plugin will fail with an error message at server startup if it is installed.

You will still be able to re-enable it manually so that you can perform the necessary migration steps.

Re-enabling the TokuDB Engine in Percona Server for MySQL 8.0.26 is fairly straightforward. You need to add the options

tokudb_enabled=TRUE

and

tokudb_backup_enabled=TRUE

options to your my.cnf file and restart your server instance.

Once the server is up and running again, you can migrate your data to an alternative storage engine like RocksDB or InnoDB before disabling and removing TokuDB. See the chapter Removing the TokuDB storage engine in our documentation for details.

Let us know if you have any questions or comments about this process!

If you need any assistance with migrating your data or have any questions or concerns about this, don’t hesitate to reach out to us – our experts are here to help!

Complete the 2021 Percona Open Source Data Management Software Survey

Have Your Say!

Sep
16
2021
--

Repoint Replica Servers in MySQL/Percona Server for MySQL 8.0

Repoint Replica Servers in MySQL

When doing migrations or failovers in MySQL, there is usually a need to do a topology change and repoint replica servers to obtain replication data from a different server.

For example, given servers {A, B, and C} and the following topology:

MySQL Topology

If you need to repoint C to be a replica of B, i.e:

repoint mysql

You can follow the next steps:

Note: log_replica_updates should be enabled on the soon-to-be primary as it is a prerequisite for chain replication.

Note: It is assumed that both replicas only stream from Server A and there are no conflicting replication filters in place that might break replication later on.

If Using File/Position-Based Replication:

1) Stop B and C

STOP REPLICA;

2) If replicas are multi-threaded, correct MTS gaps and make them single-threaded until all changes are applied. To do so, execute the following commands on BOTH nodes:

START REPLICA UNTIL SQL_AFTER_MTS_GAPS;
SHOW REPLICA STATUS\G -- repeat this until you see "Replica_SQL_Running: No"
STOP REPLICA;
SELECT @@global.replica_parallel_workers; -- take note to restore later
SET GLOBAL replica_parallel_workers=0; -- disable MTS during the operations

3) Then check which is the node that is more up to date by looking at Relay_Source_Log_File and Exec_Source_Log_Pos. Run on BOTH nodes:

SHOW REPLICA STATUS\G
# Take note of Relay_Source_Log_File/Exec_Source_Log_Pos from the most up to date node.

4) Sync replicas with UNTIL. Run on the most delayed node with above outputs:

START REPLICA UNTIL SOURCE_LOG_FILE='<Relay_Source_Log_File>', SOURCE_LOG_POS=<Exec_Source_Log_Pos>;
SHOW REPLICA STATUS\G -- repeat this until you see "Replica_SQL_Running: No"

5) If followed above steps, at this point both replicas should have the exact same data set and should be in sync at the same point in time.
# Double check that both replicas are stopped and with the same coords as doing topology changes while replication is ongoing and with diffs coords can cause inconsistencies:

SHOW REPLICA STATUS\G
# Replica_IO_Running must be “NO” in both replicas
# Replica_SQL_Running must be “NO” in both replicas
# Relay_Source_Log_File must match in both replicas
# Exec_Source_Log_Pos must match in both replicas

6) Get current coordinates from B (new intermediate primary). Execute on B:

SHOW MASTER STATUS \G
# Take note of File and Position

7) Repoint C to B. Execute on C with coords from previous step:

CHANGE REPLICATION SOURCE TO SOURCE_HOST='<ip-address-of-B>', SOURCE_LOG_FILE='<File>', SOURCE_LOG_POS='<Position>';

8) If you had disabled MTS, you should re-enable here for both B and C;

SET GLOBAL replica_parallel_workers=X; -- see output of step 2 for correct value

9) Restart replication normally. Run on both nodes:

START REPLICA;

If Using GTID-Based Replication:

1) Stop B and C:
STOP REPLICA;

2) If replicas are multi-threaded, correct MTS gaps and make them single-threaded until all changes are applied. Run on BOTH nodes:

SHOW REPLICA STATUS\G -- repeat this until you see "Replica_SQL_Running: No"
STOP REPLICA;
SELECT @@global.replica_parallel_workers; -- take note to restore later
SET GLOBAL replica_parallel_workers=0; -- disable MTS during the operations

3) Then check which is the node that is more up to date by looking at sequence numbers in Executed_Gtid_Set. Run on BOTH nodes:
?
SHOW REPLICA STATUS\G
# Take note of Executed_Gtid_Set with the largest sequence number. If there is a mismatch in the gtid sets it means there were either local writes or writes coming from some other server. In that case you should check data consistency between the servers, for example with pt-table-checksum . Then you need to fix gtid differences by either restoring the replica from scratch or fix errant transactions as explained on this other blogpost

4) Bring up all nodes to the same point in time. Run on node with smallest GTID sequence number;

START REPLICA UNTIL SQL_AFTER_GTIDS='<Executed_Gtid_Set>';
SHOW REPLICA STATUS\G -- repeat this until you see "Replica_SQL_Running: No"

5) If followed above steps, at this point both replicas should have the exact same data set and should be in sync at the same point in time.
# Double check that both replicas are stopped and with the same coords as doing topology changes while replication is ongoing and with diffs coords can cause inconsistencies:

SHOW REPLICA STATUS\G
# Replica_IO_Running must be “NO” in both replicas
# Replica_SQL_Running must be “NO” in both replicas
# Executed_Gtid_Set must match in both replicas

6) Now both replicas have identical data, so you can re-point C to replicate from B. Run on C:

CHANGE REPLICATION SOURCE TO SOURCE_HOST='<ip-address-of-B>'

7) If you had disabled MTS, you should re-enable here for both B and C;

SET GLOBAL replica_parallel_workers=X; -- see output of step 2 for correct value

8) Restart replication normally. Run on both nodes

START REPLICA;

Doing the opposite replication change from chain replication (A->B->C) into one primary with two replicas should be simpler:

If Using File/Position-Based Replication:

1) Stop replication on B and make sure B is not receiving any write activity:

STOP REPLICA;

2) Check current binary log position on B:

SHOW MASTER STATUS \G

3) On C check replication until C does catch up with B. On C:

SHOW REPLICA STATUS \G

# For C to have catch up with B, the following conditions should be met:
# “File” from B on step 2) should match Relay_Source_Log_File from 3)
# “Position” from B on step2) should match Exec_Source_Log_Pos from 3)

# After catchup, both servers will be in sync with the same data set.

4) Check current replication coords from B:

SHOW REPLICA STATUS \G
# Write down Relay_Source_Log_File and Exec_Source_Log_Pos from B, as we will be using this coords on C

5) Re point C to replicate from A. File and positions used should be the ones taken from B on last step: 

CHANGE REPLICATION SOURCE TO SOURCE_HOST='<ip-address-of-A>', SOURCE_LOG_FILE='<File>', SOURCE_LOG_POS='<Position>'

6) Restart replication normally. Run on both nodes:

START REPLICA;

If Using GTID-Based Replication:

1) Stop replication on B and make sure B is not receiving any write activity:

STOP REPLICA;

2) Check current binary log position on B:

SHOW MASTER STATUS \G

3) On C check replication until C does catch up with B. On C:

SHOW REPLICA STATUS \G

# For C to have catch up with B, the following conditions should be met:
# Executed_Gtid_Set from B step 2) should match Executed_Gtid_Set from 3)
# After catchup, both servers will be in sync with the same data set.

4) Re point C to replicate from A:

CHANGE REPLICATION SOURCE TO SOURCE_HOST='<ip-address-of-A>'

5) Restart replication normally. Run on both nodes

START REPLICA;

Conclusion:

Doing topology changes might seem hard at first, but with the above procedure, it should be easy and error-free! If you do not want to do the manual approach, then you can consider using tools like Orchestrator which allows for automatic failover and promotions.

Percona Distribution for MySQL is the most complete, stable, scalable, and secure, open-source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!

Download Percona Distribution for MySQL Today

Aug
13
2021
--

Get an Amazon Gift Card for a Review!

Percona Amazon Gift Card MySQL

Percona Amazon Gift Card MySQLThe next 10 user reviews for Percona Server for MySQL will receive a $10 Amazon gift card from Sourceforge. Add yours now!

Why does Percona value honest user reviews? They provide valuable information for engineers to guide product development. They help more users become aware of Percona software.  They help inspire our development process!

If you are uncertain what to write about, here are some ideas. Remember to describe your environment and use case if relevant.

  1. What features are most useful for you?
  2. What missing features matter most to you?
  3. How useful do you find Percona Server for MySQL in your circumstances?

To grab your $10 bonus, submit a review on Sourceforge for Percona Server for MySQL using this referral link. Hurry up; only the next 10 reviewers will receive the gift! This offer is managed by Sourceforge and is wholly independent of Percona.

Sourceforge, Capterra, and G2 also allow you to leave your review on other Percona products, but they are not eligible for the Amazon gift card giveaway.  See these links to all major review sites.

Percona Product Capterra G2 SourceForge
Percona Monitoring and Management Capterra G2 SourceForge
Percona Server For MySQL Capterra G2 SourceForge
Percona XtraDB Cluster Capterra G2 SourceForge
Percona XtraBackup Capterra G2 SourceForge
Percona Distribution for PostgreSQL G2 SourceForge
Percona Backup for MongoDB G2 SourceForge
Percona Server for MongoDB G2 SourceForge
Percona Kubernetes Operator G2 SourceForge

 

Percona gratefully reads all received reviews and greatly appreciates them. If you have any questions or concerns, don’t hesitate to get in touch with us at community-team@percona.com.

Percona Distribution for MySQL is the most complete, stable, scalable, and secure, open-source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!

Download Percona Distribution for MySQL Today

Jun
01
2021
--

Skipping Percona Server for MySQL Version 8.0.24 and Releasing 8.0.25 Next

Percona Server for MySQL Version 8.0.24

After Oracle released MySQL version 8.0.24 on April 20, 2021, our engineering team got started right away with merging our enhancements to prepare the corresponding 8.0.24 version of Percona Server for MySQL.

However, Oracle released MySQL version 8.0.25 shortly afterward on May 11, 2021, to fix a critical bug that we also observed during our initial testing and reported back to them.

Therefore, we have decided to skip releasing Percona Server for MySQL 8.0.24 both as a standalone product and a distribution as well as the matching Percona XtraDB Cluster release.

Our next public release after Percona Server for MySQL version 8.0.23 will be version 8.0.25, so don’t be surprised if you’re looking for version 8.0.24 and can’t find it. Thank you!

Apr
15
2021
--

Online DDL with Group Replication In Percona Server for MySQL 8.0.22 (and MySQL 8.0.23)

Online DDL with Group Replication MySQL

Online DDL with Group Replication MySQLWhile I was working on my grFailOver POC, I have also done some additional parallel testing. One of them was to see how online DDL is executed inside a Group Replication cluster.

The online DDL feature provides support for instant and in-place table alterations and concurrent DML. Checking the Group Replication (GR) official documentation, I was trying to identify if any limitation exists, but the only thing I have found was this:

“Concurrent DDL versus DML Operations.  Concurrent data definition statements and data manipulation statements executing against the same object but on different servers is not supported when using multi-primary mode. During execution of Data Definition Language (DDL) statements on an object, executing concurrent Data Manipulation Language (DML) on the same object but on a different server instance has the risk of conflicting DDL executing on different instances not being detected.”

This impacts only when you have a multi-primary scenario, which is NOT recommended and not my case. So, in theory, GR should be able to handle the online DDL without problems. 

My scenario:

group replication MySQL

I have two DCs and I am going to do actions on my DC1 and see how it propagates all over, and what impact it will have.

The Test

To do the test, I will run and insert from select. 

insert into windmills_test  select null,uuid,millid,kwatts_s,date,location,active,time,strrecordtype from windmills7 limit 1;

And a select, on my Primary node gr1, while on another connection execute the ALTER:

ALTER TABLE windmills_test ADD INDEX idx_1 (`uuid`,`active`), ALGORITHM=INPLACE, LOCK=NONE;

As you may have noticed, I am EXPLICITLY asking for INPLACE and lock NONE. So in this case, MySQL cannot satisfy;  it should exit and not execute the command.

In the meantime, on all other nodes, I will run a check command to see WHEN my ALTER is taking place. Let us roll the ball:

On my Primary, the command to insert the data:

[root@gr1 grtest]# while [ 1 = 1 ];do da=$(date +'%s.%3N');/opt/mysql_templates/mysql-8P/bin/mysql --defaults-file=./my.cnf -uroot -D windmills_s -e "insert into windmills_test  select null,uuid,millid,kwatts_s,date,location,active,time,strrecordtype from windmills7 limit 1;" -e "select count(*) from windmills_s.windmills_test;" > /dev/null;db=$(date +'%s.%3N'); echo "$(echo "($db - $da)"|bc)";sleep 1;done

Again on Primary another session to execute the ALTER:

DC1-1(root@localhost) [windmills_s]>ALTER TABLE windmills_test ADD INDEX idx_1 (`uuid`,`active`), ALGORITHM=INPLACE, LOCK=NONE;

On other nodes to monitor when ALTER will start:

while [ 1 = 1 ];do echo "$(date +'%T.%3N')";/opt/mysql_templates/mysql-8P/bin/mysql --defaults-file=./my.cnf -uroot -D windmills_s -e "show processlist;"|grep -i "alter";sleep 1;done

What Happens

Data is inserted by the loop.

Group Replication MySQL ALTER

ALTER starts, but I can still insert data in my table, and most importantly, the data is propagated to all nodes of the DC1 cluster.

No ALTER action on the other nodes.

.559
.502
.446
.529
.543
.553
.533
.602
.458  <---- end of the alter locally

Once ALTER is complete on the local node (Primary) it is then executed (broadcast) to all the nodes participating in the cluster.

[ERROR 1205 (HY000) at line 1: Lock wait timeout exceeded; try restarting transaction <--- waiting for waiting for handler commit    No INSERTS are allowed

But writes are suspended, waiting for:

37411 | root            | localhost          | windmills_s | Query            |    19 | Waiting for table metadata lock                                 | insert into windmills_test  select null,uuid,millid,kwatts_s,date,location,active,time,strrecordtype

And eventually, it will timeout.

The other point is that any write hangs until the slowest node had applied the ALTER. It is important to note that all nodes, not only the PRIMARY, remain pending waiting for the slow node: the slowest drives all.

GR3:

11:01:28.649  48 system user windmills_s Query 171 altering table ALTER TABLE windmills_test ADD INDEX idx_1 (`uuid`,`active`), ALGORITHM=INPLACE, LOCK=NONE
11:01:29.674  48 system user windmills_s Query 172 waiting for handler commit ALTER TABLE windmills_test ADD INDEX idx_1 (`uuid`,`active`), ALGORITHM=INPLACE, LOCK=NONE

GR2:

Start 11:00:14.438  18 system user windmills_s Query 97 altering table ALTER TABLE windmills_test ADD INDEX idx_1 (`uuid`,`active`), ALGORITHM=INPLACE, LOCK=NONE
Ends 11:02:00.107  18 system user windmills_s Query 203 altering table ALTER TABLE windmills_test ADD INDEX idx_1 (`uuid`,`active`), ALGORITHM=INPLACE, LOCK=NONE

Finally, when the last node in the GR cluster has applied the ALTER, the writes will resume, and the Replica node on DC2 will start its ALTER operation on PRIMARY first, then on the other nodes.

Summarizing:

  1. Writes are executed on Primary
  2. ALTER is executed on the Primary
    • DDL does not impact the write operation and respects the not blocking directive.
  3. ALTER is completed on Primary and passed to all nodes
    • Meta lock is raised on nodes
  4. ALL cluster waits for the slowest node to complete
  5. When all is done in the DC1 then the action is replicated to DC2
    • Goto point 2

Conclusion

It seems that at the moment we have partial coverage of the online ddl feature when using group_replication. Of course, to have to wait for the SECONDARY nodes is better and less impacting than to wait for PRIMARY first and then the SECONDARIES.

But it is confusing, given I was expecting to have either full online coverage (I had explicitly asked for that in the DDL command) or a message telling me it cannot be executed online.  Of course, I would prefer to have FULL online coverage. ;0) 

Keep in mind my setup was also pretty standard and that changing group_replication_consistency does not affect the outcome. But not sure I can classify this as a bug, more an unexpected undesirable behavior.

Apr
05
2021
--

Which Version of MySQL Should I Use for MyRocks?

Version of MySQL Should I Use for MyRocks

Version of MySQL Should I Use for MyRocksAs database footprints continue to explode, many companies are looking for ways to deal with such rapid growth.  One approach is to refactor traditional relational databases to fit into a NoSQL engine, where horizontal scalability is easier.  However, in many cases, this is in no way a trivial undertaking.

Another approach that has been gaining interest is the use of MyRocks as an alternative storage engine to the traditional InnoDB.  While not for everyone, in certain use cases it could be a potential solution.  As with so many things open source, the next standard questions are: which version should I use?  Any differences with the engine if I use MyRocks with MySQL 5.7 vs 8.0?

In this post, I wanted to touch on this and give some high-level thoughts on MyRocks when it comes to the version of MySQL.

Engine Overview

At a high level, RocksDB is an embedded key-value database with data stored in a log-structured merge tree (LSM).  MyRocks is an abstraction layer that allows RocksDB to serve as a MySQL storage engine.  With RocksDB as the underlying storage layer, there are numerous advantages including faster replication, better compression, and faster data loading.

In contrast, InnoDB is B-Tree based structure.  MySQL was designed using a plugin architecture that separates the storage engine logic from the main server functionality.  This allows users to choose which storage engine they want based on their use case.  Historically, this was the MyISAM storage engine.  In recent years, InnoDB replaced MyISAM as the defacto standard and later the default engine.

Which Engine is Better?

Choosing which engine to use is very use case-specific.  Even at Facebook, where MyRocks was first developed, it was not meant to be a universal replacement for InnoDB.  Features like gap locks remain absent from MyRocks.  If a workload is dependent on gap locking, it will be a non-starter with the MyRocks engine.

Also, data access patterns should dictate the engine choice.  If the workload is write-intensive with limited range scans, it may be a good fit for MyRocks.  If you have a traditional OLTP workload with several reporting access patterns, InnoDB would remain a better option.  As mentioned in MyRocks Engine: Things to Know Before You Start:

“MyRocks is not an enhanced InnoDB, nor a one-size-fits-all replacement for InnoDB. It has its own pros/cons just like InnoDB. You need to decide which engine to use based on your application’s data access patterns.”

Which Version of MySQL Should I Use?

Assuming you have done your research and found MyRocks would be a good fit, the next choice is which version to use.  Facebook runs MyRocks on a heavily patched, internal version of MySQL 5.6Percona Server for MySQL includes MyRocks in both 5.7 and 8.0.  So what are the differences?

At a high level, the MyRocks/RocksDB code is essentially the same in all three versions.  Percona Server for MySQL uses the upstream version of RocksDB and only changes the linked version when Facebook MySQL updates the version.  This is due to the fact that it inherits the integration testing between RocksDB and MyRocks from Facebook.

As such, the biggest differences are based solely on the server version and not the server version and MyRocks combination.  If your infrastructure is already running with MySQL 5.7 in production and not ready to finalize a move to 8.0, there would be no discernible difference running the MyRocks engine with your current version.  MyRocks running on Percona Server for MySQL 5.7 is stable and shouldn’t be discarded as an option.  You should still plan to upgrade to 8.0, but there shouldn’t be any unique challenges in the process compared to using InnoDB.  It would just require the standard upgrade process and significant testing.

Moving forward (as 5.7 approaches EOL in 2023), you can expect to see fewer enhancements with MyRocks on Percona Server for MySQL 5.7.  Most new active development will be done against 8.0 while 5.7 will remain primarily in extended support, which includes critical bug fixes and security fixes only.

Conclusion

With all the version combinations and options, it can be overwhelming to pick the best option for your organization.  For a new deployment, I would recommend starting with MyRocks on MySQL 8.0.  With an existing MySQL 5.7 deployment (while you should be actively working towards an upgrade to 8.0), MyRocks is a viable and stable option as well.  The biggest driver for MyRocks should be space concerns combined with the workload.  As nice as it would be to say MyRocks is the silver bullet that works for everything, that just isn’t the case.  You can always reach out to the Professional Services team at Percona and let us help you determine if MyRocks would be a good fit for your team!

Apr
01
2021
--

Working to Validate MyRocks in the Enterprise with Dropbox

MyRocks in the Enterprise with Dropbox

MyRocks in the Enterprise with DropboxPercona Technical Account Managers get the privilege of working with some of our largest enterprise clients day in and day out.  As such, we get to really focus on how to best leverage our technology to generate measurable benefits for our users.  While it is fun to “nerd out” and always strive to use the latest and greatest, we need to stay focused on demonstrating business value and a genuine need.  Over the past few months, I’ve been working with one of my larger clients, Dropbox, along with our professional services team to validate the use of Percona Server for MySQL with the MyRocks storage engine over a large portion of their MySQL infrastructure.

Please note – this is not meant to be a deep dive into the technical details around MyRocks or the implementation.  Rather, it is meant to show how we determined the need, potential solution, and the planning that has started us down this path.  Look for a more detailed case study in the coming months as we hope to push this solution forward!

The Problem

In a single word, space.  When a client reaches a certain scale, space becomes a real concern.  With 10-20 servers, having an extra index or choosing the wrong data type can be noticeable.  However, with 10,000 servers, there is much less margin for error.  The challenge with this client is that the schema has already been sharded and optimized. Even with that, the aggregate dataset is still on the petabyte scale.  Think about what that means:

  • Large storage footprint per server (2+TB)
  • Replication multiplies that footprint for each “cluster”
  • Multiple backups per cluster result in a huge (PB+) backup footprint

There are some additional challenges created at this scale, too.  To mitigate the risk of filling a disk, there is a soft cap of 75 percent full on each server.  When a cluster hits this level, the shard is split to give each half some additional runway.  While this is great operationally, splitting shards results in doubling the cost for the cluster.  Again, on a scale of 1000s of servers, a “split” means an increase in cost in the hundreds of thousands of dollars.  This is not trivial and puts pressure on the team to delay splits as long as possible while maintaining high availability, durability, and performance.

MyRocks Use Case

After much discussion and understanding that space (and, in turn, cost) is a major business driver, we decided to investigate a change to MyRocks as the storage engine.  While standard InnoDB compression provided a small bandaid, the thinking was that MyRocks would give substantial savings.  At a very high level, MyRocks is an LSM tree-based storage engine built on top of RocksDB.  This type of storage has numerous advantages over traditional B+Tree storage engines (like InnoDB), including a smaller disk footprint and reduced write amplification.  This can translate to business value in several ways, including:

  • Decreased cloud/HDFS storage cost for backups
  • Reduced Write Amplification results in a longer disk lifetime, reducing fleet turnover
  • Decreased instance space requires fewer splits and increases runway

After several rounds of discussion around the benefits and any blockers, we decided to begin testing the feasibility of converting from InnoDB to MyRocks.

Initial Testing

While the use case seemed like a solid fit, extensive testing is always needed.  So far, the initial testing looks promising.  We’ve seen a large reduction in space combined with an increase in performance.  One example cluster saw nearly a 75 percent reduction in space, to the point that we were able to test running two replica instances on a single server in parallel.

What makes this interesting is that the test cluster in question has replicas that periodically show lag and generally are close to the replication limit.  In contrast, the two MyRocks replicas showed no lag during the same test period despite running two instances on one physical server.

While this isn’t something that would be done in production, it was impressive to see double the workload operating so efficiently on the same hardware.  I/O and CPU utilization were both noticeably lower than the single replica running InnoDB.  This shows the potential of increased server longevity and less frequent splits that we were hoping to see.  If these numbers and this performance were to hold into production, we could see savings on the order of millions of dollars across the entire fleet.

Note – this early testing has been limited to replicas only so we can’t yet validate this performance at the production level of concurrency of a primary server.  These results are only with four replica threads, so contention has been minor.

Looking Forward

So far, our limited testing has shown some real promise.  There are still some blockers that we need to overcome, but early results are encouraging. Currently, the biggest technical challenges in this project include:

  • MyRocks currently lacks pt-table-checksum support (for replica consistency)
  • Modifying and validating existing backup/restore/clone automation

Once we are able to fully validate this solution and solve the existing challenges (both fixes have been identified and are on the project roadmap) in the coming months, look for a more detailed case study.  While not every organization needs to make such a drastic change, this use case is one that is becoming more common at the enterprise level as data sets continue to explode.

Want to learn more? Check out the Percona MyRocks Introduction page on our website!

Mar
10
2021
--

How to Build Percona Server for MySQL From Sources

Build Percona Server for MySQL From Sources

Build Percona Server for MySQL From SourcesLately, the number of questions about how to build Percona software has been increased. More and more people try to add their own patches, add some modifications, and build software by themselves. But this raises the question of how to do this in the same way as Percona does, as sometimes the compiler flag can make a drastic impact on the final binary.

First of all, let’s talk about the stages of compiling software.

I would say that at the beginning you need to prepare the build environment, install all the needed dependencies, and so on. For each version, the dependency list would be different. How do you get the correct dependency list? You can get all build requirements from the spec file (on rpm-based systems) or from the control file( on deb-based systems).

The next stage is to get the source code of Percona Server for MySQL. You can do it in different ways:

  • Get source tarball from the website

We publish source tarball for each release we issue and you can easily get it for any released version (this is the link for the latest 8.0.22 version).

  • Download tarball from GitHub

Here is the link: https://github.com/percona/percona-server/releases/tag/Percona-Server-8.0.22-13

Once you have prepared the build environment, you need to decide what result you should get: binary tarball, rpm, or deb package.

So everything looks easy. But what is the way we use, internally, to prepare our release packages? As I mentioned earlier, each compiler option can make a significant effect.

Everyone tries to make life easier and automate all tasks. It is great, as automation is one of the keys to success as you can work on other tasks once the release build is in progress. So we have created a builder script that is used for making builds. It can be found in the Percona Server GitHub repo and can be used by anyone to make his own builds.

Have open source expertise to share? Submit your talk for Percona Live ONLINE!

This script can install all needed dependencies for the build environment, create binary tarballs, source RPMs and debs, RPMs, debs itself, and the binary tarball. So it covers all build cycles. So how do you use it?

The script has various params:

        --builddir=DIR      Absolute path to the dir where all actions will be performed

        --get_sources       Source will be downloaded from github

        --build_src_rpm     If it is 1 src rpm will be built

        --build_source_deb  If it is 1 source deb package will be built

        --build_rpm         If it is 1 rpm will be built

        --build_deb         If it is 1 deb will be built

        --build_tarball     If it is 1 tarball will be built

        --install_deps      Install build dependencies(root privileges are required)

        --branch            Branch for build

        --repo              Repo for build

        --rpm_release       RPM version( default = 1)

        --deb_release       DEB version( default = 1)

        --debug             Build debug tarball

So let’s see how we can make the build process easier:

1. Download build script:

wget       https://raw.githubusercontent.com/percona/percona-server/8.0/build-ps/percona-server-8.0_builder.sh

2. Create a build directory where you are going to perform all build actions:

mkdir /tmp/BUILD_PS

3. Install dependencies(please note root permissions are required):

sudo ./percona-server-8.0_builder.sh --builddir=/tmp/BUILD_PS --install_deps=1

4. Download source code:

– From Percona repo (it is used by default):

./percona-server-8.0_builder.sh --builddir=/tmp/BUILD_PS --get_sources=1 --branch=Percona-Server-8.0.22-13

– From your own repo and branch:

./percona-server-8.0_builder.sh --builddir=/tmp/BUILD_PS --get_sources=1 --branch=<your_branch_name> --repo=<link_to_your_repo_on_github>

5. Prepare src rpm:

./percona-server-8.0_builder.sh --builddir=/tmp/BUILD_PS --build_src_rpm=1

Please note if you already have your source tarball, just create a directory named source_tarball in the build directory and put it into it.

6. Prepare source deb:

./percona-server-8.0_builder.sh --builddir=/tmp/BUILD_PS --build_source_deb=1

Please note if you already have your source tarball, just create a directory named source_tarball in the build directory and put it into it.

7. Prepare rpm:

./percona-server-8.0_builder.sh --builddir=/tmp/BUILD_PS --build_rpm=1

8. Prepare deb:

./percona-server-8.0_builder.sh --builddir=/tmp/BUILD_PS --build_deb=1

9. Prepare tarball:

./percona-server-8.0_builder.sh --builddir=/tmp/BUILD_PS --build_tarball=1

So as you can see, the build procedure becomes easier and you don’t need to think about what dependencies are needed, what cmake params should be used, etc. This build script will make everything for you and will use all build flags (if you didn’t change them in your sources) that we use for release builds.

Dec
22
2020
--

Redesign of –lock-ddl-per-table in Percona XtraBackup

Redesign of -lock-ddl-per-table in Percona XtraBackup

Redesign of -lock-ddl-per-table in Percona XtraBackupMySQL 5.7, alongside other many improvements, brought bulk load for creating an index (WL#7277 to be specific), which made ADD INDEX operations much faster by disabling redo logging and making the changes directly to tablespace files. This change requires extra care for backup tools. To block DDL statements on an instance, Percona Server for MySQL implemented LOCK TABLES FOR BACKUP. Percona XtraBackup (PXB) uses this lock for the duration of the backup. This lock does not affect DML statements.

MySQL 5.7 doesn’t have an option to block an instance against DDL and to allow all MDL’s. Thus, Percona XtraBackup has also implemented –lock-ddl-per-table. Before we go into other details, let’s understand how –lock-ddl-per-table works up to now:

  1. PXB starts its work by parsing and copying all redo logs after the checkpoint mark.
  2. Fork a dedicated thread that will keep following new redo log entries.
  3. Get a list of all tablespaces that it will require to copy.
  4. Iterate through the list of tablespaces, for each tablespace, it does the following:
    •  Query INFORMATION_SCHEMA.INNODB_SYS_TABLES or in case of a 8.0 server INFORMATION_SCHEMA.INNODB_TABLES check which table or tables belong to that tablespace ID and take an MDL on the underlying table or tables in case of a shared tablespace.
    • Copy the tablespace .ibd file.

This approach works on the promise that if an MLOG_INDEX_LOAD event (Redo log event generated by bulk load operations to notify backup tools that changes to data files have been omitted from redo log) is encountered by the redo follow thread, it’s safe to continue as tablespaces that we have already copied are protected by MDL and the  MLOG_INDEX_LOAD event is for a tablespace that is yet to be copied.

This promise is not always correct and can lead to inconsistent backup; here are a few examples:

Full-Text Index

Full-Text Index has its own tablespace:

mysql> SELECT table_id, name, space from INFORMATION_SCHEMA.INNODB_SYS_TABLES WHERE name LIKE '%FTS%';
+----------+----------------------------------------------------+-------+
| table_id | name                                               | space |
+----------+----------------------------------------------------+-------+
|     1169 | test/FTS_000000000000002e_0000000000000508_INDEX_1 |  1157 |
|     1170 | test/FTS_000000000000002e_0000000000000508_INDEX_2 |  1158 |
|     1171 | test/FTS_000000000000002e_0000000000000508_INDEX_3 |  1159 |
|     1172 | test/FTS_000000000000002e_0000000000000508_INDEX_4 |  1160 |
|     1173 | test/FTS_000000000000002e_0000000000000508_INDEX_5 |  1161 |
|     1174 | test/FTS_000000000000002e_0000000000000508_INDEX_6 |  1162 |
|     1175 | test/FTS_000000000000002e_BEING_DELETED            |  1163 |
|     1176 | test/FTS_000000000000002e_BEING_DELETED_CACHE      |  1164 |
|     1177 | test/FTS_000000000000002e_CONFIG                   |  1165 |
|     1178 | test/FTS_000000000000002e_DELETED                  |  1166 |
|     1179 | test/FTS_000000000000002e_DELETED_CACHE            |  1167 |
+----------+----------------------------------------------------+-------+
11 rows in set (0.01 sec)

With the current approach, PXB will try to run a SELECT on FTS_000000000000002e_0000000000000508_INDEX_1 for example, which is not something we can do. Here the underlying table that this FTS belongs to may or may not have been protected by MDL, which can cause the FTS index to be copied without protection.

Full-Text Index has a defined name schema, on the above situation we can easily extract the table ID by translating the first 16 characters after FTS_ from hex to decimal, which would give us the underlying table that the FTS belongs to:

session 1> CREATE FULLTEXT INDEX full_index on joinit2 (s);
session 2> SELECT table_id, name, space from INFORMATION_SCHEMA.INNODB_SYS_TABLES WHERE table_id >= 1319;
+----------+----------------------------------------------------+-------+
| table_id | name                                               | space |
+----------+----------------------------------------------------+-------+
|     1321 | test/#sql-ib1320-2000853746                        |  1309 |
|     1322 | test/FTS_0000000000000529_00000000000005b6_INDEX_1 |  1310 |
|     1323 | test/FTS_0000000000000529_00000000000005b6_INDEX_2 |  1311 |
|     1324 | test/FTS_0000000000000529_00000000000005b6_INDEX_3 |  1312 |
|     1325 | test/FTS_0000000000000529_00000000000005b6_INDEX_4 |  1313 |
|     1326 | test/FTS_0000000000000529_00000000000005b6_INDEX_5 |  1314 |
|     1327 | test/FTS_0000000000000529_00000000000005b6_INDEX_6 |  1315 |
|     1328 | test/FTS_0000000000000529_BEING_DELETED            |  1316 |
|     1329 | test/FTS_0000000000000529_BEING_DELETED_CACHE      |  1317 |
|     1330 | test/FTS_0000000000000529_CONFIG                   |  1318 |
|     1331 | test/FTS_0000000000000529_DELETED                  |  1319 |
|     1332 | test/FTS_0000000000000529_DELETED_CACHE            |  1320 |
|     1320 | test/joinit2                                       |  1308 |
+----------+----------------------------------------------------+-------+

FTS_0000000000000529 translates to table_id 1321, however, as you can see above, the FTS in question belongs to a temporary table. That is because when FTS is created for the first time, it has to rebuild the table to create the FTS_DOC_ID column. Again, it is not possible to copy the files under MDL protection.

New Table Added in the Middle of the Backup

Because of the per-table MDL acquisition, if a table has been created after PXB has gathered the list of tablespaces, it’s .ibd will not be copied. Instead, it will be recreated as part of the –prepare phase based on the data added to redo logs. As you can imagine, if the changes are redo-skipped after recreating the table based on redo information, the table will be incomplete.

Shared Tablespaces

Once a shared tablespace is parsed by the lock-ddl-per-table function, it will get a list of all tables created on that tablespace and acquire the MDL on those tables, however, there is no tablespace level MDL, which means there is nothing blocking a new table to be created on this tablespace. If the tablespace has already been copied, this will follow the previous point and be recreated at –prepare phase by parsing redo logs.

Best/Worst Case Scenario

The outcome of such inconsistent backups can be unknown. In the best-case scenario, you will get a crash either in the backup/prepare phase or when using the server. Yes, a crash is a best-case scenario because you will notice the issue right away.

In the worst-case scenario, data will be missed without you noticing it. Here instead of explaining let’s reproduce it.

Get a brand-new instance of MySQL/Percona Server for MySQL 5.7 and download Percona XtraBackup 2.4 prior to 2.4.21 (same can be reproducible with MySQL/PSMySQL 8 and PXB8).

In order to reproduce this scenario, we will be using gdb to pause PXB execution of the backup at a certain time, however, one can do the same by having a big table, which will take some time to copy.

gdb xtrabackup ex 'set args --backup --lock-ddl-per-table --target-dir=/tmp/backup' -ex 'b mdl_lock_table' -ex 'r'

This will start the backup. On a separate session, connect to MySQL and execute:

CREATE DATABASE a;
USE a;
CREATE TABLE tb1 (ID INT PRIMARY KEY, name CHAR(1));
INSERT INTO tb1 VALUES (3,'c'), (4, 'd'), (5, 'e');
CREATE INDEX n_index ON tb1(name);

Back to gdb session, execute:

disa 1
c
quit

Your backup will complete. You can now prepare it and restore it on MySQL. Now try to query a.tb1 using the below queries:

USE a;
SELECT * FROM tb1;
SELECT * FROM tb1 FORCE INDEX(PRIMARY);
SELECT * FROM tb1 FORCE INDEX(n_index);
SELECT * FROM tb1 WHERE name = 'd';

Here is an example:

mysql> SELECT * FROM tb1;
Empty set (0.00 sec)

mysql> SELECT * FROM tb1 FORCE INDEX(PRIMARY);
+----+------+
| ID | name |
+----+------+
|  3 | c    |
|  4 | d    |
|  5 | e    |
+----+------+
3 rows in set (0.00 sec)

mysql> SELECT * FROM tb1 FORCE INDEX(n_index);
Empty set (0.00 sec)

mysql> SELECT * FROM tb1 WHERE name = 'd';
Empty set (0.00 sec)

As you can see, querying using the PK shows that data is indeed present on the data. The index n_index exists but the corresponding index entries are not present, leading to inconsistent results, which may cause a lot of harm before you notice something is wrong.

Redesign of –lock-ddl-per-table

With the above points in mind, –lock-ddl-per-table has been reworked in order to guarantee a consistent backup. Percona XtraBackup 2.4.21 and Percona XtraBackup 8.0.22 have the below main changes:

  • MDL lock is now performed at the beginning of the backup before we do the first part/catch-up of redo logs.
  • Then the first scan of redo logs happens, and in this first scan, we can still have this MLOG_INDEX_LOAD event recorded in case a CREATE INDEX has happened right before the backup started. For now, it’s still safe to parse and accept it.
  • Once the first scan has finished, the thread responsible for following new redo log events is started and this thread will now abort the backup in case of encountering MLOG_INDEX_LOAD events.
  • Gather the list of tablespaces to copy.
  • Start to copy the actual .ibd files.

Other improvements have also been performed:

  • We now skip trying to run a SELECT if the .ibd file belongs to a temporary table as the SELECT query will never work.
  • Since we are taking MDL before we copy individual files, we are also skipping the lock in case we are dealing with an FTS. For FTS we will eventually (or have already) taken an MDL on the base table, making it safe to skip the work for those files too.
  • The query that takes MDL has been improved to not retrieve any data since a SELECT 1 FROM table LIMIT 0 will be enough for acquiring an MDL lock on the table.
  • –lock-ddl-per-table is a workaround for the changes done in WL#7277 when happening in the middle of a backup. In Percona Server 5.7 we have implemented LOCK TABLES FOR BACKUP (that gets executed with –lock-ddl parameter of PXB) which acquires an instance-level MDL lock. If a user is using –lock-ddl-per-table on Percona Server for MySQL 5.7 a warning is trow advising it should be using –lock-ddl instead.
  • For MySQL 8, upstream has implemented LOCK INSTANCE FOR BACKUP, which works similar to Percona Server LOCK TABLES FOR BACKUP, in the case of Percona XtraBackup 8, –lock-ddl-per-table is been deprecated, it still works but a warning is also issued advising users to switch to –lock-ddl.

Summary

As described throughout this post, WL#7277 brought some real challenges that can affect backups in different ways, some of them not easy to spot. Percona XtraBackup will always favor consistency and has been enhanced to work around those limitations when possible (taking MDL earlier in the process) and aborting the backup when an inconsistency is inevitable.

Percona Server for MySQL users and MySQL 8 should always use –lock-ddl which is a more robust and safe lock for those types of situations.

Oct
16
2020
--

Danger of Changing Default of log_error_verbosity on MySQL/Percona Server for MySQL 5.7

Changing Default of log_error_verbosity mysql

Changing Default of log_error_verbosity mysqlChanging the default value (3) of log_error_verbosity in MySQL/Percona Server for MySQL 5.7 can have a hidden unintended effect! What does log_error_verbosity do exactly? As per the documentation:

The log_error_verbosity system variable specifies the verbosity for handling events intended for the error log.

Basically a value of 1 logs only [Errors]; 2 is 1)+[Warnings]; and 3 is 2)+[Notes].

For example, one might be tempted to change the default of log_error_verbosity since the error.log might be bloated with thousands (or millions) of lines with [Notes], such as:

2020-10-08T17:02:56.736179Z 3 [Note] Access denied for user 'root'@'localhost' (using password: NO)
2020-10-08T17:04:48.149038Z 4 [Note] Aborted connection 4 to db: 'unconnected' user: 'root' host: 'localhost' (Got timeout reading communication packets)

(P.S. you can read more about those Notes on this other Percona blog posts):

Fixing MySQL 1045 Error: Access Denied

MySQL “Got an error reading communication packet”

The problem is, after lowering log_error_verbosity to 1 or 2, no messages about server startup or shutdown would be printed to the log! That can really make troubleshooting really hard in the event of issues or system failure.  For completeness, on error.log from 5.7 at startup with default log_error_verbosity, the following should be seen:

2020-10-08T16:36:07.302168Z 0 [Note] InnoDB: PUNCH HOLE support available
2020-10-08T16:36:07.302188Z 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
...
2020-10-08T16:36:07.303998Z 0 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M
2020-10-08T16:36:07.307823Z 0 [Note] InnoDB: Completed initialization of buffer pool
...
2020-10-08T16:36:07.497571Z 0 [Note] /usr/sbin/mysqld: ready for connections.

And on shutdown:

2020-10-08T16:36:10.447002Z 0 [Note] Giving 0 client threads a chance to die gracefully
2020-10-08T16:36:10.447022Z 0 [Note] Shutting down slave threads
2020-10-08T16:36:10.447027Z 0 [Note] Forcefully disconnecting 0 remaining clients
…
2020-10-08T16:36:12.104099Z 0 [Note] /usr/sbin/mysqld: Shutdown complete

With log_error_verbosity =2, there won’t be messages about MySQL startup, but some warnings that are only printed at startup might give a hint of the time of server restart such as:

2020-10-08T16:30:21.966844Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2020-10-08T16:30:22.181367Z 0 [Warning] CA certificate ca.pem is self signed.
2020-10-08T16:30:22.221732Z 0 [Warning] Changed limits: table_open_cache: 431 (requested 6000)

If there is no information about server restart, system logs can be checked for information about restarts:

# cat /var/log/messages 
...
Oct  8 16:31:25 carlos-tutte-latest57-standalone-1 systemd: Starting MySQL Server...
Oct  8 16:31:26 carlos-tutte-latest57-standalone-1 systemd: Started MySQL Server.

If still no clue when was MySQL last started, checking the “Uptime” status variable can help in calculating the last startup.

The problem does NOT occur on MySQL/Percona Server for MySQL 8.0 since even with log_error_verbosity = 1, the following startup/shutdown lines are printed on the error.log:

2020-10-08T16:31:54.532504Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.20-11) starting as process 1052
2020-10-08T16:31:54.540342Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
2020-10-08T16:31:55.026815Z 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended.
2020-10-08T16:31:55.136125Z 0 [System] [MY-011323] [Server] X Plugin ready for connections. Socket: '/var/lib/mysql/mysqlx.sock' bind-address: '::' port: 33060
2020-10-08T16:31:55.270669Z 0 [System] [MY-010931] [Server] /usr/sbin/mysqld: ready for connections. Version: '8.0.20-11'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Percona Server (GPL), Release 11, Revision 5b5a5d2.
2020-10-08T16:32:01.708932Z 0 [System] [MY-010910] [Server] /usr/sbin/mysqld: Shutdown complete (mysqld 8.0.20-11)  Percona Server (GPL), Release 11, Revision 5b5a5d2.

In conclusion, if possible, avoid changing the default of log_error_verbosity on MySQL/Percona Server for MySQL 5.7. And if you need to change it, do it online with SET GLOBAL instead of through the config file, since, in the event of a restart, startup messages won’t be logged.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com