Sep
28
2022
--

Percona Monthly Bug Report: September 2022

Percona September 2022 Bug Report

Percona September 2022 Bug ReportHere at Percona, we operate on the premise that full transparency makes a product better. We strive to build the best open-source database products, but also to help you manage any issues that arise in any of the databases that we support. And, in true open source form, report back on any issues or bugs you might encounter along the way.

We constantly update our bug reports and monitor other boards to ensure we have the latest information, but we wanted to make it a little easier for you to keep track of the most critical ones. These posts are a central place to get information on the most noteworthy open and recently resolved bugs. 

In this September 2022 edition of our monthly bug report, we have the following list of bugs:

 

Percona Server for MySQL/MySQL Bugs

PS-8328(MySQL#108316): Some queries with “group_concat” and “Group by with rollup” can either crash the server or return very unexpected wrong results.

It is happening with 8.0 whereas 5.7 is not repeating this behavior.

Reported Affected Version/s: 8.0.28, 8.0.29, 8.0.30

 

MySQL#96071: According to performance schema the new TempTable engine uses about three times as much memory as MEMORY for the internal TempTable created by Query 10 of DBT3.

Q10:

mysql> select c_custkey, c_name, sum(l_extendedprice * (1 - l_discount)) as revenue, c_acctbal, n_name, c_address, c_phone, c_comment from customer, orders, lineitem, nation where c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate >= '1993-11-01' and o_orderdate < date_add( '1993-11-01' ,interval '3' month) and l_returnflag = 'R' and c_nationkey = n_nationkey group by c_custkey, c_name, c_acctbal, c_phone, n_name, c_address, c_comment order by revenue desc LIMIT 20;

MySQL 8.0 has introduced TempTable as the new, default internal temporary table storage engine to speed up query processing.

Please note that this behavior is not repeating in MySQL 8.0.29

Reported Affected Version/s: 8.0.16

 

PS-8327: ALTER TABLE … CHECK PARTITION inside procedure causes server to Aborted connection and Got an error writing communication packets in 5.7 where in 8.0 we encountered with Trying to reconnect.

Reported Affected Version/s: 5.7.38, 8.0.28

 

ZFS Bug: 

Failure reported when InnoDB cluster in Multi-Primary mode used and it upgraded to 8.0.30 from 8.0.29 and restarted one of the nodes after it successfully joined the cluster after upgrade and then it failed to start with error “The redo log file ./#innodb_redo/#ib_redo1 size 647168 is not a multiple of innodb_page_size”

So the issue is with the FALLOC_FL_ZERO_RANGE flag where documentation says that not all filesystems support it and an error should be returned if it’s not supported. However, with ZFS no error is returned so if FALLOC_FL_ZERO_RANGE is replaced with 0 in f.cpp then the file is created properly.

fallocate() fails on ZFS when used with FALLOC_FL_ZERO_RANGE and it doesn’t even return the error, so it’s a problem in kernel/glibc/driver, not from the MySQL end.

So fallocate implementation is broken in the sense of returning 0 (success) even though it has not changed the (uncompressed) size of the file (which your experiments with f.cpp), then indeed using glibc2.12 variant of MySQL could have saved you from experiencing this bug, while using glibc2.17 variant (which is built on machine which supports FALLOC_FL_ZERO_RANGE) has exposed you to the problem.

For a workaround and detailed explanation of this issue, please refer https://bugs.mysql.com/bug.php?id=108012

Reported Affected Version/s: 8.0.30

Please note that this is a ZFS bug, not a MySQL bug and users need to study workarounds.

 

Percona XtraDB Cluster

PXC-4012: Node leaves Percona XtraDB Cluster while running concurrent CREATE USER IF NOT EXISTS when using password_history option.

Reported Affected Version/s: 8.0.28

 

PXC-3944: DDLs replicated as Total Order Isolation (TOI) fails constantly with Deadlock found when trying to get lock; try restarting transaction error after setting repl.max_ws_size. However, the problem is not seen after setting wsrep_max_ws_size` (maximum write-set size) from a different session.

Reported Affected Version/s: 8.0.27

 

Percona Toolkit

PT-2072: pt-duplicate-key-checker chokes on columns that contain a line-break

Reported Affected Version/s: 3.4.0

 

PT-2066: While running pt-online-schema-change where MySQL instance is in AWS RDS and MySQL is connected to a non-default port is raising the error saying. 

Cannot connect to MySQL: DBI connect(‘my_db;host=[db instance].rds.amazonaws.com;mysql_read_default_group=client’,’admin’,…) failed: Can’t connect to MySQL server on ‘[db instance].rds.amazonaws.com:3306’ (110) at /usr/bin/pt-online-schema-change line 2345.

As a workaround to this problem, we can pass the connection options via DSN.

Please note that it is a private bug that won’t be available publicly. 

Reported Affected Version/s: 3.3.1

 

Percona Monitoring and Management (PMM)

PMM-10075: When there is at least one service added to PMM and we changed the password where password contains quotes say “admin123” then re-login is failed saying invalid user or password.

Reported Affected Version/s: 2.28.0

 

PMM-10454: By going “share” over a panel’s title on any dashboard and clicking on the ‘Direct link rendered image’ logged error “Failed to load resource: the server responded with a status of 404 (Not Found)” and dashboard image has 404 error message along with Welcome panel.

Reported Affected Version/s: 2.29.0

 

PMM-10364: After upgraded percona-pg_stat_monitor version to 1.0.1, and after that pmm-agent raised the errors getNewBuckets failed: failed to query pg_stat_monitor: pq: pg_stat_monitor: incorrect number of output arguments…

Reported Affected Version/s: 2.28.0

Percona XtraBackup

PXB-2797: When importing a single table (IMPORT TABLESPACE) from a backup made using XtraBackup and the table contains a full-text index the import process failed with ERROR 1808 (HY000) at line 132: Schema mismatch (Index x field y is ascending which does not match metadata file which is descending).

Reported Affected Version/s: 8.0.28

 

Summary

We welcome community input and feedback on all our products. If you find a bug or would like to suggest an improvement or a feature, learn how in our post, How to Report Bugs, Improvements, and New Feature Requests for Percona Products.

For the most up-to-date information, be sure to follow us on Twitter, LinkedIn, and Facebook.

Quick References:

Percona JIRA

MySQL Bug Report

Report a Bug in a Percona Product

___

About Percona:

As the only provider of distributions for all three of the most popular open source databases—PostgreSQL, MySQL, and MongoDB—Percona provides expertise, software, support, and services no matter the technology.

Whether it’s enabling developers or DBAs to realize value faster with tools, advice, and guidance, or making sure applications can scale and handle peak loads, Percona is here to help.

Percona is committed to being open source and preventing vendor lock-in. Percona contributes all changes to the upstream community for possible inclusion in future product releases.

Sep
27
2022
--

Give Me Some Latitude… and Longitude

Geo locations in MySQL

Geo locations in MySQLGeo locations are a cornerstone of modern applications. Whether you’re a food delivery business or a family photographer, knowing the closest “something” to you or your clients can be a great feature.

In our ‘Scaling and Optimization’ training class for MySQL, one of the things we discuss is column types. The spatial types are only mentioned in passing, as less than 0.5% of MySQL users know of their existence (that’s a wild guess, with no factual basis). In this post, we briefly discuss the POINT type and how it can be used to calculate distances to the closest public park.

Import the data

To start off, we need a few tables and some data. The first table will hold the mapping between the zip code and its associated latitude/longitude. GeoNames has this data under the Creative Commons v3 license, available here for most countries. The data files are CSV and the readme.txt explains the various columns. We are only interested in a few of the columns.

CREATE TABLE usazips (
 id int unsigned NOT NULL AUTO_INCREMENT,
 zipCode int unsigned NOT NULL COMMENT 'All USA postal codes are integers',
 state varchar(20) NOT NULL,
 placeName varchar(200) NOT NULL,
 placeDesc varchar(100) DEFAULT NULL,
 latLong point NOT NULL /*!80003 SRID 4326 */,
 PRIMARY KEY (id));

There are a couple of things about this schema that are not typical in MySQL.

Firstly, the latLong column type is POINT which can store an X and Y coordinate. By default, these coordinates could be on any plane. They could represent latitude and longitude, but they could also be centimeters up and down on the surface of your desk, or yards left and right of the biggest tree at your local park. How do you know which? You need a Spatial Reference System. Thankfully, MySQL comes preloaded with about 5,100 of these systems (See INFORMATION_SCHEMA.ST_SPATIAL_REFERENCE_SYSTEMS). Each SRS has an associated Spatial Reference ID (SRID).

The SQL comment on the POINT column above ties the data in this column with a specific SRID, 4326. We can see in the I_S table noted earlier, this SRID maps to the ‘World Geodetic System 1984’ (aka WGS84) which is the same SRS used in the GeoNames dataset. By having our column and the dataset aligned, we won’t need to do any geo-transformation later on.

The second thing to note in the schema is the use of a SPATIAL index on the latLong column. A SPATIAL index in MySQL is created using R-Trees which are geospatial-specific data structures.

Let’s load this data into the table. There are 12 fields in the CSV but we only care about six of them. The LOAD DATA command allows you to associate each CSV field, positionally, with either the column name in the table or a user variable. The first field in our CSV is the country code, which we don’t care about, so we assign that field to the @dummy variable. The second field is the zip code, and we want that to go directly into the table-column so we specify the column name. We do the same for the third, fifth, sixth, and ninth fields.

The 10th and 11th CSV fields are the latitude and longitude. We need to convert those two VARCHAR fields into a POINT. We can apply some SQL transformations using the SET command which can reference user variables assigned earlier. These two fields are assigned to @lat and @lon. The remaining fields all go into @dummy since they are not used.

LOAD DATA INFILE '/var/lib/mysql-files/US.txt'
INTO TABLE usazips
FIELDS TERMINATED BY '\t' (@dummy, zipCode, placeName, @dummy, state, placeDesc, @dummy, @dummy, @dummy, @lat, @lon, @dummy)
SET id = NULL, latLong = ST_PointFromText(CONCAT('POINT(', @lat, ' ', @lon, ')'), 4326);

Unfortunately, the POINT() function in MySQL always returns an SRID of 0. Since we specified our column to be a specific SRID, a direct import will fail.

mysql> LOAD ... SET latLong = POINT(@lat, @lon);
ERROR 3643 (HY000): The SRID of the geometry does not match the SRID of the column 'latLong'.
The SRID of the geometry is 0, but the SRID of the column is 4326. Consider changing the SRID
of the geometry or the SRID property of the column.

Instead, we must “go the long route” by creating a string representation of a POINT object in the Well-Known-Text format, which MySQL can parse into a Point column type with an associated SRID.

mysql> LOAD ... SET latLong = ST_PointFromText(CONCAT('POINT(', @lat, ' ', @lon, ')'), 4326);

Yes, this was very confusing to me as well:

“POINT(123 56)” <— WKT Format (a string)
POINT(123, 56) <— MySQL Column Type (function that returns data)

Here’s a quick verification of our data with an additional column showing the lat-long binary data being converted back into WKT format.

mysql> SELECT *, ST_AsText(latLong) FROM usazips WHERE zipCode = 76455;
+-------+---------+-------+-----------+-----------+------------------------------------------------------+-------------------------+
| id    | zipCode | state | placeName | placeDesc | latLong                                              | ST_AsText(latLong)      |
+-------+---------+-------+-----------+-----------+------------------------------------------------------+-------------------------+
| 34292 |   76455 | TX    | Gustine   | Comanche  | 0xE6100000010100000068B3EA73B59958C0B84082E2C7D83F40 | POINT(31.8468 -98.4017) |
+-------+---------+-------+-----------+-----------+------------------------------------------------------+-------------------------+
1 row in set (0.04 sec)

More data to load

Now that we have all this zip code + latitude and longitude data, we next need the locations of where we want to find our distances. This could be a list of grocery stores or coffee shops. In this example, we will use a list of public parks in San Antonio, TX. Thankfully, San Antonio has all of this data openly available. I downloaded the ‘Park Boundaries’ dataset in GeoJSON format since the CSV did not contain any latitude/longitude coordinates.

CREATE TABLE parks (
 parkId int unsigned NOT NULL PRIMARY KEY,
 parkName varchar(100) NOT NULL,
 parkLocation point NOT NULL /*!80003 SRID 4326 */);

The GeoJSON data is one giant JSON string with all of the needed data nested in various JSON objects and arrays. Instead of writing some Python script to ETL the data, I decided to try out the JSON import feature of the MySQL Shell and first import the JSON directly into a temporary table.

$ mysqlsh appUser@127.0.0.1/world --import Park_Boundaries.geojson tempParks
Importing from file "Park_Boundaries.geojson" to collection `world`.`tempParks` in MySQL Server at 127.0.0.1:33060
..1..1
Processed 3.29 MB in 1 document in 0.3478 sec (1.00 document/s)
Total successfully imported documents 1 (1.00 document/s)

The MySQL Shell utility created a new table called ‘tempParks’, in the ‘world’ database with the following schema. One row was inserted, which was the entire dataset as one JSON object.

CREATE TABLE `tempParks` (
 `doc` json DEFAULT NULL,
 `_id` varbinary(32) GENERATED ALWAYS AS (json_unquote(json_extract(`doc`,_utf8mb4'$._id'))) STORED NOT NULL,
 `_json_schema` json GENERATED ALWAYS AS (_utf8mb4'{"type":"object"}') VIRTUAL,
 PRIMARY KEY (`_id`),
 CONSTRAINT `$val_strict_75A2A431C77036365C11677C92B55F4B307FB335` CHECK (json_schema_valid(`_json_schema`,`doc`)) /*!80016 NOT ENFORCED */
);

From here, I extracted the needed information from within the JSON and inserted it into the final table, shown above. This SQL uses the new JSON_TABLE function in MySQL 8 to create a result set from nested JSON data (strangely, there were a couple of duplicate parks in the dataset, which is why you see the ON DUPLICATE modifier). Note: that this dataset does longitude at index 0, and latitude at index 1.

INSERT INTO parks
  SELECT k.parkID, k.parkName, ST_PointFromText(CONCAT('POINT(', parkLat, ' ', parkLon, ')'), 4326)
  FROM tempParks, JSON_TABLE(
    tempParks.doc,
    "$.features[*]" COLUMNS (
      parkID int PATH "$.properties.ParkID",
      parkName varchar(150) PATH "$.properties.ParkName",
      parkLat varchar(20) PATH "$.geometry.coordinates[0][0][0][1]",
      parkLon varchar(20) PATH "$.geometry.coordinates[0][0][0][0]"
    )
  ) AS k ON DUPLICATE KEY UPDATE parkId = k.parkID;

Query OK, 391 rows affected (0.11 sec)
Records: 392  Duplicates: 0  Warnings: 0

What’s the closest park?

Now that all the information is finally loaded, let’s find the nearest park to where you might live, based on your zip code.

First, get the reference location (as MySQL POINT data) for the zip code in question:

mysql> SELECT latLong INTO @zipLocation FROM usazips WHERE zipCode = 78218;

Then, calculate the distance between our reference zip code and all of the park locations. Show the five closest (not including any school parks):

mysql> SELECT p.parkName, ST_AsText(p.parkLocation) AS location, ST_Distance_Sphere(@zipLocation, p.parkLocation) AS metersAway
FROM parks p WHERE parkName NOT LIKE '%School%' ORDER BY metersAway LIMIT 5;
+-----------------------+----------------------------------------------+--------------------+
| parkName              | location                                     | metersAway         |
+-----------------------+----------------------------------------------+--------------------+
| Robert L B Tobin Park | POINT(29.501796110000043 -98.42111988)       |   1817.72881969296 |
| Wilshire Terrace Park | POINT(29.486397887000063 -98.41776879999996) | 1830.8541086553364 |
| James Park            | POINT(29.48168424800008 -98.41725954999998)  |  2171.268012559491 |
| Perrin Homestead Park | POINT(29.524477369000067 -98.41258342199997) |  3198.082796589456 |
| Tobin Library Park    | POINT(29.510899863000077 -98.43343147299998) |  3314.044809806559 |
+-----------------------+----------------------------------------------+--------------------+
5 rows in set (0.01 sec)

The closest park is 1,817.7 meters away! Time for a picnic!

Conclusion

Geo locations in MySQL are not difficult. The hardest part was just finding the data and ETL’ing into some tables. MySQL’s native column types and support for the OpenGIS standards make working with the data quite easy.

Several things to note:

1) The reference location for your zip code may be several hundred or a thousand meters away. Because of this, that park that is right behind your house might not show up as the actual closest because another park is closer to the reference.

2) No SPATIAL indexes were added to the tables as they would be of no use because we must execute the distance function against all parks (ie: all rows) to find the closest park, hence the ‘metersAway’ query will always perform a full table scan.

3) All USA postal codes are integers. You can expand this to your country by simply changing the column type.

Sep
26
2022
--

MySQL in Microservices Environments

MySQL in Microservices Environments

MySQL in Microservices EnvironmentsThe microservice architecture is not a new pattern but has become very popular lately for mainly two reasons: cloud computing and containers. That combo helped increase adoption by tackling the two main concerns on every infrastructure: Cost reduction and infrastructure management.

However, all that beauty hides a dark truth:

The hardest part of microservices is the data layer

And that is especially true when it comes to classic relational databases like MySQL. Let’s figure out why that is.

MySQL and the microservice

Following the same two pillars of microservices (cloud computing and containers), what can one do with that in the MySQL space? What do cloud computing and containers bring to the table?

Cloud computing

The magic of the cloud is that it allows you to be cost savvy by letting you easily SCALE UP/SCALE DOWN the size of your instances. No more wasted money on big servers that are underutilized most of the time. What’s the catch? It’s gotta be fast. Quick scale up to be ready to deal with traffic and quick scale down to cut costs when traffic is low. 

Containers

The magic of containers is that one can slice hardware to the resource requirements. The catch here is that containers were traditionally used on stateless applications. Disposable containers.

Relational databases in general and MySQL, in particular, are not fast to scale and are stateful. However, it can be adapted to the cloud and be used for the data layer on microservices.

The Scale Cube

The Scale Cube

The book “The Art of Scalability” by Abott and Fisher describes a really useful, three dimension scalability model: the Scale Cube. In this model, scaling an application by running clones behind a load balancer is known as X-axis scaling. The other two kinds of scaling are Y-axis scaling and Z-axis scaling. The microservice architecture is an application of Y-axis scaling: It defines an architecture that structures the application as a set of loosely coupled, collaborating services.

  • X-Axis: Horizontal Duplication and Cloning of services and data (READ REPLICAS)
  • Y-Axis: Functional Decomposition and Segmentation – (MULTI-TENANT)
  • Z-Axis: Service and Data Partitioning along Customer Boundaries – (SHARDING)

On microservices, each service has its own database in order to be decoupled from other services. In other words: a service’s transactions only involve its database; data is private and accessible only via the microservice API.

It’s natural that the first approach to divide the data is by using the multi-tenant pattern:

Actually before trying multi-tenant, one can use a tables-per-service model where each service owns a set of tables that must only be accessed by that service, but by having that “soft” division, the temptation to skip the API and access directly other services’ tables is huge.

Schema-per-service, where each service has a database schema that is private to that service is appealing since it makes ownership clearer. It is easy to create a user per database, with specific grants to limit database access.

This pattern of “shared database” however comes with some drawbacks like:

  • Single hardware: a failure in your database will hurt all the microservices
  • Resource-intensive tasks related to a particular database will impact the other databases (think on DDLs)
  • Shared resources: disk latency, IOPS, and bandwidth needs to be shared, as well as other resources like CPU, Network bandwidth, etc.

The alternative is to go “Database per service”

Database per service

Share nothing. Cleaner logical separation. Isolated issues. Services are loosely coupled. In fact, this opens the door for microservices to use a database that best suits their needs, like a graph db, a document-oriented database, etc. But as with everything, this also comes with drawbacks:

  • The most obvious: cost. More instances to deploy
  • The most critical: Distributed transactions. As we mentioned before, microservices are collaborative between them and that means that transactions span several services. 

The simplistic approach is to use a two-phase commit implementation. But that solution is just an open invitation to a huge amount of locking issues. It just doesn’t scale. So what are the alternatives?

  • Implementing transactions that span services: The Saga pattern
  • Implementing queries that span services: API composition or Command Query Responsibility Segregation (CQRS)

A saga is a sequence of local transactions. Each local transaction updates the database and publishes messages or events that trigger the next local transaction in the saga. If a local transaction fails for whatever reason, then the saga executes a series of compensating transactions that undo the changes made by the previous transactions. More on Saga here: https://microservices.io/patterns/data/saga.html

An API composition is just a composer that invokes queries on each microservice and then performs an in-memory join of the results:
https://microservices.io/patterns/data/api-composition.html

CQRS is keeping one or more materialized views that contain data from multiple services. This avoids the need to do joins on the query size: https://microservices.io/patterns/data/cqrs.html

What do all these alternatives have in common? That is taken care of at the API level: it becomes the responsibility of the developer to implement and maintain it. The data layer keep continues to be data, not information.

Make it cloud

There are means for your MySQL to be cloud-native: Easy to scale up and down fast; running on containers, a lot of containers; orchestrated with Kubernetes; with all the pros of Kubernetes (health checks, I’m looking at you).

Percona Operator for MySQL based on Percona XtraDB Cluster

A Kubernetes Operator is a special type of controller introduced to simplify complex deployments. Operators provide full application lifecycle automation and make use of the Kubernetes primitives above to build and manage your application. 

Percona Operator for MySQL

In our blog post “Introduction to Percona Operator for MySQL Based on Percona XtraDB Cluster” an overview of the operator and its benefits are covered. However, it’s worth mentioning what does it make it cloud native:

  • It takes advantage of cloud computing, scaling up and down
  • Runs con containers
  • Is orchestrated by the cloud orchestrator itself: Kubernetes

Under the hood is a Percona XtraDB Cluster running on PODs. Easy to scale out (increase the number of nodes: https://www.percona.com/doc/kubernetes-operator-for-pxc/scaling.html) and can be scaled up by giving more resources to the POD definition (without downtime)

Give it a try https://www.percona.com/doc/kubernetes-operator-for-pxc/index.html and unleash the power of the cloud on MySQL.

Sep
23
2022
--

Scaling MySQL – A Good Problem to Have

scaling MySQL

scaling MySQLWhen you develop an application you expect success, and often success comes with growth problems.  These problems especially show themselves in the area of data storage, where being stateful is not as easy to scale as the stateless parts of the application.

There are several stages of approaching database scalability:

  1. Configuration and query optimization. This step can help a lot, and I would recommend a recent book by Daniel Nichter “Efficient MySQL Performance: Best Practices and Techniques”, which goes into this topic.
  2. If #1 is done and you continue to push the limits of your database, the next step is to improve the hardware: adding extra memory, improving storage throughput (regular SSD, NVMe storage layer, etc.), or increasing the size of the cloud instances (this is what I call “optimization by credit card”). This typically should help, but only to a certain limit. And there is only so much memory or storage you can push into a server before it becomes very expensive very quickly, never mind the physical limits of how much memory you can fit into a single server.
  3. Step 3 is distributing the workload on multiple servers when the limit of the single server is met. When the workload is read-intensive, it can be typically solved by a regular MySQL source->replica replication, however, if you need to scale writes, it becomes quite complicated.

I would like to provide an excerpt from the Square Cash Engineering blog, which describes their experience (Sharding Cash | Square Corner Blog (squareup.com)):

“Cash was growing tremendously and struggled to stay up during peak traffic. We were running through the classic playbook for scaling out: caching, moving out historical data, replica reads, buying expensive hardware. But it wasn’t enough. Each one of these things bought us time, but we were also growing really fast and we needed a solution that would let us scale out infinitely. We had one final item left in the playbook.”

Before jumping to the solution that Square Cash is using, I should mention a traditional solution to sharding (this is how we name distributing MySQL workload into multiple smaller pieces – servers) –  sharding on the application level – basically, an application decides what server to use to execute a query. However, there is a constantly increasing need to have sharding logic separated from the application, so developers are focusing on creating business outcomes rather than solving database workload distribution problems again and again. Therefore there is a need to move the “distribute workload” logic from an application level to a middleware or even to a database level.

There is already an available middleware that helps with these problems (and this is what Square Cash is using): Vitess (Vitess | A database clustering system for horizontal scaling of MySQL)

Vitess is an open source software that works like a proxy and helps to distribute MySQL workload across multiple servers (for more details about Vitess architecture I will refer to the Sugu Sougoumarane presentation: 2019-sugu-highload.pdf (vitess.io) and to our introduction: Introduction to Vitess on Kubernetes for MySQL – Part I of III – Percona Database Performance Blog).

Originally, Vitess was developed to scale YouTube, although I’ve heard recent rumors that Google migrates YouTube to a different backend. Later, it was adopted not only by Square Cash but also by companies like Slack, Pinterest, GitHub, and HubSpot. They all are looking to solve MySQL scalability problems.

I wrote this blog post to collect your opinions:

  • Did you try and have an experience with scaling-out solutions, and can you share it?
  • Would you be interested in hearing more about scale-out from Percona, like basic usage, setup, and general guidance?
  • Would you be interested in having support for products for scaling-out from Percona?

Please share your thoughts with us in this survey.

Sep
20
2022
--

Percona Server for MySQL – Automatic Cloud Deployment with Terraform

Automatic Cloud Deployment with Terraform MySQL

Automatic Cloud Deployment with Terraform MySQLWe are looking to provide simplified ways to deploy Percona software in cloud environments, especially for more advanced scenarios like replication and multi-node cluster (in the case of Percona XtraDB Cluster).

For this I propose trying out our new Terraform provider, with the capabilities:

  • Deploy in AWS or GCP Cloud. The provider will automatically create instances and deploy Percona Server for MySQL
  • Deploy either Percona Server for MySQL or Percona XtraDB Cluster
  • Choose instance size
  • Choose storage volume size, type, and IOPs
  • Multi-node async replication or multi-node Percona XtraDB Cluster
  • Customize MySQL configuration file
  • Capability to deploy MyRocks Engine

To get more understanding let’s review some examples, but before that, where you can obtain modules:

The provider is available from Terraform registry:

https://registry.terraform.io/providers/Percona-Lab/percona/0.9.0

And source code at GitHub:

Percona-Lab/terraform-provider-percona: Terraform modules to deploy Percona Server and Percona XtraDB Cluster (github.com)

Keep in mind that this is an EXPERIMENTAL software yet and is not covered by Percona Support.

Examples:

As a single server is quite trivial, let’s jump to more complex scenarios: Asynchronous replication between two nodes.

Here is our main.tf file:

# AWS provider configuration
provider "percona" {
  region  = "us-east-2"
  profile = "default"
  cloud   = "aws"
}
 
 
resource "percona_ps" "psrepl2" {
  instance_type            = "t3.micro" # for AWS
  key_pair_name            = "sshKey2"
  password                 = "password"
  replica_password         = "replicaPassword"
  cluster_size             = 2
  volume_size              = 30
}

And we apply with:

terraform apply
…
percona_ps.psrepl2: Still creating... [4m20s elapsed]
percona_ps.psrepl2: Still creating... [4m30s elapsed]
percona_ps.psrepl2: Still creating... [4m40s elapsed]
percona_ps.psrepl2: Creation complete after 4m47s [id=JZnEGllTfJIOqjgMyyRl]

So five minutes later we have the following:

  • We created and initialized two t3.micro AWS instances in us-east-2 region
  • Each instance uses 30GB volume sizes
  • We can log in to instances with an ssh key (which will be created if it didn’t exist before)
  • Each instance will have installed Percona Server for MySQL with the latest 8.0 version
  • Instances will be connected by SOURCE->Replica MySQL replication

Similarly, for deploying a 3 node Percona XtraDB Cluster, we use:

# AWS provider configuration
provider "percona" {
  region  = "us-east-2"
  profile = "default"
  cloud   = "aws"
}
 
 
resource "percona_pxc" "pxc3" {
  instance_type            = "t3.micro" # for AWS
  key_pair_name            = "sshKey2"
  password                 = "password"
  cluster_size             = 3
  volume_size              = 30
}

Now we also can show an example for Google Cloud:

provider "percona" {
  region  = "us-central1"
  zone    = "us-central1-a"
  project = "cto-lab-284219"
  cloud =   "gcp"
}
 
 
resource "percona_ps" "ps3" {
  instance_type            = "e2-highmem-2" 
  key_pair_name            = "sshKey2"
  password                 = "password"
  cluster_size             = 3
  volume_type              = "pd-ssd"
  volume_size              = 40
}

In this case, we will use more powerful instances with dedicated SSD volumes of 40GB each.

The script will deploy instances and install three Percona Server for MySQL servers connected in replication (one source and two replicas).

In conclusion:

We offer a Terraform provider that simplifies the deployment of Percona Server for MySQL and Percona XtraDB Cluster in cloud environments and offers various customizations for instances and replication configurations.

Try out our new Terraform provider

Sep
20
2022
--

Column-Level Encryption in MySQL

Column-Level Encryption in MySQL

Column-Level Encryption in MySQLIn a post written earlier this year – Percona Server for MySQL Encryption Options and Choices I discussed some of the options around encryption in MySQL.  Being such a complex topic, that post was meant to clarify and highlight various aspects of “encryption” at different levels.  I recently had this topic come up again, but specifically around column-level encryption and various options so I wanted to touch on this in more detail.

As of the current release of Percona Server for MySQL, there is no built-in way to define a single column as encrypted.  Ideally, there could be some metadata passed in a create statement and this would just automatically happen, such as this:

Unfortunately, this option isn’t available and we need to do some data manipulation at or prior to read/write time.

Built-in MySQL encryption functions

One of the most common approaches is to use the built-in MySQL encryption functions described here: https://dev.mysql.com/doc/refman/8.0/en/encryption-functions.html.  The standard flows look roughly like this:

This works perfectly fine and the data will be stored encrypted for that column.  If an unauthorized person were to gain access to the running table, they would be unable to read that column without the key.  The biggest concern with this approach is that the plaintext key and plaintext data are BOTH specified in the query.  This leads to potential leaks in log files (slow query, binary log, etc) as well as potential sniffing over non-secure networks (i.e. connections not using SSL).

Also, key storage can become more cumbersome.  If you plan to share the same key for the entire table, key rotation and management can become non-trivial.  General best practices recommend that you rotate keys at least once a year.  On large tables, this could be a massive undertaking.  Let’s look at the envelope encryption pattern as an alternative.

Envelope encryption

In contrast to using built-in encryption, envelope encryption uses the concept of individual data keys.  While there are many ways to approach this, my personal experience uses the AWS KMS service.  In KMS, you can create what is called a “Customer Master Key” (CMK).  This is great for encrypting small strings like passwords but is limited to encrypting strings up to 4KB.

A more flexible approach is to use individual data keys, encrypt the data locally in the application, and then store the encrypted data along with the encrypted data key.  This can be done at various levels of granularity – from the per-row level to the table level, to the database level.   Here is the general process for envelope encryption:

When you need to decrypt the data, you first decrypt the key and then use that to decrypt the data:

Because both the data key and the data are encrypted, both are safe to store together.  This is one of the main benefits of this method.  You can have hundreds to millions of data keys independent of one another but protected by a single master key.  This allows you to deactivate all of the individual keys by disabling one single master key.  

It also simplifies key rotation – you can simply rotate the master key and start using the new key to generate new data keys. Note that this doesn’t re-encrypt the data, but it does allow you to follow the best practice of periodically rotating keys.  As long as you don’t delete any of the old keys, KMS can determine which key to use for decryption from the key itself and automatically decrypt your data (i.e. you don’t have to specify which key encrypted the data).  

I’ve included a link to a sample script in Python that shows this process in more detail.  While there isn’t actually an active database in the demo, the code prints out the INSERT and SELECT statements that could be run against a live server: https://github.com/mbenshoof/kms-envelope-demo

Challenges with column-level encryption

Searching

It should go without saying that introducing column-level encryption isn’t a trivial task.  One of the biggest challenges is reviewing how encrypted data is retrieved.  For example, if I store social security numbers encrypted individually, then how do I search for the user with an SSN of 123-45-6789?  If I use a shared key for the entire table/schema, then it is possible with proper indexing.  I just have to pass the encrypted value to the where clause and if it exists, it should be found.

However, on a per-row model where each row uses a unique key, this is no longer possible.  As I don’t know the key the value was encrypted with, I can’t search for the encrypted value in the table.  In cases like this, you might consider a one-way hash field that could be searched against.  For example, you could store the SHA256 hash of an SSN as an additional column for searching but then decrypt any other sensitive information.

CPU and space overhead

The other challenge is adding additional write/read overhead to handle encryption and decryption.  While this may or may not be an issue depending on the use case, the extra CPU needed either on the application side or MySQL side could come into play.  You will need to consider the extra processing required and factor that in when planning the size of your resources.

Additionally, depending on the encryption libraries used, there can be additional space needed to store encrypted values.  In some cases, the encrypted value (when stored in base64 specifically) may end up requiring a higher storage footprint.  The space could be compounded if using an index on an additional hash column.  For small values (like SSN), the hash may be much larger than the actual data.  This can result in a much higher storage footprint when applied to millions of records. 

Wrapping up

Encryption and security are very important and complicated topics.  When considering column-level encryption in MySQL, you definitely have some options.  The easiest way would be to just leverage the built-in encryption functions in MySQL.  However, you can take things a step further and handle all encryption and decryption in your application.  

As is always the case with complicated topics like this, the choice and approach depend entirely on your requirements and use case.  Just know that with the flexibility of MySQL, there is most likely a design and approach that works for you!

Sep
15
2022
--

Percona XtraDB Cluster on Amazon EC2 and Two Interesting Changes in PXC 8.0

Percona XtraDB Cluster on Amazon EC2

Percona XtraDB Cluster on Amazon EC2This article outlines the basic configurations for setting up and deploying Percona XtraDB Cluster 8.0 (PXC) on Amazon EC2, as well as what is new in the setup compared to Percona XtraDB Cluster 5.7.

What is Percona XtraDB Cluster an ideal fit for?

Percona XtraDB Cluster is a cost-effective, high-performance clustering solution for mission-critical data. It combines all the improvements, and functionality found in MySQL 8 with Percona Server for MySQL‘s Enterprise features and Percona’s upgraded Galera library.

A Percona XtraDB Cluster environment is an ideal fit for applications requiring 5-9s uptime with high read workloads; industries like financial or healthcare businesses that require in-house or externally dedicated database resources.

How is a three-node cluster configured in an EC2 environment?

In order to describe the setup procedures, I’ll be using Amazon EC2 instances to build the environment, and based on the business requirements, we may utilize alternative infrastructures to build the cluster environment.

Amazon EC2 settings are designed to provide uptime and high availability. When designing architecture in an EC2 environment, it is preferable to have one node situated in another Availability Zones to avoid the loss of an entire AZ and its data.

If a different region is planned for a node, we can prevent the loss of the entire region and its data. It is desirable that nodes and regions have appropriate network connectivity because network latency between the two regions affects synchronous replication write latency. Alternatively, an async replica in a different region is an option.

I’m not going into too much depth on Amazon EC2 to keep this blog brief and readable. 

To build the three-node Percona XtraDB 8.0 cluster environment we first spin up the following three nodes in EC2. I’m using Amazon Linux but you can also use Ubuntu or any of the Percona-supported operating systems.

It is advised that a cluster’s nodes all have the same configuration. 

PXCNode1   IP Address: 172.31.18.119

PXCNode2   IP Address: 172.31.86.114

PXCNode3  IP Address: 172.31.26.152

To install the Percona repository on all three nodes, use the following command.

$ sudo yum install -y https://repo.percona.com/yum/percona-release-latest.noarch.rpm

Enable the Percona Server for MySQL 8.0 repository in all three nodes by running the following command.

$ sudo percona-release setup pxc-80
* Disabling all Percona Repositories
* Enabling the Percona XtraDB Cluster 8.0 repository
* Enabling the Percona Tools repository
<*> All done!
$

Using the following command, install the Percona XtraDB Cluster packages and software on all three nodes.

$ sudo yum install percona-xtradb-cluster

Before starting the nodes, update the basic variables listed below for the nodes

The following default variables must be modified with the first installations. Those that came from PXC 5.7 might wonder why wsrep_sst_auth is missing. The wsrep_sst_auth variable was removed in PXC 8 since it causes security concerns, as the user and password are saved in the config file and are easily visible to OS users. 

In PXC 8, a temporary user is created when a new node joins the existing cluster. For additional details on this security enhancement, check this article.

$ vi /etc/my.cnf
..
######## wsrep ###############
wsrep_provider=/usr/lib64/galera4/libgalera_smm.so
wsrep_cluster_address=gcomm://172.31.18.119,172.31.86.114,172.31.26.152
wsrep_slave_threads=8
wsrep_log_conflicts
wsrep_node_address=172.31.18.119
wsrep_cluster_name=pxc-cluster
#If wsrep_node_name is not specified,  then system hostname will be used
wsrep_node_name=pxc-cluster-node-1
wsrep_sst_method=xtrabackup-v2
..

Percona XtraDB Cluster nodes utilize the following ports by default, so we must open them and ensure that the nodes can communicate with one another.

3306 is used for MySQL client connections and SST (State Snapshot Transfer) via mysqldump.

    4444 is used for SST via Percona XtraBackup.

    4567 is used for write-set replication traffic (over TCP) and multicast replication (over TCP and UDP).

    4568 is used for IST (Incremental State Transfer).

For example, to test access.

Node 1
#  socat - TCP-LISTEN:4444
hello

Node 2
# echo "hello" | socat - TCP:172.31.18.119:4444
#

How is the first node bootstrapped?

After configuring each PXC node, you must bootstrap the cluster starting with the first node. All of the data that you wish to replicate to additional nodes must be present on the first node. 

Run the command to bootstrap the first node.

# systemctl start mysql@bootstrap.service
#

Since this is a brand new install, a temporary password is generated for the ‘root’ MySQL user which we can find in the mysqld.log

# grep -i "A temporary password is generated " /var/log/mysqld.log
2022-09-13T06:52:37.700818Z 6 [Note] [MY-010454] [Server] A temporary password is generated for root@localhost: PmqQiGf*#6iy
#

Reset the temporary password using the following alter.

$ mysql -uroot -p
Enter password:
mysql> SET PASSWORD = 'GdKG*12#ULmE';
Query OK, 0 rows affected (0.03 sec)

How can the cluster’s remaining nodes be joined?

Before starting node2, you must copy the SSL certificates from node1 to node2 (and to node3). PXC 8 by default encrypts all replication communication, so this is a critical step that most users miss, causing cluster startup failures.

PXCNode1# scp /var/lib/mysql/*.pem 172.31.86.114:/var/lib/mysql/
PXCNode2# chown -R mysql.mysql /var/lib/mysql/*.pem

Start the node.

PXCNode2# systemctl start mysql
#

Verify the following SET and make sure they appear as below once the node has been added to the cluster.

PXCNode2$ mysql -uroot -p -e " show global status where variable_name IN ('wsrep_local_state','wsrep_local_state_comment','wsrep_cluster_size','wsrep_cluster_status','wsrep_connected','wsrep_ready');"
Enter password:
+---------------------------+---------+
| Variable_name             | Value   |
+---------------------------+---------+
| wsrep_cluster_size        | 2       |
| wsrep_cluster_status      | Primary |
| wsrep_connected           | ON      |
| wsrep_local_state         | 4       |
| wsrep_local_state_comment | Synced  |
| wsrep_ready               | ON      |
+---------------------------+---------+

The third node may be added to the cluster using the same procedures, and its status will then look as follows.

PXCNode3$ mysql -uroot -p -e " show global status where variable_name IN ('wsrep_local_state','wsrep_local_state_comment','wsrep_cluster_size','wsrep_cluster_status','wsrep_connected','wsrep_ready');"
Enter password:
+---------------------------+---------+
| Variable_name             | Value   |
+---------------------------+---------+
| wsrep_cluster_size        | 3       |
| wsrep_cluster_status      | Primary |
| wsrep_connected           | ON      |
| wsrep_local_state         | 4       |
| wsrep_local_state_comment | Synced  |
| wsrep_ready               | ON      |
+---------------------------+---------+

Additional supporting factors

Additionally, the use of failover technologies like ProxySQL, which assist in removing failing nodes from the active read pool in some cases and shifting the primary, is advised.

It is advised to have a backup in place, with the open-source tool Percona XtraBackup taking physical copies of the dataset that are significantly faster to recover. It is strongly advised to back up binary logs using mysqlbinlog in order to do point-in-time recovery. Backups should be encrypted, compressed, and transferred to S3 as soon as possible.

To copy binlogs to an s3 bucket the command looks like this.

aws s3 sync /backups/binlogs/ s3://backup-bucket/
upload: ../../../../backups/binlogs/binlog.000001.gz.gpg to 
s3://backup-bucket/binlog.000001.gpg

For query analytics and time-based database performance insights, the open-source tool Percona Monitoring and Management is highly recommended. Using the Amazon Marketplace, this needs to be deployed on a separate host. This monitors the operating system and MySQL metrics and provides sophisticated query analytics.

Sep
08
2022
--

Enabling ProcFS UDF in Percona Monitoring and Management

Enabling ProcFS UDF in Percona Monitoring and Management

Enabling ProcFS UDF in Percona Monitoring and ManagementIn my previous blog post, ProcFS UDF: A Different Approach to Agentless Operating System Observability in Your Database, I wrote about the ProcFS UDF MySQL plugin, which allows you to get operating systems stats, through the MySQL database, without having shell access to the server and any local agent installation.

Some of you wondered whether there is a way to use this goodness in Percona Monitoring and Management (PMM), and this blog post will show you exactly how to do that.

Unfortunately, at this point, Percona Monitoring and Management does not support the ProcFS UDF MySQL plugin out of the box. It is in the backlog, along with many other cool things. However, if this particular feature would be valuable to you, please let us know. 

That said, Percona Monitoring and Management is extensible, so you can actually make things work with a little bit of elbow grease using the external exporter functionality.

Here’s how:

1. Configure the MySQL host you wish to monitor with ProcFS UDF as described in this blog post.

2. Add this MySQL server as a remote instance using “Add Instance,” available in the PMM menu in the top right corner.

PMM Add Instance

3. Pick the host to capture metrics.

While you do not need any agent installed on the MySQL server, if you’re looking to monitor, you’ll need a server to capture metrics from it and pass them to the PMM server. This server will need the PMM client installed and configured. You will also need to install Node Exporter with ProcFS UDF:

docker/podman run -p 9100:9100 -d docker.io/perconalab/node_exporter:procfs --collector.mysqlprocfs="MYSQLUSER:MYSQLPASSWORD@tcp(MYSQLHOST:3306)"

If you do not want to use docker, instructions on how to compile patched Node Exporter are included in the previously mentioned ProcFS UDF Introduction blog post.

You can use one host to monitor multiple MySQL servers — just run multiple Node Exporters on different ports.

4. Configure passing information to PMM.

Now that we have MySQL metrics flowing to PMM as a remote instance and Node Exporter running on a different host, which is ready to provide us metrics, how do we establish a connection so that those “node metrics” are attached to the correct host?

First, we need to find the node_id of the remote node we’ve added:

root@client1:# pmm-admin inventory list nodes
Nodes list.

Node type     Node name            Address           Node ID
GENERIC_NODE  mysql1               66.228.62.195
GENERIC_NODE  client1              50.116.36.182     /node_id/9ba48cd4-a7c2-43f6-9fa6-9571d1f0aebf     
….
REMOTE_NODE   procfstest           173.230.136.197   /node_id/29800e10-53fc-43f7-bba6-27c22ab3a483

Second. we need to get the external service added for this node:

root@client1:~# pmm-admin inventory add service external --name=procfstest-node --node-id=/node_id/29800e10-53fc-43f7-bba6-27c22ab3a483
External Service added.
Service ID     : /service_id/c477453f-29fb-41e1-aa64-84ee51c38cd8
Service name   : procfstest-node
Node ID        : /node_id/29800e10-53fc-43f7-bba6-27c22ab3a483
Environment    :
Cluster name   :
Replication set:
Custom labels  : map[]
Group          : external

Note: The Node ID we use here is the Node ID of the remote node we are monitoring.

This creates the external service, but it is really orphan at this point — there is no agent to supply the data.

root@client1:~# pmm-admin inventory add agent external --runs-on-node-id=/node_id/9ba48cd4-a7c2-43f6-9fa6-9571d1f0aebf --service-id=/service_id/c477453f-29fb-41e1-aa64-84eec38cd8 --listen-port=9100
External Exporter added.
Agent ID              : /agent_id/93e3856a-6d74-4f62-8e8d-821f7de73977
Runs on node ID       : /node_id/9ba48cd4-a7c2-43f6-9fa6-9571d1f0aebf
Service ID            : /service_id/c477453f-29fb-41e1-aa64-84ee51c38cd8
Username              :
Scheme                : http
Metrics path          : /metrics
Listen port           : 9100

Disabled              : false
Custom labels         : map[]

This command now specifies what external service we’ve created for our remote node will get data from the agent, which is running on the other node and will use the port specified by the listen-port option. This is what our ProcFS Enabled Node Exporter is using.

After you’ve done these steps, you should see OS data for the remote host appear on the home dashboard.

PMM Remote Host Percona

And even more important, you’ll have OS metrics populated in the Node Summary dashboard.

Node Summary dashboard PMM

Summary

While this is much harder than it has to be, I think it serves as a great proof of concept (POC) of what is possible with the ProcFS UDF MySQL plugin — getting the full power of the operating system and MySQL observability without requiring any shell access to MySQL.

I think this can be extremely valuable for MySQL provided as a Database as a Service (DBaaS), as well as for enterprises practicing great separation between their teams responsible for operating system and database operations!

Learn more about ProcFS UDF

Sep
06
2022
--

MySQL Replicate From Unsigned-int to Unsigned-bigint

MySQL Replicate

MySQL ReplicateWe often see an int column of a table that needs to be changed to unsigned-int and then unsigned-bigint due to the value being out of range. Sometimes, there may even be blockers that prevent us from directly altering the table or applying pt-online-schema-change on the primary, which requires the rotation solution: apply the change on the replica first, switch over the writes to the replica, and then apply the change on the previous primary. In this case, MySQL will have to replicate unsigned-int to unsigned-bigint for a while.

One might think it is obvious and straightforward that MySQL should be able to replicate unsigned-int to unsigned-bigint because unsigned-bigint has a larger size(8 bytes) which covers unsigned-int(4 bytes). It is partly true, but there are some tricks in practice. This blog will show you those tricks through the scenarios.

Let’s understand the scenarios and issues that one may face when replicating from unsigned-int in primary to unsigned-bigint in replica. For scenarios one and two, we will only focus on the binlog_format=ROW, because with binlog_format=STATEMENT,  “If the statement run on the source would also execute successfully on the replica, it should also replicate successfully” – MySQL doc. However, for scenario three, we tested for both binlog_format=ROW and binlog_format=STATEMENT.

First, set up the table. Here we have a table test_replication_differ_type with unsigned int in primary and unsigned bigint in replica.

Primary: Server version: MySQL 8.0.28

Table: 

CREATE TABLE `test_replication_differ_type` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `replicate_col` int(10) unsigned DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

Replica: Server version: MySQL 8.0.28

Table: 

CREATE TABLE `test_replication_differ_type` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `replicate_col` bigint(20) unsigned DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

Scenario one: Insert into an unsigned int in primary and replicate to unsigned bigint

on Primary:

mysql> insert into test.test_replication_differ_type(id,replicate_col) values(1,1);
Query OK, 1 row affected (0.00 sec)
mysql> select * from test.test_replication_differ_type;
+----+---------------+
| id | replicate_col |
+----+---------------+
|  1 |             1 |
+----+---------------+

on Replica:

root@localhost [test]> show replica status\G
...
Last_SQL_Errno: 1677
Last_SQL_Error: Column 1 of table 'test.test_replication_differ_type' cannot be converted from type 'int' to type 'bigint(20)'

Analysis and solution:

Immediately we see an error stopping us from replicating. Now there’s a configuration option “slave_type_conversions” which controls the type conversion mode used on the replica. The reason for this replication error is because of that, on the replica, the slave_type_conversions variable is NOT set by default.

root@localhost [test]> show variables like 'slave_type%';
+------------------------+-------+
| Variable_name          | Value |
+------------------------+-------+
| slave_type_conversions |       |
+------------------------+-------+

Setting slave_type_conversions to the mode ALL_LOSSY will not work because: ”requiring either lossy conversions or no conversion at all are permitted; for example, enabling only this mode permits an INT column to be converted to TINYINT (a lossy conversion), but not a TINYINT column to an INT column (non-lossy).– MySQL doc 

root@localhost [test]> set global slave_type_conversions='ALL_LOSSY';
Query OK, 0 rows affected (0.00 sec)
root@localhost [test]> start slave;
Query OK, 0 rows affected (0.01 sec)
root@localhost [test]> show slave status\G
...
Last_Errno: 1677
Last_Error: Column 1 of table 'test.test_replication_differ_type' cannot be converted from type 'int' to type 'bigint(20)'

Setting slave_type_conversions to the mode ALL_NON_LOSSY will work because the mode: “This mode permits conversions that do not require truncation or other special handling of the source value; that is, it permits conversions where the target type has a wider range than the source type.”  – MySQL doc 

root@localhost [test]> set global slave_type_conversions='ALL_NON_LOSSY';
Query OK, 0 rows affected (0.00 sec)
root@localhost [test]> start slave;
Query OK, 0 rows affected (0.01 sec)
root@localhost [test]> show slave status\G
...
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
...
Seconds_Behind_Master: 0root@localhost [test]> select * from test.test_replication_differ_type;
+----+---------------+
| id | replicate_col |
+----+---------------+
|  1 |             1 |
+----+---------------+

Don’t be confused by the name, ALL_LOSSY and ALL_NON_LOSSY are not exclusive, instead, they can be parallelly added to permit both modes:

On Replica:

root@localhost [test]> set global slave_type_conversions='ALL_NON_LOSSY, ALL_LOSSY';

On Primary:

mysql> insert into test.test_replication_differ_type(id,replicate_col) values(2,2);
Query OK, 1 row affected (0.00 sec)
root@localhost [test]> select * from test.test_replication_differ_type;
+----+---------------+
| id | replicate_col |
+----+---------------+
|  1 |             1 |
|  2 |             2 |
+----+---------------+

On Replica:

root@localhost [test]> select * from test.test_replication_differ_type;
+----+---------------+
| id | replicate_col |
+----+---------------+
|  1 |             1 |
|  2 |             2 |
+----+---------------+

Thus, if you want to replicate with different data types, consider using the mode ALL_NON_LOSSY and/or ALL_LOSSY in the global variable slave_type_conversions appropriately.

Scenario two: Insert a value out-of-range of signed int and replicate to unsigned bigint 

The range value of a signed int is (-2147483648, 2147483647). Let’s try to insert 2147483647+1 into an unsigned int column on the primary and observe how that replicates. Should be okay right?

On Primary:

mysql> insert into test.test_replication_differ_type(id, replicate_col) values(3, 2147483647);
Query OK, 1 row affected (0.00 sec)
mysql> insert into test.test_replication_differ_type(id, replicate_col) values(4, 2147483648);
Query OK, 1 row affected (0.00 sec)
mysql> select * from test.test_replication_differ_type;
+----+---------------+
| id | replicate_col |
+----+---------------+
|  1 |             1 |
|  2 |             2 |
|  3 |    2147483647 |
|  4 |    2147483648 |
+----+---------------+
4 rows in set (0.00 sec)

On Replica:

root@localhost [test]> select * from test.test_replication_differ_type;
+----+---------------+
| id | replicate_col |
+----+---------------+
|  1 |             1 |
|  2 |             2 |
|  3 |    2147483647 |
|  4 |             0 |
+----+---------------+
4 rows in set (0.00 sec)

Uh-oh, we insert 2147483648 on the primary but got a 0 on the replica.

Analysis and solution:

Checking the binlog file of primary, we can see that the unsigned int value 2147483648 was written as @2=-2147483648 (2147483648) as shown below.

root@db2:/home/vagrant# mysqlbinlog -vvv /var/lib/mysql/mysql-bin.000010 > test_convert.sql
root@db2:/home/vagrant# cat test_convert.sql
...
# at 2635
#220821  4:55:56 server id 2  end_log_pos 2679 CRC32 0x85dfff8a Write_rows: table id 113 flags: STMT_END_F
BINLOG '
3LoBYxMCAAAASAAAAEsKAAAAAHEAAAAAAAEABHRlc3QAHHRlc3RfcmVwbGljYXRpb25fZGlmZmVy
X3R5cGUAAgMDAAKLYMRs
3LoBYx4CAAAALAAAAHcKAAAAAHEAAAAAAAEAAgAC//wEAAAAAAAAgIr/34U=
'/*!*/;
### INSERT INTO `test`.`test_replication_differ_type`
### SET
###   @1=4 /* INT meta=0 nullable=0 is_null=0 */
###   @2=-2147483648 (2147483648) /* INT meta=0 nullable=1 is_null=0 */
# at 2679
#220821  4:55:56 server id 2  end_log_pos 2710 CRC32 0x532d66ec Xid = 89
COMMIT/*!*/;

However, “When an integer type is promoted, its signedness is not preserved. By default, the replica treats all such values as signed”, and so -2147483648 (2147483648) is treated as signed value -2147483648, and eventually turns into 0 as for the minimum of unsigned bigint is 0.

We need to tell the MySQL replica to treat the value -2147483648 (2147483648) as unsigned int instead of signed int by adding a mode ALL_UNSIGNED to the variable slave_type_conversions as below example:

On Replica

root@localhost [test]> set global slave_type_conversions='ALL_NON_LOSSY,ALL_UNSIGNED';
Query OK, 0 rows affected (0.00 sec)

On Primary

mysql> insert into test.test_replication_differ_type(id,replicate_col) values(5,2147483649);
Query OK, 1 row affected (0.00 sec)
mysql> select * from test.test_replication_differ_type;
+----+---------------+
| id | replicate_col |
+----+---------------+
|  1 |             1 |
|  2 |             2 |
|  3 |    2147483647 |
|  4 |    2147483648 |
|  5 |    2147483649 |
+----+---------------+
5 rows in set (0.00 sec)

On Replica

root@localhost [test]> select * from test.test_replication_differ_type;
+----+---------------+
| id | replicate_col |
+----+---------------+
|  1 |             1 |
|  2 |             2 |
|  3 |    2147483647 |
|  4 |             0 |
|  5 |    2147483649 |
+----+---------------+
5 rows in set (0.00 sec)

We can also add the mode as ALL_SIGNED, but this time, it is not like ALL_LOSSY and ALL_NON_LOSSY which are parallel. ALL_SIGNED has a higher priority than ALL_UNSIGNED, which means MySQL will first treat the value as signed if possible, otherwise treat the value as unsigned.

On Replica

root@localhost [test]> set global slave_type_conversions='ALL_NON_LOSSY,ALL_LOSSY,ALL_UNSIGNED,ALL_SIGNED';
Query OK, 0 rows affected (0.00 sec)

On Primary

mysql> insert into test.test_replication_differ_type(id,replicate_col) values(6,2147483650);
Query OK, 1 row affected (0.01 sec)mysql> select * from test.test_replication_differ_type;
+----+---------------+
| id | replicate_col |
+----+---------------+
|  1 |             1 |
|  2 |             2 |
|  3 |    2147483647 |
|  4 |    2147483648 |
|  5 |    2147483649 |
|  6 |    2147483650 |
+----+---------------+
6 rows in set (0.00 sec)

On Replica

root@localhost [test]> select * from test.test_replication_differ_type;
+----+---------------+
| id | replicate_col |
+----+---------------+
|  1 |             1 |
|  2 |             2 |
|  3 |    2147483647 |
|  4 |             0 |
|  5 |    2147483649 |
|  6 |             0 |
+----+---------------+
6 rows in set (0.00 sec)

Scenario three: Replica with an extra column and not all common columns are of the same data type.

The documentation reads: “In addition, when the replica’s copy of the table has more columns than the source’s copy, each column common to the tables must use the same data type in both tables.– MySQL doc 

Let us verify if that’s the behavior.

On Replica

root@localhost [test]> set global slave_type_conversions='ALL_NON_LOSSY,ALL_LOSSY,ALL_UNSIGNED';
Query OK, 0 rows affected (0.00 sec)root@localhost [test]> alter table test.test_replication_differ_type add column extra_col int unsigned;
Query OK, 0 rows affected (0.04 sec)
Records: 0  Duplicates: 0  Warnings: 0
root@localhost [test]> show create table test.test_replication_differ_type\G
*************************** 1. row ***************************
      Table: test_replication_differ_type
Create Table: CREATE TABLE `test_replication_differ_type` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `replicate_col` bigint(20) unsigned DEFAULT NULL,
  `extra_col` int(10) unsigned DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

On Primary

mysql> show create table test.test_replication_differ_type\G
*************************** 1. row ***************************
      Table: test_replication_differ_type
Create Table: CREATE TABLE `test_replication_differ_type` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `replicate_col` int(10) unsigned DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)mysql> insert into test.test_replication_differ_type(id,replicate_col) values(7,2147483651);
Query OK, 1 row affected (0.00 sec)
root@localhost [test]> select * from test.test_replication_differ_type;
+----+---------------+
| id | replicate_col |
+----+---------------+
|  1 |             1 |
|  2 |             2 |
|  3 |    2147483647 |
|  4 |    2147483648 |
|  5 |    2147483649 |
|  6 |    2147483650 |
|  7 |    2147483651 |
+----+---------------+
7 rows in set (0.00 sec)

On Replica

root@localhost [test]> select * from test.test_replication_differ_type;
+----+---------------+-----------+
| id | replicate_col | extra_col |
+----+---------------+-----------+
|  1 |             1 |      NULL |
|  2 |             2 |      NULL |
|  3 |    2147483647 |      NULL |
|  4 |             0 |      NULL |
|  5 |    2147483649 |      NULL |
|  6 |             0 |      NULL |
|  7 |    2147483651 |      NULL |
+----+---------------+-----------+
7 rows in set (0.00 sec)

Now, as our experiment says, the statement in the documentation isn’t correct. The fact is that even if the replicate_col has data type bigint(20) unsigned on the replica other than the int(10) unsigned on the primary, and the replica has an extra column extra_col, it can still replicate well.

I already filed a bug report to Oracle MySQL.

Conclusion

In this blog, I illustrated how to set up the variable slave_type_conversions, the purpose and the difference between ALL_NON_LOSSY and ALL_LOSSY modes, as well as ALL_UNSIGNED and ALL_SIGNED.

Specifically, in order to have MySQL replicate from unsigned-int to unsigned-bigint correctly, we need to set up the variable slave_type_conversions=’ALL_NON_LOSSY,ALL_UNSIGNED’ OR slave_type_conversions=’ALL_NON_LOSSY,ALL_LOSSY,ALL_UNSIGNED’

Also please note that when the replica has an extra column than the primary, as long as the common columns are the same name and in the same order and before the extra column, even when a common column data type is different from the primary, the replication can still go well.

Sep
01
2022
--

Getting Started with RocksDB in Percona Server for MySQL

RocksDB in Percona Server for MySQL

RocksDB in Percona Server for MySQLYou may have read MyRocks Use Case: Big Dataset and been intrigued enough to want to evaluate RocksDB with InnoDB.  It has several advantages including requiring less storage space.

So how do you start?

On a fresh install of Percona Server for MySQL, install RocksDB by entering sudo apt install percona-server-rocksdb:

percona@DellXPS:~$ sudo apt install percona-server-rocksdb
[sudo] password for percona:
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following NEW packages will be installed:
percona-server-rocksdb
0 upgraded, 1 newly installed, 0 to remove and 32 not upgraded.
Need to get 65.3 MB of archives.
After this operation, 292 MB of additional disk space will be used.
Get:1 http://repo.percona.com/ps-80/apt focal/main amd64 percona-server-rocksdb amd64 8.0.29-21-1.focal [65.3 MB]
Fetched 65.3 MB in 8s (8531 kB/s)
Selecting previously unselected package percona-server-rocksdb.
(Reading database ... 78308 files and directories currently installed.)
Preparing to unpack .../percona-server-rocksdb_8.0.29-21-1.focal_amd64.deb ...
Unpacking percona-server-rocksdb (8.0.29-21-1.focal) ...
Setting up percona-server-rocksdb (8.0.29-21-1.focal) ...


* This release of Percona Server is distributed with RocksDB storage engine.
* Run the following script to enable the RocksDB storage engine in Percona Server:

ps-admin --enable-rocksdb -u <mysql_admin_user> -p[mysql_admin_pass] [-S <socket>] [-h <host> -P <port>]

Next, we can enable RocksDB by typing sudo ps-admin –enable-rocksdb -u -p:

percona@DellXPS:~$ sudo ps-admin --enable-rocksdb -u root -p
Enter password:

Checking if RocksDB plugin is available for installation ...
INFO: ha_rocksdb.so library for RocksDB found at /usr/lib/mysql/plugin/ha_rocksdb.so.

Checking RocksDB engine plugin status...
INFO: RocksDB engine plugin is not installed.

Installing RocksDB engine...
INFO: Successfully installed RocksDB engine plugin.

Now it is time to fire up a CLI for MySQL to run SHOW ENGINES to double-check that RocksDB is ready.

mysql> show engines;
+--------------------+---------+----------------------------------------------------------------------------+--------------+------+------------+
| Engine             | Support | Comment                                                                    | Transactions | XA   | Savepoints |
+--------------------+---------+----------------------------------------------------------------------------+--------------+------+------------+
| PERFORMANCE_SCHEMA | YES     | Performance Schema                                                         | NO           | NO   | NO         |
| ROCKSDB            | YES     | RocksDB storage engine                                                     | YES          | YES  | YES        |
| InnoDB             | DEFAULT | Percona-XtraDB, Supports transactions, row-level locking, and foreign keys | YES          | YES  | YES        |
| MEMORY             | YES     | Hash based, stored in memory, useful for temporary tables                  | NO           | NO   | NO         |
| MyISAM             | YES     | MyISAM storage engine                                                      | NO           | NO   | NO         |
| FEDERATED          | NO      | Federated MySQL storage engine                                             | NULL         | NULL | NULL       |
| MRG_MYISAM         | YES     | Collection of identical MyISAM tables                                      | NO           | NO   | NO         |
| BLACKHOLE          | YES     | /dev/null storage engine (anything you write to it disappears)             | NO           | NO   | NO         |
| CSV                | YES     | CSV storage engine                                                         | NO           | NO   | NO         |
| ARCHIVE            | YES     | Archive storage engine                                                     | NO           | NO   | NO         |
+--------------------+---------+----------------------------------------------------------------------------+--------------+------+------------+
10 rows in set (0.01 sec)

And now we can use RocksDB.

mysql> create schema rocks;
Query OK, 1 row affected (0.01 sec)

mysql> use rocks;
Database changed
mysql> create table r1 (id int, foo char(25)) engine=rocksdb;
Query OK, 0 rows affected (0.01 sec)

mysql> insert into r1 (id,foo) values (1,'test'),(2,'wow'),(3,'q');
Query OK, 3 rows affected (0.01 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> select * from r1;
+------+------+
| id   | foo  |
+------+------+
|    1 | test |
|    2 | wow  |
|    3 | q    |
+------+------+
3 rows in set (0.00 sec)

Now how does this compare to Innodb? Well, I created a table like the one above but using InnoDB. Populated with the same three lines of data, the results do show a difference. I am sure with larger data sets that the saving with RocksDB would be substantially larger.

mysql> SELECT table_name AS `Table`, 
              round(((data_length + index_length) / 1024 / 1024), 2) `Size (MB)` 
       FROM information_schema.TABLES 
       WHERE table_schema = "rocks";
+-------+-----------+
| Table | Size (MB) |
+-------+-----------+
| i1    |      0.02 |
| r1    |      0.00 |
+-------+-----------+
2 rows in set (0.00 sec)

So give it a try and let me know your results.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com