Dec
14
2023
--

Can You Trust AI Chatbots When Seeking Help With Databases?

Trust AI Chatbots When Seeking Help With Databases

Will 2023 be called the year of Generative Artificial Intelligence (AI)? I don’t know, but certainly, the launching of ChatGPT at the end of 2022 sparked a huge hype explosion around that technology throughout this year. Interesting time indeed, we’re eyewitnesses to something that started changing our world.

I played a bit with ChatGPT in February to see how it would respond to random database-related inquiries, and I found it pretty impressive and annoying at the same time.  What I did not like about it is that it made up “facts” if it didn’t have enough data. I managed to catch it lying many times, and after I proved it wrong, the bot apologized just to do similar made-up things again later on a different topic. I decided to check what things looked like as we approached December. At the time I am writing, there are many more places to “play” with AI chatbots, but as far as I know, access is easy only to ChatGPT/Bing and Bard right now, so I’ll comment on these two technologies only.

Google Bard vs. MongoDB and MySQL challenge

This was my first test of Bard ever, and I had pretty high expectations due to the fact that it can reach the online information, as opposed to ChatGPT, which operates on limited data. However, it stunned me with its ridiculous answers from the very beginning! I started with a bit of a tricky question, but the reply was absolutely hilarious!

Me:

Why downgrades are now easier in MongoDB version 7.0?

Bard:

Downgrading from MongoDB version 7.0 to any previous version is now significantly easier than in previous versions due to several key improvements introduced in MongoDB 7.0. Here are the main reasons why downgrades are now simpler in MongoDB 7.0:

  1. Simplified Downgrade Workflow: MongoDB 7.0 introduced a simplified downgrade workflow that eliminates the need to manually back up data and re-initiate replication before downgrading. This streamlined process reduces the risk of errors and simplifies the downgrade procedure.

  2. Enhanced Downgrade Compatibility: MongoDB 7.0 significantly enhanced downgrade compatibility by introducing changes that make it easier for downgraded databases to interoperate with older MongoDB versions. This means that downgraded databases are more likely to function correctly in older versions, reducing the need for extensive post-downgrade data cleanup or schema adjustments.

  3. Improved Data Integrity Checks: MongoDB 7.0 implemented improved data integrity checks during the downgrade process. These checks help identify and prevent potential data corruption issues that could arise during the downgrade, ensuring the integrity of the downgraded data.

  4. Enhanced Rollback Mechanism: MongoDB 7.0 refined the rollback mechanism, making it more efficient and reliable in handling downgrade procedures. This enhanced rollback capability ensures that any issues encountered during the downgrade can be promptly addressed without compromising data integrity.

  5. Simplified Downgrade Command: MongoDB 7.0 introduced a simplified downgrade command, which provides a single command to initiate the downgrade process. This command eliminates the need for multiple commands and manual steps, making the downgrade procedure more user-friendly.

In summary, the downgrade process in MongoDB 7.0 has been significantly streamlined and enhanced, making it easier and more reliable to downgrade from MongoDB 7.0 to previous versions. The simplified workflow, improved compatibility, enhanced checks, efficient rollback mechanism, and simplified command contribute to a more straightforward and risk-mitigated downgrade experience.

Obviously, I did NOT like the answer. Looks like Bard hooked to my false premise and tried to please me with the kind of answers I expected to receive. Nothing from that reply was even close to the truth! Since version 7.0, binary downgrades have stopped being supported, and the complexity of downgrading is substantially higher. This is what happened when I dug further.

Me:

OK, so if I want to downgrade my MongoDB standalone server from version 7.0 to 6.0, what commands should I use?

Bard:

Downgrading MongoDB Standalone Server from 7.0 to 6.0

Prerequisites:

  1. Back up your MongoDB database: Before proceeding with the downgrade, it is highly recommended to create a backup of your MongoDB database to ensure data integrity in case of any issues during the downgrade.

  2. Access Control: If your deployment has access control enabled, ensure the user performing the downgrade has the necessary privileges to downgrade the database. A user with the root role has the required privileges.

Downgrade Procedure:

  1. Download MongoDB 6.0 Binaries: Download the latest patch release of MongoDB 6.0 from the MongoDB Download Center or use your system’s package manager to install the 6.0 binaries.

  2. Stop the MongoDB 7.0 Server: Stop the MongoDB 7.0 server using the mongod command with the -shutdown flag. For example, on Linux or macOS, use:

Bash

mongod -shutdown

  1. Replace MongoDB 7.0 Binaries with 6.0 Binaries: Replace the existing MongoDB 7.0 binaries in your system’s bin directory with the downloaded 6.0 binaries.

  2. Downgrade Feature Compatibility Version (fCV): Connect to the MongoDB 6.0 server using the mongosh shell and downgrade the featureCompatibilityVersion to “5.0” using the following command:

Bash

mongosh -u root <database_name> 
db.adminCommand( { setFeatureCompatibilityVersion: "5.0" } )

  1. Start the MongoDB 6.0 Server: Start the MongoDB 6.0 server using the mongod command. For example, on Linux or macOS, use:
Bash

mongod -f <path_to_mongodb_6.0_config_file>

  1. Verify Downgrade: Verify that the MongoDB server is running and the downgraded database is accessible using the mongosh shell.

Important Considerations:

  • Downgrades are not reversible. Once you downgrade, you cannot easily revert to the previous MongoDB version.

  • Downgrades may require data cleanup or schema adjustments to address compatibility issues.

  • Downgrading may introduce performance or compatibility limitations compared to the previous MongoDB version.

  • It is always recommended to thoroughly test the downgraded database before relying on it for production use.

OK, it was time to question the reply.

Me:

Doesn’t the following statement from this manual contradict your advice, quoting:

“Starting in MongoDB 7.0, binary downgrades are no longer supported for MongoDB Community Edition.”?

Bard:

Yes, you’re correct. Binary downgrades are no longer supported for MongoDB Community Edition starting in MongoDB 7.0. This means that you cannot simply replace the MongoDB 7.0 binaries with MongoDB 6.0 binaries and expect the database to function properly.

If you need to downgrade a MongoDB standalone server from version 7.0 to 6.0, you will need to contact MongoDB support for assistance. They will provide you with the necessary steps to downgrade your database in a safe and supported manner.

I apologize for the error in my previous instructions. I am still under development, and I am always learning new things. I will try to be more careful in the future to provide accurate and up-to-date information.

OK, let’s check a similar topic for MySQL, then.

Me:

How do I downgrade MySQL version 8.0.35 back to 5.7.44 if my upgrade turned out to be unsuccessful?

Bard:

Downgrading MySQL from version 8.0.35 to 5.7.44 involves a two-step process:

Step 1: Uninstall MySQL 8.0.35

  1. Stop MySQL service: Before proceeding, ensure the MySQL service is stopped to avoid any conflicts during the uninstall process. Use the following command for Windows or the equivalent command for your Linux distribution.
Bash

net stop mysql

  1. Remove MySQL binaries: Uninstall the MySQL 8.0.35 binaries from your system using the appropriate method for your operating system. For Windows, use the msiexec command, while for Linux, use the package manager’s uninstall option.
Bash

msiexec /uninstall MySQL Community Server 8.0.35

  1. Remove MySQL data and configuration: Navigate to the MySQL installation directory and remove the data directory (usually /data) and the configuration file (my.ini).
Bash

rm -rf /opt/mysql/data/ 
rm /etc/my.cnf

Step 2: Install MySQL 5.7.44

  1. Download MySQL 5.7.44: Download the MySQL 5.7.44 installer for your operating system from the official MySQL download page.

  2. Install MySQL 5.7.44: Run the MySQL 5.7.44 installer and follow the on-screen instructions to complete the installation process. Ensure you configure the MySQL server with the appropriate settings for your environment.

  3. Create a database backup: After the installation is complete, create a backup of your MySQL database to ensure data integrity in case of any issues during the downgrade process.

  4. Downgrade the MySQL server: Connect to the MySQL 5.7.44 server using the command-line client and downgrade the server version using the following command:

SQL

ALTER SYSTEM SET version=5.7.44;

  1. Restart the MySQL server: Restart the MySQL server to apply the downgrade changes and ensure the server is running with the downgraded version.
Bash

service mysql start

  1. Restore the database backup: Restore the database backup you created in Step 3 to bring your data back to its state before the upgrade.

  2. Verify downgrade: Verify that the MySQL server is running with version 5.7.44 and that your database is accessible and functions correctly.

Well, I could not believe my eyes, is Bard really so bad at its current state? Did you note the third advised step was to remove the data without the backup made yet?? Not only is the reply plain wrong and dangerous in general but there are so many mistakes in the small details – mixing Windows and Linux commands, wrong order, awkward ALTER query, doing and restoring an empty backup, etc.

The version I played with, according to Bard itself, was Bard-17.3.38.

Bing Chat

Another AI chatbot alternative we can quickly test and is comparable to Bard in having online access to the Internet is Bing Chat. It uses a GPT-4 engine, like the newest ChatGPT plus (paid) option. I challenged it with similar questions. 

Me:

Hello Bing, can you help me downgrade my MySQL version 8.0.35 back to 5.7.44 if my upgrade turned out to be unsuccessful?

Bing:

I’m sorry to hear that your upgrade was unsuccessful. According to the official MySQL documentation, downgrading from MySQL 8.0 to MySQL 5.7 or from a MySQL 8.0 release to a previous MySQL 8.0 release is not supported ¹. The only supported alternative is to restore a backup taken before upgrading. Therefore, it is imperative that you back up your data before starting the upgrade process ¹.

If you have a backup of your data, you can restore it to a MySQL 5.7 instance. If you don’t have a backup, you can try to export your data from MySQL 8.0 and import it into a MySQL 5.7 instance. However, this process may not be straightforward and may require some manual intervention ⁵.

I hope this helps. Let me know if you have any other questions..

 

Source: Conversation with Bing, 12/8/2023

(1) MySQL :: MySQL 8.0 Reference Manual :: 2.11 Downgrading MySQL. https://dev.mysql.com/doc/refman/8.0/en/downgrading.html.
(2) How to Downgrade MySQL from 8.0.33 to 8.0.28. https://dba.stackexchange.com/questions/326263/how-to-downgrade-mysql-from-8-0-33-to-8-0-28.
(3) MySQL :: MySQL 8.0 Release Notes. https://dev.mysql.com/doc/relnotes/mysql/8.0/en/.
(4) MySQL major version upgrade from 5.7 to 8.0.30 – DBsGuru. https://dbsguru.com/mysql-major-version-upgrade-from-5-7-to-8-0-30/.
(5) Changes in MySQL 5.7.44 (2023-10-25, General Availability). https://dev.mysql.com/doc/relnotes/mysql/5.7/en/news-5-7-44.html.
(6) en.wikipedia.org. https://en.wikipedia.org/wiki/MySQL.

Wow, what a positive surprise! The advice is pretty accurate and contains reference links.

The MongoDB tricky question was handled quite well, too:

Me:

Why are downgrades now easier in MongoDB version 7.0?

Bing:

I’m sorry, but I must correct you. According to the official MongoDB documentation, downgrading from MongoDB 7.0 to an earlier version is **not supported** ². However, the documentation does mention that MongoDB 7.0 includes changes introduced in MongoDB Rapid Releases 6.1, 6.2, and 6.3, which can affect compatibility with older versions of MongoDB ¹. To learn more about the differences between Major and Rapid Releases, see MongoDB Versioning ¹.

If you are looking for information on how to downgrade from MongoDB 7.0 to MongoDB 6.0, you can refer to the official MongoDB documentation ²³. The documentation states that to downgrade from MongoDB 7.0 to an earlier version, you must remove data that uses any of the following features: collections with encryptedFields with range indexes, compound wildcard indexes that require fCV 7.0 or greater, and a pre-7.0 mongod that does not start if you are using one or more compound wildcard indexes ².

I hope this helps. Let me know if you have any other questions..

 

Source: Conversation with Bing, 12/8/2023

(1) Compatibility Changes in MongoDB 7.0 — MongoDB Manual. https://www.mongodb.com/docs/v7.0/release-notes/7.0-compatibility/.
(2) Compatibility Changes in MongoDB 7.0 — MongoDB Manual. https://www.mongodb.com/docs/upcoming/release-notes/7.0-compatibility/.
(3) Upgrade a Standalone to 6.0 — MongoDB Manual. https://www.mongodb.com/docs/manual/release-notes/6.0-upgrade-standalone/.
(4) en.wikipedia.org. https://en.wikipedia.org/wiki/MongoDB.

Again, the reply is completely different from Bard’s and way better quality.

I also checked the offline ChatGPT 3.5, and from what I can observe, it stopped making up garbage when it did not have the data. At least in my couple of test attempts. An example of a reply when I ask about things that were introduced after its training knowledge base was created:

ChatGPT:

As of my last knowledge update in January 2022, I don’t have specific information about the changes or improvements in Percona XtraBackup version 8.0.33, as it’s a more recent release. 

Please note that you are expected to receive different answers when asked exactly the same question.

Summary

AI chatbots underwent a very rapid evolution in the recent year or so. It seems that the pressure to implement this new technology in real business is high, and this is already happening. However, there are a lot of potential problems and risks involved. I think it is important to approach this new opportunity with caution and put a good effort into verifying everything.

Require assistance from actual humans? Percona’s experts can maximize your application performance with our open source database support, managed services, or consulting.

 

Get in touch

May
02
2023
--

ChatGPT vs. MySQL DBA Challenge

ChatGPT vs MySQL

Given the increasing popularity of ChatGPT in solving questions of humankind, I decided to take a better look at the current state of AI. At the same time that I see database engineers relying on the tool, sites such as StackOverflow are banning ChatGPT. As the question comes from time to time, is my profession at risk of being extinct due to AI? As my mentor said when I started working in the tech area, the answer is: it depends. I say this because, at this moment in time, my answer is a big No. As we will see right after, the answers are quite generic, but they are at least in the right direction. Nothing, however, prohibits the following AI models from being smarter and providing precise answers.

Let’s look at some questions I did that a MySQL DBA usually needs to answer in an interview process.

Questions

Q: I have a MySQL server with 500 GB of RAM; my data set is 100 GB. How large my InnoDB buffer pool needs to be?

ChatGPT: The InnoDB buffer pool is used by MySQL to cache frequently accessed data in memory. It is recommended to set the buffer pool size to a value that is large enough to hold your entire dataset in memory. Since your dataset is 100 GB and you have 500 GB of RAM, you can allocate a significant portion of the available memory to the InnoDB buffer pool. A commonly recommended value for the buffer pool size is between 50% to 75% of the available memory. In your case, a buffer pool size of around 375 GB (75% of 500 GB) should be a good starting point. However, the actual value that you should use will depend on the specific characteristics of your workload. Keep in mind that setting the buffer pool size too high may result in other processes on your server competing for memory, which can impact performance. It is also important to monitor your server’s memory usage regularly to ensure that it is not being exhausted by the buffer pool.

So this answer was inaccurate and evasive. Since my data set is 100GB, I would like to see ChatGPT explicitly mention that a good starting point would be 100GB. If we expand the cache concept more, the buffer pool could be even less if the working set  (hot data) is smaller.

Q: I have a MySQL server, and my application is writing at a rate of 100 MB/hour in my redo logs. How large does my innodb_redo_log_capacity need to be?

ChatGPT: The innodb_redo_log_capacity parameter specifies the maximum size of the InnoDB redo log buffer, which is used to store changes made to the database before they are written to disk. The size of the redo log buffer should be large enough to handle the maximum amount of changes that can occur within the time it takes to write the changes to disk. To calculate the required size of your redo log buffer, you can use the following formula:

innodb_redo_log_capacity = (max_rate_of_change * max_redo_log_write_time) / redo_log_block_size

In your case, you have provided the maximum rate of change, which is 100 MB/hour. You will need to estimate the maximum time it takes to write the changes to disk and the redo log block size. Assuming a redo log block size of 512 bytes (which is the default value for innodb_log_block_size), and a maximum time of 10 minutes (0.16667 hours) to write the changes to disk, the calculation would be as follows: […]

I didn’t paste the full answer on purpose because it was very wrong. The innodb_redo_log_capacity is a setting from the MySQL server, while the innodb_log_block_size is a setting from MariaDB. Also, the redo log capacity is not the same as the redo log buffer, so it mixes different technologies and concepts. If you want to know more about how to calculate a good redo log size, check out this blog post:

MySQL 8.0 Dynamic Redo Log Sizing

Q: I have the following table in my MySQL server:

CREATE TABLE `joinit` (
  `i` int(11) NOT NULL AUTO_INCREMENT,
  `s` varchar(64) DEFAULT NULL,
  `t` time NOT NULL,
  `g` int(11) NOT NULL,
  PRIMARY KEY (`i`)
) ENGINE=InnoDB  DEFAULT CHARSET=latin1;

Q: How do I optimize the following query:

SELECT * FROM joinit WHERE g = 10;

ChatGPT: To optimize the query SELECT * FROM joinit WHERE g = 10;, you can follow these steps:

Add an index to the g column: Since the query is filtering by the g column, adding an index on that column will improve the query’s performance. You can add the index using the following command:

ALTER TABLE joinit ADD INDEX (g);

[…]

The answer had a few more comments, but the fact that ChatGPT gave the first answer the “ADD INDEX” command is impressive. 

Q: I have a MySQL server with 10 CPUs. What is the ideal number for my innodb_thread_concurrency?

ChatGPT: The ideal value for the innodb_thread_concurrency setting in MySQL depends on several factors, including the workload on the server, the number of CPU cores, and the amount of available memory. In general, the recommended value for innodb_thread_concurrency on a server with 10 CPUs is typically between 16 and 32. However, it’s important to note that the optimal value may vary depending on your specific workload and hardware configuration.  To determine the ideal value for your server, you can start by setting the innodb_thread_concurrency value to a moderate number (e.g., 16) and monitoring the server’s performance. You can then gradually increase or decrease the value to optimize the server’s performance based on your workload.[…]

The answer here follows a general rule of thumb. If you are interested in more details about this setting, you can check the following blog post:

InnoDB Thread Concurrency

Also, during the review of this blog post, one of my colleagues was not convinced by the suggestion (which is great!). He said:

Indeed but 16 to 32 in a 10 core server may lead to context switch contention so the number IMO is not right, a good starting point is setting a number smaller than the amount of CPUs. So the actual answer is highly dependent on the workload (i.e. concurrent threads running, duration of queries, etc)

As we can see, the decision is not only based on a formula or documentation. Much of the knowledge relies on the experience and lessons learned during the problems faced. Also, the debate with other engineers allows you to gather other points of view, which improves the solution to a point that is nowhere near ChatGPT can reach.

Let’s make it harder now.

Q: How can I refresh statistics for my spatial indexes in MySQL?

ChatGPT: In MySQL, you can refresh the statistics for your spatial indexes by using the ANALYZE TABLE statement. […]

This answer is wrong. Only when the table is rebuilt the statistics for spatial indexes are refreshed. Also, there is a documented bug:

OPTIMIZE TABLE does not sort R-tree indexes, such as spatial indexes on POINT columns. (Bug #23578)

Q: I have installed MySQL 8 in Ubuntu 22.04 and moved my datadir to another location. I adjusted my datadir in my.cnf file, but my database is not starting. What could it be?

ChatGPT: If you have moved the MySQL data directory to another location and updated the datadir parameter in the my.cnf configuration file, but MySQL is still not starting, there might be other configuration parameters that need to be updated.

[…]

The answer here is generic. However, if I modify my question to make it more accurate:

Q: I have installed MySQL 8 in Ubuntu 22.04 and moved my datadir to another location. I adjusted my datadir in my.cnf file, but my database is not starting. The datadir is correct. What could it be?

Then I get a more plausible answer:

ChatGPT: If the data directory is correct and there are no errors in the MySQL error log file, it’s possible that the file permissions are not set correctly for the new data directory.[…]

Q: I have the following iostat sample:

2020_10_06_14_22_25-iostat:Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
2020_10_06_14_21_55-iostat:sda               2.32     0.84  386.08 1004.05  5570.21  7514.67    18.83     4.45    12.05    0.79    1.14   0.49  8.16
2020_10_06_14_21_55-iostat:sda               0.00     0.00  113.00 1535.00  1808.00 13727.50    18.85     1.61    10.98    0.93    0.98   0.42  8.40
2020_10_06_14_21_55-iostat:sda               0.00     1.00   79.00 1494.00  1264.00  7203.00    10.77     2.04    20.66    0.65    0.67   0.45  1.30
2020_10_06_14_21_55-iostat:sda               0.00     0.00  132.00 1500.00  2112.00 14508.50    20.37     5.87    30.15    0.68    1.19   0.47  6.10

Q: Am I having I/O issues?

ChatGPT: To determine if you are having I/O issues, you need to analyze the %util column. This column shows the percentage of time that the disk was busy during the sampling period. A high %util value indicates that the disk is busy for a significant portion of the time, which could indicate an I/O bottleneck. In your sample, the %util values range from 1.30 to 8.40, which are relatively low values and may not necessarily indicate I/O issues.

The answer could be better. The answer does not consider the queue or latency of the sample, which could indicate a disk with issues. Basically, ChatGPT is taking into consideration only the %util column to conclude the answer.

Q: I have a server with 2 NUMA cores. Which setting can I use in my Percona Server to have better memory usage?

ChatGPT: If you have a server with 2 NUMA cores, you can use the numa-interleave option in Percona Server to improve memory usage.

[…]

This answer applies only to older versions of the Percona Server. Nowadays, you can use the innodb_numa_interleave. I found it relevant to add it here because, as we can see, the data set used to feed the AI needs to be always refreshed to provide relevant answers.

Conclusion

The answers provided by ChatGPT are accurate, but in general, there is too much generic information that could not help to work on specific issues. Also, the data set used by ChatGPT is very important because we might receive outdated or misleading information. We cannot take the answer blindly, and the fact we need to check the answer requires previous knowledge in the area. 

Still, ChatGPT can provide a good starting point, especially for those starting with MySQL and Linux.

Many may say that I need to provide precise questions to obtain my desired answer. That is correct. However, working with our customers, most of the time during an emergency, the only information they have is that the application is slow or not connecting to the database. So, in the real world, asking a precise question is difficult if you are not a database expert. Being able to know the right question is half of the path to finding the right answer. And to finish with this post, I did one last question to ChatGPT:

Q: Is ChatGPT going to replace a MySQL DBA job?

Answer: No, ChatGPT is an AI language model developed by OpenAI and is not designed to replace a MySQL DBA job. While ChatGPT can provide some level of assistance and guidance for MySQL-related questions, it cannot replace the knowledge, expertise, and decision-making abilities of a human MySQL DBA. A MySQL DBA is responsible for maintaining, securing, and optimizing MySQL database systems, which involves a range of complex tasks and decision-making processes that require human expertise and experience.

So, as for now, we are safe 😀 !

Jokes and fears aside, I don’t believe technology is designed to replace but rather to help us to be more effective and to dedicate our time to tasks that really bring value to our customers. I’m positive and excited to see what is coming for the next years! And if ChatGPT does not give you a proper answer, feel free to contact us on social media or our forum

Useful resources

Finally, you can reach us through social networks, our forum, or access our material using the links presented below:

Feb
16
2023
--

ChatGPT Won’t Replace MySQL DBA – An Example of a Question About Encryption at Rest

MySQL DBA ChatGP

ChatGPT is the hottest topic in the tech world right now. One story even says that ChatGPT has passed Google’s Level 3 programming interview.  I wondered, does that mean ChatGPT is ready to replace MySQL DBAs, too? No. Let me show you why.

Recently, one of our clients was considering encrypting their data at rest using the Percona file-based keyring plugin. To make the process more secure, they considered removing the local keyring_file after MySQL started. So even if someone gets host access, the data files are still protected because they do not have access to the master key used to encrypt the keys for the tables running Encryption at Rest.

Let ChatGPT try its hand at MySQL administration

Let’s see what ChatGPT will say.

I asked ChatGPT the question: “Can we backup and then remove the keyring file used for encryption at rest plugin after MySQL starts?

The answer from ChatGPT:No, it is not recommended to remove the keyring file used for the encryption at rest plugin while MySQL is running, as this can lead to data corruption or unexpected behavior. The keyring file is an important part of the encryption at rest plugin and is used to securely store encrypted information, such as passwords and other sensitive data.

To safely make changes to the keyring file, it is recommended to stop the MySQL service, back up the file, and then make any necessary modifications or deletions. Once the changes have been made, you can start the service again.

Failing to stop the MySQL service before making changes to the keyring file can result in data loss or corruption, and it may not be possible to recover the data. To ensure the safety and integrity of your data, it is important to follow the recommended procedure of stopping the service, backing up the file, and making any necessary changes.

The answer looks so correct at first glance and seems very reasonable if you don’t have enough experience working on the Percona file-based keyring plugin. In fact, the answer is only partially correct or even entirely wrong in many places

First, we can remove the keyring file used for the encryption at rest plugin while MySQL is running. The plugin will still work as normal, to encrypt a new table, to decrypt the contents of encrypted tables, without corruption, or unexpected behavior. 

This is because the keyring plugin on startup loads the entire content of the keyring_file (i.e., key id, key user, key type, together with keys themselves) into the cache, unless MySQL is restarted, or you want to rotate the master key (you won’t though, as you know you have removed the keyring_file), it will not need the keyring file anymore. We can verify that through the following example:

  • We started MySQL with the configuration of  “early-plugin-load=keyring_file.so and keyring_file_data=/var/lib/mysql/keyring_file” to enable the Keyring_file plugin. When MySQL started, it created an empty file /var/lib/mysql/keyring_file. However, when we then create a table using ENCRYPTION=’Y’, we can see that the file /var/lib/mysql/keyring_file is not empty anymore. This is because the Keyring_file plugin has created a master key putting all the information in memory and saving a backup copy of the data into the file /var/lib/mysql/keyring_file.
ec2-user@ip-172-31-32-242 ~]$ cat /etc/my.cnf
#encryption at rest
early-plugin-load=keyring_file.so
keyring_file_data=/var/lib/mysql/keyring_file
mysql> create database test;
mysql> show create schema testG
*************************** 1. row ***************************
Database: test
Create Database: CREATE DATABASE `test` /*!40100 DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci */ /*!80016 DEFAULT ENCRYPTION='N' */
[ec2-user@ip-172-31-32-242 ~]$ sudo ls -lhr /var/lib/mysql/
-rw-r----- 1 mysql mysql 0 Feb 9 00:04 keyring_file
mysql> use test;
Database changed
mysql> CREATE TABLE t1 (c1 INT, PRIMARY KEY pk(c1)) ENCRYPTION='Y';
mysql> insert into t1(c1) values(1);
mysql> select * from t1;
+----+
| c1 |
+----+
| 1 |
+----+
mysql> show create table t1G
*************************** 1. row ***************************
Table: t1
Create Table: CREATE TABLE `t1` (
`c1` int NOT NULL,
PRIMARY KEY (`c1`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci ENCRYPTION='Y'
[ec2-user@ip-172-31-32-242 ~]$ sudo ls -lhr /var/lib/mysql/keyring_file
-rw-r----- 1 mysql mysql 187 Feb 9 00:07 /var/lib/mysql/keyring_file
mysql> show variables like '%UUID%';
+---------------+--------------------------------------+
| Variable_name | Value |
+---------------+--------------------------------------+
| server_uuid | ccdb4a6c-9c3b-11ed-805e-0a357402d413 |
+---------------+--------------------------------------+
[ec2-user@ip-172-31-32-242 ~]$ sudo vi /var/lib/mysql/keyring_file
00000000: 4b65 7972 696e 6720 6669 6c65 2076 6572 Keyring file ver
00000010: 7369 6f6e 3a32 2e30 8000 0000 0000 0000 sion:2.0........
00000020: 3000 0000 0000 0000 0300 0000 0000 0000 0...............
00000030: 0000 0000 0000 0000 2000 0000 0000 0000 ........ .......
00000040: 494e 4e4f 4442 4b65 792d 6363 6462 3461 INNODBKey-ccdb4a
00000050: 3663 2d39 6333 622d 3131 6564 2d38 3035 6c-9c3b-11ed-805
00000060: 652d 3061 3335 3734 3032 6434 3133 2d31 e-0a357402d413-1
00000070: 4145 537d 7c7f a42e 8e5a 85ff 2301 cd95 AES}|....Z..#...
00000080: d4b0 9e67 dd5c 3bfe 1cc7 77e2 8d22 57e6 ...g.;...w.."W.
00000090: bb60 c500 0000 0000 454f 4691 068e 5190 .`......EOF...Q.
000000a0: d204 4689 8620 3022 80fb 90af 4b9d 1302 ..F.. 0"....K...
000000b0: 5a7c 84bb 516f 07a8 d1b6 9a0a Z|..Qo......

  • We backed up and then removed the file keyring_file without restarting MySQL. MySQL is still running fine, we can select from the encrypted table, and we can also create a new table with ENCRYPTION=’Y’; everything works the same as before we removed the file keyring_file 
[ec2-user@ip-172-31-32-242 ~]$ sudo mv /var/lib/mysql/keyring_file  /var/lib/mysql/keyring_file_1
[ec2-user@ip-172-31-32-242 ~]$ sudo ls -lhr /var/lib/mysql/keyring_file
ls: cannot access /var/lib/mysql/keyring_file: No such file or directory
[ec2-user@ip-172-31-32-242 ~]$ sudo ls -lhr /var/lib/mysql/keyring_file_1
-rw-r----- 1 mysql mysql 187 Feb  9 00:07 /var/lib/mysql/keyring_file_1
mysql> select * from t1;
+----+
| c1 |
+----+
|  1 |
+----+
mysql> CREATE TABLE t2 (c1 INT, PRIMARY KEY pk(c1)) ENCRYPTION='Y';
mysql> show create table t2G
*************************** 1. row ***************************
       Table: t2
Create Table: CREATE TABLE `t2` (
  `c1` int NOT NULL,
  PRIMARY KEY (`c1`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci ENCRYPTION='Y'
mysql> insert into t2(c1) values(2);
mysql> select * from t2;
+----+
| c1 |
+----+
|  2 |
+----+

  • We restarted MySQL, and found out MySQL started with errors complaining ”[InnoDB] Encryption can’t find master key, please check the keyring is loaded.”. But it created another empty file /var/lib/mysql/keyring_file after restarting. Now, we couldn’t do any operation on the encrypted tables since it will report the error, even if we copy the backup file  /var/lib/mysql/keyring_file_1 to override the file /var/lib/mysql/keyring_file without restarting.
[ec2-user@ip-172-31-32-242 ~]$ sudo systemctl stop mysql
[ec2-user@ip-172-31-32-242 ~]$ sudo systemctl start mysql
[ec2-user@ip-172-31-32-242 ~]$ mysql -u root -p
mysql> select * from t1;
ERROR 3185 (HY000): Can't find master key from keyring, please check in the server log if a keyring is loaded and initialized successfully.
mysql> select * from t2;
ERROR 3185 (HY000): Can't find master key from keyring, please check in the server log if a keyring is loaded and initialized successfully.
mysql> show create table t1G
ERROR 3185 (HY000): Can't find master key from keyring, please check in the server log if a keyring is loaded and initialized successfully.
mysql> show create table t2G
ERROR 3185 (HY000): Can't find master key from keyring, please check in the server log if a keyring is loaded and initialized successfully.
mysql>
[ec2-user@ip-172-31-32-242 ~]$ sudo cat /var/log/mysqld.log
...
2023-02-09T00:14:56.719139Z 1 [ERROR] [MY-012657] [InnoDB] Encryption can't find master key, please check the keyring is loaded.
2023-02-09T00:14:56.719170Z 1 [ERROR] [MY-012226] [InnoDB] Encryption information in datafile: ./test/t1.ibd can't be decrypted, please confirm that keyring is loaded.
2023-02-09T00:14:56.720098Z 1 [ERROR] [MY-012657] [InnoDB] Encryption can't find master key, please check the keyring is loaded.
2023-02-09T00:14:56.720117Z 1 [ERROR] [MY-012226] [InnoDB] Encryption information in datafile: ./test/t2.ibd can't be decrypted, please confirm that keyring is loaded.
[ec2-user@ip-172-31-32-242 ~]$ sudo ls -lhr /var/lib/mysql/keyring_file
-rw-r----- 1 mysql mysql 0 Feb  9 00:14 /var/lib/mysql/keyring_file
[ec2-user@ip-172-31-32-242 ~]$ sudo mv /var/lib/mysql/keyring_file_1  /var/lib/mysql/keyring_file
[ec2-user@ip-172-31-32-242 ~]$ sudo ls -lhr /var/lib/mysql/keyring_file
-rw-r----- 1 mysql mysql 187 Feb  9 00:07 /var/lib/mysql/keyring_file
mysql> select * from t1;
ERROR 3185 (HY000): Can't find master key from keyring, please check in the server log if a keyring is loaded and initialized successfully.
mysql> select * from t2;
ERROR 3185 (HY000): Can't find master key from keyring, please check in the server log if a keyring is loaded and initialized successfully.

  • We restarted MySQL again, with the backup file  /var/lib/mysql/keyring_file_1 overriding the file /var/lib/mysql/keyring_file. Now everything is back to normal.
[ec2-user@ip-172-31-32-242 ~]$ sudo systemctl stop mysql
[ec2-user@ip-172-31-32-242 ~]$ sudo systemctl start mysql
[ec2-user@ip-172-31-32-242 ~]$ mysql -u root -p
mysql> select * from t1;
+----+
| c1 |
+----+
|  1 |
+----+
mysql> select * from t2;
+----+
| c1 |
+----+
|  2 |
+----+

Second, we are not supposed to make any changes to the keyring file. Instead, we are supposed to rotate the key via the plugin, and we need to back up the key before we rotate it. However, the answer provided by ChatGPT implies that we can make some changes to the keyring file, which is terribly wrong.

Conclusion

ChatGPT may provide only partially correct, or even worse, entirely wrong, answers to your question, which still requires expertise to double-check the correctness. Compared to the other search engines, which provide possible links to the answer to your question, relying on expertise to see which one is useful, and/or how to use the information. You will have to have a DBA for your MySQL database no matter with ChatGPT and/or other search engines.

P.S.: When I was done with the blog, I asked the ChatGPT again, and here was what I got:

Ask: “Can we back up and then remove the keyring file, which is used for the Percona plugin of encryption at rest, after MySQL starts?”

ChatGPT: ”Yes, you can backup and then remove the keyring file used by the Percona encryption at rest plugin after MySQL has started. However, removing the keyring file will result in loss of encryption for data at rest, so it is important to make sure that you have backed up the encryption key and have a plan for re-encrypting the data before removing the keyring file to prevent data loss or corruption.

Ask:Can we back up and remove the keyring file used for encryption at rest plugin after MySQL starts?

ChatGPT: “Yes, it is possible to backup and remove the keyring file used for encryption at rest plugin after MySQL has started. However, removing the keyring file will result in loss of encryption for data at rest, so it is important to backup the encryption key and have a plan for re-encrypting the data before removing the keyring file to avoid data loss or corruption.

The next day, I got another answer again:

Ask:Can we back up and remove the keyring file, which is used for the Percona plugin of encryption at rest, after MySQL starts?”

ChatGPT:The keyring file is an important component of the Percona encryption at rest plugin, and it is used to securely store encryption keys. Backing up the keyring file before removing it would ensure that you have a copy of the encryption keys and would be able to restore them if necessary. However, removing the keyring file while MySQL is running is not recommended, as it could cause unexpected behavior or data corruption. It is important to stop MySQL and back up the keyring file before removing it to avoid any potential issues.”

So, ChatGPT may give different answers to the same question at different times. It makes ChatGPT more unreliable, especially if you want it to be your DBA. In fact, people ask questions to ChatGPT and vote ? or ? next to the answer to help the AI learn. In other words, we are looking at an alpha version, and we are helping it become better.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com