Oct
28
2020
--

Say Hello to Libcoredumper – A New Way to Generate Core Dumps, and Other Improvements

Libcoredumper

LibcoredumperIn a perfect world, we expect all software to run flawlessly and never have problems such as bugs and crashes. We also know that this perfect world doesn’t exist and we better be as prepared as possible to troubleshoot those types of situations. Historically, generating core dumps has been a task delegated to the kernel. If you are curious about how to enable it via Linux kernel, you can check out Getting MySQL Core file on Linux. There are a few drawbacks that pose either a limitation or a huge strain to get it working, such as:

  • System-wide configuration required. This is not something DBA always has access to.
  • Inability or very difficult to enable it for a specific binary only. Standards ways enable it for every software running on the box.
  • Nowadays, with cloud and containers, this task has become even more difficult because it sometimes requires containers to be running on privileged mode and host OS to be properly configured by the provider.

The above issues have driven exploration of alternative ways to do create a core dump to help troubleshooting bugs and crashes. More details can be found at PS-7114 .

The Libcoredumper

The libcoredumper is a fork of the archived project google-coredumper. Percona has forked it under Percona-Lab Coredumper, cherry-picked a few improvements from other forks of the same project, and enhanced it to compile and work on newer versions of Linux as well on newer versions of GCC and CLANG.

This project is a Tech Preview, as you may infer from the repository name (Percona Lab). We might not maintain compatibility with future kernel versions and/or patches. One should test the core dumper on their environment before putting this tool into production. We have tested on kernel versions up to 5.4.

This functionality is present on all versions of Percona Server for MySQL and Percona XtraDB Cluster starting from 5.7.31 and 8.0.21. If you compile PS/PXC from source, you can control if the library will be compiled by switching -DWITHCOREDUMPER to ON/OFF (default is ON).

How To Configure It

A new variable named coredumper has been introduced. One should include it under the [mysqld] section of my.cnf and it works independently of the older configuration core-file. This new variable can either be a boolean (no value specified) or with value. It follows a few rules:

  • No value – core dump will have saved under MySQL datadir and will be named core.
  • A path ending with  /  – core dump will be saved under the specified directory and will be named core.
  • A full path with filename  – core dump will be saved under the specified directory and will use the specified name.

Every core file will end with the timestamp of the crash instead of PID, for two main reasons:

  • Make it easier to correlate a core dump with a crash, as MySQL always print a Zulu/UTC timestamp on the logs when it crashes:
    10:02:09 UTC - mysqld got signal 11 ;
  • Operators / Containers will always be running MySQL (or whatever application it is running) as PID 1. If MySQL has crashed multiple times, we don’t want to core-dump to get overwritten by the last crash.

How To Know If I Am Using libcoredumper

When MySQL attempts to write a core file it stamps the log saying it will write a core file. When it does it delegating the action to Linux kernel, you always see a message like below:

. . .
Writing a core file

The above behavior remains the same, however, when MySQL is using libcoredumper to generate the core file, one should see that message informing that the library will be responsible for the action:

. . .
Writing a core file using lib coredumper

Other Improvements

Apart from libcoredumper, starting from the same 5.7 and 8.0 releases a stack trace will also:

  • Print binary BuildID – This information is very useful for support/development people in case the MySQL binary that crashed is a stripped binary. Stripped binaries are a technique to remove part of the binaries that are not essential for it to run, making the binary occupy less space in disk and in memory. When computers had a restriction on memory, this technique was widely used. Nowadays this doesn’t pose a limitation anymore on most of the hardware, however, it is becoming popular once again with containers where image size matters. Stripping the binary removed the binary symbols table, which is required to resolve a stack trace and lets you read the core dump. BuildID is how we can link things together again.
  • Print the server Version – This information is also useful to have at glance. Recent versions of MySQL/Percona Server for MySQL have a fix for many know issues. Having this information helps to establish the starting point investigation. MySQL only prints the server version when it starts, and by the moment a server crashes, its log may have grown significantly or even got rotated/truncated.

Here is one example of how a crash with stack trace will look like:

14:23:52 UTC - mysqld got signal 11 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.

Build ID: 55b4b87f230554777d28c6715557ee9538d80115
Server Version: 8.0.21-12-debug Source distribution

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x46000
/usr/local/mysql/bin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x55) [0x55943894c280]
/usr/local/mysql/bin/mysqld(handle_fatal_signal+0x2e0) [0x559437790768]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x13f40) [0x7f9d413bcf40]
/lib/x86_64-linux-gnu/libc.so.6(__poll+0x49) [0x7f9d40858729]
/usr/local/mysql/bin/mysqld(Mysqld_socket_listener::listen_for_connection_event()+0x64) [0x55943777db6a]
/usr/local/mysql/bin/mysqld(Connection_acceptor<Mysqld_socket_listener>::connection_event_loop()+0x30) [0x55943737266e]
/usr/local/mysql/bin/mysqld(mysqld_main(int, char**)+0x30c6) [0x559437365de1]
/usr/local/mysql/bin/mysqld(main+0x20) [0x559437114005]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xeb) [0x7f9d4076db6b]
/usr/local/mysql/bin/mysqld(_start+0x2a) [0x559437113f2a]
Please help us make Percona Server better by reporting any
bugs at https://bugs.percona.com/

You may download the Percona Server operations manual by visiting
http://www.percona.com/software/percona-server/. You may find information
in the manual which will help you identify the cause of the crash.
Writing a core file using lib coredumper

TL;DR

Libcoredumper serves as an alternative for current –core-file functionality for generating memory dumps. In case of any crash of MySQL, a core dump is written and can be later processed /read via GDB to understand the circumstances of such a crash.
Users can enable it by adding the below variable to [mysqld] section of my.cnf:

[mysqld]
coredumper

Percona Server for MySQL versions starting from 5.7.31 and 8.0.31 include the library by default. Refer to below documentation pages for more details:

https://www.percona.com/doc/percona-server/5.7/diagnostics/libcoredumper.html

https://www.percona.com/doc/percona-server/5.7/diagnostics/stacktrace.html

Summary

If you faced any issue or limitation on enabling core dumps before feel free to test new versions of Percona Server for MySQL/Percona XtraDB Cluster and use libcoredumper. Also, any feedback is very welcome on how we can improve the troubleshooting of bugs/crashes even further.

Sep
22
2017
--

How to Deal with XA Transactions Recovery

XA Transactions

XA TransactionsFor most people (including me until recently) database XA transactions are a fuzzy concept. In over eight years with Percona, I have never had to deal with XA transactions. Then a few weeks ago I got two customers having issues with XA transactions. That deserves a post.

XA 101

What are XA transactions? XA transactions are useful when you need to coordinate a transaction between different systems. The simplest example could be simply two storage engines within MySQL. Basically, it follows this sequence:

  1. XA START
  2. Some SQL statements
  3. XA END
  4. XA PREPARE
  5. XA COMMIT or ROLLBACK

Once prepared, the XA transaction survives a MySQL crash. Upon restart, you’ll see something like this in the MySQL error log:

2017-08-23T14:53:54.189068Z 0 [Note] Starting crash recovery...
2017-08-23T14:53:54.189204Z 0 [Note] InnoDB: Starting recovery for XA transactions...
2017-08-23T14:53:54.189225Z 0 [Note] InnoDB: Transaction 45093 in prepared state after recovery
2017-08-23T14:53:54.189244Z 0 [Note] InnoDB: Transaction contains changes to 2 rows
2017-08-23T14:53:54.189257Z 0 [Note] InnoDB: 1 transactions in prepared state after recovery
2017-08-23T14:53:54.189267Z 0 [Note] Found 1 prepared transaction(s) in InnoDB
2017-08-23T14:53:54.189312Z 0 [Warning] Found 1 prepared XA transactions
2017-08-23T14:53:54.189329Z 0 [Note] Crash recovery finished.
2017-08-23T14:53:54.189472Z 0 [Note] InnoDB: Starting recovery for XA transactions...
2017-08-23T14:53:54.189489Z 0 [Note] InnoDB: Transaction 45093 in prepared state after recovery
2017-08-23T14:53:54.189501Z 0 [Note] InnoDB: Transaction contains changes to 2 rows
2017-08-23T14:53:54.189520Z 0 [Note] InnoDB: 1 transactions in prepared state after recovery
2017-08-23T14:53:54.189529Z 0 [Note] Found 1 prepared transaction(s) in InnoDB
2017-08-23T14:53:54.189539Z 0 [Warning] Found 1 prepared XA transactions

The command

xa recover

 shows you an output like:

mysql> xa recover;
+----------+--------------+--------------+-----------+
| formatID | gtrid_length | bqual_length | data      |
+----------+--------------+--------------+-----------+
|     1234 |            4 |            5 |  bqual |
+----------+--------------+--------------+-----------+
1 row in set (0.00 sec)

There are some binary data that can’t be shown in HTML. The XA Xid is made of three fields: gtrid (global trx id), bqual (branch qualifier) and formatId. Java applications use all three fields. For my example above, I used “X’01020304′,’bqual’,1234”. You can trust Java application servers to be creative with Xid values. With MySQL 5.7, you can output the data part in hex with

convert xid

 :

mysql> xa recover convert xid;
+----------+--------------+--------------+----------------------+
| formatID | gtrid_length | bqual_length | data                 |
+----------+--------------+--------------+----------------------+
|     1234 |            4 |            5 | 0x01020304627175616C |
+----------+--------------+--------------+----------------------+
1 row in set (0.01 sec)

The Problem

If you do nothing, the prepared transaction stays there forever and holds locks and a read view open. As a consequence, the history list grows without bound along with your ibdata1 file, where the undo entries are kept. If you have slaves, they all have the prepared transaction too (at least with 5.7). No fun.

As a consequence, if you are using XA transactions, you MUST check if there are prepared transactions pending after the server or mysqld restarted. If you find such transactions, you need to commit or roll them back, depending on what is involved.

But how do you commit these XA transactions? The problem here is the output of

xa recover

. As it is, the output is unusable if there is a bqual field or non-default formatID field:

mysql> xa commit 0x01020304627175616C;
ERROR 1397 (XAE04): XAER_NOTA: Unknown XID

The Fix

Looking back at the

xa recover convert xid

 output above, the gtrid_length and bqual_length are provided. With the use of these values, you can extract the parts of the data field which gives us:

  • gtrid = 0x01020304
  • bqual = 0x627175616C

And, of course, the formatID is 1234. Altogether, we have:

mysql> xa commit 0x01020304,0x627175616C,1234;
Query OK, 0 rows affected (0.15 sec)

Which finally works! On 5.6 the

convert xid

 option is not available. You have to be a bit more creative:

root@master57:/var/lib/mysql# mysql -r -e 'xa recoverG' | hexdump -C
00000000  2a 2a 2a 2a 2a 2a 2a 2a  2a 2a 2a 2a 2a 2a 2a 2a  |****************|
00000010  2a 2a 2a 2a 2a 2a 2a 2a  2a 2a 2a 20 31 2e 20 72  |*********** 1. r|
00000020  6f 77 20 2a 2a 2a 2a 2a  2a 2a 2a 2a 2a 2a 2a 2a  |ow *************|
00000030  2a 2a 2a 2a 2a 2a 2a 2a  2a 2a 2a 2a 2a 2a 0a 20  |**************. |
00000040  20 20 20 66 6f 72 6d 61  74 49 44 3a 20 31 32 33  |   formatID: 123|
00000050  34 0a 67 74 72 69 64 5f  6c 65 6e 67 74 68 3a 20  |4.gtrid_length: |
00000060  34 0a 62 71 75 61 6c 5f  6c 65 6e 67 74 68 3a 20  |4.bqual_length: |
00000070  35 0a 20 20 20 20 20 20  20 20 64 61 74 61 3a 20  |5.        data: |
00000080  01 02 03 04 62 71 75 61  6c 0a                    |....bqual.|
0000008a

But there is a limitation in 5.6: you can only XA commit/rollback transactions that belong to your session. That means after a crash you are out of luck. To get rid of these you need to promote a slave or perform a logical dump and restore. The best plan is to avoid the use of XA transactions with 5.6.

I submitted this bug to Percona Server for MySQL in order to get a usable output out of

xa recover convert xid

. If you think this is important, vote for it!

Dec
13
2016
--

MongoDB 3.4: Facet Aggregation Features and SERVER-27395 Mongod Crash

Mongod Crash

This blog discusses MongoDB 3.4 GA facet aggregation features and the SERVER-27395 mongod crash bug.

As you may have heard, in late November MongoDB 3.4 GA was released. One feature that stuck out for me, a Lucene enthusiast, was the addition of powerful grouping and faceted search features in MongoDB 3.4.

Faceted Search

For those unfamiliar with the term faceted search, this is a way of grouping data using one or many different grouping criteria over a large result. It’s a tough idea to define Mongod Crashspecifically, but the aim of a faceted search is generally to show the most relevant information possible to the user and allow them to further filter what is usually a very large result of a given search criteria.

The most common day-to-day example of a faceted search is performing a search for a product on an e-commerce website such as eBay, Amazon, etc. As e-commerce sites commonly have the challenge of supplying a massive range of items to users that often provide limited search criteria, it is rare to see an online store today that does not have many “filters” in the right-side of their website to further narrow down a given product search.

Here is an example of me searching the term “mongodb” on a popular auction site:

Mongod CrashWhile this may seem like a specific search to some, at large volume this search term might not immediately show something relevant to some users. What if the user only wants a “used” copy of a MongoDB book from a specific year? What if the user was looking for a MongoDB sticker and not a book at all? This is why you’ll often see filters alongside search results (which we can call “facets”) showing item groupings such as different store departments, different item conditions (such as used/new), publication years, price ranges, review ratings, etc.

In some traditional databases, to get this kind of result we might need to issue many different expensive “GROUP BY” queries that could be painful for a database to process. Each of these queries would independently scan data, even if all queries are summarizing the same “result set.” This is very inefficient. A faceted search offers powerful groupings using a single operation on result data.

When I made my search for “mongodb”, under a faceted search model the page of items (in this case MongoDB books) and all the different groupings of departments, condition, rrice, etc., are performed as a single grouping operation in one “pass” of the data. The result from a faceted search contain items matching the search criteria AND the grouping results of the matched items as a single response.

Traditionally faceted searches were mostly limited to Lucene-based search engines such as Apache Solr, Elasticsearch and various closed-source solutions. With the release of MongoDB 3.4, this has changed!

The new Aggregation Pipeline features named $bucket and $bucketAuto provide functionality for processing groupings of result data in a single aggregation stage, and $facet allows the processing of many aggregation pipelines on the same result for even more complex cases.

New Facetting Features

MongoDB 3.4 introduces these new Aggregation Pipeline operators, allowing some advanced grouping and faceted-search-like features:

  1. $facet – Processes multiple aggregation pipelines within a single stage on the same set of input documents. Each sub-pipeline has its own field in the output document where its results are stored as an array of documents.
  2. $bucket – Categorizes incoming documents into groups, called buckets, based on a specified expression and bucket boundaries.
  3. $bucketAuto – Similar to $bucket, however bucket boundaries are automatically determined in an attempt to evenly distribute the documents into the specified number of buckets.

As a very basic example, let’s consider this collection of store items:

> db.items.find()
{ "_id" : ObjectId("58502ade9a49537a011226fb"), "name" : "scotch", "price_usd" : 90, "department" : "food and drinks" }
{ "_id" : ObjectId("58502ade9a49537a011226fc"), "name" : "wallet", "price_usd" : 95, "department" : "clothing" }
{ "_id" : ObjectId("58502ade9a49537a011226fd"), "name" : "watch", "price_usd" : 900, "department" : "clothing" }
{ "_id" : ObjectId("58502ade9a49537a011226fe"), "name" : "flashlight", "price_usd" : 9, "department" : "hardware" }

From this example data, I’d like to gather a count of items in buckets by price (field ‘price_usd’):

  1. $0.99 to $9.99
  2. $9.99 to $99.99
  3. $99.99 to $999.99

For each price-bucket, I would also like a list of unique “department” names for the matches. Here is how I would do this with $bucket (and the result):

> db.items.aggregate([
...   { $bucket: {
...     groupBy: "$price_usd",
...     boundaries: [ 0.99, 9.99, 99.99, 999.99 ],
...     output: {
...       count: { $sum: 1 },
...       departments: { $addToSet: "$department" }
...     }
...   } }
... ])
{ "_id" : 0.99, "count" : 1, "departments" : [ "hardware" ] }
{ "_id" : 9.99, "count" : 2, "departments" : [ "clothing", "food and drinks" ] }
{ "_id" : 99.99, "count" : 1, "departments" : [ "clothing" ] }

If you wanted to do something more complex, you have the flexibility of either making the $bucket stage more complex or you can even chain multiple stages together with $facet!

Mongod Crash: SERVER-27395

As I mentioned in my explanation of faceted search, it is a very complex/advanced feature that – due to the implementation challenges – is bound to have some bugs and inefficiencies.

During the evaluation of these new features, I noticed a very serious issue: I was able to crash the entire MongoDB 3.4.0 database instance using the $bucketAuto feature in combination with an $addToSet accumulator in the output definition. This is very serious!

This the example output from my issue reproduction script, responsible for sending the $bucketAuto query to the mongo instance and then checking if it crashed:

$ bash -x ./run.sh
+ js='db.tweets.aggregate([
  { $bucketAuto: {
    groupBy: "$user.location",
    buckets: 1,
    output: {
      count: { $sum: 1 },
      location: { $addToSet: "$user.location" }
    }
  } }
])'
+ echo '### Running crashing $bucketAuto .aggregate() query'
### Running crashing $bucketAuto .aggregate() query
+ /opt/mongodb-linux-x86_64-3.4.0/bin/mongo --port=27017 '--eval=db.tweets.aggregate([
  { $bucketAuto: {
    groupBy: "$user.location",
    buckets: 1,
    output: {
      count: { $sum: 1 },
      location: { $addToSet: "$user.location" }
    }
  } }
])' test
MongoDB shell version v3.4.0
connecting to: mongodb://127.0.0.1:27017/test
MongoDB server version: 3.4.0
2016-12-13T12:59:10.066+0100 E QUERY    [main] Error: error doing query: failed: network error while attempting to run command 'aggregate' on host '127.0.0.1:27017'  :
DB.prototype.runCommand@src/mongo/shell/db.js:132:1
DB.prototype.runReadCommand@src/mongo/shell/db.js:109:16
DBCollection.prototype._dbReadCommand@src/mongo/shell/collection.js:183:12
DBCollection.prototype.aggregate/doAgg<@src/mongo/shell/collection.js:1298:30
DBCollection.prototype.aggregate@src/mongo/shell/collection.js:1301:15
@(shell eval):1:1
+ sleep 1
++ tail -1 mongod.log
+ '[' '-----  END BACKTRACE  -----' = '-----  END BACKTRACE  -----' ']'
+ echo '###  Crashed mongod 3.4.0!'
###  Crashed mongod 3.4.0!

As you can see above, a full server crash occurred in my test when using $bucketAuto with $addToSet accumulators. The “network error” is caused by the MongoDB shell losing connection to the now-crashed server.

The mongod log file reports the following lines before the crash (and backtrace):

2016-12-13T12:59:10.048+0100 F -        [conn2] Invalid operation at address: 0x7f1d43ba990a
2016-12-13T12:59:10.061+0100 F -        [conn2] Got signal: 8 (Floating point exception).
 0x7f1d443e0f91 0x7f1d443e0089 0x7f1d443e06f6 0x7f1d42153100 0x7f1d43ba990a 0x7f1d43ba91df 0x7f1d43bc8d2e 0x7f1d43bcae3a 0x7f1d43bce255 0x7f1d43ca4492 0x7f1d43a3b0a5 0x7f1d43a3b29c 0x7f1d43a3b893 0x7f1d43d3c31a 0x7f1d43d3cc3b 0x7f1d4398447b 0x7f1d439859a9 0x7f1d438feb2b 0x7f1d438ffd70 0x7f1d43f12afd 0x7f1d43b1c54d 0x7f1d4371082d 0x7f1d4371116d 0x7f1d4435ec22 0x7f1d4214bdc5 0x7f1d41e78ced

This has been reported as the ticket SERVER-27395, and exists in MongoDB 3.4.0. Please see the ticket for more details, updates and a full issue reproduction: https://jira.mongodb.org/browse/SERVER-27395. If this issue is important to you, please vote for this issue at the ticket URL.

This highlights the importance of testing new features with your exact application usage pattern, especially during a major version release such as MongoDB 3.4.0. With all the new exciting ways one can aggregate data in MongoDB 3.4.0, and the infinite ways to stitch those features together in a pipeline, there are bound to be some cases where the code needs improvement.

Nonetheless, I am very excited to see the addition of these powerful new features and I look forward to them maturing.

Links

  1. https://docs.mongodb.com/manual/reference/operator/aggregation/facet/
  2. https://docs.mongodb.com/manual/reference/operator/aggregation/bucket/
  3. https://docs.mongodb.com/manual/reference/operator/aggregation/bucketAuto/
  4. https://docs.mongodb.com/manual/release-notes/3.4/#aggregation
  5. https://docs.mongodb.com/v3.4/core/aggregation-pipeline/
  6. https://en.wikipedia.org/wiki/Faceted_search
  7. https://jira.mongodb.org/browse/SERVER-27395
Jul
10
2015
--

How to create a rock-solid MySQL database backup & recovery strategy

Percona MySQL and MongoDB WebinarsHave you ever wondered what could happen if your MySQL database goes down?

Although it’s evident such a crash will cause downtime – and surely some business impact in terms of revenue – can you do something to reduce this impact?

The simple answer is “yes” by doing regular backups (of course) but are you 100% sure that your current backup strategy will really come through when an outage occurs? And how much precious time will pass (and how much revenue will be lost) before you get your business back online?

I usually think of backups as the step after HA fails. Let’s say we’re in M<>M replication and something occurs that kills the db but the HA can’t save the day. Let’s pretend that the UPS fails and those servers are completely out. You can’t failover; you have to restore data. Backups are a key piece of “Business Continuity.” Also factor in the frequent need to restore data that’s been altered by mistake. No ‘WHERE’ clause or DROP TABLE in prod instead of DEV. These instances are where backups are invaluable.

Let’s take some time and discuss the possible backup strategies with MySQL…  how to make backups efficiently and also examine the different tools that are available. We’ll cover these topics and more during my July 15  webinar: “Creating a Best-in-Class Backup and Recovery System for Your MySQL Environment” starting at 10 a.m. Pacific time.

On a related note, did you know that most online backups are possible with mysqldump and you can save some space on backups by using simple Linux tools? I’ll also cover this so be sure to join me next Wednesday. Oh, and it’s a free webinar, too!

Stay tuned!

The post How to create a rock-solid MySQL database backup & recovery strategy appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com