Feb
21
2012
--

DBD::mysql 4.014 breaks pt-table-checksum 2.0

DBD::mysql 4.014 breaks pt-table-checksum 2.0.  The cause is unknown, but the effect is a lot of errors like:

DBD::mysql::st execute failed: called with 2 bind variables when 6 are needed [for Statement "..." with ParamValues: ...] at ./pt-table-checksum line 7216.

The fix is simple: upgrade (or even downgrade) DBD::mysql to any version except 4.014. To see which version of DBD::mysql a system has, execute:

perl -MDBD::mysql -e 'print $DBD::mysql::VERSION, "\n";'

This bug may affect other Percona Toolkit tools, but currently pt-table-checksum 2.0 is the only victim. This bug does not affect pt-table-checksum 1.0, and it cannot be worked around in pt-table-checksum 2.0 because the bug in in DBD::mysql.

This bug affects pt-table-checksum 2.0 but not 1.0 because the newer version uses prepared statements with parameter values, whereas the older version does not. It seems, although I have not verified this, that DBD::mysql 4.014 has some sort of caching mechanism which causes it to use the wrong prepared statement.

Feb
20
2012
--

How to recover deleted rows from an InnoDB Tablespace

In my previous post I explained how it could be possible to recover, on some specific cases, a single table from a full backup in order to save time and make the recovery process more straightforward. Now the scenario is worse because we don’t have a backup or the backup restore process doesn’t work. How can I recover deleted rows?

We’re going to follow the same example as in my previous post so we need to delete the records of the employee 10008 from the table “salaries”. After the “accidental” deletion of rows you should stop MySQL, take a copy of the salaries.ibd and start it again. Later, we’ll extract those deleted rows from the ibd file and import them into the database. The time between the deletion of rows and the database stop is crucial. If pages are reused you can’t recover the data.

The tool that we’re going to use to achieve this objetive is Percona Data Recovery Tool for InnoDB. This is the tool that we use in our data recovery service and it’s open source.

I’m going to explain the full process in four different steps to make it clearer:

1- Extract all the InnoDB pages from the tablespace:

First we need to download Percona Data Recovery Tool and compile all the tools using the “make” command. In this example I’m going to install the tools in /root/recovery-tool folder and the data like tablespaces and recovered rows in /root/recovery-tool/data.

After the compile process we need to copy the salaries.ibd tablespace to the recovery-tool’s data directory. In order to extract all the pages we’ll use the page_parser tool. This tool will find and extract all the pages of the tablespace to an output directory. We only need to specify the row format (-5) and where is our tablespace located (-f)

The row format can be -4 (REDUNDANT) or -5 (COMPACT). From 5.0.3 the default format is COMPACT. More information about row format on the following link:

http://dev.mysql.com/doc/refman/5.5/en/innodb-physical-record.html

You can also get the table row format from the Information Schema:

mysql (information_schema) > SELECT ROW_FORMAT from TABLES WHERE TABLE_SCHEMA='employees' AND TABLE_NAME='salaries';
+------------+
| ROW_FORMAT |
+------------+
| Compact |
+------------+

~/recovery-tool# ./page_parser -5 -f data/salaries.ibd
Opening file: data/salaries.ibd:
[...]
71.43% done. 2012-02-14 13:10:08 ETA(in 00:00 hours). Processing speed: 104857600 B/sec

All pages are stored in a single directory with some subdirectories inside, one for every single index in the table:

~/recovery-tool# ls pages-1329221407/FIL_PAGE_INDEX/
0-26 0-27

In this case, there are two indexes with the IDs 0-26 and 0-27. InnoDB has Clustered Primary Key, that is, the data is organized along with the primary key. Therefore if we want to extract the rows data we need to identify which of those two indexes is the Primary Key. This is our next step.

2- Identifying the Primary Key

There are different methods to find the correct index and here I’m going to explain three of them:

A) INNODB_SYS_INDEXES

Percona Server has some extra tables in INFORMATION_SCHEMA that can help us to find the different indexes and types.

mysql (information_schema) > select i.INDEX_ID, i.NAME FROM INNODB_SYS_INDEXES as i INNER JOIN INNODB_SYS_TABLES as t USING(TABLE_ID) WHERE t.NAME='salaries';
+----------+---------+
| INDEX_ID | NAME |
+----------+---------+
| 26 | PRIMARY |
| 27 | emp_no |
+----------+---------+

B) InnoDB Table Monitor

The index information can also be taken directly from MySQL using the InnoDB Tablespace Monitor. This monitor will write all the information related with tables and indexes (with their IDs) to the error log.

mysql (employees) > CREATE TABLE innodb_table_monitor (id int) ENGINE=InnoDB;

TABLE: name employees/salaries, id 18, flags 1, columns 7, indexes 2, appr.rows 2844513
[...]
INDEX: name PRIMARY, id 26, fields 2/6, uniq 2, type 3
root page 3, appr.key vals 2844513, leaf pages 6078, size pages 6120
FIELDS: emp_no from_date DB_TRX_ID DB_ROLL_PTR salary to_date
INDEX: name emp_no, id 27, fields 1/2, uniq 2, type 0
root page 4, appr.key vals 306195, leaf pages 2189, size pages 2212
FIELDS: emp_no from_date
[...]

The second method has the same result, 0-26 is our primary key. After identifying the Primary Key don’t forget to remove the innodb_table_monitor.

C) Check the size on disk of every indes

This is very dependent of the table schema, but normally the primary key will be larger on disk because it stores also the row itself.

~/recovery-tool/pages-1329221407/FIL_PAGE_INDEX# du -hs 0-26/
96M 0-26/
~/recovery-tool/pages-1329221407/FIL_PAGE_INDEX# du -hs 0-27/
35M 0-27/

In our examples 0-26 seems to be the Primary Key.

3- Extract the rows

We know in which index is the data so the next step is clear, extract the rows from it. To accomplish this task we’ll use constraint_parser command. In order to use it the tool needs to know the table schema structure, that is, column types, names and attributes. This information needs to be available on the header file recovery-tools/include/table_defs.h so it will be necessary to recompile the tool.

To help us in the task of converting the schema definition to a C header file there is another tool with the name create_defs.pl. This one will connect to the database in order to examine the table and create the table_defs.h content.

~/recovery-tool# ./create_defs.pl --host 127.0.0.1 --port 5520 --user root --password msandbox --db employees --table salaries > include/table_defs.h

More information about the table_defs.h format in the following link:

http://www.percona.com/docs/wiki/innodb-data-recovery-tool:mysql-data-recovery:generating_a_table_definition

With the table definition on table_defs.h it’s the moment to compile again all the tools with “make” command. After the compile process we can use constraints_parser to recover the rows in a human readable format.

~/recovery-tool# ./constraints_parser -5 -D -f pages-1329221407/FIL_PAGE_INDEX/0-26/ > data/salaries.recovery

With -D option we are asking to the constraints_parser to recover only deleted pages. -5 and -f are the same options that we’ve used before with page_parser.

In the salaries.recovery you can find lot of deleted rows, not only rows deleted accidentally. You should manually find the data you need to recover and save it in another file. This is the output of our example:

~/data-recovery# cat data/salaries.recovery
salaries 10008 "1998-03-11" 46671 "1999-03-11"
salaries 10008 "1999-03-11" 48584 "2000-03-10"
salaries 10008 "2000-03-10" 52668 "2000-07-31"

4- Import the rows

With the data in our hands the last step is to import it in our database:

mysql (employees) > LOAD DATA INFILE '/root/recovery-tool/data/salaries.recovery' REPLACE INTO TABLE `salaries` FIELDS TERMINATED BY '\t' OPTIONALLY ENCLOSED BY '"' LINES STARTING BY 'salaries\t' (emp_no, from_date, salary, to_date);
Query OK, 3 rows affected (0.01 sec)
Records: 3 Deleted: 0 Skipped: 0 Warnings: 0

mysql (employees) > select * from salaries where emp_no=10008;
+--------+--------+------------+------------+
| emp_no | salary | from_date | to_date |
+--------+--------+------------+------------+
| 10008 | 46671 | 1998-03-11 | 1999-03-11 |
| 10008 | 48584 | 1999-03-11 | 2000-03-10 |
| 10008 | 52668 | 2000-03-10 | 2000-07-31 |
+--------+--------+------------+------------+

Data recovered! 10008 has recovered his salary again :)

Conclusion

With InnoDB the deleted rows are not lost. You can recover them from the original tablespace and also from an ibd file if you have a binary backup. Just use constraint_parser without the -D option (deleted) and you will recover all the data that exists inside that tablespace

Feb
19
2012
--

Why don’t our new Nagios plugins use caching?

In response to the release of our new MySQL monitoring plugins on Friday, one commenter asked why the new Nagios plugins don’t use caching. It’s worth answering in a post rather than a comment, because there is an important principle that needs to be understood to monitor servers correctly. But first, some history.

When I wrote a set of high-quality Cacti templates for MySQL a few years ago (which are now replaced by the new project), making the Cacti templates use caching was important for two reasons:

  1. Performance. Cacti runs some of its polling processes serially, so if each graph has to reach out to the MySQL server and retrieve a bunch of data, the polling can take too long. I’ve seen cases where a Cacti server that’s graphing too many MySQL servers doesn’t finish the current iteration before the next one starts. Making sure the data isn’t retrieved from MySQL repeatedly cuts a lot of latency out of the process. In fact, I went further than that; I didn’t cache the data retrieved from MySQL, I cached the resulting metrics. This avoids having to do InnoDB parsing repeatedly, too. With this approach, Cacti is capable of monitoring many more servers.
  2. Consistency. Graphs are supposed to be generated at a single instant in time, but as I just said, they actually aren’t. Graphs that show slightly different data, or greatly different data in some cases I observed, are frustrating at best and wrong or misleading at worst. When you’re using these graphs for forensic or diagnostic purposes, this is very bad. Caching the results from the first call and reusing it for all subsequent polls avoids this problem. All of the graphs are generated from one and only one set of data retrieved from MySQL.

It’s worth noting that Cacti doesn’t do this itself, which I consider to be a Cacti design shortcoming. I did not like writing my own cache management code. Handling cached data correctly is one of the two hard things in programming (choosing proper variable names is the other one, of course).

Now, somewhere along the line “you should cache when you are monitoring a system for failures” became a tacit best practice. I don’t know why, but I can speculate that there might have been a misunderstanding — perhaps caching was understood to reduce load on MySQL, rather than reducing load on the Cacti server and speeding up the poller.

This is a very bad idea! I do NOT advocate caching for Nagios checks. Why? Exactly the inverse of the two reasons that graphing should be cached!

  1. Performance doesn’t matter. First, most Nagios checks are run once every five minutes, or once a minute at most. Running SHOW STATUS infrequently doesn’t add load to the server. Most real production installations I see actually have multiple people running innotop or what have you, once a second! Second, if your monitoring is to be at all useful, you should not monitor too many things inside of MySQL. If you take the typical mass-market monitoring program’s silly “alert on 99 different kinds of cache hit ratio thresholds” approach, you’ll soon have sysadmins with /dev/null email filters, which will make your monitoring system utterly useless. However, even if you have multiple checks running once a minute, it’s still a trivial amount of load.
  2. Consistency is just another word for staleness. Great, your monitoring system thinks that the server is fine, based on a three-minute-old file — what good is that? You should be monitoring the server, as of NOW, rather than monitoring the contents of some file as of several minutes ago.

In conclusion, you should cache for historical metrics monitoring, but not for fault detection monitoring. That’s why the new Nagios plugins don’t have any caching functionality.

Feb
17
2012
--

Announcing MySQL Monitoring Plugins from Percona

We’ve released a new set of monitoring plugins for MySQL servers and related software. With these plugins, you can set up world-class graphing and monitoring for your MySQL servers, using your own on-premises Cacti and Nagios software. The Cacti plugins are derived from an existing set of templates we’ve been using for several years, but the Nagios check plugins are brand new. They are informed by the research we did into the causes and preventions of MySQL downtime.

Like all Percona software, the plugins are open-source and free, licensed under the GNU GPL. The source code and issue tracker are hosted at Launchpad. The 0.9.0 release is ready to download now, and the reference manual is also online. In the next release we will integrate this into our package management, so you can install through YUM and APT repositories.

The new monitoring plugins add first-rate support for MySQL to popular enterprise opensource monitoring systems, and that’s why we’ve added them to the list of software included in our support contracts. If you’re an existing customer of Percona’s MySQL Support services, your contract automatically covers these new plugins, too; you don’t need a new contract for that. Under the support contract, you’ll get help installing, configuring, troubleshooting, and administering your monitoring plugins, as well as being entitled to bug fixes. And as always, we’re also available to extend the plugins to support your environment or broaden the range of systems and software that they can monitor.

Please use the Launchpad bug tracker to report issues. Happy monitoring!

Feb
15
2012
--

Optimize Your SQL With Percona’s Online Query Advisor!

Wouldn’t it be nice if you could get expert advice on your SQL queries to find problems in them, the same way that programmers can use lint-check tools to warn about bugs in their C?

if ( execute = 1 ) {
   launch_missile();
}


Such a simple mistake, but it’s the kind of thing that James Bond movies are made of, isn’t it? Well, a lot of SQL queries have similar bugs, and thanks to Miguel Trias, now there’s a tool to help you find them. This is the second addition to our online suite of tools for MySQL users. You paste a query, it tells you what’s wrong with it. Simple as that.

Find the bug in this query:

select * from t1
   left join t2 using(id)
where t2.created_date < 2012-02-15;


Do you see it? Congratulations! I've analyzed that query with the tool, and shared the results with you. Click here to see if you were right.

That demonstrates another feature of the tool: you can share queries so they're public, and other people can see them. The tool uses the same account that you might have already created for our online MySQL Configuration Wizard, which helps you choose good settings for your server.

Give it a try, and tell your friends! Enjoy hunting for bugs and problems in your SQL queries!

Feb
15
2012
--

Troubleshooting MySQL Upgrade Performance Regressions

So lets say you upgraded from MySQL 5.1 to Percona Server 5.5 and instead of expected performance improvement you see your performance being worse. What should you do ?
First if you followed MySQL upgrade best practices such as testing your workload with pt-upgrade the chances of this happening are rather slim. But lets assume you have not followed these recommendations to the book or some things just slipped through.

First lets talk about what you should not do. You should not panic and go change all kind of configuration options in case you get lucky. Sometimes you do but more often you just waste your time. Instead try to understand what has changed and what exactly changed for worse.

I am speaking about MySQL upgrade – version change but in a lot of cases this will happen when settings are changed at the same time, operating system is upgraded or system is moved to “better” hardware. It is best if you can isolate the problem to one of these – make sure changing only MySQL version on the same hardware with same settings causes regression. You can always improve them later after you got well performing baseline. Do not change storage engine at the same time eather.

Make sure you’re looking at fully warmed up system. Rather often people panic and what thought to be slow down with version upgrade is rather system warming up. You need to compare apples to apples such as both systems after restart or even better both systems fully warmed up.

To spot what exactly is causing the problem it is good to use pt-query-digest to compare workloads on servers. Best if you can analyze full query logs but if this is not possible at least look at network traffic from tcpdump. This should help you to spot queries which are a lot slower on new MySQL version.

Once you have the queries you will most likely able to repeat the problem by running query on old and new MySQL Server version and observing the difference. Sometimes though it might not show performance difference ran alone on idle system – in this case you might be dealing with concurrency problem and you might need to create a more complicated test which runs this query concurrently many times or even workload consisting of several query types. Though it is rather rare for this to be needed so lets not go there.

It is ideal if you can repeat the problem on “test server” where you can analyze it without side load. Setting up 2 MySQL Servers side by side (for example with MySQL Sandbox) can especially be helpful.

Once you have spotted the query which performs differently between MySQL Server versions you should:

Check Query Plan Run EXPLAIN to see if plans for the query are the same. Changing Query Execution plans is the most common regression problem. If plan has changed check
if you can get the plan manually to the old one by using hints such as STRAIGHT_JOIN, FORCE INDEX, BIG_RESULT/SMALL_RESULT. Check whatever stats are the same (run SHOW INDEXES FROM

for tables involved and check cardinality) Different stats can often cause different plans. Run ANALYZE TABLE on both MySQL Versions to see if you can get statistics to be same or close. I should warn you though. In some cases the faster plan may be wrong plan from optimizer standpoint and updating stats may cause old MySQL Version to get slower plan instead of new one getting faster. Check stats sampling settings such as innodb-stats-method and myisam-stats-method. For Innodb storage engines it is possible for stats to be different for the same data because it uses random sampling to update the stats. You may consider disabling innodb_stats_on_metadata and increasing innodb-stats-sample-pages to get more accurate results. In recent Percona Server versions and MySQL 5.6 you can also store innodb stats in table so you have more control over them.

In the end the tricky thing about Query Plan is you might not be able to get new version to pick the old plan no matter how you try and you might be forced to change your application by changing how query is written and provide the hints.

Check Query Execution If query execution plan looks the same it is good to check whenever it seems to be executing same on low level. FLUSH STATUS; SHOW STATUS can show whenever internal operation is different, you can also check SHOW PROFILE to see where differences are. Running query in the loop and getting o_profile sample can also be a good idea. If you found query is executing differently even if the plan is the same it could be because changes to the options (might be even defaults) or some more subtle plan changes which are not seen in explain. It also could be some additional internal overhead being added. If you have query with same plan taking a lot longer to execute you might be up for deep investigation. You may also wish to create a repeatable test case which does not use any sensitive data and file the bug at this point.

Check Binaries Make sure you’re not using debug binary or differently optimized binaries.

When you’re about to file a bug or report the problem through MySQL support channel if often helps to know whenever it is the problem in stick MySQL changes between versions on Percona enhancements. Before reporting bug to Percona Server bug tracking system check whenever the bug is in Percona Server indeed by running the same test on MySQL server of the same version. If you’re Support customer you do not need to do it as Support Engineer can do it for you provided we have complete test case provided.

Feb
10
2012
--

Why not make a tool to improve existing configurations?

A couple of users of our very popular MySQL Configuration Wizard have submitted feedback such as “I’d love to input my existing server settings and get suggestions on how to improve it.”

This sounds like it would be great, doesn’t it? We’ve considered doing this, and even partially implemented it. But during our pre-release testing, we found a lot of potentially serious problems with the idea. It turns out to be very subtle — perhaps too subtle to be done with any computer program, no matter how smart, because there might be a lot of assumptions it’s forced to make, which could turn out to be dangerous. Every approach we considered carried a high risk of de-optimizing a server so it performs worse than before.

So in the end, although our tool is excellent for creating a starting my.cnf, my old blog post about the ultimate my.cnf tuner tool, which claims that an expert human is the only safe way to do this, might be an evergreen truth.

PS: my favorite user feedback so far is this one:

Great job, very good results! Now please take it offline before I am rendered obsolete. :(

Feb
06
2012
--

Announcing Percona Toolkit Release 2.0.3

We’ve released Percona Toolkit 2.0.3, with a couple of major improvements and many minor ones. You can download it, read the documentation, and get support for it.

What’s new? You can read the changelog for the details, but here are the highlights:

Brand new pt-diskstats, thanks to Brian Fraser. This tool is completely rewritten, and it’s finally the iostat replacement I always wanted. Not only does it have the functionality I want (interactive, slice and dice, smart defaults) but it has the detailed statistics on I/O, so you can see whether your reads are slow versus your writes, and whether things are waiting on the disk or waiting on the queue scheduler (cfq, hint hint). Finally, it has transparency, so you can read the documentation and understand, really, what it’s doing at the low level and what that means for your server. We really need specific, precise information on exactly how the I/O is behaving so we can make good decisions when there are problems or when doing things like capacity planning.

Brand new pt-stalk, courtesy of Daniel Nichter. This tool is also completely rewritten. Instead of a Bash script that you have to configure with environment variables and run in a screen session, this is now a first-class fault detection daemon. Everyone needs post-mortem forensic data when there is a problem, and pt-stalk aims to be a core part of your infrastructure that fills this gap. It now supports things a “real” Percona Toolkit tool ought to have, such as command-line options and a configuration file. In addition, we merged pt-collect into it, so as of Percona Toolkit 2.0.3, there isn’t a separate pt-collect tool anymore.

There’s a lot more to this release, but those are the major points. Download it and let us know how it works, please! If you find bugs, file them on Launchpad, and if you need support, you know where to get it.

Feb
02
2012
--

STOP: DELETE IGNORE on Tables with Foreign Keys Can Break Replication

DELETE IGNORE suppresses errors and downgrades them as warnings, if you are not aware how IGNORE behaves on tables with FOREIGN KEYs, you could be in for a surprise.

Let’s take a table with data as example, column c1 on table t2 references column c1 on table t1 – both columns have identical set of rows for simplicity.

CREATE TABLE `t1` (
`t1_c1` int(10) unsigned NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`t1_c1`)
) ENGINE=InnoDB;

CREATE TABLE `t2` (
`t2_c1` int(10) unsigned NOT NULL,
PRIMARY KEY (`t2_c1`),
CONSTRAINT `t2_ibfk_1` FOREIGN KEY (`t2_c1`) REFERENCES `t1` (`t1_c1`) ON UPDATE CASCADE
) ENGINE=InnoDB;

[revin@forge rsandbox_5_5_17]$ for int in {1..2000}; do ./master/use test -e "insert into t1 values($int)"; done
[revin@forge rsandbox_5_5_17]$ ./master/use test -e "insert into t2 select * from t1"

master [localhost] {msandbox} (test) > SELECT COUNT(*) FROM t1;
+----------+
| COUNT(*) |
+----------+
| 2000 |
+----------+
1 row in set (0.00 sec)

master [localhost] {msandbox} (test) > SELECT COUNT(*) FROM t2;
+----------+
| COUNT(*) |
+----------+
| 2000 |
+----------+
1 row in set (0.00 sec)

An expected behavior for DELETE IGNORE is that if the statement fails to delete all rows, none should be deleted at all, after all this is InnoDB right? Wrong, take a look at bug 44987. As stated on the bug, only 5.0 exhibits the above mentioned behavior on 5.1 and 5.5, MySQL will stop deleting rows if it encounters constraint errors.

To demonstrate on 5.5.17:

I delete some rows from t2 so some rows on t1 does not have a constraint.

master [localhost] {msandbox} (test) > DELETE FROM t2 WHERE t2_c1 BETWEEN 201 AND 400;
Query OK, 200 rows affected (0.00 sec)

Now I try to DELETE IGNORE rows 301 to 500 on t1, note rows 301 to 400 does not have any existing constraints from t2 as we deleted them from above.

master [localhost] {msandbox} (test) > DELETE IGNORE FROM t1 WHERE t1_c1 BETWEEN 301 AND 500;
Query OK, 100 rows affected, 1 warning (0.00 sec)

master [localhost] {msandbox} (test) > SHOW WARNINGS \G
*************************** 1. row ***************************
  Level: Error
   Code: 1451
Message: Cannot delete or update a parent row: a foreign key constraint fails (`test`.`t2`, CONSTRAINT `t2_ibfk_1` FOREIGN KEY (`t2_c1`) REFERENCES `t1` (`t1_c1`) ON UPDATE CASCADE)
1 row in set (0.00 sec)

As expected a warning is generated because rows 201 to 300 on t1 still has referencing foreign keys from t2. However, 100 has been deleted! Let’s see.

master [localhost] {msandbox} (test) > SELECT COUNT(*) FROM t1;
+----------+
| COUNT(*) |
+----------+
|     1900 |
+----------+
1 row in set (0.00 sec)

Now let’s check the slave.

[revin@forge rsandbox_5_5_17]$ ./node1/use test
...
slave1 [localhost] {msandbox} (test) > SELECT COUNT(*) FROM t1;
+----------+
| COUNT(*) |
+----------+
|     2000 |
+----------+
1 row in set (0.00 sec)

Uh oh, now the slave is out of sync, because the statement failed to delete all intended rows it was not written to the binary log and consequently not reaching the slave.

So how can you workaround this? Simple, 1) do not use IGNORE, be critical about your data 2) use ROW* based replication. When using the latter, MySQL will log separate statements for each row that is deleted – so if the first 100 rows was successfully deleted then those 100 events are logged and eventually replicated.

[revin@forge rsandbox_5_5_170]$ ./master/use test
...
master [localhost] {msandbox} (test) > SELECT @@binlog_format;
+-----------------+
| @@binlog_format |
+-----------------+
| ROW             |
+-----------------+
1 row in set (0.00 sec)

master [localhost] {msandbox} (test) > DELETE FROM t2 WHERE t2_c1 BETWEEN 201 AND 400;
Query OK, 200 rows affected (0.00 sec)

master [localhost] {msandbox} (test) > DELETE IGNORE FROM t1 WHERE t1_c1 BETWEEN 301 AND 500;
Query OK, 100 rows affected, 1 warning (0.01 sec)

master [localhost] {msandbox} (test) > SELECT COUNT(*) FROM t1;
+----------+
| COUNT(*) |
+----------+
|     1900 |
+----------+
1 row in set (0.00 sec)

[revin@forge rsandbox_5_5_170]$ ./node1/use test
...
slave1 [localhost] {msandbox} (test) > SELECT COUNT(*) FROM t1;
+----------+
| COUNT(*) |
+----------+
|     1900 |
+----------+
1 row in set (0.00 sec)

* MIXED mode will not work since the query in this example will be considered STATEMENT (http://dev.mysql.com/doc/refman/5.1/en/binary-log-mixed.html) thus failure to execute the query successfully means it will not get logged.

Feb
01
2012
--

Verifying backup integrity with CHECK TABLES

An attendee to Espen’s recent webinar asked how to check tables for corruption. This kind of ties into my recent post on InnoDB’s handling of corrupted pages, because the best way to check for corruption is with CHECK TABLES, but if a page is corrupt, InnoDB will crash the server to prevent access to the corrupt data. As mentioned in that post, this can only be changed by changing InnoDB.

So how are you supposed to check for corruption that might be introduced by bad hardware, a bug, or so forth?

It’s a great question. The answer I would give for most cases is “check your backups for corruption instead of your live server.” You need to do this anyway — a backup that isn’t checked is a ticking time bomb. You need to verify (at least periodically) that your backups are recoverable.

The usual procedure goes like this: copy your backup somewhere, start a server instance on it, and run CHECK TABLES. You can use the mysqlcheck program to do this conveniently.

You could also use innochecksum, which doesn’t require starting the server. But it only verifies that each page’s checksum matches the page’s data, it doesn’t do all the other checks that are built into InnoDB (making sure that the LSNs are sane, for example).

How often? As often as possible. Some people refresh their dev/staging environment every day with last night’s backup, which is a great way to make failures obvious, as long as you verify that it truly does happen (e.g. what if it fails and you keep running with yesterday’s without knowing it?). If you can’t do it daily, then weekly is perfectly acceptable to most people. I’m not saying a specific interval should/ought to be your goal, I’m just remarking on what a lot of people seem to feel good about.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com