In my previous post I explained how it could be possible to recover, on some specific cases, a single table from a full backup in order to save time and make the recovery process more straightforward. Now the scenario is worse because we don’t have a backup or the backup restore process doesn’t work. How can I recover deleted rows?
We’re going to follow the same example as in my previous post so we need to delete the records of the employee 10008 from the table “salaries”. After the “accidental” deletion of rows you should stop MySQL, take a copy of the salaries.ibd and start it again. Later, we’ll extract those deleted rows from the ibd file and import them into the database. The time between the deletion of rows and the database stop is crucial. If pages are reused you can’t recover the data.
The tool that we’re going to use to achieve this objetive is Percona Data Recovery Tool for InnoDB. This is the tool that we use in our data recovery service and it’s open source.
I’m going to explain the full process in four different steps to make it clearer:
1- Extract all the InnoDB pages from the tablespace:
First we need to download Percona Data Recovery Tool and compile all the tools using the “make” command. In this example I’m going to install the tools in /root/recovery-tool folder and the data like tablespaces and recovered rows in /root/recovery-tool/data.
After the compile process we need to copy the salaries.ibd tablespace to the recovery-tool’s data directory. In order to extract all the pages we’ll use the page_parser tool. This tool will find and extract all the pages of the tablespace to an output directory. We only need to specify the row format (-5) and where is our tablespace located (-f)
The row format can be -4 (REDUNDANT) or -5 (COMPACT). From 5.0.3 the default format is COMPACT. More information about row format on the following link:
You can also get the table row format from the Information Schema:
mysql (information_schema) > SELECT ROW_FORMAT from TABLES WHERE TABLE_SCHEMA='employees' AND TABLE_NAME='salaries';
| ROW_FORMAT |
| Compact |
~/recovery-tool# ./page_parser -5 -f data/salaries.ibd
Opening file: data/salaries.ibd:
71.43% done. 2012-02-14 13:10:08 ETA(in 00:00 hours). Processing speed: 104857600 B/sec
All pages are stored in a single directory with some subdirectories inside, one for every single index in the table:
~/recovery-tool# ls pages-1329221407/FIL_PAGE_INDEX/
In this case, there are two indexes with the IDs 0-26 and 0-27. InnoDB has Clustered Primary Key, that is, the data is organized along with the primary key. Therefore if we want to extract the rows data we need to identify which of those two indexes is the Primary Key. This is our next step.
2- Identifying the Primary Key
There are different methods to find the correct index and here I’m going to explain three of them:
Percona Server has some extra tables in INFORMATION_SCHEMA that can help us to find the different indexes and types.
mysql (information_schema) > select i.INDEX_ID, i.NAME FROM INNODB_SYS_INDEXES as i INNER JOIN INNODB_SYS_TABLES as t USING(TABLE_ID) WHERE t.NAME='salaries';
| INDEX_ID | NAME |
| 26 | PRIMARY |
| 27 | emp_no |
B) InnoDB Table Monitor
The index information can also be taken directly from MySQL using the InnoDB Tablespace Monitor. This monitor will write all the information related with tables and indexes (with their IDs) to the error log.
mysql (employees) > CREATE TABLE innodb_table_monitor (id int) ENGINE=InnoDB;
TABLE: name employees/salaries, id 18, flags 1, columns 7, indexes 2, appr.rows 2844513
INDEX: name PRIMARY, id 26, fields 2/6, uniq 2, type 3
root page 3, appr.key vals 2844513, leaf pages 6078, size pages 6120
FIELDS: emp_no from_date DB_TRX_ID DB_ROLL_PTR salary to_date
INDEX: name emp_no, id 27, fields 1/2, uniq 2, type 0
root page 4, appr.key vals 306195, leaf pages 2189, size pages 2212
FIELDS: emp_no from_date
The second method has the same result, 0-26 is our primary key. After identifying the Primary Key don’t forget to remove the innodb_table_monitor.
C) Check the size on disk of every indes
This is very dependent of the table schema, but normally the primary key will be larger on disk because it stores also the row itself.
~/recovery-tool/pages-1329221407/FIL_PAGE_INDEX# du -hs 0-26/
~/recovery-tool/pages-1329221407/FIL_PAGE_INDEX# du -hs 0-27/
In our examples 0-26 seems to be the Primary Key.
3- Extract the rows
We know in which index is the data so the next step is clear, extract the rows from it. To accomplish this task we’ll use constraint_parser command. In order to use it the tool needs to know the table schema structure, that is, column types, names and attributes. This information needs to be available on the header file recovery-tools/include/table_defs.h so it will be necessary to recompile the tool.
To help us in the task of converting the schema definition to a C header file there is another tool with the name create_defs.pl. This one will connect to the database in order to examine the table and create the table_defs.h content.
~/recovery-tool# ./create_defs.pl --host 127.0.0.1 --port 5520 --user root --password msandbox --db employees --table salaries > include/table_defs.h
More information about the table_defs.h format in the following link:
With the table definition on table_defs.h it’s the moment to compile again all the tools with “make” command. After the compile process we can use constraints_parser to recover the rows in a human readable format.
~/recovery-tool# ./constraints_parser -5 -D -f pages-1329221407/FIL_PAGE_INDEX/0-26/ > data/salaries.recovery
With -D option we are asking to the constraints_parser to recover only deleted pages. -5 and -f are the same options that we’ve used before with page_parser.
In the salaries.recovery you can find lot of deleted rows, not only rows deleted accidentally. You should manually find the data you need to recover and save it in another file. This is the output of our example:
~/data-recovery# cat data/salaries.recovery
salaries 10008 "1998-03-11" 46671 "1999-03-11"
salaries 10008 "1999-03-11" 48584 "2000-03-10"
salaries 10008 "2000-03-10" 52668 "2000-07-31"
4- Import the rows
With the data in our hands the last step is to import it in our database:
mysql (employees) > LOAD DATA INFILE '/root/recovery-tool/data/salaries.recovery' REPLACE INTO TABLE `salaries` FIELDS TERMINATED BY '\t' OPTIONALLY ENCLOSED BY '"' LINES STARTING BY 'salaries\t' (emp_no, from_date, salary, to_date);
Query OK, 3 rows affected (0.01 sec)
Records: 3 Deleted: 0 Skipped: 0 Warnings: 0
mysql (employees) > select * from salaries where emp_no=10008;
| emp_no | salary | from_date | to_date |
| 10008 | 46671 | 1998-03-11 | 1999-03-11 |
| 10008 | 48584 | 1999-03-11 | 2000-03-10 |
| 10008 | 52668 | 2000-03-10 | 2000-07-31 |
Data recovered! 10008 has recovered his salary again
With InnoDB the deleted rows are not lost. You can recover them from the original tablespace and also from an ibd file if you have a binary backup. Just use constraint_parser without the -D option (deleted) and you will recover all the data that exists inside that tablespace