Aug
31
2011
--

Percona Server 5.5.15-21.0

Percona is glad to announce the release of Percona Server 5.5.15-21.0 on August 31, 2011 (Downloads are available here and from the Percona Software Repositories).

Based on MySQL 5.5.15, including all the bug fixes in it, Percona Server 5.5.15-21.0 is now the current stable release in the 5.5 series. All of Percona’s software is open-source and free, all the details of the release can be found in the 5.5.15-21.0 milestone at Launchpad.

Improvements

Improved MEMORY Storage Engine

As of MySQL 5.5.15, a Fixed Row Format (FRF) is still being used in the MEMORY storage engine. The fixed row format imposes restrictions on the type of columns as it assigns on advance a limited amount of memory per row. This renders a VARCHAR field in a CHAR field in practice, making impossible to have a TEXT or BLOB field with that engine implementation.

To overcome this limitation, the Improved MEMORY Storage Engine is introduced in this release for supporting true VARCHARVARBINARYTEXT and BLOB fields in MEMORY tables.

This implementation is based on the Dynamic Row Format (DFR) introduced by the mysql-heap-dynamic-rows patch.

DFR is used to store column values in a variable-length form, thus helping to decrease memory footprint of those columns and making possible BLOB and TEXT fields and real VARCHAR and VARBINARY.

For performance reasons, a mixed solution is implemented: the fixed format is used at the beginning of the row, while the dynamic one is used for the rest of it. All values for columns used in indexes are stored in fixed format at the first block of the row, then the following columns are handled with DRF.

More information about the usage and implementation of the Improved MEMORY Storage Engine can be found in its documentation.

More Information

 

Aug
30
2011
--

Explaining Indexes with a Library Metaphor

My favorite metaphor for explaining indexes is comparing them to index cards in an old library. In an old library, you used to (or still do) have index cards at the front desk which have some brief description of the books in the library. They also used to be categorized alphabetically.

(image taken from http://www.flickr.com/photos/reedinglessons/2239767394/)

Let’s pretend that you are simulating an application that is trying to find a book with a certain title in the library.

Not using an index

If you are not using the index cards, you would have to go shelf by shelf and row by row, look at each book’s title and see if its the one you need. This is very time consuming and is similar to how a database looks at blocks on the hard disk when its not using an index.

Using an index

You are interested in a book by J.R. Hartley. You go to the index card, you look at section H (categorized by last name of author) and browse through the list till you find Hartley, J.R. You get back the description of the shelf and row in which these books can be found. You then walk over to the right shelf and the right row, go over the books in that row till you find the ones you want.I’m sure you can imagine that this step would be much faster.

Using an index with range scans

You would like to count how many books J.R. Hartley has ever written. You goto the index cards, look up section H, find Hartley and count all the index cards that have Hartley, J.R in the title. You then find that J.R. Hartley has written 4 books. You have your result and you have saved yourself from walking over to the shelves.

Using a covering index

You are interested to know how many pages a certain book by J.R. Hartley has. You go over to the index cards, open section H, go over the index cards till you find the right book. Now on these particular cards, the number of pages of the book is written on them. You get your result and saved yourself from walking over to the right shelf, finding the book and seeing how many pages it has.

 

Conclusion

If you imagine yourself doing these tasks, you can imagine how long they might take to complete. It is not entirely different from the tasks the database has to do.

I hope this metaphor has helped you understand indexes better (if you were not sure about them before) and I also hope that you will enjoy using it in the future.

 

Aug
30
2011
--

Percona Training In Sydney

For those that missed it – we added training in Sydney to our website.

We’ve booked a training venue near the Museum of Contemporary Art (200 George Street). Some minor logistical changes:

  • The start time will now be 9am to 5pm

We look forward to seeing you there!

Aug
27
2011
--

Recovering Linux software RAID, RAID5 Array

Dealing with MySQL you might need to deal with RAID recovery every so often. Sometimes because of client lacking the proper backup or sometimes because recovering RAID might improve recovery, for example you might get point in time recovery while backup setup only takes you to the point where last binary log was backed up. I wanted for a chance to write instructions for recovery for long time
and finally I had gotten the problems with my ReadyNAS Pro 6 which I was setting up/testing at home for use for backups. I got it doing initial sync while it spotted the problem with one other drive and as such RAID volume failed. ReadyNAS has Debian inside and as you can get root login via SSH it can be recovered as any generic Linux server.

When you restart the system RAID5 volume which has more than 1 failed hard drive will be completely inaccessible, this happens even you just happen to be one bad sector on the disk. This “paranoid” behavior helps to preserve consistency however it can scare hell out of you, giving you no access to the data at all. Not all Hardware RAID would have this behavior.

First lets see what the status of array is:

ReadyNAS1:~# mdadm -Q –detail /dev/md2
/dev/md2:
Version : 1.2
Creation Time : Fri Aug 26 13:51:11 2011
Raid Level : raid5
Used Dev Size : -1
Raid Devices : 6
Total Devices : 5
Persistence : Superblock is persistent

Update Time : Fri Aug 26 22:11:26 2011
State : active, FAILED, Not Started
Active Devices : 4
Working Devices : 5
Failed Devices : 0
Spare Devices : 1

Layout : left-symmetric
Chunk Size : 64K

Name : 001F33EABA01:2
UUID : 01a26106:50b297a8:1d542f0a:5c9b74c6
Events : 83

Number Major Minor RaidDevice State
0 8 3 0 active sync /dev/sda3
1 8 19 1 active sync /dev/sdb3
2 8 35 2 active sync /dev/sdc3
3 0 0 3 removed
4 8 67 4 active sync /dev/sde3
5 8 83 5 spare rebuilding /dev/sdf3

In this case I know/dev/sdf3 was being rebuilt when /dev/sdd3 developed problems. If you do not know what disk was being resynced (or failed first) you can check it examining all volumes separately:

ReadyNAS1:~# mdadm –examine /dev/sdf3
/dev/sdf3:
Magic : a92b4efc
Version : 1.2
Feature Map : 0×2
Array UUID : 01a26106:50b297a8:1d542f0a:5c9b74c6
Name : 001F33EABA01:2
Creation Time : Fri Aug 26 13:51:11 2011
Raid Level : raid5
Raid Devices : 6

Avail Dev Size : 5851089777 (2790.02 GiB 2995.76 GB)
Array Size : 29255447040 (13950.08 GiB 14978.79 GB)
Used Dev Size : 5851089408 (2790.02 GiB 2995.76 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
Recovery Offset : 5631463944 sectors
State : clean
Device UUID : 77ea1f91:5d4915c3:5cd17402:7f1ecafb

Update Time : Fri Aug 26 22:11:26 2011
Checksum : d9052ded – correct
Events : 83

Layout : left-symmetric
Chunk Size : 64K

Device Role : Active device 5
Array State : AAA.AA (‘A’ == active, ‘.’ == missing)

Note the Update time here. The oldest disk to fail will have earliest update time. Here is the nice article explaining it in more details.

I was looking for a way to tell mdadm to change status of rebuilding drive into “failed” and removed in “active sync” but I could not find a way to do it. It could be by design as using such commands wrong
way can ruin your RAID array. What you can do instead is to re-create RAID array in the same configuration as it was initially created, making the drive you want to be skipped (like first failed drive, or drive being resynced) as missing:

ReadyNAS1:/# mdadm –stop /dev/md2
mdadm: stopped /dev/md2

mdadm –verbose –create /dev/md2 –chunk=64 –level=5 –raid-devices=6 /dev/sda3 /dev/sdb3 /dev/sdc3 /dev/sdd3 /dev/sde3 missing

mdadm: layout defaults to left-symmetric
mdadm: layout defaults to left-symmetric
mdadm: layout defaults to left-symmetric
mdadm: /dev/sda3 appears to be part of a raid array:
level=raid5 devices=6 ctime=Fri Aug 26 13:51:11 2011
mdadm: layout defaults to left-symmetric
mdadm: /dev/sdb3 appears to be part of a raid array:
level=raid5 devices=6 ctime=Fri Aug 26 13:51:11 2011
mdadm: layout defaults to left-symmetric
mdadm: /dev/sdc3 appears to be part of a raid array:
level=raid5 devices=6 ctime=Fri Aug 26 13:51:11 2011
mdadm: layout defaults to left-symmetric
mdadm: /dev/sdd3 appears to be part of a raid array:
level=raid5 devices=6 ctime=Fri Aug 26 13:51:11 2011
mdadm: layout defaults to left-symmetric
mdadm: /dev/sde3 appears to be part of a raid array:
level=raid5 devices=6 ctime=Fri Aug 26 13:51:11 2011
mdadm: size set to 2925544704K
Continue creating array? y
mdadm: Defaulting to version 1.2 metadata

Note it is very important to have one specific drive as missing in this case if you do not one of them will be picked in as to be resynced from others, and if it is wrong drive you will lose your data. Creating RAID in such way also allows you to check if you have guessed correctly and if you created RAID wrong way you probably will be unable to mount it, find LVM volumes on it etc, and you can go back and correct your error.

ReadyNAS1:/# cat /proc/mdstat
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md2 : active raid5 sde3[4] sdd3[3] sdc3[2] sdb3[1] sda3[0]
14627723520 blocks super 1.2 level 5, 64k chunk, algorithm 2 [6/5] [UUUUU_]

So you can see RAID is active now though it is missing one of the disks, running on 5 instead of 6. Before we start resync lets validate it was assembled correctly:

ReadyNAS1:/# mount /dev/c/c
mount: special device /dev/c/c does not exist
ReadyNAS1:/# lvdisplay
— Logical volume —
LV Name /dev/c/c
VG Name c
LV UUID Rd66bT-qF3P-MgES-F9jK-zQ01-t0qo-qbo070
LV Write Access read/write
LV Status NOT available
LV Size 13.61 TB
Current LE 223041
Segments 1
Allocation inherit
Read ahead sectors 0

So we have RAID volume back but LVM shows this volume as NOT available and so it can’t be mounted. Happily it can be easily fixed:

ReadyNAS1:/# vgchange -a y
1 logical volume(s) in volume group “c” now active

Lets check if the file system is in the good shape:

ReadyNAS1:/# fsck /dev/c/c
fsck 1.41.14 (22-Dec-2010)
e2fsck 1.41.14 (22-Dec-2010)
/dev/c/c: clean, 25/228395008 files, 14579695/3654303744 blocks

good so now we are pretty sure it is in a good shape. I also mounted it and checked couple of files to be sure.

Lets try to add the last drive to the volume so it can attempt to resync. I should give you a warning though. In many cases this is valid thing to do even without replacing any hard drives, from my experience I can tell at least 50% of failed hard drives in the RAID array are false positives – either simply adding drive back or re-seating the hot swap hard drive solves the problem. If the drive has bad blocks though resync is likely to cause these being read and when array will fail again. If this is what is happening you have couple of options. First you can just copy the data from RAID array bypassing any files which have bad sectors. Often this will be a file or two. The second way is to get something like Drive Fitness Test (different hard drive vendors have different versions) such tools would often have a functionality to scan hard drive for bad blocks and remap them if there are spare sectors available. It might be able to read data from the original sectors or it might fail to do that, in which case they will be zeroed out and your data potentially corrupted. This is why I prefer to know files affected by bad blocks on the first place.

ReadyNAS1:/# mdadm -a /dev/md2 /dev/sdf3
mdadm: added /dev/sdf3

ReadyNAS1:/# cat /proc/mdstat
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md2 : active raid5 sdf3[6] sde3[4] sdd3[3] sdc3[2] sdb3[1] sda3[0]
14627723520 blocks super 1.2 level 5, 64k chunk, algorithm 2 [6/5] [UUUUU_]
[==>………………] recovery = 12.6% (371104968/2925544704) finish=488.9min speed=87075K/sec

As you see the drive is being rebuilt now though we’re yet to see if it runs into problems with volume again.

Aug
26
2011
--

Getting MySQL Core file on Linux

Core file can be quite helpful to troubleshoot MySQL Crashes yet it is not always easy to get, especially with recent Linux distributions which have security features to prevent core files to be dumped by
setuid processes (and MySQL Server is most commonly ran changing user from “root” to “mysql”). Before you embark on enabling core file you should consider two things – disk space and restart time. The core file will dump all MySQL Server memory content including buffer pool which can be tens on even hundreds GB of disk space. It can also take very long time to write this amount of data to the disk. If you are using “pid” with core files, which you probably should, as getting different samples often help developers to find what is wrong easier, you may be looking at many times the amount of memory MySQL consumes worth of disk space.

You have to do couple of changes to enable core files. First you need “core-file” option to my.cnf which will instruct MySQL to dump core on crash. This alone will unlikely work though.
I found you need to do several other changes:

echo 2 > /proc/sys/fs/suid_dumpable
mkdir /tmp/corefiles
chmod 777 /tmp/corefiles
echo “/tmp/corefiles/core” > /proc/sys/kernel/core_pattern
echo “1″ > /proc/sys/kernel/core_uses_pid

First we enable dumping cores of suid applications, when we create separate directory for core files, which is good idea anyway as you can put it on different partition etc so you are not risking to run
out of space, but the real reason is I could not really get core dumped to /var/lib/mysql (datadir) on my system (Ubuntu). You might be lucky and it might work in your system. I also enable multiple “versions” of core files here with different pid numbers which I think can be quite helpful.

After you have configured dumping core I suggest you to test it for example on the test box, which has same operating system. This is important. There have been many changes to core file handling on Linux and what worked on one system might not be enough for other.

To check the if it works you can do kill -sigsegv `pidof mysqld` which will trigger the same code as if MySQL crashes accessing the wrong memory area, you will even see some stack trace, probably something like this from main thread:

stack_bottom = (nil) thread_stack 0×40000
/usr/sbin/mysqld(my_print_stacktrace+0×39)[0x7c4bf9]
/usr/sbin/mysqld(handle_segfault+0×464)[0x518414]
/lib/libpthread.so.0(+0xf8f0)[0x7fbb86de98f0]
/lib/libc.so.6(__poll+0×53)[0x7fbb86066f93]
/usr/sbin/mysqld(_Z26handle_connections_socketsv+0×124)[0x51ba64]
/usr/sbin/mysqld(_Z11mysqld_mainiPPc+0xcfc)[0x521b0c]
/lib/libc.so.6(__libc_start_main+0xfd)[0x7fbb85fabc4d]
/usr/sbin/mysqld[0x516a61]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
Writing a core file
Segmentation fault (core dumped)

Note the end of this message – you should see Segmentation fault(core dumped) after Writing a core file. If core file was not written you will just have “Segmentation fault” with no “core dumped” attached to it.

If you’re looking for more core file options and some more explanations check out this Fromdual page it has a lot of good information.

Now as I explained how you can get core files from MySQL I should say they are often impractical – waiting, sometimes over half an hour for core dump to complete and when having huge file to work with
is not very convenient. The alternative in many cases could be to connect as “gdb -p `pidof mysqld`” select “continue” and let MySQL run. If it crashes you will have process ready to work with GDB, which is even more helpful than core file. The disadvantage though of course is you can’t restart the server while debugging it.

P.S Also do not forget to install “debuginfo” package if you expect to do any MySQL profiling or dealing with crashes. It does not slow MySQL performance yet it is very helpful for working with gdb, oprofile etc.

Aug
26
2011
--

Return of the Query Cache, win a Percona Live ticket

It’s Friday again, and time for another TGIF give-away of a Percona Live London ticket! But first, what’s new with the MySQL query cache? You may know that it still has the same fundamental architecture that it’s always had, and that this can cause scalability problems and locking, but there have been some important changes recently. Let’s take a look at those.

The first important change is that both Percona and Oracle actually built some code improvements into the query cache and the interface between it and MySQL. It’s now possible to completely disable it, for example. This used to be possible only by eliminating it at compile time. If you didn’t do that, then there was still a query-cache single choke-point in the server. Now that’s gone. As of MySQL 5.5, the query cache mutex isn’t hit at all if query_cache_type is zero. We made some related changes in Percona Server 5.1 a while ago. I don’t recall the differences between Oracle’s changes and ours, but theirs was better than ours. When they released this fix it obsoleted ours, and we didn’t port our fix forward to Percona Server 5.5, and instead backported the 5.5 fix to our 5.1 branch and replaced our fix. There are some other query cache improvements in MySQL 5.5 as well, but on big hardware with a write-heavy workload that doesn’t benefit from the query cache, the only possible improvement is to disable the cache completely, and people don’t care beyond that. Here’s my favorite query cache tuning guide.

The second important change is really minor. It didn’t improve anything in the server’s performance, but it improved transparency to the user. This is a feature that we introduced, also later superceded by Oracle, that changes the thread’s status in SHOW PROCESSLIST to “Waiting on query cache mutex” when the query cache mutex is taken. This makes it really obvious when the query cache is a bottleneck. Oracle released a similar change pretty soon afterwards, but theirs changed the wording to “Waiting on query cache lock” instead. This is more intuitive for non-programmers anyway. Regardless, the end effect is the same thing: just as you used to be able to see that you had MyISAM problems really easily when you had a screen full of threads in Locked status, now you get a screen full of query cache locks. It makes it impossible to miss serious contention when it happens. For example, it alerted a user on our forum to the fact that the query cache was configured far too large. This might have taken a lot longer to discover otherwise.

And now the moment you’ve been waiting for: round two of Percona’s TGIF contest is underway! Watch our @Percona Twitter stream and retweet the contest to win a free ticket to Percona Live London on October 24-25! We’ll pick a random retweet and give away a free ticket each week. If you don’t win this time, try next Friday or register and get the early-bird discount (but don’t wait too long: it expires September 18th). Our tutorial schedule is 100% complete at this time; don’t miss your opportunity to learn subjects from NDB Cluster to InnoDB Architecture and Performance Optimization. These detailed presentations will give you the hands-on experience you need to take your MySQL understanding to the next level.

Aug
25
2011
--

Percona Live MySQL Conference & Expo committee published

Just a quick update on April’s conference planning. We’ve published the speaker committee, and some featured speakers are on the conference’s home page. These include names like Mårten Mickos and Jeremy Zawodny.

I got a question from a community member that I thought was very important to answer publicly: what is the role and authority of the committee in choosing talks and guiding the conference program? The answer is simple: we did not create the committee to make us look good, we created the committee to really guide us. We expect to just be rubber-stamping Brian Aker’s decisions as Chair of the committee.

There are several outstanding invitations to other prominent organizations to have a seat in the committee. If we get positive RSVPs from those organizations, we’ll update the committee web page with the representative they choose.

I’m withholding this post from Planet MySQL because I don’t want to spam you, but I thought it was important to put on our blog, because there’s no where else prominent enough.

Aug
24
2011
--

Tutorial Insights for Percona Live, London

Percona Live MySQL Conference, London, Oct 24th and 25th, 2011 We have a great line up of Tutorials on Percona Live, London. I hand picked number of them after seeing outstanding speaker Performance in other Places. Let me tell in little bit more details about people we have invited and their talks.

Yoshinori Matsunobu Talk on Linux Hardware and Optimizations for MySQL at Oreilly MySQL Conference and Expo was phenomenal. I wrote about it before and I’m very happy Yoshinori is able to come to London and talk more about this topic. Yoshinori put probably months of testing and research in his talk so you will see a lot of real numbers, not just general guidance you will see in many other places.

Alexey Rybak is going to talk about Scaling LAMP focusing on other parts of the stack. I have visited Alexey talks during number of years at HighLoad conference in Russia. He has wealth of practical experience designing and operating large scale, with his most recent contribution to Badoo, which is approximately 100th site in the world by Alexa and one of Top 20 Facebook applications. What I especially like about Alexey is his practical
approach and ability to explain why something is a good solution rather than stating the fact this is what is being done in organization. A lot of approaches Alexey is going to talk about are not
only practical but very efficient – he will teach you about how to do a lot with less hardware and time/staff needed.

Florian Haas with his talk MySQL High Availability Sprint: Launch the Pacemaker! will talk about MySQL High Availability using DRBD and Pacemaker (Heartbeat) which have
been a high availability solution of choice for MySQL with Innodb storage engine in case no transaction loss can be tolerated, such as dealing with financial data. Florian is probably the most accomplished and experienced person you can find to talk about High Availability with DRBD, and with his experience both as Senior Consultant and Instructor he knows how to run a good class. Another thing which makes this class attractive is technologies described are not only helpful for your MySQL Server but can be used for other applications such as designing highly available storage systems.

Johan Andersson and Yves Trudeau will run NDB Cluster Tutorial which is full day in depth tutorial focused on NDB Cluster/MySQL Cluster technology which is
rapidly gaining popularity for building high performance and highly available systems with MySQL. Initially designed for Telecom Applications MySQL Cluster have been slowly adapting features making it more and more useful for broad set of applications. If you’re looking to get started with this new technology this is surely class to attend. Johan has unique prospective on MySQL Cluster being both core developer and so familiar with very in-depth engine operation and when moving to consulting and having lots of practical experience deploying and operating MySQL Cluster, finally at SeveralNines Johan is working on software simplyfying deployment and management of MySQL Cluster installations. Yves is perhaps one of the most experienced MySQL Cluster consultants in United States, having worked with MySQL Cluster deployments in many high scale mission critical applications. Yves specializes in MySQL Cluster Performance Optimizations.

I think this is a great lineup and you will learn a lot! See you in London

Aug
24
2011
--

Make your file system error resilient

One of the typical problems I see setting up ext2/3/4 file system is sticking to defaults when it comes to behavior on errors. By default these filesystems are configured to Continue when error (such as IO error or meta data inconsistency) is discovered which can continue spreading corruption. This manifests itself in a worst way when device have some “flapping” problems returning errors every so often as this would cause some random pieces of data and meta data to be lost. Not good for system running mySQL Server. As far as I understand this problem is limited to EXT2/3/4 while over systems like XFS will not continue if consistency problems are discovered.

So how can you check what error behavior mode your file system has ? Run dumpe2fs /dev/sda1 and you will get something like this:

dumpe2fs 1.41.14 (22-Dec-2010)
Filesystem volume name:
Last mounted on: /mnt/data
Filesystem UUID: f9f7a0c3-0350-46d5-9930-29c3ac1f4b32
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent 64bit flex_bg spars
e_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: user_xattr acl
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 226918400
Block count: 3630694400
Reserved block count: 0
Free blocks: 3616208434
Free inodes: 226918374
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 316
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 2048
Inode blocks per group: 128
RAID stride: 8
RAID stripe width: 80
Flex block group size: 16
Filesystem created: Mon Aug 22 23:03:21 2011
Last mount time: Mon Aug 22 23:18:25 2011
Last write time: Wed Aug 24 00:01:56 2011
Mount count: 2
Maximum mount count: -1
Last checked: Wed Aug 24 00:01:56 2011
Check interval: 0 ()
Lifetime writes: 54 GB
Reserved blocks uid: 0 (user unknown)
Reserved blocks gid: 0 (group unknown)
First inode: 11

This has a lot of interesting items and I’ll get into some of them a second later. What we’re concerned with right now is Errors behavior: Continue.
We can change behavior to remount-ro which will cause filesystem to become read-only and panic which will cause kernel panic. I believe remount-ro is the best option to use for the database server, though panic might be good option in high availability setup which would cause server to crash instead of continuing
in half working mode throwing errors etc (depending on which filesystem became read only)

To set error behavior to different value run tune2fs -e remount-ro /dev/sda1 which should have output something like:

tune2fs 1.41.14 (22-Dec-2010)
Setting error behavior to 2

It is worth to note when error is discovered during the operation EXT3, EXT4 filesystem will force file system check on the next startup which is handy.

Now I now some people are concerned about setting filesystem behavior to remount-ro or panic because this means even minor error in filesystem data structures which may be affects one file will take out whole file system. I do not think these concerns are valid. First with recent Linux versions and quality hardware EXT3 filesystem is extremely stable (EXT4 is good too though It is newer and I have shorter history with it). So if you have the error popping up you are very likely looking at hardware issues which can cause all kind of other nasty problems especially for database server. Second. The question comes to what you care the most – Do you care about consistency or availability ? Are you ready to risk for some data becoming inconsistent and increased data loss for system to be “up” (potentially serving wrong data) a little bit longer ? For most systems it is not worth tradeoff. Even more if you’re running Innodb chances are you will not buy you more “up time” either as Innodb is very
sensitive to corruptions and if any of file system errors are reported back to MySQL/Innodb it will assert and restart.

Now lets look at couple of other options you might want to tune with tune2fs:

Reserved block count: 0 Number of blocks reserved for root. It often defaults to 5% of total blocks, which is probably not needed for partition you store MySQL data on, as chances are MySQL server is only one doing writes on this partition anyway it just would be wasted if allocated. Some people like to keep it at some number so they have space reserve and if their database ran out of space they can buy a little bit of time before they find more permanent solution.

Maximum mount count: -1 and Check interval: 0 () These corresponds to automatic file system check on startup which is normally done once per so many mounts or so many days. Large partitions with many files can take a lot of time to check and can cause unwanted surprise when you’re restarting server and expecting it to be back in 5 minutes yet it takes 30+ because it has to check file systems. I believe it is much better to disable both these auto check functions for your data partition and just check it manually as needed, same as you would every so often check MySQL tables for corruption.

To change those options you can run tune2fs -m0 -i0 -c -1 /dev/sda1 changing reserved block percent, check interval and mount count appropriately.

Aug
19
2011
--

Win a free ticket to Percona Live London!

Win a free ticket to Percona Live London on October 24-25!  Watch @percona on Twitter, and retweet our TGIF contest tweet to enter. We’ll pick a random retweet and give away a free ticket each week.  If you don’t win this time, try again, or register and get the
early-bird discount (but don’t wait too long: it expires September 18th).

Percona Live London is the can’t-miss event for MySQL in Europe this autumn.  We will have top MySQL speakers from around the world, including Peter Zaitsev and Yoshinori Matsunobu.  The full agenda will be posted soon, but you can see the preliminary agenda now. Reserve your ticket now for a discount.  Don’t wait to book your hotel and flight — now is the best time to save money on travel costs!

Our call for proposals is still open until September 18th. Please submit your technical MySQL-related talks! Tips on content and topics we’d like to see are here.  We need your help to make this conference the best it can be!

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com