Sep
09
2016
--

Don’t Spin Your Data, Use SSDs!

ssds

ssdsThis blog post discussed the advantages of SSDs over HDDs for database environments.

For years now, I’ve been telling audiences for my MySQL Performance talk the following: if you are running an I/O-intensive database on spinning disks you’re doing it wrong. But there are still a surprising number of laggards who aren’t embracing SSD storage (whether it’s for cost or reliability reasons).

Let’s look at cost first. As I write this now (September 2016), high-performance server-grade spinning hard drives run for about $240 for 600GB (or $0.40 per GB).  Of course, you can get an 8TB archive drive at about same price (about $0.03 per GB), but it isn’t likely you’d use something like that for your operational database. At the same time, you can get a Samsung 850 EVO drive for approximately $300 (or $0.30 per GB), which is cheaper than the server-grade spinning drive!  

While it’s not the best drive money can buy, it is certainly an order of magnitude faster than any spinning disk drive!

(I’m focusing on the cost per GB rather than the cost of the number of IOPS per drive as SSDs have overtaken HDDs years ago when it comes to IOPS/$.)

If we take a look at Amazon EBS pricing, we will find that Amazon has moved to SSD volumes by default as “General Purpose” storage (gp2). Prices for this volume type run about 2x higher per GB than high-performance HDD-based volumes (st1) and provisioned IOPs volumes. The best volumes for databases will likely run you 4x higher than HDD.

This appears to be a significant cost difference, but keep in mind you can get much more IOPS at much better latency from these volumes. They also handle IO spikes better, which is very important for real workloads.

Whether we’re looking at a cloud or private environment, it is wrong just to look at the cost of the storage alone – you must look at the whole server cost. When using an SSD, you might not need to spend also buy a RAID card with battery-backed-up (BBU) cache, as many SSDs have similar functions built in.

(For some entry-level SSDs, there might be an advantage to purchasing a RAID with BBU, but it doesn’t affect performance nearly as much as for HDDs. This works out well, however, as entry level SSDs aren’t going to cost that much to begin with and won’t make this setup particularly costly, relative to a higher-end SSD.)  

Some vendors can charge insane prices for SSDs, but this is where you should negotiate and your alternative vendor choice powers.

Some folks are concerned they can’t get as much storage per server with SSDs because they are smaller. This was the case a few years back, but not any more. You can find a 2TB 2.5” SSD drive easily, which is larger than the available 2.5” spinning drives. You can go as high as 13TB in the 2.5” form factor

There is a bit of challenge if you’re looking at the NVMe (PCI-E) cards, as you typically can’t have as many of those per server as you could using spinning disks, but the situation is changing here as well with the 6.4TB SX300 from Sandisk/FusionIO or the PM1725 from Samsung. Directly attached storage provides extremely high performance and 10TB-class sizes.  

To get multiple storage units together, you can use hardware RAID, software RAID, LVM striping or some file systems (such as ZFS) can take care of it for you.    

Where do we stand with SSD reliability? In my experience, modern SSDs (even inexpensive ones) are pretty reliable, particularly for online data storage. The shelf life of unpowered SSDs is likely to be less than HDDs, but we do not really keep servers off for long periods of time when running database workloads. Most SSDs also do something like RAID internally (it’s called RAIN) in addition to error correction codes that protect your data from a full single flash chip.

In truth, focusing on storage-level redundancy is overrated for databases. We want to protect most critical applications from complete database server failure, which means using some form of replication, storing several copies of data. In this case, you don’t need bulletproof storage on a single server – just a replication setup where you won’t lose the data and any server loss is easy to handle. For MySQL, solutions like Percona XtraDB Cluster come handy. You can use external tools such as Orchestrator or MHA to make MySQL replication work.  

When it comes to comparing SSD vs. HDD performance, whatever you do with SSDs they will likely still perform better than HDDs. Your RAID5 and RAID6 arrays made from SSDs will beat your RAID10 and RAID0 made from HDDs (unless your RAID card is doing something nasty).

Another concern with SSD reliability is write endurance. SSDs indeed have a specified amount of writes they can handle (after which they are likely to fail). If you’re thinking about replacing HDDs with SSDs, examine how long SSDs would endure under a comparable write load.  

If we’re looking at a high HDD write workload, a single device is likely to handle 200 write IOPS of 16KB (when running InnoDB). Let’s double that. That comes to 6.4MB/sec, which gives us  527GB/day (doing this 24/7). Even with the inexpensive Samsung 850 Pro we get 300TB of official write endurance – enough for 1.5 years. And in reality, drives tend to last well beyond their official specs.    

If you don’t like living on the edge, more expensive server-grade storage options have much better endurance. For example, 6.4TB SX300 offers almost 100x more endurance at 22 Petabytes written.

In my experience, people often overestimate how many writes their application performs on a sustained basis. The best approach is to do the math, but also monitor the drive status with a SMART utility or vendor tool. The tools can alert you in advance when drive wears out.

Whatever your workload is, you will likely find an SSD solution that offers you enough endurance while significantly exceeding the performance of an HDD-based solution.

Finally, there is a third and very important component of SSD reliability for operational database workloads: not losing your data during a power failure. Many “consumer-grade” SSDs come with drive write cache enabled by default, but without proper power loss protection. This means you can lose some writes during a power failure, causing data loss or database corruption.

Disabling write cache is one option, though it can severely reduce write performance and does not guarantee data won’t be lost. Using enterprise-grade SSDs from a reputable vendor is another option, and testing SSDs yourself might be a good idea if you’re on a budget.  

Conclusion

When it comes to operational databases, whether your workload is on-premises or in the cloud,  Don’t spin your data – use SSD. There are choices and options for almost any budget and every workload.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com