People outside the IT industry occasionally ask me what RAID is all about. I finally took the time to write it up.
RAID is an acronym for Redundant arrays of independent disks. The acronym in the original 1980s paper stood for “Redundant Arrays of Inexpensive Disks.” The idea was to combine inexpensive commodity disks into an array, which would be more powerful than the expensive mainframe disks of the day. But the authors apparently had to change inexpensive to independent because the controller logic to make it all work was expensive. I found copies of the original paper at MIT and CMU. Wikipedia also has a nice article.
Disks have multiple cylinders. Cylinders contain multiple tracks. Tracks contain sectors, also called blocks. This Wikipedia article has more on disk block addressing.
Let’s say a disk has three blocks. Real disks have billions of blocks, but small numbers are easier to visualize. Disk firmware makes blocks appear sequential, and so raw disks look like this:
Disk 1 2 3
Raw disks not attached to any array are called JBOD, for Just A Bunch of Disks,
RAID mostly depends on a concept called striping, or RAID 0. Striping spreads blocks across all the disks in an array, putting multiple disks to work. A two-disk stripe set looks like this:
Disk 1 Disk 2 1 3 5 2 4 6
Assuming a random access pattern, striping across two disks should theoretically double the performance because two disks are working instead of one. A three-disk stripe set should theoretically triple performance. Striping is a performance win, but a reliability lose because any disk failure takes down the whole stripe set, and each new disk multiplies the odds of failure. Striping also costs more than the combined cost of all the disks because it needs a hardware or software controller smart enough to deal with stripes.
Mitigate the reliability risk by adding an additional parity disk. Parity is a way to sum the bits, so if a disk fails, an array can rebuild itself based on the surviving disks. Striping with a dedicated parity disk is called RAID 3. It looks like this:
Disk 1 Disk 2 Disk 3 1 3 5 2 4 6 P P P
The dedicated parity disk mitigates the reliability problem, but introduces an unacceptable performance bottleneck. Nobody uses it because writes perform worse than JBOD. It needs one more optimization, called RAID 5, or striping with distributed parity. This spreads the write workload across the array and looks like this:
Disk 1 Disk 2 Disk 3 1 3 P 2 P 5 P 4 6
RAID 5 sets can continue operating when one disk in the set fails. When a disk fails, quickly replace it and initiate a rebuild. If another disk fails before the rebuild finishes, the whole array fails.
To continue operating after two disk failures, another RAID level called RAID 6 uses two parity disks. RAID 6 would have made my Christmas 2014 less stressful.
Finally, RAID 1 is mirroring, meaning a set of identical disks. RAID 0+1 is another common label and means striping with mirroring.
Which is Best?
Storage choices balance three attributes: performance, price, and reliability.
JBOD is the lowest cost. RAID 5 adds one additional disk over JBOD to tolerate one disk failure, but writes are slow because every write needs to read the old parity, calculate a new parity, and write it. Writes with RAID 6 are even slower, because every write must perform two read-modify-write sequences for both parity sectors. Good RAID controllers mitigate this read-modify-write performance bottleneck by caching them.
RAID 0+1 performs the best, but needs at least twice the number of disks vs. JBOD to mirror everything. It can tolerate some multiple-disk failures, but failure with the wrong disks is catastrophic. Add additional mirrors–and additional cost–to further improve reliability.
Make your best choice by optimizing the two most important attributes for your workload.