RAID Levels

This is a summary of standard and nested RAID levels that I might implement either in a small home/office environment or from enterprise vendors. I’m avoiding the rarely-seen levels like 2, 3, and 4 as well as vendor-specific “RAID” implementations like RAID-S or X-RAID. Fortunately, many major vendors like Dell, Supermicro, and Silicon Mechanics are rigorous about following standards. RAID is formally defined in the SNIA Common RAID Disk Data Format specification, latest release March 2009.

RAID Levels Comparison Table

The following table uses these definitions:

  • Min. disks: Minimum number of disks needed to implement the given RAID level.
  • Max. disks: Maximum number of disks the given RAID level can accommodate. In practical terms, the infinity symbol (∞) means the maximum number of disks supported by the controller or local power supply. The number in parenthesis is the recommended maximum.
  • Available Storage: Amount of disk space available after RAID overhead is subtracted. If you’re using drives of different sizes, many hardware RAID controllers limit all volumes to the size of the smallest disk in the array, and I provide some calculations for this. There are methods that allow volumes of differing sizes to be utilized fully, one of which is to combine disks into virtual volumes with Linux LVM before implementing software RAID, and another of which is to implement RAID using the RAID-Z or mirroring capabilities of Sun/Oracle’s ZFS (or OpenZFS).
  • Tolerance: The maximum number of simultaneous disk failures before you lose the array.
  • “d” is number of disks in the array.
  • “s” is the storage capacity of the smallest disk in the array.
RAID LevelMin. DisksMax. DisksAvailable SpaceToleranceDescription
02∞ (2)s * d0RAID 0 is a set of two or more striped disks with no parity or mirroring: there is no “redundancy” here, so the failure of one disk means loss of array. Since data is “split” between the available disks, allowing them to work simultaneously, RAID 0 offers superior read and write performance. You can implement RAID 0 with as many disks as you like, each of which further improves read/write speed, but of course every disk you add increases the chance of disk failure and loss of array. Storage is limited to s * d, so 3 x 1 TB disks here gives you the full 3 TB, while adding a 500 GB disk to an existing RAID 0 array consisting of 2 x 250 GB disks gives you 250 GB * 3 = 750 GB total. Since this “RAID” solution actually increases the likelihood of catastrophic failure, I wouldn’t bother unless you can get a deal on a couple of fast hard drives and want to set up a workstation or gaming system with a separate backup solution.
12∞ (2)sd – 1RAID 1 is two or more disks with block-level mirroring: all disks in the array are just clones of each other, so you can lose all of them except one before you lose your data. It seems that few controller cards support RAID 1 with more than two disks, but there’s actually nothing in the RAID 1 spec that limits the number of disks per array, so you can use as many as you want with implementations like Linux software RAID. Beyond two or three disks this gets wasteful, though, since you gain no additional space. Performance is excellent, maybe even matching that of a single disk, as there’s no overhead for striping or parity. Rebuilds are fast since they’re a simple cloning operation. Available storage is typically defined by the smallest disk in the array: a RAID 1 array consisting of a 3 x 250 GB disks and 1 x 400 GB disk is a 250 GB array. A cheap way to implement basic redundancy.
53∞ (4)(s * d) – s1 diskRAID 5 is block-level striping with a single parity bit. Unlike RAID 1, disks in RAID 5 aren’t merely clones of each other, so you get better space utilization at the cost of at least one additional disk. Fault tolerance is one disk per RAID 5 array. A RAID 5 array consisting of 3 x 1 TB disks gives you 1 TB * 3 – 1 TB = 2 TB total, while 1 x 250 GB disk and 2 x 400 GB disks gives you 250 GB * 3 – 250 GB = 500 GB. Read performance is good while write performance suffers due to the parity overhead, although this is compensated for to varying degrees in modern controllers. Rebuild time is a reasonable compromise between RAID 1 and RAID 6. Keep in mind that since RAID 5 has to read from all disks during a rebuild operation, a surviving disk that is marginal can be strained to the point of failure resulting in loss of array. You may hear of the RAID 5 “write hole”, but this is unlikely to occur in real-world operation, and modern controllers can mitigate the write hole by resynchronizing the array to fix parity errors.
64∞ (8)(s * d) – (s * 2)2 disksRAID 6 is block-level striping with dual parity, giving you double the overhead of RAID 5 with double the redundancy. A RAID 6 array consisting of 5 x 1 TB disks gives you (1 TB * 5) – (1 TB * 2) = 3 TB usable space, while a RAID 6 array consisting of 2 x 250 GB and 2 x 300 GB disks gives you (250 GB * 4) – (250 GB * 2) = 500 GB usable space. As with RAID 5, read performance is excellent while writes are slowed by parity calculations. Rebuilds can take a while, and RAID 6 may exhibit RAID 5’s problem with loss of multiple disks during rebuilds. Still preferable to RAID 5 if you can afford it.
104∞ (4)(d * s)/2d – 1 per RAID 1 setRAID 10 is two or more RAID 1 mirrored pairs striped together (RAID levels 1+0). Read and write performance is excellent and rebuild operations are fast since data can be copied directly from the survivor of the mirrored pair to the new disk. Space availability is relatively poor since half of all available space goes to mirroring. A set of 6 250 GB drives configured as RAID 10 will offer (6 x 250) – (3 x 250) = 750 GB usable space, as compared with (6 x 250) – (2 x 250) = 1 TB usable space for RAID 6.
506∞ (9)(s * d) – s per RAID 5 set1 disk per RAID 5 setRAID 50 is two or more RAID 5 sets striped together (RAID levels 5+0). RAID 50 offers excellent read performance, and, because of its RAID 0 striping, better write performance than standalone RAID 5. In terms of redundancy, RAID 50 is of dubious value over RAID 5, since you can still lose only one disk per RAID 5 set before loss of array. Space utilization is relatively expensive as well, more than RAID 5 but less than RAID 10: a RAID 50 array of 6 x 1 TB disks gives you (3 – 1) + (3 – 1) = 4 TB usable space. Rebuild times with RAID 50 can be considerable.
608∞ (16)(s * d) – (s * 2) per RAID 6 set2 disks per RAID 6 setRAID 60 is two or more RAID 6 sets striped together (RAID levels 6+0). Read performance is excellent, though write performance is impacted more significantly than in RAID 50 due to the double parity per RAID 6 set. Redundancy is substantially better than RAID 50, although rebuilds can seemingly take forever. As with the other nested RAID levels (10, 50), RAID 60 can get expensive fast, as it requires a minimum of 8 disks: 8 x 1 TB in RAID 60 gives you (4 – 2) + (4 – 2) = 4 TB usable space.

In most environments, particularly given the falling costs of storage, I wouldn’t consider anything besides RAID 5 or 6, and in practice I wouldn’t implement RAID 5 with more than 4 disks or RAID 6 with more than 8. If more space is desirable then I’d suggest creating a series of RAID 5 or 6 arrays and then striping them (RAID 50 or 60). But then at this point you should probably be thinking about enterprise-grade solutions like a standalone SAN/NAS device.

RAID is Limited

Above all, RAID in itself isn’t a solution for data backups. RAID was merely intended to mitigate data losses caused by single disk failures. Disks in a RAID array are never truly “independent”, and multiple simultaneous failures is not uncommon, particularly when you consider:

  1. The disks in a single array probably came off the same assembly line, maybe even in sequence, so they’re likely subject to the same manufacturing defects, real-world MTBF, and environmental stresses
  2. A RAID rebuild operation can strain an already marginal disk to the point of failure. (I’ve had this happen in small servers, albeit in a RAID 6 group, so I didn’t lose the array.)

A rough RAID disk failure calculator.

On Linux, a software RAID array can be examined with mdadm –detail /dev/mdx or lsraid -a /dev/mdx, where mdx is your RAID array per /etc/raidtab.

Loading

Leave a Reply

Your email address will not be published. Required fields are marked *