A redundant array of independent disks (more commonly known as a RAID) is a system of using multiple hard drives for sharing or replicating data among the drives. Depending on the version chosen the benefit of RAID is a one or more of increased data integrity, fault-tolerance, throughput or capacity compared to single drives. In its original implementations (in which it was an abbreviation for "Redundant Array of Inexpensive Disks"), its key advantage was the ability to combine multiple low-cost devices using older technology into an array that together offered greater capacity, reliability, and/or speed than was affordably available in singular devices using the newest technology.
At the very simplest level, RAID is one of many ways to combine multiple hard drives into one single logical unit. Thus, instead of seeing several different hard drives, the operating system sees only one. RAID is typically used on server computers, and is usually implemented with identically-sized disk drives. With decreases in hard drive prices and wider availability of RAID options built into motherboard chipsets, RAID is also being found and offered as an option in higher-end end user computers, especially computers dedicated to storage-intensive tasks, such as video and audio editing.
The most popular RAID levels are RAID 0, RAID 1, RAID 5 and becoming more wide spread is RAID 6.
Below there are links to explaintions of what RAID levels are and what the advantages and disadvantages there are from inplementing them.
JBOD (Just a Bunch of Disks) - This is just the disks with nothing done to them.
It is a popular method for combining multiple physical disk drives into a single virtual one. As the name implies, disks are merely concatenated together, end to beginning, so they appear to be a single large disk.
In this sense, concatenation is akin to the reverse of partitioning. Whereas partitioning takes one physical drive and creates two or more logical drives, JBOD uses two or more physical drives to create one logical drive.
In that it consists of an Array of Inexpensive Disks (no redundancy), it can be thought of as a distant relation to RAID. JBOD is sometimes used to turn several odd-sized drives into one useful drive. Therefore, JBOD could use a 3 GB, 15 GB, 5.5 GB, and 12 GB drive to combine into a logical drive at 35.5 GB, arguably more useful than the individual drives separately.
A RAID 0 (also known as a striped set) splits data evenly across two or more disks with no parity information for redundancy. It is important to note that RAID 0 was not one of the original RAID levels, and is not redundant. RAID 0 is normally used to increase performance, although it is also a useful way to create a small number of large virtual disks out of a large number of small physical ones. Although RAID 0 was not specified in the original RAID paper, an idealized implementation of RAID 0 would split I/O operations into equal-sized blocks and spread them evenly across two disks. RAID 0 implementations with more than two disks are also possible, however the reliability of a given RAID 0 set is equal to the average reliability of each disk divided by the number of disks in the set. That is, reliability (as measured by mean time between failures (MTBF)) is inversely proportional to the number of membersso a set of two disks is half as reliable as a single disk. The reason for this is that the file system is distributed across all disks. When a drive fails the file system cannot cope with such a large loss of data and coherency since the data is "striped" across all drives. Data can be recovered using special tools. However, it will be incomplete and most likely corrupt.
While the block size can technically be as small as a byte it is almost always a multiple of the hard disk sector size of 512 bytes. This lets each drive seek independently when randomly reading or writing data on the disk. If all the accessed sectors are entirely on one disk then the apparent seek time would be the same as a single disk. If the accessed sectors are spread evenly among the disks then the apparent seek time would be reduced by half for two disks, by two-thirds for three disks, etc. assuming identical disks. For normal data access patterns the apparent seek time of the array would be between these two extremes. The transfer speed of the array will be the transfer speed of all the disks added together.
RAID 1. - A RAID 1 creates an exact copy (or mirror) of all of data on two or more disks.
This is useful for setups where redundancy is more important than using all the disks maximum storage capacity. The array can only be as big as the smallest member disk, however. An ideal RAID 1 set contains two disks, which increases reliability by a factor of two over a single disk, but it is possible to have many more than two copies. Since each member can be addressed independently if the other fails, reliability is a linear multiple of the number of members. To truly get the full redundancy benefits of RAID1, independent disk controllers are recommended, one for each disk. Some refer to this practice as splitting or duplexing.
When reading both disks can be accessed independently. Like RAID 0 the average seek time is reduced by half when randomly reading but because each disk has the exact same data the requested sectors can always be split evenly between the disks and the seek time remains low. The transfer rate would also be doubled. For three disks the seek time would be a third and the transfer rate would be tripled. The only limit is how many disks can be connected to the controller and its maximum transfer speed. Most IDE RAID 1 cards use a broken implementation and only read from one disk so their read performance is that of a single disk. Some older RAID 1 implementations would read both disk simultaneously and compare the data to catch errors. The error detection and correction on modern disks makes this no longer necessary. When writing the array acts like a single disk as all writes must be written to all disks.
RAID1 has many administrative advantages. For instance, in some 365*24 environments, it is possible to "Split the Mirror": declare one disk as active, do a backup of the inactive disk, and then "rebuild" the mirror. This procedure is less critical in the presence of the "snapshot" feature of some filesystems, in which some space is reserved for changes, presenting a static point-in-time view of the filesystem.
Also, one common practice is to create an extra mirror of a volume (also known as a Business Continuance Volume or BCV) which is meant to be split from the source RAID set and used independently. In some implementations, these extra mirrors can be split and then incrementally re-established, instead of requiring a complete RAID set rebuild.
RAID 2 - This stripes data at the bit (rather than block) level. Not currently used.
RAID 3. - A RAID 3 uses byte-level striping with a dedicated parity disk.
A RAID 3 uses byte-level striping with a dedicated parity disk. RAID 3 is very rare in practice. One of the side effects of RAID 3 is that it generally cannot service multiple requests simultaneously. This comes about because any single block of data will by definition be spread across all members of the set and will reside in the same location, so any I/O operation requires activity on every disk.
In our example, below, a request for block "A1" would require all three data disks to seek to the beginning and reply with their contents. A simultaneous request for block B1 would have to wait.
A RAID 4 uses block-level striping with a dedicated parity disk. RAID 4 looks similar to RAID 3 except that it stripes at the block, rather than the byte level. This allows each member of the set to act independently when only a single block is requested. If the disk controller allows it, a RAID 4 set can service multiple read requests simultaneously. Network Appliance uses RAID 4 on their Filer line of network storage servers.
RAID 5. - This uses block-level striping with parity data distributed across all disks.
A RAID 5 uses block-level striping with parity data distributed across all member disks. RAID 5 is one of the most popular RAID levels, and is frequently used in both hardware and software implementations. Virtually all storage arrays offer RAID 5.
In our example, below, a request for block "A1" would be serviced by disk 1. A simultaneous request for block B1 would have to wait, but a request for B2 could be serviced concurrently.
Every time a data "block" (sometimes called a "chunk") is written on a disk in an array, a parity block is generated within the same stripe. (A block or chunk is often composed of many consecutive sectors on a disk, sometimes as many as 256 sectors. A series of chunks [a chunk from each of the disks in an array] is collectively called a "stripe".) If another block, or some portion of a block is written on that same stripe, the parity block (or some portion of the parity block) is recalculated and rewritten. The disk used for the parity block is staggered from one stripe to the next, hence the term "distributed parity blocks". This means, of course, that the controller software becomes more complex.
Interestingly, the parity blocks are not read on data reads, since this would be unnecessary overhead and would diminish performance. The parity blocks are read, however, when a read of a data sector results in a cyclic redundancy check (CRC) error. In this case, the sector in the same relative position within each of the remaining data blocks in the stripe and within the parity block in the stripe are used to reconstruct the errant sector. The CRC error is thus hidden from the main computer. Likewise, should a disk fail in the array, the parity blocks from the surviving disks are combined mathematically with the data blocks from the surviving disks to reconstruct the data on the failed drive "on-the-fly".
This is sometimes called Interim Data Recovery Mode. The computer knows that a disk drive has failed, but this is only so that the operating system can notify the administrator that a drive needs replacement; applications running on the computer are unaware of the failure. Reading and writing to the drive array continues seamlessly, though with some performance degradation. The difference between RAID4 and RAID5 is that, in Interim Data Recovery Mode, RAID5 might be slightly faster than RAID4, because, when the CRC and parity are in the Disk that failed, the calculation does not have to be performed, while with RAID4, if one of the Data disks fails, the calculations have to be performed every time.
In RAID 5 arrays, which have only one parity block per stripe, the failure of a second drive results in total data loss.
The maximum number of drives is theoretically unlimited, but it is common practice to keep the maximum to 14 or fewer for RAID 5 implementations which have only one parity block per stripe. The reason for this restriction is that there is a greater likelihood of two drives in an array failing in rapid succession when there is greater number of drives. As the number of disks in a RAID 5 increases, the MTBF for the array as a whole can even become lower than that of a single disk. This happens when the likelihood of a second disk failing out of (N-1) dependent disks, within the time it takes to detect, replace and recreate a first failed disk, becomes larger than the likelihood of a single disk failing.
One should be aware that many disks together increase heat, which lowers the real-world MTBF of each disk. Additionally, a group of disks bought at the same time may reach the end of their Bathtub curve together, noticeably lowering the effective MTBF of the disks during that time.
In implementations with greater than 14 drives, or in situations where extreme redundancy is needed, RAID 5 with dual parity (also known as RAID 6) is sometimes used, since it can survive the failure of two disks.
RAID 6. - This uses block-level striping with parity data distributed twice across all disks.
A RAID 6 uses block-level striping with parity data distributed twice across all member disks. It was not one of the original RAID levels.
In RAID 6, parity is generated and written to two distributed parity stripes, on two separate drives, using a different parity stripe in each two dimensional "direction".
RAID 6 is very inefficient when used with a small number of drives. But as drives become bigger and arrays have more drives, and rebuild times skyrocket, the fact that RAID6 is more redundant than RAID 5, is more and more attractive, and also makes more sense than having a "hot spare" disk. See also Double parity below for another, more redundant implementation.
Nested RAID Levels
Many storage controllers allow RAID levels to be nested. That is, one RAID can use another as its basic element, instead of using physical disks. You can think of the RAID arrays as layered on top of each other, with physical disks at the bottom.
RAID 0+1. - This is a RAID used for both replicating and sharing data among disks.
A RAID 0+1 (also called RAID 01, although it shouldnt be confused with RAID 1) is a RAID used for both replicating and sharing data among disks. The difference between RAID 0+1 and RAID 10 is the location of each RAID system it is a mirror of stripes.
Where the maximum storage space here is 360GB, spread across two arrays. The advantage is that when a hard drive fails in one of the RAID 0s, the missing data can be transferred from the other array. However, adding an extra hard drive requires you to add two hard drives to balance out storage among the arrays.
It is not as robust as RAID 10 and cannot tolerate two simultaneous disk failures, if not from the same stripe. That is to say, once a single disk fails, all the disks in the other stripe are each individual single points of failure. Also, once the single failed disk is replaced, in order to rebuild its data all the disks in the array must participate in the rebuild.
To add to the confusion, some controllers that run in RAID 0+1 mode combine the striping and mirroring into a single operation. The layout of the blocks for RAID 0+1 and RAID 10 are identical except that the disks are in a different order. To the smart controller this does not matter and they gain all the benefits of RAID 10 but are still labelled as only supporting RAID 0+1 in their documentation.
RAID 10. - This is similar to a RAID 0+1 but reversed.
A RAID 10, sometimes called RAID 1+0, is similar to a RAID 0+1 with exception that the RAID levels used are reversedRAID 10 is a stripe of mirrors.
One drive from each RAID 1 set could fail without damaging the data. However, if the failed drive is not replaced, the single working hard drive in the set then becomes a single point of failure for the entire array. If that single hard drive then fails, all data stored in the entire array is lost.
Extra 120GB hard drives could be added to any one of the RAID 1s to provide extra redundancy. Unlike RAID 0+1, all the "sub-arrays" do not have to be upgraded at once.
RAID10 is often the primary choice for high-load databases, because of its faster write speeds since there is no parity to calculate.
RAID 50. - This uses block-level striping with distributed parity of RAID 5 and RAID 0.
A RAID 50 combines the block-level striping with distributed parity of RAID 5, with the straight block-level striping of RAID 0. This is a RAID 0 array striped across RAID 5 elements.
One drive from each of the RAID sets could fail without damaging the data. However, if the failed drive is not replaced, the remaining working drives in that set then become a single point of failure for the entire array. If one of those drives then fail, all data stored in the entire array is lost. The time spent in recovery (detecting and responding to a drive failure, and the rebuild process to the newly inserted drive) represents a period of vulnerability to the RAID set.
In the example below, datasets may be striped across both RAID sets. A dataset with 5 blocks would have 3 blocks written to the 1st RAID set, and the next 2 blocks written to RAID set 2.
The configuration of the RAID sets will impact the overall fault tolerancy. A construction of three seven-drive RAID 5 sets has higher capacity and storage efficiency, but can only tolerate three maximum potential drive failures. A construction of seven three-drive RAID 5 sets can handle as many as seven drive failures but has lower capacity and storage efficiency.
RAID 50 improves upon the performance of RAID 5 particularly during writes, and provides better fault tolerance than a single RAID level does. This level is recommended for applications that require high fault tolerance, capacity and random positioning performance.
As the number of drives in a RAID set increases, and the capacity of the drives increase, this impacts the fault-recovery time correspondingly as the interval for rebuilding the RAID set increases.
Proprietary RAID levels
Although all implementations of RAID differ from the idealized specification to some extent, some companies have developed entirely proprietary RAID implementations that differ substantially from the rest of the crowd.
One common addition to the existing RAID levels is double parity, sometimes implemented and known as diagonal parity. As in RAID 6, there are two sets of parity check information created. Unlike RAID 6, however, the second set is not a mere "extra copy" of the first. Rather, most implementations of Double Parity calculate the extra parity against a different group of blocks. While traditional RAID 5 and 6 calculates parity against one group of blocks (A1, A2, A3, AP), Double Parity calculates parity against different groups, for example, in our graph both RAID 5 and RAID 6 calculate against all A-lettered blocks to produce one or more parity blocks. However, it is fairly easy to calculate parity against multiple groups of blocks, instead of just A-lettered blocks, one can calculate all A-lettered blocks and all 1-numbered blocks.
RAID 1.5 - This is just a correct implementation of RAID 1. When reading data is read from both disks simultaneously and most of the work is done in hardware instead of the driver.
Quick compare chart
Min Disks Required
Small databases, database logs, critical information
Sequential reads: good. Transactional reads: Very good
at least 2
Databases and other read-intensive transactional uses
Data-intensive environments (large records)
at least 4
Medium-sized transactional or data-intensive uses
N = the amount of GB disks you need. X = Number of RAID sets
Don't know which connections are on your Mac ? This site has lots of information about different types of Mac.
Are you having a problem with a product?
Check the connections and whether any drivers need to be installed.
If it is a hard drive, make sure it has been formatted correctly.
Check the manufacturer's website for support, as other people may have had this problem before.
Check the manufacturer's website for any firmware or software upgrades.
If you are sure there is a fault with the product, check whether the manufacturer has a direct replacement service. This is normally the quickest method.
Alternatively, get in touch with us via the "contact us" page, by email, phone, or in person.
We can advise if there is a solution, or if a replacement will be needed.
We will then give you an RMA number, so you can send the product back to us for testing and replacement.