Redundant array of independent disks (RAID) is a technique for storing the same data on multiple hard disks to increase read performance and fault tolerance. In a properly configured RAID storage system, the loss of any single disk will not interfere with users' ability to access the data stored on the failed disk.
RAID has become a standard but transparent feature in both enterprise-class and consumer-class storage products. RAID storage systems aren't new or sexy, so they've fallen out of the limelight as a feature that vendors promote. "When you go to a car dealership, how often do they tell you about the tires on the car?" asked Greg Schulz, founder and senior analyst with StorageIO Group, a Stillwater, Minn.,-based technology analyst and consulting firm. "RAID's taken for granted. It's a feature. It's a standard function."
There are a number of definitions and terms that are important to understand when it comes to RAID. Striping is the process of dividing data into blocks and spreading the data blocks across several disks. Striping is how RAID technology increases data retrieval performance by allowing multiple data readers and writers to work on a single data set at the same time.
Parity is a technique of checking whether data has been successfully transmitted between computers, or in this particular case, between a computer and storage array. RAID technology uses parity calculations to ensure that complete data sets can be retrieved from an array even if one or more disks in the array fail.
Then there is mirroring, which is the process of copying data from one disk to one or more additional disks so that the data is available from more than one place. It's another technology used to ensure data availability in the event of disk failures.
RAID storage systems are still alive
But RAID remains relevant in the sophisticated storage market because as we gather, save, and rely on more and more data, the risk and consequences of losing that data increases as well. "RAID is very much alive," Schulz said. "And the beauty of it is that it continues to evolve. RAID in many ways is a paradigm; it's an umbrella that is implemented in different ways by different vendors, some in hardware, some in software. RAID doesn't have to be implemented just in a controller. If you look in a volume manager that can mirror across two different volumes, that's RAID."
One of the industry concerns about RAID is the continued growth in disk sizes. Current systems tout 2 TB drives and according to Schulz, we'll see 8 TB drives in the not too distant future. Will so much data on large, individual drives diminishes RAID's effectiveness?
Schulz says no. Today's RAID controllers are faster than ever to be able to rebuild data sets after drive failures. "They are rebuilding a given drive capacity faster than they were in previous generations with even smaller drives," Schulz said.
And today's drives are not only bigger, but more reliable. That's not to say that administrators should consider intermittent drive failures a thing of the past. "Drives will fail," Schulz said. It's not a matter of if; it's a matter of when." And to address the threats of multiple, large-drive failures, Schulz said he believes the industry will develop systems with triple parity and triple mirroring capabilities. This would offer protection against a second drive failure before the system has a chance to rebuild the data set from an initial disk failure.
Some vendors have moved in different directions and are developing "post-RAID" products. One example is IBM Corp.'s XIV storage system, which relies on a grid architecture and active-active N+1 redundancy for data protection.
Another vendor touting a post-RAID architecture is Omneon Inc., which developed a distributed file system grid with interconnected nodes in its MediaGrid storage system for data protection instead of implementing RAID technology.RAID levels
There are six basic RAID levels, and many more combinations available from most storage array vendors. The six levels define separate techniques for striping data, mirroring data, and using parity calculations for better read and write performance and fault tolerance in different storage environments and use cases. Determining the right RAID technology for your situation will require balancing equipment and technology costs, performance requirements, data availability requirements, and capacity needs along with your organizational goals.
First and foremost, RAID should be easy. Schulz said RAID technology in today's storage arrays should be transparent, easy to use and to manage. When looking at storage arrays and RAID technologies; "Don't get hung up on the RAID levels and the stripe size and the chunks and all that," Schulz said. You should ask vendors if the system will figure out what RAID levels you should use based on your business needs and data protection requirements, and guide you in setting the system up.
The right RAID disk array for you
If you feel comfortable asking more specific questions about RAID technology, Schulz recommends asking the following questions to help you find the right RAID disk array for your needs:
- What RAID levels do the storage arrays you're considering support? If the arrays support multiple levels, do they operate concurrently and over what number and types of disk drives?
- How are the RAID arrays optimized for your environment's I/O operations?
- Are there random or sequential read and writes?
- Is a RAID offload or accelerator engine being used?
- Does the RAID controller accelerate parity calculations and data movement to reduce the time and data availability exposure during drive rebuilds?
- How is the disk cache integrated with the RAID controller to improve read and write performance? If you plan to tier your data storage, you should know what's involved in migrating data from one volume to another and between RAID levels.
It's also important to stay focused on your organization's goals related to application performance and availability, and how the various RAID technologies can help you meet those goals.
And don't forget that RAID only protects data on your primary storage system. "The big thing is to make sure you know that RAID is not a replacement for backup," Schulz said. "RAID needs to be combined with some form of time-based backup." RAID protects your data from a disk failure on your primary storage system, and a reliable backup strategy protects your data in case the primary system fails.
This was first published in September 2010