Using global spare drives to increase SAN reliability
What you will learn:
How to use global spare drives to create a more reliable SAN.
Using hot spare drives is an efficient technique for building a highly reliable storage area network
(SAN). A highly reliable SAN is different from a high availability SAN, which must meet specific availability levels and requires more than a couple of hot spares. Hot spare drives allow an array to start rebuilding as soon as a failed drive is detected. This restores the array to health faster and reduces the storage administrator's workload.
Originally, the hot spare had to be included in the array being protected, but through advances in controller design and SAN architecture, several arrays on the same SAN can now share hot spares. The use of these global spare drives save money by reducing the number of spares on the system.
When a drive fails in a hot spare/global spare system, the array controller looks for a hot spare in the array. If it doesn't find one, it looks for a global spare and uses that.
Global spares are easy to implement. Once the drive or drives are installed, the controller's management software typically allows administrators to designate drives as global spares with a few mouse clicks.
Here's a checklist for storage administrators considering deploying global spares: (Note: There's no relationship between how useful global spares are and the factors to take into consideration before deploying them.)
Global spares: Make sure that the array controller can support global spares. Some older models don't.
Drive size: Although the spare can be larger than the drives in the array, sometimes the spare needs to be the same size. This can cause a problem if the arrays on the SAN have different capacity drives. Check with your vendor to make sure your proposed global spare configuration will work.
Architecture: Most controllers can't share global spares outside their own group of arrays. You need to make sure that your spares are available to all the arrays you want to protect.
If you only have two or three arrays in the group you may be able to get away with a single global spare drive. However, if you have 10 or 20 arrays, a single spare is inadequate. Rebuild time enters into this, as well. As disk capacities grow, so does the time it takes to rebuild the array. Until the array is rebuilt, a second drive failure can render most RAID levels nonfunctional. The major exception is RAID 6, which can survive two disks failing.
About the author: Rick Cook specializes in writing about issues related to storage and storage management.
This was first published in November 2007