Managing enterprise data storage more efficiently, Part 2: Reclaim storage and consolidate data

Article

Managing enterprise data storage more efficiently, Part 2: Reclaim storage and consolidate data

Beth Pariseau, Senior News Writer

To reclaim storage and consolidate the amount of data in their firms, enterprise data storage administrators are increasingly turning to technology tricks they can apply to get the most out of their storage

Continue Reading This Article

Enjoy this article as well as all of our content, including E-Guides, news, tips and more.

arrays. Using techniques such as tiered storage, data deduplication, space-efficient snapshots, thin provisioning and wide striping, solid-state drives (SSDs) and energy efficiency, they can store as much data as possible in the smallest and least-expensive amount of space.

Tiered storage

Tiered storage has been in vogue for a while, but tiering projects are becoming more popular as companies seek to avoid purchasing more disk capacity.

University HealthSystem Consortium (UHC), an Oak Brook, Ill.-based industry group for U.S. academic hospitals and medical centers that collects data and coordinates cooperation between healthcare facilities, has been aggressive with its tiered storage strategy. Instead of starting data at tier 1 and progressing down through lower tiers as it ages, all data starts at tier 2, and the organization uses Hitachi Data Systems' Tuning Manager to evaluate what data needs to be moved up to tier 1 for performance reasons.

UHC purchased the Hitachi Data Systems' Universal Storage Platform V (USP V) storage virtualization controller in 2008 when it forecasted 400% data growth. Steven Carlberg, UHC's principal network administrator, said he chose the USP V system over systems from EMC Corp., Hewlett-Packard (HP) Co., IBM and Sun Microsystems Inc. because he anticipated virtualizing arrays behind the controller would make future migrations easier.

"Obviously, the main cabinet would take a forklift upgrade to replace, but I don't foresee having to swap it out for a long, long time. It can scale up to 247 petabytes," he said. "Going forward, on lower tiers, we can change out obsolete hardware without a lot of pain."

We don't need deduplication. We have the space to keep a year-and-a-half's worth of backup snapshots on all production drives without deduplication.  
Justin Bell
network administratorStrand Associates Inc.
Data reduction maximizes tiered storage efficiency

Clackamas County in north central Oregon deployed tiered storage by using F5 Networks Inc.'s ARX file virtualization switch and Data Domain's DD565 data deduplication disk arrays in an effort to put off adding capacity to its tier 1 SAS-based iSCSI storage-area network (SAN) storage.

Similarly, Vancouver, B.C.-based Rainmaker Entertainment Inc. will use an ECO appliance from Ocarina Networks with network-attached storage (NAS) systems from BlueArc Corp. and NAS systems from Isilon Systems Inc. to reduce primary storage and keep more archival data online.

"Our tiered storage policy will free up about 10 TB of tier 2 storage. That, in turn, frees up SAS capacity used by database servers and our email archiving system," said Christopher Fricke, Clackamas County's senior IT administrator. "It helps us not have to chase capacity while we go through a budget crunch. We can focus on performance rather than capacity. In a couple of budget cycles, we should be able to look at replacing our current tier 1 with solid-state disk [SSD]."

Data reduction technologies originally deployed for backup are making their way into primary and nearline storage. Cornell University's Center for Advanced Computing (CAC) and e-commerce site Shopzilla are among the early adopters of primary storage data reduction to consolidate their storage and keep up with data growth. Both use compression appliances from startups. Ithaca, N.Y-based Cornell runs Ocarina Networks ECOsystem appliances, and Shopzilla has Storwize Inc.'s STN-6000 device.

CAC is also trying to create economies of scale for the entire campus by getting other departments in the university to store research data on its storage. "We're hoping researchers will put their data on centralized storage devices as opposed to spending excessive amounts of money on terabyte USB drives that are deployed in silos," said David Lifka, director of CAC. "Siloed technologies cost the university money. They reduce scalability and cost more to maintain."

With Ocarina Networks, the effective cost of a terabyte is approximately $500, a price point that's appealing to CAC's clients. At the rate researchers are signing on, Lifka said, CAC may have to add another 100 TB to its DataDirect Networks Inc. arrays this summer.

Space-efficient snapshots

Justin Bell, network administrator at Madison, Wisc.-based engineering firm Strand Associates Inc., said space-efficient snapshots save valuable space on his storage arrays.

"We don't need deduplication," Bell said. Strand only sends changes to data on the company's servers at the main and branch offices to the backup server with FalconStor Software Inc.'s DiskSafe product. That reduces 12 TB total capacity, and 2 TB logical snapshot capacity (about four months' worth of snapshots, according to Bell) takes up 660 GB with only changes saved.

"We have the space to keep a year-and-a-half's worth of backup snapshots on all production drives without deduplication," he said.

The company uses Riverbed Technology Inc.'s Steelhead wide-area network (WAN) optimization devices to centralize storage of production files among the branch offices, but Bell said Riverbed's data deduplication algorithms "don't touch the FalconStor stuff." Riverbed has its own compression and protocol optimization for sending data over a WAN, but "that wouldn't do much, since FalconStor is rarely sending the same data twice," Bell said.

Thin provisioning and wide striping

UHC also uses Hitachi Dynamic Provisioning (HDP) to wide stripe and thin provision volumes. "We can run at 125% over-allocation fairly safely," UHC's Carlberg said. With the most recent release, volumes can be dynamically expanded.

"It's saved us countless hours of management," Carlberg said. "Instead of struggling to manage with almost two full-time engineers [FTEs], we're now at one-half of an FTE with 400% data growth."

He would like to see Hitachi Data Systems go further, however. "I wish we had zero page reclaim when we were migrating volumes off [our older array]," Carlberg said. "We had to write the full volumes to the USP V whether they were really full or not."

Carlberg added that he hopes Hitachi Data Systems will add the ability to dynamically shrink over-allocated volumes if necessary, and he may get his wish.

"We're working with vendors like Symantec [Corp.] and industry standards groups to address automatic space reclamation and improve the ability for file systems to tell us when we can unmap storage and return it for better uses," wrote John Harker, Hitachi Data Systems' spokesman, in an email to SearchStorage.com. "We expect to share some news before the end of the year."

SSDs and energy efficiency

The cost per gigabyte of solid-state drives (SSDs) remains a barrier to adoption for most users in the enterprise data storage market. But Bill Konkol, vice president of technology at broadcast advertising design firm Marketing Architects Inc. in Minneapolis, says he has met performance requirements more cheaply with Compellent Technologies Inc. Storage Center SAN than using short-stroked traditional hard disk drives.

To get the required performance, Konkol estimated he would have to short-stroke 28 Fibre Channel (FC) disks, which would cost $75,000 more than the Compellent SAN he bought with less than 500 GB of STEC Inc. SSDs. Adding the solid-state drives has boosted performance enough so that searches that previously took hours now take approximately 15 minutes, he said.

Scott Stellmon, systems architect at Palo Alto, Calif.-based The Children's Health Council (CHC), said he's looking to deploy an Infortrend Inc. EonStor B12F modular storage system, which is scaled in 1U, 12-drive units. Infortrend's system can also accommodate SSDs in 2.5-inch drive slots, which Stellmon said he'd like to add for primary databases.

"We haven't made the final decision yet, but because some of our data sets are small enough, it won't take much scale in SSD capacity," he said.

An 80 GB PCI Express solid-state drive costs approximately $3,000, and most of CHC's databases are between 5 GB and 8 GB.

This will be cheaper and draw less power than adding enough disk drives to the system to keep databases performing fast. "If I can justify saving money on electrical costs [with SSDs], I can turn around and use that money to buy more storage gear," Stellmon said.

The final part of our series will examine what storage administrators are doing to reduce operating expenses.