Home > Primary storage data reduction advancing via data deduplication, compression
Special Report:
EMAIL THIS

Primary storage data reduction advancing via data deduplication, compression

16 Dec 2009 | Carol Sliwa, Features Writer

Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   

While not as hot as data deduplication for backup, primary storage data reduction, which includes data deduplication and data compression techniques, is getting warmer thanks to a scattering of products that try to shrink the data footprint on tier 1 disk.

More on primary storage data reduction
NetApp: Post-process deduplication limits performance hit in primary storage data deduplication

EMC Celerra: Primary storage data reduction through deduplication, compression

Ocarina ECOsystem deconstructs before compression, deduplication for primary storage data reduction


Storwize claims good data compression rates, no performance degradation on STN-6000 appliance

Primary storage data deduplication is mature now, says Gartner analyst
The companies with offerings in this space are taking a variety of approaches to address the problem. For instance, one primary storage data-reduction approach searches for duplicates at the file level, while others are more granular, comparing data blocks or byte streams, of fixed or variable sizes. Some work post-process, storing the data writes before starting the data deduplication process. And one compression specialist operates inline, in the data path.

NetApp Inc. deduplication, which operates at the block level, is the most prominent of the offerings taking aim at primary storage. The company claims that more than 8,000 customers have licensed its free deduplication technology since its 2007 release.

Rival EMC Corp. followed NetApp into primary storage deduplication in early 2009 with the release of its Celerra Data Deduplication, which actually performs compression before tackling deduplication on file-based data.

Ocarina Networks Inc. also does both compression and deduplication but takes a different path than EMC. Ocarina's ECOsystem first extracts and decompresses file-based data, then deduplicates on a variable- or sliding-block basis before compressing it.

Finishing up the list of current entrants in this space is Storwize Inc., which has a compression-only offering. Storwize CEO Ed Walsh contends that primary storage is not the appropriate place for data deduplication.

"You dedupe what's repetitive, and you don't find in primary data the same repetition that you see in a backup data flow," said Walsh, who was formerly the CEO of Avamar, a deduplication backup software vendor acquired by EMC in 2006.

More primary storage deduplication products on the horizon

Yet some industry analysts expect more vendors to turn their attention to primary storage deduplication. Permabit Technologies Corp., for instance, offers inline, sub-file level deduplication. Permabit targets its dedupe at archiving but claims some customers use it for primary storage. Sun Microsystems Inc. recently added built-in deduplication to its ZFS file system. Other vendors that employ the open-source ZFS technology are likely to exploit it.

"Vendors that have solutions today in the market may not be the ones you'll see in five years," said Lauren Whitehouse, a senior analyst at Enterprise Strategy Group. "It's not that they're going to go away, but I don't think they'll even be the top ones. It might be the application vendor or the operating system vendor, someone closer to the creation of data, the storage of that data, policies around that data."

Valdis Filks, a research director for storage technologies and strategies at Gartner Inc., said he expects two or three more vendors to offer deduplication for primary storage in 2010, with more to follow in 2011. By 2012, primary storage dedupe will be ubiquitous, he predicted.

"Sometimes we say a technology turns the industry upside down or on its head. Allegorically and technically, dedupe on primary storage does that," Filks said. "We are so used to writing the data to a backup dedupe device and deduping it there. If everything is deduped at source, I expect the back-end dedupe vendors to start to have lots of trouble, and they will obviously have a marketing offensive saying primary deduplication is the wrong place."

Filks said software-intensive, modern-design storage devices with a file system and intelligent block-based architectures, which have the ability to store metadata pertaining to each data block, will be best suited to primary storage data deduplication. Performance issues can be overcome through a combination of multi-core high-speed processors, low-cost DRAM for cache and solid-state drive technology, he added.

"Designers have more performance-accelerating components in storage than they have ever had before, at an affordable price," Filks said.

In the meantime, the majority of end users have been content to hold off on primary storage data reduction.

"People are OK just buying more disk drives," said Greg Schulz, founder and senior analyst at StorageIO Group. "People understand and realize that they can go in and archive, pull the data out of databases, out of email, out of file systems, and then back it up onto a deduped disk or onto a compressed tape."

How primary storage deduplication products work

Understanding how the current crop of primary storage data-reduction products works and where each of their sweet spots lies can help an IT organization to decide if the technology might be a good fit to help curb the explosive growth of storage.

"Vendors doing sub-file reduction have a much higher hurdle to get over because they have to demonstrate that they can do that with very little performance impact in primary storage use cases," said Jeff Boles, a senior analyst and director, validation services at Taneja Group.

Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   



RELATED CONTENT
Disk arrays
3PAR adds SSDs, sub-volume automated tiered storage to InServ arrays
SAS technology: SAS-2 enhancements and product overview
RAID disk arrays in small business data storage environments
EMC upgrades Symmetrix V-Max arrays, thin provisioning
Primary storage data reduction: Data deduplication and compression tools
NetApp: Post-process deduplication limits performance hit in primary storage data deduplication
Ocarina ECOsystem deconstructs before compression, deduplication for primary storage data reduction
EMC Celerra: Primary storage data reduction through deduplication, compression
Gartner analyst on data deduplication for primary storage
Storwize claims good data compression rates, no performance degradation on STN-6000 appliance

Disk drives
Solid-state drives vs. hard disk drives: How to justify the cost of an SSD
3PAR adds SSDs, sub-volume automated tiered storage to InServ arrays
FalconStor, Violin combine on Flash SAN accelerator
SAS and SATA explained
Using SAS and SATA for tiered storage
SATA technology advances and expands in the enterprise
Storage roundup: College uses clustered NAS; new Secure Multi-tenancy Design Architecture; and more
Primary storage data reduction: Data deduplication and compression tools
NetApp: Post-process deduplication limits performance hit in primary storage data deduplication
EMC Celerra: Primary storage data reduction through deduplication, compression

Data reduction and deduplication
IBM quietly releases source-side data deduplication in Tivoli Storage Manager 6.2
SunGard adds EMC Data Domain deduplication to Secure2Disk cloud data backup service
Primary storage data dedupe and compression find their niche
EMC's Slootman: Data Domain planning global deduplication, NetWorker integration this spring
Storage roundup: College uses clustered NAS; new Secure Multi-tenancy Design Architecture; and more
The green data centre: Business best practices
Symantec injects data deduplication into NetBackup 7 and Backup Exec 2010
Creating a data center migration plan
Data backup and recovery best practices with W. Curtis Preston
Data backup and recovery choices for SMBs

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary




Data Backup Solutions for UK - Data Reduction, Data Deduplication, Tape Storage
About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2008 - 2010, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts