Home > Storage UK All-in-One Buying Guides > Tiered Storage Buying Guide > Tiered storage data classification tools > Tiered storage data classification tool purchase considerations > Data classification tool purchase considerations
All-in-One Buying Guides: Tiered Storage Buying Guide:
EMAIL THIS
 START   GENERAL TIERED STORAGE PURCHASE CONSIDERATIONS   DATA MIGRATION TOOLS   TIERED STORAGE EMAIL DATABASE ARCHIVING TOOLS   TIERED STORAGE DATA CLASSIFICATION TOOLS   
Tiered storage data classification tools


Tiered storage data classification tool purchase considerations
<< PREVIOUS | NEXT >>

Data classification tool purchase considerations

03 Jul 2007 | Stephen J. Bigelow

Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   

Data classification is generally a two-part process. An organization must first understand the business value of applications and data, then store and protect that data at the appropriate service levels. In effect, data classification is used to align business applications with the storage infrastructure. While this may sound simple on the surface, data classification is actually one of the most difficult initiatives for an organization to successfully complete -- often because an organization cannot locate all of its data, categorize it properly or determine its business value. Data classification tools can overcome some of these obstacles by helping organizations locate their data and then organize the data based on user-defined rules. After classification, many tools can also help to move and migrate data to the appropriate storage subsystem.

((Content component not found.)) Still, data classification is hardly a ubiquitous process and certainly not a task for IT to shoulder alone. Proper classification depends on a thorough understanding of data's business value, and this normally requires the human involvement of various business units, such as legal, manufacturing, human resources and finance. Data must be classified "on paper" first. Then data classification tools can bring efficiency and automation to the process. There are numerous tools to choose from. Many tools offer features, including indexing, search, policy management and migration. Now that you've reviewed the essential issues involved in any tiered storage acquisition, this segment will cover specific considerations for data classification tools. After that, you'll also find a series of specifications to help make on-the-spot product comparisons between vendors, including Abrevity Inc., EMC Corp., Index Engines Inc., StoredIQ Inc.

Consider the product's scope. Understand the number of different file types that you will need to support and select a data classification tool that can handle the scope. Select tools that fully support structured and unstructured data. Tools that handle only structured or unstructured data or are only intended for certain applications, such as databases, may not meet your objectives. Most products handle a vast array of structured and unstructured file types. For example, FileData Classifier from Abrevity claims to handle over 300 file types, including Microsoft Office files, .pdfs, email files, databases like SQL or Access and a variety of media file types. Otherwise, some file types may be left unclassified and probably stored improperly.

Evaluate the support for various rule sets and automation. All data classification products rely on a set of rules that drive the classification engine. Early data classification tools relied almost entirely on rule sets created in-house, but most of today's tools can import established rule sets, often to support medical or legal industries. However, it is important to determine if imported rule sets can be modified or adapted to your specific needs. For example, the autostor product from Arkivio Inc. includes standard classification categories out of the box, but classes can be adapted and new classes can be created as needs change. Manual classification is not universally available. For example, the Information Server from Kazeon Systems Inc. allows manual classification to be performed by the end user or administrator on a set of files, defined by a search query or a report, while Infoscape from EMC Corp. does not support manual classification.

Consider support for tiered storage and migration. Up to 20% of corporate data is under protected, so it's not available at a service level needed by the business. For example, it may take too long to recover that under-protected data, and this translates into risk for the business. Conversely, up to 60% of data is overprotected, so it's being kept on expensive storage and probably replicated too much relative to its business value. This results in excess storage expense. Shop for a data classification tool that can migrate data between storage tiers so each data type receives the appropriate service level. This maintains adequate storage performance while minimizing costs. If the tool does not natively support data migration, be sure that it can support a third-party data mover.

Evaluate the product's performance and scalability. A large company may need to classify and migrate millions, hundreds of millions or even billions of files. Since data classification products generally have a practical limit to the number of files that they support, select a product that can accommodate that volume while still providing an acceptable level of performance. Further, you should understand how the tool handles data in terms of file count and size. For example, some tools may be adept at handling a large number of small files, while other tools may be better suited for fewer large files. Remember that data volumes are growing at a very fast pace, so the tool should accommodate both current and future data volumes.

Evaluate the level of heterogeneity for your environment. The selection of a data classification tool is not just a matter of accommodating file types, the tool must also interface with other platforms in your environment. For example, a data classification tool without migration capability will need to interface with another policy manager or data mover. Similarly, the tool should support the storage platforms that you are currently using. For example, a data classification tool might be used to identify financial or health data types and then move those data types to an EMC Centera or another content-addressed storage (CAS) device. Lab testing is always recommended to verify performance and interoperability.

The data classification product specifications page in this chapter covers the following products:

  • Abrevity Inc.; FileData Classifier and FileData Manager
  • Arkivio Inc.; auto-stor software
  • Brocade Communications Systems Inc.; Storage X software
  • Brocade Communications Systems Inc.; File Lifecycle Manager (FLM) software
  • EMC Corp.; Infoscape
  • IBM; IBM Classification Module for OmniFind Discovery Edition
  • Index Engines Inc.; ILM and Data Classification appliance
  • Kazeon Systems Inc.; Information Server software
  • Scentric; Destiny software

    Return to the beginning



    Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   


    << PREVIOUS | NEXT >>
    VIEW ALL IN THIS CATEGORY


    RELATED CONTENT
    Tools for managing data
    Symantec plans Data Insight software to link storage resources with data owners
    Tiered storage: A look at internal and external tiered storage models
    How to use flash-based SSDs in your environment
    The green data centre: Business best practices
    Green storage essentials: Addressing power, cooling and space issues
    Improve storage utilization rates with storage optimization, capacity reduction techniques
    How to purchase and manage your data storage capacity
    What is thin provisioning and how does it work?
    EMC upgrades Symmetrix V-Max arrays, thin provisioning
    Avoid data migration project failure: Five best practices

    Storage tiers
    Tiered storage strategies and best practices
    3PAR adds SSDs, sub-volume automated tiered storage to InServ arrays
    SAS and SATA explained
    Tiered storage: A look at internal and external tiered storage models
    Using SAS and SATA for tiered storage
    How to purchase and manage your data storage capacity
    EMC releases first version of FAST for automated tiered storage
    RCUK's Rawlins: Vendors need to up their game, iSCSI is over-rated
    Cloud provider Savvis chooses Compellent's automated storage tiering for Project Spirit
    Avere looks to optimize performance of tiered storage with FXT Series

    RELATED RESOURCES
    2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
    Search Bitpipe.com for the latest white papers and business webcasts
    Whatis.com, the online computer dictionary




  • Data Backup Solutions for UK - Data Reduction, Data Deduplication, Tape Storage
    About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
    SEARCH 
    TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

    TechTarget Corporate Web Site  |  Media Kits  |  Site Map




    All Rights Reserved, Copyright 2008 - 2010, TechTarget | Read our Privacy Policy
      TechTarget - The IT Media ROI Experts