Many organizations might wonder why they should bother with storage performance testing. The issue is simple; testing is the only way to answer some important questions. Are we getting the best throughput from a storage array? Do sudden performance changes indicate a fault? Does our new product measure up to the competition? The answers are out there, but the onus is on an IT department to find those answers -- or turn to a third-party that can help. The best way to judge the value of storage performance testing is to see how it's used in the real world.
Optimizing media performance
Storage systems must often service a large number of network users, so performance plays a huge role in storage operations. For T-Systems Enterprise Services GmbH, the goal is to configure the storage system for optimum media file performance across its environment. Testing takes place in a dedicated lab setup using a two switch fabric with an assortment of disk arrays, tape libraries, initiators and storage routers. All nodes are interconnected with 2 Gbps Fibre Channel (FC), and the entire storage system provides about 3 to 4 terabytes (TB).
"I simulate the typical workloads and test my improvements on the subsystem, host bus adapters, file system and FC with Iometer," says Jawid Mahmoodzada, a storage specialist currently working at T-Systems. "The target is to get cheap and fast storage for the customers."
Mahmoodzada uses both Iometer and Iozone for testing, though
As you'd expect, there are advantages and disadvantages to Iometer. Mahmoodzada sees the separation between the workload generator and reporting features as a strong positive, along with the versatility in workload configurations. Tackling the optimization work in-house also offers a cost benefit to the company. "I think it is definitely cheaper than engaging an outside company for this job," he says. His only disappointment is the reliance on Microsoft applications for analysis of the log files, and he suggests moving to an open source format that is compatible with applications other than Microsoft tools like Excel or Access.
Maintaining the baseline
The key to any troubleshooting is to understand differences. When you can point to a substantial difference between the way something "is" working and the way something "should" work, you have the foundation of a diagnosis. However, in order to know how something should work, it's important to find a baseline and establish the characteristics that are considered acceptable. "It's nice to have a set baseline -- what we should be seeing in throughput, response times and so on," says Tom Becchetti, senior capacity planner at MoneyGram International and chairman of the Minneapolis/St. Paul chapter of the Computer Measurement Group. "Troubleshooting is a time of duress, so it's good to know what's normal and what's not normal."
Having baselines are essential for Becchetti who is responsible for a large quantity of EMC Corp. storage hardware. Primary storage includes both a primary and backup EMC DMX system -- each with 15 TB). Disk backup storage includes multiple Clariion SATA systems with 10 TB in Becchetti's office and 20 TB in the backup location, as well as two more Clariions in a Colorado office with 4 TB for main use and 10 TB for backup respectively. Iometer is the preferred tool for general-purpose Windows server benchmarking, but Becchetti leverages a wide range of tools, including FDR/Upstream from Innovation Data Processing, along with utilities native to individual hardware like EMC systems and McData switches. Baselines are typically established before a system is placed into service, or when major changes are made, and routine monitoring is accomplished with Spotlight from Quest Software Inc.
Becchetti has tremendous respect for the power of analytical tools, citing one particular instance when Iometer revealed unacceptable response times over a length of dense wave division multiplexing (DWDM) optical fiber. After identifying the anomaly and working with product technicians, he discovered that an optical transceiver was operating improperly. "When they fixed it, my response time went from 8.7 milliseconds (msec) down to under 1.0 msec response time," he says. "So our backup times went from a couple of hours to under 20 minutes -- that's mainly what we were using the DWDM link for."
Becchetti appreciates the fact that Iometer is free and likes the versatility to modify the workload between reads, writes and random operation. He also notes that Iometer is quick and easy to install, operates unobtrusively on servers along side other applications and is compatible with NAS products. One key disadvantage is the learning curve, and Becchetti feels that the product could have been more intuitive to use. "I actually had to read the manual a little bit," he says.
Establishing credibility in the marketplace
When a vendor develops a new product, it's important to establish a set of performance characteristics for the device. Testing is typically outsourced to a third party to develop an independent set of performance benchmarks and compare a product's performance to other similar products in the marketplace. Network Appliance Inc. (NetApp) is one of many vendors to seek third-party performance evaluations. "When you as a company generate your own performance numbers, they don't really have a lot of credibility out there in the marketplace," says Dan Morgan, director of workload engineering at NetApp. "We like to use a third party for an independent analysis of our performance."
NetApp looks to VeriTest (a Lionbridge Technologies Inc. brand) for third-party services and has already worked through two major engagements with VeriTest -- the most recent project involved comparative testing between an EMC CX500 and NetApp's FAS 3020. Morgan clearly believes in the value of third-party testing. "You certainly pay a hefty price to get the work done," he says. "But when I look at the value-add and the feedback that I get across the organization on that collateral [testing results], it's certainly worth our efforts and the money we're putting into them." Morgan declined to comment on specific costs but noted that the project took about six weeks from start to finish.
After comparing several third-party service providers, Morgan is particularly impressed with the quality and completeness of VeriTest's reporting, noting that VeriTest's work included a critique of the documentation and other user materials involved with the product. However, it's important to note that VeriTest did not provide any suggestions or recommendations for actual engineering improvements. Another tangible value for Morgan was VeriTest's familiarity with competing products, (such as the EMC CX500). "We relied on their expertise to deploy that [competing] product in a real-world scenario," he says. "We just don't have that expertise in-house."
Perhaps the only area of concern for Morgan is a question of attentiveness from VeriTest. "They handle so many requests that I think they're strapped and a little thinned in areas," he says. "Sometimes they're off busy on other engagements, and it's hard to get a quick turnaround." Morgan hopes that VeriTest would supplement its staff to improve turnaround times and commit a dedicated contact to handle NetApp testing projects. In spite of this concern, Morgan expects future testing engagements with VeriTest.
Go to the next part of this article: Storage performance testing: Future directions
Or skip to the section of interest: