Founder Hitz talks NetApp strategy for clustering, convergence

Dave Raffo

Dave Hitz, NetApp Inc.'s executive vice president, was one of the vendor’s founders in 1992 and has had a front row seat to a raft of changes in the data storage industry for 20 years. We recently caught up with him to talk about NetApp storage strategy

    Requires Free Membership to View

and industry trends, including clustering, converged infrastructure, storage for virtualization, primary deduplication and cloud storage. What is your current role at NetApp?

Hitz: My role is chief gadfly, possibly visionary. I started as a programmer. I’ve worked as outbound product evangelist and ran engineering for a while. I took a year off and wrote the book, got involved with transition planning from [CEOs] Dan [Warmenhoven] to Tom [Georgens], and started writing papers called future histories.

We’ve done a number of major transitions of the company. In the mid- to late-'90s we took a big bet on the Internet and grew up as a company. After the crash hit, we re-oriented again and based ourselves on virtualized applications. Then we placed a real big bet on server virtualization. Every five years we’ve re-invented ourselves as a company.

The bet we’re taking this time is on large-scale infrastructure for data, in particular. What role does storage play in this large-scale infrastructure?

Hitz: The world of data has changed a lot over the past few years. Ten years ago, it was mostly boring corporate stuff … schematics, payroll, budget. Now we’re talking about storing baby’s first steps, letters to our spouse, medical records -- deeply personal stuff. So the space of data is really changing.

The other thing that’s happening with data is the place where most data management used to happen has disappeared. Data management used to mostly be on the physical server. These days with server virtualization, the physical server is not really much of a place. The virtual machines are flitting from one physical server to another. And the virtual servers aren't a good place for data management because typically you have multiple virtual servers sharing data connecting into the same database. So the question is 'Where does the data management go?' And increasingly, it’s been moving into the storage layer. The effect of that is storage is starting to become an infrastructure for data. What I mean by infrastructure is that it's a resource that’s shared, used by lot of different people but managed centrally. How is this reflected in NetApp's strategy for its storage?

Hitz: A big part of that is the clustering we’ve released in Ontap 8. We think clustering is really foundational to be able to have these characteristics where you can take out a server and install a different server. How do you make that happen when there’s data on that storage server that needs to be there? The answer is you need to have these things clustered together with the ability to just transparently migrate data from one place to another. NetApp has worked on clustering for quite a while, really since acquiring Spinnaker Networks in 2004. Data Ontap 8 was supposed to be the version with full clustering, but you didn’t talk about that much when 8.0 came out. And you still offered Ontap in Version 7 mode, without clustering. Why have you waited until Ontap 8.1.1 to really emphasize clustering?

Hitz: When we shipped 8.0, it was a pretty radical upgrade of Ontap, kind of corresponding to when Microsoft came out with Windows NT. I mean, big architectural changes. We took a different approach -- when Windows NT came out there was a classic mode, but that wasn’t the default. When we came out in 8.0, it ran by default in 7 mode. That means we did this radical architectural upgrade with all the foundations for clustering in place but didn’t ship it with all that stuff turned on. And we didn’t make all that big of a deal of it at the time because we wanted to get all the pieces ready so that people would be able to flip it on.

So 8.1 is actually the place where we’re telling the story of clustering, when we’ve got proof cases installed now and customers using it. So we took more of a ‘Let’s get the infrastructure in place, get people running on it and then make a lot of noise [approach].' So here we are in 8.1 and we’re telling the story of 8.0. Maybe we’re too conservative [in our] marketing, but we like to know what we have before we blow the bugles. How many NetApp storage customers are using clustering?

Hitz: It’s a relatively small percentage, single digits. Our goal was first to get 8.0 deployed with a whole new architecture, and make sure that whole new architectural framework was stable. While you were working on building clustering into Ontap, EMC bought Isilon and had a mature clustering shipping product. How does NetApp’s clustering stack up vs. Isilon?

Hitz: You have a really hard choice if you go with Isilon. Because on the one hand, EMC has a lot of interesting capabilities: snapshots, cloning, dedupe, compression, a rich set of data management features. The problem with EMC is that for each of those features you have to say ‘Wait a minute, which product has it?’ And it’s not uniform across them. You say ‘I want to use SATA drives.’ Well, you can do that for Isilon; if you want to do this other thing, it’s VMAX and [for] this other thing it's VNX.

If you look at features EMC can support, you end up with a complete list. If you break apart their architectures and look at the same feature list by architecture, you end up finding the main feature Isilon has is clustering, which is great. Unfortunately, it’s not in combination with the full suite of rich data management capabilities. That’s the No. 1 difference Ontap has -- it’s the same Ontap that has all this cool stuff in it.

Second, what people find with Isilon is that you can scale it, but when it comes time for a hardware refresh, you’re talking downtime. You don’t take down your network when you put a new router in place. You shouldn’t have to take down your new data infrastructure when you put new storage systems in place. A lot of IT vendors are moving toward a converged infrastructure combining storage, networking and compute into one offering. NetApp does this through its FlexPod reference architecture. Will convergence be a dominant trend in storage implementation?

Hitz: There are two competing trends. I have this picture in my head of the stack of stuff you need to put together a data center. There’s the CPU chip itself, the operating system, applications like databases, capabilities like networking and storage. You can think of that as a vertical stack. On one hand there’s pressure to get that whole vertical stack integrated to make it easier for customers.

There’s another trend, horizontalization, with companies thinking of their own particular layer. When I think of that trend, I think of companies like Intel with the chip or Microsoft with the operating system or Cisco with networking. The question is which of these trends is winning? It looks to me, in terms of IP ownership and the actual technology development, the horizontal model is winning. Oracle’s database beats any of the old server guys’ database; Cisco’s networking beats any of the old server guys’ networking; and EMC, Hitachi Data Systems [HDS] and NetApp are two-thirds of the storage market. You mention EMC, NetApp and HDS dominating the storage market, but there's still a wide gap between you and market leader EMC. How can you close that gap?

Hitz: Ontap is the No.1 storage architecture in the market. EMC is still a good bit larger than us in storage market share, but EMC is deploying lots of different architectures. They’ve got Enginuity for Symmetrix and separate architectures for Clariion/Celerra, Data Domain and Isilon. If you look at market share by architecture, NetApp is No. 1, EMC is No. 2, 3 and 4, and then NetApp is No. 5 with the E-Series.

Even though EMC is selling more storage, it’s not the best building block for infrastructure because you always have to stop and say, ‘Wait a minute, did I want it to be Celerra style, VNX style, VNXe style, VMAX style or Isilon style?' All of those are different things.

We have the E-Series for extreme cases. We say the E is for Extreme -- super-high bandwidth, Teradata-style storage. But in general, we tell them, 'Ontap’s the answer.' It’s a simple answer. NetApp boasts of its efficiency with VMware. Now you have vendors building virtual machine storage systems from the ground up. Are they a threat to NetApp?

Hitz: Around 2007, we made a big bet on VMware. We restructured our company around it. We’ve been doing VMware for a long time now. We’ve partnered effectively with VMware, more so than EMC’s own storage people. I don’t know what new guys are doing that’s designed from the ground up, but we have years of maturing in that space. NetApp was the first to offer deduplication for primary data. Are most of your customers using it yet?

Hitz: We took a look at the adoption rate of dedupe around the world. The worldwide average penetration of dedupe is about 25%, but by geographies, some are enormously higher. In Denmark, around 65% of all NetApp capacity installed was deduped. Nebraska was 2%. I started thinking, ‘What’s wrong with Nebraska or special about Denmark?’ What happens is these technologies go viral in an area, and then they spread. I though 25% worldwide, maybe that’s best you can do. But if its 65% in Denmark, why not other places? It turned out, our channel partners in the Nordic region just got the religion on dedupe and made it standard. They said ‘We turn it on by default; if customers insist they don’t want it we turn it off.’ But usually they don’t ask to turn it off.

If we really want customers to take advantage, our goal needs to be to make them buy it on default. That’s where we’re headed for dedupe and compression. Software has to be smart enough to look at the use case and have more or less aggressive dedupe and the system can auto-tune. What is the NetApp strategy toward cloud storage?

Hitz: People use the term 'cloud' in lots of ways, sometimes people use cloud as they have a bunch of VMware. A lot of people mean a large-scale shared infrastructure data center. The other meaning of cloud is as a business model. Do I want to build a new data center myself or let somebody else build a data center and I’ll rent it from them? The question is, 'Who’s going to build the data center: the customer or the cloud provider?' We focus a lot on cloud providers as a business model.

When we look at people using 50 petabytes-plus, most are doing some style of cloud computing either as an infrastructure play or a more targeted way of building out an internal service.

This story was originally published on