IDC has recently been promoting the idea of file and object based storage as a segment. While market segmentation is more art than science, putting file and object based storage in the same segment obscures the true shape of the storage industry.
Why this matters
Done well market segmentation helps to reveal the underlying dynamics of marketplace activity. Who buys what? How do they acquire it? What are the margins? What are the problems they are addressing?
There are many ways to segment markets: price; technology; customer type; distribution; and more.
For example, segmenting by price bands can reflect channel realities. Products under $50,000 are rarely sold by direct sales because salesmen are too expensive. Therefore they go through the channel.
For Seagate and Western Digital, this means that their burgeoning storage system businesses can move much further up market. EMC and NetApp can’t retaliate because their expensive direct sales forces cannot compete in the under $50,000 price band – as NetApp discovered with its StoreVault failure.
That strategic reality is only seen by correlating the segments with the underlying sales economics. And that is why IDCs conflation of file and object storage is a disservice to the industry: it conceals rather than reveals key market forces.
The rise of object storage
Gartner recently estimated that the major cloud service providers have spent more than $50 billion on cloud infrastructure in the last eight years. While undoubtedly low – due to the prevalence of private cloud expenditures – this reveals the rise of an enormous storage market that has been outside the purview of established storage companies.
As StorageMojo noted over seven years ago in So Mr. Tucci, Where Are EMC’s Google Application Notes?:
I’ve been puzzled for years over why cheap, high volume storage hasn’t made it into the data center as so many other high volume consumer technologies have. In Google I think I have my answer: it has, using hardware so cheap that the people who build it can’t afford slick “application notesâ€, big user groups, fat contracts for the “independent†analysts and four color ads in all the IT publications. Not to mention that Google has no incentive to give their secrets away.
One of Google’s secrets? Object storage in the Google File System. Every other major cloud storage provider has followed their lead.
The StorageMojo take
Public and private cloud object storage is redefining enterprise storage – including file servers – and doing it outside the channel and pricing models – 60%+ gross margins – of the current leaders in block and file storage. Munging current file storage sales and vendors with object storage sales does almost everyone a disservice – except those vendors who want to be part of a discussion they’ve mostly ignored for the last 7 years.
Object storage is important and different from file storage in many important ways, just as file storage is important and different than block storage even though all three store files. Due to its pivotal role in the cloud upheaval, object storage is arguably the most important of all.
I’d like to hear from IDC on their segmentation rationale. It may be as simple as manpower – their analyst ranks have been decimated in the last 10 years – or maybe it’s an honest judgment call.
Courteous comments welcome, of course. I’ve just written a short white paper for DDN intended to introduce senior managers to object storage. If you are evangelizing object stores in your company you should find it useful.
I think IDC got the segmentation right. Object semantics are pretty much a subset of file semantics. This places object-storage and file-storage systems/vendors in direct competition for each others’ customers. Yes, full filesystem semantics can be a within-category differentiator. So can tiering, erasure coding, retention/deletion policies, dedup, crypto, multi-DC capability, and so on. Still, file and object storage have much more in common with each other than either has with block storage. Customers see that. To them, two ways of organizing millions (to billions or in S3’s case trillions) of named variable-length objects are sufficiently similar and interchangeable that they constitute a single market segment.
In your white paper you had two nearly identical sections about the DDN product not requiring an underlying file system and thus has a 512 byte block granularity. Most hard drives sold today have a default sector size of 4096 bytes instead of the old 512 byte size. You might want to mention this.
So far, almost all object storage is in the cloud. I met with you a couple years ago in Las Vegas and showed you a very early version of my new “Didget Management System” which is designed to bring object storage to local hard drives and flash drives in addition to all those cloud storage systems. While progress has been slow since I just work on the project in my spare time, the system is now light years ahead of that first demo. I hope to introduce it to the market soon (probably through a Kickstarter campaign).
I’m surprised you would say that NetApp cannot compete in the under $50K price segment. 2000-series FAS filers are a great deal for the price, and are well south of $50K. VNXe’s can’t yet touch them. Who even offers a similar product with similar features at that price?
Mr. Harris – My response here:
http://goo.gl/W1oKrv
-Disclosure NetApp Employee, however these opinions are my own –
Robin,
part of my job is to reveiw the IDC figures for NetApp in ANZ, and report back to management with the implication of that data, so I come to this with a certain amount of experience.
In some respects I do agree with you, for example, I’ve never thought that Centera or DataDomain belonged in the NAS category. From my perspective intuitively when people talk about NAS, they mean something that serves out group shares and home directories.
Having said that, if you’re going to take this argument, then you’d probably need to ditch the entire NAS/SAN/Object categories entirely and segment the market by workloads. For example, when you use NFS to host an Oracle database, or use SMB3 to host hyper-v, you could reasonably argue that those are really “SAN” workloads. This argument around taxonomies built on technical implementations is partly why even Dave Hitz is fuzzy over the definitions as he wrote in a blog back in 2007
https://communities.netapp.com/community/netapp-blogs/dave/blog/2007/04/19/is-iscsi-san-or-is-iscsi-nas-i-don-t-know
The trouble is, getting data around how customers use their storage is challenging, but even when you do have a good idea through things like NetApp’s auto-support, the metrics that IDC might get from surveys and vendor supplied data would be even more subject to argument than their current taxonomy.
When I review the IDC data, I do additional research and spot checks along with my judgement and other analyst data, but even so I always start with the IDC data, especially at the aggregated borad upper layers where you can see the forest for the trees. As you get more and more detailed (slicing up by submarket, geography, protocol etc etc), the data gets fuzzier and relatively small variances can skew the figures, you can even start seeing where IDC is making what appear to be assumptions rather than gathering and reporting on raw data, but even then, I trust that they’ve done a job as thorough as I would have done with the same information at my disposal, and that’s all that I can ask.
Even with its flaws, the IDC data highlights the big trends, and where it doesn’t do so directly, it gives enough other data for me to form my own taxonomies (which I do). For that reason, I think they offer a very valuable service, and from my personal perspective, I think you’ve probably been a little too harsh here.
On the other hand, it’s worth noting that when a vendor starts crowing about being #1 in a given IDC category, you really need to be aware that IDCs definition of that category may not agree with your own, and that statistics can be used in many “creative” ways. As careful as IDC is with people using and reporting their data, there is always the age old problem that there are lies … there are damn lies … and then there are statistics.
Regards
John Martin