Deep file analytics

by Robin Harris on Friday, 5 September, 2014

A new storage market is being born. Will it survive?

As infrastructure continues to adjust to a data-centric world, the ability to manage data – not just storage – is poised to become a must-have capability. Traditionally, of course, data management has meant databases. But file data – confusingly called unstructured data – is by far the predominant business data type today.

Reading the tea leaves
The recent launch of DataGravity – who perform deep file inspection in their array – and a discussion with Quaddra Software’s John Howarth and Marc Farley suggests a mini-wave of activity in this area.

This is not a new idea. I talked to a startup a decade ago that proposed to do the same thing across an entire LAN – and then went nowhere.

ZL Technologies is 15 years old, prospering in the enterprise archive space. Their web site says:

ZL Technologies’ Unified Archive® utilizes a unique, unified architecture that breaks down data siloes in favor of one robust, centralized repository for managing all enterprise unstructured data and performing records management, eDiscovery, and compliance functions.

The thread
All three companies handle both metadata and content. Users can sort based on types of files – PDFs, MP3s – as well as content – social security or credit card numbers – in those files.

The use cases are similar as well. For legal discovery, DataGravity and ZL handle it, but Quaddra is looking to empower others to deliver that service. Chargeback is another common use.

An open question of growing importance is the data generated outside of corporate storage: tweets; Facebook entries; IMs; voicemails; and other types and formats that may not exist today. And you thought email was hard.

The StorageMojo take
What makes business markets take flight? I like the pain theory: when enough people feel the pain AND that pain is high relative to other pains, then people seek relief.

We’re reaching that point with deep file analytics. E-discovery is one driver. Sheer volume of data is another, which in part, drives chargeback.

The bottom line, with commodity and virtualized computes, cheaper networks, security issues and the newly visible costs of storage – thanks, AWS! – the pain of storage, which was always there, is now more visible and higher relative to other pains.

There will be backing and filling over the next 5 years, but deep file analytics is here to stay.

Courteous comments welcome, of course. Agree or disagree as you like – just tell me your reasons.

{ 2 comments… read them below or add one }

John ( other John) September 12, 2014 at 3:35 pm

This is one area where I almost wish Microsoft would revert to a “embrace, extend, extinguish” aggressive business model. It seems a very gentle game of footsy that they play, with companies that provide the tools to keep their OSs running, especially in the collapsed datacenter models, datacenter in a box, cluster in a cabinet space.

As is my focus, I’m concerned that smaller companies need these tools, now. They are not tools, or companies, on the RADAR of smaller business, nor is there much economic slack to pay for deep store analytics in a general interaction with a smaller outfit. But, most definitely, if a business owns a few filers, and I use that term without expressing implicitly a NetApp or other top tier supplier, so to include any large dedicated store, analytics can be the key to reclaiming performance. The answer I suggest for Microsoft, is that if they did provide similar tools, that they license on a per core model, because you will deploy compute resources to run them. another layer of licensing, ugh, yes, but it’s specific enough, to call it a new server edition: Microsoft Data Performance Server. Promoted to any company with large important stores, plus the usual media type companies, and possibly smaller “big data” shops running complex databases.

Aside, MS have I think shot themselves in the foot, taking away the “Microsoft Certified Master” exams. Those were the closest they had to putting specialists in the field with credentials that covered what storage specialists do. The storage specialist world needs people coming up from application domains, and especially from sectors that don’t sell that expertise, take any Microsoft Gold Partner reseller…

Generically, I simply think the sales cycle is already too complex and convoluted, to tap a wider market, for storage specialties. It’s a “big company advantage divide” problem.

PB May 26, 2015 at 7:48 pm

What would you consider a killer feature or a game changer for file analytics?

Leave a Comment

Previous post:

Next post: