One infrastructure to rule them all discussed the emerging enterprise need for a single, scalable file storage infrastructure. But what infrastructure?
Block and file
For decades direct-attached block-based storage was the only option. The ’80s introduced file-based storage. Much of storage systems growth in the last 15 years has been in file servers.
New systems, be they video, sensor or social, are producing massive collections of files at an accelerating rate. The rapid development of lower cost mobile computing devices – smartphones, iPad’s, netbooks and Android tablets – mean that content consumption and production will be a major source of file growth. The long tail of content demand means that the variety of online content will grow – especially as the cost of storage declines.
The larger issue is the need to keep this fast-growing information online for years, despite rapid change in the underlying storage, network and computing infrastructures. File data must become independent of our storage and server choices.
As stores grow data migration becomes less feasible. Rip ‘n replace gives way to in-place upgrades.
Achieving that means moving to an object storage paradigm. How do we know this will happen? Because it already has.
Object stores at Google and Amazon Web Services are already among the largest storage infrastructures in the world. AWS alone stores over 100 billion objects today. Hundreds of millions of people use object storage every day – and don’t even know it.
What is object storage?
Object storage instantiations vary in detail and supported features. However, all object storage has two key characteristics:
–Individual objects are accessed by a global handle. The handle may, for example, be a hash, a key or a something like a URL.
–Extended metadata. The extended metadata content goes beyond that of traditional file systems and may include additional security and content validation as well as presentation, decompression or other information relating to the content, production or value of the enclosed file.
Like files, objects contain data. But they lack key features that would make them files. They don’t have:
-Hierarchy. Not only are all objects created equal, they all remain at the same level. You can’t put one object inside another.
-Names. At least, not human-type names like Claudia_Schiffer or 2006_Taxes.
A user-facing component provides those missing elements. You decide which files belong in which folders. You give the files names. You decide which users have access to which files and what those users can do with those files.
Those choices are embedded in the object metadata so they can be presented as you have organized them. But if you have the object’s handle you can access it directly.
All objects look alike. Some are bigger and some are smaller, but until we get them dressed and named, they aren’t files. Yet they are a lot closer to files than blocks are. Which means that if you choose to manage objects you no longer have to worry about blocks.
Essentially then, objects are files with an address – instead of a pathname – and extra metadata. Unlike distributed file systems – where the metadata is stored in a metadata server. The metadata server keeps track the location of the data on the storage servers.
Some file storage systems are built on object storage repositories. Legacy APIs make it a requirement for many applications, but URL-style access through HTTP is more flexible in the long run.
Crossing the implementation chasm
While the economics of objects are obvious at scale, they are less compelling at the beginning of a typical enterprise project. It is easier to buy another file server than to worry about long-term architecture.
Here’s a rough diagram of the relative scalability of storage options:
When under-12-month paybacks are expected, who will buy an object storage infrastructure? The simple answer is that as object stores become better known and startup costs are reduced, more companies will buy them. Archives will be the first market. The longer answer is that as public cloud projects are brought inside, object stores will receive them.
The StorageMojo take
As organizations amass large file collections, the economies of scale and management for object storage will become apparent. Savvy architects will add commodity-based scale-out object storage to their tool kit.
HDS, NetApp and HP have recently added modern object stores to their product lines. And rumor has it EMC will too, either by getting Atmos to work or by buying Isilon.
Courteous comments welcome, of course. Still don’t like the name object, but I’ll get over it.