Seagate’s Kinetic open storage vision

by Robin Harris on Thursday, 21 November, 2013

Seagate is proposing to turn drives into object-based storage servers in massively parallel configurations. They call this vision the Seagate Kinetic Open Storage Vision.

Today’s scale out infrastructures are universally object based, but the legacy infrastructure used to create object stores has many layers and inefficiencies.

Legacy app servers are laden with a database, filesystem, volume manager and drivers. The storage server has a RAID controller and cache that speaks SCSI to drives.

Seagate proposes a radically stripped-down architecture that includes:

  • A new API and associated libraries.
  • An Ethernet backbone.
  • New hard, hybrid and SSD drives with an Ethernet interface that implements a key/value store interface – gets, puts, deletes – and handles block management internally.

There’s much more. Especially interesting is peer-to-peer communication among drives, enabling recovery from full or partial drive failures without involving a storage server.

Seagate cites several advantages of this architecture.

  • Lower capital expense. An entire layer of hardware and software is removed from the data path.
  • Labor. Removing the hardware layer also reduces technical support and reduces the opportunity for human error.
  • Power consumption. Less hardware, less power.

Seagate also cites potential improvements to data center security.

  • Authentication. Full cryptographic authentication for servers that access the drive.
  • Integrity. Embedded checksums for commands and data.
  • Authorization. Clear roles by server as to what each application is allowed to do.
  • Transport layer security. For the security of sensitive data and/or management commands an industry standard TLS suite.

The StorageMojo take
Improving object storage – the fastest growing form of storage today – is a worthy goal. But object storage is already way more efficient – for large data volumes, starting at ≈250TB – than RAID arrays or NAS, so major advances mean radical rethinking. Seagate’s adoption of massive parallelism is a good start.

RAID controllers were added to the storage hierarchy because they increased storage bandwidth, IOPS, capacity and resilience. But until RAID the secular trend was putting more intelligence in drives. After a 20+ year hiatus Seagate resurrects that.

Getting RAID controllers out of the stack reduces latency and eliminates a major cost and bug pool – a Very Good Thing. It also allows drive vendors to reclaim margin lost with falling enterprise drive sales and a much needed excuse to rewrite the spaghetti code inside drives. They need both and so does IT.

But there are concerns. Seagate has never lacked for good ideas, but execution has been uneven.

Performance is likely #1. But if servers shard large objects performance may be preserved over GigE links while reducing costs and improving data integrity.

Distributing the needed intelligence to the lowest possible level – the drive – should be more scalable and economic than current DAS-based server models. The tipping point is the value of the aggregation, caching and low-cost disk connectivity – network bandwidth is way more costly than internal DAS bandwidth – of storage servers versus the advantages of removing the storage server tier.

I’d like to see more data on that, like Seagate’s pricing for a start. But at a first approximation Seagate’s vision is promising.

Courteous comments welcome, of course. What do you think?

{ 7 comments… read them below or add one }

Anonymous November 22, 2013 at 8:24 am

This seems like a great way to lock people into Seagate at a very low level.

Brent Welch November 29, 2013 at 9:22 pm

The move toward a higher-level interface to a disk started in the 1990’s with Garth Gibson’s NASD (Network-Attached Secure Disk) work, and that evolved into an ANSI T10 standard command set for iSCSI called OSD, for Object-Based Storage Devices. Panasas, founded by Gibson, created a high performance POSIX file system layered over OSD that it created from “blades” containing two high capacity SATA drives and an SSD, and an Intel motherboard. So, this was an Ethernet-based, Object Storage interface built into a useful system by combining it with HA features, scalable metadata, and a file system interface. They also worked to standardize that client in the NFS4.1 standard, pNFS.

Meanwhile, the web 2.0 folks created services with an HTTP protocol for a similar “object storage”. Those services feature replication, web-based access, account management, and a billing model. There are now a few HTTP-based object protocols, and it appears that Seagate has created yet another one.

While I applaud the approach of building a higher-level building block by a mass producer, I want to emphasize that a pile of bricks doesn’t self-assemble into a wall, or house, or patio. To get to where customers want to go, a non-trivial amount of infrastructure will need to be layered over these storage devices. The web 2.0 guys put a dirt-simple HTTP interface over a sophisticated back-end. The interface is so simple it doesn’t really matter, as long as the back-end is super robust. Similar, PanFS put a fairly sophisticated POSIX file system in front of OSD, and enabled high performance file systems for super computers. Seagate still has a long way to go.

Jason Ozolins November 29, 2013 at 10:09 pm

A few things come to mind:
I’m not convinced that this drive-local object storage is a better overall architecture than low powered, dense microservers with faster Ethernet interfaces and multiple drive connections offering up exactly the same object storage interface and network technology.
But from Seagate’s point of view, Kinetic-style object storage seems to be a very good match for the shingled recording drives they want to sell. Seagate has a patent granted (8,482,874) on presenting shingled drives as arrays of tapes, or as a WORM or CD drive; but it’s easy to see that the complete abstraction of object placement in the Kinetic API allows a drive manufacturer to use append-or-wipe style allocation on shingled recording zones without having to expose any details of the shingle zones sizes, locations and wipe/reset mechanisms to a block storage client. Garbage collection on shingled zones with mostly dead objects would remove the need for slow mid-shingle data rewrites.

But I find one part of this article pretty worrying. Frankly, if Seagate really need the incentive of a new, higher-margin storage product to motivate them to clean up their “spaghetti code” firmware, rather than, say, any obligation to their existing Enterprise SAS/SATA customers, then they should make that clear on public roadmaps so that those customers can plan for the future.
I work at a site which deployed 50 JBODs totalling 1200 Barracuda ES.2 1TB drives in 2009, and we saw significant firmware-related issues in this Enterprise SATA variant of the Barracuda 7200.11 drive. Failing drives would go offline and come back online so chaotically as to cause the SAS HBA firmware/drivers to lose access to whole JBODs, or create new device entries for the existing drives. Replacing failed drives became an involved and somewhat stressful exercise instead of a routine maintenance operation.
From that experience, if the only way to get better-architected, systematically cleaned up drive firmware is for customers to bend their requirements to suit this vendor’s wishes, then I hope that the other big player has a better attitude towards supporting their existing storage markets, and that this competition provides some motivation to Seagate. I do hope that reliable, well-behaved block storage devices still have a place in Seagate’s future product lineup.

Tim Wessels January 4, 2014 at 10:42 am

Well, you seem to overlook the significance of Seagate Kinetic in the first sentence. Kinetic HDDs eliminate storage servers and other cruft like POSIX and RAID. Kinetic dis-intermediates the relationship between applications and the disk storage they use. What is innovative about Seagate Kinetic is the combination of Ethernet AND a key value API. The “connection layer” between applications and the Kinetic HDDs is entirely open as it consists of the Seagate LibKinetic API, which is being open sourced by Seagate, Google Protocol Buffers, TCP/IP over GbE. Curiously, LibKinetic is the first bit of software ever open-sourced by Seagate. It is unlikely that Seagate will charge a premium for their Kinetic HDDs as they are using the existing Terascale (formerly Constellation CS) line with slight modifications. Prices for Kinetic “trays” have yet to be announced by a group of manufacturers which includes Xyratex (purchased by Seagate), Newisys, Supermicro and Dell. For applications that use object storage, the Kinetic Open Storage Platform is breaking new ground but the proof will be who decides to deploy it. To date, Basho (Riak/Riak CS), SwiftStack and Scality (RING) are making their object storage software work with Kinetic HDDs. Seagate will have a Kinetic SDK, which includes a 4-Kinetic HDD enclosure, available in Q1 2014.

Patrick February 4, 2014 at 11:33 am

This is apparently some attempt from a hard drive manufacture trying to lock customers at the “software level”, which happened many many times in the history. However, in this particular case, it will be probably very very hard for Seagate. It is not a software company, and it does not know much about cloud infrastructure, a lot of designs for kinetic have already gone in wrong direction. Most importantly, it does not even have a real demo to show but already can not wait to create a buzz.

Robin Harris February 10, 2014 at 1:28 pm

Patrick, how is Seagate’s software level lock-in any different than any other company’s efforts to lock customers in? We may not like it, but a degree of lock is an almost inevitable result of having stored data, especially when you’re talking about petabytes. If the spec is open, what’s the problem?


Charles March 8, 2014 at 9:31 am

Is anyone questioning the wisdom of placing a complete flat architecture on the lowest level? 5000 hard drives with minimal computation power is a lot different from 100 powerful servers each with 50 dumb hard drives on sas. Until Seagate has a large scale demo with at least a few thousand drives, I am not convinced.

Leave a Comment

{ 1 trackback }

Previous post:

Next post: