Seagate is proposing to turn drives into object-based storage servers in massively parallel configurations. They call this vision the Seagate Kinetic Open Storage Vision.

Today’s scale out infrastructures are universally object based, but the legacy infrastructure used to create object stores has many layers and inefficiencies.

Legacy app servers are laden with a database, filesystem, volume manager and drivers. The storage server has a RAID controller and cache that speaks SCSI to drives.

Seagate proposes a radically stripped-down architecture that includes:

  • A new API and associated libraries.
  • An Ethernet backbone.
  • New hard, hybrid and SSD drives with an Ethernet interface that implements a key/value store interface – gets, puts, deletes – and handles block management internally.

There’s much more. Especially interesting is peer-to-peer communication among drives, enabling recovery from full or partial drive failures without involving a storage server.

Seagate cites several advantages of this architecture.

  • Lower capital expense. An entire layer of hardware and software is removed from the data path.
  • Labor. Removing the hardware layer also reduces technical support and reduces the opportunity for human error.
  • Power consumption. Less hardware, less power.

Seagate also cites potential improvements to data center security.

  • Authentication. Full cryptographic authentication for servers that access the drive.
  • Integrity. Embedded checksums for commands and data.
  • Authorization. Clear roles by server as to what each application is allowed to do.
  • Transport layer security. For the security of sensitive data and/or management commands an industry standard TLS suite.

The StorageMojo take
Improving object storage – the fastest growing form of storage today – is a worthy goal. But object storage is already way more efficient – for large data volumes, starting at ≈250TB – than RAID arrays or NAS, so major advances mean radical rethinking. Seagate’s adoption of massive parallelism is a good start.

RAID controllers were added to the storage hierarchy because they increased storage bandwidth, IOPS, capacity and resilience. But until RAID the secular trend was putting more intelligence in drives. After a 20+ year hiatus Seagate resurrects that.

Getting RAID controllers out of the stack reduces latency and eliminates a major cost and bug pool – a Very Good Thing. It also allows drive vendors to reclaim margin lost with falling enterprise drive sales and a much needed excuse to rewrite the spaghetti code inside drives. They need both and so does IT.

But there are concerns. Seagate has never lacked for good ideas, but execution has been uneven.

Performance is likely #1. But if servers shard large objects performance may be preserved over GigE links while reducing costs and improving data integrity.

Distributing the needed intelligence to the lowest possible level – the drive – should be more scalable and economic than current DAS-based server models. The tipping point is the value of the aggregation, caching and low-cost disk connectivity – network bandwidth is way more costly than internal DAS bandwidth – of storage servers versus the advantages of removing the storage server tier.

I’d like to see more data on that, like Seagate’s pricing for a start. But at a first approximation Seagate’s vision is promising.

Courteous comments welcome, of course. What do you think?