Amazon Web Services architect James Hamilton has been posting on network issues for over a year and researching them much longer. As Ethernet becomes the de facto SAN technology, his views become more relevant to the larger storage market.
Part of Mr. Hamilton’s concern is the structure of the networking industry: the high margins; the dominance of a single player, Cisco; the closed technology; and the heavy vertical integration. All antithetical to the dynamics that have driven server costs down so successfully in the last 20 years.
These are issues the storage industry knows too well. But Mr. Hamilton is more concerned about the waste the current high-cost industry structure causes.
The cost of network bandwidth leads to network over-subscription. Networks are configured as tree topologies: the further you move from end nodes the worse the over subscription.
As described in the 2009 Microsoft Research paper VL2: A Scalable and Flexible Data Center Network:
. . . the capacity between different branches of the tree is typically over- subscribed by factors of 1:5 or more, with paths through the highest levels of the tree oversubscribed by factors of 1:80 to 1:240. This limits communication between servers to the point that it fragments the server pool — congestion and computation hot-spots are prevalent even when spare capacity is available elsewhere.
This throttles data center performance by limiting server-to-server bandwidth, fragmenting resources and reducing network utilization. The latter reflects the redundant paths needed in case of switch failure: ≈50% or more of costly data center bandwidth goes unused.
As might be expected, big Internet data centers like Amazon’s have complex and unpredictable workloads. They need lots of bandwidth between all servers all the time.
The VL2 paper describes an experimental solution to these problems that includes location-specific and application-specific addressing, multi-path traffic load balancing and a novel directory design that efficiently handles lookups and updates to network mappings.
In an 75-node test cluster the design moved 2.75TB of data in 395 seconds – 94% of maximum network bandwidth – at a fraction of the cost of current enterprise networks. The paper calculates that a cloud-service scale network with no over-subscription could be built with commodity switches at 1/14th the cost of a traditional data center Ethernet.
The StorageMojo take
VC and engineering dollars follow high-growth markets. What Google, Amazon and Microsoft want, they get. With the rapid growth of public cloud services the network over-subscription problem will get solved.
Merchant silicon from Broadcom, Intel and Marvell is making a tried-and-true Moore’s Law attack on hardware cost. The protocol stack is tougher, but several open-source industry initiatives are under way with support from major companies. Progress will be slower than hoped, but within 3 years we’ll have a viable stack to build on.
Where does this leave the networking industry? That depends on where you sit.
Cisco will be the biggest loser, because they’ve been the biggest winner with the current model. They may need to pull an IBM and move big into services if they want to stick around. Ironically, Cisco’s UCS product line – which bakes in the tree-structured network – has further motivated broader industry action.
The rest of the industry can go after this emerging market with a lower-GM business model. Not all of them will, but it will be a critical success factor.
The big winner will be storage. Scale-out storage relies on spraying data across multiple racks for maximum availability, utilization and performance. Cheaper, faster, better scale-out networks will only drive storage demand.
For most of us this is an academic problem today. Lightly used systems – such as for backup and archiving – don’t see Amazon’s problems. But in 5 years this will be common even outside the public cloud providers.
Just as IT users have benefited from Google’s push on energy efficiency and much more, they will also benefit from much lower cost and more scalable networks.
Courteous comments welcome, of course. I can’t help but continue to marvel at how dumb Cisco’s UCS has turned out to be. It’s a gift that keeps on giving.