I’ve liked InfiniBand ever since I learned about it at YottaYotta in 2000. The switches are fast and cheap, the latency very low and the bandwidth – 6 GB/sec full-duplex at 12x – stunning. (Cisco has an excellent technical overview introduction here.)
One thing it didn’t do, though, was handle distance. Even fiber-based IB was limited to a few hundred meters. A great computer room interconnect, but not so good for the disaster-tolerant configurations that YottaYotta’s cluster-based RAID controller was hoping to address.
YY made due with gigE links, and managed some impressive demonstrations of terabyte long-distance data transfers. Just the thing for a long weekend at the lake.
Of course, there is a downside
Infiniband was designed to be more a fixed resource like Fibre Channel than an easy-come, easy-go WAN like Ethernet. Five years ago the management was less than optimal. Some 3rd-party tools were available from Voltaire – hey, guess who’s going public! – but most folks ended up writing their own management. But if you want an “always on” network this isn’t a big problem.
Ideally, InfiniBand would at least offer metro are networking for redundancy. I don’t think you can buy it yet, but long-haul I-band may be coming.
Enter Obsidian Research
Meanwhile, up in northern Alberta, one of YY’s former whizzes, David Southwell, formed Obsidian Research, dedicated to taking I-band long-haul. The company says
Longbow XR allows arbitrarily distant InfiniBand fabrics to communicate at full bandwidth through 10Gbits/s Wide Area Networks. The WAN connection is managed out of band, and except for flight time induced latency is transparent to the InfiniBand hardware, stacks, operating systems and applications.
XR achieves flow control by shaping WAN traffic and managing buffer credits to ensure extremely high efficiency bulk data transfers — including RDMAs — making the system a highly effective transport mechanism for very large data sets between geographically separated InfiniBand equipment.
In switch mode, Longbow XR looks like a 2-port switch to the InfiniBand subnet manager. A point-to- point WAN link presents as a pair of serially connected 2-port InfiniBand switches spanning the conventional InfiniBand fabrics at each site. A single subnet spans the Wide Area Network connection, unifying what were separate subnets at each site.
Longbow XR also provides an InfiniBand router mode — improving global system manageability, scalability and robustness. In this mode, each site remain separate subnets, with independent subnet managers, easing possible security and performance concerns associated with remote subnet management. 4x SDR InfiniBand provides just 8Gbits/s of data payload bandwidth; two totally independent Gigabit Ethernet links are also encapsulated across the WAN link to make full use of the extra bandwidth.
Longbow XR communicates over IPv6 Packet Over SONET (POS), ATM, and 10Gb Ethernet, as well as dark fiber applications.
Southwell is one of the smartest hardware engineers I’ve ever worked with. If he says he can do this, I’m willing to believe he can, given enough time. And if he’ll stop “improving” it and just ship.
The StorageMojo take
I-band has knocked about the industry for some time, a solution looking for that special problem that would provide volume and profits. With the growth of clusters – compute and storage – I believe it has found its niche. Long-haul I-band doesn’t solve distance latency problems, but it sure can move boatloads of data. As Google and others reach for 100x scaling, long-haul I-band could be a helpful tool.
After seeing that someone linked to this year-old post I took a look and discovered some needed edits and broken links which I’ve fixed.
Comments welcome, of course. What is the state of InfiniBand today?
I was really pleased to see this post.
Infiniband has been a favorite of mine since I first discovered it. I learned my lesson about “holding my breathe” waiting for its arrival from HIPPI.
By the time the political battles cleared over HIPPI, SCSI was faster.
For years I put aside all the designs I had for using “Infiniband on the inside and Broadband on the outside” while the the I-band political wars raged and B-band matured. Looks like they may be permanently on the shelf.
Perhaps I am missing something. A few hundred meters should be more than enough for I-band to do its work. At least the work I had in mind for it.
My plan was to use I-band as a giant “TOE” so the stack could service B-band traffic requests. If you have a Local, low-cost “Data Pump” then I-band can handle all the Local Information requests. On Demand.
The B-band should handle the long haul just fine. The devil is in the Information delivery design. There is a cost trade-off between network gear cost and I-band cost. The long haul I-band might be more cost effective here.
Now there is a new player in the game. After surviving all manner of political and technical assaults, I-band must now survive the “green” cost. Faster usually means more power consumption. I can recall a couple of “Dr. Who” technologies that when they were turned on the lights dimmed. The Speed Limit of the Information Universe is affected again.
I come from an HPC environment, and we’ve had a good look at the Obsidian products. My colleagues and I came away very impressed.
Where I work, we use iSCSI over IB/SDP for our parallel filesystem, and we hope to expand the availability of our filesystem from our machine room to our campus area network to other IB-enabled clusters. Our tests showed that the Obsidian gear excels in this application.
We view the capability to run parallel jobs over Obsidian-linked clusters as an interesting bonus. The effects of distance through fiber is observable with latency-sensitive applications like MPI, but it still beats the pants off of gigabit ethernet.
RE: “but it still beats the pants off of gigabit ethernet”
Are you talking overall performance, price or price/performance?
The HPC environment you describe is ideally suited to long-er haul IB.
At what price though?
I am a BIG FAN of IB. I have wanted it for years.
I found your article with some good numbers at:
Obsidian Longbow Delivers InfiniBand Storage
To achieve the performance numbers mentioned in the article using gigabit, or even 10xgigabit, requires aggregating which can be a management nightmare. It was the only option without IB.
I was a BIG fan (still am) of the orange light in the fibre cable called DWDM.
The red, violet and blue are there too. We just liked the orange best.
We could never convince anyone below the government and mega-multinational corporations to invest in DWDM. For a while it looked like Corvus (gone and forgotten) and Juniper Networks (still going strong)
and YottaYotta and GiantLoop etc., would manage to pull off reasonable cost long-haul by lighting a lot of dark fibre at a low cost. Then the wheels came off.
I did have an interesting experience in 2003. I missed a job opportunity by proposing an all “switch” design (lowest cost) for a mega-multinational corporation. Turns out they knew what network directors were and wanted THE BEST. The fastest backbone money could buy they said.
I had never met anyone willing to spend that kind of money on a LAN. They were not an HPC environment. They were real time. Being a “poor-boy” cost me.
Perhaps my technology is dated but I’m glad to hear about Obsidian.
Thanks for the feedback.
RE: “For a while it looked like Corvus (gone and forgotten)”
Should read:
“For a while it looked like Corvis which became Broadwing and later sold to Level3 (gone and forgotten?)”
Apparently only by me. I met David Huber at a seminar one time around 2000 and he made such an impression on me that I lost a bundle in telecom.
I still like his message.
His message was that telecom was poised at the beginning of a 25 year growth cycle like computers had been in. I’m still not sure why it didn’t come true. It may be coming true now. I sure hope so.