I wrote a short piece on ZDnet about Los Alamos National Labs new Cell Broadband Engine based supercomputer, Roadrunner. With ~14k v.3 Cell processors – an earlier version powers the PS3 game console – and another ~7k dual core Opterons, the Roadrunner’s ~3,250 compute nodes pack a lot of compute cycles.
The key compute element is the new version of the PS3 chip – called a PowerXCell 8i Processor – features 8x faster double-precision floating point and over 25 GB/sec of memory bandwidth. And it can address 64 GB RAM. There are 4 8i’s per compute node.
Nothing I read mentioned the disk storage – until the friendly Panasas PR person suggested I talk to Larry Jones, VP Product Marketing. Panasas is providing the back end storage for Roadrunner.
I did, and here’s what I learned.
LANL storage infrastructure
LANL’s 6 supercomputers + Roadrunner share the Panasas storage through LANL-developed IO nodes. While Roadrunner itself uses dual-data-rate 4x Infiniband for internode communication, the I/O nodes attach to Panasas through trunked GigE.
The advantage of the I/O nodes is that the entire Panasas storage pool is available to each supercomputer. Lots of bandwidth.
Roadrunner currently has about 80TB of RAM, roughly 24 GB per compute node. That works out to about 4 GB RAM per processor.
The jobs these machines run are huge. A simulation can run 6 months or more. Depending on criticality a job gets checkpointed every hour or maybe once a day.
The Panasas installation at LANL, begun in 2003, is currently 2 PB. Assuming an average of 500 GB drives, that means 4,000 disk drives.
Panasas uses 5 trunked GigE links to each of the 8 controllers in a single rack. They are now in beta for 10 GigE, which reduce link count from 40 to 8 per rack while doubling bandwidth.
The hot rodders at LANL should like that.
The StorageMojo take
Roadrunner’s 80 TB RAM is a sizable storage infrastructure in its own right. Keeping it fed and backed up is a major job.
Consumerization of IT is a common concept – but what we see here is the consumerization of HPC: Playstation CPUs; SATA drives; Linux OS; air cooling. The old model of highly customized kit for HPC is dead.
Which is a good thing for the rest of us. We get some of the smartest people in computing working on platforms that we might also use, developing applications that otherwise would never be available to the consumer market.
I’ll never run molecular dynamics codes, but maybe my kids will. After all, I can now edit feature length movies on my desktop. Who would have believed that just 20 years ago?
Comments welcome, of course. Disclosure: I did some work for Panasas last year and – who knows? – might do some more in the future. I like the team and the way they are pushing pNFS.
Well, that’s not quite the same stuff you have on your desktop. For example, the PowerXCell8i (there’s a mouthful) isn’t quite the same processor that’s in the Playstation, any more than the engine in a Corvette Z06 is the same as the engine in a Cobalt or Aveo (even though they’re all Chevy). I don’t see too many people running DDR IB in their homes, or even trunked GigE to custom disk controllers running proprietary software. Roadrunner might be running RHEL on the Opterons, but like most Linux distributions RHEL is limited to x86 so what’s on the more numerous (and computationally more important) Cells? Most likely it’s some variant of the BG/L microkernel, so I don’t think you could really say Roadrunner as a whole is running the same OS as the rest of us. BG/L isn’t exactly commodity stuff, neither is Ranger, and Baker won’t be either. There are quite a few Cray systems out there using Catamount and SeaStar, and of course there’s another company near and dear to my heart which makes not-quite-commodity HPC systems. 😉
There’s some commoditization going on, there always has been and always will be, but I think it’s less than you make out. Once you factor out the changes that are actually side effects of the shift from shared-memory systems (the most complex and expensive piece of custom stuff in the old days) to clusters, I’d say the custom/commodity balance hasn’t changed all that much. If there is a commoditization trend, Roadrunner is bucking it on the storage side by using a proprietary solution and home-grown I/O nodes instead of an open-source filesystem and COTS storage servers directly on the IB fabric.
If there is a commoditization trend, Roadrunner is bucking it on the storage side by using a proprietary solution and home-grown I/O nodes instead of an open-source filesystem and COTS storage servers directly on the IB fabric.
I don’t know what the I/O nodes do, but I cannot agree with the above sentence. Panasas is a turnkey appliance. One can hardly be more commoditized than that. “Proprietary” does not make something not a commodity.
Are the I/O nodes Infiniband gateways? Seems likely.
Joe.
Jeff,
True, auto engines aren’t as commoditized as CPUs.
But if you compare where HPC was 20 years ago – hand-tooled from the CPU to the OS and compilers to low-volume networks like FDDI – today’s HPC is a commodity play.
Not as commoditized as the data center, which is behind the consumer, but the trend is clear.
And who knows, maybe the PowerXCell 8i will show up in the PS4?
Robin
“Panasas is a turnkey appliance. One can hardly be more commoditized than that.”
I guess we just disagree on what “commodity” means, then. To me, a commodity is something that you can buy on an open market without needing to consider the source, like wheat or pork bellies. Secondarily, competition between practically-anonymous sources means reduced prices and trivial replacement of one supplier with another. *None* of that happens if there’s a single source, and “turnkey” has absolutely nothing to do with it. The old “pre-commoditization” HPC systems that Robin mentions were often presented as “turnkey” too. There’s nothing wrong with Panasas, I was an architect for one of their direct technological predecessors, but what they produce is not a commodity. It’s a specialized product that even HPC buyers consider high-priced, and if you knew HPC buyers like I know HPC buyers you’d find that pretty noteworthy.
“f you compare where HPC was 20 years ago – hand-tooled from the CPU to the OS and compilers to low-volume networks like FDDI – today’s HPC is a commodity play.”
FDDI was more of a commodity than IB is. No matter whose namepate is on an IB switch, the chips inside will all have “Mellanox” stamped on the top. At least FDDI had multiple suppliers from sand to switches. Yes, you can mix and match CPUs and operating systems and interconnects and storage more now in HPC. There’s less vendor lock-in across the entire system, and that’s one aspect of commoditization. If you look at the markets individually, though, only some of them have been commoditized. The OS on the Opterons is a case in point, and parallel filesystems are heading that way because there are already two significant open-source competitors (Lustre and PVFS) and open-source pNFS coming Real Soon Now. On the other hand, a custom processor running a specialized OS on a custom blade connected via a single-source interconnect to a home-grown I/O node bridging to a proprietary storage system doesn’t say “commoditization” to me.
Nice analysis, and thanks for all the info on the IO complex.
As for commoditization, I think it all misses the point. The real commodities are DRAM die and silicon wafers. I know, from having performed the experiment, that “custom” silicon can compete in HPC against mass produced processors.
The problem with building things out of “PC” components often boils down to things that LINPACK won’t show: reliability, memory bandwidth, and communication performance. Only the last of these is a strong lever against LINPACK. There’s a big difference between a component or node designed for a desktop and one designed for a 3000 node cluster. In the first case, an MTBF of longer than two years is irrelevant, in the latter it is woefully inadequate.
RoadRunner is a really interesting experiment, and it will surely succeed in its special purpose deployment. However, its balance is a little odd. With a peak compute rate of 100GF per chip, and “only” 24GB/s of memory bandwidth, the FP to MEM ratio is better than many x86 solutions, but a little light for many HPTC applications.
Disclosure: I’m Chief Engineer at SiCortex and blog at http://www.bigncomputing.org
Actually, RHEL is available for several non-x86 systems: Cell, Power4/5/6, System z, Itanic, etc. I suspect LANL uses Rocks or CAOS or something, though.
I guess we just disagree on what “commodity†means, then. To me, a commodity is something that you can buy on an open market without needing to consider the source,
Well. Point to you.
When, and not until, pNFS is ratified and a primary part of their product, then it will be a commodity… to the extent that there are substitutes. That’s realistically 12-18 months away.
Just don’t kid yourself. The TCO of a hand-rolled Lustre cluster is astronomical… for those of us who distinguish CAPEX vs OPEX, those “open source” clusters are looking not so attractive.
While I understand the conversation was introduced as pertaining to an HPC product, some of us have production “five nine” interests in these technologies.
Joe Kraska
San Diego, CA
USA
@Jeff Darcy:
Uhhh… RHEL runs on cell processors…
http://www.linuxdevices.com/news/NS5481356092.html
Yellow Dog Linux has been running on the PS3 since release.
@jeff darcy:
Oh, and QLogic switches most definitely do NOT have mellanox stamped on the chips inside their switches. They DO however have QLogic stamped on them.
Yes, QLogic sells their ownIB gear, but are you sure about what’s in all of them? According to QLogic’s own 1H08 earnings-call transcript, they’ve shipped a whopping 40K DDR ports including HCAs. According to Mellanox’s May 5 filing, QLogic accounts for 11% of their (Mellanox’s) sales. Where are those chips going? At least some of the IB gear QLogic is selling must be Mellanox’s after all, no matter what the stamp says, and they must be selling less than 40K of their own chips. Even if half of what’s left is in switches rather than HCAs, that’s not enough to make a dent in what I was saying about commoditization and competition.