StorageMojo




Robin Harris    




David Caminer: app design for 1st business computer

June 29th, 2008 by Robin Harris in Enterprise

Sometime we forget how young the computer revolution is. The death 10 days ago of David Caminer, who led the application programming for the world’s first business computer, the Lyons Electronic Office (LEO) is a reminder.

LEO performed its first business calculation - with 2,000 words of memory - on November 17, 1951, evaluating costs and margins on baked goods for J. Lyons & Company, a British chain of tea shops. Mr. Caminer was the systems analyst for the project, which grew into an early computer company that eventually became part of ICL.

From the obituary in the Independent

In 1947 a Lyons fact-finding team visited the United States to catch up on new developments in office methods. They learned for the first time about the newly invented electronic computer. No machine had yet been built, but they learned that Maurice Wilkes at Cambridge University was as far ahead as anyone in constructing a machine. On its return to England, the team made contact with Wilkes, who agreed to supply the design information to Lyons, and Lyons agreed to provide some additional finance and manpower to the project.

The Cambridge machine sprang into life in May 1949, and Lyons then proceeded to construct a copy of the machine. A Cambridge engineer, John Pinkerton, led on the hardware side, while Caminer was put in charge of application development.

As today, many early computer projects went disastrously wrong. Not so at Lyons. Although the technology was radical and innovative, Caminer’s approach to the computerisation of business processes was utterly conservative. He assumed that what could go wrong would go wrong. He therefore set out on a learning curve – computerising simple jobs first, and gradually taking on ones that were critical to the business, such as payroll and stock control. Caminer was an early advocate of management by exception, using the computer to bring critical issues to the attention of management.

Like some current computer industry luminaries, Mr. Caminer was political active, campaigning against British Fascist Oswald Mosely in the 30s and 40s and apartheid later, welcoming Bishop Desmond Tutu to his Borough.

Read more. The New York Times obit. An appreciation from Frank Land at the Leo Computers Society web site.

The first LEO ran for over 13 years - presaging IT’s “if it ain’t broke, why fix it?” mentality.


A LEO computer [courtesy the LEO Computers Society]

The StorageMojo take
As with so many revolutionary 20th century technologies - jet aircraft, radar, antibiotics - the British had an early lead that Americans eventually erased. Arguably the British lead in commercial business computers was the largest of all.

Given Mr. Caminer’s success in bringing large IT projects in on time, we should probably be sorry that we didn’t learn more from him and his methods.

Comments welcome, of course.

Optimism and manycore computing

June 26th, 2008 by Robin Harris in Architecture, Clusters, Future Tech

The parallel computing/manycore initiatives may be missing the point. The challenge of manycore computing is burn up as many CPU cycles as possible doing things that we don’t do today because the computational cost is too great. Making existing apps go faster is secondary.

Today’s focus on creating manycore development platforms like OS X.vi server’s Grand Central may be a subset of where the real action will be. Maybe current levels of parallelization are good enough for most apps. So what does that leave?

How else can we use manycore computing?
Some thoughts:

Application speed up That won’t be the big win for current apps - most feel current processors are fast enough - look at the popularity of the Eee. But I’d love Handbrake to rip my DVDs faster.

Advanced UI capabilities such as voice recognition that are loosely coupled independent processes. Your application won’t run any faster, but it will be easier to use. This is an area Microsoft is looking at. Historically, the UI has been a major consumer of improved CPU and display capability.

New forms of communication and entertainment, such as 3D virtual worlds. This is an extension of the video editing market. And just think of the storage requirements!

Communities of cellular automata One core, one or a few automata. For example, Brian Tung’s and Leonard Kleinrock’s 1996 paper Using Finite State Automata to Produce Self-Optimization and Self-Control discusses using automata to guide a group of agents to cooperate on a task in a distributed systems environment.

Optimistic computing defined by David Jefferson in a 1990 ACM paper titled Virtual Time II: Storage Management in Distributed Simulation as

An optimistic simulation mechanism is one that takes risks by performaning speculative computation, which, if subsequently determined to be correct, saves time, but which is incorrect, must be rolled back.

Update: Rethinking virtualization because once a core costs $3 and you’ve got 32 or 64 of them in a $2k server, why would you spend hundreds of dollars on software to create virtual machines when you’ve got dozens of real ones?

There’s value in easy migration of virtual machines from one physical server to another. A “thin” virtualization layer atop a manycore OS - Windows 7? - could enable Microsoft to take back VMware’s market cap and reassert control of the entire OS stack.
End update.

High desert optimist
Many performance enhancements already use optimistic concepts. But the ability to throw massive computes from networks on a chip - oh, and how about reconfiguring those on-chip networks on the fly - could take us in directions we, or at least I, can’t imagine.

The StorageMojo take
The first effort with any new technology is to recreate what you could do with the old technology. It is only with the 2nd generation that the truly innovative stuff enabled by the new technology gets built.

Consider this an effort to short-circuit that historical process.

Comments welcome, of course. Thanks to Prof. West for pointing out the Jefferson paper to me.

IT is a factory; the Web is a playground

June 24th, 2008 by Robin Harris in Architecture, Enterprise

Over on O’Reilly radar, Nat Torkington, does a neat riff on the enterprise SOA movement. He likens enterprise IT to a stern father:

. . . with strict rules, transgressors to be punished;. . .

while the Web is:

. . . the nurturing parent (the API provider) who encourages experimentation, self-development, and happiness.

It is an amusing read, but like lots of developers and engineers, Nat misunderstands enterprise IT’s motivation. They aren’t into control for the sake of control. (Well, some of them are, because some people are like that. But that isn’t the key reason.)

Control is a means to an end. The goal is production. Enterprise IT is a factory. The Web is a playground.

Expecting the two to be similar is a fundamental confusion. If you were put in charge of Goldman’s IT, you’d turn into a control freak too.

Statistical process control
Factories produce more and higher quality goods by reducing variability. Variability creates problems that cost money, either warranty costs or greater downtime/setup costs.

Enterprise IT is a factory
I first learned this truth when I was selling to engineers for development and to manufacturing for MRP. The engineers were all about the money and the freedom to tinker.

The manufacturing guys just wanted it to work. Save a few bucks on a 3rd party expansion rack? Why? Any glitch would wipe out the savings. So they wouldn’t go there.

The Web is a playground
Sure, there are people, like me, for whom the Web is instrumental in their work. I have backups for everything. The big destination sites do the same.

But for most of us the Web is something more casual: entertainment; shopping; news; communication. As long as it usually works we’re fine. The local cable loop goes down for a couple of hours and we’ll survive.

The StorageMojo take
The engineering and manufacturing cultures are very different, even though both groups are technical. This is why the gap between Silicon Valley and enterprise IT is so wide: the SV engineers think they get IT. And they don’t.

If you can show IT how your product reduces variability in their environment, giving them more certainty about production, you will have their attention. NUMA architectures, for example, add variability, despite higher average performance on tuned workloads.

So you could predict they wouldn’t be successful in the enterprise.

Words like “flexibility,” “experimentation” and “mashup” just don’t compute in the enterprise infrastructure. I’ve been as frustrated by the IT mindset as anyone, but complaining won’t change it. They are doing the best they can with the tools they have.

Want to do something great? Give IT better tools for managing variability.

Comments welcome, of course.

Short videos from Seattle Scalability Conference

June 20th, 2008 by Robin Harris in Off-Topic

I’ve put together a couple of ~3 minute video excerpts from the Seattle Scalability Conference last Saturday. I’ve edited them to be useful standalone intros. Maybe they’ll entice you to learn more.

Chapel: productive parallel programming at scale
Bradford Chamberlain of Cray talks about a new language that he and his colleagues are developing. It isn’t released to the public yet, but he is looking for collaborators interested in moving it beyond a pure HPC focus.

Chapel appears to dramatically simplify parallel programming, if the code samples are any indication.

This is only 3 minutes out of 30, so if this whets your appetite be sure to look for the full video - shot on better equipment - on YouTube. As of this writing it isn’t up yet.

Carmen: a scalable science cloud
This is 3 minutes from early in a talk that Paul Watson of Newcastle University gave on cloud computing for neuroscience research. Neuroscience has a number of issues - including 100,000 researchers worldwide - that lend themselves to a cloud approach.

The full talk is up on Google Video.

Commenters on my ZDnet blog
inform me that Microsoft has solved all these multicore programming problems. Maybe the next scalability conference should be held in Redmond.

It’s official: ZFS in Mac OS 10.6 server

June 19th, 2008 by Robin Harris in Architecture, Information Management

Can single-user OS X be far behind?
Here’s the official Apple announcement:

For business-critical server deployments, Snow Leopard Server adds read and write support for the high-performance, 128-bit ZFS file system, which includes advanced features such as storage pooling, data redundancy, automatic error correction, dynamic volume expansion, and snapshots.

The StorageMojo take
Cool! And only 2 years later than I’d predicted. I’m an optimist.

As I noted almost 2 years ago:

StorageMojo.com has devoted time to this issue because today’s computer business is largely driven by consumer computing, not enterprise computing. Putting a really modern integrated file and storage management system on a consumer OS would raise the bar for everyone else.

I stand by that.

Comments welcome, of course.
For more on ZFS see:
Want to know more about ZFS? I’ve been hot on it for over a year. See:

Cloud computing podcast

June 16th, 2008 by Robin Harris in Future Tech

Gary Orenstein has published a podcast of a discussion we had a couple of weeks ago about cloud computing.

Cloudy days on the hype cycle
Cloud computing and storage is still climbing the hype cycle. Remember client-server computing? It was going to change the world. It did, but not as we expected. Now it is an invisible part of the infosphere.

Likewise cloud computing. It is another arrow in the quiver, not a howitzer. The critical issue is how creatively and transparently we utilize it. No doubt many of us will be surprised.

In 15 years cloud computing will be as obvious to users as client-server is today.

The StorageMojo take
The podcast discusses other issues in cloud computing and storage. Kudos to Gary for putting on the cloud computing series.

Comments welcome, of course. I’ve done work Gary’s employer, Gear6, in the past. This discussion was conducted gratis.

Seattle Scalability Conference quick take

June 16th, 2008 by Robin Harris in Architecture, Clusters, Future Tech

I’m relaxing in beautiful Port Townsend, Washington today, under the gray skies of the coldest June in almost 100 years. The fire in the wood-burning stove and Frank’s strong coffee provide the good cheer.

Temporal compare
My comments are more impressionistic than considered. No “best of” selections now.

Comparing this year’s conference to last year’s is tricky. The Googlers who selected the papers didn’t profess a theme, choosing what they found interesting. So it may be a Rorschach inkblot test to see a pattern in the 2 conferences, but I do.

Last year’s conference focused on cluster scalability - building really big clusters that go beyond the 8,000 or so node clusters Google uses. Jeffrey Dean last year was open about Google’s desire to knit their data centers into a single global name space.

This year the focus moved up the stack to file systems and programming languages. The problem of multi-core chips seemed especially pertinent.

Bradford Chamberlain’s Chapel language attacks the issue of programming multicore/processor systems and sounded promising [download a technical pdf on Chapel here].

Vijay Menon’s “Scalable multiprocessor programming via transactional memory” seeks to replace clustering’s traditional reliance on threads and locks with an atomic transactional model of file access. He noted that Azul Systems uses hardware transactional memory in their 800+ core Java servers.

And there was more.

The StorageMojo take
Scalability is a key problem. The Googler’s desire to involve industry as well as academe gives this conference a dual personality that I like. At its best we see ideas beginning to morph into platforms.

The slow take will be coming as I look further into the papers that were presented. In the meantime Garth Gibson, CMU prof and RAID paper co-author, made some interesting comments on the earlier Scalability Conference post.

Comments welcome, of course. Looking forward to returning to NoAZ tomorrow.

Off to Seattle

June 12th, 2008 by Robin Harris in Off-Topic

That’s right: the second Seattle Conference on Scalability - sponsored by Google - is this Saturday [see a couple of posts back for more info]. I’m also attending the bonus meeting in Fremont Friday evening.

I’m bringing the video production backpack and I’ll try to get some video clips up if I capture something short & interesting. Sunday I’m going to get some Father’s Day love and then up to charming Port Townsend for a couple of days R&R with Frank.

If you’ve spent time in PT, you know Frank. So no guarantees on the video.

The StorageMojo take
The StorageMojo team has been celebrating the 500 post mark - by not posting. But now its back to work.

If you’re at the Conference look me up. Always pleased to meet StorageMojo readers - even occasional ones - or people who could be StorageMojo readers.

Roadrunner’s backing store

June 11th, 2008 by Robin Harris in Architecture, Clusters, Disk, NAS, IP, iSCSI, SAN, FC

I wrote a short piece on ZDnet about Los Alamos National Labs new Cell Broadband Engine based supercomputer, Roadrunner. With ~14k v.3 Cell processors - an earlier version powers the PS3 game console - and another ~7k dual core Opterons, the Roadrunner’s ~3,250 compute nodes pack a lot of compute cycles.

The key compute element is the new version of the PS3 chip - called a PowerXCell 8i Processor - features 8x faster double-precision floating point and over 25 GB/sec of memory bandwidth. And it can address 64 GB RAM. There are 4 8i’s per compute node.

Nothing I read mentioned the disk storage - until the friendly Panasas PR person suggested I talk to Larry Jones, VP Product Marketing. Panasas is providing the back end storage for Roadrunner.

I did, and here’s what I learned.

LANL storage infrastructure
LANL’s 6 supercomputers + Roadrunner share the Panasas storage through LANL-developed IO nodes. While Roadrunner itself uses dual-data-rate 4x Infiniband for internode communication, the I/O nodes attach to Panasas through trunked GigE.

The advantage of the I/O nodes is that the entire Panasas storage pool is available to each supercomputer. Lots of bandwidth.

Roadrunner currently has about 80TB of RAM, roughly 24 GB per compute node. That works out to about 4 GB RAM per processor.

The jobs these machines run are huge. A simulation can run 6 months or more. Depending on criticality a job gets checkpointed every hour or maybe once a day.

The Panasas installation at LANL, begun in 2003, is currently 2 PB. Assuming an average of 500 GB drives, that means 4,000 disk drives.

Panasas uses 5 trunked GigE links to each of the 8 controllers in a single rack. They are now in beta for 10 GigE, which reduce link count from 40 to 8 per rack while doubling bandwidth.

The hot rodders at LANL should like that.

The StorageMojo take
Roadrunner’s 80 TB RAM is a sizable storage infrastructure in its own right. Keeping it fed and backed up is a major job.

Consumerization of IT is a common concept - but what we see here is the consumerization of HPC: Playstation CPUs; SATA drives; Linux OS; air cooling. The old model of highly customized kit for HPC is dead.

Which is a good thing for the rest of us. We get some of the smartest people in computing working on platforms that we might also use, developing applications that otherwise would never be available to the consumer market.

I’ll never run molecular dynamics codes, but maybe my kids will. After all, I can now edit feature length movies on my desktop. Who would have believed that just 20 years ago?

Comments welcome, of course. Disclosure: I did some work for Panasas last year and - who knows? - might do some more in the future. I like the team and the way they are pushing pNFS.

EMC’s vision for Pi Corp

June 3rd, 2008 by Robin Harris in Future Tech, Information Management, SOHO/SMB

Consumerization is the ultimate scale-out application
I spoke to EMC’s CTO, Jeff Nick, at EMC world and video’d his comments. I didn’t know what to expect, as some past EMC CTO’s have been lightweights whose insight wasn’t up to Silicon Valley standards.

But Nick is different: a former IBM distinguished engineer Fellow - their highest technical level; holder of many patents; leader of IBM’s grid initiatives. He swims in the deep end of the pool. Once he realized I’d done some homework he proved voluble and insightful.

We discussed several topics, including why Maui is late (short answer: productizing advanced technology is hard). The best part was describing what Paul Maritz’ Pi Corp brings to the table.

Here’s the video:

The StorageMojo take
EMC is taking Dell’s purchase of EqualLogic seriously. They are intent on building EMC into a trusted consumer brand for personal information storage in the cloud.

That is easier said than done. Yet Google - the obvious 1st choice for this market - has hurt their brand by dithering on privacy issues. Why trust your most private data to a company that makes its money selling your information? 

EMC is unleashing a triple whammy on its traditional competitors

  • Leading edge technology in Maui
  • Consumer-focused services with Mozy and Iomega
  • A next-gen software infrastructure in Pi that - if it delivers - will change how consumers manage their data forever

These are all game-changers. Together they bring on the consumerization of IT - storage industry division - at a fast pace. While cloud storage must still overcome the Internet’s 3 9s availability, EMC’s added-value approach is promising.

Comments welcome, of course.



StorageMojo RSS Feed September 2008 August 2008 July 2008 June 2008 May 2008 April 2008 March 2008 February 2008 January 2008 December 2007 November 2007 October 2007 September 2007 August 2007 July 2007 June 2007 May 2007 April 2007 March 2007 February 2007 January 2007 December 2006 November 2006 October 2006 September 2006 August 2006 July 2006 June 2006