Concall today with Bryan Cantrill, the smart guy behind Dtrace. Dtrace was the engine behind
Sun’s Oracle’s Fishworks server and application monitor. Dtrace has also been incorporated into OS X.
Bryan left Oracle last week and started Monday at Joyent the cloud infrastructure provider, as VP of engineering. Why?
Bryan is an instrumentation geek. He really wants to know what’s going on. Instrumentation in the cloud is the next big challenge.
That makes sense: there are so many moving parts that understanding and resolving performance and availability issues will be critical to the widespread adoption of cloud.
Bryan described 3 technology epiphanies that he’s enjoyed. The 1st was when he saw Java for the first time back in 1995. The 2nd was when he saw a Ruby on Rails video about deploying a web app.
We know that server I/O latency can kill performance. It’s even worse in the cloud.
A single bad drive can hose a server if the app is holding locks. What if you have a webpage that relies on five different Web services, or as many Amazon pages do, 150 services?
You need an infrastructure that is resilient in the face of long latency while maintaining high throughput. Bryan says that most failures are not hard failures but are latency bubbles that cascade out and lock up the rest of the infrastructure.
Ryan does a fine job introducing node.js in a 1 hour Google Tech Talk last week. He outlined how to build a server that can handle 10,000 or more users. His goal with node.js was to make it easy to write high-performance servers.
There is an arms race out there for performance – Google, Apple, Mozilla, Opera, Microsoft – to win the hearts and eyeballs of hundreds of millions of consumers. Fickle consumers.
Node.js only exposes nonblocking asynchronous interfaces to the programmer. It has very few abstractions. Its power lies in the fact that it moves you away from certain interfaces like synchronous I/O that you shouldn’t do.
You don’t have to worry about some event completing and taking over while you’re in the middle of something else. Each node.js is a single thread. If you want to do more work you start multiple node.js instances and let the kernel do the load balancing.
Memory isolation is enforced at the process boundary. The kernel manages it, not the coder. That’s a good thing.
The StorageMojo take
Latency is the app killer of the cloud. The current cloud focus on write once/read never apps reflects that.
The fight against latency proceeds on many fronts: storage; network; CPU; and software. Asankya and others have good ideas for reducing Internet latency. Flash architectures are undergoing rapid evolution. Multicore and multiprocessor servers are attacking throughput.
Node.js is a big step in the right direction. Removing the dependency is that synchronous I/O create means any more resilient and higher performance infrastructure. Ryan reports that a Japanese website is already running several hundred thousand users on node.js instances.
As for Bryan, he’ll bring the same intelligence and energy to Joyent that he brought to Dtrace and Fishworks. Expect more great things.
Courteous comments welcome, of course. Update: The other smart guys behind Dtrace are the redoubtable Adam Leventhaland Mike Shapiro.