StorageMojo: on the road again

by Robin Harris on Thursday, 26 August, 2010

StorageMojo’s peripatetic Global HQ will be spending a couple of days in Edmonton, AB before lighting out for Glacier NP and other scenic points south.

Which means StorageMojo will not be attending VMworld. Not that I don’t want to. The timing just didn’t work out.

Maybe next year.

The StorageMojo take
Sometimes you should stop and smell the flowers. Or take a 4400 mile road trip. Or both.

In the meantime the gnomes of Sedona are hard at work – I hope – on new price lists.

Courteous comments welcome, of course.

{ 2 comments }

3Par-ty tonight – hangover tomorrow?

by Robin Harris on Wednesday, 25 August, 2010

Wondering about HP’s counter for 3Par. Dell’s offer makes sense: they’re building expertise and a high-margin business. Dell and long time supplier EMC are growing apart as Dell realizes that that the relationship is doing little for them – and a lot for EMC.

But HP? OK, they may want a replacement for their high-end Hitachi boxes. They might be thinking that 3Par’s architecture will be a useful bridge to its scale-out architecture offerings from IBRIX and Lefthand. Maybe they see a major opportunity in storage for their industry-leading blade servers.

But a $1.6B bridge?

3Par’s growth hit a wall this last year compared to the rapid growth of ’08 to ’09 – reaching $194 million for the year ended March 31. They also lost money due to some non-cash expenses such as stock-based compensation. Not bad considering the times, but not what Data Domain was doing either.

New product purgatory
Let’s say HP can take 3Par’s gross margins to 60% after the obligatory year in HP’s integration purgatory. Code reviews, testing and integration, part numbering, pricing, sales and service training. Figure 9-12 months before they start selling in earnest.

In the meantime 3Par sales and margins plunge. HDS customers go into fight-or-flight mode – and HDS sales rise. EMC and other competitor reps go into overdrive to unhook HP’s storage business. Company bloggers spread rumors, half-truths, maybe even truths.

All good fun. But how does HP monetize this massive acquisition?

The StorageMojo take
They have to grow the business. My back of the envelope SWAG is they have to grow it to over $1.5B in 3-5 years, assuming overall profitability similar to EMC’s last full fiscal year.

That’s 7x growth. Not impossible, but not easy either. HP’s sales force already has a lot on its plate.

The bigger issue: the high-end block storage business isn’t growing. It is one thing to grow in a growing market – much harder in a static or shrinking market. Customers already have vendor relationships and don’t see the shrinking market as strategic.

Yes, EMC, the bell weather of high-end block storage, is showing growth. But looking at the numbers it isn’t the base product that is driving the growth, but add-on like V-Max and new products like Data Domain.

The high-end block business is stalled, probably permanently. Partly that is due to the recession, but it also reflects the secular trend away from block, despite virtualization’s positive impact on block demand.

Thus HP – Donatelli, really – is placing a big bet on a slowing market. If HP/3Par can take a chunk out of EMC’s business, it may pay off. But it won’t be easy or quick.

Update: Dell updated their bid to slightly above HP’s, a strong signal they won’t go much higher. Expect HP to counter and close the deal. End update.

Courteous comments welcome, of course.

{ 20 comments }

3bpc flash debut

by Robin Harris on Wednesday, 18 August, 2010

I’m in Silicon Valley at the Flash Memory Summit. The big news so far: Intel and Micron announced they are sampling an 8 GB, 3 bit-per-cell (3bpc) NAND flash.

Commercial availability is expected in Q4, but don’t be surprised if that date slips. Process tweaks needed to reach full spec parts in the next 6 months aren’t guaranteed.

Bang/$
3bpc means 50% more capacity per $. That’s good.

The tradeoff: it will only handle about 1,000 writes before failing. Which, as I pointed out 4 years ago in an early post about server flash SSDs, isn’t much of a problem when you have lots of capacity.

This flash will 1st go into products like USB thumb drives or SD cards, where you won’t write to it 1,000 times anyway. With wear-leveling you’ll have to write 8,000,000 MB of data to an 8 GB part before it croaks. That is 2 million 4 MB JPEGs. That’s a lot of snapshots.

Since most of these will go into devices that hold 16, 32 or even 64 GB of flash, multiply 2 million by 2, 4 or 8 and it is obvious that casual users will never wear out 3bpc flash.

Write performance will suffer as well, but how much remains to be seen. The effect will depend on how many flash chips are on the device: flash controllers write data in parallel, so the more chips the faster the write.

The StorageMojo take
3bpc products will have a ripple effect on other flash parts. Price sensitive products – most flash parts – will want to move to 3bpc ASAP, but the volumes won’t be there. But as production ramps, 2bpc flash will face big price pressure as vendors with older fabs try to keep them running.

Translation: expect that flash prices will resume their downward path after 3 quarters of flat prices. Yay!

Comments welcome, of course. SSD vendors shipped about 10 million SSDs in the last 12 months. Sounds good, but tiny compared to over 500 million hard drives.

{ 3 comments }

@Flash Memory Summit

by Robin Harris on Tuesday, 17 August, 2010

In Silicon Valley for this week’s Flash Memory Summit. Looking forward to meeting flash heavies from Intel, Micron and some startups.

Flash’s impact on architecture is entering a new phase. On the consumer side shipping 3-bit MLC will drive cost down and adoption up. That’s the volume side of the market that keeps multi-billion dollar fabs busy. And a headache for drive vendors.

On the system side the quick-easy-cheap SSD swap-ins are joined by products that capitalize on flash benefits at a deeper level. A little non-volatility can go a long way in making current products faster-better-cheaper.

The StorageMojo take
There have been some teasing pre-summit emails that intrigue. NVDIMMS?

It looks like flash prices have started declining again. Good news for broader adoption, especially on the consumer side. 3-bit MLC will accelerate that.

Hope to see greater advantage taken of increasing flash storage capacities: as capacities increase the importance of write-cycles declines. I’ll be looking for other creative takes on flash today.

Courteous comments welcome, of course.

{ 0 comments }

Cloud’s app killer



by Robin Harris on Thursday, 5 August, 2010

Concall today with Bryan Cantrill, the smart guy behind Dtrace. Dtrace was the engine behind Sun’s Oracle’s Fishworks server and application monitor. Dtrace has also been incorporated into OS X.

Bryan left Oracle last week and started Monday at Joyent the cloud infrastructure provider, as VP of engineering. Why?

Bryan is an instrumentation geek. He really wants to know what’s going on. Instrumentation in the cloud is the next big challenge.

That makes sense: there are so many moving parts that understanding and resolving performance and availability issues will be critical to the widespread adoption of cloud.

Tech epiphanies
Bryan described 3 technology epiphanies that he’s enjoyed. The 1st was when he saw Java for the first time back in 1995. The 2nd was when he saw a Ruby on Rails video about deploying a web app.

His 3rd epiphany came recently when he saw something called node.js. Developed by Ryan Dahl it turns the JavaScript paradigm on its head: node.js runs on the server, not the client.

Latency bubbles
We know that server I/O latency can kill performance. It’s even worse in the cloud.

A single bad drive can hose a server if the app is holding locks. What if you have a webpage that relies on five different Web services, or as many Amazon pages do, 150 services?

You need an infrastructure that is resilient in the face of long latency while maintaining high throughput. Bryan says that most failures are not hard failures but are latency bubbles that cascade out and lock up the rest of the infrastructure.

Ryan took Google’s of V8 JavaScript engine and extended it so you can handle long latency events. Without locking up the server.

Ryan does a fine job introducing node.js in a 1 hour Google Tech Talk last week. He outlined how to build a server that can handle 10,000 or more users. His goal with node.js was to make it easy to write high-performance servers.

There is an arms race out there for performance – Google, Apple, Mozilla, Opera, Microsoft – to win the hearts and eyeballs of hundreds of millions of consumers. Fickle consumers.

Node.js only exposes nonblocking asynchronous interfaces to the programmer. It has very few abstractions. Its power lies in the fact that it moves you away from certain interfaces like synchronous I/O that you shouldn’t do.

You don’t have to worry about some event completing and taking over while you’re in the middle of something else. Each node.js is a single thread. If you want to do more work you start multiple node.js instances and let the kernel do the load balancing.

Memory isolation is enforced at the process boundary. The kernel manages it, not the coder. That’s a good thing.

The StorageMojo take
Latency is the app killer of the cloud. The current cloud focus on write once/read never apps reflects that.

The fight against latency proceeds on many fronts: storage; network; CPU; and software. Asankya and others have good ideas for reducing Internet latency. Flash architectures are undergoing rapid evolution. Multicore and multiprocessor servers are attacking throughput.

Node.js is a big step in the right direction. Removing the dependency is that synchronous I/O create means any more resilient and higher performance infrastructure. Ryan reports that a Japanese website is already running several hundred thousand users on node.js instances.

As for Bryan, he’ll bring the same intelligence and energy to Joyent that he brought to Dtrace and Fishworks. Expect more great things.

Courteous comments welcome, of course. Update: The other smart guys behind Dtrace are the redoubtable Adam Leventhaland Mike Shapiro.

{ 3 comments }

Dear StorageMojo: low-cost archive storage?

by Robin Harris on Wednesday, 4 August, 2010

This came in over the transom from a semiconductor engineer. He’s wants home archive storage and is wondering why no one seems to sell it. I’ve been grappling with the same issue.

Here’s an edited-for-length excerpt from his letter:

I use RAID server products from Netgear and QNAP and have been searching for my ideal server product. I don’t understand why it doesn’t exist. My hunt is for a server with integrated error checking to ensure that bit errors can be caught and rectified. My goal is a system that secures the integrity of the files stored upon it. As far as I am aware this kind of functionality does not exist and isn’t discussed anywhere.

For example, once I have a video file (which can be the result of many hours of editing) it isn’t subsequently modified, just read on occasion as required. I don’t want any bit errors on this file – every change is just a corruption. I backup my files, I just want to make sure that what I am backing up is the same as when it was first written.

I don’t want to run a piece of software over the network, I want the server to run a check and fix any errors that it finds. It is something that I would gladly pay a premium for. I think there is a market for this as there must be a lot of users who have files that never change.

Frustrated in California

Dear Frustrated
You’ve identified the key problem: RAID systems aren’t for archives. RAID keeps your data available after a disk failure – sometimes 2 disk failures – but they do not ensure long-term data integrity. Or even short-term integrity. Not their thing.

This is what archives – traditionally tape – are for.

But tape is a tough sell to the home & SOHO market. Low-end drives – DAT, SVR, VXA – cost several hundred dollars plus the tapes. DLT/LTO drives start around $1200 with $40 tapes.

You can buy an external Blu-ray burner for those prices and 50 GB media for a few bucks each. On sale BR media is starting to reach the 5¢/GB level of 2 TB drives, and the longevity should be better.

But I have over 500 GB of video alone. Shuffling 10 or more BR media – 20 if I’m paranoid – reminds me of floppy backups. Yuck.

The current plan:

  1. Create zip archives of files and folders I want to preserve.
  2. Back them up to 2 local hard drives.
  3. And ship them off to my online backup provider.

What I don’t know is how robust zip archives are. There is a 32-bit CRC, but what does that do for a 10 GB folder of PDFs?

Also, I wonder about the advisability of zipping compressed formats such video and audio files. It might be worth the computational overhead and the possible larger files if the zip file is robust.

The StorageMojo take
Frustrated isn’t the first home user to want an archive and he won’t be the last. Hundreds of millions of home users will see the need over the next decade.

The question is whether or not someone can design a commercially viable system for home and SOHO use. It is obvious that drive vendors have the cost advantage, especially with the advent of easy and cheap USB drive docks, if they build a disk drive designed for that purpose.

An archive drive can be slower – 4200 or even 3600 RPM – and less dense. Optimized for large transfers. Slower, cheaper actuators and drive electronics.

Single platter 2.5″ 7mm drives could be the sweet spot: minimal head cost; slim cartridge-like form factor; and much faster than optical. Then it is just a matter of getting the volumes up and the costs down.

But that’s just one idea. Please comment on how you would solve the home and SOHO archive problem.

Courteous comments welcome, of course.

{ 23 comments }

What is “primary” storage?

by Robin Harris on Monday, 26 July, 2010

A commenter recently asked

Archivas was focused on archive, do you expect the new solution to sustain performance for primary storage as well?

Which is a good question, if you know what “primary” means. Do we?

Tiers of a clown
10 years ago we all agreed on 1st tier or primary storage: block-based; RAID 5; enterprise FC or SCSI drives; SCSI, FC or ESCON host connects; optimized for transactional workloads; and large mirrored (with 1 notable exception) caches. When SANS took off we stuck FC switches in front of the boxes and called it good.

But something happened to that consensus: iSCSI; NFS; CIFS; SSD; MEMcache; Internet scale-out; Infiniband; 10GigE; storage & processor virtualization; CDNs; web-serving; pNFS; and lower-cost out-sourced high-scale infrastructure (i.e. cloud). And more – such as non-SQL data management – is coming.

Will the real primary storage please stand up?
Amazon runs a high-growth $25B/yr business on scale-out storage, servicing millions of customers, taking real money and shipping real goods, 7x24x365. Smells like enterprise spirit.

Is Amazon’s storage “primary” and, if so, what makes it primary?

Yes, it is primary storage. No, it isn’t the logo that makes it so.

Workload & service level
It’s tempting to consider workload, but what workload? IOPS? Bandwidth?

How about parallelism? Web service is highly parallel. ACID database updates less so.

And what about files vs blocks? Blocks don’t require as much processing as files, as the host is handling the file system.

It is clear that most files aren’t often accessed. Does primary storage for files mean availability and reasonable performance? Or is there little difference between archive and primary for files?

NetApp is deduping primary storage. Others will follow, whether it makes sense or not, at least in messaging. Skeptics ask “If it is deduped, is it really primary?”

The StorageMojo take
We do a disservice to customers if we talk about “primary” storage as a class of equipment. It isn’t.

Primary storage is whatever works as primary storage for your application. Bare SATA drives Velcro’d to motherboards to a big cluster of DMXs. Both are in use in major enterprises for mission critical applications – and they both work.

The 60 year secular trend to cooler data is the cause – an inverse of Moore’s Law. As the average accesses of data declines, technologies that meet the need at a lower cost become attractive, find a market, and grow. Niche products become mainstream – and perhaps “primary” – for their markets.

At the same time Moore’s Law is working its magic: creaky slow 10Mbit Ethernet becomes 10GigE. Board level controllers become chips. Storage software migrates from firmware to a stack running on commodity processors. Yesterday’s “archive” storage is tomorrows “primary” storage for the right apps.

Even the term “enterprise” is losing its meaning. As firms begin the 10 year migration to private clouds for cooler data, commodity hardware – servers, unmanaged switches, SATA drives – will be knit by cluster software that may even be open source. It is “enterprise” because an enterprise is using it.

This why all the big iron vendors are migrating their software from embedded firmware to stacks running on commodity processors and operating systems. For the mainstream market the commodities are fast enough and the economics are compelling.

If if works for you, it’s primary.

Courteous comments welcome, of course. BTW, I’m getting a briefing from HDS on the old Archivas product, so maybe I’ll have more to say RSN.

{ 5 comments }

HDS: masters of stealth marketing

by Robin Harris on Thursday, 22 July, 2010

Winding up the week – it is Friday here – in Japan as a guest of Hitachi Data Systems. Fine hospitality from my American and Japanese hosts in steamy mid-summer Tokyo. Looking forward to Arizona.

The practitioners in the group – one who loves XIV, others with EMC and NetApp kit – were surprised by what the HDS stuff does. Such as virtualizing and managing your current storage platforms, regardless of vendor.

Seems like the big guys have been promising that for years. HDS delivered? Whoa.

A couple of things impressed me:

  • The senior Japanese execs weren’t the starchy, face-saving guys I’d expected. The Chairman of Hitachi made a speech to about 10,000 people without a tie, and all the other execs I spoke to followed suit. Even giving careful non-answers they came across as relaxed and realistic. Are they also decisive? We’ll see.
  • HDS has a clustered object store. I hope to get briefed on it next month.
  • The parent company has a vision for using massive amounts of data to improve our quality of life. Since they also produce power systems and high-speed trains they have a direct line into some critical issues.

The StorageMojo take
HDS is a multi-billion dollar company with some leading edge products and technologies. They’re about the size of NetApp – and I know you’ve heard of them.

As their OEM relationship with Sun winds down – or at least I expect it to – they’ll have more direct contact with a new group of customers. Now is the time for HDS to sharpen their messaging and turn up the volume.

Sadly that isn’t likely. The internal dynamics of the company seem to lead to generic messaging that fails to plant a hook. Maybe it is a consensus thing. But they aren’t doing customers any favors.

Courteous comments welcome, of course. Any recent experience with HDS?

{ 13 comments }