Update on advertising

by Robin Harris on Tuesday, 30 June, 2009

Close observers might have noticed that StorageMojo hasn’t had an ads for a month or so. I hope that is about to change.

For readers who care about such things you can now read the StorageMojo advertising policy and rate card by clicking on Advertising or on the nav bar above.

The gist of it:

StorageMojo gets requests for advertising, and as a long-time capitalist roader, I’m pleased. But StorageMojo’s relationship with advertising is a bit ticklish.

I don’t mind ads and I like the money, but dealing with advertisers is time-consuming and may give readers the feeling that StorageMojo’s independence is compromised. After all, StorageMojo was founded to offer vigorous independent analysis and opinion, not sell ads.

Thus the public page on advertising. You can read it and decide for yourself what, if any, influence advertisers may exercise.

My experience is that companies intrepid enough to advertise on StorageMojo are worth checking out. They like the site and the audience and often share some of the concerns I have about the industry. Like StorageMojo readers, a cool bunch.

Courteous comments welcome, of course.

{ 0 comments }

Cold storage

by Robin Harris on Monday, 29 June, 2009

As the economics of data storage push more and more data onto disks, the energy efficiency of data storage is ever more critical. Storage is anti-entropic, so keeping bits organized requires energy. How can we minimize that energy input?

Data cooling is the major reason disk drives have remained a viable storage strategy for 50 years. The IOPS/MB has dropped steadily for decades, yet disks remain the preferred tool outside of very low latency or high-bandwidth applications.

Looking forward to massive scale-out storage infrastructures the data will get even cooler. Copan’s MAID architecture, which turns disks off when not in use, is a rational extension of the cool data concept.

As data continues to cool we will eventually see millions of disk drives – along with tapes – sitting idle. But even if we have cold archive disks, one of disk’s big advantages over tape is the ease with which data can be spread over multiple drives for data protection.

Not RAID 5
You can’t count on any one hard drive actually restarting after a few months or years of idle time. Nor can you expect that any specific sector will be readable. Cold data requires even more advanced – energy efficient, disaster-tolerant – storage techniques than RAID arrays offer today.

Oh, and they need to be cheap too. Which means RAID arrays won’t get this business. What about open source software?

Erasure coding
In A Performance Evaluation and Examination of Open-Source Erasure Coding Libraries For Storage (pdf) James S. Plank, Jianqiang Luo, Catherine D. Schuman, Lihao Xu, and Zooko Wilcox-O’Hearn examine 5 open source implementations of 5 different erasure codes: Reed-Solomon, Cauchy Reed-Solomon, Even-Odd, Row Diagonal Parity and Minimal Density RAID 6 codes.

Picture 7
Typical storage system with erasure coding – figure from the paper

Several companies – including Cleversafe, NetApp and Panasas – use erasure codes today to ensure higher data availability. What Plank et. al. wanted to know is how well these codes work and what system designers need to know to use them effectively.

The OSS implementations tested are:

  • Luby, a C version of CRS.
  • Zfec, a highly tuned Reed-Solomon library.
  • Jerasure, a GNU LGPL C library that includes RS, CRS and 3 MDR6 among others.
  • Cleversafe
  • released an open source version of their dispersed storage system, from which the authors used just erasure coding parts.

  • EVENODD/RDP
  • , patented codes not available to the public and included for performance comparison.

Most important result
The study found that while tuning boosts performance and some architectures are much faster than others,

Given the speeds of current disks, the libraries explored here perform at rates that are easily fast enough to build high performance, reliable storage systems.

Translation: this isn’t string theory.

Other findings include:

  • The RAID 6 codes out-perform the general purpose codes.
  • For non-RAID 6 codes, the Cauchy Reed-Solomon performs much better than straight RS
  • CPU architectural features, such as cache size and memory behavior, make it hard to predict an optimal data structure for a given code configuration.
  • The code’s memory and cache footprint can have a large impact on performance.
  • Specialized RAID 6 codes hold promise for creating efficient storage that can withstand numerous concurrent disk failures.
  • Multicore performance issues are largely unexplored.

The StorageMojo take
The architect for a planned commercial 200 PB cold-storage infrastructure confessed that he can see how to get to 25 PB today, but not beyond. Yet they have no choice but to start building now.

This market’s eventual structure may parallel that of today’s tape silo market: everal hundred large customers who are continuously churning through rolling upgrades of media and servers.

Right now, tape silos still enjoy an economic advantage over disks. But it looks like disks have more degrees of freedom to improve their cold storage economics than tape.

In just 5 years the first exabyte cold storage systems will be on the drawing boards. It is time for disk companies to get serious about a tape-replacing archival disk. And for clever startups to focus on this emerging market.

Courteous comments welcome, of course.

{ 4 comments }

Not a filesystem, not a database.

by Robin Harris on Wednesday, 17 June, 2009

Jeff Darcy has a good post on key data stores, like Amazon’s Dynamo, and how they differ from filesystems and databases. He relates his transition from a filesystem purist to a more flexible perspective.

The thing that really changed my mind about this was an observation in the Dynamo paper: strong consistency reduces availability. I’ve always thought of data availability in terms of data not being lost or stranded on the other side of a failed network connection. The Dynamo insight is that many applications have to do a lot of work within a small acceptable-response-time window, and to make sure that they fit into that window they have to impose deadlines on all sub-operations including data access. If consistency issues make data unavailable within that deadline then they’ve made it unavailable period, with practically the same effect as if the data were unavailable in any other sense.

In short, while there is a class of applications where traditional consistency is important, there is an emerging class where strong consistency isn’t affordable or necessary. Good stuff.

Another point
Many of the features that make up these non-FS/non-DB stores seem to have a lot in common with object storage. In a highly mobile world the whole idea of placing a file in cyberspace by a path name is anachronistic at best: it could be, physically, almost anywhere and is most likely in several places at once.

The StorageMojo take
While the name “object” is problematic for market acceptance, the concept of managing objects in a flat address space – like the web itself – is a better fit for a mobile networked world. There is a major opportunity to move file management infrastructure forward to reflect the world we now live in rather than a 35 year old server environment.

Courteous comments welcome, of course. Thanks to Wes Felter’s Hack the Planet blog for the link to Jeff’s post.

{ 8 comments }

Outrageously cool new hard drive

by Robin Harris on Monday, 15 June, 2009

DataSlide has come out of stealth mode with a very creative SSD replacement technology. They call it a Hard Rectangular Disk or HRD.

Here’s their quick overview:

DataSlide applies technology in new, patented ways to achieve unprecedented high performance 160,000 IOPS & 500MB/sec and low power <4 Watts for a magnetic storage device:

  1. A piezoelectric actuator keeps the rectangular media in precise motion
  2. A diamond solid lubricant coating protects the surfaces for years of worry free service
  3. A massively parallel 2D array of magnetic heads reads from or writes to up to 64 embedded
    heads at a time

Here’s a diagram, courtesy DataSlide:

But that’s not all. According to the redoubtable Chris Mellor at The Register a

. . . 2-dimensional array of 64 read-write heads, operating in parallel, . . . positioned above an piezo-electric-driven oscillating rectangular recording surface. . . .

The data organization compared to a disk drive look like this:
courtesy DataSlide

Chris also reports that Oracle’s Embedded Global Business Unit is working with DataSlide to incorporate a database to create a “smart” storage device for use in I/O intensive “multiple concurrent stream” applications.

The company says the drive is at the prototype stage and uses existing high-volume production technologies, including perpendicular recording media, semicondutor lithographic heads and LCD glass treatments.

The StorageMojo take
DataSlide has taken much from IBM’s Millipede concept and reimagined it using common technologies. While much remains to be done to productize the prototype, the fact of such architectural creativity should spur new thinking at the hard drive companies.

Of course, just like SSDs, with such low latencies it doesn’t make much sense to stick the device at the end of a long, complex, high-latency interconnect chain. PCI-e HRD card, anyone?

Also, the relatively low capacity – 36GB – of the prototype device suggests it may slot in between larger capacity SSDs and DRAM. Until we know the economics though that is almost baseless speculation.

Let’s hope they can get it to market in less than 3 years. And let the based speculation begin!

Courteous comments welcome, of course. This post was updated from the original with the digrams and some minor edits.

{ 8 comments }

Atmos gets no love from EMC sales

by Robin Harris on Tuesday, 9 June, 2009

A couple of reliable informants tell me the same story: EMC’s Atmos is in a fight for its life. Symm and Clariion sales people are treating the new born product as a competitor, not another EMC product.

The dozen or so Atmos sales people – yes, they have a tiny dedicated salesforce – are finding the well poisoned almost anywhere they go. Issues such as performance, stability, quality and future support of Atmos are reportedly being raised. Perfectly fair questions for any v.1.0 product – but they’re usually asked by competitors.

To be fair to the EMC field, the Atmos product web page is not up to EMC’s usual standards – a customer testimonial is conspicuous by its absence. Nor is there much on the business case for cloud storage.

Sales people don’t get medals for being the first to sell a radical new product. The experienced ones stay away until they get good reports from someone they trust. With Atmos that could be a while.

The StorageMojo take
In EMC’s famously sales-driven culture the local offices are used to doing as they please. As long as they make their numbers Hopkinton doesn’t mind.

But Atmos is different. Scale out architectures are the future of the industry and Atmos is EMC’s entry into the race. But EMC’s sales force doesn’t want to sell an immature product – or an architecture that will replace much of their current revenue with cheap commodity capacity.

The Atmos team is rumored to have a 1 year dispensation from making money or even many sales. That may need to be extended a couple of years.

At some point EMC’s sales force will need to get on board with Atmos. EMC better hope that some other scale-out vendor doesn’t get in those accounts first.

Courteous comments welcome, of course.

{ 7 comments }

Configure a 100 TB HD video infrastructure

by Robin Harris on Sunday, 7 June, 2009

The video folks have an interesting set of problems: large needs; major bandwidth; time-critical collaboration; lots of metadata; and more. Like budgets. I do some video production myself and empathize.

They are today where most of us will be in 10 years: lots of large files; local and remote sharing; processor and bandwidth intensive operations; large archives of wanted and rarely accessed files. Today high-end video folks are working at 2k, 4k and, sometimes, 8k video resolutions – and 10 years from now I wouldn’t be surprised if home users weren’t too.

What prompts this is a note I received from, well, I’ll let him introduce himself.

I have a boutique post-production company and I’m a filmmaker. We are small, under a dozen, but swell to a few times that size with freelancers on a project-by-project basis. Because we work with very high resolution media, we need a lot of space, and very high throughput to each user. . . . [W]e’re all working with 2K and 4K media (300 and 1200MBps respectively to EACH user) and 3D animation rendering. . . . We use a mix of Linux, Windows, and OS X clients. In total, we could easily make use of 100TB+ right now, and prefer to stop archiving everything to tape and deleting it, but rather migrate to another tier of storage but keep in one global namespace with the tape just for disaster recovery. We also need security administration.

I can’t find a storage system that does all this. DataDirect Networks seems to be the du jour high-end storage for my industry, and supposing I’m willing to finance that big-ticket brand, they still don’t have a filing system answer. They’re suggesting StorNext or CXFS, and I know the multi-user scalability and expansion limitations well (can anybody say “forklift”?).

The closest I’ve come is Lustre. It seems like it would fit the bill nicely, especially since we’re savvy to integrate in-house, except that it is Linux only, and NFS/CIFS gateways don’t seem like a great idea. I keep hearing they’re working on at least a Windows client, but who knows when it will be ready?

Can you help at all? What have I overlooked? Doesn’t anyone make what I’m looking for?

Short answer to last question:
No.

Longer answer:
No. But there are workarounds.

For those new to video, here’s an abbreviated chart of some video rates in megabytes per second:
video_data_rates1 [Adapted from Integrity Data Systems which offers the whole chart. Aspect ratios and frame rates left out.]
Update: Larry Jordan, a writer and trainer in video editing, graciously wrote to let me know that the above data rates are uncompressed – and that most production houses would use compressed data. The amount of compression varies based on the codec as Larry explains in this informative post. End update.

Issue 1: Interconnects
GigE won’t even handle 32-bit RGB standard def video. And when you get into HD video it gets hairier fast. Trunk multiple GigE’s? 10GbE? 4x Infiniband? FC? eSATA or PCI-e direct attached storage?

Issue 2: Virtualization
A single address space is a wonderful thing. You’ll need a software layer that clusters multiple boxes. You’ll also probably want to build an archive infrastructure that is distinct from your higher performance working set storage, but some vendors will disagree.

Likely software suspects include IBRIX, Parascale, Caringo, MatrixStore, Bycast and Permabit.

On the combined HW/SW side there’s Panasas and Isilon. Something tells me there are some other options, like HP’s Extreme Data Storage 9100, that are also applicable.

Lustre is not a product I would recommend since it was designed for HPC, a market where PhDs work as sysadmins. Sun may have tamed it since they bought it, but it is a non-trivial piece of software.

Come one, come all
StorageMojo readers are invited to offer their 2¢ worth. Architecting is non-trivial, especially if money is an object.

Update:
Our interlocutor wrote in to add some detail:

thanks for the response. Here’s some answers:

– We can manage expensive interfaces like 10GigE and Infiniband QDR. We’ve been paying for dual-channel 4Gb FC for the past few years, after all. I just want to also allow standard Gigabit connections to the cheap seats without a lot of complexity. So I guess the jargon for that would be “multiprotocol” switching?

– The large naming space might be a luxury. The fact is that jobs come in one of three general sizes, and we could have volumes of that size waiting to take on new jobs as they come in, so at least there is one namespace per job. As you said, capacity is cheap…

– Truth is I am pretty savvy, but other than that we have a lot of power desktop users but not sysadmin types. I contract some people with steady part-time work, but it has been our business model to try to keep as many of our full-time people on the creative and producing side as possible, and not in support/administration.

The one thing I don’t understand is what you say about Infiniband not being so great when there’s lots of node churn?

I know what you mean about DAS, but I think I’ve ruled out distributing the data through push/pull from a central repository. The fact is jobs just move to fast through here for that, and we often have about two seconds notice that we need to bring a certain job’s data to System X, Y or Z to do work on it. It’s very dynamic.

I see some brands in your blog post I haven’t checked on yet.

What turned me onto Lustre is that Frantic Films in London has deployed it. They’re the only ones AFAIK.
End update.

The StorageMojo take
Some thoughts on the infrastructure issues.

Capacity is cheap, network bandwidth is expensive. Raw SATA disk is less than $0.10/GB. 10GbE switch ports are over a grand apiece. Infiniband is better from a price/performance perspective, but not as friendly for networks where there is much node churn – unless that’s been fixed in the last few years.

Direct attached storage will give you the best performance – especially with 4k. The new PCI-e attached arrays from JMR and others can offer up to 4,000 MB/sec bandwidth. Stripe across 4 of those and you’ll be able to handle 8k.

Transaction processing is well on its way to niche status, like mainframes and hierarchical databases that once ruled the earth. It is a big file world out there and the files are getting bigger every year.

Courteous comments welcome, of course. I’ve done work for many of these folks – but not all – at one time or another.

{ 25 comments }

It’s official: Data Domain’s board doesn’t like EMC

by Robin Harris on Thursday, 4 June, 2009

Joe, how about “Hawaiian shirt Fridays?”
Data Domain’s board is has rejected EMC’s all cash offer in favor of NetApp’s enhanced cash + stock offer. But shareholders get the ultimate say.

EMC is continuing with its tender offer for outstanding stock at $30/share through June 29. If it gets a majority of the ~62 million shares, it’s all over.

Except for the fallout.

No non-competes in California
While everyone has been polite so far, EMC’s hostile offer will turn up the heat. And the all cash offer and EMC’s stagnant stock price means that DD employees have to consider their options – which, BTW, are usually fully vested in event of an acquisition.

NetApp has nothing to lose by formally pushing the anti-trust issues. Engineers are getting calls from old friends. Business plans are being sketched on cocktail napkins at Birk’s.

Saw VI – “DDUP, are you grateful to be alive now?”
EMC could end up with a shell company: some patents, products and a brand, but much of the newly-affluent talent gone. And maybe they can earn their cash back in the next few years before something better comes along.

The StorageMojo take
EMC has irritated more folks than the occasional blogger. Their sales-driven culture doesn’t mesh well with Valley technophiles. And their stock price is a turn-off.

Emotions will cool. We’ll have the answer by month’s end – if EMC doesn’t extend their offer. There’s been so much turnover in DDUP stock in the last 2 weeks that it’s hard to know – for outsiders – who still owns what.

Financial investors will take EMC’s money if they haven’t already sold. People who see the chance to rebuild NetApp’s fortunes – and stock price – haven’t.

Hostile takeovers have a poor record in high-tech. But that’s a risk Tucci is willing to take.

Courteous comments welcome, of course. StorageMojo will be returning to its irregularly unscheduled programming shortly.

{ 10 comments }

NetApp matches – really? – EMC’s offer

by Robin Harris on Wednesday, 3 June, 2009

Chris Mellor has the story on NetApp’s new bid. Basically NetApp has sweetened their offer of combined cash and stock to match EMC’s all-cash offer.

I commented yesterday that

If NetApp merely matches EMC’s offer they may still win the deal. Valley folks generally view them more favorably than they do EMC.

This being finance, however, the emotional decision needs to be covered in a cloth of gold. As NetApp’s CEO said in a letter to Data Domain’s Chairman, the new proposal:

. . . offers Data Domain’s stockholders a superior combination of risk-adjusted value and transaction certainty than EMC’s unsolicited acquisition proposal.

Let’s parse that.
“Transaction certainty” is NetApp’s polite way of promising to raise anti-trust issues should EMC’s bid be accepted. It’s doubtful that anti-trust would derail the deal – after all, there are a number of dedup vendors and no one is dominant – given a new administration it might take longer than usual for Washington to OK the deal – and some strings might get attached along the way.

EMC will say “Hey! it’s all cash, so who cares if the deal closes now or in 9 months?”

The nut of NetApp’s argument is the “superior combination of risk-adjusted value” line. The cash in the offer means “value certainty.” The stock in the deal offers “. . . the potential for long-term value upside through the ongoing ownership of NetApp stock.”

There are 2 elements in favor of stock. First, the stock portion is tax-free, so if you’ve already made a ton of money on DDUP you won’t have to take your capital gains all at once. Second, maybe the 2 companies really will make a killing and drive the stock price up substantially.

If you like your DDUP stock now NetApp offers a way to keep some of it. Both companies have traded in the $40s in the last 3 years, maybe DDUP+NTAP could be trading in the -simplistically – $80s in another 3 years.

Sure, you could buy EMC stock, but other than the unwarranted VMware mania in late ‘07, EMC stock has been stuck in neutral for years.

NetApp’s checkered acquisition history
EMC partisans will point to NetApp’s acquisition problems: Internet Middleware (later NetCache); Spinnaker Networks; and Topio. Somehow, NetApp was never able to make those go.

The difference here is that both companies sell appliances. This isn’t a technology integration play or a an orthogonal business venture.

NetApp’s pitch should be “look, NAS is a faster growing market than EMC’s block-based storage and DDUP makes a perfect back-end for our customer base. We’ll make a killing!”

And they probably will.

The StorageMojo take
Data Domain’s top 5 investors hold more than 50% of the shares. This isn’t going to turn into a long, drawn-out proxy fight.

If they like NetApp more than EMC – which they probably do – a few head nods around a conference table will seal the deal. EMC can counter again, but if NetApp’s cash+stock offer is attractive now, it will only get more so if they match EMC again.

NetApp will close this deal and EMC will be left looking a little foolish. The good news: there are plenty of other good acquisition candidates that could power EMC’s growth in the next decade.

Courteous comments welcome, of course.

{ 10 comments }