Great work at FAST ’11

by Robin Harris on Thursday, 17 February, 2011

After a quick scan of the paper titles I wasn’t impressed. But after seeing presentations and posters I am.

Here’s some I found interesting. I’ll be posting longer pieces on some of these.

  • A Study of Practical Deduplication Full paper *Best Paper Winner*
  • Tradeoffs in Scalable Data Routing for Deduplication Clusters Full paper
  • Exploiting Half-Wits: Smarter Storage for Low-Power Devices Full paper
  • Reliably Erasing Data from Flash-Based Solid State Drives Full paper
  • Scale and Concurrency of GIGA+: File System Directories with Millions of Files Full paper
  • Emulating Goliath Storage Systems with David Full paper *Best Paper Winner*

An excellent conference. NetApp, EMC, Microsoft and IBM were recruiting.

The StorageMojo take
We’re still learning about flash, and the research presented here is a substantial addition to our meager knowledge.

Microsoft tells me they’re delivering major improvements to NTFS and Windows Server later this year. I’m looking forward to that briefing.

And it’s always a pleasure catching up with the people who, for some reason, never come to Sedona.

Courteous comments welcome, as always.

{ 3 comments… read them below or add one }

Ryan Friday, 18 February, 2011 at 9:55 am

I’ve been wondering for a while when Microsoft will update NTFS… it’s starting to look pretty dated compared to ZFS, BTRFS, etc., and the various size limits are becoming an issue more and more frequently (as well as associated issues, e.g. the maximum size for a VHD is 2 TB).

When is the briefing? Are you able to share anything? We’re making storage decisions right now that would greatly benefit from an NTFS roadmap. For example, one solution (Nexenta) would involve layering NTFS over ZFS to take advantage of the advanced caching options. It’d be great to avoid that kind of complexity.

John (other John) Saturday, 19 February, 2011 at 4:48 pm

Ryan,

look up tNTFS.

unfortunately there’s not a lot to read. Where’s Russinovich’s book, when you need it?

Copies already do a kind of transactional integrity. Mea, checked back the link and they claim full ACID.

http://technet.microsoft.com/en-us/library/cc730726(WS.10).aspx

Now i know why hard to find – it’s also referenced as “Transacted NTFS” and “TxF”.

Also look up “self healing NTFS”

Was on upgrade cycle anyway, but Win7 is causing far less bother on desktops which are running heavier stuff. Methinks, no co-incidence. I recenty started thinking, if only we knew more, there could be much heavier lift done on the current windows.

I don’t for a moment think this is petascale ready, but a lot more going on under the hood. MSFT has to save cra**y programmers from knowing about I/O, ya know, so they had to do something :-)

It only gets funny when you think you’re going to keep a namespace or even contigious bare space, across one of their “clusters”. But when people finally clamour for that they’ll just expose some old Cutler code stashed away in the kernel :-)

It’s like serializing a novel (awful pun, sorry), to sell the newspaper . . .

Face it, the thumper / opensolaris derived kit has been giving people real headaches. Is BTRFS even production code? Does Oracle care, when they built a layer to hit drive firmware direct? Do I get someone to really blame, if i go for a “thin” object layer or whatever who price in the same way the big boys do?

This necessary revolution will be invisible, even if thanks to Robin and others, it is not silent. (checks pension policy date to estimate timetable!)

cheers for the paper links Robin, at first skim read, these are truly useful. Thanks for passing on.

– j

John (other John) Saturday, 19 February, 2011 at 5:43 pm

Ryan,

i think the NTFS roadmap briefing was in 1995 wasn’t it? :-)

(as in “yeah, we’re putting SQLserver on the disk, yay!”)

sorry, couldn’t help myself . . .

looked at NTFS over Nexenta here as well. Bear in mind what we’re playing with is not enjoying network latencies. And wants fat pipes, so that layer isn’t cheap. Better to get one mammoth box plus DAS and remote in. But just for the additional complexity and licensing, and possible dependancy on a OS fork, i think Nexenta is fundamentally no – go. (more below) Unless you found some way to direct attach a Nexenta box to Windows, cacheing won’t help much unless you buy low latency switches, i believe in any use case, save maybe lots of common hot data. That’s expensive, and if your cable puller isn’t serious, you’ll loose on the jitter.

Do you write your own I/O? Have enough stable libraries you can aggressively test against? T10 DIF is the goal. (this is the firmware checksum Oracle talks to, on Linux you can use their stuff no charge)

As far as i’ve got is ferreting out some likely suspects to tell me how T10DIF is or isn’t supplied by Hitachi kit. On the specs, it seems standard now.

But think about it, if current NTFS is doing ACID and integrity / error scanning, it will make some assumptions, which sure will break if you put in another layer, trying to do the same, in different ways.

Got the same problem here, what for a NTFS store. But what’s your scale, traffic, usage patterns, centile loads? Do you need serious data integrity? Do you throw massive files about? (those last two are the same question, usually)

Personally, i think the only time you want NAS is when you must, clusters, virtual desktops, physical security, or when you have average light load coming from lots of clients. Wildly duplicated data. May have missed some use cases.

I really think you should explain your problems to a big name, and, having prepared a few foolscaps of good Qs, backed with some of your stats, sit back and enjoy the hospitality. Only way to find answers is to hang around like you’re some two bit lady of the night, and ignore the ones who pounce. You can only do that if you go say “Hi”. Will be useful, and if you’re prepared, possibly illuminating.

(sorry for the rude analogy, but i’ve felt treated like that before, so i decided instead of taking offense, i’d get everyone to laugh about it. Defusing sales types is first objective in securing the Real Price.*)

If you’re doing network I/O heavily, or tons of virtualization, this Nate guy’s writing was very helpful to me, so ask him maybe.

Please excuse my waffle, I’m very keen on DIY, but DIY only adds up on some occasions, often very nicely, but i guessed from your language you are after something specific. When it does, and you do get that big win, you get that glow of satisfaction, like your muscles pleasantly ache after digging a tunnel with a spoon :-)

best to all,

– j

*I like to think i could sell, once!

Leave a Comment

Previous post:

Next post: