StorageMojo




Robin Harris    


Storage Management Declining & Not A Moment Too Soon

August 24th, 2006 by Robin Harris in Enterprise, SAN, FC

The good people at Computer Economics have recently published an article entitled Storage Management Disciplines are Declining. Which is the best thing I’ve heard all week. Unless your job title includes the words “Storage Manager”.

Unused Capacity: Horrors!
CE has looked at data center benchmarks for disk capacity utilization and percent of capacity accessed within the last 15 days over the last decade and found “. . . by both of these measures, disk storage management practices have declined over the past decade, in mainframe, UNIX, and Windows server environments.”

Here are some stats about mainframes, which were all that is available on the freebie version on their website. I think it safe to assume the trends are worse for the lower-cost systems. I’m in touch with CE, so maybe I’ll be able to get more detailed info into a future post.

Mainframe Average 1997 2006
Installed MIPS 575 2028
Installed Disk 2.2 TB 11.6 TB
Allocated Disk 68.5% 61.3%
Unused Disk 0.7 TB 4.5 TB
Disk Accessed in Past 15 Days 78.8% 72.6%

Unix & Windows Capacity Allocation

UNIX Windows
Local Attach 56.6% 46.6%
SAN Attach 75.5% 55.8%

Fuel Tank Management Practices Decline
The storage management hairball has always puzzled me. Why do we manage storage? Because it is a precious resource, right? But prices have been dropping like a rock for decades, so why is it precious? Enterprises everywhere are happy to invest in personal and server resources whose utilization ranges from ~3% (PCs) to ~20% (servers). So why does storage need to be at 60-70%?

You don’t spend a lot of time worrying about how much fuel is in your car’s tank. When it gets “low” - however you define that - you fill it up. Storage systems are the inverse: when they get “too full” you add capacity or archive. When I was a starving student, I managed my fuel tank pretty closely to conserve cash. And now that gas is $3 a gallon, I’m more thoughtful about it again. But storage capacity is as cheap as its ever been and getting cheaper. So why do we have to manage it? The one good argument is that since prices are dropping so fast you can save a lot of money by delaying purchases. On the other hand you are spending a lot of money to manage what should be a cheap resource. Why?

Problem or Opportunity?
The fuel tank analogy is instructive. Unlike adding gas to a car, adding capacity to a SAN or an array is an involved process. You have no choice but to “manage” the storage and risk your data. Compare that to ZFS, which has implemented storage pools, so new disks are automatically added to the available capacity. Google’s GFS has a similar capability.

Continuing the automotive analogy: one of my first cars had a manual choke. Chokes control a carburetor’s fuel-air ratio for starting the car when the engine is cold. You had to remember to reduce the choke as the engine warmed up or else you’d be burning a lot of gas and gunking up the engine. Manual chokes got replaced with automatic chokes that had bi-metallic coils that would control the choke plate. They’d usually work but could stick, so you’d be burning a lot of gas and gunking up the engine and not even know it. The solution: get rid of the carburetor - a mechanical analog computer - and replace it with fuel injection.

Storage Injection: It’s What The Big Boys Use
Right now we are using expensive people and expensive software to manage cheap disks wrapped in expensive cabinets and controllers. Rather than an automatic choke, why don’t we go to fuel injection, i.e. storage pooling?

Net Net
This decline in storage management is good news. It demonstrates that as storage prices decline, practitioners are making the rational economic decision to exchange capex for opex: investing more in hardware and less in management.

Memo To Storage Titans
To: Storage Titans
From: StorageMojo.com
First thing this morning: wake up, smell coffee, change course. What this research says is that if you make storage cheaper and easier to use, people will buy and “waste” a lot more of it, just as they do with PC CPU cycles. Someone is going to be the Toyota of next-gen storage. Why not you?

Comments welcome, of course.

An Open-Source SAN

August 17th, 2006 by Robin Harris in NAS, IP, iSCSI, SAN, FC, SOHO/SMB

Update Over at TechRepublic, Scott Lowe offers another view of AoE here. If I were an SMB VAR, I’d be checking AoE out.

It Is About Time
Here’s a potential game-changer - especially for the SMB market. It is low-cost SAN functionality based on local Ethernet. From a company named Coraid. Available for Windows, Linux, Solaris, FreeBSD and Mac OS X.

Wait a minute? Isn’t that iSCSI? It is a block device after all. Nope. Different. IMHO, better. There are some Don’t Gets, and a lot of Don’t Needs.

Putting Local - And Storage - Back In LAN
Coraid’s innovation is the open ATA over Ethernet (AoE) protocol. The big Don’t Get is that the protocol isn’t routable - it is strictly local - no IP involved. So the Don’t Needs include no TCP/IP overhead, no TCP/IP offload engines, no CPU-cycle sucking and latency-inducing TCP/IP stacks. AoE sits right on the data link layer - level two - of the ISO network model, so with a switched LAN - is there any other kind these days? - you get very low latency and full network bandwidth across a low-cost, industry standard LAN.

The other big Don’t Get: expensive and finicky Fibre Channel HBAs, switches and storage, along with the extra bandwidth FC offers. Like FC, AoE appears to make very effective use of available bandwidth - maxing it out with storage traffic. You’ll want a dedicated storage network to run AoE across.

Practice Makes Perfect
Even though it is cleared for use with Oracle, it probably isn’t a solution, today, for habitually late adopters. You’ll need to think through your security and system management processes to ensure that data doesn’t get munged by an inattentive sysadmin. A dedicated AoE SAN is a start, and VLAN techniques can help partition off potential damage-doers. The key: it just looks like a disk, and anything goofy you can do to a disk you can do over AoE.

Write Once, Read Never?
So far it appears that Coraid is the only company building AoE hardware. It doesn’t appear they are trying to keep anyone else from doing it, only it just hasn’t happened yet. That might be a worry for some folks. So in a smart move, Coraid has a Linux tool called srcat a tool for recovering data from the raw disks on a Coraid JBOD or array. So if the company goes belly up, controller breaks, no replacements available, you can still pull the drives and use srcat to pull the data off. Neat.

StorageMojo.com Take
Congrats to Coraid for a creative way to bring the benefits of network economics to storage networks, just as some of us thought FC would 10 years ago. By creating an open platform and protocol, they’ve started the open-source equivalent of a SAN. If you require - or would like to be able to afford - a lot of storage capacity, you should certainly check these guys out.

Update: Over at Tech Republic, Scott Lowe offers some more info on AoE. The (literal) money quote:

AoE is cheap! An array capable of supporting up to 11.25 TB from Coraid starts at less than $4,000 without disks. Today’s price for a 750-GB disk at NewEgg.com is $400 and the unit supports 15 disks. So, for less than 10 grand, you can get 11.25 TB of shared block storage. If you do the math, that runs at about $888/TB or $0.87/GB. Not bad!

ZFS Performance Versus Hardware RAID

August 15th, 2006 by Robin Harris in Enterprise, Future Tech, SAN, FC

Over at Home » OpenSolaris Forums » zfs » discuss Robert Milkowski has posted some promising test results.

Hard vs. Soft
Possibly the longest running battle in RAID circles is which is faster, hardware RAID or software RAID. Before RAID was RAID, software disk mirroring (RAID 1) was a huge profit generator for system vendors, who sold it as an add-on to their operating systems. With the advent of hardware RAID systems the battle was joined until the hardware array emerged victorious. Software RAID has been relegated to low-end, low-cost applications where folks didn’t want to spend even a few hundred dollars for a PCI RAID controller.

It’s All Software RAID
Yet the fact is that it is all software RAID - it is just a question of where the software runs. Throw a lot of hardware (and cash) at a problem and even dodgy code runs acceptably. Yet the investment that requires is also the Achilles heel of hardware RAID: once you get everything working right on a specific platform you want to just keep selling it, even as the hardware becomes technologically obsolete. It no accident that EMC’s capacity-based pricing tiers made it uneconomic to fully expand a Symmetrix. The platform would max out well before capacity limits were reached because it was running on microprocessors that might be five years old.

Let The Battle Begin, Again!
So I’m excited to see the battle joined again. Server processors usually advance much faster than the add-on co-processors - with the major exception of graphics processors where gamer demand has driven incredible progress - so host-based RAID has a lot of built-in hardware investment behind it. ZFS offers a fundamentally re-architected RAID that is designed to overcome the traditional limitations of host-based RAID - which lacks non-volatile cache - by smart engineering.

So Does It Work, Already?
Short answer: yes. It is still early, both in ZFS development and in testing, but some highly suggestive numbers have been published here and here.

Robert tested against a modern, modular storage array, the Sun StorageTek3510 FC Array, which offers a gigabyte of cache and 2Gb FC. Not an HDS Tagma, but I’d guess that in performance it is pretty close and that it is mostly the scalability of the larger, enterprise systems, it lacks.

Results:

With Hardware RAID
Robert ran these tests on a Sun Fire V440 Server. He first ran the filebench and varmail tests using ZFS on the hardware RAID LUNs the 3510 provides, and ran each test twice:

IO Summary: 499078 ops 8248.0 ops/s, 40.6mb/s, 6.0ms latency
IO Summary: 503112 ops 8320.2 ops/s, 41.0mb/s, 5.9ms latency

Then he ran the same tests using the 3510’s as Just a Bunch of Disks (JBODs) and got these results:

IO Summary: 558331 ops 9244.1 ops/s, 45.2mb/s, 5.2ms latency
IO Summary: 537542 ops 8899.9 ops/s, 43.5mb/s, 5.4ms latency

Net Net
A strong showing by ZFS: ~10% more IOPS; ~10% lower latency; ~10% more bandwidth. Equivalent performance at a much lower cost. Promising news for ZFS adopters and those of us cheering from the sidelines.

Brocade Buys McData: Yawn.

August 11th, 2006 by Robin Harris in Enterprise, NAS, IP, iSCSI, SAN, FC

From the Too-Little, Too-Late Department
Eyes glazed over at the news that Brocade is buying McData. The Wall Street Journal reported (subscription required), that Brocade CEO Mike Klayko told analysts that “customers are frustrated by equipment that doesn’t work together well.”

Well, duh. Network equipment? Double duh.

The Wages Of Sin
Fibre Channel has never fulfilled its early promise. Partly because it isn’t quite a network - it’s a channel - and mostly because everyone imported storage business tactics. The chief tactic: minimal interoperability with other storage vendors to ensure lock-in.

The problem is that applied to a network, vendor lock-in means you don’t get the advantages of network economics. In a nutshell, the value of the network increases as it grows while the cost of connecting drops. That is why all networks get linked: the interconnection cost cheap compared to what has already been spent, while the benefits are huge. Have you heard of Cisco?

Virtuous Cycle Of Network Economics
A single telephone is worthless. Two connected telephones is more valuable. A billion connected telephones is invaluable. And due to learning curve effects the cost of that billionth telephone is much lower than the first.

Fibre Channel Inflection Point
Consolidation usually occurs in maturing industries, as it has in disk drives, for one or more of several reasons, such as increasing capital intensity (semiconductors), economies of scale (automobiles), or acquiring customers (soft drinks). In this case though it is happening because Fibre Channel is beginning a long decline.

Customers have seen an anemic ROI for their billions in FC investment. Without network economics, FC cannot compete with Ethernet over the long term. And now the long term has arrived.

Can This Technology Be Saved?
Not likely. Ultimately, it is the folks that connect to the network who must decide that compatibility is in their interest. Remember IBM’s very silly anti-Ethernet Token Ring network? IBM pushed it hard and lots of their most trusting customers bought it, only to face a painful migration a few years later. That is how you turn trusting customers into suspicious customers.

Storage vendors do not believe in interoperability, do not support it, and have no interest in encouraging mixed vendor FC infrastructures. So design and management is unnecessarily painful and expensive.

On the ethernet/IP/iSCSI side of the house however, compatibility with the network is the only option. Network and semiconductor economics are implacable. In ten years, Fibre Channel will be one of those legacy technologies used only where niche economics or customer sentiment dictate.

Apple Mojo In High-End Storage

June 27th, 2006 by Robin Harris in Future Tech, SAN, FC

As an aside StorageMojo.com noted yesterday (see “Bring Me The Head Of WinFS“) that with the demise of WinFS in Vista, Apple has an even greater opportunity to stick it to Redmond by incorporating the leading edge open source ZFS file system cum storage manager into the next major release (Leopard) of Mac OS X (for more on ZFS see ZFS: Threat or Menace?.

Quality Low-Cost Storage For the Rest of Us
Apple’s rumored adoption of ZFS would also add significant Mojo to its server and its Fibre Channel Xserve RAID storage business. Xserve RAID has Apple’s typical design goodness and management simplicity, combined with industry-leading pricing of less than $2k per Terabyte. To do that the Xserve RAID’s RAID controllers dropped the expensive and tricky dual ported cache that enables controller failover. Unless you use server-based RAID, the loss of a controller means the loss of access to the disks behind that controller, so Xserve RAID isn’t enterprise class.With ZFS they’d have high-performance dual-parity RAID in software that would make a virtue of the Xserve’s RAID architecture.

This would be too smart for words. Steve Jobs’ modest investment in Xserve RAID and Xsan shows he is willing to push the envelope on Apple’s high-end storage as long as it doesn’t cost anything. ZFS support in Leopard would fit the bill perfectly.

Oh, And One More Thing
Expect Apple to announce soon a 10.5 Terabyte Xserve RAID configuration for $15k, dropping it to under $1500/TB. This translates to 147 TB in a 42U rack - over 12 TB per square foot. Will it go to 4Gb Fibre Channel and SATA drives? Stay tuned.

Update 2.0
For more on what ZFS on Leopard would mean see this post.

Update 1.0
Alert reader ZDigital pointed out that Xserve RAID does have dual-redundant RAID controllers, so I modified the post.

The original post:

To do that Apple dropped dual-redundant controllers in favor of server-based RAID, so Xserve RAID isn’t enterprise class.

The correction:

To do that the Xserve RAID’s RAID controllers dropped the expensive and tricky dual ported cache that enables controller failover. Unless you use server-based RAID, the loss of a controller means the loss of access to the disks behind that controller, so Xserve RAID isn’t enterprise class.

Comments Welcome

ZFS: Threat or Menace? Pt. I

May 26th, 2006 by Robin Harris in Enterprise, Future Tech, NAS, IP, iSCSI, SAN, FC

IMHO, both. In a storage industry where the hardware cost to protect data keeps rising, ZFS represents a software solution to the problem of wobbly disks and data corruption. Thus it is a threat to hardened disk array model of very expensive engineering on the outside to protect the soft underbelly of ever-cheaper disks on the inside.

It’s Software Version of the Initiation Rite in A Man Called Horse
Before I jump into the review of ZFS, let me share what I like best about it, from a slide in the modestly titled “ZFS, The Last Word In Filesystems” presentation:

ZFS Test Methodology

  • A Product is only as good as its test suite [amen, brother!]
    • ZFS designed to run in either user or kernel context
    • Nightly “ztest” program does all of the following in parallel:
      • Read, write, create and delete files and directories
      • Create and destroy entire filesystem and storage pools
      • Turn compression on and off (while FS is active)
      • Change checksum algorithm (while FS is active)
      • Add and remove devices (while pool is active)
      • Change I/O caching and scheduling policies (while pool is active)
      • Scribble random garbage on one side of live mirror to test self-healing data
      • Force violent crashes to simulate power loss, then verify pool integrity
    • Probably more abuse in 20 seconds than you’d see in a lifetime
    • ZFS has been subjected to over a million forced, violent crashes without losing data integrity or leaking a single block

Read The Rest Of ZFS: Threat or Menace? Pt. I

EMC Investment in YottaYotta Confirmed

May 1st, 2006 by Robin Harris in Enterprise, Future Tech, SAN, FC

Everybody Unregenerate storage geeks talk about creating globally coherent distributed block storage services, but YottaYotta has done something about it. The busy elves at YY’s Edmonton HQ have done it, and now they have EMC’s investment to prove it.

Senior EMC executives today confirmed that EMC has invested, so far, about US$4 million in the Canadian company. In addition, there are several on-going projects at large EMC customers. As these demonstrate the utility of YY’s products, EMC expects to make another, larger investment. Concurrent with that investment it is expected that EMC will ink an OEM deal with YY.

The lead VC for YY, Technocap, a Montreal-based firm, badly needs YY to be a success since so many of their other dot-bomb era investments, including the very expensive Hyperchip, have gone bust. Richard Prytula, Techocap’s president, has privately described EMC as a “dream” partner while publicly remaining mum about the EMC relationship.

I hate to rain on their parade, but EMC has turned out to be more of a nightmare for many of those who cut deals with them. Ask HP. I personally worked at EMC’s largest reseller where I found that getting them to return a phone call was a minor victory. And all the honeyed words about cooperation and partnership from corporate at contract signing time were only dimly heard in the local sales offices, and then thoroughly ignored.

Congratulations on the EMC investment, YY. And good luck on making the relationship work for you as well as it works for EMC.

So Mr. Tucci, Where Are EMC’s Google Application Notes?

April 29th, 2006 by Robin Harris in Enterprise, Future Tech, SAN, FC

They’ve come out of nowhere and in a few short years built one of the world’s largest always on data centers supporting data and compute intensive applications such as search, mail, chat, mapping, blogging and much more. They roll out new applications faster than anyone in the business, including such deep-pocketed and savvy competitors as Microsoft and Yahoo.

Someone has to ask: how could more than 20,000 terabytes of mission-critical storage be built and operated WITHOUT any of the “big iron” storage vendors? Google collectively probably hasn’t spent two minutes thinking about EMC, but the gnomes of Hopkinton are praying that their big customers don’t notice. Fat frickin’ chance!

With all the chatter about SOA and Web 2.0 Google is exhibit “A” for someone who is doing it, not for a specialized application with a single dominant data type, but for a dozen widely divergent data types, 7×24. The Google platform is clearly an incredible competitive advantage.

So where are all the proud “application notes” that vendors buff up to show just how indispensable they are to the creation of these vital money-spinning applications?

EMC? HP? IBM? Sun? Anyone?

They don’t exist.

I’ll be exploring the implications in future posts, but consider this:

  • Numbers are scarce (no application notes?) but the best estimates are that Google currently is managing well in excess of 20,000 Terabytes of storage. Only the NSA’s domestic surveillance program is likely to be in significant excess of that, and they don’t have to file quarterly reports with the SEC.
  • Other than a possible version of software disk mirroring (RAID 1) it appears that Google does this without any big iron RAID boxes.
  • This is less certain, but it also appears that Google has built their platform on PATA and SATA disks. You know, the ones that are so flaky that the Conventional Wisdom is stampeding to RAID 6 as the only (and surprisingly expensive) way to safely incorporate them into mission-critical 7 x 24 production infrastructures like, you know, everyone but Google needs.

I’ve been puzzled for years over why cheap, high volume storage hasn’t made it into the data center as so many other high volume consumer technologies have. In Google I think I have my answer: it has, using hardware so cheap that the people who build it can’t afford slick “application notes”, big user groups, fat contracts for the “independent” analysts and four color ads in all the IT publications. Not to mention that Google has no incentive to give their secrets away.

Developing . . . .

EMC Buys Distributed Caching Technology For Coming Google Battle

April 28th, 2006 by Robin Harris in Enterprise, Future Tech, SAN, FC, Security & Public Policy

EMC’s GM of the Grid & Utility Computing, Ian Baird, mentioned at EMC World in Boston this week that EMC had invested in distributed caching technology developed by YottaYotta, a Canadian startup, for their “Grid Storage” strategic direction.

Distributed caching technology is crucial to creating WAN-based storage infrastructures that operate as if local, despite being spread over thousands of miles, where normal network latency would cripple response times. YottaYotta, an $80M startup based in Edmonton, Alberta, has been working on the technology since its founding in early 2000.

With Google’s gdrive initiative, as well as similar expected services from other major players, EMC is facing the threat of losing many petabytes of data now hosted on expensive EMC gear to web-based free or low-cost storage. As broadband adoption rates and quality improve, users will have less incentive to leave data in disaster-vulnerable single locations.

By promoting a network or grid based architecture to their lucrative corporate customers, EMC is hoping to stave off the fate of so many other big iron companies: being rendered irrelevant by high-volume, low-cost alternatives. This happened to minicomputers, 9″ and 5.25″ disks, mainframes, proprietary OS’s such as VMS, and countless other computer products. Storage hardware is one of the most profitable hardware businesses in the industry, helping keep otherwise money-losing computer hardware vendors afloat.

EMC, which is almost entirely a storage hardware and software vendor, except for their recent acquisition of VMWare, has been moving downmarket with lower cost Clariion products marketed by Dell and Intel. However the grid computing paradigm threatens to give big corporate customers a new and lower cost way to deliver storage services (such as IBM’s fledgling download grid) just as the web storage services like gdrive encourage downmarket customers to cut down on their storage consumption.

EMC’s biggest problem may be that grid architectures require greater integration between servers, networks and storage, integration that EMC can’t easily deliver since it has no server business. Maybe they’ll buy staggering Sun to get that piece.

What I don’t know and would like is whether EMC has an exclusive with YY. Distributed caching is really hard, and as other vendors realize they need it perhaps they will also go trotting off to scenic Edmonton (home of the free world’s biggest shopping mall) and lay their money down.

BTW, EMC spent $24,524M in 2005 and $20,297M in 2004 on strategic investments in private companies, which is the pot the YY money would have come from. No idea on how many other strategics they shared the money with.

I asked YY for comment, but no one was answering their phones. I hope to have more later.

Note: I was once employed by YottaYotta and still hold shares in the company.

Update:
A senior EMC technology manager would only confirm this morning that “We do have a grid project underway at an enduser site with YottaYotta, to leverage their caching know-how.” So it may be that no money has changed hands between EMC and YY. Investment contingent on successful trial? Or an NRE investment by EMC? In any case, congratulations to the engineering team at YY whose 6 long years of work on a very complex problem is finally bearing commercial fruit.

Gee, “Users Cite ILM Shortfalls” - Maybe ILM IS Bunk

April 19th, 2006 by Robin Harris in Enterprise, SAN, FC

Byte and Switch has posted an article on ILM (Information Lifecyle Management) where users note that it is very difficult to classify data and to move large quantities from more expensive to less expensive storage. These problems are consequences of the industry’s flawed approach to helping customers manage ever larger data sets.

As I’ve noted before ILM is bunk because it is built on two broken ideas:

  • That IT organizations actually own the data instead being custodians of the data for users.
  • That HSM (Hierarchical Storage Management) systems make more sense now than in all the years past that people weren’t buying them.

The data ownership issue is critical to the classification problem. Users know what is important and IT doesn’t, so until IT can get users to classify data themselves, classification will only work for certain large, well-defined data sets.

Yet even if the classification problem is solved, there is still the problem that ILM is just HSM with a new coat of paint. And HSM has never taken off for a very good reason: given the choice between investing big bucks in HSM and just buying some more disks, most people found it made more sense to just buy the disks. They felt guilty about sweeping the problem under the rug, sure, but it got them through the year without having to buy and learn another complicated storage product.

So why all the hype about ILM? There could be a lot of reasons, but how about: some big vendors see it as a way to make a lot of money off customers?

Here’s my logic: Let’s say you sell the most expensive storage in the industry, you like what that does for your margins, and you don’t want to lower your prices too fast. Customers are moaning about TCO (total cost of ownership) and you want to become a software company because the margins are even better. So you task your marketing people with coming up with an excuse to sell more hardware and software to customers under the guise of caring deeply about helping them cut costs. Is this ringing any bells, EMC? ILM is the marketing gloss, and EMC has put its entire corporate sales and marketing energy behind it.

Too bad ILM is still broken.

This is where a bomb throwing (note to all NSA & CIA monitors: I mean that figuratively) company like the young Sun could kick some serious butt. As Jim Gray noted a couple of years ago, there is now no reason that all company data cannot be on line. In fact, there is no reason that all company data cannot be placed on cheap SATA drives and mirrored, so all the scary stuff about SATA reliability is put to rest. And then, of course, you don’t need big expensive controllers that do all kinds of fancy RAID magic. You may want a couple of large SSDs on your SAN for database acceleration, but with the continued cooling of corporate data, you probably don’t need much more. CDP? Ok. Remote vaulting? Sure. Disk to disk backup? Absolutely.

Since Sun won’t do it, maybe some other hungry company will. For SMB’s the lesson is simple: keep your storage structure simple and low-cost. Stay away from Fibre Channel SANs. And when that nice storage salesman drops by, hold onto your wallet.

ILM is Bunk

October 12th, 2004 by Robin Harris in Enterprise, Future Tech, SAN, FC

The forces of flaky marketing have been pushing mightily on the concept of Information Lifecycle Management. With much heavy lifting the marketing mavens have gotten ILM fairly high on the Hype Cycle. I keep wondering when someone will wake up to the simple fact that ILM is stupid.

There are several ways to skin this particular cat. At one level, which many observers have pointed out, ILM is simply HSM with a shiny new coat of paint. HSM has never really taken off, despite decades of devoted work, for one simple reason: when faced with the choice between implementing a $100,000 (or more) HSM system that will pay off sometime next year unless disk prices plunge or just buying $100,000 (or less) of new disks and solving today’s problem today, the huge majority buy the disks. They feel a little guilty about it because it isn’t architecturally elegant and merely delays solving the problem, but sometimes band-aids are all you need. After all, you’ll replace your entire storage infrastructure with marginally faster, much cheaper, much higher density and higher reliability in three years anyway, so why rush?

ILM is bunk for another reason, which goes to the heart of the utility computing paradigm as well. ILM assumes something that isn’t true: that IT owns the data. When you get into the details of ILM implementation there is some reference to working with the actual owners of the data to get them to agree to and abide by ILM metrics. But the simple fact is that few end-users give a damn about IT’s storage problems. There are some offerings out there that look at file systems and migrate the least-recently-used files off to cheaper storage, so end-users don’t need to be involved, but smart (i.e. risk-averse) IT guys have to balance the money they might save against the grief they are sure to get if that cheaper storage burps just when some high-ranking end-user wants to see that data. Just how this ties into utility computing I’ll leave for a later note, but I will say that there needs to be a simple, understandable business process that ties end-user behavior to IT infrastructure costs.

Yet another perspective is one inspired by an interview that Dave Patterson, an inventor of RAID among other smart things, had with Jim Gray, another all-around really smart guy last year. In it, among many other interesting things, Jim said “The two things that are going to be real shifts in storage are tertiary storage going online so there is no distinction; and intelligent storage, so that we raise the level above SCSI.” Essentially what Jim is saying is that because disks are now as cheap as tape, the role of tape libraries will be absorbed by disks. So the storage taxonomy will look like this: fast disks for databases; slow cheap disks for everything else. What does ILM mean in a two tier, application driven infrastructure?

What it means is: time to get a new marketing bandwagon rolling, ’cause the wheels on this one are about to fall off!



« Previous Article
StorageMojo RSS Feed May 2008 April 2008 March 2008 February 2008 January 2008 December 2007 November 2007 October 2007 September 2007 August 2007 July 2007 June 2007 May 2007 April 2007 March 2007 February 2007 January 2007 December 2006 November 2006 October 2006 September 2006 August 2006 July 2006 June 2006