Architecting & integrating flash into enterprise storage

by Robin Harris on Thursday, 16 May, 2013

Have you ever noticed that it is difficult to get good information about how flash works? The vendors know but they’ve never been terribly forthcoming.

For example, how does flash wear out? When most things break you lose their contents. But once flash stops working your data is still there. Huh?

And the fact that flash is a wearing medium spooks many people. How should we think about flash? Can we live with a wearing medium?

Or write amplification? How does that work? What can be done to reduce it?

That’s why it was a pleasure to sit down with Rob Ober of LSI. Rob is an LSI Fellow and system architect with deep technical knowledge of flash and how it interacts with systems and applications.

Rob holds dozens of patents and is articulate and open. Plus he’s a very nice guy.

I distilled down what I learned and some of Rob’s key points into a StorageMojo video white paper that LSI commissioned. If you are curious about flash, how it works, how it fails and how it can be turned into an enterprise class storage medium, you’ll find the video informative.

At least I did my level best to make it so, including video from Wilson Canyon, one of my favorite local hikes. Here’s the video:

The StorageMojo take
As a thought experiment I sometimes wonder about how storage would be different if IBM had invented flash back in 1956 instead of the RAMAC disk drive. What it reads were fast and free while writes were expensive?

That’s essentially the problem we’re trying to solve today. Except today we have an installed base of a couple billion disk drives and decades of driver, OS and application development all predicated on disk performance.

We’re still in the early days of flash integration, even though forward-leaning architects have been working on it for 6 years or more. Thanks to flash – and cloud – storage has never been more vibrant or exciting.

Courteous comments welcome, of course. Feel free to ask about anything in the video that wasn’t clear or didn’t go deep enough. Your questions help me understand what you find valuable.

{ 0 comments }

Cloud money: flip a Bitcoin

by Robin Harris on Saturday, 11 May, 2013

Digital coinage can’t do everything a physical coin can do, but that’s not stopping people from signing up – or going to conferences. There’s one in Silicon Valley next week and the elite StorageMojo analyst crew will be there in force.

Digital currency as a store of value?
Today Bitcoin and other digital currencies look more like stocks than bonds because of their volatility. But with Amazon and eBay looking at accepting them, they could become more like money. If that seems unlikely, recall that much of what you use now as “money” is simply electronic transfers: credit cards; debit cards; PayPal.

These non-national currencies have much in common with how the US currency system used to work. Each local bank issued its own currency supported – in theory – by the deposits of customers.

To redeem the currency you’d go to the bank and exchange the notes for coin. Since 19th century travel was often difficult, the bank notes would depreciate with distance from the issuing bank.

With the frequent business crashes of those years, people were alert to the possibility that the issuing bank could fail, leaving the notes worthless. Thus if business tanked people would “run” to the bank to exchange their notes for specie. Since banks borrow short term and loan long term, they would often run out of coinage and close.

That’s why the US has a national currency, a Federal Reserve Bank and insures bank deposits (FDIC). But digital currencies have none of these protections.

The StorageMojo take
A recurring strand of American thought is that we should go back on the gold standard rather than letting the dollar to “float” against other currencies. After all, advocates contend, without gold the dollar isn’t backed by anything at all.

And yet the dollar remains the worldwide currency of choice, not only for B2B but as a store of value and convertibility as hard cash. Most US currency circulates outside the US – us locals would rather use credit cards.

Since the US dollar isn’t backed by gold, and since the Fed can loan as much money as it wants to banks that can use it as reserves against loans – if only they were making loans! – why exactly do we ascribe value to the dollar? Global acceptance and ready convertibility are two major reasons.

Which is where the value proposition for digital currencies makes the most sense. So can a digital currency replace – or at least supplement – national currencies? Yes.

It isn’t that different from what we used not so long ago – or what we use today. Digital currency is the new frontier in more ways than one.

Courteous comments welcome, of course. Any StorageMojo readers going? Look me up!

{ 1 comment }

EMC and the 7 dwarves – pt 2

by Robin Harris on Friday, 3 May, 2013

Note: This post got so long it needed to be posted in 2 parts. Part 1 is here. And while I promised this 2nd part “tomorrow” the editing took much longer than expected. End note.

HP has made the most dramatic bet with their 3PAR-based converged storage line. While the rapid growth of the new products is overwhelmed by the even quicker decline of older products like EVA, they’re off to a good start, claiming over 1200 new customeres.

The challenge for former EMC’er Dave Donatelli and 3PAR’s David Scott isn’t technology but sales. Do you have field storage sales specialists and, if so, how do you compensate everyone so account managers are comfortable bringing them into large deals? Or do you flog marketing to make selling converged storage so simple and remunerative that server guys do it themselves?

Ideally both, but the HP go-to marketing strategy is mind-numbing detail and then tossing it over the wall for sales to figure out. That’s not a winning strategy these days.

Bottom line: HP’s lead in physical servers makes them a threat for many add-on storage sales. But EMC’s storage-focused sales force keeps winning deals.

IBM‘s storage business is at risk. While they have great technology, none of their hardware products are even a strong #2. Two thirds of IBM’s Systems and Technology group business is servers, and it is clear that IBM management isn’t happy with how some product lines are performing.

IBM top management is not sentimental. Given the disposal of once-core assets, such as disk drives, and IBM’s shrinking storage market share, it is clear that IBM will likely sell off or shutter some hardware product lines to focus on higher-margin storage. The Texas Memory Systems solid state arrays, storage software and mainframe storage are likely safe.

But at some point – and that point will come sooner rather than later due to the cloud’s competitive pressure – IBM’s management will decide that the overhead and investment required to support 3rd place and worse products isn’t worth it. That’s likely behind the **BILLION DOLLAR ALL FLASH** data center initiative: a desperate attempt to stay relevant in a rapidly changing storage world.

Expect to see less-than-competitive hardware lines put on life-support with cheap-to-implement HW upgrades – new, bigger disks! now with SSD added! – and new OS quals, until most of the base has migrated or the profits are gone. It is a sad comedown for the company that created and dominated the storage business for decades and then lost it to an upstart named EMC.

Who is the final dwarf?
Well, that’s 6 companies. Who’s the final dwarf? Fujitsu? NEC? Cisco? Huawei? Yes, Huawei is in the storage business.

I like Fujitsu and their Eternus system for the final spot, although NEC’s HYDRAstor has some great scale-out technology. But is NEC getting traction? If they are it’s a well-kept secret.

Eternus is probably doing best in Japan, a large enough market to keep them going for years. But don’t expect any game-changers.

The StorageMojo take
The most important impact of cloud infrastructure is that it gives CFOs a yardstick to beat their CIO with measure their internal IT against. And they aren’t happy with what they’re seeing.

For decades IT has been the wayward child in the corporate family. Cost overruns, service delays, unfathomable mumbo-jumbo, security lapses – IT is a never ending string of expensive problems – as the short life of CIOs attests.

But the need for IT infrastructure is only growing. It’s the means that are changing. That’s where the aggressive, new architecture storage companies like Fusion-io/NexGen, Tegile, Nimble, Avere, Kaminario, DDN, Tintri, Nutanix and now Exablox come in.

Since the advent of RAID systems in the early 90s we haven’t seen much turnover in the major players. That’s about to change.

While it will be wrenching for those involved, it will be good for the industry and for the myriad organizations that need reliable, cost-effective storage. People will buy large arrays for decades to come – just as they still buy mainframes – but they’ll have many more options for non-transactional storage.

Courteous comments welcome, of course. Texas Memory Systems was a long-time advertiser on StorageMojo until recently. HP flew me to their corporate trade show in Europe last year to get the converged storage story direct from David Scott and others. I’ve done work for some of the other firms mentioned.

{ 0 comments }

EMC and the 7 dwarves – part 1

by Robin Harris on Thursday, 25 April, 2013

EMC has been gaining marketshare over the last several years. The world’s largest data storage company is getting larger.

Why?

IBM and the 7 dwarves
Back when mainframes ruled the earth, IBM faced a hardy band of competitors that used its software – usually MVS – but ran it on less costly or more performant hardware. had their own processor architectures and operating systems. Originally known as the 7 dwarves – Burroughs, UNIVAC, Control Data, NCR, GE, RCA and Honeywell – these companies rode the computing boom with varying success until the early 70s. Then the mainframe business matured and started consolidating. By the late 80s the 7 dwarves were taking significant share from IBM thanks to its bloated prices and conservative hardware design.

After Lou Gerstner took over he lowered IBM’s pricing and reinvigorated its engineering to make life difficult for the dwarves. IBM now has a nice multi-billion dollar annuity mainframe business.

The same, only different
EMC doesn’t control the OLTP large storage array business the way IBM drove plug compatible mainframes. But the pressure on competitors will no less intense.

EMC’s position is analogous to IBM’s in the 70s: EMC has the most successful scale-up OLTP arrays; offers better support; and keeps adding useful features.

Because of its size and growing share, EMC’s Symmetrix VMAX business will out-invest their competitors, increasing their functional lead in features and performance. As once-reasonable competitors like HP’s EVA fall by the wayside there will fewer reasons to choose anything else for high-capacity OLTP.

The cloud onslaught
With the coming tidal wave of cloud-based storage options it is clear that the industry cannot support all the big iron array companies we currently have. There are several implications to EMC’s dominance in the traditional storage business.

  • Large storage arrays for OLTP are no longer a major pain points with customers. They have bigger problems now with massive amounts of file data, streaming data and scale of public and private cloud storage.
  • Another is that customers are no longer looking solely to big iron arrays for high performance. DRAM and flash arrays are taking over the nose-bleed end of the storage performance envelope, leaving less latency-critical applications for traditional storage.
  • As competition decreases, expect EMC to treat its flagship arrays as cash cows as it invests in newer technologies and companies.

What changed?
Expect to see a several of the dwarves leave the big iron storage array business. Let’s look at each of the competitors in turn.

Oracle/Sun. Sun’s storage business is the obvious weak sister among major vendors, as it has been for over 20 years. Oracle is having some success with its database optimized storage offerings, where it’s focus is on IBM.

They’ve got a tin-wrapped software strategy. They aren’t seeking to challenge EMC and will remain a niche player closely aligned with the Oracle’s database business.

Hitachi Data Systems saw the writing on the wall several years ago with their acquisition of Archivas. They’ve been busy turning it into a credible cloud storage alternative. With their global distribution, quality reputation and OEM relationships, they have a better than even chance of making the transition to the brave new world of storage.

Dell is not in the big iron storage array business today but they’ve been working to build a significant business. Unfortunately their operations focused culture – and years of dependence on EMC – leaves them poorly prepared to enter the mainstream enterprise storage business.

Dell is leveraging their low-cost supply chain to build a scale-out storage business. They’ll succeed with cloud service providers, but they’re unlikely to win in the enterprise. Providing reliable and low-cost hardware only gets you so far in the enterprise: support and a knowledgeable sales force mean even more.

NetApp has done a good job putting financial and marketing daylight between themselves and the other dwarves. But their one-size-fits-all strategy is bumping up against the reality that it doesn’t.

Buying Bycast was a smart move, but like the Spinnaker acquisition they’ve been slow to capitalize on the little-known scale-out market leader. They’ll hang in there, but unless they adopt a more flexible strategy and product mix, their days of heady growth are behind them.

They need to reinvigorate the company with a major cultural shift that enables them to market and sell multiple product lines, something some longtime senior execs – and a too-comfortable sales force – are dead set against. They don’t need to go as far afield as EMC has, but with their global sales and support they are well positioned to take a leaf from EMC’s technology publishing model.

Tomorrow: HP, IBM, the 7th dwarf and the StorageMojo take.

Update: As alert reader John Verity noted in the comments, my memory unit conflated the 7 dwarves with the PCM vendors – most famously Amdahl – in the first version of this post. Luckily correcting this makes the argument stronger. In the interest of transparency I struck out the wrong parts, but if it makes it unreadable I may just pull it. If I do, I’ll update this note. Sorry!

{ 5 comments }

Is F5′s ARX file virtualization a success?

by Robin Harris on Monday, 22 April, 2013

In response to the post on Avere’s architecture for fronting backend NAS filers – where StorageMojo said that no front-end to NAS boxes has succeeded – alert reader Jacob Marley asked “What about F5′s ARX to stitch/balance storage across multiple filers?”

f5-fullcolor-lg.jpg;wad58b3e319a2f6d68

Good question! What can we deduce from publicly available sources?

The F5 ARX product line is billed as an “intelligent file virtualization solution” that
“. . .preserves the logical access to files regardless of their current location on storage.” Like earlier file switches

The ARX device does not introduce a new file system; rather it acts as a proxy to the file systems that are already there.

ARX is not a storage device itself but a load-balancer for NAS filers. Then, per Mr Marley’s question, is ARX not a success?

Competitive analysis
First up, let’s take a look at the latest quarterly 10-Q report, courtesy of the SEC’s EDGAR database.

In “Management’s Discussion and Analysis of Financial Condition and Results of Operations” they describe their product revenues as

The majority of our revenues are derived from sales of our application delivery networking (ADN) products including our high end VIPRION chassis and related software modules; BIG-IP Local Traffic Manager, BIG-IP Global Traffic Manager, BIG-IP Link Controller, BIG-IP Application Security Manager, BIG-IP Edge Gateway, BIG-IP WAN Optimization module, BIG-IP Access Policy Manager, WebAccelerator; FirePass SSL VPN appliance; Traffix diameter signaling products; and ARX file virtualization products.

Unless this is a last-but-not-least ordering, it looks like management is not leading with the ARX products. But let’s look for more evidence of management’s priorities.

Combing through the F5 newsroom, for instance, we find that the last press release on ARX is almost 2 years old. Titled “F5’s New ARX Platforms Help Organizations Reap the Benefits of File Virtualization” it is surprising that later press releases don’t call out other success stories.

The most recent ARX white papers, “Reducing Storage Costs with F5 ARX” and “Enabling Flexibility with Intelligent File Virtualization” are both dated 2011. The ARX data sheet is from 2013 though.

The StorageMojo take
It’s clear that F5 has backed away from the ARX technology – which they acquired with Acopia in 2007 – in favor of the Application Delivery Controller market. But does that mean that the Acopia/F5 ARX didn’t succeed?

Clearly, ARX succeeded for a while: F5 bought them after all. And the F5 PR archives have several success stories from 2009.

But within the current F5 context – where they have several high-growth segments – ARX is getting little investment. At another company, perhaps, ARX would be a success, but at F5 it clearly is not.

If I were a customer I would certainly look at ARX if I wanted to virtualize disparate NAS filers. But I’d be sure to have some contingency plans in place if F5 decided to end-of-life the product in the next 2 years.

Courteous comments welcome, of course. Fun fact: F5′s name was inspired by one of my favorite movies – and an AFI top 100 selection – Twister, that popularized the Fujita scale (now the Enhanced Fujita scale) for tornado intensity.

{ 7 comments }

Fronting NAS for fun and profit

by Robin Harris on Friday, 19 April, 2013

The traditional model of NAS filers is handy if you only have a few. But once you get to 8 or 10 NAs filers your life gets complicated.

Your oldest data is on the oldest filer and your active data is on the newest. If that new filer bottlenecks your entire system slows down.

Hence the old saw “you’ll love your first filer and hate your tenth.” System administrators will load balance by moving data back and forth, an inherently wasteful and error-prone exercise.

A history of failure
A number of start ups – such as Z-force and Zambeel – have attempted a fix. The general idea is a switch that virtualizes the backend filers to create a single pool.

While the concept sounds good, results have been dismal. No storage system whose primary function was to front-end existing NAS boxes has succeeded.

Once more unto the breach
Now another entrant enters the fray. Avere Systems has raised $50 million and is on v3 of their tin-wrapped software.

At NAB 2013 they announced the FXT 3800 Edge Filer. The 3800 tiers across RAM, SSD, SAS, backend NAS and cloud across one namespace.

They’re understandably proud of their new SPECsfs2008 NFS result with a FXT 3800 32 Node Cluster that reached 1,592,334 Ops/Sec. That beats NetApp, Isilon, Hitachi/BlueArc and everybody else, except Huawei’s OceanStor 8500, which used 24 file systems and more than twice the number of SSDs get over 3 million Ops/Sec.

Oh, and they included a transcontinental latency in the network. As you might see using a cloud provider like Amazon Web Services, which was showing an Avere proto version in their NAB booth.

The hard question
After the briefing by Avere I asked Ron Bianchini, the CEO and cofounder, why Avere would escape the fate of their erstwhile predecessors.

I boiled his answer down to 4 points:

  • Avere’s appliance is a read and write cache, so hot data I/O is handled directly and not routed to the backend filers. Typically, he says, only 1 out of 50 I/Os leave Avere for backend NAS, and for some workloads it is as little as 1 out of 200.
  • Their file system is the client of the backend filers, so they know exactly where the data is at all times. Furthermore, they’ve certified vendors like NetApp, so they handle the inevitable corner cases.
  • The system moves data across 4 tiers – DRAM, SSD, SAS, SATA and the backend filers so it is capable of extremely high performance, unlike products that relied upon backend performance.
  • They also manage blocks within files, so a change in a file doesn’t require rewriting the entire file, a popular feature in large file applications.

The StorageMojo take
Rip and replace has never been popular. With today’s data volumes it is ever more unwieldy.

Avere’s performance and cost-effectiveness make it more than a simple pooling of NAS capacity: by reducing the load on current filers it extends their economic life while eliminating hot-spots and bottlenecks. You keep what you’ve got and make it faster and easier to manage.

Since most disk-based systems are way over-configured on capacity, this also means reduced CapEx and OpEx as fewer new filers are bought and less floor space, power and maintenance is needed. Given their scale-out architecture – minimum config is 3 nodes – you can add performance without adding more filers.

Bottom line: Avere, using 21st century technology, has built a new way to utilize existing resources while improving performance and reducing costs. That’s something no other NAS front-end ever managed.

They’ll do well.

Courteous comments welcome, of course. Any Avere users want to comment on their experience? I haven’t done any work for Avere, but that could change.

{ 2 comments }

StorageMojo @NAB 2013 next week

by Robin Harris on Tuesday, 2 April, 2013

It’s spring, and a young man’s fancy turns to Las Vegas and the National Association of Broadcasters annual price-is-no-object tech toy fair. Shaking off a long winter’s chills the StorageMojo analyst army is getting ready to ride off into the hills to see the bright lights of the LVCC.

Arriving Wednesday on the show floor and leaving Thursday afternoon. Will you be there? Let’s meet up. Comment below.

The StorageMojo take
NAB is a favorite show because it is further upstream than CES: no consumption without production; and this is where you find out about what can be produced. Plus the toys are more fun, even if many are priced for mega-corps, not small producers.

As a video producer playing with – and writing about – the storage intensive tech is both fun and profitable. The changes the industry has gone through in the last 10 years are amazing, and I suspect the fun is just beginning!

Courteous comments welcome, of course.

{ 0 comments }

Build a 3PB storage solution

by Robin Harris on Monday, 1 April, 2013

Choice is a great thing, unless there’s too much of it. And choice is what we have a lot of in today’s data storage market.

A longtime StorageMojo reader has an interesting problem: architect a 3PB data storage facility. Can you help?

Here’s what he wrote to StorageMojo. His email has been slightly edited for clarity and length.

One of my current problems is to design one of the nodes for a large research data storage facility. I’ve had to do this stuff in varying degrees, varying modalities and varying tech in times gone by.

I’ve been given a number and “capacity” to look into – somewhere near or around 3PB to begin with. We won’t even go down the path of discussing workloads or disk technology fit for purpose at this stage, but, something has struck me as interesting.

There is this clear divergence in disk technologies at the moment and I’m finding it hard to resolve what is the “right” one of the task.

Currently, I see:

  • Heavy-end storage virtualisation frames [VSP, Symmetrix et al]
  • Big grid-ish things [IBM XIV etc]
  • Weird “stacked” commodity LSI Silicon [NetApp E5400/5500, SGI IS5500/IS5600, Dell MD3660F etc – all the same silicon I think?!]
  • Quasi virtualisation arrays with modular form factors (Hitachi’s HUS-VM?)
  • High performance dense trays in modular form factors [DDN's SFA-12K Exa and Grid scaler tech?]
  • Bog-standard performance dense trays in modular form factors [Hitachi HUS, EMC VNX, HP EVA, Dell compellent etc etc]
  • That wild crazy pure flash/RAM/SSD/NAND world that guys like Violin inhabit.

Currently I’m trying to rationalise what I should be using for a storage platform that needs to scale big, but do it in a sensible economic standpoint, with density, performance of interconnect and throughput with gross mixed workloads being all big factors.

Some folks suggest to me that I should be happy enough with the LSI horizontally stacked 60-drive trays, but I am not sure the technology is tracking too well in terms of performance or density (Hitachi, DDN and maybe some others can now do 84 drives in as little as 4-RU!).

I guess my question to you is – where do you see that dense high performance market heading? I know the guys at the LLNL over your way were crowing about the NetApp E5400 LSI stuff where they managed their “1TB/sec” file system (I think it was Lustre based?), but I have to wonder if that could have been more efficiently carried out using a DDN GridScaler/SFA-12K-E etc.

The StorageMojo take
Two issues here: is the segmentation our correspondent offers realistic and helpful? And what are the core architectural issues he needs to think about?

For the first issue an object store or a highly parallel NFS – like Panasas – seems to be indicated.

Given that this is a general purpose high-performance system, the critical problem seems to be how the system – however architected – handles file creation/update/deletion metadata. String enough disks together – 1,000 to 2,000 – and you can get a reasonable # of IOPS and, if you need more, put some SSDs in front.

There are a number of scale-out storage systems that will credibly and economically grow to 3PB. Metadata is often the bottleneck, as Isilon buyers have found when creating many small files.

A maximum performance spec – including file creation etc. rates – will probably help eliminate likely laggards, while a budget $ per usable TB/PB will eliminate the uneconomic products.

Vendors are welcome to offer their perspectives. Please just identify your company so we know where you’re coming from.

Practitioners who’ve done this, or something similar, are encouraged to share their hard-earned wisdom. 3PB is non-trivial today.

Courteous comments welcome, of course. I’m going to start offering almost-free consulting for end-users. Stay tuned!

{ 12 comments }