Hike blogging: July 4th, 2014

by Robin Harris on Saturday, 5 July, 2014

A new hike on July 4th. Hiked around Twin Buttes, about 7-8 miles. Heavy cloud cover heralding the start of the summer monsoon season. Finally got some sun breaking through the clouds and took this picture looking south.

Broken_Arrow_07-04-2014-3947

Courteous comments welcome, of course. Readers, what say you on hike blogging? Like, don’t like, don’t care?

{ 0 comments }

Data services more important than latency? Not!

by Robin Harris on Thursday, 3 July, 2014

Yesterday’s post on IOPS vs latency provoked some controversy on Twitter. Kappy, CTO of a midwestern IT consultancy, asserted

@storagemojo Most AFA users don’t even care about latency. Sure there are latency sensitive apps, but data services are more important.

When asked what services, Kappy said

@lleung @storagemojo standard issue stuff. Snaps, replication, deep VMware integration, rich API access, etc. See: http://t.co/7MIaaZ4eIP

I replied:

@Kappy Data services more important than performance? Really? What about the massive savings from lower latency?

Kappy replied:

@StorageMojo can you give more specifics as to what you see as “massive savings”?

The storage pyramid
Surprising question. As said in the StorageMojo post, not the tweet:

Lower latency means fewer inflight I/Os, less server I/O overhead and more server capacity. The systems can handle more work. With fewer servers, software licenses, network ports. Less power, cooling, floor space, maintenance.

Those massive savings. If flash isn’t making something better, you have to ask: Why has flash remade the storage industry in the last 10 years?

The StorageMojo take
Availability and performance are two sides of the same coin: low performance = low availability. That’s why tape’s high latency is slowly pushing it out of a market it once owned.

Most of the data services Kappy mentioned are designed to ensure availability because that’s what customers need. But the problem most data services help manage is simple: data corruption and loss.

There’s a reason we have a storage pyramid: if the cheapest storage were also the fastest that’s all we’d use. Flash has inserted itself between DRAM and disk arrays because it makes our systems perform better for a reasonable cost.

Data services are a Good Thing. But ask a customer if she’d rather have availability and performance or data services and you’ll find out quickly enough what customers care about.

Twitter isn’t the best place for a thorough airing of technical issues. But it can be educational nonetheless.

Courteous comments welcome, of course.

{ 13 comments }

IOPS is not the key number

by Robin Harris on Wednesday, 2 July, 2014

Americans love round numbers. 600HP. 200MPH. $1,000,000,000. 15% capital gains tax rate. 10Gb/s. 6TB drive. 1,000,000 IOPS. $2/GB.

Those are brawny, manly numbers that red-blooded Americans can relate to. Not fussy little decimals, sliding further into irrelevancy with each succeeding digit.

We’re even less fond of ratios. A 600HP car. What’s the power to weight ratio?

Yet, if pushed, we’ll go with ratios. PUE of 1.1- but we won’t get excited about them.

But our love of round numbers can lead us astray. Why is 1,000,000 IOPS the goto number for all-flash arrays?

Few applications require anywhere near 1 million IOPS. Quick, how much bandwidth do 1 million IOPS require? Who cares about IOPS when virtually every AFA has more than enough?

Fair point. You might argue for demand spikes to 5 or 10x of normal might require that magical number. But that’s an argument for over-provisioning and, thanks to AWS, CFOs are daily less interested in funding over-provisioned systems.

But there’s one thing storage systems never get enough of: less.

As in less latency. Lower latency means fewer inflight I/Os, less server I/O overhead and more server capacity. The systems can handle more work. With fewer servers, software licenses, network ports. Less power, cooling, floor space, maintenance.

All we need is a round number to describe latency. Good luck though, because average latency is a trap. Average latency can be low, but if there are long tails – where latency goes from microseconds to many seconds – that’s a problem.

Maybe a diagram would work. Like the eye diagrams used in signal engineering, which measures the effect of channel noise and intersymbol interference on channel performance.
Image courtesy Hardwareonkel
Image courtesy Hardwareonkel

The StorageMojo take
Bottom line: it would help the AFA array market to get away from the pointless IOPS number. It made more sense with disk arrays since disks were the ultimate limiting factor.

More disks, more IOPS. More cache, lower (average) latency.

But flash arrays are are basically all cache, all the time. Yes, there are caches, but more to improve endurance rather than performance. Nothing to be gained by reminding customers of endurance issues.

StorageMojo readers: how best to describe latency? Get as close to one round number as you can!

Courteous comments welcome, of course. Or if IOPS or latency aren’t critical, what is?

{ 8 comments }

Hike blogging

by Robin Harris on Wednesday, 2 July, 2014

This morning, looking west from the crest of Cibola trail, near the start of a 6 mile hike. Click on it for a larger version.

Brin_loop_7-2-14-3797

The StorageMojo take
Psychic income isn’t taxable. Earn all you can!

Courteous comments welcome, of course.

{ 0 comments }

Crossbar shows ultra-dense RRAM architecture

by Robin Harris on Monday, 30 June, 2014

Crossbar, the resistance RAM (RRAM) startup, opened the kimono a little wider today with the announcement of their “1TnR” architecture, which they have implemented on pre-production test chips.

Unpacking 1TnR: 1 Transistor drives n number of RRAM cells. How many? They report that a single transistor can drive over 2,000 memory cells at very low power and super density.

How dense? They say 1TB on a single die, with 3D stacking of memory cells. Point: this is on-chip deposition of multiple layers of memory cells, not the mechanical 3D of NAND flash that requires VIAs and precise positioning of multiple dies.

Pictures
The Crossbar memory cell uses a metallic nano-filament in a non-conductive layer that can built on current CMOS fabs. Here’s a Crossbar diagram showing the cell’s various states:

Screen Shot 2014-06-30 at 8.47.18 AM

TE = Top Electrode
SM = Switching Medium
BE = Bottom Electrode

As StorageMojo related last week Crossbar co-founder Wei Lu, a Michigan professor, made a breakthrough discovery in how their RRAM works: it moves metal particles through a solid. That’s new.

The StorageMojo take
Crossbar is in the process of licensing its technology to some major fabs. The simplicity and scalability of its process means it can be built on fully depreciated CMOS lines using old technology. Large feature sizes aren’t a big problem when you can stack multiple cell layers on a single die.

Commercial shipments are planned for 2017 assuming all goes well. But if the cost and density predictions pan out, Crossbar’s RRAM will be a game changer for SSDs, NVDIMMs and, eventually, enterprise storage. Keep an eye on them.

Courteous comments welcome, of course. What other info would you like to see from Crossbar?

{ 0 comments }

Crossbar founder finds metals move – in a solid

by Robin Harris on Friday, 27 June, 2014

NVRAM maker Crossbar’s co-founder and professor at U Michigan Wei Lu has published a paper describing a never-before-seen phenomena: metal nanoparticles moving in a solid. Crossbar is pushing RRAM – Resistance RAM – but there is a problem with most RRAM implementations: no one knows how they work.

RRAMs been made to work, seen to work, and they have great properties that promise to supersede NAND flash in enterprise applications, but not knowing their mechanics in detail is a problem. For example, it’s hard to optimize the production process if you don’t know what the underlying physics are.

According to the press release (paper is behind a pay wall):

Lu, who led the project, and colleagues at U-M and the Electronic Research Centre Jülich in Germany used transmission electron microscopes to watch and record what happens to the atoms in the metal layer of their memristor when they exposed it to an electric field. The metal layer was encased in the dielectric material silicon dioxide, which is commonly used in the semiconductor industry to help route electricity. They observed the metal atoms becoming charged ions, clustering with up to thousands of others into metal nanoparticles, and then migrating and forming a bridge between the electrodes at the opposite ends of the dielectric material.

They demonstrated this process with several metals, including silver and platinum. And depending on the materials involved and the electric current, the bridge formed in different ways.

The bridge, also called a conducting filament, stays put after the electrical power is turned off in the device. So when researchers turn the power back on, the bridge is there as a smooth pathway for current to travel along. Further, the electric field can be used to change the shape and size of the filament, or break the filament altogether, which in turn regulates the resistance of the device, or how easy current can flow through it.

The StorageMojo take
With the many compromises required to use NAND flash for enterprise storage, its declining durability as feature sizes shrink, and the relatively small cost of media in high-performance storage, it seems likely NAND flash will not be the medium of choice in 10 years time. Consumer and mobile apps will continue to use it, but a more robust medium is helpful for mass storage reliability and economics.

But RRAM has a steep hill to climb. Billions have been spent on flash factories and hundreds of millions on making it robust enough for high-performance use.

That said, Crossbar is taking the right approach to RRAM: their process is compatible with today’s CMOS foundries; they can produce 3D chips more simply than 3D NAND; and, as this paper demonstrates, they probably know more about this technology than anyone else. That’s a good start.

Courteous comments welcome, of course. Other RRAM teams are welcome to comment.

{ 0 comments }

Competing with the cloud: Achieving high efficiency

by Robin Harris on Monday, 23 June, 2014

A post in the occasional Competing with the Cloud series intended for enterprise IT.

In the last post StorageMojo discussed HP’s POD systems, which have a PUE (Power Use Efficiency) as low as 1.1, competitive with Google and Amazon.

But what about your existing data centers? Can you reduce their PUE?

Yes, you can.

StorageMojo spoke to Chris Yetman, SVP of Operations for Vantage Data Centers, whose Santa Clara data center is LEED platinum certified, about how to achieve an ultra-low PUE.

Why PUE is important
If your IT group is being told to do more with less, join the crowd. Improving PUE is an effective way to do just that. Why?

If your PUE is currently around 2, you’re competing with Google who is at 1.1 today. To put that in perspective, for every megawatt in, Google puts 900KW to work, while you’ll get only 500KW of work done.

Get your PUE down to 1.2 though and you’ll have 833KW to do work with, for only the cost of the improvements. That’s doing more with less.

Key tips
Mr. Yetman has decades of experience in hosting, and Vantage is serious about power efficiency as their LEED platinum cert attests. He is well-versed on the literature and practice of PUE.

Here are his top tips:

Be brave. Many enterprises have a narrow and costly view of proper data center conditions: 65-80F temps; 42-60% humidity; and a dew point up to 58F. But Amazon, Google and others have proven that temps from 59-90F, humidity from 20-80% and a dew point up to 63F – all ASHRAE allowed – is very workable and much more efficient.

Even if you suffer more failures it will cost you much less than lower temperatures. Which brings up the next tip.

Embrace failure. Instead of trying to build a bulletproof infrastructure – a costly and self-defeating effort – challenge IT ops to configure robust systems that survive the inevitable failures. Software people like to boast that software eats hardware. Make them prove it.

Challenge vendors to write better software to handle hardware failures non-disruptively. Then test it.

High voltage power distribution. Every transformer wastes power, so use fewer of them. Deliver 480V to racks, convert once to 12VDC, and be done. Higher voltages have lower current losses and the entire system uses less copper, an expensive metal.

Also, stop buying those big diesel-generator sets UPSs. Put 12V batteries in racks instead: much cheaper and simpler to maintain.

Forget raised floors. They’re expensive and unneccesary.

High efficiency mechanical & electrical equipment. Make efficient trade-offs.
Mechanical – HE fan motors, direct drive – no belts. Don’t lose efficiecy due to gears, belts.

Proper containment. Plug the gaps created to run cables or removed servers in racks. A good hot aisle creates a chimney effect that reduces fan use.

Measurement. Understand transition points – such as filter walls – and measure them. Hot aisle should be lower pressure than cold aisle. Measure outlet temps not inlet temps.

Reduce pressure drops. High pressure variances usually means wasted energy. Measure air pressure on inlet and outlet sides.

The StorageMojo take
Predictions that IT will go 100% cloud are overblown. Besides the problem of legacy apps, there are real advantages to local production and control.

But IT has to be competitive with cloud vendors. PUE isn’t the biggest issue – management cost is – but showing your CFO that you can compete on PUE and other metrics is key to making the case to keep vital functions in-house.

Staying inefficient because “we’ve always done it that way” ensures a short and unhappy career in today’s competitive environment.

Courteous comments welcome, of course. What other tips do you have?

{ 0 comments }

Competing with the cloud: PUE

by Robin Harris on Thursday, 19 June, 2014

A post in the occasional Competing with the Cloud series intended for enterprise IT.

Ten years ago few datacenter managers considered PUE – Power Usage Effectiveness – the ratio of total facility power divided by IT equipment power – as a competitive advantage. Everyone used the same equipment, at the same temperatures, so there was little difference to exploit.

But with the advent of warehouse scale computing PUE suddenly became important to Google and Amazon. Google pushed the adoption of efficient power supplies and analyzed power distribution infrastructure.

Five years ago the average data center PUE we was in the realm of 2.5, while Google was achieving 1.2. To put that in perspective, for every megawatt in, Google put 833KW to work, while the average data center put 400KW to work. Since power infrastructure costs as much as 10 years of power, the capex efficiency and opex reduction meant that enterprises had no chance to compete on power.

Until now.

At the recent HP Discover – where I was a guest of HP – I got a whirlwind tour of their HP 20ce Performance Optimized Datacenter – POD – from Wade Vinson, an HP Distinguished Technologist.

Wade outlined several benefits of the POD concept:

  • Efficiency. Depending on locale and equipment PUEs under 1.1 are feasible.
  • Code-compliance. The PODs are UL listed.
  • Fast delivery. Available built to your specifications in 8-12 weeks, fully tested, software loaded, ready to plug in and go.
  • Dense. Up to 450KW in the 40′ POD; 290KW in the 20′ POD.
  • Flexible. Put PODs in a warehouse or outside in their weatherproof containers.

The PODs are available in 20 and 40 foot lengths and offer PUEs as low as 1.1 – competitive with best-in-class webscale data centers. They have a host of features to make operating them as practical as buying them. For example, on the 20ce POD, the entire rack slides forward to enable space-efficient access to the rear of equipment.

hp_pods

The StorageMojo take
IT vendors are facing their greatest challenge ever: the enormous scale and flexibility of cloud infrastructure. If they want to be in business in 10 years, they need to make enterprise scale infrastructure competitive with the cloud.

The PUE of HP’s PODs – competitive with Google and Amazon – and their quick delivery, make them a viable alternative to building greenfield datacenters. If you can predict your application load 3 months ahead, they are also an alternative to IaaS vendors.

PODs aren’t the only option, but they are solid evidence that major IT vendors can help enterprise IT compete with cloud. If you thought it was game over for small-scale datacenters, think again.

Courteous comments welcome, of course. More coming soon on another option to cloud IaaS.

{ 0 comments }