<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>StorageMojo &#187; SSD/Flash Disk</title>
	<atom:link href="http://storagemojo.com/category/ssdflash-disk/feed/" rel="self" type="application/rss+xml" />
	<link>http://storagemojo.com</link>
	<description>Data storage info &#38; analysis</description>
	<lastBuildDate>Fri, 20 Jan 2012 06:10:36 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>SSDs and the TPC-C top 10</title>
		<link>http://storagemojo.com/2012/01/19/ssds-and-the-tpc-c-top-10/</link>
		<comments>http://storagemojo.com/2012/01/19/ssds-and-the-tpc-c-top-10/#comments</comments>
		<pubDate>Thu, 19 Jan 2012 23:54:24 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Disk]]></category>
		<category><![CDATA[Enterprise]]></category>
		<category><![CDATA[SSD/Flash Disk]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2574</guid>
		<description><![CDATA[If SSDs are so great, shouldn&#8217;t we see the results in TPC-C benchmarks? They are, and we do. But there are some surprises. Cost Looking at the TPC-C top 10 performance results showed the dramatic impact SSDs have had on the cost per thousand transactions (tpmC). There are no top-10 disk-only results after 2009. The [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>If SSDs are so great, shouldn&#8217;t we see the results in TPC-C benchmarks? They are, and we do. </p>
<p>But there are some surprises.</p>
<p><strong>Cost</strong><br />
Looking at the <a href="http://www.tpc.org/tpcc/results/tpcc_perf_results.asp" target="_blank">TPC-C top 10</a> performance results showed the dramatic impact SSDs have had on the cost per thousand transactions (tpmC). </p>
<ul>
<li>There are no top-10 disk-only results after 2009.</li>
<li>The most expensive top-10 SSD result is some 15% cheaper than the least expensive disk-based result &#8211; and the other SSD results are much less.</li>
<li>No top-10 results posted during 2009 &#8211; the depth of the great recession.</li>
</ul>
<p><a href="http://storagemojo.com/wp-content/uploads//2012/01/Screen-Shot-2012-01-19-at-4.45.25-PM.png"><img src="http://storagemojo.com/wp-content/uploads//2012/01/Screen-Shot-2012-01-19-at-4.45.25-PM.png" alt="" title="" width="449" height="346" class="aligncenter size-full wp-image-2575" /></a></p>
<p><strong>Capacity</strong><br />
The conventional wisdom has it that disks must be way over-configured to get enough IOPS. You&#8217;d expect to see disk solutions have a lot more capacity than SSD solutions in top-10 results.</p>
<p>But we don&#8217;t:</p>
<p><a href="http://storagemojo.com/wp-content/uploads//2012/01/Screen-Shot-2012-01-19-at-4.45.44-PM.png"><img src="http://storagemojo.com/wp-content/uploads//2012/01/Screen-Shot-2012-01-19-at-4.45.44-PM.png" alt="" title="" width="464" height="341" class="aligncenter size-full wp-image-2576" /></a><br />
The highest capacity &#8211; 1760 TB &#8211; is for an Oracle SSD-based solution. Yet the lowest capacity solution &#8211; 83 TB &#8211; is also SSD-based and is also the cheapest per tpmC.</p>
<p>Are we seeing issues with the rest of the infrastructure?</p>
<p><strong>The StorageMojo take</strong><br />
I&#8217;ll be taking a deeper dive into the data, but perceptions may be at odds with what this limited set of performance focused benchmarks is showing us. </p>
<p>Readers: what do you think?</p>
<p><strong>Courteous comments welcome, of course.</strong> Events beyond my control have reduced StorageMojo&#8217;s usual posting frequency. Hope to get things back to normal over the next several weeks.</p>
<div style="clear:both;margin-bottom:5px;">
				<a href="http://twitter.com/share?url=http://storagemojo.com/2012/01/19/ssds-and-the-tpc-c-top-10/&text=SSDs and the TPC-C top 10" target="_blank" title="Click here if you liked this article">
					<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
				</a>
			</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2012/01/19/ssds-and-the-tpc-c-top-10/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Learning from customers</title>
		<link>http://storagemojo.com/2011/12/07/learning-from-customers/</link>
		<comments>http://storagemojo.com/2011/12/07/learning-from-customers/#comments</comments>
		<pubDate>Wed, 07 Dec 2011 20:05:07 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Enterprise]]></category>
		<category><![CDATA[SSD/Flash Disk]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2563</guid>
		<description><![CDATA[EMC&#8217;s Chuck Hollis blogged about The Vendor Beating a couple of months ago. The unspoken question in the post is &#8220;how do we understand what customers are telling us?&#8221; He writes As an employee of a large IT vendor, I&#8217;ve been at the receiving end of a reasonable number of vendor beatings. Occasionally it&#8217;s richly [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>EMC&#8217;s Chuck Hollis <a href="http://chucksblog.emc.com/chucks_blog/2011/09/the-vendor-beating.html" target="_blank">blogged</a> about <i>The Vendor Beating</i> a couple of months ago. The unspoken question in the post is &#8220;how do we understand what customers are telling us?&#8221;</p>
<p>He writes</p>
<blockquote><p>
As an employee of a large IT vendor, I&#8217;ve been at the receiving end of a reasonable number of vendor beatings.</p>
<p><i>Occasionally it&#8217;s richly deserved</i>. But, sometimes, it&#8217;s masking a deeper set of issues that have very little to do any vendor whatsoever.
</p></blockquote>
<p>Unhappy customers, like unhappy families, are all unhappy in their own way. This customer appeared to be overstaffed, under-skilled and poorly managed.</p>
<p><strong>Interpretation</strong><br />
Interpreting customer complaints and behavior is hard. When companies can&#8217;t decipher what customers want &#8211; which is usually what the company <i>isn&#8217;t</i> selling &#8211; it is easy and dangerous to tune them out. </p>
<p>Customers can tell you things about your company and products that you can&#8217;t directly discover for yourself, but what customers say may be different from what they think. And both are influenced by the customer&#8217;s context, which can include company politics, prior vendor experiences, knowledge deficits and employee level.</p>
<p><strong>Diagnosis</strong><br />
Steve Jobs once said that customers don&#8217;t know what they want until you show it to them. Customers know what would improve the current product in the current use case, but they can&#8217;t imagine bringing multiple novel technologies to bear on a much broader problem.</p>
<p>Tablet computers flopped for years until the iPad crystalized the market. Everyone saw the tablet problems: thick; heavy; slow; clunky UI; poor battery life; and, thanks to low volumes, cost. Incremental improvements &#8211; faster processors, more RAM, larger disks &#8211; didn&#8217;t help.</p>
<p>Tablets required a deep rethinking and application of several novel technologies &#8211; flash, gestures, CNC case milling, an app store and an energy-efficient OS &#8211; to create a compelling user experience. </p>
<p>The iPad illustrates the problem of listening to customers: they described symptoms and suggest fixes, but couldn&#8217;t articulate the underlying problem: how the use case differs from desktop and notebook PCs. That requires an act of imagination, not transcription.</p>
<p><strong>The StorageMojo take</strong><br />
In Chuck&#8217;s post an EMC presales engineer identified the root cause of the customer&#8217;s pain:</p>
<blockquote><p>
. . . the database environment had grown willy-nilly over the years &#8212; it wasn&#8217;t laid out well, the queries weren&#8217;t particularly well written, and so on.</p>
<p>Sure, there were things we could do on the storage side (e.g. faster storage, better layouts, etc.), but it was a bigger issue than just storage performance.
</p></blockquote>
<p>But the larger question is: with high-speed and high-capacity SSDs, why isn&#8217;t this customer moving to an infrastructure that doesn&#8217;t need this fancy tuning? EMC can&#8217;t manage the fight between DBAs and storage admins, but they could be making it less contentious.</p>
<p>From within the EMC ecosystem the solution is clear: more training, professional services and faster gear. But from the outside the question is: who is building &#8220;it just works&#8221; high performance storage? </p>
<p><strong>Courteous comments welcome, of course.</strong> I admire Tucci&#8217;s innovative EMC business model: outbid everyone else for chasm-crossing companies; give them global distribution and support; and watch the bucks roll in. It may not be innovative <i>technically</i> but it is innovative.</p>
<div style="clear:both;margin-bottom:5px;">
				<a href="http://twitter.com/share?url=http://storagemojo.com/2011/12/07/learning-from-customers/&text=Learning from customers" target="_blank" title="Click here if you liked this article">
					<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
				</a>
			</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2011/12/07/learning-from-customers/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Ask StorageMojo: 80,000 mailboxes need help</title>
		<link>http://storagemojo.com/2011/11/02/ask-storagemojo-80000-mailboxes-need-help/</link>
		<comments>http://storagemojo.com/2011/11/02/ask-storagemojo-80000-mailboxes-need-help/#comments</comments>
		<pubDate>Wed, 02 Nov 2011 16:00:28 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Enterprise]]></category>
		<category><![CDATA[NAS, IP, iSCSI]]></category>
		<category><![CDATA[SSD/Flash Disk]]></category>
		<category><![CDATA[Virtualization]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2543</guid>
		<description><![CDATA[A StorageMojo reader has a problem. Can you help? Our mail hub (80,000+ mailboxes) is virtualized with vSphere 4.1 with Red Hat Enterprise Linux 5 x64 and Dovecot 2.0 [an open source IMAP/POP3 email server for Linux/UNIX-like systems]. We are using HP LeftHand Networks P4300 iSCSI storage in a &#8220;network RAID10 setup of RAID10 storage&#8221; [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>A StorageMojo reader has a problem. Can you help?</p>
<blockquote><p>
Our mail hub (80,000+ mailboxes) is virtualized with vSphere 4.1 with Red Hat Enterprise Linux 5 x64 and <a href="http://dovecot.org/index.html" target="_blank">Dovecot 2.0</a> [an open source IMAP/POP3 email server for Linux/UNIX-like systems]. We are using HP LeftHand Networks P4300 iSCSI storage in a &#8220;network RAID10 setup of RAID10 storage&#8221; for Dovecot indexes and multiple &#8220;networks RAID1 of RAID5 storage&#8221; for actual mailboxes.</p>
<p>This is my take: our Dovecot indexes are getting hammered with lots of small I/O requests, about 8,000 IOPS continuous during 8-working-hour days, 75% write. Indexes are fairly small (50 GB) and expected to grow to 100-150 GB, but need a lot of random I/O. We need real-time replication in storage (LeftHand is ok for us) and we think that SSD should shine in this situation. Bandwidth is not a problem (200-300 megabits of indexes traffic, but we need more IOPs).</p>
<p>The problem is the indexes, but our total mailbox capacity is expected to grow to 6 TB compressed using zlib compression in Dovecot.</p>
<p>We want to buy a storage appliance with the following requirements:</p>
<ul>
<li>Vsphere 4.1 &#038; 5 certified storage, VAAI enabled (if possible)</li>
<li>iSCSI (1 gbps)</li>
<li>High number of IOPS (at least 12,000+, most of them writes)</li>
<li>Small size (200 GB)</li>
<li>Fault tolerant (RAID, battery-backed write cache, power supply, fans, multiple gigabit uplinks, synchronous replication)</li>
<li>Cheap (less than $30k the full setup)</li>
</ul>
<p>We want to buy at the beginning of 2012. Any product that fits?
</p></blockquote>
<p><strong>The StorageMojo take</strong><br />
Suspect price will be the most significant limiter. But the respondent only needs index storage not the whole shooting match. He&#8217;s pretty happy with LeftHand for mailbox storage.</p>
<p>But if we can solve both problems for him, why not? If he should relax some constraint, feel free to suggest it.</p>
<p>He&#8217;ll be watching the comments, so if you have questions please ask them. I&#8217;ll be following the comments as well.</p>
<p><strong>Courteous comments welcome, of course.</strong> His email was edited for clarity.</p>
<div style="clear:both;margin-bottom:5px;">
				<a href="http://twitter.com/share?url=http://storagemojo.com/2011/11/02/ask-storagemojo-80000-mailboxes-need-help/&text=Ask StorageMojo: 80,000 mailboxes need help " target="_blank" title="Click here if you liked this article">
					<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
				</a>
			</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2011/11/02/ask-storagemojo-80000-mailboxes-need-help/feed/</wfw:commentRss>
		<slash:comments>47</slash:comments>
		</item>
		<item>
		<title>RAMCloud is the new flash</title>
		<link>http://storagemojo.com/2011/10/05/ramcloud-is-the-new-flash/</link>
		<comments>http://storagemojo.com/2011/10/05/ramcloud-is-the-new-flash/#comments</comments>
		<pubDate>Thu, 06 Oct 2011 01:03:30 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Clusters]]></category>
		<category><![CDATA[SSD/Flash Disk]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2529</guid>
		<description><![CDATA[Sometimes in the midst of the endless tweaking needed to maximize storage performance one just wants to say &#8220;screw it! Put everything in RAM!&#8221; And that&#8217;s just what RAMCloud does. Disk is the new tape, flash the new disk, DRAM the new flash. RAMCloud is a research paper (pdf) and an open software project. The [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>Sometimes in the midst of the endless tweaking needed to maximize storage performance one just wants to say &#8220;screw it! Put everything in RAM!&#8221; And that&#8217;s just what RAMCloud does.</p>
<p><strong> Disk is the new tape, flash the new disk, DRAM the new flash.</strong><br />
RAMCloud is a <a href="http://www.stanford.edu/~ouster/cgi-bin/papers/ramcloud.pdf" target="_blank">research paper</a> (pdf) and an <a href="http://fiz.stanford.edu:8081/display/ramcloud/Home" target="_blank">open software project</a>. The goal is enterprise-class availability with every bit of active data stored in DRAM, not disk or flash, for maximum performance. It is a key-value object store today, though as pure software that could change.</p>
<p>It&#8217;s the brainchild of John Ousterhout, a Stanford prof who invented Tcl back in the 80s at Berkeley. </p>
<p><strong>Isn&#8217;t DRAM volatile and costly?</strong><br />
Right on both counts, grasshopper, so RAMCloud isn&#8217;t a 1 for 1 disk-style architecture. No Google FS-style triple replication here, or RAID-style erasure coding.</p>
<p>Instead RAMCloud uses <i>buffered logging</i>:</p>
<blockquote><p>
. . . a single copy of each object is stored in DRAM of a primary server and copies are kept on the disks of two or more backup servers; each server acts as both primary and backup. However, the disk copies are not updated synchronously during write operations. Instead, the primary server updates its DRAM and forwards log entries to the backup servers, where they are stored temporarily in DRAM.
</p></blockquote>
<p>Instead of working around crashes &#8211; using multiple object copies as scale-out storage does &#8211; RAMCloud recovers lost data from the DRAM logs or disk drives to replicate the lost data at high speed. That&#8217;s possible because all the log data is in DRAM or spread across many disks. </p>
<p>In a recent paper (<a href="http://www.stanford.edu/~ouster/cgi-bin/papers/ramcloud-recovery.pdf" target="_blank">Fast Crash Recovery in  RAMCloud</a>) (pdf) Diego Ongaro, Stephen M. Rumble, Ryan Stutsman, John Ousterhout, and Mendel Rosenblum (co-founder of VMware) go into more detail on this critical feature. </p>
<p>The key elements are:</p>
<ul>
<li><strong>Scale.</strong> Servers scatter their backup data across all other servers so thousands of disks can serve the recovery.</li>
<li><strong>Log-structure. </strong> Reduces complexity and offers high performance.</li>
<li><strong>Randomization.</strong> Many decisions need to be made in a large cluster. Rather than CPU, time and bandwidth consuming determinism, injecting randomization speeds decisions with less overhead.</li>
<li><strong>Dynamic tablets.</strong> The key-value store tracks resource usage within a single table and ensures that no single partition is too large for fast restores.</li>
</ul>
<p>DRAM is volatile so the log replication data is spread to other servers on other racks for redundancy before being committed to disk. Still, total system write throughput is limited by the disk write speed, whose limits are a key reason people are moving from disks. Flash drives may help, but other techniques, such as log truncation and sharding make it possible to get good performance from several thousand SATA drives.</p>
<p>How good? The team reports that in a 60 node cluster they recover 35GB in 1.6 seconds. With more nodes larger partitions should be restored even faster. Scale is good.</p>
<p><strong>Lights out!</strong><br />
Power failures wipe all the data in DRAM. The obvious defense is to avoid failures: combine battery backup with diesel generator sets. Power ride-through will handle interruptions into the hundreds of milliseconds.</p>
<p>But who is going to trust that? That&#8217;s why future commercial implementations will insist on logging to stable storage, such as the flash SSDs.</p>
<p>They&#8217;re getting cheaper fast &#8211; faster than DRAM &#8211; which will make this a common approach. </p>
<p><strong>Cost</strong><br />
Professor Ousterhout kindly sent a short note about cost, correctly noting that</p>
<blockquote><p>
. . . if you measure cost/operation, DRAM is roughly 100x cheaper than disk, since a disk can only perform about 100-200 operations/second.  This is why RAMCloud makes sense for data-intensive applications. . . .
</p></blockquote>
<p>While you and I might find that persuasive, too many enterprises don&#8217;t. The deep conservatism of the storage culture &#8211; both figuratively and literally &#8211; makes cost a good excuse to stay with the tried and true, and easy to explain to CFOs. </p>
<p>The good news for the company I hope he is starting is that the primacy of $/GB is slowly eroding as customers see the system level savings from fast storage. SSD vendors and companies like TMS and Kaminario are breaking trail for RAMCloud.</p>
<p><strong>The StorageMojo take</strong><br />
Make no mistake: RAMCloud is a research project, not a commercial product, years and million$ away from commercial application. But the concept is promising.</p>
<p>Imagine a world where data layout doesn&#8217;t matter, where apps are optimized for sub-millisecond storage, where 100 byte I/Os are faster and just as efficient as 8KB I/Os. The architectural implications are huge and would take a decade or more to absorb.</p>
<p>RAMCloud raises the thorny issue of tiering: getting hot data on the hot storage and everything else off to disk. There are OK answers for tiering but nothing insanely great. </p>
<p>RAMCloud shows we&#8217;re far from the end of the line in what storage can do. Faster, better, arguably cheaper: 2 out of 3 ain&#8217;t bad.</p>
<p><strong>Courteous comments welcome, of course.</strong> A shorter version of this post appeared on <a href="http://www.zdnet.com/blog/storage/ramcloud-puts-everything-in-dram/1546" target="_blank">ZDNet</a>.</p>
<div style="clear:both;margin-bottom:5px;">
				<a href="http://twitter.com/share?url=http://storagemojo.com/2011/10/05/ramcloud-is-the-new-flash/&text=RAMCloud is the new flash" target="_blank" title="Click here if you liked this article">
					<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
				</a>
			</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2011/10/05/ramcloud-is-the-new-flash/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Storage @VMworld 2011</title>
		<link>http://storagemojo.com/2011/09/12/storage-vmworld-2011/</link>
		<comments>http://storagemojo.com/2011/09/12/storage-vmworld-2011/#comments</comments>
		<pubDate>Mon, 12 Sep 2011 16:53:32 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Cloud computing & storage]]></category>
		<category><![CDATA[Clusters]]></category>
		<category><![CDATA[Enterprise]]></category>
		<category><![CDATA[SSD/Flash Disk]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2519</guid>
		<description><![CDATA[VMworld is the best storage show I&#8217;ve seen in years. VMware&#8217;s severe storage problems leave users hungry for solutions &#8211; and your friendly neighborhood storage industry is happy to oblige. It&#8217;s almost as if VMware were owned by a storage company. Flash everywhere Fusion-io, Nimble Storage, Nimbus Data, Avere, Pure and more were talking about [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>VMworld is the best storage show I&#8217;ve seen in years. VMware&#8217;s severe storage problems leave users hungry for solutions &#8211; and your friendly neighborhood storage industry is happy to oblige.</p>
<p>It&#8217;s almost as if VMware were owned by a storage company.</p>
<p><strong>Flash everywhere</strong><br />
<a href="http://www.fusionio.com/" target="_blank">Fusion-io</a>, <a href="http://www.nimblestorage.com/" target="_blank">Nimble Storage</a>, <a href="http://www.nimbusdata.com/" target="_blank">Nimbus Data</a>, <a href="http://www.averesystems.com/" target="_blank">Avere</a>, <a href="http://www.purestorage.com/" target="_blank">Pure</a> and more were talking about how well flash supports VMware. Fixes VDI boot storms, deduped VMDKs, I/O bound servers and much more.</p>
<p><strong>Pure Storage</strong><br />
Here is <a href="http://www.purestorage.com/" target="_blank">Pure&#8217;s</a> Matt Kixmoeller giving a nifty demo in this 50 second video:</p>
<p><object width="500" height="306"><param name="movie" value="http://www.youtube.com/v/7_7ps2ci8tk?version=3"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/7_7ps2ci8tk?version=3" type="application/x-shockwave-flash" width="500" height="306" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p>Not exactly sure what those thousand VMs were doing. Maybe Pure will comment.</p>
<p><strong>Falconstor</strong><br />
I lost track of <a href="http://www.falconstor.com/" target="_blank">Falconstor</a> due to their OEM focus and sprawling product line. New CEO Jim McNiel has refocused the company &#8211; with the help of former Cheyenne teammates &#8211; on backup, business continuity/DR, dedup and virtualization.</p>
<p>Their clustered Network Storage Server turns all of Fstor&#8217;s products into tin-wrapped software suitable for channel partners. Takeaway: forget what you knew about them; they are a new company.</p>
<p><strong><a href="http://www.virsto.com/" target="_blank">Virsto</a></strong><br />
While the release of their storage hypervisor for VMware makes them seem like a new company, Virsto has been shipping product for over a year, but on Hyper-V, not VMware. Microsoft lost interest in server virtualization and Virsto moved on.</p>
<p>Their product is a virtual appliance that:</p>
<blockquote><p>
. . . runs in each host, creating a transparent virtual storage layer that is thin provisioned, fully cluster-aware, supports very rapid snapshot and clone creation, and scales to support tens of thousands of high performance snapshots and clones.</p>
<p>Virsto . . . decouple[s] application performance from any dependence on the rotational latencies and seek times of underlying disk associated with random writes. All random writes are sequentialized and written directly to a transparent logging device . . . and then asynchronously de-staged to primary storage. . . .
</p></blockquote>
<p>Net/net: high performance virtual storage regardless of underlying physical storage. Virsto offers a free trial &#8211; if you try it, let me know how it works.</p>
<p><strong>But wait! There&#8217;s more!</strong><br />
Cloud-related products from <a href="http://www.storsimple.com/" target="_blank">StorSimple</a>, <a href="http://amax.com/default.asp" target="_blank">AMAX</a> and <a href="http://raidundant.com/v2/" target="_blank">Raidundant</a> continue to pick at the problem of how/when/where cloud integrates with the enterprise.</p>
<p><strong>The StorageMojo take</strong><br />
Many cool products and ideas. The storage problems of many virtual machines are not unlike those of earlier time-shared virtual memory systems, but the scale is much greater. </p>
<p>And when the scale is greater the problem is fundamentally different. As virtualization grows we&#8217;ll need to see more creative answers beyond deduplication and flash.</p>
<p><strong>Courteous comments welcome, of course.</strong> Message to SNIA: storage networking is passé. Time to retool for the world of virtual machines, noSQL databases, scale-out storage and flash-enabled architectures. New name would be a start.</p>
<div style="clear:both;margin-bottom:5px;">
				<a href="http://twitter.com/share?url=http://storagemojo.com/2011/09/12/storage-vmworld-2011/&text=Storage @VMworld 2011 " target="_blank" title="Click here if you liked this article">
					<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
				</a>
			</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2011/09/12/storage-vmworld-2011/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>Flash cheaper than disk? Really?</title>
		<link>http://storagemojo.com/2011/08/29/flash-cheaper-than-disk-really/</link>
		<comments>http://storagemojo.com/2011/08/29/flash-cheaper-than-disk-really/#comments</comments>
		<pubDate>Mon, 29 Aug 2011 15:12:18 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Enterprise]]></category>
		<category><![CDATA[SSD/Flash Disk]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2515</guid>
		<description><![CDATA[Pure Storage, a well-funded ($55M) valley startup, came out of hiding last week with a startling claim: enterprise flash that is cheaper1,2,3 than disk. 1Cheaper after compressing and deduping the data. 2Cheaper after using almost all the flash capacity, which you can&#8217;t do with disks because performance suffers. 3Cheaper compared to the most expensive disk-based [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>Pure Storage, a well-funded ($55M) valley startup, came out of hiding last week with a startling claim: enterprise flash that is cheaper<sup>1,2,3</sup> than disk.</p>
<p><strong><sup>1</sup></strong>Cheaper after compressing and deduping the data.</p>
<p><strong><sup>2</sup></strong>Cheaper after using almost all the flash capacity, which you can&#8217;t do with disks because performance suffers.</p>
<p><strong><sup>3</sup></strong>Cheaper compared to the most expensive disk-based enterprise storage you can buy. </p>
<p><strong>Your mileage will vary</strong><br />
4 years ago an EMC VP <a href="http://storagemojo.com/2008/05/19/emc-flash-replaces-high-end-disks-in-2010/" target="_blank">predicted that flash would replace high-end disks in 2010</a>. That didn&#8217;t happen. Why? </p>
<p>After 5 years of hype, enterprises are still leery of flash. Endurance, reliability, data integrity, security, integration &#8211; all unanswered questions. At least by the other IT guys in town, even if vendors think they&#8217;ve nailed it. </p>
<p>So people buy the known quantity: high-end drives. </p>
<p><strong>The StorageMojo take</strong><br />
Kudos to Pure&#8217;s marketing for making a bold, attention-grabbing statement. Too often marketing falls back on the trite-and-true &#8220;faster, better, cheaper.&#8221;</p>
<p>But IT wants to solve old problems while not introducing new ones. The 10x performance boost would be enough, if IT believed.</p>
<p>Pure&#8217;s challenge &#8211; as well as other companies with similar products &#8211; is to convince IT that not only is flash ready for primetime &#8211; but that compression and dedup are too. </p>
<p>And once they do that, why not use them with disks, as well? As Nimble Storage has found, inline compression is now easily handled in software on multi-core chips.</p>
<p>With raw SATA drives down to 3¢/GB, storage vendors have ample opportunity to squeeze costs out. The flash/disk competition will be good for all of us.</p>
<p><strong>Courteous comments welcome, of course.</strong> The storage high-end is more active than its been in 15 years. Good!</p>
<div style="clear:both;margin-bottom:5px;">
				<a href="http://twitter.com/share?url=http://storagemojo.com/2011/08/29/flash-cheaper-than-disk-really/&text=Flash cheaper than disk? Really?" target="_blank" title="Click here if you liked this article">
					<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
				</a>
			</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2011/08/29/flash-cheaper-than-disk-really/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>De-dup: too much of good thing?</title>
		<link>http://storagemojo.com/2011/06/27/de-dup-too-much-of-good-thing/</link>
		<comments>http://storagemojo.com/2011/06/27/de-dup-too-much-of-good-thing/#comments</comments>
		<pubDate>Mon, 27 Jun 2011 18:52:40 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[SSD/Flash Disk]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2434</guid>
		<description><![CDATA[A post last month in ACM&#8217;s Queue raised a disturbing point around block-level deduplication in flash SSDs: it could hose your file system. De-dup is a Good Thing, right? Researchers found that at least 1 Sandforce SSD controller &#8211; the SF1200 &#8211; does block-level deduplication by default. Many file systems write critical metadata to multiple [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>A post last month in <a href="http://queue.acm.org/detail.cfm?id=1985003" target="_blank">ACM&#8217;s Queue</a> raised a disturbing point around block-level deduplication in flash SSDs: it could hose your file system.</p>
<p><strong>De-dup is a Good Thing, right?</strong><br />
Researchers found that at least 1 Sandforce SSD controller &#8211; the SF1200 &#8211; does block-level deduplication by default. Many file systems write critical metadata to multiple blocks in case one copy gets corrupted. But what if, unbeknownst to you, your SSD de-duplicates that block, leaving your file system with only 1 copy? </p>
<p>Yup, corruption of 1 block could wipe out your entire file system. And since all the &#8220;copies&#8221; point to the same corrupted block, there&#8217;s no way to recover. </p>
<p>Most Unix superblock-based FSs and ZFS could be pooched by loss of a single block. NTFS also mirrors critical metafile info and could be vulnerable as well.</p>
<p>To be fair, AFAIK no one has reported this failure in the wild, so it is conjecture today. That said, it may have happened to people who didn&#8217;t realize what went wrong.</p>
<p>But in the world of storage, if something can happen it will, usually at the worst possible time.  Have you seen a total data loss on an otherwise functioning SSD?</p>
<p><strong>The StorageMojo take</strong><br />
I&#8217;ve made calls to a number of vendors to get their responses, including Sandforce, Intel, Texas Memory Systems and OCZ. With any luck we&#8217;ll soon have a 1st pass on who does what to your data. </p>
<p>Don&#8217;t panic: not all SSD controllers do this. Texas Memory Systems controllers don&#8217;t, partly because they don&#8217;t use MLC flash and partly because minimizing capacity use and maximizing data availability are conflicting goals, and they chose the availability over capacity.</p>
<p>Also note that the SF-1200 is offered as a consumer grade controller. Not clear what Sandforce does with the rest of their line, but their site does repeatedly reference their &#8220;DuraWrite&#8221; technology which appears to include block-level dedup. </p>
<p>Just last week StorageMojo recommended faster adoption of SSDs in the enterprise &#8211; and still does. But this once again underlines the need for mirroring. The sooner we find these issues, the sooner they&#8217;ll be fixed.</p>
<p>Watch the comments for vendor info, and I&#8217;ll update this post with more info if and when it develops. </p>
<p><strong>Update:</strong>Here is the Sandforce response:</p>
<blockquote><p>
In the recent article by David Rosenthal he mentions a conversation with Kirk McKusik and the ZFS team at Sun Microsystems (Oracle). That conversation explains why it is critical that meta data not be lost or corrupted. He goes on to say that &#8220;If the stored metadata gets corrupted, the corruption will apply to all copies, so recovery is impossible.&#8221;</p>
<p>SandForce employs a feature called DuraWrite which enables flash memory to last longer through innovative patent pending techniques. Although SandForce has not disclosed the specific operation of DuraWrite and its 100% lossless write reduction techniques, the concept of deduplication, compression, and data differencing is certainly related. Through all the years of development and OEM testing with our SSD manufacturers and top tier storage users, there has not been a single reported failure of the DuraWrite engine. There is no more likelihood of DuraWrite loosing data than if it was not present.</p>
<p>We completely agree that any loss of metadata is likely to corrupt access to the underlying data. That is why SandForce created RAISE (Redundant Array of Independent Silicon Elements) and includes it on every SSD that uses a SandForce SSD Processor. All storage devices include ECC protection to minimize the potential that a bit can be lost and corrupt data. Not only do SandForce SSD Processors employ ECC protection enabling an UBER (Uncorrectable Bit Error Rate) of greater than 10^-17, if the ECC engine is unable to correct the bit error RAISE will step in to correct a complete failure of an entire sector, page, or block. </p>
<p>This combination of ECC and RAISE protection provides a resulting UBER of 10^-29 virtually eliminates the probabilities of data corruption. This combined protection is much higher than any other currently shipping SSD or HDD solution we know about. The fact that ZFS stores up to three copies of the metadata and optionally can replicate user data is not an issue. All data stored on a SandForce Driven SSD is viewed critical and protected with the highest level of certainty.
</p></blockquote>
<p>Readers: how does that sound to you?<br />
<strong>End update.</strong><br />
<strong>Update 2:</strong> Oddly enough, the Sandforce web site specifies the SD-1200 controller at</p>
<blockquote><p>
ECC Recovery: Up to 24 bytes correctable per 512-byte sector<br />
Unrecoverable Read Errors: Less than 1 sector per 1016 bits read
</p></blockquote>
<p>which is about where many enterprise disk drives spec&#8217;d &#8211; and quite a bit less than 10<sup>-29</sup>. Hmm-m.<br />
<strong>End update 2.</strong></p>
<p><strong>Update 3:</strong><br />
Spoke to James Myers of Intel. He said that no current Intel SSD uses any form of compression, including dedup. He also cautioned against making too much of the risk: after all, you&#8217;d have to have an unrecoverable read error AND it would have to be that 1 critical block. Perhaps, he suggested, file systems that do use multiple copies of critical FS metadata could slightly alter the copies to eliminate the possibility of deduplication.<br />
<strong>End update 3.</strong></p>
<p><strong>Courteous comments welcome, of course.</strong> TMS has been advertising on StorageMojo for a couple of years. </p>
<div style="clear:both;margin-bottom:5px;">
				<a href="http://twitter.com/share?url=http://storagemojo.com/2011/06/27/de-dup-too-much-of-good-thing/&text=De-dup: too much of good thing?" target="_blank" title="Click here if you liked this article">
					<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
				</a>
			</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2011/06/27/de-dup-too-much-of-good-thing/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>Can flash SSDs be trusted?</title>
		<link>http://storagemojo.com/2011/06/20/can-flash-ssds-be-trusted/</link>
		<comments>http://storagemojo.com/2011/06/20/can-flash-ssds-be-trusted/#comments</comments>
		<pubDate>Mon, 20 Jun 2011 22:15:18 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[SSD/Flash Disk]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2404</guid>
		<description><![CDATA[IT pros are always skeptical about new technology. Is it surprising that flash SSD&#8217;s are getting the gimlet eye? The big worry seems to be endurance. Nobody wants to buy an expensive SSD and have it fail after a year on the job. But IT infrastructures are designed to manage endurance failures. LTO tape, for [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>IT pros are always skeptical about new technology. Is it surprising that flash SSD&#8217;s are getting the gimlet eye?</p>
<p>The big worry seems to be endurance. Nobody wants to buy an expensive SSD and have it fail after a year on the job.</p>
<p>But IT infrastructures are designed to manage endurance failures. LTO tape, for example, is specified for a few hundred head passes. Yet tape is the paragon of data persistence.</p>
<p>Hard drive failure rates aren&#8217;t low enough for any of us to consider storing important data on one without backup. So why are IT pros so skittish about flash SSD&#8217;s?</p>
<p><strong>Experience</strong><br />
Or rather, lack of experience. Flash SSDs are evolving rapidly, with new generations arriving every 12 to 18 months.</p>
<p>It takes time for experience with new models to percolate. In the meantime, bad experiences with earlier generation drives continue to circulate.</p>
<p>Vendor secrecy about failure rates and modes doesn&#8217;t help. Until the Bianca Schroeder/Google/CMU <a href="http://storagemojo.com/2007/02/19/googles-disk-failure-experience/" target="_blank">disk drive studies</a> were released 4 years ago, we had no independent large-scale reliability data.</p>
<p>I hope it won&#8217;t take 20 years before we get that information on SSD&#8217;s. How about it, vendors?</p>
<p><strong>Reliability</strong><br />
SSD&#8217;s may turn out to be more reliable than hard drives but I won&#8217;t believe it until I see independent data. The lack of moving parts is a plus but about half the failures and this drives come from the electronics not the spinning bits. SSDs have most of the same electronics.</p>
<p><strong>SSD equivalent of a head disk assembly</strong><br />
Plane failures are a major trouble spot. Each die consists of two planes. These planes are prone to sudden failure, wiping out half the data on a die.</p>
<p>Most chip carriers contain multiple stacked dies, so a plane failure will remove anywhere from a quarter to an eighth of the chip&#8217;s total storage. Most flash controllers lay out the data in ways similar to a RAID array to guard against data loss.</p>
<p><strong>What to look for</strong><br />
Since Maxtor&#8217;s well-deserved demise we&#8217;ve had reasonable parity between disk drives and disk drive vendors. But that is not the case with the still maturing flash drive market.</p>
<p>Storage Newsletter recently <a href="http://www.storagenewsletter.com/news/flash/90-ssd-manufacturers-in-the-world-document" target="_blank">published</a> a list of 85 SSD vendors, most of whom none of us have heard of. Many are focused on the embedded systems market, but also because the SSD market barriers to entry are small: buy controller chip; buy flash on the spot market; gen up a PC board and <i>voilà</i> you are in the SSD market.</p>
<p>But flash that ends up on the spot market at rock-bottom prices is often marginal. The big buyers, like Apple, get first dibs on the best.</p>
<p>SSDs made with spot-market flash and a no-name &#8211; USB thumb drive? &#8211; controller will have a lot more problems. Which is to say that in today&#8217;s SSD market brandnames count.</p>
<p>Other things to look for are a guarantee of total write capacity. Another is a statement on the amount of over provisioning the drive has. </p>
<p>Even better: a five-year guarantee such as Seagate popularized with disks and that Intel just started offering on one of its SSD lines.</p>
<p><strong>The StorageMojo take</strong><br />
I have been as skeptical as anyone on SSDs &#8211; read some of my earliest posts &#8211; but the time for skepticism has passed. Of course, perform careful evals on any new IT product. But the best flash SSD&#8217;s are ready for the enterprise today.</p>
<p>And here&#8217;s an even more radical conclusion: the best consumer SSD&#8217;s are ready for the enterprise as well. Using any SATA drives in your enterprise?</p>
<p>The key: how is the SSD architected into the system? If it is storage tier the data has to be protected just like a RAID array. If it is a cache you have more flexibility &#8211; as long as the data is also on disk.</p>
<p>Yes, it&#8217;s more difficult to separate the wheat from the chaff in the SSD market today. But there are quality products available today.</p>
<p><strong>Courteous comments welcome, of course.</strong><br />
Started thinking about this is result of the research project I did a few months ago. Leading-edge storage managers with workloads that would benefit enormously by flash SSD&#8217;s weren&#8217;t seriously evaluating them today. Big surprise. What do you think?</p>
<div style="clear:both;margin-bottom:5px;">
				<a href="http://twitter.com/share?url=http://storagemojo.com/2011/06/20/can-flash-ssds-be-trusted/&text=Can flash SSDs be trusted? " target="_blank" title="Click here if you liked this article">
					<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
				</a>
			</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2011/06/20/can-flash-ssds-be-trusted/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>Webinar Q&amp;A: flash SSD performance &amp; reliability</title>
		<link>http://storagemojo.com/2011/06/07/webinar-qa-flash-ssd-performance-reliability/</link>
		<comments>http://storagemojo.com/2011/06/07/webinar-qa-flash-ssd-performance-reliability/#comments</comments>
		<pubDate>Tue, 07 Jun 2011 17:02:03 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[SOHO/SMB]]></category>
		<category><![CDATA[SSD/Flash Disk]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2387</guid>
		<description><![CDATA[I was surprised by the number of questions at last week&#8217;s webinar &#8211; many more than we could get to &#8211; so I&#8217;m answering a few here. Performance Q: Can Robin talk about performance and how does flash help solve I/O bottleneck? NAND flash is very good at random reads, and a good SSD can [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>I was surprised by the number of questions at last week&#8217;s webinar &#8211; many more than we could get to &#8211; so I&#8217;m answering a few here.</p>
<p><strong>Performance</strong><br />
<i>Q: Can Robin talk about performance and how does flash help solve I/O bottleneck?</i></p>
<p>NAND flash is very good at random reads, and a good SSD can handle thousands per second, compared to a disk drive&#8217;s 150-500 (for a high-end drive). That&#8217;s one reason arrays are popular: they provide higher random I/O performance because multiple heads are seeking. But that&#8217;s also why capacity utilization is so low: the disks come with more capacity than most applications use.</p>
<p>So not only are you buying multiples of the most expensive disks, but then you only use a fraction of their capacity. This is why flash SSDs are predicted to kill the high-end drives even though SSDs cost much more per GB: a single SSD can eliminate 6-10 hard drives. That is a major cost saving.</p>
<p><i>Q: So the low cost assumes that the cache is read only, otherwise it needs to be RAID? That comes off as misleading.</i></p>
<p>While flash SSDs are much better at random reads than they are random writes, they still beat several high-end disks at writes. </p>
<p>Since most workloads are 80%-95% reads, an SSD that can handle 1,000 writes per second can handle a lot of work. Disks are still the most cost-effective solution for large sequential workloads because their performance is close to SSDs and they are so much cheaper. </p>
<p><strong>Reliability</strong><br />
<i>Q: What are Robin&#8217;s thoughts on the reliability of SSDs? We have seen failure rates of over 10% on drives less than two months old.</i></p>
<p>Flash SSD reliability today is all over the map. As flash SSD technology matures, I&#8217;d expect to see drive reliability rates converge. 5 years ago disk reliability was fairly similar with the glaring exception of Maxtor. </p>
<p>That said, it&#8217;s useful to recognize that there&#8217;s a lot more design and sourcing variability in SSDs. If someone uses the cheapest parts &#8211; and there are plenty available &#8211; they can offer good specs but highly variable reliability. </p>
<p>If they leave out too much redundancy they&#8217;ll have a cost advantage but will be more vulnerable to chip and plane failures. The market will eventually settle on similar specs for each application, but we&#8217;re years away from that.</p>
<p><i>Q: You mentioned 10,000 writes and failure can begin, what is that in years?</i></p>
<p>Like so many storage specs, that 10k write spec for MLC flash is a statistical one that can be improved upon by more robust ECC, as this chart from SNIA shows:</p>
<p><a href="http://storagemojo.com/wp-content/uploads//2011/06/ecc_flash_reliability.jpg"><img src="http://storagemojo.com/wp-content/uploads//2011/06/ecc_flash_reliability.jpg" alt="" title="ecc_flash_reliability" width="480" height="343" class="aligncenter size-full wp-image-2388" /></a></p>
<p>But the most important way to improve upon it is by increasing the capacity of the SSD. Double the size of the SSD and you double the total write capacity. </p>
<p>As to what that is in years, the industry is still figuring out how to spec that. The best vendor spec I&#8217;ve seen so far has been from Intel &#8211; 5 years at 20 GB of writes per day. </p>
<p><strong>Courteous comments welcome, of course.</strong> I enjoyed the webinar &#8211; a new experience for me &#8211; and not just because I got paid. The crew at Nimble was a pleasure to work with. Here&#8217;s a <a href="http://www.nimblestorage.com/resources/robin-harris-ssd-webinar/" target="_blank">link</a> to the webinar. </p>
<div style="clear:both;margin-bottom:5px;">
				<a href="http://twitter.com/share?url=http://storagemojo.com/2011/06/07/webinar-qa-flash-ssd-performance-reliability/&text=Webinar Q&A: flash SSD performance & reliability" target="_blank" title="Click here if you liked this article">
					<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
				</a>
			</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2011/06/07/webinar-qa-flash-ssd-performance-reliability/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>StorageMojo webinar Thursday</title>
		<link>http://storagemojo.com/2011/05/23/storagemojo-webinar-thursday/</link>
		<comments>http://storagemojo.com/2011/05/23/storagemojo-webinar-thursday/#comments</comments>
		<pubDate>Tue, 24 May 2011 01:26:18 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[SSD/Flash Disk]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2381</guid>
		<description><![CDATA[A new media adventure StorageMojo is decamping to Silicon Valley tomorrow. On Thursday I&#8217;ll be doing my first webinar &#8211; ever &#8211; with Dan Leary of Nimble Storage on the topic Not Just a Flash in the Pan: Proven Strategies for Successful SSD Deployment in the Midsize Enterprise. Didn&#8217;t realize how long that title was [...]]]></description>
			<content:encoded><![CDATA[<p></p><p><strong>A new media adventure</strong><br />
StorageMojo is decamping to Silicon Valley tomorrow. On Thursday I&#8217;ll be doing my first webinar &#8211; ever &#8211; with Dan Leary of Nimble Storage on the topic <a href="http://info.nimblestorage.com/robin-harris-webinar-storage-mojo.html" target="_blank">Not Just a Flash in the Pan: Proven Strategies for Successful SSD Deployment in the Midsize Enterprise</a>.</p>
<p>Didn&#8217;t realize how long that title was until I tried to put it on a slide.</p>
<p><strong>The StorageMojo take</strong><br />
My focus is on addressing IT concerns about flash-based products. One part carrot &#8211; look at the massive benefits that flash SSDs bring to storage architectures &#8211; and one part medicine &#8211; IT folks attitudes towards flash tend to be gloomier than the product&#8217;s rapid progress justifies. </p>
<p>My own flash posts, nearly 5 years ago, were downbeat &#8211; too many unanswered questions &#8211; and as in any new area some companies made better decisions than others. The resulting noise made &#8220;go <i>really</i> slow&#8221; the IT default. In the recent survey on StorageMojo I found that IT remains more skeptical than I&#8217;d expected.</p>
<p>But we&#8217;re moving past the fear and loathing stage with 4th generation flash SSDs. The question now: how do new system architectures enabled by flash help me do more with less? Nimble has a pretty good answer to that question.</p>
<p><strong>Courteous comments welcome, of course.</strong> I have some free time on Wednesday if anyone would like to chat. And I&#8217;ll be hitting the Ryowa ramen house in Mountain View at least once.</p>
<div style="clear:both;margin-bottom:5px;">
				<a href="http://twitter.com/share?url=http://storagemojo.com/2011/05/23/storagemojo-webinar-thursday/&text=StorageMojo webinar Thursday" target="_blank" title="Click here if you liked this article">
					<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
				</a>
			</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2011/05/23/storagemojo-webinar-thursday/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Flash and the re-architecting of storage</title>
		<link>http://storagemojo.com/2011/05/17/flash-and-the-re-architecting-of-storage/</link>
		<comments>http://storagemojo.com/2011/05/17/flash-and-the-re-architecting-of-storage/#comments</comments>
		<pubDate>Tue, 17 May 2011 18:44:11 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[SOHO/SMB]]></category>
		<category><![CDATA[SSD/Flash Disk]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2374</guid>
		<description><![CDATA[JIm Gray&#8217;s comment that disk is the new tape is truer today than it was 8 years ago. We&#8217;ve been adding caches, striping disks, modifying applications and performing other unnatural acts to both reduce and accommodate random reads and writes to disk. Flash changes the calculus of 20 years of storage engineering. Flash gives us [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>JIm Gray&#8217;s comment that <a href="http://queue.acm.org/detail.cfm?id=864078" target="_blank">disk is the new tape</a> is truer today than it was 8 years ago. We&#8217;ve been adding caches, striping disks, modifying applications and performing other unnatural acts to both reduce and accommodate random reads and writes to disk.</p>
<p>Flash changes the calculus of 20 years of storage engineering. Flash gives us abundant random reads &#8211; something hard drives are poor at &#8211; and reasonable random writes to whatever hot data we choose. </p>
<p>In a feverish burst of design and investment we&#8217;ve tried flash everywhere in the storage stack: disks; PCI cards; motherboards; controllers; built-in tiering; and appliances.  These products have been focused on enterprise datacenters or very targeted applications where the cost of flash was justifiable.</p>
<p>But clarity is emerging. It isn&#8217;t so much where you put the flash as what you ask the flash to do. There are three requirements:</p>
<ul>
<li>Valuable data. Flash is an order of magnitude more costly than disk. </li>
<li>Often accessed. If not, leave it on disk.</li>
<li>Enables new functionality and/or lowers cost. If it doesn&#8217;t, why bother?</li>
</ul>
<p><strong>The buyer&#8217;s burden</strong><br />
These requirements frame a basic point: optimizing for flash requires a systems level approach. Adding flash can make current architectures go faster, but that isn&#8217;t the big win.</p>
<p>Buyers looking for an economic edge must make a cognitive leap: <i>the old ways are no longer best</i>. Flash enables efficiencies and capabilities in smaller systems that only costly enterprise gear had a few years ago.</p>
<p><strong>Tiering</strong><br />
Tiered flash solutions are the most common approach today. Tiering software has improved in recent years, making the movement of data between flash and disk safe, fast and granular. </p>
<p>We’ve started to at least see interest in the midsize enterprise, like the <a href="http://www.equallogic.com/products/default.aspx?id=9511" target="_blank">EqualLogic hybrid SAS/SSD</a> array in VDI deployments.</p>
<p><strong>Metadata and cache</strong><br />
The best fit for flash today is metadata and caching. These best meet the requirements for value, access and functionality.</p>
<p>Once metadata is freed from disk constraints we can combine it with caching to build high-performance systems on commodity hardware. The win for innovators is to design new metadata structures and caching algorithms for flash. </p>
<p>They can design the (write) data layouts to best take advantage of the physics of disk and flash, such as with <a href=”http://storagemojo.com/2010/11/08/jack-be-nimble/” target=”_blank”>Nimble Storage’s CASL architecture</a>, which combines a large flash cache with full-stripe writes, is one example.</p>
<p>Flash is also an important enabler for low-cost de-duplication because it&#8217;s cheaper to keep block metadata &#8211; fingerprints or hash codes &#8211; in flash than it is in RAM. Some vendors are encouraging the use of de-duplicated storage for midrange primary storage, enabled by flash indexes or caches that make it feasible to reconstruct files on-the-fly. </p>
<p><strong>The StorageMojo take</strong><br />
Shaking off the effects of 50 years of disk-based limitations isn&#8217;t easy. Our disk-based orthodoxy is ingrained in architectures and our thinking. </p>
<p>But buyers face a difficult job: evaluating architectures and algorithms to choose  products for eval. A shortcut: look for architectures that collapse existing storage stovepipes to reduce cost, total data stored and operational complexity. The three are related and offer the big wins. </p>
<p>In the last 10 years raw disk capacity cost has dropped to less than a 10th of what they were, but the cost of traditional storage systems haven&#8217;t. The culprits: operating costs; storage network infrastructure costs; and capacity requirements that have risen faster than management productivity. </p>
<p>The flood of data continues to rise, but cost and complexity doesn&#8217;t have to rise with it. We can &#8211; and are &#8211; doing better.</p>
<p><strong>Courteous comments welcome, of course.</strong> I&#8217;ve been working with Nimble Storage lately and like what they&#8217;ve done.</p>
<div style="clear:both;margin-bottom:5px;">
				<a href="http://twitter.com/share?url=http://storagemojo.com/2011/05/17/flash-and-the-re-architecting-of-storage/&text=Flash and the re-architecting of storage" target="_blank" title="Click here if you liked this article">
					<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
				</a>
			</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2011/05/17/flash-and-the-re-architecting-of-storage/feed/</wfw:commentRss>
		<slash:comments>15</slash:comments>
		</item>
		<item>
		<title>All de-dup works</title>
		<link>http://storagemojo.com/2011/05/03/all-de-dup-works/</link>
		<comments>http://storagemojo.com/2011/05/03/all-de-dup-works/#comments</comments>
		<pubDate>Tue, 03 May 2011 22:48:13 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Backup]]></category>
		<category><![CDATA[Enterprise]]></category>
		<category><![CDATA[SOHO/SMB]]></category>
		<category><![CDATA[SSD/Flash Disk]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2364</guid>
		<description><![CDATA[Forget the flame wars over moving window versus fixed block de-duplication. A recent paper, A Study of Practical Deduplication (pdf) from William J. Bolosky of Microsoft Research and Dutch T. Meyer of the University of British Columbia found that whole file deduplication achieves about 75% of the space savings of the most aggressive block level [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>Forget the flame wars over moving window versus fixed block de-duplication. A recent paper, <a href="http://www.usenix.org/events/fast11/tech/full_papers/Meyer.pdf" target="_blank">A Study of Practical Deduplication</a> (pdf) from William J. Bolosky of Microsoft Research and Dutch T. Meyer of the University of British Columbia found that whole file deduplication achieves about 75% of the space savings of the most aggressive block level de-dup for live filesystems and 87% of the savings for backup images.</p>
<p>Presented at <a href="http://www.usenix.org/events/fast11/" target="_blank">FAST 11</a> &#8211; and winner of a &#8220;Best Paper&#8221; award &#8211; the researchers looked at file systems from 857 Microsoft desktop computers over 4 weeks. Researchers asked permission to install rather invasive scanning software.</p>
<p>The scanner took a snapshot using Window&#8217;s volume shadow copy service and then recorded metadata about the file system itself. The scanner recorded each file&#8217;s metadata, retrieval and allocation pointers as well as the computer&#8217;s hardware and systems configuration. They excluded the pagefile, hibernation file, the scanner itself and the VSS snapshots the scanner created. </p>
<p> During scanning each file was broken into chunks using both fixed block or Rabin fingerprinting. They also identified whole file duplicates.</p>
<p>Rabin uses dynamically variable block sizes to maximize compression. Figuring out where to break the file adds to the overhead.</p>
<p>The resulting data set was 4.1 TB compressed &#8211; too large to import into a database &#8211; and was further groomed to lose unneeded data.</p>
<p><strong>De-dup issues</strong><br />
De-duplication is expensive. You&#8217;re giving up direct access to the data to save capacity.</p>
<p>The expense is in I/Os and CPU cycles. Comparing each chunk&#8217;s fingerprint to all other chunks is nontrivial. De-duplication indirection adds to I/O latency. A file&#8217;s chunks are scattered around, requiring small and expensive random I/O&#8217;s to read. </p>
<p>Older techniques, such as sparse files and Single Instance Storage, are more economical even if their compression ratios aren&#8217;t as high. Fewer CPU cycles, less indirection and good compression.</p>
<p><strong>The StorageMojo take</strong><br />
If capacity is expensive &#8211; read &#8220;enterprise&#8221; &#8211; and I/Os cheap &#8211; SSD or NVRAM in the mix &#8211; fancy dedup can make sense. It is at the margin of capacity cost and I/O availability that the value prop gets dicey.  </p>
<p>Low duty cycle storage &#8211; SOHO &#8211; with plenty of excess CPU and light transactions could use deduped primary storage. But with a 10 TB of data to backup, most users would&#8217;t notice the difference between whole file and 8KB Rabin. </p>
<p>It&#8217;s the price tag and user reviews the SOHO/SMB crowd will be looking at. </p>
<p><strong>Courteous comments welcome, of course.</strong> The paper also included some interesting historical data about Windows file system that I covered on <a href="http://www.zdnet.com/blog/storage/10-years-of-windows-file-changes/1372" target="_blank">ZDNet</a>.</p>
<div style="clear:both;margin-bottom:5px;">
				<a href="http://twitter.com/share?url=http://storagemojo.com/2011/05/03/all-de-dup-works/&text=All de-dup works" target="_blank" title="Click here if you liked this article">
					<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
				</a>
			</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2011/05/03/all-de-dup-works/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Anobit&#8217;s voodoo that they do</title>
		<link>http://storagemojo.com/2011/04/11/anobits-voodoo-that-they-do/</link>
		<comments>http://storagemojo.com/2011/04/11/anobits-voodoo-that-they-do/#comments</comments>
		<pubDate>Mon, 11 Apr 2011 17:17:18 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[SSD/Flash Disk]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2345</guid>
		<description><![CDATA[It sounds unlikely: technology that makes MLC flash as reliable and long-lived as more expensive SLC? Right! Or even better: make 3 bit per cell flash equivalent to 2 bit MLC? What are these guys smoking? Whatever it is, I&#8217;d like to try some myself. Because it looks like they have the goods. Flash problems [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>It sounds unlikely: technology that makes MLC flash as reliable and long-lived as more expensive SLC? Right!</p>
<p>Or even better: make 3 bit per cell flash equivalent to 2 bit MLC? What are these guys smoking?</p>
<p>Whatever it is, I&#8217;d like to try some myself. Because it looks like they have the goods.</p>
<p><strong>Flash problems</strong><br />
Flash vendors are close-mouthed about flash problems, not wanting to kill the goose that&#8217;s laying billion$ in revenue. But the problem most people worry about &#8211; life span &#8211; isn&#8217;t much of a problem. </p>
<p>So what is? Errors. The good news: most NAND flash errors have predictable causes. And what can be predicted can be corrected. </p>
<p>That&#8217;s Avraham Meir&#8217;s story and he&#8217;s the CTO of <a href="http://anobit.com/default.asp?PageID=3" target="_blank">Anobit</a>. We had a chat at SNW.</p>
<p>According to their web site:</p>
<blockquote><p>
Mr. Avraham Meir is an internationally recognized authority in NAND Flash technology and products. Prior to joining Anobit, Mr. Meir was VP Corporate Engineering at SanDisk (NASDAQ: SNDK), and the CTO at M-Systems (NASDAQ: FLSH, acquired by SanDisk).
</p></blockquote>
<p>Flash errors are commonly due to cross coupling between adjacent cells, read disturbs and program disturbs,  and reading or programing the wrong cell. Retention impairments and endurance impairments are more common as geometries shrink. </p>
<p>Data retention phenomena and endurance effects are not random: normally high levels go to lower levels. In adjacent cells, one high and the other low, voltage will leak across to reduce the higher voltage.</p>
<p>So the first stage is reducing errors is knowing how flash impairments behave to predict an error. Then working with flash vendors they can repair the impairment in the wild.</p>
<p>By reducing the errors first and then applying signal processing and error correction Anobit makes less-reliabile flash look like more reliable flash.</p>
<p>Extending endurance has another effect: with flash as the number of writes increases so does the error rate. And as the endurance increases the data retention time drops to as little as a few months. Write once and your data may last for years. But don&#8217;t trust that old thumb drive.</p>
<p><strong>The StorageMojo take</strong><br />
Anobit has an impressive IP portfolio: go to the <a href="http://appft1.uspto.gov/netahtml/PTO/search-bool.html" target="_blank">USPTO patent application search engine</a>, type in anobit and you&#8217;ll see what I mean. And they have about a dozen granted already.</p>
<p>What is evident is that flash as a medium is still in its infancy. If it turns out, as some predict, that flash will hit a geometry wall in a generation or two, Anobit&#8217;s technology looks to be transferrable to whatever next-gen non-volatile device wins. In storage, reliability and endurance are problems that will never go away.</p>
<p><strong>Courteous comments welcome, of course.</strong>  </p>
<div style="clear:both;margin-bottom:5px;">
				<a href="http://twitter.com/share?url=http://storagemojo.com/2011/04/11/anobits-voodoo-that-they-do/&text=Anobit's voodoo that they do" target="_blank" title="Click here if you liked this article">
					<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
				</a>
			</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2011/04/11/anobits-voodoo-that-they-do/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Flash isn&#8217;t storage!</title>
		<link>http://storagemojo.com/2011/03/02/flash-isnt-storage/</link>
		<comments>http://storagemojo.com/2011/03/02/flash-isnt-storage/#comments</comments>
		<pubDate>Wed, 02 Mar 2011 20:55:55 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Enterprise]]></category>
		<category><![CDATA[Marketing]]></category>
		<category><![CDATA[SSD/Flash Disk]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2325</guid>
		<description><![CDATA[It isn&#8217;t about the capacity. It&#8217;s the performance. I few weeks ago I asked StorageMojo readers to help out Jim Handy of Objective Analysis, a semiconductor research firm. They did, and Jim got some surprising results. Now that I have finished interviewing 21 PCIe SSD users (both Fusion-io and Texas Memory Systems users, thanks to [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>It isn&#8217;t about the capacity. It&#8217;s the performance.</p>
<p>I few weeks ago I asked StorageMojo readers to help out Jim Handy of <a href="http://objective-analysis.com/" target="_blank">Objective Analysis</a>, a semiconductor research firm. They did, and Jim got some surprising results. </p>
<blockquote><p>
Now that I have finished interviewing 21 PCIe SSD users (both Fusion-io and Texas Memory Systems users, thanks to help from StorageMojo) I can step back and see the common threads.  One surprises me a bit.</p>
<p>The almost universal reply from users is that, although they were able to reduce the use of other hardware by adding flash to their system, they were unhappy that the price per gigabyte for flash is so much higher than HDDs.</p>
<p>In one or two cases the SSD displaced very expensive hardware.  One company was able to re-deploy a $500K high-speed SAN which it now uses to perform backups.  </p>
<p>A couple of others slashed their DRAM sizes by as much as 2/3, leading to power savings.  Answers.com decreased the server count in each of the company&#8217;s five data centers from four to one by eliminating the need to shard its databases.</p>
<p>There are side benefits to such moves.  </p>
<ul>
<li>Any future system expansion that uses the new approach will be able to get by with less hardware, saving money along the way.</li>
<li>A smaller system is usually easier to manage than a large one, saving in HR costs.</li>
<li>In some cases fewer software licenses will be required, cutting operating costs appreciably. </li>
<li>Smaller systems are likely to consume less power than their predecessors.</li>
</ul>
<p>This last may seem to be a small point until its effect over time is considered.  The last three all accumulate over time to become important sums.</p>
<p>Yes, this is a TCO argument, I agree, but it’s a GOOD TCO argument.  Although the cost per gigabyte of an SSD may be an order of magnitude greater than that of hard drives, very many systems can save money by deploying these devices.  Unfortunately it’s difficult to tell exactly how this will come about until after the SSD is deployed.</p>
<p>As I knuckle down to compile the survey data into a report I’ll be considering the impact of SSDs on the data center, and how the findings of these 21 companies can help others who have not yet adopted the technology.  </p>
<p>This exercise has reinforced my position that flash will become widespread in the data center, but not as storage; it will rather be thought of as “something else” that brings cost and power efficiencies that can’t be attained through more established methods.
</p></blockquote>
<p><strong>The StorageMojo take</strong><br />
Jim&#8217;s findings surprised me. But thinking a bit I&#8217;m surprised that I&#8217;m surprised.</p>
<p>Why? Back in &#8217;06 I wrote about the <a href="http://storagemojo.com/2006/10/03/utilization-vs-cost-the-capacity-illusion/" target="_blank">Capacity Illusion</a>, the storage industry equivalent of the economist&#8217;s &#8220;money illusion.&#8221;</p>
<p>The capacity illusion: most customers <i>need</i> IOPS, but they <i>buy</i> capacity. And then they moan about how much unused capacity they have or, in the SSD case, how much SSD capacity costs.</p>
<p>While faster and cheaper SSDs are rewiring data center architectures, something more powerful is needed to rewire customers. A concerted industry effort to classify and promote on IOPS, perhaps?</p>
<p><strong>Courteous comments welcome, of course.</strong> Speaking of research, if you haven&#8217;t done so yet, please support StorageMojo by completing the <a href="https://www.surveygizmo.com/s3/476397/ecdedba37108" target="_blank">Internal Storage Survey</a>. More about it in the previous post. Thanks!</p>
<div style="clear:both;margin-bottom:5px;">
				<a href="http://twitter.com/share?url=http://storagemojo.com/2011/03/02/flash-isnt-storage/&text=Flash isn't storage!" target="_blank" title="Click here if you liked this article">
					<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
				</a>
			</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2011/03/02/flash-isnt-storage/feed/</wfw:commentRss>
		<slash:comments>18</slash:comments>
		</item>
	</channel>
</rss>

