<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>StorageMojo &#187; NAS, IP, iSCSI</title>
	<atom:link href="http://storagemojo.com/category/nas/feed/" rel="self" type="application/rss+xml" />
	<link>http://storagemojo.com</link>
	<description>Data storage info &#38; analysis</description>
	<lastBuildDate>Thu, 16 May 2013 18:38:20 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>Is F5&#8242;s ARX file virtualization a success?</title>
		<link>http://storagemojo.com/2013/04/22/is-f5s-arx-file-virtualization-a-success/</link>
		<comments>http://storagemojo.com/2013/04/22/is-f5s-arx-file-virtualization-a-success/#comments</comments>
		<pubDate>Mon, 22 Apr 2013 17:35:55 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Enterprise]]></category>
		<category><![CDATA[Management]]></category>
		<category><![CDATA[NAS, IP, iSCSI]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2908</guid>
		<description><![CDATA[In response to the post on Avere&#8217;s architecture for fronting backend NAS filers &#8211; where StorageMojo said that no front-end to NAS boxes has succeeded &#8211; alert reader Jacob Marley asked &#8220;What about F5′s ARX to stitch/balance storage across multiple filers?&#8221; Good question! What can we deduce from publicly available sources? The F5 ARX product [...]]]></description>
				<content:encoded><![CDATA[<p></p><p>In response to the <a href="http://storagemojo.com/2013/04/19/fronting-nas-for-fun-and-profit/" target="_blank">post on Avere&#8217;s architecture</a> for fronting backend NAS filers &#8211; where StorageMojo said that no front-end to NAS boxes has succeeded &#8211; alert reader Jacob Marley asked &#8220;What about F5′s ARX to stitch/balance storage across multiple filers?&#8221;</p>
<p><a href="http://storagemojo.com/wp-content/uploads//2013/04/f5-fullcolor-lg.jpgwad58b3e319a2f6d68.jpeg"><img src="http://storagemojo.com/wp-content/uploads//2013/04/f5-fullcolor-lg.jpgwad58b3e319a2f6d68-e1366651712877.jpeg" alt="f5-fullcolor-lg.jpg;wad58b3e319a2f6d68" width="125" height="112" class="alignleft size-full wp-image-2911" /></a></p>
<p>Good question! What can we deduce from publicly available sources?</p>
<p>The F5 <a href="http://www.f5.com/products/hardware/arx-hardware/" target="_blank">ARX product line</a> is billed as an &#8220;intelligent file virtualization solution&#8221; that<br />
&#8220;. . .preserves the logical access to files regardless of their current location on storage.&#8221; Like earlier file switches</p>
<blockquote><p>
The ARX device does not introduce a new file system; rather it acts as a proxy to the file systems that are already there.
</p></blockquote>
<p>ARX is not a storage device itself but a load-balancer for NAS filers. Then, per Mr Marley&#8217;s question, is ARX not a success?</p>
<p><strong>Competitive analysis</strong><br />
First up, let&#8217;s take a look at the latest quarterly 10-Q report, courtesy of the <a href="http://www.sec.gov/edgar.shtml" target="_blank">SEC&#8217;s EDGAR database</a>.</p>
<p>In &#8220;Management’s Discussion and Analysis of Financial Condition and Results of Operations&#8221; they describe their product revenues as</p>
<blockquote><p>
The majority of our revenues are derived from sales of our application delivery networking (ADN) products including our high end VIPRION chassis and related software modules; BIG-IP Local Traffic Manager, BIG-IP Global Traffic Manager, BIG-IP Link Controller, BIG-IP Application Security Manager, BIG-IP Edge Gateway, BIG-IP WAN Optimization module, BIG-IP Access Policy Manager, WebAccelerator; FirePass SSL VPN appliance; Traffix diameter signaling products; and <strong>ARX file virtualization products</strong>.
</p></blockquote>
<p>Unless this is a last-but-not-least ordering, it looks like management is not leading with the ARX products. But let&#8217;s look for more evidence of management&#8217;s priorities.</p>
<p>Combing through the <a href="http://www.f5.com/about/news/press/" target="_blank">F5 newsroom</a>, for instance, we find that the last <a href="http://www.f5.com/about/news/press/2011/20110711/" target="_blank">press release</a> on ARX is almost 2 years old. Titled &#8220;F5’s New ARX Platforms Help Organizations Reap the Benefits of File Virtualization&#8221; it is surprising that later press releases don&#8217;t call out other success stories.</p>
<p>The most recent ARX white papers, &#8220;Reducing Storage Costs with F5 ARX&#8221; and &#8220;Enabling Flexibility with Intelligent File Virtualization&#8221; are both dated 2011. The ARX data sheet is from 2013 though.</p>
<p><strong>The StorageMojo take</strong><br />
It&#8217;s clear that F5 has backed away from the ARX technology &#8211; which they acquired with Acopia in 2007 &#8211; in favor of the Application Delivery Controller market. But does that mean that the Acopia/F5 ARX didn&#8217;t succeed?</p>
<p>Clearly, ARX succeeded for a while: F5 bought them after all. And the F5 PR archives have several success stories from 2009. </p>
<p>But within the current F5 context &#8211; where they have several high-growth segments &#8211; ARX is getting little investment. At another company, perhaps, ARX would be a success, but at F5 it clearly is not.</p>
<p>If I were a customer I would certainly look at ARX if I wanted to virtualize disparate NAS filers. But I&#8217;d be sure to have some contingency plans in place if F5 decided to end-of-life the product in the next 2 years.</p>
<p><strong>Courteous comments welcome, of course.</strong> Fun fact: F5&#8242;s name was inspired by one of my favorite movies &#8211; and an AFI top 100 selection &#8211; <i>Twister</i>, that popularized the Fujita scale (now the Enhanced Fujita scale) for tornado intensity.</p>
<div class="twttr_button">
					<a href="http://twitter.com/share?url=http://storagemojo.com/2013/04/22/is-f5s-arx-file-virtualization-a-success/&text=Is F5's ARX file virtualization a success?" target="_blank" title="Click here if you like this article.">
						<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
					</a>
				</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2013/04/22/is-f5s-arx-file-virtualization-a-success/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Fronting NAS for fun and profit</title>
		<link>http://storagemojo.com/2013/04/19/fronting-nas-for-fun-and-profit/</link>
		<comments>http://storagemojo.com/2013/04/19/fronting-nas-for-fun-and-profit/#comments</comments>
		<pubDate>Fri, 19 Apr 2013 23:47:25 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Cloud computing & storage]]></category>
		<category><![CDATA[Clusters]]></category>
		<category><![CDATA[Enterprise]]></category>
		<category><![CDATA[Management]]></category>
		<category><![CDATA[NAS, IP, iSCSI]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2902</guid>
		<description><![CDATA[The traditional model of NAS filers is handy if you only have a few. But once you get to 8 or 10 NAs filers your life gets complicated. Your oldest data is on the oldest filer and your active data is on the newest. If that new filer bottlenecks your entire system slows down. Hence [...]]]></description>
				<content:encoded><![CDATA[<p></p><p>The traditional model of NAS filers is handy if you only have a few. But once you get to 8 or 10 NAs filers your life gets complicated. </p>
<p>Your oldest data is on the oldest filer and your active data is on the newest. If that new filer bottlenecks your entire system slows down.</p>
<p>Hence the old saw &#8220;you&#8217;ll love your first filer and hate your tenth.&#8221; System administrators will load balance by moving data back and forth, an inherently wasteful and error-prone exercise.</p>
<p><strong>A history of failure</strong><br />
A number of start ups &#8211; such as Z-force and Zambeel &#8211; have attempted a fix. The general idea is a switch that virtualizes the backend filers to create a single pool. </p>
<p>While the concept sounds good, results have been dismal. No storage system whose primary function was to front-end existing NAS boxes has succeeded.</p>
<p><strong>Once more unto the breach</strong><br />
Now another entrant enters the fray. <a href="http://www.averesystems.com" target="_blank">Avere Systems</a> has raised $50 million and is on v3 of their tin-wrapped software. </p>
<p>At NAB 2013 they announced the FXT 3800 Edge Filer. The 3800 tiers across RAM, SSD, SAS, backend NAS <a href="http://www.averesystems.com/News_PressReleases.aspx?ID=65" target="_blank">and cloud</a> across one namespace. </p>
<p>They&#8217;re understandably proud of their <a href="http://www.spec.org/sfs2008/results/res2013q2/sfs2008-20130318-00218.html" target="_blank">new SPECsfs2008 NFS result</a> with a FXT 3800 32 Node Cluster that reached 1,592,334 Ops/Sec. That beats NetApp, Isilon, Hitachi/BlueArc and everybody else, except Huawei&#8217;s OceanStor 8500, which used 24 file systems and more than twice the number of SSDs get over 3 million Ops/Sec.</p>
<p>Oh, and they included a transcontinental latency in the network. As you might see using a cloud provider like Amazon Web Services, which was showing an Avere proto version in their NAB booth.</p>
<p><strong>The hard question</strong><br />
After the briefing by Avere I asked Ron Bianchini, the CEO and cofounder, why Avere would escape the fate of their erstwhile predecessors.</p>
<p>I boiled his answer down to 4 points:</p>
<ul>
<li>Avere&#8217;s appliance is a read and write cache, so hot data I/O is handled directly and not routed to the backend filers. Typically, he says, only 1 out of 50 I/Os leave Avere for backend NAS, and for some workloads it is as little as 1 out of 200.</li>
<li>Their file system is the client of the backend filers, so they know exactly where the data is at all times. Furthermore, they&#8217;ve certified vendors like NetApp, so they handle the inevitable corner cases.</li>
<li>The system moves data across 4 tiers &#8211; DRAM, SSD, SAS, SATA and the backend filers so it is capable of extremely high performance, unlike products that relied upon backend performance.</li>
<li>They also manage blocks within files, so a change in a file doesn&#8217;t require rewriting the entire file, a popular feature in large file applications.</li>
</ul>
<p><strong>The StorageMojo take</strong><br />
Rip and replace has never been popular. With today&#8217;s data volumes it is ever more unwieldy. </p>
<p>Avere&#8217;s performance and cost-effectiveness make it more than a simple pooling of NAS capacity: by reducing the load on current filers it extends their economic life while eliminating hot-spots and bottlenecks. You keep what you&#8217;ve got and make it faster and easier to manage. </p>
<p>Since most disk-based systems are way over-configured on capacity, this also means reduced CapEx and OpEx as fewer new filers are bought and less floor space, power and maintenance is needed. Given their scale-out architecture &#8211; minimum config is 3 nodes &#8211; you can add performance without adding more filers.</p>
<p>Bottom line: Avere, using 21st century technology, has built a new way to utilize existing resources while improving performance and reducing costs. That&#8217;s something no other NAS front-end ever managed.</p>
<p>They&#8217;ll do well.</p>
<p><strong>Courteous comments welcome, of course.</strong> Any Avere users want to comment on their experience? I haven&#8217;t done any work for Avere, but that could change.</p>
<div class="twttr_button">
					<a href="http://twitter.com/share?url=http://storagemojo.com/2013/04/19/fronting-nas-for-fun-and-profit/&text=Fronting NAS for fun and profit" target="_blank" title="Click here if you like this article.">
						<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
					</a>
				</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2013/04/19/fronting-nas-for-fun-and-profit/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Build a 3PB storage solution</title>
		<link>http://storagemojo.com/2013/04/01/build-a-3pb-storage-solution/</link>
		<comments>http://storagemojo.com/2013/04/01/build-a-3pb-storage-solution/#comments</comments>
		<pubDate>Mon, 01 Apr 2013 23:36:34 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Clusters]]></category>
		<category><![CDATA[NAS, IP, iSCSI]]></category>
		<category><![CDATA[SAN, FC]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2882</guid>
		<description><![CDATA[Choice is a great thing, unless there&#8217;s too much of it. And choice is what we have a lot of in today&#8217;s data storage market. A longtime StorageMojo reader has an interesting problem: architect a 3PB data storage facility. Can you help? Here&#8217;s what he wrote to StorageMojo. His email has been slightly edited for [...]]]></description>
				<content:encoded><![CDATA[<p></p><p>Choice is a great thing, unless there&#8217;s too much of it. And choice is what we have a lot of in today&#8217;s data storage market. </p>
<p>A longtime StorageMojo reader has an interesting problem: architect a 3PB data storage facility. Can you help?</p>
<p>Here&#8217;s what he wrote to StorageMojo. His email has been slightly edited for clarity and length.</p>
<blockquote><p>
One of my current problems is to design one of the nodes for a large research data storage facility. I&#8217;ve had to do this stuff in varying degrees, varying modalities and varying tech in times gone by.</p>
<p>I&#8217;ve been given a number and &#8220;capacity&#8221; to look into – somewhere near or around 3PB to begin with. We won&#8217;t even go down the path of discussing workloads or disk technology fit for purpose at this stage, but, something has struck me as interesting.</p>
<p>There is this clear divergence in disk technologies at the moment and I&#8217;m finding it hard to resolve what is the &#8220;right&#8221; one of the task.</p>
<p>Currently, I see:</p>
<ul>
<li>Heavy-end storage virtualisation frames [VSP, Symmetrix et al]</li>
<li>Big grid-ish things [IBM XIV etc]</li>
<li>Weird &#8220;stacked&#8221; commodity LSI Silicon [NetApp E5400/5500, SGI IS5500/IS5600, Dell MD3660F etc – all the same silicon I think?!]</li>
<li>Quasi virtualisation arrays with modular form factors (Hitachi&#8217;s HUS-VM?)</li>
<li>High performance dense trays in modular form factors [DDN's SFA-12K Exa and Grid scaler tech?]</li>
<li>Bog-standard performance dense trays in modular form factors [Hitachi HUS, EMC VNX, HP EVA, Dell compellent etc etc]</li>
<li>That wild crazy pure flash/RAM/SSD/NAND world that guys like Violin inhabit.</li>
</ul>
<p>Currently I&#8217;m trying to rationalise what I should be using for a storage platform that needs to scale big, but do it in a sensible economic standpoint, with density, performance of interconnect and throughput with gross mixed workloads being all big factors. </p>
<p>Some folks suggest to me that I should be happy enough with the LSI horizontally stacked 60-drive trays, but I am not sure the technology is tracking too well in terms of performance or density (Hitachi, DDN and maybe some others can now do 84 drives in as little as 4-RU!).</p>
<p>I guess my question to you is – where do you see that dense high performance market heading? I know the guys at the LLNL over your way were crowing about the NetApp E5400 LSI stuff where they managed their &#8220;1TB/sec&#8221; file system (I think it was Lustre based?), but I have to wonder if that could have been more efficiently carried out using a DDN GridScaler/SFA-12K-E etc.
</p></blockquote>
<p><strong>The StorageMojo take</strong><br />
Two issues here: is the segmentation our correspondent offers realistic and helpful? And what are the core architectural issues he needs to think about?</p>
<p>For the first issue an object store or a highly parallel NFS &#8211; like Panasas &#8211; seems to be indicated.</p>
<p>Given that this is a general purpose high-performance system, the critical problem seems to be how the system &#8211; however architected &#8211; handles file creation/update/deletion metadata. String enough disks together &#8211; 1,000 to 2,000 &#8211; and you can get a reasonable # of IOPS and, if you need more, put some SSDs in front. </p>
<p>There are a number of scale-out storage systems that will credibly and economically grow to 3PB. Metadata is often the bottleneck, as Isilon buyers have found when creating many small files.</p>
<p>A maximum performance spec &#8211; including file creation etc. rates &#8211; will probably help eliminate likely laggards, while a budget $ per usable TB/PB will eliminate the uneconomic products. </p>
<p>Vendors are welcome to offer their perspectives. Please just identify your company so we know where you&#8217;re coming from.</p>
<p>Practitioners who&#8217;ve done this, or something similar, are encouraged to share their hard-earned wisdom. 3PB is non-trivial today.</p>
<p><strong>Courteous comments welcome, of course.</strong> I&#8217;m going to start offering almost-free consulting for end-users. Stay tuned!</p>
<div class="twttr_button">
					<a href="http://twitter.com/share?url=http://storagemojo.com/2013/04/01/build-a-3pb-storage-solution/&text=Build a 3PB storage solution" target="_blank" title="Click here if you like this article.">
						<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
					</a>
				</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2013/04/01/build-a-3pb-storage-solution/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Dear StorageMojo: should I go all SSD?</title>
		<link>http://storagemojo.com/2013/03/19/dear-storagemojo-should-i-go-all-ssd/</link>
		<comments>http://storagemojo.com/2013/03/19/dear-storagemojo-should-i-go-all-ssd/#comments</comments>
		<pubDate>Tue, 19 Mar 2013 16:33:03 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Enterprise]]></category>
		<category><![CDATA[NAS, IP, iSCSI]]></category>
		<category><![CDATA[SSD/Flash Disk]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2874</guid>
		<description><![CDATA[This came in this morning&#8217;s email from a reader I&#8217;ll call Perplexed. How would you advise Perplexed? I&#8217;m looking at a new iSCSI storage system for two sites with ~ 20 servers each &#8211; 10TB each should do it. Picture two fairly usual manufacturing/mining sites, 200-500 users, email, file/Print, finance and production database services, MS [...]]]></description>
				<content:encoded><![CDATA[<p></p><p>This came in this morning&#8217;s email from a reader I&#8217;ll call Perplexed. How would you advise Perplexed?</p>
<blockquote><p>
I&#8217;m looking at a new iSCSI storage system for two sites with ~ 20 servers each &#8211; 10TB each should do it. Picture two fairly usual manufacturing/mining sites, 200-500 users, email, file/Print, finance and production database services, MS Domain etc.</p>
<p>Looking at IOPS &#8211; we would be serviced by 24 x 2.5&#8243; SAS 10K disks in a RAID6 array.So &#8211; the thought occurs &#8211; that SSD would easily match that performance with far less devices. </p>
<p>Say 15 VM&#8217;s and 5 Servers per location. Requirement for about 5TB of data with limited growth &#8211; lets say 10TB storage and under 1000 IOPS. </p>
<p>Throughput is not an issue except for backup and DR. If we can saturate 2 or 3 Gigabit Ethernet links that is adequate.</p>
<p>This would be served comfortably by 24 x 10K 2.5&#8243; RAID6 arrays at each location. Two of them for redundancy.</p>
<p>But &#8211; a single Intel 710 SSD could meet that IOPS rate and probably throughput as well. One SSD disk replacing an entire 24 disk array!</p>
<p>I would then ask, why have RAID at all? RAID is based on spindles being the smallest block for failure. With SSD, that block could be much smaller. The controller is already doing some ECC for wear management with overprovisioning. </p>
<p>Is there a new paradigm the granularity is no longer a &#8220;spindle&#8221;? Should we simply over-provision by 50%? SSD generally comes with provisioned spare capacity &#8211; starting to sound like redundancy and error correction is built into the controllers to some degree already. </p>
<p>What would be ideal is a 1RU box full of 10TB solid state storage with 10G iSCSI &#8211; no separate disks.</p>
<p>Has SSD let us start to move beyond RAID? With the death of spindles and the huge IOPS available, is the entire R1, R5, R6, R10 debate finished? Does RAID have it&#8217;s place in a box full of chips, and if yes, does it look the same as what we know?</p>
<p>Has the world started to change in storage, or is SSD still just non-moving spindles?</p>
<p>10TB, 1000IOPS, 10G iSCSI &#8211; how would you buy it?
</p></blockquote>
<p><strong>Readers, what say you?</strong><br />
What suggestions do you have for Perplexed? The IOPS are low and he doesn&#8217;t suggest heavy bandwidth requirements either. But he does seem very interested in reliability.</p>
<p><strong>Update:</strong> Vendors are welcome to comment. I only ask that you identify yourself as such. <strong>End update.</strong></p>
<p><strong>The StorageMojo take</strong><br />
Aside from cost &#8211; I&#8217;d expect a minimum of $4-$5 per gigabyte or ≈$100k+ for the storage &#8211; the low IOPS requirement means SSD could be overkill. Perhaps a hybrid SSD/disk solution? SSDs can and do fail, so relying on a single SSD is as dangerous as relying on a single HDD.</p>
<p>A number of companies might be appropriate, including Nexsan, Nimble, TwinStrata, Nexenta, Nutanix, Tintri, Violin, Pure, Nimbus, Tegile and Avere among others. Some have features, such as WAN replication or cloud backup, that might prove useful. Others have VM support, but not with iSCSI.</p>
<p>Performance isn&#8217;t likely to be an issue with any of these vendors, so I&#8217;d focus on availability, management, support and then look at cost.</p>
<p><strong>Courteous comments welcome, of course.</strong> I&#8217;ve done work for some of the companies mentioned.</p>
<div class="twttr_button">
					<a href="http://twitter.com/share?url=http://storagemojo.com/2013/03/19/dear-storagemojo-should-i-go-all-ssd/&text=Dear StorageMojo: should I go all SSD?" target="_blank" title="Click here if you like this article.">
						<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
					</a>
				</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2013/03/19/dear-storagemojo-should-i-go-all-ssd/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Amplidata&#8217;s distributed object store</title>
		<link>http://storagemojo.com/2012/04/17/amplidatas-distributed-object-store/</link>
		<comments>http://storagemojo.com/2012/04/17/amplidatas-distributed-object-store/#comments</comments>
		<pubDate>Tue, 17 Apr 2012 18:19:05 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Clusters]]></category>
		<category><![CDATA[NAS, IP, iSCSI]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2647</guid>
		<description><![CDATA[Our digital civilization requires data integrity and long-term preservation, and neither is assured by our current storage infrastructure. But progress continues. Latest case in point: Amplidata. This 4 year old company, based in Belgium with a growing US footprint, brings a new level of erasure code goodness to the both problems with a cluster-based object [...]]]></description>
				<content:encoded><![CDATA[<p></p><p>Our digital civilization requires data integrity and long-term preservation, and neither is assured by our current storage infrastructure. But progress continues.</p>
<p>Latest case in point: Amplidata. This 4 year old company, based in Belgium with a growing US footprint, brings a new level of erasure code goodness to the both problems with a cluster-based object store.</p>
<p>What erasure code goodness, you ask? The first &#8211; AFAIK &#8211; rateless erasure code, AKA fountain code, storage system in production use.</p>
<p><strong>And that is a good thing because?</strong><br />
Robustness and efficiency. </p>
<p>Amplidata claims storage durability well beyond RAID 6: 10 9&#8242;s (spread across 16 drives with up to 4 failures) durability &#8211; though the spreads can be much larger logically and geographically. They do this by breaking the data object into segments and adding redundancy data. </p>
<p>The redundancy data adds about 50% to the object size &#8211; more efficient than mirroring or triple replication. The benefit is that the system can lose hundreds of segments and still reconstruct the data.</p>
<p>Each object is protected by checksums that can protect against more than 1000 simultaneous bit errors per object. And each write goes to at least to controllers before it is committed.</p>
<p>What kind of monster controller is able to perform all this magic? The minimum configuration is 3 Xeon-based commodity controller nodes with as many 10-drive Atom-based storage nodes as you need.</p>
<p>Amplidata is optimized for bandwidth, not IOPS. With their latest software update they now spec each controller at 750MB/sec, and you can have as many controllers as you can afford.</p>
<p><strong>Sounds like Cleversafe</strong><br />
Cleversafe thought so too, and they&#8217;ve sued Amplidata for patent infringement. But Intel &#8211; who knows about patents and due diligence &#8211; invested after the suit. </p>
<p>Like NetApp&#8217;s suit against ZFS, this seems like a vanity project. Surely Cleversafe has more important things to invest in. If they don&#8217;t they&#8217;re in bigger trouble than we know.</p>
<p><strong>The StorageMojo take</strong><br />
The need for robust, inexpensive and massive storage has been a theme of StorageMojo&#8217;s for years. Object storage is the best solution to the problem of scale, while the kind of redundancy and end-to-end checksumming that Amplidata uses seems as robust as anything on the market today.</p>
<p>As for inexpensive, that is in the eye of the beholder, but Amplidata tells me that their newest storage node lists for less than $0.60/GB while consuming only 60 watts. That should be attractive to people running tape silos who want faster access and better redundancy.</p>
<p><strong>Courteous comments welcome, of course.</strong> I&#8217;m working with Amplidata to produce a video white paper on their technology, so stay tuned for more info on a promising company.</p>
<div class="twttr_button">
					<a href="http://twitter.com/share?url=http://storagemojo.com/2012/04/17/amplidatas-distributed-object-store/&text=Amplidata's distributed object store" target="_blank" title="Click here if you like this article.">
						<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
					</a>
				</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2012/04/17/amplidatas-distributed-object-store/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Gridstore snags Geoff Barrall</title>
		<link>http://storagemojo.com/2012/01/10/gridstore-snags-geoff-barrall/</link>
		<comments>http://storagemojo.com/2012/01/10/gridstore-snags-geoff-barrall/#comments</comments>
		<pubDate>Tue, 10 Jan 2012 17:09:37 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Clusters]]></category>
		<category><![CDATA[NAS, IP, iSCSI]]></category>
		<category><![CDATA[SOHO/SMB]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2568</guid>
		<description><![CDATA[BlueArc and Drobo founder Geoff Barrall has a new perch: Gridstore, one of the companies I&#8217;ve been following for almost 3 years. Geoff is the new executive chairman. Formal announcement is expected this week. Gridstore&#8217;s concept is a low-cost scale-out NAS appliance designed for office environments. Each box is a small, low-power node with a [...]]]></description>
				<content:encoded><![CDATA[<p></p><p><a href="http://www.bluearc.com/" target="_blank">BlueArc</a> and <a href="http://www.drobo.com/" target="_blank">Drobo</a> founder Geoff Barrall has a new perch: <a href="http://gridstore.com/" target="_blank">Gridstore</a>, one of the companies I&#8217;ve been <a href="http://www.zdnet.com/blog/storage/google-style-storage-comes-to-the-smb/1323" target="_blank">following</a> for almost 3 years. Geoff is the new executive chairman. Formal announcement is expected this week.</p>
<p>Gridstore&#8217;s concept is a low-cost scale-out NAS appliance designed for office environments. Each box is a small, low-power node with a couple of TB. Stack &#8216;em for as much redundancy, capacity and performance you want.</p>
<p>Think of it as the consumerization of hyper-scale technology. <a href="http://www.nutanix.com/" target="_blank">Nutanix</a> writ small.</p>
<p><strong>Gridstore details</strong><br />
Gridstore is offering a low-cost, scale-out network file server for $500 a node. That is too cheap for the enterprise storage companies to sell directly.</p>
<p>Founded 5 years ago, Gridstore got a beta out in 2010, and have been shipping for well over a year. They are a Microsoft CIFS protocol file server, using Microsoft’s storage server software. Running on small, 25 watt Atom-based boxes, a 6 node configuration is the size of a bread box.</p>
<p> Like other scale-out NAS systems, the Gridstore NAS has no single point of failure and can survive multiple node failures without going down or losing data.</p>
<p>They call their redundancy scheme RAIDg. When you set up a volume you dial in how many faults you want to survive and the software handles the rest.</p>
<p>Today the number of faults they can handle is limited to half the number of nodes minus one. If you have a 6 node configuration it can handle the loss of 2 nodes. They expect to relax that requirement in the future.</p>
<p><strong>The StorageMojo take</strong><br />
Haven&#8217;t spoken to Geoff about this, but Gridstore seems like a natural for him. If there&#8217;s a theme to his many endeavors, its making advanced NAS technology more accessible.</p>
<p>Gridstore fits the bill nicely. If there&#8217;s one complaint about Drobo, its the lack of box-level redundancy. Gridstore answers this objection, at a higher price point.</p>
<p>Drobo &#8211; over 200,000 units sold &#8211; has blazed a trail for bringing advanced storage technology to the masses at affordable prices. They may be the first, but as Gridstore and others demonstrate, they won&#8217;t be the last.</p>
<p><strong>Courteous comments welcome, of course.</strong> Hoping to make it to CES later this week. Readers: anyone I should make a point to see?</p>
<div class="twttr_button">
					<a href="http://twitter.com/share?url=http://storagemojo.com/2012/01/10/gridstore-snags-geoff-barrall/&text=Gridstore snags Geoff Barrall" target="_blank" title="Click here if you like this article.">
						<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
					</a>
				</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2012/01/10/gridstore-snags-geoff-barrall/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Ask StorageMojo: 80,000 mailboxes need help</title>
		<link>http://storagemojo.com/2011/11/02/ask-storagemojo-80000-mailboxes-need-help/</link>
		<comments>http://storagemojo.com/2011/11/02/ask-storagemojo-80000-mailboxes-need-help/#comments</comments>
		<pubDate>Wed, 02 Nov 2011 16:00:28 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Enterprise]]></category>
		<category><![CDATA[NAS, IP, iSCSI]]></category>
		<category><![CDATA[SSD/Flash Disk]]></category>
		<category><![CDATA[Virtualization]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2543</guid>
		<description><![CDATA[A StorageMojo reader has a problem. Can you help? Our mail hub (80,000+ mailboxes) is virtualized with vSphere 4.1 with Red Hat Enterprise Linux 5 x64 and Dovecot 2.0 [an open source IMAP/POP3 email server for Linux/UNIX-like systems]. We are using HP LeftHand Networks P4300 iSCSI storage in a &#8220;network RAID10 setup of RAID10 storage&#8221; [...]]]></description>
				<content:encoded><![CDATA[<p></p><p>A StorageMojo reader has a problem. Can you help?</p>
<blockquote><p>
Our mail hub (80,000+ mailboxes) is virtualized with vSphere 4.1 with Red Hat Enterprise Linux 5 x64 and <a href="http://dovecot.org/index.html" target="_blank">Dovecot 2.0</a> [an open source IMAP/POP3 email server for Linux/UNIX-like systems]. We are using HP LeftHand Networks P4300 iSCSI storage in a &#8220;network RAID10 setup of RAID10 storage&#8221; for Dovecot indexes and multiple &#8220;networks RAID1 of RAID5 storage&#8221; for actual mailboxes.</p>
<p>This is my take: our Dovecot indexes are getting hammered with lots of small I/O requests, about 8,000 IOPS continuous during 8-working-hour days, 75% write. Indexes are fairly small (50 GB) and expected to grow to 100-150 GB, but need a lot of random I/O. We need real-time replication in storage (LeftHand is ok for us) and we think that SSD should shine in this situation. Bandwidth is not a problem (200-300 megabits of indexes traffic, but we need more IOPs).</p>
<p>The problem is the indexes, but our total mailbox capacity is expected to grow to 6 TB compressed using zlib compression in Dovecot.</p>
<p>We want to buy a storage appliance with the following requirements:</p>
<ul>
<li>Vsphere 4.1 &#038; 5 certified storage, VAAI enabled (if possible)</li>
<li>iSCSI (1 gbps)</li>
<li>High number of IOPS (at least 12,000+, most of them writes)</li>
<li>Small size (200 GB)</li>
<li>Fault tolerant (RAID, battery-backed write cache, power supply, fans, multiple gigabit uplinks, synchronous replication)</li>
<li>Cheap (less than $30k the full setup)</li>
</ul>
<p>We want to buy at the beginning of 2012. Any product that fits?
</p></blockquote>
<p><strong>The StorageMojo take</strong><br />
Suspect price will be the most significant limiter. But the respondent only needs index storage not the whole shooting match. He&#8217;s pretty happy with LeftHand for mailbox storage.</p>
<p>But if we can solve both problems for him, why not? If he should relax some constraint, feel free to suggest it.</p>
<p>He&#8217;ll be watching the comments, so if you have questions please ask them. I&#8217;ll be following the comments as well.</p>
<p><strong>Courteous comments welcome, of course.</strong> His email was edited for clarity.</p>
<div class="twttr_button">
					<a href="http://twitter.com/share?url=http://storagemojo.com/2011/11/02/ask-storagemojo-80000-mailboxes-need-help/&text=Ask StorageMojo: 80,000 mailboxes need help " target="_blank" title="Click here if you like this article.">
						<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
					</a>
				</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2011/11/02/ask-storagemojo-80000-mailboxes-need-help/feed/</wfw:commentRss>
		<slash:comments>47</slash:comments>
		</item>
		<item>
		<title>Dear StorageMojo: make NFS go fast!</title>
		<link>http://storagemojo.com/2010/12/10/dear-storagemojo-make-nfs-go-fast/</link>
		<comments>http://storagemojo.com/2010/12/10/dear-storagemojo-make-nfs-go-fast/#comments</comments>
		<pubDate>Fri, 10 Dec 2010 15:26:04 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Enterprise]]></category>
		<category><![CDATA[NAS, IP, iSCSI]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2226</guid>
		<description><![CDATA[Most of us know what it is like when a relationship goes bad: the sinking feeling that this just isn&#8217;t going to work. Can this configuration be saved? Dear StorageMojo: I joined a company last year that is running Oracle 10g on a NetApp NAS/SAN. Immediately I asked why they were not using Clustering, Oracle [...]]]></description>
				<content:encoded><![CDATA[<p></p><p>Most of us know what it is like when a relationship goes bad: the sinking feeling that this just isn&#8217;t going to work. </p>
<p><strong>Can this configuration be saved?</strong><br />
Dear StorageMojo:</p>
<blockquote><p>
I joined a company last year that is running Oracle 10g on a NetApp NAS/SAN.</p>
<p>Immediately I asked why they were not using Clustering, Oracle RAC, Oracle ASM or Fiber Channel. No answer.</p>
<p>Fast fwd to a year later and they are asking me to deploy this to an I/O bound customer with hundreds of connections and lots of transactions to their DB over NFS.</p>
<p>Long story short it&#8217;s slow-w-w-w. They tried trunking multiple network connections. They tried tuning. They tried a bunch of stuff. And it&#8217;s still a dog.</p>
<p>How slow?</p>
<p>I have a screaming Dell r710 running a 7TB database attached over SAS to a set of MD3000 storage arrays. I am getting 450MBs&#8230;..and this barely suffices&#8230;..</p>
<p>The &#8220;new&#8221; system they showed me gets 50MBsec&#8230;the same screaming Dellr710 but connected over NFS (instead of SAS) to the NetApp NAS.</p>
<p>Do you have any suggestions?</p>
<p>Thank you for reading this nightmare.<br />
Bob
</p></blockquote>
<p>Poor Bob! He&#8217;ll be getting grief from the client for months, maybe years to come, unless this gets fixed.</p>
<p><strong>The StorageMojo take</strong><br />
Maybe Bob could have been better about developing a relationship with the guys configuring the systems. More questions, fewer conclusions, at first.</p>
<p>Suggestions to the customer for acceptance testing might be in order. </p>
<p>But there are 2 problems here:</p>
<ol>
<li>What to do now.</li>
<li>How to keep this from happening again.</li>
</ol>
<p>What would you suggest to Bob on either or both topics? I&#8217;ve asked him to watch the comments, so if more info would be useful, I hope he&#8217;ll provide it.</p>
<p><strong>Courteous comments <strike>welcome</strike> needed.</strong> When a multi-billion dollar near-sighted telescope can get sent into orbit, it is surprising more IT projects don&#8217;t go wrong.</p>
<div class="twttr_button">
					<a href="http://twitter.com/share?url=http://storagemojo.com/2010/12/10/dear-storagemojo-make-nfs-go-fast/&text=Dear StorageMojo: make NFS go fast!" target="_blank" title="Click here if you like this article.">
						<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
					</a>
				</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2010/12/10/dear-storagemojo-make-nfs-go-fast/feed/</wfw:commentRss>
		<slash:comments>26</slash:comments>
		</item>
		<item>
		<title>Jack be Nimble</title>
		<link>http://storagemojo.com/2010/11/08/jack-be-nimble/</link>
		<comments>http://storagemojo.com/2010/11/08/jack-be-nimble/#comments</comments>
		<pubDate>Mon, 08 Nov 2010 21:43:03 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Backup]]></category>
		<category><![CDATA[NAS, IP, iSCSI]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2197</guid>
		<description><![CDATA[Talked to Nimble Storage a few months ago. The 1st time they sounded cool and now I know why. What they do Nimble builds a converged storage appliance out of commodity hard drives and SSDs that offers high performance &#8211; is there any other kind? &#8211; and iSCSI, backup, a form of dedup and WAN [...]]]></description>
				<content:encoded><![CDATA[<p></p><p>Talked to Nimble Storage a few months ago. The 1st time they sounded cool and now I know why.</p>
<p><strong>What they do</strong><br />
Nimble builds a converged storage appliance out of commodity hard drives and SSDs that offers high performance &#8211; is there any other kind? &#8211; and iSCSI, backup, a form of dedup and WAN replication. The pitch is EqualLogic &#038; Data Domain merged into a single low-cost appliance. Only better.</p>
<ul>
<li>iSCSI + dedup</li>
<li>Capacity-optimized snapshots</li>
<li>SATA + flash instead of high-rpm drives</li>
<li>Can run off a remote snapshot</li>
</ul>
<p>EL &#038; DD sell a lot of kit, so this could work.</p>
<p><strong>Claim to fame</strong><br />
Cache Accelerated Sequential Layout is what Nimble calls their secret sauce.</p>
<p>CASL combines a variable block size, in-line compression, application-specific block sizes and checksum and compression data kept in the block header. They coalesce the blocks and only write in full stripes to disk.</p>
<p>The box has a large flash-based cache where the full stripe writes are also written, overcoming the small write performance hit that flash shares with parity raid. This also insures a high percentage of cache hits on the first read.</p>
<p>The system maintains an index of where all the blocks are written. Typically, this index is also held in flash for maximum lookup performance.</p>
<p><strong>App-specific block sizes</strong><br />
Nimble uses of variable block sizes to improve performance. For example, the last three versions of exchange have all used different block sizes. CASL recognizes the different versions of Exchange and dynamically adjusts its block size to the best fit.</p>
<p>They claim a 2x performance advantage on Exchange databases.</p>
<p><strong>Coalesce</strong><br />
They take the variable size blocks then coalesce those blocks into big chunks and write to flash. They write in large blocks &#8211; full block writes to flash and in full stripe writes to disk. Result: fast reads &#038; writes across both media</p>
<p>Their page sizes are variable but small, ranging from 4KB to 64KB. The greater granularity means that frequent snapshots are much smaller than large page size systems like EqualLogic.</p>
<p><strong>The StorageMojo take</strong><br />
There&#8217;s no reason that data protection should be separate from data storage. We&#8217;ve been moving towards integration since the CDP craze. </p>
<p>The average business wants to store and protect their data and they don&#8217;t want to spend much time or money on it. Nor should they. </p>
<p>With powerful commodity processors and nickel-per-GB storage there&#8217;s a huge market for a box that &#8211; or 2 or 3 boxes &#8211; that </p>
<ul>
<li>Stores terabytes of data</li>
<li>Protects that data with local replication and frequent snapshots</li>
<li>Auto-connects to cloud storage for DR and archiving</li>
<li>Doesn&#8217;t confuse users with LUNs and stripes</li>
<li>Offers Time Machine like data recovery to end users</li>
</ul>
<p>It will look like magic &#8211; as any sophisticated technology should &#8211; and you&#8217;ll buy it at Office Max. As with any volume product the key will be architecting to maximize the user experience at an affordable price point.</p>
<p>Nimble certainly has the right idea.</p>
<p><strong>Courteous comments welcome, of course.</strong> I don&#8217;t know which analyst the Nimble guys are blowing their money on, but it isn&#8217;t me.</p>
<div class="twttr_button">
					<a href="http://twitter.com/share?url=http://storagemojo.com/2010/11/08/jack-be-nimble/&text=Jack be Nimble" target="_blank" title="Click here if you like this article.">
						<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
					</a>
				</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2010/11/08/jack-be-nimble/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Ask StorageMojo: EqualLogic vs LeftHand &amp; more</title>
		<link>http://storagemojo.com/2009/10/21/ask-storagemojo-equallogic-vs-lefthand-more/</link>
		<comments>http://storagemojo.com/2009/10/21/ask-storagemojo-equallogic-vs-lefthand-more/#comments</comments>
		<pubDate>Wed, 21 Oct 2009 20:29:11 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[NAS, IP, iSCSI]]></category>
		<category><![CDATA[SOHO/SMB]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=1658</guid>
		<description><![CDATA[These requests came in over the transom in the last couple of days. Maybe some StorageMojo readers have wisdom to share. I have a question I hope you can help me with. My boss asked me . . . to research HP Left-hand SANs and Dell Equallogic SANs. Do you have any special knowledge of [...]]]></description>
				<content:encoded><![CDATA[<p></p><p>These requests came in over the transom in the last couple of days. Maybe some StorageMojo readers have wisdom to share. </p>
<blockquote><p>
I have a question I hope you can help me with.  My boss asked me . . . to research HP Left-hand SANs and Dell Equallogic SANs.  Do you have any special knowledge of these products and, if so, would you make an informal recommendation?
</p></blockquote>
<p>What say you, StorageMojo readers? If you evaluated both, why did you make the choice you did? Vendors welcome to comment, but please identify yourself as such. </p>
<p><strong>The StorageMojo take</strong><br />
AFAIK, both products are good iSCSI systems. Both are backed by major corporations. EqualLogic may be stronger in the channel today, but HP has channel chops as well. HP&#8217;s blade servers may be a more expandable platform, but EqualLogic&#8217;s software portfolio may be more affordable.</p>
<p>Translation: you could do worse than either of these. </p>
<p><strong>Part II</strong><br />
Another customer perplexity: service.</p>
<blockquote><p>
We have a pair of HP disk arrays, EVA 8000 and 6000 and I am looking for a consultant to help up with storage planning.  Do you do such work or could you recommend someone to me.  I am looking for someone who goes beyond just being a seller, I have plenty of potential sellers already.
</p></blockquote>
<p>The writer is in a small city in the Mountain West, so you should be used to working remotely with clients. No, not in Arizona.</p>
<p><strong>The StorageMojo take</strong><br />
HP folks may be wondering: why doesn&#8217;t he call HP? My guess: not big enough  for a direct engagement.</p>
<p><strong>Courteous comments welcome, of course.</strong>  </p>
<div class="twttr_button">
					<a href="http://twitter.com/share?url=http://storagemojo.com/2009/10/21/ask-storagemojo-equallogic-vs-lefthand-more/&text=Ask StorageMojo: EqualLogic vs LeftHand & more" target="_blank" title="Click here if you like this article.">
						<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
					</a>
				</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2009/10/21/ask-storagemojo-equallogic-vs-lefthand-more/feed/</wfw:commentRss>
		<slash:comments>43</slash:comments>
		</item>
		<item>
		<title>Cloud storage for $100 a terabyte</title>
		<link>http://storagemojo.com/2009/09/01/cloud-storage-for-100-a-terabyte/</link>
		<comments>http://storagemojo.com/2009/09/01/cloud-storage-for-100-a-terabyte/#comments</comments>
		<pubDate>Tue, 01 Sep 2009 12:50:37 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Cloud computing & storage]]></category>
		<category><![CDATA[Future Tech]]></category>
		<category><![CDATA[NAS, IP, iSCSI]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=1555</guid>
		<description><![CDATA[Imagine cloud storage that didn&#8217;t cost much more than bare drives. High density storage with RAID 6 protection, reasonable bandwidth and web-friendly HTTPS access. And really, really cheap. Raw disk cost is only 5-10% of a RAID systems cost. The rest goes for corporate jets, sales commissions, 3 martini lunches, tradeshows, sheetmetal, 2 Intel x86 [...]]]></description>
				<content:encoded><![CDATA[<p></p><p>Imagine cloud storage that didn&#8217;t cost much more than bare drives. High density storage with RAID 6 protection, reasonable bandwidth and web-friendly HTTPS access.</p>
<p>And really, really cheap. </p>
<p>Raw disk cost is only 5-10% of a RAID systems cost. The rest goes for corporate jets, sales commissions, 3 martini lunches, tradeshows, sheetmetal, 2 Intel x86 mobos, obscene profits and a few pale and blinking engineers in a windowless lab who make the whole thing work. </p>
<p><strong>Storage for ascetics</strong><br />
But let&#8217;s say you didn&#8217;t want the 3 martini lunch or the barely-clad booth babes. All you want is really <strike>cheap</strike> economical, reasonably reliable storage.</p>
<p>You aren&#8217;t running the global financial system &#8211; what&#8217;s left of it anyway &#8211; and you don&#8217;t have a 2500 person call center hammering on a few dozen Oracle databases 7 x 24. No, you&#8217;re thinking a quiet cloud storage business for SMB&#8217;s, maybe backup and some light file sharing, that will give you a nifty little revenue stream with annual renewals so you can see trouble coming 12 months in advance.</p>
<p>Enough redundancy so when something breaks you can wait until morning to fix it instead of an 0300 pajama run to the data center. Easy connectivity so you aren&#8217;t blowing the savings on Cisco switches. </p>
<p><strong>Bliss</strong><br />
Well, you aren&#8217;t the only one. <a href="https://www.backblaze.com/" target="_blank">Backblaze</a>, a new online backup provider, designed the Storage Pod for their own use and are sharing it with everyone. They aren&#8217;t in the hardware business and I think they figured sharing it would be a nice little attention-getting device.</p>
<p>It worked for me.  Here&#8217;s the box &#8211; which they are using in production.</p>
<p><a href="http://storagemojo.com/wp-content/uploads//2009/08/backblaze_box.jpg"><img src="http://storagemojo.com/wp-content/uploads//2009/08/backblaze_box.jpg" alt="backblaze_box" title="backblaze_box" width="480" height="324" class="aligncenter size-full wp-image-1562" /></a></p>
<p>Here&#8217;s an exploded diagram with a simplified BOM:</p>
<p><a href="http://storagemojo.com/wp-content/uploads//2009/08/backblaze_storage_pod_bom.jpg"><img src="http://storagemojo.com/wp-content/uploads//2009/08/backblaze_storage_pod_bom.jpg" alt="backblaze_storage_pod_bom" title="backblaze_storage_pod_bom" width="480" height="670" class="aligncenter size-full wp-image-1563" /></a></p>
<p>And then there&#8217;s the (free) software. 64-bit Debian Linux, IBM&#8217;s open source JFS file system and HTTPS access. Put a stateless webserving front end on it and you&#8217;re good to go. Scale out the webserver and add storagepods to grow the system.</p>
<p><strong>The StorageMojo take</strong><br />
This isn&#8217;t general purpose or high-performance storage.  Nor is it backed by global network of 7 x 24 service professionals. But there are a lot of applications out there that just need a big bit bucket. This is it.</p>
<p>No one is manufacturing this for you either &#8211; which is a good thing. If you don&#8217;t know what you&#8217;re doing you can get in a lot of trouble with a lot of data real fast. Want to be the Bernie Madoff of cloud storage? </p>
<p>But the density is good, the performance is reasonable, the availability is decent and the price is right. This is a DC-3, not a 747. It is all you need for the right application.</p>
<p><strong>Courteous comments welcome, of course.</strong>  See the plans and get the box model in the  <a href="https://www.backblaze.com/petabytes-on-a-budget-how-to-build-cheap-cloud-storage.html" target="_blank">paper </a> Backblaze put together.</p>
<div class="twttr_button">
					<a href="http://twitter.com/share?url=http://storagemojo.com/2009/09/01/cloud-storage-for-100-a-terabyte/&text=Cloud storage for $100 a terabyte" target="_blank" title="Click here if you like this article.">
						<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
					</a>
				</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2009/09/01/cloud-storage-for-100-a-terabyte/feed/</wfw:commentRss>
		<slash:comments>49</slash:comments>
		</item>
		<item>
		<title>Configure a 100 TB HD video infrastructure</title>
		<link>http://storagemojo.com/2009/06/07/configure-a-100-tb-hd-video-infrastructure/</link>
		<comments>http://storagemojo.com/2009/06/07/configure-a-100-tb-hd-video-infrastructure/#comments</comments>
		<pubDate>Mon, 08 Jun 2009 01:20:37 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Clusters]]></category>
		<category><![CDATA[NAS, IP, iSCSI]]></category>
		<category><![CDATA[Video]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=1409</guid>
		<description><![CDATA[The video folks have an interesting set of problems: large needs; major bandwidth; time-critical collaboration; lots of metadata; and more. Like budgets. I do some video production myself and empathize. They are today where most of us will be in 10 years: lots of large files; local and remote sharing; processor and bandwidth intensive operations; [...]]]></description>
				<content:encoded><![CDATA[<p></p><p>The video folks have an interesting set of problems: large needs; major bandwidth; time-critical collaboration; lots of metadata; and more. Like budgets. I do some <a href="http://www.youtube.com/user/StorageMojo" target="_blank">video production</a> myself and empathize.</p>
<p>They are today where most of us will be in 10 years: lots of large files; local and remote sharing; processor and bandwidth intensive operations; large archives of wanted and rarely accessed files.  Today high-end video folks are working at 2k, 4k and, sometimes, 8k video resolutions &#8211; and 10 years from now I wouldn&#8217;t be surprised if home users weren&#8217;t too.</p>
<p>What prompts this is a note I received from, well, I&#8217;ll let him introduce himself.</p>
<blockquote><p>
I have a boutique post-production company and I&#8217;m a filmmaker. We are small, under a dozen, but swell to a few times that size with freelancers on a project-by-project basis. Because we work with very high resolution media, we need a lot of space, and very high throughput to each user.  . . . [W]e&#8217;re all working with 2K and 4K media (300 and 1200MBps respectively to EACH user) and 3D animation rendering. . . . We use a mix of Linux, Windows, and OS X clients. In total, we could easily make use of 100TB+ right now, and prefer to stop archiving everything to tape and deleting it, but rather migrate to another tier of storage but keep in one global namespace with the tape just for disaster recovery. We also need security administration.</p>
<p>I can&#8217;t find a storage system that does all this. DataDirect Networks seems to be the du jour high-end storage for my industry, and supposing I&#8217;m willing to finance that big-ticket brand, they still don&#8217;t have a filing system answer. They&#8217;re suggesting StorNext or CXFS, and I know the multi-user scalability and expansion limitations well (can anybody say &#8220;forklift&#8221;?). </p>
<p>The closest I&#8217;ve come is Lustre. It seems like it would fit the bill nicely, especially since we&#8217;re savvy to integrate in-house, except that it is Linux only, and NFS/CIFS gateways don&#8217;t seem like a great idea. I keep hearing they&#8217;re working on at least a Windows client, but who knows when it will be ready?</p>
<p>Can you help at all? What have I overlooked? Doesn&#8217;t anyone make what I&#8217;m looking for?
</p></blockquote>
<p><strong>Short answer to last question:</strong><br />
No.</p>
<p><strong>Longer answer:</strong><br />
No. But there are workarounds.</p>
<p>For those new to video, here&#8217;s an abbreviated chart of some video rates in megabytes per second:<br />
<a href="http://storagemojo.com/wp-content/uploads//2009/06/video_data_rates1.png"><img src="http://storagemojo.com/wp-content/uploads//2009/06/video_data_rates1.png" alt="video_data_rates1" title="video_data_rates1" width="471" height="268" class="aligncenter size-full wp-image-1420" /></a> [Adapted from <a href="http://www.integritydatasystems.net/Video_Data_Rates.htm" target="_blank">Integrity Data Systems</a> which offers the whole chart. Aspect ratios and frame rates left out.]<br />
<strong>Update:</strong> Larry Jordan, a writer and trainer in video editing, graciously wrote to let me know that the above data rates are uncompressed &#8211; and that most production houses would use compressed data. The amount of compression varies based on the codec as Larry explains in this <a href="http://www.larryjordan.biz/articles/lj_video_data_rates.html" target="_blank">informative post</a>.<strong> End update.</strong></p>
<p><strong>Issue 1: Interconnects</strong><br />
GigE won&#8217;t even handle 32-bit RGB standard def video. And when you get into HD video it gets hairier fast. Trunk multiple GigE&#8217;s? 10GbE? 4x Infiniband? FC? eSATA or PCI-e direct attached storage? </p>
<p><strong>Issue 2: Virtualization</strong><br />
A single address space is a wonderful thing. You&#8217;ll need a software layer that clusters multiple boxes. You&#8217;ll also probably want to build an archive infrastructure that is distinct from your higher performance working set storage, but some vendors will disagree.</p>
<p>Likely software suspects include <a href="http://www.ibrix.com/" target="_blank">IBRIX</a>, <a href="http://www.parascale.com/" target="_blank">Parascale</a>, <a href="http://www.caringo.com/" target="_blank">Caringo</a>,  <a href="http://www.object-matrix.com/" target="_blank">MatrixStore</a>, <a href="http://www.bycast.com/" target="_blank">Bycast</a> and <a href="http://www.permabit.com/" target="_blank">Permabit</a>.</p>
<p>On the combined HW/SW side there&#8217;s <a href="http://www.panasas.com/" target="_blank">Panasas</a> and <a href="http://www.isilon.com/" target="_blank">Isilon</a>.  Something tells me there are some other options, like HP&#8217;s Extreme Data Storage 9100, that are also applicable. </p>
<p>Lustre is not a product I would recommend since it was designed for HPC, a market where PhDs work as sysadmins. Sun may have tamed it since they bought it, but it is a non-trivial piece of software. </p>
<p><strong>Come one, come all</strong><br />
StorageMojo readers are invited to offer their 2¢ worth. Architecting is non-trivial, especially if money is an object. </p>
<p><strong>Update:</strong><br />
Our interlocutor wrote in to add some detail:</p>
<blockquote>
<p>thanks for the response. Here&#8217;s some answers:</p>
<p> &#8211; We can manage expensive interfaces like 10GigE and Infiniband QDR. We&#8217;ve been paying for dual-channel 4Gb FC for the past few years, after all. I just want to also allow standard Gigabit connections to the cheap seats without a lot of complexity. So I guess the jargon for that would be &#8220;multiprotocol&#8221; switching?</p>
<p> &#8211; The large naming space might be a luxury. The fact is that jobs come in one of three general sizes, and we could have volumes of that size waiting to take on new jobs as they come in, so at least there is one namespace per job. As you said, capacity is cheap&#8230;</p>
<p> &#8211; Truth is I am pretty savvy, but other than that we have a lot of power desktop users but not sysadmin types. I contract some people with steady part-time work, but it has been our business model to try to keep as many of our full-time people on the creative and producing side as possible, and not in support/administration. </p>
<p>The one thing I don&#8217;t understand is what you say about Infiniband not being so great when there&#8217;s lots of node churn?</p>
<p>I know what you mean about DAS, but I think I&#8217;ve ruled out distributing the data through push/pull from a central repository. The fact is jobs just move to fast through here for that, and we often have about two seconds notice that we need to bring a certain job&#8217;s data to System X, Y or Z to do work on it. It&#8217;s very dynamic.</p>
<p>I see some brands in your blog post I haven&#8217;t checked on yet.</p>
<p>What turned me onto Lustre is that Frantic Films in London has deployed it. They&#8217;re the only ones AFAIK.<br />
<strong>End update.</strong></p>
</blockquote>
<p><strong>The StorageMojo take</strong><br />
Some thoughts on the infrastructure issues.</p>
<p>Capacity is cheap, network bandwidth is expensive. Raw SATA disk is less than $0.10/GB. 10GbE switch ports are over a grand apiece. Infiniband is better from a price/performance perspective, but not as friendly for networks where there is much node churn &#8211; unless that&#8217;s been fixed in the last few years.</p>
<p>Direct attached storage will give you the best performance &#8211; especially with 4k. The new PCI-e attached arrays from <a href="http://www.jmr.com/" target="_blank">JMR</a> and others can offer up to 4,000 MB/sec bandwidth. Stripe across 4 of those and you&#8217;ll be able to handle 8k.</p>
<p>Transaction processing is well on its way to niche status, like mainframes and hierarchical databases that once ruled the earth. It is a big file world out there and the files are getting bigger every year.</p>
<p><strong>Courteous comments welcome, of course.</strong>  I&#8217;ve done work for many of these folks &#8211; but not all &#8211; at one time or another. </p>
<div class="twttr_button">
					<a href="http://twitter.com/share?url=http://storagemojo.com/2009/06/07/configure-a-100-tb-hd-video-infrastructure/&text=Configure a 100 TB HD video infrastructure" target="_blank" title="Click here if you like this article.">
						<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
					</a>
				</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2009/06/07/configure-a-100-tb-hd-video-infrastructure/feed/</wfw:commentRss>
		<slash:comments>28</slash:comments>
		</item>
		<item>
		<title>HP/LeftHand: cluster market shapes up</title>
		<link>http://storagemojo.com/2008/10/08/hplefthand-cluster-market-shapes-up/</link>
		<comments>http://storagemojo.com/2008/10/08/hplefthand-cluster-market-shapes-up/#comments</comments>
		<pubDate>Thu, 09 Oct 2008 01:33:04 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Clusters]]></category>
		<category><![CDATA[Enterprise]]></category>
		<category><![CDATA[NAS, IP, iSCSI]]></category>
		<category><![CDATA[SOHO/SMB]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=966</guid>
		<description><![CDATA[Hewlett-Packard&#8217;s acquisition of the LeftHand Networks shows how cluster storage is going mainstream &#8211; and how HP plans to be right in the middle of it. First PolyServe and now LeftHand. This is about commodity-based clusters Not iSCSI or GigE or 10 GigE as a storage interconnect. Fibre Channel&#8217;s failure to move downmarket &#8211; and [...]]]></description>
				<content:encoded><![CDATA[<p></p><p>Hewlett-Packard&#8217;s acquisition of the LeftHand Networks shows how cluster storage is going mainstream &#8211; and how HP plans to be right in the middle of it. First <a href="http://storagemojo.com/2007/03/12/hps-bold-move-into-storage-clusters/" target="_blank">PolyServe</a> and now LeftHand. </p>
<p><strong>This is about commodity-based clusters</strong><br />
Not iSCSI or GigE or 10 GigE as a storage interconnect. Fibre Channel&#8217;s failure to move downmarket &#8211; and Infiniband&#8217;s similar problem &#8211; means GigE is the only game in town. </p>
<p>Reaching the huge, not currently imploding, SMB market requires meeting people where they live. SMBs don&#8217;t live in Fibre Channel glass houses. GigE isn&#8217;t ideal, but it&#8217;s cheap and it works.</p>
<p><strong>Did HP overpay?</strong><br />
$360 million isn&#8217;t pocket change, but it is only about 4x the $86 million investors put in. The investors get some nice coin, but it isn&#8217;t the 10-bagger they were hoping for. </p>
<p>Once the Lefties go through the interminable internal HP meat grinder, sales will grow rapidly. I suspect they weren&#8217;t up to Isilon&#8217;s $100M in sales &#8211; maybe $70M &#8211; but LeftHand was much closer to profitability. Net net: the price looks fair for a market leader in a high-growth market.</p>
<p><strong>HP vs EMC</strong><br />
Battle of the competing cluster storage visions. Polyserve handles files; LeftHand blocks. EMC&#8217;s Maui is aimed at large-scale distributed file storage, a utility that ISP&#8217;s might resell to SMBs, but nothing an SMB would implement on their own.</p>
<p>Which will win &#8211; and there&#8217;s room for both &#8211; rests on the answer to the question <a href="http://storagemojo.com/2008/09/18/are-there-economies-of-scale-in-storage/" target="_blank">Are there economies of scale in storage?</a>. If there are, small-scale clusters sales will suffer and Maui should win. </p>
<p><strong>The StorageMojo take</strong><br />
This is cluster storage market skirmishing, not a pitched battle. That will come but right now everyone is feeling their way, coming into the market from different directions, waiting to see what clicks. </p>
<p>Right now though, HP seems to have the strongest position. XIV is too new; Maui even newer; Lustre too complex; Isilon is digging out of a big hole. HP has the pole position with implementable products today and the services to back them up. Should be a powerful combination.</p>
<p><strong>Courteous comments welcome, of course.</strong> Disclosure: I&#8217;ve done some work for HP, Isilon and Sun.</p>
<div class="twttr_button">
					<a href="http://twitter.com/share?url=http://storagemojo.com/2008/10/08/hplefthand-cluster-market-shapes-up/&text=HP/LeftHand: cluster market shapes up" target="_blank" title="Click here if you like this article.">
						<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
					</a>
				</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2008/10/08/hplefthand-cluster-market-shapes-up/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Our changing file workloads</title>
		<link>http://storagemojo.com/2008/09/09/our-changing-file-workloads/</link>
		<comments>http://storagemojo.com/2008/09/09/our-changing-file-workloads/#comments</comments>
		<pubDate>Wed, 10 Sep 2008 04:34:49 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Enterprise]]></category>
		<category><![CDATA[NAS, IP, iSCSI]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=928</guid>
		<description><![CDATA[StorageMojo has long held the view that our storage workloads are changing: more file storage, less block storage; larger file sizes; and cooler data. While all the indicators said this was happening it&#8217;s good to find a study that confirmed this intuition. In the Measurement And Analysis Of Large-Scale Network File System Workloads (pdf) researchers [...]]]></description>
				<content:encoded><![CDATA[<p></p><p>StorageMojo has long held the view that our storage workloads are changing: more file storage, less block storage; larger file sizes; and cooler data. While all the indicators said this was happening it&#8217;s good to find a study that confirmed this intuition.</p>
<p>In the <a href="http://www.ssrc.ucsc.edu/Papers/leung-usenix08.pdf" target="_blank">Measurement And Analysis Of Large-Scale Network File System Workloads</a> (pdf) researchers Andrew W. Leung and Ethan L. Miller from UC Santa Cruz and Shankar Pasupathy and Garth Goodson of Netapp measured 2 large file servers for 4 months. Their results are worth reviewing, since so many of the optimizations in storage infrastructures rely on workload assumptions. </p>
<p><strong>Unstudied CIFS</strong><br />
The authors point out that there have been no major studies of the CIFS protocol, odd since it is the default on Windows systems. Furthermore, the last major study of network file loads was performed in 2001 &#8211; seven years ago &#8211; an interval in which average this drive sizes have gone from 20 GB to 500 and network speeds from 100 MB to 1 GB. </p>
<p>Most surprising, however is that no published study has ever analyzed large-scale enterprise file system workloads. Researchers have studied workloads closer to home: university and engineering workloads. </p>
<p><strong>Enterprise workloads</strong><br />
One was a midrange file server with 3 TB of capacity with almost 3 TB used by over 1000 marketing sales and finance employees. The second server was a high end Netapp filer with 28 TB capacity &#8211; 19 TB used &#8211; supporting 500 engineering employees. </p>
<p>Yes, marketers, engineers get the good toys. You can cry about it over your next 3 martini lunch.</p>
<p>Some significant differences from prior studies:</p>
<ul>
<li><strong>Workloads more write oriented.</strong> Read/write byte ratios and are now only 2 to 1 compared to the 4-1 or higher ratios reported earlier.</li>
<li><strong>Workloads less read-centric.</strong> Read/write workloads are now 30x more common.</li>
<li><strong>Most bytes transferred sequentially.</strong> These runs are 10x the length found in the old studies.</li>
<li><strong>Files 10x bigger.</strong></li>
<li><strong>Files live 10x longer.</strong> Less than half are deleted within a day of creation.</li>
</ul>
<p><strong>Cool new findings</strong></p>
<ul>
<li><strong>Files rarely re-opened. </strong>Over 66% are re-opened once and 95% fewer than 5 times.</li>
<li><strong>Over 60% of file re-opens are within a minute of the first open.</strong></li>
<li><strong>Less than 1% of clients account for 50% of requests.</strong></li>
<li><strong>Infrequent file sharing.</strong> Over 76% of files are opened by just 1 client.</li>
<li><strong>Concurrent file sharing very rare.</strong> As the prior point suggests, only 5% of files are opened by multiple clients and 90% of those are read only.</li>
<li><strong>Most file types have no common access pattern.</strong></li>
</ul>
<p>And there&#8217;s this: <strong>over 90% of the active storage was untouched during the study.</strong> That makes it official: data is getting cooler.</p>
<p>Another interesting finding: 91% of VMWare Virtual Disk (vmdk) files accesses were small sequential reads &#8211; not the larger sequential accesses I&#8217;d expect.</p>
<p><strong>The StorageMojo take</strong><br />
The writers rightly suggest that given the rarity of file reads after creation it makes sense to migrate files to cheap storage sooner than later.</p>
<p>Perhaps primary file storage should be thought of as a large FIFO buffer &#8211; tossing 3 month old files to an archive for long-term storage. A data flow architecture instead of a series ever-larger buckets.</p>
<p>Kudos to NetApp and UCSC for this work. It seems like NetApp has been doing the best job of leveraging academic researchers lately. I&#8217;d like to see them get more marketing mileage out of their good work.</p>
<p><strong>Courteous comments welcome, of course.</strong>  </p>
<div class="twttr_button">
					<a href="http://twitter.com/share?url=http://storagemojo.com/2008/09/09/our-changing-file-workloads/&text=Our changing file workloads" target="_blank" title="Click here if you like this article.">
						<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
					</a>
				</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2008/09/09/our-changing-file-workloads/feed/</wfw:commentRss>
		<slash:comments>16</slash:comments>
		</item>
	</channel>
</rss>
