<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>StorageMojo &#187; Future Tech</title>
	<atom:link href="http://storagemojo.com/category/future-tech/feed/" rel="self" type="application/rss+xml" />
	<link>http://storagemojo.com</link>
	<description>Data storage info &#38; analysis</description>
	<lastBuildDate>Fri, 20 Jan 2012 06:10:36 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>The network is choking our storage</title>
		<link>http://storagemojo.com/2011/10/20/the-network-is-choking-our-storage/</link>
		<comments>http://storagemojo.com/2011/10/20/the-network-is-choking-our-storage/#comments</comments>
		<pubDate>Thu, 20 Oct 2011 17:03:08 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Cloud computing & storage]]></category>
		<category><![CDATA[Clusters]]></category>
		<category><![CDATA[Future Tech]]></category>
		<category><![CDATA[SAN, FC]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2533</guid>
		<description><![CDATA[Amazon Web Services architect James Hamilton has been posting on network issues for over a year and researching them much longer. As Ethernet becomes the de facto SAN technology, his views become more relevant to the larger storage market. Critique Part of Mr. Hamilton&#8217;s concern is the structure of the networking industry: the high margins; [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>Amazon Web Services architect James Hamilton has been <a href="http://perspectives.mvdirona.com/2011/10/01/ChangesInNetworkingSystems.aspx" target="_blank">posting</a> on network issues for over a year and researching them much longer. As Ethernet becomes the <i>de facto</i> SAN technology, his views become more relevant to the larger storage market.</p>
<p><strong>Critique</strong><br />
Part of Mr. Hamilton&#8217;s concern is the structure of the networking industry: the high margins; the dominance of a single player, Cisco; the closed technology; and the heavy vertical integration. All antithetical to the dynamics that have driven server costs down so successfully in the last 20 years.</p>
<p>These are issues the storage industry knows too well. But Mr. Hamilton is more concerned about the waste the current high-cost industry structure causes.</p>
<p>Waste?</p>
<p><strong>Workload placement</strong><br />
The cost of network bandwidth leads to network over-subscription. Networks are configured as tree topologies: the further you move from end nodes the worse the over subscription. </p>
<p>As described in the 2009 Microsoft Research paper <a href="http://research.microsoft.com/pubs/80693/vl2-sigcomm09-final.pdf" target="_blank">VL2: A Scalable and Flexible Data Center Network</a>:</p>
<blockquote><p>
. . . the capacity between different branches of the tree is typically over- subscribed by factors of 1:5 or more, with paths through the highest levels of the tree oversubscribed by factors of 1:80 to 1:240. This limits communication between servers to the point that it fragments the server pool — congestion and computation hot-spots are prevalent even when spare capacity is available elsewhere.
</p></blockquote>
<p>This throttles data center performance by limiting server-to-server bandwidth, fragmenting resources and reducing network utilization. The latter reflects the redundant paths needed in case of switch failure: ≈50% or more of costly data center bandwidth goes unused.</p>
<p>As might be expected, big Internet data centers like Amazon&#8217;s have complex and unpredictable workloads. They need lots of bandwidth between all servers all the time.</p>
<p><strong>A solution</strong><br />
The VL2 paper describes an experimental solution to these problems that includes <i>location-specific</i> and <i>application-specific</i> addressing, multi-path traffic load balancing and a novel directory design that efficiently handles lookups and updates to network mappings.</p>
<p>In an 75-node test cluster the design moved 2.75TB of data in 395 seconds &#8211; 94% of maximum network bandwidth &#8211; at a fraction of the cost of current enterprise networks. The paper calculates that a cloud-service scale network with no over-subscription could be built with commodity switches at <strong>1/14th the cost</strong> of a traditional data center Ethernet.</p>
<p>Whoa!</p>
<p><strong>The StorageMojo take</strong><br />
VC and engineering dollars follow high-growth markets. What Google, Amazon and Microsoft want, they get. With the rapid growth of public cloud services the network over-subscription problem will get solved. </p>
<p>Merchant silicon from Broadcom, Intel and Marvell is making a tried-and-true Moore&#8217;s Law attack on hardware cost. The protocol stack is tougher, but several open-source industry initiatives are under way with support from major companies. Progress will be slower than hoped, but within 3 years we&#8217;ll have a viable stack to build on.</p>
<p>Where does this leave the networking industry? That depends on where you sit.</p>
<p>Cisco will be the biggest loser, because they&#8217;ve been the biggest winner with the current model. They may need to pull an IBM and move big into services if they want to stick around. Ironically, Cisco&#8217;s UCS product line &#8211; which bakes in the tree-structured network &#8211; has further motivated broader industry action.</p>
<p>The rest of the industry can go after this emerging market with a lower-GM business model. Not all of them will, but it will be a critical success factor. </p>
<p>The big winner will be storage. Scale-out storage relies on spraying data across multiple racks for maximum availability, utilization and performance. Cheaper, faster, better scale-out networks will only drive storage demand.</p>
<p>For most of us this is an academic problem today. Lightly used systems &#8211; such as for backup and archiving &#8211; don&#8217;t see Amazon&#8217;s problems. But in 5 years this will be common even outside the public cloud providers.</p>
<p>Just as IT users have benefited from Google&#8217;s push on energy efficiency and much more, they will also benefit from much lower cost and more scalable networks.</p>
<p><strong>Courteous comments welcome, of course.</strong> I can&#8217;t help but continue to marvel at how dumb Cisco&#8217;s UCS has turned out to be. It&#8217;s a gift that keeps on giving.</p>
<div style="clear:both;margin-bottom:5px;">
				<a href="http://twitter.com/share?url=http://storagemojo.com/2011/10/20/the-network-is-choking-our-storage/&text=The network is choking our storage " target="_blank" title="Click here if you liked this article">
					<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
				</a>
			</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2011/10/20/the-network-is-choking-our-storage/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>NoSQL in the metadata engine room</title>
		<link>http://storagemojo.com/2011/10/03/nosql-in-the-metadata-engine-room/</link>
		<comments>http://storagemojo.com/2011/10/03/nosql-in-the-metadata-engine-room/#comments</comments>
		<pubDate>Mon, 03 Oct 2011 18:59:44 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Clusters]]></category>
		<category><![CDATA[Future Tech]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2525</guid>
		<description><![CDATA[One more datapoint and we&#8217;ll have a trend: NoSQL databases managing metadata. It&#8217;s obvious in retrospect: use a scalable big data tool to handle scale-out metadata. Maybe not a requirement today, but surely will be with even bigger data tomorrow. Metadata is a fraction of the user data set, but it gets hammered much more. [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>One more datapoint and we&#8217;ll have a trend: NoSQL databases managing metadata. It&#8217;s obvious in retrospect: use a scalable big data tool to handle scale-out metadata. Maybe not a requirement today, but surely will be with even bigger data tomorrow.</p>
<p>Metadata is a fraction of the user data set, but it gets hammered much more. As more metadata is found useful the hammering will get more insistent.</p>
<p><strong>Nutanix</strong><br />
<a href="http://www.nutanix.com/" target="_blank">Nutanix</a>, whose CTO and co-founder, Mohit Aron, was a developer of the Google File System, uses MapReduce. Nutanix achieves it scale due to its distributed metadata, masterless architecture &#8211; powered by MapReduce jobs that run in the background.</p>
<p><strong>Druva</strong><br />
<a href="http://www.druva.com/" target="_blank">Druva</a>, a backup company for mobile devices, also uses a NoSQL database to manage storage metadata. They say they&#8217;ve found that NoSQL scales over an order of magnitude better than relational in similar applications.</p>
<p><strong>Somebody else</strong><br />
A company that shall remain nameless is porting Hadoop to their backend. The customer won&#8217;t be able to access Hadoop for their work &#8211; it is strictly for the system&#8217;s internal use.</p>
<p>It is a proof of concept so it isn&#8217;t a 3rd data point, but they see the potential advantages. Call it data point 2½. </p>
<p><strong>The StorageMojo take</strong><br />
Small advances are the building blocks of disruption. RAID made it possible to build available storage using cheap disks. Consumer adoption of PCs made disks even cheaper. Moore&#8217;s Law made RAID controllers cheaper and faster, or faster and more capable. </p>
<p>A virtuous circle of disruption.</p>
<p>The basic architecture of scale-out storage systems &#8211; purpose-built software on clustered commodity hardware &#8211; has been stable. But this is the beginning of scale-out storage 2.0: taking scale-out technology developed for users and incorporating it into the storage infrastructure itself.</p>
<p>These ideas are bubbling up among the latest startups and among the establishment players. At some point the old RAID architectures will be well and truly broken, able to compete in smaller and smaller niches until the revenue can&#8217;t justify more investment. </p>
<p>Of course vendors have been making RAID controllers out of servers for years now, and those servers can run any software they want. But at some point the explicit and implicit assumptions in the old architecture crash into current realities &#8211; either in cost, development time, feature completeness or management overhead &#8211; and then we move on.</p>
<p><strong>Courteous comments welcome, of course.</strong> I learned about Nutanix at the last <a href="http://techfieldday.com/" target="_blank">Tech Field Day</a> &#8220;The Independent IT Influencer Event&#8221; which paid for my travel expenses to Silicon Valley.</p>
<div style="clear:both;margin-bottom:5px;">
				<a href="http://twitter.com/share?url=http://storagemojo.com/2011/10/03/nosql-in-the-metadata-engine-room/&text=NoSQL in the metadata engine room " target="_blank" title="Click here if you liked this article">
					<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
				</a>
			</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2011/10/03/nosql-in-the-metadata-engine-room/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>StorageMojo at NAB &#8217;11</title>
		<link>http://storagemojo.com/2011/04/10/storagemojo-at-nab-11/</link>
		<comments>http://storagemojo.com/2011/04/10/storagemojo-at-nab-11/#comments</comments>
		<pubDate>Mon, 11 Apr 2011 03:52:12 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Future Tech]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2343</guid>
		<description><![CDATA[CES is a lot of fun, but my favorite toy trade show is the National Association of Broadcasters (NAB) convention. I arrive tomorrow and return Wednesday and hope &#8211; on the way back &#8211; to walk across the new bridge that is 900 feet above the Colorado River at Hoover Dam. I do video work [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>CES is a lot of fun, but my favorite <strike>toy</strike> trade show is the National Association of Broadcasters (NAB) convention. I arrive tomorrow and return Wednesday and hope &#8211; on the way back &#8211; to walk across the new bridge that is 900 feet above the Colorado River at Hoover Dam.</p>
<p>I do video work today and did FM broadcasting decades ago. Today we&#8217;re all narrowcasting, but digital has made the tech now so much better &#8211; and cheaper! &#8211; than what broadcasters had 10 years ago that the possibilities are endless.</p>
<p><strong>Pre-show expectations</strong><br />
Rumor has it that Apple will announce the newest version of Final Cut Studio &#8211; my preferred editing platform &#8211; on Tuesday. I&#8217;m also looking for any and all Thunderbolt peripherals, although the pre-show PR hasn&#8217;t mentioned it. I hope Promise and LaCie have something to show, and perhaps Sony as well.</p>
<p>Object storage should be more visible this year as well. And where will USB 3.0 show up &#8211; if it shows up anywhere &#8211; in pro gear?</p>
<p><strong>The StorageMojo take</strong><br />
There&#8217;s something about pro gear &#8211; even though I can&#8217;t afford 98% of it and wouldn&#8217;t use it to best advantage if I could &#8211; that fascinates. Built into all the features and specs is deep knowledge of the technology and its limitations. </p>
<p>There are people who spend $5k for a microphone to record a single instrument. Their ears can discern the differences between equally high-end mics and they know how to mix them to get the sound they want. </p>
<p>Then we listen to all that painstaking work through crummy little earbuds. Oh well!</p>
<p>If you have something to share please contact me through the comments. I&#8217;m looking for cool stuff.</p>
<p><strong>Courteous comments welcome, of course.</strong> </p>
<div style="clear:both;margin-bottom:5px;">
				<a href="http://twitter.com/share?url=http://storagemojo.com/2011/04/10/storagemojo-at-nab-11/&text=StorageMojo at NAB '11" target="_blank" title="Click here if you liked this article">
					<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
				</a>
			</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2011/04/10/storagemojo-at-nab-11/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>StorageMojo @Big Data NYC next week</title>
		<link>http://storagemojo.com/2011/03/15/storagemojo-big-data-nyc-next-week/</link>
		<comments>http://storagemojo.com/2011/03/15/storagemojo-big-data-nyc-next-week/#comments</comments>
		<pubDate>Tue, 15 Mar 2011 22:10:42 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Future Tech]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2329</guid>
		<description><![CDATA[The younger StorageMojo analysts have heard that New York is a den of sin and depravity and can&#8217;t wait to try some. With new Wrangler jeans and polished silver bolos they are ready to par-tay with some big city gals. Yee-ha! Should I tell them that it will be 95% male? Nah. The event is [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>The younger StorageMojo analysts have heard that New York is a den of sin and depravity and can&#8217;t wait to try some. With new Wrangler jeans and polished silver bolos they are ready to par-tay with some big city gals. Yee-ha!</p>
<p>Should I tell them that it will be 95% male? Nah. </p>
<p>The event is GigaOm&#8217;s <a href="http://bigdata2011vip.eventbrite.com/" target="_blank">Structure Big Data 2011</a> conference on Wednesday the 23rd. Haven&#8217;t been to it before, but I was overdue to see what Mr. Malik has cooked up.</p>
<p><strong>The StorageMojo take</strong><br />
With the gradual slowing of improvement in hardware and big plans for massive data collection, the world of storage is looking at accelerating change. Drive and CPU vendors won&#8217;t be doing all the heavy lifting. </p>
<p>That means that architecture will become even more critical to successful products. Lots of room for creativity and innovation because there is a ready audience that needs something better than what they have today.</p>
<p>Looking forward to meeting new people and learning about new technologies and markets. I&#8217;ll have some time Thursday morning too, if anyone is eager to bend my ear. Leave a comment with your preferred means of contact.</p>
<p><strong>Courteous comments welcome, of course.</strong>  </p>
<div style="clear:both;margin-bottom:5px;">
				<a href="http://twitter.com/share?url=http://storagemojo.com/2011/03/15/storagemojo-big-data-nyc-next-week/&text=StorageMojo @Big Data NYC next week" target="_blank" title="Click here if you liked this article">
					<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
				</a>
			</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2011/03/15/storagemojo-big-data-nyc-next-week/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Show StorageMojo some love</title>
		<link>http://storagemojo.com/2011/02/28/show-storagemojo-some-love-2/</link>
		<comments>http://storagemojo.com/2011/02/28/show-storagemojo-some-love-2/#comments</comments>
		<pubDate>Tue, 01 Mar 2011 06:22:52 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Future Tech]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2314</guid>
		<description><![CDATA[Update: The survey is now closed. Thanks to everyone who responded! I&#8217;ll have more later on the results. End update. StorageMojo would like your help. TechnoQWAN LLC, StorageMojo&#8217;s publisher, is a research and analysis firm. An IT supplier has retained us to help plan their internal storage strategy . You&#8217;ll be anonymous, unless you choose [...]]]></description>
			<content:encoded><![CDATA[<p></p><p><strong>Update:</strong> The survey is now closed. Thanks to everyone who responded!</p>
<p>I&#8217;ll have more later on the results. <strong>End update.</strong></p>
<p>StorageMojo would like your help. </p>
<p>TechnoQWAN LLC, StorageMojo&#8217;s publisher, is a research and analysis firm. An IT supplier has retained us to help plan their internal storage strategy . </p>
<p>You&#8217;ll be anonymous, unless you choose not to be. This is a research project, not a marketing Trojan horse.</p>
<p><strong>How you can help</strong><br />
Please donate 5 to 8 minutes to complete a survey. You&#8217;ll help keep StorageMojo the independent source of storage coolness it has always strived to be.</p>
<p>We&#8217;d like to get a couple of hundred respondents RSN. Can you do it now?</p>
<p>Here&#8217;s the link to the <a href="https://www.surveygizmo.com/s3/476397/ecdedba37108" target="_blank">Internal Storage Survey</a>.</p>
<p>And please pass the link on to likely friends and colleagues. The more the merrier!</p>
<p><strong>The StorageMojo take</strong><br />
After we&#8217;ve looked at the data I plan to write about what I&#8217;ve found interesting in the results. You&#8217;ll get to learn something about the rest of the StorageMojo community.</p>
<p>And I&#8217;ll get to learn some more about you.</p>
<p>If, perchance, you&#8217;re passionate about the topic, you&#8217;ll be able to volunteer for a more in-depth discussion. Trust me: the sponsor can make a real difference in the servers you buy in a couple of years. </p>
<p><strong>Courteous comments welcome, of course.</strong> Completed surveys even more so! BTW, QWAN = Quality Without A Name.</p>
<div style="clear:both;margin-bottom:5px;">
				<a href="http://twitter.com/share?url=http://storagemojo.com/2011/02/28/show-storagemojo-some-love-2/&text=Show StorageMojo some love" target="_blank" title="Click here if you liked this article">
					<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
				</a>
			</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2011/02/28/show-storagemojo-some-love-2/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Great work at FAST &#8217;11</title>
		<link>http://storagemojo.com/2011/02/17/great-work-at-fast-11/</link>
		<comments>http://storagemojo.com/2011/02/17/great-work-at-fast-11/#comments</comments>
		<pubDate>Fri, 18 Feb 2011 06:42:47 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Future Tech]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2296</guid>
		<description><![CDATA[After a quick scan of the paper titles I wasn&#8217;t impressed. But after seeing presentations and posters I am. Here&#8217;s some I found interesting. I&#8217;ll be posting longer pieces on some of these. A Study of Practical Deduplication Full paper *Best Paper Winner* Tradeoffs in Scalable Data Routing for Deduplication Clusters Full paper Exploiting Half-Wits: [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>After a quick scan of the paper titles I wasn&#8217;t impressed. But after seeing presentations and posters I am.</p>
<p>Here&#8217;s some I found interesting. I&#8217;ll be posting longer pieces on some of these.</p>
<ul>
<li><strong>A Study of Practical Deduplication <a href="http://www.usenix.org/events/fast11/tech/full_papers/Meyer.pdf">Full paper</a> <em>*Best Paper Winner*</em></strong></li>
<li><strong><strong>Tradeoffs in Scalable Data Routing for Deduplication Clusters <a href="http://www.usenix.org/events/fast11/tech/full_papers/Dong.pdf">Full paper</a></strong></strong></li>
<li><strong>Exploiting Half-Wits: Smarter Storage for Low-Power Devices <a href="http://www.usenix.org/events/fast11/tech/full_papers/Salajegheh.pdf">Full paper</a></strong></strong></li>
<li><strong><strong>Reliably Erasing Data from Flash-Based Solid State Drives <a href="http://www.usenix.org/events/fast11/tech/full_papers/Wei.pdf">Full paper</a></strong></strong></li>
<li><strong><strong>Scale and Concurrency of GIGA+: File System Directories with Millions of Files <a href="http://www.usenix.org/events/fast11/tech/full_papers/Patil.pdf">Full paper</a></strong></strong></li>
<li><strong><strong>Emulating Goliath Storage Systems with David <a href="http://www.usenix.org/events/fast11/tech/full_papers/Agrawal.pdf">Full paper</a> <em>*Best Paper Winner*</em></strong></strong></li>
</ul>
<p>An excellent conference. NetApp, EMC, Microsoft and IBM were recruiting.</p>
<p><strong>The StorageMojo take</strong><br />
We&#8217;re still learning about flash, and the research presented here is a substantial addition to our meager knowledge.</p>
<p>Microsoft tells me they&#8217;re delivering major improvements to NTFS and Windows Server later this year. I&#8217;m looking forward to that briefing.</p>
<p>And it&#8217;s always a pleasure catching up with the people who, for some reason, never come to Sedona.</p>
<p><strong>Courteous comments welcome, as always.</strong></p>
<div style="clear:both;margin-bottom:5px;">
				<a href="http://twitter.com/share?url=http://storagemojo.com/2011/02/17/great-work-at-fast-11/&text=Great work at FAST '11" target="_blank" title="Click here if you liked this article">
					<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
				</a>
			</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2011/02/17/great-work-at-fast-11/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>StorageMojo @FAST &#8217;11</title>
		<link>http://storagemojo.com/2011/02/11/storagemojo-fast-11/</link>
		<comments>http://storagemojo.com/2011/02/11/storagemojo-fast-11/#comments</comments>
		<pubDate>Fri, 11 Feb 2011 20:26:33 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Future Tech]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2290</guid>
		<description><![CDATA[It&#8217;s that time of the year again: the Usenix File And Storage Technology conference in San Jose next Tuesday thru Thursday. The elite StorageMojo analyst SWAT unit will be there, in color coordinated Kevlar and Spandex, rappelling into the Marriott ballroom. So come by and say hello. The StorageMojo take No obvious must-reads among the [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>It&#8217;s that time of the year again: the Usenix <a href="http://www.usenix.org/events/fast11/calendar.html" target="_blank">File And Storage Technology</a> conference in San Jose next Tuesday thru Thursday.</p>
<p>The elite StorageMojo analyst SWAT unit will be there, in color coordinated Kevlar and Spandex, rappelling into the Marriott ballroom. So come by and say hello. </p>
<p><strong>The StorageMojo take</strong><br />
No obvious must-reads among the papers this year, so we&#8217;ll have to wait and see if lightning strikes. </p>
<p><strong>Courteous comments welcome, of course.</strong> </p>
<div style="clear:both;margin-bottom:5px;">
				<a href="http://twitter.com/share?url=http://storagemojo.com/2011/02/11/storagemojo-fast-11/&text=StorageMojo @FAST '11" target="_blank" title="Click here if you liked this article">
					<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
				</a>
			</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2011/02/11/storagemojo-fast-11/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Will algorithms leap Moore&#8217;s Wall?</title>
		<link>http://storagemojo.com/2011/01/30/will-algorithms-leap-moores-wall/</link>
		<comments>http://storagemojo.com/2011/01/30/will-algorithms-leap-moores-wall/#comments</comments>
		<pubDate>Mon, 31 Jan 2011 00:23:49 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Future Tech]]></category>
		<category><![CDATA[Security & Public Policy]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2259</guid>
		<description><![CDATA[The performance increase in individual CPUs is slowing to a crawl. All the easy wins &#8211; higher clock speeds, wider datapaths, more DRAM, larger registers and caches, 2-4 cores &#8211; have been exploited. Doctor, is there any hope? In the recent PCAST report on Federal technological initiatives (see Fed funding for our digital future) (pdf) [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>The performance increase in individual CPUs is <a href="http://storagemojo.com/2010/11/29/moores-wall-the-end-of-moores-law/" target="_blank">slowing to a crawl</a>. All the easy wins &#8211; higher clock speeds, wider datapaths, more DRAM, larger registers and caches, 2-4 cores &#8211; have been exploited. </p>
<p><strong>Doctor, is there any hope?</strong><br />
In the recent PCAST report on Federal technological initiatives (see <a href="http://storagemojo.com/2010/12/28/fed-funding-for-our-digital-future/" target="_blank">Fed funding for our digital future</a>) (pdf) one sidebar suggested &#8220;Progress in Algorithms Beats Moore’s Law.&#8221;</p>
<blockquote><p>
. . . in many areas, performance gains due to improvements in algorithms have vastly exceeded even the dramatic performance gains due to increased processor speed.</p>
<p>The algorithms that we use today for speech recognition, for natural language translation, for chess playing, for logistics planning, have evolved remarkably in the past decade. It’s difficult to quantify the improvement, though, because it is as much in the realm of quality as of execution time.</p>
<p>In the field of numerical algorithms, however, the improvement can be quantified. Here is just one example . . . a benchmark production planning model solved using linear programming would have taken 82 years to solve in 1988, using the computers and the linear programming algorithms of the day. Fifteen years later – in 2003 – this same model could be solved in roughly 1 minute, an improvement by a factor of roughly 43 million. Of this, a factor of roughly 1,000 was due to increased processor speed, whereas a factor of roughly 43,000 was due to improvements in algorithms!
</p></blockquote>
<p><strong>The StorageMojo take</strong><br />
Let&#8217;s file this one under &#8220;Wishful thinking&#8221; along with &#8220;US housing prices will never decline.&#8221; Piecemeal enhancements of specific application areas cannot replace the generalized performance improvements we&#8217;ve seen for decades.</p>
<p>No doubt there are important algorithmic improvements to be made. And that in certain problem spaces those speedups will far exceed Moore&#8217;s Law &#8211; even though the Law is about transistor count, not performance.</p>
<p>That doesn&#8217;t change the fact of computation today: the era of predictable and rapid performance improvement is over. Like a vein of rich ore that thins out, our computers will still improve, but the effort needed to do so is rising fast.</p>
<p>Cheap(er) SSDs, larger memories and caches are helping mask the performance plateau by increasing system performance, but reduced I/O latency and increased bandwidth will only take us so far. The way forward is a game of wringing out single-digit percent improvements, not the 2-3 year doubling of the last 60 years.</p>
<p><strong>Courteous comments welcome, of course.</strong> The professor whose work the PCAST quote refers to is Martin Grötschel of Konrad-Zuse-Zentrum in Berlin. He&#8217;s been doing <a href="http://www.zib.de/groetschel/research/Musterbiblio.html" target="_blank">brilliant work on optimization problems</a>- including the traveling salesman problem and data network design &#8211; for decades. </p>
<div style="clear:both;margin-bottom:5px;">
				<a href="http://twitter.com/share?url=http://storagemojo.com/2011/01/30/will-algorithms-leap-moores-wall/&text=Will algorithms leap Moore's Wall? " target="_blank" title="Click here if you liked this article">
					<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
				</a>
			</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2011/01/30/will-algorithms-leap-moores-wall/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Hyder: a flash-based scale-out database</title>
		<link>http://storagemojo.com/2011/01/24/hyder-a-flash-based-scale-out-database/</link>
		<comments>http://storagemojo.com/2011/01/24/hyder-a-flash-based-scale-out-database/#comments</comments>
		<pubDate>Mon, 24 Jan 2011 07:36:35 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Clusters]]></category>
		<category><![CDATA[Future Tech]]></category>
		<category><![CDATA[Information Management]]></category>
		<category><![CDATA[SSD/Flash Disk]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2239</guid>
		<description><![CDATA[Talked to a company last week whose cloud app handles several billion transactions per month on a cluster. Sounds like SSDs could help them but how? In a paper from the latest 5th Biennial Conference on Innovative Data Systems Research (CIDR &#8217;11) researchers Philip A. Bernstein and Colin W. Reid of Microsoft and Sudipto Das [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>Talked to a company last week whose cloud app handles several billion transactions per month on a cluster. Sounds like SSDs could help them but how?</p>
<p>In a paper from the latest <a href="http://www.cidrdb.org/cidr2011/" target="_blank">5th Biennial Conference on Innovative Data Systems Research</a> (CIDR &#8217;11) researchers Philip A. Bernstein and Colin W. Reid of Microsoft and Sudipto Das of UC Santa Barbara have a suggestion: <a href="http://www.cidrdb.org/cidr2011/Papers/CIDR11_Paper2.pdf" target="_blank">Hyder – A Transactional Record Manager for Shared Flash</a> (pdf).</p>
<p>As underlying hardware changes &#8211; faster networks, large memories, multi-core CPUs and SSDs &#8211; database software architectures may change too. <i>Hyder</i> architecture supports</p>
<blockquote><p>
. . . reads and writes on indexed records within classical multi-step transactions. It is designed to run on a cluster of servers that have shared access to a large pool of network-addressable raw flash chips. . . . Hyder uses a data-sharing architecture that scales out without partitioning the database or application.
</p></blockquote>
<p><strong>No partition scale-out</strong><br />
Today, most popular database clusters partition the database across multiple servers. Done well this works, but at some cost. The database design is non-trivial &#8211; cross-partition transactions, cache coherence, load balancing, scaling and multi-server debugging &#8211; are knotty issues which translate into higher design and operation costs.</p>
<p>Hyder eliminates partitioning, distributed programming, layers of cache, remote procedure calls and load balancing. All servers can read and write the entire database &#8211; so any server can execute any transaction. Load-balancing is simple: direct new transactions to lightly-loaded servers.</p>
<p>Each update transaction runs on one machine and writes to a shared log &#8211; so there&#8217;s no 2-phase commit. And no 2-phase <strike>commit</strike> locking, which can force performance off a cliff when workloads spike.</p>
<p>The 3 main components of Hyder are the <i>log</i>, the <i>index</i> and the <i>roll-forward algorithm</i>.</p>
<p><strong>Log</strong><br />
The log runs on multiple flash devices &#8211; chips, DIMMs or ??? &#8211; and writes multi-page log records across multiple devices with parity to enable log recovery after device failures.</p>
<p>Hyder uses a <i>multi-versioned</i> database &#8211; old record versions aren&#8217;t updated-in-place, only the most recent version of a record is used &#8211; which has a couple of useful properties:</p>
<ul>
<li>Server caches are inherently coherent since only the most recent versions of records are used.</li>
<li>Data can be read while writes are in progress.</li>
<li>Queries that can be decomposed can be run across multiple servers concurrently for a faster response time.</li>
</ul>
<p>[This may seem like voodoo to ACIDheads. A good technical intro to multi-versioning concurrency control (MVCC) is <a href="http://www.rtcmagazine.com/articles/view/101612" target="_blank">Multi-core software: to gain speed, eliminate resource contention</a>.]</p>
<p>Servers run a cache update process that keeps them current with updated records. Server caches don&#8217;t have to be identical and the cache invalidate messages that most clusters use for cache coherency aren&#8217;t needed.</p>
<p>All log writes are idempotent appends, so if a write fails the server can simply reissue the write. The authors describe several error modes and how Hyder handles them.</p>
<p><strong>Index</strong><br />
The index stores the database as a search tree with each node a [key, payload] pair. The tree can store, for example, a relational database. The index tree is also represented in the log.</p>
<p>Tree nodes are not updated in place. When node <i>n</i> is updated, a new copy &#8211; <i>n&#8217;</i>is created. Then, of course, the parent node must be updated and so on up the tree. </p>
<p>A binary tree minimizes the number of node updates, but can be processor intensive. The optimal tree structure for Hyder is not yet resolved.</p>
<p>Garbage collection is an issue. Each node pointer includes the ID of the oldest reachable data element. An element older than any that is pointed to by a node is garbage.</p>
<p><strong>Roll-forward algorithm</strong><br />
This is the key process of Hyder.</p>
<p>When a record update begins, one server executes the transaction. The server is given a copy of  the latest database root, a static snapshot of the entire database.</p>
<p>The updates are stored in a local cache and after execution the after-images are gathered into an <i>intention</i> record, which is broadcast to all servers and appended to the log. The update&#8217;s readset is included in the intention record, to insure all intentions are properly ordered, none are lost, and the offset is made known to all servers.</p>
<p>Each server can assemble a local copy of the tail of the log, which is used to determine if there are conflicting updates. The <i>meld</i> procedure manages conflicting updates.</p>
<p>Appending the intention to the database log doesn&#8217;t commit the transaction. The intention references the static snapshot of the latest database root. The meld procedure determines if any committed transactions since the snapshot conflict with the intention. </p>
<p>If they don&#8217;t, all is good. If they do, the transaction is aborted.</p>
<p>All servers roll forward using meld and don&#8217;t message each other about committed and failed transactions. Therefore there is no lock manager and no 2-phase commit.</p>
<p><strong>Contention</strong><br />
Losing the lock manager and 2-phase commit should help performance unless other points of contention throttle the system. Hyder&#8217;s points of contention include appending intentions to the log, melding the log at each server, and aborting transactions.</p>
<p>Intention appends are serial. The lower the write latency the more appends can be written. A 10us write latency means a 100k TPS.</p>
<p>Network latency adds to write latency. Faster switches improve append performance.</p>
<p>The abort rate depends on the number of concurrent transactions that conflict. Fast transactions reduce the probability of aborts by reducing the number of concurrent transactions. </p>
<p>The worst case is a record subject to multiple updates from different servers. Detecting high-conflict transactions and serializing them by forcing them onto 1 server would reduce the hot data performance hit.</p>
<p><strong>Performance</strong><br />
The authors model Hyder&#8217;s performance with a focus on the high-contention corner cases. In general, the tests show linear scaling as servers are added. </p>
<p>The problems come when the underlying hardware limits are exceeded. Increasing execution times mean more aborts and performance falls off a cliff. From the paper:</p>
<p><a href="http://storagemojo.com/wp-content/uploads//2011/01/hyder_thrashing.jpg"><img src="http://storagemojo.com/wp-content/uploads//2011/01/hyder_thrashing.jpg" alt="" title="hyder_thrashing" width="475" height="286" class="aligncenter size-full wp-image-2240" /></a></p>
<p><strong>The StorageMojo take</strong><br />
We&#8217;ve been building disk workarounds for for decades. We now tend to assume those workarounds are fundamental architectural requirements rather than hacks. </p>
<p>The <i>Hyder</i> paper asks us to imagine a world where non-volatile mass storage is fast and cheap &#8211; and how we could re-architect basic systems to be faster and cheaper too.</p>
<p>The authors conclusion is a fair assessment:</p>
<blockquote><p>
Many variations of the Hyder architecture and algorithms would be worth exploring. There may also be opportunities to use Hyder’s logging and meld algorithms with some modification in other contexts, such as file systems and middleware. We suggested a number of directions for future work throughout the paper. No doubt there are many other directions as well.
</p></blockquote>
<p><strong>Courteous comments welcome, of course.</strong> I hope to get to some of the other CIDR papers before <a href="" target="_blank">FAST &#8217;11</a> snows me under.  <strong>Update:</strong> Phil Bernstein was kind enough to scan the post and I&#8217;ve updated 1 minor error. He also mentioned that it won the Best Paper award at the conference. Those CIDR folks have great taste in papers, don&#8217;t they?</p>
<div style="clear:both;margin-bottom:5px;">
				<a href="http://twitter.com/share?url=http://storagemojo.com/2011/01/24/hyder-a-flash-based-scale-out-database/&text=Hyder: a flash-based scale-out database" target="_blank" title="Click here if you liked this article">
					<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
				</a>
			</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2011/01/24/hyder-a-flash-based-scale-out-database/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Moore&#8217;s Wall: the end of Moore&#8217;s Law</title>
		<link>http://storagemojo.com/2010/11/29/moores-wall-the-end-of-moores-law/</link>
		<comments>http://storagemojo.com/2010/11/29/moores-wall-the-end-of-moores-law/#comments</comments>
		<pubDate>Mon, 29 Nov 2010 18:41:27 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Future Tech]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2207</guid>
		<description><![CDATA[CPU performance and clock speed have leveled out over the last several years. What does this mean for the industry? Moore&#8217;s law Strictly speaking, Moore&#8217;s law says that the number of transistors on a chip will double every 18 to 24 months. And that&#8217;s been true for the last 40 years. And it appears set [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>CPU performance and clock speed have leveled out over the last several years. What does this mean for the industry?</p>
<p><strong>Moore&#8217;s law</strong><br />
Strictly speaking, Moore&#8217;s law says that the number of transistors on a chip will double every 18 to 24 months. And that&#8217;s been true for the last 40 years. And it appears set to continue for another decade. </p>
<p>But Moore&#8217;s observation has been simplified to mean a <i>doubling of performance</i> every 18 to 24 months. And that too has been true. But not anymore.</p>
<p>Transistors and performance do not have a one-to-one relationship. Yes, clock speeds have improved from the 1 MHz 6502 processor in the original Apple II to over 3 GHz in the latest and greatest. But we&#8217;ve reached the end of the line in clock speed improvements: in a third of a nanosecond light moves about 4 inches or 10 cm. Not much distance when chips have miles of internal wiring.</p>
<p>But clock speed isn&#8217;t the whole story. Chips now move data in 64 and 128-bit chunks, rather than the 6502&#8242;s 8-bit chunks. While there are  experiments with Very Long Word computer architectures, as a practical matter we were also at the end of the line for wider data paths as well: 256 bits is as wide as personal and commercial processors can reasonably use.</p>
<p>More RAM? We&#8217;ve also been adding ever-larger on-chip caches that improve performance. But inevitably cache-hit ratios decline with size and so do the performance benefits. </p>
<p><strong>Multicore</strong><br />
We can&#8217;t make processors go faster. We can&#8217;t process more data per clock cycle. So how do we put twice as many transistors to work?</p>
<p>Stuffing more processors on a chip. And right now many of the brightest minds in computer science are struggling with the problem of getting usable work out of 8, 12 or 16 core CPUs.</p>
<p>Dual and quad core processors work pretty well because our multitasking operating systems run a lot of background threads. Spreading those threads across multiple cores improves performance for everyone.</p>
<p>But outside video, image, voice and scientific apps, most of what we do today &#8211; word processing, e-mail, web surfing, spreadsheets and presentations &#8211; don&#8217;t need multicore architectures. Certainly not 8 or more cores. Humans aren&#8217;t good multi-taskers.</p>
<p><strong>The wall</strong><br />
We&#8217;ve hit a technology wall. We can still double the number of transistors on a chip every couple of years. We can still double disk drive capacity every 2 to 3 years. We can build faster interconnects, such as QuickPath, Light Peak and 10 Gb Ethernet.</p>
<p>But the easy wins are over. Going forward performance gains will be measured in single digit percents each year.</p>
<p><strong>Implications</strong><br />
Information technology, like most of the US economy, is driven by consumer spending. So what happens when a new PC is only 20% faster than your fully paid for three-year-old PC?</p>
<p>Digital Equipment Corporation, who pioneered the minicomputer in the 1960s, had a simple model for product improvements. A successful product would add functionality and performance at a constant cost. And they would offer the same functionality and performance at a declining cost. Here&#8217;s a graph of their model:</p>
<p><a href="http://storagemojo.com/wp-content/uploads//2010/11/dec_product_strategy.png"><img src="http://storagemojo.com/wp-content/uploads//2010/11/dec_product_strategy.png" alt="" title="dec_product_strategy" width="475" height="451" class="aligncenter size-full wp-image-2208" /></a></p>
<p>At some point the cost of producing a given level of functionality would be so low that distribution and marketing costs would dominate. Then volumes would migrate to the price performance sweet spot and lower volume products died.</p>
<p>Today, we can no longer count on performance increases to open up new application territory. Therefore we will see differentiation move to what were once considered secondary characteristics.</p>
<ul>
<li><strong>Power.</strong> The server space is grappling with the implications of greater power efficiency, but the mobile space has been pushing this metric for the last 15 years. That will continue for years to come.</li>
<li><strong>Integration.</strong> Open up in iPad or a MacBook Air and what do you see? A tiny PC board, a few chips and a huge set of batteries. Long battery life is what makes the product so convenient that they become part everyday life.</li>
<li><strong>Functionality.</strong> Creatively integrating multiple applications, each with their own dedicated core, may enable consumer devices to collapse multistep workflows into a single handy device. Combine image capture, voice recognition, editing and compression into a single device that would enable consumers to capture, edit and post video from a single candy bar sized device, editing on-the-fly with spoken commands.</li>
<li><strong>Cost.</strong> The first low-res digital cameras cost hundreds of dollars, but today we build them into cheap cell phones.</li>
</ul>
<p><strong>The StorageMojo take</strong><br />
The days of the Moore&#8217;s Law driven application growth are over. The next step is to use our still growing technical capabilities to refine what we already do.</p>
<p>The good news for the storage industry is that new data production will continue to grow rapidly. Always on, always available consumer data systems will create ever more demand for storage.</p>
<p>This is also another nail in the coffin of the RAID controller paradigm. Distributed multicore processing power requires distributed data protection and storage architectures. </p>
<p>When you can&#8217;t scale up, you have to scale out. Decomposable storage architectures will inevitably come to the fore.</p>
<p><strong>Courteous comments welcome, of course.</strong> Oddly enough, the Apple ][ motherboard's style was the same as today's MacBook Air: a few chips on a PC board. Friends were always startled to see empty my Apple ]['s case was.</p>
<div style="clear:both;margin-bottom:5px;">
				<a href="http://twitter.com/share?url=http://storagemojo.com/2010/11/29/moores-wall-the-end-of-moores-law/&text=Moore's Wall: the end of Moore's Law" target="_blank" title="Click here if you liked this article">
					<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
				</a>
			</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2010/11/29/moores-wall-the-end-of-moores-law/feed/</wfw:commentRss>
		<slash:comments>17</slash:comments>
		</item>
		<item>
		<title>OpenStorage Summit next week</title>
		<link>http://storagemojo.com/2010/10/22/openstorage-summit-next-week/</link>
		<comments>http://storagemojo.com/2010/10/22/openstorage-summit-next-week/#comments</comments>
		<pubDate>Fri, 22 Oct 2010 15:37:24 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Future Tech]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2194</guid>
		<description><![CDATA[StorageMojo&#8217;s Global HQ is pulling up stakes and traveling to wilds of Palo Alto for the OpenStorage Summit 2010. I&#8217;m hoping to hear more about the future of ZFS and other storage stacks. I&#8217;ll arrive Tuesday afternoon and will leave Thursday. If you&#8217;re in the neighborhood please stop by and say hello. The StorageMojo take [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>StorageMojo&#8217;s Global HQ is pulling up stakes and traveling to wilds of Palo Alto for the <a href="http://nexenta-summit2010.eventbrite.com/" target="_blank">OpenStorage Summit 2010</a>. I&#8217;m hoping to hear more about the future of ZFS and other storage stacks.</p>
<p>I&#8217;ll arrive Tuesday afternoon and will leave Thursday. If you&#8217;re in the neighborhood please stop by and say hello.</p>
<p><strong>The StorageMojo take</strong><br />
The rapidly growing cloud storage market is slowly but surely changing the dynamics of the storage market. Private clouds will never be as flexible and cheap as public clouds, but the closer they get the more their economics will improve. It&#8217;s a game played at the margins.</p>
<p>That bodes well for enterprise adoption of open storage for private cloud infrastructure. Not everyone, but those with the scale and the moxie to make it work will find a significant competitive advantage.</p>
<p><strong>Courteous comments welcome, of course.</strong>  </p>
<div style="clear:both;margin-bottom:5px;">
				<a href="http://twitter.com/share?url=http://storagemojo.com/2010/10/22/openstorage-summit-next-week/&text=OpenStorage Summit next week" target="_blank" title="Click here if you liked this article">
					<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
				</a>
			</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2010/10/22/openstorage-summit-next-week/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Objectively speaking: the future of objects</title>
		<link>http://storagemojo.com/2010/10/18/objectively-speaking-the-future-of-objects/</link>
		<comments>http://storagemojo.com/2010/10/18/objectively-speaking-the-future-of-objects/#comments</comments>
		<pubDate>Mon, 18 Oct 2010 23:08:20 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Cloud computing & storage]]></category>
		<category><![CDATA[Enterprise]]></category>
		<category><![CDATA[Future Tech]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2184</guid>
		<description><![CDATA[One infrastructure to rule them all discussed the emerging enterprise need for a single, scalable file storage infrastructure. But what infrastructure? Some background to this is last year&#8217;s Cloud Quadrant and this year&#8217;s Why private clouds are part of the future. Block and file For decades direct-attached block-based storage was the only option. The &#8217;80s [...]]]></description>
			<content:encoded><![CDATA[<p></p><p><a href="http://storagemojo.com/2010/09/10/one-infrastructure-to-rule-them-all/" target="_blank">One infrastructure to rule them all</a> discussed the emerging enterprise need for a single, scalable file storage infrastructure. But what infrastructure?</p>
<p>Some background to this is last year&#8217;s <a href="http://storagemojo.com/2009/09/28/the-cloud-quadrant/" target="_blank">Cloud Quadrant</a> and this year&#8217;s <a href="http://storagemojo.com/2010/02/05/why-private-clouds-are-part-of-the-future/" target="_blank">Why private clouds are part of the future</a>. </p>
<p><a href="http://storagemojo.com/wp-content/uploads//2010/10/Cloud-quadrant-plain-diagram.jpg"><img src="http://storagemojo.com/wp-content/uploads//2010/10/Cloud-quadrant-plain-diagram.jpg" alt="" title="Cloud quadrant plain diagram" width="480" height="464" class="aligncenter size-full wp-image-2188" /></a></p>
<p><strong>Block and file</strong><br />
For decades direct-attached block-based storage was the only option. The &#8217;80s introduced file-based storage. Much of storage systems growth in the last 15 years has been in file servers.</p>
<p>New systems, be they video, sensor or social, are producing massive collections of files at an accelerating rate. The rapid development of lower cost mobile computing devices – smartphones, iPad&#8217;s, netbooks and Android tablets – mean that content consumption and production will be a major source of file growth. The long tail of content demand means that the variety of online content will grow &#8211; especially as the cost of storage declines.</p>
<p><strong>Private cloud</strong><br />
The larger issue is the need to keep this fast-growing information online for years, despite rapid change in the underlying storage, network and computing infrastructures. File data must become independent of our storage and server choices. </p>
<p>As stores grow data migration becomes less feasible. Rip &#8216;n replace gives way to in-place upgrades. </p>
<p>Achieving <i>that</i> means moving to an object storage paradigm. How do we know this will happen? Because it already has. </p>
<p>Object stores at Google and Amazon Web Services are already among the largest storage infrastructures in the world. AWS alone stores over 100 billion objects today. Hundreds of millions of people use object storage every day &#8211; and don&#8217;t even know it.</p>
<p><strong>What is object storage? </strong><br />
Object storage instantiations vary in detail and supported features. However, all object storage has two key characteristics:<br />
	–Individual objects are accessed by a global handle. The handle may, for example, be a hash, a key or a something like a URL.<br />
	–Extended metadata. The extended metadata content goes beyond that of traditional file systems and may include additional security and content validation as well as presentation, decompression or other information relating to the content, production or value of the enclosed file.</p>
<p>Like files, objects contain data. But they lack key features that would make them files. They don’t have:<br />
	-Hierarchy. Not only are all objects created equal, they all remain at the same level. You can’t put one object inside another.<br />
	-Names. At least, not human-type names like Claudia_Schiffer or 2006_Taxes.</p>
<p>A user-facing component provides those missing elements. You decide which files belong in which folders. You give the files names. You decide which users have access to which files and what those users can do with those files. </p>
<p>Those choices are embedded in the object metadata so they can be presented as you have organized them. But if you have the object&#8217;s handle you can access it directly.</p>
<p>All objects look alike. Some are bigger and some are smaller, but until we get them dressed and named, they aren’t files. Yet they are a lot closer to files than blocks are. Which means that if you choose to manage objects you no longer have to worry about blocks.</p>
<p>Essentially then, objects are files with an address &#8211; instead of a pathname &#8211; and extra metadata. Unlike distributed file systems &#8211; where the metadata is stored in a metadata server. The metadata server keeps track the location of the data on the storage servers.</p>
<p>Some file storage systems are built on object storage repositories. Legacy APIs make it a  requirement for many applications, but URL-style access through HTTP is more flexible in the long run.</p>
<p><strong>Crossing the implementation chasm</strong><br />
While the economics of objects are obvious at scale, they are less compelling at the beginning of a typical enterprise project. It is easier to buy another file server than to worry about long-term architecture. </p>
<p>Here&#8217;s a rough diagram of the relative scalability of storage options:</p>
<p><a href="http://storagemojo.com/wp-content/uploads//2010/10/cloud_quadrant_object.jpg"><img src="http://storagemojo.com/wp-content/uploads//2010/10/cloud_quadrant_object.jpg" alt="" title="cloud_quadrant_object" width="485" height="452" class="aligncenter size-full wp-image-2190" /></a></p>
<p>When under-12-month paybacks are expected, who will buy an object storage infrastructure? The simple answer is that as object stores become better known and startup costs are reduced, more companies will buy them. Archives will be the first market. The longer answer is that as public cloud projects are brought inside, object stores will receive them. </p>
<p><strong>The StorageMojo take</strong><br />
As organizations amass large file collections, the economies of scale and management for object storage will become apparent. Savvy architects will add commodity-based scale-out object storage to their tool kit. </p>
<p>HDS, NetApp and HP have recently added modern object stores to their product lines. And rumor has it EMC will too, either by getting Atmos to work or by buying Isilon. </p>
<p><strong>Courteous comments welcome, of course.</strong> Still don&#8217;t like the name object, but I&#8217;ll get over it.</p>
<div style="clear:both;margin-bottom:5px;">
				<a href="http://twitter.com/share?url=http://storagemojo.com/2010/10/18/objectively-speaking-the-future-of-objects/&text=Objectively speaking: the future of objects" target="_blank" title="Click here if you liked this article">
					<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
				</a>
			</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2010/10/18/objectively-speaking-the-future-of-objects/feed/</wfw:commentRss>
		<slash:comments>22</slash:comments>
		</item>
		<item>
		<title>Calling all grad students</title>
		<link>http://storagemojo.com/2010/10/04/calling-all-grad-students/</link>
		<comments>http://storagemojo.com/2010/10/04/calling-all-grad-students/#comments</comments>
		<pubDate>Mon, 04 Oct 2010 12:20:12 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Cloud computing & storage]]></category>
		<category><![CDATA[Future Tech]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2160</guid>
		<description><![CDATA[The friendly folks at Scality have put up $100,000 to encourage open source development of useful cloud storage bits. It&#8217;s open to anyone, not just grad students. Yup, it&#8217;s corporate self-interest at work &#8211; Scality sells object-based cloud storage software &#8211; but they&#8217;re taking an enlightened approach. The resulting code will be open source and [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>The friendly folks at <a href="http://www.scality.com/" target="_blank">Scality</a> have put up $100,000 to encourage open source development of useful cloud storage bits. It&#8217;s open to anyone, not just grad students.</p>
<p>Yup, it&#8217;s corporate self-interest at work &#8211; Scality sells object-based cloud storage software &#8211; but they&#8217;re taking an enlightened approach. The resulting code will be open source and is intended to work with a variety of object-based cloud storage services &#8211; such as S3 &#8211; not just theirs. </p>
<p><strong>Real money</strong><br />
The defined projects have bonuses of $2,000 to $10k.  The projects include a <a href="http://scop.scality.com/2010/09/gallery-3-sd_gallery.html" target="_blank">Gallery plugin</a> for the </p>
<blockquote><p>
. . . full replacement of the underlying filesystem based storage of content/objects with object storage using the REST interface . . . .
</p></blockquote>
<p>There&#8217;s a <a href="http://scop.scality.com/2010/09/wordpress-plugin.html" target="_blank">WordPress plugin</a> project to add an object storage backend to the popular CMD. The $10,000 prize goes for a <a href="http://scop.scality.com/2010/09/kvm-virtualization-storage-engine-sd_linuxkvm.html" target="_blank">KVM virtualization storage engine</a> for the Linux Kernel Volume Manager that provides:</p>
<blockquote><p>
. . . block level storage volumes that can be attached to KVM virtual machines. The solution should not require any central node, for example, no central meta-data server and provide a completely stateless operation model.
</p></blockquote>
<p>Here&#8217;s the <a href="http://scop.scality.com/scop-drops-bounty-list.html" target="_blank">list of defined projects</a>.</p>
<p><strong>Even better</strong><br />
And almost ⅔ of the money remains uncommitted. If you have an idea for domesticating object storage in the cloud &#8211; propose it.</p>
<p><strong>The StorageMojo take</strong><br />
Objects are the future of large-scale storage. If cutting edge stuff gets your heart pumping, this is a good place to start.</p>
<p>Or just collect some cash and move on. Your choice.</p>
<p>Feel free to ask questions in the comments. I&#8217;ll ping the Scality guys to get answers.</p>
<p><strong>Courteous comments welcome, of course.</strong> I&#8217;ve done some work for Scality and like the team. I&#8217;m also wondering why Amazon hasn&#8217;t done something like this.</p>
<div style="clear:both;margin-bottom:5px;">
				<a href="http://twitter.com/share?url=http://storagemojo.com/2010/10/04/calling-all-grad-students/&text=Calling all grad students" target="_blank" title="Click here if you liked this article">
					<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
				</a>
			</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2010/10/04/calling-all-grad-students/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Cloud&#8217;s app killer  </title>
		<link>http://storagemojo.com/2010/08/05/clouds-app-killer%e2%80%a8%e2%80%a8/</link>
		<comments>http://storagemojo.com/2010/08/05/clouds-app-killer%e2%80%a8%e2%80%a8/#comments</comments>
		<pubDate>Fri, 06 Aug 2010 04:23:35 +0000</pubDate>
		<dc:creator>Robin Harris</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Cloud computing & storage]]></category>
		<category><![CDATA[Future Tech]]></category>

		<guid isPermaLink="false">http://storagemojo.com/?p=2105</guid>
		<description><![CDATA[Concall today with Bryan Cantrill, the smart guy behind Dtrace. Dtrace was the engine behind Sun&#8217;s Oracle&#8217;s Fishworks server and application monitor. Dtrace has also been incorporated into OS X. Bryan left Oracle last week and started Monday at Joyent the cloud infrastructure provider, as VP of engineering. Why? Bryan is an instrumentation geek. He [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>Concall today with Bryan Cantrill, the smart guy behind <a href="http://en.wikipedia.org/wiki/DTrace" target="_blank">Dtrace</a>. Dtrace was the engine behind <strike>Sun&#8217;s</strike> Oracle&#8217;s Fishworks server and application monitor. Dtrace has also been incorporated into OS X.</p>
<p>Bryan left Oracle last week and started Monday at <a href="http://www.joyent.com/" target="_blank">Joyent</a> the cloud infrastructure provider, as VP of engineering. Why?</p>
<p>Bryan is an instrumentation geek. He really wants to know what&#8217;s going on. Instrumentation in the cloud is the next big challenge.</p>
<p>That makes sense: there are so many moving parts that understanding and resolving performance and availability issues will be critical to the widespread adoption of cloud. </p>
<p><strong>Tech epiphanies</strong><br />
Bryan described 3 technology epiphanies that he&#8217;s enjoyed. The 1st was when he saw Java for the first time back in 1995. The 2nd was when he saw a Ruby on Rails video about deploying a web app.</p>
<p>His 3rd epiphany came recently when he saw something called node.js. Developed by Ryan Dahl it turns the JavaScript paradigm on its head: node.js runs on the server, not the client.</p>
<p><strong>Latency bubbles</strong><br />
We know that server I/O latency can kill performance. It&#8217;s even worse in the cloud.</p>
<p>A single bad drive can hose a server if the app is holding locks. What if you have a webpage that relies on five different Web services, or as many Amazon pages do, 150 services?</p>
<p>You need an infrastructure that is resilient in the face of long latency while maintaining high throughput. Bryan says that most failures are not hard failures but are latency bubbles that cascade out and lock up the rest of the infrastructure.</p>
<p>Ryan took Google&#8217;s of V8 JavaScript engine and extended it so you can handle long latency events. Without locking up the server.</p>
<p>Ryan does a fine job <a href="http://www.youtube.com/watch?v=F6k8lTrAE2g" target="_blank">introducing node.js</a> in a 1 hour Google Tech Talk last week. He outlined how to build a server that can handle 10,000 or more users. His goal with node.js was to make it easy to write high-performance servers.</p>
<p><a href="http://storagemojo.com/wp-content/uploads//2010/08/nodejs_architecture1.jpg"><img src="http://storagemojo.com/wp-content/uploads//2010/08/nodejs_architecture1.jpg" alt="" title="nodejs_architecture" width="470" height="404" class="aligncenter size-full wp-image-2113" /></a></p>
<p>There is an arms race out there for performance – Google, Apple, Mozilla, Opera, Microsoft – to win the hearts and eyeballs of hundreds of millions of consumers. Fickle consumers.</p>
<p>Node.js only exposes nonblocking asynchronous interfaces to the programmer. It has very few abstractions. Its power lies in the fact that it moves you away from certain interfaces like synchronous I/O that you shouldn&#8217;t do.</p>
<p>You don&#8217;t have to worry about some event completing and taking over while you&#8217;re in the middle of something else. Each node.js is a single thread. If you want to do more work you start multiple node.js instances and let the kernel do the load balancing.</p>
<p>Memory isolation is enforced at the process boundary. The kernel manages it, not the coder. That&#8217;s a good thing.</p>
<p><strong>The StorageMojo take</strong><br />
Latency is the app killer of the cloud. The current cloud focus on write once/read never apps reflects that.</p>
<p>The fight against latency proceeds on many fronts: storage; network; CPU; and software. <a href="http://www.asankya.com/" target="_blank">Asankya</a> and others have good ideas for reducing Internet latency. Flash architectures are undergoing rapid evolution. Multicore and multiprocessor servers are attacking throughput.</p>
<p>Node.js is a big step in the right direction. Removing the dependency is that synchronous I/O create means any more resilient and higher performance infrastructure. Ryan reports that a Japanese website is already running several hundred thousand users on node.js instances.</p>
<p>As for Bryan, he&#8217;ll bring the same intelligence and energy to Joyent that he brought to Dtrace and Fishworks. Expect more great things.</p>
<p><strong>Courteous comments welcome, of course.</strong> <strong>Update:</strong> The other smart guys behind Dtrace are the redoubtable <a href="http://blogs.sun.com/ahl/category/DTrace" target="_blank"> Adam Leventhal</a>and Mike Shapiro.</p>
<div style="clear:both;margin-bottom:5px;">
				<a href="http://twitter.com/share?url=http://storagemojo.com/2010/08/05/clouds-app-killer%e2%80%a8%e2%80%a8/&text=Cloud's app killer  " target="_blank" title="Click here if you liked this article">
					<img src="http://storagemojo.com/wp-content/plugins/twitter-plugin/images/twitt.gif" alt="Twitt" />
				</a>
			</div>]]></content:encoded>
			<wfw:commentRss>http://storagemojo.com/2010/08/05/clouds-app-killer%e2%80%a8%e2%80%a8/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>

