<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: De-duplicating primary storage</title>
	<atom:link href="http://storagemojo.com/2008/09/30/de-duplicating-primary-storage/feed/" rel="self" type="application/rss+xml" />
	<link>http://storagemojo.com/2008/09/30/de-duplicating-primary-storage/</link>
	<description>Data storage info &#38; analysis</description>
	<lastBuildDate>Fri, 19 Mar 2010 09:23:11 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: HPCC - HPCC - DELL COMMUNITY</title>
		<link>http://storagemojo.com/2008/09/30/de-duplicating-primary-storage/comment-page-1/#comment-200356</link>
		<dc:creator>HPCC - HPCC - DELL COMMUNITY</dc:creator>
		<pubDate>Wed, 15 Apr 2009 13:31:34 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/?p=957#comment-200356</guid>
		<description>[...] after a small window of time, data is rarely reopened or touched. Robin Harris then wrote another blog that talked about some implications for this study on de-duplication. In particular, he made a [...]</description>
		<content:encoded><![CDATA[<p>[...] after a small window of time, data is rarely reopened or touched. Robin Harris then wrote another blog that talked about some implications for this study on de-duplication. In particular, he made a [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: dirkmeister.de &#187; Blog Archive &#187; Deduplication as Primary Data Storage</title>
		<link>http://storagemojo.com/2008/09/30/de-duplicating-primary-storage/comment-page-1/#comment-198340</link>
		<dc:creator>dirkmeister.de &#187; Blog Archive &#187; Deduplication as Primary Data Storage</dc:creator>
		<pubDate>Wed, 05 Nov 2008 08:29:48 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/?p=957#comment-198340</guid>
		<description>[...] Blog StorageMojo schreibt in einem Artikel:  So what percentage de-dup compression of unstructured data is feasible? That is the key to [...]</description>
		<content:encoded><![CDATA[<p>[...] Blog StorageMojo schreibt in einem Artikel:  So what percentage de-dup compression of unstructured data is feasible? That is the key to [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jeremy</title>
		<link>http://storagemojo.com/2008/09/30/de-duplicating-primary-storage/comment-page-1/#comment-198187</link>
		<dc:creator>Jeremy</dc:creator>
		<pubDate>Mon, 27 Oct 2008 17:21:34 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/?p=957#comment-198187</guid>
		<description>I was involved in a project evaluating dedupe for backup but we ended up moving in the direction of DataDomain&#039;s inline deduplication. In a proof of concept using DataDomain and we were able to get their advertised 1TB/hr rate. We experimented with direct database backups even though DataDomain usually seems to target VTL solutions. We chatted about deduped primary storage but I haven&#039;t personally been involved in any projects yet to actually try it. And NetApp probably has a better proposition for that; I&#039;m just guessing but inline dedupe is probably too computationally expensive at the moment to be feasible.</description>
		<content:encoded><![CDATA[<p>I was involved in a project evaluating dedupe for backup but we ended up moving in the direction of DataDomain&#8217;s inline deduplication. In a proof of concept using DataDomain and we were able to get their advertised 1TB/hr rate. We experimented with direct database backups even though DataDomain usually seems to target VTL solutions. We chatted about deduped primary storage but I haven&#8217;t personally been involved in any projects yet to actually try it. And NetApp probably has a better proposition for that; I&#8217;m just guessing but inline dedupe is probably too computationally expensive at the moment to be feasible.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Joe Kraska</title>
		<link>http://storagemojo.com/2008/09/30/de-duplicating-primary-storage/comment-page-1/#comment-197949</link>
		<dc:creator>Joe Kraska</dc:creator>
		<pubDate>Sun, 12 Oct 2008 02:21:26 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/?p=957#comment-197949</guid>
		<description>The guarantee is mostly there to provide comfort to buyers. Most of our virtual machine volumes are at or near 80% recoup rates from NetApp&#039;s dedup.

Joe.</description>
		<content:encoded><![CDATA[<p>The guarantee is mostly there to provide comfort to buyers. Most of our virtual machine volumes are at or near 80% recoup rates from NetApp&#8217;s dedup.</p>
<p>Joe.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: NetApp&#8217;s 50% Guarantee : techmute.com</title>
		<link>http://storagemojo.com/2008/09/30/de-duplicating-primary-storage/comment-page-1/#comment-197908</link>
		<dc:creator>NetApp&#8217;s 50% Guarantee : techmute.com</dc:creator>
		<pubDate>Mon, 06 Oct 2008 03:38:00 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/?p=957#comment-197908</guid>
		<description>[...] Robin Harris (Independent Analyst):  Robin didn&#8217;t discuss the guarantee, other than use it as a jumping-off point for primary storage de-dup.  &#8220;If the feature is free, de-duping some primary storage will be standard practice in most data centers within 5 years. As the de-dup technology improves and Moore’s Law drives performance, more and more unstructured data will be de-dup’d as a matter of course.&#8221; [...]</description>
		<content:encoded><![CDATA[<p>[...] Robin Harris (Independent Analyst):  Robin didn&#8217;t discuss the guarantee, other than use it as a jumping-off point for primary storage de-dup.  &#8220;If the feature is free, de-duping some primary storage will be standard practice in most data centers within 5 years. As the de-dup technology improves and Moore’s Law drives performance, more and more unstructured data will be de-dup’d as a matter of course.&#8221; [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Joe Kraska</title>
		<link>http://storagemojo.com/2008/09/30/de-duplicating-primary-storage/comment-page-1/#comment-197898</link>
		<dc:creator>Joe Kraska</dc:creator>
		<pubDate>Sun, 05 Oct 2008 14:48:04 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/?p=957#comment-197898</guid>
		<description>We have NetApp systems running dedup on primary storage in our environment. This doesn&#039;t slow things down in any appreciable manner at all. I believe NetApp is saying that the 7.2.4 release will contain changes to facilitate dup&#039;s and cache hits, which could very well end up providing performance *increases* in a highly duplicative environment, as with VMWare.

I only wish I&#039;d known about the 2TB limit long ago. We have some &gt;2TB volumes, and migrating off of them would be... painful.

Joe Kraska</description>
		<content:encoded><![CDATA[<p>We have NetApp systems running dedup on primary storage in our environment. This doesn&#8217;t slow things down in any appreciable manner at all. I believe NetApp is saying that the 7.2.4 release will contain changes to facilitate dup&#8217;s and cache hits, which could very well end up providing performance *increases* in a highly duplicative environment, as with VMWare.</p>
<p>I only wish I&#8217;d known about the 2TB limit long ago. We have some &gt;2TB volumes, and migrating off of them would be&#8230; painful.</p>
<p>Joe Kraska</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ausmith1</title>
		<link>http://storagemojo.com/2008/09/30/de-duplicating-primary-storage/comment-page-1/#comment-197882</link>
		<dc:creator>Ausmith1</dc:creator>
		<pubDate>Fri, 03 Oct 2008 00:18:38 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/?p=957#comment-197882</guid>
		<description>Here is the sanitized output from &#039;df -s -h&#039; on one of our filers, it houses about 250 Windows based ESX development VMs on VMFS volumes.
Filesystem			used      	saved    	%saved
/vol/vol0/          648MB     	0MB         0%
/vol/vol1/      	731GB     	1230GB      63%
/vol/vol2/      	356GB      	299GB       46%
/vol/vol3/     		9639MB      10GB        53%
/vol/vol4/      	108GB     	1302GB      92%
/vol/vol5/       	158GB      	500GB       76%
/vol/vol6/      	176GB      	903GB       84%
/vol/vol7/      	186GB      	290GB       61%
/vol/vol8/       	148GB     	36GB        20%
/vol/vol9/       	71GB       	53GB        43%
/vol/vola/          150GB      	236GB       61%
/vol/volb/      	268GB      	397GB       60%
/vol/volc/      	146GB       42GB        22%

That makes 2.5TB of disk space used and 5.3TB saved by my count.

There are some volumes that ASIS is not enabled on, therefore I have not included them in this output. The only reason that ASIS is not enabled on them is that they are large (&gt;2TB) volumes created before ASIS was freely available. Enabling ASIS on a volume is dependent on the size of the volume relative to the RAM available in the filer. i.e. the largest volume this particular filer can handle is 2TB. A 6000 series filer can handle 16TB ASIS volumes.</description>
		<content:encoded><![CDATA[<p>Here is the sanitized output from &#8216;df -s -h&#8217; on one of our filers, it houses about 250 Windows based ESX development VMs on VMFS volumes.<br />
Filesystem			used      	saved    	%saved<br />
/vol/vol0/          648MB     	0MB         0%<br />
/vol/vol1/      	731GB     	1230GB      63%<br />
/vol/vol2/      	356GB      	299GB       46%<br />
/vol/vol3/     		9639MB      10GB        53%<br />
/vol/vol4/      	108GB     	1302GB      92%<br />
/vol/vol5/       	158GB      	500GB       76%<br />
/vol/vol6/      	176GB      	903GB       84%<br />
/vol/vol7/      	186GB      	290GB       61%<br />
/vol/vol8/       	148GB     	36GB        20%<br />
/vol/vol9/       	71GB       	53GB        43%<br />
/vol/vola/          150GB      	236GB       61%<br />
/vol/volb/      	268GB      	397GB       60%<br />
/vol/volc/      	146GB       42GB        22%</p>
<p>That makes 2.5TB of disk space used and 5.3TB saved by my count.</p>
<p>There are some volumes that ASIS is not enabled on, therefore I have not included them in this output. The only reason that ASIS is not enabled on them is that they are large (&gt;2TB) volumes created before ASIS was freely available. Enabling ASIS on a volume is dependent on the size of the volume relative to the RAM available in the filer. i.e. the largest volume this particular filer can handle is 2TB. A 6000 series filer can handle 16TB ASIS volumes.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Are you Content Aware? &#171; Storage Optimization</title>
		<link>http://storagemojo.com/2008/09/30/de-duplicating-primary-storage/comment-page-1/#comment-197880</link>
		<dc:creator>Are you Content Aware? &#171; Storage Optimization</dc:creator>
		<pubDate>Thu, 02 Oct 2008 18:10:43 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/?p=957#comment-197880</guid>
		<description>[...] October 2, 2008 Tags: NetApp, Robin Harris, StorageMojo   Storage analyst Robin Harris commented on the storage story of the week&#8211;NetApp&#8217;s Guarantee that virtualization will mean a 50% gain in storage capacity for its [...]</description>
		<content:encoded><![CDATA[<p>[...] October 2, 2008 Tags: NetApp, Robin Harris, StorageMojo   Storage analyst Robin Harris commented on the storage story of the week&#8211;NetApp&#8217;s Guarantee that virtualization will mean a 50% gain in storage capacity for its [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: max</title>
		<link>http://storagemojo.com/2008/09/30/de-duplicating-primary-storage/comment-page-1/#comment-197876</link>
		<dc:creator>max</dc:creator>
		<pubDate>Wed, 01 Oct 2008 22:30:45 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/?p=957#comment-197876</guid>
		<description>FWIW

Have ASIS running w/ ESX on a (primary storage w/ ASIS)   In my experience, the 50% is a very low bar for netapp with this setup in a hosted ESX environment (~400 VMs.)</description>
		<content:encoded><![CDATA[<p>FWIW</p>
<p>Have ASIS running w/ ESX on a (primary storage w/ ASIS)   In my experience, the 50% is a very low bar for netapp with this setup in a hosted ESX environment (~400 VMs.)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: open systems storage guy</title>
		<link>http://storagemojo.com/2008/09/30/de-duplicating-primary-storage/comment-page-1/#comment-197875</link>
		<dc:creator>open systems storage guy</dc:creator>
		<pubDate>Wed, 01 Oct 2008 20:10:19 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/?p=957#comment-197875</guid>
		<description>I&#039;ve used it- it&#039;s not for all workloads, but it&#039;s a nice feature for low use file systems and whatnot. I wouldn&#039;t suggest it on anything that really hits the controllers heavily because every time a write is done, a process running on the filer hashes the data, which creates something like a 5% processor overhead. During idle times, it goes up considerably as the algorithm will do a byte to byte comparison of all suspected duplicate data chunks before pointing both sections of volume to the same chunk.

Netapp filers use the overhead everyone&#039;s been complaining about to save space in the end. If you have to clone databases, can thin provision, take snapshots, and have heavily duplicated files, you&#039;ll probably end up with more data stuffed into your filer than you could get in an equivalent traditional disk box. If you don&#039;t, however, then you&#039;ll need more disks in your filer than you would otherwise.</description>
		<content:encoded><![CDATA[<p>I&#8217;ve used it- it&#8217;s not for all workloads, but it&#8217;s a nice feature for low use file systems and whatnot. I wouldn&#8217;t suggest it on anything that really hits the controllers heavily because every time a write is done, a process running on the filer hashes the data, which creates something like a 5% processor overhead. During idle times, it goes up considerably as the algorithm will do a byte to byte comparison of all suspected duplicate data chunks before pointing both sections of volume to the same chunk.</p>
<p>Netapp filers use the overhead everyone&#8217;s been complaining about to save space in the end. If you have to clone databases, can thin provision, take snapshots, and have heavily duplicated files, you&#8217;ll probably end up with more data stuffed into your filer than you could get in an equivalent traditional disk box. If you don&#8217;t, however, then you&#8217;ll need more disks in your filer than you would otherwise.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Cinetica Blog &#187; Deduplication sullo storage primario</title>
		<link>http://storagemojo.com/2008/09/30/de-duplicating-primary-storage/comment-page-1/#comment-197869</link>
		<dc:creator>Cinetica Blog &#187; Deduplication sullo storage primario</dc:creator>
		<pubDate>Wed, 01 Oct 2008 12:36:25 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/?p=957#comment-197869</guid>
		<description>[...] ho conferma, per l&#8217;ennesima volta, anche da un post su storagemojo che ho letto [...]</description>
		<content:encoded><![CDATA[<p>[...] ho conferma, per l&#8217;ennesima volta, anche da un post su storagemojo che ho letto [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Steven Schwartz</title>
		<link>http://storagemojo.com/2008/09/30/de-duplicating-primary-storage/comment-page-1/#comment-197865</link>
		<dc:creator>Steven Schwartz</dc:creator>
		<pubDate>Tue, 30 Sep 2008 22:26:14 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/?p=957#comment-197865</guid>
		<description>Come on Robin, did you read the NetApp release?  Everyone has written about it already, they never claim 50% reduction in storage required due to Deduplication, it is claimed on several things...I posted a silly but funny corollary on my blog.

http://thesantechnologist.com/?p=122</description>
		<content:encoded><![CDATA[<p>Come on Robin, did you read the NetApp release?  Everyone has written about it already, they never claim 50% reduction in storage required due to Deduplication, it is claimed on several things&#8230;I posted a silly but funny corollary on my blog.</p>
<p><a href="http://thesantechnologist.com/?p=122" rel="nofollow">http://thesantechnologist.com/?p=122</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: TylerB</title>
		<link>http://storagemojo.com/2008/09/30/de-duplicating-primary-storage/comment-page-1/#comment-197864</link>
		<dc:creator>TylerB</dc:creator>
		<pubDate>Tue, 30 Sep 2008 22:12:20 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/?p=957#comment-197864</guid>
		<description>Robin-
 (disclaimer: I work for an NTAP Partner)
This does work and we have a ton of customers using it. While unstructured data is decent (30% is common), VMware is THE killer app for primary storage dedupe. We have plenty of customer at 70, 80, and even 90% dedupe rates. The beauty of it is since its post process, it has no noticeable effect on the live data.
Basically we&#039;ve either been installing new NetApp arrays or fronting older ones with v-series all over the place.</description>
		<content:encoded><![CDATA[<p>Robin-<br />
 (disclaimer: I work for an NTAP Partner)<br />
This does work and we have a ton of customers using it. While unstructured data is decent (30% is common), VMware is THE killer app for primary storage dedupe. We have plenty of customer at 70, 80, and even 90% dedupe rates. The beauty of it is since its post process, it has no noticeable effect on the live data.<br />
Basically we&#8217;ve either been installing new NetApp arrays or fronting older ones with v-series all over the place.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
