<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: FastMail fights data corruption</title>
	<atom:link href="http://storagemojo.com/2007/12/19/fastmail-fights-data-corruption/feed/" rel="self" type="application/rss+xml" />
	<link>http://storagemojo.com/2007/12/19/fastmail-fights-data-corruption/</link>
	<description>Data storage info &#38; analysis</description>
	<pubDate>Mon, 13 Oct 2008 14:17:05 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6.2</generator>
		<item>
		<title>By: Robert Milkowski</title>
		<link>http://storagemojo.com/2007/12/19/fastmail-fights-data-corruption/#comment-159449</link>
		<dc:creator>Robert Milkowski</dc:creator>
		<pubDate>Sat, 29 Dec 2007 11:51:20 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/2007/12/19/fastmail-fights-data-corruption/#comment-159449</guid>
		<description>I know first hand of a rather large email system (3GB free account, 4+ million active users) which is running on ZFS.</description>
		<content:encoded><![CDATA[<p>I know first hand of a rather large email system (3GB free account, 4+ million active users) which is running on ZFS.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: xfer_rdy</title>
		<link>http://storagemojo.com/2007/12/19/fastmail-fights-data-corruption/#comment-157493</link>
		<dc:creator>xfer_rdy</dc:creator>
		<pubDate>Thu, 20 Dec 2007 22:30:42 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/2007/12/19/fastmail-fights-data-corruption/#comment-157493</guid>
		<description>Hi Bron,

I'm confused... ZFS (SUN's) doesn't have a built in file replication engine. Only file system to disk integrity.

If your underlying execution platform is having data integrity errors, like the one you have described (it must be linux), the integrity of the application must be called into question. File systems, like Ext3 in lnux, has some very famous bugs including throwing data away  without informing the application. While there are programmatic techniques to detect these errors, mos developers do not apply them   The linux swap file system does NO end to end integrity checking. The fact that the mail system's program modules are executing on  platform with a poor level of data integrity, brings the integrity of the executing application environment into question. 

Its great to have data integrity checking, and most people agree not to rely on a single point for any system feature. However, most developers ignore the application execution space when it comes to integrity validation.  It wouldn't be the first time I've seen multiple copies of files corrupted by the application that wrote them. Most data recover actions are not due to hardware failures, but due to operator error and application defects - yes, both copies.

Before being lulled into a false sense of security in your application, realize there are other issues much more significant, in terms of "true" integrity, other than some hash value with the data.</description>
		<content:encoded><![CDATA[<p>Hi Bron,</p>
<p>I&#8217;m confused&#8230; ZFS (SUN&#8217;s) doesn&#8217;t have a built in file replication engine. Only file system to disk integrity.</p>
<p>If your underlying execution platform is having data integrity errors, like the one you have described (it must be linux), the integrity of the application must be called into question. File systems, like Ext3 in lnux, has some very famous bugs including throwing data away  without informing the application. While there are programmatic techniques to detect these errors, mos developers do not apply them   The linux swap file system does NO end to end integrity checking. The fact that the mail system&#8217;s program modules are executing on  platform with a poor level of data integrity, brings the integrity of the executing application environment into question. </p>
<p>Its great to have data integrity checking, and most people agree not to rely on a single point for any system feature. However, most developers ignore the application execution space when it comes to integrity validation.  It wouldn&#8217;t be the first time I&#8217;ve seen multiple copies of files corrupted by the application that wrote them. Most data recover actions are not due to hardware failures, but due to operator error and application defects - yes, both copies.</p>
<p>Before being lulled into a false sense of security in your application, realize there are other issues much more significant, in terms of &#8220;true&#8221; integrity, other than some hash value with the data.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Bron Gondwana</title>
		<link>http://storagemojo.com/2007/12/19/fastmail-fights-data-corruption/#comment-157178</link>
		<dc:creator>Bron Gondwana</dc:creator>
		<pubDate>Thu, 20 Dec 2007 00:46:10 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/2007/12/19/fastmail-fights-data-corruption/#comment-157178</guid>
		<description>Disclaimer: I'm the FastMail developer who wrote most of our integrity checking systems.

Data integrity is the responsibility of every layer.  The underlying storage (like ZFS indeed) can only guarantee the integrity of what it's given.  If your replication engine makes a mistake, or your email software contains a bug somewhere which causes it to go on an index hosing spree, then your filesystem can present a perfectly good copy of the hosed files.  It doesn't understand the format enough to know that they're wrong.

This is the same reason that we don't use block level replication for our filesystem, because we have seen filesystem bugs before, and we don't want a filesystem bug to lose both copies.  Better to have the replication engine bail out saying the sha1 doesn't match when it tries to copy the message.

So yeah, it's great to have a filesystem that does integrity checking for you, but concepts like "defence in depth" from the security world map very nicely to the data integrity world as well.  Don't trust everything to one layer.</description>
		<content:encoded><![CDATA[<p>Disclaimer: I&#8217;m the FastMail developer who wrote most of our integrity checking systems.</p>
<p>Data integrity is the responsibility of every layer.  The underlying storage (like ZFS indeed) can only guarantee the integrity of what it&#8217;s given.  If your replication engine makes a mistake, or your email software contains a bug somewhere which causes it to go on an index hosing spree, then your filesystem can present a perfectly good copy of the hosed files.  It doesn&#8217;t understand the format enough to know that they&#8217;re wrong.</p>
<p>This is the same reason that we don&#8217;t use block level replication for our filesystem, because we have seen filesystem bugs before, and we don&#8217;t want a filesystem bug to lose both copies.  Better to have the replication engine bail out saying the sha1 doesn&#8217;t match when it tries to copy the message.</p>
<p>So yeah, it&#8217;s great to have a filesystem that does integrity checking for you, but concepts like &#8220;defence in depth&#8221; from the security world map very nicely to the data integrity world as well.  Don&#8217;t trust everything to one layer.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Robert Chien</title>
		<link>http://storagemojo.com/2007/12/19/fastmail-fights-data-corruption/#comment-157152</link>
		<dc:creator>Robert Chien</dc:creator>
		<pubDate>Wed, 19 Dec 2007 21:50:26 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/2007/12/19/fastmail-fights-data-corruption/#comment-157152</guid>
		<description>Shouldn't data integrity be the responsibility of the underlying storage (like ZFS) and not the application (messaging in this case) that runs on top?</description>
		<content:encoded><![CDATA[<p>Shouldn&#8217;t data integrity be the responsibility of the underlying storage (like ZFS) and not the application (messaging in this case) that runs on top?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Gimlet</title>
		<link>http://storagemojo.com/2007/12/19/fastmail-fights-data-corruption/#comment-157139</link>
		<dc:creator>Gimlet</dc:creator>
		<pubDate>Wed, 19 Dec 2007 20:02:04 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/2007/12/19/fastmail-fights-data-corruption/#comment-157139</guid>
		<description>Hi Robin,

Sendmail (and Postfix as well) is strictly an MTA (Mail Transfer Agent), that is, it really only moves mail from one host to another.  Once at the right host, it will hand the mail to a backend to deliver the mail to the user's inbox.  So, the question then becomes, how does the backend handle it?</description>
		<content:encoded><![CDATA[<p>Hi Robin,</p>
<p>Sendmail (and Postfix as well) is strictly an MTA (Mail Transfer Agent), that is, it really only moves mail from one host to another.  Once at the right host, it will hand the mail to a backend to deliver the mail to the user&#8217;s inbox.  So, the question then becomes, how does the backend handle it?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Nathan</title>
		<link>http://storagemojo.com/2007/12/19/fastmail-fights-data-corruption/#comment-157104</link>
		<dc:creator>Nathan</dc:creator>
		<pubDate>Wed, 19 Dec 2007 14:55:39 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/2007/12/19/fastmail-fights-data-corruption/#comment-157104</guid>
		<description>Seems like there is an "About" page (http://www.fastmail.fm/pages/fastmail/docs/about.html). They just look like another hosted mail provider. They give you 10M of storage space for free and IMAP access. Whoo.</description>
		<content:encoded><![CDATA[<p>Seems like there is an &#8220;About&#8221; page (http://www.fastmail.fm/pages/fastmail/docs/about.html). They just look like another hosted mail provider. They give you 10M of storage space for free and IMAP access. Whoo.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

<!-- Dynamic Page Served (once) in 0.509 seconds -->
