<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: Latent sector errors in disk drives</title>
	<atom:link href="http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/feed/" rel="self" type="application/rss+xml" />
	<link>http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/</link>
	<description>Data storage info &#38; analysis</description>
	<pubDate>Fri, 16 May 2008 14:18:01 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5.1</generator>
		<item>
		<title>By: Bill Todd</title>
		<link>http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/#comment-176621</link>
		<dc:creator>Bill Todd</dc:creator>
		<pubDate>Sat, 01 Mar 2008 02:25:19 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/#comment-176621</guid>
		<description>"Yes, there are “bad” disks: 0.2% of the drives had more than 1000 errors."

You seem to have skimmed the paper a bit too quickly:  0.2% *of the 3.45% of the disks that had errors* had more than 1000 errors (i.e., about one drive in 14,000).

"file systems that replicate critical data across the disk are much less likely to lose your data than those, like ReiserFS, place critical structures in one contiguous area"

Hogwash:  you're almost exactly as likely to lose data with one as with the other, because the chances that an LSE (or even a bunch of adjacent LSEs) will affect more than data in a single user file is minute.

The most you could say is that you're far less likely to lose *all* your data due to LSEs when critical metadata is replicated across the disk, but since the likelihood that LSEs will just happen to hit such critical metadata is already infinitesimal further reducing it (even by orders of magnitude) doesn't count for much (investing in a meteor shield might make about as much sense).

And if you introduce any disk-level redundancy (mirrored or parity) into your system the likelihood that you'll lose more than data in a single file due to LSEs (even if you don't scrub at all) becomes pretty much indistinguishable from zero.

The reason that file system designers sometimes distribute multiple copies of critical metadata across a single disk is because it doesn't cost much (in terms of space or performance) to do and helps give users a warm, fuzzy feeling, not because it's likely to impact availability in any significant way with respect to LSE problems (though it can help with more drastic events such as head crashes that create a far wider path of devastation while still leaving portions of the disk accessible).

"In the analyzed drives over 60% of LSE were found by scrubbing."

Another instance of too quick a skim, I suspect:  while this casual observation does occur in section 6.2, the more detailed presentation in section 5.5 states that 61.5% of the LSEs in *enterprise* drives were discovered by scrubbing while 86.6% of the LSEs in *nearline* disks were discovered by scrubbing, for an overall average of 77.4%.

"Scrubbing is a high-end feature that works."

Perhaps a great deal better than you realize, even after taking the above corrections into account:  apparently the only reason that scrubbing did not detect 100% of the LSEs was because user read and write operations discovered them before the scrub had a chance to, though it would have been nice to see it stated explicitly that no unrecoverable sectors (at least within the 0.1% rounding error) occurred during reconstruction after a disk failure during the test period.

"I think drive vendors could get a nice premium on LER - low error rate - drives, if they positioned them correctly."

That suggestion is just as silly as it was when I responded to it 7 months ago in http://storagemojo.com/2007/07/19/why-arent-disk-reads-more-reliable/ - where I didn't even mention the fact that demanding a *premium* for such drives would make them even less competitive (compared with using conventional drives in larger numbers to attain at least comparable reliability with potentially better performance in the bargain).

Hell, it's going to be tough enough preserving a marketable distinction between 'nearline' and 'enterprise' drives (at least with anything like their current price difference:  SATA drives can usually clobber enterprise drives already on price for a given level of aggregate performance in most enrivonments), without attempting to introduce an *additional* new tier.

"after a RAID 5 disk failure, your chances of seeing an unrecoverable read error on the remaining SATA drives is high."

Not if the RAID-group size is reasonable (say, no more than 9 drives - 5 or 6 would be more typical) and you scrub reasonably frequently.

More specifically, this paper found that 91.5% of the nearline (SATA) disks developed no LSEs at all over the 32-month test period.  Of the remaining 8.5% that developed at least one LSE over the 32-month period, the fact that 3.15% developed at least one LSE within 12 months suggests that the rate of incidence of disks with at least one LSE (though not necessarily the total number of LSEs) was roughly linear in time, which would imply that in any two-week interval between complete scrubs (the frequency cited in the paper) there would be about a 0.12% chance that any given SATA drive would develop at least one LSE.

So if you had a 9-disk group and one failed, if you scrubbed every two weeks (and on average the failure occurred half-way through that period) there'd be about a 0.5% chance of encountering one or more LSEs (most disks that had any LSEs at all had only a very few) during reconstruction.  With a more typical 5-disk array there'd be about a 0.25% chance.

While these probabilities are non-negligible, I wouldn't call them 'high' - and in particular, for the typical (5-disk) RAID-5 array the likelihood of encountering *any* unrecoverable data due to LSEs during reconstruction is only 4x as high as it would be for an equivalent RAID-1/RAID-10 (and effectively only around 2.5x as high, considering that you'd need 60% more mirrored disks to achieve the same storage capacity and thus have something like a 60% higher probability of having a disk fail in the first place).

You've been clamoring about the dangers of RAID-5 long enough, Robin:  start listening when people try to give you a clue.  You've taken a legitimate problem (the fact that LSEs can interact with disk failures to cause data loss in RAID with non-negligible probability), ignored the major impact that scrubbing can have in reducing that problem (by over two orders of magnitude if you scrub every 2 weeks during a disk's nominal 5-year service life - and that doesn't take into account the impact that regular scrubbing has by causing the disk itself to revector failing but still readable sectors *before* they generate LSEs, which this paper alludes to but does not quantify), and ignored the fact that mirroring suffers from it not much less than typical parity RAIDs do.

"When your RAID controller encounters one it has to say it can’t recover the data, so now it is time to recover from a backup, meaning all the RAID recovery time was wasted."

Only if your controller is brain-damaged:  otherwise, it should report the problem, recover the rest of your data, and leave the unrecoverable (logical) sector marked 'bad' (probably waiting for a subsequent write to 'revector' it) - just as would happen with a single drive when it encounters an unreadable sector (it doesn't just take all its marbles and go home if that happens, and neither should the array).  The likelihood is *overwhelmingly* high that such a bad sector will occur in the data of some single file rather than cause wider damage by occurring in some critical metadata of more global significance.

Of course, as explained above if you scrub reasonably often the chances that your RAID will encounter *any LSEs at all* during reconstruction after a disk failure are quite small (well under 1%), and the chances that it will encounter any that would affect more than a single file are several (usually many) orders of magnitude smaller still.

"if a RAID system knew about files it would be a file system"

So would a disk - see my comment just above:  why, exactly, do you think they should behave differently in this area?

"This paper sounds like a big ad for ZFS/Btrfs. :-) But maybe that’s just my bias talking."

I suspect the latter:  the only advantage that ZFS has in this area over a conventional (non-brain-damaged) RAID-plus-scrubbing approach is in its additional replication of metadata, and the chances that an LSE problem during reconstruction will hit critical metadata (rather than plain old user data) are infinitesimal.

- bill</description>
		<content:encoded><![CDATA[<p>&#8220;Yes, there are “bad” disks: 0.2% of the drives had more than 1000 errors.&#8221;</p>
<p>You seem to have skimmed the paper a bit too quickly:  0.2% *of the 3.45% of the disks that had errors* had more than 1000 errors (i.e., about one drive in 14,000).</p>
<p>&#8220;file systems that replicate critical data across the disk are much less likely to lose your data than those, like ReiserFS, place critical structures in one contiguous area&#8221;</p>
<p>Hogwash:  you&#8217;re almost exactly as likely to lose data with one as with the other, because the chances that an LSE (or even a bunch of adjacent LSEs) will affect more than data in a single user file is minute.</p>
<p>The most you could say is that you&#8217;re far less likely to lose *all* your data due to LSEs when critical metadata is replicated across the disk, but since the likelihood that LSEs will just happen to hit such critical metadata is already infinitesimal further reducing it (even by orders of magnitude) doesn&#8217;t count for much (investing in a meteor shield might make about as much sense).</p>
<p>And if you introduce any disk-level redundancy (mirrored or parity) into your system the likelihood that you&#8217;ll lose more than data in a single file due to LSEs (even if you don&#8217;t scrub at all) becomes pretty much indistinguishable from zero.</p>
<p>The reason that file system designers sometimes distribute multiple copies of critical metadata across a single disk is because it doesn&#8217;t cost much (in terms of space or performance) to do and helps give users a warm, fuzzy feeling, not because it&#8217;s likely to impact availability in any significant way with respect to LSE problems (though it can help with more drastic events such as head crashes that create a far wider path of devastation while still leaving portions of the disk accessible).</p>
<p>&#8220;In the analyzed drives over 60% of LSE were found by scrubbing.&#8221;</p>
<p>Another instance of too quick a skim, I suspect:  while this casual observation does occur in section 6.2, the more detailed presentation in section 5.5 states that 61.5% of the LSEs in *enterprise* drives were discovered by scrubbing while 86.6% of the LSEs in *nearline* disks were discovered by scrubbing, for an overall average of 77.4%.</p>
<p>&#8220;Scrubbing is a high-end feature that works.&#8221;</p>
<p>Perhaps a great deal better than you realize, even after taking the above corrections into account:  apparently the only reason that scrubbing did not detect 100% of the LSEs was because user read and write operations discovered them before the scrub had a chance to, though it would have been nice to see it stated explicitly that no unrecoverable sectors (at least within the 0.1% rounding error) occurred during reconstruction after a disk failure during the test period.</p>
<p>&#8220;I think drive vendors could get a nice premium on LER - low error rate - drives, if they positioned them correctly.&#8221;</p>
<p>That suggestion is just as silly as it was when I responded to it 7 months ago in <a href="http://storagemojo.com/2007/07/19/why-arent-disk-reads-more-reliable/" rel="nofollow">http://storagemojo.com/2007/07/19/why-arent-disk-reads-more-reliable/</a> - where I didn&#8217;t even mention the fact that demanding a *premium* for such drives would make them even less competitive (compared with using conventional drives in larger numbers to attain at least comparable reliability with potentially better performance in the bargain).</p>
<p>Hell, it&#8217;s going to be tough enough preserving a marketable distinction between &#8216;nearline&#8217; and &#8216;enterprise&#8217; drives (at least with anything like their current price difference:  SATA drives can usually clobber enterprise drives already on price for a given level of aggregate performance in most enrivonments), without attempting to introduce an *additional* new tier.</p>
<p>&#8220;after a RAID 5 disk failure, your chances of seeing an unrecoverable read error on the remaining SATA drives is high.&#8221;</p>
<p>Not if the RAID-group size is reasonable (say, no more than 9 drives - 5 or 6 would be more typical) and you scrub reasonably frequently.</p>
<p>More specifically, this paper found that 91.5% of the nearline (SATA) disks developed no LSEs at all over the 32-month test period.  Of the remaining 8.5% that developed at least one LSE over the 32-month period, the fact that 3.15% developed at least one LSE within 12 months suggests that the rate of incidence of disks with at least one LSE (though not necessarily the total number of LSEs) was roughly linear in time, which would imply that in any two-week interval between complete scrubs (the frequency cited in the paper) there would be about a 0.12% chance that any given SATA drive would develop at least one LSE.</p>
<p>So if you had a 9-disk group and one failed, if you scrubbed every two weeks (and on average the failure occurred half-way through that period) there&#8217;d be about a 0.5% chance of encountering one or more LSEs (most disks that had any LSEs at all had only a very few) during reconstruction.  With a more typical 5-disk array there&#8217;d be about a 0.25% chance.</p>
<p>While these probabilities are non-negligible, I wouldn&#8217;t call them &#8216;high&#8217; - and in particular, for the typical (5-disk) RAID-5 array the likelihood of encountering *any* unrecoverable data due to LSEs during reconstruction is only 4x as high as it would be for an equivalent RAID-1/RAID-10 (and effectively only around 2.5x as high, considering that you&#8217;d need 60% more mirrored disks to achieve the same storage capacity and thus have something like a 60% higher probability of having a disk fail in the first place).</p>
<p>You&#8217;ve been clamoring about the dangers of RAID-5 long enough, Robin:  start listening when people try to give you a clue.  You&#8217;ve taken a legitimate problem (the fact that LSEs can interact with disk failures to cause data loss in RAID with non-negligible probability), ignored the major impact that scrubbing can have in reducing that problem (by over two orders of magnitude if you scrub every 2 weeks during a disk&#8217;s nominal 5-year service life - and that doesn&#8217;t take into account the impact that regular scrubbing has by causing the disk itself to revector failing but still readable sectors *before* they generate LSEs, which this paper alludes to but does not quantify), and ignored the fact that mirroring suffers from it not much less than typical parity RAIDs do.</p>
<p>&#8220;When your RAID controller encounters one it has to say it can’t recover the data, so now it is time to recover from a backup, meaning all the RAID recovery time was wasted.&#8221;</p>
<p>Only if your controller is brain-damaged:  otherwise, it should report the problem, recover the rest of your data, and leave the unrecoverable (logical) sector marked &#8216;bad&#8217; (probably waiting for a subsequent write to &#8216;revector&#8217; it) - just as would happen with a single drive when it encounters an unreadable sector (it doesn&#8217;t just take all its marbles and go home if that happens, and neither should the array).  The likelihood is *overwhelmingly* high that such a bad sector will occur in the data of some single file rather than cause wider damage by occurring in some critical metadata of more global significance.</p>
<p>Of course, as explained above if you scrub reasonably often the chances that your RAID will encounter *any LSEs at all* during reconstruction after a disk failure are quite small (well under 1%), and the chances that it will encounter any that would affect more than a single file are several (usually many) orders of magnitude smaller still.</p>
<p>&#8220;if a RAID system knew about files it would be a file system&#8221;</p>
<p>So would a disk - see my comment just above:  why, exactly, do you think they should behave differently in this area?</p>
<p>&#8220;This paper sounds like a big ad for ZFS/Btrfs. <img src='http://storagemojo.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> But maybe that’s just my bias talking.&#8221;</p>
<p>I suspect the latter:  the only advantage that ZFS has in this area over a conventional (non-brain-damaged) RAID-plus-scrubbing approach is in its additional replication of metadata, and the chances that an LSE problem during reconstruction will hit critical metadata (rather than plain old user data) are infinitesimal.</p>
<p>- bill</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Keith S.</title>
		<link>http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/#comment-174720</link>
		<dc:creator>Keith S.</dc:creator>
		<pubDate>Sun, 24 Feb 2008 05:05:56 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/#comment-174720</guid>
		<description>It screams ZFS to me too.  That a read or write would fail would seem right up the alley of ZFS's checksum error handling.  An example of ZFS handling what sounds like the same thing this report describes: http://blogs.sun.com/elowe/entry/zfs_saves_the_day_ta</description>
		<content:encoded><![CDATA[<p>It screams ZFS to me too.  That a read or write would fail would seem right up the alley of ZFS&#8217;s checksum error handling.  An example of ZFS handling what sounds like the same thing this report describes: <a href="http://blogs.sun.com/elowe/entry/zfs_saves_the_day_ta" rel="nofollow">http://blogs.sun.com/elowe/entry/zfs_saves_the_day_ta</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Liam Newcombe</title>
		<link>http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/#comment-173634</link>
		<dc:creator>Liam Newcombe</dc:creator>
		<pubDate>Wed, 20 Feb 2008 10:15:54 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/#comment-173634</guid>
		<description>Robin,
Another interesting article;

BER as influenced by 'latent' sector failures is one of the key aspects in understanding the Mean Time to Data Loss (MTTDL) for disk arrays, once the weighting factors for disk BER, End User failure rate versus manufacturer, failure correlation (which is alluded to in the articles you link) and Batch Diversity are included the apparent 'reliability' of many large disk arrays is substantially lower than claimed. A simple rule of thumb is that a RAID 6(0) array built up out of sensible size RAID 6 groups (&lt;20) is likely to achieve a MTTDL slightly less (/10) than the unadjusted RAID 5 MTBF based on hardware failures only. This is good news for people who have thrown out their big iron and are using smart file systems that actually talk to the disk in place of a volume manager that obscures disk events, also those file systems that use file level CRC. Once key differentiator is some file systems also read the parity stripe whenever they read the data which improves the error detection rate. There have been many posts by new ZFS users expressing their horror at the data error rate this class of file system exposes on their exising hardware, previously we probably blamed the data errors that were exposed on the application instead of the real culprit.
Latent sector failure is not a new issue, Hannu H Kari published a paper covering this issue in 1997, "Latent Sector Faults and Reliability of Disk Arrays" where the benefits of scrub technologies were well explored, the data from the paper referenced in this article seems to support his findings.

The claimed 'reliability' of large disk arrays is largely an illusion, once you understand that the calculated MTBF from most vendors does not include this type of data loss failure, it only considers simple hardware failure (not maintainability issues created by unecessary complexity and the loss of fault containment with an uber SAN) and takes no account of the huge performance degradation that many arrays suffer during parity rebuild (think those whose performance tanks when a snapshot is open) the achieved performance availability of many systems is several orders of magnitude lower than claimed, driving yet more expenditure on expensive, power consuming kit to maintain performance.
In terms of mitigating the impacts of the latent sector failure component of BER as described here smart background scan algorithms are undoubtedly effective but a cheap alternative for those running arrays with cheaper controllers or file system / volume managers that do not include this functionality is to simply back up the entire binary volume to dev null. Perversely a weekly backup to dev null, through forcing a complete surface read, can substantially improve the achieved data reliability by forcing the disks to detect sectors that took multiple read attempts and remap them elsewhere on the disk before the data is unreadable.

The final issue here is that latent sector failures raise another serious question about the viability of MAID disk arrays. As the data on the disks slowly degrades on the platters and this is kept in check by reading the disks to detect the re-reads as the data decays this does not bode well for a disk array that is designed to spend as much of its time as possible asleep. As many of the MAID platforms contain very large disks, which even at their maximum transfer rate in linear read can take many hours or even days to read the entire surface the weekly or so background scans could well use up much of the 'Idle' time of the MAID array simply preserving the data. This suggests that in place of MAID technology a non volatile media could be a far better option. Of course, smart file systems such as ZFS and the NetApp equivalent that only have to manage the actual data area have a significant advantage here over dumb volume managers and hardware controllers.</description>
		<content:encoded><![CDATA[<p>Robin,<br />
Another interesting article;</p>
<p>BER as influenced by &#8216;latent&#8217; sector failures is one of the key aspects in understanding the Mean Time to Data Loss (MTTDL) for disk arrays, once the weighting factors for disk BER, End User failure rate versus manufacturer, failure correlation (which is alluded to in the articles you link) and Batch Diversity are included the apparent &#8216;reliability&#8217; of many large disk arrays is substantially lower than claimed. A simple rule of thumb is that a RAID 6(0) array built up out of sensible size RAID 6 groups (&lt;20) is likely to achieve a MTTDL slightly less (/10) than the unadjusted RAID 5 MTBF based on hardware failures only. This is good news for people who have thrown out their big iron and are using smart file systems that actually talk to the disk in place of a volume manager that obscures disk events, also those file systems that use file level CRC. Once key differentiator is some file systems also read the parity stripe whenever they read the data which improves the error detection rate. There have been many posts by new ZFS users expressing their horror at the data error rate this class of file system exposes on their exising hardware, previously we probably blamed the data errors that were exposed on the application instead of the real culprit.<br />
Latent sector failure is not a new issue, Hannu H Kari published a paper covering this issue in 1997, &#8220;Latent Sector Faults and Reliability of Disk Arrays&#8221; where the benefits of scrub technologies were well explored, the data from the paper referenced in this article seems to support his findings.</p>
<p>The claimed &#8216;reliability&#8217; of large disk arrays is largely an illusion, once you understand that the calculated MTBF from most vendors does not include this type of data loss failure, it only considers simple hardware failure (not maintainability issues created by unecessary complexity and the loss of fault containment with an uber SAN) and takes no account of the huge performance degradation that many arrays suffer during parity rebuild (think those whose performance tanks when a snapshot is open) the achieved performance availability of many systems is several orders of magnitude lower than claimed, driving yet more expenditure on expensive, power consuming kit to maintain performance.<br />
In terms of mitigating the impacts of the latent sector failure component of BER as described here smart background scan algorithms are undoubtedly effective but a cheap alternative for those running arrays with cheaper controllers or file system / volume managers that do not include this functionality is to simply back up the entire binary volume to dev null. Perversely a weekly backup to dev null, through forcing a complete surface read, can substantially improve the achieved data reliability by forcing the disks to detect sectors that took multiple read attempts and remap them elsewhere on the disk before the data is unreadable.</p>
<p>The final issue here is that latent sector failures raise another serious question about the viability of MAID disk arrays. As the data on the disks slowly degrades on the platters and this is kept in check by reading the disks to detect the re-reads as the data decays this does not bode well for a disk array that is designed to spend as much of its time as possible asleep. As many of the MAID platforms contain very large disks, which even at their maximum transfer rate in linear read can take many hours or even days to read the entire surface the weekly or so background scans could well use up much of the &#8216;Idle&#8217; time of the MAID array simply preserving the data. This suggests that in place of MAID technology a non volatile media could be a far better option. Of course, smart file systems such as ZFS and the NetApp equivalent that only have to manage the actual data area have a significant advantage here over dumb volume managers and hardware controllers.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Wes Felter</title>
		<link>http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/#comment-173507</link>
		<dc:creator>Wes Felter</dc:creator>
		<pubDate>Wed, 20 Feb 2008 01:41:13 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/#comment-173507</guid>
		<description>Nathan is right on the money about the transitive marketing. In my low-end storage worldview, NetApp doesn't even exist. So when someone tells me to checksum my data, naturally I would reach for ZFS.</description>
		<content:encoded><![CDATA[<p>Nathan is right on the money about the transitive marketing. In my low-end storage worldview, NetApp doesn&#8217;t even exist. So when someone tells me to checksum my data, naturally I would reach for ZFS.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Nathan</title>
		<link>http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/#comment-173458</link>
		<dc:creator>Nathan</dc:creator>
		<pubDate>Tue, 19 Feb 2008 23:00:58 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/#comment-173458</guid>
		<description>Robin:
Hm, interesting, I was unaware of that. Knowing this, your article you linked earlier makes a lot more sense. For my case, I'd rather the RAID controller continue with the rebuild--I'd rather get random corruption on one sector than not be able to recover *any* of my data. But I see your point.

Sounds to me then like a home RAID like I have set up just serves to give a false sense of security. That's too bad, because it's really easy to just keep all my important stuff on the fileserver and assume my data are safe. 

Thanks for the info.
Nathan</description>
		<content:encoded><![CDATA[<p>Robin:<br />
Hm, interesting, I was unaware of that. Knowing this, your article you linked earlier makes a lot more sense. For my case, I&#8217;d rather the RAID controller continue with the rebuild&#8211;I&#8217;d rather get random corruption on one sector than not be able to recover *any* of my data. But I see your point.</p>
<p>Sounds to me then like a home RAID like I have set up just serves to give a false sense of security. That&#8217;s too bad, because it&#8217;s really easy to just keep all my important stuff on the fileserver and assume my data are safe. </p>
<p>Thanks for the info.<br />
Nathan</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Robin Harris</title>
		<link>http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/#comment-173444</link>
		<dc:creator>Robin Harris</dc:creator>
		<pubDate>Tue, 19 Feb 2008 21:53:18 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/#comment-173444</guid>
		<description>Nathan, if a RAID system knew about files it would be a file system. Some cheapo RAID controllers will report an error and keep recovering, but then you have to figure out what is missing. The honest thing is for the controller to do is to stop. It doesn't know if it has a database or a Paris Hilton video. Better to assume the former.

Transitive property! I like that.

Allen, on ZDnet someone advocated buying drives from different vendors to put in a RAID array to avoid that very problem. My concern is that then you are looking at problems from untested corner cases of controller/drive interaction x 6 or whatever. 

I agree that the level of data disclosure in the paper wasn't as deep as I would like. How about a follow on that goes out to 44 months? And gives mean, median and std dev? Maybe NetApp will surprise us at FAST next week.

Robin</description>
		<content:encoded><![CDATA[<p>Nathan, if a RAID system knew about files it would be a file system. Some cheapo RAID controllers will report an error and keep recovering, but then you have to figure out what is missing. The honest thing is for the controller to do is to stop. It doesn&#8217;t know if it has a database or a Paris Hilton video. Better to assume the former.</p>
<p>Transitive property! I like that.</p>
<p>Allen, on ZDnet someone advocated buying drives from different vendors to put in a RAID array to avoid that very problem. My concern is that then you are looking at problems from untested corner cases of controller/drive interaction x 6 or whatever. </p>
<p>I agree that the level of data disclosure in the paper wasn&#8217;t as deep as I would like. How about a follow on that goes out to 44 months? And gives mean, median and std dev? Maybe NetApp will surprise us at FAST next week.</p>
<p>Robin</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Nathan</title>
		<link>http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/#comment-173432</link>
		<dc:creator>Nathan</dc:creator>
		<pubDate>Tue, 19 Feb 2008 20:50:57 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/#comment-173432</guid>
		<description>Robin: The unrecoverable read error just affects a single sector, right? I realize this means you have file-corruption, which is clearly not good, but it doesn't mean that all the RAID recovery time is wasted--you just have to recover a single file from backup. Unless I'm missing something? (I have a RAID-5 running at home, so I'm personally very interested in whether I should switch over to RAID-6 or just give up altogether and use backups).

Also:
&#62; ZFS!?! Coming from NetApp - well, somehow I don’t think so!
It *is* kind of an ad for ZFS, though, as Wes said; NetApp alleges that ZFS stole from WAFL, and it's an ad for WAFL, so by the transitive property... :)</description>
		<content:encoded><![CDATA[<p>Robin: The unrecoverable read error just affects a single sector, right? I realize this means you have file-corruption, which is clearly not good, but it doesn&#8217;t mean that all the RAID recovery time is wasted&#8211;you just have to recover a single file from backup. Unless I&#8217;m missing something? (I have a RAID-5 running at home, so I&#8217;m personally very interested in whether I should switch over to RAID-6 or just give up altogether and use backups).</p>
<p>Also:<br />
&gt; ZFS!?! Coming from NetApp - well, somehow I don’t think so!<br />
It *is* kind of an ad for ZFS, though, as Wes said; NetApp alleges that ZFS stole from WAFL, and it&#8217;s an ad for WAFL, so by the transitive property&#8230; <img src='http://storagemojo.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Allen Cole</title>
		<link>http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/#comment-173397</link>
		<dc:creator>Allen Cole</dc:creator>
		<pubDate>Tue, 19 Feb 2008 18:59:18 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/#comment-173397</guid>
		<description>Looking at disk errors without also examining drive firmware versions misses the point.  My experience in seeing thousands of arrays is that the vast majority of failures happen with specific firmware versions.  The symptom is that the array stops reading and writing data at random times.  An average is not the right measure if most of the errors fall into specific groups that are not measured.  Another fact to consider is that if you buy multiple drives of the same size at the same time, chances are good that they will have the same firmware.

Thanks--Allen</description>
		<content:encoded><![CDATA[<p>Looking at disk errors without also examining drive firmware versions misses the point.  My experience in seeing thousands of arrays is that the vast majority of failures happen with specific firmware versions.  The symptom is that the array stops reading and writing data at random times.  An average is not the right measure if most of the errors fall into specific groups that are not measured.  Another fact to consider is that if you buy multiple drives of the same size at the same time, chances are good that they will have the same firmware.</p>
<p>Thanks&#8211;Allen</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Robin Harris</title>
		<link>http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/#comment-173368</link>
		<dc:creator>Robin Harris</dc:creator>
		<pubDate>Tue, 19 Feb 2008 18:09:14 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/#comment-173368</guid>
		<description>Joe, I agree that vendor-sponsored research is always  suspect. But whether the extra features are "worth" the cost is something that each buyer needs to decide for themselves based on their application and business requirements.

Daniel, the errors that weren't found by scrubbing were found by either failed reads or writes in roughly 50-50 proportion. The issue of "wrong data" is quite another kettle of fish.

Nathan, after a RAID 5 disk failure, your chances of seeing an unrecoverable read error on the remaining SATA drives is high. When your RAID controller encounters one it has to say it can't recover the data, so now it is time to recover from a backup, meaning all the RAID recovery time was wasted. See &lt;a href="http://blogs.zdnet.com/storage/?p=162" target="_blank" rel="nofollow"&gt;Why RAID 5 stops working in 2009&lt;/a&gt; on my ZDnet blog.

Robin</description>
		<content:encoded><![CDATA[<p>Joe, I agree that vendor-sponsored research is always  suspect. But whether the extra features are &#8220;worth&#8221; the cost is something that each buyer needs to decide for themselves based on their application and business requirements.</p>
<p>Daniel, the errors that weren&#8217;t found by scrubbing were found by either failed reads or writes in roughly 50-50 proportion. The issue of &#8220;wrong data&#8221; is quite another kettle of fish.</p>
<p>Nathan, after a RAID 5 disk failure, your chances of seeing an unrecoverable read error on the remaining SATA drives is high. When your RAID controller encounters one it has to say it can&#8217;t recover the data, so now it is time to recover from a backup, meaning all the RAID recovery time was wasted. See <a href="http://blogs.zdnet.com/storage/?p=162" target="_blank" rel="nofollow">Why RAID 5 stops working in 2009</a> on my ZDnet blog.</p>
<p>Robin</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Nathan</title>
		<link>http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/#comment-173311</link>
		<dc:creator>Nathan</dc:creator>
		<pubDate>Tue, 19 Feb 2008 15:54:19 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/#comment-173311</guid>
		<description>Why no RAID 5 on SATA? What is recommended instead?</description>
		<content:encoded><![CDATA[<p>Why no RAID 5 on SATA? What is recommended instead?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Daniel Smith</title>
		<link>http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/#comment-172980</link>
		<dc:creator>Daniel Smith</dc:creator>
		<pubDate>Mon, 18 Feb 2008 22:10:45 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/#comment-172980</guid>
		<description>So if 60% of errors were found by scrubbing, does that mean that 40% of errors returned wrong data without the drive reporting it?</description>
		<content:encoded><![CDATA[<p>So if 60% of errors were found by scrubbing, does that mean that 40% of errors returned wrong data without the drive reporting it?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: joe m.</title>
		<link>http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/#comment-172978</link>
		<dc:creator>joe m.</dc:creator>
		<pubDate>Mon, 18 Feb 2008 21:55:28 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/#comment-172978</guid>
		<description>A Netapp sponsored study that seems to indicate that Netapp's disk scrubbing and hand-picked 'enterprise' drives really are worth 100x more than the cost of consumer/commodity drives and storage systems.  Somehow I am not surprised.</description>
		<content:encoded><![CDATA[<p>A Netapp sponsored study that seems to indicate that Netapp&#8217;s disk scrubbing and hand-picked &#8216;enterprise&#8217; drives really are worth 100x more than the cost of consumer/commodity drives and storage systems.  Somehow I am not surprised.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Robin Harris</title>
		<link>http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/#comment-172958</link>
		<dc:creator>Robin Harris</dc:creator>
		<pubDate>Mon, 18 Feb 2008 20:00:13 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/#comment-172958</guid>
		<description>Wes,

ZFS!?! Coming from NetApp - well, somehow I don't think so!

For decades Detroit was convinced that quality didn't sell, so they ignored it. One of the key trends in the consumerization of IT is that as we all become more dependent on our systems quality becomes more important. I think drive vendors could get a nice premium on LER - low error rate - drives, if they positioned them correctly.

Robin</description>
		<content:encoded><![CDATA[<p>Wes,</p>
<p>ZFS!?! Coming from NetApp - well, somehow I don&#8217;t think so!</p>
<p>For decades Detroit was convinced that quality didn&#8217;t sell, so they ignored it. One of the key trends in the consumerization of IT is that as we all become more dependent on our systems quality becomes more important. I think drive vendors could get a nice premium on LER - low error rate - drives, if they positioned them correctly.</p>
<p>Robin</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Wes Felter</title>
		<link>http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/#comment-172956</link>
		<dc:creator>Wes Felter</dc:creator>
		<pubDate>Mon, 18 Feb 2008 19:50:38 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/2008/02/18/latent-sector-errors-in-disk-drives/#comment-172956</guid>
		<description>This paper sounds like a big ad for ZFS/Btrfs. :-) But maybe that's just my bias talking.

Speaking of ECC, I read that the codes are more efficient for larger sector sizes, so future disks with 4KB sectors may be more reliable. (Or vendors may cheap out and use less ECC to achieve the same level of (un)reliability they have now.)</description>
		<content:encoded><![CDATA[<p>This paper sounds like a big ad for ZFS/Btrfs. <img src='http://storagemojo.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> But maybe that&#8217;s just my bias talking.</p>
<p>Speaking of ECC, I read that the codes are more efficient for larger sector sizes, so future disks with 4KB sectors may be more reliable. (Or vendors may cheap out and use less ECC to achieve the same level of (un)reliability they have now.)</p>
]]></content:encoded>
	</item>
</channel>
</rss>

<!-- Dynamic Page Served (once) in 4.268 seconds -->
