<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: TPC-C: comparing SSD &amp; disk</title>
	<atom:link href="http://storagemojo.com/2009/08/07/tpc-c-comparing-ssd-disk/feed/" rel="self" type="application/rss+xml" />
	<link>http://storagemojo.com/2009/08/07/tpc-c-comparing-ssd-disk/</link>
	<description>Data storage info &#38; analysis</description>
	<lastBuildDate>Sun, 01 Aug 2010 02:16:15 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
	<item>
		<title>By: KD Mann</title>
		<link>http://storagemojo.com/2009/08/07/tpc-c-comparing-ssd-disk/comment-page-1/#comment-206422</link>
		<dc:creator>KD Mann</dc:creator>
		<pubDate>Fri, 06 Nov 2009 23:00:06 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/?p=1525#comment-206422</guid>
		<description>Taylor, re:

&quot;...seems pretty clear to me.  Compare the #1 and #2 (or #3 or #4) TPC-C results....&quot;.

The Flash SSD value proposition is cost/performance. In this light, the &quot;PR stunt configurations&quot; (I think Steve Jones above called them &quot;insane&quot;) are not very relevant to real world customers. This is evidenced first and foremost by observing that the cost-per-transaction-per-minute in the stunt-class averages 5x higher than that of the top-10 cost/performance systems in the &quot;real-world&quot; class. 

http://www.tpc.org/tpcc/results/tpcc_price_perf_results.asp

Getting back to the real world, we need to look at the top systems from a cost-performance perspective, and see whether Flash can deliver any improvement.

We need to look at systems like these instead:

http://www.tpc.org/results/individual_results/HP/HPML350G6OELTPCC_ES.pdf 

In the lowest cost-per-TPMc system (above), the entire storage infrastructure -- connectivity included - costs about $430/spindle, and each spindle is good for 1,160 TPMc.

In the Sun F5100 SSD result, the total storage infrastructure costs  (including the two other disk tiers and connectivity) are about $2,000 per SSD, and 4,800 SSDs were supporting only 1,583 TPMc each.

Given these numbers, I&#039;m pretty sure you can&#039;t plug $2,000 SSDs into the leading cost-performance system without quadrupling the price. Haven&#039;t done the power-savings calculations, but I&#039;m pretty sure it would take more than 10 years for the SSDs to pay for themselves.</description>
		<content:encoded><![CDATA[<p>Taylor, re:</p>
<p>&#8220;&#8230;seems pretty clear to me.  Compare the #1 and #2 (or #3 or #4) TPC-C results&#8230;.&#8221;.</p>
<p>The Flash SSD value proposition is cost/performance. In this light, the &#8220;PR stunt configurations&#8221; (I think Steve Jones above called them &#8220;insane&#8221;) are not very relevant to real world customers. This is evidenced first and foremost by observing that the cost-per-transaction-per-minute in the stunt-class averages 5x higher than that of the top-10 cost/performance systems in the &#8220;real-world&#8221; class. </p>
<p><a href="http://www.tpc.org/tpcc/results/tpcc_price_perf_results.asp" rel="nofollow">http://www.tpc.org/tpcc/results/tpcc_price_perf_results.asp</a></p>
<p>Getting back to the real world, we need to look at the top systems from a cost-performance perspective, and see whether Flash can deliver any improvement.</p>
<p>We need to look at systems like these instead:</p>
<p><a href="http://www.tpc.org/results/individual_results/HP/HPML350G6OELTPCC_ES.pdf" rel="nofollow">http://www.tpc.org/results/individual_results/HP/HPML350G6OELTPCC_ES.pdf</a> </p>
<p>In the lowest cost-per-TPMc system (above), the entire storage infrastructure &#8212; connectivity included &#8211; costs about $430/spindle, and each spindle is good for 1,160 TPMc.</p>
<p>In the Sun F5100 SSD result, the total storage infrastructure costs  (including the two other disk tiers and connectivity) are about $2,000 per SSD, and 4,800 SSDs were supporting only 1,583 TPMc each.</p>
<p>Given these numbers, I&#8217;m pretty sure you can&#8217;t plug $2,000 SSDs into the leading cost-performance system without quadrupling the price. Haven&#8217;t done the power-savings calculations, but I&#8217;m pretty sure it would take more than 10 years for the SSDs to pay for themselves.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Taylor</title>
		<link>http://storagemojo.com/2009/08/07/tpc-c-comparing-ssd-disk/comment-page-1/#comment-206145</link>
		<dc:creator>Taylor</dc:creator>
		<pubDate>Wed, 21 Oct 2009 23:34:09 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/?p=1525#comment-206145</guid>
		<description>It seems pretty clear to me. Compare the #1 and #2 (or #3 or #4) TPC-C results. The IBM disk array is 68 racks full of spindles. The Sun solution is 7 racks. Storage system COGs works out about the same ($9m) given the 57% discount from retail on the IBM systems. I&#039;d wager that the Sun system consumes a wee bit less power overall. 

The Sun solution costs 6% more initially but performs 28% better. It takes up (looking at storage only here) 90% less floor space.

The Bull solution uses roughly the same disk setup as the IBM. The HP after that uses 2.5&quot; drives, taking up 44 racks.</description>
		<content:encoded><![CDATA[<p>It seems pretty clear to me. Compare the #1 and #2 (or #3 or #4) TPC-C results. The IBM disk array is 68 racks full of spindles. The Sun solution is 7 racks. Storage system COGs works out about the same ($9m) given the 57% discount from retail on the IBM systems. I&#8217;d wager that the Sun system consumes a wee bit less power overall. </p>
<p>The Sun solution costs 6% more initially but performs 28% better. It takes up (looking at storage only here) 90% less floor space.</p>
<p>The Bull solution uses roughly the same disk setup as the IBM. The HP after that uses 2.5&#8243; drives, taking up 44 racks.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: KD Mann</title>
		<link>http://storagemojo.com/2009/08/07/tpc-c-comparing-ssd-disk/comment-page-1/#comment-206123</link>
		<dc:creator>KD Mann</dc:creator>
		<pubDate>Tue, 20 Oct 2009 15:56:36 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/?p=1525#comment-206123</guid>
		<description>Rockmelon,

Your comments on the Microsoft paper contain a number of errors and misinterpretations. I only have time to touch on a few of these, and encourage others here to look a the paper itself.

Regarding your assertion of inflated SSD costs in the Microsoft paper due to faulty &quot;wear out&quot; calculations, it appears neither you nor EMC&#039;s &quot;Storage Anarchist&quot; read the paper. It says:

&quot;However, even if we conservatively assume a high overhead of 50% (one background write for every two foreground writes), the majority of volumes have wear-out times exceeding 100 years. All volumes with the exception of one small 10GB volume have wear-out times of 5 years or more. Hence, we do not expect that wear will be a major
contributor to the total cost of SSD-based storage.&quot;

From this it&#039;s absolutely clear that Barry Burke did not plug any STEC  numbers into Microsoft&#039;s model, wear, pricing or otherwise. 

As regards the cost of the Memoright SSD vs. Intel; $23/GB vs  $15/GB, this also doesn&#039;t change results of the cost-benefit equation for any of the applications modeled. Far more importantly though, the &quot;enterprise class SSD&quot; market is typified by EMC/STEC at ~$180/GByte and by Sun at $80/GByte (FMOD landed in an F5100 socket). Intel&#039;s SSDs are not currently part of that landscape -- the only player of note who tried them, Pillar Data, recently dumped Intel for STEC. 

Meanwhile...the STEC and Sun prices represent some pretty outrageous margins on SLC FLash -- the kind of margins that drive hype-cycles.

As regards the performance numbers you provide for X25-E,  &quot;...60X the thruput and 180X better latency.&quot;, these are quite similar to typical manufacturer claims -- claims that reliably fall apart in real-world tests and audited benchmarks. For example STEC claims 33K read IOPS and 17K write IOPS at 4KBytes, 250MBytes/sec. and with microseconds latency, and &quot;200x faster than HDD&quot;.

When run against SPC-1, the $13,000 STEC units deliver only 3.4K IOPS and 28Mbytes/Sec. with response time only about 1.4x (not 180X) better than HDD. I&#039;m wide awake, thank you, and those numbers are orders-of-magnitude worse than either you or the manufacturers are claiming...I don&#039;t smell any roses.

The numbers Microsoft reported for their SLC example device are much closer to how these devices actually perform in real applications than the IOmeter benchmarketing numbers you and Intel quote for X25-E. Given Pillar Data&#039;s &quot;Oracle heritage&quot;, I doubt that Pillar dumped Intel for STEC if Intel was delivering the goods -- and we have yet to see the first audited benchmark on an Intel SSD.

Regarding &quot;85 racks of 160 drives&quot; and &quot;3KW per rack (160 drives); as you were reviewing TPC results you might have noticed that nobody uses 3.5&quot; HDDs anymore. Nowadays, these are 2.5&quot; HDDs that use 5-7W, and 40-60 of them fit in 3RU. Your rough calculation of racks required is off by roughly 10x.

If you base the numbers on current HDD technology you&#039;ll see that it would take more than 10 yrs for the Sun Flash SSD setup to pay for itself in energy savings. That analysis is also included in the Microsoft Research model -- real-world energy savings from Flash are orders-of-magnitude smaller than the costs incurred in provisioning.</description>
		<content:encoded><![CDATA[<p>Rockmelon,</p>
<p>Your comments on the Microsoft paper contain a number of errors and misinterpretations. I only have time to touch on a few of these, and encourage others here to look a the paper itself.</p>
<p>Regarding your assertion of inflated SSD costs in the Microsoft paper due to faulty &#8220;wear out&#8221; calculations, it appears neither you nor EMC&#8217;s &#8220;Storage Anarchist&#8221; read the paper. It says:</p>
<p>&#8220;However, even if we conservatively assume a high overhead of 50% (one background write for every two foreground writes), the majority of volumes have wear-out times exceeding 100 years. All volumes with the exception of one small 10GB volume have wear-out times of 5 years or more. Hence, we do not expect that wear will be a major<br />
contributor to the total cost of SSD-based storage.&#8221;</p>
<p>From this it&#8217;s absolutely clear that Barry Burke did not plug any STEC  numbers into Microsoft&#8217;s model, wear, pricing or otherwise. </p>
<p>As regards the cost of the Memoright SSD vs. Intel; $23/GB vs  $15/GB, this also doesn&#8217;t change results of the cost-benefit equation for any of the applications modeled. Far more importantly though, the &#8220;enterprise class SSD&#8221; market is typified by EMC/STEC at ~$180/GByte and by Sun at $80/GByte (FMOD landed in an F5100 socket). Intel&#8217;s SSDs are not currently part of that landscape &#8212; the only player of note who tried them, Pillar Data, recently dumped Intel for STEC. </p>
<p>Meanwhile&#8230;the STEC and Sun prices represent some pretty outrageous margins on SLC FLash &#8212; the kind of margins that drive hype-cycles.</p>
<p>As regards the performance numbers you provide for X25-E,  &#8220;&#8230;60X the thruput and 180X better latency.&#8221;, these are quite similar to typical manufacturer claims &#8212; claims that reliably fall apart in real-world tests and audited benchmarks. For example STEC claims 33K read IOPS and 17K write IOPS at 4KBytes, 250MBytes/sec. and with microseconds latency, and &#8220;200x faster than HDD&#8221;.</p>
<p>When run against SPC-1, the $13,000 STEC units deliver only 3.4K IOPS and 28Mbytes/Sec. with response time only about 1.4x (not 180X) better than HDD. I&#8217;m wide awake, thank you, and those numbers are orders-of-magnitude worse than either you or the manufacturers are claiming&#8230;I don&#8217;t smell any roses.</p>
<p>The numbers Microsoft reported for their SLC example device are much closer to how these devices actually perform in real applications than the IOmeter benchmarketing numbers you and Intel quote for X25-E. Given Pillar Data&#8217;s &#8220;Oracle heritage&#8221;, I doubt that Pillar dumped Intel for STEC if Intel was delivering the goods &#8212; and we have yet to see the first audited benchmark on an Intel SSD.</p>
<p>Regarding &#8220;85 racks of 160 drives&#8221; and &#8220;3KW per rack (160 drives); as you were reviewing TPC results you might have noticed that nobody uses 3.5&#8243; HDDs anymore. Nowadays, these are 2.5&#8243; HDDs that use 5-7W, and 40-60 of them fit in 3RU. Your rough calculation of racks required is off by roughly 10x.</p>
<p>If you base the numbers on current HDD technology you&#8217;ll see that it would take more than 10 yrs for the Sun Flash SSD setup to pay for itself in energy savings. That analysis is also included in the Microsoft Research model &#8212; real-world energy savings from Flash are orders-of-magnitude smaller than the costs incurred in provisioning.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: rockmelon</title>
		<link>http://storagemojo.com/2009/08/07/tpc-c-comparing-ssd-disk/comment-page-1/#comment-206118</link>
		<dc:creator>rockmelon</dc:creator>
		<pubDate>Tue, 20 Oct 2009 03:36:34 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/?p=1525#comment-206118</guid>
		<description>KD,

that would be the Microsoft paper which looked at several of their in-house
apps (notably none of them OLTP though one called &quot;websql&quot;) and concluded:

  &quot;Depending on the workload, the capacity/dollar of SSDs needs to improve 
   by a factor of 3–3000 for SSDs to be able to replace disks.&quot;

Well 3X is not a huge leap before some of their apps become economically
viable to be deployed on SSDs.

The Storage Anarchist (Sr. Director and Chief Strategy Officer for the Symmetrix
 Product Group within EMC&#039;s Storage Division) made these comments about
that paper in http://blogs.netapp.com/shadeofblue/2009/02/emc-says-ssd-is.html

&quot;And here&#039;s but one flaw in Microsoft&#039;s analysis: they presumed an expected wear-out of flash chips after 100,000 erase+write cycles - built it right into their models and calculations.
...
Plug in the real-world cell wear numbers of the ZeusIOPS 146GB or 300GB FC-based SSD and today&#039;s pricing, and Microsoft&#039;s math changes dramatically.&quot;

The Microsoft Tech Report was published in April and considers only a single
SSD - the 32GB Memoright MR 25.2 which costed $23/GB.  Here we are in
October and a better buy would be the Intel X25-E at $15/GB
http://www.cdw.com/shop/products/default.aspx?EDC=1774816

Additionally, the SLC-based X25-E offers 10X the read IOPS and 5X the
write IOPS afforded by the Memoright.   Most Enterprise-focussed users
would stick with SLC, but MLC can be had for $5/GB, e.g. Intel X25-M  -
http://www.cdw.com/shop/products/default.aspx?EDC=1736412

Again, with 5X the read IOPS and 10X the write IOPS afforded by the
Memoright.  For the light-duty IO loads of the Microsoft servers, MLC would
probably suffice.

Another interesting takeaway from the Microsoft report was:

&quot;Of the three enterprise-class devices shown Table 4, the
Cheetah 10K disk was the best choice for all 49 volumes.&quot;

I checked the executive summaries for the next 5 highest throughput TPC-C
publications and note that they exclusively use 15k rpm drives.  Does this
mean that the benchmark engineers at IBM, HP &amp; Fujitsu are clueless or that
the fileservers traced by Microsoft are not as demanding in IOPS as TPC-C ?

Another thing the Microsoft report acknowledges ignoring:

&quot;The approach in this paper is to find the least-cost configuration that meets
known targets for performance (as well as capacity and fault-tolerance) rather
than maximizing performance while keeping within a known cost budget.&quot;

What if you are trying to manage response times downwards as well ?

This lack of response time consideration also feeds into their quantitative model
where this important parameter is ignored.  For example, a Seagate Cheetah 15K
drive is considered good for 384 Read IOPS, but does not factor in that the
response time from the drive at this throughput is probably 30ms.  I&#039;m watching
an Intel X25-E at the moment doing a little over 24k IOPS of 4KB random
read, average queue length a little under 4 and the response time is 160-170
_micro_seconds.  60X the thruput and 180X better latency....

Wake up and smell the roses!

One thing Microsoft got right was to recognize the multiple dimensions
of price performance which they show in Figure 3:  IOPS/$, GB/$ and MB/s/$.

Not even the most ardent Enterprise Flash protagonist would suggest deploying 
SSDs everywhere.  Flash has its place, disk has its place and so does tape.
Database logging does not need IOPS, it needs Gigabytes, so disk makes
economic sense there. 

I&#039;ll finish with a rough calculation of my own.  If Sun/Oracle had used 
conventional disk instead of Flash, they would have needed something
like 85 racks of 160 drives. Allowing 3kW/rack and 20 square feet/rack,
going with flash saved 1700 sq.ft of floor space at 150 Watts/sq.ft.  The cost
to construct that floor space at that power density would be around 
$2000/sq.ft - 3.4 M$.</description>
		<content:encoded><![CDATA[<p>KD,</p>
<p>that would be the Microsoft paper which looked at several of their in-house<br />
apps (notably none of them OLTP though one called &#8220;websql&#8221;) and concluded:</p>
<p>  &#8220;Depending on the workload, the capacity/dollar of SSDs needs to improve<br />
   by a factor of 3–3000 for SSDs to be able to replace disks.&#8221;</p>
<p>Well 3X is not a huge leap before some of their apps become economically<br />
viable to be deployed on SSDs.</p>
<p>The Storage Anarchist (Sr. Director and Chief Strategy Officer for the Symmetrix<br />
 Product Group within EMC&#8217;s Storage Division) made these comments about<br />
that paper in <a href="http://blogs.netapp.com/shadeofblue/2009/02/emc-says-ssd-is.html" rel="nofollow">http://blogs.netapp.com/shadeofblue/2009/02/emc-says-ssd-is.html</a></p>
<p>&#8220;And here&#8217;s but one flaw in Microsoft&#8217;s analysis: they presumed an expected wear-out of flash chips after 100,000 erase+write cycles &#8211; built it right into their models and calculations.<br />
&#8230;<br />
Plug in the real-world cell wear numbers of the ZeusIOPS 146GB or 300GB FC-based SSD and today&#8217;s pricing, and Microsoft&#8217;s math changes dramatically.&#8221;</p>
<p>The Microsoft Tech Report was published in April and considers only a single<br />
SSD &#8211; the 32GB Memoright MR 25.2 which costed $23/GB.  Here we are in<br />
October and a better buy would be the Intel X25-E at $15/GB<br />
<a href="http://www.cdw.com/shop/products/default.aspx?EDC=1774816" rel="nofollow">http://www.cdw.com/shop/products/default.aspx?EDC=1774816</a></p>
<p>Additionally, the SLC-based X25-E offers 10X the read IOPS and 5X the<br />
write IOPS afforded by the Memoright.   Most Enterprise-focussed users<br />
would stick with SLC, but MLC can be had for $5/GB, e.g. Intel X25-M  -<br />
<a href="http://www.cdw.com/shop/products/default.aspx?EDC=1736412" rel="nofollow">http://www.cdw.com/shop/products/default.aspx?EDC=1736412</a></p>
<p>Again, with 5X the read IOPS and 10X the write IOPS afforded by the<br />
Memoright.  For the light-duty IO loads of the Microsoft servers, MLC would<br />
probably suffice.</p>
<p>Another interesting takeaway from the Microsoft report was:</p>
<p>&#8220;Of the three enterprise-class devices shown Table 4, the<br />
Cheetah 10K disk was the best choice for all 49 volumes.&#8221;</p>
<p>I checked the executive summaries for the next 5 highest throughput TPC-C<br />
publications and note that they exclusively use 15k rpm drives.  Does this<br />
mean that the benchmark engineers at IBM, HP &amp; Fujitsu are clueless or that<br />
the fileservers traced by Microsoft are not as demanding in IOPS as TPC-C ?</p>
<p>Another thing the Microsoft report acknowledges ignoring:</p>
<p>&#8220;The approach in this paper is to find the least-cost configuration that meets<br />
known targets for performance (as well as capacity and fault-tolerance) rather<br />
than maximizing performance while keeping within a known cost budget.&#8221;</p>
<p>What if you are trying to manage response times downwards as well ?</p>
<p>This lack of response time consideration also feeds into their quantitative model<br />
where this important parameter is ignored.  For example, a Seagate Cheetah 15K<br />
drive is considered good for 384 Read IOPS, but does not factor in that the<br />
response time from the drive at this throughput is probably 30ms.  I&#8217;m watching<br />
an Intel X25-E at the moment doing a little over 24k IOPS of 4KB random<br />
read, average queue length a little under 4 and the response time is 160-170<br />
_micro_seconds.  60X the thruput and 180X better latency&#8230;.</p>
<p>Wake up and smell the roses!</p>
<p>One thing Microsoft got right was to recognize the multiple dimensions<br />
of price performance which they show in Figure 3:  IOPS/$, GB/$ and MB/s/$.</p>
<p>Not even the most ardent Enterprise Flash protagonist would suggest deploying<br />
SSDs everywhere.  Flash has its place, disk has its place and so does tape.<br />
Database logging does not need IOPS, it needs Gigabytes, so disk makes<br />
economic sense there. </p>
<p>I&#8217;ll finish with a rough calculation of my own.  If Sun/Oracle had used<br />
conventional disk instead of Flash, they would have needed something<br />
like 85 racks of 160 drives. Allowing 3kW/rack and 20 square feet/rack,<br />
going with flash saved 1700 sq.ft of floor space at 150 Watts/sq.ft.  The cost<br />
to construct that floor space at that power density would be around<br />
$2000/sq.ft &#8211; 3.4 M$.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: KD Mann</title>
		<link>http://storagemojo.com/2009/08/07/tpc-c-comparing-ssd-disk/comment-page-1/#comment-206109</link>
		<dc:creator>KD Mann</dc:creator>
		<pubDate>Mon, 19 Oct 2009 16:09:53 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/?p=1525#comment-206109</guid>
		<description>rockmelon,

Regarding &quot;I think you need to give Enterprise Flash a fair run for it’s money.&quot;

I&#039;ve taken up a countervailing position vs. the prevailing hype. What&#039;s not fair? The Flash SSD value proposition, as it has been presented, always plays on some variation of &quot;replace tens or hundreds of spinning disks with SSD&quot;, and &quot;the much higher cost-per-GB of SSD is justified by much lower cost per IOP&quot;. The basis for both the outright replacement scenario and the &quot;Flash Tier&quot; scenario is that cost-per-IOP is one or two orders of magnitude cheaper than HDD . It&#039;s not.

The business case for Flash falls apart if cost/IOP is not significantly lower than HDD in...and here&#039;s the key...real world, real application workload scenarios. Over the past several weeks, we have seen the first audited, application benchmarks utilizing Flash SSD for realistic &quot;transactional&quot; workloads. 

None of them even demonstrates Cost/IOP parity with HDD, much less a cost advantage. In other words, Enterprise Flash Reality is at least an order of magnitude different from Enterprise Flash Hype. 

 - In SPC-1C/E, SSD cost/IOP was slightly higher than HDD, not lower.
 - In SPC-1, SSD cost/IOP was 2x-3x higher than HDD
 - In TPC-C, the three-tiered SSD/HDD/HDD setup resulted in storage cost/performance ~50% higher than a single tier of HDD.

I stand by my remark...if Sun had attempted to run the entire TPC database on a single tier of Flash, the cost/TPMc would have been at least 5x higher than HDD. Even with a three-tier setup, Flash still increases storage provisioning costs dramatically. Given that we&#039;ve been told for so long that Flash SSD was going to reduce storage costs for I/O intensive applications -- I&#039;d say &quot;jaw dropping&quot; is a reasonable way to describe the size of the gap between hype and reality.

I&#039;d add another observation. These first-ever audited benchmark results align quite perfectly with the (widely ignored) conclusions reached here, which also happen to be my own conclusions:

http://research.microsoft.com/en-us/um/people/antr/ms/ssd.pdf</description>
		<content:encoded><![CDATA[<p>rockmelon,</p>
<p>Regarding &#8220;I think you need to give Enterprise Flash a fair run for it’s money.&#8221;</p>
<p>I&#8217;ve taken up a countervailing position vs. the prevailing hype. What&#8217;s not fair? The Flash SSD value proposition, as it has been presented, always plays on some variation of &#8220;replace tens or hundreds of spinning disks with SSD&#8221;, and &#8220;the much higher cost-per-GB of SSD is justified by much lower cost per IOP&#8221;. The basis for both the outright replacement scenario and the &#8220;Flash Tier&#8221; scenario is that cost-per-IOP is one or two orders of magnitude cheaper than HDD . It&#8217;s not.</p>
<p>The business case for Flash falls apart if cost/IOP is not significantly lower than HDD in&#8230;and here&#8217;s the key&#8230;real world, real application workload scenarios. Over the past several weeks, we have seen the first audited, application benchmarks utilizing Flash SSD for realistic &#8220;transactional&#8221; workloads. </p>
<p>None of them even demonstrates Cost/IOP parity with HDD, much less a cost advantage. In other words, Enterprise Flash Reality is at least an order of magnitude different from Enterprise Flash Hype. </p>
<p> &#8211; In SPC-1C/E, SSD cost/IOP was slightly higher than HDD, not lower.<br />
 &#8211; In SPC-1, SSD cost/IOP was 2x-3x higher than HDD<br />
 &#8211; In TPC-C, the three-tiered SSD/HDD/HDD setup resulted in storage cost/performance ~50% higher than a single tier of HDD.</p>
<p>I stand by my remark&#8230;if Sun had attempted to run the entire TPC database on a single tier of Flash, the cost/TPMc would have been at least 5x higher than HDD. Even with a three-tier setup, Flash still increases storage provisioning costs dramatically. Given that we&#8217;ve been told for so long that Flash SSD was going to reduce storage costs for I/O intensive applications &#8212; I&#8217;d say &#8220;jaw dropping&#8221; is a reasonable way to describe the size of the gap between hype and reality.</p>
<p>I&#8217;d add another observation. These first-ever audited benchmark results align quite perfectly with the (widely ignored) conclusions reached here, which also happen to be my own conclusions:</p>
<p><a href="http://research.microsoft.com/en-us/um/people/antr/ms/ssd.pdf" rel="nofollow">http://research.microsoft.com/en-us/um/people/antr/ms/ssd.pdf</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: rockmelon</title>
		<link>http://storagemojo.com/2009/08/07/tpc-c-comparing-ssd-disk/comment-page-1/#comment-206101</link>
		<dc:creator>rockmelon</dc:creator>
		<pubDate>Mon, 19 Oct 2009 09:04:32 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/?p=1525#comment-206101</guid>
		<description>KD,

I think you need to give Enterprise Flash a fair run for it&#039;s money.

In these comments some 5 or 6 weeks ago, you were claiming that we would
never see an audited TPC benchmark using Flash because it would be jaw-
droppingly expensive and that Flash $/IO would be no cheaper than conventional
HDDs for real workloads.  You really ought to test some Intel X25-E or Sun
SSD products (I have) before making these assertions.

Now that there is such an audited benchmark, you see fit to bash it because
it was not done with 100% Flash.

Every viable storage product has its niche ...

Looking at the benchmark executive summary, I see 8.35 M$ of server storage,
of which 6.62 M$ is F5100s.  (Incidentally, 2.08 M$ of server hardware
and 7.88 M$ of server software [Oracle licenses] rounds out the 18 M$
total system cost.)

I really doubt that Sun/Oracle would configure 80% of their storage budget
on Flash unless it was beneficial to performance and price/performance.

The 508 page full disclosure report may be somewhat daunting, but thankfully
the juicy parts relating to the storage configuration are detailed within the first
20 pages.

The next biggest item of the 8.35 M$ server storage was 0.95 M$ spent on
24 x ST6140 conventional disk arrays.  On page 12 of the full disclosure report
we see that these were connected via 4GB FC, two per database server node,
as Oracle log files. Each ST6140 was comprised of 16 x 300GB SAS 
drives and they were mirrored by the database.  A generation ago, databases
could log to tape (log IO is sequential), so disk really is the new tape :-)

It makes economic sense to log to conventional HDDs, where the important
metrics are MB/sec/$ and MB/$.

Sun/Oracle would have put 80% of their storage $$s behind Flash and 20% 
behind HDDs because it made the most sense to optimize the benchmark
metrics.</description>
		<content:encoded><![CDATA[<p>KD,</p>
<p>I think you need to give Enterprise Flash a fair run for it&#8217;s money.</p>
<p>In these comments some 5 or 6 weeks ago, you were claiming that we would<br />
never see an audited TPC benchmark using Flash because it would be jaw-<br />
droppingly expensive and that Flash $/IO would be no cheaper than conventional<br />
HDDs for real workloads.  You really ought to test some Intel X25-E or Sun<br />
SSD products (I have) before making these assertions.</p>
<p>Now that there is such an audited benchmark, you see fit to bash it because<br />
it was not done with 100% Flash.</p>
<p>Every viable storage product has its niche &#8230;</p>
<p>Looking at the benchmark executive summary, I see 8.35 M$ of server storage,<br />
of which 6.62 M$ is F5100s.  (Incidentally, 2.08 M$ of server hardware<br />
and 7.88 M$ of server software [Oracle licenses] rounds out the 18 M$<br />
total system cost.)</p>
<p>I really doubt that Sun/Oracle would configure 80% of their storage budget<br />
on Flash unless it was beneficial to performance and price/performance.</p>
<p>The 508 page full disclosure report may be somewhat daunting, but thankfully<br />
the juicy parts relating to the storage configuration are detailed within the first<br />
20 pages.</p>
<p>The next biggest item of the 8.35 M$ server storage was 0.95 M$ spent on<br />
24 x ST6140 conventional disk arrays.  On page 12 of the full disclosure report<br />
we see that these were connected via 4GB FC, two per database server node,<br />
as Oracle log files. Each ST6140 was comprised of 16 x 300GB SAS<br />
drives and they were mirrored by the database.  A generation ago, databases<br />
could log to tape (log IO is sequential), so disk really is the new tape <img src='http://storagemojo.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>It makes economic sense to log to conventional HDDs, where the important<br />
metrics are MB/sec/$ and MB/$.</p>
<p>Sun/Oracle would have put 80% of their storage $$s behind Flash and 20%<br />
behind HDDs because it made the most sense to optimize the benchmark<br />
metrics.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: KD Mann</title>
		<link>http://storagemojo.com/2009/08/07/tpc-c-comparing-ssd-disk/comment-page-1/#comment-205990</link>
		<dc:creator>KD Mann</dc:creator>
		<pubDate>Wed, 14 Oct 2009 23:47:00 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/?p=1525#comment-205990</guid>
		<description>The world&#039;s first TPC-C result running on Flash SSD is out, courtesy of Sun and Oracle.

Well...not exactly. What we&#039;re really seeing is the world&#039;s first TPC-C running on a THREE-tiered storage solution, Flash SSD, 15kRPM HDDs and cheap SATA disk.

I haven&#039;t finished sifting the ~500 pages of Full Disclosure Report, but one thing is clear so far. Even with the most write-intensive part of the workload kept far away from the SSDs (log files are striped on 384 fast spinners), the Flash SSDs didn&#039;t even reach 3x IOPS/Disk improvement over HDD.

HDDs are reliably good for about 600 TPMc per disk in these large scale systems. In this test the 4,800 SSDs were only 2.5x better than HDD -- around 1,500 TPMc/SSD.

Oh well...so much for replacing hundreds or dozens...or even a handful of spinning disks with a single Flash SSD.

As far as storage provisioning costs, the 3-Tier approach looks to be about 1.5x more expensive than a single-tier of HDD -- using Oracle and HP&#039;s previous big machine for comparison. Of course that doesn&#039;t include the costs of managing three islands storage instead of one.

http://www.tpc.org/results/FDR/TPCC/Sun_T5440_TPC-C_Cluster_FDR_101109.pdf</description>
		<content:encoded><![CDATA[<p>The world&#8217;s first TPC-C result running on Flash SSD is out, courtesy of Sun and Oracle.</p>
<p>Well&#8230;not exactly. What we&#8217;re really seeing is the world&#8217;s first TPC-C running on a THREE-tiered storage solution, Flash SSD, 15kRPM HDDs and cheap SATA disk.</p>
<p>I haven&#8217;t finished sifting the ~500 pages of Full Disclosure Report, but one thing is clear so far. Even with the most write-intensive part of the workload kept far away from the SSDs (log files are striped on 384 fast spinners), the Flash SSDs didn&#8217;t even reach 3x IOPS/Disk improvement over HDD.</p>
<p>HDDs are reliably good for about 600 TPMc per disk in these large scale systems. In this test the 4,800 SSDs were only 2.5x better than HDD &#8212; around 1,500 TPMc/SSD.</p>
<p>Oh well&#8230;so much for replacing hundreds or dozens&#8230;or even a handful of spinning disks with a single Flash SSD.</p>
<p>As far as storage provisioning costs, the 3-Tier approach looks to be about 1.5x more expensive than a single-tier of HDD &#8212; using Oracle and HP&#8217;s previous big machine for comparison. Of course that doesn&#8217;t include the costs of managing three islands storage instead of one.</p>
<p><a href="http://www.tpc.org/results/FDR/TPCC/Sun_T5440_TPC-C_Cluster_FDR_101109.pdf" rel="nofollow">http://www.tpc.org/results/FDR/TPCC/Sun_T5440_TPC-C_Cluster_FDR_101109.pdf</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: KD Mann</title>
		<link>http://storagemojo.com/2009/08/07/tpc-c-comparing-ssd-disk/comment-page-1/#comment-205036</link>
		<dc:creator>KD Mann</dc:creator>
		<pubDate>Wed, 09 Sep 2009 17:30:57 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/?p=1525#comment-205036</guid>
		<description>Steve, Robin:

I think there is one really simple reason why we have not seen TPC-C or TPC-E running on SSD.

If we were to see an actual, audited cost/performance number on a transational database system -- the vast majority of even really smart storage and database guys would be picking their jaws up from the floor after seeing how expensive these things are on a cost/transaction basis compared to HDD.

FYI, IBM has already built a HUGE SSD based system in Q3&#039;08 . It&#039;s configuration was identical to those that IBM uses for it&#039;s TPC-C testing. It was called &quot;quicksilver&quot;, and after publishing a &quot;million iops&quot; number (IOmeter), lots of us were waiting for the TPC results.

They never came. Quicksilver has not been heard from again. This silence  (combined with the fact that IBM is the most prolific publisher of TPC benchmarks on the planet) speaks volumes.

I DID however recently see a shocker from IBM on SPC-1C/E, though almost nobody noticed. When IBM ran STEC SSD&#039;s (again, on a configuration that was identical to thier TPC-C/E setups), the STEC SSDs:

(a) delivered only about 12% of STEC advertised IOPS (while HDDs usually deliver slightly more than advertised IOPS on SPC-1)
(b) cost-per-IOP was no cheaper than spinning disk (actully slightly higher when compared to Seagate 10K SFF disks)
(c) cost/GB was 135X enterprise-class spinning disk
The net here is that there is no market for Enterprise Flash SSD when dollars/IOP in REAL workloads are no cheaper than HDD, and costs/GB are more than two orders of magnitude higher than HDD.

This is EXACTLY the situation today, and an audited TPC-C/E  &quot;Cost per TPMc&quot; result would illustrate this clearly and spell disaster for the Enterprise Flash Hype Party.

This would not be Good. EMC (and a few others) are making obscene margins on the SSDs that are already obscenely profitable for STEC (at $23,000 for 146GBytes!!!!). In this context, it&#039;s not surprising that the overwhelming majority of sales that STEC is reporting are from EMC. This particular hype-cycle is a hugely profitable one.

It will be interesting to look at EMC&#039;s inventory levels at the end of STEC&#039;s fiscal year, to see how much sell-thru is really going on.</description>
		<content:encoded><![CDATA[<p>Steve, Robin:</p>
<p>I think there is one really simple reason why we have not seen TPC-C or TPC-E running on SSD.</p>
<p>If we were to see an actual, audited cost/performance number on a transational database system &#8212; the vast majority of even really smart storage and database guys would be picking their jaws up from the floor after seeing how expensive these things are on a cost/transaction basis compared to HDD.</p>
<p>FYI, IBM has already built a HUGE SSD based system in Q3&#8217;08 . It&#8217;s configuration was identical to those that IBM uses for it&#8217;s TPC-C testing. It was called &#8220;quicksilver&#8221;, and after publishing a &#8220;million iops&#8221; number (IOmeter), lots of us were waiting for the TPC results.</p>
<p>They never came. Quicksilver has not been heard from again. This silence  (combined with the fact that IBM is the most prolific publisher of TPC benchmarks on the planet) speaks volumes.</p>
<p>I DID however recently see a shocker from IBM on SPC-1C/E, though almost nobody noticed. When IBM ran STEC SSD&#8217;s (again, on a configuration that was identical to thier TPC-C/E setups), the STEC SSDs:</p>
<p>(a) delivered only about 12% of STEC advertised IOPS (while HDDs usually deliver slightly more than advertised IOPS on SPC-1)<br />
(b) cost-per-IOP was no cheaper than spinning disk (actully slightly higher when compared to Seagate 10K SFF disks)<br />
(c) cost/GB was 135X enterprise-class spinning disk<br />
The net here is that there is no market for Enterprise Flash SSD when dollars/IOP in REAL workloads are no cheaper than HDD, and costs/GB are more than two orders of magnitude higher than HDD.</p>
<p>This is EXACTLY the situation today, and an audited TPC-C/E  &#8220;Cost per TPMc&#8221; result would illustrate this clearly and spell disaster for the Enterprise Flash Hype Party.</p>
<p>This would not be Good. EMC (and a few others) are making obscene margins on the SSDs that are already obscenely profitable for STEC (at $23,000 for 146GBytes!!!!). In this context, it&#8217;s not surprising that the overwhelming majority of sales that STEC is reporting are from EMC. This particular hype-cycle is a hugely profitable one.</p>
<p>It will be interesting to look at EMC&#8217;s inventory levels at the end of STEC&#8217;s fiscal year, to see how much sell-thru is really going on.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Robin Harris</title>
		<link>http://storagemojo.com/2009/08/07/tpc-c-comparing-ssd-disk/comment-page-1/#comment-204674</link>
		<dc:creator>Robin Harris</dc:creator>
		<pubDate>Thu, 13 Aug 2009 14:01:11 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/?p=1525#comment-204674</guid>
		<description>David,

$50/GB is steep, no doubt about it, and probably not what anyone considers reasonable - unless you are the salesman. Nonetheless, the one constant over the last decade is that for protected storage 90-95% of the cost is for all the stuff around the capacity, while the capacity itself is only 5-10% of the storage cost. 

I keep wondering when that will change. Guess I have to keep wondering.

Robin</description>
		<content:encoded><![CDATA[<p>David,</p>
<p>$50/GB is steep, no doubt about it, and probably not what anyone considers reasonable &#8211; unless you are the salesman. Nonetheless, the one constant over the last decade is that for protected storage 90-95% of the cost is for all the stuff around the capacity, while the capacity itself is only 5-10% of the storage cost. </p>
<p>I keep wondering when that will change. Guess I have to keep wondering.</p>
<p>Robin</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: David Garvie</title>
		<link>http://storagemojo.com/2009/08/07/tpc-c-comparing-ssd-disk/comment-page-1/#comment-204671</link>
		<dc:creator>David Garvie</dc:creator>
		<pubDate>Wed, 12 Aug 2009 13:14:32 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/?p=1525#comment-204671</guid>
		<description>I am continually amazed that $20M is still considered a reasonable price for ~400TB of high performance disk.  As I can put that capacity (usable, RAID-5 w/ 160 hot spares) on commodity 2.5&quot;  10K SAS drives and controllers in 4 racks (100 TB per rack) for less than $2M list, one really wonders where all that extra cash is going.

Not an apples-to-apples, to be sure, but what price/IOP are you willing to spend, what ROI are you getting for the extra $18.5M, and to what standards of support, uptime, and technology refresh timeframes are you holding your &quot;enterprise&quot; storage vendors for that kind of CAPEX?  I sure hope it&#039;s worth it.

-D</description>
		<content:encoded><![CDATA[<p>I am continually amazed that $20M is still considered a reasonable price for ~400TB of high performance disk.  As I can put that capacity (usable, RAID-5 w/ 160 hot spares) on commodity 2.5&#8243;  10K SAS drives and controllers in 4 racks (100 TB per rack) for less than $2M list, one really wonders where all that extra cash is going.</p>
<p>Not an apples-to-apples, to be sure, but what price/IOP are you willing to spend, what ROI are you getting for the extra $18.5M, and to what standards of support, uptime, and technology refresh timeframes are you holding your &#8220;enterprise&#8221; storage vendors for that kind of CAPEX?  I sure hope it&#8217;s worth it.</p>
<p>-D</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Fazal Majid</title>
		<link>http://storagemojo.com/2009/08/07/tpc-c-comparing-ssd-disk/comment-page-1/#comment-204645</link>
		<dc:creator>Fazal Majid</dc:creator>
		<pubDate>Sun, 09 Aug 2009 17:48:19 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/?p=1525#comment-204645</guid>
		<description>@David - SSDs are indeed too expensive to be used for an entire database. Sun&#039;s ZFS hybrid storage pool technology is very cool but it is not directly relevant to OLTP databases.
Typically you will use SSDs to optimize access to critical tables and indexes, and sometimes transaction journal or redo logs (although disks are very good at sequential I/O and SSDs are best deployed for random I/O, as Steve notes).
This brings up an interesting phenomenon - SSDs&#039; extremely low latency reveals bottlenecks in database engines themselves, e.g. low lock granularity. You need to benchmark your system to ensure there are no priority inversion effects where a query involving both SSD and HDD holds up a higher-priority query (or one that has a more stringent latency SLA) that uses SSD exclusively.</description>
		<content:encoded><![CDATA[<p>@David &#8211; SSDs are indeed too expensive to be used for an entire database. Sun&#8217;s ZFS hybrid storage pool technology is very cool but it is not directly relevant to OLTP databases.<br />
Typically you will use SSDs to optimize access to critical tables and indexes, and sometimes transaction journal or redo logs (although disks are very good at sequential I/O and SSDs are best deployed for random I/O, as Steve notes).<br />
This brings up an interesting phenomenon &#8211; SSDs&#8217; extremely low latency reveals bottlenecks in database engines themselves, e.g. low lock granularity. You need to benchmark your system to ensure there are no priority inversion effects where a query involving both SSD and HDD holds up a higher-priority query (or one that has a more stringent latency SLA) that uses SSD exclusively.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: xfer_rdy</title>
		<link>http://storagemojo.com/2009/08/07/tpc-c-comparing-ssd-disk/comment-page-1/#comment-204644</link>
		<dc:creator>xfer_rdy</dc:creator>
		<pubDate>Sun, 09 Aug 2009 17:40:18 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/?p=1525#comment-204644</guid>
		<description>I for one would like to see a full TPC-H benchmark for TMS&#039;s 6200  for large datasets. I&#039;d also like to see smallest dataset verses media life. 

If media life is reasonable, they may have just carved out a niche for emerging multi-tenent providers. 

@Fazal: I agree with you about FusionIO&#039;s product, it is very fast.  Most don&#039;t realize its one of the few products that are true &quot;packet&quot; storage devices.  Now we just have to wait 20 years until the mother board architectures and operating systems catch up with its potential. 

:)</description>
		<content:encoded><![CDATA[<p>I for one would like to see a full TPC-H benchmark for TMS&#8217;s 6200  for large datasets. I&#8217;d also like to see smallest dataset verses media life. </p>
<p>If media life is reasonable, they may have just carved out a niche for emerging multi-tenent providers. </p>
<p>@Fazal: I agree with you about FusionIO&#8217;s product, it is very fast.  Most don&#8217;t realize its one of the few products that are true &#8220;packet&#8221; storage devices.  Now we just have to wait 20 years until the mother board architectures and operating systems catch up with its potential. </p>
<p> <img src='http://storagemojo.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Steve Jones</title>
		<link>http://storagemojo.com/2009/08/07/tpc-c-comparing-ssd-disk/comment-page-1/#comment-204639</link>
		<dc:creator>Steve Jones</dc:creator>
		<pubDate>Sun, 09 Aug 2009 07:29:07 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/?p=1525#comment-204639</guid>
		<description>I missed the comment about the nature of the RamSan storage. Despite the name, the RamSan 6200 (and it&#039;s related products) are all flash-, not RAM-based. If you tried to build a 100TB storage device using DRAM  then the cost, footprint and power consumption would be enormous.  

Certainly there was a time when the only available solid state storage was of that type, but flash-based SSD has essentially taken over apart from small, and extremely fast requirements where NV RAM may still may play a part.</description>
		<content:encoded><![CDATA[<p>I missed the comment about the nature of the RamSan storage. Despite the name, the RamSan 6200 (and it&#8217;s related products) are all flash-, not RAM-based. If you tried to build a 100TB storage device using DRAM  then the cost, footprint and power consumption would be enormous.  </p>
<p>Certainly there was a time when the only available solid state storage was of that type, but flash-based SSD has essentially taken over apart from small, and extremely fast requirements where NV RAM may still may play a part.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Steve Jones</title>
		<link>http://storagemojo.com/2009/08/07/tpc-c-comparing-ssd-disk/comment-page-1/#comment-204638</link>
		<dc:creator>Steve Jones</dc:creator>
		<pubDate>Sun, 09 Aug 2009 07:16:58 +0000</pubDate>
		<guid isPermaLink="false">http://storagemojo.com/?p=1525#comment-204638</guid>
		<description>I&#039;d seen the TPC-H result, but hadn&#039;t analysed it in comparison with a disk alternative. TPC-H has a very different I/O profile to TPC-C and it was really the transactional profile I was interested in. The low latency of these SSD drives is particularly relevant to the issues I see with many of those sort of systems. 

I&#039;ve also seen what SUN are up to with SSD caches in their unified storage device. I believe all but the lowest end appliance uses very large, and relatively slow disks (1TB 5,400 RPM drives at launch with plans to go to 2TB).  Large enterprise arrays include large non-volatile memory (in the tens, or even hundred+ GB region), albeit at a very high cost. This large NV memory is used for cache, including such hidden things as the maintenance of pointers and maps for various replication capabilities. In many ways SUN&#039;s use of cache is an alternative to that large NV cache, although it is organised in a fundamentally different way and has both write- and read-optimised parts.

In our experience, all these cached systems hit limits on very large databases with random access. The very good cache prospects  tend to be grabbed by the database, leaving the storage array to cope with only writes and the remaining &quot;dross&quot;.  Writes are easy to deal with (providing the ratio isn&#039;t too high).  It&#039;s the random reads on large data sets which is the real problem. 

It&#039;s an old story  with cache - the law of diminishing returns sets in as you add more, and you eventually get limited by the small proportion of misses. We have systems with 99.8%+ database cache hits and they still get I/O bound on random reads as the remaining rump of 15,000 random IOPs is still the dominant factor. 

I think the SUN box has certainly got a place for many workloads. I suspect it will make a very good general-purpose file server, and it will use its combination of SSD and large, slow disks to hit some really good price/performance points, especially with various replication capabilities built-in. However, it isn&#039;t going to be a good device if your requirement is for a large (multi-TB) heavily randomly-accessed data store, as you&#039;ll get I/O bound on those slow disks.</description>
		<content:encoded><![CDATA[<p>I&#8217;d seen the TPC-H result, but hadn&#8217;t analysed it in comparison with a disk alternative. TPC-H has a very different I/O profile to TPC-C and it was really the transactional profile I was interested in. The low latency of these SSD drives is particularly relevant to the issues I see with many of those sort of systems. </p>
<p>I&#8217;ve also seen what SUN are up to with SSD caches in their unified storage device. I believe all but the lowest end appliance uses very large, and relatively slow disks (1TB 5,400 RPM drives at launch with plans to go to 2TB).  Large enterprise arrays include large non-volatile memory (in the tens, or even hundred+ GB region), albeit at a very high cost. This large NV memory is used for cache, including such hidden things as the maintenance of pointers and maps for various replication capabilities. In many ways SUN&#8217;s use of cache is an alternative to that large NV cache, although it is organised in a fundamentally different way and has both write- and read-optimised parts.</p>
<p>In our experience, all these cached systems hit limits on very large databases with random access. The very good cache prospects  tend to be grabbed by the database, leaving the storage array to cope with only writes and the remaining &#8220;dross&#8221;.  Writes are easy to deal with (providing the ratio isn&#8217;t too high).  It&#8217;s the random reads on large data sets which is the real problem. </p>
<p>It&#8217;s an old story  with cache &#8211; the law of diminishing returns sets in as you add more, and you eventually get limited by the small proportion of misses. We have systems with 99.8%+ database cache hits and they still get I/O bound on random reads as the remaining rump of 15,000 random IOPs is still the dominant factor. </p>
<p>I think the SUN box has certainly got a place for many workloads. I suspect it will make a very good general-purpose file server, and it will use its combination of SSD and large, slow disks to hit some really good price/performance points, especially with various replication capabilities built-in. However, it isn&#8217;t going to be a good device if your requirement is for a large (multi-TB) heavily randomly-accessed data store, as you&#8217;ll get I/O bound on those slow disks.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
