Herewith continues NAND – an engineer’s perspective.
Begin part zwei
. . . tested application performance hardly changes either . . . .
Actually, this makes sense. If you are accessing 4k of data, then both HDD and SSD are both fast enough and you don’t care. If you are accessing a 1MB file, then that is 256 x 4k sector accesses, and the sectors will be laid out one after the other, which is where HDDs perform well. SSDs will shine when you need to do 256 x 4k sector accesses, and the sectors you are accessing are scattered across the disk, but as far as I know this access pattern is not common except on servers.
And what about the 4-bit MLC that Toshiba is counting on to drive costs down?
I’m a NAND flash fan, but this is scary stuff for me. To store 1 bit in a bit cell, you need to distinguish between two voltage levels. To store 2 bits, you need to distinguish 4 levels. For 3 bits, 8 levels. For 4 bits, 16 levels. I think at the 4 bit/16 level point, we’re down to where 10-20 individual electrons can make the difference in the bits read out.
This will less durable than current SLC. How do you explain that to consumers?
The answer is easy, but doing it is hard. You have to make it so that the issues are completely invisible to consumers.
Note that this has been done successfully with flash for years. Most of the memory cards (SD, MMC, etc) that people have been buying for years use MLC flash.
Flash has read errors – that’s why vendors implement error detection.
NAND chips are generally organized in write pages, with a spare area for each page – typically 2kB page, with 64B of spare area. The spare area is used to store ECC parity data, and meta data (more about this shortly).
HDDs have read errors as well, they also write their data to the platter using ECC, and other algorithms that make it easier to recover the bit clock and align the heads when reading the data back.
But flash has a problem disks don’t: flash drives move your data around a lot more often than disks do. Every time a flash drive writes a page, it has to erase the entire block that page is in.
Not quite right. Generally, a page can only be written once, and has to be erased before it can be written again. And unfortunately, erases can only be done on an erase block, which is usually 64 write pages. If you have to erase a page, then you might have to move 63 other pages to free up the erase block – yuck! It happens sometimes, but the FTL (flash translation layer) software that manages all of this is usually optimized to avoid this situation as much as possible.
The normal scenario is that you write a page, and the FTL just puts the new data in a new page somewhere, and marks the old page as obsolete. Once you the FTL runs low on space, it needs to do garbage collection, but if you put a little extra NAND in your system so that even a full filesystem has some empty pages, you can make that pretty rare.
No hard numbers from the vendors – depends on how good their signal processing algorithms are – but it could easily be 5,000 writes – down from 10,000 today.
Actually, some of the NAND vendors are already at 5k erase/write cycles today. This, and slow write speeds are definitely the weak links for MLC NAND.
I believe that it is possible to do a good enough job with caches in the computer DRAM, and in the FTL to make a system built from 5k endurance work for a very long time.
Note that the 5k number is a statistical thing – this is the number of cycles at which about x% of the blocks will have failed (I think x% = 50%, but I didn’t look it up). This means that some blocks might fail when the part is new, and some might last a lot longer. If the software is done right, then the amount of available storage space will gradually shrink as blocks fail, and the entire drive won’t suddenly fail.
The map that keeps track of where your data is rapidly gets very complex – and itself is regularly read and rewritten. How well protected is this critical data structure? If it isn’t bulletproof you can kiss your data good bye.
All true. But you can also write metadata information in the spare area, to allow you to rebuild the FTL map if something goes horribly wrong.
Also, HDDs have the same problem with their FAT tables, or the modern equivalent. This is normally stored on the disk, and in the computer’s RAM, with the disk copy being a little out of date. Lose power at the wrong moment, and bad things can happen.
The StorageMojo take
Many thanks to the anonymous contributor. Net/net this points again to the suitability of flash drives for servers – and not so much for notebooks – the original subject.
The larger issue is the lack of transparency on the part of NAND SSD vendors. Until their architectures can be independently reviewed, we all have to rely upon marketing assurances – not! – and the useful but skimpy testing provided by sites like Anandtech.
The server-side SSD market can work with those limits. After all, the vendor of the complete system has to stand behind it.
But that is a tiny fraction of the total available market. The big win is on the consumer side: 100+ million units; if the product delivers.
Samsung, Toshiba: your current strategy is doomed. You need to engage at the consumer’s level instead of relying on the usual marketing hype. Your product is too costly, now and 3 years from now, to succeed without delivering real benefits.
You aren’t there yet.
Comments welcome, of course.