I got into it today on ZDnet with one of the other bloggers, George Ou, who published Why dumb-downed no-RAID storage is bad for consumers. As I believe that RAID is an idea whose time is coming to a close, I responded with Why home RAID won’t fly.
So far, ZDnet readers seem more persuaded by George
I’m in my trailer, sulking. How could they?
The exchange has sharpened my thinking, as George and some other folks came back with some good comments, and a couple of the more perceptive – obviously – folks came to my defense.
While I like the Drobo storage robot concept and Geoff Barrall personally, I’ll be very interested to see what kind of market they develop. Which is marketing-speak for “I’m dubious.”
Why?
The secular trend in computers is that technologies scale up from consumers – not scale down from the enterprise. But so what? The real question is why.
Because consumer stuff is cheap and enterprise stuff is expensive. Because one is high volume and the other low volume. Because volume enables low-cost experimentation and improvement. Because building cheap stuff usually forces people to focus on what really works for customers who won’t open a manual.
Home RAID? I don’t think so
Why not? Let me count the ways:
- Complexity: RAID fails ugly. Pick the wrong drive to pull or copy and your protected data is no more. And due to the redundancy, RAID systems have failures much more often than a single disk does.
- Completeness: while RAID solves some problems, it isn’t a substitute for a backup. Getting customers to understand that is hard. Not all the ZDnet readers get it.
- Cost: HW RAID means a controller, a chassis. A lot of money before you buy the first disk. SW RAID is cheaper – with Intel’s ICH8 chip almost free – and consumers still need to understand why they are buying a second drive and not getting more capacity.
The vast fetid swamp of consumer ignorance
In my small town I often help people with computer problems. Often these are small business people who’ve been using computers for years. What I’ve found is that these people don’t have the faintest idea how their computer works or how the components work together. To most people computers are magick.
Case in point: a professional photographer lives across the street. Two Macs, scanner, several photo quality printers, a couple of fancy digital SLRs. One Mac does color correction. The other is her main machine. Photoshop and a bunch of other image processing software that she knows how to use. Pretty sharp lady.
And she doesn’t know the difference between disk and RAM. It is all “memory” to her. She never added RAM to the skimpy amount Apple provided, so her disks are thrashed all day. She’d let the disks fill up, not realizing that she needs at least 10% free space just for the OS to use. A few hundred megabytes sounds like a lot to her.
This is the person you are going to sell RAID to? She’s your target market, with hundreds of gigabytes of valuable digital assets to protect. How would you start the conversation?
She does understand the value and process of making copies, which she would still need to do even if she bought your RAID gizmo. So how do you explain your value-add?
The StorageMojo take
Home RAID for the masses is an uphill battle. Backup is the battle the industry can win. What kind is the issue. Across the net to Mozy, Carbonite or some more fully featured option? Local backup to a DAS hard drive or to a simple USB-attached NAS drive? Those “one-touch” Maxtor drives?
Comments welcome, of course. Leaving for Boston today. If you’re in the neighborhood this weekend, send me an email and we’ll do coffee. I’ll be staying at Copley Square. Moderation may be a little slower than usual, but moderate I will.
No related posts.
Related posts brought to you by Yet Another Related Posts Plugin.


{ 2 trackbacks }
{ 29 comments… read them below or add one }
“Backup is the battle the industry can win. ”
Getting a company to implement a proper backup solution is hard enough. The average home user – that’s a long way off. Home users have a ton of technology at their fingertips they never use Windows Snapshots, System Restore, NT Backup. All of those are already there and comparatively easy (to enterprise tools) but users don’t use them.
Users will start using RAID or regular backups when it takes more effort to not use them than to just leave them enabled.
I don’t want to sound like a jerk or anything – but your photography friend who lives across the street needs to learn more about the technology she bases her business on. I’m not saying she deserves to lose her data, but that kind of wilful ignorance is pretty indefensible. Nothing stopping her taking 5 minutes to learn what the difference between RAM and HD is.
I’m gonna have to disagree with you on this, btw. Backing up hundreds of gigabyte of data in any kind of timely and cost-effective manner is very hard to do. I don’t do it and I’m an “IT” person. I’m gonna have to go with RAID too, and if you really need offsite backup – and I’m not denying the value of backup, just emphasising its difficulties – then pay for an online solution. As a techie, I know how to rsync my whole home directory to a remote server I control. Most people don’t, however, so they should pay for a turnkey solution. Online services have the economy of scale and concentration of expertise to do backup right, and I’d go for that over messing around with a pile of mislabelled tapes any day. Needless to say there is no HD on the market large enough for me to back all my data up onto.
What to say to these people? Here’s the analogy I’ve made in the past – RAID is like a spare tire for your car. You’ve got 4 disks already – and the spare’s just sitting there in case one of the main one fails.
The answer to the question of which should you use, RAID or backup, is always “both”.
it’s not a contest between RAID and backup. Backup gets your files back, after something (human, mechanical, or software-ish) went poof. RAID provides a way (although complex) to keep your system running in the meantime.
There is no contest between the two, there is just a discussion each home user needs to have about how much they need. Most home users will be ok with a good backup, and a day or two of downtime to restore things in the event of major failure.
Good comments.
True, true. Which is why I like Carbonite and Mozy.. Once you install them they just work, in the background, and you don’t need to know a thing. Your data is encrypted and ready to download.
Yes, it is pretty shocking to me too, but she isn’t the only one. To most users this stuff is just meaningless words. And I think you underestimate what is required. Once I understand RAM and disk, then I need to understand something about virtual memory to get the relationship between the two. Once I get that then I can make an informed buying decision if I can figure out what I need to buy. These people have lives, too.
I run carbon copy cloner – which front ends rsync on the mac – to a firewire drive I store in a safe and I’ll start using Mozy as soon as they get it working on the mac. Plus I leave all my email on servers for 30 days. So I’m a belt and suspenders kind of guy, at least for data.
While I agree with you that RAID + backup is ideal, most people aren’t going to spend the money for both. So what do you recommend first? Backup.
True, another take on why backup is the first choice.
Cheers,
Robin
I completely agree that RAID is low on the list of things to be worried about in a home server. The first priority should be backups. RAID only prepares you for one specific failure, while backup prepares you for almost any incident.
The problem as I see it is that new tech people who have come in to the profession in the last 10 years have had it drilled into their heads that RAID and SERVER always go together. It’s a crutch, they feel safe by suggesting RAID because it always done that way. They can’t seem to think outside of the box. Just look at the arguments for using RAID, most of them seem to be saying that it’s a good idea because enterprise users are using it. Then there are the comments from some users that clearly show that they have no understanding of deeper details of RAID 5. RAID 5 is, and always has been a compromise.
The first line of defense is always backup, after that RAID is a luxury.
It’s called a data mangement plan. Depending on your needs, you need backup or RAID or other things. The real question is what is the goal you are trying to reach. Worried about having different versions over time – that’s one problem. Worried about your data dying on a hard disk – that’s another problem. Worried about your data surviving your house fire – that’s yet another problem. Worried about access to your data while you are out of the house – one more.
What would I like to see? An online service that keeps the meta data on my data and tracks it – what it is, when it was created, how big it is, where I have it backed up. I would like to use plain old CDs kept at my office for some stuff, flicker/google for others, on my pc only, etc. Sort of like drobo but not disk oriented.
A combo of raid and a good backup system would be the best thing.
Currently building a media center with 4 500 GB in raid 5, giving a total of 1.5TB (with the ICH8 chip, which I found can do only up to 4 disks), as a backup/media server Windows Home Server, its “virtual file system” will duplicate data over multiple disks…. and also has a build in network backup function so you can restore a system in your network… its beta atm and im still having some issues with it but its promising..
and as final backup a 500GB external disk (for the real important stuff)…
im too lazy to burn dvd’s although I occasionally will do the family photo’s
a final solution will be off site backup through a service provider, a major isp in the Netherlands is already providing such a service for small businesses.
Beepo,
As a guy who remembers the world before RAID, people did just fine with software mirroring. I’d forgotten that if you came into the business in the last 12 years or so you’d have RAID drilled into you. Good point.
Data management plan? Good idea. Then I see the interviews with people who lost everything in a fire, flood, tornado or hurricane, even when they had warning, and I think it is not going to happen for most folks. Plus I noticed you said “backup or RAID” and I wonder, is there a case where you would recommend RAID by itself?
Intellectually the RAID + backup idea appeals. Then I go back and look at Desktop RAID is a bad idea and I
have to wonder. For the average home user the RAID will probably break 4-5x more than a single big hard drive. Is it worth it>
Cheers,
Robin
We went with a hybrid backup solution to cover our bases. With just under 100 employees we do not have the resources for the pricey EMC backup solutions. Instead, we opted for a combination of managed backup via a company called AmeriVault and a RAID system optimized with high-capacity 1 TB SATA drives. The backup service allows us to do high-volume test recoveries and gives the security of having data stored offisite. We also continue to use our 4 TB RAID server to centralize our backup data onsite as an extra measure of protection. The backup service uses a backup software platform by Asigra which has been around for 20 years and provides all the bells and whistles of expensive enterprise software such as CDP. There are probably hundreds of backup service providers offering managed backup so pricing is competitive.
“While I agree with you that RAID + backup is ideal, most people aren’t going to spend the money for both. So what do you recommend first? Backup.”
Robin,
I agree backup is “better” than RAID. The problem is that “backup” takes DOUBLE the amount of total capacity to store the same data. You’re effectively talking about a far-apart mirroring system. While that’s a lot more resilient against things like fire than RAID fault tolerance, mirroring is the MOST expensive form of data redundancy and remote mirroring makes it even more expensive. RAID isn’t anywhere as resilient against environmental dangers, but you can effectively use a single drive as an insurance policy to cover 3 or more hard drives.
I really like Sho’s spare tire analogy. RAID is like a spare tire for your car while backup is like a spare car. Of course we would all like a spare car parked safely two blocks from our home incase our main car dies, but who can afford it? But most of us can afford a spare tire for our car. Ideally we’d have a 5th spare tire AND a spare car, but most people can never afford that.
Looking at the real world perspective, if a user has a RAID in their home server with no backup and the 13 year old of the house decides to delete the family photos to make room for videos of he and his friends skateboarding what is the solution?
I’m not sure they are going to be able to set quotas and permissions if they can’t grasp RAID and backups.
RAID and Backup serve two different masters.
RAID is to support “24×7xforeverxforever” operation.
Backup is to support you if your RAID fails completely (a Local Disaster), your site disappears for some reason (a Local Disaster), your site disappears and all your remote sites disappear (a Global Disaster). If you have a Global Disaster and your remote tape storage disappears you are one unlucky person.
Surprisingly enough there are Strategies that will handle all these situations within the limits of your available funds.
Most Internet sites don’t have backups of the front-end Information from which they make most of their money. They can re-create it faster from scratch than they can restore from tape. And tape restore costs them more revenue. They go to a “reduced” billing algorithm defined in the customer SLA.
Financial Services companies do Backups but only to look good to customers and for Disaster Recovery. If they lose a site they want to be able to restore it fully to re-establish their redundancy. Their time constraints are such that tape is too slow for recovery. Actually disk is too.
The Strategy is to never go down. Lots of RAID and lots of duplications so there are no SPOF’s (Single Points Of Failure)
Because few of these situations occur in the areas of Personal Computing or the SOHO and because the “value” of the Information lost is not life threatening there is little incentive to do much. Insurance is such a burden.
Ever own a beach house? Live on a known fault line? Your thinking changes.
The difference seems to be that the Personal Computing/SOHO area is a good playground for techie’s. RAID, LVM, you name it. Great place to play.
But let’s say you make your living from your Personal Computing/SOHO.
These people would like a solution much like they get driving their Lexus, Toyota, or Honda. Mercedes are not as feature rich as these cars and Toyota is the most reliable car delivered anywhere in the world according to all the rating services.
What does Storage have to offer along these lines?
I don’t care how my Toyota works. I don’t care how my Information is safeguarded as long as the price is inline with what I think my Information is worth to me.
ISPs have had a golden opportunity to provide all these services but because they are controlled by some of the dumbest, greediest management in existence they blew off all these value-add services.
Ever wonder why it costs you cell phone minutes when someone calls you? The one I love is being charged for “Not using my AT&T Long Distance”. I pay $3.55 per month, of which $0.99 cents is a “Carrier Cost Recovery Fee”, for not using any Long Distance. What a rip! What a waste of a beautiful opportunity to offer services I would be willing to pay for.
I say “Welcome”… it’s about time.
Can we clarify some terms? When you say ‘consumer RAID’, what do you mean? RAID on their desktop’s DAS? or a consumer RAID NAS? Or what? It makes a difference. I think my current favorite scheme is: desktops unRAID’d, but a cheapish SOHO NAS (Iomega or Buffalo or the like) or that gets backed up to Amazon S3 fairly often. You can either teach users to keep ‘valuable’ data on the NAS drive, or you can cherrypick files from the desktop to ‘back up’ by copying them to the NAS. Speaking of which, got any recommendations on a good SOHO NAS? I’ve heard good things about the Iomega line, Buffalo’s Terastations, and D-Link’s DNS-323.
I am one of the designers of the HP Media Vault, a NAS storage device targeted at home and small business users. I also run a user group and FAQ website for it. I was a big proponent of RAID until I found that our customers were placing so much faith in RAID that they were putting all their data on the NAS and then _deleting_ it from ALL other locations. In many cases, they had no off-site storage strategy for their data.
There are multiple ways to lose both drives at once, as we’re finding. For example, in the case of a lightning strike, we had a customer lose both drives, and his mirrored data, even though the PC which was on a surge protector was unscathed. There’s also the case where if you accidentally delete or corrupt a file from one drive, the mirroring function with dutifully delete or corrupt it on the other. RAID cannot protect you from data loss in the event of fire, flood or theft. So I’m now of the opinion that a safer solution is to have an off-site strategy with drives that can be periodically cycled rather than using a RAID-only solution for home users.
It’s not that I don’t like RAID. I do use it myself. It’s just that sometimes we think of a hard drive crash as the only way to lose data and forget about all the other ways the data can be lost.
I think this discussion is a bit confused as we all seem to be talking about different things!
When it comes to home users and their data I would like to suggest that there are three main concerns:
1. protection against accidental deletion
2. protection against HD crash
3. protection against site/location disaster
and the solutions have to be
4. foolproof
5. proactive
6. affordable
1. this is probably the most common. A daily backup, if it is maintained, is the best defense. But since it is rarely maintained even at businesses, what would be ideal is some sort of snapshotting service, hopefully enabled by default on a turnkey solution.
2. protection against HD crash – this is probably the second most common cause of data loss and is the chief reason for the existence of RAID. Despite some of the comments above, RAID is in fact a pretty good defense against HD failure (else, uh, no-one would use it) with the advantages of proactive user alert for drive failure (on good devices, anyway) and – importantly – no downtime or reinstallations necessary if failure does occur.
3. Location disaster ie fire or flood – nothing can protect against this but a current offsite backup (I don’t share Robin’s faith in fire safes ..!)
and to the necessary attributes of the service …
4. Foolproof. Backup’s achilles heel. Backups, when they aren’t just a single HD copied over each night, require media changes. Testing. A labelling system. A schedule. Testing. Buying expensive new media when the old stuff wears out, which nobody notices anyway without …. Testing. Did I mention testing? Doesn’t really matter how many times I mention it actually, no home user will ever test their backup system.
Backups are absolutely necessary. They are also high-maintenance and require user attention. This is why I would basically not recommend anything except an online automated backup system. That’s the only way I can imagine Mrs “What’s RAM?” actually having an extant current backup somewhere…
5. Proactive – I like RAID because it’s proactive. When something goes wrong, it tells you, assuming it’s anything like a decent system. The problem with backup is it’s reactive. The typical user will be reminded of the fact she hasn’t updated her backups in 6 months by nothing but the actual failure of her disk. Software warnings are way too easy to ignore/dismiss, and it’ll become a habit.
6. Affordable – the tapes, fragile mechanical interiors and pure cost in time and worry of a home tape backup system will cost you more than a good RAID would anyway.
OK, this comment is getting too long, let me get to the point. The perfect home backup system, as I see it, is this:
- a RAIDed “big drive”, with some kind of version control/snapshotting system on it – choose your poison, VSC or whatever. I use svn ; ) .. all the important data goes on here. Bonus points for a bootable image for the client sitting on there as well, as well as full client backups. A prominent visible position, warning lights clearly visible and audio alerts on, and a spare drive sitting next to it.
- an automated “lazy realtime” (ie, leisurely asynchronous) online backup for the RAID (and the client backups on it!) If the online service can do versioning and snapshots, all the better.
The key is they need to be running all the time behind the scenes, and make a loud beeping sound if something goes wrong. If the backup company doesn’t receive any data for 24 hours the person’s phone should be ringing. If the RAID drops a drive it should sound like a smoke alarm.
That’s my opinion of pretty much the only way to guarantee a good backup. Sorry for taking so long, thanks for reading : )
I am a storage geek and work for a major storage vendor. I am well versed in RAID. I don’t want it for my home system. Too cumbersome, too costly, too painful to setup and manage. Yes, even I have pulled the wrong drive (once.) I don’t need the high availability that RAID offers – even though I’m running a web site. As Robin suggests, I’m looking for the perfect backup system for my 4 home PCs. My choice is NAS with enough capacity for backups and general storage. Most current home NAS systems are easy to install. I haven’t found the perfect one yet, but I’m sure it’s out there or on its way. So I’ll perform my backups to NAS, and for those files that can’t be lost at any cost (archive), I copy those off to DVD and put them in the safe deposit box. Sort of like what many businesses are doing today by doing “backup to disk” and then to tape for offsite storage.
High availability with RAID is a bit too much for home use – for me.
Don’t feel too bad. George is the same guy who thinks RAID 10 and RAID 01 are the same and RAID 10 is worse in IOPS and throughput than RAID 1 but he can’t explain why that’s the case. He then tells you “the whole industry must be wrong” does anyone else see the irony here?
George, RAID 10 is not the same thing as RAID 01. Furthermore RAID 1 may have better performance if you you configure it for a particular application. An application that can do parallel reads and writes through two separate disk controllers to two separate RAID 1 volumes. However, as a general purpose read/write filesystem on a single controller you can’t beat RAID 10 otherwise the whole storage industry must be wrong.
Forgot to add, I’m a storage architect and manager for a large corporation. We have three dozen high end NetApps, EMC clariions and HP EVAs. They all serve different functions and provide various level SLAs and performance.
At home I just use a cheap $15 USB metal case with my old 5400 RPM hard drive. I use a XCOPY .bat script and cygwin + ssh to copy my linux tar backups to backup my data. However, while the solution is not optimal it’s all automated and has worked without any issues.
Does RAID have a place in your home env? Sure if you have the knowledge, enthusiasm, need for performance and particular RTO why not? But do my folks or friends or even myself need that only to keep spreadsheets and documents, pics and occasional vacation videos need RAID…I don’t think so.
Now the Drobo type thing is pretty cool since to a machine it’s just another disk drive but I don’t see myself or anyone in my family to be shelling out $400+ for a ‘reliable USB drive’
I have raid on my home machine. All I needed to buy was a second disk and a $40 software raid controller. A mirror is simple. If either disk works, you don’t have to restore from backup. Ugly failures are pretty rare with a raid mirror. If a software raid controller is built on the motherboard it would probably add $5 to the cost of the machine. That leaves the only real cost as the extra drive. As long as the consumer can switch from two copies of my data to one copy of my data easily I don’t see the problem. Given that most consumer machines have no backup, a the possibility of a raid mirror at the cost of a second drive would be a step forward.
p.s. Most dual failures happen when you buy a drive from a manufacturer that sells drives with higher failure rates. Some of these defects can be fixed with drive firmware updates, but unless that is an automatic process most consumers or even business will get the ugly failure.
p.p.s. Raid 6 does not fix the problem of “oops! All my drives have decided to stop reading or writing data”.
Your pro photographer friend is in the majority of pro and amateur digital photographers. 100 years from now most families will have very few photos from this period in time due to ignorance of maintaining data. It’s the dirty little secret the digital camera companies don’t want to talk about.
RE: “most families will have very few photos from this period in time due to ignorance of maintaining data”
I made the following post originally on another Blog in response to a CAS post. No one ever replied. Maybe its worthless but I believe the “thousands of pictures” is very to the point.
Neither RAID nor backup would have helped without a Strategy and some Manageware skills.
Posted in response to:
“Oh, and one more thing. Robert Pearson’s first comment above ["CAS (Content Addressed Storage) requires organizing the Storage by Content. This requires some, if not a whole lot of, advance knowledge of the Information being stored."] truly invites a reaction, as the exact opposite is true. Real CAS explicitly does **not** require organizing Storage by Content.”
I guess I am a little confused by all of this?
You have my respect as an acknowledged expert in CAS.
I don’t claim to be.
My goal is to build consensus and make CAS better and more viable.
I fully agree with your statement about the Information, once it is in the CAS. My statement was directed to these areas:
1) Determining Stored Information that is a candidate for CAS
2) Getting it in the CAS
3) Once the Information is in the CAS dealing with the same
issues “non-CAS” Storage has like “hot-spots”, bottlenecks, updates, migrations, replications. and synchronizations.
Is CAS exempt from these “Management” issues?
How long would it take to migrate 30 TB of Seismic Information to CAS?
The same as for “regular” Storage?
I can tell you how long a 30 TB “snapshot” takes to commit and that is not being hashed for Content.
Here is an example:
A real world example is when my Uncle died. His wife preceded him by a few
years. My Uncle loved to take pictures. He had thousands of pictures.
He had some made into movies. He loved to show these at the family reunions. One whole room was his for the slide shows and the home theatre. And he took more pictures all the time.
He knew all the people in the pictures and had stories to tell about the person, the place, and the picture.
I helped his children transfer all these pictures to CD. Not every recipient had DVD capability. But we really didn’t know what to do with them. They seemed very valuable to us somehow. Like a valuable piece of history.
They should have been made with a VCR, or transferred to a VCR, so there would be audio but that technology wasn’t available until much later.
Between all of us we could probably identify 10% of the people and places.
We physically archived the pictures, movies and master DVD, made the CDs and disbanded.
That video Information has become valueless.
The technology is there to view them at any time. To what purpose?
Nobody derives any value from viewing them.
We put them on the CAS.
Suddenly they became hugely popular as background for TV shows. They were constantly being accessed. The TV people wanted to edit them and add comments.
Now I’ve got many versions of the originals that have become “fixed” that need to go on the CAS. I am out of CAS.
Does CAS offer versioning software?
Does it work with popular versioning software?
Then I got a chance to buy hundreds of hours of old TV sitcom video. We put it on the CAS figuring it would only be read.
Wrong! Thousands of “stills” have been edited out and produced.
These need to be stored in a related fashion to the CAS originals.
Is the hash on a still from a video the same as the video hash?
There is all kinds of Information related to the original Content that is stored on the CAS in a Content related fashion.
I keep hearing database. It was in a database.
Plus they want to insert “human realistic” pellet people in place of the
original actors to avoid paying any royalties. So now I have
the original, modified originals and “human realistic” copies. They are all the same content but play at different levels of value. Maybe these are not good candidates for CAS?
What about the level of Information High Availability? Information Integrity?
Disaster Recovery? Business Continuance?
Worst of all, Findability?
Each Unit of Information stored on the CAS is subject to all these demands.
“Vendor A” said CAS was the wrong solution. What I needed was “5 nines (99999)” active Storage. And lots of it. He bought me lunch.
IMHO, RAID on servers is to protect the running OS, not the data on the server. I rarely ever keep data on the os drive if I can help it. While I do sometimes provide a raid 0 copy of the data drive, it’s always just for convenience. Data backup belongs on another machine, ideally at a different location. In business, we can do that many ways. Cross backup between different sites. Contracts to store tapes, optical media or drives in a secure location. But we never confuse backup with RAID. Why it has been some jumbled in this discussion is from the amateur point of view of the originator of the article. Just because you know what a piece of technology is and does, does not mean you have any comprehension what is actually used for in the real world. Your real world not withstanding, is not the only one.
RAID could quietly reside inside a consumer appliance or home server and never be a problem. The user interface could direct consumer user to replace the BLUE drive and not the RED drive in such a device. Or better yet, let the meat head down at Best Buy take the heat for not knowing the difference between the Blue and Read drive when the flashing light on the front of the users “Walmart Stoage Server” starts flashing.
And as for all you guys who are so happy paying $20 or $50 a month for your web based offsite backup via Carbonite, etc. Did the Dot-Com bust not teach you anything. Online companies can be fortune 500 today, and penny stock tomorrow. And all that data you feel so safe stored on there servers will go the way of any reformatted drive when google buys them out and decides to use the companies assets for more UTube movie bites…If you don’t have a copy of your data in your control, you don’t have your data. The cheapest way for most consumers is to burn data to quality DVD disks and put them somewhere safe outside your home. That might be a safety deposit box if you can afford it, or at your Parents house in a lock box. You might be able to safely store your backup at your place of employment (clarify that with a employer so you don’t end up loosing it when you change jobs.)
Currently, I working with the beta of Microsoft’s Home server. It’s utility to backup all the home pc’s to the server makes it real easy to be sure that all the pictures, videos, and other data your family has taken the time to put on their pc gets copied to a central location. The next obvious step is a backup utility for that server. Since the SDK is based on Windows Server Small Business Server 2003, there are already several solutions for making a safe backup of the server that we techies can use. When the product comes to market, I expect one of the backup companies to have a farely mature direct to DVD backup product in the offering. The WHS product is already useing a RAID like disk utility to make adding space easy for the home server administrator.
Not to go to far down Microsofts Road, I would imagine there are other vendors with similar products waiting for Microsoft to bloody the waters with the release of their usual unfinished product. They will then jump in with something Linux based that’s cheaper and has things like server backup built in.
The current selection of consumer NAS boxes are not worth the time to setup IMHO. They provide some useful services, but fail to provide true “server” abilities.
I’ve worked with a large number of home and small business users and, to quote a friend, “Selling backup to SOHO (Small Office, Home Office) is like selling life insurance to teenagers.” The amount of time and money SOHO users are willing to spend on backup is almost zero until they lose data. Of course by that time, its not only too late, but also the IT guy’s fault for not taking the zero dollars and putting into place an enterprise-level backup system. Backup is key, but it has to be automated and as transparent as possible or the SOHO users just won’t do it.
For RAID to work in home environments, you’ll need to delete the word RAID – to much information. While there are some pro home users active in hobbies or trades such as digital video where the use of programs such as Avid Xpress Pro, Apple Final Cut Pro, Autodesk Maya and other digital media applications demand high-performance hardware such as a high-throughput video server powered by a high-end video RAID controller (ie: ATTO), the average home user will typically not venture into such technology. I have heard of vendors working with emerging consumer RAID solutions where there will be no mention of the word RAID with devices so easy that grandma can use them. Once these begin hitting the streets, at a price digestible by the public, we may see greater consumer interest. Until that time, it’s pro-sumers and up.
Gentoo Linux has a wonderful HOWTO Backup Guide, which discusses full, differential, and incremental backups. You can set it up to automate full, differential, and incremental backups on a regular time-schedule, at a time that is convenient for you. E.g., full backups every month, differential backups every week, incremental backups every day. Of course, the minute I said Gentoo, you know that isn’t for the typical use. But automated full, differential, and incremental backups sound good for the regular user. Have it do it at 3AM in the morning or something. I think the people at Apple really know what they’re doing. That Time Machine thing is exactly what consumers want — something that’s transparent. Hopefully, it can backup to 2nd or external drives.
I’m a “prosumers” user, so I’m a little out of step with normal users, but I understand them. I had HD reliability issues on my main HD for my laptop. So what I did is bought a new 2nd HD, put it in the optical bay, and moved all my data there. My main HD filled up, and became less and less reliable (running WinXP BS, as I needed it for business school). Eventually, WinXP wouldn’t boot. Great! Got a 120GB replacement drive, and am now operating from the Ubuntu Linux CD. Haven’t bothered installing yet (actually, did, but ran into a stupid error 97% way done about GRUB install to MBR failing, then it borked out; haven’t had time to mess around with it yet). BTW, Ubuntu running from a CD boots up faster and loads programs faster than Winblows XP did for me for the last couple of years (from the HD). That’s really sad. And worse yet, apparently Win Vista is a downgrade from XP in terms of performance.
In the meanwhile, my data is pretty safe on an that data-drive, which is now wrapped in the HD static-proof bag that my new HD came with. And I have a lot of my photos on DVDs as well.
A few other comments…in RAID, drive failures tend to be correlated due to similar use-patterns. This greatly diminishes the value of RAID for home-users. It just doesn’t seem worth it. Because RAID isn’t a backup. So then, to really be protected, you need 3 hard-drives: 1 original, 1 for RAID, and 1 for backup. You’ll have a tough enough time convincing people to get 1 extra HD, let alone 2.
Ideally, backups to Gold Archival DVDs should be done whenever you know anything is fixed (e.g., not a work in progress), and have enough to fill up a DVD. But, although I’m willing to spend $80 for a 100-pack of gold archival DVDs, the average consumer will wonder why spend that kind of $$$, when a 100-pack of normal DVDs is like $20 or whatever.
There are other considerations when backing up, as well. A few ideas of mine:
* OS and apps should never be on the same HD as data. At the very least, they should be on a different partition. If on a different HD, the OS and apps would ideally be on a faster smaller HD…maybe even a small SAS drive instead of SATA for better performance. Also, if you want to make use of solid sate drives (like mtron), you can get small drives for your virtual memory, for your temporary files, and even for your /var files (if you’re a linux person).
*Your backup HD should be separated from your main HDs. Being inside the CPU case means more heat, which is bad for HDs. Also, if possible, it should be powered down when not in use by the automatic backup system (e.g., have software do this). That way, it has even less wear and tear.
*Use an ultra-high performance HD for your OS and apps, like a small Raptor or Cheatah.
*If possible, use an even better HD, like the mtron ssd’s, for your swap file, temp directory, /var directory, and anything else that gets a lot of read/writes all the time.
*Use an affordable high-performance high-capacity drive for your data, like the Hitachi DeskStar 7k1000, or Samsung Spinpoint F1. You can get these affordably in the 500, 750, or 1000GB sizes (by affordably, I mean more than 3.8 GB/$).
*Use an enterprise-class rated HD for your data backups. Something that’s rated for continuous 24×7 usage (even though you’ll have it powered down when not in use). E.g., the Hitachi Ultrastar, or Seagate Barracuda ES.2 SAS drives. These also come with 5 year warranties. Use them in an external enclosure. Or use an eSATA HD.
Ideally, all this stuff I’ve talked about should be done automatically, although that seems near-impossible. But Apple is always great with this stuff, making it easy for consumers. And they could even make all that hard-ware selection easy by selling premium reliability/performance grade macs that come with multiple HD’s, and the sys setup to use them appropriately for every-day use vs. backup.
PS: I find the arrogance displayed by some here flabbergasting. People are chastising the photographer woman for not knowing the difference between RAM and hard-drive? Ok, sure, she’s just a few notches above great-grandpa in terms of computer knowledge. But I bet a lot of the geeks harping about how imperative it is that people understand their computers, don’t know how to change a tire on a car, or deal with routine house-maintenance, or know a god-damned thing about their car for that matter. Also, I bet that that photographer — if she’s a professional — knows a lot more about Photoshop than anyone on this forum. Maybe if she saw people here doing some silly thing in Photoshop, she’d comment on how ignorant we were!
You don’t need to convince anybody of the usefulness of RAID or the necesity of RAID, they come crying to you once their data is gone. Regardless of complexity, when the choices are RAID/Backup (both) vs. loosing all your beloved digital pictures, music, games, contacts, mail, etc. The answer is a resounding YES, I don’t care for those extra $100-$200 in the extra HDD. Don’t try to convince anybody, let them crash once, and they learn for ever.
a
Right now I have a 7-drive Raid5 at home (3.75TB, 1 hot spare). I’m a computer guru so the normal pitfalls are not issues for me.
But, I can find no other method for keeping tabs on 3TB of data that are nearly as cost effective or pose less headaches. To date I have lost data, but only from optical media failure. Short of a DLT drive I can’t think of any viable option.
Sure, online backup is great, if you have a few hundred gigs of data at most, once you’re in the TB range it’s completely pointless for both speed of access and cost.
My only future idea is to go from Raid 5 to multiple mirrored drives to spread the URE possibility much thinner.
I don’t normally respond to many blogs but this one is hitting home to me today. I have a “significant” home network due to my wife and I both having been developers, web and otherwise, for over 10 years. We have eight machines on a 100MBit network with two laptops that are 90% wireless. A little over a year ago we decided to get some NAS storage. I went with a Maxtor solution featuring three 500GB shared storage devices. At the time they were highly rated by users and tech evaluators alike, the cost was reasonable and they installed and were managed easily enough.
It is 14months later now, one died this weekend (the disk does not seem to be spinning at all), another has a noisy fan (luckily it’s in the “server room” in the basement) and the one on my desk intermittently makes some really strange noises of late. To say I’m somewhat disappointed in a Maxtor product that was highly rated last year is an understatement.
I was just looking into about buying a RIAD box for here at home, or stripping down one of the lesser use machines and building a RAID out of that when I came across this blog. While both my wife and I are tech-savvy enough to manage and secure our network neither of us is a network admin/specialist. So I was looking for one of those “dummed down” RAIDs. I know that I have to have twice the space and am willing to accept that as it part of the price you pay to get RAID.
We had, as of last week, close t0 3TB of disk storage on the network and as is almost always the case, the disk that died this weekend is the “critical” one for right now. I’m headed off to a data recovery specialist later this morning to see what the prognosis of getting that data back are.
I’m not a huge fan of the “online” backup solution as it put’s my customer’s data on servers that I have no control over from a security or reliability point of view, and upload speeds on a cable modem are just not that great on my provider.
Over the next week or two I’m going to be doing some research (until reading some of these posts, and the ones on ZDnet, I was just going to buy/build a RAID box) to see what would work best for us. I would be interested to hear what you and your readers think.
Firstly may I say this debate is a point I have pondered for several years, I work for a major IT vendor supporting many large blue chip clients and have a strong Solaris background (note that I don’t use the term GURU here). Also I have good familiarity with many other Operating Systems.
I have witnessed many catastrophic failures which have been recoverable due to the complexity of the recovery strategy in place. The clients production systems tend to feature RAID 1 for the OS/Mangement Framework storage as a matter of course and the data/third party apps are either run from NAS or SAN with a complete backup solution for both areas. Obviously this is a very expensive solution and would not be expected of a home user/Small Business.
One of the key points in the argument against RAID here was made by David J. Heinrich in regard to “drive failures tend to be correlated due to similar use-patterns”. For the uninitiated this basically means that the likelyhood of your RAID drives failing simultaneously is increased due to the nature of how RAID works … an ideal RAID requires all of your RAID drives to be of the same model and firmware revision to be most effective. Let’s go back to the tyre analagy provided by George Ou, yes you may have 4 tyres and a spare however the spare is also being used at the same time, the tread depth wears out at the same rate and you are then running on 5 bald tyres rather than just 1, all mechanical drives share this same weakness and coupled with the precision of mass manufacture the likelyhood of multiple drive failure is compounded. In essence the greatest failure with RAID is it’s synchronicity. This statement is also true of mirroring.
Another , less noted, variant here is the problems with RAID 5. RAID 5 and it’s derivitives have the beauty of redundancy and speed built in however, as anyone who has dealt with the loss of a disk in a RAID 5 configuration can vouch, the performance stinks when a drive fails and you have all the issues of the resync to boot. A loss of a second disk duing this vulnerable phase and bang …. all your data is lost.
RAID 5 and Striping (Striping cannot really be considered as true RAID as there is no Redundancy): Plus point – is very fast access – the more disks you use is just short of equal to the number of drives multiplied by the access to one drive. Minus point – increasing the size of storage requires you to destroy what you have and rebuild from scratch. Also the need for this kind of performance shouldn’t really apply to the majority of SOHO users.
Taking all of the above into account I would look at the alternative of snapshots to alternative local storage to maintain data integrity – this would be similar to RAID 1 where you use a second disk for redundancy however only one drive would be actively providing access to the filesystem/disk the other drive could then be brought online in the event of a major failure on the first and service would be restored resulting in minimal data loss and a fast service restore time. To further improve in performance what could also be done is to have a matching partition layout on both drives (for simplicity we’ll just have 2) then on the first disk the first partition could hold one set of data with an equal amount of space on the second disk’s first partition being used to back this up. Meanwhile on the second disk the second partition could be used to hold a second set of active data while the corresponding partition on the first disk could be used to hold the backup for this. The upshot being that you can have two sets of data being accessed at a similar rate as opposed to contesting the same drive for access – hence the user see a performance increase as well as having redundancy meaning that they get some additional bang for their buck.