Backup Paranoia and Letting Go
Issues of personal backup loom large this morning. I spent some time reading up on personal backup in light of my personal concerns. Here is the official StorageMojo.com cheat sheet on personal and SOHO backup.
How Much Data Do You Produce?
This tells you about the scale of your backup solution. Devoted videographer? You could easily produce 100’s of GB of raw Digital Video a year. A writer? A few MB a year will cover your needs. Archive of family photos? A few GB. The issue is scale. I don’t know about you, but for me more than three or four pieces of media is more trouble than it’s worth.
There Are No Good Solutions
Only less bad. What can you live with? Are you the kind of person who will meticulously go through aging media and rewrite it onto new media? Each year? Are you sure? CD = 700 MB. DVD = 4+ GB. Hard drive up to 750 GB.
Safety Through Replication
Media goes bad. Interfaces change. File formats lose support. Read errors. Head crashes. Wildfires, tornados, earthquakes, lightning strikes, floods, volcanos, hurricanes, blizzards, tsunami. One copy is the minimum. Two copies are better.
Proprietary File Formats = Mark Of The Beast
If only one program can read it, you are SOL if that program goes away. Therefore:
- Text: use .txt, .rtf, or pdf. That’s right, save your Word docs in these formats. You can also use Word to pull text out of obsolete document files from no longer available word processors.
- Pictures: use .jpg, .tiff, or pdf. Sure, .psp is handier, but will you *always* have a copy of Photoshop when you need it?
- Music: high-bit rate (256kb/s or above) .mp3 sounds good and will play anywhere. Burn your protected iTunes downloads to CD and then use Tunatic to recover titles, artists and the like.
- Video: edit it down and burn it to a DVD. Keep a copy of the edited version on a backup hard drive as well. True, you won’t be saving every precious second, but DV files are huge, so just the time it takes to read the data will be daunting in a year or two. Re-burn the DVD every year or two.
- DO NOT USE commercial backup software. Sure, they may make it easy, but 1) if the backups aren’t working you won’t usually know until it is too late and 2) they use proprietary formats which = mark of the beast.
- DO USE standard backup formats like tar or just back up files.
- Encryption is another file format. If you must encrypt, use a widely available package like PGP or .zip. But think twice. Maybe a good lock on the door would be better.
What I Do
I live on my computer. I’ve hosed drives before, so I understand what can happen. I use a hard drive as my primary back up medium. It’s fast, easy and reliable. I’m on a Mac so I’ve created an external, bootable disk that has all my apps and files. If my laptop dies or walks out, I can get a new one and have all my data in minutes.
I back up monthly. I leave email on my servers for 30 days so email doesn’t have to be backed up any more frequently. This blog is hosted externally at DreamHost (use coupon stomojo to save $15 off any of their plans), who backs up the data as well. I back up the entire site locally every week too.
Then Let It Go
Do what you can, and then stop worrying. If all your data went away it would be a massive pain, but you’d still be alive. You can re-create most of it. And what you can’t re-create how much of it do you really need? Personal data is just stuff. You’ll miss if for a while, but in a few years you won’t even remember what all the fuss was about.
At least, that’s what I tell myself.
Update On Drive Life
I read somewhere that leaving hard drives on a shelf doesn’t work because the lubricant inside dries out and the drive won’t spin up. So I asked the folks who should know at Seagate if that is true. They assure me it isn’t.
For personal and SOHO backup, disk drives are almost as cheap as tape cartridges, offer high speed random access, have lower error rates, and don’t rely upon flaky backup software to work.
Nice post, Robin. One question about proprietary formats: Isn’t PDF itself a proprietary format, controlled by Adobe? Is there a risk in saving something as PDF document, if Adobe is bought out, goes under, or simply stops supporting it?
PDF is Adobe proprietary. And the reader for it is freely licensed so there are many applications, besides Adobe Reader, that can open those files. You may not be able to write or edit it, but you will be able to access the data. Since the reader technology is widely distributed I don’t think there is a risk if Adobe founders.
Excellent, excellent, post. Especially for a HO-(power)-user, with too little technical knowledge.
Can I add one more question to the list? “How much bandwidth are you willing to use or can you use? Some online/offsite backup requires substantial upload amounts.
(Testing this out right now and coming to the conclusion that my ISP’s 2GB upload limit really isn’t that big of a deal)
Keep up the good work,
Consider using duplicity, buying 2x as much disk as you need, and creating a peering backup with a friend. It’s all encrypted, and stored as a tar archive.
Also, if you’re going to use encryption, consider GPG in symmetric mode (-z) and using the AES256 algorithm. For heaven’s sakes, do not use public-key crypto; when you need to retrieve the data, your key will have expired, or you won’t be able to remember the passphrase to unlock the secret key, and besides, PK is probably the weakest part of a hybrid system like PGP. The hard part will be coming up with a good long passphrase with 256 bits of entropy, since that’s 128-256 characters of English prose. And test your backups. Don’t file it away without testing it. I recall a story about a backup operator who took the tapes home in his car, which had heated seats; the coils generated enough henries to corrupt the tapes beyond recognition.
Also, you might want to use crypto that doesn’t propogate errors, so if part of the backup becomes unreadable due to errors, you can recover the rest. PGP/GPG operates in CBC mode, which means no random access and one error will ruin the rest of the tape (IIRC). Consider CTR mode. Also check the SISWG for new storage encryption standards.
Sorry, PDF is not proprietary. Adobe published the specs for everyone to use. You can find open source products that can open and print PDF files. Apple’s Mac OS X can open and print PDF files without a hint of Adobe software (no, they didn’t license it from Adobe). A subset of PDF, PDF/A (for Archive) was recently approved as an archival standard for use by courts and others.
I’m easy. The only thing I don’t get is that if PDF is as open as you say, Adobe threatened Microsoft with a lawsuit and Microsoft backed down. Doesn’t seem all that non-proprietary to me.