Key Limits Of Apple’s Time Machine

by Robin Harris on Wednesday, 9 August, 2006

So I’m trying to make some lemonade to go with my crow steak after mistakenly tying Time Machine to Sun’s very cool ZFS. Given that Apple has incorporated Sun’s DTrace into the Darwin kernal, I still have hope they will do the same with ZFS. In my optimistic view, they just haven’t yet. As one of the clearly in-the-know commenters stated

ZFS is not in the Leopard discussed at WWDC in any capacity.

Which is a lot different than saying it will never be in Leopard.

Yet what some of the clearly better-informed-than-me readers say is valuable in and of itself.

What Is The Role Of Journaling?
One reader stated that Time Machine is enabled by journaling. Journaling has been available, in some way shape or form since Mac OS X Server 10.2.2. Yet the modest claims Apple makes for HFS+ journaling don’t square with the reader’s assertion:

Journaling accelerates the recovery time after an unexpected shutdown, significantly improving the availability of server and storage systems. When journaling is turned on on a storage volume, the server automatically tracks file system operations and maintains a continuous record of these transactions in a separate file, called a journal. The operating system can use the journal to return the file system to a known, consistent state after a failure.

In essence then, journaling speeds up the process of consistency checking. There is nothing in journaling that squashes the standard processes of bit rot. There is nothing in journaling per se that enables Time Machine. It isn’t clear that even joining the record of operations with a backup application is all that important. So I don’t get where journaling fits in.

It’s A Backup Application With A Pretty Front End
More likely is this reader’s comment:

It works just like Backup 3. It creates a sparse image and then copies the files onto it in stages. It’s not ZFS or using any special filesystem tricks; it’s just copying files incrementally on a schedule.

This seems to get to the heart of the matter. Time Machine is a pretty front end to a backup application. The presentation focused on finding lost files, and said almost nothing about full restores, but they are implied.

The Limits Of Time Machine
Until I get my hands on it I won’t know for sure, but based on what readers have said I would guess these assertions are likely to be true:

  • You can’t boot off a Time Machine backup
  • If you create and delete a file before the nightly backup Time Machine can’t recover it
  • If your system disk fails you must replace it AND perform the restore – which would take hours
  • Requiring a second drive to find lost files is overkill for laptop users Verdict
I don’t think Time Machine will cause me to change my backup strategy. The fundamental problem of disks going away is a greater concern to me than any one file. That is why I use CarbonCopyCloner to create bootable backups, because if my laptop drive fails I can be back up in minutes, not hours or possibly days – not a lot of SATA notebook drives in small towns.

Requiring a second drive to use Time Machine will be easy for Mac Pro users with multiple drives, but not for my MacBook. Time Machine is a great front end with a weak engine. Let’s hope ZFS is in the works.

{ 5 comments… read them below or add one }

PJ August 9, 2006 at 11:20 pm

Journalling fits in because if you’ve got a starting point disk image and a journal that’s a timestamped list of what changed on that image, you then have the ability to pick a point in the timeperiod that your journal covers and recreate the disk image as it was at that point in time. This is also related to how the big filer companies offer a way to be able to back up rapidly-changing files in a coherent fashion… they pick a time in the journal of that file and make that version available.

Robin Harris August 10, 2006 at 11:06 am

PJ, now I am getting confused. If journaling is used to create the disk image as it was at some point in time, why does Time Machine specify that it does backups at midnight?

A journaled file system – unlike the journaling done in a database – does not equal Continuous Data Protection or snapshot copy – because if it did we wouldn’t need CDP or snapshots. I think the Apple Knowledge Base article I quoted is correct: HFS+ journaling accelerates recovery time by eliminating the need for a lengthy fsck of the file system.

ZDigital August 10, 2006 at 2:04 pm

I thought the main purpose of a journaling filesystem was to provide consistency in the filesystem by checking a list of transactions and then rolling forward uncommitted transactions to the disk after a system crash. This was to prevent uncommitted data from being lost in the event of a power failure, crash, etc. which is what traditional filesystems were doing before journaling. To my way of thinking, Time Machine, sounds more like rsync or even a simplified Subversion repository and if Apple is using the Journaling function of HFS+ there must be a separate log that contains the transactions so that you can go back much further in time. As I have read, the log for a traditional Journaling FS is circular and has a limited length of life and would not be well suited to Time Machine. Let me know your thoughts.

Robert Pearson August 10, 2006 at 8:31 pm

I believe Time Machine is a wonderful product for the Personal Computer working environment. It is actually better than it needs to be. Over time it should improve. Provided Apple continues to be as responsive to user feedback as they have been.
The visual frontend is very similar to a product called Scope I evaluated years ago. The UX (User Experience) was fantastic. Scope visualized all your files. It also sent copies of all of them back to the author without your permission. At least the eval did. Unfortunately Scope relied on Microsoft Outlook, which is a pig you don’t want to plan your dreams on.

I once was rejected in a job interview over the journaling vs. non-journaling question. When I was asked why you would want to use a journaling file system I gave a reply very much like ZDigital.
The face of the VP of Technology I was interviewing went blank. He closed my paperwork and thanked me for coming in. I thought I had the job! When I asked him when I should report he looked at me like I was a complete fool. When I asked him how he would answer the question he almost screamed at me, “Because it doesn’t have to run fsck!!!”. Many people believe this to be the case. It is an obvious feature. This company was out of business in less than 3 years after this. Nice guys, not a clue.

Think about 24x7xForever-Forever operations. If you reboot, it is true you would like to avoid the lengthy fsck. You pray to God that you don’t have to reboot. But if you do, you hope that some form of “checkpointing” is available.
Corrupted Information is a special problem. There are many kinds of corruption. The worst is from hardware failure. The corruption can be massive and undetected until reboot. It is hard to diagnose because there is no easily discernible pattern to it. FSCK can choke on these errors. Journaling seems to handle it better. Just an opinion. Nothing is foolproof.

PJ August 15, 2006 at 8:18 am

| PJ, now I am getting confused. If journaling is used to create the disk image as it was at some point in time, why does Time Machine specify that it does backups at midnight?

Because storing all the changepoints is much more expensive than storing some of them, so it would save a lot of journal space to consolidate all the changes made in some time period (a day?) into one change set. If in a 5 minute period you change A to a to b to B, that could be stored as either 3 changes ( A->a->b->B) or as one (A->B). One changeset is going to be much smaller than 3, and if there’s a way to access that timepoint, or even just view the filesystem as it was at that timepoint, you’ve got a good way to make sure you get a consistent view of the filesystem, which is a prereq for doing a good backup.

|A journaled file system – unlike the journaling done in a database – does not equal Continuous Data Protection or snapshot copy – because if it did we wouldn’t need CDP or snapshots.

Sure – those features take some extra work, but a journaled filesystem provides all the information and facilites needed to implement CDP or snapshots, and I suspect that Time Machine is Apple’s version of those kind of features; as usual though, the interface is the problem, and that’s one place Apple excells.

Leave a Comment

Previous post:

Next post: