Shortlog - a log of everyday things

Home

2010-12-18

Today I realized that my ideal home directory system (network mounted homedir, but with offline access) is in fact, impossible to implement. Provably so, thanks to the CAP theorem aka Brewer's theorem.

So: I have to choose to lose one of: Consistency, Availability, or Partition-tolerance. Let's review the options:

  1. Give up consistency. You keep a cache of the most recently seen copy of every file, and work with that. You can synchronize when the link is up, and you can always access some version of the homedir (albeit not necessarily the official global copy). As I understand it, this is what Dropbox does (or rsync + some clever scripts).
  2. Give up availability. Now, you lose access to your files under some conditions (notably, network outages). This is pretty uncool, and applications don't like this.
  3. Give up partition-tolerance. This is the classic homedir-is-on-the-network-drive system. Now, you lose the ability to cope with network outages. This is not good if you're using this system for your home directory, and in particular, is unacceptable if you store your wireless connection secrets on the network drive itself.

Conclusion: the best of all worlds is impossible. If you can guarantee network stability, then by all means, give up partition-tolerance, and use a pure network filesystem with a single authoritative copy of the data. If you expect network outages, or need access to the data before network is up, give up consistency, and use something like Dropbox that keeps a local cache and syncs that with the remote "authoritative" copy.

It's interesting to note that the best possible solutions to this problem already exist (and see wide deployment). Thus, I claim that network-based storage and synchronization is, for all intents and purposes, a solved problem.