Care must be taken when planning out a filesystem structure for concurrent use with multiple redundant services. The extra performance cost of shared storage can also be taken into account.

As I see it, files can be divided into the following categories:

Static Content

The majority of our storage space is filled up with media files, HTML, PDFs, etc. These can easily be served up from multiple simultaneous nodes. A shared filesystem is enough to keep all nodes synchronized, and failover (and back) is trivial.

Live Content

Usually database files, though any files which are held open for writing for extending periods of time can fit here. Because of the extended locking, we cannot serve the same file from multiple nodes simultaneously. It will make the most sense to keep live files on a local non-shared filesystem. Frequent archive dumps to the shared fs (and copying from the shared fs on startup) will keep the services backed up and (somewhat) in sync. Failover and back (and/or live mirroring) will need to be dealt with on a service by service basis.

Configuration Files

Binary executable files and other detritus needed to run each service should be installed and configured locally on each node. Where possible, configuration files should be kept on the shared fs and simply 'included' in the local configuration.

Log Files

Logs should be kept on the shared filesystem, but with filenames or paths that include the serving node's name.

Seeded Files

Seeded files are generated specifically for one node based on shared data, and should not be on the shared fs. Unlike live content, the seed data rarely changes. The generated file can be rebuilt easily, but generally does not need to be. An example might be a config file, with %HOSTNAME% replaced with the name of the serving node.

Notes on building a filesystem for the purpose

I wrote ["lmfs"] for this purpose, but fuse is simply not robust enough.

I recently came across ["http://btrfs.wiki.kernel.org/index.php/Main_Page" btrfs], which offers several features that might come in handy here. In particular, it's ability to make filesystem snapshots. A snapshot is a 'copy' of the entire filesystem as it was at the time the snapshot was taken. This can be incredibly useful for doing any kind of live backups, by ensuring data integrity across the entire filesystem. In btrfs, snapshots are not read-only, but are a copy-on-write copy of the entire fs.

DistributedServer/FileSystem (last edited 2008-10-17 18:29:09 by calin)