We want to run the various SplitReflection services on two (or more) servers, each with their own separate internet connection.
We hope to gain:
- Data Redundancy
- Automatic Failover
- Load Balancing
- Clustering certain services
Ideally, we'd like fully bi-directional data redundancy. Changes made on either server should be replicated to the other.
Fully capable options include:
- Coda provides a good set of features, but might not be production-ready or fully posix compliant
This is a homebrew project I started specifically for this purpose, using [http://fuse.sourceforge.net FUSE] and perl.
We should do some testing and research to see whether any of those will be able to work in an unstable and slow internet setting. If none will do, we will probably need to have a master writable server, and a slave read-only server.
We could accomplish this with a periodic rsync. There may also be a way to monitor the filesystem for changes, and trigger the copy.
Take a look at DistributedServer/FileSystem for structure notes.
In the event that one server is unreachable, clients should automatically be directed to the other. This mostly has to do with proper DNS directing, but DNS does not support conditional resolution so far as I know. This means the member servers will need to determine which members are alive, and update DNS as necessary.
Incoming client requests should be spread across the available member servers. This can be done with round-robin DNS entries.
Clustering Certain Services
Unfortunately, some services require state to be maintained on the server. Some only per-session, some over all. Some examples:
- httpsocket proxy
- Each session starts up a process on the server, which lives for the duration of the session. Subsequent requests must be made to the same server
- MUCKs and Databases
- Users connecting to different servers for these services will not have access to the same data
Some of these problems might be solved by using special DNS entries - muck.splitreflection.com for example, which will point to whichever server is currently running the muck. These entries should only be changed when the active server is unreachable, and the service should be disabled on the inactive servers.
In some cases, the application itself supports clustering. MySQL might do this, and protomuck might be made to.