Disaster Recovery

From LogicalDOC Community Wiki
Revision as of 07:13, 10 March 2011 by Car031 (talk | contribs) (External procedure with Amazon S3)
Jump to navigationJump to search

The Aim: Mirroring LogicalDOC resources

As a DMS, LogicalDOC has to handle large amount of sensible and reserverd documents, from the user's perspective it is important to guarantee information resilience even after a server failure or in general a disaster that may involve the building or the geographical area. We also need a tool to quickly restore the data once the runtime environment becomes available after de fault. It is important to note that what is described here is NOT a backup, it just describes a mirroring system to be used to fast restore documents after a disaster.

External procedure with Amazon S3

We try to develop a mirroring procedure that takes care of put LogicalDOC resources to a specific Amazon S3 bucket using Java and the AWS SDK. Here below there is a rough description of some important issues to understand when inspecting a single resource:

1) If the resource is locally new, simply execute a PUT

  For each inspected folder a special cache file is created storing file name and MD5 hash

2) If the resource was locally modified, execute a DELETE followed by a PUT (we don't want to handle versions)

  To check if a resource is modified it is enough to check the MD5 against the one saved in the cache file

3) If the resource was locally deleted, execute a DELETE

  To detect deletions we could use a special file in each parent folder enlisting the deleted resources. 
  If the deleted resource is a folder, all remote elements inside this folder have to be remotely deleted. 


The procedure will take all the configuration parameters from central context.properties and additional properties files.

What about the partial file problem?

When the procedure ispects a folder, how can it assure that a file is finished or not completely written? We can discard the question, since when the case occurs, a second run will solve the issue since tha MD5 changed and the remote resource will be replaced by the correct local binary. This minimalistic approach is simple but doesn't guarantee to have a remote consistent mirror at a given point in time.