FIXME: THIS CHAPTER IS BROKEN.
Inserting a file into the database is conceptually a search of a `representation space' for the file.
For any given file there are many reasonable representations, such as:
There are also many unreasonable representations, such as:
We want to get the reasonable representations and avoid the unreasonable ones. We also want to explore possibilities such as replacing an existing file with a reverse delta.
Since all of the representations are transformations of other representations, it makes sense to arrange them in a tree. Thus:
or
The configuration file for this looks like:
# transform INPUT-TYPE OUTPUT-TYPE NAMES... # endpoints INPUT-TYPE OUTPUT-TYPE # Note NAMES are tried in the order listed. # If no NAMES are given, the transform is an identity transform. # Identity transforms are handy (necessary!) if more than one # output type can be an endpoint. # Where we start and where we want to go: endpoints input storage # Use this for encryption: endpoints input encrypted-storage # Identity transforms transform input storage transform delta storage transform compressed storage # Use these for encryption transform compressed encrypted gpg transform delta encrypted gpg transform input encrypted gpg transform encrypted encrypted-storage # Compression algorithms (in desired search order) transform input compressed bzlib zlib bzip2 gzip transform delta compressed bzlib zlib bzip2 gzip # Delta compression transform input delta xdelta # Storage and retrieval methods # (try them all in order until one succeeds) storage berkeley storage file storage some-plugin-or-other