WORK IN PROGRESS: this is a toy project provided without any guarantee.
This project aims to deal with the headache created by the uncontrolled growth of cache files, as raised by one user:
In ~/.unison, files starting with ar and fp accumulate as one changes roots over the years. It's natural to want to delete ones that are unused, but it's somewhat difficult to understand what each file is for.
(See Issue 495.)
This project provides a temporary solution by providing a wrapper, called uwrapper, which manually reads the configuration file, manages the corresponding archive files in a dedicated directory. To achieve this for remote hosts, the wrapper also upload or download the files accordingly.
Problem
Unison use archive files to store information about a synchronisation. These files exist on both roots. Using for a long time creates a significant amount of archive files, resulting in confusion and a cluttered ~/.unison folder.
Archive files
Assume there are two roots: A and B, there will be two archive files for A, located on A's machine's ~/.unison folder, and named arX and fpX, where X is a 32 length 32 alphanumeric in lower case. Similarly, for B, we have two archive files located similarly.
The names of roots are canonized, and used to compute name of the archive files, i.e., the X above. Note that the name of a root contains both a hostname and the full path. On both the local machine and the remote, the hostname is first determined by the environment variable UNISONLOCALHOSTNAME. If the environment variable is missing, it is then determined by a standard procedure. See Appendix for details.
Solution
For each profile file, the wrapper creates a folder with the same prefix (i.e. without the .prf extension). This folder stores the archive files corresponding to the profile file.
- When the wrapper starts, it creates a clean
~/.unisonfolder for both local host and remote host (if exists). Existing folders are renamed before for backup purposes. They will NOT be restored after the wrapper, because restoring them would encourage using pre-existing archives and would bring complexity in maintaining this wrapper. - When the wrapper starts, it populates
~/.unisonwith the required profile file and archive files. When it ends, it moves those files back to the folder and remove the temporary~/.unison. - The local archives are stored in the
localsub-folder, remote ones in theremotesub-folder.
Currently, this wrapper is developed with a MacOS local host, and can support a Linux host in principle. It has been tested against a Linux remote as well as a Windows remote. Additionally, the communication with remote is implemented through simple calls to ssh commands and executing standard shell/powershell scripts on the remote platform, which can be problematic if not supported remotely.
Please first clone this package, and install it locally:
- Install:
pipx install ./. - Upgrade:
pipx upgrade unison-wrapper.
To use this package:
- To start preparing for a profile:
uwrapper start profile_name.prf. - After this, one can execute unison as normal:
unison profile_name.prf. - Once the synchronisation is finished, it is important to restore the state so that the updated archive files are properly copied back:
unison restore profile_name.prf.
- Use a file to record the state so that we don't need to specify "start/restore".
- Keep a record of
UNISONLOCALHOSTNAME. - Extend to work on Windows machine (local host is Windows).
- Test it on Linux host.
From Unison's documentation:
The name of the archive file on each replica is calculated from
* the canonical names of all the hosts (short names like saul are
converted into full addresses like saul.cis.upenn.edu),
* the paths to the replicas on all the hosts (again, relative
pathnames, symbolic links, etc. are converted into full, absolute
paths), and
* an internal version number that is changed whenever a new Unison
release changes the format of the information stored in the
archive.
This method should work well for most users. However, it is
occasionally useful to change the way archive names are generated.
Unison provides two ways of doing this.
The function that finds the canonical hostname of the local host (which
is used, for example, in calculating the name of the archive file used
to remember which files have been synchronized) normally uses the
gethostname operating system call. However, if the environment variable
UNISONLOCALHOSTNAME is set, its value will be used instead. This makes
it easier to use Unison in situations where a machine’s name changes
frequently (e.g., because it is a laptop and gets moved around a lot).
A more powerful way of changing archive names is provided by the
rootalias preference. The preference file may contain any number of
lines of the form:
rootalias = //hostnameA//path-to-replicaA -> //hostnameB/path-to-replicaB
When calculating the name of the archive files for a given pair of
roots, Unison replaces any root that matches the left-hand side of any
rootalias rule by the corresponding right-hand side.
So, if you need to relocate a root on one of the hosts, you can add a
rule of the form:
rootalias = //new-hostname//new-path -> //old-hostname/old-path
Note that root aliases are case-sensitive, even on case-insensitive
file systems.