git-annex allows managing files with git, without checking the file contents into git. While that may seem paradoxical, it is useful when dealing with files larger than git can currently easily handle, whether due to limitations in memory, time, or disk space.
Get all content from backup
remote:
git annex get * --from backup
time git annex fsck --fast | grep -A 10 -v "ok$"
time git annex fsck | grep -A 10 -v "ok$"
git annex find --not --in=<remote> .
Check how much disk space the content from backup
remote will use when fetched:
git annex info . --not --in here
To migrate from older SHA1E backend to newer SHA256E (default for new repos):
git annex migrate --backend SHA256E *
After migration you might need to run git annex unused
and git annex dropunused
.
git annex unused
git annex unused | grep -o -P "^ [0-9]+" | xargs git annex dropunused
If there are still files in a specific backend:
$ git annex info
...
backend usage:
SHA1E: 29
SHA256E: 973
Show which remotes contain files with backend=SHA1E
:
$ git annex list --inbackend=SHA1E
here
|backup
||github
|||origin
||||web
|||||bittorrent
||||||
_X____ foobar1.txt
...
git annex dead <UUID>
To prune all history relating to all dead remotes
git annex forget --drop-dead
That prunes all history relating to all dead remotes. You need to be running a git-annex that supports this on all computers you use the repos on, or the pruned history will get merged back in.
First clone the repo to new location:
git clone foo
Now set "annex.uuid" in freshly created .git/config
to the UUID of the dead repo you want to recycle.
Do this before you run any git annex command. Now run:
git annex init
git annex fsck
git annex semitrust <uuid>
Sync with master.
Git-annex's metadata works best when files have a lot of useful metadata attached to them. To make git-annex automatically set the year and month when adding files, run:
git config annex.genmetadata true
A git commit hook can be set up to extract lots of metadata from files like photos, mp3s, etc. Install the extract utility, from libextractor.
Download pre-commit-annex and install it in your git-annex repository as '.git/hooks/pre-commit-annex'. Remember to make the script executable! Run:
git config metadata.extract "artist album title camera_make camera_model orientation video_dimensions image_dimensions"
Running for first time to update already annexed content:
git annex find --format='${file}\n' | sort | \
awk -vRS= -vFS='\n' '{for (i = 2; i <= NF; i++) print $i}' | \
xargs -d '\n' bash -x .git/hooks/pre-commit-annex
Now any fields you list in metadata.extract to will be extracted and stored when files are committed.
To get a list of all possible fields, run:
libextractor-extract -L | sed 's/ /_/g'
By default, if a git-annex already has a metadata field for a file, its value will not be overwritten with metadata taken from files. To allow overwriting, run:
git config metadata.overwrite true
Links