This is a wrinkle that people often don't get about the ident
attribute: in CVS or Subversion, it would be legitimate to have a VERSION
file containing nothing but $Id$
as a way of tracking that the right push had happened; if you did that with git, you'd get nothing useful. If you run the script below (or just read it and the output that follows), you can see why.
(Please direct comments on this to https://plus.google.com/+TreyHarris/posts/KRgNG3zDk9x.)
Setting it up in Git is relatively simple. You create (or edit) the file .gitattributes and add a line telling Git to replace $Id$
when found in files that match certain filenames:
*.txt ident
This tells Git to check all files with a .txt extension. Two things to note:
- For some reason (performance, most likely), you can't just wildcard
* ident
there to get all files to be scanned. You have to specify the files for which ident substitution will happen; all others are ignored. - Unlike CVS or SVN, Git does not modify the files with
$Id$
tags at commit time. It will only do the substitition when writing out the file in the course of a checkout. In the example, you'll see how I worked around this: After creating and checking in the files, I just deleted them all and usedgit reset --hard
to force git to recreate them, this time with the$Id$
substitution.
Here's an example script, which you can download from here. It uses a little utility library I wrote (to do output like below, letting you describe the commands in the script as it's running) that can be downloaded here. You don't need to download it to try the script, though, as this version just pulls it down from a gist.
Note that this is a Z-shell script and probably won't work with bash or any other shell.
#!/bin/zsh
# Make any non-zero return fatal just in case
setopt err_exit
source =(curl https://gist.githubusercontent.com/treyharris/c486c9f8776802e270b7/raw/2d0519826f3b8ca8446f59211ecd0dc738221d3d/showshell.zsh 2> /dev/null)
local dir=/tmp/ident-test
local files=5
show "Setting up ${dir}" "
rm -rf ${dir}
mkdir ${dir}
cd ${dir}
"
show "Setting up git" "
git init
echo '*.txt ident' > .gitattributes
"
for file ({1..${files}}.txt) {
echo '$Id$' > ${file}
git add "${file}"
git commit -m "Adding ${file}"
}
show "Show we do have ${files} commits for each file:" "
git log --stat --oneline --decorate
"
show "Show the current contents" "
cat *.txt
"
show "Git doesn't replace the \$Id\$ token until it must write the file" "
rm *
git reset --hard
cat *.txt
"
And the results are:
==============================================================================================================================================================================================================
Setting up /tmp/ident-test
==============================================================================================================================================================================================================
> rm -rf /tmp/ident-test
> mkdir /tmp/ident-test
> cd /tmp/ident-test
==============================================================================================================================================================================================================
Setting up git
==============================================================================================================================================================================================================
> git init
Initialized empty Git repository in /private/tmp/ident-test/.git/
> git email-work
[email protected]
> echo '*.txt ident'
[master (root-commit) 47e610b] Adding 1.txt
1 file changed, 1 insertion(+)
create mode 100644 1.txt
[master 5ffefaf] Adding 2.txt
1 file changed, 1 insertion(+)
create mode 100644 2.txt
[master 3262753] Adding 3.txt
1 file changed, 1 insertion(+)
create mode 100644 3.txt
[master a5f485b] Adding 4.txt
1 file changed, 1 insertion(+)
create mode 100644 4.txt
[master 27b67fa] Adding 5.txt
1 file changed, 1 insertion(+)
create mode 100644 5.txt
==============================================================================================================================================================================================================
Show we do have 5 commits for each file:
==============================================================================================================================================================================================================
> git log --stat --oneline --decorate
27b67fa (HEAD, master) Adding 5.txt
5.txt | 1 +
1 file changed, 1 insertion(+)
a5f485b Adding 4.txt
4.txt | 1 +
1 file changed, 1 insertion(+)
3262753 Adding 3.txt
3.txt | 1 +
1 file changed, 1 insertion(+)
5ffefaf Adding 2.txt
2.txt | 1 +
1 file changed, 1 insertion(+)
47e610b Adding 1.txt
1.txt | 1 +
1 file changed, 1 insertion(+)
==============================================================================================================================================================================================================
Show the current contents
==============================================================================================================================================================================================================
> cat 1.txt 2.txt 3.txt 4.txt 5.txt
$Id$
$Id$
$Id$
$Id$
$Id$
==============================================================================================================================================================================================================
Git doesn't replace the $Id$ token until it must write the file
==============================================================================================================================================================================================================
> rm 1.txt 2.txt 3.txt 4.txt 5.txt
> git reset --hard
HEAD is now at 27b67fa Adding 5.txt
> cat 1.txt 2.txt 3.txt 4.txt 5.txt
$Id: 055c8729cdcc372500a08db659c045e16c4409fb $
$Id: 055c8729cdcc372500a08db659c045e16c4409fb $
$Id: 055c8729cdcc372500a08db659c045e16c4409fb $
$Id: 055c8729cdcc372500a08db659c045e16c4409fb $
$Id: 055c8729cdcc372500a08db659c045e16c4409fb $
All the $Id$
tags become the same sha, 055c872, none of which corresponds to any commit. That sha isn't even particularly useful:
$ git show 055c872
$Id$
The only thing this sha is useful for is to have a readily-visible signature of a single file's contents. In dealing with batch pushes that may partially succeed, this could be useful, but it shouldn't be thought to be able to stand in for a version number.
Still, while it's definitely not the same as CVS/SVN $Id$
, and its scope is much more limited, that doesn't mean it's totally useless.
At a few sites I'm aware of, Git $Id$
is used for non-code things where a “build” consisted basically of pulling files from the git repo; for example, a repo for versioning and storing network gear configurations. When “build and install” entirely consists of downloading the version in Git, the $Id$
expansion was a useful thing. (Especially on network gear where it isn’t possible to re-download a configuration from the box in a form that can be diff'ed against the original.)
For the networking gear I worked with at a prior job, the device was "configured" by issuing a set of commands, not with a configuration file. So the Git-controlled "configuration file" was actually a dumb batch script.
So, at the beginning of an update, one custom variable, call it $cf_pushing
, was set to the expansion of $Id$
, and another variable, $cf_version
, was set to the value ${cf_version}-dirty
; at the end of the update $cf_version
was reset to the value of $cf_pushing
and $cf_pushing
was nulled. So you could tell not only what version had been pushed most recently, but whether the push had completed, and it was easy to set up monitoring to alarm if a push didn't go smoothly, just by polling that variable for the substring dirty
. (Of course, in real life you'd probably want to alarm only after X consecutive polls showed dirty
, so that a normal push in-progress wouldn't set off alarms.)
The ident actually comes from RCS when there wasn't a concept of a working copy, so you would need the ident string in order to know where that file actually came from. It ended up being cargo culted onto CVS (because CVS was just a wrapper around RCS anyway), and later onto SVN (because SVN was a CVS replacement).
One aspect, however, makes the ident string even less useful in git, which is the fact that the file itself doesn't have a history, so if two independent branches get to the same contents before being merged, what would the semantics of that even be?
One more interesting thing, however, is that git archive supports substitutions when generating the output, which is probably going to be more useful if you're trying to create a release tarball...