- You own a Git repository server and the developers do not have access to it
(i.e. they can only read & write to the repo, but not
gc
it). - You had a developer that wrote a project for you.
- He got angry for whatever reason and deleted all branches from the remote repo.
He also
push -f
ed themaster
branch leaving only one silly commit there. - He escaped from the country leaving you without any code at all (at least this is what he believe in).
- You have never cloned the repo to other machine. There were only two copies of it: the developer's one and the server's one.
You want to recover the work deleted by the developer.
- Does
git clone
ing of such repo also clone orphaned git objects? - How to find the lastest commit in the repo without knowing any branch?
- Are there any other methods for developer to successfully delete the commit objects without access to the physical
.git
remote repository or command line of the server that hosts the repo?
Git 2.15
# you create a repo on remote, probably throught some GUI
git init --bare repo.git
# developers clones it
git clone repo.git repo
# and works...
cd repo
echo "Commit 1" > A.txt
git add A.txt
git commit -m "The first commit"
echo "Commit 2" >> A.txt
git commit -am "The second commit"
echo "Commit 3" >> A.txt
git commit -am "The third commit"
# he shares the work with your remote
git push -u origin master
# but then he gots angry and tries to delete all of the commits
git checkout --orphan angry
git commit --allow-emtpy -m "The project was here haha"
git push -f origin HEAD:master
cd ..
rm -fr repo
Now the repo.git
is a bare (remote) repository that contained all the commits. The only developer copy has been deleted. The repo.git
should contained all commits objects because the push -f
only forced to move the master label. But is is possible to find them after cloning the repo again?
λ git clone repo.git repo
λ cd repo
λ git log
commit a2f0bf377ac1ac57f3c58a70ca229912ddf6f20e (HEAD -> master, origin/master, origin/HEAD)
Author: Wojciech Frącz <test@gmail.com>
Date: Fri Apr 27 20:54:26 2018 +0200
A project were here haha
The master
branch indeed contains only the one emmpty commit. Let's see what's in git's database:
λ ls .git/objects
14/ 77/ 82/ 85/ 8c/ 9c/ a2/ b0/ e2/ info/ pack/
Seems like there are more objects that should be for one commit only. Let's see what they are.
λ git cat-file --batch-check --batch-all-objects
14fbddbeba88315d645e02aaf9d09b842285acbc tree 33
77bf4d1ec7b29aca81c7a90830b5ae2e3294dc9f commit 247
821c92b2726661eb88dfa64fa569ab1dbabd3309 commit 248
85f127471a40127d4207df5a89968c07c0b47ae1 tree 33
8cbd46332fa568feeec1f98058fdaf9614b8c304 blob 36
9c9016c31def8d882871761376869aea2a4d058d blob 24
a26aa380ab24aee257be62c5a65c04d5fcbeb1b1 tree 33
a2f0bf377ac1ac57f3c58a70ca229912ddf6f20e commit 207
b0393ccdcde08ebc16093b45cab53be7e5e0df92 commit 199
e2ef58fc757b94d7a4ddca1459fc4b858f119d53 blob 12
Looks like there are 4 commit objects (three with work and one "erasing"), 3 blobs containing the A.txt
contents and 3 trees specifying the A.txt
path. The fourth erasing commit was empty, thus no tree and blob for this one. Perfect.
It's clear now that no commits has been deleted at all and they are cloned with the repo. But how do you find the most recent commit?
Let's try to find dangling commits in the repo. There should be only one dangling commit: the last commit in the master
branch as it were previous the removal. Let's use powerful git fsck
for this:
λ git fsck --lost-found
Checking object directories: 100% (256/256), done.
dangling commit 77bf4d1ec7b29aca81c7a90830b5ae2e3294dc9f
Found you?
λ git show 77bf4d1ec7b29aca81c7a90830b5ae2e3294dc9f
commit 77bf4d1ec7b29aca81c7a90830b5ae2e3294dc9f
Author: Wojciech Frącz <[email protected]>
Date: Fri Apr 27 20:50:09 2018 +0200
The third commit
diff --git a/A.txt b/A.txt
index 9c9016c..8cbd463 100644
--- a/A.txt
+++ b/A.txt
@@ -1,2 +1,3 @@
"Commit 1"
"Commit 2"
+"Commit 3"
Oh yes. So let's fix the remote (after giving a ban for a nasty dev).
λ git branch found 77bf4d1ec7b29aca81c7a90830b5ae2e3294dc9f
λ git checkout found
λ git log
commit 77bf4d1ec7b29aca81c7a90830b5ae2e3294dc9f (HEAD -> found)
Author: Wojciech Frącz <[email protected]>
Date: Fri Apr 27 20:50:09 2018 +0200
The third commit
commit 821c92b2726661eb88dfa64fa569ab1dbabd3309
Author: Wojciech Frącz <[email protected]>
Date: Fri Apr 27 20:50:07 2018 +0200
The second commit
commit b0393ccdcde08ebc16093b45cab53be7e5e0df92
Author: Wojciech Frącz <[email protected]>
Date: Fri Apr 27 20:49:54 2018 +0200
The first commit
λ git push -f origin HEAD:master
λ git checkout master
λ git reset --hard origin/master
Sadly, after a "real world example" I found out that the cloned repo does not contain the dangling commits from the remote. So the only way you retrieve them is executing the above commands directly on the git server command line. You should do it before the default gc prune period elapses (2 weeks by default). See below for more information about GC.
There are some APIs that allows to inspect the dangling commits on remote, e.g. on GitHub. If the repository server also supports PRs, the recent commit should be referenced in one of the recently closed PRs, as suggested here.
Surprisingly, manually executing git gc
in the repo does not delete the orphaned commits objects, too. They are packed (so there are no more easy to investigate .git/objects/XX
directories), but the commits are still there:
λ git gc
Counting objects: 3, done.
Writing objects: 100% (3/3), done.
Total 3 (delta 0), reused 0 (delta 0)
λ ls .git\objects\
info/ pack/
λ git fsck --lost-found
Checking object directories: 100% (256/256), done.
Checking objects: 100% (3/3), done.
dangling commit 77bf4d1ec7b29aca81c7a90830b5ae2e3294dc9f
You have to specify the --prune=now
to force git really delete those:
λ git gc --prune=now
Counting objects: 3, done.
Writing objects: 100% (3/3), done.
Total 3 (delta 0), reused 3 (delta 0)
λ git fsck --lost-found
Checking object directories: 100% (256/256), done.
Checking objects: 100% (3/3), done.
If you do this in remote repo, the work is really gone.
According to the docs, simple git gc
would remove dangling commits if they are 2 weeks old.
- Does
git clone
ing of such repo also clone orphaned git objects? Yes, if cloning in the same filesystem. No, if cloning over the SSH/HTTP. - How to find the lastest commit in the repo without knowing any branch? Provided that you are looking for intentionally deleted branch, look for dangling commits with
git fsck --lost-found
. - Are there any other methods for developer to successfully delete the commit objects without access to the physical
.git
remote repository. No that I'm aware of. He must had a direct access to command line on remote's OS to execute thegit gc --prune=now
to delete the unwanted commits immediately. No GUI that I know supports such execution.
try
git reflog
locally, it should show you the previos repository states (as seen locally)Once you found an interesting entry, just
git checkout HEAD@{NUMBER}
, replacing theNUMBER
with the found reflog entry.