Skip to content

Instantly share code, notes, and snippets.

@fracz
Last active May 2, 2023 07:25
Show Gist options
  • Save fracz/b4a70b3ec195015cfb6aab5ef8fdb754 to your computer and use it in GitHub Desktop.
Save fracz/b4a70b3ec195015cfb6aab5ef8fdb754 to your computer and use it in GitHub Desktop.
Restore intentionally deleted commits in Git

Restore intentionally deleted commits in Git (remote)

Situation

  1. You own a Git repository server and the developers do not have access to it (i.e. they can only read & write to the repo, but not gc it).
  2. You had a developer that wrote a project for you.
  3. He got angry for whatever reason and deleted all branches from the remote repo. He also push -fed the master branch leaving only one silly commit there.
  4. He escaped from the country leaving you without any code at all (at least this is what he believe in).
  5. You have never cloned the repo to other machine. There were only two copies of it: the developer's one and the server's one.

You want to recover the work deleted by the developer.

Questions

  1. Does git cloneing of such repo also clone orphaned git objects?
  2. How to find the lastest commit in the repo without knowing any branch?
  3. Are there any other methods for developer to successfully delete the commit objects without access to the physical .git remote repository or command line of the server that hosts the repo?

Environment

Git 2.15

Problem simulation

# you create a repo on remote, probably throught some GUI
git init --bare repo.git
# developers clones it
git clone repo.git repo
# and works...
cd repo
echo "Commit 1" > A.txt
git add A.txt
git commit -m "The first commit"
echo "Commit 2" >> A.txt
git commit -am "The second commit"
echo "Commit 3" >> A.txt
git commit -am "The third commit"
# he shares the work with your remote
git push -u origin master
# but then he gots angry and tries to delete all of the commits
git checkout --orphan angry
git commit --allow-emtpy -m "The project was here haha"
git push -f origin HEAD:master
cd ..
rm -fr repo

Now the repo.git is a bare (remote) repository that contained all the commits. The only developer copy has been deleted. The repo.git should contained all commits objects because the push -f only forced to move the master label. But is is possible to find them after cloning the repo again?

Cloning & investigating the repo

λ git clone repo.git repo
λ cd repo
λ git log
commit a2f0bf377ac1ac57f3c58a70ca229912ddf6f20e (HEAD -> master, origin/master, origin/HEAD)
Author: Wojciech Frącz <test@gmail.com>
Date:   Fri Apr 27 20:54:26 2018 +0200

    A project were here haha

The master branch indeed contains only the one emmpty commit. Let's see what's in git's database:

λ ls .git/objects
14/  77/  82/  85/  8c/  9c/  a2/  b0/  e2/  info/  pack/

Seems like there are more objects that should be for one commit only. Let's see what they are.

λ git cat-file --batch-check --batch-all-objects
14fbddbeba88315d645e02aaf9d09b842285acbc tree 33
77bf4d1ec7b29aca81c7a90830b5ae2e3294dc9f commit 247
821c92b2726661eb88dfa64fa569ab1dbabd3309 commit 248
85f127471a40127d4207df5a89968c07c0b47ae1 tree 33
8cbd46332fa568feeec1f98058fdaf9614b8c304 blob 36
9c9016c31def8d882871761376869aea2a4d058d blob 24
a26aa380ab24aee257be62c5a65c04d5fcbeb1b1 tree 33
a2f0bf377ac1ac57f3c58a70ca229912ddf6f20e commit 207
b0393ccdcde08ebc16093b45cab53be7e5e0df92 commit 199
e2ef58fc757b94d7a4ddca1459fc4b858f119d53 blob 12

Looks like there are 4 commit objects (three with work and one "erasing"), 3 blobs containing the A.txt contents and 3 trees specifying the A.txt path. The fourth erasing commit was empty, thus no tree and blob for this one. Perfect.

It's clear now that no commits has been deleted at all and they are cloned with the repo. But how do you find the most recent commit?

Let's try to find dangling commits in the repo. There should be only one dangling commit: the last commit in the master branch as it were previous the removal. Let's use powerful git fsck for this:

λ git fsck --lost-found
Checking object directories: 100% (256/256), done.
dangling commit 77bf4d1ec7b29aca81c7a90830b5ae2e3294dc9f

Found you?

λ git show 77bf4d1ec7b29aca81c7a90830b5ae2e3294dc9f
commit 77bf4d1ec7b29aca81c7a90830b5ae2e3294dc9f
Author: Wojciech Frącz <[email protected]>
Date:   Fri Apr 27 20:50:09 2018 +0200

    The third commit

diff --git a/A.txt b/A.txt
index 9c9016c..8cbd463 100644
--- a/A.txt
+++ b/A.txt
@@ -1,2 +1,3 @@
 "Commit 1"
 "Commit 2"
+"Commit 3"

Oh yes. So let's fix the remote (after giving a ban for a nasty dev).

λ git branch found 77bf4d1ec7b29aca81c7a90830b5ae2e3294dc9f
λ git checkout found
λ git log
commit 77bf4d1ec7b29aca81c7a90830b5ae2e3294dc9f (HEAD -> found)
Author: Wojciech Frącz <[email protected]>
Date:   Fri Apr 27 20:50:09 2018 +0200

    The third commit

commit 821c92b2726661eb88dfa64fa569ab1dbabd3309
Author: Wojciech Frącz <[email protected]>
Date:   Fri Apr 27 20:50:07 2018 +0200

    The second commit

commit b0393ccdcde08ebc16093b45cab53be7e5e0df92
Author: Wojciech Frącz <[email protected]>
Date:   Fri Apr 27 20:49:54 2018 +0200

    The first commit
λ git push -f origin HEAD:master
λ git checkout master
λ git reset --hard origin/master

Cloning over SSH/HTTPS

Sadly, after a "real world example" I found out that the cloned repo does not contain the dangling commits from the remote. So the only way you retrieve them is executing the above commands directly on the git server command line. You should do it before the default gc prune period elapses (2 weeks by default). See below for more information about GC.

There are some APIs that allows to inspect the dangling commits on remote, e.g. on GitHub. If the repository server also supports PRs, the recent commit should be referenced in one of the recently closed PRs, as suggested here.

GC

Surprisingly, manually executing git gc in the repo does not delete the orphaned commits objects, too. They are packed (so there are no more easy to investigate .git/objects/XX directories), but the commits are still there:

λ git gc
Counting objects: 3, done.
Writing objects: 100% (3/3), done.
Total 3 (delta 0), reused 0 (delta 0)
λ ls .git\objects\
info/  pack/
λ git fsck --lost-found
Checking object directories: 100% (256/256), done.
Checking objects: 100% (3/3), done.
dangling commit 77bf4d1ec7b29aca81c7a90830b5ae2e3294dc9f

You have to specify the --prune=now to force git really delete those:

λ git gc --prune=now
Counting objects: 3, done.
Writing objects: 100% (3/3), done.
Total 3 (delta 0), reused 3 (delta 0)
λ git fsck --lost-found
Checking object directories: 100% (256/256), done.
Checking objects: 100% (3/3), done.

If you do this in remote repo, the work is really gone.

According to the docs, simple git gc would remove dangling commits if they are 2 weeks old.

Answers

  1. Does git cloneing of such repo also clone orphaned git objects? Yes, if cloning in the same filesystem. No, if cloning over the SSH/HTTP.
  2. How to find the lastest commit in the repo without knowing any branch? Provided that you are looking for intentionally deleted branch, look for dangling commits with git fsck --lost-found.
  3. Are there any other methods for developer to successfully delete the commit objects without access to the physical .git remote repository. No that I'm aware of. He must had a direct access to command line on remote's OS to execute the git gc --prune=now to delete the unwanted commits immediately. No GUI that I know supports such execution.
@LokeshJatangi
Copy link

LokeshJatangi commented Apr 4, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment