Skip to content

Instantly share code, notes, and snippets.

@piki
Last active November 19, 2024 17:21
Show Gist options
  • Save piki/10d57cda6d5b25744fdeefb56b421fe4 to your computer and use it in GitHub Desktop.
Save piki/10d57cda6d5b25744fdeefb56b421fe4 to your computer and use it in GitHub Desktop.
Script to push (mirror) a large repo to GitHub
#!/bin/bash -ex
#
# Push the current repository to GitHub, in small enough chunks that it
# won't exceed the pack-size limit
# Commit to start with, counting from the oldest. If the process fails,
# you can change this variable to restart from where it failed.
START_COMMIT=1000
# Number of commits to push at a time, counting from the oldest. If a
# push fails because the pack file is too big, try using a smaller number.
COMMIT_STEP=1000
git log --pretty=%H | ruby -e 'puts ARGF.each_line.to_a.reverse' > commits
COMMIT_COUNT=$(wc -l commits | cut -d' ' -f1)
for i in `seq $START_COMMIT $COMMIT_STEP $COMMIT_COUNT`; do
echo ====== $i
COMMIT=$(git show $(head -$i commits | tail -1) | head -1 | cut -d' ' -f2)
git tag -d foo || true
git tag foo $COMMIT
git push -f origin foo
done
git tag -d foo
git push origin HEAD
git push --mirror
@richban
Copy link

richban commented Oct 4, 2022

Thanks for this script. Is my understanding correct that in the for loop it pushes all the commits of the current checkout branch. An in git push --mirror it pushes all the rest missing git objects?

@piki
Copy link
Author

piki commented Oct 4, 2022

Mostly right. It pushes all the commits from the current branch, in batches, under the tag foo. Then git push origin HEAD creates the default branch on the server, which should just push the ref, no objects, since all the objects already got pushed. Then git push --mirror pushes all other branches, including any objects that are only in those branches, not the checked-out branch.

I think I've only tested it with main checked out, but it should work OK with another branch or even a detached head. ymmv.

It could fail if you have an old branch checked out, if main has > 100MB of objects not on the old branch. Less likely, it could fail if any your other branches have > 100MB total of objects that aren't found somewhere on main.

@richban
Copy link

richban commented Oct 4, 2022

Thanks for the clarification!

I have just migrated a 10GB repo from gitlab to github. FIrst I had issues with 100MB size limit - fixed it with git lfs than encountered the issue push >2GB but with your script it worked out like a charm ;)

@nkitagawa-venn
Copy link

Thank you for this @piki ! I was able to use this to mirror a large repo with a long commit history.

FYI: I did notice what appears to be a small bug in the script - it appears that lines 8 and 12 are reversed w.r.t. their comments. (I saw this because I did have to tune the commit step size.)

@piki
Copy link
Author

piki commented Mar 3, 2024

Good catch, @nkitagawa-venn. Fixed it!

@mejuliver
Copy link

mejuliver commented Nov 19, 2024

I get this error, any idea? I'm trying to push 2.6gb size repo to empty repository that rides on a private org with team plan

error: tag 'foo' not found.
fatal: --mirror can't be combined with refspecs
Enumerating objects: 17288, done.
Counting objects: 100% (17288/17288), done.
Delta compression using up to 12 threads
Compressing objects: 100% (11057/11057), done.
Writing objects: 100% (17288/17288), 2.48 GiB | 185.90 MiB/s, done.
Total 17288 (delta 5173), reused 17288 (delta 5173), pack-reused 0 (from 0)
error: RPC failed; HTTP 500 curl 92 HTTP/2 stream 5 was not closed cleanly: CANCEL (err 8)
send-pack: unexpected disconnect while reading sideband packet
fatal: the remote end hung up unexpectedly
Everything up-to-date

@piki
Copy link
Author

piki commented Nov 19, 2024

@mejuliver Is the repository you're pushing a mirror of something else? You can run git config remote.origin.mirror to find out. If it's true, you need to run git config --unset remote.origin.mirror to set it not to be.

You don't seem to be running the script with bash -ex that's in the #! line. If you do, it will show each command as it runs and will stop at the first error, which is also helpful for debugging.

@mejuliver
Copy link

mejuliver commented Nov 19, 2024

@mejuliver Is the repository you're pushing a mirror of something else? You can run git config remote.origin.mirror to find out. If it's true, you need to run git config --unset remote.origin.mirror to set it not to be.

You don't seem to be running the script with bash -ex that's in the #! line. If you do, it will show each command as it runs and will stop at the first error, which is also helpful for debugging.

using this git config remote.origin.mirror returns true so its a mirror repo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment