Skip to content

Instantly share code, notes, and snippets.

@goodmami
Last active February 23, 2023 17:39
Show Gist options
  • Save goodmami/b2e70fe2fd47fb92bb27576d8c59f758 to your computer and use it in GitHub Desktop.
Save goodmami/b2e70fe2fd47fb92bb27576d8c59f758 to your computer and use it in GitHub Desktop.
Converting ACE's Subversion repository to Git

Converting ACE from Subversion to Git

The ace-svn-to-git.sh script will use git-svn to convert ACE's Subversion repository to Git with the --stdlayout flag so the trunk, tags, and branches are handled mostly as expected (more below). The --prefix=svn/ option puts all of those tags and branches under the svn reference namespace, and the --authors-file option maps the Subversion author names to the current GitHub profiles of the three authors in ACE's history.

Initial Conversion

Just run:

./ace-svn-to-git.sh

Note that it will create an authors.txt file in the current directory (overwriting any existing file with the same name) and it will create the repository in an ace/ subdirectory.

Subsequent Cleanup

The conversion works mostly as expected, but Subversion's tags are not tags in Git but remote branches. This is because tags in Subversion are just like branches and can be modified. Here we assume that the tags will never be modified and the script converts them to Git tags, then deletes the remote (tag) branches.

The trunk remote branch is then created as a regular Git branch so it can be checked out.

Finally, the master branch is renamed to main, following the current convention.

Pushing to GitHub

Once everything looks good, it can be pushed to GitHub by setting the remote and pushing all branches. Note that git push --all does not push remote (svn/) branches that have not been converted to Git branches, nor does it push tags, which is then accomplished with git push --tags.

git remote add origin https://github.com/delph-in/ace.git
git push -u origin --all
git push -u origin --tags

Syncing to Subversion

The trunk branch should never be committed to directly. Instead, it should be kept up-to-date with the Subversion repository as follows:

git checkout trunk
git svn rebase
git checkout main  # don't remain on trunk to avoid accidental commits

The main branch can then merge in commits from trunk, e.g., when a new release is ready.

Note that the above steps only work if the local repository has the relevant SVN metadata (e.g., if it was the one that originally cloned from SVN). If not, the update.sh script in this gist will attempt to reconfigure Git to know about the SVN repo and rebuild the metadata. This should also work from a fresh clone from GitHub.

Notes

The GitHub Importer tool does a great job at converting a subversion repository to a conventional-looking Git repository. It does not, however, keep the relevant metadata necessary for syncing the Git repo with subsequent changes to the Subversion repo. Part of this metadata includes lines appended to every commit message imported from Subversion, and another part exists within the .git/svn/ subdirectory.

#!/bin/bash
cat <<EOF > authors.txt
sweaglesw = Woodley Packard <[email protected]>
oe = Stephan Oepen <[email protected]>
djwong = Darrick Wong <[email protected]>
(no author) = (no author) <(no author)>
EOF
git svn clone http://sweaglesw.org/svn/ace --stdlayout --prefix=svn/ --authors-file=authors.txt
# change to cloned repo directory
pushd ace
# convert SVN tag branches to Git tags, then delete the branch
# thanks: https://stackoverflow.com/a/14800155/1441112
git for-each-ref --format="%(refname:short) %(objectname)" refs/remotes/svn/tags \
| while read BRANCH REF
do
TAG_NAME="${BRANCH##*/}"
BODY="$(git log -1 --format=format:%B $REF)"
echo "ref=$REF parent=$(git rev-parse $REF^) tagname=$TAG_NAME body=$BODY" >&2
git tag -a -m "$BODY" $TAG_NAME $REF^ &&\
git branch -r -d "$BRANCH"
done
# rename master to main
git branch -m main
# create trunk branch
git branch trunk refs/remotes/svn/trunk
popd
rm authors.txt
echo "Now: create an empty repository on GitHub (e.g., delph-in/ace)"
echo "Run the following:"
echo " git remote add origin https://github.com/delph-in/ace.git"
echo " git push -u origin --all"
echo " git push -u origin --tags"
#!/usr/bin/bash
if [ "$( git branch --show-current )" != "trunk" ]; then
echo "Run from the trunk branch: git checkout trunk"
exit 1
fi
# ensure the trunk remote ref points to the head of the trunk branch
git update-ref refs/remotes/svn/trunk refs/heads/trunk
# ensure the config is set appropriately
git config --local svn-remote.svn.url http://sweaglesw.org/svn/ace
git config --local svn-remote.svn.fetch trunk:refs/remotes/svn/trunk
git config --local svn-remote.svn.branches branches/*:refs/remotes/svn/*
git config --local svn-remote.svn.tags tags/*:refs/remotes/svn/tags/*
# create the authors file and set its config path
if [ ! -f ./authors.txt ]; then
cat <<EOF > ./authors.txt
sweaglesw = Woodley Packard <[email protected]>
EOF
fi
git config --local svn.authorsfile ./authors.txt
# fetch and rebase from SVN
git svn rebase
echo "Done. Don't forget to switch out of the trunk branch to avoid accidental commits."
@arademaker
Copy link

Hi @goodmami , in the https://gist.github.com/goodmami/b2e70fe2fd47fb92bb27576d8c59f758#file-ace-svn-to-git-sh-L39 above, after the git push -u origin --all...

In my local machine I see the branches from ERG. The ones that I got from the --stdlayout --prefix=svn/ ... But they were not pushed to the GitHub https://github.com/arademaker/erg/branches/all

% git branch -a
* main
  remotes/origin/main
  remotes/svn/bd
  remotes/svn/dp
  remotes/svn/efficiency
  remotes/svn/lp
  remotes/svn/mo
  remotes/svn/mo@28390
  remotes/svn/trunk

@arademaker
Copy link

https://www.gitkraken.com/blog/migrating-git-svn

So the remote branches need to be converted to local branches before the push! Fine.

But I still have question

https://gist.github.com/goodmami/b2e70fe2fd47fb92bb27576d8c59f758#file-ace-svn-to-git-sh-L23 why we need to REF^? How can I double check with the SVN tags?

@goodmami
Copy link
Author

@arademaker I'm not sure I understand the issue. As described in the Subsequent Cleanup section of the README, SVN tags are imported as Git branches, so for ACE I convert them to Git tags (assuming the SVN tags won't be modified further) and delete the branches. You might do something similar for the ERG.

why we need to REF^?

Actually it's $REF^. REF is a variable created above like this:

$ git for-each-ref --format="%(refname:short) %(objectname)" refs/remotes/svn/tags \
  | while read BRANCH REF

So REF is the objectname, or commit hash, of a ref under refs/remotes/svn/tags. The ^ after dereferencing $REF is Git syntax to get the parent commit. I think this is because the import creates a commit just for the branch and I want the last contentful commit, which is the previous one.

In my local machine I see the branches from ERG. The ones that I got from the --stdlayout --prefix=svn/ ... But they were not pushed to the GitHub https://github.com/arademaker/erg/branches/all

Unfortunately I cannot see your repo. Maybe it's private.

In any case, and I apologize if this is obvious, but did you run the git push commands yourself? Note that the script only echos out the commands you are supposed to run yourself.

And finally, is GitHub's SVN importer not sufficient? If you're moving to GitHub and don't want to try to continually keep an SVN and Git repo in sync, then maybe we don't need something so complicated?

@goodmami
Copy link
Author

@arademaker For issues specifically about importing the ERG, please continue at delph-in/erg#40.

@arademaker
Copy link

The ^ after dereferencing $REF is Git syntax to get the parent commit. I think this is because the import creates a commit just for the branch and I want the last contentful commit, which is the previous one.

how can we check if the REF^ is pointing to the right object? You said “I think..”

@arademaker
Copy link

did you run the git push commands yourself? Note that the script only echos out the commands you are supposed to run yourself.

Of course I executed the push. But even so, remote branches need first to become local before pushing them. I share the repo with you. I am just trying to have one extra review to ensure I am not missing anything

@arademaker
Copy link

And finally, is GitHub's SVN importer not sufficient?

I tired twice. The ERG repo has some complications. The profiles make it really huge and SVN server is giving timeout. Dan agree to separate them from the grammar code.

@goodmami
Copy link
Author

how can we check if the REF^ is pointing to the right object? You said “I think..”

When someone does svn tag it creates a commit that adds the new tags/... subdirectory. In Git, we don't need that commit, because what we want is merely the commit that one is based on. If you want to have those commits, then by all means do so. I removed them because I wanted the repo to follow Git conventions, where a tag is simply a pointer to the commit we want tagged and not a commit itself. This was explained in the link in the comment above: https://stackoverflow.com/a/14800155/1441112

Of course I executed the push. But even so, remote branches need first to become local before pushing them.

The clone creates a lot of branches that aren't necessary in a regular Git repo. E.g., if the SVN repo has 10 branches, you might have 20+ branches created in the Git repo converted from SVN, with some just for tracking the upstream SVN repo, I think. The git-svn docs will explain it better: https://git-scm.com/docs/git-svn.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment