This process has one critical flaw and you probably don't want to use it. git-svn is simpler.
You need:
svn-all-fast-export (from the Debian/Ubuntu package of the same name; upstream homepage is http://gitorious.org/svn2git, not related to a Ruby tool of the same name)
To avoid segfaults due to authz filtering (four revisions on svn.zope.org are not available to the general public: r129027, r129030, r129031, r129032, except, curiously, via the ViewCVS web interfaces, where the filtering is apparently not applied), you want to build your own svn-all-fast-export from https://github.com/mgedmin/svn2git
a copy of the Subversion repository
Good thing I have one set up, using svnsync:
svnadmin create /stuff/zope-mirror svnadmin setuuid /stuff/zope-mirror 62d5b8a3-27da-0310-9561-8e5933582275 vi /stuff/zope-mirror/hooks/pre-revprop-change # see the http link above svnsync init file:///stuff/zope-mirror svn://svn.zope.org/repos/main/ svnsync sync file:///stuff/zope-mirror # repeat last command periodically
It needs about 3.3 gigs of disk space.
a copy of authors.txt that maps svn usernames to real names and emails (ask Tres or Jim; Marius has a copy too but he isn't going to share it without explicit permission of the Zope Foundation)
an empty repository on Github (ask Tres or Stephan or Marius or Jim to create one at https://github.com/zopefoundation)
NB: after creating the repository make sure you go to Settings -> Teams, and add zopefoundation/developers and zopefoundation/administrators. And set up e-mail.
https://github.com/zopefoundation/zope.githubsupport can do all that for you!
The conversion process goes like this:
write a rules.txt like this one I used for zope.dottedname:
create repository zope.dottedname end repository # feel free to create multiple repositories in one go # order of matches matters in this file # trailing slashes in match rules are very important match /(zope\.dottedname)/trunk/ repository \1 branch master end match match /(zope\.dottedname)/branches/([^/]+)/ repository \1 branch \2 end match match /(zope\.dottedname)/tags/([^/]+)/ repository \1 branch refs/tags/\2 end match match / # ignore all other projects end match
run svn-all-fast-export --identity-map=authors.txt --rules=rules.txt --stats /path/to/your/zope-svn-mirror
You can also pass --svn-branches for a slightly more accurate conversion (branch merge commits do not go away, even when the diff is empty), if I understand it correctly.
And if you pass --add-metadata-notes, you'll get to see svn path and revno attached to a note on each commit. These are shown by git log.
These notes are easy to lose (git push --all/--tags doesn't push them; git clone doesn't fetch them). Read more about them at http://git-scm.com/2010/08/25/notes.html
The notes are shown on Github like this: https://github.com/zopefoundation/zope.traversing/commit/c10f103#gitnotes
wait a bit
The first time I ran it it took ~18 wall clock minutes (~4 CPU minutes) and ended in a segfault.
The second run took 12 wall clock minutes (hot disk cache, I suppose) and also ended in a segfault.
Then I discovered that if I don't remove the git repository, svn-all-fast-export will resume the process (a few thousand revisions before it crashed, or maybe that just happened to be the last successfully converted commit before the crash), which is considerably faster than starting from scratch.
I tried to add some min-revision/max-revision based rules to skip the broken commits in my svn mirror, but that didn't fix the segfaults. Luckily, conversion succeeds if I just run the tool twice (without removing intermediate results).
inspect ./zope.dottedname for sanity
I recommend tig as a very nice console-mode interactive git history viewer. Try tig --all. Or, if you prefer a GUI, try gitk --all.
For an example of things to inspect, e.g., there was a deleted 3.4.1 tag from http://zope3.pov.lt/trac/changeset/80495, which shouldn't have been deleted, according to http://zope3.pov.lt/trac/changeset/80499, so I've re-created the tag from refs/backups/r80495/tags/3.4.1 that was left by the conversion tool:
git tag 3.4.1 refs/backups/r80495/tags/3.4.1
Sometimes the conversion tool produces strands of unrelated history. tig --all interleaves them which makes this hard to notice. gitk --all shows them separately.
You can identify all the root commits with
git log --all --oneline --decorate --max-parents=0
then see which branches began with these with
git branch --contains $commit_id
and then see what the parent revision of each of these ought to be by looking at the commit note of $commit_id, getting svn path and revno, then looking at http://zope3.pov.lt/trac/log/{PATH}?rev={REVNO}
If you identify a missing commit parent, you can fix it up by creating a grafts file (info/grafts, each line contains "$commit_id $parent_id ..."), and you can make the connections permanent (I don't think the grafts file survives a git push) by running
git-filter-branch
with no arguments.A good way to check if your authors.txt was complete and correct is to run 'git shortlog --all -s' on the result.
if you want to dig deeper, add some more rules:
match /Zope3/trunk/src/zope/dottedname/ repository zope.dottedname branch monolithic-zope3 end match match /Zope3/branches/([^/]+)/src/zope/dottedname/ repository zope.dottedname branch monolithic-zope3-\1 end match # Zope/DottedName never existed, this applies to other packages match /Zope3/trunk/lib/python/Zope/DottedName/ repository zope.dottedname branch ancient-zope3 end match match /Zope3/branches/([^/]+)/lib/python/Zope/DottedName/ repository zope.dottedname branch ancient-zope3-\1 end match
nuke the old repository (and
log-zope.dottedname*
), re-run the conversion tool, get a new repo, inspect, don't forget the tag resurrection:git tag 3.4.1 refs/backups/r80495/tags/3.4.1
upload to github:
git remote add origin [email protected]:zopefoundation/zope.dottedname.git git push -u origin --mirror
remove old code from Subversion:
svn rm * echo 'See https://github.com/zopefoundation/zope.dottedname' > MOVED_TO_GITHUB svn add MOVED_TO_GITHUB svn ci -m "Moved to github"
update any buildouts that used to check code out from svn, e.g. wineggbuilder:
svn co svn+ssh://svn.zope.org/repos/main/zope.wineggbuilder/trunk cd zope.wineggbuilder vim project-list.cfg replace zope.dottedname,svn://svn.zope.org/repos/main/ with zope.dottedname,git://github.com/zopefoundation/zope.dottedname.git svn ci -m "zope.dottedname moved to github"
update zopetoolkit too:
svn co svn+ssh://svn.zope.org/repos/main/zopetoolkit/trunk cd zopetoolkit vim ztk-sources.cfg replace zope.dottedname = svn ${buildout:svn-zope-org}/zope.dottedname/trunk with zope.dottedname = git ${buildout:github}/zope.dottedname svn ci -m "zope.dottedname moved to github"
Advantages of svn-all-fast-export
- it's fast (<5 CPU minutes for 129128 svn revisions; I wish I had an SSD on the server that hosts my svn mirror -- speaking of which, I do have an SSD on my laptop, where the conversion takes <4 wall clock minutes!)
- it can simultaneously convert multiple packages (add more 'create repository/end repository' statements to rules.txt, and extend the match regexps to catch the packages you're interested in)
- it's very flexible and can handle gnarly repository history, if you write rules for it
- it was written for and used by the KDE project to convert and explode their humongous svn repository into a multitude of git projects, so it's been stress-tested rather well
Disadvantages of svn-all-fast-export
- it requires a local copy of the entire subversion repository
- it doesn't produce error messages if something's wrong, instead it segfaults
- it doesn't notice commits that copy or move parts of the tree outside of your rules; this means you may be missing commits or entire files added in those commits, if those files weren't modified since. (This is the critical flaw)
Alternative method:
git svn clone file://$PWD/zope-mirror/$package $package --stdlayout -A authors.txt