Versions used:
$ svn --version svn, version 1.6.17 (r1128011) compiled Nov 20 2011, 03:42:58 ...
$ git --version git version 1.7.10
$ wget https://raw.github.com/brosner/django-git-authors/master/authors.txt $ wget https://www.djangoproject.com/m/data/django-svn.svndump.bz2 $ bunzip2 --stdout django-svn.svndump.bz2 > django-svn.svndump $ svnadmin create svn-repo $ svnadmin load svn-repo < django-svn.svndump
(wait a few hours for the entire SVN history to be replayed)
Important: In the next command, note the trailing django/
in the SVN repository path.
$ git svn init \ --rewrite-root=http://code.djangoproject.com/svn \ --trunk=trunk \ --branches=branches/releases \ --branches=branches/features \ --branches=branches/soc2009 \ --branches=branches/soc2010 \ --branches=branches/attic \ --branches={0.90-bugfixes,0.91-bugfixes,0.95-bugfixes,0.96-bugfixes} \ file://`pwd`/svn-repo/django/ \ django-dry-run $ cd django-dry-run $ git svn fetch --quiet --authors-file=../authors.txt
(wait a few hours for the entire SVN repository history to be imported)
Track the remote branches, and make use of the chance to rename them:
$ git checkout --track -b stable/0.90.X remotes/0.90-bugfixes $ git checkout --track -b stable/0.91.X remotes/0.91-bugfixes $ git checkout --track -b stable/0.95.X remotes/0.95-bugfixes $ git checkout --track -b stable/0.96.X remotes/0.96-bugfixes $ git checkout --track -b stable/1.0.X remotes/1.0.X $ git checkout --track -b stable/1.1.X remotes/1.1.X $ git checkout --track -b stable/1.2.X remotes/1.2.X $ git checkout --track -b stable/1.3.X remotes/1.3.X $ git checkout --track -b stable/1.4.X remotes/1.4.X $ git checkout --track -b soc2009/i18n-improvements remotes/i18n-improvements $ git checkout --track -b soc2009/model-validation remotes/model-validation $ git checkout --track -b soc2009/multidb remotes/multidb $ git checkout --track -b soc2009/admin-ui remotes/admin-ui $ git checkout --track -b soc2009/http-wsgi-improvements remotes/http-wsgi-improvements $ git checkout --track -b soc2009/test-improvements remotes/test-improvements $ git checkout --track -b soc2010/app-loading remotes/app-loading $ git checkout --track -b soc2010/query-refactor remotes/query-refactor $ git checkout --track -b soc2010/test-refactor remotes/test-refactor $ git checkout --track -b attic/boulder-oracle-sprint remotes/boulder-oracle-sprint@11505 $ git checkout --track -b attic/full-history remotes/full-history $ git checkout --track -b attic/generic-auth remotes/generic-auth $ git checkout --track -b attic/gis remotes/gis@11507 $ git checkout --track -b attic/i18n remotes/i18n@11508 $ git checkout --track -b attic/magic-removal remotes/magic-removal@11509 $ git checkout --track -b attic/multi-auth remotes/multi-auth@11510 $ git checkout --track -b attic/multiple-db-support remotes/multiple-db-support $ git checkout --track -b attic/new-admin remotes/new-admin@11512 $ git checkout --track -b attic/newforms-admin remotes/newforms-admin@11514 $ git checkout --track -b attic/per-object-permissions remotes/per-object-permissions $ git checkout --track -b attic/queryset-refactor remotes/queryset-refactor@11516 $ git checkout --track -b attic/schema-evolution remotes/schema-evolution $ git checkout --track -b attic/schema-evolution-ng remotes/schema-evolution-ng $ git checkout --track -b attic/search-api remotes/search-api $ git checkout --track -b attic/sqlalchemy remotes/sqlalchemy $ git checkout --track -b attic/unicode remotes/unicode@11521 # Get back to trunk $ git checkout master
Fork the official Django GitHub repository in your account using the GitHub Web UI.
Push the branches to it:
$ git remote add mine https://[email protected]/ramiro/django.git $ git push mine $(git branch |grep -v ^\*\ master)
We will be roughly following Adrian's instructions from http://www.holovaty.com/writing/django-github/
Get latest
authors.txt
authors map:$ wget https://raw.github.com/brosner/django-git-authors/master/authors.txt
Get a SVN dump from our Open Data page (https://code.djangoproject.com/wiki/OpenData):
$ wget https://www.djangoproject.com/m/data/django-svn.svndump.bz2
Prepare it:
$ bunzip2 --stdout django-svn.svndump.bz2 > django-svn.svndump
Create and populate the SVN repo:
$ svnadmin create django-svn $ svnadmin load django-svn < django-svn.svndump
(wait a few hours for the entire SVN history to be replayed)
Init the Git repo and git-svn metadata/configuration.
Important: Note the trailing
django/
in the SVN repository path.$ git svn init \ --rewrite-root=http://code.djangoproject.com/svn \ --trunk=trunk \ --branches=branches/releases \ --branches=branches/features \ --branches=branches/soc2009 \ --branches=branches/soc2010 \ --branches=branches/attic \ --branches={0.90-bugfixes,0.91-bugfixes,0.95-bugfixes,0.96-bugfixes} \ file:///path/to/local/SVN/repo/django/ \ django-dry-run
This will create a
[svn-remote ...]
section in the.git/config
file similar to this:[svn-remote "svn"] url = file:///path/to/local/SVN/repo fetch = django/trunk:refs/remotes/trunk branches = django/branches/releases/*:refs/remotes/* branches = django/branches/features/*:refs/remotes/* branches = django/branches/attic/*:refs/remotes/* branches = django/branches/soc2009/*:refs/remotes/* branches = django/branches/soc2010/*:refs/remotes/* branches = django/branches/{0.90-bugfixes,0.91-bugfixes,0.95-bugfixes,0.96-bugfixes}:refs/remotes/*
Perform the actual SVN -> Git cloning:
$ cd django-dry-run $ git svn fetch --quiet --authors-file=../authors.txt
This took three and a half hours approx.
Verify branches:
Check that trunk is known as
master
:$ git branch * master
Check that remote branches were created for the SVN branches:
$ git branch -r 0.90-bugfixes 0.90-bugfixes@3590 0.91-bugfixes 0.91-bugfixes@3571 0.95-bugfixes 0.95-bugfixes@4358 0.96-bugfixes 0.96-bugfixes@6603 1.0.X 1.1.X 1.2.X 1.3 1.3.X 1.4.X admin-ui app-loading boulder-oracle-sprint boulder-oracle-sprint@11505 full-history full-history@11500 full-history@11501 generic-auth generic-auth@11506 gis gis@11507 http-wsgi-improvements i18n i18n-improvements i18n@11508 magic-removal magic-removal@11509 model-validation multi-auth multi-auth@11510 multidb multiple-db-support multiple-db-support@11511 new-admin new-admin@11512 newforms-admin newforms-admin@11514 per-object-permissions per-object-permissions@11515 query-refactor queryset-refactor queryset-refactor@11516 schema-evolution schema-evolution-ng schema-evolution-ng@11518 schema-evolution@11517 search-api search-api@11519 sqlalchemy sqlalchemy@11520 test-improvements test-refactor trunk unicode unicode@11521
Branches that got moved to the Attic are represented by two (or more) remote branches, where the extra ones are suffixed with
'@revnumber'
and represent points in that branches' histories right before they were moved to underbranches/attic/
. The branch without such suffixes in its name is the terminal one, after the move. This distinction could be important in the next step.Also, GSoC 2009
i18n-improvements
,model-validation
andmultidb
branches weren't merged by using SVN facilities but by a manual, local merge by a mentor that then performed a plain commit.Create local branches from the remote ones.
I think that for branches that:
- Got merged back to trunk (see section at the end of this document) and
- Later were moved to the Attic (see above)
we need to use the
'branchname@revnumber'
branch instead of the'branchname'
one. This will provide for an easier and more realistic scenario later when we try to convert these merges into Git merges.# Release maintenance branches $ git checkout --track -b releases/1.0.X remotes/1.0.X $ git checkout --track -b releases/1.1.X remotes/1.1.X $ git checkout --track -b releases/1.2.X remotes/1.2.X $ git checkout --track -b releases/1.3.X remotes/1.3.X $ git checkout --track -b releases/1.4.X remotes/1.4.X # Branches that got merged back into trunk $ git checkout --track -b boulder-oracle-sprint remotes/boulder-oracle-sprint@11505 $ git checkout --track -b gis remotes/gis@11507 $ git checkout --track -b i18n remotes/i18n@11508 $ git checkout --track -b magic-removal remotes/magic-removal@11509 $ git checkout --track -b multi-auth remotes/multi-auth@11510 $ git checkout --track -b new-admin remotes/new-admin@11512 $ git checkout --track -b newforms-admin remotes/newforms-admin@11514 $ git checkout --track -b queryset-refactor remotes/queryset-refactor@11516 $ git checkout --track -b unicode remotes/unicode@11521 $ git checkout --track -b soc2009/i18n-improvements remotes/i18n-improvements $ git checkout --track -b soc2009/model-validation remotes/model-validation $ git checkout --track -b soc2009/multidb remotes/multidb # Branches for GSoC student work, abandoned $ git checkout --track -b soc2009/admin-ui remotes/admin-ui $ git checkout --track -b soc2009/http-wsgi-improvements remotes/http-wsgi-improvements $ git checkout --track -b soc2009/test-improvements remotes/test-improvements $ git checkout --track -b soc2010/app-loading remotes/app-loading $ git checkout --track -b soc2010/query-refactor remotes/query-refactor $ git checkout --track -b soc2010/test-refactor remotes/test-refactor # Abandoned branches $ git checkout --track -b attic/full-history remotes/full-history $ git checkout --track -b attic/generic-auth remotes/generic-auth $ git checkout --track -b attic/multiple-db-support remotes/multiple-db-support $ git checkout --track -b attic/per-object-permissions remotes/per-object-permissions $ git checkout --track -b attic/schema-evolution remotes/schema-evolution $ git checkout --track -b attic/schema-evolution-ng remotes/schema-evolution-ng $ git checkout --track -b attic/search-api remotes/search-api $ git checkout --track -b attic/sqlalchemy remotes/sqlalchemy # The future $ git checkout --track -b features/py3k remotes/py3k # Get back to trunk $ git checkout master
THIS STEP ISN'T NEEDED ANYMORE -- We avoid it by using the
--rewrite-root=http://code.djangoproject.com /svn
command line option when runninggit svn init
in step 7 above.Fix the SVN repo URLs.
This will correct the commit IDs so they are identical to the ones in our officla GitHub repository:
$ git filter-branch --msg-filter \ "sed \"s|^git-svn-id: file:///path/to/local/SVN/repo/django|git-svn-id: http://code.djangoproject.com/svn/django|g\"" -- --all Rewrite xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx (nnnnn/17593) Ref 'refs/heads/attic/full-history' was rewritten ... Ref 'refs/remotes/unicode@11521' was rewritten
Note
Consider using the
-d
switch forfilter-branch
and point it to a RAM disk.Warning
Note that
git filter-branch
doubles the number of commits in the repository as it creates the modified ones but doesn't delete the original ones. It seems the original commits can be removed by a do-nothinggit filter-branch
run or by cloning the repository to another one.Perform some basic sanity checks. e.g. compare
- The commit ID
- The full commit message
of this migrated commit: https://github.com/django/django/commit/ddc5d59c6a547f76797d99510df8c3cec61e5f89 and the corresponding commit in our
django-dry-run
Git repo. They should be identical.Play with trying to get SVN branch merges as nice Git merges.
Unfortunately it seems grafts created in the
.git/info/grafts
file can't be transferred to other repos and getting them effectively incorporated into the repository so they can be transferred changes the hashes for all the subsequent commits (in Git the hash of a commit is calculated among other item based on the IDs of its parents).Some useful links:
- http://justatheory.com/computers/vcs/git/bricolage-to-git.html
- http://simeonpilgrim.com/blog/2009/11/17/complex-svn-repository-conversion-to-git/
- http://blog.johngoulah.com/2009/11/migrating-svn-to-git/
- http://stackoverflow.com/questions/79165/how-to-migrate-svn-with-history-to-a-new-git-repository
- http://jausoft.com/blog/2009/07/08/svn-to-git-migration-1/
- http://ben.straubnet.net/post/939181602/git-grafting-repositories
- http://evan-tech.livejournal.com/255341.html
Push to GitHub for review.
AKA SVN archeology.
- Boulder Oracle sprint
- Branch name:
bould-oracle-sprint
- Merged in: r5519 (https://code.djangoproject.com/changeset/5519) -- 06/23/07 11:16:00
- Merge commit:
ac64e91a0cadc57f4bc5cd5d66955832320ca7a1
- Parent commit in trunk:
553a20075e6991e7a60baee51ea68c8adc520d9a
- Parent commit in branch:
0cb8e31823b2e9f05c4ae868c19f5f38e78a5f2e
- Branch name:
- GIS
- Branch name:
gis
- Merged in: r8219 (https://code.djangoproject.com/changeset/8219) -- 08/05/08 15:13:06
- Merge commit:
79e68c225b926302ebb29c808dda8afa49856f5c
- Parent commit in trunk:
d0f57e7c7385a112cb9e19d314352fc5ed5b0747
- Parent commit in branch:
aa239e3e5405933af6a29dac3cf587b59a099927
- Branch name:
- i18n (original hugo's work)
- Branch name:
i18n
- Merged in: r1068 (https://code.djangoproject.com/changeset/1068) -- 11/04/05 01:59:46
- Merge commit:
5cf8f684237ab5addaf3549b2347c3adf107c0a7
- Parent commit in trunk:
cb45fd0ae20597306cd1f877efc99d9bd7cbee98
- Parent commit in branch:
e27211a0deae2f1d402537f0ebb64ad4ccf6a4da
- Branch name:
- Magic removal
- Branch name:
magic-removal
- Merged in: r2809 (https://code.djangoproject.com/changeset/2809) -- 05/01/06 22:31:56
- Merge commit:
f69cf70ed813a8cd7e1f963a14ae39103e8d5265
- Parent commit in trunk:
d5dbeaa9be359a4c794885c2e9f1b5a7e5e51fb8
- Parent commit in branch:
d2fcbcf9d76d5bb8a661ee73dae976c74183098b
- Branch name:
- multi-auth
- Branch name:
multi-auth
- Merged in: r3226 (https://code.djangoproject.com/changeset/3226) -- 06/28/06 13:37:02
- Merge commit:
aab3a418ac9293bb4abd7670f65d930cb0426d58
- Parent commit in trunk:
4ea7a11659b8a0ab07b0d2e847975f7324664f10
- Parent commit in branch:
adf4b9311d5d64a2bdd58da50271c121ea22e397
- Branch name:
- Alex's Multiple DB support (GSoC 2009) (MANUAL SVN MERGE)
- Branch name:
multidb
- Merged in: r11952 (https://code.djangoproject.com/changeset/11952) -- 12/22/09
- Merge commit:
ff60c5f9de3e8690d1e86f3e9e3f7248a15397c8
- Parent commit in trunk:
7ef212af149540aa2da577a960d0d87029fd1514
- Parent commit in branch:
45b4288bb66a3cda401b45901e85b645674c3988
- Branch name:
- rjwittams' first admin refactoring
- Branch name:
new-admin
- Merged in: r1434 (https://code.djangoproject.com/changeset/1434) -- 11/25/05 18:20:09
- Merge commit:
9dda4abee1225db7a7b195b84c915fdd141a7260
- Parent commit in trunk:
4fe5c9b7ee09dc25921918a6dbb7605edb374bc9
- Parent commit in branch:
3a7c14b583621272d4ef53061287b619ce3c290d
- Branch name:
- Second admin refactoring
- Branch name:
newforms-admin
- Merged in: r7967 (https://code.djangoproject.com/changeset/7967) -- 07/18/08 20:54:34
- Merge commit:
a19ed8aea395e8e07164ff7d85bd7dff2f24edca
- Parent commit in trunk:
dc375fb0f3b7fbae740e8cfcd791b8bccb8a4e66
- Parent commit in branch:
42ea7a5ce8aece67d16c6610a49560c1493d4653
- Branch name:
- Malcolm's QuerySet refactor
- Branch name:
queryset-refactor
- Merged in: r7477 (https://code.djangoproject.com/changeset/7477) -- 04/26/08 23:50:16
- Merge commit:
9c52d56f6f8a9cdafb231adf9f4110473099c9b5
- Parent commit in trunk:
c91a30f00fd182faf8ca5c03cd7dbcf8b735b458
- Parent commit in branch:
4a5c5c78f2ecd4ed8859cd5ac773ff3a01bccf96
- Branch name:
- Unicode
- Branch name:
unicode
- Merged in: r5609 (https://code.djangoproject.com/changeset/5609) -- 07/04/07 09:11:04
- Merge commit:
953badbea5a04159adbfa970f5805c0232b6a401
- Parent commit in trunk:
4c958b15b250866b70ded7d82aa532f1e57f96ae
- Parent commit in branch:
5664a678b29ab04cad425c15b2792f4519f43928
- Branch name:
- model validation (GSoC 2009) (MANUAL SVN MERGE)
- Branch name:
model-validation
- Merged in: r12098 (https://code.djangoproject.com/changeset/12098) -- 09/11/09 18:23:55
- Merge commit:
471596fc1afcb9c6258d317c619eaf5fd394e797
- Parent commit in trunk:
4e89105d64bb9e04c409139a41e9c7aac263df4c
- Parent commit in branch:
3e9035a9625c8a8a5e88361133e87ce455c4fc13
- Branch name:
- i18n-improvements (GSoC 2009) (MANUAL SVN MERGE)
- Branch name:
i18n-improvements
- Merged in: r11964 (https://code.djangoproject.com/changeset/11964) -- 12/22/09 14:58:49
- Merge commit:
9233d0426537615e06b78d28010d17d5a66adf44
- Parent commit in trunk:
6632739e94c6c38b4c5a86cf5c80c48ae50ac49f
- Parent commit in branch:
18e151bc3f8a85f2766d64262902a9fcad44d937
- Branch name: