This gist explains the detail of migrating from SVN to Git. I have had a significant portion of my research work, paper, and source code in Subversion (SVN) since 2015. Recently we decided to migrate everything to GitHub private repository. I am detailing the steps below in case that will be helpful.
svn co https://svn.engr.arizona.edu/svn/jms/jmsgroup/trunk/wildcat wildcat_SVN
This will save the subversion code in wildcat_SVN.
Run following inside wildcat_SVN
directory to get the list of author information who have committed so far.
cd wildcat_SVN
svn log -q | awk -F '|' '/^r/ {sub("^ ", "", $2); sub(" $", "", $2); print $2" = "$2" <"$2">"}' | sort -u > authors-transform.txt
This will save the author information in the file authors-transform.txt
.
In my case following were the content for authors-transform.txt
:
wildcat = wildcat <wildcat>
wilma = wilma <wilma>
In order to map commits to an existing GitHub user, I edited authors-transform.txt
to have the following content:
wildcat = Wilma Wildcat <[email protected]>
wilma = Wilbur Wildcat <[email protected]>
Now, we will use git-svn to perform svn checkout which will checkout svn repo as a git repo. First, we need to install git-svn
sudo apt-get install git-svn
Then we checkout repo using git svn
:
cd ..
git svn clone https://svn.engr.arizona.edu/svn/jms/jmsgroup/trunk/wildcat - authors-file=wildcat_SVN/authors-transform.txt wildcat_GIT
This will take a while, so grab a cup of coffee, enjoy, watch an episode of your favorite Korean drama and then come back later. Now we need to move svn ignore to git ignore:
cd wildcat_GIT
git svn show-ignore > .gitignore
Then we need convert all of the SVN tags into the proper Git tags. We can run the following command to do so:
for t in $(git for-each-ref - format='%(refname:short)' refs/remotes/tags); do git tag ${t/tags\//} $t && git branch -D -r $t; done
We can also create a local branch for each of our remote refs. You can do so with the following command:
for b in $(git for-each-ref - format='%(refname:short)' refs/remotes); do git branch $b refs/remotes/$b && git branch -D -r $b; done
Now, at this point, we need to take care of large file > 100 MB as Github, by default, doesn't automatically accept large files. For that, I need to install git lfs. Download git lfs
cd ..
wget https://github.com/git-lfs/git-lfs/releases/download/v2.12.0/git-lfs-linux-amd64-v2.12.0.tar.gz
tar -xzvf git-lfs-linux-amd64-v2.12.0.tar.gz -C ./git-lfs
cd git-lfs
sudo ./install.sh
Once we have installed git lfs, we go back to our git-svn folder
cd ../wildcat_GIT
In my case, .fig files are > 100 MB, so I can use the following command to tell GIT LFS to manage:
git lfs track "*.fig"
git add .gitattributes
But if you have only a couple of files > 100 MB then you can specify them directly. In my case, let say figures/gradient.fig
> 100 MB, then I will specify it as follows:
git lfs track "*.figures/gradient.fig"
git add .gitattributes
You can run following command to get files > 100 MB in your git directory:
find ./ -size +99M
Once we have this ready. We should be able to commit pretty soon. We need to take care of one more thing. Since in my case, subversion had a lot of commits, so I needed to commit in steps. I couldn't commit all at once. In this case, I did the following steps: First, I got all the commit logs as follows:
git log > all_commits.txt
Then I parsed the all_commits.txt
file to get the list of commits hashes as follows:
awk -F'commit ' '{print $2}' all_commits.txt > commitsHash.txt
Since commits hashes in the file had newest commit has at the top and the oldest at the bottom, I needed to reverse that. I performed following sequence to save commit hashes in reverse order:
sed '1!G;h;$!d' commitsHash.txt > Reverse_commitsHas.txt
Now, create a git repo in your GitHub.com account. I created a Github repo at https://github.com/catgroup/wildcat
Then I added this as remote url to wildcat_GIT
folder
git remote add origin https://github.com/catgroup/wildcat
Now, I wrote following bash script to commit sequentially from oldest commit to latest one:
#!/bin/bash
filename="Reverse_commitsHas.txt"
while IFS= read -r line
do
length=${#line}
if [[ "$length" == 0 ]];then
continue
fi
git push origin - force "$line":refs/heads/master
echo "Committed hastag $line"
done < "$filename"
Save this file as push_to_github.sh. Change its mode to be executable
chmod +x push_to_github.sh
Then execute the bash script:
./push_to_github.sh
This will take a while so go grab a cup of coffee or watch an episode of your favorite Korean drama and come back later This will start pushing everything. Now you also want to push tags which you will do as follows:
git push origin --tags
You are now officially done. Check your repo at Github.