Import from SVN
This guide explains how to migrate a project from SVN to Git.
Contents
- 1 Configure your system to allow Migration Facilities to properly work
- 2 Extract the author(s) information
- 3 SVN Repository Layouts
- 4 Clone the SVN Repository
- 5 Change dir in the cloned folder
- 6 Rename “trunk” branch to “master” (if needed)
- 7 Check the imported history
- 8 Add code-repo’s Git repository as new remote
- 9 Push the local repository to the new remote
- 10 Change the SCM Connection
Configure your system to allow Migration Facilities to properly work
Download migration utilities
For an easy reference in this guide, the migration scripts is downloaded in the home folder:
cd $HOME && wget https://code-repo.d4science.org/gCubeSystem/Configs/raw/branch/master/SVN/svn-migration-scripts.jar
Install git-svn (Ubuntu)
On Ubuntu, if you get this output:
$ java -jar ./svn-migration-scripts.jar verify svn-migration-scripts: using version 0.1.56bbc7f Git: using version 2.17.1 Subversion: using version 1.9.7 git: 'svn' is not a git command. See 'git --help'. The most similar commands are fsck mv show git-svn: ERROR: Unable to determine version.
You must install the git-svn package by running:
$ sudo apt install git-svn
Now, you will get something like this:
$ java -jar ./svn-migration-scripts.jar verify svn-migration-scripts: using version 0.1.56bbc7f Git: using version 2.17.1 Subversion: using version 1.9.7 git-svn: using version 2.17.1
Mount a case-sensitive disk image (for Mac OS)
Check if this step is needed by running:
java -jar ~/svn-migration-scripts.jar verify svn-migration-scripts: using version 0.1.56bbc7f Git: using version 2.11.0 Subversion: using version 1.9.4 git-svn: using version 2.11.0 You appear to be running on a case-insensitive file-system. This is unsupported, and can result in data loss.
Following the warning, we create a disk image dedicated to the migration activities:
java -jar ~/svn-migration-scripts.jar create-disk-image 5 GitMigration created: /Users/manuelesimi/GitMigration.sparseimage /dev/disk2 GUID_partition_scheme /dev/disk2s1 EFI /dev/disk2s2 Apple_HFS /Users/manuelesimi/GitMigration The disk image was created successfully and mounted as: /Users/manuelesimi/GitMigration
Extract the author(s) information
SVN uses the username to associate the commit, instead, Git uses the email. The author file is required to properly associate the history of commits to the right person.
It is possible to get the author list for the current repository only or use the global author list (extracted for your convenience).
Get authors info from the desired repository
cd ~/GitMigration java -jar ~/svn-migration-scripts.jar authors http://svn.research-infrastructures.eu/public/d4science/gcube/trunk/Common/gxREST > authors.txt About to create the authors file.
Alternative (pure SVN):
svn co https://svn.d4science.research-infrastructures.eu/gcube/trunk/Common/gxREST cd gxREST svn log -q | awk -F '|' '/^r/ {sub("^ ", "", $2); sub(" $", "", $2); print $2" = "$2" <"$2">"}' | sort -u > authors.txt
Edit the authors file
You need to edit the file and add the correct email address (i.e. the address configured in the Git service) for each listed author:
$ cat authors.txt luca.frosini = luca.frosini <luca.frosini@mycompany.com> lucio.lelii = lucio.lelii <lucio.lelii@mycompany.com> manuele.simi = manuele.simi <manuele.simi@mycompany.com $ vi authors.txt $ cat authors.txt luca.frosini = Luca Frosini <luca.frosini@isti.cnr.it> lucio.lelii = Lucio Lelii <lucio.lelii@isti.cnr.it> manuele.simi = Manuele Simi <manuele.simi@isti.cnr.it>
You can also do that with just one bash command
$ sed -i s/@mycompany.com/@isti.cnr.it/g authors.txt
Use the global authors list
You can get the global authors mapping file as following
$ wget -O authors.txt https://code-repo.d4science.org/gCubeSystem/Configs/raw/branch/master/SVN/all-svn-authors.txt
BEFORE USE THE FILE PLEASE DOUBLE CHECK IF YOUR INFORMATION ARE CORRECT. IF YOU NOTICE SOME ERRORS FOR ANY AUTHOR PLEASE CORRECT THEM ON THE SOURCE FILE
SVN Repository Layouts
Depending on the structure of your SVN repo, the git-svn (see below) command needs to be configured differently. There are two possible layouts.
Standard Layout
The SVN project uses the standard /trunk, /branches, and /tags directory layout. This is the recommended way to organize a repository:
- a trunk directory to hold the “main line” of development,
- a branches directory to contain branch copies,
- and a tags directory to contain tag copies.
In the standard layout, these are top-level directories.
Non-Standard Layout
The SVN project uses a custom layout.
Clone the SVN Repository
The git svn clone command transforms the trunk, branches, and tags in your SVN repository into a new Git repository. Depending on the structure of the SVN repo, the command needs to be configured differently.
> git svn clone --authors-file=authors.txt --follow-parent http://svn.research-infrastructures.eu/public/d4science/gcube/trunk/Common/gxREST \ --username manuele.simi gxRest Note: --follow-parent makes it slower, but it’s needed if the SVN folder has been moved around. Initialized empty Git repository in /Users/manuelesimi/GitMigration/gxRest/.git/ This may take a while on large repositories Checked through r173000 Checked Ahrough .classpath A pom.xml A gxJRS/.classpath A gxJRS/.project A gxJRS/distro/profile.xml A gxJRS/distro/LICENSE A gxJRS/distro/changelog.xml A gxJRS/distro/README A gxJRS/src/test/java/org/gcube/common/gxrest/request/GXWebTargetAdapterRequestTest.java A gxJRS/src/test/java/org/gcube/common/gxrest/request/GXHTTPStringRequestTest.java A gxJRS/src/test/resources/logback-test.xml A gxJRS/src/main/java/org/gcube/common/gxrest/methods/package-info.java A gxJRS/src/main/java/org/gcube/common/gxrest/request/GXHTTPStreamRequest.java A gxJRS/src/main/java/org/gcube/common/gxrest/request/package-info.java A gxJRS/src/main/java/org/gcube/common/gxrest/request/GXWebTargetAdapterRequest.java [omitted output] A gxJRS/src/main/java/org/gcube/common/gxrest/response/entity/SerializableErrorEntityTextWriter.java A gxJRS/src/main/java/org/gcube/common/gxrest/response/entity/SerializableErrorEntityTextReader.java M gxJRS/src/main/java/org/gcube/common/gxrest/response/entity/SerializableErrorEntity.java r178787 = ac04855b00de818f2095d0784eb68c51a6ec9f77 (refs/remotes/git-svn) Checked out HEAD: https://svn.d4science.research-infrastructures.eu/gcube/trunk/Common/gxREST r178787 creating empty directory: gxHTTP/src/main/resources creating empty directory: gxJRS/src/main/resources creating empty directory: gxJRS/src/test/java/org/gcube/common/gxrest/response
Do note that the command above DO NOT automatically import the SVN branches because the repo does not have the standard SVN layout .
If the SVN repository doesn’t have a standard layout and you want to import everything, you need to provide the locations of your trunk, branches, and tags using the --trunk, --branches, and --tags command line options. See git-svn for further options.
Change dir in the cloned folder
The last parameter of the "git svn clone" command is the name of the folder ("gxRest" in the previous example) where the SVN repository is cloned and converted to Git. All the git commands in the next sections are executed in the cloned folder. So you need to:
> cd <cloned repo>
Rename “trunk” branch to “master” (if needed)
> git branch * trunk > git branch -m trunk master > git branch * master
Check the imported history
> git log -10 commit ac04855b00de818f2095d0784eb68c51a6ec9f77 Author: manuele.simi <manuele.simi@isti.cnr.it> Date: Sun Mar 31 03:39:06 2019 +0000 Add JAX-RS MessageBodyWriter/Reader responsible for converting SerializableErrorEntity to/from a stream. git-svn-id: https://svn.d4science.research-infrastructures.eu/gcube/trunk/Common/gxREST@178787 82a268e6-3cf1-43bd-a215-b396298e98cf commit 6eb3f608dbe31c50578e46c53e16c469a0cc7f0c Author: manuele.simi <manuele.simi@isti.cnr.it> Date: Sat Mar 30 19:58:21 2019 +0000 Tweak some javadoc. git-svn-id: https://svn.d4science.research-infrastructures.eu/gcube/trunk/Common/gxREST@178786 82a268e6-3cf1-43bd-a215-b396298e98cf
Add code-repo’s Git repository as new remote
Do note that, before running this step, you need to create a new Git repository.
> git remote add origin https://code-repo.d4science.org/manuele.simi/gxRest.git > git remote -v origin https://code-repo.d4science.org/manuele.simi/gxRest.git (fetch) origin https://code-repo.d4science.org/manuele.simi/gxRest.git (push)
Push the local repository to the new remote
> git push --set-upstream --force origin master Counting objects: 168, done. Delta compression using up to 4 threads. Compressing objects: 100% (125/125), done. Writing objects: 100% (168/168), 39.47 KiB | 0 bytes/s, done. Total 168 (delta 44), reused 0 (delta 0) remote: Resolving deltas: 100% (44/44), done. To https://code-repo.d4science.org/manuele.simi/gxRest.git + 4ec6b48...ac04855 master -> master (forced update) Branch master set up to track remote branch master from origin.
Change the SCM Connection
The SCM section of the POM must be changed to reflect that we are now working with a Git repository.
For instance, the following section:
<scm> <connection>scm:svn:http://svn.d4science.research-infrastructures.eu/gcube/trunk/distributions/${project.artifactId}</connection> <developerConnection>scm:svn:https://svn.d4science.research-infrastructures.eu/gcube/trunk/distributions/${project.artifactId}</developerConnection> <url>http://svn.d4science.research-infrastructures.eu/gcube/trunk/distributions/${project.artifactId}</url> </scm>
must be changed to:
<scm> <connection>scm:git:https://code-repo.d4science.org/gCubeSystem/${project.artifactId}.git</connection> <developerConnection>scm:git:https://code-repo.d4science.org/gCubeSystem/${project.artifactId}.git</developerConnection> <url>https://code-repo.d4science.org/gCubeSystem/${project.artifactId}</url> </scm>
NB. In this case, the ${project.artifactId} declared in the pom is equal to the repository name declared on Git.
Otherwise, you must declare the Git repository URL like is it:
<scm> <connection>scm:git:[YOUR_GIT_REPOSITORY_URL].git</connection> <developerConnection>scm:git:[YOUR_GIT_REPOSITORY_URL].git</developerConnection> <url>[YOUR_GIT_REPOSITORY_URL]</url> </scm>
Back to the CI guide.