Import from SVN

From Gcube Wiki
Jump to: navigation, search

This guide explains how to migrate a project from SVN to Git.

Configure your system to allow Migration Facilities to properly work

Download migration utilities

For an easy reference in this guide, the migration scripts is downloaded in the home folder:

 cd $HOME &&  wget https://code-repo.d4science.org/gCubeSystem/Configs/raw/branch/master/SVN/svn-migration-scripts.jar

Install git-svn (Ubuntu)

On Ubuntu, if you get this output:

$ java -jar ./svn-migration-scripts.jar verify  
svn-migration-scripts: using version 0.1.56bbc7f 
Git: using version 2.17.1  
Subversion: using version 1.9.7
git: 'svn' is not a git command. See 'git --help'.
The most similar commands are
fsck
mv
show
git-svn: ERROR: Unable to determine version.

You must install the git-svn package by running:

$ sudo apt install git-svn

Now, you will get something like this:

$ java -jar ./svn-migration-scripts.jar verify
svn-migration-scripts: using version 0.1.56bbc7f
Git: using version 2.17.1
Subversion: using version 1.9.7
git-svn: using version 2.17.1

Mount a case-sensitive disk image (for Mac OS)

Check if this step is needed by running:

java -jar ~/svn-migration-scripts.jar verify
 
svn-migration-scripts: using version 0.1.56bbc7f
Git: using version 2.11.0
Subversion: using version 1.9.4
git-svn: using version 2.11.0
You appear to be running on a case-insensitive file-system. This is unsupported, and can result in data loss.

Following the warning, we create a disk image dedicated to the migration activities:

java -jar ~/svn-migration-scripts.jar create-disk-image 5 GitMigration
 
created: /Users/manuelesimi/GitMigration.sparseimage
/dev/disk2     		 GUID_partition_scheme     		 
/dev/disk2s1   		 EFI                       		 
/dev/disk2s2   		 Apple_HFS                 		 /Users/manuelesimi/GitMigration
The disk image was created successfully and mounted as: /Users/manuelesimi/GitMigration

Extract the author(s) information

SVN uses the username to associate the commit, instead, Git uses the email. The author file is required to properly associate the history of commits to the right person.

It is possible to get the author list for the current repository only or use the global author list (extracted for your convenience).

Get authors info from the desired repository

 cd ~/GitMigration
 java -jar ~/svn-migration-scripts.jar authors http://svn.research-infrastructures.eu/public/d4science/gcube/trunk/Common/gxREST  > authors.txt
About to create the authors file.


Alternative (pure SVN):

svn co https://svn.d4science.research-infrastructures.eu/gcube/trunk/Common/gxREST
cd gxREST
svn log -q | awk -F '|' '/^r/ {sub("^ ", "", $2); sub(" $", "", $2); print $2" = "$2" <"$2">"}' | sort -u > authors.txt

Edit the authors file

You need to edit the file and add the correct email address (i.e. the address configured in the Git service) for each listed author:

$ cat authors.txt
luca.frosini = luca.frosini <luca.frosini@mycompany.com>
lucio.lelii = lucio.lelii <lucio.lelii@mycompany.com>
manuele.simi = manuele.simi <manuele.simi@mycompany.com
$ vi authors.txt
 
$ cat authors.txt
luca.frosini = Luca Frosini <luca.frosini@isti.cnr.it>
lucio.lelii = Lucio Lelii <lucio.lelii@isti.cnr.it>
manuele.simi = Manuele Simi <manuele.simi@isti.cnr.it>

You can also do that with just one bash command

$ sed -i s/@mycompany.com/@isti.cnr.it/g authors.txt

Use the global authors list

You can get the global authors mapping file as following

$ wget -O authors.txt https://code-repo.d4science.org/gCubeSystem/Configs/raw/branch/master/SVN/all-svn-authors.txt


BEFORE USE THE FILE PLEASE DOUBLE CHECK IF YOUR INFORMATION ARE CORRECT. IF YOU NOTICE SOME ERRORS FOR ANY AUTHOR PLEASE CORRECT THEM ON THE SOURCE FILE

SVN Repository Layouts

Depending on the structure of your SVN repo, the git-svn (see below) command needs to be configured differently. There are two possible layouts.

Standard Layout

The SVN project uses the standard /trunk, /branches, and /tags directory layout. This is the recommended way to organize a repository:

  • a trunk directory to hold the “main line” of development,
  • a branches directory to contain branch copies,
  • and a tags directory to contain tag copies.

In the standard layout, these are top-level directories.

SNVStardardLayout.png

Non-Standard Layout

The SVN project uses a custom layout.

SVNNonStandardLayout.png

Clone the SVN Repository

The git svn clone command transforms the trunk, branches, and tags in your SVN repository into a new Git repository. Depending on the structure of the SVN repo, the command needs to be configured differently.

> git svn clone --authors-file=authors.txt --follow-parent   http://svn.research-infrastructures.eu/public/d4science/gcube/trunk/Common/gxREST \
  --username manuele.simi gxRest
 
Note: --follow-parent makes it slower, but it’s needed if the SVN folder has been moved around.
 
Initialized empty Git repository in /Users/manuelesimi/GitMigration/gxRest/.git/
 
This may take a while on large repositories
Checked through r173000
 
Checked Ahrough .classpath
 
    A    pom.xml
    A    gxJRS/.classpath
    A    gxJRS/.project
    A    gxJRS/distro/profile.xml
    A    gxJRS/distro/LICENSE
    A    gxJRS/distro/changelog.xml
    A    gxJRS/distro/README
    A    gxJRS/src/test/java/org/gcube/common/gxrest/request/GXWebTargetAdapterRequestTest.java
    A    gxJRS/src/test/java/org/gcube/common/gxrest/request/GXHTTPStringRequestTest.java
    A    gxJRS/src/test/resources/logback-test.xml
    A    gxJRS/src/main/java/org/gcube/common/gxrest/methods/package-info.java
    A    gxJRS/src/main/java/org/gcube/common/gxrest/request/GXHTTPStreamRequest.java
    A    gxJRS/src/main/java/org/gcube/common/gxrest/request/package-info.java
    A    gxJRS/src/main/java/org/gcube/common/gxrest/request/GXWebTargetAdapterRequest.java
 
    [omitted output]
 
    A    gxJRS/src/main/java/org/gcube/common/gxrest/response/entity/SerializableErrorEntityTextWriter.java
    A    gxJRS/src/main/java/org/gcube/common/gxrest/response/entity/SerializableErrorEntityTextReader.java
    M    gxJRS/src/main/java/org/gcube/common/gxrest/response/entity/SerializableErrorEntity.java
r178787 = ac04855b00de818f2095d0784eb68c51a6ec9f77 (refs/remotes/git-svn)
Checked out HEAD:
  https://svn.d4science.research-infrastructures.eu/gcube/trunk/Common/gxREST r178787
creating empty directory: gxHTTP/src/main/resources
creating empty directory: gxJRS/src/main/resources
creating empty directory: gxJRS/src/test/java/org/gcube/common/gxrest/response

Do note that the command above DO NOT automatically import the SVN branches because the repo does not have the standard SVN layout .

If the SVN repository doesn’t have a standard layout and you want to import everything, you need to provide the locations of your trunk, branches, and tags using the --trunk, --branches, and --tags command line options. See git-svn for further options.

Change dir in the cloned folder

The last parameter of the "git svn clone" command is the name of the folder ("gxRest" in the previous example) where the SVN repository is cloned and converted to Git. All the git commands in the next sections are executed in the cloned folder. So you need to:

> cd <cloned repo>

Rename “trunk” branch to “master” (if needed)

> git branch
* trunk
> git branch -m trunk master
> git branch
* master

Check the imported history

> git log -10
commit ac04855b00de818f2095d0784eb68c51a6ec9f77
Author: manuele.simi <manuele.simi@isti.cnr.it>
Date:   Sun Mar 31 03:39:06 2019 +0000
 
	Add JAX-RS MessageBodyWriter/Reader responsible for converting SerializableErrorEntity to/from a stream.
 
	git-svn-id: https://svn.d4science.research-infrastructures.eu/gcube/trunk/Common/gxREST@178787 82a268e6-3cf1-43bd-a215-b396298e98cf
 
commit 6eb3f608dbe31c50578e46c53e16c469a0cc7f0c
Author: manuele.simi <manuele.simi@isti.cnr.it>
Date:   Sat Mar 30 19:58:21 2019 +0000
 
	Tweak some javadoc.
 
	git-svn-id: https://svn.d4science.research-infrastructures.eu/gcube/trunk/Common/gxREST@178786 82a268e6-3cf1-43bd-a215-b396298e98cf

Add code-repo’s Git repository as new remote

Do note that, before running this step, you need to create a new Git repository.

> git remote add origin https://code-repo.d4science.org/manuele.simi/gxRest.git
> git remote -v
origin    https://code-repo.d4science.org/manuele.simi/gxRest.git (fetch)
origin    https://code-repo.d4science.org/manuele.simi/gxRest.git (push)

Push the local repository to the new remote

> git push --set-upstream --force origin master
Counting objects: 168, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (125/125), done.
Writing objects: 100% (168/168), 39.47 KiB | 0 bytes/s, done.
Total 168 (delta 44), reused 0 (delta 0)
remote: Resolving deltas: 100% (44/44), done.
To https://code-repo.d4science.org/manuele.simi/gxRest.git
 + 4ec6b48...ac04855 master -> master (forced update)
Branch master set up to track remote branch master from origin.

Change the SCM Connection

The SCM section of the POM must be changed to reflect that we are now working with a Git repository.

For instance, the following section:

  <scm>
	<connection>scm:svn:http://svn.d4science.research-infrastructures.eu/gcube/trunk/distributions/${project.artifactId}</connection>
	<developerConnection>scm:svn:https://svn.d4science.research-infrastructures.eu/gcube/trunk/distributions/${project.artifactId}</developerConnection>
	<url>http://svn.d4science.research-infrastructures.eu/gcube/trunk/distributions/${project.artifactId}</url>
  </scm>

must be changed to:

<scm>
    <connection>scm:git:https://code-repo.d4science.org/gCubeSystem/${project.artifactId}.git</connection>
    <developerConnection>scm:git:https://code-repo.d4science.org/gCubeSystem/${project.artifactId}.git</developerConnection>
    <url>https://code-repo.d4science.org/gCubeSystem/${project.artifactId}</url>
</scm>

NB. In this case, the ${project.artifactId} declared in the pom is equal to the repository name declared on Git.

Otherwise, you must declare the Git repository URL like is it:

<scm>
    <connection>scm:git:[YOUR_GIT_REPOSITORY_URL].git</connection>
    <developerConnection>scm:git:[YOUR_GIT_REPOSITORY_URL].git</developerConnection>
    <url>[YOUR_GIT_REPOSITORY_URL]</url>
</scm>


Back to the CI guide.