Migrating multiple SVN repos into a single GIT repo ?

I have multiple projects hosted OnDemand, A and B - each with their own subversion repository.

A has an svn:externals to B, such that checking out A gives

A/B/a

A/B/b

A/c

A/d

etc

As part of the migration to GIT on bitbucket I would like to combine the two subversion repositories into a single git repository (A.git), such that checing out A.git would give the same layout as svn gave.

Since we no longer need to check out B on its own, I don't see the reason to propagate the modules approach.

I need each SVN commit to remain as an identifiable single git commit.

What solution does Atlassian recommend ?

2 answers

1 vote

Hi Richard,

I wouldn't go so far as to call this the Atlassian recommended approach, but I'll try to answer your question.

The git way that roughly (but with some drawbacks) maps to using svn:externals would be submodules (http://www.kernel.org/pub/software/scm/git/docs/git-submodule.html). Submodules allow you to have one Git repository as a subdirectory of another Git repository with separate commit histories. You clearly stated that checking out B is not longer required and that you don't want/need to propagate the module approach so git submodules are not necessarily required.

One option that retains the history of commits is the subtree command that is available in git/contrib:

(https://github.com/git/git/blob/master/contrib/subtree/git-subtree.txt)

Note: git installed via Homebrew on Mac OS X installs the `subtree` command automatically, otherwise you'd need to follow the installation instructions on https://github.com/git/git/blob/master/contrib/subtree/INSTALL.

This needs to be done for the creation of the merged history only. After pushing the changes, other git clients will get the merged history without requiring this subcommand to be available. Further down I show how the same result can be achieved without having to install the subtree command.

This example will merge two git repositories into one, with the second ending up in a subdirectory of the first:

$> git clone git@bitbucket.org:ssaasen/git-pastiche.git
Cloning into 'git-pastiche'...
$> cd git-pastiche
$> ls -l
total 24
-rw-r--r--   1 ssaasen  staff   815  9 Feb 12:34 Makefile
-rw-r--r--   1 ssaasen  staff  1269  9 Feb 12:34 README.md
drwxr-xr-x  10 ssaasen  staff   340  9 Feb 12:34 bin
-rw-r--r--   1 ssaasen  staff  1214  9 Feb 12:34 build.hs
drwxr-xr-x  10 ssaasen  staff   340  9 Feb 12:34 man

$> git remote add spy -f git@bitbucket.org:ssaasen/spy.git
warning: no common commits
remote: Counting objects: 231, done.
remote: Compressing objects: 100% (20
...
$> git subtree add --prefix=spy spy/master
Added dir 'spy'

$> ls -l
total 24
-rw-r--r--   1 ssaasen  staff   815  9 Feb 12:34 Makefile
-rw-r--r--   1 ssaasen  staff  1269  9 Feb 12:34 README.md
drwxr-xr-x  10 ssaasen  staff   340  9 Feb 12:34 bin
-rw-r--r--   1 ssaasen  staff  1214  9 Feb 12:34 build.hs
drwxr-xr-x  10 ssaasen  staff   340  9 Feb 12:34 man
drwxr-xr-x  13 ssaasen  staff   442  9 Feb 12:35 spy <=====

This will add the project git@bitbucket.org:ssaasen/spy.git in the subdirectory ./spy with its history retained:

Pushing this combined repository to its new location will make the combined history available in a single repository.

There are a few caveats though:

  • Althought commits from boths sides are retained, showing per file history does not return what you'd expect.
    E.g. `git log -- ./spy` only shows the merge commit but not the history of the files in the spy directory.
    Traversing the full history of either of the two repositories is still possible though.
  • Paths in the commits of the imported repository are still in the context of the original repository. E.g. `git show b303486^2` shows file changes as if they were made in the project root directory (which is true for the original repository) although the files are now in the ./spy subdirectory.
  • The new repository contains all the tags from both repositories which might cause conflicts.
  • The subdirectory only contains one branch of the merged repository ("spy/master" in the example above). Other branches of the merged repository are not easily available.

If it's important to be able to not only reference the commits of the merged repository but branches and tags for each, git submodules might be a better option for your use case even though you wanted to move away from a module approach.

The same result can be achieved using subtree merging directly without the need for the subtree command to be installed:

$> git clone git@bitbucket.org:ssaasen/git-pastiche.git
$> cd git-pastiche/
$> git remote add spy -f git@bitbucket.org:ssaasen/spy.git
$> git merge -s ours --no-commit spy/master
$> git read-tree --prefix=spy -u spy/master
$> git commit -m "Add ./spy as a subtree merging git@bitbucket.org:ssaasen/spy.git"

Hope this helps.

Cheers,

Stefan

0 votes
Gary Sackett Atlassian Team Feb 08, 2013

Hi Richard,

The SVN OnDemand structure actually consists of one repo separated by directories, so in theory you should be able to migrate directly to Git. For any SVN Externals, we reccomend using submodules.

Cheers,

Gary

Suggest an answer

Log in or Sign up to answer
Community showcase
Published Monday in Jira Ops

Jira Ops Early Access Program Update #1: Announcing our next feature and a new integration

Thanks for signing up for Jira Ops! I’m Matt Ryall, leader for the Jira Ops product team at Atlassian. Since this is a brand new product, we’ll be delivering improvements quickly and sharing updates...

474 views 0 9
Read article

Atlassian User Groups

Connect with like-minded Atlassian users at free events near you!

Find a group

Connect with like-minded Atlassian users at free events near you!

Find my local user group

Unfortunately there are no AUG chapters near you at the moment.

Start an AUG

You're one step closer to meeting fellow Atlassian users at your local meet up. Learn more about AUGs

Groups near you