Migrating from SVN
Workflow
If you previously have experience with ATLAS' previous source code system, Subversion, along with all of its associated workflow (SVN repository layout, package tags, Tag Collector, projects, etc.) then git can seem intimidatingly different at first. Don't worry, it will pass... and most people do end up really appreciating git's power and flexibility.
Here we provide a guide to thinking like a git (pun intended) when you work with ATLAS code that should help map the old workflow as much as is possible to the new one, alongside some explanations as to why the new workflow is the way it is.
We go through the workflow top to bottom, from a developer's point of view:
Repository preamble
In SVN there was one single repository and everyone worked from that. However, this was inflexible and led to committing code to the single main repository being the only general way to let others look at it or test it.
With git we use personal or group copies of the main repository (aka the upstream repository). So there is an extra step here to take your own fork of the main repository. However, this only needs to be done once - the cost is small and the benefits are great.
Get a copy of the code locally to work on
Assuming that your basic account was setup correctly then in SVN developers would take a copy of the repository directly from the single SVN master:
svn co $SVNOFF/Tools/PyJobTransforms/trunk Tools/PyJobTransforms
(Of course, as that was tedious, svnco
was used as a wrapper.)
Instead with git you clone the repository:
git clone https://:@gitlab.cern.ch:8443/YOUR_USER_NAME/athena.git
git remote add upstream https://:@gitlab.cern.ch:8443/atlas/athena.git
The second git command run is to make sure your git repository knows about the main repository - it's important to be able to synchronise with changes there directly.
Tip
The git checkout is a fully functioning repository on its own - so many operations like diffing code, searching or viewing history become extremely fast. The default checkout is also of all packages, so you can see the whole of the code at once (and, of course, if you don't want that then just use a sparse checkout).
Danger
Making a local copy of the repository does take up more space - about 200MB for a sparse checkout, 800MB for a complete one...
Tips
...however, it is possible to keep a lot of development fed from one checkout by using branches and fetching updates - so reuse it.
Develop new code
Actual code development is pretty independent of the SCM that ATLAS uses. However, a fully functioning local repository is more powerful.
Tip
You can save local
snapshots of your code as a work in progress as often as you like
(just git commit
whenever). You can even use side branches
of your main topic branch for isolating different sub-developments.
You can also trivially diff
your code against any version
in the repository. This is really powerful when you do incremental
cycles of local development and testing.
In git you also create a branch for your development at this point, which is equivalent to a set of package tags plus a Tag Collector bundle. However, it's hugely easier and faster to do:
git fetch upstream # Sync with latest changes in main repo
git checkout -b my-new-development upstream/[PARENT_BRANCH] --no-track
Commit your code
svn commit ...
becomes
git add ...
git commit ...
git push -u origin ...
Why the difference? Git distinguishes between your local repository
and the GitLab one - so you can keep everything local (git commit
)
until it's fully ready (git push
). Git also has the concept of a
staging area, which is why git add
is needed. (Although there are
more steps, git gives you a lot more flexibility.)
Conflict resolution
In SVN updating to HEAD is pretty easy:
svn update
In git the process is just to apply the changes from the upstream branch onto your topic branch:
git fetch upstream # Get all upstream changes
# (does not touch your checkout!)
git merge upstream/[PARENT_BRANCH]
If there is a line-by-line conflict you must resolve it by hand.
git has much more powerful facilities for rolling back out of
a failed merge (see git status
during the merge).
Tagging
svn cp -r 123456 $SVNOFF/Tools/PyJobTransforms/trunk $SVNOFF/Tools/PyJobTransforms/tags/PyJobTransforms-01-02-03
becomes... nothing at all. Because this change is already identified uniquely in git by the commits made in the development and by the topic branch that was used.
Info
Just creating a branch in git is easier, faster and less error prone than the fiddly SVN equivalent.
Requesting a tag
In the old workflow a developer would request that a package tag be incorporated into a release by either
- Using the Tag Collector,
- Sending an email to the release coordinators.
In git this is done from GitLab by just clicking on the Merge Request
button in the GitLab web page. Handling merge requests in GitLab is
vastly superior, not just because the user interface is a thing of joy
compared to Tag Collector:
- Code discussion can be far more detailed (line by line discussions in the diff are possible)
- The code can be updated to address points of concern, before acceptance - we can finally review code properly
- If the code is rejected the main repository was never polluted by it
- Discussion and outcome are archived for history
Tag bundles
When packages needed to be changed in concert they needed to be handled specially in Tag Collector using a bundle. This was an awkward step and caused a lot of problems, especially if packages needed to be swept from one release to another as the bundle identity was lost when the bundle was accepted.
However, in git, a merge request will encapsulate coherently all changes, regardless of package boundaries. The merge request can be coherently cherry picked between release branches as well. (In fact, one can think of a topic branch in git as being inherently like a bundle, implemented correctly.)
Other Issues
Where are my package tags...?
Probably the single most unsettling thing for developers used to the old workflow is the lack of package tags. These were used in SVN to snapshot a particular version of a particular package, then collected by Tag Collector to assemble a release.
We already discussed the superiority of GitLab merge requests to a Tag Collector workflow, but still, people will probably still ask where are my package tags?
- Git does have tags, however they are tags that snapshot the
state of the entire repository. But this is a good thing
because it's trivial to look at the code differences between
releases without needing to go via Tag Collector:
This probably shows more changes than you want, so give a path to only see the changes for a package, e.g.,
git diff release/21.0.1..release/21.0.8
git diff release/21.0.1..release/21.0.8 Tools/PyJobTransforms
- We will also make lightweight tags per nightly so you can
see changes between any two nightly builds, e.g.,
git diff nightly/21.0/2016-12-01T2230..nightly/21.0/2016-12-07T2230 Tools/PyJobTransforms
- Further, you can use
git log PATH
to see only commits that affected a particular PATH (which can be a package, but more powerfully can be single files or domain areas). Then follow up with agit show
to see the commit log and diff for a particular commit. - Finally, do remember that git can also diff or log between
any arbitrary commit ids and any developer can make their
own private tags between significant development points, e.g.,
(These tags will not be in the main repository, although other developers can get a copy of them if they add your private fork as a remote.)
git tag my_package/some_label
We have a collection of hints that will provide more recipes that can be used.
svnpull.py
The import of ATLAS code from SVN was only made for SVN tags that
were part of a release (or for dev
, at least validated). If there
is code you need to take from SVN into git then you can use the
svnpull.py
script to achieve that.
Usage is very simple: giving a package name will import the current
SVN trunk; giving a package tag will import that tag. Use
svnpull.py --help
for full usage, which allows you to import any
SVN path into git, also restricting to only some files or using an
arbitrary SVN revision.
The script will only copy the code from SVN into your git checkout.
It is then up to you to add
and commit
when you have reviewed
the change and are ready to make a merge request.
Some very important notes when importing from SVN:
- Manually remove the cmt directory, or at least remove the files cmt/Makefile.RootCore and cmt/requirements
- Mention the tag you pulled in inside your git commit message. This is needed for us to check whether all tags in the SVN based release have made it into the git based release.
- Copy the relevant sections of the ChangeLog into the git commit message.
- When pulling multiple packages make a separate commit for every package. They can be in a single merge request, but make them separate commits.
Additional notes
svnpull.py
has been added to lsetup
, but it requires python 2.7
so run lsetup git python
. The code is maintained on the
svnpull
branch in this GitLab repo.
Cheat Sheet
This table tries to summarise what was described above. However, it's just not possible to map a git command to an SVN one and plug this into the new workflow (which was possible with the CVS to SVN migration), because SVN and git are just not two beasts from the same stable. So, instead, we try to map some of the key concepts and give commands or procedures that are similar.
It's also useful to keep our git cheat sheet close at hand.
To... | SVN | git |
---|---|---|
Checkout code | svn co |
git clone |
Show differences | svn diff |
git diff |
Update | svn update |
git fetch; git merge |
Commit code | svn ci |
git add; git commit; git push |
Identify code patch for a release | SVN package tag: svn cp |
git topic branch: git checkout -b new_branch_name |
Request tags | Tag Collector or Email | GitLab merge request |
Tag bundles | Tag Collector | unnecessary |