InfoQ

News

Article: Distributed Version Control Systems - a guide

Posted by Niclas Nilsson on May 09, 2008

Community
Architecture,
Agile
Topics
Collaboration ,
Teamwork ,
Technology
Tags
Version Control ,
Mercurial ,
Subversion ,
git ,
CVS ,
DVCS ,
VCS

Article: Distributed Version Control Systems

Since Linus Torvalds presentation at Google about git in May 2007, the adoption and interest for Distributed Version Control Systems has been constantly rising. In this article, Sebastien Auvray introduces the concept of Distributed Version Control, see when to use it, why it may be better than what you’re currently using, and have a look at three actors in the area: git, Mercurial and Bazaar.

Sebastien start out by comparing Distributed Version Control Systems with Centralized VMS:es:

Or a more precise question: Why Central VCS (and notably Subversion) are not satisfying? Several things are blamed on Subversion:

* Major reason is that branching is easy but merging is a pain (but one doesn't go without the other). And it's likely that any consequent project you'll work on will need easy gymnastic with splits, dev, test branches. Subversion has no History-aware merge capability, forcing its users to manually track exactly which revisions have been merged between branches making it error-prone.
* No way to push changes to another user (without submitting to the Central Server).
* Subversion fails to merge changes when files or directories are renamed.
* The trunk/tags/branches convention can be considered misleading.
* Offline commits are not possible.
* .svn files pollute your local directories.
* svn:external can be harmful to handle.
* Performance

After that, Sebastien moves on to describe how DVCS works, which the open source options are, before he comes to the an excellent comparison between git, Mercurial and Bazaar. The comparison includes things line features, model, web access, integration, performance, hosting options and many other things.

If you’re having trouble deciding which DVCS to pick, this is the article for you.

Related Sponsor

VersionOne is recognized by Agile practitioners as the leader in Agile project management tools. Companies such as Adobe, BBC, CNN, Dow, HP, IBM, Sony and 3M have turned to VersionOne to help deliver greater value to their customers.

28 comments

Watch Thread Reply

Issue Tracker Integration for git by Roman Heinrich Posted May 7, 2008 6:20 AM
Excellent article by Surya De Posted May 7, 2008 3:07 PM
Kind of.... by chris songer Posted May 7, 2008 4:01 PM
Re: Kind of.... by Sebastien Auvray Posted May 7, 2008 6:20 PM
Re: Kind of.... by sax maniac Posted Nov 25, 2009 3:18 AM
Outstanding breakdown by Kurt Christensen Posted May 7, 2008 4:21 PM
my 2c by Bela Babik Posted May 7, 2008 8:17 PM
BitKeeper by Robert Sullivan Posted May 7, 2008 10:00 PM
SVK - Subversion decentralized - http://svk.bestpractical.com/view/HomePage by Ludolph Neethling Posted May 8, 2008 2:19 AM
Rebase plugin available for Bazaar by Jelmer Vernooij Posted May 8, 2008 4:28 AM
Re: Rebase plugin available for Bazaar by Sebastien Auvray Posted May 8, 2008 4:46 AM
Top mark against Subversion going away by Ray Davis Posted May 8, 2008 5:07 PM
git repository size by dhamma vicaya Posted May 9, 2008 2:05 AM
SVK by Jean Seurin Posted May 9, 2008 7:00 AM
Usability/colour impairment by Damien Warman Posted May 10, 2008 6:10 AM
Article Updated by Sebastien Auvray Posted May 12, 2008 2:28 PM
Good but... by David H. Posted May 15, 2008 3:41 PM
Choices by Bruno Vernay Posted May 20, 2008 5:38 AM
Update on Bazaar performance by Ian Clatworthy Posted May 23, 2008 8:22 AM
Branches in Hg are supported by Stepan Koltsov Posted May 26, 2008 3:50 PM
MySQL uses Bazaar by Robin Stocker Posted Jul 6, 2008 4:35 PM
history model by bgeron bgeron Posted Sep 21, 2008 12:04 PM
Re: history model by Bhaskar Rimal Posted Nov 28, 2008 6:09 AM
Re: history model by Sebastien Auvray Posted Dec 22, 2008 4:26 PM
Re: history model by bhaskar rimal Posted Jan 3, 2009 7:26 AM
Revision naming in Mercurial by Dirkjan Ochtman Posted Jan 19, 2009 5:31 AM
Re: Revision naming in Mercurial by Mark Anderson Posted Jan 27, 2009 11:13 AM
You forgot git-bisect. by John Q. Public Posted May 19, 2009 1:22 AM
  1. Back to top

    Issue Tracker Integration for git

    May 7, 2008 6:20 AM by Roman Heinrich

    Git integrates well with Readmine (www.redmine.org/), a Rails-based Issue Tracker. Just had to add this one ;)

  2. Back to top

    Excellent article

    May 7, 2008 3:07 PM by Surya De

    I thoroughly enjoyed this. And I learned a lot from this as well. Somehow our switch from CVS to Subversion makes less sense now after reading this.

  3. Back to top

    Kind of....

    May 7, 2008 4:01 PM by chris songer

    The problem with this analysis, and those like it, is that they assume SVN as the state of the art. SVN is the state of the "free" art; but is missing a lot of features that perforce offers.

    Now Perforce is pretty costly, but when broadly comparing "centralized vs. decentralized" it's probably not quite on target to use the "best free" rather than the "best." Many of the issues cited (and thereby implied to be issues with centralized solutions)

    For example, perforce does not pollute your directories with control files. Perforce does a fantastic job of maintaining merges between branches and keeping merge history.

    Indeed, most of the arguments put forward against "central" tend to be limitations in the SVN implementation of centralized SCM rather than limitations in the state of the art.

  4. Back to top

    Outstanding breakdown

    May 7, 2008 4:21 PM by Kurt Christensen

    Dude, this was a very, very nice article. Thanks for the rollup.

  5. Back to top

    Re: Kind of....

    May 7, 2008 6:20 PM by Sebastien Auvray

    Hi Chris,
    I agree with you that Perforce got some good points that SVN do not support. But as you said 1) you need to pay for it 2) you really need to take a big care of your production scalability (as any central server...) else it's becoming a big bottleneck and you're blocked at each branch creation... Some Editors like Idea add some intelligence to that like adding the Offline mode.

  6. Back to top

    my 2c

    May 7, 2008 8:17 PM by Bela Babik

    DVCS's are young, but very capable. They are rather tools not systems to manage the code. Users need to come up with their own workflow and thats what a lot of them do not get.

    It can be frustrating when someone tries out one of them and it seems complicated after CVS/SVN. The benefits are not so obvious (what are the benefits for a developer who has never ever did any branching and merging in CVS/SVN?).


    > Mercurial SLOC (without Test src)
    Mercurial's core is python + c only (there are 4 c files).

    What I don't like about Git:
    - is that it is a mess. c+perl+bash
    - my environment is polluted by it really hardly (at least in msys git, there are huge amount of aliases)
    - native windows port is really far from being production ready. (I can not even use it from behind firewall.)

    What I don't like about Mercurial:
    - tree handling is not really good. It's not an easy call to implement it right, but it is important.
    - editing of the history is considered harmful and not really supported (can be done though through extensions, but not as nice as in git)

    What I like about Mercurial:
    - hgbook is awesome
    - easy to extend (the included extensions are good starting point)
    - clean python
    - very flexible, you can build nice workflows around it
    - mq is evil but very handy once you understand what and how is it doing

  7. Back to top

    BitKeeper

    May 7, 2008 10:00 PM by Robert Sullivan

    Great article, nice comparison & research. I've been very interested in BitKeeper and git since reading about the controversy with Linux. Very cool idea, and if anyone's interested there is some very good doc out there explaining how BitKeeper works.

    And I find it amazing that when Larry McEvoy pulled the plug on the Linux licenses, Linus (or someone) shook git out of their sleeve in a short time. Impressive.

    Now that said, Torvalds has an incredible knack for saying incredibly insulting comments. SVN is pointless? When 59% of git users also use SVN or CVS? and anyone who uses CVS is stupid - uh - this coming from someone who didn't use *any* version control until forced to do so because it was impacting the work on Linux? Now that was nuts. Now, SCM is a given, CVS is probably one of the grandaddy, we stand on the shoulders of giants. And SVN's aim was to fix issues with CVS, nothing more, isn't git pretty much a clone of BitKeeper, like Linux a clone of Unix? Hopefully for Linus, the successors to Linux will have a more admirable view of their forebearers.

  8. Thanks for the informing article. Another option for a DVCS maybe SVK. I haven't tried it myself, so feedback or a review of SVK would be appreciated.



    Copied from wikipedia:



    "SVK (also written svk) is a decentralized version control system written in Perl, with a hierarchical distributed design comparable to centralized deployment of BitKeeper and GNU arch.



    SVK uses the Subversion filesystem but provides additional features:

    * Offline operations like checkin, log, merge.

    * Distributed branches.

    * Lightweight checkout copy management (no .svn directories).

    * Advanced merge algorithms, like star-merge and cherry picking.

    * Changeset signing and verification.

    * Can mirror and operate on Subversion, Perforce and CVS repositories.
    "



    I do think it addresses some of the problems of subversion, but seeing I haven't used it I don't know what new problems it creates.



    Regards,

  9. Back to top

    Rebase plugin available for Bazaar

    May 8, 2008 4:28 AM by Jelmer Vernooij

    Your overview lists Bazaar as not having support for Rebase and queues. However, there is a rebase plugin available for Bazaar ( bazaar-vcs.org/Rebase) and a queues one (launchpad.net/bzr-loom).

  10. Back to top

    Re: Rebase plugin available for Bazaar

    May 8, 2008 4:46 AM by Sebastien Auvray

    Hi Jelmer,

    You're right, I'll update this asap.

    I'll also take into consideration some interesting remarks from reddit.

    Thanks.

  11. Back to top

    Top mark against Subversion going away

    May 8, 2008 5:07 PM by Ray Davis

    Nice overview, thanks. One thing, though -- the "major reason" you list for moving from Subversion is the difficulty of merging, but history-aware merge is the major improvement being delivered in Subversion 1.5. That's mentioned in the pages you link to at the end of your article, but given the importance attached to the feature, I thought it might be worth calling out.

  12. Back to top

    git repository size

    May 9, 2008 2:05 AM by dhamma vicaya

    git gc by default uses a conservative window size to save memory. For relatively large import from foreign repositories you should run:

    git repack -a -f -d --window=100 --depth=100

    A couple of times until the repository doesn't get any smaller.

  13. Back to top

    SVK

    May 9, 2008 7:00 AM by Jean Seurin

    Being a novice in the matter, I'd be very interested in having some feedback on how SVK compare to the mentioned DVCS.
    I had identified it as a potential solution to solve my SVN problems, mainly off-line commits and .svn file.

    I'm very happy for the new perspectives the articles gave me. Thanks for the hard work!

  14. Back to top

    Usability/colour impairment

    May 10, 2008 6:10 AM by Damien Warman

    This otherwise rather interesting article is for me and for more than 10% of readers fatally undermined by the choice to use generic icons differing only in red/green colour choice in the first comparison table. Simple check/dash/x symbology, or even a tooltip, would dramatically increase its usefulness.

  15. Back to top

    Article Updated

    May 12, 2008 2:28 PM by Sebastien Auvray

    [Article updated on 20080512 according to the comments here and from Ian Clatworthy and reedit]:



    • * Bzr plugins and Windows Gui added: rebase, ..., Wildcat BZR, ...

    • * Hg Shelve added.

    • * SLOC for Hg updated (HTML doc used to be counted, I kept contrib which is responsible for the presence of Lisp and Tcl/Tk).

    • * Repository size for git updated after doing proper repack command (<code>git repack -a -f -d --window=100 --depth=100</code> until size becomes constant) (Thanks to the comment by dhamma vicaya).

  16. Back to top

    Good but...

    May 15, 2008 3:41 PM by David H.

    Hello.
    A nice article, but I am a bit disappointed. First of all there is no mention of www.monotone.ca/ which is a well recognised DVCS now-a-days. It also fails to mention that git traditionally is fastest because on linux good old Linus Torvalds is using all the low level filesystem tricks you can possibly think off.

  17. Back to top

    Choices

    May 20, 2008 5:38 AM by Bruno Vernay

    @Perforce : That's why Open Source (and free beer) is relevant : you get considered.


    Developer's choices are not based on which is best, but what is my community using, as the author noticed :

    it seems as if some choices have emerged based on the language used by the communities: Java / Sun related developments seem to be interested more in Mercurial while C / Linux / Ruby / Rails related projects are attracted by git.

    But overall, the point is that your SCM tool should support your workflow and processes. It maybe be easier to change the tool than it is to change the processes.

  18. Back to top

    Update on Bazaar performance

    May 23, 2008 8:22 AM by Ian Clatworthy

    My measurements have 'bzr clone master feature-1' coming in around 22 secs. A patch is available to reduce this to 16 secs. See this email for further details.


    If you want to save further time and space when cloning in Bazaar, use the --hardlink option. It cuts the time to 11.2 secs (vs 11.1 secs for git on my computer) and reduces space usage across the working trees, which is where most of the disk space get consumed in tools as efficient at historical storage as these.

  19. Back to top

    Branches in Hg are supported

    May 26, 2008 3:50 PM by Stepan Koltsov

    Seems like branches are supported in hg:

    hgbook.red-bean.com/hgbookch8.html

  20. Back to top

    MySQL uses Bazaar

    Jul 6, 2008 4:35 PM by Robin Stocker

    There's now a big project which decided to use Bazaar, namely MySQL. Here's the announcement.

  21. Back to top

    history model

    Sep 21, 2008 12:04 PM by bgeron bgeron

    Sorry, but the history model for Mercurial is the same as for Git and Bazaar. A changeset/changegroup/commit/revision is in all systems a name for a snapshot. :)

  22. Back to top

    Re: history model

    Nov 28, 2008 6:09 AM by Bhaskar Rimal

    This is very impressive article and good analysis.Actually I want to know could anyone can explain about DVCS by UML diagram of its every process and steps so that I can understand all its micro process as visually.

  23. Back to top

    Re: history model

    Dec 22, 2008 4:26 PM by Sebastien Auvray

    Hi Bhaskar,
    It's very difficult to gather information on the micro process from the various VCS available. I can only advise you Scott Chacon presentation about Git at RailsConf and its slides.

  24. Back to top

    Re: history model

    Jan 3, 2009 7:26 AM by bhaskar rimal

    Thank you very much

  25. Back to top

    Revision naming in Mercurial

    Jan 19, 2009 5:31 AM by Dirkjan Ochtman

    There's been some confusion among people who think Mercurial doesn't use SHA1 for revision identification, because this article suggests its naming is simpler. While we gladly accept the notion that our revision identification scheme is simpler to use than that of other DVCSs, it's still fundamentally based on SHA1 hashes. Please note this in the table, if you can.

  26. Back to top

    Re: Revision naming in Mercurial

    Jan 27, 2009 11:13 AM by Mark Anderson

    As much as I appreciated articles like this in the past, I must say we did ourselves a favor by skipping the interim solution of going from CVS and VSS to another open source tool and chose what we believe is the last tool we'll have to use, period. www.accurev.com I just don't get what all the excitement is over a 'free' tool that will only lead to more problems down the road?

  27. Back to top

    You forgot git-bisect.

    May 19, 2009 1:22 AM by John Q. Public

    Generally a very good article. But...

    Widely considered one of Git's killer features, git-gisect is worth mentioning.

    Basically, it automates searching through the revision history for a version that introduced a regression. Git's speed lets you make frequent small commits, so the change that introduced the regression can be very small, and the result is the problem is easy to spot.

    For distributed development, it's particularly nice because it lets even a relatively unskilled tester find the exact commit that introduced the regression and send the complaint straight to the relevant developer.

    (It has, of course, been copied by Hg and bzr, so it's no longer unique to git. Still something well worth knowing about.)

  28. Back to top

    Re: Kind of....

    Nov 25, 2009 3:18 AM by sax maniac

    But centralization is the main weakness of Perforce - you cannot work while you are offline as you need to check out files on the server, which is unneeded and annoying. That is why SVN beats Perforce (from developer point of view).

Educational Content

Brian Marick on 4 Challenges and 5 Guiding Values of Agile Software Development

Brian Marick takes us through a quick tour of the most important values and challenges to adopting Agile successfully (they aren't the typical challenges and values we hear in the community).

Are You a Software Architect?

The line between development and architecture is tricky. Does it exist at all? Is an ivory tower actually needed? There's a balance in the middle, but how do you move from developer to architect?

Agile – A Way of Life and Pragmatic Use of Authority

The word 'authority' sometimes produces an allergic response in hard-line agilists. Freedom and authority – both are bad if misused and both are good if used in right spirit for a noble cause.

Getting Started with Grails, Second Edition

"Getting Started with Grails" brings you up to speed on this modern web framework. Companies as varied as LinkedIn, Wired, and Taco Bell are all using Grails. Are you ready to get started as well?

Using ITIL V3 as a Foundation for SOA Governance

Those familiar with only ITIL V2 often scoff at the thought that ITIL could serve as a governance framework for SOA. With ITIL V3, the focus of the framework shifted towards service-orientation.

Adrian Colyer on AspectJ, tc Server and dm Server

SpringSource CTO Adrian Colyer discusses AspectJ, SpringSource's dm Server and tc Server products, OSGi and Scrum.

Adam Wiggins on Heroku

Heroku's Adam Wiggins talks about Rails, Background Jobs, Add-Ons, Ruby, and how Heroku manages to work around Ruby's inefficiencies using Erlang and other languages.

SOA as an Architectural Pattern: Best Practices in Software Architecture

For Grady Booch the foundation of a good architecture is patterns, SOA being just one of many patterns. In this Second Life presentation, Booch attempts to bring more clarity on what architecture is.