BT
x Your opinion matters! Please fill in the InfoQ Survey about your reading habits!

Distributed Version Control Systems: A Not-So-Quick Guide Through

Posted by Sebastien Auvray on May 07, 2008 |

Since Linus Torvalds presentation at Google about git in May 2007, the adoption and interest for Distributed Version Control Systems has been constantly rising. We will introduce the concept of Distributed Version Control, see when to use it, why it may be better than what you're currently using, and have a look at three actors in the area: git, Mercurial and Bazaar.

What?

A Version Control System (or SCM) is responsible for keeping track of several revisions of the same unit of information. It's commonly used in software development to manage source code project. The historical and first project VCS of choice was CVS started in 1986. Since then many other SCM have flourished with their specific advantages over CVS: Subversion (2000), Perforce (1995), CVSNT (1998),  ...

In December 1999, in order to manage the mainline kernel sources, Linus chose BitKeeper described as "the best tool for the job". Prior to this Linus was integrating each patch manually. While all its predecessors were working in a Client-(Central)Server model BitKeeper was the first VCS to allow a truly distributed system in which everybody owns their own master copy. Due to licensing conflicts, BitKeeper was later abandoned in favor of git (Apr, 2005). Other systems following the same model are available: Mercurial (Apr, 2005), Bazaar (Mar, 2005), darcs (Nov, 2004), Monotone (Apr, 2003).

Why?

Or a more precise question: Why Central VCS (and notably Subversion) are not satisfying?
Several things are blamed on Subversion:

  • Major reason is that branching is easy but merging is a pain (but one doesn't go without the other). And it's likely that any consequent project you'll work on will need easy gymnastic with splits, dev, test branches. Subversion has no History-aware merge capability, forcing its users to manually track exactly which revisions have been merged between branches making it error-prone.
  • No way to push changes to another user (without submitting to the Central Server).
  • Subversion fails to merge changes when files or directories are renamed.
  • The trunk/tags/branches convention can be considered misleading.
  • Offline commits are not possible.
  • .svn files pollute your local directories.
  • svn:external can be harmful to handle.
  • Performance

 

The modern DVCS fixed those issues with both their own implementation tricks and from the fact that they were distributed. But as we will see in conclusion, Subversion did not resign yet.

How?

Decentralization

Distributed Version Control Systems take advantage of the peer-to-peer approach. Clients can communicate between each other and maintain their own local branches without having to go through a Central Server/Repository. Then synchronization takes place between the peers who decide which changesets to exchange.

This results in some striking differences and advantages from a centralized system:

  • No canonical, reference copy of the codebase exists by default; only working copies.
  • Disconnected operations: Common operations such as commits, viewing history, diff, and reverting changes are fast, because there is no need to communicate with a central server. Even if a central server can exist (for stable, reference or backup version), if Distribution is well used it shouldn't be as much queried as in a CVCS schema.
  • Each working copy is effectively a remoted backup of the codebase and change history, providing natural security against data loss.
  • Experimental branches – creating and destroying branches are simple operations and fast.
  • Collaboration between peers made easy.

 

For an introduction to DVCS collaboration pratices, you might have a look at the Intro to Distributed Version Control (Illustrated) or possible Collaboration workflows.

You should also be aware that there are some disadvantages in opting for DVCS, notably in term of complexity; This decentralized view is very different from Central world and it might need some time to get used to for your developers. Changeset tracking instead of file tracking can also be confusing even if very powerful and making it theoritically possible to track method move through file.

Who?

The battle rages on! Some of the Good and the Bad.

The good and the bad essentially from an updated (because some old arguments are not true anymore) compilation of blogs and my personal experience.
 You should notice that it is a very short list of features (ie git has more than 150 commands), and some issues might be more critical than others.

  git Mercurial Bzr
 
Project      
Maintainer Junio C Hamano Matt Mackall Canonical Ltd. - Became GNU project
Concurrency model Merge Merge Merge
License GPL GPL GPL
Platforms supported POSIX, Windows, Mac OS X Unix-like, Windows, Mac OS X Unix-like, Windows, Mac OS X
Cost Free Free Free
Maturity      
Version  > 1.0 (1.5.5)  > 1.0 (1.0)  > 1.0 (1.3.1)
Project Start Apr, 2005 Apr, 2005 Mar, 2005
Implementation      
SLOC (without Test src)
SLOC Count 130550 38172 79864
Test Suites  ~20% of sources dedicated to Tests  ~25% of sources dedicated to Tests  ~50% of sources dedicated to Tests
History model Snapshot Changeset Snapshot
Repo. growth O(patch) O(patch) O(patch)
Network protocols HTTP, FTP, email bundles, custom, ssh, rsync HTTP, ssh, email HTTP, SFTP, FTP, ssh, custom, email bundles
Basic Features      
Atomic commits
File renames  implicit
Merge file renames
Symbolic links
Pre/post-event hooks
Signed revisions  Partial / Manual verification
Merge tracking
End of line conversions  Planned (1.6)
Tags
International Support  Planned
Partial checkout  Use submodules instead  Planned  Planned
Model / Architecture      
File Single top-level .git directory Single top-level .hg directory Single top-level .bzr directory
Model   Simple branch model (a clone is a branch) Simple branch model (a clone is a branch)
Repository Specificities     Shared repositories for sharing revisions between branches.
Supposed-to-be Better Storage Model
Directories versionable
Submodules  Submodule support via git-submodule  Submodule support via the Forest extension (as used by OpenJDK) Workaround with 3rd Party tool ConfigManager
Per file commit  Goes against architecture  Goes against architecture  Goes against architecture
Rebase / Queue  rebase  Mercurial Queues  Rebase plugin, Loom plugin (comp. with Quilt)
Web Access      
   Note: Repository can also be shared read-only via static files over HTTP.  Note: Repository can also be shared read-only via static files over HTTP.  Not as good as the 2 others. Faster Smart Server now available.
  gitweb, wit, cgit hgweb (single rep), hgwebdir (multi rep) webserve, trac, Loggerhead
Integration      
Integration-ability  git is more scriptable than integrable through API (even if there are some frontend api like Ruby/Git)  Rich API
Migration  Good. git-svn is also a very powerful and easy to put in place bi-directional gateway between Subversion and Git allowing you to use Git over an existing Subversion Repository.  Good. hgsvn not as polished as git-svn.  Well covered but slow.
Issue Tracker Integration  Trac Versioning System Backend Plugin avail. Bugzilla workaround. No JIRA Plugin.  Trac Versioning System Backend Plugin avail. Bugzilla avail. JIRA.  Trac Versioning System Backend Plugin avail. Bugzilla avail. No JIRA Plugin.
IDE Plugins  Existing dev versions: Idea, Eclipse, NetBeans  Existing dev versions: Idea, Eclipse, NetBeans  Existing dev versions: Idea, Eclipse. Missing: NetBeans
Plugins  Emacs / Vim / ...  Emacs / Vim / ...  Emacs / Vim / ...
Performance      
   git has always been historically faster than its competitors    bzr has historically been the slowest of the 3.
Advanced Features      
   With more than 150 binaries it's hard not to find the killing command you always dreamt of (even if this increases complexity)    
Complexity      
      Bzr pretends to hide complexity by keeping a clean User Interface while adapting to the different collaboration workflows and their evolution in a team.
Revision Naming  Git revisions are SHA-1 making it less userfriendly when doing a diff between two revisions. This was chosen to guarantee safety and integrity of data and also happens to avoid collision when merging with other peers.  Simple naming  Simple revision id naming r1, r2, etc...
Commands  Familiar, with some specificities like rename command which differs from other SCM (won't be changed because of backward compatibility).
git is the most advanced SCM in term of commands but if you add all possible commands and their options, you end up with a huge number of possibilities that it's hard to master.
The fact that such tool like Easy Git exists means that Git can be considered quite complex.
 Familiar (not far from subversion)  Familiar
UserBase      
   Large userbase / Numerous (and large) Projects running git and interest in user feedback  Large userbase / Numerous (and large) Projects running hg.  Smallest Market Share: Apart from Canonical projects (Ubuntu, Launchpad), no big names are using it yet. Bazaar is also less well-known.
  Linux kernel, Cairo, Wine, X.Org, Rails, Rubinius, Compiz Fusion Xine, OpenJDK, OpenSolaris, NetBeans, (Part of) Mozilla (Part of) Ubuntu, Launchpad
Documentation  Good documentation. Very good man pages (with a lot of examples)  Good documentation.  Good documentation.
Platforms      
    Poor Windows support   Cross Platform   Cross Platform
Additional Misc. Good Points      
  git is scriptable over pluginable which is a good and bad point (easy entry point through script, all tests are done in bash script by the way).
Very tunable for advanced administration: staging area, dangling objects, detached heads, plumbing vs porcelain, reflogs.
Local branches are possible.
Robust Renaming. Robust Renaming.
Supports lightweight checkouts (without history).
Bound branches.
Local branches are possible.
Patch Queue Manager (manages several branches, performing merges for developers)
Some very cool commands / extensions git-stash (when interrupted for a quick bug-fix on another project), git-cherry-pick (for picking only single commits, rather that complete branches), git-rebase (to forward port your local commits. Quilt-like changeset like Mercurial Queues).
git add -i (equivalent of Mercurial RecordExtension).
There's hardly a command you dreamt of that git doesn't have.
RecordExtension (it lets you choose which parts of the changes in a working directory you'd like to commit).
Hg Shelve Extension (same as git-stash: to interactively select changes to set aside).
Shelf plugin (same as git-stash: when interrupted for a quick bug-fix on another project, latest rel on Jan. 2007).
bzr-dbus (for broadcasting hooks and revisions).
Additional Misc. Bad points      
  Renaming not handled as good as bzr (Test Case).
Read-only static HTTP setup is a bit obtuse (--bare and update-server-info).
Handling of Unicode (UTF-16 encoded) files.
Storage Model. Git stores each newly created object as a separate file that can be packed into a single file delta compressed between each. Forces to do administration and launch pack command on a regular basis.
Mixture of C, Perl and bash script, which makes it far less obvious to port to other systems while maintaining the same feature set.
Renaming not handled as good as bzr. [FIXED]
Local branches are not possible, clone is used instead.
To avoid lost of space, Hg use hardlinks making problem when pull (and also under Windows).
Forest extension (submodules) not native and not well documented.
 
Gui      
Windows    TortoiseHg  Complicated. TortoiseBzr (no submit on launchpad project since Aug 2007, but the project is still active). WildcatBZR.
Linux  gitk, git-gui, tig, ...  TortoiseHg, Hgtk, hgct  bzr-gtk, ...
Installation      
   You'll need either cygwin installed or alternative git installation like Git on MSys
Free Hosting Available      
  GitHub, gitorious FreeHg Launchpad

There are debates left open, like the fact that in bzr directories are branches, not branch containers like in git.
Also the fact that Mercurial is using external tools to do merges is also criticized by Bzr. This is not true anymore as of Mercurial v1.0.
You'll find other biased comparisons made by Bazaar team: Bazaar vs Git, Bazaar vs Mercurial and the associated reply from Mercurial.

 

 
Some User Statistics from Git Survey 2007

 

You should notice that in the survey, there was no option to choose Ruby as proficient language. Should be interesting to add it for survey 2008.

It's also funny to see that ~1/3 of people use Distributed VCS (here git) in collaboration with ... 0 or 1 person!

Guis

gitk on Linux TortoiseHg on Windows OliveGtk on Linux

The guis look nearly the same with a preference for the effectiveness of gitk. TortoiseHg (with folder watch activated) was really slow with a big repository like Mozilla.

A quick and non-exhaustive look at performance

Conditions of the bench

git is still leading the performance battle, but Hg and Bzr have made great improvements in the past year.

You should notice that Mercurial doubles the number of files in your repository (the historic is kept per file in .hg/store/data). It doesn't seem to be a good choice for Windows system running on NTFS.
It's also interesting to see that git takes a big advantage of the system when executing command. While Hg and Bzr do not spend a big proportion of time in system, Git can take up to 10-40% cpu time within system call, which raises the question as to how it will perform on Windows system where the git-developers won't have access to all the system performance trick they are used to with Linux.
Single Merges and Merge Queues should be tested, this is a tiedous part to benchmark.

Benchmarks should also be run on Windows as:

  1. Even if your server is running on *nix, many developers are still having a Windows environment at work and DVCS transfered more processing on the developer station
  2. Performance might be really different on Windows machine.

 

When?

Experience stories.

I had the chance to catch up with Kelly O'Hair from Sun about its choice for Hg for OpenJDK.

Sebastien Auvray: I read the reasons for migrating from TeamWare to Mercurial but had remaining questions. Did you simply follow OpenSolaris choice?

Kelly O'Hair: To some degree yes, but the OpenSolaris choice also became the Sun wide choice to any Sun Software teams having to convert. The OpenSolaris investigation was pretty complete and they had all the exact requirements we had. We had to convert for OpenJDK, because TeamWare was unacceptable for an open source project, the answer of Mercurial was pretty obvious for us.

Or did you do a refreshed tournament and tried the other DVCS again (git, ...)?

We did not do a detailed re-investigation, that seemed like a waste of time. The only other possible choice in my view was git, and since git wasn't giving Windows a priority, which we needed. Again the choice was obvious.

OpenSolaris reports took place in April 2006 which is 2 years ago.

Understood. Some things may have changed, git has improved, but the ball was rolling, and Mercurial was improving too.

Also did you encounter any specific problems in the migration?

File permissions and ownership can be a problem in sharing a repository vis a NFS or UFS file system, so we finally setup a server to handle the shared repositories, the better way. That could be made easier.
The other issue is that using hooks to rollback or filter pushes creates a window where someone could accidently pull changes that will be rolled back, so you have to use a pair of repositories, one for pushes and one for pulls, with an automatic sync after the hooks run to sync them up.
Using forests also introduces a problem because a forest push is just a set of individual pushes, and if one push failed, technically you would want to rollback all other pushes. Nobody is doing this, and just taking their chances. If the repositories in the forest are fairly independent, this is not a real problem.

In the day-to-day usage?

Remains to be seen. Change like this is easy for some, harder for others. Given time, I think most people have and will adapt and learn to love it.
The concept of "working set files" (having to do 'hg update') and having to merge changesets that don't seem to merge anything is confusing to people. Also, the idea that they are pushing changesets and not files is something people have a problem with, "Why can't I just bringover this one file?".

What is better than TeamWare?

Much much much faster than TeamWare. Our teams in China and Russia are looking forward to full deployment because they don't need to keep mirrors of integration areas. Refreshes (pulls) are very fast over slow connections.
The state of the repository in Mercurial is well-defined, unlike TeamWare which allowed for partial workspaces, TeamWare was just a loose bag of individually managed files (SCCS files).
The changeset concept was missing in TeamWare, along with the concept of well known simple state of the entire repository (a simple changeset id).

Is there anything you're missing from TeamWare?

People are missing the email notifications and putback/bringover transaction history, but the changeset provides much of that.
What may be missing is somekind of repository transaction history, but again, email archives of Mercurial events could provide this.

Is Hg becoming the VCS of choice for Sun including internal projects? Or is Sun using it only for public projects that need openness?

Both internal and external projects are converting, where it makes sense.
I've seen a big increase in interest from internal projects that are taking the plunge.

 

I also caught up with Pierre d'Herbemont from VLC to get their opinion about git.

Sebastien Auvray: Firstly what was the version control system you were using prior to using Git?

Pierre d'Herbemont: SVN and a git-svn mirror.

When did you migrate?

We opened a git mirror of the svn tree, to ease VLC Google Summer of Code projects. So that was back then. Then we totally migrated to git on March 1st-2nd 2008.

Why did you chose Git over its competitors?

 

  • Over SVN: Git is fast. Branch is cheap. Atomic Commits. Rebasing on top of an other tree.
  • Over other distributed system: Proven user base (Linux Kernel). I have been successfully using it while working on Wine. Git is sexy. And Some core developers had experiences with Git, whereas no one has with Mercurial and such. Nothing technical there.

 

Also did you encounter any specific problems in the migration? In the day-to-day usage?

We encountered some troubles with Trac and buildbot. Their support for Git is really minimal especially in their releases versions. We had to checkout Builbot latest trunk. For Trac we are using a crippled Git plugin. Trac Git Plugin needs Trac 0.11. But Trac 0.11 isn't stable and has some known memleak that prevent us from switching. So basically we are waiting for them to fix that...
It took some times for some committers to get accustomed with Git. But after two days, everything seemed fine. And some Git-beginners starts to really enjoy Git.

So what ?

Choosing between Distributed VCS and Central VCS is far from being easy. DVCS will definitely change the way you work and collaborate. Subversion, one of the Central VCS leader, has not resigned yet in the performance and features battle, and 1.5 version should come up with good compromises. It can count on its existing userbase and simplicity favor (at the cost of some pain). In very specific case like project dealing with large opaque binary files, Subversion would be better than DVCS because the client-side space usage stays constant. Also if you use partial checkout heavily, svn will perform better (but when massively used this reveals a problem in the setting of your modules).

Once you made the choice for either Distributed or Central solution, then it will also be hard to compare the competitors in their area as implementations/commands and at the end performances can be very different. And there is no real existing benchmarks for the common operations.
In this hard battle, Bazaar lost many new really influencing early adopters (Mozilla, Solaris, OpenJDK) because of its poor performance of the beginning. It also has to be said that Bazaar website is a lot more Marketing-oriented: by publicizing not-all-true differences with its competitors, or by publicizing benchmark comparison with its competitors only about Space efficiency while there's no timing benchmark comparison of daily commands: diff, add, ...
I feel that even though the 3 projects started out at nearly the same time, bzr did face a lot more performance and design problem at the early beginning making it a bit less mature than its competitors now.
Yet unseen phenomenon, it seems as if some choices have emerged based on the language used by the communities: Java / Sun related developments seem to be interested more in Mercurial while C / Linux / Ruby / Rails related projects are attracted by git.

Hope this article enlightened you and your experiences and feedbacks are always welcome!

Credits:
People who kindly accepted my interview: Kelly O'Hair, Pierre d'Herbemont.
Ian Clatworthy for his help and reactivity on the conversion of the Mozilla Hg Repository to Bzr.
#git, #mercurial, #bzr on Freenode IRC, #mozilla on Mozilla IRC.
Athletism Picture by Antonio Juez

Random quotes:
Linus Torvald: "Subversion has been the most pointless project ever started". "If you like using CVS, you should be in some kind of mental institution or somewhere else".
Mark Shuttleworth (Ubuntu / Canonical Ltd.):  "Merging is the key to software developer collaboration."
Ian Clatworthy (Canonical / Bazaar): "By 2011-2012, I predict this technology will be widely adopted and many teams will wonder how they once managed without it."
Assaf Arkin in Git forking for fun and profit originally: "Apache built a great infrastructure around SVN, lots of sweat and tears went into making it happen, and at first I felt like we’re circumventing all of that. But the longer I thought about it, the more I realized that Git is just more social than SVN, and that’s exactly what Apache is about."

[Article updated on 20080512 according to the comments here and from Ian Clatworthy and reedit]:

  • Bzr plugins and Windows Gui added: rebase, ..., Wildcat BZR, ...
  • Hg Shelve added.
  • SLOC for Hg updated (HTML doc used to be counted, I kept contrib which is responsible for the presence of Lisp and Tcl/Tk).
  • Repository size for git updated after doing proper repack command (git repack -a -f -d --window=100 --depth=100 until size becomes constant) (Thanks to the comment by dhamma vicaya).

 

Apologies:
darcs, Monotone were not taken into account in this comparison because it was already a hard work to gather all this information and to actually test those 3 DVCS. Strangely, even though they are the oldest in the DVCS scene, the focus is more on the DVCS I reviewed here (which doesn't help moving the focus I admit but darcs, Monotone users/developers are welcome to post comments and advertising here!).

References:
The very exhaustive Wikipedia page about Git.
Distributed Revision Control Wikipedia page.
Comparison of Revision Control Software Wikipedia page.

Distributed Version Control - Why and How by Ian Clatworthy, Canonical (Bazaar).
Intro to Distributed Version Control (Illustrated) by Kalid Azad.
Distributed Version Control Systems by Dave Dribin (who finally chose Mercurial).
Why Distributed Version Control by Wincent Colaiuta.
Source Code Management for OpenSolaris. OpenSolaris SCM Project History (2005).
Mercurial OpenJDK Questions by Kelly O'Hair, Sun.
Why I chose git by Aristotle Pagaltzis.
Distributed SCM by Gnome crew.
FreeBSD SCM Requirements.
Open Office Requirements.
Mozilla VCS Requirements.
Use Mercurial you git! by Ian Dees.
What a DVCS gets you (maybe) by Bill de hÓra.
The Differences Between Mercurial and Git.
And all URLs referenced in this article.

Cheat Sheets:
Git Cheat Sheet
Mercurial Cheat Sheet
Bazaar Quick Start Card

Benchmark conditions.
Benchmark was done using AMD Athlon(tm) 64 Processor 3500+ 1GB RAM on Linux Kubuntu 6.10 Edgy x86_64 with ext3 fs.
Each command was run 8 times (and the best and worst time were cut out). They were done locally through the filesystem (other protocol tests should definitely be done as even if DVCS are not coupled with a central server, network communications when badly implemented can lower user performance).

Version used are:

Repository consists in a snapshot of 12456 changesets (from 20080303, 70853 total revisions from the hg Repository), ~30000 files from Mozilla Repository (originally hg formatted and translated into git repository thanks to hg-fast-export.sh for git and hg-fast-export.sh coupled with fast-import plugin for bazaar).
Default file formats were used and git repository size remained the same running git-gc (which can be considered normal for a freshly migrated repository). One file was modified (dom/src/base/nsDOMClassInfo.cpp) just like a benchmark test done by Jst 1.5 year ago.

 

About the Author

Sébastien Auvray is a senior software designer and technology enthousiast. After being forced to use CVS, svn now he has to suffer the daily usage of Perforce at work. Sébastien is also one of the Ruby editors of InfoQ.

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Issue Tracker Integration for git by Roman Heinrich

Git integrates well with Readmine (www.redmine.org/), a Rails-based Issue Tracker. Just had to add this one ;)

Excellent article by Surya De

I thoroughly enjoyed this. And I learned a lot from this as well. Somehow our switch from CVS to Subversion makes less sense now after reading this.

Kind of.... by chris songer

The problem with this analysis, and those like it, is that they assume SVN as the state of the art. SVN is the state of the "free" art; but is missing a lot of features that perforce offers.

Now Perforce is pretty costly, but when broadly comparing "centralized vs. decentralized" it's probably not quite on target to use the "best free" rather than the "best." Many of the issues cited (and thereby implied to be issues with centralized solutions)

For example, perforce does not pollute your directories with control files. Perforce does a fantastic job of maintaining merges between branches and keeping merge history.

Indeed, most of the arguments put forward against "central" tend to be limitations in the SVN implementation of centralized SCM rather than limitations in the state of the art.

Outstanding breakdown by Kurt Christensen

Dude, this was a very, very nice article. Thanks for the rollup.

Re: Kind of.... by Sebastien Auvray

Hi Chris,
I agree with you that Perforce got some good points that SVN do not support. But as you said 1) you need to pay for it 2) you really need to take a big care of your production scalability (as any central server...) else it's becoming a big bottleneck and you're blocked at each branch creation... Some Editors like Idea add some intelligence to that like adding the Offline mode.

my 2c by Bela Babik

DVCS's are young, but very capable. They are rather tools not systems to manage the code. Users need to come up with their own workflow and thats what a lot of them do not get.

It can be frustrating when someone tries out one of them and it seems complicated after CVS/SVN. The benefits are not so obvious (what are the benefits for a developer who has never ever did any branching and merging in CVS/SVN?).


> Mercurial SLOC (without Test src)
Mercurial's core is python + c only (there are 4 c files).

What I don't like about Git:
- is that it is a mess. c+perl+bash
- my environment is polluted by it really hardly (at least in msys git, there are huge amount of aliases)
- native windows port is really far from being production ready. (I can not even use it from behind firewall.)

What I don't like about Mercurial:
- tree handling is not really good. It's not an easy call to implement it right, but it is important.
- editing of the history is considered harmful and not really supported (can be done though through extensions, but not as nice as in git)

What I like about Mercurial:
- hgbook is awesome
- easy to extend (the included extensions are good starting point)
- clean python
- very flexible, you can build nice workflows around it
- mq is evil but very handy once you understand what and how is it doing

BitKeeper by Robert Sullivan

Great article, nice comparison & research. I've been very interested in BitKeeper and git since reading about the controversy with Linux. Very cool idea, and if anyone's interested there is some very good doc out there explaining how BitKeeper works.

And I find it amazing that when Larry McEvoy pulled the plug on the Linux licenses, Linus (or someone) shook git out of their sleeve in a short time. Impressive.

Now that said, Torvalds has an incredible knack for saying incredibly insulting comments. SVN is pointless? When 59% of git users also use SVN or CVS? and anyone who uses CVS is stupid - uh - this coming from someone who didn't use *any* version control until forced to do so because it was impacting the work on Linux? Now that was nuts. Now, SCM is a given, CVS is probably one of the grandaddy, we stand on the shoulders of giants. And SVN's aim was to fix issues with CVS, nothing more, isn't git pretty much a clone of BitKeeper, like Linux a clone of Unix? Hopefully for Linus, the successors to Linux will have a more admirable view of their forebearers.

SVK - Subversion decentralized - http://svk.bestpractical.com/view/HomePage by Ludolph Neethling

Thanks for the informing article. Another option for a DVCS maybe SVK. I haven't tried it myself, so feedback or a review of SVK would be appreciated.



Copied from wikipedia:



"SVK (also written svk) is a decentralized version control system written in Perl, with a hierarchical distributed design comparable to centralized deployment of BitKeeper and GNU arch.



SVK uses the Subversion filesystem but provides additional features:

* Offline operations like checkin, log, merge.

* Distributed branches.

* Lightweight checkout copy management (no .svn directories).

* Advanced merge algorithms, like star-merge and cherry picking.

* Changeset signing and verification.

* Can mirror and operate on Subversion, Perforce and CVS repositories.
"



I do think it addresses some of the problems of subversion, but seeing I haven't used it I don't know what new problems it creates.



Regards,

Rebase plugin available for Bazaar by Jelmer Vernooij

Your overview lists Bazaar as not having support for Rebase and queues. However, there is a rebase plugin available for Bazaar ( bazaar-vcs.org/Rebase) and a queues one (launchpad.net/bzr-loom).

Re: Rebase plugin available for Bazaar by Sebastien Auvray

Hi Jelmer,

You're right, I'll update this asap.

I'll also take into consideration some interesting remarks from reddit.

Thanks.

Top mark against Subversion going away by Ray Davis

Nice overview, thanks. One thing, though -- the "major reason" you list for moving from Subversion is the difficulty of merging, but history-aware merge is the major improvement being delivered in Subversion 1.5. That's mentioned in the pages you link to at the end of your article, but given the importance attached to the feature, I thought it might be worth calling out.

git repository size by dhamma vicaya

git gc by default uses a conservative window size to save memory. For relatively large import from foreign repositories you should run:

git repack -a -f -d --window=100 --depth=100

A couple of times until the repository doesn't get any smaller.

SVK by Seurin Jean

Being a novice in the matter, I'd be very interested in having some feedback on how SVK compare to the mentioned DVCS.
I had identified it as a potential solution to solve my SVN problems, mainly off-line commits and .svn file.

I'm very happy for the new perspectives the articles gave me. Thanks for the hard work!

Usability/colour impairment by Damien Warman

This otherwise rather interesting article is for me and for more than 10% of readers fatally undermined by the choice to use generic icons differing only in red/green colour choice in the first comparison table. Simple check/dash/x symbology, or even a tooltip, would dramatically increase its usefulness.

Article Updated by Sebastien Auvray

[Article updated on 20080512 according to the comments here and from Ian Clatworthy and reedit]:



  • * Bzr plugins and Windows Gui added: rebase, ..., Wildcat BZR, ...

  • * Hg Shelve added.

  • * SLOC for Hg updated (HTML doc used to be counted, I kept contrib which is responsible for the presence of Lisp and Tcl/Tk).

  • * Repository size for git updated after doing proper repack command (<code>git repack -a -f -d --window=100 --depth=100</code> until size becomes constant) (Thanks to the comment by dhamma vicaya).

Good but... by David H.

Hello.
A nice article, but I am a bit disappointed. First of all there is no mention of www.monotone.ca/ which is a well recognised DVCS now-a-days. It also fails to mention that git traditionally is fastest because on linux good old Linus Torvalds is using all the low level filesystem tricks you can possibly think off.

Choices by Bruno Vernay

@Perforce : That's why Open Source (and free beer) is relevant : you get considered.


Developer's choices are not based on which is best, but what is my community using, as the author noticed :
it seems as if some choices have emerged based on the language used by the communities: Java / Sun related developments seem to be interested more in Mercurial while C / Linux / Ruby / Rails related projects are attracted by git.

But overall, the point is that your SCM tool should support your workflow and processes. It maybe be easier to change the tool than it is to change the processes.

Update on Bazaar performance by Ian Clatworthy

My measurements have 'bzr clone master feature-1' coming in around 22 secs. A patch is available to reduce this to 16 secs. See this email for further details.


If you want to save further time and space when cloning in Bazaar, use the --hardlink option. It cuts the time to 11.2 secs (vs 11.1 secs for git on my computer) and reduces space usage across the working trees, which is where most of the disk space get consumed in tools as efficient at historical storage as these.

Branches in Hg are supported by Stepan Koltsov

Seems like branches are supported in hg:

hgbook.red-bean.com/hgbookch8.html

MySQL uses Bazaar by Robin St

There's now a big project which decided to use Bazaar, namely MySQL. Here's the announcement.

history model by bgeron bgeron

Sorry, but the history model for Mercurial is the same as for Git and Bazaar. A changeset/changegroup/commit/revision is in all systems a name for a snapshot. :)

Re: history model by Bhaskar Rimal

This is very impressive article and good analysis.Actually I want to know could anyone can explain about DVCS by UML diagram of its every process and steps so that I can understand all its micro process as visually.

Re: history model by Sebastien Auvray

Hi Bhaskar,
It's very difficult to gather information on the micro process from the various VCS available. I can only advise you Scott Chacon presentation about Git at RailsConf and its slides.

Re: history model by bhaskar rimal

Thank you very much

Revision naming in Mercurial by Dirkjan Ochtman

There's been some confusion among people who think Mercurial doesn't use SHA1 for revision identification, because this article suggests its naming is simpler. While we gladly accept the notion that our revision identification scheme is simpler to use than that of other DVCSs, it's still fundamentally based on SHA1 hashes. Please note this in the table, if you can.

Re: Revision naming in Mercurial by Mark Anderson

As much as I appreciated articles like this in the past, I must say we did ourselves a favor by skipping the interim solution of going from CVS and VSS to another open source tool and chose what we believe is the last tool we'll have to use, period. www.accurev.com I just don't get what all the excitement is over a 'free' tool that will only lead to more problems down the road?

You forgot git-bisect. by John Q. Public

Generally a very good article. But...

Widely considered one of Git's killer features, git-gisect is worth mentioning.

Basically, it automates searching through the revision history for a version that introduced a regression. Git's speed lets you make frequent small commits, so the change that introduced the regression can be very small, and the result is the problem is easy to spot.

For distributed development, it's particularly nice because it lets even a relatively unskilled tester find the exact commit that introduced the regression and send the complaint straight to the relevant developer.

(It has, of course, been copied by Hg and bzr, so it's no longer unique to git. Still something well worth knowing about.)

Re: Kind of.... by sax maniac

But centralization is the main weakness of Perforce - you cannot work while you are offline as you need to check out files on the server, which is unneeded and annoying. That is why SVN beats Perforce (from developer point of view).

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

28 Discuss

Educational Content

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2014 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT