Article: Distributed Version Control Systems - a guide
Article: Distributed Version Control Systems
Since Linus Torvalds presentation at Google about git in May 2007, the adoption and interest for Distributed Version Control Systems has been constantly rising. In this article, Sebastien Auvray introduces the concept of Distributed Version Control, see when to use it, why it may be better than what you’re currently using, and have a look at three actors in the area: git, Mercurial and Bazaar.
Sebastien start out by comparing Distributed Version Control Systems with Centralized VMS:es:
Or a more precise question: Why Central VCS (and notably Subversion) are not satisfying? Several things are blamed on Subversion:
* Major reason is that branching is easy but merging is a pain (but one doesn't go without the other). And it's likely that any consequent project you'll work on will need easy gymnastic with splits, dev, test branches. Subversion has no History-aware merge capability, forcing its users to manually track exactly which revisions have been merged between branches making it error-prone.
* No way to push changes to another user (without submitting to the Central Server).
* Subversion fails to merge changes when files or directories are renamed.
* The trunk/tags/branches convention can be considered misleading.
* Offline commits are not possible.
* .svn files pollute your local directories.
* svn:external can be harmful to handle.
After that, Sebastien moves on to describe how DVCS works, which the open source options are, before he comes to the an excellent comparison between git, Mercurial and Bazaar. The comparison includes things line features, model, web access, integration, performance, hosting options and many other things.
If you’re having trouble deciding which DVCS to pick, this is the article for you.
Issue Tracker Integration for git
Now Perforce is pretty costly, but when broadly comparing "centralized vs. decentralized" it's probably not quite on target to use the "best free" rather than the "best." Many of the issues cited (and thereby implied to be issues with centralized solutions)
For example, perforce does not pollute your directories with control files. Perforce does a fantastic job of maintaining merges between branches and keeping merge history.
Indeed, most of the arguments put forward against "central" tend to be limitations in the SVN implementation of centralized SCM rather than limitations in the state of the art.
Re: Kind of....
I agree with you that Perforce got some good points that SVN do not support. But as you said 1) you need to pay for it 2) you really need to take a big care of your production scalability (as any central server...) else it's becoming a big bottleneck and you're blocked at each branch creation... Some Editors like Idea add some intelligence to that like adding the Offline mode.
It can be frustrating when someone tries out one of them and it seems complicated after CVS/SVN. The benefits are not so obvious (what are the benefits for a developer who has never ever did any branching and merging in CVS/SVN?).
> Mercurial SLOC (without Test src)
Mercurial's core is python + c only (there are 4 c files).
What I don't like about Git:
- is that it is a mess. c+perl+bash
- my environment is polluted by it really hardly (at least in msys git, there are huge amount of aliases)
- native windows port is really far from being production ready. (I can not even use it from behind firewall.)
What I don't like about Mercurial:
- tree handling is not really good. It's not an easy call to implement it right, but it is important.
- editing of the history is considered harmful and not really supported (can be done though through extensions, but not as nice as in git)
What I like about Mercurial:
- hgbook is awesome
- easy to extend (the included extensions are good starting point)
- clean python
- very flexible, you can build nice workflows around it
- mq is evil but very handy once you understand what and how is it doing
And I find it amazing that when Larry McEvoy pulled the plug on the Linux licenses, Linus (or someone) shook git out of their sleeve in a short time. Impressive.
Now that said, Torvalds has an incredible knack for saying incredibly insulting comments. SVN is pointless? When 59% of git users also use SVN or CVS? and anyone who uses CVS is stupid - uh - this coming from someone who didn't use *any* version control until forced to do so because it was impacting the work on Linux? Now that was nuts. Now, SCM is a given, CVS is probably one of the grandaddy, we stand on the shoulders of giants. And SVN's aim was to fix issues with CVS, nothing more, isn't git pretty much a clone of BitKeeper, like Linux a clone of Unix? Hopefully for Linus, the successors to Linux will have a more admirable view of their forebearers.
SVK - Subversion decentralized - http://svk.bestpractical.com/view/HomePage
Copied from wikipedia:
"SVK (also written svk) is a decentralized version control system written in Perl, with a hierarchical distributed design comparable to centralized deployment of BitKeeper and GNU arch.
SVK uses the Subversion filesystem but provides additional features:
* Offline operations like checkin, log, merge.
* Distributed branches.
* Lightweight checkout copy management (no .svn directories).
* Advanced merge algorithms, like star-merge and cherry picking.
* Changeset signing and verification.
* Can mirror and operate on Subversion, Perforce and CVS repositories."
I do think it addresses some of the problems of subversion, but seeing I haven't used it I don't know what new problems it creates.
Rebase plugin available for Bazaar
Re: Rebase plugin available for Bazaar
You're right, I'll update this asap.
I'll also take into consideration some interesting remarks from reddit.
Top mark against Subversion going away
git repository size
git repack -a -f -d --window=100 --depth=100
A couple of times until the repository doesn't get any smaller.
I had identified it as a potential solution to solve my SVN problems, mainly off-line commits and .svn file.
I'm very happy for the new perspectives the articles gave me. Thanks for the hard work!
- * Bzr plugins and Windows Gui added: rebase, ..., Wildcat BZR, ...
- * Hg Shelve added.
- * SLOC for Hg updated (HTML doc used to be counted, I kept contrib which is responsible for the presence of Lisp and Tcl/Tk).
- * Repository size for git updated after doing proper repack command (<code>git repack -a -f -d --window=100 --depth=100</code> until size becomes constant) (Thanks to the comment by dhamma vicaya).
A nice article, but I am a bit disappointed. First of all there is no mention of www.monotone.ca/ which is a well recognised DVCS now-a-days. It also fails to mention that git traditionally is fastest because on linux good old Linus Torvalds is using all the low level filesystem tricks you can possibly think off.
Developer's choices are not based on which is best, but what is my community using, as the author noticed :
it seems as if some choices have emerged based on the language used by the communities: Java / Sun related developments seem to be interested more in Mercurial while C / Linux / Ruby / Rails related projects are attracted by git.
But overall, the point is that your SCM tool should support your workflow and processes. It maybe be easier to change the tool than it is to change the processes.
Update on Bazaar performance
If you want to save further time and space when cloning in Bazaar, use the --hardlink option. It cuts the time to 11.2 secs (vs 11.1 secs for git on my computer) and reduces space usage across the working trees, which is where most of the disk space get consumed in tools as efficient at historical storage as these.
Branches in Hg are supported
MySQL uses Bazaar
Re: history model
Re: history model
Revision naming in Mercurial
Re: Revision naming in Mercurial
You forgot git-bisect.
John Q. Public
Widely considered one of Git's killer features, git-gisect is worth mentioning.
Basically, it automates searching through the revision history for a version that introduced a regression. Git's speed lets you make frequent small commits, so the change that introduced the regression can be very small, and the result is the problem is easy to spot.
For distributed development, it's particularly nice because it lets even a relatively unskilled tester find the exact commit that introduced the regression and send the complaint straight to the relevant developer.
(It has, of course, been copied by Hg and bzr, so it's no longer unique to git. Still something well worth knowing about.)
InfoQ Sep 01, 2015