InfoQ

InfoQ

News

My Bookmarks

Login or Register to enable bookmarks for unlimited time.

The content has been bookmarked!

There was an error bookmarking this content! Please retry.

ThoughtWorks’ Developers Favor Distributed Version Control Systems

Posted by Abel Avram on Mar 18, 2010

Sections
Operations & Infrastructure,
Process & Practices,
Architecture & Design,
Development
Topics
Java ,
Versioning ,
Websphere ,
Ruby ,
IBM ,
Dynamic Languages ,
Collaboration ,
.NET ,
Application Servers ,
Version Control ,
Languages ,
Companies ,
Teamwork ,
Source Control ,
Agile in the Enterprise ,
Agile ,
Programming ,
CVS ,
Architecture ,
DVCS ,
VCS ,
Mercurial ,
bzr ,
Hg ,
git

Martin Fowler has conducted a survey on ThoughtWorks’ software development mailing list to determine how some of the version control systems (VCS) are perceived by developers. He also wrote a review of most prominent VCSes comparing centralized and distributed systems.

The results of the survey were:

Tool

Best

OK

Problematic

Dangerous

No Opinion

Active Responses

Approval %

git

65

19

1

0

14

85

99%

Mercurial

33

27

2

0

36

62

97%

Subversion

20

72

6

1

0

99

93%

Bazaar

1

13

3

0

80

17

82%

Perforce

1

26

16

1

54

44

61%

CVS

0

14

59

11

15

84

17%

ClearCase

0

3

14

41

41

58

5%

VSS

1

1

11

64

22

77

3%

TFS

0

0

32

22

44

54

0%

The participants had the option to rate 9 VCS as Best, OK, Problematic, Dangerous or No Opinion, the last meaning they haven’t used the respective VCS. “Active Responses” is the total number of respondents excluding those with “No Opinion”, while “Approval %” is (Best + OK)/Active Responses.

While this is not a survey made on a large number of subjects, the results show a subjective perception within an organization, ThoughtWorks, and might be used as an indication of VCS’ perception in other organizations.

git and Mercurial, two open source distributed version control systems (DVCS), had the highest approval rates, over 95%. Subversion, an open source centralized VCS, also had a very good approval rate at 93%. The worst rated were commercial VCSes: IBM’s ClearCase, and Microsoft’s VSS and TFS.

All respondents have used Subversion, 85 git, 84 CVS, while Bazaar has been used by only 17 users out of 99. One conclusion is that VSS has been used by a fair number of respondents (77) but it managed to collect only 3% approval.

These results sustain Fowler’s initial remarks on version control systems. Based on discussions with other ThoughtWorks employees and collaborators, Fowler concluded that there are 3 VCSes which are generally accepted, and consequently are recommendable: Subversion, git and Mercurial. This means that one needs to choose between a centralized VCS and a distributed one. According to Fowler, a distributed VCS is better because it “opens up lots of flexibility in work-flow, but that flexibility can be dangerous if you don't have the maturity to use it well”. Subversion is good because it:

encourages a simple central repository model, discouraging large scale branching. In an environment that's using Continuous Integration, which is how most of my friends like to work, that model fits reasonably well. As a result Subversion is a good choice for most environments.

And although DVCSs give you lots of flexibility in how you arrange your work-flows, most people I know still base their work patterns on the notion of a shared mainline repository that's used with Continuous Integration. Although modern VCS have almost magic tools to merge different people's changes, these merges are still just merging text. Continuous Integration is still necessary to get semantic consistency. So as a result even a team using DVCS usually still has the notion of the central master repository.

Fowler’s review of VCS continues by remarking the strengths DVCS has against centralized VCS:

  • Speed – not having a local copy of the repository, Subversion is slower especially when working with a server located on a different continent.
  • Connection – DVCS can be used even when a network connection is missing.
  • Branching – DVCS encourage branching:

DVCS encourages quick branching for experimentation. You can do branches in Subversion, but the fact that they are visible to all discourages people from opening up a branch for experimental work. Similarly a DVCS encourages check-pointing of work: committing incomplete changes, that may not even compile or pass tests, to your local repository. Again you could do this on a developer branch in Subversion, but the fact that such branches are in the shared space makes people less likely to do so.

Fowler noted that centralized systems are better at handling binary files:

There is one particular case where Subversion is the better choice even for a team that skilled at using a DVCS. This case is where the artifacts you're collaborating on are binary and cannot be merged by the VCS - for example Word documents or presentation decks. In this case you need to revert to pessimistic locking with single-writer checkouts - and that requires a centralized system.

Fowler expressed a low opinion on ClearCase and TFS:

Two in particular generate a lot of criticism: ClearCase (from IBM) and TFS (from Microsoft). One reason they get a lot of criticism is that they are very popular on client sites, often with company policies mandating their use (I'll describe a coping strategy for that at the end).

… developers I respect have worked extensively with, and do not recommend, these products.

VSS was considered the worst choice:

Before I finish with those behind the threshold, I just want to say a few things about a particularly awful tool: Visual Source Safe, or as I call it: Visual Source Shredder. We see this less often now, thank goodness, but if you are using it we'd strongly suggest you get off it. Now. Not just is it a pain to use, I've heard too many tales of repository corruption to trust it with anything more valuable than foo.txt.

This is a subjective opinion on VCS and does not compare actual capabilities these tools have. The Better SCM Initiative makes an in-depth comparison of 28 version control systems, but it does not rank them. It lets the user discover which is the best VCS based on some key features: Atomic Commits, Files and Directories Moves or Renames, Intelligent Merging after Moves or Renames, File and Directory Copies,  Ability to Work only on One Directory of the Repository, Tracking Uncommited Changes, Documentation, Ease of Deployment, and others.

Scott Chacon wrote a post defending git and explaining why he believes git is better than Mercurial, Bazaar, or SVN: cheap local branching, everything is local, git is fast, git works with any workflow, and a few other reasons.

Of course, every developer or team has its favorite CVS. What is yours? What are your great stories or horror stories to tell?

  • This article is part of a featured topic series on Agile
TFS is not VSS by ebru cucen Posted
Doubtful by Sakkraya ! Posted
Worst Survey Ever by Morgoth Melkor Posted
  1. Back to top

    TFS is not VSS

    by ebru cucen

    How sad to keep "TFS" and "VSS" at the same level, is TFS much more powerful tool than VSS. Hsving VSS as the nightmare once upon a time, should not make TFS not to be included in the list.

  2. Back to top

    Doubtful

    by Sakkraya !

    I've extensively used Perforce, Subversion, CVS, VSS and TFS. According to my experience TFS is not bad as rated here.

  3. Back to top

    Worst Survey Ever

    by Morgoth Melkor

    Now now.. This is really a very shallow review and I think Mr. Fowler being a Chief Scientist needs to stop sending surveys around and start doing some serious work on his own

    --Morgoth

Educational Content

Eventually Consistent HTTP with Statebox and Riak

Bob Ippolito explains how to solve concurrent update conflicts with Statebox, an open source library for automatic conflict resolution, running on top of Riak.

Java.next

Erik Onnen attempts to demonstrate that Java is still the best programming language for the JVM if simplified idioms are used along with proper tooling.

Evolution in Data Integration From EII to Big Data

Approaches to integrating data are changing with emergence of cloud computing.

Winning Hearts and Minds: How to Embed UX from Scratch in a Large Organization

Michele Ide-Smith presents the lessons learned in the process of introducing UX principles and techniques into a large organization through a series of small steps.

LMAX Disruptor: 100K TPS at Less than 1ms Latency

Dave Farley and Martin Thompson discuss solutions for doing low-latency high throughput transactions based on the Disruptor concurrency pattern.

Thoughts on Test Automation in Agile

Rajneesh Namta shares his thoughts, experiences, and some of the critical lessons learned while implementing software test automation on a recent Agile project.

Actor Interaction Patterns

Dale Schumacher presents several patterns of actor interaction that can be used in collaborative programs written in any language.

Scalaz: Functional Programming in Scala

Rúnar Bjarnason discusses Scalaz, a Scala library of pure data structures, type classes, highly generalized functions, and concurrency abstractions to perform functional programming in Scala.