Pros and Cons of Subversion over CVS
This page is a hastily written overview of the relative benefits of
Subversion and CVS. It is not a finished article, and may never
become one. It was something I wrote in an email once, and turned
into a web page when I found myself wanting to quote it to a second
person.
I have written a proper article about Subversion,
here.
That's not a comparison document as such; it documents my
experiences of migrating from CVS to Subversion, and while it
touches on some of the points below, it doesn't provide what I'd
call a comprehensive list of them.
This list is written from the point of view of Subversion. That is,
everything labelled "pro" below is an advantage of Subversion
over CVS, and everything labelled "con" is a disadvantage of
Subversion.
- Most obvious change is of course the single revision number.
- pro: makes it easy to know what went into a build
- pro: largely removes the need for date-based checkouts, since remembering one revision number is even easier than remembering a date. Just arrange for the revision number to be embedded in every binary you build, and you're done.
- The branching model is extremely different. In CVS, branches are represented as a fork in the time dimension; both branches of a file are seen at the same absolute pathname, and you switch between them by deciding which revision number branch you're working on. In SVN, the time dimension is unbroken and linear; an SVN repository is simply an ordinary-looking file system seen at a number of snapshots throughout its history. Branches are not officially recognised by the software at all; to create a branch you just copy (say) the path "
/myproj
" to "/myproj-branch
" and do further development by checking out from there instead (exactly as you might have done if you wanted to branch the software in the absence of any version control). SVN's support for cheap file and directory copies makes this internally efficient, and the "svn switch
" command is provided as a means of conveniently moving a single working copy directory between branches.
- pro: creating a branch is now a version-controlled operation, meaning you get to track who did it, when and why
- pro: "
svn ls
" can easily show you what branches exist and are active, which was always hard to keep track of in CVS
- pro: you can delete a branch when you've finished with it (although if anyone later needs to recover it, they can always do so by using "
svn cp
" from an earlier revision)
- con: the benefit of the single revision number is
partially undone by this branching model, because to specify what
files went into a build you now have to specify a pathname within
the repository as well as the revision number. That said, at least
you can do that at all – CVS didn't even permit you to give a
branch tag and a date tag together!
- con: there is no way to bring together all the various
branched versions of a single source file, if (for example) you're
trying to remember which branch a particular change was
made on. CVS didn't exactly support this either, but it was at least
practically feasible in some cases by grepping the
,v
file itself.
- con: it's also rather heavyweight if you only wanted to branch one single file for some unofficial or private purpose.
- SVN stores a pristine copy of every checked-out source file on the client side, in a subdirectory of "
.svn
".
- pro: this allows common operations such as "
svn status
" and some forms of "svn diff
" to work entirely locally without needing to talk to the SVN server.
- con: but of course your working copies take up twice as
much space. Where I work this is a significant issue, since we have so much source!
- con: also, if your working copies are stored on an NFS
volume, you might not even save much time, since it doesn't make
much difference whether you're comparing your working files with
pristine copies on an NFS server or on an SVN server – the network
RTT and transfer rate is the limiting factor either way. So if you
use network file servers, you don't even get much benefit in return
for the space cost.
- con: recursive greps of source directories now turn up
lots of bogus hits in "
.svn
" subdirectories. I don't
doubt that at some point someone will provide a recursive grep which
leaves out .svn
directories without having to be told to every time, but I haven't seen it yet.
- pro: despite the large number of cons to go with the one
pro, I will stress that the pro is a big pro in many circumstances and often outweighs all of the cons!
- pro: rumour has it that at some point the local pristine copies might become optional, allowing users to choose their own tradeoff between local storage and network utilisation
- improved security model as discussed at length in my article
- pro: even better still in SVN 1.3, with proper access control even in svnserve
- repository storage mechanism is totally different
- con: Berkeley DB format is a horrid mess, self-admittedly prone to corruption incidents requiring manual recovery, and IME terribly slow
- pro: fortunately, they saw the light and implemented fsfs, which is outstandingly cool and now also the default
- pro: fsfs is extremely friendly to incremental backup strategies
- pro: fsfs permits safe (race-condition-free) read-only access to a repository
- con: neither of SVN's formats provides the simplicity of
CVS's from the point of view of someone poking around the on-disk
repository by hand. Many functions not officially implemented in CVS
could be worked around by hand-grepping – or occasionally
hand-editing – the
,v
files. This is not feasible in SVN, so you're stuck with only the repository search and manipulation tools provided. Fortunately, these are by and large good enough. (Also, CVS's conceptually simple storage format was a large part of what made it hard to support file renames.)
- lots of small but important ways in which SVN has fixed CVS's annoying brokennesses
- pro: genuinely atomic commits (CVS can fail part-way through, and also tags the various files in a single commit with independently generated timestamps, so that a date-based checkout can occasionally give you only half of a big change)
- pro: support for moving and renaming files
- pro: separate "
status
" from "update
" command, so you can quickly see what you've changed locally without having to risk making a mess with conflict markers. "cvs -n up
" would have done this too, but not so nicely, and nobody I know ever remembered to use it.
- pro: when conflicts do show up during "
svn
update
", all three versions of the file are preserved (the local version, the new remote version, and their common parent the old remote version), which is occasionally a lifesaver
- con: SVN will now not automatically cope once you've
removed the conflict markers from a file. It will have marked the
file as "conflicted" when it first displayed the C status, and when
you've sorted it out you have to manually tell it
"
resolved
". OTOH, this could be seen as a pro, since
CVS's failure to track this allowed you to check in conflict markers
by mistake.
- pro: the all-important "
revert
" command is
something that it ought to have been a criminally culpable oversight
to have left out of CVS, or at least to have continued to leave out of it after the first half-hour of use!