I have been exploring distributed revision control systems for almost a year now.
While I am impressed by the speed and features of git, I still prefer user-friendly interfaces such as the ones provided by darcs and bzr.
So I decide to compare the performance of git and bzr on a well-known example of a large source tree, the Linux kernel. I downloaded linux-2.6.0, and the latest version, linux-2.6.15.4.
These are the versions of git and bzr that I used:
$ git --version
git version 0.99.9c
$ bzr --version
bzr (bazaar-ng) 0.7pre
First I did an init in the 2.6.0 directory:
$ time bzr init
real 0m1.593s
user 0m0.140s
sys 0m0.047s
$ time git-init-db
real 0m0.161s
user 0m0.000s
sys 0m0.006s
Then I added all files:
$ time bzr add > /tmp/bzr-add
real 0m31.870s
user 0m31.072s
sys 0m0.520s
$ time git-add > /tmp/git-add
real 0m42.121s
user 0m32.428s
sys 0m3.208s
To my surprise bzr was quite a bit faster than git here.
Then I did a cp -r ../linux-2.6.15.4/* . inside the linux-2.6.0 directory. Now, how about doing a diff?
$ time bzr diff > /tmp/bzr-diff
real 1m13.869s
user 0m26.168s
sys 0m2.860s
$ time git-diff > /tmp/git-diff
real 2m26.982s
user 1m48.952s
sys 0m39.048s
Again, bzr is faster than git.
Let’s commit the initial revision:
$ time bzr commit -m "" > /tmp/bzr-commit
real 2m4.757s
user 1m16.578s
sys 0m6.195s
$ time git-commit -a -m "dummy." > /tmp/git-commit
real 0m54.964s
user 0m49.719s
sys 0m3.297s
As you can see, committing a large tree is where git really shines.
Next, I did a stupid test, I wanted a diff of all changes after the commit. Since I didn’t change anything, the diff would be empty:
$ time bzr diff > /tmp/bzr-diff-after-commit
real 3m51.918s
user 0m7.216s
sys 0m1.970s
$ time git-diff > /tmp/git-diff-after-commit
real 0m0.057s
user 0m0.009s
sys 0m0.047s
Git knows there have been no changes, so it immediately returns. Bzr on the other hand searches the whole tree again to see if there are changes, which is of course very slow. Git is a clear winner here.
Let’s give bzr another chance, what about a status report? This is something developers do often, they want to know what files were modified, added, deleted, and so on.
$ time bzr status > /tmp/bzr-status-after-commit
real 0m19.711s
user 0m15.180s
sys 0m1.178s
$ time git-status > /tmp/git-status-after-commit
real 0m0.442s
user 0m0.256s
sys 0m0.202s
It’s not as bad as the diff, but bzr is still way to slow here. Git, as expected, can easily check if there have been changes, so it immediately returns.
Now, let’s actually do a change. I added my name to the MAINTAINERS file, and tried to commit it.
$ time bzr commit -m "bla" > /tmp/bzr-commit-after-change
real 2m6.685s
user 0m31.734s
sys 0m3.458s
$ time git-commit -a -m "bla" > /tmp/git-commit-after-change
real 0m7.364s
user 0m6.936s
sys 0m0.430s
It takes git only 7 seconds to update its datastructure, in order to easily check if there has been a change at a later time. Bzr however seems to be traversing the whole tree again.
As a conclusion, we can say that until the initial commit, bzr is very fast (mostly even faster as git). However, after committing, git can easily check against the committed version, while it takes bzr very long to do that. Performing a diff, getting the status or committing again is very slow compared to git.
Without an intial commit, bzr diff is two times as fast as git-diff. Adding is also quite a bit faster.
Of course this was not a fair comparison, since the bzr developers have not been optimizing for speed at the moment. Speed improvements are planned for bzr 2.0. And I can’t blame them, wasn’t it Tony Hoare who stated: premature optimization is the root of all evil
?
I think bzr will certainly do for my projects, and can only get better in terms of performance. Its user experience is excellent as opposed to git’s. It also has good support for Windows, which is an important factor for general adoption in my opinion.