I have been exploring distributed revision control systems for almost a year now.
While I am impressed by the speed and features of git, I still prefer user-friendly interfaces such as the ones provided by darcs and bzr.
So I decide to compare the performance of git and bzr on a well-known example of a large source tree, the Linux kernel. I downloaded linux-2.6.0, and the latest version, linux-126.96.36.199.
These are the versions of git and bzr that I used:
$ git --version
git version 0.99.9c
$ bzr --version
bzr (bazaar-ng) 0.7pre
First I did an init in the 2.6.0 directory:
$ time bzr init
$ time git-init-db
Then I added all files:
$ time bzr add > /tmp/bzr-add
$ time git-add > /tmp/git-add
To my surprise bzr was quite a bit faster than git here.
Then I did a cp -r ../linux-188.8.131.52/* . inside the linux-2.6.0 directory. Now, how about doing a diff?
$ time bzr diff > /tmp/bzr-diff
$ time git-diff > /tmp/git-diff
Again, bzr is faster than git.
Let’s commit the initial revision:
$ time bzr commit -m "" > /tmp/bzr-commit
$ time git-commit -a -m "dummy." > /tmp/git-commit
As you can see, committing a large tree is where git really shines.
Next, I did a stupid test, I wanted a diff of all changes after the commit. Since I didn’t change anything, the diff would be empty:
$ time bzr diff > /tmp/bzr-diff-after-commit
$ time git-diff > /tmp/git-diff-after-commit
Git knows there have been no changes, so it immediately returns. Bzr on the other hand searches the whole tree again to see if there are changes, which is of course very slow. Git is a clear winner here.
Let’s give bzr another chance, what about a status report? This is something developers do often, they want to know what files were modified, added, deleted, and so on.
$ time bzr status > /tmp/bzr-status-after-commit
$ time git-status > /tmp/git-status-after-commit
It’s not as bad as the diff, but bzr is still way to slow here. Git, as expected, can easily check if there have been changes, so it immediately returns.
Now, let’s actually do a change. I added my name to the MAINTAINERS file, and tried to commit it.
$ time bzr commit -m "bla" > /tmp/bzr-commit-after-change
$ time git-commit -a -m "bla" > /tmp/git-commit-after-change
It takes git only 7 seconds to update its datastructure, in order to easily check if there has been a change at a later time. Bzr however seems to be traversing the whole tree again.
As a conclusion, we can say that until the initial commit, bzr is very fast (mostly even faster as git). However, after committing, git can easily check against the committed version, while it takes bzr very long to do that. Performing a diff, getting the status or committing again is very slow compared to git.
Without an intial commit, bzr diff is two times as fast as git-diff. Adding is also quite a bit faster.
Of course this was not a fair comparison, since the bzr developers have not been optimizing for speed at the moment. Speed improvements are planned for bzr 2.0. And I can’t blame them, wasn’t it Tony Hoare who stated:
premature optimization is the root of all evil?
I think bzr will certainly do for my projects, and can only get better in terms of performance. Its user experience is excellent as opposed to git’s. It also has good support for Windows, which is an important factor for general adoption in my opinion.