I have been exploring distributed revision control systems for almost a year now.
While I am impressed by the speed and features of git, I still prefer user-friendly interfaces such as the ones provided by darcs and bzr.
So I decide to compare the performance of git and bzr on a well-known example of a large source tree, the Linux kernel. I downloaded linux-2.6.0, and the latest version, linux-2.6.15.4.
These are the versions of git and bzr that I used:
$ git --version git version 0.99.9c $ bzr --version bzr (bazaar-ng) 0.7pre
First I did an init in the 2.6.0 directory:
$ time bzr init real 0m1.593s user 0m0.140s sys 0m0.047s $ time git-init-db real 0m0.161s user 0m0.000s sys 0m0.006s
Then I added all files:
$ time bzr add > /tmp/bzr-add real 0m31.870s user 0m31.072s sys 0m0.520s $ time git-add > /tmp/git-add real 0m42.121s user 0m32.428s sys 0m3.208s
To my surprise bzr was quite a bit faster than git here.
Then I did a cp -r ../linux-2.6.15.4/* . inside the linux-2.6.0 directory. Now, how about doing a diff?
$ time bzr diff > /tmp/bzr-diff real 1m13.869s user 0m26.168s sys 0m2.860s $ time git-diff > /tmp/git-diff real 2m26.982s user 1m48.952s sys 0m39.048s
Again, bzr is faster than git.
Let’s commit the initial revision:
$ time bzr commit -m "" > /tmp/bzr-commit real 2m4.757s user 1m16.578s sys 0m6.195s $ time git-commit -a -m "dummy." > /tmp/git-commit real 0m54.964s user 0m49.719s sys 0m3.297s
As you can see, committing a large tree is where git really shines.
Next, I did a stupid test, I wanted a diff of all changes after the commit. Since I didn’t change anything, the diff would be empty:
$ time bzr diff > /tmp/bzr-diff-after-commit real 3m51.918s user 0m7.216s sys 0m1.970s $ time git-diff > /tmp/git-diff-after-commit real 0m0.057s user 0m0.009s sys 0m0.047s
Git knows there have been no changes, so it immediately returns. Bzr on the other hand searches the whole tree again to see if there are changes, which is of course very slow. Git is a clear winner here.
Let’s give bzr another chance, what about a status report? This is something developers do often, they want to know what files were modified, added, deleted, and so on.
$ time bzr status > /tmp/bzr-status-after-commit real 0m19.711s user 0m15.180s sys 0m1.178s $ time git-status > /tmp/git-status-after-commit real 0m0.442s user 0m0.256s sys 0m0.202s
It’s not as bad as the diff, but bzr is still way to slow here. Git, as expected, can easily check if there have been changes, so it immediately returns.
Now, let’s actually do a change. I added my name to the MAINTAINERS file, and tried to commit it.
$ time bzr commit -m "bla" > /tmp/bzr-commit-after-change real 2m6.685s user 0m31.734s sys 0m3.458s $ time git-commit -a -m "bla" > /tmp/git-commit-after-change real 0m7.364s user 0m6.936s sys 0m0.430s
It takes git only 7 seconds to update its datastructure, in order to easily check if there has been a change at a later time. Bzr however seems to be traversing the whole tree again.
As a conclusion, we can say that until the initial commit, bzr is very fast (mostly even faster as git). However, after committing, git can easily check against the committed version, while it takes bzr very long to do that. Performing a diff, getting the status or committing again is very slow compared to git.
Without an intial commit, bzr diff is two times as fast as git-diff. Adding is also quite a bit faster.
Of course this was not a fair comparison, since the bzr developers have not been optimizing for speed at the moment. Speed improvements are planned for bzr 2.0. And I can’t blame them, wasn’t it Tony Hoare who stated: premature optimization is the root of all evil
?
I think bzr will certainly do for my projects, and can only get better in terms of performance. Its user experience is excellent as opposed to git’s. It also has good support for Windows, which is an important factor for general adoption in my opinion.