Coding

Authors ≠ Committers: why most Open Source projects should switch to distributed revision control

I had been thinking about the idea behind this post for a while now, but reading this post about getting newbies involved in open source just convinced me to write it down.

Being a concept developed in the Open Source world, it is no wonder that distributed revision control systems give their best in that context. There are many pros and cons, that other people described in detail better than I can do. Of all the features they offer, however, the one I prefer is the least technical one, and it is related to the way they encourage new developers to contribute to open source projects. In that perspective, git and mercurial are a lot more effective than svn, for example.

It all comes as a side effect of authors and committers being two different roles. This can encourage new contributors, who are approaching a new project for the first time, and individuals who may not have the time and energy to dedicate long periods of their time to a project, but may be able to contribute with just a few patches.

A commit from GitHub How GitHub displays both the author and the committer of a single change. Oh, and yes, there is something wrong with the dates. 😉

Think about that. Recognition is one of the most important drivers for Open Source contributors but, unfortunately, centralized revision control (subversion, CVS and the like) doesn’t help in giving credit to newcomers or occasional contributors.

That’s because, generally, sending a single patch (or even a few of them) is not enough to be granted commit access to a project repository (and rightly so) and the commit itself must be done by a project member with enough privileges.

Failure is an option

As Software Engineers, we often tend to be overly optimistic about software. In particular, it often happens that we underestimate the probability of systems and components failures and the impact this kind of events can have on our applications.

We usually tend to dismiss failure events as random, unlikely and sporadic. And, often, we are proven wrong.

Systems do fail indeed. Moreover, when something goes wrong, either it’s barely noticeable, or it leads to extreme consequences. Take the example of the recent AWS outage: everything was caused by a mistake during a routine network change.

Right now, some days after the event, post-mortem analyses and survival stories count in the dozens. There is one recurring lesson that can be learned from what happened.

What I learnt from coding offline

During the last weeks, I’ve been writing a lot of code while commuting.

Since I was working on an application that uses Facebook’s and Amazon’s APIs and I was working offline, I often found myself unable to test the code I was writing against live data.

I must confess I struggled at the beginning.

I had no chance to try out my small application while I was working on it and, generally, I had to code during 1-hour trips, waiting until I got to work/home to see whether everything was performing as expected.

The outcome? I was surprisingly productive. And I’m sure it was not just because of the absence of common distraction sources (IMs, Email, Twitter and the like).

So, what made the difference? Here are some of the insights I got from the experience.