If it wasn’t already clear, I dig code review. I think it really helps get a team of developers more in tune with what’s going on in their repository, and is an easy way to weed out mistakes and bad design in small chunks of code.
But there’s a fly in the soup.
This semester, the MarkUs project has been using ReviewBoard to review all commits to the repository. We’ve caught quite a few things, and we’ve been learning a lot.
Now for that fly:
A developer recently found a typo in our code base. An I18n variable was misspelled in one of our Views as I18n.t(:no_students_wihtou_a_group_message). One of our developers saw this, fixed the typo, and submitted the diff for review.
I was one of the reviewers. And I gave it the green light. It made sense – I18n.t(:no_students_wihtou_a_group_message) is clearly wrong, and I18n.t(:no_students_without_a_group_message) is clearly what was meant here.
So the review got the “Ship It”, and the code was committed. What we didn’t catch, however, was that the locale string was actually named “no_student_without_a_group_message” in the translation file, not “no_students_without_a_group_message”. So the fix didn’t work.
This is important: the diff looked good, but the bug remained because we didn’t have more information on the context of the bug. We had no information about I18n.t(:no_students_without_a_group_message) besides the fact that I18n.t(:no_students_wihtou_a_group_message) looked wrong.
Which brings me back to the conversation we had yesterday: while it seems plausible that code review helps catch defects in small code blocks, does the global defect count on the application actually decrease? Since ReviewBoard doesn’t have any static analysis tools to check what our diffs are doing, isn’t it plausible that while our diffs look good, we’re not preventing ourselves from adding new bugs into the code base?
So, the question is: does light-weight code review actually decrease the defect count across an application as a whole?
If not, can we augment these code review tools so that they’re more sensitive to the context of the diffs that are being reviewed?