Category Archives: Technology

The Achilles’ Heel of Light-Weight Code Review

So I had my weekly meeting with my supervisor, and fellow students Zuzel and Jon Pipitone.  Something interesting popped up, and I thought I’d share it.

If it wasn’t already clear, I dig code review.  I think it really helps get a team of developers more in tune with what’s going on in their repository, and is an easy way to weed out mistakes and bad design in small chunks of code.

But there’s a fly in the soup.

This semester, the MarkUs project has been using ReviewBoard to review all commits to the repository.  We’ve caught quite a few things, and we’ve been learning a lot.

Now for that fly:

A developer recently found a typo in our code base.  An I18n variable was misspelled in one of our Views as I18n.t(:no_students_wihtou_a_group_message).  One of our developers saw this, fixed the typo, and submitted the diff for review.

I was one of the reviewers.  And I gave it the green light.  It made sense – I18n.t(:no_students_wihtou_a_group_message) is clearly wrong, and I18n.t(:no_students_without_a_group_message) is clearly what was meant here.

So the review got the “Ship It”, and the code was committed.  What we didn’t catch, however, was that the locale string was actually named “no_student_without_a_group_message” in the translation file, not “no_students_without_a_group_message”.  So the fix didn’t work.

This is important:  the diff looked good, but the bug remained because we didn’t have more information on the context of the bug.  We had no information about I18n.t(:no_students_without_a_group_message) besides the fact that I18n.t(:no_students_wihtou_a_group_message) looked wrong.

Which brings me back to the conversation we had yesterday:  while it seems plausible that code review helps catch defects in small code blocks, does the global defect count on the application actually decrease?  Since ReviewBoard doesn’t have any static analysis tools to check what our diffs are doing, isn’t it plausible that while our diffs look good, we’re not preventing ourselves from adding new bugs into the code base?

So, the question is:  does light-weight code review actually decrease the defect count across an application as a whole?

If not, can we augment these code review tools so that they’re more sensitive to the context of the diffs that are being reviewed?

Research Question Idea #3

When we started using ReviewBoard with MarkUs a few months back, all of a sudden, commits to the repository seemed to slow down: we would take more time cleaning up our code, and polishing it for others to see.

Our commits were usually quite large too.  This is because we were all working on different sections of the code, and we wanted to commit stuff that “instantly worked” and was “instantly perfect”.  So after days of silence, 1000 lines of code would suddenly go up for review…and as Jason Cohen can probably tell you, the number of defects found during review decreases as the amount of code to look at increases.  So, the reviewer would skip through 1000 lines, assume most of it was OK, and give it the Ship It.

Yeah, I know.  Awful.  I wonder if this is a standard newbie mistake for student groups just starting out with code review…

So, study idea:

Have two separate groups working on some assignment.  Have Group 1 commit to their repository without any review process.  Have Group 2 do pre-commit reviews using a tool like ReviewBoard.

Now check out the size, frequency, and readability of the repository diffs of each group.  Might generate some interesting data.

Anyhow, in our defence, we seem to have calmed down on MarkUs.  Diffs up for review are pretty small, and get posted relatively frequently.  Using ReviewBoard on MarkUs has made me a believer.  Testify!

Research Question Idea #2

(Read this if you have no idea what I’m talking about)

Why not go right for the throat?

How about I just round up all of the instructors who teach courses with group assignments, and ask them why code review tools aren’t provided or encouraged.  Or maybe they’ve tried, but they ran into a stumbling block.  Or perhaps the whole idea of using code review tools flies in the face of some important teaching method.

I won’t know until I ask.  So why not just ask?

It might not be a quick, sharp, clever scientific study, but it sure might generate some interesting material for examination.

Coming Up with Interesting Research Questions, and Idea #1

So my supervisor Greg Wilson has challenged myself and fellow grad student Zuzel to try to come up with one idea for a study per day until our next meeting.

I’ve been researching the use of code reviews in CS undergraduate classes, and this is what my ideas will center around.

My first idea is a knock-off of one that Jorge Aranda performed a while back:

Take a group of students, and tell them that they will all be working together on an assignment. Give them a spec for their assignment.  Get a time estimate in hours as to how long they think they will need to complete the assignment.

Take a second group of students (who were not present when the first group was around), and tell them that they will be working together on an assignment.  Give them the same spec that the first group had.  Tell them that they will need to use a peer code review tool like ReviewBoard for every commit.  Get a time estimate on how long they will need to complete the assignment.

Compare the two sets of estimates.

I predict that the second set will have a higher range of values.  I wouldn’t find that surprising.  I’m more interested in how much higher the estimates are.

I remember my first reaction to using ReviewBoard on MarkUs:  This’ll slow us down.  We don’t have time for this.

I’m curious if others feel the same way.

alertCheck Grows Up

Remember that Firefox add-on I wrote up a while back – the one that allows you to suppress annoying alert popups?

Some updates:

  • alertCheck just went public on Mozilla Add-ons after being reviewed by an add-ons editor
  • As of this writing, alertCheck has 740 downloads
  • As of this writing, alertCheck has 4 reviews – all positive

Not bad for my first add-on!