Category Archives: Computer Science

The Achilles’ Heel of Light-Weight Code Review

So I had my weekly meeting with my supervisor, and fellow students Zuzel and Jon Pipitone.  Something interesting popped up, and I thought I’d share it.

If it wasn’t already clear, I dig code review.  I think it really helps get a team of developers more in tune with what’s going on in their repository, and is an easy way to weed out mistakes and bad design in small chunks of code.

But there’s a fly in the soup.

This semester, the MarkUs project has been using ReviewBoard to review all commits to the repository.  We’ve caught quite a few things, and we’ve been learning a lot.

Now for that fly:

A developer recently found a typo in our code base.  An I18n variable was misspelled in one of our Views as I18n.t(:no_students_wihtou_a_group_message).  One of our developers saw this, fixed the typo, and submitted the diff for review.

I was one of the reviewers.  And I gave it the green light.  It made sense – I18n.t(:no_students_wihtou_a_group_message) is clearly wrong, and I18n.t(:no_students_without_a_group_message) is clearly what was meant here.

So the review got the “Ship It”, and the code was committed.  What we didn’t catch, however, was that the locale string was actually named “no_student_without_a_group_message” in the translation file, not “no_students_without_a_group_message”.  So the fix didn’t work.

This is important:  the diff looked good, but the bug remained because we didn’t have more information on the context of the bug.  We had no information about I18n.t(:no_students_without_a_group_message) besides the fact that I18n.t(:no_students_wihtou_a_group_message) looked wrong.

Which brings me back to the conversation we had yesterday:  while it seems plausible that code review helps catch defects in small code blocks, does the global defect count on the application actually decrease?  Since ReviewBoard doesn’t have any static analysis tools to check what our diffs are doing, isn’t it plausible that while our diffs look good, we’re not preventing ourselves from adding new bugs into the code base?

So, the question is:  does light-weight code review actually decrease the defect count across an application as a whole?

If not, can we augment these code review tools so that they’re more sensitive to the context of the diffs that are being reviewed?

Research Question Idea #3

When we started using ReviewBoard with MarkUs a few months back, all of a sudden, commits to the repository seemed to slow down: we would take more time cleaning up our code, and polishing it for others to see.

Our commits were usually quite large too.  This is because we were all working on different sections of the code, and we wanted to commit stuff that “instantly worked” and was “instantly perfect”.  So after days of silence, 1000 lines of code would suddenly go up for review…and as Jason Cohen can probably tell you, the number of defects found during review decreases as the amount of code to look at increases.  So, the reviewer would skip through 1000 lines, assume most of it was OK, and give it the Ship It.

Yeah, I know.  Awful.  I wonder if this is a standard newbie mistake for student groups just starting out with code review…

So, study idea:

Have two separate groups working on some assignment.  Have Group 1 commit to their repository without any review process.  Have Group 2 do pre-commit reviews using a tool like ReviewBoard.

Now check out the size, frequency, and readability of the repository diffs of each group.  Might generate some interesting data.

Anyhow, in our defence, we seem to have calmed down on MarkUs.  Diffs up for review are pretty small, and get posted relatively frequently.  Using ReviewBoard on MarkUs has made me a believer.  Testify!

Research Question Idea #2

(Read this if you have no idea what I’m talking about)

Why not go right for the throat?

How about I just round up all of the instructors who teach courses with group assignments, and ask them why code review tools aren’t provided or encouraged.  Or maybe they’ve tried, but they ran into a stumbling block.  Or perhaps the whole idea of using code review tools flies in the face of some important teaching method.

I won’t know until I ask.  So why not just ask?

It might not be a quick, sharp, clever scientific study, but it sure might generate some interesting material for examination.

Coming Up with Interesting Research Questions, and Idea #1

So my supervisor Greg Wilson has challenged myself and fellow grad student Zuzel to try to come up with one idea for a study per day until our next meeting.

I’ve been researching the use of code reviews in CS undergraduate classes, and this is what my ideas will center around.

My first idea is a knock-off of one that Jorge Aranda performed a while back:

Take a group of students, and tell them that they will all be working together on an assignment. Give them a spec for their assignment.  Get a time estimate in hours as to how long they think they will need to complete the assignment.

Take a second group of students (who were not present when the first group was around), and tell them that they will be working together on an assignment.  Give them the same spec that the first group had.  Tell them that they will need to use a peer code review tool like ReviewBoard for every commit.  Get a time estimate on how long they will need to complete the assignment.

Compare the two sets of estimates.

I predict that the second set will have a higher range of values.  I wouldn’t find that surprising.  I’m more interested in how much higher the estimates are.

I remember my first reaction to using ReviewBoard on MarkUs:  This’ll slow us down.  We don’t have time for this.

I’m curious if others feel the same way.

Learning to Read Things, and Code as Art

A while back, I went on a little rant about how CS students don’t learn how to read and critique code, and that because of this, they’re missing out on a huge opportunity for learning.  They’re not exercising their abilities to comprehend other people’s code.

But let’s step back for a second.  Forget code for a minute, and let’s just think about reading in general.

Check out this article that Greg Wilson forwarded to Zuzel and I.

The take home message:  reading is not a skill that can be sharpened in a vacuum.  One of the keys to increasing reading comprehension is to increase the variety of reading material to practice with.  This will help build the foundation needed to understand more sophisticated material.

Actually, this really doesn’t come as a surprise to me.  In the Drama department, before diving into a new script, we are encouraged to briefly study the time period (slang, accents, and turns of phrase in particular) and the subject matter of the play.  This builds a foundation of context, so that the script makes some sense the first time it is read.  And the first time a script is read is very important, because for me,  it lays down many of the assumptions that I carry throughout my time working with the play.  Every subsequent reading of that play builds off of the first reading.

And have you ever read T.S. Elliot’s Wasteland?  Good luck making sense of that unless you have a study guide, because that guy references a ton of stuff, and expects you to fill the gaps.

So let’s direct this back to reading source code.

Taking that article (and this video) under consideration, it seems that just slapping some code up on a projector, and getting people to talk about it might not be good enough.

Something like this might be a better idea:  check out this post by Jason Montojo.  If his CS art history course ever takes off, I’ll be first in line to sign up.

The idea of studying code as art is intruiging…maybe I’m not the only one who thinks of it like sculpture.