Tag Archives: code review

Code Reviews and Predictive Impact Analysis

A few posts ago, I mentioned what I think of as the Achilles’ Heel of light-weight code review: the lack of feedback over dependencies that can/will be impacted by a posted change. I believe this lack of feedback can potentially give software developers the false impression that proposed code is sound, and thus allow bugs to slip through the review process. This has happened more than once with the MarkUs project, where we are using ReviewBoard.

Wouldn’t it be nice…

Imagine that you’re working on a “Library” Rails project. Imagine that you’re about to make an update to the Book model within the MVC framework: you’ve added a more efficient static class method to the Book model that allows you to check out large quantities of Books from the Library all at once, rather than one at a time. Cool. You update the BookController to use the new method, run your regression tests (which are admittedly incomplete, and pass with flying colours), and post your code for review.

Your code review tool takes the change you’re suggesting, and notices a trend: in the past, when the “checkout” methods in the Book model have been updated, the BookController is usually changed, and a particular area in en.yml locale file is usually updated too. The code review tool notices that in this latest change, nothing has been changed in en.yml.

The code review tool raises its eyebrow. “I wonder if they forgot something…”, it ponders.

Now imagine that someone logs in to review the code. Along with the proposed changes, the code review tool suggests that the reviewer also take a peek at en.yml just in case the submitter has missed something. The reviewer notices that, yes, a translation string for an error message in en.yml no longer makes sense with the new method. The reviewer writes a comment about it, and submits the review.

The reviewee looks at the review and goes, “Of course! How could I forget that?”, and updates the en.yml before updating the diff under review.

Hm. It’s like a recommendation engine for code reviews…”by the way, have you checked…?”

I wonder if this would be useful…

Mining Repositories for Predicting Impact Analysis

This area of research is really new to me, so bear with me as I stumble through it.

It seems like it should be possible to predict what methods/files have dependencies on other files based on static analysis, as well as VCS repository mining. I believe this has been tried in various forms.

But I don’t think anything like this has been integrated into a code review tool. Certainly not any of the ones that I listed earlier.

I wonder if such a tool would be accurate… and, again, would it be useful? Could it help catch more of the bugs that the standard light-weight code review process misses?

Thoughts?

Treasure Hunting, and Research Idea #4

Remember those research questions Greg had me and Zuzel come up with? He’s asked us to select one, and find some tools that could help us in an experiment to answer that question.

Originally, my favourite question was my second one: in the courses where students work in teams, why aren’t the instructors providing or encouraging code review tools?

I’ve already received two answers to this question from instructors here at UofT.

Greg has warned me that this line of inquiry might be a bit too shallow. He warns that it’s possible that, when asked about code review, they’ll shrug their shoulders and say that they just don’t have time to teach it on top of everything else. Karen Reid’s response echos this warning, somewhat.

And maybe Steve Easterbrook has a point – that code review is hard to assess as an assignment. It seems he’s at least tried it. However, it appears that he was using fragments of Fagan Inspection reports as his measuring stick rather than the reviews themselves. I assert that this is where light-weight code review tools could be of some service: to actually see the conversation going on about the code. To see the review. I also assert that such a conversation is hard to fake, or at least, to fake well.

So, just go with me on this for a second: let’s pretend that Steve is going to teach his course again. Except this time, instead of collecting fragments of Fagan Inspection reports, he uses something like ReviewBoard, Crucible, or Code Collaborator to collect reviews and conversations. Would it be worth his time? Would it benefit the students? Would they see the value of code review?

Reading this blog post, it seems that the Basie folk first got on the ReviewBoard band wagon because Blake Winton set a good example as a code reviewer. I remember talking to Basie developer Bill Konrad this summer, and him telling me that he’d never go back after seeing the improvement in code quality.

Because that’s the clincher – getting the students to see the value. You can’t make a horse drink, and you can’t get students to use a tool unless they find it useful. And how do you show that to them? How do you show them the value without having to call in Blake Winton? How do you make them see? And how do you make the process painless enough so that instructors don’t have to worry about teaching a new, confusing tool?

One of the comments on Greg’s post says the following:

My feeling is that the easier it is to review code, the more interest you’ll see from students.

Maybe that’s really all you need.

So, how about this for an experiment – take a class of students who are working on group assignments, and set them up a copy of a light-weight code review tool. Get one of the first assignments in the class to involve a review so that they need to use the software at least once. Now track the usage for the rest of the semester…do they still use it? Do some use it, and not others? If so, do the ones who use it tend to receive higher grades? Or is there no difference? What is the quality of their reviews when the reviews are not being marked? What will the students think of review tool once the course is completed?

I think it’s simple enough, and I can’t believe I didn’t think about it earlier.

Some of the software I could use:

Quite a few choices for the review tool. And I wasn’t even digging that deeply. Perhaps I should do a quick survey across all of them to see which one would be the best fit for a CS course. Perhaps.

The Achilles’ Heel of Light-Weight Code Review

So I had my weekly meeting with my supervisor, and fellow students Zuzel and Jon Pipitone. Something interesting popped up, and I thought I’d share it.

If it wasn’t already clear, I dig code review. I think it really helps get a team of developers more in tune with what’s going on in their repository, and is an easy way to weed out mistakes and bad design in small chunks of code.

But there’s a fly in the soup.

This semester, the MarkUs project has been using ReviewBoard to review all commits to the repository. We’ve caught quite a few things, and we’ve been learning a lot.

Now for that fly:

A developer recently found a typo in our code base. An I18n variable was misspelled in one of our Views as I18n.t(:no_students_wihtou_a_group_message). One of our developers saw this, fixed the typo, and submitted the diff for review.

I was one of the reviewers. And I gave it the green light. It made sense – I18n.t(:no_students_wihtou_a_group_message) is clearly wrong, and I18n.t(:no_students_without_a_group_message) is clearly what was meant here.

So the review got the “Ship It”, and the code was committed. What we didn’t catch, however, was that the locale string was actually named “no_student_without_a_group_message” in the translation file, not “no_students_without_a_group_message”. So the fix didn’t work.

This is important: the diff looked good, but the bug remained because we didn’t have more information on the context of the bug. We had no information about I18n.t(:no_students_without_a_group_message) besides the fact that I18n.t(:no_students_wihtou_a_group_message) looked wrong.

Which brings me back to the conversation we had yesterday: while it seems plausible that code review helps catch defects in small code blocks, does the global defect count on the application actually decrease? Since ReviewBoard doesn’t have any static analysis tools to check what our diffs are doing, isn’t it plausible that while our diffs look good, we’re not preventing ourselves from adding new bugs into the code base?

So, the question is: does light-weight code review actually decrease the defect count across an application as a whole?

If not, can we augment these code review tools so that they’re more sensitive to the context of the diffs that are being reviewed?

Research Question Idea #3

When we started using ReviewBoard with MarkUs a few months back, all of a sudden, commits to the repository seemed to slow down: we would take more time cleaning up our code, and polishing it for others to see.

Our commits were usually quite large too. This is because we were all working on different sections of the code, and we wanted to commit stuff that “instantly worked” and was “instantly perfect”. So after days of silence, 1000 lines of code would suddenly go up for review…and as Jason Cohen can probably tell you, the number of defects found during review decreases as the amount of code to look at increases. So, the reviewer would skip through 1000 lines, assume most of it was OK, and give it the Ship It.

Yeah, I know. Awful. I wonder if this is a standard newbie mistake for student groups just starting out with code review…

So, study idea:

Have two separate groups working on some assignment. Have Group 1 commit to their repository without any review process. Have Group 2 do pre-commit reviews using a tool like ReviewBoard.

Now check out the size, frequency, and readability of the repository diffs of each group. Might generate some interesting data.

Anyhow, in our defence, we seem to have calmed down on MarkUs. Diffs up for review are pretty small, and get posted relatively frequently. Using ReviewBoard on MarkUs has made me a believer. Testify!

Research Question Idea #2

(Read this if you have no idea what I’m talking about)

Why not go right for the throat?

How about I just round up all of the instructors who teach courses with group assignments, and ask them why code review tools aren’t provided or encouraged. Or maybe they’ve tried, but they ran into a stumbling block. Or perhaps the whole idea of using code review tools flies in the face of some important teaching method.

I won’t know until I ask. So why not just ask?

It might not be a quick, sharp, clever scientific study, but it sure might generate some interesting material for examination.

A Blog by Mike Conley

The personal blog of a Toronto based software mechanic, musician, sound designer, and theatre enthusiast.