Monthly Archives: October 2009

Augmenting Code Review Tools: Screen Age and Retina Burn…

Two more ideas to augment the code review process:

Screen Age

Imagine you’re reviewing a piece of code.  There’s something like…500 lines to look at.  So you’re scrolling through the diff, checking things here, checking things there…

And in the background, the review software is recording which sections of the code you’re looking at.

Hm.

I wonder if it’d be a useful metric to know what portions of the code were being looked at the most during a review?  I believe it’s possible (at least within Firefox) to determine the scroll position of a window.  I believe it’s also possible to determine the dimensions of the entire diff page, and the offset coordinates of elements on that page.

So shouldn’t it be possible to determine which elements were on the screen, and for how long?

Imagine seeing a “heat map” over the diff, showing where the reviewer spent most of their time looking.

Jason Cohen wrote that code reviews should take at least 5 minutes, and at most 60-90 minutes.  I wonder if collecting this information would help determine whether or not a review was performed carefully enough.

Now, granted, there are plenty of ways to add noise to the data.  If I’m doing a review, and stand up to go get a sandwich, and then my neighbour visits, etc…my computer sits there, gazing at those elements, and the data is more or less worthless…

Which brings me to:

Retina Burn

Eye tracking is not a new technology.  One of my roommates works at a lab where they do experiments with eye tracking tools.  From what he tells me, the eye tracking gear in the lab is pretty expensive and heavy-weight.

But check this out:

and this:

and this:

So it looks like a webcam and some clever software can do the eye tracking trick too.  This stuff is probably far less accurate than what my roommate uses – but it’s lighter, and cheaper, and therefore more likely to find its way into homes.  So it looks like the day is coming where I’ll eventually be able to use my eyes to control my mouse cursor.

But, more interestingly, this more or less solves the problem with my Screen Age idea:  this technology can tell when you’re looking at the screen.  And it can give a pretty good guess about what area of the screen you’re looking at.

I wonder if collecting this information from code reviewers would be useful – what exact parts of the code have they looked at?  And for how long?  What have they missed?  What did they gloss over?

UPDATE: Karen Reid has also brought to my attention the possibility of using this technology to see how TAs grade assignments by tracking what they’re looking at.  Hm…

Limits of the i>Clicker

This post is from an idea that Karen Reid has posed to me for potential research…

i>Clickers have been around for a few years.  I’ve never had to buy or use one in any of my classes, but it seems like more and more courses are starting to find it useful.

So what is this i>Clicker thing?

An i>Clicker is a handheld wireless device that essentially brings the “ask the audience” portion from Who Wants to Be A Millionaire, into the classroom.

So, each student buys his or her own personal i>Clicker, and registers it for any classes that require it.  During one of those classes, the instructor could throw up a slide that quizzes the students on what was just taught.  Students key in their responses on the i>Clicker, and the results are then displayed up on the screen.

From what I can tell, the idea is that the i>Clicker should encourage more class participation because:

  1. Students answer simultaneously – so instead of instructors choosing a raised hand from the class, the entire class gets polled
  2. Results displayed to the class can be anonymous.  So, instead of remaining silent among your peers out of fear of being publicly wrong, all students can submit an answer, and get feedback that helps them learn

The i>Clicker can also be used to poll students and give the instructor feedback. For example, an instructor could put up a slide that says “How was my lecture today?” and get some anonymous feedback there.

Well, not exactly anonymous.  See, the instructor has the ability to see who submitted what, and when…so if you repeatedly answer quiz questions incorrectly, the instructor can probably detect that you’re misunderstanding, guessing, or just don’t care.

Anyhow, that’s the basic idea behind the i>Clicker.  It’s used in a few classes here at UofT, and I know people who’ve had to purchase ($35+) and use them.

Click here to visit the i>Clicker website

The Limits of the i>Clicker

The amount of data that students can provide through the i>Clicker is pretty limited.  Here’s a photo of a the device:

The iClicker.

Ta da.

Students have a maximum of 5 choices that they can make while being polled.  Instructors are restricted to multiple-choice questions.

Hm.  Can’t we do any better?

Turning Smartphones into an i>Clicking Device

Wifi-enabled “smartphones” are becoming part of everyday life.  It seems like I can’t walk half a block without seeing somebody whip out their iPhone and do something really freakin’ cool with it:

So it’s not really a far fetched idea to imagine that, some day, every student will possess one of these things.

Certainly, something like the iPhone could act as a multiple choice interface.  But is there a way of turning some of that cool touch/gesture/accelerometer stuff into useful polling feedback for students and instructors?

Some Ideas

  1. Instructor puts a graphic up on the board, and asks the students “what’s wrong with this picture?”.  Students look at the picture on their SmartPhone, and use their finger to indicate the portion of the picture that they’re interested in.  After a few seconds, the instructor displays the results – which is a semi-transparent overlay on the image, showing all of the areas that students indicated.  Areas that are of interest to more students are emphasized.I can see this being useful for code reading classes.  The instructor splats a piece of code up on the screen, and asks the students to indicate where the bug is.
  2. Students were asked to mock-up a paper prototype for an interface that they are designing.  The instructor asks all of the students to take a picture of their paper prototype, and submit it on their SmartPhone.  The instructor is then able to put all of the photos up on the screen for discussion.  This could nicely tie in with idea #1.
  3. Students are polled to see how many years they have been programming for.  Students simply type in the number of years, and submit it.  While the i>Clicker restricts the answer to such a question to 5 ranges, the SmartPhone submits the actual answer.  Once collected, the submissions could be displayed on a histogram to give students an accurate impression of the level of experience in the classroom.

Any other ideas?

Code Reviews and Predictive Impact Analysis

A few posts ago, I mentioned what I think of as the Achilles’ Heel of light-weight code review:  the lack of feedback over dependencies that can/will be impacted by a posted change.  I believe this lack of feedback can potentially give software developers the false impression that proposed code is sound, and thus allow bugs to slip through the review process.  This has happened more than once with the MarkUs project, where we are using ReviewBoard.

Wouldn’t it be nice…

Imagine that you’re working on a “Library” Rails project.  Imagine that you’re about to make an update to the Book model within the MVC framework:  you’ve added a more efficient static class method to the Book model that allows you to check out large quantities of Books from the Library all at once, rather than one at a time.  Cool.  You update the BookController to use the new method, run your regression tests (which are admittedly incomplete, and pass with flying colours), and post your code for review.

Your code review tool takes the change you’re suggesting, and notices a trend:  in the past, when the “checkout” methods in the Book model have been updated, the BookController is usually changed, and a particular area in en.yml locale file is usually updated too.  The code review tool notices that in this latest change, nothing has been changed in en.yml.

The code review tool raises its eyebrow.  “I wonder if they forgot something…”, it ponders.

Now imagine that someone logs in to review the code.  Along with the proposed changes, the code review tool suggests that the reviewer also take a peek at en.yml just in case the submitter has missed something.  The reviewer notices that, yes, a translation string for an error message in en.yml no longer makes sense with the new method.  The reviewer writes a comment about it, and submits the review.

The reviewee looks at the review and goes, “Of course!  How could I forget that?”, and updates the en.yml before updating the diff under review.

Hm.  It’s like a recommendation engine for code reviews…”by the way, have you checked…?”

I wonder if this would be useful…

Mining Repositories for Predicting Impact Analysis

This area of research is really new to me, so bear with me as I stumble through it.

It seems like it should be possible to predict what methods/files have dependencies on other files based on static analysis, as well as VCS repository mining.  I believe this has been tried in various forms.

But I don’t think anything like this has been integrated into a code review tool.  Certainly not any of the ones that I listed earlier.

I wonder if such a tool would be accurate…  and, again, would it be useful?  Could it help catch more of the bugs that the standard light-weight code review process misses?

Thoughts?

Teaching Your Dog Tricks Through a Glass Window: The 10/Gui Interface as an Adventure Game Controller

Adventure game developers have tried a bunch of ways of making it easy to control characters in 3d environments.  Myst stuck with the basic point and click.  Grim Fandango used the numpad on the keyboard.  Gabriel Knight 3 used both the mouse and keyboard:  the mouse moved the player, and the keyboard controlled the camera.

With all of these…it’s always felt a little bit…restrictive.

A few weeks ago, Ben Croshaw reviewed the new Monkey Island game.  Here’s the review.  Pay particular attention to 1:50 when he talks about the mouse controlling a character in a 3d environment.

NSFW (language):

That’s exactly right.  He’s hit the nail on the head.  It’s like trying to teach your dog something through a glass window.

The 2D interface of a mouse or keyboard makes actions in a 3d world awkward.

Multi-touch improves on 2d interfaces by giving us more bandwidth (multiple fingers = multiple cursors).  If this is the next step in desktop computing:

What does that mean for 3d adventure games?  How can 3d adventure games leverage multi-touch?

For that matter, how can 2d adventure games leverage multi-touch?  Think of all of the puzzles…

Treasure Hunting, and Research Idea #4

Remember those research questions Greg had me and Zuzel come up with?  He’s asked us to select one, and find some tools that could help us in an experiment to answer that question.

Originally, my favourite question was my second one:  in the courses where students work in teams, why aren’t the instructors providing or encouraging code review tools?

I’ve already received two answers to this question from instructors here at UofT.

Greg has warned me that this line of inquiry might be a bit too shallow.  He warns that it’s possible that, when asked about code review, they’ll shrug their shoulders and say that they just don’t have time to teach it on top of everything else.  Karen Reid’s response echos this warning, somewhat.

And maybe Steve Easterbrook has a point – that code review is hard to assess as an assignment.  It seems he’s at least tried it. However, it appears that he was using fragments of Fagan Inspection reports as his measuring stick rather than the reviews themselves. I assert that this is where light-weight code review tools could be of some service: to actually see the conversation going on about the code.  To see the review.  I also assert that such a conversation is hard to fake, or at least, to fake well.

So, just go with me on this for a second:  let’s pretend that Steve is going to teach his course again.  Except this time, instead of collecting fragments of Fagan Inspection reports, he uses something like ReviewBoard, Crucible, or Code Collaborator to collect reviews and conversations.  Would it be worth his time?  Would it benefit the students?  Would they see the value of code review?

Reading this blog post, it seems that the Basie folk first got on the ReviewBoard band wagon because Blake Winton set a good example as a code reviewer.  I remember talking to Basie developer Bill Konrad this summer, and him telling me that he’d never go back after seeing the improvement in code quality.

Because that’s the clincher – getting the students to see the value.  You can’t make a horse drink, and you can’t get students to use a tool unless they find it useful.  And how do you show that to them?  How do you show them the value without having to call in Blake Winton?  How do you make them see?  And how do you make the process painless enough so that instructors don’t have to worry about teaching a new, confusing tool?

One of the comments on Greg’s post says the following:

My feeling is that the easier it is to review code, the more interest you’ll see from students.

Maybe that’s really all you need.

So, how about this for an experiment – take a class of students who are working on group assignments, and set them up a copy of a light-weight code review tool.  Get one of the first assignments in the class to involve a review so that they need to use the software at least once.  Now track the usage for the rest of the semester…do they still use it?  Do some use it, and not others?  If so, do the ones who use it tend to receive higher grades?  Or is there no difference?  What is the quality of their reviews when the reviews are not being marked?  What will the students think of review tool once the course is completed?

I think it’s simple enough, and I can’t believe I didn’t think about it earlier.

Some of the software I could use:

Quite a few choices for the review tool.  And I wasn’t even digging that deeply.  Perhaps I should do a quick survey across all of them to see which one would be the best fit for a CS course.  Perhaps.