Exploring Peer Review in the Computer Science Classroom
by Scott Turner and Manuel A. Pérez Quiñones
I’m new at reading papers, so I’ve gotten used to 5 or 10 page-ers. This looks like the big one, though – 69 pages.
I assume they have something significant to say. The title sure sounds interesting, especially considering what I’m looking for.
Anyhow, it’s a big paper, and there’s a lot to go through. Let’s get started.
Right off the bat, I can see that they’re interested in the same problem that I am:
Peer review, while it has many known benefits (Zeller 2000; Papalaskari 2003; Wolfe 2004; Hamer, Ma et al. 2005; Trytten 2005), and is used extensively in other fields (Falchikov and Goldfinch 2000; Topping, Smith et al. 2000; Liu and Hansen 2002; Dossin 2003; Carlson and Berry 2005) and in industry (Anderson and Shneiderman 1977; Anewalt 2005; Hundhausen, Agrawal et al. 2009), is not as widely used in the computer science curriculum. This may be due, in part, to a lack of information about what, who and when to review in order to achieve specific goals in computer science.
This next part got my attention:
That is not to say that the literature is silent on these issues. The studies make these choices but there are few reasons given for the decisions and even fewer comparisons performed to show relative value of those options.
Holy smokes. A pretty bold critique of those previous papers (at least one of which, I’ve already reviewed). It sounds like Scott and Manuel were as disappointed in some of the peer code review literature as I was. If I was part of an audience that was being read this paper aloud, I would “wooo!” at this point.
What is needed is a clearer understanding of the requirements that the discipline imposes on the peer review process so that it may be effectively used.
Cool – I’m looking forward to seeing what they dug up.
Reading onwards through the introduction, I’m seeing the same basic arguments for peer code review that I’ve seen elsewhere. I’ll summarize:
- PCR (peer code review) involves active use of higher levels of Bloom’s Taxonomy: synthesis, analysis and evaluation, both for reviewers and review-ees
- PCR prepares students for industry, since code review is (or should be) a common part of professional software development
Soon after, a good point is brought up – PCR is potentially a beneficial learning activity, but it all depends on the goals of a particular assignment. A particular review structure may be better for improving code quality, and another might be better for increasing student motivation. These considerations need to be taken into account when choosing a PCR structure.
By PCR structure, I think the writers mean:
- What is being reviewed – source code vs design diagrams, for example
- Who is being reviewed – students could review one another, or they could all review a piece of code provided by the instructor
- When the review occurs – what level of students should take part in PCR? Is there a minimum grade level that must be achieved before PCR is effective? And when, during a project, should PCR happen? Early in the design process, or after-the-fact – like a “code autopsy”?
These are good questions. No wonder this paper is so long – I think their experiment is going to try to answer them all.
And just when things are starting to get good…they go into a literature review. Yech. I know it’s important to lay the groundwork, but I think I just felt my eyes turn to oatmeal.
The literature starts by briefly listing off those sources again – past papers who have tried to deal with this topic. They categorize them based on what the papers try to deliver, and then they give them a light slam for not having a scientific basis on which to form the structures of their PCR structures. I heartily agree.
The first part of the literature review discusses the benefits of peer review in other fields. Papers as far back as the 1950’s are cited. I think it goes without saying that having other people critique your work can be a great way of receiving constructive feedback (ask any playwrite, for example). I guess these fellas feel they have to be pretty rigorous though, and really ground their argument in some solid past work. Power to them.
An interesting notion is brought up – peer reviews over an extended length of time within the same groups helps cultivate “interaction between…peers and for the building of knowledge”. One-time reviews, however, “[lends] itself more to a cognitive approach…more attention can be paid to the changes in the students’ thought processes”. Hm.
This fellow named Topping seems to be quite popular with these guys. Apparently, he came up with something called “Topping’s Peer Review Topology”, and I get the feeling that he is one of their primary sources in figuring out different ways of constructing PCR structures.
Oh, and an important morsel just got snuck in:
We are interested in how the student is affected by the acts of reviewing and being reviewed rather than by the social interaction occurring during the process.
So that sense of community that that other paper was talking about – not being looked at here.
The paper then goes on to chisel down some of the jargon from Topping’s Topology, and make it fit the field of Computer Science. By doing this, the writers simpy reiterate the variables they’re going to be playing with: what, who, and when.
The paper then dives into a two page summary of some peer review papers, and the various results that they found. They note a few instances where peer review seemed to improve student performance, and other cases where peer review resulted in semi-disaster. There are just as many theories as there are conflicting accounts. From reading this stuff, it seems that peer review is a vast and complicated topic, and nobody seems to really have a firm grasp on it just yet.
Likewise, rubric creation seems to be a bit of a contentious topic:
While there seems to be a general consensus that rubrics are important and that they improve the peer review activity, there is not as much agreement on how they should be implemented.
However, it is clear that rubrics are useful in peer review for novice reviewers:
Rubrics can supply the guidance students need to learn how to evaluate an assignment. It provides the needed scaffolding until the students are comfortable witht he process and the domain to make correct judgements.
Makes sense to me.
The paper then asks an important question: what makes a “successful” review in the education context? Greater understanding? Learning new concepts? Improved grades? Better designs? Fewer code defects?
The answer: it really depends on the instructor, and what the course wants from the PCR process. For example, what is more important – having the reviewer learn to review? Or having the review-ee receive good feedback? Or both? PCR is complex, and has lots of things to consider…
The next section highlights how technology is used to support peer review. One particularly interesting example, is of a Moodle module that allows for peer reviews on assignments. Apparently, the authors are fans. I’ve never used Moodle before, and haven’t yet found the module that they’re talking about, but it sounds worth investigation.
The very next section details their experiment – their method, their data, and their results. However, this post is getting a bit long, so I’m going to stop here, and continue on in a second post.