Does Peer Grading Make Students Better Programmers?

The Question

My past few blog posts have been concerned with the usefulness of peer grading.  Steve Joordens showed that peer grading was pedagogically useful for first-year psych students…but what about computer science students?  Would they learn from it?  Would they become better programmers?

We don’t know.

Maybe it’s time to find out.

The Experiment

It’s pretty simple, actually.

I have two groups of students.  Let’s call them groups A and B.

For each student in A, have them complete a simple programming assignment (call it P1).  Once they’re finished, have them complete a second simple programming assignment (call it P2).

For each student in B, have them complete P1.  Once that’s done, have them view 5 or 6 different mocked up submissions, also for P1.  For each submission, have the students fill out a rubric and assign a grade. Once finished, the students then complete P2.

Then, I get some fellow graduate students to mark my mocked up submissions, the group A P1/P2 submissions, and the group B P1/P2 submissions.

If grading made the students better programmers, we should see an increase in the number of marks given to the students in group B for P2.

Bonuses, and Other Concerns

This experiment is nice and simple. And, besides showing if peer grading makes students better programmers, it gives us a couple of bonuses:

  • It tells us if graduate students tend to agree on what marks to give to submissions.  If they don’t agree, and the marks wildly differ…we might have a problem
  • It tells us if some number of students can, on average, approximate the grade a TA would give on a submission
  • It can tell us the average inspection rate for both students and TAs

I’ll have to do randomization here and there to eliminate ordering effects – for example, randomizing the criteria on the rubric, randomizing which assignments go first and second, randomizing the order in which the mock-up submissions are shown, etc.

One thing to consider though:  what effect does simply seeing the rubric have on students?

I’ve been in courses where I’ve not been allowed to see the marking rubric for some assignment.  It’s frustrating.  Seeing the rubric helps me focus on the areas that I’ll be marked on.

So what if just seeing the rubric makes the students “better programmers”?  One way to counteract this would be to have the rubric for the second assignment be quite a bit different than the one for the first assignment.


Oh yeah.  Stats.  Not my strongest subject.  I’m going to have to brush up on this (and probably enlist some help within the department) if I’m going to do this properly.  I’m probably not going to get as many participants as I think I will…so I have to accommodate small N.  Hrm.

Anyhow, this is where my summer experiment seems to heading.  What do you think?  I’m all ears.

4 thoughts on “Does Peer Grading Make Students Better Programmers?

  1. George

    What about a third group of students that just looked at other solutions that their peers had created without attempting to grade them? I don’t really care about a comparison between students that do nothing extra and students that do something extra. I fully expect the ones that do extra to be slightly better. I might want to know if grading specifically is a useful activity for them.

  2. Nelle

    Stats would help discuss that matter.

    I think that if grades given by students *really* differ from the one given by graders, it means that students didn’t understand the problem : they either didn’t understand *what* to grade, or what is the “proper solution”. For example, you ask a student to grade a japanese assignment. If the student gives very different grades than the graders, it could mean two things: he didn’t understand the assignment( or the rubric criteria) OR he understand the answer.

    I’m not sure I’m really clear on this…

  3. Karen

    I like the experiment. I have two concerns. The first is that differences between the two assignments should be taken into account. If the second assignment is substantially different than the first you may get different results than if they are similar. Second, I would like to see some qualitative analysis. The grade they receive on the second assignment may not reflect their greater understanding of the first.

    It would also be interesting to try to factor out the “extra work” advantage. What if the students were given an annotated solution, rather than the peer evaluation, or what if they had to write a comment about a peer’s piece of work rather than give a grade?

  4. Severin

    I agree with Karen. The difference between assignments might become significant. In order to be able to conclude that peer review makes better programmers, I expect them to be similar or closely related. What do you think?

    In terms of qualitative analysis:

    Wouldn’t it also be interesting to see what grader’s reasoning is as to why the submission got the mark they gave them. I wonder if the reasons grad-students give are significantly different than reasonings of peer students. BTW. by reasoning I don not mean something like “matched selected criterion best”. Another interesting variable in this regard might be to what extent a good rubric (i.e. good explanation as to what the corresponding grade-levels mean) influences the reasoning of the grader. Best case, the common denominator of students’ reasonings and TAs’ reasonings match and lead effectively to a good rubric (for instructors). Also, the required justification for a given grade might give you hints how well students understand the problem of the assignment. 🙂

    Different question:
    Do you plan to measure time reviewers spend on one submission? Although one might expect people to spend more time on the first submissions, it would be interesting to know what the average/mean time difference between TAs grading submissions and students grading submissions is.

Comments are closed.