Tag Archives: research

Starting My Thesis

So I’ve been given the go-ahead to start writing my thesis. I was going to post up some more exciting numbers/findings from my experiment, but that’ll have to wait – the thesis beckons.

I’ve started writing it, and holy smokes, it’s hard. It’s hard because I have to zoom out from my current perspective, and start right from scratch, explaining where every single decision came from.

And I have to do it in a formal, academic tone – without awesome photos.

Plan of Attack

I think I’m going to go with Alecia on this one, and start with my outline. That’s what I always did for any of my Drama classes where I had to write a big essay: start with the outline, and treat it like the skeleton…then slowly put more flesh on the skeleton. Keep fleshing it out, throw on some skin, some clothes, a lick of varnish, and bam: it’s all done.

Anyhow, that’s my plan of attack. So I need an outline. Let me show you what I have.

Tentative Outline

Intro
1. Title Page
2. Abstract
3. Acknowledgments
4. Table of Contents
5. List of Tables (where applicable)
6. List of Plates (where applicable)
7. List of Figures
8. List of Appendices (where applicable)
The Meat
1. Background
  1. Code Review
    1. What it is, how it is commonly used in industry
    2. Proven to be effective (Jason Cohen study)
    3. Helps to spread learning in a development team
  1. If code review is so good at spreading learning, why isn’t it part of the pedagogy in the undergrad curriculum?
    1. How do we teach it?
    2. The curriculum is already packed – how do we fit it in?
    3. Joorden’s and Pare’s peerScholar approach
  1. The idea:
    1. Have students evaluate one another after assignments, and give them a code review grade based on agreement with the TA grades.
2. Unanswered questions:
  1. Would students actually benefit from this idea?
  2. What is the relationship between the marks given by TAs, and the marks given by student evaluators?
  3. How would students feel about grading one another?
3. The experiment
  1. Terminology
    1. Assignment specification
    2. Submission
    3. Subject
    4. Grader
    5. Peer Grader
    6. Marking
    7. Marking Rubric
    8. Peer Average
    9. Agreement
  2. Design
    1. Single-blind, with two groups (control and treatment)
      1. In both groups, subjects would:
      2. fill out brief questionnaire
      3. work on two programming assignments
      4. have a maximum of half an hour to complete each assignment
      5. perform another activity during the time between assignments, dependent on their particular group:
        
        treatment group would perform some grading
        
        control group would work on a vocabulary exercise
    2. Subjects in the treatment group would then fill out a post-experiment questionnaire to get their feedback on their marking experience
    3. Counter-balancing?
    4. Graders would mark shuffled submissions
    5. Graders would choose their preferred submission
  3. Instruments
    1. Pre-experiment Questionnaire
    2. Assignment Specifications
      1. Flights and Passengers
      2. Decks and Cards
    3. Assignment Rubrics
    4. Mock-ups
    5. Vocabulary Exercise
    6. Post-experiment Questionnaire
    7. Working Environment
      1. IDE
      2. Count-down widget
      3. Screen capture
  4. Subjects
    1. Undergraduates with 4+ months of Python programming experience
    2. Months as a unit of experience
    3. The two graders
  5. Assignment Sessions
    1. Greeting, informed consent, withdrawal rights
    2. Pre-experiment questionnaire
    3. First Assignment Rules
      1. 30 minutes maximum – finish early, let me know
      2. full access to Internet
      3. work may or may not be seen by other participants in the study
      4. may ask for clarification
    4. First Assignment begins
      1. Timer widget starts
      2. Screen capture begins
      3. Subject left alone
    5. Marking / vocabulary phase
      1. Treatment group
        
        Would be given 5 submissions (secretly mock-ups), given 5 rubrics, asked to fill out as much as possible
        
        30 minute time limit
      2. Control group
        
        Given links to 5 vocabulary exercises found online
        
        Asked to complete as much as possible, and to self-report results on a sheet of paper
        
        30 minute time limit
    6. Second Assignment Rules
      1. Same as first, but repeated for emphasis
    7. Second Assignment begins
      1. Timer widget starts
      2. Screen capture begins
      3. Subject left alone
    8. Control group subjects released
    9. Treatment group subjects fill out post-experiment questionnaire
  6. Grading
    1. Initial meeting, and then hand-off of submissions / rubrics
    2. Hands-off approach
  7. Choosing Phase
    1. Submissions for each assignment were paired by the subject that wrote them
    2. Mock-ups not included
    3. Graders were asked to choose which one they preferred, and give a rating of the difference
4. Analysis
  1. Pearson’s Correlation Co-efficient as a measure of agreement
  2. Fisher’s z-score
5. Results
  1. On grader vs. grader agreement
  2. On grader vs. peer average agreement
  3. On treatment vs. control
    1. Difference in average
    2. Grader preference
  4. On student opinion wrt peer grading
6. Discussion
7. Threats to validity
  1. The 30 minute time limit
  2. A rigid rubric
8. Future work
9. Conclusion

That’s the current structure of it. I’m meeting my supervisor tomorrow and getting feedback, so this might change. Stay tuned.

My Experiment Apparatus: The Assignments, Rubrics and Mock-Ups

If you’ve read about my experiment, you’ll know that there were two Python programming assignments that my participants worked on, and a rubric for each assignment.

There were also 5 mock-up submissions for each assignment that I had my participants grade. I developed these mock-ups, after a few consultations with some of our undergraduate instructors, in order to get a sense of the kind of code that undergraduate programmers tend to submit.

I’ve decided to post these materials to this blog, in case somebody wants to give them a once over. Just thought I’d open my science up a little bit.

So here they are:

Flights and Passengers

Cards and Decks

Peruse at your leisure.

Ladies and Gentlemen, Step Right Up: A Call For Participants

Some Notes About My Experiment Design

This is mostly a reminder note to myself, but I thought I’d post it publicly.

So, remember the experiment I’m conducting? I’ve been testing out various components of it while I wait for ethics board approval, and some interesting questions have come up. Some of these questions have already been raised in other posts (and in comment threads – thanks for the discussion, all!), but I want to summarize them here.

Access to the Internet

When participants are working on the assignments, they will be given full access to the Internet.

I had a bunch of conversations with undergrad instructors here at UofT, and across the board, during programming exercises, students have full access to the Internet. They’re prohibited from just copying and pasting code from somewhere else into their assignments, but they can certainly look at online documentation and examples to get some ideas. So I’m going to allow this as well, in order to better model an actual undergraduate assignment.

I’m also considering writing a script that will take a screen capture every 30 seconds or so in the background. This way, I can quickly get a sense of the participants activity during the assignments. This should hopefully give me some idea of how much time they’re spending in the browser, and how much time they’re spending writing Python.

Participant Programming Ability

The only prerequisite for my participants is that they must have 4+ months of Python programming experience. I’m not filtering out the stronger or weaker programmers. I’m taking them all.

Why? Because I think this will give me a more accurate model of the composition of a first or second year programming course. From my experience, students in first and second year programming courses have a wide spectrum of strengths and abilities.

But what if they don’t complete any of my assignments? What if they stare blankly at the screen, or give up, or just surf Reddit because they can’t understand what I’m asking from them? That’s OK – completion of the assignments is not necessary, and again – this relaxation helps to better model a real programming course (because there are invariably students who don’t start, or don’t finish a programming assignment).

And it’s OK if they don’t do well. I’m taking a reading of their programming abilities with the first assignment, getting them to do peer grading, and then taking a reading with a second assignment. I don’t care what absolute grades they get, I care about what happens after they’ve done the grading.

I might also see some interesting trends – for example, participants who perform poorly on the first assignment might benefit greatly from the peer grading, whereas participants who perform strongly on the first assignment might see little benefit. Or vice-versa. Or neither. Granted, I probably won’t be able to collect enough participants to make such statements with statistical strength, but if a signal appears to be there, it’d be an interesting direction for future work.

30 Minute Time Limit

Students will have 30 minutes maximum to complete each assignment. And that’s absolute – it includes reading the assignment, surfing the Internet, and coding it up.

This is because a hard time limit like this, again, better models the state of classrooms as they are. In an open-book exam, you’re not timed on how long you’re spent only writing, with the clock paused as you glance at your textbook. You have a firm time limit, and that’s it.

In an ideal world, students would be given a more personal level of teaching, as opposed to this mechanical, factory-floor approach. But if I (or anybody) is ever going to implement these ideas into a real classroom setting, I’ll want to make sure it can be easily adapted into the teaching environment as it already is.

Ping!

I’ve done it again: I’ve let dust gather on my blog.

Quick update:

I’ve finished my courses for this semester, and have gone into full-blown research mode.
My research proposal is going through ethics review, in order to make sure that I’m not going to blow things up (or hurt anybody if I do)
While my paperwork is reviewed, I’m refining my procedure and apparatus. Better and better.
I’ve been accepted into Google Summer of Code this year – I’ll be working on Review Board. Details about my project will be the subject of an upcoming post, which I will toss up shortly.
I may or may not be co-directing a radio play. I’ll let you know.
The MarkUs team is about to release version 0.7, and a fresh batch of Summer students will soon be here at UofT to work on it!
I have not forgotten about the UCDP trip to Poland. I still have to tell you what we saw and did at Auschwitz. Cripes – it’s almost a year since I returned, and I’m only half-way through the whole story. And there’s a ton more to tell. Coming soon.

Stay tuned.

A Blog by Mike Conley

The personal blog of a Toronto based software mechanic, musician, sound designer, and theatre enthusiast.