{"id":1623,"date":"2010-08-11T12:10:43","date_gmt":"2010-08-11T17:10:43","guid":{"rendered":"http:\/\/mikeconley.ca\/blog\/?p=1623"},"modified":"2023-12-20T16:25:15","modified_gmt":"2023-12-20T21:25:15","slug":"research-experiment-a-recap","status":"publish","type":"post","link":"https:\/\/mikeconley.ca\/blog\/2010\/08\/11\/research-experiment-a-recap\/","title":{"rendered":"Research Experiment:  A Recap"},"content":{"rendered":"<p>Before I start diving into results, I&#8217;m just going to recap my experiment so we&#8217;re all up to speed.<\/p>\n<p>I&#8217;ll try to keep it short, sweet, and punchy &#8211; but remember, this is a couple of months of work right here.<\/p>\n<p>Ready?\u00a0 Here we go.<\/p>\n<h3>What I was looking for<\/h3>\n<h4>A quick refresher on what code review is<\/h4>\n<p>Code review is like the software industry equivalent of a taste test.\u00a0 A developer makes a change to a piece of software, puts that change up for review, and a few reviewers take a look at that change to make sure it&#8217;s up to snuff.\u00a0 If some issues are found during the course of the review, the developer can go back and make revisions.\u00a0 Once the reviewers give it the thumbs up, the change is put into the software.<\/p>\n<p>That&#8217;s an oversimplified description of code review,\u00a0 but it&#8217;ll do for now.<\/p>\n<h4>So what?<\/h4>\n<p>What&#8217;s important is to know that <em>it works.<\/em> <a href=\"http:\/\/mikeconley.ca\/blog\/2009\/09\/14\/smart-bear-cisco-and-the-largest-study-on-code-review-ever\/\">Jason Cohen showed<\/a> that <strong>code review reduces the number of defects that enter the final software product.<\/strong> That&#8217;s great!<\/p>\n<p>But there are some other cool advantages to doing code review as well.<\/p>\n<ol>\n<li>It helps to train up new hires.\u00a0 They can lurk during reviews to see how more experienced developers look at the code.\u00a0 They get to see what&#8217;s happening in other parts of the software.\u00a0 They get their code reviewed, which means direct, applicable feedback.\u00a0 All good things.<\/li>\n<li>It helps to clean and homogenize the code.\u00a0 Since the code will be seen by their peers, developers are generally compelled to not put up &#8220;embarrassing&#8221; code (or, if they do, to at least try to explain why they did).\u00a0 Code review is a great way to compel developers to keep their code readable and consistent.<\/li>\n<li>It helps to spread knowledge and good practices around the team.\u00a0 New hires aren&#8217;t the only ones to benefit from code reviews.\u00a0 There&#8217;s always something you can learn from another developer, and code review is where that will happen.\u00a0 And I believe this is true not just for those who receive the reviews, but also for those who perform the reviews.<\/li>\n<\/ol>\n<p>That last one is important.\u00a0 Code review sounds like <strong>an excellent teaching tool<\/strong>.<\/p>\n<p>So why isn&#8217;t code review part of the standard undergraduate computer science education?\u00a0 Greg and I hypothesized that the reason that code review isn&#8217;t taught is because <a href=\"http:\/\/mikeconley.ca\/blog\/2010\/03\/14\/lessons-from-peerscholar-an-approach-to-teaching-code-review\/\">we don&#8217;t know how to teach it<\/a>.<\/p>\n<p>I&#8217;ll quote myself:<\/p>\n<blockquote><p>What if peer code review isn\u2019t taught in undergraduate courses  because we just don\u2019t know how to teach it?\u00a0 We don\u2019t know how to fit it  in to a curriculum that\u2019s already packed to the brim.\u00a0 We don\u2019t know  how to get students to take it seriously.\u00a0 We don\u2019t know if there\u2019s  pedagogical value, let alone how to show such value to the students.<\/p><\/blockquote>\n<h4>The idea<\/h4>\n<p>Inspired by work by <a href=\"http:\/\/absurdium.utsc.utoronto.ca\/peerScholar\/peerScholar_site\/Publications\/Peering%20into%20large%20lectures_examining%20peer%20and%20expert%20mark%20agreement%20using%20peerScholar,%20an%20on%20line%20peer%20assessment%20tool.pdf\">Joordens and Pare<\/a>, Greg and I developed an approach to teaching code review that integrates itself nicely into the current curriculum.<\/p>\n<p>Here&#8217;s the basic idea:<\/p>\n<p>Suppose we have a computer programming class.\u00a0 Also suppose that after each assignment, each student is randomly presented with anonymized assignment submissions from some of their peers.\u00a0 Students will then be asked to anonymously peer grade these assignment submissions.<\/p>\n<p>Now, before you go howling your head off about the inadequacy \/ incompetence of student markers, or <a href=\"http:\/\/www.cupe3902.org\/settlement-in-peer-scholar-case\">the PeerScholar debacle<\/a>, read this next paragraph, because there&#8217;s a twist.<\/p>\n<p>The assignment submissions will still be marked by TA&#8217;s as usual.\u00a0 The grades that a student receives from her peers will not directly affect her mark.\u00a0 Instead, <strong>the student is graded based on how well they graded their peers.<\/strong> The peer reviews that a student completes will be compared with the grades that the TA&#8217;s delivered.\u00a0 The closer a student is to the TA, the better the mark they get on their &#8220;peer grading&#8221; component (which is distinct from the mark they receive for their programming assignment).<\/p>\n<p>Now, granted, the idea still needs some fleshing out, but already, we&#8217;ve got some questions that need answering:<\/p>\n<ol>\n<li>Joordens and Pare showed that for short written assignments, you need about 5 peer reviews to predict the mark that the TA will give.\u00a0 Is this also true for computer programming assignments?<\/li>\n<li>Grading students based on how much their peer grading matches TA grading assumes that the TA is an infallible point of reference.\u00a0 How often to TA&#8217;s disagree amongst themselves?<\/li>\n<li>Would peer grading like this actually make students better programmers?\u00a0 Is there a significant difference in the quality of their programming after they perform the grading?<\/li>\n<li>What would students think of peer grading computer programming assignments?\u00a0 How would they feel about it?<\/li>\n<\/ol>\n<p>So those were my questions.<\/p>\n<h3>How I went about looking for the answers<\/h3>\n<p>Here&#8217;s the design of the experiment in a nutshell:<\/p>\n<h4>Writing phase<\/h4>\n<p>I have a treatment group, and a control group.\u00a0 Both groups are composed of undergraduate students.\u00a0 After writing a short pre-experiment questionnaire, participants in both groups will have half an hour to work on a short programming assignment.\u00a0 The treatment group will then have another half an hour to peer grade some submissions for the assignment they just wrote.\u00a0 The submissions that they mark will be mocked up by me, and will be the same for each participant in the treatment group.\u00a0 The control group will not perform any grading &#8211; instead, they will do an unrelated vocabulary exercise for the same amount of time.\u00a0 Then, participants in either group will have another half an hour to work on the second short programming assignment. Participants in my treatment group will write a short post-experiment questionnaire to get their impressions on their peer grading experience.\u00a0 Then the participants are released.<\/p>\n<p>Here&#8217;s a picture to help you visualize what you just read.<\/p>\n<p style=\"text-align: center;\"><a href=\"http:\/\/mikeconley.ca\/blog\/wp-content\/uploads\/2010\/08\/workflow.png\"><img loading=\"lazy\" decoding=\"async\" data-attachment-id=\"1628\" data-permalink=\"https:\/\/mikeconley.ca\/blog\/2010\/08\/11\/research-experiment-a-recap\/workflow\/\" data-orig-file=\"https:\/\/mikeconley.ca\/blog\/wp-content\/uploads\/2010\/08\/workflow.png\" data-orig-size=\"505,797\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;}\" data-image-title=\"Group Tasks\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/mikeconley.ca\/blog\/wp-content\/uploads\/2010\/08\/workflow.png\" class=\"size-full wp-image-1628 aligncenter\" title=\"Group Tasks\" src=\"http:\/\/mikeconley.ca\/blog\/wp-content\/uploads\/2010\/08\/workflow.png\" alt=\"Tasks for each group in my experiment.\" width=\"505\" height=\"797\" srcset=\"https:\/\/mikeconley.ca\/blog\/wp-content\/uploads\/2010\/08\/workflow.png 505w, https:\/\/mikeconley.ca\/blog\/wp-content\/uploads\/2010\/08\/workflow-190x300.png 190w\" sizes=\"auto, (max-width: 505px) 100vw, 505px\" \/><\/a><\/p>\n<p>So now I&#8217;ve got two piles of submissions &#8211; one for each assignment, 60 submissions in total.\u00a0 I add my mock-ups to each pile.\u00a0 That means 35 submissions in each pile, and 70 submissions in total.<\/p>\n<h4>Marking phase<\/h4>\n<p>I assign ID numbers to each submission, shuffle them up, and hand them off to some graduate level TA&#8217;s that I hired.\u00a0 The TA&#8217;s will grade each assignment using the same marking rubric that the treatment group used to peer grade.\u00a0 They will not know if they are grading a treatment group submission, a control group submission, or a mock-up.<\/p>\n<h4>Choosing phase<\/h4>\n<p>After the grading is completed, I remove the mock-ups, and pair up submissions in both piles based on who wrote it.\u00a0 So now I&#8217;ve got 30 pairs of submissions:\u00a0 one for each student.\u00a0 I then ask my graders to look at each pair, knowing that they&#8217;re both written by the same student, and to choose which one they think is better coded, and to rate and describe the difference (if any) between the two.\u00a0 This is an attempt to catch possible improvements in the treatment group&#8217;s code that might not be captured in the marking rubric.<\/p>\n<h3>So that&#8217;s what I did<\/h3>\n<p>So everything you&#8217;ve just read is what I&#8217;ve just finished doing.<\/p>\n<p>Once the submissions are marked, I&#8217;ll analyze the marks for the following:<\/p>\n<ol>\n<li>Comparing  the two groups, is there any significant improvement in the marks from  the first assignment to the second in the treatment group?\n<ol>\n<li>If there was an improvement, on which criteria?\u00a0 And how much of an improvement?<\/li>\n<\/ol>\n<\/li>\n<li>How did the students do at grading my mock-ups?\u00a0 How similar were their peer grades to what the TAs gave?<\/li>\n<li><a href=\"http:\/\/mikeconley.ca\/blog\/2010\/08\/24\/some-more-results-did-the-graders-agree\/\">How much did my two graders agree with one another?<\/a><\/li>\n<li>During the choosing phase, did my graders tend to choose the second assignment over the first assignment more often for the treatment group?<\/li>\n<\/ol>\n<p>And <a href=\"http:\/\/mikeconley.ca\/blog\/2010\/08\/11\/some-preliminary-results\/\">I&#8217;ll also analyze the post-experiment questionnaire to get student feedback on their grading experience.<\/a><\/p>\n<p>Ok, so that&#8217;s where I&#8217;m at.\u00a0 Stay tuned for results.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Before I start diving into results, I&#8217;m just going to recap my experiment so we&#8217;re all up to speed. I&#8217;ll try to keep it short, sweet, and punchy &#8211; but remember, this is a couple of months of work right here. Ready?\u00a0 Here we go. What I was looking for A quick refresher on what [&hellip;]<\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"jetpack_post_was_ever_published":false},"categories":[454,626],"tags":[807,804,501,806,58,496,802,426,460,800,803,801,475,494,642,799,502,635,470,645,805],"class_list":["post-1623","post","type-post","status-publish","format-standard","hentry","category-code-reviews","category-research-computer-science-technology","tag-analysis","tag-choosing","tag-code-review","tag-control-group","tag-education","tag-experiment","tag-grading","tag-greg-wilson","tag-jason-cohen","tag-joordens","tag-marking","tag-pare","tag-peer-code-review","tag-peer-review","tag-peerscholar","tag-proof","tag-science","tag-statistics","tag-study","tag-teaching","tag-treatment-group"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/prmTy-qb","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/mikeconley.ca\/blog\/wp-json\/wp\/v2\/posts\/1623","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mikeconley.ca\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mikeconley.ca\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mikeconley.ca\/blog\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/mikeconley.ca\/blog\/wp-json\/wp\/v2\/comments?post=1623"}],"version-history":[{"count":31,"href":"https:\/\/mikeconley.ca\/blog\/wp-json\/wp\/v2\/posts\/1623\/revisions"}],"predecessor-version":[{"id":3160,"href":"https:\/\/mikeconley.ca\/blog\/wp-json\/wp\/v2\/posts\/1623\/revisions\/3160"}],"wp:attachment":[{"href":"https:\/\/mikeconley.ca\/blog\/wp-json\/wp\/v2\/media?parent=1623"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mikeconley.ca\/blog\/wp-json\/wp\/v2\/categories?post=1623"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mikeconley.ca\/blog\/wp-json\/wp\/v2\/tags?post=1623"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}