{"id":1750,"date":"2010-08-30T21:16:33","date_gmt":"2010-08-31T02:16:33","guid":{"rendered":"http:\/\/mikeconley.ca\/blog\/?p=1750"},"modified":"2023-12-20T16:25:15","modified_gmt":"2023-12-20T21:25:15","slug":"some-more-results-did-the-graders-agree-part-2","status":"publish","type":"post","link":"https:\/\/mikeconley.ca\/blog\/2010\/08\/30\/some-more-results-did-the-graders-agree-part-2\/","title":{"rendered":"Some More Results: Did the Graders Agree? &#8211; Part 2"},"content":{"rendered":"<p><a href=\"http:\/\/mikeconley.ca\/blog\/2010\/08\/24\/some-more-results-did-the-graders-agree\/\">(Click here to read the first part of the story)<\/a><\/p>\n<p>I&#8217;m just going to come right out and say it: \u00a0I&#8217;m no stats buff.<\/p>\n<p>Actually, maybe that&#8217;s giving myself too much credit. \u00a0I barely scraped through my\u00a0compulsory statistics course. \u00a0In my defense, the teaching was abysmal, and the class average was in the sewer the entire time.<\/p>\n<p>So, unfortunately, I don&#8217;t have the statistical chops that a real scientist should.<\/p>\n<p>But, today, I learned a new trick.<\/p>\n<h3>Pearson&#8217;s Correlation Co-efficient<\/h3>\n<p>Joorden&#8217;s and Pare gave me the idea while I was reviewing <a href=\"http:\/\/absurdium.utsc.utoronto.ca\/peerScholar\/peerScholar_site\/Publications\/Peering%20into%20large%20lectures_examining%20peer%20and%20expert%20mark%20agreement%20using%20peerScholar,%20an%20on%20line%20peer%20assessment%20tool.pdf\">their paper<\/a> for the Related Work section of my thesis. \u00a0They used it in order to inspect mark agreement between their expert markers.<\/p>\n<p><a href=\"http:\/\/mikeconley.ca\/blog\/2010\/08\/24\/some-more-results-did-the-graders-agree\/\">In my last post on Grader agreement<\/a>, I was looking at mark agreement at the equivalence level. \u00a0Pearson&#8217;s Correlation Co-efficient should (I think) let me inspect mark agreement at the &#8220;shape&#8221; level.<\/p>\n<p>And by shape level, I mean this: \u00a0if Grader 1 gives a high mark for a participant, then Grader 2 gives a high mark. \u00a0If Grader 1 gives a low mark for the next participant, then Grader 2 gives a low mark. \u00a0These high and low marks might not be equal, but the basic shape of the thing is there.<\/p>\n<p>And <a href=\"http:\/\/www.gifted.uconn.edu\/siegle\/research\/correlation\/alphaleve.htm\">this page<\/a>, with <a href=\"http:\/\/www.gifted.uconn.edu\/siegle\/research\/correlation\/corrchrt.htm\">it&#8217;s useful table<\/a>, tell me how I can tell if the correlation co-efficient that I find is significant. \u00a0Awesome.<\/p>\n<p>At least, that&#8217;s my interpretation of Pearson&#8217;s Correlation Co-efficient. \u00a0Maybe I&#8217;ve got it wrong. \u00a0Please let me know if I do.<\/p>\n<p>Anyhow, it can&#8217;t hurt to look at some more tables. \u00a0Let&#8217;s do that.<\/p>\n<h3>About these tables&#8230;<\/h3>\n<p>Like my previous post on graders, I&#8217;ve organized my data into two tables &#8211; one for each assignment.<\/p>\n<p>Each table has a row for that assignments criteria.<\/p>\n<p>Each table has two columns &#8211; the first is strictly to list the assignment criteria. \u00a0The second column gives the Pearson Correlation Co-efficient for each criterion. \u00a0The correlation measurement is between the marks that my two Graders gave on that criterion across all 30 submissions for that assignment.<\/p>\n<p>I hope that makes sense.<\/p>\n<p>Anyways, here goes&#8230;<\/p>\n<h3>Da-ta!<\/h3>\n<h4>Decks and Cards Grader Correlation Table<\/h4>\n<p>[table id=8 \/]<\/p>\n<h4>Flights and Passengers Grader Correlation Table<\/h4>\n<p>[table id=9 \/]<\/p>\n<h3>What does this tell us?<\/h3>\n<p>Well, first off, remember that for each assignment, for each criterion, there were 30 submissions.<\/p>\n<p>So N = 30.<\/p>\n<p>In order to determine if the correlation co-efficients are significant, we look at <a href=\"http:\/\/www.gifted.uconn.edu\/siegle\/research\/correlation\/corrchrt.htm\">this table<\/a>, and find N &#8211; 2 down the left hand side:<\/p>\n<blockquote><p>28 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 .306 \u00a0 \u00a0.361 \u00a0 \u00a0.423 \u00a0 \u00a0.463<\/p><\/blockquote>\n<p>Those 4 values on the right are the critical values that we want to pass for significance.<\/p>\n<p>Good news! \u00a0<strong>All<\/strong> of the correlation co-efficients fall within the range of [.306, .463]. \u00a0So now, I&#8217;ll show you their significance by level:<\/p>\n<h4>p &lt; 0.10<\/h4>\n<ul>\n<li>Design of __str__ in Decks and Cards assignment<\/li>\n<\/ul>\n<h4>p &lt; 0.05<\/h4>\n<ul>\n<li>Design of deal method in Decks and Cards assignment<\/li>\n<\/ul>\n<h4>p &lt; 0.02<\/h4>\n<ul>\n<li>Design of heaviest_passenger method in Flights and Passengers<\/li>\n<\/ul>\n<h4>p &lt; 0.01<\/h4>\n<h5>Decks and Cards<\/h5>\n<ul>\n<li>Design of Deck constructor<\/li>\n<li>Style<\/li>\n<li>Internal Comments<\/li>\n<li>__str__ method correctness<\/li>\n<li>deal method correctness<\/li>\n<li>Deck constructor correctness<\/li>\n<li>Docstrings<\/li>\n<li>shuffle method correctness<\/li>\n<li>Design of shuffle method<\/li>\n<li>Design of cut method<\/li>\n<li>cut method correctness<\/li>\n<li>Error checking<\/li>\n<\/ul>\n<h5>Flights and Passengers<\/h5>\n<ul>\n<li>Design of __str__ method<\/li>\n<li>Design of lightest_passenger method<\/li>\n<li>Style<\/li>\n<li>Design of Flight constructor<\/li>\n<li>Internal comments<\/li>\n<li>Design of add_passenger method<\/li>\n<li>__str__ method correctness<\/li>\n<li>Error checking<\/li>\n<li>heaviest_passenger method correctness<\/li>\n<li>Docstrings<\/li>\n<li>lightest_passenger method correctness<\/li>\n<li>Flight constructor correctness<\/li>\n<li>add_passenger method correctness<\/li>\n<\/ul>\n<p><strong>Wow!<\/strong><\/p>\n<h3>Correlation of Mark Totals<\/h3>\n<p>Joorden&#8217;s and Pare ran their correlation statistics on assignments that were marked on a scale from 1 to 10. \u00a0I can do the same type of analysis by simply running Pearson&#8217;s on the totals for each participant by each Grader.<\/p>\n<p>Drum roll, please&#8230;<\/p>\n<h4>Decks and Cards<\/h4>\n<p>p(28) = 0.89, p &lt; 0.01<\/p>\n<h4>Flights and Passengers<\/h4>\n<p>p(28) = 0.92, p &lt; 0.01<\/p>\n<p><strong>Awesome!<\/strong><\/p>\n<h3>Summary \/ Conclusion<\/h3>\n<p>I already showed before that my two Graders rarely agreed mark for mark, and that one Grader tended to give higher marks than the other.<\/p>\n<p>The analysis with Pearson&#8217;s correlation co-efficient seems to suggest that, while there isn&#8217;t one-to-one agreement, there is certainly a significant correlation &#8211; with the majority of the criteria having a correlation with <strong>p &lt; 0.01!<\/strong><\/p>\n<p>The total marks also show a very strong, significant, positive correlation.<\/p>\n<p>Ok, so that&#8217;s the conclusion here: \u00a0<strong>the Graders marks do not match, but show moderate to high positive correlation to a significant degree.<\/strong><\/p>\n<h3>How&#8217;s My Stats?<\/h3>\n<p>Did I screw up somewhere? \u00a0Am I making fallacious claims? \u00a0Let me know &#8211; post a comment!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>(Click here to read the first part of the story) I&#8217;m just going to come right out and say it: \u00a0I&#8217;m no stats buff. Actually, maybe that&#8217;s giving myself too much credit. \u00a0I barely scraped through my\u00a0compulsory statistics course. \u00a0In my defense, the teaching was abysmal, and the class average was in the sewer the [&hellip;]<\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[626],"tags":[469,840,802,841,803,836,837,838,839,635],"class_list":["post-1750","post","type-post","status-publish","format-standard","hentry","category-research-computer-science-technology","tag-data","tag-grader-agreement","tag-grading","tag-marker-agreement","tag-marking","tag-pearsons-correlation-co-efficient","tag-r","tag-significance","tag-statistical-significance","tag-statistics"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/prmTy-se","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/mikeconley.ca\/blog\/wp-json\/wp\/v2\/posts\/1750","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mikeconley.ca\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mikeconley.ca\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mikeconley.ca\/blog\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/mikeconley.ca\/blog\/wp-json\/wp\/v2\/comments?post=1750"}],"version-history":[{"count":14,"href":"https:\/\/mikeconley.ca\/blog\/wp-json\/wp\/v2\/posts\/1750\/revisions"}],"predecessor-version":[{"id":1759,"href":"https:\/\/mikeconley.ca\/blog\/wp-json\/wp\/v2\/posts\/1750\/revisions\/1759"}],"wp:attachment":[{"href":"https:\/\/mikeconley.ca\/blog\/wp-json\/wp\/v2\/media?parent=1750"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mikeconley.ca\/blog\/wp-json\/wp\/v2\/categories?post=1750"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mikeconley.ca\/blog\/wp-json\/wp\/v2\/tags?post=1750"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}