Any tips on how to analyze level of agreement when
1) many raters, let's say students that were randomly selected from larger population
2) raters examine many different randomly selected subjects from larger population of interest, not all subjects rated by every rater and some subjects rated by >1 rater
3) score is binary
4) compare against a single gold standard rater that evaluated all subjects
interested in level of agreement between students (as representing the student body in general) vs gold standard.
Thanks in advance!!
Mark
Data would look some thing like:
student_id | subject | rating | gold_stand_rating |
1 | 1 | 0 | 0 |
1 | 2 | 1 | 1 |
2 | 1 | 0 | |
2 | 3 | 0 | 0 |
2 | 4 | 1 | 0 |
3 | 1 | 0 | |
4 | 3 | 0 | |
4 | 5 | 1 | 1 |
4 | 6 | 1 | 0 |
4 | 7 | 0 | 0 |
5 | 2 | 0 | |
5 | 5 | 0 |
0 Response to Agreement between many raters with many different subjects against gold standard rater
Post a Comment