Hi all,

I am trying to calculate inter-rater reliability with a complicated study and data structure. Below is a (fake) example that illustrates the structure for 2 targets:

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input byte target str1 type byte rater str3 ph byte outcome
1 "A"  3 "MP1" 90
1 "B"  3 "MP1" 78
1 "A"  3 "MP2" 46
1 "B"  3 "MP2" 30
1 "A"  5 "MP1" 20
1 "B"  5 "MP1" 45
1 "A"  5 "MP2" 23
1 "B"  5 "MP2" 12
1 "A"  7 "MP1" 20
1 "B"  7 "MP1" 45
1 "A"  7 "MP2" 23
1 "B"  7 "MP2" 12
1 "A"  9 "MP1" 20
1 "B"  9 "MP1" 45
1 "A"  9 "MP2" 23
1 "B"  9 "MP2" 12
2 "A"  9 "MP1" 98
2 "B"  9 "MP1" 99
2 "A"  9 "MP2" 34
2 "B"  9 "MP2" 23
2 "A" 10 "MP1" 67
2 "B" 10 "MP1" 79
2 "A" 10 "MP2" 90
2 "B" 10 "MP2" 45
2 "A" 11 "MP1" 24
2 "B" 11 "MP1" 34
2 "A" 11 "MP2" 23
2 "B" 11 "MP2" 34
2 "A" 12 "MP1" 52
2 "B" 12 "MP1" 14
2 "A" 12 "MP2" 12
2 "B" 12 "MP2" 12
end


Characteristics of study:
  • Every target is rated by 4 raters. Note, the same set of raters does not rate each target.
  • Each rater rates ALL the data for 2 targets.

My analysis model is below. Specifically, there are 3 random intercepts.

mixed outcome indeps || _all: R.rater || _all: R.target || _all: R.ph
  1. Looking at the manual for the command icc, options include:
    1. one-way random-effects model: In the one-way random-effects model, each target is rated by a different set of k independent raters, who are randomly drawn from the population of raters. The target is the only random effect in this model; the effects due to raters and possibly due to rater-and-target interaction cannot be separated from random error.
    2. two-way random-effects model: In the two-way random-effects model, each target is rated by the same set of k independent raters, who are randomly drawn from the population of raters. The random effects in this model are target and rater and possibly their interaction, although in the absence of repeated measurements for each rater on each target, the effect of an interaction cannot be separated from random error.
Questions:
1. In the two way random effects model, each target is rated by the same set of raters (does not seem true in this case). So can't use that. In the one way random effects model, each target is rated by a different set of raters (in my study however, each rater rates ALL the data for 2 targets, so each data is linked to 2 targets). So that doesn't seem quite true either. My question: Is it OK to use the one-way random effects model here? Or does the design of this study make calculation of inter-rater reliability impossible or not advised?

2. When one is calculating inter-rater reliability for a study with multiple outcome variables, does one typically calculate an inter-rater reliability score for each outcome measure? Or does typically one choose one measure?

Thank you!