Hi everyone,

I am trying to calculate inter rater reliability for my data but am struggling due to missing data.

In this dummy data (very similar to my own data but a smaller sample) I have 9 raters (1-9), who have scored (score) 4 Vignettes (1-4) out of 100. The 9 raters are constant throughout, however not all raters completed the questionnaire, meaning some vignettes have only been rated by 7 or 8 raters. My data is currently in long format

e.g.
ID Vignette Score
1 1 8
1 2 32
1 3 8
1 4 65
2 1 16
2 2 16

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input float ID byte Vignette float score
1 1  8
1 2 32
1 3  8
1 4 65
2 1 16
2 2 16
2 3  6
2 4 50
3 1 14
3 2 14
3 3 14
3 4 32
4 1  8
4 2  8
4 3 16
4 4 32
5 1  0
5 2  0
5 3 32
5 4  .
6 1 14
6 2 32
6 3 14
6 4 16
7 1  0
7 2 16
7 3  8
7 4 60
8 1 16
8 2 14
8 3  0
8 4 65
9 1  8
9 2  0
9 3  .
9 4  .
end
(Apologies if this is not the correct way to post my data, please let me know!)

My initial thought before encountering the missing data was to approach this by calculating the icc using a two-way random effects model, however stata excludes two vignettes due to the missing values. In my real dataset I have more vignettes and in some cases majority are excluded due to missing data.

What is the best way to calculate inter-rater reliability for this data, taking into consideration the missing values?

Thank you so much,
Olivia