Dear clever statistics people,

I am writing a paper using a RD-design on three different data sets. The first data set has five times as many observations as the other two. The first data set grants me results that are significant and confirms my hypothesis. The two other data set grants me results, which for the most are insignificant, but some of them are significant results that confirms the opposite of my hypothesis.

How do I interpret and discuss these results?

I am fairly convinced that the insignificant results in the smaller data sets are due to Type II error, because smaller sample sizes in general leads to larger standard errors that make it difficult to detect significant results (Kellstedt & Whitten 2013: 141). Especially when it comes to RD-design that require a lot of statistical power (Cook & Wong 2008; Deke & Dragoset 2012).

The significant, but opposite results boggles my mind, however. I am absolutely certain that they are not correct, since I study school reforms that raise the compulsory schooling age, but these results show that it lowers the attained education. Therefore I conclude that they must be Type I errors. I suspect that the small sample size can explain the Type I error, but I am unable to find any papers/books that confirm this relationship.

Can you help me?

Are my Type I errors due to small sample sizes?

What kind of literature is there on this issue?

Or is it something else? Perhaps to do with the RD-design in itself?

Thank you very much,
Andreas Esbjørnsen
Department of Political Science
University of Copenhagen