Dear all,

I am a fresh-man Master student and I am leaning who to conduct a regression discontinuity design (RDD), particularly fuzzy RDD in this case. I have read several guide books on RDD as well as related journal articles but there are basic things I am not totally understand, especially the meaning of parameters controlled for in a standard fuzzy RDD. I am grateful if anyone can help me to understand a fuzzy RDD specification. Following books/journals I have read, the standard fuzzy is set up as follows:

Let's say we want to examine if more educated individuals tend to consume more health care services than those with less educated. We also have a policy change in education system and we will exploit this possible exogenous variation to instrument for years of education.

First stage
Edu = alpha1 + alpha2*D + alpha3*(X-c) + alpha4*D*(X-c) + error1, (1)

Second stage
Health = beta1 + beta2*Eduhat + beta3*(X-c) + beta4*D*(X-c) + error2 (2)

where Edu is years of education; Health outpatient service utilization; D is the treatment status coded as 1 if assigned to the treatment group and 0 otherwise; X is age of individuals at the time of surveys conducted; c is a certain threshold by which if age of an individual exceeds this threshold then he/she may expose to the policy change in education; and error1 and error2 are the error term.

What I am not understanding is the meaning of alpha3 and alpha4 in equation (1). Why do we need to control for those parameters? and what do they mean in the fuzzy RDD specification above?. Books and journal articles I read just provide general justifications such as controlling for alpha3 to correct for selection bias due to the selection on observables or age trends, while controlling for alpha4 to account for the fact that the treatment may impact not only the intercept, but also the slope of the regression line. These explanations are quite technical for me so I hope that someone here can help me out with more detailed explanations on alpha3 and alpha4.

Thank you.