Hi statalist members,
This is my first post here so pardon me if I deviate from the established etiquette for the forum. I shall cut right to the chase.
I am trying to understand and replicate the analysis of Abadie, Athey, Imbens, and. Wooldridge (2017) (https://arxiv.org/abs/1710.02926), particularly what was presented at the Chamberlain Seminar last year (https://www.google.com/url?q=https%3...xpGV5v9dU8jDBi). I am running into issues setting up the Monte Carlo simulation.
The regression is of an outcome regressed on only a constant and treatment assignment variable (W). Outcome is generated by drawing from a normal distribution, with mean for control as alpha and for treatment as alpha + tau. Alpha and tau vary across clusters with variance 0.15 and 0.12 and have means 9.9 and 0.4 respectively.
Firstly, the treatment assignment variable (W). This should be drawn from a binomial distribution since we want W to be a binary variable with mean 0.55. Now my understanding is that W1 should be a 52x1 vector which is the means of W in each cluster. W1 will then help to generate data for W in each cluster by drawing from a binomial distribution with probability W1i where i belongs to [0,52]. sigmaK which Abadie et al are varying should be the variance of W1. To reiterate simply, the assignment probabilities across clusters should have mean 0.55 and variance sigmaK. My problem is that I am drawing W1 from a normal distribution with mean 0.55 and standard deviation sigmaK. When sigmaK is less than approx 0.23, the draws are all within (0,1). We need the draws to be between 0 and 1 because these will be the probability values for the binomial distribution. Abadie et al have a case of highly correlated assignment probability where sigmaK = 0.6. This lets the draws from normal distribution be outside (0,1). So my question is what should I be doing so that I get the correct form of W.
Secondly, the simulation results show the true standard deviation. I would naively assume them to be the standard error from OLS regression using all population as sample. But this does not sit right as standard errors (variance) is a function of q (proportion of observed clusters) and should vary accordingly. What would they be considering as true standard error?
Thirdly, in generating the outcome variable, they mention that it is drawn from a distribution with variance estimated on original data. This might be a long shot but I don’t have (know) the exact data they use. Could there be a workaround? I for now draw outcome variable from a multivariate normal with variances 1 and covariances 0.5.
Any help in understanding and correcting my understanding would be highly appreciated.
Regards,
Abbas
Related Posts with Understanding Monte Carlo simulations in "When Should You Adjust Standard Errors for Clustering?" (Abadie at al, 2017)
Reminder: UK Stata Conference submission deadline 26 MayUK Stata Conference, 7-8 September 2023: reminder I'm bumping the thread at https://www.statalist.o…
Stata Command: including interaction terms of the endogenous variable in 2SLS using xtivregHi, I am wondering if anyone had any experience in including an interaction term in 2SLS using xtiv…
How do I find out which excel file was imported into STATA (origin of data?)Hey all, Months ago, I imported an excel file into stata for analysis. However, I would like to do…
Normalize variable as an expanding windowHi, I would like to normalize the price of the following as an expanding window taking into account…
Query on ordering coefficients from multiple models when using coefplotHi All We need help with using the coefplot command. Briefly, we have run a series logistic regress…
Subscribe to:
Post Comments (Atom)
0 Response to Understanding Monte Carlo simulations in "When Should You Adjust Standard Errors for Clustering?" (Abadie at al, 2017)
Post a Comment