Below I have calculated that interval first using a regression and also through simulation of that regression. The 95% CIs for the treat coefficient are very similar: [-.023, .072] and [-0.024,0.073], respectively.
0) Does this seem like a reasonable procedure?
1) Is one of these intervals better to use? Why aren't they identical?
2) Am I correct that this design should be able to detect anything outside outside that CI?
Stata code:
Code:
set seed 10011979
/* Regression Approach */
sysuse nlsw88.dta, clear
gen log_wage = ln(wage)
gen treat = runiform()>0.5
tab treat
reg log_wage i.treat, level(95) robust
save "my_nlsw88.dta", replace
/* Sumulation Approach */
capture program drop my_nlsw88_reg
program my_nlsw88_reg, rclass
version 16.0
use "my_nlsw88.dta", clear
bsample 2246 // obs in original dataset
reg log_wage i.treat, robust
return scalar lift = _b[1.treat]
end
simulate lift = r(lift), reps(10000) dots(10000) saving("mde_sim.dta", replace): my_nlsw88_reg
sum lift, meanonly
local mean = r(mean)
_pctile lift, percentile(2.5 97.5)
return list
di "MDE is " %-9.3f r(r1) "to " %-9.3f r(r2)
0 Response to Minimum Detectable Effect Using Non-parametric Power Simulations on Existing Data and Assignment
Post a Comment