This question investigates whether restricting youth access to alcohol has impacts on motor vehicle death rates for young people. We restrict attention on death rates of those 18-20 (the age group impacted MLDA). The key variable is `legal1820`, indicating the fraction of 18-20 year olds in a state that can buy alcohol legal. This will be 1 if the MLDA is 18, and 0 if it is 21 for an entire year. For states that changed mid-way through the year, the variable is scaled. Many States had MLDA ages between this range. We exploit the over-time, within-state variation in an difference-in-difference design.
## Difference in Difference
Since the data is a panel on states that vary the drinking age limit, a difference-in-differences strategy to estimate the effect on drinking age limits on death rates seems natural here.
Long data is great for figures, but doesn't always work for tables or regressions. Thus, I covert it to wide form here. This creates separate variables for each of the death causes. The main dependent variable with be `MVA`, deaths from moter vehicals.
```{r}
df <- df %>%
ungroup() %>%
pivot_wider(names_from = dtype, id_cols = c(state, year,pop, legal1820, legal, beertaxa, beerpercap, winepercap, spiritpercap, totpercap), values_from = mrate) %>%
rename(other_external = `other external`) %>%
group_by(state) %>%
mutate(treat = ifelse(first(legal1820) != legal1820[year == 1979],1,0))
```
1. I have created a variable called `treat` that is equal to 1 for states that responded to the 1971 constitution change and 0 otherwise. Add two more variables: `post` if the year is `>=1975` and an interaction between the variable `post` and `treat`.
```{r}
gen treat = 1 if(year=1971)
gen post if(year >= 1975)
gen interaction = post*treat
```
(I do not have idea about this)
2. Run a simple difference in difference regression where the dependent variable is `MVA` and the right hand side has `post`, `treat`, and the interaction term you created above. Interpret your result: do states that lower their drinking age have more motor vehicle deaths?
```{r}
```
3. The simple difference in difference above doesn't use all the information available. Instead of putting a dummy variable `treat`, we could include state fixed effects. Likewise, instead of a dummy variable `post` we could include year fixed effects. The variable `legal1820` varies across states and over time, so it will more efficiently use the data compared to a post-treatment dummy. Using the data frame `df` add two new variables. A `factor` variable called year using `year = as.factor(year)` inside mutate and similarly for state, `state = as.factor(state)`. These can now be added easily to a regression as categorical variables. Run a difference-in-difference regression of `MVA` on `legal1820` and state and year fixed effects. Save this as `mod1`
3. Repeat the above regression, but weight it by the variable `pop` using the weight option: `lm( y ~ x, data = df, weight = pop)`. Save this as `mod2`
4. Repeat your above two regressions (with and without weights) usign the control variables `beertaxa`, `beerpercap`, `winepercap`, `spiritpercap`, `totpercap`. Save these as `mod3` and `mod4`.
5. Output your regression results using `stargazer`, but only keep the variable `legal1820` using the `stargazer` option `keep`. Interpret the output from your table.
6. Repeat the above steps, but use the dependent variable `internal`. This is death from internal causes, and thought to be unrelated to alcohol consumption. Thus, it serves as a **falsification** test. We should not find that drinking laws are correlated to internal death cause rates.
I am a beginner of regression, these problems are very difficult for me, and I do not know where to find the basic code, if you can give me some advices and I would appreciate it.
Related Posts with how to solve those questions about difference in difference
Moderated mediation using GSEM from a difference-in-difference approach - calculating iniDear Statalist-ers, I hope this message finds everyone well. I am currently running some Difference…
Time-lagged variable for cross-sectional dataHi, I have a seemingly simple recoding problem, which I could not solve by reading the documentatio…
Plotting non linearities in regression and interaction effects from reghdfeHello everyone, First of all, thank you in advance for any answers to this post. It's very rare to …
malmq.ado could not work with the save(filename) option, anyone know what's wrong with the ado file? ado file and datasets attached I use malmq.ado to calculate malmquist index in stata 15, it works OK.But when I want to save resul…
Using*two*inputs variable*in*measuring*Malmquist*Productivity*IndexHi I am trying to calculate malmquist productivity index with two inputs (Farming Grain) , but when…
Subscribe to:
Post Comments (Atom)
0 Response to how to solve those questions about difference in difference
Post a Comment