I am investigating the effect of air pollution (measured by pollutant concentration) on health (proxied by number of hospital visits) through a 2SLS regression. Here's a sample of my data:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float(year month week) int hosp_visits float(pm25 radiatpw temp prec windsp) 2012 8 34 2669 24.3 6.699171 27.797144 4.5666666 7.519048 2012 9 35 2533 26.57143 6.798932 26.885714 7.595238 6.380952 2012 9 36 2343 33.914284 33.018562 27.67857 2.8619046 8.088095 2012 9 37 2619 27.97143 6.873756 27.59524 2.9285715 6.266667 2012 9 38 2579 35.685715 13.724868 28.207144 3.447619 6.157143 2012 9 39 2517 27.685715 17.431175 27.87 6.035714 6.381905 2012 10 40 2575 28.8 4.580237 28.14857 2.3809524 6.17619 2012 10 41 2638 30.34286 .7070137 27.52238 4.4690475 5.316667 2012 10 42 2695 23.8 .3770984 26.83524 13.909524 5.27619 2012 10 43 2768 18.22857 1.0381021 28.19762 1.545238 6.088095 2012 11 44 2605 14.942857 .7895027 27.38524 5.447619 6.454762 2012 11 45 2578 13.085714 .23479086 26.830954 8.711905 5.733333 2012 11 46 2581 17.114286 .43059835 27.064285 14.583333 5.254762 2012 11 47 2504 17.657143 .2927277 27.190475 11.383333 5.442857 2012 12 48 2542 19.77143 .54017836 27.140953 8.892857 4.780952 2012 12 49 2681 12.857142 1.0204597 26.94286 14.038095 5.707143 2012 12 50 2604 14.17143 1.4329002 26.43333 14.869048 5.057143 2012 12 51 2497 13.685715 2.502648 26.22143 14.228572 4.795238 2012 12 52 2812 11.342857 3.386352 26.524763 2.990476 5.947619 2013 1 1 3022 11.6 3.368256 26.80857 16.009523 6.016667 end
hosp_visits represent number of hospital visits for respiratory conditions in the week. pm25 represents average pollutant concentration. radiatpw is a measure of the radiative power of fire hot spots. temp represents average temperature. prec represents average amount of precipitation. windsp represents average wind speed.
For one of my specifications I plan to do something like this: hosp_visits = pm25 + pm25squared +pm25cubed + pm25(t-1) + pm25(t-1)squared +pm25(t-1)cubed + weather controls + time fixed effects + constant, where pm25(t-1) represents the first lag of pm25 (I define the time unit as the week). I include windsp and radiatpw as instruments that only affect hosp_visits through the channel of pm25. However, I would also like to capture the non-linear and lingering effect of forest fires on pollution by including the first lags of radiatpw with its square.
I would like to check if this specification is sound. If it is, I wish to clarify what is the proper way to implement it in Stata code.
This is my clunky attempt at the code:
Code:
ivregress 2sls hosp_visits `weather' i.month i.year (c.pm25##c.pm25##c.pm25 L.c.pm25##L.c.pm25##L.c.pm25 = c.radiatpw##c.radiatpw L.c.radiatpw##L.c.radiatpw windsp), r
0 Response to 2SLS with polynomial distributed lags in both endogenous variables and instruments
Post a Comment