I have ran into several general issues whilst starting to work with DiD.
The data that I have is repeated cross-sections with 12 cities of which 3 are in the treatment group and 9 are in the control group. I want to estimate student grant on enrollment. I attach the sample of my dataset below.
Code:
* Example generated by -dataex-. For more info, type help dataex clear input long pid int year byte city float income byte female float(age ability) byte educ int grant byte enrolled float d 1204550 2010 3 30835.06 0 17.654997 103.11912 10 0 1 0 319284 2014 3 31932.877 0 18.130793 103.68096 14 0 1 0 889613 2013 3 36795.105 1 17.410234 105.94727 18 0 1 0 978314 2014 11 34354.633 1 18.826042 109.90023 13 0 1 1 522308 2018 12 39353.02 0 16.946407 94.4239 11 0 1 0 188084 2014 7 34108.605 0 16.74036 105.9109 12 0 1 0 807206 2016 12 28902.074 0 16.833967 113.8728 13 0 0 0 320333 2013 7 27654.77 0 17.068989 102.17348 8 0 1 0 450150 2010 12 32522.38 1 17.06066 99.98117 13 0 0 0 18346 2016 2 36964.234 1 17.876476 107.94476 15 0 1 0 365140 2010 7 32668.863 1 17.623026 92.24822 14 0 1 0 890188 2018 7 34312.56 0 17.601595 109.98123 13 0 1 0 646758 2018 1 33627.96 0 18.01731 120.2906 10 0 1 0 459585 2015 7 29370.793 1 17.249372 87.39782 6 0 0 0 935203 2013 12 32445.84 0 17.77647 75.570305 18 0 1 0 787581 2011 10 30065.26 1 18.051292 99.88745 5 0 1 1 498375 2015 7 35290.19 1 16.93023 93.36523 13 0 1 0 174629 2019 6 30601.344 0 16.877481 75.047035 9 0 0 0 611611 2011 5 30250.344 0 17.525812 90.92487 14 0 0 0 838826 2016 7 30751.756 0 17.068407 107.4832 14 0 1 0 1083592 2012 6 32664.69 1 17.389822 106.0466 11 0 1 0 242215 2015 7 30082.6 0 17.077396 72.571175 12 0 1 0 510105 2015 5 36238.57 0 17.273191 93.56672 14 0 1 0 1171168 2018 1 27488.486 1 17.905424 104.80125 8 0 1 0 505633 2013 10 38493.77 1 18.317528 109.59848 10 0 1 1 1073150 2010 4 31469.16 0 17.581125 98.7337 10 0 1 0 287828 2018 1 30329.86 1 17.092825 85.97654 17 0 0 0 891757 2017 12 35884.47 0 18.140705 111.96175 12 0 1 0 913149 2019 5 35003.055 0 16.631374 122.54363 11 0 1 0 991665 2015 6 33503.066 1 17.866404 70.98935 18 0 1 0 243071 2011 11 31379.88 1 17.084621 103.64464 14 0 1 1 250937 2017 9 35780.746 0 16.826647 86.37576 13 0 1 0 1027638 2018 12 33101.484 1 17.773394 91.35767 10 0 1 0 22953 2013 10 29485.53 1 17.464046 96.355 13 0 1 1 153108 2018 7 29512.227 1 17.275154 94.25189 12 0 1 0 658013 2013 2 27794.816 1 16.913126 82.87864 10 0 1 0 1180143 2013 5 32402.127 1 16.620207 92.83389 10 0 0 0 312950 2010 7 28712.87 1 17.36845 89.96732 12 0 0 0 13933 2013 2 33346.375 0 17.85618 99.07447 12 0 1 0 461750 2010 12 31624.84 0 18.187187 119.1078 12 0 1 0 1067041 2011 3 28147.123 1 17.787865 108.14761 18 0 0 0 807595 2015 9 31974.695 0 16.861078 114.08148 10 0 1 0 598028 2018 9 35101.645 1 18.408142 100.17933 15 0 0 0 898812 2012 7 34147.723 1 18.142551 88.43004 12 0 1 0 498185 2015 12 33769.85 1 18.160706 76.81758 11 0 1 0 284335 2015 7 27923.943 1 17.350409 105.08566 18 0 0 0 528101 2011 9 30789.766 1 17.574682 111.9395 17 0 0 0 120755 2015 3 38303.02 1 17.399488 102.59693 11 0 1 0 931243 2013 5 26780.69 1 17.589457 112.46512 16 0 1 0 13793 2013 8 33066.164 1 17.885649 78.38871 6 0 0 1 1089259 2019 7 31339.92 0 17.514992 120.0189 17 0 1 0 1157963 2013 3 34078.273 0 17.390617 83.69234 16 0 1 0 864745 2015 12 31167.443 0 17.64933 97.17146 9 0 0 0 597375 2015 11 32886.715 1 17.705078 90.70517 13 0 1 1 652215 2015 11 28142.48 0 17.882828 88.85712 11 0 1 1 228472 2012 12 29800.863 1 17.13096 93.62843 18 0 1 0 138830 2010 3 32122.283 1 18.19804 106.25028 7 0 1 0 744471 2011 11 32415.04 0 17.548534 104.27207 18 0 1 1 598536 2016 2 34121.395 1 16.673536 89.31037 15 0 1 0 1052614 2014 8 27814.113 1 17.010475 108.49964 10 0 1 1 649531 2011 7 27315.727 1 18.158386 81.48443 18 0 1 0 1021262 2012 7 30865.465 1 18.37912 91.87194 16 0 1 0 745194 2014 7 29216.043 1 17.223442 98.18081 15 0 0 0 1178861 2011 8 27306.32 0 17.345772 88.04616 16 0 1 1 1139135 2015 4 31997.08 0 17.50336 82.63182 18 0 1 0 549748 2018 12 31119.97 0 16.786953 90.79367 18 0 1 0 749490 2010 7 25319.2 0 17.834885 85.50574 12 0 0 0 1050823 2013 3 31714.156 1 17.77309 106.7881 12 0 0 0 719246 2016 2 30775.043 0 17.12436 94.05766 7 0 1 0 1005210 2010 8 30388.896 1 17.395172 98.40136 13 0 0 1 447412 2012 9 31698.453 0 17.66549 111.54143 9 0 1 0 1083881 2011 11 35093.87 1 17.310112 102.2826 14 0 0 1 1017055 2015 12 30806.945 1 18.521137 104.66325 11 0 1 0 992772 2012 12 27001.717 0 17.590246 112.75533 14 0 1 0 890703 2013 8 26494.21 1 16.931662 95.20155 12 0 1 1 281129 2019 6 37380.133 0 16.40155 91.67627 18 0 1 0 668979 2019 3 36621.348 1 18.151688 95.8932 14 0 0 0 967675 2015 4 34753.12 0 17.897005 113.03803 18 0 1 0 348630 2010 5 32139.91 1 17.538477 104.08703 10 0 1 0 539534 2014 9 32938.26 0 18.574429 87.49246 11 0 1 0 125183 2013 8 31783.246 1 18.586412 115.38096 16 0 1 1 1083222 2012 8 32871.15 1 17.557425 98.71842 16 0 0 1 808739 2019 4 31418.37 1 16.910383 73.486244 18 0 0 0 972730 2010 3 27586.594 1 17.133444 102.77448 15 0 1 0 177327 2017 9 32548.914 0 16.684296 93.75323 15 0 0 0 103276 2016 1 30006.35 0 16.82226 103.4052 8 0 1 0 856478 2018 7 34106.258 0 17.852732 107.36507 14 0 1 0 923792 2012 6 35160.816 0 17.40168 96.59897 14 0 1 0 1064345 2015 12 34054.652 1 17.536524 101.16731 16 0 1 0 522532 2012 3 32600.11 1 17.823885 95.66457 16 0 1 0 407449 2019 1 29082.324 1 17.626932 94.40669 14 0 0 0 819533 2013 3 30923.36 0 18.071976 100.04102 13 0 1 0 315863 2013 1 30462.123 0 16.875221 93.62828 16 0 1 0 56267 2017 7 31048.93 1 17.411005 99.2887 17 0 0 0 174233 2013 7 28117.346 0 17.959694 112.06963 11 0 1 0 459831 2011 7 34452.543 0 18.144615 105.93893 15 0 1 0 515840 2010 8 32178.18 0 17.602175 89.60943 12 0 1 1 186889 2019 7 32696.75 0 17.649858 95.25651 16 0 1 0 672054 2014 3 35514.21 1 17.791594 93.148 6 0 1 0 900622 2012 3 29506.32 0 17.503372 93.44817 18 0 1 0 end
These are the questions I have for DiD.
1. In order to run an appropriate model, from what I've found in Scott Cunningham "Causal inference" is that most of the times the errors are correlated within groups. First, is there a way to test it such as white-test and if not should clustered errors be used regardless?
2. For DiD the most important assumption is parallel trends. However, is there still any use of finding out the degree to which both groups are balanced, if so what would be the appropriate code for that?
3. Most important question is which method would give me the extend to which student grant matters? (I want to find out if there is an intensive effect of student grants on enrollment)
Kindest regards
0 Response to DiD
Post a Comment