I would like to know if there is a Stata program to help me find the most appropriate lag in a panel. For a reproducible example, not at all related to my research, let's use the following code:
Code:
import delimited https://covid.ourworldindata.org/data/owid-covid-data.csv, clear egen country=group(iso_code) rename date datestr gen double date = clock(datestr,"YMD") format date %tc+CCYY-NN-DD cap drop new_cases_avg gen double new_cases_avg = new_cases_smoothed / 50 xtset country date preserve keep if iso_code == "GBR" | iso_code == "USA" | iso_code == "BRA" | iso_code == "ITA" twoway (tsline new_cases_avg) (tsline new_deaths_smoothed), /// ytitle(Covid-19 Cases (÷50) and Deaths) /// ytitle(, size(vsmall)) /// ylabel(, labsize(default)) /// ttitle(Days) /// ttitle(, size(vsmall)) /// tlabel(#10, angle(forty_five)) /// legend(size(vsmall)) /// name(lags, replace) /// scale(.5) /// by(iso_code, total iscale(*.5)) exit
Array
Let's suppose a researcher wants to know how long it takes for a spike in the number of cases to trigger a spike in the number of deaths. You can see that a raise in covid cases is followed by a raise in covid deaths, in at least some of the countries of this dataset. Well, the rate of change in deaths is decreasing compared to the rate of change in cases, either by better preparedness of the medical community or because of the effects of the vaccines. The researcher does not care. They only care about the lag size.
First, what is the average lag size for all the countries?
Second, how to estimate or test whether there are differences in at least some of the +100 countries in the dataset.
I know it is a large N, large T dataset, but let me know if you are aware even for large N, small T data (which is my case).
Best regards,
Iuri.
0 Response to Finding lags in panel data
Post a Comment