Dear members,

I would like to know if there is a Stata program to help me find the most appropriate lag in a panel. For a reproducible example, not at all related to my research, let's use the following code:

Code:
import delimited https://covid.ourworldindata.org/data/owid-covid-data.csv, clear
egen country=group(iso_code)
rename date datestr
gen double date = clock(datestr,"YMD")
format date %tc+CCYY-NN-DD
cap drop new_cases_avg
gen double new_cases_avg = new_cases_smoothed / 50
xtset country date 
preserve

keep if iso_code == "GBR" | iso_code == "USA" | iso_code == "BRA" | iso_code == "ITA"

twoway (tsline new_cases_avg) (tsline new_deaths_smoothed), ///
    ytitle(Covid-19 Cases (÷50) and Deaths) ///
    ytitle(, size(vsmall)) ///
    ylabel(, labsize(default)) ///
    ttitle(Days) ///
    ttitle(, size(vsmall)) ///
    tlabel(#10, angle(forty_five)) ///
    legend(size(vsmall)) ///
    name(lags, replace) ///
    scale(.5) ///
    by(iso_code, total iscale(*.5))

exit
The plot is the following:

Array

Let's suppose a researcher wants to know how long it takes for a spike in the number of cases to trigger a spike in the number of deaths. You can see that a raise in covid cases is followed by a raise in covid deaths, in at least some of the countries of this dataset. Well, the rate of change in deaths is decreasing compared to the rate of change in cases, either by better preparedness of the medical community or because of the effects of the vaccines. The researcher does not care. They only care about the lag size.

First, what is the average lag size for all the countries?

Second, how to estimate or test whether there are differences in at least some of the +100 countries in the dataset.

I know it is a large N, large T dataset, but let me know if you are aware even for large N, small T data (which is my case).

Best regards,

Iuri.