Hi Statalists,

I have two datasets to calculate 2 variables: Roll (1984) liquidity measure and the 4-year-rolling-window standard deviations of the residuals of a cross-sectional regression.

1) For Roll liquidity:

Roll liquidity = 2*sqrt(- covariance (price_change_t, price_change_t-1))

I have unbalanced panel for daily stock price (i.e. id date price). I use rangestat to calculate the covariance over 21 days. I have quite big data with nearly 5,300,000 rows (about 800 firms over 14 years). It takes me years to get the results and I am not sure when it will give me the results.

Code:
encode id,gen (firm)
sort firm date
format %td date

by firm: gen obs_count=_n
xtset firm obs_count
bys firm: gen change_prc= prc - L.prc

bys firms: gen lag_change_prc=L.change_prc

drop if year(date)<2004
drop if year(date)>2017

ssc install rangestat
rangestat (cov) lag_change_prc change_prc, by(firm) interval(obs_count -20 0)

2) For the 4-year-rolling-window standard deviation of the residuals of a cross-sectional regression for unbalanced panel with over 35,000 obs:
First, I run the cross-sectional regression like this reg accruals cf_1lag cf cf_1lead rev ppe
So I use runby as suggested by some prior posts here, and get the residuals, then I want to calculate the standard deviations of the residuals rolling 4 years.
Again, it takes me forever to have the results.

Code:
ssc install runby
capture program drop one_regression
program define one_regression
    if _N > 10 {
        capture noisily reg accruals cf_1lag cf cf_1lead rev ppe, noconstant
        if c(rc) == 0 { // REGRESSION WENT OK
            predict r
        }
        else if inlist(c(rc), 2000, 2001) { // NO OR INSUFFICIENT OBSERVATIONS
            gen r = .
        }
        else { // THERE WAS AN UNEXPECTED PROBLEM
            gen comment = "Unexpected error `c(rc)''"
        }
    }
    exit
end

runby one_regression, by(year industry) status
replace r=0 if missing(r)
rename r residuals

///use asrol to obtain the standard deviation of the residuals rolling 4 years
sort firm year
bys firm: gen t=_n
tsset firm t
asrol residuals, w(year 4) s(sd) g(sd)
I cannot upload any dataset sample here because rangestat or asrol is run based on the actual data sample size.

Can anyone please help if I did something wrong with the codes? How can I check if when they will finish?

I really appreciate your help.

Kind regards,
Ken