I have annual stock returns for a number of firms for about 20 years. The total firm-year obs are about 15k. I wanted to pick 100k random samples, say 20% of the obs each year. I am using forvalues loop to pick random samples by year, then compute portfolio means by year, add column to identify simulation index, and store it. The concern is that it is taking too much time and sleep option is necessary to avoid read-only issue while saving.
I was wondering if there is a better way to optimize, something like first use expand to first create 100k replicas and then compute returns by simulation index and year. I am okay with large file if it reduces the runtime.
data set looks like-
fid ayr ret
abc 2001 0.012
abc 2002 0.014
abc .....
abc 2020 0.032
xyz 2005 0.265
xyz 2006 0.023
.....
Code: I am using right now"
save yr_ret.dta, replace
local flag = 1
set seed 1234
forvalues i=1/100000 {
display "starting sample `i'"
use yr_ret.dta, replace
sample 20, by (ayr)
collapse (mean) eqret=ret (count) n=ret, by(ayr)
gen indx=`i'
if `flag'!=1 {
append using eq_ret_ranpf.dta
}
save eq_ret_ranpf.dta, replace
sleep 500
local flag = 0
}
Appreciate if someone can help.
Related Posts with Repeated random sampling without replacement from a panel data
floatplot updated on SSCThanks as always to Kit Baum, the files for floatplot on SSC have been updated. Somehow or other th…
Multilevel modeling a binary outcome taking account a complex survey design (with psu, strata, and weights) using svy:melogitProgramme: Stata SE 16.1 Desired Analysis: Multi-level modelling for complex survey design (with psu…
How 'mixed' Command Identifies Level-1 and Level-2 Variables?My confusion is depicted in the picture below: Array …
Difference-in-DifferenceGood morning, Kindly I am new in this command. What does it mean that I don't have significance in …
Should I use "c." or "i." prefix or something else?Dear all, I have a panel dataset of rated firms and want to study the effect of busyness cycles on …
Subscribe to:
Post Comments (Atom)
0 Response to Repeated random sampling without replacement from a panel data
Post a Comment