I have annual stock returns for a number of firms for about 20 years. The total firm-year obs are about 15k. I wanted to pick 100k random samples, say 20% of the obs each year. I am using forvalues loop to pick random samples by year, then compute portfolio means by year, add column to identify simulation index, and store it. The concern is that it is taking too much time and sleep option is necessary to avoid read-only issue while saving.
I was wondering if there is a better way to optimize, something like first use expand to first create 100k replicas and then compute returns by simulation index and year. I am okay with large file if it reduces the runtime.
data set looks like-
fid ayr ret
abc 2001 0.012
abc 2002 0.014
abc .....
abc 2020 0.032
xyz 2005 0.265
xyz 2006 0.023
.....
Code: I am using right now"
save yr_ret.dta, replace
local flag = 1
set seed 1234
forvalues i=1/100000 {
display "starting sample `i'"
use yr_ret.dta, replace
sample 20, by (ayr)
collapse (mean) eqret=ret (count) n=ret, by(ayr)
gen indx=`i'
if `flag'!=1 {
append using eq_ret_ranpf.dta
}
save eq_ret_ranpf.dta, replace
sleep 500
local flag = 0
}
Appreciate if someone can help.
Related Posts with Repeated random sampling without replacement from a panel data
Merge m:1 using participant ID in matched case-control data - error messagesHi, I have a matched case control dataset with 3 controls to every case. I have data that I want to…
Problem when running loops with forvalues command: error message: `i' invalid nameDear Statalist, I got a problem with running a loop. I give you my whole code, so you can see how v…
Renaming group of variables as varHello, Can anyone advise how to rename a group as variables as simply var1 var2 var3 etc. Currentl…
Estimating adjusted medians and 95% CI using quantile regressionDear StataList: I am wishing to estimate age-adjusted median concentrations and 95% CIs of a labora…
Collapsing time variableHello all, I am quite new to stata and I need help with a simple question (I think), that I just ca…
Subscribe to:
Post Comments (Atom)
0 Response to Repeated random sampling without replacement from a panel data
Post a Comment