Hi all!
I need to split my dataset into 10 folds to do cross-validation "manually" (I know about the crossfold command, but it won't work for what I need).
I need a loop that allows me to run 10 cycles. In each of them, I need to select the train set (9 parts) and the validation dataset (1 part). Then I will run my model is the trainset and use the predict command to estimate the prediction of this model on the validation dataset.
This is what I have come up with so far, which I am aware it's wrong...
sysuse auto, clear
generate prp=0
* ** variable with the fold number. I know that in R there is a command that allows me to split n observations into K groups to be used for (repeated) K-fold cross-validation (cvfolds). Any suggestion?
egen split = seq(), f(1) t(10)
forvalues i = 1/10{
reg price mpg headroom if split != `i'
predict p if split =`i'
replace prp=p
drop p
}
Thanks a lot in advance!
Related Posts with Split dataset for cross-validation
Help with loop for rename functionHi all, I am a new user to STATA. I want to rename 12 different variables: Instead of coding 12 diff…
Meta-analysis of RCTsDear all, I had a basic question pertaining to meta-analysis of RCTs. If we have ATEs and CIs of th…
Dynamic and No endogeneityI am working on panel data and I found none of the regressor to be Endogenous. But I found Lag of my…
Calculate income household (=the total income of the members) in each householdI have a dataset with some variables: ID_I (individual id), ID_H (household id), J60 (total income).…
Two way clustering versus clustering on the interaction term?Suppose I want to cluster on country and year effects. Specifically, I am using reghdfe command but …
Subscribe to:
Post Comments (Atom)
0 Response to Split dataset for cross-validation
Post a Comment