BJ Data Tech Solution

Specialized on Data processing, Data management Implementation plan, Data Collection tools - electronic and paper base, Data cleaning specifications, Data extraction, Data transformation, Data load, Analytical Datasets, and Data analysis. BJ Data Tech Solutions teaches on design and developing Electronic Data Collection Tools using CSPro, and STATA commands for data manipulation. Setting up Data Management systems using modern data technologies such as Relational Databases, C#, PHP and Android.

Split dataset for cross-validation
Split dataset for cross-validation

Hi all!

I need to split my dataset into 10 folds to do cross-validation "manually" (I know about the crossfold command, but it won't work for what I need).

I need a loop that allows me to run 10 cycles. In each of them, I need to select the train set (9 parts) and the validation dataset (1 part). Then I will run my model is the trainset and use the predict command to estimate the prediction of this model on the validation dataset.

This is what I have come up with so far, which I am aware it's wrong...

sysuse auto, clear

generate prp=0

* ** variable with the fold number. I know that in R there is a command that allows me to split n observations into K groups to be used for (repeated) K-fold cross-validation (cvfolds). Any suggestion?

egen split = seq(), f(1) t(10)

forvalues i = 1/10{

reg price mpg headroom if split != `i'

predict p if split =`i'

replace prp=p

drop p

}

Thanks a lot in advance!

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Split dataset for cross-validation
Split dataset for cross-validation

0 Response to Split dataset for cross-validation

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Split dataset for cross-validation Split dataset for cross-validation

0 Response to Split dataset for cross-validation