Hi,
I have a data set on mortgage lending for single family homes and I have a total of 10,000 observations. My variable of interest/dependent variable is default which takes a binary value of 0 and 1. 1 means if the borrower's payment was 90+ days late and 0 otherwise and I have a list of x variables such as adjustable rate of mortgage, refinance etc.
My question is, how can I split my data of observations into two different groups e.g. training and testing with 6000 observations randomly assigned to my training data set because I need to tabulate my dependent variable default for both training and testing data sets.
Related Posts with Splitting data
Access to stataHello! How export data from access to stata with codes and labels. Thanks …
Random (clustered) sampling without replacement keeping two strata population proportionsDear Statalist, My first post on this site so please bear with me for any 'mild' transgressions I m…
new reshape issueSorry, this should be straightforward but I can't figure it out even after trying and using the manu…
convert concatenate strings into numericDear Stata users, I have a data like below, the researchers input variables as alphabet. Now I want…
Is ~ a valid character in a variable name?-import delimited- and -insheet- create variable names with a tilde when the name in the first line …
Subscribe to:
Post Comments (Atom)
0 Response to Splitting data
Post a Comment