Hi,
I have a data set on mortgage lending for single family homes and I have a total of 10,000 observations. My variable of interest/dependent variable is default which takes a binary value of 0 and 1. 1 means if the borrower's payment was 90+ days late and 0 otherwise and I have a list of x variables such as adjustable rate of mortgage, refinance etc.

My question is, how can I split my data of observations into two different groups e.g. training and testing with 6000 observations randomly assigned to my training data set because I need to tabulate my dependent variable default for both training and testing data sets.