Hi,
I have a data set on mortgage lending for single family homes and I have a total of 10,000 observations. My variable of interest/dependent variable is default which takes a binary value of 0 and 1. 1 means if the borrower's payment was 90+ days late and 0 otherwise and I have a list of x variables such as adjustable rate of mortgage, refinance etc.
My question is, how can I split my data of observations into two different groups e.g. training and testing with 6000 observations randomly assigned to my training data set because I need to tabulate my dependent variable default for both training and testing data sets.
Related Posts with Splitting data
Regression over time, industry, company, country, timeHello, I would like to regress the following equation: DV_{i,c,j,t} = a_i+α_c+α_j+a_t+beta * (IV_i*…
Problem with selecting only one county per stateHey there, I am a bit new to Stata, so I hope my question will be clear enough. I am looking at the …
RE: Creating simulated dataHello, I generated a dummy variable using the following code: gen employed = 1 + int(2*runiform()) …
-estout- returning "file could not be opened" error except for tex filesI'm running into a puzzling situation where I'm receiving the "file [XYZ] could not be opened" error…
What are .r values in a dataset? Can all values be replaced with 0?Hello, I have ".r" values in my dataset and I am unable to figure out why. Data comes from MSCI (KL…
Subscribe to:
Post Comments (Atom)
0 Response to Splitting data
Post a Comment