Hi,

I estimated the probability of default of loans using borrower characteristics. First, I run a probit regression. The dependent variable is a dummy variable that equals one if the loan is defaulted and equals zero otherwise. The independent variables are borrower characteristics, such as homeowner dummy, credit history length, and etc. I use the "predict" command in Stata to estimate the probability of default of each loan as percentages.

I'm now trying to run the probit regression on only half of loans in my sample and use the coefficients generated in this regression to predict the probability of default as percentages for the whole sample. How can I do it? Please see the following data sample.

input float(default_1 homeowner_1) long amount_delinquent float bankcard_utilization long revolving_balance float credit_history_length byte delinquencies_over60_days
0 0 0 .83 22144 19.164955 0
0 1 0 .45 23427 21.08145 3
1 1 0 .4 29815 26.55989 0
0 1 0 .2 7484 16.347708 0
0 1 0 .75 3622 21.141684 0
0 0 0 .2 331 14.557153 0
1 1 0 .67 67001 18.283367 0
0 1 0 .95 62094 17.18549 0
0 1 0 0 1092 12.364134 0
0 1 0 .5 15593 23.140314 1
0 1 0 .33 14145 28.249144 0
0 1 0 .77 50439 27.54278 0
0 1 0 .69 27053 24.0219 1
1 1 0 .71 15149 33.99589 2
0 1 232 .75 58265 19.28268 0
0 0 0 .01 29 12.427105 0
1 0 0 .59 55194 22.29432 0