Hi Statalist,

I'm attempting to estimate a probit model on the likelihood of a patent being accepted (based on only a small handful of firm/patent characteristics). Presenting the probit results from this regression is relatively straightforward, but because the majority of my variables are dummy variables, I'm having a few problems. Here is a copy of my data.

Sealed logavgmktcap patents_on_issue ipcA ipcB ipcC ipcD ipcE ipcF ipcG ipcH
1 23.73145 4 0 0 0 0 0 0 1 0
0 26.00386 6 0 0 1 0 0 0 0 0
1 26.2179 1 0 0 0 0 0 1 0 0
1 24.43442 10 0 0 1 0 0 0 0 0
1 26.354576 1 0 0 1 0 0 0 0 0
0 26.117344 8 0 0 1 0 0 0 0 0
0 25.99842 11 0 0 1 0 0 0 0 0
1 25.34848 4 0 0 1 0 0 0 0 0
1 26.00386 6 0 0 0 0 0 0 0 1
1 26.117344 8 0 0 0 0 0 1 0 0
1 25.68289 7 0 0 0 0 1 0 0 0
0 25.68289 7 0 0 1 0 0 0 0 0
0 25.99842 11 0 0 0 0 0 0 1 0
1 25.34848 4 0 0 1 0 0 0 0 0
1 25.99842 11 0 0 0 0 0 1 0 0
1 24.97608 6 0 0 1 0 0 0 0 0
1 24.97608 6 0 0 1 0 0 0 0 0

The dependent variable is an indicator that takes on the value of 1 if a patent is granted and IpcA through H are dummy variables to identify the patent technology field (physics, materials etc.). Naturally I had to drop one of the IPC class variable to avoid perfect collinearity in the probit regression.

How would I go about estimating the (conditional) probability of each unique combination of the above variables and saving them down as a new variable to include in a separate regression? One of the above dummies will not be found in the list of covariates during post-estimation.

Thanks,

Andy