I am a novice user in Stata. I'm working with Stata.14 and Windows 7.
I'm working on a Panel Data Set for all commerical banks in the U.S. for the period 1995 - 2018 (time variable). So I have data on a bank-year level. I created the ID Variable with the variables bank name and cert I already calculated four bank risk proxies: Z-Score, NPA (non-performing assets), LLP (loan loss provisions) and LLR (loan loss reserves) on a bank-year level.
I calculated the Risk Proxy Z_score and I would like to run the binary probability model explaining the occurrence of a bank failure ( Failure = 1, Active = 0) with the risk proxy (lagged by one year). My "Guiding Paper" reports marginal effects of probit regressions.
This is the dataset with 172 431 observations and the Variable "status" if a banke failed = 1 or if it is still active = 0
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float(year id status Z_score) 1995 7 1 -1.4038005 1995 10 0 -1.434213 1995 11 0 -1.5771302 1995 14 0 -1.758422 1995 16 0 -1.4295077 1995 21 0 -1.329172 1995 27 0 -1.3730284 1995 32 0 -1.3627455 1995 34 0 -1.908463 1995 38 0 -1.8048723 1995 41 0 -1.5905398 1995 46 0 -1.533159 1995 47 0 -1.663955 1995 48 0 -1.518417 1995 52 0 -1.3701818 1995 53 0 -1.485621 1995 56 0 -1.63249 1995 59 0 -1.476241 1995 76 0 -1.3577138 1995 82 0 -1.3661845 1995 84 0 -1.3885205 1995 85 0 -1.5949416 1995 87 0 -2.0597448 1995 99 0 -2.2821965 1995 101 0 -1.5937258 1995 104 0 -1.6237373 end
- As I use data for all U.S. commercial banks between 1995 and 2018 I have Panel Data, so I have to use xtprobit
- As my model is Pr(Failure = 1) = F(x´it-1 ß) and my "Guiding Paper" is saying "we report marginal effects of probit regressions" I think I have to use the Population-averaged (PA) model instead of the Random-effects (RE) model. After the literature review I came to the conclusion, that I would like to have a model which should explain the occurence of a bank failure of the entire "banking population" and not of a specific bank (e.g. Deutsche Bank).
So I was running the command with the correct correction of the standard error with vce(robust):
Code:
xtset id year panel variable: id (unbalanced) time variable: year, 1995 to 2018, but with gaps delta: 1 unit
Code:
xtprobit status L.Z_score, pa vce(robust)
Code:
Iteration 1: tolerance = .86103329 Iteration 2: tolerance = .503125 Iteration 3: tolerance = .29113738 Iteration 4: tolerance = .22620098 Iteration 5: tolerance = .14532103 Iteration 6: tolerance = .10679244 Iteration 7: tolerance = .06929776 Iteration 8: tolerance = .05079473 Iteration 9: tolerance = .03304497 Iteration 10: tolerance = .02339874 Iteration 11: tolerance = .01565962 Iteration 12: tolerance = .01090075 Iteration 13: tolerance = .0073888 Iteration 14: tolerance = .00510128 Iteration 15: tolerance = .00347811 Iteration 16: tolerance = .00239194 Iteration 17: tolerance = .00163531 Iteration 18: tolerance = .00112255 Iteration 19: tolerance = .00076844 Iteration 20: tolerance = .00052703 Iteration 21: tolerance = .000361 Iteration 22: tolerance = .00024749 Iteration 23: tolerance = .00016957 Iteration 24: tolerance = .00011623 Iteration 25: tolerance = .00007964 Iteration 26: tolerance = .00005459 Iteration 27: tolerance = .00003741 Iteration 28: tolerance = .00002564 Iteration 29: tolerance = .00001757 Iteration 30: tolerance = .00001204 Iteration 31: tolerance = 8.251e-06 Iteration 32: tolerance = 5.655e-06 Iteration 33: tolerance = 3.875e-06 Iteration 34: tolerance = 2.656e-06 Iteration 35: tolerance = 1.820e-06 Iteration 36: tolerance = 1.247e-06 Iteration 37: tolerance = 8.548e-07 GEE population-averaged model Number of obs = 156,147 Group variable: id Number of groups = 14,692 Link: probit Obs per group: Family: binomial min = 1 Correlation: exchangeable avg = 10.6 max = 23 Wald chi2(1) = 24.18 Scale parameter: 1 Prob > chi2 = 0.0000 (Std. Err. adjusted for clustering on id) ------------------------------------------------------------------------------ | Semirobust status | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- Z_score | L1. | .0335066 .0068144 4.92 0.000 .0201506 .0468626 | _cons | -1.658208 .0214663 -77.25 0.000 -1.700282 -1.616135 ------------------------------------------------------------------------------
In my "Guiding Paper" they assess the Z-Score Model on its Pseudo R2. I know that the "normal" probit regeression Output shows me the Pseudo R2 and there is a way to calculate the Pseudo R2 for the xtprobit Panel Data Probit Regression. I know that the pseudo R2 is stored with e(r2 p).
How can I calculate the Pseudo R Squared for the Population Averaged Model?
Thank you very much for your support!
0 Response to Pseudo R squared for a Population-averaged Model
Post a Comment