thank you for your time and effort in advance! I really appreciate it!
I have some big issues with my two-way fixed effects model that I can´t solve and I am getting very frustrated to be honest. I have an unbalanced panel data set of N=33 and T=7 and Iam using the xtreg,fe command in Stata 16.
My problem at this point is,that the within R Squared is always greater than 90% and sometimes even greater than 95%. This seems way too good and I cannot believe that this is a sign of a good model. I rather would state that this shows that my model is extremely bad.
After looking for possible causes for such an measure I concluded that this must be the result of a spurious regression.
Causes for spurious regression according to my knowledge are: Multicollinearity (but all VIF are smaller than 5); Overfitting (but when I discard multiple independent variables, dummies or control variables the R Squares still stays around 90%, identical functional forms of DV and IV (not the case since only 2 IVs are logarithmized), Chance correlation (I really hope this isn´t the case) and time trends respectively non-stationarity
What I detected when going through my model:
There is a high correlation between the independent variable "lnTotalAssets" (proxy for size) and the dependent variable "lnGrossLoans".
-> When i discard Total Assets the R Squared decreases to 83%, which is still very high and the R Squared then stays around this number when I try to decrease it further and delete other variables, so it does not seem to be the root of my problem.
With regard to time trends i originally assumed that the time dummies account for those trends and especially in a N > T dataset I do not have to worry too much about such a thing.
When I looked at my time dummies I recognized that my macro economic variables are correlated with my year dummies and either a lot of time dummies or the macro economic variables are omitted by Stata. But when i discard the macro economic variables and thus no year dummies are omitted, nothing happens and the R Square stays at 95%.
My Questions at this point are:
1) Do you have other ideas besides my guess about spurious regression?
2) Do you think it is necessary to check for stationarity and cointegration and if yes, would it be enough to take the first differences of the dependent variable (if non stationary), of all independent variables that are non-stationary or do I have to transform all variables in such a case? (so basically: is it statistically valid to just transform some variables? e.g. the ones I do not want to interpret)
Code:
xtreg lnGrossLoans lnTotalAssets LiquidityRatio EquityRatio lnNPLratio Depositratio ROAE2 NetFeesCommissionsNI BaselIIIDummy NPLDummy LDDummy MandatoryReservesDep RealGDPgrowthofChina ShanghaiCompositeStockMarket i.Year i.ReportingStand, fe cluster(Entity) vce(bootstrap, rep(50) seed(20))
Joan
P.S. I do not understand yet, how to show my regression output since screenshots are not allowed. Otherwise I would provide it instantly.
0 Response to R squared of fixed effects model too high
Post a Comment