As my last post did not get much of a reaction at all, and I was advised to be more specific, I'm trying again for a similar question. If I got any technicalities wrong, please tell me, I'm eager to get this right.
I'm using stata 13
My variables are as follows:
Dependent:
par30 - portfolio at risk > 30 (percentage of loans overdue more than 30 days)
Independent:
perfem - percentage of female borrowers
TAK - total assets (in $1000)
PSK - average portfolio size per borrower (in $1000)
MFIage = indicates if an MFI is new, young or mature (1 = 1-4 years old, 2 = 4-8 years old, 3 = 8+ years old)
- I used a different variable beforehand (0 for new, +1 for each year the MFI is active) however I encountered huge problems due to linearity, so I decided to use this one
MFI = Microfinance Institution, also called financial service provider
Concerning the group variables, my data contains a variable called mfiname which has the individual name of each MFI so I did
Code:
. egen numMFI = group(mfiname)
. xtset numMFI fiscalyear
panel variable: numMFI (unbalanced)
time variable: fiscalyear, 2003 to 2012, but with gaps
delta: 1 unit
Code:
. quietly xtreg par30 perfem TAK PSK MFIage
Code:
. nlcheck perfem
Nonlinearity test:
chi2( 9) = 10.11
Prob > chi2 = 0.3414
Code:
. nlcheck TAK
Nonlinearity test:
chi2( 9) = 7.91
Prob > chi2 = 0.5434
Code:
. nlcheck PSK
Nonlinearity test:
chi2( 9) = 20.61
Prob > chi2 = 0.0145
Code:
. nlcheck MFIage
Nonlinearity test:
chi2( 1) = 0.13
Prob > chi2 = 0.7229
Code:
. quietly regress par30 perfem TAK PSK MFIage . predict r, resid
Code:
. acprplot perfem , lowess . acprplot TAK , lowess . acprplot PSK , lowess . acprplot MFIage , lowess
Code:
. acprplot perfem , lowess
Code:
. acprplot PSK , lowess
Code:
. kdensity PSK, normal
Code:
. summarize PSK, detail
PSK
-------------------------------------------------------------
Percentiles Smallest
1% .0266504 .0149792
5% .0484809 .0195255
10% .061839 .0266504 Obs 209
25% .1161648 .0330702 Sum of Wgt. 209
50% .2528287 Mean .431633
Largest Std. Dev. .4882636
75% .5544006 2.138109
90% 1.126634 2.336616 Variance .2384014
95% 1.406009 2.36487 Skewness 2.467618
99% 2.336616 3.398852 Kurtosis 11.24819
1. -nlcheck- indicated perfem was linear, however to me that doesn't look too great, any thoughts on that?
2. Any ideas on transforming PSK? I tried to log it on e which gave me the following results
Code:
. generate lnPSK = ln(PSK) . quietly regress par30 perfem TAK lnPSK MFIage . predict r, resid (81 missing values generated)
Code:
. acprplot lnPSK , lowess
Code:
. kdensity lnPSK, normal
Which does look better I guess, however -nlcheck- still rejects the null
Code:
. quietly xtreg par30 perfem TAK lnPSK MFIage
. nlcheck lnPSK
Nonlinearity test:
chi2( 9) = 20.58
Prob > chi2 = 0.0146
Thanks in advance!
0 Response to I encountered a problem with linearity and transformation in linear regression
Post a Comment