As my last post did not get much of a reaction at all, and I was advised to be more specific, I'm trying again for a similar question. If I got any technicalities wrong, please tell me, I'm eager to get this right.
I'm using stata 13
My variables are as follows:
Dependent:
par30 - portfolio at risk > 30 (percentage of loans overdue more than 30 days)
Independent:
perfem - percentage of female borrowers
TAK - total assets (in $1000)
PSK - average portfolio size per borrower (in $1000)
MFIage = indicates if an MFI is new, young or mature (1 = 1-4 years old, 2 = 4-8 years old, 3 = 8+ years old)
- I used a different variable beforehand (0 for new, +1 for each year the MFI is active) however I encountered huge problems due to linearity, so I decided to use this one
MFI = Microfinance Institution, also called financial service provider
Concerning the group variables, my data contains a variable called mfiname which has the individual name of each MFI so I did
Code:
. egen numMFI = group(mfiname) . xtset numMFI fiscalyear panel variable: numMFI (unbalanced) time variable: fiscalyear, 2003 to 2012, but with gaps delta: 1 unit
Code:
. quietly xtreg par30 perfem TAK PSK MFIage
Code:
. nlcheck perfem Nonlinearity test: chi2( 9) = 10.11 Prob > chi2 = 0.3414
Code:
. nlcheck TAK Nonlinearity test: chi2( 9) = 7.91 Prob > chi2 = 0.5434
Code:
. nlcheck PSK Nonlinearity test: chi2( 9) = 20.61 Prob > chi2 = 0.0145
Code:
. nlcheck MFIage Nonlinearity test: chi2( 1) = 0.13 Prob > chi2 = 0.7229
Code:
. quietly regress par30 perfem TAK PSK MFIage . predict r, resid
Code:
. acprplot perfem , lowess . acprplot TAK , lowess . acprplot PSK , lowess . acprplot MFIage , lowess
Code:
. acprplot perfem , lowess
Code:
. acprplot PSK , lowess
Code:
. kdensity PSK, normal
Code:
. summarize PSK, detail PSK ------------------------------------------------------------- Percentiles Smallest 1% .0266504 .0149792 5% .0484809 .0195255 10% .061839 .0266504 Obs 209 25% .1161648 .0330702 Sum of Wgt. 209 50% .2528287 Mean .431633 Largest Std. Dev. .4882636 75% .5544006 2.138109 90% 1.126634 2.336616 Variance .2384014 95% 1.406009 2.36487 Skewness 2.467618 99% 2.336616 3.398852 Kurtosis 11.24819
1. -nlcheck- indicated perfem was linear, however to me that doesn't look too great, any thoughts on that?
2. Any ideas on transforming PSK? I tried to log it on e which gave me the following results
Code:
. generate lnPSK = ln(PSK) . quietly regress par30 perfem TAK lnPSK MFIage . predict r, resid (81 missing values generated)
Code:
. acprplot lnPSK , lowess
Code:
. kdensity lnPSK, normal
Which does look better I guess, however -nlcheck- still rejects the null
Code:
. quietly xtreg par30 perfem TAK lnPSK MFIage . nlcheck lnPSK Nonlinearity test: chi2( 9) = 20.58 Prob > chi2 = 0.0146
Thanks in advance!
0 Response to I encountered a problem with linearity and transformation in linear regression
Post a Comment