I want to identify the relationship between blood pressure component (systolic blood pressure;SBP , diastolic blood pressure;DBP) and outcome.
So I set the dependent, independent variables, and models as below.
<Dependent variable>
outcome (0: negative, 1: positive)
<Independent variable>
(main predictor)
gr_sbp5 (5 level categorical variable, ref=1)
gr_dbp5 (5 level categorical variable, ref=1)
Because SBP showed non-linear relationship with outcome (U-shape). so made categorical variable with SBP & DBP.
(covariate)
gr_bmi (5 level categorical variable, ref=1)
uob, dz_cvd, dz_dm (binary categorical variable, ref=0)
wbc, hb, glu10, chol10, gfr10, u_ph (continuous variable)
<Multivariate model>
model1 : base model + SBP
model2 : base model + DBP
model3 : base model + SBP + DBP
model4 : base model + SBP + DBP + Interaction term(SBP*DBP)
Code:
. logistic outcome i.gr_bmi uob dz_cvd dz_dm wbc hb glu10 chol10 gfr10 u_ph i.gr_sbp5 Logistic regression Number of obs = 307,996 LR chi2(17) = 2818.92 Prob > chi2 = 0.0000 Log likelihood = -19670.63 Pseudo R2 = 0.0669 -------------------------------------------------------------------------------- outcome | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] ---------------+---------------------------------------------------------------- gr_bmi | 18.5-22.9 | .2274956 .0089222 -37.75 0.000 .2106636 .2456724 23-24.9 | .1281581 .0080337 -32.77 0.000 .1133413 .144912 25-29.9 | .1386081 .0080557 -34.00 0.000 .1236853 .1553315 >30 | .2342154 .0169318 -20.08 0.000 .2032735 .2698672 | uob | 4.106947 .2508892 23.12 0.000 3.643511 4.62933 dz_cvd | 1.542221 .310151 2.15 0.031 1.039835 2.287331 dz_dm | 11.24978 1.953465 13.94 0.000 8.004562 15.81069 wbc | 1.038053 .0095519 4.06 0.000 1.019499 1.056944 hb | 1.063652 .0198393 3.31 0.001 1.025469 1.103255 glu10 | .9808376 .0096815 -1.96 0.050 .9620446 .9999978 chol10 | 1.02705 .0061042 4.49 0.000 1.015155 1.039084 gfr10 | .8396113 .0108823 -13.49 0.000 .818551 .8612135 u_ph | .8203162 .0242938 -6.69 0.000 .7740568 .8693401 | gr_sbp5 | 2nd(115-123) | .920193 .0441731 -1.73 0.083 .8375634 1.010974 3rd (123-130) | .9555914 .0479349 -0.91 0.365 .8661115 1.054316 4th (130-136) | .8897213 .0479283 -2.17 0.030 .8005726 .9887973 5th (>136) | .9576887 .0531013 -0.78 0.436 .8590679 1.067631 | _cons | .2932171 .1202545 -2.99 0.003 .1312483 .6550658 -------------------------------------------------------------------------------- . testparm i.gr_sbp5 -(omitted)- chi2( 4) = 5.84 Prob > chi2 = 0.2117 . . logistic outcome i.gr_bmi uob dz_cvd dz_dm wbc hb glu10 chol10 gfr10 u_ph i.gr_dbp5 Logistic regression Number of obs = 307,996 LR chi2(17) = 2875.08 Prob > chi2 = 0.0000 Log likelihood = -19642.551 Pseudo R2 = 0.0682 ------------------------------------------------------------------------------ outcome | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- gr_bmi | 18.5-22.9 | .2246587 .0087555 -38.31 0.000 .2081373 .2424915 23-24.9 | .1232036 .0076423 -33.76 0.000 .1090997 .1391308 25-29.9 | .1283074 .0072706 -36.24 0.000 .1148201 .1433791 >30 | .2004536 .0140431 -22.94 0.000 .1747357 .2299567 | uob | 4.106937 .2509981 23.11 0.000 3.643311 4.62956 dz_cvd | 1.543388 .3104973 2.16 0.031 1.040474 2.289386 dz_dm | 11.95432 2.058213 14.41 0.000 8.530423 16.75247 wbc | 1.03261 .0095297 3.48 0.001 1.014101 1.051458 hb | 1.042993 .0194805 2.25 0.024 1.005503 1.081882 glu10 | .973856 .0098721 -2.61 0.009 .954698 .9933984 chol10 | 1.023836 .0060904 3.96 0.000 1.011968 1.035843 gfr10 | .83826 .0108643 -13.61 0.000 .8172345 .8598264 u_ph | .8177969 .0242136 -6.79 0.000 .7716899 .8666587 | gr_dbp5 | 2nd (67-72) | 1.08831 .0574541 1.60 0.109 .9813323 1.20695 3rd (72-77) | 1.191713 .0636803 3.28 0.001 1.073216 1.323295 4th (77-82) | 1.122692 .0625467 2.08 0.038 1.006559 1.252225 5th (>82) | 1.493472 .0821815 7.29 0.000 1.340781 1.663552 | _cons | .4072293 .1672724 -2.19 0.029 .1820547 .9109117 ------------------------------------------------------------------------------ . testparm i.gr_dbp5 -(omitted)- chi2( 4) = 63.82 Prob > chi2 = 0.0000 . . logistic outcome i.gr_bmi uob dz_cvd dz_dm wbc hb glu10 chol10 gfr10 u_ph i.gr_sbp5 i.gr_dbp5 Logistic regression Number of obs = 307,996 LR chi2(21) = 2926.78 Prob > chi2 = 0.0000 Log likelihood = -19616.696 Pseudo R2 = 0.0694 -------------------------------------------------------------------------------- outcome | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] ---------------+---------------------------------------------------------------- gr_bmi | 18.5-22.9 | .2326459 .0091386 -37.12 0.000 .2154066 .2512648 23-24.9 | .1316977 .0082627 -32.31 0.000 .1164593 .1489302 25-29.9 | .140246 .0081583 -33.77 0.000 .1251339 .1571832 >30 | .2230054 .0162042 -20.65 0.000 .1934037 .2571378 | uob | 4.109485 .2512613 23.12 0.000 3.645385 4.63267 dz_cvd | 1.5523 .3124149 2.18 0.029 1.046317 2.302968 dz_dm | 11.64676 2.016673 14.18 0.000 8.295009 16.35284 wbc | 1.033418 .0095374 3.56 0.000 1.014893 1.052281 hb | 1.045843 .0195443 2.40 0.016 1.00823 1.084859 glu10 | .9761361 .0098164 -2.40 0.016 .9570847 .9955667 chol10 | 1.024662 .0060943 4.10 0.000 1.012786 1.036676 gfr10 | .8390002 .0108785 -13.54 0.000 .8179475 .8605948 u_ph | .8232253 .0243883 -6.57 0.000 .7767865 .8724405 | gr_sbp5 | 2nd(115-123) | .8176541 .0417093 -3.95 0.000 .739859 .9036291 3rd (123-130) | .77717 .043948 -4.46 0.000 .6956354 .8682611 4th (130-136) | .6656185 .0421356 -6.43 0.000 .5879518 .7535446 5th (>136) | .6369883 .0439094 -6.54 0.000 .5564879 .7291336 | gr_dbp5 | 2nd (67-72) | 1.179404 .0642708 3.03 0.002 1.05993 1.312346 3rd (72-77) | 1.411342 .0839404 5.79 0.000 1.256049 1.585835 4th (77-82) | 1.398934 .0897421 5.23 0.000 1.233651 1.586361 5th (>82) | 1.981014 .1352353 10.01 0.000 1.732924 2.26462 | _cons | .373708 .1536299 -2.39 0.017 .1669578 .8364846 -------------------------------------------------------------------------------- . . logistic outcome i.gr_bmi uob dz_cvd dz_dm wbc hb glu10 chol10 gfr10 u_ph i.gr_sbp5##i.gr_dbp5 Logistic regression Number of obs = 307,996 LR chi2(37) = 2941.57 Prob > chi2 = 0.0000 Log likelihood = -19609.303 Pseudo R2 = 0.0698 -------------------------------------------------------------------------------------------- outcome | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] ---------------------------+---------------------------------------------------------------- gr_bmi | 18.5-22.9 | .2325644 .0091394 -37.12 0.000 .215324 .2511851 23-24.9 | .1316118 .0082597 -32.31 0.000 .1163792 .1488382 25-29.9 | .140364 .0081682 -33.74 0.000 .1252338 .1573221 >30 | .2224026 .016214 -20.62 0.000 .1927898 .2565639 | uob | 4.106376 .2511678 23.09 0.000 3.642459 4.629379 dz_cvd | 1.556121 .3132275 2.20 0.028 1.048835 2.308764 dz_dm | 11.6862 2.024792 14.19 0.000 8.321299 16.41177 wbc | 1.033174 .0095377 3.54 0.000 1.014649 1.052038 hb | 1.045368 .0195416 2.37 0.018 1.00776 1.084379 glu10 | .9760597 .0098105 -2.41 0.016 .9570196 .9954787 chol10 | 1.024647 .0060968 4.09 0.000 1.012767 1.036666 gfr10 | .8390364 .0108798 -13.53 0.000 .8179811 .8606337 u_ph | .8235468 .0244002 -6.55 0.000 .7770854 .8727861 | gr_sbp5 | 2nd(115-123) | .8260038 .0802381 -1.97 0.049 .6828043 .9992355 3rd (123-130) | .85602 .1312613 -1.01 0.311 .6338135 1.156129 4th (130-136) | .9636551 .2558051 -0.14 0.889 .5727534 1.621346 5th (>136) | .6518757 .3799761 -0.73 0.463 .2079708 2.043277 | gr_dbp5 | 2nd (67-72) | 1.157018 .0853214 1.98 0.048 1.001315 1.336934 3rd (72-77) | 1.525007 .1551635 4.15 0.000 1.249296 1.861565 4th (77-82) | 1.778231 .2553033 4.01 0.000 1.342084 2.356115 5th (>82) | 2.00592 .4725639 2.95 0.003 1.264106 3.183051 | gr_sbp5#gr_dbp5 | 2nd(115-123)#2nd (67-72) | 1.033055 .1328648 0.25 0.800 .8028741 1.329228 2nd(115-123)#3rd (72-77) | .9158564 .1363203 -0.59 0.555 .6841174 1.226095 2nd(115-123)#4th (77-82) | .7906024 .1473929 -1.26 0.208 .5486149 1.139327 2nd(115-123)#5th (>82) | 1.041386 .2948189 0.14 0.886 .5979081 1.813798 3rd (123-130)#2nd (67-72) | .9068627 .1664451 -0.53 0.594 .6328671 1.299483 3rd (123-130)#3rd (72-77) | .8529421 .1616351 -0.84 0.401 .588321 1.236587 3rd (123-130)#4th (77-82) | .7350782 .1601667 -1.41 0.158 .479584 1.126685 3rd (123-130)#5th (>82) | .9090021 .2643235 -0.33 0.743 .5141015 1.607241 4th (130-136)#2nd (67-72) | .885609 .2623133 -0.41 0.682 .4955872 1.582574 4th (130-136)#3rd (72-77) | .6956298 .2030264 -1.24 0.214 .3925967 1.232565 4th (130-136)#4th (77-82) | .5237542 .1607292 -2.11 0.035 .2870195 .9557482 4th (130-136)#5th (>82) | .6361281 .2281251 -1.26 0.207 .3149858 1.28469 5th (>136)#2nd (67-72) | 1.055758 .6622867 0.09 0.931 .3087432 3.610198 5th (>136)#3rd (72-77) | .7535714 .4546771 -0.47 0.639 .2309622 2.458714 5th (>136)#4th (77-82) | .7626797 .4613183 -0.45 0.654 .2330666 2.495768 5th (>136)#5th (>82) | 1.012943 .6364825 0.02 0.984 .2956191 3.470862 | _cons | .3704338 .1525055 -2.41 0.016 .165301 .8301289 -------------------------------------------------------------------------------------------- . testparm i.gr_sbp5 -(omitted)- chi2( 4) = 4.81 Prob > chi2 = 0.3073 . testparm i.gr_sbp5#i.gr_dbp5 -(omitted)- chi2( 16) = 15.15 Prob > chi2 = 0.5140
In model1, SBP was not significant , except 4th (130-136) group. And gr_sbp5 's overall effect was also non-significant (teatparm result)
In model 2, DBP was significant, except 2nd (67-72) group.
In model 3, SBP became significant in all groups when it combined with DBP group.
In model 4, SBP was not significant again. except 2nd(115-123) group. gr_sbp5 's overall effect was also non-significant (teatparm result)
I guess the presence of collinearity between SBP & DBP.
so I check the collinearity between independent variables.
Code:
. collin gr_bmi uob dz_cvd dz_dm wbc hb glu10 chol10 gfr10 u_ph gr_dbp5 gr_sbp5 (obs=307,996) Collinearity Diagnostics SQRT R- Variable VIF VIF Tolerance Squared ---------------------------------------------------- gr_bmi 1.29 1.14 0.7729 0.2271 uob 1.00 1.00 0.9956 0.0044 dz_cvd 1.00 1.00 0.9996 0.0004 dz_dm 1.26 1.12 0.7938 0.2062 wbc 1.07 1.03 0.9352 0.0648 hb 1.08 1.04 0.9242 0.0758 glu10 1.31 1.14 0.7646 0.2354 chol10 1.13 1.06 0.8886 0.1114 gfr10 1.03 1.01 0.9714 0.0286 u_ph 1.01 1.01 0.9863 0.0137 gr_dbp5 1.89 1.37 0.5303 0.4697 gr_sbp5 2.00 1.41 0.5002 0.4998 ---------------------------------------------------- Mean VIF 1.26 Cond Eigenval Index --------------------------------- 1 9.2823 1.0000 2 1.0005 3.0460 3 0.9954 3.0538 4 0.9757 3.0844 5 0.4311 4.6403 6 0.1192 8.8239 7 0.0938 9.9464 8 0.0481 13.8984 9 0.0240 19.6677 10 0.0142 25.5434 11 0.0090 32.1367 12 0.0057 40.4496 13 0.0011 92.5860 --------------------------------- Condition Number 92.5860 Eigenvalues & Cond Index computed from scaled raw sscp (w/ intercept) Det(correlation matrix) 0.2703
But condition index was 92.58. lager than cutoff 30.
When I add the interaction term between sbp&dbp, it increase both VIF and condition index.
When I remove gr_sbp , VIF of gr_dbp decreased to 1.17, but condition index was still 88.68.
Now, I ask questions
1) How can I treat collinearity, showed by condition index? Can I ignore result of condition index, because VIF was below 10?
2) in model 4, Interaction term was not significant, but It change the coefficient (odds ratio) and p-value of SBP.
Which model do I have to select & report between 1&2 or 3 or 4?
Do I have to margin command? than what level should do I have to fix for each blood pressure group?
0 Response to Question about Interaction term & collinearity in Logistic regression.
Post a Comment