I am looking for wise advice on what happened to my mixed effect model from the STATA experts here.
I compared the mixed-effect model with a random intercept (Model B) with one that does not have a random intercept (Model A) (Model A might be identical with OLS).
The data I used is the longitudinal data that has 1,272 respondents, 35 waves, and 22,777 recodes in total.
It is unbalanced data: someone has 35 spells while some respondents only participated only two waves(2 spells).
I reshaped this data into the person-age form. Age ranges from 20 to 60 years old.
The dependent variables is the physical strenuousness of work (PSW).
the data is looks like this:
Code:
+---------------------------+ | id age PSW | |---------------------------| 43. | 10037 23 73.25 | 44. | 10037 24 73.25 | 45. | 10037 25 73.25 | 46. | 10037 26 73.25 | 47. | 10037 27 75.375 | |---------------------------| 48. | 10037 28 75.375 | 49. | 10037 29 75.375 | 50. | 10037 30 75.375 | 51. | 10037 31 75.375 | 52. | 10037 32 22.43056 | |---------------------------| 53. | 10037 33 22.43056 | 54. | 10037 34 22.43056 | 55. | 10037 35 22.43056 | 56. | 10037 36 75.375 | 57. | 10037 37 75.375 | |---------------------------| 58. | 10037 38 75.375 | 59. | 10037 39 22.43056 | 60. | 10037 40 22.43056 | 64. | 10037 50 37.25 | 65. | 10037 52 37.25 | |---------------------------| 76. | 10038 48 31.89583 | 77. | 10038 49 31.89583 | 78. | 10038 50 20.075 | 79. | 10038 51 20.075 | 80. | 10038 53 20.075 | |---------------------------| 81. | 10038 55 20.075 |
Model A) mixed psw age c.age#c.age c.age#c.age#c.age, mle
Model B) mixed psw age c.age#c.age c.age#c.age#c.age || id: , mle
The results table for each model is
Code:
Model A Mixed-effects ML regression Number of obs = 22,777 Wald chi2(3) = 580.52 Log likelihood = -102576.8 Prob > chi2 = 0.0000 ----------------------------------------------------------------------------------- psw | Coef. Std. Err. z P>|z| [95% Conf. Interval] ------------------+---------------------------------------------------------------- age | -3.906402 .6288462 -6.21 0.000 -5.138918 -2.673887 | c.age#c.age | .0742086 .0162178 4.58 0.000 .0424222 .105995 | c.age#c.age#c.age | -.0004612 .0001344 -3.43 0.001 -.0007245 -.0001978 | _cons | 108.8749 7.81452 13.93 0.000 93.55875 124.1911 ----------------------------------------------------------------------------------- ------------------------------------------------------------------------------ Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval] -----------------------------+------------------------------------------------ var(Residual) | 477.7911 4.477182 469.096 486.6473 ------------------------------------------------------------------------------ Model B Mixed-effects ML regression Number of obs = 22,777 Group variable: idintnum68 Number of groups = 1,272 Obs per group: min = 1 avg = 17.9 max = 35 Wald chi2(3) = 589.43 Log likelihood = -92923.381 Prob > chi2 = 0.0000 ----------------------------------------------------------------------------------- psw | Coef. Std. Err. z P>|z| [95% Conf. Interval] ------------------+---------------------------------------------------------------- age | -2.768387 .3943091 -7.02 0.000 -3.541218 -1.995555 | c.age#c.age | .050902 .0101317 5.02 0.000 .0310442 .0707597 | c.age#c.age#c.age | -.0002928 .0000837 -3.50 0.000 -.0004568 -.0001288 | _cons | 91.63064 4.9491 18.51 0.000 81.93058 101.3307 ----------------------------------------------------------------------------------- ------------------------------------------------------------------------------ Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval] -----------------------------+------------------------------------------------ idintnum68: Identity | var(_cons) | 317.9105 13.30189 292.8796 345.0806 -----------------------------+------------------------------------------------ var(Residual) | 170.1462 1.640699 166.9607 173.3925 ------------------------------------------------------------------------------ LR test vs. linear model: chibar2(01) = 19306.83 Prob >= chibar2 = 0.0000
And I plotted the predicted scores from both models. Also, I put the observed mean of the dependent variable for the sake of comparison.
in the attached graph, red dots are for the observed mean values, the blue line is from Model B, and the gray line from the Model A
As shown in the attached graph, the estimated dep variable is differed by whether I added random intercept or not.
It totally makes sense that adding random intercept makes a difference.
My issue is
why the Model B's estimated scores is lower than one from Model A at 20 yrs?
why the Model B's estimation scores become higher than Model A around 30-year-old and become wider by aging?
May these differences be due to something related to my raw data such as the unbalanced structure?
My concern is that the Model with the random intercept might be biased cause the increasing pattern of PSW after 40s (the gray line) seems not to make sense and not fit the observed pattern.
Array
Any kind of comments will be greatly helpful.
0 Response to Question about Mixed effect model
Post a Comment