I am having an issue running the estimates of a Kaplan Meier estimate via the stset command. Within my study I am looking to understand rates of first marriage at each age among three kinds of migrants (variable mig_type_19). I observe individuals multiple times throughout the study, with the variable person_id uniquely identifying individuals. Individuals can begin the study at ages 15, 16, or 17, which means there is some left truncation. Individuals leave the study after age 24, or upon getting married.
I have a number of questions regarding the output of the sts list command, and whether or not I am using the stset command correctly. First, when I run stset command and sts list over my three migration types, I get the following
Code:
stset age, id(person_id) failure(mar_ind_r) Survival-time data settings ID variable: person_id Failure event: mar_ind_r!=0 & mar_ind_r<. Observed time interval: (age[_n-1], age] Exit on or before: failure -------------------------------------------------------------------------- 16,692 total observations 1,053 observations begin on or after (first) failure -------------------------------------------------------------------------- 15,639 observations remaining, representing 3,700 subjects 713 failures in single-failure-per-subject data 80,222 total analysis time at risk and under observation At risk from t = 0 Earliest observed entry t = 0 Last observed exit t = 24 . sts list, f by(mig_type_19) Failure _d: mar_ind_r Analysis time _t: age ID variable: person_id Kaplan–Meier failure function By variable: mig_type_19 At Net Failure Std. Time risk Fail lost function error [95% conf. int.] ------------------------------------------------------------------------ Rural Stayer 15 493 3 1 0.0061 0.0035 0.0020 0.0187 16 489 3 9 0.0122 0.0049 0.0055 0.0269 17 477 5 29 0.0225 0.0067 0.0125 0.0403 18 443 16 33 0.0578 0.0108 0.0400 0.0833 19 394 22 32 0.1104 0.0149 0.0846 0.1436 20 340 20 23 0.1628 0.0181 0.1307 0.2018 21 297 29 15 0.2445 0.0218 0.2049 0.2903 22 253 21 37 0.3072 0.0239 0.2631 0.3568 23 195 18 71 0.3712 0.0260 0.3226 0.4245 24 106 23 83 0.5076 0.0324 0.4460 0.5725 Rural Mover 15 2748 2 6 0.0007 0.0005 0.0002 0.0029 16 2740 3 21 0.0018 0.0008 0.0008 0.0044 17 2716 23 161 0.0103 0.0019 0.0071 0.0148 18 2532 35 197 0.0240 0.0030 0.0188 0.0306 19 2300 37 164 0.0397 0.0039 0.0327 0.0481 20 2099 54 106 0.0644 0.0050 0.0552 0.0750 21 1939 48 103 0.0875 0.0059 0.0766 0.0999 22 1788 65 163 0.1207 0.0070 0.1077 0.1351 23 1560 88 682 0.1703 0.0084 0.1546 0.1874 24 790 72 718 0.2459 0.0114 0.2244 0.2691 Urban Stayer 15 459 2 0 0.0044 0.0031 0.0011 0.0173 16 457 1 2 0.0065 0.0038 0.0021 0.0201 17 454 12 6 0.0328 0.0083 0.0199 0.0538 18 436 15 13 0.0661 0.0117 0.0467 0.0931 19 408 10 13 0.0890 0.0134 0.0660 0.1193 20 385 17 13 0.1292 0.0160 0.1011 0.1643 21 355 14 13 0.1635 0.0178 0.1318 0.2019 22 328 17 22 0.2069 0.0197 0.1712 0.2488 23 289 18 119 0.2563 0.0217 0.2166 0.3017 24 152 20 132 0.3541 0.0278 0.3027 0.4114 ------------------------------------------------------------------------ Note: Net lost equals the number lost minus the number who entered.
However if I run a tab of the percent married over the same three groups, I get the following:
Code:
tab age mig_type_19, | Migration Type in 2019 Age | Rural Sta Rural Mov Urban Sta | Total -----------+---------------------------------+---------- 15 | 269 1,521 267 | 2,057 16 | 276 1,420 260 | 1,956 17 | 254 1,526 272 | 2,052 18 | 262 1,453 261 | 1,976 19 | 217 1,240 242 | 1,699 20 | 217 1,060 194 | 1,471 21 | 184 994 205 | 1,383 22 | 198 986 187 | 1,371 23 | 167 1,016 211 | 1,394 24 | 174 956 203 | 1,333 -----------+---------------------------------+---------- Total | 2,218 12,172 2,302 | 16,692 . tab age mig_type_19, sum(mar_ind_r) nost nofreq Means of First Marriage | Migration Type in 2019 Age | Rural Sta Rural Mov Urban Sta | Total -----------+---------------------------------+---------- 15 | .01115242 .00131492 .00749064 | .00340301 16 | .01811594 .0028169 .01153846 | .00613497 17 | .03543307 .01703801 .05514706 | .02436647 18 | .08778626 .03234687 .09578544 | .04807692 19 | .15668203 .05806452 .12809917 | .08063567 20 | .23041475 .09339623 .19587629 | .12712441 21 | .32065217 .12977867 .25365854 | .17353579 22 | .38888889 .163286 .30481283 | .21517141 23 | .43113772 .1988189 .32701422 | .24605452 24 | .52298851 .24895397 .34975369 | .30007502 -----------+---------------------------------+---------- Total | .19071235 .08051265 .15768897 | .10579919
Second, there are only 269 individuals who are rural stayers at age 15, but the corresponding at risk population is 493. Why is there is a difference between these two outputs? What is the “at risk” population referring to?
Which one is correct between the sts list and the cross tabs? Based on the information I have provided is there any additional options that I need to include in the stset command.
Thank you
0 Response to Interpreting Output of sts list Command
Post a Comment