I am having an issue running the estimates of a Kaplan Meier estimate via the stset command. Within my study I am looking to understand rates of first marriage at each age among three kinds of migrants (variable mig_type_19). I observe individuals multiple times throughout the study, with the variable person_id uniquely identifying individuals. Individuals can begin the study at ages 15, 16, or 17, which means there is some left truncation. Individuals leave the study after age 24, or upon getting married.
I have a number of questions regarding the output of the sts list command, and whether or not I am using the stset command correctly. First, when I run stset command and sts list over my three migration types, I get the following
Code:
stset age, id(person_id) failure(mar_ind_r)
Survival-time data settings
ID variable: person_id
Failure event: mar_ind_r!=0 & mar_ind_r<.
Observed time interval: (age[_n-1], age]
Exit on or before: failure
--------------------------------------------------------------------------
16,692 total observations
1,053 observations begin on or after (first) failure
--------------------------------------------------------------------------
15,639 observations remaining, representing
3,700 subjects
713 failures in single-failure-per-subject data
80,222 total analysis time at risk and under observation
At risk from t = 0
Earliest observed entry t = 0
Last observed exit t = 24
. sts list, f by(mig_type_19)
Failure _d: mar_ind_r
Analysis time _t: age
ID variable: person_id
Kaplan–Meier failure function
By variable: mig_type_19
At Net Failure Std.
Time risk Fail lost function error [95% conf. int.]
------------------------------------------------------------------------
Rural Stayer
15 493 3 1 0.0061 0.0035 0.0020 0.0187
16 489 3 9 0.0122 0.0049 0.0055 0.0269
17 477 5 29 0.0225 0.0067 0.0125 0.0403
18 443 16 33 0.0578 0.0108 0.0400 0.0833
19 394 22 32 0.1104 0.0149 0.0846 0.1436
20 340 20 23 0.1628 0.0181 0.1307 0.2018
21 297 29 15 0.2445 0.0218 0.2049 0.2903
22 253 21 37 0.3072 0.0239 0.2631 0.3568
23 195 18 71 0.3712 0.0260 0.3226 0.4245
24 106 23 83 0.5076 0.0324 0.4460 0.5725
Rural Mover
15 2748 2 6 0.0007 0.0005 0.0002 0.0029
16 2740 3 21 0.0018 0.0008 0.0008 0.0044
17 2716 23 161 0.0103 0.0019 0.0071 0.0148
18 2532 35 197 0.0240 0.0030 0.0188 0.0306
19 2300 37 164 0.0397 0.0039 0.0327 0.0481
20 2099 54 106 0.0644 0.0050 0.0552 0.0750
21 1939 48 103 0.0875 0.0059 0.0766 0.0999
22 1788 65 163 0.1207 0.0070 0.1077 0.1351
23 1560 88 682 0.1703 0.0084 0.1546 0.1874
24 790 72 718 0.2459 0.0114 0.2244 0.2691
Urban Stayer
15 459 2 0 0.0044 0.0031 0.0011 0.0173
16 457 1 2 0.0065 0.0038 0.0021 0.0201
17 454 12 6 0.0328 0.0083 0.0199 0.0538
18 436 15 13 0.0661 0.0117 0.0467 0.0931
19 408 10 13 0.0890 0.0134 0.0660 0.1193
20 385 17 13 0.1292 0.0160 0.1011 0.1643
21 355 14 13 0.1635 0.0178 0.1318 0.2019
22 328 17 22 0.2069 0.0197 0.1712 0.2488
23 289 18 119 0.2563 0.0217 0.2166 0.3017
24 152 20 132 0.3541 0.0278 0.3027 0.4114
------------------------------------------------------------------------
Note: Net lost equals the number lost minus the number who entered.However if I run a tab of the percent married over the same three groups, I get the following:
Code:
tab age mig_type_19,
| Migration Type in 2019
Age | Rural Sta Rural Mov Urban Sta | Total
-----------+---------------------------------+----------
15 | 269 1,521 267 | 2,057
16 | 276 1,420 260 | 1,956
17 | 254 1,526 272 | 2,052
18 | 262 1,453 261 | 1,976
19 | 217 1,240 242 | 1,699
20 | 217 1,060 194 | 1,471
21 | 184 994 205 | 1,383
22 | 198 986 187 | 1,371
23 | 167 1,016 211 | 1,394
24 | 174 956 203 | 1,333
-----------+---------------------------------+----------
Total | 2,218 12,172 2,302 | 16,692
. tab age mig_type_19, sum(mar_ind_r) nost nofreq
Means of First Marriage
| Migration Type in 2019
Age | Rural Sta Rural Mov Urban Sta | Total
-----------+---------------------------------+----------
15 | .01115242 .00131492 .00749064 | .00340301
16 | .01811594 .0028169 .01153846 | .00613497
17 | .03543307 .01703801 .05514706 | .02436647
18 | .08778626 .03234687 .09578544 | .04807692
19 | .15668203 .05806452 .12809917 | .08063567
20 | .23041475 .09339623 .19587629 | .12712441
21 | .32065217 .12977867 .25365854 | .17353579
22 | .38888889 .163286 .30481283 | .21517141
23 | .43113772 .1988189 .32701422 | .24605452
24 | .52298851 .24895397 .34975369 | .30007502
-----------+---------------------------------+----------
Total | .19071235 .08051265 .15768897 | .10579919Second, there are only 269 individuals who are rural stayers at age 15, but the corresponding at risk population is 493. Why is there is a difference between these two outputs? What is the “at risk” population referring to?
Which one is correct between the sts list and the cross tabs? Based on the information I have provided is there any additional options that I need to include in the stset command.
Thank you
0 Response to Interpreting Output of sts list Command
Post a Comment