I am interested in analyzing a longitudinal dataset with several observations per individual. However, each individual may start in different dates as well as having different amount of observations (similar to an unbalance panel). My point is that for analyzing it, I do not know how to read the data in Stata.
Let’s say that I would like to do a regression like: y x1 x2 x3 x4 through a Fixed effect estimation (in a panel would be: xtreg y x1 x2 x3 x4, fe).
If the time dimension would be a yearly observation per individual, I would do:
Code:
xtset id1 timein
A different complication arises if two or more observations start at the same date. In this case I would have to erase those repeated observations in order to use (xtset id1 timein). However, I might be losing possibly relevant observations. My second questions is if instead of using the daily variable (time), I could use an occasion variable. I mean, I first sort the data by id time, and then build an occasion variable being 1 (for the first observation within individual) 2 (for the second one)…
My point is that in this case, an individual which her first observation start at 2/2/2005 would be compared with an individual which her first observation is at 5/5/2015 (ten years after). So, if I put in the regression an occasion dummy variables, then it won`t capture the effect as the yearly dummy variables in a panel dataset (please correct me if I am wrong).
Is this (occasion variable) a possible approach? Or should I drop the repeated observations and use the daily time variable (xtset id1 timein)?
Thanks a lot in advance for your help. (here you have an example of the dataset)
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input byte y double x1 byte(x2 x3) float(timein occ id1) 0 . 23 46 15036 1 4 0 . 23 93 15051 2 4 0 . 23 78 15901 3 4 0 . 23 78 15932 4 4 0 21 23 93 16302 5 4 0 27 23 73 16406 6 4 0 . 23 55 16580 7 4 0 3 33 85 16712 8 4 0 . 23 93 16953 9 4 0 4.041666666666667 23 93 17080 10 4 0 . 23 84 17120 11 4 0 . 55 84 17532 12 4 0 20 22 81 15168 1 6 0 14 22 87 15427 2 6 0 19.03125 22 47 15538 3 6 0 . 22 47 15545 4 6 0 19.038194444444446 22 47 15553 5 6 0 14 22 81 15812 6 6 0 30 23 78 15866 7 6 0 17 22 87 15968 8 6 0 . 22 87 16071 9 6 0 25.5 22 87 16619 10 6 0 . 22 87 16628 11 6 0 . 22 87 16983 12 6 0 . 22 87 17018 13 6 0 . 22 87 17029 14 6 0 . 54 86 17594 15 6 0 . 54 86 17720 16 6 0 . 54 86 17994 17 6 0 . 54 86 18051 18 6 0 . 54 86 18234 19 6 0 20 51 47 16250 1 7 0 . 51 56 16628 2 7 0 20 51 47 16740 3 7 0 . 51 86 17001 4 7 0 6 54 86 17438 5 7 0 25 54 96 17440 6 7 0 2 54 86 17475 7 7 1 6 54 . 17622 8 7 0 . 54 86 17658 9 7 0 . 54 86 17843 10 7 0 . 54 86 17947 11 7 0 . 54 86 18057 12 7 0 . 54 86 18176 13 7 0 23.333333333333336 54 87 18480 14 7 0 13.583333333333334 54 87 18513 15 7 0 . 55 86 18546 16 7 0 . 54 86 18597 17 7 0 . 23 96 20636 18 7 0 . 54 84 15536 1 18 0 . 54 84 15567 2 18 0 . 54 86 15585 3 18 1 . 54 . 16315 4 18 0 6 23 85 16530 5 18 0 10 23 85 16559 6 18 0 10 23 85 16561 7 18 0 10 23 85 16580 8 18 0 10 23 85 16589 9 18 0 6 23 85 16600 10 18 0 10 23 85 16601 11 18 0 10 23 85 16617 12 18 0 10 23 85 16699 13 18 0 10 23 85 16713 14 18 0 10 23 85 16727 15 18 0 2 23 85 16748 16 18 0 10 23 85 16783 17 18 0 . 55 47 17841 18 18 0 . 55 41 18163 19 18 0 . 55 41 19178 20 18 0 30 55 85 19267 21 18 0 28 55 85 19617 22 18 0 . 23 85 20509 23 18 0 . 23 85 20515 24 18 0 20 54 78 20698 25 18 0 32.63993055555556 55 87 20752 26 18 0 . 55 87 20755 27 18 0 37.333333333333336 55 96 21118 28 18 0 3 33 85 21284 29 18 0 4.25 33 85 21339 30 18 0 20 80 47 18079 1 21 0 30 51 47 18198 2 21 1 . 51 . 18320 3 21 0 16 55 47 19650 4 21 0 . 55 47 19932 5 21 0 16 51 78 21067 6 21 0 28 51 78 21117 7 21 0 20 51 46 21148 8 21 0 18.889930555555555 59 47 21430 9 21 0 . 32 78 21458 10 21 0 10 51 78 21535 11 21 0 20 51 78 21598 12 21 0 10 51 78 21609 13 21 0 10 51 78 21626 14 21 0 . 12 41 21668 15 21 0 12 51 78 21703 16 21 0 . 32 78 21705 17 21 0 18 51 78 21724 18 21 0 20 51 78 21826 19 21 0 20 51 78 21878 20 21 0 30 23 56 19909 1 24 end format %td timein
0 Response to How to properly read a longitudinal dataset in Stata
Post a Comment