My main database contains longitudinal year-based information between some years (e.g., 2007-2013), and I am taking the ISEI (occupational status score, continuous variable) level as a dependent variable.
Since not all individuals took part in all the years, I am not interested in a panel analysis, so instead I opted for a pooled cross sectional analysis. But for this, I don't want to choose one particular year so I can avoid any time-variance, and rather work with the moment an individual reached the maximum level of ISEI.
So my question is: is there any way to tell STATA to keep the observations in the same row when the maximum level of the dependent variable was reached and when this first happened to the individual?
Here is a visual example:
ID | year | ISEI | IV1 | IV2 | |
1 | 1 | 2008 | 35 | 0 | .. |
2 | 1 | 2009 | 40 | 1 | |
3 | 1 | 2011 | 55 | 1 | |
4 | 1 | 2012 | 60 | 1 | |
5 | 2 | 2010 | 55 | 0 | |
6 | 2 | 2011 | 56 | 1 | |
7 | 3 | 2012 | 25 | 0 | |
8 | 3 | 2013 | 30 | 0 | |
9 | 3 | 2014 | 30 | 1 | |
10 | 4 | 2009 | 40 | 0 | |
11 | 4 | 2010 | 41 | 0 | |
12 | 4 | 2013 | 35 | 1 | |
13 | 5 | ||||
14 | .. |
By doing this I could create a sub- cross sectional database containing only the selected rows (in the example, the ones I marked with bold).
I tried a couple of things with the "collapse" command, but I got stuck with it, and besides I am not quite sure whether this is the best alternative.
I hope you can help me out and please let me know if you need more information.
Thank you
Cheers,
Esteban
0 Response to issue when converting from panel to pooled cross sectional
Post a Comment