Hey Stata users,

I have panel data on subjects and am interesting in how their wages change with time. For each subject, I want to see the number of years since wage data became available. If all subjects had wage information in the first year, this would be easy. I could generate a variable by subject ID equal to _n.

However, for many subjects, the wage data may only become available some years after the panel begins. For example subjects 2, 4, and 5 in this example,

ID Year Wages
1 2001 5
1 2002 5
1 2003 7
2 2001 .
2 2002 .
2 2003 5
3 2001 7
4 2002 .
4 2003 3
5 2001 .
5 2002 .
5 2003 4

I want to start counting in the first year Wages is available. Most of what I have tried has started counting in the first observation year and I cannot adjust this. Even when I made a variable for the first year wages was able and tried to temporally sort by this variable it did not work when making my count variable it did not work. Here is my code:

Code:
gen wage_avail = 0
replace wage_avail= 1 if wages>0 
replace wage_avail= 0 if wages==.

egen first_wage_year = min(year / (wage_avail== 1)), by(ID)
by ID first_wage_year, sort: gen wage_count = _n
Your informed thoughts are appreciated