Hi dear statalist,

I have an unbalanced panel with gaps in years, with worker level information about occupation (code) and firm for several years.

A given worker may be observed in several firms within a year if she moves and even twice in the same firm if she changes occupation, but can only be observed once per year in a given occupation code within a given firm. Hence, i am using as unique identifier (id) a combination of person_id, occupation and firm.

I have some gaps years and i was using tsfill but then realized this was giving me some problems.

To provide you with an example of why, consider my dataset as follows for a given worker which changes occupation within the same firm only and where i have used tsfill:




Person_ID Firm_ID Estab_id year Occupation Team_id id
035 553 353 2003 1222 571254 14
2004 14
2005 14
2006 14
2007 14
2008 14
2009 14
2010 14
2011 14
2012 14
2013 14
035 553 353 2014 1222 571254 14
035 553 353 2002 2145 571267 15
2003 15
035 553 353 2004 2145 571267 15
035 553 353 2005 2145 571267 15
2006 15
2007 15
035 553 353 2008 2145 571267 15
035 553 353 2009 2145 571267 15
2010 15
035 553 353 2011 2145 571267 15
035 553 353 2012 2145 571267 15



In 2002 she was working at occupation 2145 and in 2003 changed to occupation 1222. In the following year, went back to occupation 2145 up until 2014 where she returned to occupation 1222.
As she is observed in 2003 and in 2014 in the same occupation (1222), stata creates missing years even though we know she was employed in those years a different occupation in between...
I know this happens because of the way i created the unique identifier, but since i have multiple observations per worker per year this was the only way to create a unique id..

Any idea on how to solve this problem with the tsfill??

Thanks a lot!