I have a database with longitudinal data with 55.000 observations that looks like this:

Array

I would like:
1) Remove all the ID that are not present from 2006-2010 (here remove B and C)
2) Create a dummy variable which takes 1 if the ID has recovered its profitability from 2006 and 0 if the ID has not be able to do so (here A would be 1, and D would be 0)
3) Create a variable that computes the time it took an ID to recover (here A it would be 3 years)

The idea is to make a regression with: xtgee ln(time_to_recover) = profitability + other variables (r&d expenses, debt, etc.). Si that I can asses the characteristics that drives to resilience (recovery) of companies.