Dear members,

I am running a discrete-time survival analysis on an unbalanced panel data by using xtcloglog command in Stata.

xtcloglog dv iv cv i.industry_id, cluster(firm_id)

The final step prior to estimation is to choose a functional form for the baseline hazard function. I used a non-parametric baseline for which I created dummy variables, one for each spell year at risk.

ta j, ge(d)

The problem is the origin year for the firms in my data is 1980 but there are firms with a late entry regarding the origin because, for example, the firm starts working from 2002. so in this case when I create dummy based on j, the value of d1, for example, is 1 for the first year at risk of all these firms (so for one firm is 1980 for another is 2002) which seems not correct. what should I do with this late entry?

I am following Professor Stephen P. Jenkins online material and he mentioned: "Remember that we do not have to stset the data for estimation because we do not use the st commands – they are for the continuous-time case." So because of my discrete-time, I can't use stset.


Thanks for helping me,