I am conducting an analysis using a longitudinal dataset from an ongoing cohort that spans over 5.5 years of follow-up. Children are evaluated on their development every few months over this period and have been evaluated up to 8 times. At each evaluation, they will be categorized as "delayed" or "not delayed" in their development. Children do not exit the study when they are categorized as "delayed," they continue to be followed up. Children have entered the study at different moments in time and exit the study at different moments (only if they were lost to follow-up).

My research question is asking how development in a group of individuals has changed over this time period. I want to know if using Poisson regression is a good technique because my outcome is in the form of counts. I would like to measure the same children throughout the entire study period to understand their rates of developmental delay over time and with each evaluation. However, when using Stata, I have always thought that a timevar had to be specified, but I don't have an exit variable in my study because children never exit, unless they are lost to follow-up:

stset **timevar**, fail(delay) origin(dob) enter(date_first_eval) id(id_child) scale(365.25)

Is there a way to create a time variable variable when there IS lost to follow-up, or do you think I should use a different statistical method?

I then would want to assume that the rates of delay are constant within age bands and for each individual in the dataset and I would divide my dataset into parts that refer to the follow-up of a single participant though a single age band by creating a current_age variable using the stplit command to allow me to control for current age in the analysis. This would allow me to see the rates of delay at each age time point or evaluation and obtain a rate ratio that takes current age into account.

Do you think this method outlined above is an ideal method for my analysis? After understanding the rates of delay (whether they have increased or decreased over this study period), I would like to examine if different factors are associated with their delay, ie. income level of family, age of mother, mother's education level, etc. and was considering adding these independent variables to my Poisson regression model.

I tried using logistic regression, but have found that this isn't ideal if I want to measure counts as my outcome and take time into account. I have now tried to use Poisson regression, but could not define a time variable.

If you have any suggestions, please let me know.

Thank you for reading!