Hi all,

I'm having difficulty with creating indicator variables in a dataset in long format in stata. The data contains a ID, date of treatment, and treatment code, for example (all made up data):

ID Date Treatment

1 1Jan1990 D

1 1Feb1991 D

1 1Mar1992 T

1 1Feb1993 F

1 1Mar1994 D

2 1Apr1990 D

2 1Feb1992 D

2 1Feb1995 D

Before I convert to wide format I want to create a new indicator variable that is 1 if the patient has EVER had a treatment "T" and 0 if the patient has never received a treatment "T". In the example above I would want the indicator variable to show the following:


ID Date Treatment Indicator

1 1Jan1990 D 1

1 1Feb1991 D 1

1 1Mar1992 T 1

1 1Feb1993 F 1

1 1Mar1994 D 1

2 1Apr1990 D 0

2 1Feb1992 D 0

2 1Feb1995 D 0

The difficulty I am having is that treatment "T" isn't necessarily always the last treatment code for a given individual (and that each individual can have a variable number of treatments)

Similarly, I would also like to create an indicator variable to denote what the last treatment a patient had received at exactly 1 year after their first treatment. For the made up dataset below:

ID Date Treatment

1 1Jan1990 D

1 1Feb1991 P

1 1Mar1992 T

1 1Feb1993 F

1 1Mar1994 H

2 1Jan1990 H

2 1Feb1990 P

2 1Feb1995 H

3 1Jan1993 H

3 1Dec1993 T

I want to create an indicator variable:

ID Date Treatment Indicator of treatment at 1yr after first treatment

1 1Jan1990 H H

1 1Feb1991 P H

1 1Mar1992 T H

1 1Feb1993 F H

1 1Mar1994 H H

2 1Jan1990 H P

2 1Feb1990 P P

2 1Feb1995 H P

3 1Jan1993 H T

3 1Dec1993 T T

Does anyone know if there is a way of achieving this in Stata? And if so where I can learn more about doing this?

Kind regards,

B