Dear STATA-Users,

I am working on a data set which currently looks like that: (The data is sorted already using: sort id date_start)


id date_start date_end policy1 policy2 policy3 policy 4
1 1 Apr 2020 3 Apr 2020 1 0 0 0
1 1 Apr 2020 3 Apr 2020 0 2 0 0
1 1 Apr 2020 2 Apr 2020 0 0 1 0
1 2 Apr 2020 . 0 0 0 1
1 3 Apr 2020 . 0 0 0 0
1 3 Apr 2020 . 0 0 0 0
2 1 Apr 2020 2 Apr 2020 0 1 0 0
2 2 Apr 2020 3 Apr 2020 1 0 0 0
2 2 Apr 2020 6 Apr 2020 0 0 0 1
2 3 Apr 2020 1 May 2020 0 0 1 0


I started coding a qualitative data set where policies are described in written form, where each observation (i.e. line) described a policy. I generated new variables (i.e. policy1, policy2, etc. which are coded as dummies or variables taking ordinal values). he data includes a large number of different policies in different countries (id), the date when the policy was introduced (date_start) and the date the policies end (date_end). When there are missings in date_end this means the policy is introduced for only one day.

I would like to create a panel data set with id as the panel variable and date as the time variable and I would like to carry on the values of each policy when it is introduced until the date when the policy ends.

I already managed to replace the values when each policy was introduced for each duplicate starting date. So I get all observations of policy variables in the same line. My idea was to be able to drop duplicate observations in terms of date-start later. So my data now looks like that:

id date_start date_end policy1 policy2 policy3 policy 4
1 1 Apr 2020 3 Apr 2020 1 2 1 0
1 1 Apr 2020 3 Apr 2020 1 2 1 0
1 1 Apr 2020 2 Apr 2020 1 2 1 0
1 2 Apr 2020 . 0 0 0 1
1 3 Apr 2020 . 0 0 0 0
1 3 Apr 2020 1 Apr 2020 0 0 0 0
2 1 Apr 2020 2 Apr 2020 0 1 0 0
2 2 Apr 2020 3 Apr 2020 1 0 0 1
2 2 Apr 2020 6 Apr 2020 1 0 0 1
2 3 Apr 2020 1 May 2020 0 0 1 0

But from here I do not manage to carry on the values of each policy until date_start takes the value of date_end when the policy was introduced. So e.g. policy1 should take the value 1, when at date 1 Apr 2020 and should remain 1 until the date 3 Apr 2020.

I am happy for any help in this regard.

Many thanks and best,
Sophie