Dear Statalisters,

I have an administrative dataset that is updated on the 1st and 15th day of every month.
It looks somewhat like this:
Code:
clear
input long id  str10( var1 begin end spell )
1 x 01.01.2006 14.01.2006 1
1 y 15.01.2006 28.02.2006 2
1 z 01.03.2006 31.12.2006 3
2 a 01.01.2006 29.02.2008 1
end
gen SIFbegin=date(begin, "DMY")
gen SIFend=date(end, "DMY")
format SIFbegin %td
format SIFend %td
Hovewer, unlike a panel, if nothing changes (e.g. var 1 stays the same) the endate of the spell is just updated to the day before the next update and no new entry is made (i.e. no observation for the preceding time period).If something had changed within that ~15 day period from the previous update, the enddate would be set to end the day before the update and the change in var1 would get a new entry (new spell) with the begin date set to the date of the update and the enddate to the day before the next update. I now want to bring this into a panel form that includes the update cycles where "nothing happened" and would ideally look like this:

Code:
clear
input long id  str10( var1 begin end  spell)
1 x 01.01.2006 14.01.2006 1
1 y 15.01.2006 31.01.2006 2
1 y 01.02.2006 14.02.2006 2
1 y 15.02.2006 28.02.2006 2
1 z 01.03.2006 14.03.2006 3
1 z 15.03.2006 31.03.2006 3
1 z 01.04.2006 14.04.2006 3
1 z 15.04.2006 30.04.2006 3
. . . . .
. . . . .
. . . . .
1 z 15.12.2006 31.12.2006 4
2 a 01.01.2006 14.01.2006 1
. . . . .
. . . . .
. . . . .
2 a 15.02.2008 29.02.2008 1
end
gen SIFbegin=date(begin, "DMY")
gen SIFend=date(end, "DMY")
format SIFbegin %td
format SIFend %td
NOTE: the dots are not missing values, just me skipping over the interjacend cycles

My first(admittedly somewhat naive) attempt at doing this transformation relied on using the stata week-date format and expanding by the difference (in weeks) between begin and end date:
Code:
clear
input long id  str10( var1 begin end spell )
1 x 01.01.2006 14.01.2006 1
1 y 15.01.2006 28.02.2006 2
1 z 01.03.2006 31.12.2006 3
2 a 01.01.2006 29.02.2008 1
end
gen SIFbegin=date(begin, "DMY")
gen SIFend=date(end, "DMY")
format SIFbegin %td
format SIFend %td

gen WEEK_begin=wofd(SIFbegin)
gen WEEK_end=wofd(SIFend)
format WEEK_begin %tw
format WEEK_end %tw
gen duration=WEEK_end-WEEK_begin
replace duration=duration/2
expand duration
But at this point I realized this would of course give me paneldates that do not coincide with my update cycles, as calenderweeks do not really overlap with them and that of course these update cycles can have varying lenghts of 13, 14, 15 or 16 days (depending on the month and wether or not there are leap days):

lenght 13: 15th-28th of a regular february
lenght 14: 1st-14th of a month and 15th-29th in february of a leap year
lenght 15: 15th-30th of a month
lenght 16: 15th-31st of a month

And now I am at a loss for a straightforward viable solution. The Problem is that my paneldates have to exactly fit the update cycle as I want to merge this data with another such dataset (with the same problem) but the id only uniquely identifies observations within one update cycle (i.e. between one begin and end date)..
I somehow have a feeling that my problem is trivial and the solution is just arround the corner and I am just not seeing it.

I would greatly appreciate any help and feedback!

Cheers,

Franz