I wrote the following test code which checks that the total number of nomissing observations for each id is the maximum number of conditional observations, but it falls down if there is a date mismatch as in the last case. This case is invalid because there isn't any group which has valid data at both t=5 and t=8 (or the minimum and maximum date of the panel if used unconditionally). I can put a failure check which calculates the in sample min and max values of the date, but I wanted to ask if there was a more natural way to do this. I assumed that this is a somewhat common concern, but I don't have a good sense of the best way to approach it.
Code:
clear input float(id date variable) 1 5 .88 1 6 .2 1 7 .89 2 5 .58 2 6 .37 2 7 .85 3 5 .39 3 6 .12 3 7 . 4 6 .7 4 7 .69 4 8 .93 end capture program drop balanced program define balanced syntax varlist [if], Generate(string) marksample touse tempvar obs balanced by id (date): gen `obs' = sum(`touse') qui sum `obs', meanonly local maxobs = `r(max)' qui by id (date): replace `touse' = 0 if `obs'[_N] != `maxobs' gen `generate' = `touse' end tsset id date balanced variable if inrange(date,5,7), g(bal57) balanced variable if inrange(date,5,6), g(bal67) balanced variable if inrange(date,6,7), g(bal56) balanced variable if inrange(date,5,8), g(bal58) /// Produces an incorrect result, should probably be made to generate an error
Code:
+------------------------------------------------------+ | id date variable bal57 bal67 bal56 bal58 | |------------------------------------------------------| 1. | 1 5 .88 1 1 0 1 | 2. | 1 6 .2 1 1 1 1 | 3. | 1 7 .89 1 0 1 1 | 4. | 2 5 .58 1 1 0 1 | 5. | 2 6 .37 1 1 1 1 | 6. | 2 7 .85 1 0 1 1 | 7. | 3 5 .39 0 1 0 0 | 8. | 3 6 .12 0 1 0 0 | 9. | 3 7 . 0 0 0 0 | 10. | 4 6 .7 0 0 1 1 | 11. | 4 7 .69 0 0 1 1 | 12. | 4 8 .93 0 0 0 1 | +------------------------------------------------------+
0 Response to Best way to mark a sample containing the balanced panel of observations with nonmissing data.
Post a Comment