Hello Statalisters,

I am trying to select a subset from a large clinical dataset based on a set of criteria. Just from the start of my dofile I
keep getting different results. My commands for one particular criterion are as below:

Note: lab_date : date when a glucose test was done
food _start_date : date when clients started a particular food combination

Code:
******Checking for and dropping  duplicates 

sort id_client food food_start_date lab_date
quietly by id_client food food_start_date lab_date:  gen dup = cond(_N==1,0,_n)
br if dup>0

drop if dup>1

bysort id_client lab_date: gen visit=_n

*based on selection criteria:*

***keeping only patients who started  "FO" containg food between 01/01/14 to 31/12/14
bysort id_client visit: gen condition1 = 1 if strpos(food, "FO") > 0 ///
& inrange(food_start_date,td(01jan2014),td(31dec2014)) 

       *Capturing entire follow-up period (using lab_dates as reference) of each observation on 
       by id_client: mipolate condition1 visit , gen(condition2) forward
       keep if condition2==1
This last step for this particular criterion keeps varying with observations ranging from 60610 to 60622 being kept every time I run this part of the dofile

Any reason why this is the case?

I know adding "stable" to the -bysort- command will only mask the problem.

Any suggestions?

Thanks so much for your assistance.

Regards

Adrian