Hi Listers,

I have collected data of people who want to join the gym at 4 regular intervals. At each time point I ask them if they have joined and if they are interested in joining. I would like to see if those who say they are interested in joining go on to report they have joined the gym at the next follow-up and I am using a mixed-model approach to establish such relationship.

If they report joining at one time point, I don't think I should include their response of the want question (which may still be yes). I decided to code want to be 999 if this is the case and then exclude those respondents from the analysis; however, depending on how I do this the results change!

This is my code:

* set value of want to be 999 if they have joined already at the previous session
replace want2 = 999 if joined1==1
replace want3 = 999 if joined2==1
replace want4 = 999 if joined3==1

reshape long want* joined* , i(id) j(time)

* First approach: I set an if statement within xtlogit to exclude the cases with 999
xtset id time
xtlogit joined l1.want i.time if want!=999, or i(id)

* Second approach: I drop them before running the model
drop if if want!=999
xtset id time
xtlogit joinec l1.want i.time, or i(id)

The first approach leads to a positive association between want and joined but I find no association using the second approach. Quite different results!

What's causing such difference and which one is the correct approach?

Thanks in advance,
Laura