Hi everyone,

I'm evaluating whether increased dosage (number of completed sessions) of an behaviour change therapeutic program, is associated with a disappearance in that behaviour in the 12 months after the program ends. The outcome is a logistic binary effect - did engage in behaviour vs. did not engage in that behaviour.

Unfortunately, the program is not a RCT, only have a naturalistic/observation design: another group - those who were referred but never commenced the program - are going to act as a control group. I'm planning to use IPTW to develop propensity scores to weight the groups to balance both in terms of the likelihood of participating in the program to begin with. I am seeking an average treatment effect on the treated (ATT) sample, relative to never having participated in the program before.

I have two questions:

1. I was planning to do a xtmelogit of program participation, followed by Predict prob_particication and then

gen iptw=1/prob_participation
summ iptw


However, i've also seen people use -pscore- syntax to develop propensity scores? Is it okay to use the former method or is it less efficient than pscore?

2. When choosing the covariates to include in propensity score, is it better practice to include only personal characteristics of the sample - e.g., age, gender, education, etc. or is it also okay to include some broader operational factors in the model, - e.g., program trainer identity, timing of the start of the program, etc.? I've seen written in some places that choosing more 'operational factors' can be problematic because they are often more related to treatment than to the outcome and may be confounders.


I'm not sure if its helpful information, but my data is in wide format.


Thanks in advance! I've never completed IPTW before so it's very new to me

Marlee