I have data on several measures for 150 survey respondents: Infection (binary outcome of interest), Hormonal BC (binary predictor #1 of interest), Non-Hormonal BC (binary predictor #2 of interest) and Age, which have been collected for the years 2012-2016. I have been asked to determine the risk of infection due to hormonal BC use compared to non-hormonal BC use, adjusting for age.

My problem is, not only does the outcome of interest vary over time, the predictors do too, and there can be overlap between predictors. Respondents could have an infection in 2012 while on hormonal BC, have an infection in 2013 while on non-hormonal BC, and not have an infection in 2014 while on hormonal BC and non-hormonal BC. (See table below). This means I not only have to account for repeated measures, but also for interactions between BC use, and correlation between BC use and infection in the same year.

How do I go about setting up this analysis? This is way over my head and I don't think -xtgee using StudyID for the panel variable is quite right.

StudyID Year Infection Age Hormonal NonHorm
1 2012 0 24 0 0
1 2013 1 25 0 0
1 2014 0 26 0 0
1 2015 1 27 1 0
1 2016 1 28 0 0
2 2012 1 26 1 1
2 2013 1 27 1 1
2 2014 1 28 1 1
2 2015 1 29 0 0
2 2016 1 30 0 0
3 2012 1 41 0 0
3 2013 1 42 0 0
3 2014 1 43 0 0
3 2015 1 44 0 0
3 2016 1 45 0 0
4 2012 0 23 0 1
4 2013 0 24 0 1
4 2014 0 25 0 1
4 2015 0 26 0 1
4 2016 0 27 0 1
5 2012 0 27 1 0
5 2013 0 28 1 0
5 2014 0 29 1 0
5 2015 0 30 0 0
5 2016 1 31 1 1