I am writing for your input on how best to use exposure () in xtgee and glm analyses.

My outcome variable is count data. However, my unit of analyses is neighborhood and since each neighborhood varies by size it’s best to use the density variable of the count / area of neighborhood.

My question: is it better to capture the differences in neighborhood area by using the exposure option and if so, should I do this in addition to or in lieu of making the outcome the density variable.

Example 1: xtgee
xtset neighborhood year
xtgee count predictors, family(Gaussian) identity(link) corr(ar1) exposure(area of neighborhood) vce

Or

xtgee density predictors, family(Gaussian) identity(link) corr(ar1) exposure(area of neighborhood) vce

Example 2: GLM

glm count predictors, family(gamma) identity(link) corr(ar1) exposure(area of neighborhood) vce

Or

glm density predictors, family(gamma) identity(link) corr(ar1) exposure(area of neighborhood) vce


I’m leaning towards using xtgee density with exposure and glm with count but unclear how to account for the differences in area when using glm. Could i still use glm gamma distribution if I’m using density?

Thank you in advance for your insights.