Hi group,
I'm running an OLS model where the dependent variable is patients' average value of HgA1c, and among the predictors is a group of measures from the American Community Survey (ACS) that outputs percent of people in a census tract that have achieved various levels of education. I matched the patient census tract with ACS census tract for relevant years to obtain these measures, so these values represent the entire CT rather than the individual person (it's the best we can do - we don't have a measure that lists patient education experience). There are 5 of these measures - (1) % < high school education, (2) % high school education/GED, (3) % some college, (4) % bachelors degree, (5) % masters degree or higher. These five measures sum up to 100%. I don't have any experience modeling something like this where there's separate measures relating to the same thing. We discussed instituting a cut-point to create a indicator variable, such as high_school_educ = 1 if % high school education/GED is > 50% for that particular census tract, but then we get into the business of the 'choosing' of the cut point. Anyone have advice or experience working with something like this?
Thanks.