I successfully ran the below model and created a nice a chart showing trends in gender over time for people in a specific
department and
position in my data.
Gender is the dependent variable, and
time (year) is the independent variable.
Code:
mlogit gender_n year, vce(cluster person)
margins, at(year = (2008(1)2013))
marginsplot
For a specific department and
position, there are NO duplicate people in a given year. (although can be duplicate people across the years). Data sample:
Department A, Position H |
|
|
person |
year |
gender |
2 |
2009 |
M |
2 |
2010 |
M |
3 |
2010 |
F |
3 |
2011 |
F |
I would like to produce the same chart for EVERY
department and
position in my data. However, on the full dataset, there are duplicate people in a given YEAR, as the same person can be associated with more than one position and more than one department. Data sample:
All Data |
|
|
|
|
person |
year |
position |
department |
gender |
1 |
2009 |
H |
A |
M |
1 |
2009 |
H |
B |
M |
1 |
2009 |
L |
C |
M |
1 |
2009 |
L |
B |
M |
Is this a case where it is justifiable to run separate regression models for each combination of department and position? Or is it better to do a model on all data combined, including department and position and potentially their interaction as independent variables?
Any help is much appreciated. (Note, using Stata 15)
0 Response to justified to do subgroup analysis versus include as covariate/interaction?
Post a Comment