I’m doing an analysis of applicants for grants over several years. In a given year, duplicate people have been dropped (those who submit more than one application in a given year). However, it is quite common for the same person to be found in multiple years, and the number of years in the data can vary across people.
The question I am trying to answer: is there a "statistically significant" linear trend over time in the percentage of females, % of males, and % unknown? Also, I’d like to show the regression trends in a graph with the confidence intervals. I realize that with having an unknown category, increases in the % females and % males over time need to be interpreted with caution.
Proposed set up: 3 separate logistic regression models. Outcome is 1) female (vs not female), 2) male (vs not male), 3) unknown (vs not unknown).
The explanatory variable is year, coded as: 1, 2, 3, etc. (use to determine the linear trend)
Question:
**1) does one need to account for the fact that the same person can be found in different years? For my purposes, I just want to know if the overall percentage increased over the years, regardless of whether some were the same people or not. Also, the outcome (gender) does not change over time within a given person. Therefore, it seems like my goal is maybe to treat them as independent but the data has some of the people in the same years. Can one do a regular logit does one need to do a GEE for example accounting for the panel data?
Note, question cross posted here (no replies as of now): https://stats.stackexchange.com/ques...-for-clusterin
Related Posts with to account for clustering or not to account for clustering?
How to add fixed regressors in NARDL modelHi My questions are on Nonlinear and Asymmetric Autoregressive Distribution Lag Model (NARDL) 1. Ho…
NAs and missing valuesI am a stata novice, using stata for the first time in life. I have multiple variables as strings (…
sample selection under many restrictionsHi, I have a (panel) dataset that contains observations with the following variables: company (abou…
How to do the Interaction plot with continuous variables and constrain other control variables at meanTo whom can solve this problem: Please kindly help! I need to draw an interaction plot figure to se…
Allocating particular values based on an identifierHello! I am fairly new to stata, and I have spent quite some time on this website to find a solutio…
Subscribe to:
Post Comments (Atom)
0 Response to to account for clustering or not to account for clustering?
Post a Comment