Hi all, I am working on a study of grant applications and selections. I have about 8 years worth of data, and am looking at Principal Investigators only. The goal is to see if characteristics of the PI predict the likelihood of getting a grant.
It is common for same person to apply for multiple grants, and be selected for multiple grants. The same person can apply for several grants within the same year, but I don’t have a more refined time unit than year. (so I don’t know the time order of the different grants in the same year)
Sample sizes:
about 8,000 applications
about 3,000 unique applicants
I am interested in a model predicting the probability of being awarded the grant. Currently I am running a model of applications clustered within applicants using the following code:
xtlogit awarded yearvariables independentvariables, i(applicant_id)
My questions:
1) does this sound like the correct model for the data structure, applications clustered within applicants? Specifically, I’m wondering if I need to account for the particular grant topic/area the person applied for, as some will have higher award rates than others. Additionally, within the same grant area, the same person can apply multiple times, and be selected multiple times. (So within the same grant area, the same person can get multiple awards). However, across the 8 years, there are about 150 year-grant area combinations, which seems like a lot. Seems too much to be included as dummy variables in the model. Could this be a cross classified model, with applications nested in applicants, but a given applicant is linked to multiple grants? (cross classified)?
2) is it a problem that the same person can apply multiple times in a year, and I don’t have a more refined time unit than year?
Any advice would be much appreciated!
Thank you!!
MJ
Related Posts with multilevel/ cross classified model question
VAR models on raw or filtered/smoothed data?Dear all, I just started to learn time series analysis and I'm reading Becketti's book at the moment…
Least Square Method for parametric survival curve fittingDear All, Hope you're doing well and safe! Can anyone let me know Stata codes for parametric survi…
Issues with the dependent variable being too common in Logistic regressions? Good morning/afternoon/evening Ladies and Gents of Statalist, I'm using logistic regression to p…
Calculating the percentage changes with longitudinal data (not a panel)Hi, My name is Jiwan. I'm currently working on Korean Household Income and Expenditure Survey, to s…
You can now play Oregon Trail on Statahttps://twitter.com/mcdroste/status/1321482111350677505 If Stata sales triple in the next few days,…
Subscribe to:
Post Comments (Atom)
0 Response to multilevel/ cross classified model question
Post a Comment