Stata/MP 14.2 and my installation does not have internet access, so I cannot copy code or output to this forum.
My data is longitudinal, with 128 "zones" and around 700 daily observations for whether or not a pipe break has occurred on that day. My covariates consist of various time-varying factors specific to each zone, like water demand, pressure measurements, etc. One issue is that pressure measurements (a key variable) are only available for 42 zones and are severely unbalanced. Time invariant factors are ignored since there are so many that we can't quantify.
Initially the idea was to do a regression on breaks per mile of pipe, but it has since come to light that the miles used in that calculation are unreliable estimates. So, a binary outcome of whether or not a break happened seems reasonable.
xtlogit has the random effects, conditional fixed effects, and population averaged approach available, but I am not sure which would be best.
As I understand it, random effects is only valid for random samples from a larger population, and since the population averaged approach is similar, does that exclude that approach too? Also, since we don't quantify the time-invariant variables, isn't random effects invalid? Does that apply to population averaged approaches too?
So conditional fixed effects remains, but it doesn't have cluster-robust standard errors. I could use the bootstrap option (but the docs don't explicitly say that this would be sufficient, but threads on this forum suggest this is the case) or do clogit with robust standard errors. But clogit is for matched case-control data according to the docs...
What is the most appropriate approach? Hosmer and Lemeshow (2013) mention a "cluster-specific" model, but I don't see that language anywhere in the Stata documentation.
Hosmer Jr., D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied Logistic Regression. Wiley series in probability and statistics. Hoboken, NJ, USA: John Wiley & Sons, Inc.
Related Posts with Help with choosing which type of logistic regression is most appropriate
Dealing with zerosI have lots of zeros in both my dependent and independent variables. One way that I was dealing wit…
Appropriate regression modelMy dataset consists of an amount of lubricant added to a process at various times. The lubricant is …
reshaping to long with a string IDI want to reshape my data from wide to long. I implement: Code: reshape long MortgageloansthUSD2010…
How to export list/table to excel with blank row as group separatorHello, Hoping someone has a hack to insert a blank row as a group separator when I export a list to …
Create laggingsHello, I have a dataset of 1,791 firm-year observations. Now, I would like - for each firm-year obs…
Subscribe to:
Post Comments (Atom)
0 Response to Help with choosing which type of logistic regression is most appropriate
Post a Comment