Stata/MP 14.2 and my installation does not have internet access, so I cannot copy code or output to this forum.
My data is longitudinal, with 128 "zones" and around 700 daily observations for whether or not a pipe break has occurred on that day. My covariates consist of various time-varying factors specific to each zone, like water demand, pressure measurements, etc. One issue is that pressure measurements (a key variable) are only available for 42 zones and are severely unbalanced. Time invariant factors are ignored since there are so many that we can't quantify.
Initially the idea was to do a regression on breaks per mile of pipe, but it has since come to light that the miles used in that calculation are unreliable estimates. So, a binary outcome of whether or not a break happened seems reasonable.
xtlogit has the random effects, conditional fixed effects, and population averaged approach available, but I am not sure which would be best.
As I understand it, random effects is only valid for random samples from a larger population, and since the population averaged approach is similar, does that exclude that approach too? Also, since we don't quantify the time-invariant variables, isn't random effects invalid? Does that apply to population averaged approaches too?
So conditional fixed effects remains, but it doesn't have cluster-robust standard errors. I could use the bootstrap option (but the docs don't explicitly say that this would be sufficient, but threads on this forum suggest this is the case) or do clogit with robust standard errors. But clogit is for matched case-control data according to the docs...
What is the most appropriate approach? Hosmer and Lemeshow (2013) mention a "cluster-specific" model, but I don't see that language anywhere in the Stata documentation.
Hosmer Jr., D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied Logistic Regression. Wiley series in probability and statistics. Hoboken, NJ, USA: John Wiley & Sons, Inc.
Related Posts with Help with choosing which type of logistic regression is most appropriate
how to count the numbers of variables that meet some requirements in the matrix?I want to count the numbers of variables whose p value is less than .2 after running a regression mo…
Dropping lowest n values of a group variableHi, my question is a bit silly but I am just not being very bright today As you can see from the da…
Trim Stata OutputHi, My dataset has many (more 100,000) observations. I am running a standard fixed effects model wi…
No p-value in xtpedroni cointegration testDear All, I ran xtpedroni cointegration and the p-values are not sowing. However, the reviewer for …
Mean different from zero, problemI want to test if my dependent variable differs significanlty from zero. My sample is not normally d…
Subscribe to:
Post Comments (Atom)
0 Response to Help with choosing which type of logistic regression is most appropriate
Post a Comment