Dear all,
I am writing to ask whether there is any rule of thumb on the minimum cell size (N) for running a regression model, and specifically a multilevel ordered logistic regression model.
I want to estimate the effect of a dependent variable (DV) on an independent variable (IV) by age group (8 age groups). My IV is a Likert scale (life satisfaction), and my DV is a categorical varaible taking values: Every day; 2-3 times a week; about once a week, 2-3 times a month; less often; never. Looking at the distribution of the IV and DV, and given that the distribution of the DV is very different by age group, and also for ease of interpretation, I had initially categorised the IV and DV as two dummy variabels - i.e. satisfied=1 not satisfied=0; use=1 never use=0. I then estimated a set of multilevel logit models, one for each of the 8 age groups of interest. The article reviewer is not satisfied with this modelling strategy as it looses potentially relevant information.
I thus tried to fit a set of odered logit models, with the original variable coding -- i.e. the IV as a likerts scale with 4 categories, and the DV in 6 categories, again by age group. By doing a cross tab on my IV and DV by age group I have noticed that I end up having some cells with very few (0-20) cases.
I then tried to compromise, and categories my DV in three categories and my IV in 4, but I still have some cells with N<30; same applies (thought to a lower extent) if I categorise the DV in two categories (please see pictures below). While the model runs and I do find some significat effects, I worried about the sample size in some cells. Is there a rule of thumb on the minimum sample size?
Also, I considered using interactions by age groups instead of splitting the sample: however, am I write in thinking that in the interaction, roughtly speaking, the model is still looking at the small cells (as it estimates the effect of each category of the DV in that single age group)?
thank you in advance for your help,
Best
Alessandra
Array
Array
Related Posts with Cell sample size in multilevel ordered logit models
How to remove Autocorrelation and Hereroskedasticity detected in my FEM modelI used Hausman test and as my prob < 5%. I choose to use FEM. But after testing I found out that …
Logistic RegressionI am fairly new to Statistics. I have a large datatset where N=50,000. I have a dichotomous outcome …
A Solution for Publication Quality Regression Tables (from STATA to .DOC)Dear friends, I often use Latex to generate publication-quality tables and figures and the STATA-La…
Direct effect switches sign when moderator is added - what does this mean?Hello! I have examined a panel and performed various analyses in STATA, whose results you can see …
short term and long term analysisHello everyone, I want to test the short-term and long-term impact of a certain independent variabl…
Subscribe to:
Post Comments (Atom)
0 Response to Cell sample size in multilevel ordered logit models
Post a Comment