Hi everyone,
I hope I'm posting this in the right place.
Here is my question :
I've always heard by my teachers that when you use an explanatory variable that is qualitative (binary or more), each modality of this variable must represent at least 5% of the total population. But what happens if one doesn't ? What if one of the modalities represents less than 5% of the total sample ?
I remember something like "stadards errors are greater, hence the robustness of the estimated coefficient is poorer..".
But is it that bad ? Even if my modality has A LOT of observations (like 1000, 10 000, 100 000) but is still under those 5% of representation ?
Thanks you very much for your help and guidance.
Jordan.
0 Response to Consequences of modality under 5% of the total population ?
Post a Comment