Dear Statalist Community,
I am looking for a command that tells me how many observations are lost with each variable that my regression contains.
Let's say my regression would look like this:
reg goals rankdiff teamvalue coachexperience weather
And let's assume I'd work with a dataset containing several similarly defined variables and I am looking for the ones that leave me with the highest number of observations.
So far, I have played around with excluding single variables and see how the observations react and which combination of variables within the regression may cause the most significant drop in observations.
I am imagining a command that gives me s.th. like:
------------------
1. goals - 300k observations left (=100%)
2. rankdiff - 280k observations left
3. teamvalue - 270k observations left
4. coachexperience - 110k observations left
5. weather - 100k observations left
------------------
In this scenario, I would then proceed to look for a good replacement for "coachexperience", as it seems to have too many missing values in data rows where the other variables contain values.
The real dataset and the regression are bigger than this example and finding out which variables decrease the overall observations the most is more tedious.
I would appreciate any help regarding this matter.
Thank you very much,
Björn.
Related Posts with Indicate loss of observations by variable
ICC for count models. …
Replace missing data with mean or average for summative scale purposesHi, I fully understand the issues with mean imputation, but I still need to use this as a quick solu…
qregpdDear all! Getting error message r(3499) after running qregpd command. Please help! …
reshape wide to long with bighello, I am working with panel data and I would like to convert data from wide format to long format…
svyset for repeated cross-sectional survey data // calculating averages & proportionsHey everyone, I am preparing for my Master's dissertation, and it is my first time performing data …
Subscribe to:
Post Comments (Atom)
0 Response to Indicate loss of observations by variable
Post a Comment