Dear Statalist Community,
I am looking for a command that tells me how many observations are lost with each variable that my regression contains.
Let's say my regression would look like this:
reg goals rankdiff teamvalue coachexperience weather
And let's assume I'd work with a dataset containing several similarly defined variables and I am looking for the ones that leave me with the highest number of observations.
So far, I have played around with excluding single variables and see how the observations react and which combination of variables within the regression may cause the most significant drop in observations.
I am imagining a command that gives me s.th. like:
------------------
1. goals - 300k observations left (=100%)
2. rankdiff - 280k observations left
3. teamvalue - 270k observations left
4. coachexperience - 110k observations left
5. weather - 100k observations left
------------------
In this scenario, I would then proceed to look for a good replacement for "coachexperience", as it seems to have too many missing values in data rows where the other variables contain values.
The real dataset and the regression are bigger than this example and finding out which variables decrease the overall observations the most is more tedious.
I would appreciate any help regarding this matter.
Thank you very much,
Björn.
Related Posts with Indicate loss of observations by variable
Creating a mean for each person ID conditional on another variableHello, I have a long dataset with multiple observations of a value (value) per person id (pid). I h…
How to use "if not" condition in regressionHi all, I read the syntax of regress and operators but still cannot find how to run a regression wi…
Survival Analysis: Problem with setting up time to event data (years vs periods of time)Dear Statalist users, I am setting up a survival model, where I look at the farmers' time to adopti…
Merging Cross-sections of a Panel Data SetHello everyone, This is my first post on the forum, so I hope that I will be able to provide all th…
FGLS for panel dataI have a panel data N=46, T=4, with time invariant variable. Hausman test tells me I should do fixed…
Subscribe to:
Post Comments (Atom)
0 Response to Indicate loss of observations by variable
Post a Comment