Hi all
Trying to understand the nature of my wage data and whether Tobit regression (or some other method suited for censored data) is more appropriate than OLS-regression. The variable 'wage' shows the income of the individuals in my data set. 7% of the individuals have a wage of '0' (because they are unemployed). In this case, is the data censored? (mean is aprx. 420.000, median is aprx. 388.000, skewness=-0.53, kurtosis=5,27).
This link provides an example of censored data as: "There are a number of customers in a mall (buyers and non-buyers). In censored data, non-buyers value will be counted as zero while buyers cosumption will be observed. In truncated data only buyers data will be in the sample."
In my understanding, data is censored when the information we have on a variable is unexact above/below some treshold. So an example of a censored wage variable could be a variable containing information on the exact wage of individuals exact individuals who earn less than 25.000 a year where their values would just be '<25.000'. But in my example and the example provided in the link, the information is not unexact, because the individuals in my data are indeed earning '0' because they are unemployed and the customers in the mall are indeed buying '0'. This makes conclude that my data and the data in the example are not censored, but maybe I am getting something wrong here.
Could anyone explain whether my data and the data in the example is censored?
Related Posts with Are wage data with 7% of observations earning '0' censored?
Probem with using global macro in a string before an underscoreHi statalisters, I want to use global macro's to refer to folder names, but ran into some unexpecte…
Extracting dummy regression coefficients using statsbyI have the dataset shared further below. I need to: a) run several regressions whereby a continuous …
Latent Class Analysis: Vermunt 3-step procedure for estimating distributions of external variables?Has anyone implemented, or know of an approach, to implementing the 3-step ML method for estimating …
Survival analysis using flexible parametric models to account for time-varying hazards (stpm2)Dear Statalisters, My question relates to survival analysis in the presence of time-varying hazards…
The best model for a multilevel dataset with a small number of clustersDear all, I have a three-level cross-country survey data that is not panel. The levels are country…
Subscribe to:
Post Comments (Atom)
0 Response to Are wage data with 7% of observations earning '0' censored?
Post a Comment