Hi all

Trying to understand the nature of my wage data and whether Tobit regression (or some other method suited for censored data) is more appropriate than OLS-regression. The variable 'wage' shows the income of the individuals in my data set. 7% of the individuals have a wage of '0' (because they are unemployed). In this case, is the data censored? (mean is aprx. 420.000, median is aprx. 388.000, skewness=-0.53, kurtosis=5,27).

This link provides an example of censored data as: "There are a number of customers in a mall (buyers and non-buyers). In censored data, non-buyers value will be counted as zero while buyers cosumption will be observed. In truncated data only buyers data will be in the sample."

In my understanding, data is censored when the information we have on a variable is unexact above/below some treshold. So an example of a censored wage variable could be a variable containing information on the exact wage of individuals exact individuals who earn less than 25.000 a year where their values would just be '<25.000'. But in my example and the example provided in the link, the information is not unexact, because the individuals in my data are indeed earning '0' because they are unemployed and the customers in the mall are indeed buying '0'. This makes conclude that my data and the data in the example are not censored, but maybe I am getting something wrong here.

Could anyone explain whether my data and the data in the example is censored?