Hi Stata Forum
I have an issue with a tobit regression predicting too high values.
As a part of me and my colleague’s master’s thesis, we are trying to use a Tobit regression to predict the proportion of Special Items (an accounting post) that can be considered as opportunistic.
Inspired by an American research paper we are using Compustat data for all public Nordic companies as the basis for our predictions. We have recreated the variables from the American paper and our coefficients resemble the original when running the tobit-regression.
The company observations are grouped by industry and year, and we have created the IndYear variable accounting for this - IndYear are dropped if they have less than 30 observations.
We run a forvalues loop for each IndYear and try to make IndYear specific predictions, but the predicted SI (PredSI) is higher than the actual SI 75 pct. of the time - for the American paper it predicted too high 7 pct. of the time.
The model should predict the “real” SI and the residual represents the opportunistic part, so the residuals can’t be negative.
We are using the following command for the regression-loop:
gen PredSI = .
forvalues i = 1/62 {
tempvar test
tobit SI $cntrl if IndYear ==`i', vce(cluster CompanyName)
predict `test', ystar(0,.5)
replace PredSI = `test' if IndYear == `i'
}
PredSI = Predicted Special Items
SI = Special Items
$cntrl holds 14 variables
All variables have been winsorized by IndYear using winsor2 at a (1 99) or (0 99) if they have a lower boundary.
Hope somebody can help!
Best regards
Mathias
Related Posts with Issue with a Looped Tobit Regression Predicting too high values
How do I graph a parameter t distributionDear Statalisters, I have a regression coefficient x2=.2227885 with a SE=.056563 and a 95% Conf. In…
Count in how many districts two certain parties competeDear all, I am working on Data on election in Japan. I need to count in how many same distrcits the…
Hausman test for negative binomial fixed effects and random effectsHello everybody! I am trying to model the relationship between the number of patents and oil prices…
Replace missing values with closest non-missing value in a panel dataHi, I have a panel dataset (example attached) which is tsset id year. employed is a binary variable…
Difference in PCA and PCA, factorHi everyone, I am created an asset index for my paper using Principle Components Analysis. I used t…
Subscribe to:
Post Comments (Atom)
0 Response to Issue with a Looped Tobit Regression Predicting too high values
Post a Comment