I'm somewhat puzzled by the results given from 2 down-to-earth models, one using - bayes - prefix - and the other under - regress - command.
Below, a toy example:
Code:
. sysuse auto (1978 Automobile Data) . su price mpg Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- price | 74 6165.257 2949.496 3291 15906 mpg | 74 21.2973 5.785503 12 41 . regress price Source | SS df MS Number of obs = 74 -------------+---------------------------------- F(0, 73) = 0.00 Model | 0 0 . Prob > F = . Residual | 635065396 73 8699525.97 R-squared = 0.0000 -------------+---------------------------------- Adj R-squared = 0.0000 Total | 635065396 73 8699525.97 Root MSE = 2949.5 ------------------------------------------------------------------------------ price | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- _cons | 6165.257 342.8719 17.98 0.000 5481.914 6848.6 ------------------------------------------------------------------------------ . regress mpg Source | SS df MS Number of obs = 74 -------------+---------------------------------- F(0, 73) = 0.00 Model | 0 0 . Prob > F = . Residual | 2443.45946 73 33.4720474 R-squared = 0.0000 -------------+---------------------------------- Adj R-squared = 0.0000 Total | 2443.45946 73 33.4720474 Root MSE = 5.7855 ------------------------------------------------------------------------------ mpg | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- _cons | 21.2973 .6725511 31.67 0.000 19.9569 22.63769 ------------------------------------------------------------------------------ . */ Let's check a Bayesian approach . bayes: regress mpg Burn-in ... Simulation ... Model summary ------------------------------------------------------------------------------ Likelihood: mpg ~ regress({mpg:_cons},{sigma2}) Priors: {mpg:_cons} ~ normal(0,10000) {sigma2} ~ igamma(.01,.01) ------------------------------------------------------------------------------ Bayesian linear regression MCMC iterations = 12,500 Random-walk Metropolis-Hastings sampling Burn-in = 2,500 MCMC sample size = 10,000 Number of obs = 74 Acceptance rate = .421 Efficiency: min = .2193 avg = .2246 Log marginal likelihood = -244.90098 max = .2299 ------------------------------------------------------------------------------ | Equal-tailed | Mean Std. Dev. MCSE Median [95% Cred. Interval] -------------+---------------------------------------------------------------- mpg | _cons | 21.3047 .6779059 .014139 21.30774 19.93476 22.61922 -------------+---------------------------------------------------------------- sigma2 | 34.26699 5.788511 .123607 33.69243 24.66202 47.59297 ------------------------------------------------------------------------------ Note: Default priors are used for model parameters. . */ so far, so good . */ let's check with another variable . bayes: regress price Burn-in ... Simulation ... Model summary ------------------------------------------------------------------------------ Likelihood: price ~ regress({price:_cons},{sigma2}) Priors: {price:_cons} ~ normal(0,10000) {sigma2} ~ igamma(.01,.01) ------------------------------------------------------------------------------ Bayesian linear regression MCMC iterations = 12,500 Random-walk Metropolis-Hastings sampling Burn-in = 2,500 MCMC sample size = 10,000 Number of obs = 74 Acceptance rate = .6707 Efficiency: min = .01234 avg = .09647 Log marginal likelihood = -763.50183 max = .1806 ------------------------------------------------------------------------------ | Equal-tailed | Mean Std. Dev. MCSE Median [95% Cred. Interval] -------------+---------------------------------------------------------------- price | _cons | 99.62768 100.5112 2.36511 102.1462 -98.38006 295.3294 -------------+---------------------------------------------------------------- sigma2 | 4.66e+07 7703121 693447 4.60e+07 3.36e+07 6.30e+07 ------------------------------------------------------------------------------ Note: Default priors are used for model parameters. Note: Adaptation tolerance is not met in at least one of the blocks. . */ ops, that's quite a difference! We see there is a problem with tolerance... . */ let's try to "fix" this with a block for the variance plus extra burn-in, etc. . bayes, block({sigma2}) burnin(10000) mcmcsize(20000) adaptation(tolerance(0.8)) gibbs: regress price Burn-in ... Simulation ... Model summary ------------------------------------------------------------------------------ Likelihood: price ~ normal({price:_cons},{sigma2}) Priors: {price:_cons} ~ normal(0,10000) {sigma2} ~ igamma(.01,.01) ------------------------------------------------------------------------------ Bayesian linear regression MCMC iterations = 30,000 Metropolis-Hastings and Gibbs sampling Burn-in = 10,000 MCMC sample size = 20,000 Number of obs = 74 Acceptance rate = 1 Efficiency: min = .001135 avg = .5006 Log marginal likelihood = -860.1099 max = 1 ------------------------------------------------------------------------------ | Equal-tailed | Mean Std. Dev. MCSE Median [95% Cred. Interval] -------------+---------------------------------------------------------------- price | _cons | 483.244 95.47607 .675118 482.3642 295.6941 670.667 -------------+---------------------------------------------------------------- sigma2 | 8699171 90.86103 19.072 8699157 8699031 8699347 ------------------------------------------------------------------------------ Note: Default priors are used for model parameters. Note: There is a high autocorrelation after 500 lags. */ we see there is high autocorrelation. Let's try to tackle this issue . bayes, block({sigma2}) burnin(20000) mcmcsize(30000) adaptation(tolerance(0.8)) gibbs: regress price Burn-in ... Simulation ... Model summary ------------------------------------------------------------------------------ Likelihood: price ~ normal({price:_cons},{sigma2}) Priors: {price:_cons} ~ normal(0,10000) {sigma2} ~ igamma(.01,.01) ------------------------------------------------------------------------------ Bayesian linear regression MCMC iterations = 50,000 Metropolis-Hastings and Gibbs sampling Burn-in = 20,000 MCMC sample size = 30,000 Number of obs = 74 Acceptance rate = 1 Efficiency: min = .001019 avg = .5005 Log marginal likelihood = -859.21765 max = 1 ------------------------------------------------------------------------------ | Equal-tailed | Mean Std. Dev. MCSE Median [95% Cred. Interval] -------------+---------------------------------------------------------------- price | _cons | 483.0878 96.4316 .556748 483.0736 291.8467 671.6999 -------------+---------------------------------------------------------------- sigma2 | 8699465 218.056 39.4317 8699483 8699049 8699774 ------------------------------------------------------------------------------ Note: Default priors are used for model parameters. Note: There is a high autocorrelation after 500 lags. */ I know I can enlarge the thinning. In order to decrease the time of analysis, I selected a short MCMC size and burn-period, but I chose Gibbs samples and a large thinning . bayes, block({sigma2}) burnin(250) mcmcsize(1000) adaptation(tolerance(0.8)) thinning(600) gibbs: regress price note: discarding every 599 sample observations; using observations 1 and 601 Burn-in ... Simulation ... Model summary ------------------------------------------------------------------------------ Likelihood: price ~ normal({price:_cons},{sigma2}) Priors: {price:_cons} ~ normal(0,10000) {sigma2} ~ igamma(.01,.01) ------------------------------------------------------------------------------ Bayesian linear regression MCMC iterations = 599,651 Metropolis-Hastings and Gibbs sampling Burn-in = 250 MCMC sample size = 1,000 Number of obs = 74 Acceptance rate = 1 Efficiency: min = .005903 avg = .503 Log marginal likelihood = -858.13439 max = 1 ------------------------------------------------------------------------------ | Equal-tailed | Mean Std. Dev. MCSE Median [95% Cred. Interval] -------------+---------------------------------------------------------------- price | _cons | 481.385 98.38691 3.11127 481.6254 294.3904 686.2036 -------------+---------------------------------------------------------------- sigma2 | 8699642 620.4574 255.378 8699604 8698622 8700712 ------------------------------------------------------------------------------ Note: Default priors are used for model parameters. Note: Adaptation continues during simulation.
That being said, with these commands, by selecting the variable price, I found a difference between - regress - and - bayes: regress - of 1000%.
To some extent, high autocorrelation is one of the culprits.
According to the Stata Manual:
Once convergence is established, the presence of high autocorrelation will
typically mean low precision for some parameter estimates in the model.
Depending on the magnitudes of the parameters and your research objective,
you may be satisfied with the obtained precision, in which case you can
ignore the reported note. If the level of precision is unacceptable, you
may try to reduce autocorrelation in your model. We recommend you try to
do it even if the level of precision is acceptable to you.
That being said, it's crystal clear that the model converged .
This notwithstanding, the disparity is humongous, to say the least.
I wonder why such a dismal disparity happened with price, but not mpg, in the very same data set.
Also, I'd like to know what to do in such cases, for example, when we wish to compare several variables under an uniformative "default" prior.
0 Response to Bayesian approach: what to do when it differs from the frequentist approach?
Post a Comment