This is something that has puzzled me for a while but I haven't pursued it because I use rstan for Bayesian modelling. However, Stata appears to have a lot of capability with Bayesian commands so I am keen to use it. I always seem to get wildly different answers with Stata which don't make sense so I'm obviously doing something wrong. I've had to reproduce a whole dataset here to illustrate the problem:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input int time byte discoveries float exp_time 1860 5 1 1861 3 1 1862 0 1 1863 2 1 1864 0 1 1865 3 1 1866 2 1 1867 3 1 1868 6 1 1869 1 1 1870 2 1 1871 1 1 1872 2 1 1873 1 1 1874 3 1 1875 3 1 1876 3 1 1877 5 1 1878 2 1 1879 4 1 1880 4 1 1881 0 1 1882 2 1 1883 3 1 1884 7 1 1885 12 1 1886 3 1 1887 10 1 1888 9 1 1889 2 1 1890 3 1 1891 7 1 1892 7 1 1893 2 1 1894 3 1 1895 3 1 1896 6 1 1897 2 1 1898 4 1 1899 3 1 1900 5 1 1901 2 1 1902 2 1 1903 4 1 1904 0 1 1905 4 1 1906 2 1 1907 5 1 1908 2 1 1909 3 1 1910 3 1 1911 6 1 1912 5 1 1913 8 1 1914 3 1 1915 6 1 1916 6 1 1917 0 1 1918 5 1 1919 2 1 1920 2 1 1921 2 1 1922 6 1 1923 3 1 1924 4 1 1925 4 1 1926 2 1 1927 2 1 1928 4 1 1929 7 1 1930 5 1 1931 3 1 1932 3 1 1933 0 1 1934 2 1 1935 2 1 1936 2 1 1937 1 1 1938 3 1 1939 4 1 1940 2 1 1941 2 1 1942 1 1 1943 1 1 1944 1 1 1945 2 1 1946 1 1 1947 4 1 1948 4 1 1949 3 1 1950 2 1 1951 1 1 1952 4 1 1953 1 1 1954 1 1 1955 1 1 1956 0 1 1957 0 1 1958 2 1 1959 0 1 end
Rstan with a gamma prior on the mean of the Poisson likelihood gives a sensible answer of 3.11 or 3.12. I've reproduced the code and the results for this here for those unfamiliar with rstan:
Code:
library(rstan)
options(mc.cores = parallel::detectCores())
rstan_options(auto_write = TRUE)
model_string <- "
data {
int n1;
int y1[n1];
}
parameters {
real<lower=0> lambda1;
}
model {  
lambda1 ~ gamma(3,0.5);
y1 ~ poisson(lambda1);
}
generated quantities {
}
"
data_list <- list(y1 = problem10_3_1$discoveries,n1 = length(problem10_3_1$discoveries))
# Compiling and producing posterior samples from the model.
stan_samples <- stan(model_code = model_string, data = data_list)
stan_samplesCode:
Inference for Stan model: 67a64cf3a1044b4ab88f343ac21d4848.
4 chains, each with iter=2000; warmup=1000; thin=1; 
post-warmup draws per chain=1000, total post-warmup draws=4000.
         mean se_mean   sd  2.5%   25%   50%   75% 97.5% n_eff Rhat
lambda1  3.11    0.00 0.17  2.78  3.00  3.11  3.22  3.46  1457    1
lp__    42.11    0.02 0.69 40.08 41.96 42.38 42.54 42.58  1967    1
Samples were drawn using NUTS(diag_e) at Wed Apr 28 15:33:48 2021.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).Code:
poisson discoveries,exposure(exp_time) irr
Code:
bayes, prior({discoveries:_cons}, gamma(3,1)) : poisson discoveries, exposure(exp_time)[CODE][bayes : poisson discoveries, exposure(exp_time)/CODE]
or
Code:
bayesmh discoveries, likelihood(poisson) prior({discoveries: _cons}, flat)Code:
bayesmh discoveries, likelihood(poisson) prior({discoveries: _cons}, gamma(9,3))Sorry for the long post but any advice about getting started with Bayes in Stata would be appreciated. I think there is a StataCorp course-probably should take that!
Regards
Chris
0 Response to Doing Bayesian modelling in Stata
Post a Comment