This is something that has puzzled me for a while but I haven't pursued it because I use rstan for Bayesian modelling. However, Stata appears to have a lot of capability with Bayesian commands so I am keen to use it. I always seem to get wildly different answers with Stata which don't make sense so I'm obviously doing something wrong. I've had to reproduce a whole dataset here to illustrate the problem:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input int time byte discoveries float exp_time 1860 5 1 1861 3 1 1862 0 1 1863 2 1 1864 0 1 1865 3 1 1866 2 1 1867 3 1 1868 6 1 1869 1 1 1870 2 1 1871 1 1 1872 2 1 1873 1 1 1874 3 1 1875 3 1 1876 3 1 1877 5 1 1878 2 1 1879 4 1 1880 4 1 1881 0 1 1882 2 1 1883 3 1 1884 7 1 1885 12 1 1886 3 1 1887 10 1 1888 9 1 1889 2 1 1890 3 1 1891 7 1 1892 7 1 1893 2 1 1894 3 1 1895 3 1 1896 6 1 1897 2 1 1898 4 1 1899 3 1 1900 5 1 1901 2 1 1902 2 1 1903 4 1 1904 0 1 1905 4 1 1906 2 1 1907 5 1 1908 2 1 1909 3 1 1910 3 1 1911 6 1 1912 5 1 1913 8 1 1914 3 1 1915 6 1 1916 6 1 1917 0 1 1918 5 1 1919 2 1 1920 2 1 1921 2 1 1922 6 1 1923 3 1 1924 4 1 1925 4 1 1926 2 1 1927 2 1 1928 4 1 1929 7 1 1930 5 1 1931 3 1 1932 3 1 1933 0 1 1934 2 1 1935 2 1 1936 2 1 1937 1 1 1938 3 1 1939 4 1 1940 2 1 1941 2 1 1942 1 1 1943 1 1 1944 1 1 1945 2 1 1946 1 1 1947 4 1 1948 4 1 1949 3 1 1950 2 1 1951 1 1 1952 4 1 1953 1 1 1954 1 1 1955 1 1 1956 0 1 1957 0 1 1958 2 1 1959 0 1 end
Rstan with a gamma prior on the mean of the Poisson likelihood gives a sensible answer of 3.11 or 3.12. I've reproduced the code and the results for this here for those unfamiliar with rstan:
Code:
library(rstan) options(mc.cores = parallel::detectCores()) rstan_options(auto_write = TRUE) model_string <- " data { int n1; int y1[n1]; } parameters { real<lower=0> lambda1; } model { lambda1 ~ gamma(3,0.5); y1 ~ poisson(lambda1); } generated quantities { } " data_list <- list(y1 = problem10_3_1$discoveries,n1 = length(problem10_3_1$discoveries)) # Compiling and producing posterior samples from the model. stan_samples <- stan(model_code = model_string, data = data_list) stan_samples
Code:
Inference for Stan model: 67a64cf3a1044b4ab88f343ac21d4848. 4 chains, each with iter=2000; warmup=1000; thin=1; post-warmup draws per chain=1000, total post-warmup draws=4000. mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat lambda1 3.11 0.00 0.17 2.78 3.00 3.11 3.22 3.46 1457 1 lp__ 42.11 0.02 0.69 40.08 41.96 42.38 42.54 42.58 1967 1 Samples were drawn using NUTS(diag_e) at Wed Apr 28 15:33:48 2021. For each parameter, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat=1).
Code:
poisson discoveries,exposure(exp_time) irr
Code:
bayes, prior({discoveries:_cons}, gamma(3,1)) : poisson discoveries, exposure(exp_time)
[CODE][bayes : poisson discoveries, exposure(exp_time)/CODE]
or
Code:
bayesmh discoveries, likelihood(poisson) prior({discoveries: _cons}, flat)
Code:
bayesmh discoveries, likelihood(poisson) prior({discoveries: _cons}, gamma(9,3))
Sorry for the long post but any advice about getting started with Bayes in Stata would be appreciated. I think there is a StataCorp course-probably should take that!
Regards
Chris
0 Response to Doing Bayesian modelling in Stata
Post a Comment