## Sunday, September 28, 2014

### Bayesian models in R

There are many ways to run general Bayesian calculations in or from R. The best known are JAGS, OpenBUGS and STAN. Then some time ago Rasmus Bååth had a post Three ways to run Bayesian models in R in which he mentioned LaplacesDemon (not on CRAN) on top of those. A check of the Bayes task view gives 'MCMCpack (...) contains a generic Metropolis sampler that can be used to fit arbitrary models', 'The mcmc package consists of an R function for a random-walk Metropolis algorithm for a continuous random vector' and 'Note that rcppbugs is a package that attempts to provide a pure R alternative to using OpenBUGS/WinBUGS/JAGS for MCMC'.
Clearly there is a wealth of approaches and each will have its strengths and weaknesses. To get an idea on their comparison I decided to run a number of calculations through all of them. The source of these calculations will be the fine SAS manual, where PROC MCMC has 21 examples. The first example, which I will try this week, is drawing samples from a number of distributions, from which I selected the mixture of three normal distributions.

Draw samples from a mixture of three normal distributions. Let me start by stating that I would not use anything as complex as an MCMC sampler for this task, my personal SAS solution would be:
data p1;
do i=1 to 1000;
s=rand('table', .3, .4);
select (s);
when (1) r=rand('normal', -3, 2);
when (2) r=rand('normal', 2, 1);
when (3) r=rand('normal', 10, 4);
end;
output;
end;
keep r;
run;

proc sgplot data=p1;
density r / type=kernel (c=.5);
run;
The equivalent R code:
library(plyr)
library(dplyr)
nu <- c(-3,2,10)
sd <- c(2,1,4)
tau <- sd^-2
wgt <- c(.3,.4,.3)

sampls <- sample(1:3,1000,replace=TRUE,wgt)
sampls <- rnorm(1000,nu[sampls],sd[sampls])
density(sampls) %>%    plot

#### MCMCpack

Just having a density was enough for MCMCpack. The samples have quite a long autocorrelation but do follow the distribution(not shown).
library(MCMCpack)
mydens <- function(x,...) {
step1 <- sum(wgt*dnorm(x,nu,sd))
log(step1)
}
mcmcpack <- MCMCmetrop1R(mydens,theta.init=1)
acf(mcmcpack)

#### mcmc

mcmc has the property that its result is not the samples, but rather summary statistics of the samples. Hence it does not give samples from the desired distribution.

#### JAGS

The JAGS samples had no autocorrelation and followed the distribution.
library(R2jags)
jmodel <- function() {
i ~ dcat(wgt)
nunu <- (i==1)*nu+
(i==2)*nu+
(i==3)*nu
tautau <- (i==1)*tau+
(i==2)*tau+
(i==3)*tau

p ~ dnorm(nunu,tautau)
}

datain <- list(nu=nu,tau=tau,wgt=wgt)
parameters <- c('i','p')

jj <- jags(model.file=jmodel,data=datain,parameters=parameters,DIC=FALSE)

#### STAN

I did not manage to get STAN to give me the samples.

#### rccpbugs

I did not manage to get rccpbugs to give me the samples.

#### LaplacesDemon

LaplacesDemon feels like the most complex of direct R solutions.
library('LaplacesDemon')
mon.names <- "LP"
parm.names <- 'x'
MyData <- list(mon.names=mon.names, parm.names=parm.names)
N <-1
Model <- function(parm, Data)
{
x <- parm
LL <- sum(wgt*dnorm(x,nu,sd)) %>% log
Modelout <- list(LP=LL, Dev=-2*LL, Monitor=LL,
yhat=x, parm=parm)
return(Modelout)
}
Initial.Values <- 1
Fit <- LaplacesDemon(Model, Data=MyData, Initial.Values)