Sunday, September 28, 2014

Bayesian models in R

There are many ways to run general Bayesian calculations in or from R. The best known are JAGS, OpenBUGS and STAN. Then some time ago Rasmus Bååth had a post Three ways to run Bayesian models in R in which he mentioned LaplacesDemon (not on CRAN) on top of those. A check of the Bayes task view gives 'MCMCpack (...) contains a generic Metropolis sampler that can be used to fit arbitrary models', 'The mcmc package consists of an R function for a random-walk Metropolis algorithm for a continuous random vector' and 'Note that rcppbugs is a package that attempts to provide a pure R alternative to using OpenBUGS/WinBUGS/JAGS for MCMC'. 
Clearly there is a wealth of approaches and each will have its strengths and weaknesses. To get an idea on their comparison I decided to run a number of calculations through all of them. The source of these calculations will be the fine SAS manual, where PROC MCMC has 21 examples. The first example, which I will try this week, is drawing samples from a number of distributions, from which I selected the mixture of three normal distributions.

Task

Draw samples from a mixture of three normal distributions. Let me start by stating that I would not use anything as complex as an MCMC sampler for this task, my personal SAS solution would be:
data p1;
    do i=1 to 1000;
        s=rand('table', .3, .4);
        select (s);
            when (1) r=rand('normal', -3, 2);
            when (2) r=rand('normal', 2, 1);
            when (3) r=rand('normal', 10, 4);
        end;
        output;
    end;
    keep r;
run;

proc sgplot data=p1;
    density r / type=kernel (c=.5);
run;
The equivalent R code:
library(plyr)
library(dplyr)
nu <- c(-3,2,10)
sd <- c(2,1,4)
tau <- sd^-2
wgt <- c(.3,.4,.3)

sampls <- sample(1:3,1000,replace=TRUE,wgt)
sampls <- rnorm(1000,nu[sampls],sd[sampls])
density(sampls) %>%    plot

MCMCpack

Just having a density was enough for MCMCpack. The samples have quite a long autocorrelation but do follow the distribution(not shown).
library(MCMCpack)
mydens <- function(x,...) {
    step1 <- sum(wgt*dnorm(x,nu,sd))
    log(step1)
}
mcmcpack <- MCMCmetrop1R(mydens,theta.init=1)
acf(mcmcpack)

mcmc

mcmc has the property that its result is not the samples, but rather summary statistics of the samples. Hence it does not give samples from the desired distribution.

JAGS

The JAGS samples had no autocorrelation and followed the distribution.
library(R2jags)
jmodel <- function() {
    i ~ dcat(wgt)
    nunu <- (i==1)*nu[1]+
            (i==2)*nu[2]+
            (i==3)*nu[3]
        tautau <- (i==1)*tau[1]+
            (i==2)*tau[2]+
            (i==3)*tau[3]
       
    p ~ dnorm(nunu,tautau)
}

datain <- list(nu=nu,tau=tau,wgt=wgt)
parameters <- c('i','p')

jj <- jags(model.file=jmodel,data=datain,parameters=parameters,DIC=FALSE)

STAN

I did not manage to get STAN to give me the samples.

rccpbugs

I did not manage to get rccpbugs to give me the samples.

LaplacesDemon

LaplacesDemon feels like the most complex of direct R solutions.  
library('LaplacesDemon')
mon.names <- "LP"
parm.names <- 'x'
MyData <- list(mon.names=mon.names, parm.names=parm.names)
N <-1
Model <- function(parm, Data)
{
    x <- parm[1]
    LL <- sum(wgt*dnorm(x,nu,sd)) %>% log
    Modelout <- list(LP=LL, Dev=-2*LL, Monitor=LL,
        yhat=x, parm=parm)
    return(Modelout)
}
Initial.Values <- 1
Fit <- LaplacesDemon(Model, Data=MyData, Initial.Values)

No comments:

Post a Comment