Wiekvoet: JAGS model Fe concentration in rainwater including values below detection level

In my previous post I ignored the fact that some data are below the detection level. I would not know how to handle those in a mixed model from lme4 or nlme. However, JAGS can handle these values. Next to that I kept the usual independent variables, such as random effects for location, machine etc.

Data

Data has been acquired and processed similar to last week. For script at the bottom of the page.

detection level

In the spreadsheet a '4' is used to indicate a value below detection limit. (* een " 4 "-teken in kolom C geeft aan dat de analyseresultaten beneden de detectielimiet liggen). There is something strange there. The detection limit seems to be 0.4, but there is data under 0.4 which is not flagged. In addition, some of the data is below 0, which seems improbable. In the end I flagged all data under 0.4 as NA.

Fe <- rf2[rf2$Compound==levels(rf2$Compound)[9],]

library(lattice)

densityplot( ~ Amount | machine + Status,

groups=Location,data=Fe[!is.na(Fe$Status)

& Fe$Amount<.8,],

panel=function(...) {

panel.densityplot(...)

panel.abline(v=c(0,0.4))

},adjust=.5

)

Analysis

It is a reasonable standard JAGS model. The only tricky thing is initialization. When the parameters are too far away an error is thrown. Hence 'intercept' is initialized as -1. In text the model and inits are given, at the end the full code.

FeModel <- function() {

for (i in 1:nobs) {

BLZ[i] ~ dinterval( LAmount[i] , -0.4034029 )

ExpLAmount[i] <- intercept +

rate*time[i] +

rateL[Location[i]]*time[i]+

FMach[Machine[i]] +

FLoc[Location[i]]

LAmount[i] ~ dnorm(ExpLAmount[i],tau[Machine[i]])

}

for (i in 1:2) {

tau[i] <- pow(sd[i],-2)

sd[i] ~ dunif(0,1000)

}

intercept ~ dnorm(0,.0001)

FMach[1] ~dnorm(0,tauMachShift)

FMach[2] <- -FMach[1]

tauMachShift <- pow(sdMachShift,-2)

sdMachShift ~ dunif(0,10)

for (iL in 1:nloc) {

FLoc[iL] ~ dnorm(0,tauLoc)

rateL[iL] ~dnorm(0,tauRate)

yearlyL[iL] <- pow(pow(10,rateL[iL]+rate),365)

}

tauLoc <- pow(sdLoc,-2)

sdLoc ~ dunif(0,100)

tauRate <- pow(sdRate,-2)

sdRate ~ dunif(0,100)

rate ~ dnorm(0,.0001)

dailydec <- pow(10,rate)

yearlydec <- pow(dailydec,365)

}

inits <- function() {

list(intercept = -1,#rnorm(1,.45,.04),

rate=rnorm(0,.01),

sd=runif(2,0,50),

sdMachShift=runif(1,0,1),

sdLoc=runif(1,0,1))

}

Results

The model shows the decrease of 5% per year which was expected.

Machine 1 (Eigenbrodt NSA 181 KHT) is slightly less accurate (was also observed last week) and has slightly lower values than 2 (Van Essen/ECN vanger). Variation between locations is less than variation between machines.

Inference for Bugs model at "C:/.../AppData/Local/Temp/Rtmpma5UEx/modelc40761211ba.txt", fit using jags,

4 chains, each with 5000 iterations (first 2500 discarded), n.thin = 2

n.sims = 5000 iterations saved

mu.vect sd.vect 2.5% 25% 50% 75% 97.5%

FLoc[1] 0.023 0.070 -0.110 -0.022 0.021 0.065 0.169

FLoc[2] 0.139 0.081 -0.038 0.091 0.147 0.195 0.280

FLoc[3] -0.041 0.069 -0.180 -0.083 -0.041 0.002 0.098

FLoc[4] -0.103 0.080 -0.289 -0.147 -0.094 -0.048 0.032

FLoc[5] -0.072 0.072 -0.209 -0.117 -0.075 -0.027 0.078

FLoc[6] -0.036 0.070 -0.173 -0.079 -0.036 0.008 0.102

FLoc[7] -0.016 0.069 -0.154 -0.058 -0.016 0.027 0.117

FLoc[8] 0.278 0.079 0.127 0.229 0.274 0.323 0.453

FLoc[9] -0.061 0.089 -0.243 -0.118 -0.060 0.000 0.109

FLoc[10] 0.052 0.069 -0.091 0.009 0.052 0.095 0.186

FLoc[11] 0.040 0.079 -0.131 -0.008 0.048 0.096 0.172

FLoc[12] -0.102 0.077 -0.245 -0.153 -0.106 -0.054 0.064

FLoc[13] -0.054 0.082 -0.210 -0.107 -0.056 -0.005 0.117

FLoc[14] 0.027 0.067 -0.112 -0.013 0.028 0.070 0.155

FLoc[15] -0.031 0.070 -0.164 -0.076 -0.033 0.011 0.116

FLoc[16] 0.088 0.070 -0.050 0.045 0.087 0.131 0.227

FLoc[17] -0.154 0.076 -0.293 -0.203 -0.157 -0.106 0.006

FMach[1] -0.053 0.017 -0.087 -0.065 -0.053 -0.041 -0.018

FMach[2] 0.053 0.017 0.018 0.041 0.053 0.065 0.087

intercept 0.401 0.081 0.240 0.345 0.402 0.455 0.559

sd[1] 0.436 0.018 0.402 0.423 0.436 0.448 0.471

sd[2] 0.386 0.007 0.372 0.381 0.386 0.391 0.401

sdLoc 0.135 0.036 0.077 0.110 0.130 0.154 0.216

sdMachShift 1.854 2.456 0.033 0.170 0.675 2.587 8.820

yearlyL[1] 0.944 0.007 0.931 0.940 0.944 0.948 0.957

yearlyL[2] 0.950 0.007 0.938 0.945 0.949 0.954 0.965

yearlyL[3] 0.945 0.006 0.932 0.941 0.945 0.949 0.958

yearlyL[4] 0.949 0.007 0.936 0.944 0.948 0.953 0.963

yearlyL[5] 0.943 0.007 0.929 0.939 0.944 0.948 0.956

yearlyL[6] 0.945 0.006 0.932 0.941 0.945 0.949 0.957

yearlyL[7] 0.946 0.006 0.933 0.941 0.946 0.950 0.958

yearlyL[8] 0.945 0.006 0.932 0.941 0.945 0.949 0.957

yearlyL[9] 0.944 0.007 0.929 0.940 0.945 0.949 0.958

yearlyL[10] 0.946 0.006 0.933 0.942 0.946 0.950 0.958

yearlyL[11] 0.950 0.006 0.938 0.946 0.950 0.954 0.964

yearlyL[12] 0.942 0.007 0.928 0.938 0.943 0.947 0.955

yearlyL[13] 0.944 0.007 0.930 0.939 0.944 0.948 0.956

yearlyL[14] 0.947 0.006 0.935 0.943 0.947 0.951 0.959

yearlyL[15] 0.944 0.006 0.931 0.940 0.944 0.948 0.956

yearlyL[16] 0.946 0.006 0.934 0.942 0.946 0.950 0.958

yearlyL[17] 0.942 0.007 0.926 0.938 0.943 0.947 0.956

yearlydec 0.945 0.005 0.936 0.942 0.945 0.949 0.955

deviance 1729.603 52.705 1627.020 1693.551 1728.751 1765.143 1831.422

Rhat n.eff

FLoc[1] 1.002 2200

FLoc[2] 1.001 4000

FLoc[3] 1.001 5000

FLoc[4] 1.001 5000

FLoc[5] 1.001 5000

FLoc[6] 1.002 1500

FLoc[7] 1.001 5000

FLoc[8] 1.001 4400

FLoc[9] 1.002 1600

FLoc[10] 1.001 5000

FLoc[11] 1.001 2800

FLoc[12] 1.002 2200

FLoc[13] 1.001 5000

FLoc[14] 1.001 5000

FLoc[15] 1.001 2800

FLoc[16] 1.001 4800

FLoc[17] 1.004 740

FMach[1] 1.004 760

FMach[2] 1.004 760

intercept 1.001 3100

sd[1] 1.007 380

sd[2] 1.002 1500

sdLoc 1.003 1600

sdMachShift 1.001 5000

yearlyL[1] 1.001 3000

yearlyL[2] 1.001 4100

yearlyL[3] 1.002 1600

yearlyL[4] 1.001 2800

yearlyL[5] 1.002 1400

yearlyL[6] 1.002 2300

yearlyL[7] 1.001 5000

yearlyL[8] 1.001 5000

yearlyL[9] 1.001 4000

yearlyL[10] 1.001 3500

yearlyL[11] 1.001 5000

yearlyL[12] 1.002 1400

yearlyL[13] 1.001 5000

yearlyL[14] 1.002 1400

yearlyL[15] 1.001 3100

yearlyL[16] 1.002 1500

yearlyL[17] 1.004 850

yearlydec 1.002 1800

deviance 1.003 900

For each parameter, n.eff is a crude measure of effective sample size,

and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = var(deviance)/2)

pD = 1385.1 and DIC = 3114.7

DIC is an estimate of expected predictive error (lower deviance is better).

Code

library(R2jags)

# http://www.lml.rivm.nl/data/gevalideerd/data/lmre_1992-2005.xls

# http://www.r-bloggers.com/read-excel-files-from-r/

library(XLConnect)

readsheet <- function(wb,cursheet) {

df = readWorksheet(wb, sheet = cursheet , header = TRUE)

topline <- 8

test <- which(df[6,]=='C')+1

numin <- df[topline:nrow(df),test]

names(numin) <- make.names( gsub(' ','',df[6,test]))

for(i in 1:ncol(numin)) numin[,i] <- as.numeric(gsub(',','.',numin[,i]))

status = df[topline:nrow(df),test-1]

names(status) <- paste('C',make.names( gsub(' ','',df[6,test])),sep='')

numin$StartDate <- as.Date(substr(df[topline:nrow(df),1],1,10))

numin$EndDate <- as.Date(substr(df[topline:nrow(df),2],1,10))

numin <- cbind(numin,status)

tall <- reshape(numin,

varying=list(make.names( gsub(' ','',df[6,test])),paste('C',make.names( gsub(' ','',df[6,test])),sep='')),

v.names=c('Amount','Status'),

timevar='Compound',

idvar=c('StartDate','EndDate'),

times=paste(df[6,test],'(',df[5,test],')',sep=''),

direction='long')

tall$Compound <- factor(gsub(' )',')',gsub(' +',' ',tall$Compound)))

row.names(tall)<- NULL

tall$Location <- cursheet

tall

}

readfile <- function(fname) {

wb <- loadWorkbook(fname)

print(wb)

sheets <- getSheets(wb)[-1]

tt <- lapply(sheets,readsheet,wb=wb)

tt2 <- do.call(rbind,tt)

tt2$Location <- factor(tt2$Location)

tt2$fname <- fname

tt2

}

fnames <- dir(pattern='*.xls') #.\\. ?

rf <- lapply(fnames,readfile)

rf2 <- do.call(rbind,rf)

rf2$machine <- factor(ifelse(rf2$fname=="lmre_1992-2005.xls",

'Van Essen/ECN vanger',

' Eigenbrodt NSA 181 KHT'))

Fe <- rf2[rf2$Compound==levels(rf2$Compound)[9],]

library(lattice)

densityplot( ~ Amount | machine + Status,

groups=Location,data=Fe[!is.na(Fe$Status)

& Fe$Amount<.8,],

panel=function(...) {

panel.densityplot(...)

panel.abline(v=c(0,0.4))

},adjust=.5

)

Fe$LAmount <- log10(Fe$Amount)

Fe$LAmount[Fe$Amount<0.4] <- NA

Fe2 <- Fe[!is.na(Fe$Status),]

datain <- list(

LAmount = Fe2$LAmount,

BLZ = !is.na(Fe2$LAmount),

Machine=(Fe2$machine=='Van Essen/ECN vanger')+1,

Location=(1:nlevels(Fe2$Location))[Fe2$Location],

time=as.numeric(Fe2$StartDate),

nobs =nrow(Fe2),

nloc=nlevels(Fe2$Location)

)