Wiekvoet: 2013

Friday, December 27, 2013

Sinterklaas and Santa Claus gave presents

In December my nice little netbook acquired a shiny new openSUSE 13.1, including all the goodies I might want to use in 2014; R, Stan, Julia and Jags. So here is what you might get when you set up a computer for statistics today.

Jags

I have been using Jags for quite some time because I had the (incorrect) impression openBugs development stopped. I never noticed, but since September the most current version of Jags is 3.4. What I missed since 3.3? this is how the NEWS file starts.

New features

You can now set a monitor of type "mean" for any node, which records the running mean.
The order function in the bugs module returns a permutation that sorts its argument in ascending order (the inverse of the permutation provided by the rank function).
The Windows installer now offers a choice of personal or global installation for users with Admin access.
The negative binomial distribution is extended to allow parameters size=0 and probability=1.

Bug fixes

LogicalNode::isClosed could throw a logic_error due to a bug in the checkLinear function (Linear.cc)
Setting a monitor with thinning interval 0 prints an error message but no longer throws a logic_error exception.
The command line interface could print a spurious warning that ".RNG.seed" or ".RNG.state" was unused.
A function or distribution with zero arguments caused a segmentation fault. This now gives a compilation error.
The "dround" function (bugs module) was incorrectly calling fprec (round to n significant figures) when it should have been calling fround (round to n decimal places).
The sd() function divided by n (sample size) instead of (n-1).
The Windows uninstaller did not correctly uninstall personal installations for users with administrative access.

R

Nothing new there compared to my windows machine (version 3.0.2 (2013-09-25) -- "Frisbee Sailing"). It was built on the netbook rather than taken from the repository. Getting it running under Eclipse en StatET did take some effort, but it is the environment I prefer.

Julia

In my mind Julia is the response to the lack of speed of R. This is a fresh start, a new language based on modern understanding how an interpreter should work, while appreciating some of the good things of R.
Julia code promises to be as fast as C code. Last time I looked was August 2012. Seeing changes is not a surprise, the amount of changes was. Version 0.2 is there, including windows installers (32 and 64 bits). There are packages to add to it. And there is an IPython interface which runs in the browser. This gives IJulia, see screen shot. While IPython came out of the box (repository) getting Julia running in it took some fiddling around. I Played around a bit further than 4+5=9, but still wonder how it will work when running real computations. I'll have to try some Julia pretty soon.

Stan

Stan is an alternative to Jags/Bugs. It is not compatible, Stan code is converted to C++ then compiled. This may be the solution for bigger or more complex models. Stan was one of the reasons for running a Linux after a windows period. I never could get a compiler to function with R on my windows 7. Looking on the Stan website, I must say there is a load of documentation (a 400+ page manual). Examples? All the Bugs examples plus all examples from the arm book. I'll have to convert some of my Jags code to Stan soon. The one example I ran to proof functioning software showed the compiling to give a bunch of overhead, but that is only to be expected.

Others

Engauge digitizer and plotdigitizer for converting plots to data. Saw them when browsing the repository. You never know when to extract data back from a plot, so these are installed too. Not tested yet though.
Numpy, SciPy and Octave. While I drew them from the repository just in case, they might sit there unused for a long time.
Openbugs. It seems development continued after all. It can be built under linux too. On 64 bit Linux, the necessary 32-bit C development packages are required, which is the one thing stopping me. Updating my windows computer seems to be in order though.

Friday, December 20, 2013

Merry Christmas

Merry Christmas for all readers. Please enjoy my tree.

code

part <- list(x0=0,y0=0,x1=0,y1=1,
branch1=NULL,branch2=NULL,extend=NULL,
lwd=1,depth=0,col='springgreen')

par(mfrow=c(1,1),mar=c(5, 4, 4, 2) + 0.1)
segplot <- function(tree) {
if (is.null(tree)) return()
segments(tree$x0,tree$y0,tree$x1,tree$y1,
col=tree$col,
lwd=tree$lwd)
segplot(tree$branch1)
segplot(tree$branch2)
segplot(tree$extend)
}
#segplot(part)

grow <- function(tree) {
if (is.null(tree) ) return(NULL)

tree$lwd=tree$lwd*1.2

if (tree$lwd>2.5) tree$col <- 'brown'
if (is.null(tree$extend)) {
tree$extend <- list(
x0=tree$x1,
y0=tree$y1,
x1=rnorm(1,1,.03)*(2*tree$x1-tree$x0),
y1=(rnorm(1,.98,.02)+.02*(tree$x1==tree$x0))*(2*tree$y1-tree$y0),
branch1=NULL,
branch2=NULL,
extend=NULL,
lwd=1,
depth=tree$depth,
col=tree$col
)
length=sqrt((tree$x1-tree$x0)^2 + (tree$y1-tree$y0)^2)
angle <- asin((tree$x1-tree$x0)/length)
branch <- list(
x0=(tree$x1+tree$x0)/2,
y0=(tree$y1+tree$y0)/2,
branch1=NULL,
branch2=NULL,
extend=NULL,
lwd=1,
depth=tree$depth,
col=tree$col
)
shift <- rnorm(2,.5,.1)
branch$x0 <- shift[1]*tree$x1+(1-shift[1])*tree$x0
branch$y0 <- shift[1]*tree$y1+(1-shift[1])*tree$y0
length=length*rnorm(1,.5,.05)
co <- runif(1,.35,.45)
branch$x1 <- branch$x0+sin(angle+co)*length
branch$y1 <- branch$y0+cos(angle+co)*length
tree$branch1 <- branch
branch$x0 <- shift[2]*tree$x1+(1-shift[2])*tree$x0
branch$y0 <- shift[2]*tree$y1+(1-shift[2])*tree$y0
co <- runif(1,.35,.45)
branch$x1 <- branch$x0+sin(angle-co)*length
branch$y1 <- branch$y0+cos(angle-co)*length
tree$branch2 <- branch
} else {
tree$branch1 <- grow(tree$branch1)
tree$branch2 <- grow(tree$branch2)
tree$extend <- grow(tree$extend)
}
tree$depth <- tree$depth+1
if (tree$depth>2) tree$col <- 'green'
if (tree$depth>4) tree$col <- 'darkgreen'
if (tree$depth>6) tree$col <- 'brown'

tree
}
tree <- part
for (i in 1:9) tree <- grow(tree)
par(mar=c(0,0,0,0))
plot(x=c(-3,3),y=c(0,9),type='n',axes=FALSE,xlab='',ylab='')
segplot(tree)

Sunday, December 15, 2013

plotting y and log(y) in one figure

Sometimes I have the desire to plot both on the linear and on the log scale. To save space just two figures is not my solution. I want to reuse the x-axis, legend, title. This post examines possibilities to do so with standard plot tools, lattice and ggplot2.

Data

Data is completely artificial.
library(ggplot2)
library(lattice)

datastart <- data.frame(x=rep(1:5,2),
y=c(1,2,10,50,1, .1,9,8,20,19),
type=rep(c('a','b'),each=5))
datastart

x y type

1 1 1.0 a

2 2 2.0 a

3 3 10.0 a

4 4 50.0 a

5 5 1.0 a

6 1 0.1 b

7 2 9.0 b

8 3 8.0 b

9 4 20.0 b

10 5 19.0 b

standard plot tools

The trick here is to make two plots. The top plot has no x-axis, the bottom one no title. To make the two plot surfaces equal in size the room reserved for x-axis and title is the same.

par(mfrow=c(2,1),mar=c(0,4.1,4,2))

plot(y~x,

data=datastart,

axes=FALSE,

frame.plot=TRUE,

xlab='',

main='bla bla',

col=c('red','green')[datastart$type])

legend(x='topleft',

legend=c('a','b'),

col=c('red','green'),

pch=1)

axis(side=2,las=1)

par(mar=c(4,4.1,0,2))

plot(y~x,

data=datastart,

axes=FALSE,

frame.plot=TRUE,

xlab='x',

log='y',

ylab='log(y)',

col=c('red','green')[datastart$type]

)

axis(side=2,las=1)

axis(side=1)

lattice

As far as I understand, lattice does not have the tools to fiddle this extreme with axis. The trick then is to add a copy of the data on logarithmic scale and manually control the labels. Lattice does not understand that the x-axis are same, but some suppression is possible.

data1=datastart

data2=datastart

data1$lab='linear'

data2$lab='log'

data2$y <- log10(data2$y)

dataplot <- rbind(data1,data2)

at2 <- axisTicks(range(data2$y),log=TRUE,nint=4)

at1 <- axisTicks(range(data1$y),log=FALSE,nint=5)

atx <- axisTicks(range(data1$x),log=FALSE,nint=5)

dataplot$lab <- factor(dataplot$lab,levels=c('linear','log'))

xyplot(y ~ x | lab,groups=type,data=dataplot,layout=c(1,2),

main='bla bla',

as.table=TRUE,

auto.key=TRUE,

scales=list(relation='free',

x=list(at=list(NULL,atx)),

y=list(at=list(at1,log10(at2)),

labels=list(format(at1),format(at2))

)

))

ggplot2

The trick here is that ggplot2 can have a free y-axis, but you cannot set the labels per axis. Instead it is a common y-axis which has adapted labels. ggplot chooses the range for the y-axis itself, you have to make sure that the labels you feed it match that range. To make that fit in the end I just shifted the whole log part to a different range. Some of the code of lattice is re-used.

dataplot2 <- dataplot

offset <- -10

breaks=c(-11,-10,-9)

dataplot2$y[dataplot2$lab=='log'] <- dataplot2$y[dataplot2$lab=='log'] +offset

p <- ggplot(dataplot2, aes(x, y,colour=type)) + geom_point()

p + facet_grid(lab ~ .,

scales='free_y') +

labs(title = 'bla bla') +

scale_y_continuous(

breaks = c(breaks,at1),

labels=c(format(10^(breaks-offset)),format(at1)))

Sunday, December 8, 2013

Fukushima region radiation decrease

Fukushima is still a topic which gets headlines, and somewhere in a comment was a link to actual and historical radiation data: Fukushima prefecture radioactivity measurement map. It takes a bit of time to load but then you have a map of the Fukushima region with current radiation levels at measurement stations. Somewhere on that website there is data from the past year, this post examines the decrease at four of the stations.

Data

If you look at the map, there are loads of blue dots, indicating low radiation, and a line of yellow, orange and red dots, striking north east from the reactor. It is an interactive map. You can click on any dot and see some details for that region. And at the bottom of that is a link to a detail page. On that page there are, among other things, graphs such as this one.

One of these plots contains 'transit of air dose rates in the most recent 365 days. It is a dynamic plot, the link to it contains the data, just like the plot above. It is pretty easy to extract the data from the link. Since getting the links was manual work, I decided to restrict my data to four dots. To discriminate between locations I copied some header and a number. For example 【相双地方】室原公民館の測定値 (Google translate: Measurements [phase] of bi-district community center Murohara) with number 704. The complete data with location information are at the bottom of the page.
There may be a more simple way to get the data, but since I don't speak Japanese, I cannot say either way.
Edit 9 December 2012: I got sent better translations from Ishigami, a Japanese reader, hence this post has been edited.

Analysis

Since the data looks pretty choppy, I wanted a robust model. To quote the robust task view: "robust depends on robustbase, the former providing convenient routines for the casual user where the latter will contain the underlying functionality, and provide the more advanced statistician with a large range of options for robust modeling." Regarding robust I am certainly the casual user. Proportional decrease by using a linear model on log transformed data seemed an appropriate modelling strategy.
The plot below contains the data and the predictions. For me, the predictions seem just like I would make them when ignoring some of the crazy jumps.
The data themselves, show a clear correlation. The same drop mid January, a jump in September, though I do not know an explanation.

Half lives can be estimated from the models. Two of them are around 2. Cesium-134 has a half live of 2.04 years and has apparently been released, it is easy to speculate Cesium-134 is part of the source of this radiation.
Lower CI mean Upper CI
Maenori meeting place in Soso district 1.922496 2.074042 2.251524
Murohara community center in Soso district 3.171938 4.463582 7.529780
Sugiuchi meeting place in Soso district 1.764994 1.817172 1.872530
township village center 2.602491 2.854096 3.159558

code

#【相双地方】室原公民館の測定値
#704
# Measurements [phase] of bi-district community center Murohara
# Measurements of Murohara community center in Soso district.
ys704 <- "http://chart.apis.google.com/chart?chd=t:433,432,433,410,431,432,426,430,441,438,435,432,435,416,422,431,422,422,407,425,405,421,419,424,421,418,415,428,412,402,414,404,413,412,401,402,406,413,424,276,310,334,352,357,370,389,396,398,402,405,404,396,397,405,404,404,407,417,419,411,413,411,332,409,398,400,409,388,402,402,403,394,387,400,408,394,394,389,405,384,391,387,399,404,407,416,407,403,409,409,412,423,428,415,411,408,416,431,403,411,415,420,436,426,420,407,416,414,411,411,404,402,415,406,405,393,420,403,389,401,403,390,397,403,410,388,393,388,395,412,403,408,413,391,395,383,377,392,401,399,402,386,394,392,405,403,387,392,394,404,406,411,391,406,401,408,387,389,392,401,399,391,390,405,402,411,408,409,408,407,404,411,413,417,405,390,399,393,392,401,402,405,411,410,405,414,414,411,405,404,417,395,396,401,403,396,406,398,401,403,407,408,388,390,393,396,398,402,405,392,404,409,406,410,405,419,423,420,417,400,402,398,400,387,395,389,397,399,386,395,386,396,396,386,392,383,392,390,380,379,391,392,385,390,400,405,408,416,417,415,415,415,417,417,419,419,420,405,411,405,401,406,408,411,403,403,408,417,402,401,395,405,395,389,394,395,381,389,392,395,399,390,396,388,370,380,383,383,405,391,387,384,389,382,369,374,377,384,379,382,375,374,372,365,366,374,379,388,378,378,373,366,368,343,355,353,358,361,347,351,334,346,340,334,327,332,334,343,331,337,335,338,344,331,333,336,341,331,336,338,322,320,322,331,330,330,331,335,326,331,325,325,327,328,331,325,331,324,324,320,323,322,327,324,323&chxl=0:|365|330|300|270|240|210|180|150|120|90|60|30|1|2:|%CE%BCSv%2Fh&chxp=0|2,0&chxr=0,0,100|1,0,50&chxs=0,676767,9,0,lt,676767|1,676767,9,0,l,676767&chxtc=0,3&chxt=x,y,t&chs=300x150&cht=lc&chco=0067B7&chds=0,5000&chls=1&chma=1,1,1,1&chg=8.333,10,1,1"
#【相双地方】杉内集会所の測定値
#628
# Measured value of the bi-phase region] Sugiuchi meeting place
# Measured value of Sugiuchi meeting place in Soso district.
ys628 <- "http://chart.apis.google.com/chart?chd=t:280,281,281,269,278,279,277,279,282,278,281,273,275,275,274,276,267,267,269,270,271,272,261,267,269,269,270,272,270,266,269,270,270,268,270,269,268,270,268,183,183,197,201,203,208,216,223,234,243,246,253,250,251,254,255,255,256,262,265,266,258,263,201,242,245,250,255,256,255,258,259,252,253,256,258,260,259,258,259,256,257,258,261,261,262,258,263,262,263,264,264,268,269,265,265,265,265,268,265,265,267,268,267,266,265,265,266,265,263,261,262,247,253,254,253,247,253,244,235,246,249,232,242,247,250,239,245,240,246,251,250,251,254,246,244,234,234,240,243,233,242,230,239,242,244,243,234,241,244,247,248,248,246,247,248,248,227,231,236,238,240,236,239,243,241,242,242,246,245,245,245,245,247,247,239,221,231,230,227,233,235,237,238,238,227,236,237,238,226,224,232,218,223,226,225,220,224,216,217,224,227,230,217,218,220,220,224,226,230,218,224,227,227,228,222,230,231,235,236,222,223,222,224,225,217,218,222,223,214,218,211,216,215,210,214,209,214,214,207,209,214,215,210,211,217,221,223,228,229,231,232,232,235,235,237,238,238,226,227,227,225,228,230,233,221,225,226,231,222,223,223,228,215,214,215,216,215,216,217,219,221,213,218,210,204,209,211,212,215,217,215,214,216,203,205,207,209,211,208,206,201,205,202,197,197,201,204,207,206,205,206,206,206,190,194,196,197,198,187,190,185,188,183,185,186,190,191,193,194,195,195,195,196,192,193,195,194,193,194,194,188,190,191,192,192,191,193,193,194,194,194,194,194,194,195,189,190,191,190,190,191,190,191,188,188&chxl=0:|365|330|300|270|240|210|180|150|120|90|60|30|1|2:|%CE%BCSv%2Fh&chxp=0|2,0&chxr=0,0,100|1,0,50&chxs=0,676767,9,0,lt,676767|1,676767,9,0,l,676767&chxtc=0,3&chxt=x,y,t&chs=300x150&cht=lc&chco=0067B7&chds=0,5000&chls=1&chma=1,1,1,1&chg=8.333,10,1,1"
# 【相双地方】前乗集会所の測定値
#741
# Measurement of the meeting place to the power [bi-phase region] before
# Measurement of Maenori meeting place in Soso district.
ys741 <- 'http://chart.apis.google.com/chart?chd=t:114,113,112,110,98,101,102,105,109,112,115,114,114,114,105,105,104,105,102,92,91,93,92,95,99,100,100,103,99,98,97,100,99,98,99,98,98,99,100,77,56,56,56,51,48,49,50,48,48,49,49,46,43,45,45,46,46,49,66,67,68,70,59,68,64,65,68,66,67,64,66,63,61,62,65,64,65,63,66,64,57,58,59,64,68,75,77,78,81,84,90,96,99,98,98,97,100,101,99,100,101,103,104,103,105,101,103,102,102,102,100,100,103,102,101,99,102,100,94,97,99,96,96,98,99,98,98,95,97,101,100,101,102,97,98,92,64,90,96,96,97,94,97,98,98,99,95,96,97,98,99,100,98,100,101,102,95,98,99,100,100,97,98,100,98,99,99,100,100,100,100,101,101,101,96,93,96,96,95,97,98,99,100,99,93,96,98,97,97,96,97,90,91,93,93,90,92,92,93,93,94,94,87,90,90,90,91,92,92,88,90,90,91,91,88,91,92,92,92,87,89,87,90,85,86,88,89,90,87,90,87,88,88,85,86,85,85,86,84,84,87,88,85,84,88,89,89,91,91,92,92,93,94,94,94,94,95,88,90,91,89,90,87,87,87,89,90,91,87,86,87,87,87,84,86,86,83,85,86,86,87,87,88,84,83,85,85,86,88,88,87,86,87,86,83,84,85,86,85,85,83,83,83,81,81,82,84,85,85,86,84,85,85,79,80,80,81,82,77,79,77,78,78,78,78,79,79,80,79,79,80,80,81,79,79,80,79,79,79,78,76,76,76,78,78,77,78,79,78,79,78,78,78,78,75,77,77,76,76,76,76,76,77,76,76&chxl=0:|365|330|300|270|240|210|180|150|120|90|60|30|1|2:|%CE%BCSv%2Fh&chxp=0|2,0&chxr=0,0,100|1,0,50&chxs=0,676767,9,0,lt,676767|1,676767,9,0,l,676767&chxtc=0,3&chxt=x,y,t&chs=300x150&cht=lc&chco=0067B7&chds=0,5000&chls=1&chma=1,1,1,1&chg=8.333,10,1,1'
# 【相双地方】町区集落センターの測定値
#1376
# Measured value of the bi-phase region] township village center
ys1376 <- 'http://chart.apis.google.com/chart?chd=t:1023,1010,991,987,953,1027,1007,1008,1006,957,963,980,1000,899,1003,994,920,995,886,1008,884,1003,910,915,918,904,977,1006,881,993,888,984,908,976,986,861,869,995,986,612,651,683,711,709,728,733,770,813,834,853,849,851,981,969,994,968,963,986,978,988,967,965,742,967,833,855,946,973,978,853,970,836,821,840,975,835,841,977,854,971,970,824,982,849,984,885,971,861,976,985,883,902,985,878,865,983,986,992,846,867,987,901,909,987,887,984,966,875,983,957,849,841,953,855,849,814,875,956,827,844,849,825,830,849,865,826,838,825,834,869,853,872,949,854,834,1003,1004,1041,1042,1012,1024,865,989,1009,1015,1003,981,997,1024,902,1029,1027,1005,899,894,908,975,986,1000,1009,1014,996,985,998,1003,999,991,1008,1020,988,1003,1016,919,921,984,847,872,981,980,966,996,991,898,890,977,988,984,979,875,874,989,861,862,873,965,858,867,846,844,860,960,972,845,847,854,855,869,869,879,850,865,867,872,877,859,881,873,884,885,849,853,850,849,828,837,829,841,849,826,840,823,833,839,818,834,814,829,830,816,813,827,831,825,822,837,845,856,866,930,860,861,862,869,873,872,878,881,847,851,855,847,850,854,861,846,847,855,870,862,858,860,870,849,840,837,841,818,833,839,844,851,834,848,828,803,814,820,821,854,834,836,826,837,811,796,799,809,825,817,813,801,803,797,784,790,808,813,820,787,795,780,772,774,743,739,743,750,754,729,817,791,793,795,797,769,782,784,808,793,805,790,799,812,783,779,789,805,779,793,800,747,752,748,771,768,769,772,785,766,777,782,779,774,767,777,763,761,758,759,747,764,758,763,746,745&chxl=0:|365|330|300|270|240|210|180|150|120|90|60|30|1|2:|%CE%BCSv%2Fh&chxp=0|2,0&chxr=0,0,100|1,0,50&chxs=0,676767,9,0,lt,676767|1,676767,9,0,l,676767&chxtc=0,3&chxt=x,y,t&chs=300x150&cht=lc&chco=0067B7&chds=0,5000&chls=1&chma=1,1,1,1&chg=8.333,10,1,1'
linktodata <- function(ys,i) {
ys <- gsub('.*t:','',ys)
ys <- gsub('&chxl.*','',ys)
dose <- scan(textConnection(ys),sep=',')
data.frame(dose=dose,station=i,relday = -(length(dose):1))
}

locations <-data.frame(station=c(704,628,741,1376),
loc=factor(c('Murohara community center in Soso district','Sugiuchi meeting place in Soso district',
'Maenori meeting place in Soso district', 'township village center')))

r1 <- rbind(linktodata(ys704,704),
linktodata(ys628,628),
linktodata(ys1376,1376),
linktodata(ys741,741))
r1$doseus <- r1$dose/100
r1$day <- as.Date("05-12-2013",format='%d-%m-%Y')+r1$relday
Sys.setlocale("LC_TIME",'C')

r1 <- merge(r1,locations)
r1$status <- 'Observed'
library(robust)

models <- lapply(levels(r1$loc),function(st) {
r2 <- r1[r1$loc==st,]
lmrob(log10(doseus) ~ relday,data=r2)
})

topred <- r1[r1$station==r1$station[1],c('relday','day')]

preds <- lapply(1:nlevels(r1$loc),function(i) {
preds <- topred
preds$doseus <- 10^predict(models[[i]],preds)
preds$status <- 'Predicted'
preds$loc <- levels(r1$loc)[i]
preds
}
)
preds <- do.call(rbind,preds)
preds <- merge(preds,locations)
all <- rbind(subset(r1,select=-dose),preds)

library(lattice)
png('fuk2.png')
xyplot(doseus ~ day | loc,groups= status,data=all,type='l',
scales=list(y=list(log=10)),
par.strip.text=list(cex=.7),
ylab=expression(mu * "Sv per hour" ))
dev.off()

halflives <- t(sa <- sapply(models,function(x) {
point <- coef(x)[2]
sd <- sqrt(vcov(x)[2,2])
yy <- c(log10(.5)/c(`0.025`=point-1.96*sd,mean=point,`0.975`=point+1.96*sd))/365
names(yy) <- c('Lower CI','mean','Upper CI')
yy
}))
rownames(halflives) <- levels(r1$loc)
halflives

Sunday, December 1, 2013

JAGS model Fe concentration in rainwater including values below detection level

In my previous post I ignored the fact that some data are below the detection level. I would not know how to handle those in a mixed model from lme4 or nlme. However, JAGS can handle these values. Next to that I kept the usual independent variables, such as random effects for location, machine etc.

Data

Data has been acquired and processed similar to last week. For script at the bottom of the page.

detection level

In the spreadsheet a '4' is used to indicate a value below detection limit. (* een " 4 "-teken in kolom C geeft aan dat de analyseresultaten beneden de detectielimiet liggen). There is something strange there. The detection limit seems to be 0.4, but there is data under 0.4 which is not flagged. In addition, some of the data is below 0, which seems improbable. In the end I flagged all data under 0.4 as NA.

Fe <- rf2[rf2$Compound==levels(rf2$Compound)[9],]

library(lattice)

densityplot( ~ Amount | machine + Status,

groups=Location,data=Fe[!is.na(Fe$Status)

& Fe$Amount<.8,],

panel=function(...) {

panel.densityplot(...)

panel.abline(v=c(0,0.4))

},adjust=.5

)

Analysis

It is a reasonable standard JAGS model. The only tricky thing is initialization. When the parameters are too far away an error is thrown. Hence 'intercept' is initialized as -1. In text the model and inits are given, at the end the full code.

FeModel <- function() {

for (i in 1:nobs) {

BLZ[i] ~ dinterval( LAmount[i] , -0.4034029 )

ExpLAmount[i] <- intercept +

rate*time[i] +

rateL[Location[i]]*time[i]+

FMach[Machine[i]] +

FLoc[Location[i]]

LAmount[i] ~ dnorm(ExpLAmount[i],tau[Machine[i]])

}

for (i in 1:2) {

tau[i] <- pow(sd[i],-2)

sd[i] ~ dunif(0,1000)

}

intercept ~ dnorm(0,.0001)

FMach[1] ~dnorm(0,tauMachShift)

FMach[2] <- -FMach[1]

tauMachShift <- pow(sdMachShift,-2)

sdMachShift ~ dunif(0,10)

for (iL in 1:nloc) {

FLoc[iL] ~ dnorm(0,tauLoc)

rateL[iL] ~dnorm(0,tauRate)

yearlyL[iL] <- pow(pow(10,rateL[iL]+rate),365)

}

tauLoc <- pow(sdLoc,-2)

sdLoc ~ dunif(0,100)

tauRate <- pow(sdRate,-2)

sdRate ~ dunif(0,100)

rate ~ dnorm(0,.0001)

dailydec <- pow(10,rate)

yearlydec <- pow(dailydec,365)

}

inits <- function() {

list(intercept = -1,#rnorm(1,.45,.04),

rate=rnorm(0,.01),

sd=runif(2,0,50),

sdMachShift=runif(1,0,1),

sdLoc=runif(1,0,1))

}

Results

The model shows the decrease of 5% per year which was expected.

Machine 1 (Eigenbrodt NSA 181 KHT) is slightly less accurate (was also observed last week) and has slightly lower values than 2 (Van Essen/ECN vanger). Variation between locations is less than variation between machines.

Inference for Bugs model at "C:/.../AppData/Local/Temp/Rtmpma5UEx/modelc40761211ba.txt", fit using jags,

4 chains, each with 5000 iterations (first 2500 discarded), n.thin = 2

n.sims = 5000 iterations saved

mu.vect sd.vect 2.5% 25% 50% 75% 97.5%

FLoc[1] 0.023 0.070 -0.110 -0.022 0.021 0.065 0.169

FLoc[2] 0.139 0.081 -0.038 0.091 0.147 0.195 0.280

FLoc[3] -0.041 0.069 -0.180 -0.083 -0.041 0.002 0.098

FLoc[4] -0.103 0.080 -0.289 -0.147 -0.094 -0.048 0.032

FLoc[5] -0.072 0.072 -0.209 -0.117 -0.075 -0.027 0.078

FLoc[6] -0.036 0.070 -0.173 -0.079 -0.036 0.008 0.102

FLoc[7] -0.016 0.069 -0.154 -0.058 -0.016 0.027 0.117

FLoc[8] 0.278 0.079 0.127 0.229 0.274 0.323 0.453

FLoc[9] -0.061 0.089 -0.243 -0.118 -0.060 0.000 0.109

FLoc[10] 0.052 0.069 -0.091 0.009 0.052 0.095 0.186

FLoc[11] 0.040 0.079 -0.131 -0.008 0.048 0.096 0.172

FLoc[12] -0.102 0.077 -0.245 -0.153 -0.106 -0.054 0.064

FLoc[13] -0.054 0.082 -0.210 -0.107 -0.056 -0.005 0.117

FLoc[14] 0.027 0.067 -0.112 -0.013 0.028 0.070 0.155

FLoc[15] -0.031 0.070 -0.164 -0.076 -0.033 0.011 0.116

FLoc[16] 0.088 0.070 -0.050 0.045 0.087 0.131 0.227

FLoc[17] -0.154 0.076 -0.293 -0.203 -0.157 -0.106 0.006

FMach[1] -0.053 0.017 -0.087 -0.065 -0.053 -0.041 -0.018

FMach[2] 0.053 0.017 0.018 0.041 0.053 0.065 0.087

intercept 0.401 0.081 0.240 0.345 0.402 0.455 0.559

sd[1] 0.436 0.018 0.402 0.423 0.436 0.448 0.471

sd[2] 0.386 0.007 0.372 0.381 0.386 0.391 0.401

sdLoc 0.135 0.036 0.077 0.110 0.130 0.154 0.216

sdMachShift 1.854 2.456 0.033 0.170 0.675 2.587 8.820

yearlyL[1] 0.944 0.007 0.931 0.940 0.944 0.948 0.957

yearlyL[2] 0.950 0.007 0.938 0.945 0.949 0.954 0.965

yearlyL[3] 0.945 0.006 0.932 0.941 0.945 0.949 0.958

yearlyL[4] 0.949 0.007 0.936 0.944 0.948 0.953 0.963

yearlyL[5] 0.943 0.007 0.929 0.939 0.944 0.948 0.956

yearlyL[6] 0.945 0.006 0.932 0.941 0.945 0.949 0.957

yearlyL[7] 0.946 0.006 0.933 0.941 0.946 0.950 0.958

yearlyL[8] 0.945 0.006 0.932 0.941 0.945 0.949 0.957

yearlyL[9] 0.944 0.007 0.929 0.940 0.945 0.949 0.958

yearlyL[10] 0.946 0.006 0.933 0.942 0.946 0.950 0.958

yearlyL[11] 0.950 0.006 0.938 0.946 0.950 0.954 0.964

yearlyL[12] 0.942 0.007 0.928 0.938 0.943 0.947 0.955

yearlyL[13] 0.944 0.007 0.930 0.939 0.944 0.948 0.956

yearlyL[14] 0.947 0.006 0.935 0.943 0.947 0.951 0.959

yearlyL[15] 0.944 0.006 0.931 0.940 0.944 0.948 0.956

yearlyL[16] 0.946 0.006 0.934 0.942 0.946 0.950 0.958

yearlyL[17] 0.942 0.007 0.926 0.938 0.943 0.947 0.956

yearlydec 0.945 0.005 0.936 0.942 0.945 0.949 0.955

deviance 1729.603 52.705 1627.020 1693.551 1728.751 1765.143 1831.422

Rhat n.eff

FLoc[1] 1.002 2200

FLoc[2] 1.001 4000

FLoc[3] 1.001 5000

FLoc[4] 1.001 5000

FLoc[5] 1.001 5000

FLoc[6] 1.002 1500

FLoc[7] 1.001 5000

FLoc[8] 1.001 4400

FLoc[9] 1.002 1600

FLoc[10] 1.001 5000

FLoc[11] 1.001 2800

FLoc[12] 1.002 2200

FLoc[13] 1.001 5000

FLoc[14] 1.001 5000

FLoc[15] 1.001 2800

FLoc[16] 1.001 4800

FLoc[17] 1.004 740

FMach[1] 1.004 760

FMach[2] 1.004 760

intercept 1.001 3100

sd[1] 1.007 380

sd[2] 1.002 1500

sdLoc 1.003 1600

sdMachShift 1.001 5000

yearlyL[1] 1.001 3000

yearlyL[2] 1.001 4100

yearlyL[3] 1.002 1600

yearlyL[4] 1.001 2800

yearlyL[5] 1.002 1400

yearlyL[6] 1.002 2300

yearlyL[7] 1.001 5000

yearlyL[8] 1.001 5000

yearlyL[9] 1.001 4000

yearlyL[10] 1.001 3500

yearlyL[11] 1.001 5000

yearlyL[12] 1.002 1400

yearlyL[13] 1.001 5000

yearlyL[14] 1.002 1400

yearlyL[15] 1.001 3100

yearlyL[16] 1.002 1500

yearlyL[17] 1.004 850

yearlydec 1.002 1800

deviance 1.003 900

For each parameter, n.eff is a crude measure of effective sample size,

and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = var(deviance)/2)

pD = 1385.1 and DIC = 3114.7

DIC is an estimate of expected predictive error (lower deviance is better).

Code

library(R2jags)

# http://www.lml.rivm.nl/data/gevalideerd/data/lmre_1992-2005.xls

# http://www.r-bloggers.com/read-excel-files-from-r/

library(XLConnect)

readsheet <- function(wb,cursheet) {

df = readWorksheet(wb, sheet = cursheet , header = TRUE)

topline <- 8

test <- which(df[6,]=='C')+1

numin <- df[topline:nrow(df),test]

names(numin) <- make.names( gsub(' ','',df[6,test]))

for(i in 1:ncol(numin)) numin[,i] <- as.numeric(gsub(',','.',numin[,i]))

status = df[topline:nrow(df),test-1]

names(status) <- paste('C',make.names( gsub(' ','',df[6,test])),sep='')

numin$StartDate <- as.Date(substr(df[topline:nrow(df),1],1,10))

numin$EndDate <- as.Date(substr(df[topline:nrow(df),2],1,10))

numin <- cbind(numin,status)

tall <- reshape(numin,

varying=list(make.names( gsub(' ','',df[6,test])),paste('C',make.names( gsub(' ','',df[6,test])),sep='')),

v.names=c('Amount','Status'),

timevar='Compound',

idvar=c('StartDate','EndDate'),

times=paste(df[6,test],'(',df[5,test],')',sep=''),

direction='long')

tall$Compound <- factor(gsub(' )',')',gsub(' +',' ',tall$Compound)))

row.names(tall)<- NULL

tall$Location <- cursheet

tall

}

readfile <- function(fname) {

wb <- loadWorkbook(fname)

print(wb)

sheets <- getSheets(wb)[-1]

tt <- lapply(sheets,readsheet,wb=wb)

tt2 <- do.call(rbind,tt)

tt2$Location <- factor(tt2$Location)

tt2$fname <- fname

tt2

}

fnames <- dir(pattern='*.xls') #.\\. ?

rf <- lapply(fnames,readfile)

rf2 <- do.call(rbind,rf)

rf2$machine <- factor(ifelse(rf2$fname=="lmre_1992-2005.xls",

'Van Essen/ECN vanger',

' Eigenbrodt NSA 181 KHT'))

Fe <- rf2[rf2$Compound==levels(rf2$Compound)[9],]

library(lattice)

densityplot( ~ Amount | machine + Status,

groups=Location,data=Fe[!is.na(Fe$Status)

& Fe$Amount<.8,],

panel=function(...) {

panel.densityplot(...)

panel.abline(v=c(0,0.4))

},adjust=.5

)

Fe$LAmount <- log10(Fe$Amount)

Fe$LAmount[Fe$Amount<0.4] <- NA

Fe2 <- Fe[!is.na(Fe$Status),]

datain <- list(

LAmount = Fe2$LAmount,

BLZ = !is.na(Fe2$LAmount),

Machine=(Fe2$machine=='Van Essen/ECN vanger')+1,

Location=(1:nlevels(Fe2$Location))[Fe2$Location],

time=as.numeric(Fe2$StartDate),

nobs =nrow(Fe2),

nloc=nlevels(Fe2$Location)

)

FeModel <- function() {

for (i in 1:nobs) {

BLZ[i] ~ dinterval( LAmount[i] , -0.4034029 )

ExpLAmount[i] <- intercept +

rate*time[i] +

rateL[Location[i]]*time[i]+

FMach[Machine[i]] +

FLoc[Location[i]]

LAmount[i] ~ dnorm(ExpLAmount[i],tau[Machine[i]])

}

for (i in 1:2) {

tau[i] <- pow(sd[i],-2)

sd[i] ~ dunif(0,1000)

}

intercept ~ dnorm(0,.0001)

FMach[1] ~dnorm(0,tauMachShift)

FMach[2] <- -FMach[1]

tauMachShift <- pow(sdMachShift,-2)

sdMachShift ~ dunif(0,10)

for (iL in 1:nloc) {

FLoc[iL] ~ dnorm(0,tauLoc)

rateL[iL] ~dnorm(0,tauRate)

yearlyL[iL] <- pow(pow(10,rateL[iL]+rate),365)

}

tauLoc <- pow(sdLoc,-2)

sdLoc ~ dunif(0,100)

tauRate <- pow(sdRate,-2)

sdRate ~ dunif(0,100)

rate ~ dnorm(0,.0001)

dailydec <- pow(10,rate)

yearlydec <- pow(dailydec,365)

}

parameters <- c('intercept','yearlydec','sd',

'sdMachShift','FMach',

'sdLoc','FLoc',

'yearlyL' )

inits <- function() {

list(intercept = -1,#rnorm(1,.45,.04),

rate=rnorm(0,.01),

sd=runif(2,0,50),

sdMachShift=runif(1,0,1),

sdLoc=runif(1,0,1))

}

# estimate

jj <- jags(FeModel, data=datain,

parameters=parameters, init=inits,n.iter=5000,

progress.bar="gui",n.chains=4)