Wiekvoet: Mixed models; Random Coefficients, part 1

Continuing with my exploration of mixed models I am now at the first part of random coefficients: example 59.5 for proc mixed (page 5034 of the SAS/STAT 12.3 Manual). This means I skipped examples 59.3 (plotting the likelihood) and 59.4 (known G and R). The latter I might want to do later, though I find this to be quite a strong prior.

Data

To quote the SAS/STAT manual 'The observed responses are replicate assay results, expressed in percent of label claim, at various shelf ages, expressed in months. The desired mixed model involves three batches of product that differ randomly in intercept (initial potency) and slope (degradation rate).'

rc1 <- read.table(textConnection('

1 0 101.2 103.3 103.3 102.1 104.4 102.4

1 1 98.8 99.4 99.7 99.5 . .

1 3 98.4 99.0 97.3 99.8 . .

1 6 101.5 100.2 101.7 102.7 . .

1 9 96.3 97.2 97.2 96.3 . .

1 12 97.3 97.9 96.8 97.7 97.7 96.7

2 0 102.6 102.7 102.4 102.1 102.9 102.6

2 1 99.1 99.0 99.9 100.6 . .

2 3 105.7 103.3 103.4 104.0 . .

2 6 101.3 101.5 100.9 101.4 . .

2 9 94.1 96.5 97.2 95.6 . .

2 12 93.1 92.8 95.4 92.2 92.2 93.0

3 0 105.1 103.9 106.1 104.1 103.7 104.6

3 1 102.2 102.0 100.8 99.8 . .

3 3 101.2 101.8 100.8 102.6 . .

3 6 101.1 102.0 100.1 100.2 . .

3 9 100.9 99.5 102.2 100.8 . .

3 12 97.8 98.3 96.9 98.4 96.9 96.5

'),

na.strings='.',

col.names=c('Batch','Month',paste('Y',1:6,sep='')))

rc2 <- reshape(rc1,

direction='long',

varying=list(Y=paste('Y',1:6,sep='')),

v.names='Y',

timevar='i',

times=1:6)

rc2$Batch <- factor(rc2$Batch)

rc2 <- rc2[complete.cases(rc2),]

Analysis

In this post the first model will be attempted, for nlme, lme4 and MCMCglmm. MCMCglmm was clearly the most difficult to formulate, especially with regards to the prior. It appeared the models have different results for the significance of the fixed month effect. Model quote 'The two random effects are Int and Month, modeling random intercepts and slopes, respectively. Note that Intercept and Month are used as both fixed and random effects.'

lme4

The basic call is not difficult at all. The summary also shows residual variance and variances for intercept and month (within batch).

library(lme4)

lmer1 <- lmer(Y ~ Month + (Month|Batch),data=rc2)

summary(lmer1)

Linear mixed model fit by REML

Formula: Y ~ Month + (Month | Batch)

Data: rc2

AIC BIC logLik deviance REMLdev

362.3 376.9 -175.2 348.4 350.3

Random effects:

Groups Name Variance Std.Dev. Corr

Batch (Intercept) 0.976805 0.98833

Month 0.037166 0.19278 -0.548

Residual 3.293234 1.81473

Number of obs: 84, groups: Batch, 3

Fixed effects:

Estimate Std. Error t value

(Intercept) 102.7038 0.6456 159.08

Month -0.5259 0.1194 -4.41

Correlation of Fixed Effects:

(Intr)

Month -0.580

There is not as much default output as PROC MIXED. Extra output for variances is needed to retrieve the Month-Intercept covariance.

VarCorr(lmer1)

$Batch

(Intercept) Month

(Intercept) 0.9768046 -0.10448120

Month -0.1044812 0.03716553

attr(,"stddev")

(Intercept) Month

0.9883343 0.1927836

attr(,"correlation")

(Intercept) Month

(Intercept) 1.0000000 -0.5483579

Month -0.5483579 1.0000000

attr(,"sc")

[1] 1.814727

Significance of Month effect can be calculated via anova and simulation. As I cannot see the point in estimating significance of intercept effect when a main effect is present this is skipped. Anova makes it clearly non-significant, while simulation is close to the p-value PROC MIXED provides, but there are 30 simulations with convergence problems.

lmer1b <- lmer(Y ~ 1 + (Month|Batch),data=rc2)

anova(lmer1,lmer1b)

Data: rc2

Models:

lmer1b: Y ~ 1 + (Month | Batch)

lmer1: Y ~ Month + (Month | Batch)

Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)

lmer1b 5 365.36 377.51 -177.68

lmer1 6 360.44 375.03 -174.22 6.914 1 0.008552 **

---

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

pboot <- function(m0,m1) {

s <- simulate(m0)

L0 <- logLik(refit(m0,s))

L1 <- logLik(refit(m1,s))

2*(L1-L0)

}

obsdev <- 2*( logLik(lmer1)-logLik(lmer1b))

set.seed(1001)

reps <- replicate(1000,pboot(lmer1b,lmer1))

mean(reps>obsdev)

[1] 0.055

Random effects are same:

ranef(lmer1)

$Batch

(Intercept) Month

1 -1.0010539 0.12865763

2 0.3934426 -0.20598652

3 0.6076112 0.07732889

nlme

library(nlme)

lme1 <- lme(Y ~ Month,

random= ~Month|Batch,

data=rc2)

summary(lme1)

Linear mixed-effects model fit by REML

Data: rc2

AIC BIC logLik

362.3281 376.7685 -175.1641

Random effects:

Formula: ~Month | Batch

Structure: General positive-definite, Log-Cholesky parametrization

StdDev Corr

(Intercept) 0.9883331 (Intr)

Month 0.1927833 -0.548

Residual 1.8147270

Fixed effects: Y ~ Month

Value Std.Error DF t-value p-value

(Intercept) 102.70380 0.6456110 80 159.08001 0

Month -0.52594 0.1193732 80 -4.40589 0

Correlation:

(Intr)

Month -0.58

Standardized Within-Group Residuals:

Min Q1 Med Q3 Max

-1.85442335 -0.68272916 -0.09994707 0.62635493 2.64425526

Number of Observations: 84

Number of Groups: 3

Random effects are slightly different displayed but the same.

#variances

c(.9883331,.1927844,1.814727)^2

[1] 0.97680232 0.03716582 3.29323408

#covariance

.9883331*.1927844*-.548

[1] -0.1044133

Fixed effects are same as in PROC MIXED, but degrees of freedom are off, making the month effect significant. Random effects are exactly the same

ranef(lme1)

(Intercept) Month

1 -1.0010483 0.12868117

2 0.3934057 -0.20599206

3 0.6076426 0.07731089

MCMCglmm

MCMCglmm has me baffled for this one:. After reviewing the manual the obvious model was not so difficult to formulate. However,a prior is needed. Following MCMCglmm course notes example, page 78 I copied a prior. The standard errors are larger than PROC MIXED. In addition, month is not a significant fixed effect.

prior1 <- list(R=list(V=1e-7,nu=-2),

G=list(G1=list(V=diag(2),nu=2) )

)

MCMCglmm1 <- MCMCglmm(Y ~ Month,

random= ~ us(Month+1):Batch,

pr=TRUE,

data=rc2,family='gaussian',

nitt=500000,thin=20,burnin=50000,

prior=prior1,

verbose=FALSE)

summary(MCMCglmm1)

Iterations = 50001:499981

Thinning interval = 20

Sample size = 22500

DIC: 346.489

G-structure: ~us(Month + 1):Batch

post.mean l-95% CI u-95% CI eff.samp

(Intercept):(Intercept).Batch 4.3457 0.1418 11.519 22500

Month:(Intercept).Batch -0.3678 -4.8534 3.988 22500

(Intercept):Month.Batch -0.3678 -4.8534 3.988 22500

Month:Month.Batch 2.0747 0.1247 5.766 22500

R-structure: ~units

post.mean l-95% CI u-95% CI eff.samp

units 3.474 2.456 4.658 22500

Location effects: Y ~ Month

post.mean l-95% CI u-95% CI eff.samp pMCMC

(Intercept) 102.7132 100.5880 105.0134 22257 <4e-05 ***

Month -0.5211 -2.0054 1.0162 22500 0.356

---

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

colMeans(MCMCglmm1$Sol)

(Intercept) Month Batch.(Intercept).Batch.1

102.71316233 -0.52109218 -1.00404017

Batch.(Intercept).Batch.2 Batch.(Intercept).Batch.3 Batch.Month.Batch.1

0.38305996 0.60774619 0.12522274

Batch.Month.Batch.2 Batch.Month.Batch.3

-0.22493139 0.08335665

apply(MCMCglmm1$Sol,2,sd)

(Intercept) Month Batch.(Intercept).Batch.1

1.1851105 0.8264831 1.2192814

Batch.(Intercept).Batch.2 Batch.(Intercept).Batch.3 Batch.Month.Batch.1

1.2049515 1.2168346 0.8271416

Batch.Month.Batch.2 Batch.Month.Batch.3

0.8274522 0.8288230

To return to this all important prior, where experimenting showed a different choice clearly had influence, we can plot it. This shows that the variance is suggested to be close to 1 and lower than 10.

nu.ast <- prior1$G$G1$nu-dim(prior1$G$G1$V)[1]+1

V.ast <- prior1$G$G1$V[1,1]*(prior1$G$G1$nu/nu.ast)

xv=seq(1e-16,10,length.out=100)

dv=MCMCpack::dinvgamma(xv,shape=nu.ast/2,

scale=(nu.ast*V.ast)/2)

plot(dv~xv,type='l')