## Saturday, September 14, 2013

### Mixed models; Random Coefficients, part 2

Continuing from random coefficients part 1, it is time for the second part. To quote the SAS/STAT manual 'a random coefficients model with error terms that follow a nested structure'. The additional random variable is monthc, which is a factor containing the months and nested under batch. Hence there is one additional statement in generating the data
rc2\$Monthc <- factor(rc2\$Month)

## lme4

Running the analysis in lme4 is not difficult. However, it is noticed the random effect for month:batch provides an estimation problem. Somewhere between the month fixed effect and monthc:batch random effect lme4 has no room left for the month:batch effect. However, it is still needed later on. During simulation full against restricted model the restricted model lacks the month fixed effect and hence the random month:batch effect is not equal to 0. Hence this term has influence on significance of the fixed term.
lmer1 <- lmer(Y ~ Month +(Month-1|Batch) + (1|Batch:Monthc),data=rc2)
summary(lmer1)
Linear mixed model fit by REML
Formula: Y ~ Month + (Month - 1 | Batch) + (1 | Batch:Monthc)
Data: rc2
AIC   BIC logLik deviance REMLdev
287.3 299.4 -138.6    275.2   277.3
Random effects:
Groups       Name        Variance   Std.Dev.
Batch:Monthc (Intercept) 4.1669e+00 2.0413e+00
Batch        Month       1.1006e-11 3.3175e-06
Residual                 7.9670e-01 8.9258e-01
Number of obs: 84, groups: Batch:Monthc, 18; Batch, 3

Fixed effects:
Estimate Std. Error t value
(Intercept) 102.5558     0.7674  133.64
Month        -0.5002     0.1140   -4.39

Correlation of Fixed Effects:
(Intr)
Month -0.768

#### Detail on random effects

It is almost the same solution as PROC MIXED when compensated for continuous month effect. For instance, the last number (0.885355535) is close to 12*0.07600+0.002293= 0.914293 from PROC MIXED.

ranef(lmer1)

\$`Batch:Monthc`

(Intercept)
1:0   0.220538869
1:1  -2.582118495
1:3  -2.319238635
1:6   1.880673690
1:9  -1.244284881
1:12  0.772296027
2:0  -0.005588532
2:1  -2.295803984
2:3   2.905996019
2:6   1.642080584
2:9  -2.103224341
2:12 -3.330296971
3:0   1.964950249
3:1  -0.816514780
3:3   0.520042537
3:6   1.236464739
3:9   2.668672370
3:12  0.885355535

\$Batch
Month
1 -2.779928e-11
2 -6.342200e-09
3  6.369999e-09

#### Fixed effects

Fixed effects are easy after all these exercises. Again we see a discrepancy between anova and simulation, with simulation (p=0.033) making the month effect just significant and anova making it highly significant.
lmer1b <- lmer(Y ~ 1 +(Month-1|Batch)+ (1|Batch:Monthc) ,data=rc2)
anova(lmer1,lmer1b)
Data: rc2
Models:
lmer1b: Y ~ 1 + (Month - 1 | Batch) + (1 | Batch:Monthc)
lmer1: Y ~ Month + (Month - 1 | Batch) + (1 | Batch:Monthc)
Df    AIC    BIC  logLik  Chisq Chi Df Pr(>Chisq)
lmer1b  4 290.91 300.63 -141.45
lmer1   5 285.18 297.33 -137.59 7.7251      1   0.005446 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
pboot <- function(m0,m1) {
s <- simulate(m0)
L0 <- logLik(refit(m0,s))
L1 <- logLik(refit(m1,s))
2*(L1-L0)
}
obsdev <- 2*( logLik(lmer1)-logLik(lmer1b))
#
set.seed(1001)
reps <- replicate(1000,pboot(lmer1b,lmer1))
mean(reps>obsdev)
[1] 0.033

## nlme

It is not obvious how to run this model in nlme. It appears (r-help post) that one can make a groupedData object. The random effect in this object gets used in the model, even though not explicitly in the model call. Then there is some trickery in using pdBlocked. However, the output is very straightforward. The month effect is highly significant.
rc3 <- groupedData(formula=Y ~ Month | Batch,
data=rc2)
lme1 <- lme(Y ~ Month,
random= pdBlocked(list(pdIdent(~ Monthc - 1),pdIdent(~ Month - 1))),
data=rc3)
summary(lme1)
Linear mixed-effects model fit by REML
Data: rc3
AIC      BIC    logLik
286.901 298.9346 -138.4505

Random effects:
Composite Structure: Blocked

Block 1: Monthc0, Monthc1, Monthc3, Monthc6, Monthc9, Monthc12
Formula: ~Monthc - 1 | Batch
Structure: Multiple of an Identity
Monthc0  Monthc1  Monthc3  Monthc6  Monthc9 Monthc12
StdDev: 1.934191 1.934191 1.934191 1.934191 1.934191 1.934191

Block 2: Month
Formula: ~Month - 1 | Batch
Month  Residual
StdDev: 0.111472 0.8926711

Fixed effects: Y ~ Month
Value Std.Error DF   t-value p-value
(Intercept) 102.55643 0.7287180 80 140.73542   0e+00
Month        -0.50034 0.1259348 80  -3.97302   2e-04
Correlation:
(Intr)
Month -0.66

Standardized Within-Group Residuals:
Min         Q1        Med         Q3        Max
-2.0466072 -0.6347078  0.0557648  0.4304314  2.4608117

Number of Observations: 84
Number of Groups: 3
Conversion of standard deviations to variances for checking:
c(1.934191,0.111472, 0.8926711)^2
[1] 3.74109482 0.01242601 0.79686169

#### Details on Random effects

Exactly the same as in PROC MIXED.
ranef(lme1)
Monthc0    Monthc1    Monthc3   Monthc6   Monthc9     Monthc12
1  0.219126119 -2.5690021 -2.3067185 1.8726057 -1.235035  0.773610734
2 -0.006207775 -2.2125547  3.1063194 2.0649346 -1.444999 -2.440480024
3  1.957416153 -0.8849589  0.3005989 0.7971893  2.005861  0.002293605
Month
1 -0.0002840204
2 -0.0757124467
3  0.0759964671

## MCMCglmm

For once this seems most easy. Notice that monthc is not used in the model. Absence of us() around month is sufficient for MCMCglm to interpret it as levels rather than continuous. By now I am not surprised by a mismatch between random effects between PROC MIXED and MCMCglmm. The month effect is significant.
library(MCMCglmm)
MCMCglmm1 <- MCMCglmm(Y ~ Month,
random= ~ us(Month):Batch + Month:Batch, # two terms
pr=TRUE,
data=rc2,family='gaussian',
nitt=1000000,thin=20,burnin=100000,
verbose=FALSE)
summary(MCMCglmm1)

Iterations = 100001:999981
Thinning interval  = 20
Sample size  = 45000

DIC: 238.8215

G-structure:  ~us(Month):Batch

post.mean  l-95% CI u-95% CI eff.samp
Month:Month.Batch   0.02221 1.386e-17  0.03053    45000

~Month:Batch

post.mean l-95% CI u-95% CI eff.samp
Month:Batch     4.695    1.741    8.649    45000

R-structure:  ~units

post.mean l-95% CI u-95% CI eff.samp
units    0.8223   0.5518    1.114    45000

Location effects: Y ~ Month

post.mean l-95% CI u-95% CI eff.samp   pMCMC
(Intercept)  102.5577 100.9234 104.1365    45961 < 2e-05 ***
Month         -0.4994  -0.7563  -0.2401    45000 0.00467 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

#### Details on random effects.

Again a small calculation for the last item shows 0.737419049+12*0.011875820= 0.8799289 against 0.91 and 0.88 for PROX MIXED and lme4 respectively. The random effects are quire close.

colMeans(MCMCglmm1\$Sol)

(Intercept)               Month Batch.Month.Batch.1 Batch.Month.Batch.2

102.557650677        -0.499353562        -0.000757751        -0.013238892
Batch.Month.Batch.3     Month:Batch.0.1     Month:Batch.1.1     Month:Batch.3.1
0.011875820         0.218504027        -2.577571780        -2.317749883
Month:Batch.6.1     Month:Batch.9.1    Month:Batch.12.1     Month:Batch.0.2
1.873068384        -1.245499371         0.768104528        -0.008015210
Month:Batch.1.2     Month:Batch.3.2     Month:Batch.6.2     Month:Batch.9.2
-2.278469327         2.929929878         1.705485501        -1.995548530
Month:Batch.12.2     Month:Batch.0.3     Month:Batch.1.3     Month:Batch.3.3
-3.183017853         1.960380450        -0.828147383         0.483852028
Month:Batch.6.3     Month:Batch.9.3    Month:Batch.12.3
1.158358351         2.551876100         0.737419049

## Difference in fixed effects between all functions

All functions now make the fixed month effect at approximately -0.5. All have it significant at the conventional 5% level. There are still quite some differences. PROC MIXED and lme4 simulations have p-values close to 5%. Lme4 anova and MCMCglmm have the p-value in the 0.005 region. nlme has 0.0002. This is worrying; the moment I am performing such calculations with real data I need to be more certain on my p-values than implied here.