Sunday, September 8, 2013

Mixed models; Random Coefficients, part 1

Continuing with my exploration of mixed models I am now at the first part of random coefficients: example 59.5 for proc mixed (page 5034 of the SAS/STAT 12.3 Manual). This means I skipped examples 59.3 (plotting the likelihood) and 59.4 (known G and R). The latter I might want to do later, though I find this to be quite a strong prior.

Data 

To quote the SAS/STAT manual 'The observed responses are replicate assay results, expressed in percent of label claim, at various shelf ages, expressed in months. The desired mixed model involves three batches of product that differ randomly in intercept (initial potency) and slope (degradation rate).'
rc1 <- read.table(textConnection('
1 0 101.2 103.3 103.3 102.1 104.4 102.4
1 1 98.8 99.4 99.7 99.5 . .
1 3 98.4 99.0 97.3 99.8 . .
1 6 101.5 100.2 101.7 102.7 . .
1 9 96.3 97.2 97.2 96.3 . .
1 12 97.3 97.9 96.8 97.7 97.7 96.7
2 0 102.6 102.7 102.4 102.1 102.9 102.6
2 1 99.1 99.0 99.9 100.6 . .
2 3 105.7 103.3 103.4 104.0 . .
2 6 101.3 101.5 100.9 101.4 . .
2 9 94.1 96.5 97.2 95.6 . .
2 12 93.1 92.8 95.4 92.2 92.2 93.0
3 0 105.1 103.9 106.1 104.1 103.7 104.6
3 1 102.2 102.0 100.8 99.8 . .
3 3 101.2 101.8 100.8 102.6 . .
3 6 101.1 102.0 100.1 100.2 . .
3 9 100.9 99.5 102.2 100.8 . .
3 12 97.8 98.3 96.9 98.4 96.9 96.5
'),
    na.strings='.',
    col.names=c('Batch','Month',paste('Y',1:6,sep='')))
rc2 <- reshape(rc1,
    direction='long',
    varying=list(Y=paste('Y',1:6,sep='')),
    v.names='Y',
    timevar='i',
    times=1:6)
rc2$Batch <- factor(rc2$Batch)
rc2 <- rc2[complete.cases(rc2),]

Analysis

In this post the first model will be attempted, for nlme, lme4 and MCMCglmm. MCMCglmm was clearly the most difficult to formulate, especially with regards to the prior. It appeared the models have different results for the significance of the fixed month effect. Model quote 'The two random effects are Int and Month, modeling random intercepts and slopes, respectively. Note that Intercept and Month are used as both fixed and random effects.'

lme4

The basic call is not difficult at all. The summary also shows residual variance and variances for intercept and month (within batch).
library(lme4)
lmer1 <- lmer(Y ~ Month + (Month|Batch),data=rc2)
summary(lmer1)
Linear mixed model fit by REML 
Formula: Y ~ Month + (Month | Batch) 
   Data: rc2 
   AIC   BIC logLik deviance REMLdev
 362.3 376.9 -175.2    348.4   350.3
Random effects:
 Groups   Name        Variance Std.Dev. Corr   
 Batch    (Intercept) 0.976805 0.98833         
          Month       0.037166 0.19278  -0.548 
 Residual             3.293234 1.81473         
Number of obs: 84, groups: Batch, 3

Fixed effects:
            Estimate Std. Error t value
(Intercept) 102.7038     0.6456  159.08
Month        -0.5259     0.1194   -4.41

Correlation of Fixed Effects:
      (Intr)
Month -0.580
There is not as much default output as PROC MIXED. Extra output for variances is needed to retrieve the Month-Intercept covariance. 
VarCorr(lmer1)
$Batch
            (Intercept)       Month
(Intercept)   0.9768046 -0.10448120
Month        -0.1044812  0.03716553
attr(,"stddev")
(Intercept)       Month 
  0.9883343   0.1927836 
attr(,"correlation")
            (Intercept)      Month
(Intercept)   1.0000000 -0.5483579
Month        -0.5483579  1.0000000

attr(,"sc")
[1] 1.814727
Significance of Month effect can be calculated via anova and simulation. As I cannot see the point in estimating significance of intercept effect when a main effect is present this is skipped. Anova makes it clearly non-significant, while simulation is close to the p-value PROC MIXED provides, but there are 30 simulations with convergence problems.
lmer1b <- lmer(Y ~ 1 + (Month|Batch),data=rc2)
anova(lmer1,lmer1b)
Data: rc2
Models:
lmer1b: Y ~ 1 + (Month | Batch)
lmer1: Y ~ Month + (Month | Batch)
       Df    AIC    BIC  logLik Chisq Chi Df Pr(>Chisq)   
lmer1b  5 365.36 377.51 -177.68                           
lmer1   6 360.44 375.03 -174.22 6.914      1   0.008552 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
pboot <- function(m0,m1) {
  s <- simulate(m0)
  L0 <- logLik(refit(m0,s))
  L1 <- logLik(refit(m1,s))
  2*(L1-L0)
}
obsdev <- 2*( logLik(lmer1)-logLik(lmer1b))
#
set.seed(1001)
reps <- replicate(1000,pboot(lmer1b,lmer1))
mean(reps>obsdev)
[1] 0.055
Random effects are same:
ranef(lmer1)
$Batch
  (Intercept)       Month
1  -1.0010539  0.12865763
2   0.3934426 -0.20598652
3   0.6076112  0.07732889

nlme

library(nlme)
lme1 <- lme(Y ~ Month,
    random= ~Month|Batch,
    data=rc2)
summary(lme1)
Linear mixed-effects model fit by REML
 Data: rc2 
       AIC      BIC    logLik
  362.3281 376.7685 -175.1641

Random effects:
 Formula: ~Month | Batch
 Structure: General positive-definite, Log-Cholesky parametrization
            StdDev    Corr  
(Intercept) 0.9883331 (Intr)
Month       0.1927833 -0.548
Residual    1.8147270       

Fixed effects: Y ~ Month 
                Value Std.Error DF   t-value p-value
(Intercept) 102.70380 0.6456110 80 159.08001       0
Month        -0.52594 0.1193732 80  -4.40589       0
 Correlation: 
      (Intr)
Month -0.58 

Standardized Within-Group Residuals:
        Min          Q1         Med          Q3         Max 
-1.85442335 -0.68272916 -0.09994707  0.62635493  2.64425526 

Number of Observations: 84
Number of Groups: 3 
Random effects are slightly different displayed but the same. 
#variances
c(.9883331,.1927844,1.814727)^2
[1] 0.97680232 0.03716582 3.29323408
#covariance
.9883331*.1927844*-.548
[1] -0.1044133
Fixed effects are same as in PROC MIXED, but degrees of freedom are off, making the month effect significant. Random effects are exactly the same
ranef(lme1)
  (Intercept)       Month
1  -1.0010483  0.12868117
2   0.3934057 -0.20599206
3   0.6076426  0.07731089

MCMCglmm

MCMCglmm has me baffled for this one:. After reviewing the manual the obvious model was not so difficult to formulate. However,a prior is needed. Following MCMCglmm course notes example, page 78 I copied a prior. The standard errors are larger than PROC MIXED. In addition, month is not a significant fixed effect.
prior1 <- list(R=list(V=1e-7,nu=-2),
    G=list(G1=list(V=diag(2),nu=2) )
)

MCMCglmm1 <- MCMCglmm(Y ~ Month, 
    random= ~ us(Month+1):Batch,
    pr=TRUE,
    data=rc2,family='gaussian',
    nitt=500000,thin=20,burnin=50000,
    prior=prior1,
    verbose=FALSE)
summary(MCMCglmm1)
 Iterations = 50001:499981
 Thinning interval  = 20
 Sample size  = 22500 

 DIC: 346.489 

 G-structure:  ~us(Month + 1):Batch

                              post.mean l-95% CI u-95% CI eff.samp
(Intercept):(Intercept).Batch    4.3457   0.1418   11.519    22500
Month:(Intercept).Batch         -0.3678  -4.8534    3.988    22500
(Intercept):Month.Batch         -0.3678  -4.8534    3.988    22500
Month:Month.Batch                2.0747   0.1247    5.766    22500

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units     3.474    2.456    4.658    22500

 Location effects: Y ~ Month 

            post.mean l-95% CI u-95% CI eff.samp  pMCMC    
(Intercept)  102.7132 100.5880 105.0134    22257 <4e-05 ***
Month         -0.5211  -2.0054   1.0162    22500  0.356    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
colMeans(MCMCglmm1$Sol)
              (Intercept)                     Month Batch.(Intercept).Batch.1 
             102.71316233               -0.52109218               -1.00404017 
Batch.(Intercept).Batch.2 Batch.(Intercept).Batch.3       Batch.Month.Batch.1 
               0.38305996                0.60774619                0.12522274 
      Batch.Month.Batch.2       Batch.Month.Batch.3 
              -0.22493139                0.08335665 
apply(MCMCglmm1$Sol,2,sd)
              (Intercept)                     Month Batch.(Intercept).Batch.1 
                1.1851105                 0.8264831                 1.2192814 
Batch.(Intercept).Batch.2 Batch.(Intercept).Batch.3       Batch.Month.Batch.1 
                1.2049515                 1.2168346                 0.8271416 
      Batch.Month.Batch.2       Batch.Month.Batch.3 
                0.8274522                 0.8288230 
To return to this all important prior, where experimenting showed a different choice clearly had influence, we can plot it. This shows that the variance is suggested to be close to 1 and lower than 10. 
nu.ast <- prior1$G$G1$nu-dim(prior1$G$G1$V)[1]+1
V.ast <- prior1$G$G1$V[1,1]*(prior1$G$G1$nu/nu.ast)
xv=seq(1e-16,10,length.out=100)
dv=MCMCpack::dinvgamma(xv,shape=nu.ast/2,
    scale=(nu.ast*V.ast)/2)
plot(dv~xv,type='l')


2 comments:

  1. Am I missing something, doesn't the anova show the main effect of month as significant? And the result looks significant in the nlme model as well?

    ReplyDelete
    Replies
    1. It depends:
      Lme4 yes if you look at Anova, borderline for simulation. Nlme4 yes, but those df? They are the reason why the result is different from proc mixed.
      MCMCglmm no, not with p=0.356.

      Delete