Sunday, May 5, 2013

Simulation shows gain of clmm over ANOVA is small

After last post's setting up for a simulation, it is now time to look how the models compare. To my disappointment with my simple simulations of assessors behavior the gain is minimal. Unfortunately, the simulation took much more time than I expected, so I will not expand it.


I have been looking at the ordered logistic model in a number of postings. The reason is that in general I find people use ANOVA for analyzing data on a nine point scale, whereas you would think an ordered logistic model works better. Two posts (first and second) showed the current methods in R, and and JAGS are easy to use and with some tweaking provide suitable output to present. Subsequently I made the simulated data generator. Now it is time to make the final comparison.  


The core of the simulator is explained elsewhere so I won't explain here again. I did however notice a small error, so the corrected code is given here. Some new parts are added, wrappers around the data generator and the analysis. And to my big disappointment I could not even build that as desired. The call anova(Res.clmm2,Res.clmm) with subsequent extraction has been replaced by the ugly pchisq(2*(Res.clmm$logLik-Res.clmm2$logLik),2,lower.tail=FALSE ). 2 represents the degrees of freedom for products in my simulations. Somehow that call to ANOVA did not run within a function, after trying too many variations I choose the short cut.


num2scale <- function(x,scale) {
pop.limits2ind.limits <- function(scale,sd=.5) {
  newscale <- scale+rnorm(length(scale),sd=sd)

obs.score <- function(obs,pop.score,pop.limits,,,,center.scale=5,
    labels=LETTERS[1:length(pop.score)]) {
  # individual sensitivity (multiplicative)
  obs.errorfreeintensity <- center.scale + 
  #individual (additive) 
  obs.errorfreeintensity <- obs.errorfreeintensity +
  # individual observation error 
  obs.intensity <- obs.errorfreeintensity+
  # individual cut offs between categories  
  obs.limits <- pop.limits2ind.limits(pop.limits)
  obs.score <- num2scale(obs.intensity,obs.limits)
      score = obs.score,

panel.score <- function(n=100,pop.score,pop.limits,,,,center.scale=5,
    labels=LETTERS[1:length(pop.score)]) {
  la <- lapply(1:n,function(x) {
  dc <-,la)
  dc$obs <- factor(dc$obs)

overallP <- function(scores) {
  Res.aov <- aov( numresponse ~ obs +  product , data=scores)
  paov <- anova(Res.aov)['product','Pr(>F)']
  Res.clmm <- clmm(score  ~ product + (1|obs),data=scores)
  Res.clmm2 <- clmm(score ~ 1 + (1|obs),data=scores)
    clmm = pchisq(2*(Res.clmm$logLik-Res.clmm2$logLik),2,lower.tail=FALSE ) #.1687

onesim <- function(prodmeans,pop.limits,center.scale, {
  scores <- panel.score(40,prodmeans,pop.limits,
  scores$numresponse <- as.numeric(levels(scores$score))[scores$score]

Simulation I, 5 categories

The first simulation is with 5 categories. This represents the Just About Right (JAR) scale. The final plot shows the difference between ANOVA and clmm is minimal.

pop.limits <- c(1,2.5,4.5,6)

nsim <- 250
sim5cat1 <- lapply(seq(0,.6,.05),function(dif) {
    sa <- sapply(1:nsim,function(x) onesim(prodmeans,pop.limits))

sim5cat1tb <-,sim5cat1)
ggplot(sim5cat1tb, aes(dif,nreject/nsim ,colour=method)) +
     geom_line() + xlab('Difference between products') + 
     ylab('Proportion significant (at 5% Test)') +
     theme(legend.position = "bottom") + ylim(c(0,1)) +
     guides(colour = guide_legend('Analysis Method'))

Simulation 2, 9 categories

This simulation represents the intensity and liking scales. Again the difference between ANOVA and clmm are minimal.

pop.limits <- c(1,3:8,10)
prodmeans <- c(7,7,7)
scores <- panel.score(40,prodmeans,pop.limits,
scores$numresponse <- as.numeric(levels(scores$score))[scores$score]
nsim <- 250
sim9cat <- lapply(seq(0.,.6,.05),function(dif) {
      sa <- sapply(1:nsim,function(x) onesim(prodmeans,pop.limits,
sim9cattb <-,sim9cat)
ggplot(sim9cattb, aes(dif,(nsim-nsign)/nsim ,colour=method)) +
    geom_line() + xlab('Difference between products') + 
    ylab('Proportion significant (at 5% Test)') +
    theme(legend.position = "bottom") + ylim(c(0,1)) +
    guides(colour = guide_legend('Analysis Method'))


The differences between clmm and ANOVA seem to be minimal. I had not expected large differences, but especially at 5 categories my expectation were to find real differences as the continuous data is more violated. Obviously, a load more simulations would be needed to draw final conclusions. This is beyond the scope of my blog. 
To conclude, in theory clmm is much more suitable than ANOVA for ordinal data. There are no reasons in terms of presentation to prefer ANOVA over clmm. But in practice the difference may be minimal.

1 comment:

  1. This doesn't surprise me at all. You'd expect similar performance between logistic regression and regression in terms of its use as a significance test when the means you are comparing are between 0.2 and 0.8 and the residuals of the regression are approximately normal. Extending that logic to the ordinal case you'd tend to want few categories (e.g., 3-7) and decidedly non-normal residuals (so responses that don't sit in the middle of the ranges) and possibly also small small-ish samples.

    However, the real gain for logistic regression and ordinal logistic regression is in terms of prediction. The regression/ANOVA approach will have symmetrical confidence or prediction intervals and will frequently predict impossible values (though it can be OK in large samples where responses sit in the middle of the range).