Sunday, March 10, 2013

More sequential testing for triangle tests

I looked before at triangle tests and at sequential testing in triangle tests (blog entry). In the latter post it was demonstrated that a sequential test is possible, without costs in desired error of the first kind. The latter because the desired alpha (e.g. 5%) is never exactly obtained in any test based on the binomial distribution. So, e.g. alpha=3% is obtained, leaving 2% to spare. This new post demonstrates a similar intermediate decision is possible with regards to H1 and ends with a demonstration simulation combining all.

Triangle tests

As mentioned before, a triangle test is a simple frequently used sensory test. The aim is to determine if two products are different. Subjects get three samples, two of one kind, one of the other kind. The subjects need to determine which of the three samples is the odd one. If a sufficient number of tests have the correct answer the products are different.

True discriminators

One additional concept is used, that is the concept of true discriminators. This is the idea that there are two kinds of subjects. People who recognize the odd product, and people who make a random guess. The former always select the odd sample, the latter one third of the time. It is easy to go from true discriminators to correct choices. True discriminators is just what sensory specialists like to use.

library(ggplot2)
True2Correct <- function(x) 1/3 + x*2/3
Correct2True <- function(x) x-(1-x)/2 

Power for 35 subjects

I chose to use 35 subjects as this is quite normal. It gives an alpha of 4.4% and with 33% true discriminators a power of 84%. The plot shows that 35 is reasonable for getting 30% to 50% true discriminators, which is more or less the region of interest.
nObs <-35
(limit <- qbinom(.95,nObs,1/3))
[1] 161-pbinom(limit,nObs,1/3) # alpha
[1] 0.04419916#plot power
qplot(y=1-pbinom(limit,nObs,True2Correct(seq(0,.6,length.out=30))),
        x=seq(0,.6,length.out=30),geom='path',
        xlab='Proportion True Discriminators',
        ylab='Power') +
    geom_vline(aes(xintercept = PropH1TDisc,colour='blue'))
ggsave('een.png')

# choose H1, hence power
PropH1TDisc <- 1/3
1-pbinom(limit,nObs,True2Correct(PropH1TDisc))
[1] 0.8417316

Adding a hypothesis test after 16 triangles

First the expected distributions under H0 and H1 are plotted. The plot shows us that after 16 triangle tests, there are outcomes which will almost only occur under H0, and outcomes which will almost only occur under H1.
Subsequently a routine is made which gives the chance to go over the limit (16) for the full test, given a certain number correct after 16 triangles. For example, if the first 16 have 0 correct, this is the probability of getting 0 in the first 16, times the probability of getting more than the remaining correct (16-0) to get to the  limit in the second set of tests. Add the values for 0 and 1, and we know the chance to get over the limit in the first test, given 0 or 1 correct in the first observations. So, the cumulative of these values provides the costs in power in case a certain cut-off is used after 16 triangles to decide H0 won't be rejected (the products are not different).
It appears that up to 5 correct these costs are low. The remaining power is 83%.
nObs1 <- 16
# expected dsitributions
qplot(y=dbinom(0:nObs1,nObs1,True2Correct(PropH1TDisc)),
        x=0:nObs1,colour='H1, 33% Discriminators',
        xlab='Number of correct',
        ylab='Probability')  +
    geom_point(aes(y=dbinom(0:nObs1,nObs1,1/3),
            x=0:nObs1,colour='H0, No Discriminators')) +
    labs(colour = "Distribution") +
    theme(legend.position='bottom')
ggsave('twee.png')

Powercost <- function(nstep1,nObs1,limit,nObs,PropH1TDisc) {
  nObs2 <- nObs-nObs1
  Correct2Limit <- limit-nstep1
  ProbCorrect <- True2Correct(PropH1TDisc)
  ifelse (Correct2Limit < 0,0, 
      dbinom(nstep1,nObs1,ProbCorrect) *
          pbinom(Correct2Limit,nObs2,ProbCorrect,lower.tail=FALSE) 
  )
}

PowerExpect <- data.frame(
    nCorrect1=0:nObs1,
    pH1=pbinom(0:nObs1,nObs1,True2Correct(PropH1TDisc)),
    Powercost=cumsum(Powercost(0:nObs1,nObs1,limit,nObs,PropH1TDisc))
)

PowerExpect$Powerremaining <- 1-pbinom(limit,nObs,True2Correct(PropH1TDisc))-PowerExpect$Powercost    
PowerExpect[1:9,]

  nCorrect1          pH1    Powercost Powerremaining
1         0 2.317820e-06 4.111782e-09      0.8417316
2         1 4.867422e-05 4.110800e-07      0.8417312
3         2 4.832655e-04 1.396838e-05      0.8417176
4         3 3.018381e-03 2.294403e-04      0.8415021
5         4 1.331729e-02 2.139068e-03      0.8395925
6         5 4.421401e-02 1.247786e-02      0.8292537
7         6 1.150190e-01 4.884815e-02      0.7928834
8         7 2.414565e-01 1.359299e-01      0.7058016
9         8 4.192592e-01 2.832904e-01      0.5584411

A simulation


Validation of the calculation comes from a simulation. Two functions are used. First, a function simulating number correct triangle tests for a given number of rue discriminators. The second function calls for the triangles to be executed in two phases, and gives the verdict. Either a decision after 16 triangles, or a decision after a full run. The decision after 16 is H0 not reject when less than 6 correct are observed, H0 reject, (the limit of 9 is from the previous sequential testing post) or the remainder of the data is used. Finally, this function is called hundred thousand times under both distributions and the results are tabulated.
It appears that almost half of the time a decision can be made after 16 tests. Chance of error of the first kind approximately 5%. Power is approximately 83%
rNCTriangle <- function(nObs,PropTDisc) {
  PropCorrect <- True2Correct(PropTDisc)
  sum(sample(0:1,nObs,c(1-PropCorrect,PropCorrect),replace=TRUE))
}

TrialRun <- function(PropTDisc) {
  test1 <- rNCTriangle(16,PropTDisc)
  test2 <- rNCTriangle(19,PropTDisc)
  ans <- if (test1+test2>limit) 'Reject H0 end' else 'Not Reject H0 end'
  if (test1<6) ans <- 'Not Reject H0 fase 1'
  if (test1>9) ans <- 'Reject H0 fase 1'
  ans
}

Nsamp <- 1e5
DecisionH0 <- sapply(1:Nsamp,function(x) TrialRun(0))
as.data.frame(table(DecisionH0)/Nsamp)

            DecisionH0    Freq
1    Not Reject H0 end 0.40487
2 Not Reject H0 fase 1 0.54435
3        Reject H0 end 0.03488
4     Reject H0 fase 1 0.01590
DecisionH1 <- sapply(1:Nsamp,function(x) TrialRun(1/3))

as.data.frame(table(DecisionH1)/Nsamp)

            DecisionH1    Freq
1    Not Reject H0 end 0.11983
2 Not Reject H0 fase 1 0.04316
3        Reject H0 end 0.45319
4     Reject H0 fase 1 0.38382

No comments:

Post a Comment