Wiekvoet: A small comparison of bio-equivalence calculations.

Last week I looked at two-way cross-over studies and followed the example of Schütz (http://bebac.at/) in the analysis. Since the EU has its on opinions (Questions & Answers: Positions on specific questions addressed to the pharmacokinetics working party) and two example data sets, I was wondering how the various computations compared.

Data

There are 6 data-sets in the analysis.

The first data from Schütz.
The 'Dataset from Chow & Liu'. This is a two-period data set. Schütz did these data in three versions. This version is complete (balanced).
Data set 2; incomplete (one subject misses one evaluation). Note the EU guide states: 'subjects in a crossover trial who do not provide evaluable data for both of the test and reference products should not be included'.
Data set 2; unbalanced (one subject is missing).
EU data set 1: 'a four period crossover study where subjects receive both test and reference twice, with some subjects providing data for only a subset of the treatment periods'
EU data set 2: 'a three period crossover study where all subjects receive reference twice and test once'

Bio-equivalence analysis

There are five algorithms. I wrapped each in a function and made a unified print() method, so as to ease comparison. These functions have two input parameters; the data and the index for the level of the reference treatment.

The code from Schütz slide 14. Note this only did two period data, and not the incomplete data.
ANOVA. In contrast to my previous post subject is now fixed. For the interval I use just the residual degrees of freedom.
nlme This should deliver what proc mixed would deliver
MCMCglmm. A Bayesian MCMC sampler.
lme4. This is traditional mixed model with point estimate and interval via a simulation

Result comparison

Data set 1

Method Point_estimate p5CI p95CI CVIntra

[1,] "Schutz" 100.8168 95.47312 106.4596 7.370138

[2,] "ANOVA" 100.8168 95.47312 106.4596 7.370138

[3,] "nlme" 100.8168 95.47312 106.4596 7.370138

[4,] "MCMCglmm" 100.8011 95.38906 106.5084 8.336442

[5,] "lme4" 100.71 96.03672 105.9925 7.370137

Data set 2

Method Point_estimate p5CI p95CI CVIntra

[1,] "Schutz" 97.17545 88.3128 106.9275 19.47357

[2,] "ANOVA" 97.17545 88.3128 106.9275 19.47357

[3,] "nlme" 97.17545 88.31279 106.9275 19.47359

[4,] "MCMCglmm" 97.09287 86.66635 108.7019 24.21701

[5,] "lme4" 97.51018 89.10346 106.7388 19.47351

Data set 3: Inbalanced

Method Point_estimate p5CI p95CI CVIntra

[1,] "Schutz" 95.79137 87.02926 105.4356 19.05692

[2,] "ANOVA" 95.60894 86.86353 105.2348 19.05692

[3,] "nlme" 95.60894 86.86352 105.2349 19.05693

[4,] "MCMCglmm" 95.6052 85.20512 107.4927 24.13791

[5,] "lme4" 95.80032 86.8732 103.6471 19.05691

Data set 4: Incomplete

Method Point_estimate p5CI p95CI CVIntra

[1,] "ANOVA" 95.60894 86.86353 105.2348 19.05692

[2,] "nlme" 96.46814 87.6157 106.215 19.2212

[3,] "MCMCglmm" 96.78571 86.78392 108.6978 23.77217

[4,] "lme4" 96.76121 88.38716 105.469 19.22122

Data set 5: four period crossover

Method Point_estimate p5CI p95CI CVIntra

[1,] "ANOVA" 115.6587 107.1057 124.8948 41.65396

[2,] "nlme" 115.7298 107.1707 124.9725 41.66876

[3,] "MCMCglmm" 115.7334 107.0849 125.0422 41.87724

[4,] "lme4" 115.7831 106.6829 124.6334 41.66873

Data set 6: three period crossover

Method Point_estimate p5CI p95CI CVIntra

[1,] "ANOVA" 102.2644 97.31555 107.4649 11.85557

[2,] "nlme" 102.2644 97.31555 107.4649 11.85557

[3,] "MCMCglmm" 102.2624 97.29063 107.4589 12.17123

[4,] "lme4" 102.3081 97.30974 107.192 11.85559

From this small analysis one can observe

Only to use the Schütz code for balanced two period crossover data. But in that case nlme and ANOVA provide the same answers. Hence it is purely educational
Given the width of the intervals, the point estimates are close enough so differences can be ignored.
For all data except the incomplete data the difference between nlme and ANOVA is absent or very small.
MCMCglmm has some differences, compared to other methods it over estimates CVIntra. MCMCglmm gives a slightly wider confidence interval for bioequivalence.
lme4 had in most cases a slightly narrower interval.

Code

###############################################################################

# Data

###############################################################################

#Schutz 1

T1 <- c(28.39,33.36,24.29,28.61,59.49,42.30)

T2 <- c(49.42,36.78,34.81,45.54,28.23,25.71)

R1 <- c(39.86,32.75,34.97,45.44,27.87,24.26)

R2 <- c(35.44,33.40,24.65,31.77,65.29,37.01)

S1 <- data.frame(

aval=c(T1,R2,R1,T2),

subjid=factor(c(

rep(c(1,4,6,7,9,12),2),

rep(c(2,3,5,8,10,11),2))),

seqa=rep(c('TR','RT'),each=12),

trta=rep(c('T','R','R','T'),each=6),

aperiod=factor(rep(c(1,2,1,2),each=6))

)

S1 <- S1[order(S1$subjid,S1$aperiod),]

S1$logAval <- log(S1$aval)

S2 <- read.table(textConnection('

Subject Period Sequence Formulation AUC Subject Period Sequence Formulation AUC

1 1 RT Reference 74.675 13 1 TR Test 122.45

1 2 RT Test 73.675 13 2 TR Reference 124.975

2 1 TR Test 74.825 14 1 TR Test 99.075

2 2 TR Reference 37.35 14 2 TR Reference 85.225

3 1 TR Test 86.875 15 1 RT Reference 69.725

3 2 TR Reference 51.925 15 2 RT Test 59.425

4 1 RT Reference 96.4 16 1 RT Reference 86.275

4 2 RT Test 93.25 16 2 RT Test 76.125

5 1 RT Reference 101.95 17 1 TR Test 86.35

5 2 RT Test 102.125 17 2 TR Reference 95.925

6 1 RT Reference 79.05 18 1 TR Test 49.925

6 2 RT Test 69.45 18 2 TR Reference 67.1

7 1 TR Test 81.675 19 1 RT Reference 112.675

7 2 TR Reference 72.175 19 2 RT Test 114.875

8 1 TR Test 92.7 20 1 RT Reference 99.525

8 2 TR Reference 77.5 20 2 RT Test 116.25

9 1 TR Test 50.45 21 1 TR Test 42.7

9 2 TR Reference 71.875 21 2 TR Reference 59.425

10 1 TR Test 66.125 22 1 TR Test 91.725

10 2 TR Reference 94.025 22 2 TR Reference 114.05

11 1 RT Reference 79.05 23 1 RT Reference 89.425

11 2 RT Test 69.025 23 2 RT Test 64.175

12 1 RT Reference 85.95 24 1 RT Reference 55.175

12 2 RT Test 68.7 24 2 RT Test 74.575

'),header=TRUE,stringsAsFactors=FALSE)

S2 <- with(S2,data.frame(

aval=c(AUC,AUC.1),

subjid=factor(c(Subject,Subject.1)),

seqa=factor(c(Sequence,Sequence.1)),

trta=factor(c(Formulation,Formulation.1)),

aperiod=factor(c(Period,Period.1))))

S2$logAval <- log(S2$aval)

S2InC <- S2[!(S2$subjid=='24' & S2$aperiod=='2'),]

S2InB <- S2[!(S2$subjid=='24'),]

##
#The EU data sets I copied-pasted into .csv prior to analysis.

#EU1

EU1 <- read.csv('eu1.csv')

EU1$subjid <- factor(EU1$SUBJECT)

EU1$aperiod <- factor(EU1$PERIOD)

EU1$aval <- EU1$DATA

EU1$seqa <- EU1$SEQUENCE

EU1$trta <- EU1$FORMULATION

EU1$logAval <- log(EU1$aval)

#EU2

EU2 <- read.csv('eu2.csv')

EU2$subjid <- factor(EU2$SUBJECT)

EU2$aperiod <- factor(EU2$PERIOD)

EU2$aval <- EU2$DATA

EU2$seqa <- EU2$SEQUENCE

EU2$trta <- EU2$FORMULATION

EU2$logAval <- log(EU2$aval)

###############################################################################

# Analysis functions

###############################################################################

print.BEQ <- function(x,...) {

result <- paste(

" Back transformed (raw data scale)\n",

"Point estimate:",

round(x$Point_estimate, digits=2),"%\n",

"90 % confidence interval:",

round(x$p5CI,digits=2), "% to",

round(x$p95CI, digits=2),"%\n",

"CVIntra:",round(x$CVIntra,digits=2),

"%\n")

cat(result)

invisible(x)

}

BESchutz <- function(datain,iref) {

prepdata <- datain[order(datain$seqa,datain$subjid,datain$aperiod),]

T1 <- with(prepdata,aval[seqa==levels(seqa)[iref] & aperiod==1])

R1 <- with(prepdata,aval[seqa==levels(seqa)[3-iref] & aperiod==1])

R2 <- with(prepdata,aval[seqa==levels(seqa)[iref] & aperiod==2])

T2 <- with(prepdata,aval[seqa==levels(seqa)[3-iref] & aperiod==2])

RT <- log(R1) - log(T2)

TR <- log(R2) - log(T1)

n1 <- length(RT)

mRT <- mean(RT)

vRT <- var(RT)

n2 <- length(TR)

mTR <- mean(TR)

vTR <- var(TR)

mD <- mean(log(c(T1,T2))) - mean(log(c(R1,R2)))

MSE <- (((n1-1)*vRT + (n2-1)*vTR)/(n1+n2-2))/2

alpha <- 0.05

lo <- mD - qt(1-alpha,n1+n2-2)*sqrt(MSE)*

sqrt((1/(2*n1) + 1/(2*n2)))

hi <- mD + qt(1-alpha,n1+n2-2)*sqrt(MSE)*

sqrt((1/(2*n1) + 1/(2*n2)))

#result <- paste(

# paste(" Back transformed (raw data scale)\n",

# "Point estimate:",

# round(100*exp(mD), digits=2),"%\n"),

# paste("90 % confidence interval:"),

# paste(round(100*exp(lo), digits=2), "% to"),

# paste(round(100*exp(hi), digits=2),"%\n",

# paste("CVintra:",round(100*sqrt(exp(MSE)-1),

# digits=2),"%\n")))

result <- list(Method='Schutz',

Point_estimate=100*exp(mD),

p5CI=100*exp(lo),

p95CI=100*exp(hi),

CVIntra=100*sqrt(exp(MSE)-1))

class(result) <- c('BEQ',class(result))

result

}

#######

# plain anova

#######

BEanova <- function(datain,iref) {

trtastring <- paste('trta',levels(datain$trta)[iref],sep='')

A1 <- aov(logAval ~ aperiod +seqa+ trta + subjid,

data=datain)

SE <- sqrt(vcov(A1)[trtastring,trtastring])

td <- qt(.05,A1$df.residual)

result <- list(Method='ANOVA',

Point_estimate=100*exp(coef(A1)[trtastring]),

p5CI=100*exp(coef(A1)[trtastring]+td*SE),

p95CI=100*exp(coef(A1)[trtastring]-td*SE),

CVIntra=100*sqrt(exp(sum(A1$residuals^2)/A1$df.residual)-1))

class(result) <- c('BEQ',class(result))

result

}

########

# nlme

########

library(nlme)

BEnlme <- function(datain,iref) {

trtastring <- paste('trta',levels(datain$trta)[iref],sep='')

L1 <- lme(logAval ~ aperiod + seqa + trta,

random=~1|subjid,

data=datain)

coL1 <- fixef(L1)

SEL1 <- sqrt(vcov(L1)[trtastring,trtastring])

dfL1 <- L1$fixDF$terms[names(L1$fixDF$terms)=='trta']

tdL1 <- qt(.05,dfL1)

result <- list(Method='nlme',

Point_estimate=100*exp(coL1[trtastring]),

p5CI=100*exp(coL1[trtastring]+tdL1*SEL1),

p95CI=100*exp(coL1[trtastring]-tdL1*SEL1),

CVIntra=100*sqrt(exp(as.numeric(VarCorr(L1)['Residual','Variance']))-1))

class(result) <- c('BEQ',class(result))

result

}

######

# MCMCglm

######

library(MCMCglmm)

BEmcmcglmm <- function(datain,iref) {

trtastring <- paste('trta',levels(datain$trta)[iref],sep='')

M1 <- MCMCglmm(logAval ~ aperiod + seqa + trta,

random= ~ subjid ,

data=datain,family='gaussian',nitt=500000,thin=20,

burnin=50000,verbose=FALSE)

M1est <- 100*exp(quantile(M1$Sol[,trtastring],c(.05,.5,.95)))

result=list(Method='MCMCglmm',

Point_estimate=M1est[2],

p5CI=M1est[1],

p95CI=M1est[3],

CVIntra= 100*sqrt(exp(mean(M1$VCV[,'units']))-1))

class(result) <- c('BEQ',class(result))

result

}

#######

# lme4

######

library(lme4)

BElmer <- function(datain,iref) {

trtastring <- paste('trta',levels(datain$trta)[iref],sep='')

R1 <- lmer(logAval ~ aperiod + seqa + trta + (1|subjid),

data=datain)

simeff <- function(m0) {

s <- simulate(m0)

fixef(refit(m0,s))[trtastring]

}

ff <- replicate(1000,simeff(R1))

R1est <- 100*exp(quantile(ff,c(.05,.5,.95)))

result <- list(Method='lme4',

Point_estimate=R1est[2],

p5CI=R1est[1],

p95CI=R1est[3],

CVIntra=100*sqrt(exp(sigma(R1)^2)-1))

class(result) <- c('BEQ',class(result))

result

}

###############################################################################

# Result summary

###############################################################################

t(cbind(BESchutz(S1,2)

, BEanova(S1,2)

, BEnlme(S1,2)

, BEmcmcglmm(S1,2)

, BElmer(S1,2)

))

t(cbind(BESchutz(S2,2)

, BEanova(S2,2)

, BEnlme(S2,2)

, BEmcmcglmm(S2,2)

, BElmer(S2,2)

))

t(cbind( BESchutz(S2InB,2)

, BEanova(S2InB,2)

, BEnlme(S2InB,2)

, BEmcmcglmm(S2InB,2)

, BElmer(S2InB,2)

))

t(cbind( BEanova(S2InC,2)

, BEnlme(S2InC,2)

, BEmcmcglmm(S2InC,2)

, BElmer(S2InC,2)

))

t(cbind( BEanova(EU1,2)

, BEnlme(EU1,2)

, BEmcmcglmm(EU1,2)

, BElmer(EU1,2)

))

t(cbind( BEanova(EU2,2)

, BEnlme(EU2,2)

, BEmcmcglmm(EU2,2)

, BElmer(EU2,2)

))

2 comments:

Unknown05 December, 2013 18:42
Hi,

I don’t think that your setup of the replicated studies (data sets 5 & 6) is correct. See the results reported in EMA’s Q&A-document for ‘Method C’ – which is also the coding preferred by the FDA.

Data set 5
PE 115.6577% (90% CI: 107.1044 – 124.8939%)
Data set 6
PE 102.2644% (90% CI: 97.0532 – 107.7554%)

These results are obtained in SAS and Pharsight’s Phoenix/WinNonlin.

R is a nasty beast.

Helmut

Sunday, November 10, 2013

A small comparison of bio-equivalence calculations.

Data

Bio-equivalence analysis

Result comparison

Data set 1

Data set 2

Data set 3: Inbalanced

Data set 4: Incomplete

Data set 5: four period crossover

Data set 6: three period crossover

Code

2 comments: