library(knitr);library(CTP) knitr::opts_chunk$set(collapse = TRUE, out.width="100%" ,fig.width=12,fig.height=8,dev.args = list(pointsize=25) )
The closure principle is a way to protect the type-I error from multiple testing. Here, we follow the description in [@Bretz2011]. It consists of four steps:
Definition of a set ${H} = {H_1,\ldots, H_n}$ of elementary hypotheses.
Construction of the closure set ("Hypothesis Tree").
$$\overline{H} = \left { H_I =\bigcap_{i \in I}H_i : \quad I \subseteq {1,\ldots,n} \right } $$ $$(\text{all intersection hypotheses} H_I ).$$
Construction of a local level-$\alpha$ test for each $H_I \in \overline{H}$.
Rejection of $H_i$, if all null hypotheses $H_I \in \overline{H}$ with $i \in I$ are rejected at at the local level $\alpha$.
As the null hypothesis $H_i$ is rejected only if the null hypotheses $H_I \in \overline{H}$ with $i \in I$ are rejected (see point 4. above), the adjusted p-value $p_{adj;i}$ for $H_i$ is defined as:
The package was designed in partcular for treatment comparisons in ANOVA-like situations.
The hypothesis tree of the closed testing procedure is created using the function IntersectHypotheses
.
In the case of single hypotheses (i.e. if the hypothesis can be described by a single integer vector e.g. (1,3,5) the test (F-Test, Kruskal-Wallis-test, probability test, logrank test, ....) is applied directly.
For combined hypotheses (i.e. for hypotheses described by several non-overlapping integer vectors eg. (1,2), (3,4), The procedure differs for (generalised) linear hypotheses and other tests.
In the case of generalised linear hypotheses, the contrast matrices for the single hypotheses included are combined and these contrasts are tested simultaneously. Henceforth, functions from the package emmeans
[@emmeans] are used, as for all othe linear and generalised linear hypotheses. For all other tests, first the p-values $p_1, p_2, \ldots ,p_m$ for the single hypotheses are calculated, and then these are combined by Fisher's combination rule:
If all $m$ hypotheses are assumed to be independent, the test statistics $X$ follows under $H_0$ a $\chi^2$-distribution with $2m$ degrees of freedom: $$ X=-2\sum_{i=1}^{m}\ln(p_i) \sim \chi_{2m}^2$$ from which a p-value for the global hypothesis can be easily obtained.
In the case of trend tests, the same type of test is applied for all intermediate single tests.
Finally the p-values for the elementary hypotheses are adjusted by calculating the maximum of the p-values from the hypotheses in the testing set of the respective hypothesis.
The function AnalyseCTP
calculates all local p-values and the adjusted p-values for all elementary hypotheses.
With the function Adjust_raw
, it is also possible to use p-values that have been calculated by other functions or software to calculate the adjusted p-values.
The testing set for a specific elementary hyothesis can be printed by the function TestingSet
.
library(CTP) Pairwise <- IntersectHypotheses(list(c(1,2), c(1,3), c(1,4), c(2,3), c(2,4), c(3,4))) Set24 <- TestingSet(Pairwise,"[24]") Set24
The dataframe pasi
comprises the changes in PASI-score (Psoriasis Area and Severity Index) from baseline
within two month in 72 patients treated with three different doses of Etretin or Placebo in a double blind study.
The elementary hypotheses 1:2, 1:3, 1:4 are tested simultaneously using the F-Test i.e. $H_1: \mu_1=\mu_2$, $H_2: \mu_1=\mu_3$ and $H_3: \mu_1=\mu_4$ simultaneously. The groups with levels 2,3 and 4 are compared to the control (Placebo) group (level 1). In this specific example, the adjusted and unadjusted p-values are the same. All doses show a significant effect compared to Placebo.
library(CTP) data(pasi) three.to.first <- IntersectHypotheses(list(1:2,c(1,3),c(1,4))) Display(three.to.first,Type="s",arrow=TRUE) pasi.ctp.F1 <- AnalyseCTP(three.to.first,pasi.ch~dose,pasi) summary(pasi.ctp.F1) Display(pasi.ctp.F1)
Testing the elementary hypotheses 1:2, 2:3, 3:4 simultaneously using the F-Test, i.e. testing $H_1: \mu_1=\mu_2$, $H_2: \mu_2=\mu_3$ and $H_3: \mu_3=\mu_4$ simultaneously. This provides quite different results (compared to pasi.ctp.F1
): No further improvement for higher doses.
dose.steps4 <- IntersectHypotheses(list(1:2,2:3,3:4)) Display(dose.steps4,arr=TRUE) pasi.ctp.F2 <- AnalyseCTP(dose.steps4,pasi.ch~dose,pasi) summary(pasi.ctp.F2) Display(pasi.ctp.F2)
For the same hypothesis structure, other tests can also be used:
As an example, a positive response is defined as a change from baseline PASI score greater than 50. The new variable Resp
has then the value 1 if pasi.ch > 50 or 0 otherwise. The corresponding model is chosen to be a generalised linear model with logit-link, as implemented in glm
.
pasi$Resp <- ifelse(pasi$pasi.ch > 50,1,0) pasi.ctp_bin <-AnalyseCTP(three.to.first,Resp~dose,pasi,test.name="glm",family="binomial") summary(pasi.ctp_bin)
pasi.ctp.K <- AnalyseCTP(dose.steps4,pasi.ch~dose,pasi, test="kruskal") summary(pasi.ctp.K) Display(pasi.ctp.K)
pasi.ctp.J1 <- AnalyseCTP(dose.steps4,pasi.ch~dose,pasi, test="jonckheere",alternative="increasing") summary(pasi.ctp.J1)
The data set colorectal
contains the response rates from a dose finding study in metastatic colorectal cancer. Two doses of the experimental drug were compared to the standard treatment. The response rates in the two dose groups are compared to the control responder rate using both, the $\chi^2$-test and Fisher's exact test.
two.to.first<- IntersectHypotheses(list(1:2,c(1,3))) Display(two.to.first,Type="s",main="two vs control",arrow=TRUE) #The two elementary hypotheses are tested after comparing the three proportions globally. data(colorectal) colorectal.ctp <-AnalyseCTP(two.to.first,responder~dose,data=colorectal, test="prob") summary(colorectal.ctp) Display(colorectal.ctp,Type="t") colorectal.chisq <-AnalyseCTP(two.to.first,responder~dose,data=colorectal, test="chisq") summary(colorectal.chisq, digits=1)
This example uses the sample dataset ovarian
from the package survival
. The overall survival curves of the two treatments rx
do not differ significantly:
```r library(survival) data(ovarian)
print(survdiff(Surv(futime,fustat)~rx, data=ovarian))
Together with the performance subgroups `ecog=1` and `ecog=2` , a factor "subgroups" defined by the combinations of the performance measure `ecog.ps` and the treatment `rx`. ```r ovarian$subgroups <- as.factor(10*ovarian$ecog.ps+ovarian$rx) print(head(ovarian))
Then, the treatment differences within the performance subgroups ecog=1 and ecog=2 are compared. I.e. the elementary hypotheses are subgroup11=subgroup12 and subgroup21=subgroup22 or ${(1,2),(3,4)}$.
comb.sub <- IntersectHypotheses(list(c(1,2),c(3,4))) #Display(comb.sub) ovar.ctp <-AnalyseCTP(comb.sub,Surv(futime,fustat)~subgroups, ovarian, test="lgrank") summary(ovar.ctp) Display(ovar.ctp)
In a study with diabetes type II patients (dataset glucose
), three doses of a drug are compared to a placebo. The primary variable is the change of fasting plasma glucose from baseline. Fasting plasma glucose at baseline is included into the model as covariate (only implemented for linear and generalised linear models).
data(glucose) glucose.ctp <- AnalyseCTP(three.to.first,GLUCOSE.CHANGE~GLUCOSE.BLA+DOSE, data=glucose, factor.name="DOSE") summary(glucose.ctp) Display(glucose.ctp,Type="s")
Whith an increasing number of hypotheses to test, the graphical display may become quite confusing:
G <- factor(rep(1:5,each=4) ) y <- rnorm(20) Y <- data.frame(G,y) xxx <- IntersectHypotheses(list(1:2,c(1,3),c(1,4),c(1,5),c(2,5),c(3,4))) summary(xxx) Display(xxx)
It is possible to:
IntersectHypotheses
to create the closure set and Display
to plot the corresponding hypothesis tree.Adjust_raw
to calculte the adjusted p-values for the elementary hypotheses. Adjust_raw
delivers an object of class ctp
.summary
and Display
on this object.Pairwise <- IntersectHypotheses(list(c(1,2), c(1,3), c(1,4), c(2,3), c(2,4), c(3,4))) Display(Pairwise) summary(Pairwise) # the vector of p-values calculated by another software # (Example from Prof. John M. Lachin, The Biostatistics Center Rockville MD) p.val <- c( 0.4374, 0.6485, 0.4103, 0.2203, 0.1302, 0.6725, 0.4704, 0.3173, 0.6762, 0.7112, 0.2866, 0.3362, 0.2871, 0.4633) result <- Adjust_raw(Pairwise, p.value=p.val) summary(result,digits=3) # details may be documented result <- Adjust_raw(Pairwise, p.value=p.val ,dataset.name="my Data", factor.name="Factor" ,factor.levels=c("A","B","C","D"), model=y~Factor ,test.name="my Test") summary(result,digits=3)
nocite: | @Gabriel1976, @Bauer1991, @Dmit2010 ...
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.