This example comes from Chapter 6, page 305-307.
We'll consider an application concerning a case/control study in which the dependent variable is the haplotypical expression in a chromosome 17 marker. The goal of the study is to estimating the association between the group variable (Cases/Controls) and the marker (the categorical variable).
library(permuter) library(knitr) set.seed(50) data(chrom17m) chrom17m
The association test was performed using $\chi^2$ (asymptotic and exact) and $T_{SM}$ tests (Sham and Curtis, 1995). Moreover, the analysis was developed following the multifocus approach. Ten thousand CMC iterations were carried out for the four tests based on a permutation strategy.
This explanation makes no sense. what 4 tests?
The permutation proceeds as follows: the number of cases and controls (n1
and n2
) remains fixed. We permute the group labels and then recompute the $k$-by-2 table. The test statistic is something for each of the $k$ categories.
B <- 1000 X1 <- rep(chrom17m$Groups, chrom17m$Cases) X2 <- rep(chrom17m$Groups, chrom17m$Controls) X <- as.factor(c(X1, X2)) k <- nrow(chrom17m) n1 <- sum(chrom17m$Cases) n2 <- sum(chrom17m$Controls) n <- n1 + n2 l <- rep(c(1, 2), c(n1, n2)) data <- chrom17m[, -1] marginal <- rowSums(chrom17m[, -1]) observed <- rep(0, k) distr <- matrix(0, nrow = B, ncol = k) for (kk in 1:k) { observed[kk] <- (chrom17m[kk, 2] - marginal[kk] * n1/n)^2/(marginal[kk] * n1/n) for (bb in 1:B) { X.star <- sample(X) data.star <- table(X.star, l) marginal.star <- rowSums(data.star) distr[bb, kk] <- (data.star[kk, 1] - marginal.star[kk] * n1/n)^2/(marginal.star[kk] * n1/n) } }
We compute the p-values for each of the $k$ groups individually. Then, since we want to make inferences for individual alleles, we need to adjust for multiple tests. We use fwe.minp
to do this.
partial_pvalues <- sapply(1:k, function(i) t2p(observed[i], distr[, i], alternative = "greater")) names(partial_pvalues) <- chrom17m$Groups adjusted_pvalues <- fwe_minp(partial_pvalues, distr) names(adjusted_pvalues) <- chrom17m$Groups kable(cbind(partial_pvalues, adjusted_pvalues), row.names = TRUE, col.names = c("Partial P-values", "Adjusted P-values"))
The table above shows the raw p-values and the adjusted p-values for multiplicity using Tippett's combination for each category. The alleles A and C are individually significant at a level of $\alpha = 0.05$. After adjustment for multiplicity (Tippett's step-down procedure; Finos et al., 2003) only allele C remains significant.
Finally, suppose we want a global p-value for the hypothesis test. We'll use the NPC methodology.
null_pvalues <- apply(distr, 2, pvalue_distr) # get a p-value corresponding to each null test statistic tippett_distr <- apply(null_pvalues, 1, tippett) fisher_distr <- apply(null_pvalues, 1, fisher) liptak_distr <- apply(null_pvalues, 1, liptak) global_pvalue_tippett <- t2p(tippett(partial_pvalues), tippett_distr, alternative = "greater") global_pvalue_fisher <- t2p(fisher(partial_pvalues), fisher_distr, alternative = "greater") global_pvalue_liptak <- t2p(liptak(partial_pvalues), liptak_distr, alternative = "greater") globals <- c(global_pvalue_tippett, global_pvalue_fisher, global_pvalue_liptak) kable(t(globals), row.names = F, col.names = c("Tippett", "Fisher", "Liptak"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.