Description Usage Arguments Details Value References See Also Examples
This is the main function, implementing the Convex Hierarchical Testing (CHT) procedure. The CHT procedure produces a set of test statistics for both main effects and interactions with the property that an interaction's statistic is never larger than at least one of its two main effects. This is accomplished by formulating a convex optimization problem that enforces a hierarchical sparsity relationship between the main effects and interactions. The result is that interactions with large main effects receive a "boost" relative to those that do not.
1 |
x |
n by p design matrix |
y |
binary (0 or 1) vector of length n indicating class |
type |
determines whether Fisher transform should be applied to interaction contrasts. See below for explanation. Default is Fisher and is the recommended choice. |
The Convex Hierarchical Testing test statistics are the knots of the CHT optimization problem. That is, the statistic for a given main effect or interaction is the value of lambda at which the corresponding parameter becomes nonzero in the regularization path. Theorem 1 of the CHT paper gives the closed form expression used to compute these knots (recall that for the interaction test statistics, one takes the maximum of the two corresponding knots).
In Section 2.1 of the CHT paper, the raw main effect and interaction
contrasts are defined. These are referred to as "w" and "z" in the paper.
The main effect contrast "w" is the standard two-sample t-statistic. The
interaction contrast "z" is the normalized difference of the Fisher
transformed sample correlations between the two classes. If one instead uses
type="simple"
, we simply take for "z" a two-sample statistic on the
products of features. We recommend that type="Fisher"
be used instead
of "simple"
.
A hiertest object, which consists of an ordered list of the main effects and interactions and a vector indicating which of these are interactions.
Bien, Simon, and Tibshirani (2015) Convex Hierarchical Testing of Interactions. Annals of Applied Statistics. Vol. 9, No. 1, 27-42.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | # generate some data accoring to the backward model:
set.seed(1)
n <- 200
p <- 50
y <- rep(0:1, each=n/2)
x <- matrix(rnorm(n*p), n, p)
colnames(x) <- c(letters,LETTERS)[1:p]
# make some interactions between several pairs of variables:
R <- matrix(0.3, 5, 5)
diag(R) <- 1
x[y==1, 1:5] <- x[y==1, 1:5] %*% R
# and a main effect for variables 1 and 3:
x[y==1, 1:5] <- x[y==1, 1:5] + 0.5
testobj <- hiertest(x=x, y=y, type="Fisher")
# look at test statistics
print(testobj)
plot(testobj)
## Not run:
lamlist <- seq(5, 2, length=100)
estfdr <- estimate.fdr(x, y, lamlist, type="Fisher", B=200)
plot(estfdr)
print(estfdr)
# the cutoff lamlist[70] is estimated to have roughly 10% FDR:
estfdr$fdr[70]
# this allows us to reject this many interactions:
nrejected <- estfdr$ncalled[70]
# These are the interactions rejected:
interactions.above(testobj, lamlist[70])
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.