Description Usage Arguments Details Value Author(s) References See Also Examples
View source: R/gGlobalAncova.hierarchical.R
Hierarchical testing procedure according to Meinshausen (2008) screening for groups of related variables within a hierarchy instead of screening individual variables independently. Groups are tested by the generalized GlobalAncova approach. The family-wise error rate is simultaneously controlled over all levels of the hierarchy. In order to reduce computational complexity for large hierarchies, a "short cut" is implemented, where the testing procedure is applied separately to K sub-hierarchies. The p-values are adjusted such that they are identical to the ones obtained when testing the complete hierarchy.
1 2 | gGlobalAncova.hierarchical(data, H, formula.full, formula.red=~1, model.dat, sumstat=sum,
alpha=0.05, K, perm=10000, returnPermstats=FALSE, permstats)
|
data |
|
H |
dendrogram object specifying the hierarchy of the variables; |
formula.full |
model formula for the full model |
formula.red |
model formula for the reduced model (that does not contain the terms of interest) |
model.dat |
|
sumstat |
function for summarizing univariate test statistics; default is |
alpha |
global significance level |
K |
optional integer; if this is specified, "short cut" on hierarchical testing will be applied separately to |
perm |
number of permutations |
returnPermstats |
if |
permstats |
if variable-wise permutation statistics were calculated previously, they can be provided in order not to repeat permutation testing (but only the hierarchical prodcedure); useful e.g. if procedure is run again with different |
The hierarchical procedure starts with testing the global null hypothesis that all variables are not associated with the design of interest, and then moves down the given hierarchy testing subclusters of variables. A subcluster is only tested if the null hypothesis corresponding to its ancestor cluster could be rejected. The p-values are adjusted for multiple testing according to cluster size p_{C,adj} = p_C m/|C|, where m is the total number of variables and |C| is the number of variables in cluster C.
If K
is specified and the procedure is split to K
sub-hierarchies containing m_1, …, m_K variables, p-values are additionally adjusted by τ = m / m_k, k=1, …, K, such that resulting p-values are identical to the ones obtained when testing the complete hierarchy
p_{C,adj,k} \cdot τ = p_C m_k/|C| \cdot m/m_k = p_{C,adj}
an object of class GAhier
Manuela Hummel m.hummel@dkfz.de
Meinshausen N, 2008. Hierarchical testing of variable importance. Biometrika, 95(2):265
gGlobalAncova
, GAhier
, Plot.hierarchy
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | data(bindata)
X <- as.matrix(bindata[,-1])
# get a hierarchy for variables
dend <- as.dendrogram(hclust(dist(t(X))))
# hierarchical test
set.seed(555)
res <- gGlobalAncova.hierarchical(X, H = dend, formula.full = ~group, model.dat = bindata, alpha = 0.05, perm = 1000)
res
results(res)
# get names of significant clusters
sigEndnodes(res)
# visualize results
Plot.hierarchy(res, dend)
# starting with 3 sub-hierarchies
set.seed(555)
res2 <- gGlobalAncova.hierarchical(X, H = dend, K = 3, formula.full = ~group, model.dat = bindata, alpha = 0.05, perm = 1000)
results(res2)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.