gGlobalAncova.hierarchical: Hierarchical testing procedure using generalized GlobalAncova

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/gGlobalAncova.hierarchical.R

Description

Hierarchical testing procedure according to Meinshausen (2008) screening for groups of related variables within a hierarchy instead of screening individual variables independently. Groups are tested by the generalized GlobalAncova approach. The family-wise error rate is simultaneously controlled over all levels of the hierarchy. In order to reduce computational complexity for large hierarchies, a "short cut" is implemented, where the testing procedure is applied separately to K sub-hierarchies. The p-values are adjusted such that they are identical to the ones obtained when testing the complete hierarchy.

Usage

1
2
gGlobalAncova.hierarchical(data, H, formula.full, formula.red=~1, model.dat, sumstat=sum, 
                                       alpha=0.05, K, perm=10000, returnPermstats=FALSE, permstats)

Arguments

data

data.frame of variables (columns=variables) to be tested hierarchically; (multi-) categorical variables should be factors, ordinal variables should be ordered factors

H

dendrogram object specifying the hierarchy of the variables; labels(H) has to coincide with colnames(data)

formula.full

model formula for the full model

formula.red

model formula for the reduced model (that does not contain the terms of interest)

model.dat

data.frame of regressors, containing variables specified in formula.full and formula.red

sumstat

function for summarizing univariate test statistics; default is sum

alpha

global significance level

K

optional integer; if this is specified, "short cut" on hierarchical testing will be applied separately to K sub-hierarchies

perm

number of permutations

returnPermstats

if TRUE, the variable-wise statistics for all permutations are returned

permstats

if variable-wise permutation statistics were calculated previously, they can be provided in order not to repeat permutation testing (but only the hierarchical prodcedure); useful e.g. if procedure is run again with different alpha and/or hierarchy H; NOTE: data, formula.full and formula.red must be identical to the previous call

Details

The hierarchical procedure starts with testing the global null hypothesis that all variables are not associated with the design of interest, and then moves down the given hierarchy testing subclusters of variables. A subcluster is only tested if the null hypothesis corresponding to its ancestor cluster could be rejected. The p-values are adjusted for multiple testing according to cluster size p_{C,adj} = p_C m/|C|, where m is the total number of variables and |C| is the number of variables in cluster C.

If K is specified and the procedure is split to K sub-hierarchies containing m_1, …, m_K variables, p-values are additionally adjusted by τ = m / m_k, k=1, …, K, such that resulting p-values are identical to the ones obtained when testing the complete hierarchy

p_{C,adj,k} \cdot τ = p_C m_k/|C| \cdot m/m_k = p_{C,adj}

Value

an object of class GAhier

Author(s)

Manuela Hummel m.hummel@dkfz.de

References

Meinshausen N, 2008. Hierarchical testing of variable importance. Biometrika, 95(2):265

See Also

gGlobalAncova, GAhier, Plot.hierarchy

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
data(bindata)
X <- as.matrix(bindata[,-1])

# get a hierarchy for variables
dend <- as.dendrogram(hclust(dist(t(X))))

# hierarchical test
set.seed(555)
res <- gGlobalAncova.hierarchical(X, H = dend, formula.full = ~group, model.dat = bindata, alpha = 0.05, perm = 1000)
res
results(res)

# get names of significant clusters
sigEndnodes(res)

# visualize results
Plot.hierarchy(res, dend)

# starting with 3 sub-hierarchies
set.seed(555)
res2 <- gGlobalAncova.hierarchical(X, H = dend, K = 3, formula.full = ~group, model.dat = bindata, alpha = 0.05, perm = 1000)

results(res2)

GlobalAncova documentation built on Nov. 8, 2020, 8:10 p.m.