Description Usage Arguments Details Value References See Also Examples
View source: R/testonlyhierarchy.R
Hierarchical testing given the output of the function
multisplit
.
1 2 3 4 5 6  test_only_hierarchy(x, y, dendr, res.multisplit, clvar = NULL,
family = c("gaussian", "binomial"), alpha = 0.05,
global.test = TRUE, agg.method = c("Tippett", "Stouffer"),
verbose = FALSE, sort.parallel = TRUE, parallel = c("no",
"multicore", "snow"), ncpus = 1L, cl = NULL, check.input = TRUE,
unique.colnames.x = NULL)

x 
a matrix or list of matrices for multiple data sets. The matrix or matrices have to be of type numeric and are required to have column names / variable names. The rows and the columns represent the observations and the variables, respectively. 
y 
a vector, a matrix with one column, or list of the aforementioned
objects for multiple data sets. The vector, vectors, matrix, or matrices
have to be of type numeric. For 
dendr 
the output of one of the functions

res.multisplit 
the output of the function

clvar 
a matrix or list of matrices of control variables. 
family 
a character string naming a family of the error distribution;
either 
alpha 
the significant level at which the FWER is controlled. 
global.test 
a logical value indicating whether the global test should be performed. 
agg.method 
a character string naming an aggregation method which
aggregates the pvalues over the different data sets for a given cluster;
either 
verbose 
a logical value indicating whether the progress of the computation should be printed in the console. 
sort.parallel 
a logical indicating whether the values are sorted with respect to the size of the block. This can reduce the run time for parallel computation. 
parallel 
type of parallel computation to be used. See the 'Details' section. 
ncpus 
number of processes to be run in parallel. 
cl 
an optional parallel or snow cluster used if

check.input 
a logical value indicating whether the function should
check the input. This argument is used to call

unique.colnames.x 
a character vector containing the unique column
names of 
The function test_only_hierarchy
requires the output
of one of the functions cluster_var
or
cluster_position
as an input (argument dendr
).
Furthermore it requires the output of the function
multisplit
as an input (argument res.multisplit
).
Hierarchical testing is performed by going top down through the hierarchical
tree. Testing only continues if at least one child of a given cluster is significant.
If the argument block
was supplied for the building
of the hierarchical tree (i.e. in the function call of either
cluster_var
or
cluster_position
), i.e. the second level of the
hierarchical tree was given, the hierarchical testing step can be run in
parallel across the different blocks by specifying the arguments
parallel
and ncpus
. There is an optional argument cl
if
parallel = "snow"
. There are three possibilities to set the
argument parallel
: parallel = "no"
for serial evaluation
(default), parallel = "multicore"
for parallel evaluation
using forking, and parallel = "snow"
for parallel evaluation
using a parallel socket cluster. It is recommended to select
RNGkind("L'EcuyerCMRG")
and set a seed to ensure that
the parallel computing of the package hierinf
is reproducible.
This way each processor gets a different substream of the pseudo random
number generator stream which makes the results reproducible if the arguments
(as sort.parallel
and ncpus
) remain unchanged. See the vignette
or the reference for more details.
Note that if Tippett's aggregation method is applied for multiple data sets, then very small pvalues are set to machine precision. This is due to rounding in floating point arithmetic.
The returned value is an object of class "hierT"
, consisting of
two elements, the result of the multisample splitting step
"res.multisplit"
and the result of the hierarchical testing
"res.hierarchy"
.
The result of the multisample splitting step is a list with number of elements corresponding to the number of data sets. Each element (corresponding to a data set) contains a list with two matrices. The first matrix contains the indices of the second half of variables (which were not used to select the variables). The second matrix contains the column names / variable names of the selected variables.
The result of the hierarchical testing is a data frame of significant clusters with the following columns:
block 

p.value 
The pvalue of the significant cluster. 
significant.cluster 
The column names of the members of the significant cluster. 
There is a print
method for this class; see
print.hierT
.
Renaux, C. et al. (2018), Hierarchical inference for genomewide association studies: a view on methodology with software. (arXiv:1805.02988)
cluster_var
,
cluster_position
,
multisplit
,
test_hierarchy
, and
compute_r2
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34  n < 200
p < 500
library(MASS)
set.seed(3)
x < mvrnorm(n, mu = rep(0, p), Sigma = diag(p))
colnames(x) < paste0("Var", 1:p)
beta < rep(0, p)
beta[c(5, 20, 46)] < 1
y < x %*% beta + rnorm(n)
dendr1 < cluster_var(x = x)
set.seed(76)
res.multisplit1 < multisplit(x = x, y = y, family = "gaussian")
sign.clusters1 < test_only_hierarchy(x = x, y = y, dendr = dendr1,
res.multisplit = res.multisplit1,
family = "gaussian")
## With block
# The column names of the data frame block are optional.
block < data.frame("var.name" = paste0("Var", 1:p),
"block" = rep(c(1, 2), each = p/2),
stringsAsFactors = FALSE)
dendr2 < cluster_var(x = x, block = block)
# The output res.multisplit1 can be used since the multisample
# step is the same with or without blocks.
sign.clusters2 < test_only_hierarchy(x = x, y = y, dendr = dendr2,
res.multisplit = res.multisplit1,
family = "gaussian")
# Access part of the object
sign.clusters2$res.hierarchy[, "block"]
sign.clusters2$res.hierarchy[, "p.value"]
# Column names or variable names of the significant cluster in the first row.
sign.clusters2$res.hierarchy[[1, "significant.cluster"]]

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.