aggregate_features: Aggregate rare features hierarchically with False Split Rate...

View source: R/aggfeature.R

aggregate_featuresR Documentation

Aggregate rare features hierarchically with False Split Rate control

Description

This function achieves rare features aggregation while simultaneously controlling False Split Rate under a target level. The aggregation is achieved by two steps: (1) Generate p-values for each interior node (2) Sequentially test on the tree.

Usage

aggregate_features(y, X, sigma = NULL, tree = NULL, alpha)

Arguments

y

A length-n-observation response vector that guides the aggregation

X

An n-observation-by-n-feature design matrix. Each row corresponds to a subject and each column stores the observation of a feature made by subjects.

sigma

Standard deviation of noise. If not given, the algorithm will estimate sigma.

tree

An object encoding the tree structure. Can be one of three formats: (1) an hclust object (if tree is binary), (2) a dendrogram, or (3) a generalization of an hclust object to the case of non-binary trees, which we call an hc_list object. An hc_list object is a list of length-num_interior_nodes where the ith item in the list contains the child nodes of the ith node in the tree. The negative values in the list indicate leaf nodes.

alpha

A use-specified target FSR level

Value

Returns the aggregation result.

alpha

The target FSR level.

groups

A length-n-feature vector of integers indicating the cluster to which each feature is allocated.

rejections

A length-(num_interior_nodes) vector indicating whether each node is rejected.

p_vals

A length-(num_interior_nodes) vector of the p-value (note: all are computed although not all are used in the sequential testing procedure)

Examples

## See vignette for a small data example.

simone0628/hat documentation built on June 1, 2024, 9 a.m.