aggregate_observations: Aggregate observations hierarchically with False Split Rate...

View source: R/aggobs.R

aggregate_observationsR Documentation

Aggregate observations hierarchically with False Split Rate control

Description

This function aggregate observations with the same means while simultaneously controlling False Split Rate under a target level. The aggregation is achieved by two steps: (1) Generate p-values for each interior node through ANOVA test (2) Sequentially test on the tree.

Usage

aggregate_observations(y, sigma = NULL, tree = NULL, alpha)

Arguments

y

A length-n-observation vector of observations

sigma

Standard deviation of noise. If given, the algorithm will compute nodewise p-values with chi-squared statistics. If sigma is unkown, the algorithm will compute p-values with F-test statistics.

tree

An object encoding the tree structure. Can be one of three formats: (1) an hclust object (if tree is binary), (2) a dendrogram, or (3) a generalization of an hclust object to the case of non-binary trees, which we call an hc_list object. An hc_list object is a list of length-num_interior_nodes where the ith item in the list contains the child nodes of the ith node in the tree. The negative values in the list indicate leaf nodes.

alpha

A use-specified target FSR level

Value

Returns the aggregation result.

alpha

The target FSR level.

groups

A length-n-observation vector of integers indicating the cluster to which each observation is allocated.

rejections

A length-(num_interior_nodes) vector indicating whether each node is rejected.

p_vals

A length-(num_interior_nodes) vector of the p-value (note: all are computed although not all are used in the sequential testing procedure)

Examples

set.seed(123)
hc = hclust(dist((1:20) + runif(20)/20), method = "complete")
k = 4 # 4 true groups
groups = cutree(hc, k = 4)
theta = runif(k, 0, 10)[groups]
y = theta + runif(20, 0, 1)
aggregate_observations(y, sigma = 1, tree= hc, alpha = 0.1)

simone0628/hat documentation built on June 1, 2024, 9 a.m.