aggDS: Aggregate observed data based on a tree

View source: R/aggDS.R

aggDSR Documentation

Aggregate observed data based on a tree

Description

Aggregate observed values based on a column (sample) tree, e.g. for differential state analysis. The returned object will contain one abundance matrix for each node in the tree, with columns corresponding to sample IDs and rows corresponding to the same features as the rows of the input object. The nodes correspond to either the original sample clusters, or larger metaclusters encompassing several of the original clusters (defined by the provided column tree).

Usage

aggDS(
  TSE,
  assay = "counts",
  sample_id = "sample_id",
  group_id = "group_id",
  cluster_id = "cluster_id",
  FUN = sum,
  message = FALSE
)

Arguments

TSE

A TreeSummarizedExperiment object. Rows represent variables (e.g., genes) and columns represent observations (e.g., cells). The object must contain a column tree, whose tips represent initial cell clusters (the cluster_id annotation indicates which of these clusters a cell belongs to). The internal nodes represent increasingly coarse partitions of the cells obtained by successively merging the original clusters according to the provided column tree.

assay

The name or index of the assay from TSE to aggregate values from.

sample_id

A character scalar indicating the column of colData(TSE) that corresponds to the sample ID. These will be the columns of the output object.

group_id

A character scalar indicating the column of colData(TSE) that corresponds to the group/condition. This information will be propagated to the aggregated object.

cluster_id

A character scalar indicating the column of colData(TSE) that corresponds to the initial cluster ID for each cell.

FUN

The aggregation function.

message

A logical scalar, indicating whether progress messages should be printed to the console.

Value

A SummarizedExperiment object. Each assay represents the aggregated values for one node in the tree.

Author(s)

Ruizhu Huang, Charlotte Soneson

Examples

suppressPackageStartupMessages({
    library(TreeSummarizedExperiment)
    library(ape)
    library(ggtree)
})

set.seed(1L)
tr <- rtree(3, tip.label = LETTERS[seq_len(3)])
ggtree(tr) +
    geom_text(aes(label = node), hjust = -1, vjust = 1) +
    geom_text(aes(label = label), hjust = -1, vjust = -1)

cc <- matrix(rpois(60, 10), nrow = 6)
rownames(cc) <- paste0("gene", seq_len(6))
colnames(cc) <- paste0("cell", seq_len(10))
cd <- data.frame(sid = rep(seq_len(2), each = 5),
                 gid = rep(letters[seq_len(2)], each = 5),
                 cid = sample(LETTERS[seq_len(3)], size = 10,
                              replace = TRUE),
                 stringsAsFactors = FALSE)
tse <- TreeSummarizedExperiment(assays = list(counts = cc),
                                colTree = tr,
                                colNodeLab = cd$cid,
                                colData = cd)

out <- aggDS(TSE = tse, assay = "counts", sample_id = "sid",
             group_id = "gid", cluster_id = "cid")

## Aggregated counts for the node 5
SummarizedExperiment::assay(out, "alias_5")
## This is equal to the sum of the counts from nodes 1 and 2
SummarizedExperiment::assay(out, "alias_1")
SummarizedExperiment::assay(out, "alias_2")


fionarhuang/treeclimbR documentation built on Jan. 1, 2025, 9:02 p.m.