splitOn: Split 'TreeSummarizedExperiment' column-wise or row-wise...

splitOnR Documentation

Split TreeSummarizedExperiment column-wise or row-wise based on grouping variable

Description

Split TreeSummarizedExperiment column-wise or row-wise based on grouping variable

Usage

splitOn(x, ...)

## S4 method for signature 'SummarizedExperiment'
splitOn(x, f = NULL, ...)

## S4 method for signature 'SingleCellExperiment'
splitOn(x, f = NULL, ...)

## S4 method for signature 'TreeSummarizedExperiment'
splitOn(x, f = NULL, update_rowTree = FALSE, ...)

unsplitOn(x, ...)

## S4 method for signature 'list'
unsplitOn(x, update_rowTree = FALSE, ...)

## S4 method for signature 'SimpleList'
unsplitOn(x, update_rowTree = FALSE, ...)

## S4 method for signature 'SingleCellExperiment'
unsplitOn(x, altExpNames = names(altExps(x)), keep_reducedDims = FALSE, ...)

Arguments

x

A SummarizedExperiment object or a list of SummarizedExperiment objects.

...

Arguments passed to mergeRows/mergeCols function for SummarizedExperiment objects and other functions. See mergeRows for more details.

  • use_names A single boolean value to select whether to name elements of list by their group names.

f

A single character value for selecting the grouping variable from rowData or colData or a factor or vector with the same length as one of the dimensions. If f matches with both dimensions, MARGIN must be specified. Split by cols is not encouraged, since this is not compatible with storing the results in altExps.

update_rowTree

TRUE or FALSE: Should the rowTree be updated based on splitted data? Option is enabled when x is a TreeSummarizedExperiment object or a list of such objects. (By default: update_rowTree = FALSE)

altExpNames

a character vector specifying the alternative experiments to be unsplit. (By default: altExpNames = names(altExps(x)))

keep_reducedDims

TRUE or FALSE: Should the reducedDims(x) be transferred to the result? Please note, that this breaks the link between the data used to calculate the reduced dims. (By default: keep_reducedDims = FALSE)

Details

splitOn split data based on grouping variable. Splitting can be done column-wise or row-wise. The returned value is a list of SummarizedExperiment objects; each element containing members of each group.

Value

For splitOn: SummarizedExperiment objects in a SimpleList.

For unsplitOn: x, with rowData and assay data replaced by the unsplit data. colData of x is kept as well and any existing rowTree is dropped as well, since existing rowLinks are not valid anymore.

Author(s)

Leo Lahti and Tuomas Borman. Contact: microbiome.github.io

See Also

splitByRanks unsplitByRanks mergeRows, sumCountsAcrossFeatures, agglomerateByRank, altExps, splitAltExps

Examples

data(GlobalPatterns)
tse <- GlobalPatterns
# Split data based on SampleType. 
se_list <- splitOn(tse, f = "SampleType")

# List of SE objects is returned. 
se_list

# Create arbitrary groups
rowData(tse)$group <- sample(1:3, nrow(tse), replace = TRUE)
colData(tse)$group <- sample(1:3, ncol(tse), replace = TRUE)

# Split based on rows
# Each element is named based on their group name. If you don't want to name
# elements, use use_name = FALSE. Since "group" can be found from rowdata and colData
# you must use MARGIN.
se_list <- splitOn(tse, f = "group", use_names = FALSE, MARGIN = 1)

# When column names are shared between elements, you can store the list to altExps
altExps(tse) <- se_list

altExps(tse)

# If you want to split on columns and update rowTree, you can do
se_list <- splitOn(tse, f = colData(tse)$group, update_rowTree = TRUE)

# If you want to combine groups back together, you can use unsplitBy
unsplitOn(se_list)


microbiome/mia documentation built on April 27, 2024, 4:04 a.m.