se_collapse_by_column: Collapse SummarizedExperiment data by column
In jmw86069/jamses: Jam SummarizedExperiment Stats

se_collapse_by_column

R Documentation

Collapse SummarizedExperiment data by column

Description

Collapse SummarizedExperiment data by column

Usage

se_collapse_by_column(
  se,
  columns = colnames(se),
  column_groups,
  assay_names = NULL,
  colDataColnames = colnames(SummarizedExperiment::colData(se)),
  keepNULLlevels = FALSE,
  groupFunc = jamba::rowGroupMeans,
  noise_floor = 0,
  noise_floor_value = 0,
  rmOutliers = FALSE,
  madFactor = 5,
  useMedian = FALSE,
  verbose = FALSE,
  ...
)

Arguments

`se`	`SummarizedExperiment` object
`columns`	`character` vector of `colnames(se)` to include in the process.
`column_groups`	`character` vector of column groupings, or `character` vector of `colnames(colData(se))` used to define the column groupings.
`assay_names`	`character` vector with one or more `assayNames(se)` to apply the column grouping calculation defined in `groupFunc`. By default, all assay names in `assayNames(se)` are used.
`colDataColnames`	`character` vector of `colData(se)` colnames to be included in the returned `SummarizedExperiment` after the column grouping. This argument is used to subset the columns, in cases where some columns do not need to be combined and returned in the output data.
`keepNULLlevels`	`logical` indicating whether to return empty columns when there are not factor levels present in the data. This option is intended when `column_group` references a `factor` type, whose factor levels are not present in the current data, using `columns`. When `keepNULLlevels=TRUE` any missing levels will be present with `NA` values, which can be helpful for generating a consistent output.
`groupFunc`	`function` used to perform row group calculations on a `numeric` matrix. The default is passed to `jamba::rowGroupMeans()`, but can be substituted with another row-based function. It must accept arguments `x` and `groups`, but the other arguments are passed only if `groupFunc` permits these argument names, or `...`: `x` as a `numeric` matrix (required), `groups` as a `character` vector of column groups, in order of `colnames(x)` (required) `rmOutliers` a `logical` indicating whether to apply outlier removal, though the function can ignore this value (optional). `madFactor` a `numeric` value indicating the MAD threshold used when `rmOutliers=TRUE`; though again the function can ignore this value (optional). `useMedian=FALSE` is `logical` and when `useMedian=FALSE` it disables calculating the `median()` value per group, and instead takes the group `mean()` value. `...` additional arguments in `...` will be passed only if permitted by `groupFunc`.
`noise_floor`	`numeric` value indicating the minimum numeric value permitted, at or below this value will be replaced with `noise_floor_value`. The default value `noise_floor=0` will therefore change all values at or below zero to `noise_floor_value=0` by default. Another alternative is to change abnormally low values such as zero `0` to `NA` so these values are not treated as actual measurements during the group summary calculation. This value and the replacement should be adjusted with caution. Use `noise_floor=NULL` or `noise_floor=-Inf` to disable this step.
`noise_floor_value`	`numeric` or `NA` used as a replacement for `numeric` values at or below `noise_floor`, which occurs prior to calling the `groupFunc` summary calculation.
`rmOutliers`, `madFactor`	`logical` and `numeric`, respectively, passed to `groupFunc` which by default is `jamba::rowGroupMeans()`.
`useMedian`	`logical` passed to argument `groupFunc()`, intended to be used by `jamba::rowGroupMeans()` to specify taking the mean and not the median value per row group.
`verbose`	`logical` indicating whether to print verbose output.
`...`	additional arguments are passed through `groupFunc`.

Details

Purpose is to collapse columns of a SummarizedExperiment object, where measurements for a given entity, usually a gene, are split across multiple rows in the source data. The output of this function should be measurements appropriately summarized to the gene level.

The driving use case is slightly different than with se_collapse_by_row(), in this case the function is mostly convenient method to calculate group mean values in context of a SummarizedExperiment object, so it can be used with jamses::heatmap_se() for example.

This function retains associated column annotations colData(se), after combining multiple values in an appropriate manner.

Optionally, this function will detect and remove individual outlier values before calculating the group mean.

Value

SummarizedExperiment object with these changes:

columns will be collapsed by column_groups, for each assays(se) numeric matrix defined by assay_names.
colData(se) will also be collapsed by shrinkDataFrame() to combine unique values from each column annotation.

jmw86069/jamses
Jam SummarizedExperiment Stats

se_collapse_by_column: Collapse SummarizedExperiment data by column
In jmw86069/jamses: Jam SummarizedExperiment Stats

Collapse SummarizedExperiment data by column

Description

Usage

Arguments

Details

Value

See Also

Related to se_collapse_by_column in jmw86069/jamses...

R Package Documentation

Browse R Packages

We want your feedback!

jmw86069/jamses Jam SummarizedExperiment Stats

se_collapse_by_column: Collapse SummarizedExperiment data by column In jmw86069/jamses: Jam SummarizedExperiment Stats

Collapse SummarizedExperiment data by column

Description

Usage

Arguments

Details

Value

See Also

Related to se_collapse_by_column in jmw86069/jamses...

R Package Documentation

Browse R Packages

We want your feedback!

jmw86069/jamses
Jam SummarizedExperiment Stats

se_collapse_by_column: Collapse SummarizedExperiment data by column
In jmw86069/jamses: Jam SummarizedExperiment Stats