groupFeatures-similar-rtime: Group features based on approximate retention times
In rformassspectrometry/MsFeatures: Functionality for Mass Spectrometry Features

groupFeatures-similar-rtime

R Documentation

Group features based on approximate retention times

Description

Group features based on similar retention time. This method is supposed to be used as an initial crude grouping of LC-MS features based on the median retention time of all their chromatographic peaks. All features with a difference in their retention time which is <= parameter diffRt of the parameter object are grouped together.

If object is a SummarizedExperiment::SummarizedExperiment(): if a column "feature_group" is found in SummarizedExperiment::colData() feature groups defined in that column are further sub-grouped with this method. See groupFeatures() for the general concept of this feature grouping. Also, it might be required to specify the column in the object's rowData containing the retention times with the rtime parameter (which defaults to rtime = "rtime".

Parameter groupFun allows to specify the function that should be used for the actual grouping. Two possible choices are:

groupFun = groupClosest (the default): this method creates groups of features with smallest differences in retention times between the individual group members. All differences between group members are ⁠< diffRt⁠ (in contrast to the other grouping functions listed below). See groupSimilarityMatrix() (which is used for the actual grouping on pairwise retention time differences) for more details.
groupFun = groupConsecutive: the groupConsecutive() function groups values together if their difference is smaller than diffRt. This function iterates over the sorted retention times starting the grouping from the lowest value. If the difference of a feature to more than one group is smaller diffRt it is assigned to the group to which its retention time is closest (most similar) to the mean retention time of that group. This leads to smaller group sizes. Be aware that with this grouping differences in retention times between individual features within a group could still be larger diffRt. See groupConsecutive() for details and examples.
groupFun = MsCoreUtils::group: this function consecutively groups elements together if their difference in retention time is smaller than diffRt. If two features are grouped into one group, also all other features with a retention time within the defined window to any of the two features are also included into the feature group. This grouping is recursively expanded which can lead, depending on diffRt, to very large feature groups spanning a large retention time window. See MsCoreUtils::group() for details.

Other grouping functions might be added in future. Alternatively it is also possible to provide a custom function for the grouping operation.

Usage

SimilarRtimeParam(diffRt = 1, groupFun = groupClosest)

## S4 method for signature 'numeric,SimilarRtimeParam'
groupFeatures(object, param, ...)

## S4 method for signature 'SummarizedExperiment,SimilarRtimeParam'
groupFeatures(object, param, rtime = "rtime", ...)

Arguments

`diffRt`	`numeric(1)` defining the retention time window within which features should be grouped. All features with a rtime difference smaller or equal than `diffRt` are grouped.
`groupFun`	`function` that can be used to group values. Defaults to `groupFun = groupClosest`. See description for details and alternatives.
`object`	input object that provides the retention times that should be grouped. The `MsFeatures` package defines a method for `object` being either a `numeric` or a `SummarizedExperiment`.
`param`	`SimilarRtimeParam` object with the settings for the method.
`...`	additional parameters passed to the `groupFun` function.
`rtime`	for `object` being a `SummarizedExperiment::SummarizedExperiment()`: `character(1)` specifying the column in `rowData(object)` that contains the retention time values.

Value

Depending on parameter object:

for object being a numeric: returns a factor defining the feature groups.
for object being SummarizedExperiment: returns the input object with the feature group definition added to a column "feature_group" in the result object's rowData.

Author(s)

Johannes Rainer

Examples


## Simple grouping of a numeric vector.
##
## Define a numeric vector that could represent retention times of features
x <- c(2, 3, 4, 5, 10, 11, 12, 14, 15)

## Group the values using a `group` function. This will create larger
## groups.
groupFeatures(x, param = SimilarRtimeParam(2, MsCoreUtils::group))

## Group values using the default `groupClosest` function. This creates
## smaller groups in which all elements have a difference smaller than the
## defined `diffRt` with each other.
groupFeatures(x, param = SimilarRtimeParam(2, groupClosest))

## Grouping on a SummarizedExperiment
##
## load the test SummarizedExperiment object
library(SummarizedExperiment)
data(se)

## No feature groups defined yet
featureGroups(se)

## Determine the column that contains retention times
rowData(se)

## Column "rtmed" contains the (median) retention time for each feature
## Group features that are eluting within 10 seconds
res <- groupFeatures(se, SimilarRtimeParam(10), rtime = "rtmed")

featureGroups(res)

## Evaluating differences between retention times within each feature group
rts <- split(rowData(res)$rtmed, featureGroups(res))
lapply(rts, function(z) abs(diff(z)) <= 10)

## One feature group ("FG.053") has elements with a difference larger 10:
rts[["FG.053"]]
abs(diff(rts[["FG.053"]]))

## But the difference between the **sorted** retention times is < 10:
abs(diff(sort(rts[["FG.053"]])))

## Feature grouping with pre-defined feature groups: groupFeatures will
## sub-group the pre-defined feature groups, features with the feature group
## being `NA` are skipped. Below we perform the feature grouping only on
## features 40 to 70
fgs <- rep(NA, nrow(rowData(se)))
fgs[40:70] <- "FG"
featureGroups(se) <- fgs
res <- groupFeatures(se, SimilarRtimeParam(10), rtime = "rtmed")
featureGroups(res)

rformassspectrometry/MsFeatures documentation built on June 15, 2025, 12:55 a.m.