clusterFeatures: Feature clustering
In HimesGroup/qmtools: Quantitative Metabolomics Data Processing Tools

clusterFeatures

R Documentation

Feature clustering

Description

Function to cluster LC-MS features according to their retention time and intensity correlation across samples with a SummarizedExperiment.

Usage

clusterFeatures(
  x,
  i,
  rtime_var = "rtime",
  rt_cut = 10,
  cor_cut = 0.7,
  rt_grouping = c("hclust", "closest", "consecutive"),
  cor_grouping = c("louvain", "SimilarityMatrix", "connected", "none"),
  cor_use = c("everything", "all.obs", "complete.obs", "na.or.complete",
    "pairwise.complete.obs"),
  cor_method = c("pearson", "kendall", "spearman"),
  log2 = FALSE,
  hclust_linkage = "complete"
)

Arguments

`x`	A SummarizedExperiment object.
`i`	A string or integer value specifying which assay values to use.
`rtime_var`	A string specifying the name of variable containing a numeric vector of retention times in `rowData(x)`.
`rt_cut`	A numeric value specifying a cut-off for the retention-time based feature grouping.
`cor_cut`	A numeric value specifying a cut-off for the correlation-based feature grouping.
`rt_grouping`	A string specifying which method to use for the retention-time based feature grouping.
`cor_grouping`	A string specifying which method to use for the correlation-based feature grouping.
`cor_use`	A string specifying which method to compute correlations in the presence of missing values. Refer to `?cor` for details.
`cor_method`	A string specifying which correlation coefficient is to be computed. See `?cor` for details.
`log2`	A logical specifying whether feature intensities need to be log2-transformed before calculating a correlation matrix.
`hclust_linkage`	A string specifying the linkage method to be used when `rt_grouping` is "hclust".

Details

For soft ionization methods (e.g., LC/ESI-MS) commonly used in metabolomics, one or more ions could be generated from an individual compound upon ionization. The redundancy of feature data needs to be addressed since we typically interested in compounds rather than different ion species. This function attempts to identify a group of features from the same compound with the following steps:

Features are grouped by their retention times to identify co-eluting compounds.
For each retention time-based group, features are further clustered by patterns of the intensity correlations across samples to identify a subset of features from the same compound.

The retention time-based grouping is performed using either a hierarchical clustering via hclust or the methods available in the MsFeatures package via MsFeatures::groupClosest and MsFeatures::groupConsecutive. For the rt_grouping = "hclust", by default, complete-linkage clustering is conducted using the Manhattan distance (i.e., difference in retention times) where the distance between two clusters is defined as the difference in retention times between the farthest pair of elements in the two clusters. Group memberships are assigned by specifying the cut height for the distance metric. Other linkage methods can be specified with hclust_linkage. Please refer to ?hclust for details. For the "closest" and "consecutive", please refer to ?MsFeatures::groupClosest and ?MsFeatures::groupConsecutive for the details of algorithms.

For the correlation-based grouping, cor_grouping = "connected" creates a undirected graph using feature correlations as an adjacency matrix (i.e., correlations serve as edge weights). The edges whose weights are below the cut-off specified by cor_cut will be removed from the graph, separating features into several disconnected subgroups. Features in the same subgroup will be assigned to the same feature cluster. For the "louvain", the function further applies the Louvain algorithm to the graph in order to identify densely connected features via igraph::cluster_louvain. For the "SimilarityMatrix", MsFeatures::groupSimilarityMatrix is used for feature grouping. Please refer to ?MsFeatures::groupSimilarityMatrix for the details of algorithm.

Value

A SummarizedExperiment object with the grouping results added to columns "rtime_group" (initial grouping on retention times) and "feature_group" in its rowData.

References

Johannes Rainer (2022). MsFeatures: Functionality for Mass Spectrometry Features. R package version 1.3.0. 'https://github.com/RforMassSpectrometry/MsFeatures

Vincent D. Blondel, Jean-Loup Guillaume, Renaud Lambiotte, Etienne Lefebvre: Fast unfolding of communities in large networks. J. Stat. Mech. (2008) P10008

Csardi G, Nepusz T: The igraph software package for complex network research, InterJournal, Complex Systems 1695. 2006. https://igraph.org

Examples


data(faahko_se)

se <- clusterFeatures(faahko_se, i = "knn_vsn", rtime_var = "rtmed")
rowData(se)[, c("rtmed", "rtime_group", "feature_group")]

HimesGroup/qmtools documentation built on April 16, 2023, 8 p.m.

HimesGroup/qmtools index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

HimesGroup/qmtools
Quantitative Metabolomics Data Processing Tools

clusterFeatures: Feature clustering
In HimesGroup/qmtools: Quantitative Metabolomics Data Processing Tools

Feature clustering

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Related to clusterFeatures in HimesGroup/qmtools...

R Package Documentation

Browse R Packages

We want your feedback!

HimesGroup/qmtools Quantitative Metabolomics Data Processing Tools

clusterFeatures: Feature clustering In HimesGroup/qmtools: Quantitative Metabolomics Data Processing Tools

Feature clustering

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Related to clusterFeatures in HimesGroup/qmtools...

R Package Documentation

Browse R Packages

We want your feedback!

HimesGroup/qmtools
Quantitative Metabolomics Data Processing Tools

clusterFeatures: Feature clustering
In HimesGroup/qmtools: Quantitative Metabolomics Data Processing Tools