rc.ramclustr: rc.ramclustr

View source: R/rc.ramclustr.R

rc.ramclustrR Documentation

rc.ramclustr

Description

Main clustering function for grouping features based on their analytical behavior.

Usage

rc.ramclustr(
  ramclustObj = NULL,
  st = NULL,
  sr = NULL,
  maxt = NULL,
  deepSplit = FALSE,
  blocksize = 2000,
  mult = 5,
  hmax = NULL,
  collapse = TRUE,
  minModuleSize = 2,
  linkage = "average",
  cor.method = "pearson",
  rt.only.low.n = TRUE
)

Arguments

ramclustObj

ramclustR object: containing ungrouped features. constructed by rc.get.xcms.data, for example

st

numeric: sigma t - time similarity decay value

sr

numeric: sigma r - correlational similarity decay value

maxt

numeric: maximum time difference to calculate retention similarity for - all values beyond this are assigned similarity of zero

deepSplit

logical: controls how agressively the HCA tree is cut - see ?cutreeDynamicTree

blocksize

integer: number of features (scans?) processed in one block =1000,

mult

numeric: internal value, can be used to influence processing speed/ram usage

hmax

numeric: precut the tree at this height, default 0.3 - see ?cutreeDynamicTree

collapse

logical: if true (default), feature quantitative values are collapsed into spectra quantitative values.

minModuleSize

integer: how many features must be part of a cluster to be returned? default = 2

linkage

character: heirarchical clustering linkage method - see ?hclust

cor.method

character: which correlational method used to calculate 'r' - see ?cor

rt.only.low.n

logical: default = TRUE At low injection numbers, correlational relationships of peak intensities may be unreliable. by defualt ramclustR will simply ignore the correlational r value and cluster on retention time alone. if you wish to use correlation with at n < 5, set this value to FALSE.

Details

Main clustering function output - see citation for algorithm description or vignette('RAMClustR') for a walk through. batch.qc. normalization requires input of three vectors (1) batch (2) order (3) qc. This is a feature centric normalization approach which adjusts signal intensities first by comparing batch median intensity of each feature (one feature at a time) QC signal intensity to full dataset median to correct for systematic batch effects and then secondly to apply a local QC median vs global median sample correction to correct for run order effects.

Value

$featclus: integer vector of cluster membership for each feature

$clrt: cluster retention time

$clrtsd: retention time standard deviation of all the features that comprise that cluster

$nfeat: number of features in the cluster

$nsing: number of 'singletons' - that is the number of features which clustered with no other feature

$cmpd: compound name. C#### are assigned in order of output by dynamicTreeCut. Compound with the most features is classified as C0001...

$ann: annotation. By default, annotation names are identical to 'cmpd' names. This slot is a placeholder for when annotations are provided

$SpecAbund: the cluster intensities after collapsing features to clusters

$SpecAbundAve: the cluster intensities after averaging all samples with identical sample names

Author(s)

Corey Broeckling

References

Broeckling CD, Afsar FA, Neumann S, Ben-Hur A, Prenni JE. RAMClust: a novel feature clustering method enables spectral-matching-based annotation for metabolomics data. Anal Chem. 2014 Jul 15;86(14):6812-7. doi: 10.1021/ac501530d. Epub 2014 Jun 26. PubMed PMID: 24927477.

Broeckling CD, Ganna A, Layer M, Brown K, Sutton B, Ingelsson E, Peers G, Prenni JE. Enabling Efficient and Confident Annotation of LC-MS Metabolomics Data through MS1 Spectrum and Time Prediction. Anal Chem. 2016 Sep 20;88(18):9226-34. doi: 10.1021/acs.analchem.6b02479. Epub 2016 Sep 8. PubMed PMID: 7560453.


cbroeckl/RAMClustR documentation built on Sept. 1, 2024, 1:50 a.m.