rc.get.df.data: rc.get.df.data

View source: R/rc.get.df.data.R

rc.get.df.dataR Documentation

rc.get.df.data

Description

extractor for dataframe input in preparation for normalization and clustering

Usage

rc.get.df.data(
  ms1_featureDefinitions = NULL,
  ms1_featureValues = NULL,
  ms2_featureDefinitions = NULL,
  ms2_featureValues = NULL,
  phenoData = NULL,
  ExpDes = NULL,
  featureNamesColumnIndex = 1,
  st = NULL,
  ensure.no.na = TRUE
)

Arguments

ms1_featureDefinitions

dataframe with metadata with columns: mz, rt, feature names containing MS data

ms1_featureValues

dataframe with rownames = sample names, colnames = feature names containing MS data

ms2_featureDefinitions

dataframe with metadata with columns: mz, rt, feature names containing MSMS data

ms2_featureValues

dataframe with rownames = sample names, colnames = feature names containing MSMS data

phenoData

dataframe containing phenoData

ExpDes

either an R object created by R ExpDes object: data used for record keeping and labelling msp spectral output

featureNamesColumnIndex

integer: which column in 'ms1_featureDefinitions' contains feature names?

st

numeric: sigma t - time similarity decay value

ensure.no.na

logical: if TRUE, any 'NA' values in msint and/or msmsint are replaced with numerical values based on 10 percent of feature min plus noise. Used to ensure that spectra are not written with NA values.

Details

This function creates a ramclustObj which will be used as input for clustering.

Value

an empty ramclustR object. this object is formatted as an hclust object with additional slots for holding feature and compound data. details on these found below.

$frt: feature retention time, in whatever units were fed in

$fmz: feature retention time, reported in number of decimal points selected in ramclustR function

$ExpDes: the experimental design object used when running ramclustR. List of two dataframes.

$MSdata: the MSdataset provided by either xcms or csv input

$MSMSdata: the (optional) DIA(MSe, MSall, AIF etc) dataset

$xcmsOrd: original xcms order of features, for back-referencing when necessary

$msint: weighted.mean intensity of feature in ms level data

$msmsint:weighted.mean intensity of feature in msms level data

Author(s)

Zargham Ahmad, Helge Hecht, Corey Broeckling

References

Broeckling CD, Afsar FA, Neumann S, Ben-Hur A, Prenni JE. RAMClust: a novel feature clustering method enables spectral-matching-based annotation for metabolomics data. Anal Chem. 2014 Jul 15;86(14):6812-7. doi: 10.1021/ac501530d. Epub 2014 Jun 26. PubMed PMID: 24927477.

Broeckling CD, Ganna A, Layer M, Brown K, Sutton B, Ingelsson E, Peers G, Prenni JE. Enabling Efficient and Confident Annotation of LC-MS Metabolomics Data through MS1 Spectrum and Time Prediction. Anal Chem. 2016 Sep 20;88(18):9226-34. doi: 10.1021/acs.analchem.6b02479. Epub 2016 Sep 8. PubMed PMID: 7560453.

Examples

## Choose dataframe with metadata with columns: mz, rt, feature names containing MS data
## Choose dataframe with rownames = sample names, colnames = feature names containing MS data
## Choose dataframe containing phenoData 
df1 <- readRDS(system.file("extdata", "featDefinition.rds", package = "RAMClustR", mustWork = TRUE))
df2 <- readRDS(system.file("extdata", "featValues.rds", package = "RAMClustR", mustWork = TRUE))
df3 <- readRDS(system.file("extdata", "phenoData_df.rds", package = "RAMClustR", mustWork = TRUE))

ramclustr <- rc.get.df.data(ms1_featureDefinitions=df1, ms1_featureValues=df2, phenoData=df3, st=5)


cbroeckl/RAMClustR documentation built on Sept. 1, 2024, 1:50 a.m.