preProc: preProc - combined pre-processing steps of the MetMSLine...

Description Usage Arguments Details Value Examples

Description

Wrapper function performs all multiparametric preprocessing steps for large-scale high-resolution LC-MS metabolomic datasets. Combines the zeroFill, signNorm, loessSmooth, blankSub, cvCalc and logTrans functions.

Usage

1
2
3
4
5
preProc(peakTable = NULL, obsNames = NULL, sampNames = NULL,
  qcNames = NULL, blankNames = NULL, zeroFillValue = NULL,
  normMethod = NULL, cvThresh = NULL, nCores = NULL, outputDir = NULL,
  smoothSpan = NULL, folds = 7, baseLogT = exp(1),
  blankFCmethod = "mean", blankFCthresh = 2)

Arguments

peakTable

either a data.frame, full file path as a character string to a .csv file of a peak table in the form observation (samples) in columns and variables (Mass spectral signals) in rows. If argument is not supplied a GUI file selection window will open and a .csv file can be selected.

obsNames

character vector of observation (i.e. sample/ QC/ Blank) names to identify appropriate observation (sample) columns.

sampNames

character vector of sample names to identify appropriate observation (sample) columns. If either the sampNames or qcNames arguments are not supplied then the pooled QC loessSmooth function is not performed.

qcNames

character vector of quality control (QC) names to identify appropriate observation (quality control) columns. If the qcNames argument is not supplied neither the loessSmooth or cvCalc functions will be carried out.

blankNames

character vector of blank (i.e. negative control) names to identify appropriate observation (blank) columns. If this and the sampNames argument are supplied the preProc function will conduct a background subtraction (samples:blanks). This is carried out by calculating the fold change (either mean or median: see blankFCmethod) between the samples (see sampNames argument) as the numerator and the blanks as the denominator for each LC-MS feature. Any LC-MS features lower than the fold change threshold (blankFCthresh) will be removed. The fold change threshold can be set as required (default = 2, see blankFCthresh).

zeroFillValue

numeric value to fill zero/ missing values (NA). By default half the mimimum non-zero observed peak intensity is used for zeroFill function.

normMethod

either "medFC" for median fold change or "totIon" for total ion signal normalization. also a custom vector of factors equal in length to the obsNames argument with which to normalize the data can also be supplied. default = "medFC". Argument for the bnormMethod function.

cvThresh

numeric the minimum CV% to retain an LC-MS variable (default = NULL). If this argument and qcNames are not supplied the CV% will not be calculated using the quality control samples and no LC-MS features below the CV% threshold will not be removed. Argument for the cvCalc function. @param outputDir optional directory path to save output images before and after QC smoothing. A subdirectory will be created in which to save the png images. Argument for the loessSmooth function.

smoothSpan

numeric (values between 0-1) fixed smoothing span. If supplied a this fixed smoothing span parameter will override the cross validated feature-by-feature smoothing span optimization. Argument for the loessSmooth function.

folds

numeric (default=7, i.e. 7-fold cross validation) n-fold cross validation. Argument for the loessSmooth function.

baseLogT

the base with respect to which logarithms are computed log. Defaults to e=exp(1). Argument for the logTrans function.

blankFCmethod

character either 'median' or 'mean' fold change calculation. Argument for the blankSub function.

blankFCthresh

numeric sample:blank fold change cut-off. Any LC-MS features below this threshold will be removed. Argument for the blankSub function.

Details

the wrapper function performs the following steps of partially optional data pre-processing steps, not supplying certain function arguments affects which preprocessing steps will be conducted:

1. zero filling (see: zeroFill for further details).

2. signal normalization (optional, see: signalNorm for further details).

3. blank substraction (see: blankSub for further details).

4. pooled QC-based loess smoothing (optional, see: loessSmooth for further details).

5. Coefficient of variation calculation and filtration (optional, see: cvCalc for further details).

6. log transformation (see: logTrans for further details).

Value

a data frame identical to peakTable argument (with potentially fewer rows/ LC-MS features if blank substraction or CV% calculation/ threshold filtration have been performed, see: blankSub, cvCalc).

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
# read in table of synthetic MS1 metabolomic profiling data
peakTable <- system.file("extdata", "synthetic_MS_data.csv", package = "MetMSLine")
peakTable <- read.csv(peakTable, header=T, stringsAsFactors=F)
# all observation names
obsNames <- colnames(peakTable)[grep("QC_|sample_|blank_", colnames(peakTable))]
# quality control names
qcNames <- colnames(peakTable)[grep("QC_", colnames(peakTable))]
# remove all but last column conditioning QC
qcNames <- qcNames[-c(9:1)]
# sample names only those bounded by qcs
sampNames <- colnames(peakTable)[grep("sample_", colnames(peakTable))]
sampNames <- sampNames[-c(length(sampNames):{length(sampNames) - 3})]
# blank (negative control) names
blankNames <- colnames(peakTable)[grep("blank_", colnames(peakTable))]
# detect number of cores using parallel package
nCores <- parallel::detectCores()
# conduct LC-MS data preprocessing
preProc_peakTable <- preProc(peakTable, obsNames, sampNames, qcNames, blankNames,
                             cvThresh=30, nCores=nCores)

WMBEdmands/MetMSLine documentation built on May 9, 2019, 10:03 p.m.