peakPantheR_singleFileSearch: Search, integrate and report targeted features in a raw...

View source: R/peakPantheR_singleFileSearch.R

peakPantheR_singleFileSearchR Documentation

Search, integrate and report targeted features in a raw spectra

Description

Report for a raw spectra the TIC, acquisition time, integrated targeted features, fitted curves and datapoints for each region of interest. Optimised to reduce the number of file access. Features not detected can be integrated using fallback integration regions (FIR).

Usage

peakPantheR_singleFileSearch(
    singleSpectraDataPath,
    targetFeatTable,
    peakStatistic = FALSE,
    plotEICsPath = NA,
    getAcquTime = FALSE,
    FIR = NULL,
    centroided = TRUE,
    curveModel = "skewedGaussian",
    verbose = TRUE,
    ...
)

Arguments

singleSpectraDataPath

(str) path to netCDF or mzML raw data file (centroided, only with the channel of interest).

targetFeatTable

a data.frame of compounds to target as rows. Columns: cpdID (str), cpdName (str), rtMin (float in seconds), rt (float in seconds, or NA), rtMax (float in seconds), mzMin (float), mz (float or NA), mzMax (float).

peakStatistic

(bool) If TRUE calculates additional peak statistics: 'ppm_error', 'rt_dev_sec', 'tailing factor' and 'asymmetry factor'

plotEICsPath

(str or NA) If not NA, will save a .png of all ROI EICs at the path provided ('filepath/filename.png' expected). If NA no plot saved

getAcquTime

(bool) If TRUE will extract sample acquisition date-time from the mzML metadata (the additional file access will impact run time)

FIR

(data.frame or NULL) If not NULL, integrate Fallback Integration Regions (FIR) when a feature is not found. Compounds as row are identical to targetFeatTable, columns are rtMin (float in seconds), rtMax (float in seconds), mzMin (float), mzMax (float).

centroided

(bool) use TRUE if the data is centroided, used by readMSData when reading the raw data file

curveModel

(str) specify the peak-shape model to fit, by default skewedGaussian. Accepted values are skewedGaussian and emgGaussian

verbose

(bool) If TRUE message calculation progress, time taken and number of features found

...

Passes arguments to findTargetFeatures to alter peak-picking parameters (e.g. curveModel, sampling, params as a list of parameters for each ROI or 'guess',...)

Value

a list: list()$TIC (int) TIC value, list()$peakTable (data.frame) targeted features results (see Details), list()$curveFit (list) list of peakPantheR_curveFit or NA for each ROI, list()$acquTime (POSIXct or NA) date-time of sample acquisition from mzML metadata, list()$ROIsDataPoint (list) a list of data.frame of raw data points for each ROI (retention time 'rt', mass 'mz' and intensity 'int' (as column) of each raw data points (as row)).

Details:

The returned peakTable data.frame is structured as follow:

cpdID database compound ID
cpdName compound name
found was the peak found
rt retention time of peak apex (sec)
rtMin leading edge of peak retention time (sec) determined at 0.5% of apex intensity
rtMax trailing edge of peak retention time (sec) determined at 0.5% of apex intensity
mz weighted (by intensity) mean of peak m/z across scans
mzMin m/z peak minimum (between rtMin, rtMax)
mzMax m/z peak maximum (between rtMin, rtMax)
peakArea integrated peak area
peakAreaRaw integrated peak area from raw data points
maxIntMeasured maximum peak intensity in raw data
maxIntPredicted maximum peak intensity based on curve fit
is_filled Logical indicate if the feature was integrated using FIR (Fallback Integration Region)
ppm_error difference in ppm between the expected and measured m/z
rt_dev_sec difference in seconds between the expected and measured rt
tailingFactor the tailing factor is a measure of peak tailing.It is defined as the distance from the front slope of the peak to the back slope divided by twice the distance from the center line of the peak to the front slope, with all measurements made at 5% of the maximum peak height. The tailing factor of a peak will typically be similar to the asymmetry factor for the same peak, but the two values cannot be directly converted
asymmetryFactor the asymmetry factor is a measure of peak tailing. It is defined as the distance from the center line of the peak to the back slope divided by the distance from the center line of the peak to the front slope, with all measurements made at 10% of the maximum peak height. The asymmetry factor of a peak will typically be similar to the tailing factor for the same peak, but the two values cannot be directly converted

See Also

Other peakPantheR: peakPantheRAnnotation, peakPantheR_parallelAnnotation()

Other parallelAnnotation: peakPantheRAnnotation, peakPantheR_parallelAnnotation()

Examples

if(requireNamespace('faahKO')){
## Load data
library(faahKO)
netcdfFilePath <- system.file('cdf/KO/ko15.CDF', package = 'faahKO')

## targetFeatTable
targetFeatTable <- data.frame(matrix(vector(), 2, 8, dimnames=list(c(),
                    c('cpdID','cpdName','rtMin','rt','rtMax','mzMin','mz',
                    'mzMax'))), stringsAsFactors=FALSE)
targetFeatTable[1,] <- c('ID-1', 'Cpd 1', 3310., 3344.888, 3390., 522.194778,
                        522.2, 522.205222)
targetFeatTable[2,] <- c('ID-2', 'Cpd 2', 3280., 3385.577, 3440., 496.195038,
                        496.2, 496.204962)
targetFeatTable[,c(3:8)] <- vapply(targetFeatTable[,c(3:8)], as.numeric,
                                    FUN.VALUE=numeric(2))

res <- peakPantheR_singleFileSearch(netcdfFilePath,targetFeatTable,
                                    peakStatistic=TRUE)
# Polarity can not be extracted from netCDF files, please set manually the
#    polarity with the 'polarity' method.
# Reading data from 2 windows
# Data read in: 0.16 secs
# Warning: rtMin/rtMax outside of ROI; datapoints cannot be used for
#   mzMin/mzMax calculation, approximate mz and returning ROI$mzMin and
#   ROI$mzMax for ROI #1
# Found 2/2 features in 0.05 secs
# Peak statistics done in: 0 secs
# Feature search done in: 0.75 secs

res
# $TIC
# [1] 2410533091
#
# $peakTable
#   found    rtMin       rt    rtMax    mzMin    mz    mzMax peakArea
# 1  TRUE 3309.759 3346.828 3385.410 522.1948 522.2 522.2052 26133727
# 2  TRUE 3345.377 3386.529 3428.279 496.2000 496.2 496.2000 35472141
#   peakAreaRaw maxIntMeasured maxIntPredicted cpdID cpdName is_filled
# 1    26071378         889280        901015.8  ID-1   Cpd 1     FALSE
# 2    36498367        1128960       1113576.7  ID-2   Cpd 2     FALSE
#    ppm_error   rt_dev_sec  tailingFactor  asymmetryFactor
# 1 0.02337616    1.9397590       1.015357         1.026824
# 2 0.02460103    0.9518072       1.005378         1.009318
#
# $acquTime
# [1] NA
#
#
# $curveFit
# $curveFit[[1]]
# $amplitude
# [1] 162404.8
# 
# $center
# [1] 3341.888
# 
# $sigma
# [1] 0.07878613
# 
# $gamma
# [1] 0.00183361
# 
# $fitStatus
# [1] 2
# 
# $curveModel
# [1] 'skewedGaussian'
# 
# attr(,'class')
# [1] 'peakPantheR_curveFit'
# 
# $curveFit[[2]]
# $amplitude
# [1] 199249.1
# 
# $center
# [1] 3382.577
# 
# $sigma
# [1] 0.07490442
# 
# $gamma
# [1] 0.00114719
# 
# $fitStatus
# [1] 2
# 
# $curveModel
# [1] 'skewedGaussian'
# 
# attr(,'class')
# [1] 'peakPantheR_curveFit'
#
#
# $ROIsDataPoint
# $ROIsDataPoint[[1]]
#          rt    mz    int
# 1  3315.154 522.2   2187
# 2  3316.719 522.2   3534
# 3  3318.284 522.2   6338
# 4  3319.849 522.2  11718
# 5  3321.414 522.2  21744
# 6  3322.979 522.2  37872
# 7  3324.544 522.2  62424
# 8  3326.109 522.2  98408
# 9  3327.673 522.2 152896
# 10 3329.238 522.2 225984
# ...
#
# $ROIsDataPoint[[2]]
#          rt    mz     int
# 1  3280.725 496.2    1349
# 2  3290.115 496.2    2069
# 3  3291.680 496.2    3103
# 4  3293.245 496.2    5570
# 5  3294.809 496.2   10730
# 6  3296.374 496.2   20904
# 7  3297.939 496.2   38712
# 8  3299.504 496.2   64368
# 9  3301.069 496.2   97096
# 10 3302.634 496.2  136320
# ...
}


phenomecentre/peakPantheR documentation built on Feb. 29, 2024, 9:07 p.m.