proc.txt: Filter noise and detect peaks from LC/MS data in text format

View source: R/proc.txt.R

proc.txtR Documentation

Filter noise and detect peaks from LC/MS data in text format

Description

This function applies the run filter to remove noise. Data points are grouped into EICs in this step.

Usage

proc.txt(filename, min.pres=0.5, min.run=12,tol=NA, find.tol.maxd=1e-4, baseline.correct.noise.percentile=0, baseline.correct=NA)

Arguments

filename

The text file name. If the file is not in the working directory, the path needs to be given.

min.pres

Run filter parameter. The minimum proportion of presence in the time period for a series of signals grouped by m/z to be considered a peak.

min.run

Run filter parameter. The minimum length of elution time for a series of signals grouped by m/z to be considered a peak.

tol

m/z tolerance level for the grouping of data points. This value is expressed as the fraction of the m/z value. This value, multiplied by the m/z value, becomes the cutoff level. The recommended value is the machine's nominal accuracy level. Divide the ppm value by 1e6. For FTMS, 1e-5 is recommended.

find.tol.maxd

maximum distance between datapoints that are allowed in the procedure to find tolerance.

baseline.correct

After grouping the observations, the highest intensity in each group is found. If the highest is lower than this value, the entire group will be deleted. The default value is NA, in which case the program uses the 75th percentile of the height of the noise groups.

baseline.correct.noise.percentile

The perenctile of signal strength of those EIC that don't pass the run filter, to be used as the baseline threshold of signal strength.

Details

The columns in the text file need to be separated by tab. The first column needs to be the retention time, the second column the m/z values, and the third column the intensity values. The first row needs to be the column labels, rather than values. The m/z are grouped based on the tolerance level using multi-stage smoothing and peak finding. Non-parametric density estimation is used in both m/z dimension and elution time dimension to fine-tune the signal grouping. A run filter is applied, which requires a "true peak" to have a minimum length in the retention time dimension (parameter: min.run), as well as being detected at or higher than a proportion of the time points within the time period (parameter: min.pres).

Value

A matrix with four columns: m/z value, retention time, intensity, and group number.

Author(s)

Tianwei Yu <tyu8@emory.edu>


yufree/apLCMS documentation built on May 19, 2024, 1:22 p.m.