mzpart: Divisive partitioning of raw LC-HRMS measurements

Description Usage Arguments Details Value Imbecile Warning Author(s) See Also

Description

Divisive recursive partition of LC-HRMS measurements. Preparatory step for mzclust and mzpick; altenative to mzagglom. Requires an MSlist initilialized by readMSdata as input.

Usage

1
2
3
	mzpart(MSlist, dmzgap = 10, drtgap = 500, ppm = TRUE, 
	minpeak = 4, peaklimit = 2500, cutfrac = 0.1, drtsmall=50, 
	progbar = FALSE, stoppoints = 2e+05)

Arguments

MSlist

MSlist generated by readMSdata

dmzgap

m/z gap width for partitioning

drtgap

RT gap width for partitioning

ppm

dmzgap given in ppm (TRUE) or as absolute value (FALSE)?

minpeak

Minimum number of measurements in a partition

peaklimit

Maximum number of measurements in a partition

cutfrac

Fraction of low density measurements to be discarded

drtsmall

RT tolerance used to estimate density

progbar

For debugging, ignore

stoppoints

For debugging, ignore

Details

This function searchs recursively for gaps in retention time (RT) and m/z in the LC-HRMS measurements and thus partitions (and resorts) the matrix contained in MSlist[[4]]. If neither partitioning by RT nor by m/z results in a small enough partition of <= peaklimit measurements, a fraction cutfrac of lowest-density measurements is discarded and the partition procedure resumed. Measurement-wise density is based on a gaussian kernel density estimate scaled to dmzgap and drtsmall, i.e., to the local neighbourhood of each measurement.

Partitioning is necessary to speed up the clustering procedure of mzclust. Hence, there is a trade-off: large values of peaklimit leads to faster execution of mzpart but to slower computation of mzclust and vice versa.

Value

Returns the argument MSlist, with entries made:

Parameters

MSlist[[2]]: saves the parameter settings.

Scans

MSlist[[4]]: matrix with raw measurements and tags resorted for partitions.

Partition_Index

MSlist[[5]]: Index assigning partitions to sections in the raw measurement of MSlist[[4]]; required for fast (random) access.

Imbecile

Do not set minpeak bigger than its counterpart in mzclust or mzpick. Too complicated? Then rather use enviPickwrap for adjusting all function arguments.

Warning

Despite optimized code, this function has a potential to run for a intolerable long time or out of memory if (a) the parameters are set wrongly, (b) the .mzML/.mzXML-file was not centroided or (c) the underlying data is inadequate for this peak picker. With regards to (a), do not assume gaps being larger than actually present. Instead, use plotMSlist to have a look at your data contained in MSlist after upload with readMSdata; set progbar=TRUE to monitor where a function fails. Once settled, set progbar=FALSE for faster execution.

To avoid running out of memory, stoppoints sets the maximum number of measurements that can be handled in the routines to delete those of lowest intensity (in cases where peaklimit cannot be reached by partitioning by dmzgap and drtgap alone). If above stoppoints, execution aborts.

Author(s)

Martin Loos

See Also

mzclust


enviPick documentation built on May 1, 2019, 8:05 p.m.