baselineCorrectionQuant: Baseline correction - quantiles method
In acinostroza/TargetSearch: A package for the analysis of GC-MS metabolite profiling data

baselineCorrectionQuant

R Documentation

Baseline correction - quantiles method

Description

This function perform baseline correction using a quantiles around a moving window algorithm.

Usage

    baselineCorrectionQuant(peaks, time, smooth=0, qntl=0.50, width=30,
                            unit=c("seconds", "points"), steps=10)

Arguments

`peaks`	Either a matrix object of spectra peak intensities to be baseline corrected, where the rows are retention times and columns are mass traces; or, a named list containing an element called `"Peaks"` which such matrix and another called `"Time"` with the retention time in seconds. The list can be generated by `peakCDFextraction`
`time`	A vector of retention time in seconds. This parameter is used if `peaks` is a matrix. Otherwise, the element called `"Time"` is used instead and this parameter is ignored.
`smooth`	An integer. Smooth each signal by this number of points using a moving average. Smoothing is disabled if this value is less or equal than 1. Note that the smoothing is applied after the baseline correction.
`qntl`	Numeric scalar. The quantile for baseline estimation. The value must be in `[0, 1]`.
`width`	Numeric scalar. The size of the window centered around a scan for baseline estimation. The size depends on the parameter `unit` below.
`unit`	A string which chooses if the `width` are points (scans) or seconds.
`steps`	Integer scalar greater than zero. To speed up computation, the baseline algorithm does not compute the baseline estimate in each single scan, but in intervals of `steps` steps. The intermediate points are estimated by simple linear regression.

Details

Applies a quantile based baseline estimation method. The method is applied for each ion mass trace (column of peaks) individually. It simple computes for each data point of the trace the qntl quantile, for example the 50% quantile, i.e., the median, of all the points which are within a width distance or it.

In order for the method to work, select a width much larger than the widest peak.

For speed efficiency, and assuming that the baseline is a smooth curve, the quantiles are computed every step points. For example, if step=3, then the quantiles will be computed every third scan instead of every point. If instead step=1, then it will computed in every scan. The baseline of the points in between (if step > 1) are approximated by linear interpolation.

Value

Returns a list with the same elements as the input, but the element "Peaks" containing baseline corrected values. In case peaks is a matrix, it returns a matrix of the same dimension instead.

Author(s)

Alvaro Cuadros-Inostroza

Examples

  # get a random sample CDF from TargetSearchData
  require(TargetSearchData)
  cdffile <- sample(tsd_cdffiles(), 1)
  pdata <- peakCDFextraction(cdffile)

  # restrict mass range to reduce computing time (not needed for
  # actual data)
  pdata$Peaks <- pdata$Peaks[, 1:10] ; pdata$massRange <- c(85, 94)

  # make a fake baseline as constant + noise (the CDF files have been
  # already baseline corrected by the vendor software).
  nscans <- length(pdata$Time)
  noise <- as.integer(1000 + rnorm(nscans, sd=5))
  pdata$Peaks <- pdata$Peaks + noise

  # change parameters and see how the results change. Note that the default
  # width of 30 seconds might be too small
  pdata1 <- baselineCorrectionQuant(pdata, steps=5)
  pdata2 <- baselineCorrectionQuant(pdata, width=50, steps=5)

  # pick random trace k and compare correction values
  k <- 6
  m <- cbind(pdata$Peaks[, k] - noise, pdata1$Peaks[, k], pdata2$Peaks[, k])
  matplot(pdata$Time, m, type='l', lty=1, xlab='time', ylab='intensity')
  legend('topleft', c('original', 'base correct 1', 'base correct 2'),
         col=1:3, lty=1, lwd=1)

acinostroza/TargetSearch documentation built on July 5, 2025, 1:19 a.m.