msPeakCWT: Peak Detection via the Continuous Wavelet Transform

Description Usage Arguments Value References See Also Examples

Description

This function isolates peaks of interest in the input mass spectrum by way of a continuous wavelet transform (CWT). The method is scale-selective, i.e., the user can specify the scale range of interest in defining peaks. The basic algorithm works as follows:

CWT

a continuous wavelet transform of the input spectrum y is calculated over a broad range of scales.

CWT Tree

a CWT tree is formed by dividing the set of local maxima in the CWT time-scale plane into branches. Each branch contains a list of maxima whose neighbors are close in time as scales are traversed from coarse to fine. The m/z value at the smallest scale identifies the peak location in the original input spectrum.

Pruning

The collection of branches (and corresponding peak locations) is pruned in numerous ways (see the wavCWTPeaks function for details.)

In addition to peak identification, this function also calculates an estimate of the Holder exponent associated with each peak. Qualitatively speaking, the magnitude (of the modulus) of the Holder exponent is proportional to the the sharpness of the corresponding peak. See the holderSpectrum function for more information. The Holder exponents are packed into the output data.frame along with other peak-related information.

Usage

1
2
3
4
msPeakCWT(x, y, n.scale = 100, snr.min = 3, scale.min = 4,
          length.min = 10, noise.span = NULL, noise.fun =
          "quantile", noise.min = NULL, n.octave.min = 1,
          tolerance = 0, holder = TRUE, process = "msPeakCWT")

Arguments

x

A numeric vector representing the m/z values of a spectrum.

y

A numeric vector representing the intensity values of the spectrum.

length.min

The minimum number of points along a CWT tree branch and within the specified scale.range needed in order for that branches peak to be considered a peak candidate. See the wavCWTPeaks function for more details. Default: 10.

n.octave.min

A pruning factor for excluding non-persistent branches. If a branch of connected extrema does not span this number of octaves, it is excluded from the tree. See the wavCWTTree function for more details. Default: 1.

n.scale

The (maximum) number of logarithmically distributed scales over which to evaluate the CWT. For a uniformly sampled time series (x), the supported range of CWT scales is deltat(x) * c(1, length(x)). where deltat(x) is the sampling interval. However, a mass spectrum is generally viewed as a non-uniformly sampled time series because the difference in successive m/z values is non-constant due to the quadratic mapping of the TOF values to the m/z domain. For the purpose of peak detection, however, the sampling interval can be set to unity. Thus, the possible range of CWT scales is 1 to length(x) and the scale values are integers. See the wavCWT function for more details on CWT scales. Default: 100.

holder

a logical value. If TRUE, the holder exponents corresponding to the peaks are also calculated.

noise.fun

A character string defining the function to apply to the local noise estimates in order to sumarize and quantify the local noise level into a scalar value. See the DETAILS section for more information. Supported values are

"quantile"

quantile(x, probs=0.95)

"sd"

sd(x)

"mad"

mad(x, center=0)

where x is a vector of smallest-scale CWT coefficients whose time indices are near that of the branch termination time. See the wavCWTPeaks function for more details. Default: "quantile".

noise.min

The minimum allowed estimated local noise level. Values must be between 0 and 1 inclusive. Default NULL corresponds to 0.05. It is converted to raw scale before passed to wavCWTPeaks: quantile(attr(x,"noise"), prob=0.05), where x is the input wavCWTTree object.

noise.span

The span in time surrounding each branche's temrination point to use in forming local noise estimates and (ultimately) peak SNR estimates. See the wavCWTPeaks function for more details. Default: NULL,max(0.01 * diff(range(times)), 5*sampling.interval), where times and sampling.interval are attributes of the input wavCWTTree object.

process

A character string denoting the name of the process to register with the (embedded) event history object of the input after processing the input data. Default: "msPeakCWT".

scale.min

The minimum allowable value of the scale associated with a CWT peak. See the scale.range argument of the wavCWTPeaks function for more details. Default: 4.

snr.min

The minimum allowed peak signal-to-noise ratio. See the wavCWTPeaks function for more details. Default: 3.

tolerance

A tolerance vector used to find CWT extrema. This vector must be as long as there are scales in the CWT such that the jth element defines the tolerance to use in finding modulus maxima at the jth scale of the CWT. If not, the last value is replicated appropriately. See the wavCWTTree function for more details. Default: 0.

Value

A data.frame with 11 columns: peak class location, left bound, right bound and peak span in both clock tick ("tick.loc", "tick.left", "tick.right", "tick.span") and mass measure ("mass.loc", "mass.left", "mass.right", "mass.span"), and peak signal-to-noise ratio and intensity ("snr", "intensity"). The final column is "holder", representing the estimated Holder exponent asscoiated with each peak. Since noise.local is NULL, "snr" is the same as ("intensity").

References

Pan Du, Warren A. Kibbe, and Simon M. Lin, “Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching", Bioinformatics, 22, 2059–2065 (2006).

D. B. Percival and A. T. Walden, Wavelet Methods for Time Series Analysis, Cambridge University Press, 2000.

See Also

wavCWT, wavCWTTree, msPeak, msPeakInfo.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
if (!exists("qcset")) data("qcset", package="msProcess")

## extract a subset of a single spectrum 
mz  <- qcset$mz
imz <- mz > 3000 & mz < 5000
z   <- as.vector(qcset[imz, 1]$intensity)
mz  <- mz[imz]
plot(mz, z, cex=0.5, ylab="Intensity", xlab="m/z", type="l")

## estimate the peak locations using various 
## minimum CWT scales and overlay the plot 
## locations 
scale.min <- c(2,4,8)
col <- c("green","blue","red")

for (i in seq(along=scale.min)){
    p <- msPeakCWT(mz, z, scale.min=scale.min[i])
    ipeak <- p[["tick.loc"]]
    points(mz[ipeak], z[ipeak], cex=i, col=col[i], pch=1)
}

## add a legend 
if (is.R()){
  legend(3000,10000,pch=1,col=col,legend=paste("scale.min =", scale.min))
} else {
  legend(3000,10000,marks=1,col=col,legend=paste("scale.min =", scale.min))
}

## plot the Holder exponents corresponding to each 
## peak 
hx <- p[["mass.loc"]]
hy <- abs(p[["holder"]])
plot(hx, hy, type="h", xlim=range(mz), lty="dashed", col="blue", xlab="m/z", ylab="|Holder exponent|")
points(hx, hy, col="red", cex=1.2)
lines(mz, rescale(z,c(0,0.5)), lwd=3)
abline(h=0, lty="dotted")
legend(3000,1,col=c("blue","black"), lwd=c(1,3), lty=c("dashed","solid"),
    legend=c("|Holder exponent|", "Scaled spectrum"))

zeehio/msProcess documentation built on May 4, 2019, 10:15 p.m.