Description Usage Arguments Details Value Note Author(s) References See Also
Computes an estimated baseline curve for a spectrum using the “BXR algorithm,” a method of Xi and Rocke generalized by Barkauskas and Rocke.
1 2 3 4 |
spect |
vector containing the intensities of the spectrum |
init.bd |
initial value for baseline; default is flat baseline at median height |
sm.par |
smoothing parameter for baseline calculation |
sm.ord |
order of derivative to penalize in baseline analysis |
max.iter |
convergence criterion in baseline calculation |
tol |
convergence criterion; see below |
sm.div |
smoothness divisor in baseline calculation |
sm.norm.by |
method for smoothness penalty in baseline analysis |
neg.div |
negativity divisor in baseline calculation |
neg.norm.by |
method for negativity penalty in baseline analysis |
rel.conv.crit |
logical; whether convergence criterion should be relative to size of current baseline estimate |
zero.rm |
logical; whether to replace zeros with average of surrounding values |
halve.search |
logical; whether to use a halving-line search if step leads to smaller value of function |
If the spectrum is given by y[i], then the algorithm works by maximizing the objective function
F({b[i]}) = sum_{i=1}^{n}b[i] - sum_{i=2}^{n-1}A[1,i]*(b[i-1]-2b[i]+b[i+1])^2 - ∑_{i=1}^n A[2,i]*[max{b[i]-y[i],0}]^2
using Newton's method (with embedded
halving line search if halve.search == TRUE
) using starting value
b[i] = init.bd[i]
for all i. The middle term controls the
smoothness of the baseline and the last term applies a “negativity
penalty” when the baseline is above the spectrum.
The smoothing factor sm.par
corresponds to A[1]^{*} in
Barkauskas (2009) and controls how large the estimated nth derivative of
the baseline is allowed to be (for sm.ord = n
). From a practical
standpoint, values of sm.ord
larger than two do not seem to adequately
smooth the baseline because the Hessian becomes computationally singular for any
reasonable value of sm.par
.
The parameters sm.div
, sm.norm.by
, neg.div
, and
neg.norm.by
determine the methods used to normalize the smoothness and
negativity terms. The general forms are
A[1,i] = n^4 * A[1]^{*}/M[i]/p and
A[2,i] = 1/M[i]/p. Here, n = length(spect)
;
p is sm.div
or neg.div
, as appropriate; and
M[i] is determined by sm.norm.by
or neg.norm.by
, as
appropriate. Values of "baseline"
make
M[i] = b[i]', where b[i]' is the currently
estimated value of the baseline; values of "overestimate"
make
M[i] = b[i]'-y[i]; and values of "constant"
make M[i] = σ, where σ is an estimate of
the noise standard deviation.
The values of sm.norm.by
and neg.norm.by
can be abbreviated and
both have default value "baseline"
. The default values of NA
for
sm.div
and neg.div
are translated by default to
sm.div = 0.5223145
and neg.div = 0.4210109
, which are the
appropriate parameters for the FT-ICR mass spectrometry machine that generated
the spectra which were used to develop this package. It is distinctly possible
that other machines will require different parameters, and almost certain that
other spectroscopic technologies will require different parameters; see
Barkauskas (2009a) for a description for how these parameters were obtained.
If zero.rm == TRUE
and y[a],…,y[a+k] = 0,
then these values of the spectrum are set to be
(y[a-1]+y[a+k+1])/2. (For typical MALDI FT-ICR
spectra, a spectrum value of zero indicates an erased harmonic and should not be
considered a real data point.)
A list containing the following items:
baseline |
The computed baseline |
iter |
The number of iterations for convergence |
changed |
Numeric vector of length |
hs |
Numeric vector of length |
The original algorithm was developed by Yuanxin Xi and David Rocke. The code in this package was first adapted from a Matlab program by Yuanxin Xi, then modified to account for the new methodology in Barkauskas (2009a).
halve.search = FALSE
is recommended unless both
sm.norm.by == "constant"
and neg.norm.by == "constant"
.
Don Barkauskas (barkda@wald.ucdavis.edu)
Barkauskas, D.A. and D.M. Rocke. (2009a) “A general-purpose baseline estimation algorithm for spectroscopic data”. to appear in Analytica Chimica Acta. doi:10.1016/j.aca.2009.10.043
Barkauskas, D.A. et al. (2009b) “Analysis of MALDI FT-ICR mass spectrometry data: A time series approach”. Analytica Chimica Acta, 648:2, 207–214.
Barkauskas, D.A. et al. (2009c) “Detecting glycan cancer biomarkers in serum samples using MALDI FT-ICR mass spectrometry data”. Bioinformatics, 25:2, 251–257.
Xi, Y. and Rocke, D.M. (2008) “Baseline Correction for NMR Spectroscopic Metabolomics Data Analysis”. BMC Bioinformatics, 9:324.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.