Description Usage Arguments Details Value Note Author(s) References See Also
Takes the file generated by run.cluster.matrix
and tests the
peaks using Benjamini-Hochberg to control the False Discovery Rate.
1 2 3 4 5 6 7 8 9 | run.analysis(form, covariates, FDR = 0.1, norm.post.repl = FALSE,
norm.peaks = c("common", "all", "none"), normalization,
add.norm = TRUE, repl.method = "max", use.model = "lm",
pval.fcn = "default", lrg.only = TRUE, masses = NA,
isotope.dist = 7, root.dir = ".", lrg.dir,
lrg.file = lrg_peaks.RData, res.dir,
res.file = "analyzed.RData", overwrite = FALSE,
use.par.file = FALSE, par.file = "parameters.RData",
bhbysubj = TRUE, subs, ...)
|
form |
object of class “ |
covariates |
data frame containing covariates used in analysis |
FDR |
False Discovery Rate in Benjamini-Hochberg test |
norm.post.repl |
logical; whether to normalize after combining replicates |
norm.peaks |
which peaks to use in normalization |
normalization |
type of normalization to use on spectra before statistical analysis; kept for compatibility (see below) |
add.norm |
logical; whether to normalize additively or multiplicatively on the log scale |
repl.method |
function or string representing the name of a function; how to deal with replicates |
use.model |
function or string representing the name of a function; what test to apply to data |
pval.fcn |
function to extract p-values; default is overall p-value of test |
lrg.only |
logical; whether to consider only peaks that have at least one “large” peak; i.e., identified by |
masses |
specific masses to test |
isotope.dist |
maximum distance for declaring isotopes |
root.dir |
directory for parameters file and raw data |
lrg.dir |
directory for large peaks file; default is |
lrg.file |
name of file to store large peaks in |
res.dir |
directory for results file; default is |
res.file |
name for results file |
overwrite |
logical; whether to replace existing files with new ones |
use.par.file |
logical; if |
par.file |
string containing name of parameters file |
bhbysubj |
logical; whether to look for number of large peaks by subject (i.e., combining replicates) or by spectrum |
subs |
subset of spectra to use for analysis; see below |
... |
additional parameters to be passed to |
Reads in information from file created by run.cluster.matrix
and
creates a file named res.file
in directory res.dir
which contains
the following variables:
amps | matrix of transformed amplitudes of alignment peaks |
bysubjvar | a vector which tells which rows of covariates are identified as the same subject |
centers | matrix of calculated masses of alignment peaks |
clust.mat | matrix of transformed amplitudes of peaks used in statistical testing |
min.FDR | FDR level required to get at least one significant test given the starting set of peaks |
sigs | matrix containing all tests which are significant under at least one scenario |
which.sig | matrix containing all peaks tested |
parameter.list | if use.par.file = TRUE , a list generated by extract.pars ; otherwise not defined |
No value returned; the file is simply created.
If use.par.file == TRUE
and other parameters are entered into the function
call, then the parameters entered in the function call overwrite those read in
from the file. Note that this is opposite from the behavior for
FTICRMS versions 0.7 and earlier.
norm.peaks
determines the peaks used for normalization: "common"
normalizes each spectrum using the average peak height of the alignment peaks
from that spectrum in amps
; "all"
normalizes each spectrum using
the average peak height of all peaks in that spectrum.
normalization
is obsolete but is included for compatibility with previous
versions of the package. The valid normalization schemes translate to the new
scheme as follows: "common"
is norm.post.repl = FALSE
and
norm.peaks = "common"
; "postbase"
is norm.post.repl = FALSE
and norm.peaks = "all"
; "postrepl"
is norm.post.repl = TRUE
and norm.peaks = "all"
; and "none"
is norm.peaks = "none"
(and norm.post.repl = FALSE
, although this value is irrelevant).
Replicates for the same subject are assumed to be determined by the unique
values of covariates$subj
. (Future implementations will allow for
other methods of defining this.) To analyze replicates as independent samples,
use repl.method = "none"
. This will also speed up the run time if there
are no replicates in the data set.
The argument subs
can be logical or numeric or character; if it is
defined, then covariates
is modified to covariates[subs,,drop=F]
.
If masses
is not NULL
, then the listed masses plus anything that
could be in the first isotope.dist - 1
isotope peaks of each mass are
tested.
If something other than the p-value for the overall test statistic is
needed, then the user-defined function for pval.fcn
should have the form
pval.fcn = function(x){...}
, where x
is a model object of the
type returned by use.model
; and should have a return value of the desired
p-value.
If use.model
evaluates to t.test
, then the difference
between the two groups for each peak is recorded in which.sig$Delta
and
sigs$Delta
; otherwise, these columns consist entirely of NA
entries.
Each rowname of sigs
and which.sig
represents the range of masses
that were used to form that peak. The columns of those objects give the
p-value of the peaks in each row, the number of samples that had large
peaks for each row, and the significance of each test, coded as
NA | peak not eligible for B-H |
0 | peak eligible for B-H but not declared significant |
1 | peak declared significant |
The “S
” labels refer to the number of large peaks that were
necessary for a row to be eligible. For example, the column labeled S5
in sigs
used as its starting set of p-values all rows which had
which.sig$num.lrg >= 5
. If bhbysubj == TRUE
, then the entries of
num.lrg
are obtained by going subject-by-subject and for each mass
counting the number of subjects who had at least one spectrum with a large peak
at that mass; otherwise, num.lrg
for each mass is simply the total number
of spectra that had a large peak at that mass.
Don Barkauskas (barkda@wald.ucdavis.edu)
Barkauskas, D.A. and D.M. Rocke. (2009a) “A general-purpose baseline estimation algorithm for spectroscopic data”. to appear in Analytica Chimica Acta. doi:10.1016/j.aca.2009.10.043
Barkauskas, D.A. et al. (2009b) “Analysis of MALDI FT-ICR mass spectrometry data: A time series approach”. Analytica Chimica Acta, 648:2, 207–214.
Barkauskas, D.A. et al. (2009c) “Detecting glycan cancer biomarkers in serum samples using MALDI FT-ICR mass spectrometry data”. Bioinformatics, 25:2, 251–257.
Benjamini, Y. and Hochberg, Y. (1995) “Controlling the false discovery rate: a practical and powerful approach to multiple testing.” J. Roy. Statist. Soc. Ser. B, 57:1, 289–300.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.