spectraData | R Documentation |
As detailed in the documentation of the Spectra class, a Spectra
object
is a container for mass spectrometry (MS) data that includes both the mass
peaks data (or peaks data, generally m/z and intensity values) as well
as spectra metadata (so called spectra variables). Spectra variables
generally define one value per spectrum, while for peaks variables one value
per mass peak is defined and hence multiple values per spectrum (depending
on the number of mass peaks of a spectrum).
Data can be extracted from a Spectra
object using dedicated accessor
functions or also using the $
operator. Depending on the backend class
used by the Spectra
to represent the data, data can also be added or
replaced (again, using dedicated functions or using $<-
).
asDataFrame(
object,
i = seq_along(object),
spectraVars = spectraVariables(object)
)
## S4 method for signature 'Spectra'
acquisitionNum(object)
## S4 method for signature 'Spectra'
centroided(object)
## S4 replacement method for signature 'Spectra'
centroided(object) <- value
## S4 method for signature 'Spectra'
collisionEnergy(object)
## S4 replacement method for signature 'Spectra'
collisionEnergy(object) <- value
coreSpectraVariables()
## S4 method for signature 'Spectra'
dataOrigin(object)
## S4 replacement method for signature 'Spectra'
dataOrigin(object) <- value
## S4 method for signature 'Spectra'
dataStorage(object)
## S4 method for signature 'Spectra'
intensity(object, f = processingChunkFactor(object), ...)
## S4 method for signature 'Spectra'
ionCount(object)
## S4 method for signature 'Spectra'
isCentroided(object, ...)
## S4 method for signature 'Spectra'
isEmpty(x)
## S4 method for signature 'Spectra'
isolationWindowLowerMz(object)
## S4 replacement method for signature 'Spectra'
isolationWindowLowerMz(object) <- value
## S4 method for signature 'Spectra'
isolationWindowTargetMz(object)
## S4 replacement method for signature 'Spectra'
isolationWindowTargetMz(object) <- value
## S4 method for signature 'Spectra'
isolationWindowUpperMz(object)
## S4 replacement method for signature 'Spectra'
isolationWindowUpperMz(object) <- value
## S4 method for signature 'Spectra'
length(x)
## S4 method for signature 'Spectra'
lengths(x, use.names = FALSE)
## S4 method for signature 'Spectra'
msLevel(object)
## S4 method for signature 'Spectra'
mz(object, f = processingChunkFactor(object), ...)
## S4 method for signature 'Spectra'
peaksData(
object,
columns = c("mz", "intensity"),
f = processingChunkFactor(object),
...,
BPPARAM = bpparam()
)
## S4 method for signature 'Spectra'
peaksVariables(object)
## S4 method for signature 'Spectra'
polarity(object)
## S4 replacement method for signature 'Spectra'
polarity(object) <- value
## S4 method for signature 'Spectra'
precScanNum(object)
## S4 method for signature 'Spectra'
precursorCharge(object)
## S4 method for signature 'Spectra'
precursorIntensity(object)
## S4 method for signature 'Spectra'
precursorMz(object)
## S4 replacement method for signature 'Spectra'
precursorMz(object, ...) <- value
## S4 method for signature 'Spectra'
rtime(object)
## S4 replacement method for signature 'Spectra'
rtime(object) <- value
## S4 method for signature 'Spectra'
scanIndex(object)
## S4 method for signature 'Spectra'
smoothed(object)
## S4 replacement method for signature 'Spectra'
smoothed(object) <- value
## S4 method for signature 'Spectra'
spectraData(object, columns = spectraVariables(object))
## S4 replacement method for signature 'Spectra'
spectraData(object) <- value
## S4 method for signature 'Spectra'
spectraNames(object)
## S4 replacement method for signature 'Spectra'
spectraNames(object) <- value
## S4 method for signature 'Spectra'
spectraVariables(object)
## S4 method for signature 'Spectra'
tic(object, initial = TRUE)
## S4 method for signature 'Spectra'
uniqueMsLevels(object, ...)
## S4 method for signature 'Spectra'
x$name
## S4 replacement method for signature 'Spectra'
x$name <- value
## S4 method for signature 'Spectra'
x[[i, j, ...]]
## S4 replacement method for signature 'Spectra'
x[[i, j, ...]] <- value
object |
A |
i |
For |
spectraVars |
|
value |
A vector with values to replace the respective spectra variable. Needs to be of the correct data type for the spectra variable. |
f |
For |
... |
Additional arguments. |
x |
A |
use.names |
For |
columns |
For |
BPPARAM |
Parallel setup configuration. See |
initial |
For |
name |
For |
j |
For |
A common set of core spectra variables are defined for Spectra
. These
have a pre-defined data type and each Spectra
will return a value for
these if requested. If no value for a spectra variable is defined, a missing
value (of the correct data type) is returned. The list of core spectra
variables and their respective data type is:
acquisitionNum integer(1)
: the index of acquisition of a spectrum
during an MS run.
centroided logical(1)
: whether the spectrum is in profile or centroid
mode.
collisionEnergy numeric(1)
: collision energy used to create an MSn
spectrum.
dataOrigin character(1)
: the origin of the spectrum's data, e.g. the
mzML file from which it was read.
dataStorage character(1)
: the (current) storage location of the
spectrum data. This value depends on the backend used to handle and
provide the data. For an in-memory backend like the MsBackendDataFrame
this will be "<memory>"
, for an on-disk backend such as the
MsBackendHdf5Peaks
it will be the name of the HDF5 file where the
spectrum's peak data is stored.
isolationWindowLowerMz numeric(1)
: lower m/z for the isolation
window in which the (MSn) spectrum was measured.
isolationWindowTargetMz numeric(1)
: the target m/z for the isolation
window in which the (MSn) spectrum was measured.
isolationWindowUpperMz numeric(1)
: upper m/z for the isolation window
in which the (MSn) spectrum was measured.
msLevel integer(1)
: the MS level of the spectrum.
polarity integer(1)
: the polarity of the spectrum (0
and 1
representing negative and positive polarity, respectively).
precScanNum integer(1)
: the scan (acquisition) number of the precursor
for an MSn spectrum.
precursorCharge integer(1)
: the charge of the precursor of an MSn
spectrum.
precursorIntensity numeric(1)
: the intensity of the precursor of an
MSn spectrum.
precursorMz numeric(1)
: the m/z of the precursor of an MSn spectrum.
rtime numeric(1)
: the retention time of a spectrum.
scanIndex integer(1)
: the index of a spectrum within a (raw) file.
smoothed logical(1)
: whether the spectrum was smoothed.
For each of these spectra variable a dedicated accessor function is defined
(such as msLevel()
or rtime()
) that allows to extract the values of
that spectra variable for all spectra in a Spectra
object. Also,
replacement functions are defined, but not all backends might support
replacing values for spectra variables. As described above, additional
spectra variables can be defined or added. The spectraVariables()
function
can be used to
Values for multiple spectra variables, or all spectra vartiables* can be
extracted with the spectraData()
function.
Spectra
also provide mass peak data with the m/z and intensity values
being the core peaks variables:
intensity numeric
: intensity values for the spectrum's peaks.
mz numeric
: the m/z values for the spectrum's peaks.
Values for these can be extracted with the mz()
and intensity()
functions, or the peaksData()
function. The former functions return a
NumericList
with the respective values, while the latter returns a List
with numeric
two-column matrices. The list of peaks matrices can also
be extracted using as(x, "list")
or as(x, "SimpleList")
with x
being
a Spectra
object.
Some Spectra
/backends provide also values for additional peaks variables.
The set of available peaks variables can be extracted with the
peaksVariables()
function.
The set of available functions to extract data from, or set data in, a
Spectra
object are (in alphabetical order) listed below. Note that there
are also other functions to extract information from a Spectra
object
documented in addProcessing()
.
$
, $<-
: gets (or sets) a spectra variable for all spectra in object
.
See examples for details. Note that replacing values of a peaks variable
is not supported with a non-empty processing queue, i.e. if any filtering
or data manipulations on the peaks data was performed. In these cases
applyProcessing()
needs to be called first to apply all cached data
operations.
[[
, [[<-
: access or set/add a single spectrum variable (column) in the
backend.
acquisitionNum()
: returns the acquisition number of each
spectrum. Returns an integer
of length equal to the number of
spectra (with NA_integer_
if not available).
asDataFrame()
: converts the Spectra
to a DataFrame
(in long format)
contining all data. Returns a DataFrame
.
centroided()
, centroided<-
: gets or sets the centroiding
information of the spectra. centroided()
returns a logical
vector of length equal to the number of spectra with TRUE
if a
spectrum is centroided, FALSE
if it is in profile mode and NA
if it is undefined. See also isCentroided()
for estimating from
the spectrum data whether the spectrum is centroided. value
for centroided<-
is either a single logical
or a logical
of
length equal to the number of spectra in object
.
collisionEnergy()
, collisionEnergy<-
: gets or sets the
collision energy for all spectra in object
. collisionEnergy()
returns a numeric
with length equal to the number of spectra
(NA_real_
if not present/defined), collisionEnergy<-
takes a
numeric
of length equal to the number of spectra in object
.
coreSpectraVariables()
: returns the core spectra variables along with
their expected data type.
dataOrigin()
, dataOrigin<-
: gets or sets the data origin for each
spectrum. dataOrigin()
returns a character
vector (same length than
object
) with the origin of the spectra. dataOrigin<-
expects a
character
vector (same length than object
) with the replacement
values for the data origin of each spectrum.
dataStorage()
: returns a character
vector (same length than object
)
with the data storage location of each spectrum.
intensity()
: gets the intensity values from the spectra. Returns
a IRanges::NumericList()
of numeric
vectors (intensity values for each
spectrum). The length of the list is equal to the number of
spectra
in object
.
ionCount()
: returns a numeric
with the sum of intensities for
each spectrum. If the spectrum is empty (see isEmpty()
),
NA_real_
is returned.
isCentroided()
: a heuristic approach assessing if the spectra in
object
are in profile or centroided mode. The function takes
the qtl
th quantile top peaks, then calculates the difference
between adjacent m/z value and returns TRUE
if the first
quartile is greater than k
. (See Spectra:::.isCentroided()
for
the code.)
isEmpty()
: checks whether a spectrum in object
is empty
(i.e. does not contain any peaks). Returns a logical
vector of
length equal number of spectra.
isolationWindowLowerMz()
, isolationWindowLowerMz<-
: gets or sets the
lower m/z boundary of the isolation window.
isolationWindowTargetMz()
, isolationWindowTargetMz<-
: gets or sets the
target m/z of the isolation window.
isolationWindowUpperMz()
, isolationWindowUpperMz<-
: gets or sets the
upper m/z boundary of the isolation window.
length()
: gets the number of spectra in the object.
lengths()
: gets the number of peaks (m/z-intensity values) per
spectrum. Returns an integer
vector (length equal to the
number of spectra). For empty spectra, 0
is returned.
msLevel()
: gets the spectra's MS level. Returns an integer vector (names
being spectrum names, length equal to the number of spectra) with the MS
level for each spectrum.
mz()
: gets the mass-to-charge ratios (m/z) from the
spectra. Returns a IRanges::NumericList()
or length equal to the number
of spectra, each element a numeric
vector with the m/z values of
one spectrum.
peaksData()
: gets the peaks data for all spectra in object
. Peaks
data consist of the m/z and intensity values as well as possible additional
annotations (variables) of all peaks of each spectrum. The function
returns a S4Vectors::SimpleList()
of two dimensional arrays (either
matrix
or data.frame
), with each array providing the values for the
requested peak variables (by default "mz"
and "intensity"
).
Optional parameter columns
is passed to the backend's peaksData()
function to allow the selection of specific (or additional) peaks
variables (columns) that should be extracted (if available). Importantly,
it is not guaranteed that each backend supports this parameter (while
each backend must support extraction of "mz"
and "intensity"
columns).
Parameter columns
defaults to c("mz", "intensity")
but any value
returned by peaksVariables(object)
is supported.
Note also that it is possible to extract the peak data with
as(x, "list")
and as(x, "SimpleList")
as a list
and SimpleList
,
respectively. Note however that, in contrast to peaksData()
, as()
does not support the parameter columns
.
peaksVariables()
: lists the available variables for mass peaks provided
by the backend. Default peak variables are "mz"
and "intensity"
(which
all backends need to support and provide), but some backends might provide
additional variables.
These variables correspond to the column names of the peak data array
returned by peaksData()
.
polarity()
, polarity<-
: gets or sets the polarity for each
spectrum. polarity()
returns an integer
vector (length equal
to the number of spectra), with 0
and 1
representing negative
and positive polarities, respectively. polarity<-
expects an
integer
vector of length 1 or equal to the number of spectra.
precursorCharge()
, precursorIntensity()
, precursorMz()
,
precScanNum()
, precAcquisitionNum()
: gets the charge (integer
),
intensity (numeric
), m/z (numeric
), scan index (integer
)
and acquisition number (interger
) of the precursor for MS level >
2 spectra from the object. Returns a vector of length equal to
the number of spectra in object
. NA
are reported for MS1
spectra of if no precursor information is available.
rtime()
, rtime<-
: gets or sets the retention times (in seconds)
for each spectrum. rtime()
returns a numeric
vector (length
equal to the number of spectra) with the retention time for each
spectrum. rtime<-
expects a numeric vector with length equal
to the number of spectra.
scanIndex()
: returns an integer
vector with the scan index
for each spectrum. This represents the relative index of the
spectrum within each file. Note that this can be different to the
acquisitionNum
of the spectrum which represents the index of the
spectrum during acquisition/measurement (as reported in the mzML file).
smoothed()
,smoothed<-
: gets or sets whether a spectrum is
smoothed. smoothed()
returns a logical
vector of length equal
to the number of spectra. smoothed<-
takes a logical
vector
of length 1 or equal to the number of spectra in object
.
spectraData()
: gets general spectrum metadata (annotation, also called
header). spectraData()
returns a DataFrame
. Note that this
method does by default not return m/z or intensity values.
spectraData<-
: replaces the full spectra data of the Spectra
object with the one provided with value
. The spectraData<-
function
expects a DataFrame
to be passed as value with the same number of rows
as there a spectra in object
. Note that replacing values of
peaks variables is not supported with a non-empty processing queue, i.e.
if any filtering or data manipulations on the peaks data was performed.
In these cases applyProcessing()
needs to be called first to apply all
cached data operations and empty the processing queue.
spectraNames()
, spectraNames<-
: gets or sets the spectra names.
spectraVariables()
: returns a character
vector with the
available spectra variables (columns, fields or attributes of each
spectrum) available in object
. Note that spectraVariables()
does not
list the peak variables ("mz"
, "intensity"
and eventual additional
annotations for each MS peak). Peak variables are returned by
peaksVariables()
.
tic()
: gets the total ion current/count (sum of signal of a
spectrum) for all spectra in object
. By default, the value
reported in the original raw data file is returned. For an empty
spectrum, 0
is returned.
uniqueMsLevels()
: get the unique MS levels available in object
. This
function is supposed to be more efficient than unique(msLevel(object))
.
Sebastian Gibb, Johannes Rainer, Laurent Gatto, Philippine Louail
addProcessing()
for functions to analyze Spectra
.
Spectra for a general description of the Spectra
object.
## Create a Spectra from mzML files and use the `MsBackendMzR` on-disk
## backend.
sciex_file <- dir(system.file("sciex", package = "msdata"),
full.names = TRUE)
sciex <- Spectra(sciex_file, backend = MsBackendMzR())
sciex
## Get the number of spectra in the data set
length(sciex)
## Get the number of mass peaks per spectrum - limit to the first 6
lengths(sciex) |> head()
## Get the MS level for each spectrum - limit to the first 6 spectra
msLevel(sciex) |> head()
## Alternatively, we could also use $ to access a specific spectra variable.
## This could also be used to add additional spectra variables to the
## object (see further below).
sciex$msLevel |> head()
## Get the intensity and m/z values.
intensity(sciex)
mz(sciex)
## Convert a subset of the Spectra object to a long DataFrame.
asDataFrame(sciex, i = 1:3, spectraVars = c("rtime", "msLevel"))
## Create a Spectra providing a `DataFrame` containing the spectrum data.
spd <- DataFrame(msLevel = c(1L, 2L), rtime = c(1.1, 1.2))
spd$mz <- list(c(100, 103.2, 104.3, 106.5), c(45.6, 120.4, 190.2))
spd$intensity <- list(c(200, 400, 34.2, 17), c(12.3, 15.2, 6.8))
s <- Spectra(spd)
s
## List all available spectra variables (i.e. spectrum data and metadata).
spectraVariables(s)
## For all *core* spectrum variables accessor functions are available. These
## return NA if the variable was not set.
centroided(s)
dataStorage(s)
rtime(s)
precursorMz(s)
## The core spectra variables are:
coreSpectraVariables()
## Add an additional metadata column.
s$spectrum_id <- c("sp_1", "sp_2")
## List spectra variables, "spectrum_id" is now also listed
spectraVariables(s)
## Get the values for the new spectra variable
s$spectrum_id
## Extract specific spectra variables.
spectraData(s, columns = c("spectrum_id", "msLevel"))
## -------- PEAKS VARIABLES AND DATA --------
## Get the peak data (m/z and intensity values).
pks <- peaksData(s)
pks
pks[[1]]
pks[[2]]
## Note that we could get the same resulb by coercing the `Spectra` to
## a `list` or `SimpleList`:
as(s, "list")
as(s, "SimpleList")
## Or use `mz()` and `intensity()` to extract the m/z and intensity values
## separately
mz(s)
intensity(s)
## Some `MsBackend` classes provide support for arbitrary peaks variables
## (in addition to the mandatory `"mz"` and `"intensity"` values. Below
## we create a simple data frame with an additional peak variable `"pk_ann"`
## and create a `Spectra` with a `MsBackendMemory` for that data.
## Importantly the number of values (per spectrum) need to be the same
## for all peak variables.
tmp <- data.frame(msLevel = c(2L, 2L), rtime = c(123.2, 123.5))
tmp$mz <- list(c(103.1, 110.4, 303.1), c(343.2, 453.1))
tmp$intensity <- list(c(130.1, 543.1, 40), c(0.9, 0.45))
tmp$pk_ann <- list(c(NA_character_, "A", "P"), c("B", "P"))
## Create the Spectra. With parameter `peaksVariables` we can define
## the columns in `tmp` that contain peaks variables.
sps <- Spectra(tmp, source = MsBackendMemory(),
peaksVariables = c("mz", "intensity", "pk_ann"))
peaksVariables(sps)
## Extract just the m/z and intensity values
peaksData(sps)[[1L]]
## Extract the full peaks data
peaksData(sps, columns = peaksVariables(sps))[[1L]]
## Access just the pk_ann variable
sps$pk_ann
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.