Spectra | R Documentation |
The Spectra
class encapsules spectral mass spectrometry (MS) data and
related metadata. The MS data is represented by a backend extending the
virual MsBackend class which provides the data to the Spectra
object.
The Spectra
class implements only data accessor, filtering and analysis
methods for the MS data and relies on its backend to provide the MS data.
This allows to change data representations of a Spectra
object depending
on the user's needs and properties of the data. Different backends and
their properties are explained in the MsBackend documentation.
Documentation on other topics and functionality of Spectra
can be found in:
spectraData()
for accessing and using MS data through Spectra
objects.
filterMsLevel()
to subset and filter Spectra
objects.
plotSpectra()
for visualization of Spectra
objects.
processingChunkSize()
for information on parallel and chunk-wise data
processing.
combineSpectra()
for merging, aggregating and splitting of Spectra
objects.
combinePeaks()
for merging and aggregating Spectra
's mass peaks data.
addProcessing()
for data analysis functions.
compareSpectra()
for spectra similarity calculations.
## S4 method for signature 'missing'
Spectra(
object,
processingQueue = list(),
metadata = list(),
...,
backend = MsBackendMemory(),
BPPARAM = bpparam()
)
## S4 method for signature 'MsBackend'
Spectra(
object,
processingQueue = list(),
metadata = list(),
...,
BPPARAM = bpparam()
)
## S4 method for signature 'character'
Spectra(
object,
processingQueue = list(),
metadata = list(),
source = MsBackendMzR(),
backend = source,
...,
BPPARAM = bpparam()
)
## S4 method for signature 'ANY'
Spectra(
object,
processingQueue = list(),
metadata = list(),
source = MsBackendMemory(),
backend = source,
...,
BPPARAM = bpparam()
)
## S4 method for signature 'Spectra,MsBackend'
setBackend(
object,
backend,
f = processingChunkFactor(object),
...,
BPPARAM = bpparam()
)
## S4 method for signature 'Spectra'
export(object, backend, ...)
## S4 method for signature 'Spectra'
dataStorageBasePath(object)
## S4 replacement method for signature 'Spectra'
dataStorageBasePath(object) <- value
object |
For |
processingQueue |
For |
metadata |
For |
... |
Additional arguments. |
backend |
For |
BPPARAM |
Parallel setup configuration. See |
source |
For |
f |
For |
value |
For |
The Spectra
class uses by default a lazy data manipulation strategy,
i.e. data manipulations such as performed with replaceIntensitiesBelow()
are not applied immediately to the data, but applied on-the-fly to the
spectrum data once it is retrieved. This enables data manipulation
operations also for read only data representations. For some backends that
allow to write data back to the data storage (such as the
MsBackendMemory()
, MsBackendDataFrame()
and MsBackendHdf5Peaks()
) it
is possible to apply to queue with the applyProcessing()
function (see
the applyProcessing()
function for details).
Clarifications regarding scan/acquisition numbers and indices:
A spectrumId
(or spectrumID
) is a vendor specific field in
the mzML file that contains some information about the
run/spectrum, e.g.: controllerType=0 controllerNumber=1 scan=5281 file=2
acquisitionNum
is a more a less sanitize spectrum id generated
from the spectrumId
field by mzR
(see
here).
scanIndex
is the mzR
generated sequence number of the
spectrum in the raw file (which doesn't have to be the same as
the acquisitionNum
)
See also this issue.
Spectra
objectThe Spectra
object is a container for MS data that includes mass peak
data (m/z and related intensity values, also referred to as peaks data
in the context of Spectra
) and metadata of individual spectra (so called
spectra variables). While a core set of spectra variables (the
coreSpectraVariables()
) are guaranteed to be provided by a
Spectra
, it is possible to add arbitrary additional spectra variables to
a Spectra
object.
The Spectra
object is designed to contain MS data of a (large) set of mass
spectra. The data is organized linearly and can be thought of a list of
mass spectra, i.e. each element in the Spectra
is one spectrum.
Spectra
classes can be created with the Spectra()
constructor function
which supports the following formats:
parameter object
is a data.frame
or DataFrame
containing the
full spectrum data (spectra variables in columns as well as columns
with the individual MS peak data, m/z and intensity). The provided
backend
(by default a MsBackendMemory) will be initialized
with that data.
parameter object
is a MsBackend (assumed to be already
initialized).
parameter object
is missing, in which case it is supposed that the data
is provided by the MsBackend class passed along with the backend
argument.
parameter object
is of type character
and is expected to be the file
names(s) from which spectra should be imported. Parameter source
allows
to define a MsBackend that is able to import the data from the
provided source files. The default value for source
is MsBackendMzR()
which allows to import spectra data from mzML, mzXML or CDF files.
With ...
additional arguments can be passed to the backend's
backendInitialize()
method. Parameter backend
allows to specify which
MsBackend should be used for data representation and storage.
Spectra
The MS data which can be accessed through the Spectra
object is
represented by its backend, which means that this backend defines how
and where the data is stored (e.g. in memory or on disk). The Specrta
object relies on the backend to provide the MS data whenever it needs it
for data processing.
Different backends with different properties, such as minimal memory
requirement or fast data access, are defined in the Spectra package or
one of the MsBackend* packages. More information on backends and their
properties is provided in the documentation of MsBackend.
On-disk backends keep only a limited amount of data in memory retrieving
most of the data (usually the MS peak data) upon request on-the-fly from
their on-disk data representations. Moving the on-disk data storage of such
a backend or a serialized object to a different location in the file
system will cause data corruption. The dataStorageBasePath()
and
dataStorageBasePath<-
functions allow in such cases (and if thebackend
classes support this operation), to get or change the base
path to the directory of the backend's data storage. In-memory backends
such as MsBackendMemory or MsBackendDataFrame keeping all MS data in
memory don't support, and need, this function, but for MsBackendMzR this
function can be used to update/adapt the path to the directory containing
the original data files. Thus, for Spectra
objects (using this backend)
that were moved to another file system or computer, these functions allow to
adjust/adapt the base file path.
Spectra
The data representation, i.e. the backend of a Spectra
object can be
changed with the setBackend()
method that takes an instance of the new
backend as second parameter backend
. A call to
setBackend(sps, backend = MsBackendDataFrame())
would for example change the backend of sps
to the in-memory
MsBackendDataFrame
. Changing to a backend is only supported if that
backend has a data
parameter in its backendInitialize()
method and if
supportsSetBackend()
returns TRUE
for that backend. setBackend()
will
transfer the full spectra data from the originating backend as a DataFrame
to the new backend.
Generally, it is not possible to change to a read-only backend such as
the MsBackendMzR()
backend.
The definition of the function is:
setBackend(object, backend, ..., f = dataStorage(object), BPPARAM = bpparam())
and its parameters are:
object
: the Spectra
object.
backend
: an instance of the new backend, e.g. [MsBackendMemory()]
.
f
: factor allowing to parallelize the change of the backends. By
default the process of copying the spectra data from the original to the
new backend is performed separately (and in parallel) for each file. Users
are advised to use the default setting.
...
: optional additional arguments passed to the backendInitialize()
method of the new backend
.
BPPARAM
: setup for the parallel processing. See
BiocParallel::bpparam()
for details.
Spectra
objectData from a Spectra
object can be exported to a file with the
export()
function. The actual export of the data is performed by
the export
method of the MsBackend class defined with the mandatory
parameter backend
which defines also the format in which the data
is exported. Note however that not all backend classes support
export of data. From the MsBackend
classes in the Spectra
package
currently only the MsBackendMzR
backend supports data export (to
mzML/mzXML file(s)); see the help page of the MsBackend for
information on its arguments or the examples below or the vignette
for examples.
The definition of the function is
export(object, backend, ...)
and its
parameters are:
object
: the Spectra
object to be exported.
backend
: instance of a class extending MsBackend which supports export
of the data (i.e. which has a defined export
method).
...
: additional parameters specific for the MsBackend
passed with
parameter backend
.
Sebastian Gibb, Johannes Rainer, Laurent Gatto, Philippine Louail
## -------- CREATION OF SPECTRA OBJECTS --------
## Create a Spectra providing a `DataFrame` containing the spectrum data.
spd <- DataFrame(msLevel = c(1L, 2L), rtime = c(1.1, 1.2))
spd$mz <- list(c(100, 103.2, 104.3, 106.5), c(45.6, 120.4, 190.2))
spd$intensity <- list(c(200, 400, 34.2, 17), c(12.3, 15.2, 6.8))
data <- Spectra(spd)
data
## Create a Spectra from mzML files and use the `MsBackendMzR` on-disk
## backend.
sciex_file <- dir(system.file("sciex", package = "msdata"),
full.names = TRUE)
sciex <- Spectra(sciex_file, backend = MsBackendMzR())
sciex
## -------- CHANGING DATA REPRESENTATIONS --------
## The MS data is on disk and will be read into memory on-demand. We can
## however change the backend to a MsBackendMemory backend which will
## keep all of the data in memory.
sciex_im <- setBackend(sciex, MsBackendMemory())
sciex_im
## The `MsBackendMemory()` supports the `setBackend()` method:
supportsSetBackend(MsBackendMemory())
## Thus, it is possible to change to that backend with `setBackend()`. Most
## read-only backends however don't support that, such as the
## `MsBackendMzR` and `setBackend()` would fail to change to that backend.
supportsSetBackend(MsBackendMzR())
## The on-disk object `sciex` is light-weight, because it does not keep the
## MS peak data in memory. The `sciex_im` object in contrast keeps all the
## data in memory and its size is thus much larger.
object.size(sciex)
object.size(sciex_im)
## The spectra variable `dataStorage` returns for each spectrum the location
## where the data is stored. For in-memory objects:
head(dataStorage(sciex_im))
## While objects that use an on-disk backend will list the files where the
## data is stored.
head(dataStorage(sciex))
## The spectra variable `dataOrigin` returns for each spectrum the *origin*
## of the data. If the data is read from e.g. mzML files, this will be the
## original mzML file name:
head(dataOrigin(sciex))
head(dataOrigin(sciex_im))
## -------- DATA EXPORT --------
## Some `MsBackend` classes provide an `export()` method to export the data
## to the file format supported by the backend.
## The `MsBackendMzR` for example allows to export MS data to mzML or
## mzXML file(s), the `MsBackendMgf` (defined in the MsBackendMgf R package)
## would allow to export the data in mgf file format.
## Below we export the MS data in `data`. We call the `export()` method on
## this object, specify the backend that should be used to export the data
## (and which also defines the output format) and provide a file name.
fl <- tempfile()
export(data, MsBackendMzR(), file = fl)
## This exported our data in mzML format. Below we read the first 6 lines
## from that file.
readLines(fl, n = 6)
## If only a single file name is provided, all spectra are exported to that
## file. To export data with the `MsBackendMzR` backend to different files, a
## file name for each individual spectrum has to be provided.
## Below we export each spectrum to its own file.
fls <- c(tempfile(), tempfile())
export(data, MsBackendMzR(), file = fls)
## Reading the data from the first file
res <- Spectra(backendInitialize(MsBackendMzR(), fls[1]))
mz(res)
mz(data)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.