View source: R/MsBackendPython.R
MsBackendPy | R Documentation |
The MsBackendPy
allows to access MS data stored as matchms.Spectrum
or spectrum_utils.spectrum.MsmsSpectrum
objects from the
matchms respectively
spectrum_utils Python
library directly from R. The MS data (peaks data or spectra variables) are
translated on-the-fly when accessed. Thus, the MsBackendPy
allows a
seamless integration of Python MS data structures into Spectra::Spectra()
based analysis workflows.
The MsBackendPy
object is considered read-only, i.e. it does not provide
functionality to replace the peaks data from R. However, it is possible to
directly change the data in the referenced Python variable.
## S4 method for signature 'MsBackendPy'
backendInitialize(
object,
pythonVariableName = character(),
spectraVariableMapping = defaultSpectraVariableMapping(),
pythonLibrary = c("matchms", "spectrum_utils"),
...,
data
)
## S4 method for signature 'MsBackendPy'
length(x)
## S4 method for signature 'MsBackendPy'
spectraVariables(object)
## S4 method for signature 'MsBackendPy'
spectraData(object, columns = spectraVariables(object), drop = FALSE)
## S4 method for signature 'MsBackendPy'
peaksData(object, columns = c("mz", "intensity"), drop = FALSE)
## S4 method for signature 'MsBackendPy'
x$name
## S4 replacement method for signature 'MsBackendPy'
spectraVariableMapping(object) <- value
## S4 replacement method for signature 'Spectra'
spectraVariableMapping(object) <- value
reindex(object)
object |
A |
pythonVariableName |
For |
spectraVariableMapping |
For |
pythonLibrary |
For |
... |
Additional parameters. |
data |
For |
x |
A |
columns |
For |
drop |
For |
name |
For |
value |
Replacement value(s). |
The MsBackendPy
keeps only a reference to the MS data in Python (i.e. the
name of the variable in Python) as well as an index pointing to the
individual spectra in Python but no other data. Any data requested from
the MsBackendPy
is accessed and translated on-the-fly from the Python
variable. The MsBackendPy
is thus an interface to the MS data, but not
a data container. All changes to the MS data in the Python variable
(performed e.g. in Python) immediately affect any MsBackendPy
instances
pointing to this variable.
Special care must be taken if the MS data structure in Python is subset or
its order is changed (e.g. by another process). In that case it might be
needed to re-index the backend using the reindex()
function:
object <- reindex(object)
. This will update (replace) the index to the
individual spectra in Python which is stored within the backend.
See description of individual functions for their return values.
MsBackendPy
methodsThe MsBackendPy
supports all methods defined by the Spectra::MsBackend()
interface for access to MS data. Details on the invidual functions can also
be found in the main documentation in the Spectra package (i.e. for
Spectra::MsBackend()
). Here we provide information for functions with
specific properties of the backend.
backendInitialize()
: initializes the backend with information from the
referenced Python variable (attribute). The name of this attribute,
ideally stored in the associated Python session, is expected to be
provided with the pythonVariableName
parameter. The optional
spectraVariableMapping
parameter allows to provide additional, or
alternative, mapping of Spectra
's spectra variables to metadata in the
matchms.Spectrum
objects. See defaultSpectraVariableMapping()
(the
default) for more information. Parameter pythonLibrary
must be used
to specify the Python library representing the MS data in Python. It can
be either pythonLibrary = "matchms"
(the default) or
pythonLibrary = "spectrum_utils"
. The function returns an initialized
instance of MsBackendPy
.
peaksData()
: extracts the peaks data matrices from the backend. Python
code is applied to the data structure in Python to
extract the m/z and intensity values as a list of (numpy) arrays. These
are then translated into an R list
of two-column numeric
matrices.
Because Python does not allow to name columns of an array, an additional
loop in R is required to set the column names to "mz"
and "intensity"
.
spectraData()
: extracts the spectra data from the backend. Which spectra
variables are translated and retrieved from the Python objects depends on
the backend's spectraVariableMapping()
. All metadata names defined are
retrieved and added to the returned DataFrame
(with eventually missing
core spectra variables filled with NA
).
spectraVariables()
: retrieves available spectra variables, which include
the names of all metadata attributes in the matchms.Spectrum
objects
and the core spectra variables Spectra::coreSpectraVariables()
.
spectraVariableMapping<-
: replaces the spectraVariableMapping
of the
backend (see setSpectraVariableMapping()
for details and description
of the expected format).
reindex()
: update the internal index to match 1:length(object)
.
This function is useful if the original data referenced by the backend was
subset or re-ordered by a different process (or a function in Python).
As mentioned in the details section the MS data is completely stored in
Python and the backend only references to this data through the name of
the variable in Python. Thus, each time MS data is requested from the
backend, it is retrieved in its current state.
If for example data was transformed or metadata added or removed in the
Python object, it immediately affects the Spectra
/backend.
Johannes Rainer and the EuBIC hackathon team
## Loading an example MGF file provided by the SpectriPy package.
## As an alternative, the data could also be imported directly in Python
## using:
## import matchms
## from matchms.importing import load_from_mgf
## s_p = list(load_from_mgf(r.fl))
library(Spectra)
library(MsBackendMgf)
fl <- system.file("extdata", "mgf", "test.mgf", package = "SpectriPy")
s <- Spectra(fl, source = MsBackendMgf())
s
## Translating the MS data to Python and assigning it to a variable
## named "s_p" in the (*reticulate*'s) `py` Python environment. Assigning
## the variable to the Python environment has performance advantages, as
## any Python code applied to the MS data does not require any data
## conversions.
py_set_attr(py, "s_p", rspec_to_pyspec(s))
## Create a `MsBackendPy` representing an interface to the data in the
## "s_p" variable in Python:
be <- backendInitialize(MsBackendPy(), "s_p")
be
## Create a Spectra object which this backend:
s_2 <- Spectra(be)
s_2
## Available spectra variables: these include, next to the *core* spectra
## variables, also the names of all metadata stored in the `matchms.Spectrum`
## objects.
spectraVariables(s_2)
## Get the full peaks data:
peaksData(s_2)
## Get the peaks from the first spectrum
peaksData(s_2)[[1L]]
## Get the full spectra data:
spectraData(s_2)
## Get the m/z values
mz(s_2)
## Plot the first spectrum
plotSpectra(s_2[1L])
########
## Using the spectrum_utils Python library
## Below we convert the data to a list of `MsmsSpectrum` object from the
## spectrum_utils library.
py_set_attr(py, "su_p", rspec_to_pyspec(s,
spectraVariableMapping("spectrum_utils"), "spectrum_utils"))
## Create a MsBackendPy representing this data. Importantly, we need to
## specify the Python library using the `pythonLibrary` parameter and
## ideally also set the `spectraVariableMapping` to the one specific for
## that library.
be <- backendInitialize(MsBackendPy(), "su_p",
spectraVariableMapping = spectraVariableMapping("spectrum_utils"),
pythonLibrary = "spectrum_utils")
be
## Get the peaks data for the first 3 spectra
peaksData(be[1:3])
## Get the full spectraData
spectraData(be)
## Extract the precursor m/z
be$precursorMz
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.