View source: R/MsBackendMassbankSql-functions.R
MsBackendMassbankSql provides access to mass spectrometry data from
MassBank by directly accessing its
MySQL/MariaDb database. In addition it supports adding new spectra variables
or locally changing spectra variables provided by MassBank (without
changing the original values in the database).
MsBackendMassbankSql requires a local installation of the
MassBank database since direct database access is not supported for the
main MassBank instance.
Also, some of the fields in the MassBank database are not directly compatible
Spectra, such as the collision energy which is not available as a
numeric value. The collision energy as available in MassBank is reported as
"collision_energy_text". Also, precursor m/z values
reported for some spectra can not be converted to a
numeric and hence
is reported with the spectra variable
precursorMz for these spectra. The
"precursor_mz_text" can be used to get the original precursor
m/z reported in MassBank.
MsBackendMassbankSql does not support parallel processing
because the database connection stored within the object can not be
shared acrcoss parallel processes. All functions on
MsBackendMassbankSql will (silently) disable parallel processing
even if the user provides a dedicated parallel processing setup with
MsBackendMassbankSql() ## S4 method for signature 'MsBackendMassbankSql' backendInitialize(object, dbcon, ...) ## S4 method for signature 'MsBackendMassbankSql' peaksData(object, columns = peaksVariables(object)) ## S4 method for signature 'MsBackendMassbankSql' dataStorage(object) ## S4 replacement method for signature 'MsBackendMassbankSql' intensity(object) <- value ## S4 replacement method for signature 'MsBackendMassbankSql' mz(object) <- value ## S4 method for signature 'MsBackendMassbankSql' reset(object) ## S4 method for signature 'MsBackendMassbankSql' spectraData(object, columns = spectraVariables(object)) ## S4 method for signature 'MsBackendMassbankSql' spectraNames(object) ## S4 replacement method for signature 'MsBackendMassbankSql' spectraNames(object) <- value ## S4 method for signature 'MsBackendMassbankSql' tic(object, initial = TRUE) ## S4 method for signature 'MsBackendMassbankSql' x[i, j, ..., drop = FALSE] ## S4 method for signature 'Spectra' compounds(object, ...) ## S4 method for signature 'MsBackendMassbankSql' compounds(object, ...) ## S4 replacement method for signature 'MsBackendMassbankSql' x$name <- value ## S4 method for signature 'MsBackendMassbankSql' precScanNum(object) ## S4 method for signature 'MsBackendMassbankSql' backendBpparam(object, BPPARAM = bpparam())
replacement value for
name of the variable to replace for
See documentation of respective function.
The following functions are supported by the
[: subset the backend. Only subsetting by element (row/
$<-: access or set/add a single spectrum variable (column) in the
acquisitionNum: returns the acquisition number of each
spectrum. Returns an
integer of length equal to the number of
NA_integer_ if not available).
peaksData returns a
list with the spectras' peak data. The length of
the list is equal to the number of spectra in
object. Each element of
the list is a
matrix with columns
"intensity". For an empty
matrix with 0 rows and two columns (named
intensity) is returned. Parameter
columns allows to select which peaks
variables to return, but supports currently only
backendBpparam: whether the backend supports parallel processing. Takes
MsBackendMassbankSql and a parallel processing setup (see
for details) as input and always returns a
function can be used to test whether a provided parallel processing setup
is supported by the backend and returns the supported setup.
backendInitialize: initialises the backend by retrieving the IDs of all
spectra in the database. Parameter
dbcon with the connection to the
MassBank MySQL database is required.
dataOrigin: gets a
character of length equal to the number of spectra
object with the data origin of each spectrum. This could e.g. be
the mzML file from which the data was read.
"<MassBank>" for all spectra.
centroided<-: gets or sets the centroiding
information of the spectra.
centroided returns a
vector of length equal to the number of spectra with
TRUE if a
spectrum is centroided,
FALSE if it is in profile mode and
if it is undefined. See also
isCentroided for estimating from
the spectrum data whether the spectrum is centroided.
centroided<- is either a single
logical or a
length equal to the number of spectra in
collisionEnergy<-: gets or sets the
collision energy for all spectra in
numeric with length equal to the number of spectra
NA_real_ if not present/defined),
collisionEnergy<- takes a
numeric of length equal to the number of spectra in
object. Note that
the collision energy description from MassBank are provided as spectra
intensity: gets the intensity values from the spectra. Returns
numeric vectors (intensity values for each
spectrum). The length of the
list is equal to the number of
ionCount: returns a
numeric with the sum of intensities for
each spectrum. If the spectrum is empty (see
NA_real_ is returned.
isCentroided: a heuristic approach assessing if the spectra in
object are in profile or centroided mode. The function takes
qtl th quantile top peaks, then calculates the difference
between adjacent m/z value and returns
TRUE if the first
quartile is greater than
isEmpty: checks whether a spectrum in
object is empty
(i.e. does not contain any peaks). Returns a
logical vector of
length equal number of spectra.
isolationWindowLowerMz<-: gets or sets the
lower m/z boundary of the isolation window.
isolationWindowTargetMz<-: gets or sets the
target m/z of the isolation window.
isolationWindowUpperMz<-: gets or sets the
upper m/z boundary of the isolation window.
isReadOnly: returns a
logical(1) whether the backend is read
only or does allow also to write/update data.
length: returns the number of spectra in the object.
lengths: gets the number of peaks (m/z-intensity values) per
spectrum. Returns an
integer vector (length equal to the
number of spectra). For empty spectra,
0 is returned.
msLevel: gets the spectra's MS level. Returns an
vector (of length equal to the number of spectra) with the MS
level for each spectrum (or
NA_integer_ if not available).
mz: gets the mass-to-charge ratios (m/z) from the
spectra. Returns a
NumericList() or length equal to the number of
spectra, each element a
numeric vector with the m/z values of
polarity<-: gets or sets the polarity for each
polarity returns an
integer vector (length equal
to the number of spectra), with
1 representing negative
and positive polarities, respectively.
polarity<- expects an
integer vector of length 1 or equal to the number of spectra.
precAcquisitionNum: get the charge (
numeric), m/z (
numeric), scan index (
and acquisition number (
interger) of the precursor for MS level
2 and above spectra from the object. Returns a vector of length equal to
the number of spectra in
NA are reported for MS1
spectra of if no precursor information is available.
reset: restores the backend to its original state, i.e. deletes all
locally modified data and reinitializes the backend to the full data
available in the database.
rtime<-: gets or sets the retention times for each
spectrum (in seconds).
rtime returns a
numeric vector (length equal to
the number of spectra) with the retention time for each spectrum.
rtime<- expects a numeric vector with length equal to the
number of spectra.
scanIndex: returns an
integer vector with the scan index
for each spectrum. This represents the relative index of the
spectrum within each file. Note that this can be different to the
acquisitionNum of the spectrum which is the index of the
spectrum as reported in the mzML file.
selectSpectraVariables: reduces the information within the backend to
the selected spectra variables.
smoothed<-: gets or sets whether a spectrum is
smoothed returns a
logical vector of length equal
to the number of spectra.
smoothed<- takes a
of length 1 or equal to the number of spectra in
spectraData: gets general spectrum metadata (annotation, also called
spectraData returns a
DataFrame. Note that replacing the
spectra data with
spectraData<- is not supported.
spectraNames: returns a
character vector with the names of
the spectra in
spectraVariables: returns a
character vector with the
available spectra variables (columns, fields or attributes)
object. This should return all spectra variables which
are present in
"intensity" (which are by
default not returned by the
tic: gets the total ion current/count (sum of signal of a
spectrum) for all spectra in
object. By default, the value
reported in the original raw data file is returned. For an empty
NA_real_ is returned.
The following functions are not supported by the
the original data can not be changed.
While compound annotations are also provided via the
the backend, it would also be possible to use the
compounds function on
Spectra object (that uses a
MsBackendMassbankSql backend) to retrieve
compound annotations for the specific spectra.
## Create a connection to a database with MassBank data - in the present ## example we connect to a tiny SQLite database bundled in this package ## as public access to the MassBank MySQL is not (yet) supported. See the ## vignette for more information on how to install MassBank locally and ## enable MySQL database connections library(RSQLite) con <- dbConnect(SQLite(), system.file("sql", "minimassbank.sqlite", package = "MsBackendMassbank")) ## Given that we have the connection to a MassBank databas we can ## initialize the backend: be <- backendInitialize(MsBackendMassbankSql(), dbcon = con) be ## Access MS level msLevel(be) be$msLevel ## Access m/z values be$mz ## Access the full spectra data (including m/z and intensity values) spectraData(be) ## Add a new spectra variable be$new_variable <- "b" be$new_variable ## Subset the backend be_sub <- be[c(3, 1)] spectraNames(be) spectraNames(be_sub)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.