BiodbConn: The mother abstract class of all database connectors.
In pkrog/biodb: biodb, a library and a development framework for connecting to chemical and biological databases

BiodbConn

R Documentation

The mother abstract class of all database connectors.

Description

The mother abstract class of all database connectors.

Details

This is the super class of all connector classes. All methods defined here are thus common to all connector classes. All connector classes inherit from this abstract class.

See section Fields for a list of the constructor's parameters. Concrete classes may have direct web services methods or other specific methods implemented, in which case they will be described inside the documentation of the concrete class. Please refer to the documentation of each concrete class for more information. The database direct web services methods will be named "ws.*".

The constructor has the following arguments:

id: The identifier of the connector.

cache.id: The identifier used in the disk cache.

Super class

biodb::BiodbConnBase -> BiodbConn

Methods

Public methods

BiodbConn$new()
BiodbConn$getBiodb()
BiodbConn$getId()
BiodbConn$print()
BiodbConn$correctIds()
BiodbConn$getEntry()
BiodbConn$getCacheFile()
BiodbConn$getEntryContent()
BiodbConn$getEntryContentFromDb()
BiodbConn$getEntryContentRequest()
BiodbConn$getEntryIds()
BiodbConn$getNbEntries()
BiodbConn$isEditable()
BiodbConn$editingIsAllowed()
BiodbConn$allowEditing()
BiodbConn$disallowEditing()
BiodbConn$setEditingAllowed()
BiodbConn$addNewEntry()
BiodbConn$isWritable()
BiodbConn$allowWriting()
BiodbConn$disallowWriting()
BiodbConn$setWritingAllowed()
BiodbConn$writingIsAllowed()
BiodbConn$write()
BiodbConn$isSearchableByField()
BiodbConn$getSearchableFields()
BiodbConn$searchForEntries()
BiodbConn$searchByName()
BiodbConn$isDownloadable()
BiodbConn$isDownloaded()
BiodbConn$requiresDownload()
BiodbConn$getDownloadPath()
BiodbConn$setDownloadedFile()
BiodbConn$isExtracted()
BiodbConn$download()
BiodbConn$isRemotedb()
BiodbConn$isCompounddb()
BiodbConn$searchCompound()
BiodbConn$annotateMzValues()
BiodbConn$isMassdb()
BiodbConn$checkDb()
BiodbConn$getAllVolatileCacheEntries()
BiodbConn$getAllCacheEntries()
BiodbConn$deleteAllEntriesFromVolatileCache()
BiodbConn$deleteAllEntriesFromPersistentCache()
BiodbConn$deleteWholePersistentCache()
BiodbConn$deleteAllCacheEntries()
BiodbConn$getCacheId()
BiodbConn$makesRefToEntry()
BiodbConn$makeRequest()
BiodbConn$getEntryImageUrl()
BiodbConn$getEntryPageUrl()
BiodbConn$getChromCol()
BiodbConn$getMatchingMzField()
BiodbConn$setMatchingMzField()
BiodbConn$getMzValues()
BiodbConn$getNbPeaks()
BiodbConn$filterEntriesOnRt()
BiodbConn$searchForMassSpectra()
BiodbConn$searchMsEntries()
BiodbConn$searchMsPeaks()
BiodbConn$msmsSearch()
BiodbConn$collapseResultsDataFrame()
BiodbConn$searchMzRange()
BiodbConn$searchMzTol()
BiodbConn$clone()

Inherited methods

Method `new()`

New instance initializer. Connector objects must not be created directly. Instead, you create new connector instances through the BiodbFactory instance.

Usage

BiodbConn$new(id = NA_character_, cache.id = NA_character_, bdb, ...)

Arguments

id: The ID of the connector instance.
cache.id: The Cache ID of the connector instance.
bdb: The BiodbMain instance.
...: Remaining arguments will be passed to the constructor of the super class.

Returns

Nothing.

Method `getBiodb()`

Returns the biodb main class instance to which this object is attached.

Usage

BiodbConn$getBiodb()

Returns

The main biodb instance.

Method `getId()`

Get the identifier of this connector.

Usage

BiodbConn$getId()

Returns

The identifier of this connector.

Method `print()`

Prints a description of this connector.

Usage

BiodbConn$print()

Returns

Nothing.

Method `correctIds()`

Correct a vector of IDs by formatting them to the database official format, if required and possible.

Usage

BiodbConn$correctIds(ids)

Arguments

ids: A character vector of IDs.

Returns

The vector of IDs corrected.

Method `getEntry()`

Return the entry corresponding to this ID. You can pass a vector of IDs, and you will get a list of entries.

Usage

BiodbConn$getEntry(id, drop = TRUE, nulls = TRUE)

Arguments

id: A character vector containing entry identifiers.
drop: If set to TRUE and only one entry is requrested, then the returned value will be a single BiodbEntry object, otherwise it will be a list of BiodbEntry objects.
nulls: If set to TRUE, NULL entries are preserved. This ensures that the output list has the same length than the input vector id. Otherwise they are removed from the final list.

Returns

A list of BiodbEntry objects, the same size of the vector of IDs. The list will contain NULL values for invalid IDs. If drop is set to TRUE and only one etrny was requested then a single BiodbEntry is returned instead of a list.

Method `getCacheFile()`

Get the path to the persistent cache file.

Usage

BiodbConn$getCacheFile(entry.id)

Arguments

entry.id: The identifiers (e.g.: accession numbers) as a character vector of the database entries.

Returns

A character vector, the same length as the vector of IDs, containing the paths to the cache files corresponding to the requested entry IDs.

Method `getEntryContent()`

Get the contents of database entries from IDs (accession numbers).

Usage

BiodbConn$getEntryContent(id)

Arguments

id: A character vector of entry IDs.

Returns

A character vector containing the contents of the requested IDs. If no content is available for an entry ID, then NA will be used.

Method `getEntryContentFromDb()`

Get the contents of entries directly from the database. A direct request or an access to the database will be made in order to retrieve the contents. No access to the biodb cache system will be made.

Usage

BiodbConn$getEntryContentFromDb(entry.id)

Arguments

entry.id: A character vector with the IDs of entries to retrieve.

Returns

A character vector, the same size of entry.id, with contents of the requested entries. An NA value will be set for the content of each entry for which the retrieval failed.

Method `getEntryContentRequest()`

Gets the URL to use in order to get the contents of the specified entries.

Usage

BiodbConn$getEntryContentRequest(entry.id, concatenate = TRUE, max.length = 0)

Arguments

entry.id: A character vector with the IDs of entries to retrieve.
concatenate: If set to TRUE, then try to build as few URLs as possible, sending requests with several identifiers at once.
max.length: The maximum length of the URLs to return, in number of characters.

Returns

A vector of URL strings.

Method `getEntryIds()`

Get entry identifiers from the database. More arguments can be given, depending on implementation in specific databases. For mass databases the ms.level argument can also be set.

Usage

BiodbConn$getEntryIds(max.results = 0, ...)

Arguments

max.results: The maximum of elements to return from the method.
...: Arguments specific to connectors.

Returns

A character vector containing entry IDs from the database. An empty vector for a remote database may mean that the database does not support requesting for entry accessions.

Method `getNbEntries()`

Get the number of entries contained in this database.

Usage

BiodbConn$getNbEntries(count = FALSE)

Arguments

count: If set to TRUE and no straightforward way exists to get number of entries, count the output of getEntryIds().

Returns

The number of entries in the database, as an integer.

Method `isEditable()`

Tests if this connector is able to edit the database (i.e.: the connector class implements the interface BiodbEditable). If this connector is editable, then you can call allowEditing() to enable editing.

Usage

BiodbConn$isEditable()

Returns

Returns TRUE if the database is editable.

Method `editingIsAllowed()`

Tests if editing is allowed.

Usage

BiodbConn$editingIsAllowed()

Returns

TRUE if editing is allowed for this database, FALSE otherwise.

Method `allowEditing()`

Allows editing for this database.

Usage

BiodbConn$allowEditing()

Returns

Nothing.

Method `disallowEditing()`

Disallows editing for this database.

Usage

BiodbConn$disallowEditing()

Returns

Nothing.

Method `setEditingAllowed()`

Allow or disallow editing for this database.

Usage

BiodbConn$setEditingAllowed(allow)

Arguments

allow: A logical value.

Returns

Nothing.

Method `addNewEntry()`

Adds a new entry to the database. The passed entry must have been previously created from scratch using BiodbFactory :createNewEntry() or cloned from an existing entry using BiodbEntry :clone().

Usage

BiodbConn$addNewEntry(entry)

Arguments

entry: The new entry to add. It must be a valid BiodbEntry object.

Returns

Nothing.

Method `isWritable()`

Tests if this connector is able to write into the database. If this connector is writable, then you can call allowWriting() to enable writing.

Usage

BiodbConn$isWritable()

Returns

Returns TRUE if the database is writable.

Method `allowWriting()`

Allows the connector to write into this database.

Usage

BiodbConn$allowWriting()

Returns

Nothing.

Method `disallowWriting()`

Disallows the connector to write into this database.

Usage

BiodbConn$disallowWriting()

Returns

Nothing.

Method `setWritingAllowed()`

Allows or disallows writing for this database.

Usage

BiodbConn$setWritingAllowed(allow)

Arguments

allow: If set to TRUE, allows writing.

Returns

Nothing.

Method `writingIsAllowed()`

Tests if the connector has access right to the database.

Usage

BiodbConn$writingIsAllowed()

Returns

TRUE if writing is allowed for this database, FALSE otherwise.

Method `write()`

Writes into the database. All modifications made to the database since the last time write() was called will be saved.

Usage

BiodbConn$write()

Returns

Nothing.

Method `isSearchableByField()`

Tests if a field can be used to search entries when using method searchForEntries().

Usage

BiodbConn$isSearchableByField(field = NULL, field.type = NULL)

Arguments

field: The name of the field.
field.type: The field type.

Returns

Returns TRUE if the database is searchable using the specified field or searchable by any field of the specified type, FALSE otherwise.

Method `getSearchableFields()`

Get the list of all searchable fields.

Usage

BiodbConn$getSearchableFields()

Returns

A character vector containing all searchable fields for this connector.

Method `searchForEntries()`

Searches the database for entries whose name matches the specified name. Returns a character vector of entry IDs.

Usage

BiodbConn$searchForEntries(fields = NULL, max.results = 0)

Arguments

fields: A list of fields on which to filter entries. To get a match, all fields must be matched (i.e. logical AND). The keys of the list are the entry field names on which to filter, and the values are the filtering parameters. For character fields, the filter parameter is a character vector in which all strings must be found inside the field's value. For numeric fields, the filter parameter is either a list specifying a min-max range (list(min=1.0, max=2.5)) or a value with a tolerance in delta (list(value=2.0, delta=0.1)) or ppm (list(value=2.0, ppm=1.0)).
max.results: If set, the number of returned IDs is limited to this number.

Returns

A character vector of entry IDs whose name matches the requested name.

Method `searchByName()`

DEPRECATED. Use searchForEntries() instead.

Usage

BiodbConn$searchByName(name, max.results = 0)

Arguments

name: A character value to search inside name fields.
max.results: If set, the number of returned IDs is limited to this number.

Returns

A character vector of entry IDs whose name matches the requested name.

Method `isDownloadable()`

Tests if the connector can download the database.

Usage

BiodbConn$isDownloadable()

Returns

Returns TRUE if the database is downloadable.

Method `isDownloaded()`

Tests if the database has been downloaded.

Usage

BiodbConn$isDownloaded()

Returns

TRUE if the database content has already been downloaded.

Method `requiresDownload()`

Tests if the connector requires the download of the database.

Usage

BiodbConn$requiresDownload()

Returns

TRUE if the connector requires download of the database.

Method `getDownloadPath()`

Gets the path where the downloaded content is written.

Usage

BiodbConn$getDownloadPath()

Returns

The path where the downloaded database is written.

Method `setDownloadedFile()`

Set the downloaded file into the cache.

Usage

BiodbConn$setDownloadedFile(src, action = c("copy", "move"))

Arguments

src: Path to the downloaded file.
action: Specifies if files have to be moved or copied into the cache.

Returns

Nothing.

Method `isExtracted()`

Tests if the downloaded database has been extracted (in case the database needs extraction).

Usage

BiodbConn$isExtracted()

Returns

TRUE if the downloaded database content has been extracted, FALSE otherwise.

Method `download()`

Downloads the database content locally.

Usage

BiodbConn$download()

Returns

Nothing.

Method `isRemotedb()`

Tests if the connector is connected to a remote database.

Usage

BiodbConn$isRemotedb()

Returns

Returns TRUE if the database is a remote database."

Method `isCompounddb()`

Tests if the connector's database is a compound database.

Usage

BiodbConn$isCompounddb()

Returns

Returns TRUE if the database is a compound database.

Method `searchCompound()`

This method is deprecated. Use searchForEntries() instead. Searches for compounds by name and/or by mass. At least one of name or mass must be set.

Usage

BiodbConn$searchCompound(
  name = NULL,
  mass = NULL,
  mass.field = NULL,
  mass.tol = 0.01,
  mass.tol.unit = "plain",
  max.results = 0
)

Arguments

name: The name of a compound to search for.
mass: The searched mass.
mass.field: For searching by mass, you must indicate a mass field to use ('monoisotopic.mass', 'molecular.mass', 'average.mass' or 'nominal.mass').
mass.tol: The tolerance value on the molecular mass.
mass.tol.unit: The type of mass tolerance. Either 'plain' or 'ppm'.
max.results: The maximum number of matches to return.
description: A character vector of words or expressions to search for inside description field. The words will be searched in order. A match will be made only if all words are inside the description field.

Returns

A character vector of entry IDs."

Method `annotateMzValues()`

Annotates a mass spectrum with the database. For each matching entry the entry field values will be set inside columns appended to the data frame. Names of these columns will use a common prefix in order to distinguish them from other data from the input data frame.

Usage

BiodbConn$annotateMzValues(
  x,
  mz.tol,
  ms.mode,
  mz.tol.unit = c("plain", "ppm"),
  mass.field = "monoisotopic.mass",
  max.results = 3,
  mz.col = "mz",
  fields = NULL,
  prefix = NULL,
  insert.input.values = TRUE,
  fieldsLimit = 0
)

Arguments

x: Either a data frame or a numeric vector containing the M/Z values.
mz.tol: The tolerance on the M/Z values.
ms.mode: The MS mode. Set it to either 'neg' or 'pos'.
mz.tol.unit: The type of the M/Z tolerance. Set it to either to 'ppm' or 'plain'.
mass.field: The mass field to use for matching M/Z values. One of: 'monoisotopic.mass', 'molecular.mass', 'average.mass', 'nominal.mass'.
max.results: If set, it is used to limit the number of matches found for each M/Z value. To get all the matches, set this parameter to NA_integer_. Default value is 3.
mz.col: The name of the column where to find M/Z values in case x is a data frame.
fields: A character vector containing the additional entry fields you would like to get for each matched entry. Each field will be output in a different column.
prefix: A prefix that will be inserted before the name of each added column in the output. By default it will be set to the name of the database followed by a dot.
insert.input.values: Insert input values at the beginning of the result data frame.
fieldsLimit: The maximum of values to output for fields with multiple values. Set it to 0 to get all values.

Returns

A data frame containing the input values, and annotation columns appended at the end. The first annotation column contains the IDs of the matched entries. The following columns contain the fields you have requested through the fields parameter.

Method `isMassdb()`

Tests if the connector's database is a mass spectra database.

Usage

BiodbConn$isMassdb()

Returns

Returns TRUE if the database is a mass database.

Method `checkDb()`

Checks that the database is correct by trying to retrieve all its entries.

Usage

BiodbConn$checkDb()

Returns

Nothing.

Method `getAllVolatileCacheEntries()`

Get all entries stored in the memory cache (volatile cache).

Usage

BiodbConn$getAllVolatileCacheEntries()

Returns

A list of BiodbEntry instances.

Method `getAllCacheEntries()`

This method is deprecated. Use getAllVolatileCacheEntries() instead.

Usage

BiodbConn$getAllCacheEntries()

Returns

All entries cached in memory.

Method `deleteAllEntriesFromVolatileCache()`

Delete all entries from the volatile cache (memory cache).

Usage

BiodbConn$deleteAllEntriesFromVolatileCache()

Returns

Nothing.

Method `deleteAllEntriesFromPersistentCache()`

Delete all entries from the persistent cache (disk cache).

Usage

BiodbConn$deleteAllEntriesFromPersistentCache(deleteVolatile = TRUE)

Arguments

deleteVolatile: If TRUE deletes also all entries from the volatile cache (memory cache).

Returns

Nothing.

Method `deleteWholePersistentCache()`

Delete all files associated with this connector from the persistent cache (disk cache).

Usage

BiodbConn$deleteWholePersistentCache(deleteVolatile = TRUE)

Arguments

deleteVolatile: If TRUE deletes also all entries from the volatile cache (memory cache).

Returns

Nothing.

Method `deleteAllCacheEntries()`

Delete all entries from the memory cache. This method is deprecated, please use deleteAllEntriesFromVolatileCache() instead.

Usage

BiodbConn$deleteAllCacheEntries()

Returns

Nothing.

Method `getCacheId()`

Gets the ID used by this connector in the disk cache.

Usage

BiodbConn$getCacheId()

Returns

The cache ID of this connector.

Method `makesRefToEntry()`

Tests if some entry of this database makes reference to another entry of another database.

Usage

BiodbConn$makesRefToEntry(id, db, oid, any = FALSE, recurse = FALSE)

Arguments

id: A character vector of entry IDs from the connector's database.
db: Another database connector.
oid: A entry ID from database db.
any: If set to TRUE, returns a single logical value: TRUE if any entry contains a reference to oid, FALSE otherwise.
recurse: If set to TRUE, the algorithm will follow all references to entries from other databases, to see if it can establish an indirect link to oid.

Returns

A logical vector, the same size as id, with TRUE for each entry making reference to oid, and FALSE otherwise.

Method `makeRequest()`

Makes a BiodbRequest instance using the passed parameters, and set ifself as the associated connector.

Usage

BiodbConn$makeRequest(...)

Arguments

...: Those parameters are passed to the initializer of BiodbRequest.

Returns

The BiodbRequest instance.

Method `getEntryImageUrl()`

Gets the URL to a picture of the entry (e.g.: a picture of the molecule in case of a compound entry).

Usage

BiodbConn$getEntryImageUrl(entry.id)

Arguments

entry.id: A character vector containing entry IDs.

Returns

A character vector, the same length as entry.id, containing for each entry ID either a URL or NA if no URL exists.

Method `getEntryPageUrl()`

Gets the URL to the page of the entry on the database web site.

Usage

BiodbConn$getEntryPageUrl(entry.id)

Arguments

entry.id: A character vector with the IDs of entries to retrieve.

Returns

A list of BiodbUrl objects, the same length as entry.id.

Method `getChromCol()`

Gets a list of chromatographic columns contained in this database.

Usage

BiodbConn$getChromCol(ids = NULL)

Arguments

ids: A character vector of entry identifiers (i.e.: accession numbers). Used to restrict the set of entries on which to run the algorithm.

Returns

A data.frame with two columns, one for the ID 'id' and another one for the title 'title'.

Method `getMatchingMzField()`

Gets the field to use for M/Z matching.

Usage

BiodbConn$getMatchingMzField()

Returns

The name of the field (one of peak.mztheo or peak.mzexp).

Method `setMatchingMzField()`

Sets the field to use for M/Z matching.

Usage

BiodbConn$setMatchingMzField(field = c("peak.mztheo", "peak.mzexp"))

Arguments

field: The field to use for matching.

Returns

Nothing.

Method `getMzValues()`

Gets a list of M/Z values contained inside the database.

Usage

BiodbConn$getMzValues(
  ms.mode = NULL,
  max.results = 0,
  precursor = FALSE,
  ms.level = 0
)

Arguments

ms.mode: The MS mode. Set it to either 'neg' or 'pos' to limit the output to one mode.
max.results: If set, it is used to limit the size of the output.
precursor: If set to TRUE, then restrict the search to precursor peaks.
ms.level: The MS level to which you want to restrict your search. 0 means that you want to search in all levels.

Returns

A numeric vector containing M/Z values.

Method `getNbPeaks()`

Gets the number of peaks contained in the database.

Usage

BiodbConn$getNbPeaks(mode = NULL, ids = NULL)

Arguments

mode: The MS mode. Set it to either 'neg' or 'pos' to limit the counting to one mode.
ids: A character vector of entry identifiers (i.e.: accession numbers). Used to restrict the set of entries on which to run the algorithm.

Returns

The number of peaks, as an integer.

Method `filterEntriesOnRt()`

Filters a list of entries on retention time values.

Usage

BiodbConn$filterEntriesOnRt(
  entry.ids,
  rt,
  rt.unit,
  rt.tol,
  rt.tol.exp,
  chrom.col.ids,
  match.rt
)

Arguments

entry.ids

A character vector of entry IDs.

rt

A vector of retention times to match. Used if input.df is not set. Unit is specified by rt.unit parameter.

rt.unit

The unit for submitted retention times. Either 's' or 'min'.

rt.tol

The plain tolerance (in seconds) for retention times: input.rt

rt.tol <= database.rt <= input.rt + rt.tol.

rt.tol.exp

A special exponent tolerance for retention times: input.rt

input.rt ** rt.tol.exp <= database.rt <= input.rt + input.rt ** rt.tol.exp. This exponent is applied on the RT value in seconds. If both rt.tol and rt.tol.exp are set, the inequality expression becomes input.rt - rt.tol - input.rt ** rt.tol.exp <= database.rt <= input.rt + rt.tol + input.rt ** rt.tol.exp.

chrom.col.ids

IDs of chromatographic columns on which to match the retention time.

match.rt

If set to TRUE, filters on RT values, otherwise does not do any filtering.

Returns

A character vector containing entry IDs after filtering.

Method `searchForMassSpectra()`

Searches for entries (i.e.: spectra) that contain a peak around the given M/Z value. Entries can also be filtered on RT values. You can input either a list of M/Z values through mz argument and set a tolerance with mz.tol argument, or two lists of minimum and maximum M/Z values through mz.min and mz.max arguments.

Usage

BiodbConn$searchForMassSpectra(
  mz.min = NULL,
  mz.max = NULL,
  mz = NULL,
  mz.tol = NULL,
  mz.tol.unit = c("plain", "ppm"),
  rt = NULL,
  rt.unit = c("s", "min"),
  rt.tol = NULL,
  rt.tol.exp = NULL,
  chrom.col.ids = NULL,
  precursor = FALSE,
  min.rel.int = 0,
  ms.mode = NULL,
  max.results = 0,
  ms.level = 0,
  include.ids = NULL
)

Arguments

mz.min

A vector of minimum M/Z values.

mz.max

A vector of maximum M/Z values. Its length must be the same as mz.min.

mz

A vector of M/Z values.

mz.tol

The M/Z tolerance, whose unit is defined by mz.tol.unit.

mz.tol.unit

The type of the M/Z tolerance. Set it to either to 'ppm' or 'plain'.

rt

A vector of retention times to match. Used if input.df is not set. Unit is specified by rt.unit parameter.

rt.unit

The unit for submitted retention times. Either 's' or 'min'.

rt.tol

The plain tolerance (in seconds) for retention times: input.rt

rt.tol <= database.rt <= input.rt + rt.tol.

rt.tol.exp

A special exponent tolerance for retention times: input.rt

input.rt ** rt.tol.exp <= database.rt <= input.rt + input.rt ** rt.tol.exp. This exponent is applied on the RT value in seconds. If both rt.tol and rt.tol.exp are set, the inequality expression becomes input.rt - rt.tol - input.rt ** rt.tol.exp <= database.rt <= input.rt + rt.tol + input.rt ** rt.tol.exp.

chrom.col.ids

IDs of chromatographic columns on which to match the retention time.

precursor

If set to TRUE, then restrict the search to precursor peaks.

min.rel.int

The minimum relative intensity, in percentage (i.e.: float number between 0 and 100).

ms.mode

The MS mode. Set it to either 'neg' or 'pos'.

max.results

If set, it is used to limit the number of matches found for each M/Z value.

ms.level

The MS level to which you want to restrict your search. 0 means that you want to search in all levels.

include.ids

A list of IDs to which to restrict the final results. All IDs that are not in this list will be excluded.

Returns

A character vector of spectra IDs.

Method `searchMsEntries()`

DEPRECATED. Use searchForMassSpectra() instead.

Usage

BiodbConn$searchMsEntries(
  mz.min = NULL,
  mz.max = NULL,
  mz = NULL,
  mz.tol = NULL,
  mz.tol.unit = c("plain", "ppm"),
  rt = NULL,
  rt.unit = c("s", "min"),
  rt.tol = NULL,
  rt.tol.exp = NULL,
  chrom.col.ids = NULL,
  precursor = FALSE,
  min.rel.int = 0,
  ms.mode = NULL,
  max.results = 0,
  ms.level = 0
)

Arguments

mz.min

A vector of minimum M/Z values.

mz.max

A vector of maximum M/Z values. Its length must be the same as mz.min.

mz

A vector of M/Z values.

mz.tol

The M/Z tolerance, whose unit is defined by mz.tol.unit.

mz.tol.unit

The type of the M/Z tolerance. Set it to either to 'ppm' or 'plain'.

rt

A vector of retention times to match. Used if input.df is not set. Unit is specified by rt.unit parameter.

rt.unit

The unit for submitted retention times. Either 's' or 'min'.

rt.tol

The plain tolerance (in seconds) for retention times: input.rt

rt.tol <= database.rt <= input.rt + rt.tol.

rt.tol.exp

A special exponent tolerance for retention times: input.rt

input.rt ** rt.tol.exp <= database.rt <= input.rt + input.rt ** rt.tol.exp. This exponent is applied on the RT value in seconds. If both rt.tol and rt.tol.exp are set, the inequality expression becomes input.rt - rt.tol - input.rt ** rt.tol.exp <= database.rt <= input.rt + rt.tol + input.rt ** rt.tol.exp.

chrom.col.ids

IDs of chromatographic columns on which to match the retention time.

precursor

If set to TRUE, then restrict the search to precursor peaks.

min.rel.int

The minimum relative intensity, in percentage (i.e.: float number between 0 and 100).

ms.mode

The MS mode. Set it to either 'neg' or 'pos'.

max.results

If set, it is used to limit the number of matches found for each M/Z value.

ms.level

The MS level to which you want to restrict your search. 0 means that you want to search in all levels.

Returns

A character vector of spectra IDs.

Method `searchMsPeaks()`

For each M/Z value, searches for matching MS spectra and returns the matching peaks.

Usage

BiodbConn$searchMsPeaks(
  input.df = NULL,
  mz = NULL,
  mz.tol = NULL,
  mz.tol.unit = c("plain", "ppm"),
  min.rel.int = 0,
  ms.mode = NULL,
  ms.level = 0,
  max.results = 0,
  chrom.col.ids = NULL,
  rt = NULL,
  rt.unit = c("s", "min"),
  rt.tol = NULL,
  rt.tol.exp = NULL,
  precursor = FALSE,
  precursor.rt.tol = NULL,
  insert.input.values = TRUE,
  prefix = NULL,
  compute = TRUE,
  fields = NULL,
  fieldsLimit = 0,
  input.df.colnames = c(mz = "mz", rt = "rt"),
  match.rt = FALSE
)

Arguments

input.df

A data frame taken as input for searchMsPeaks(). It must contain a columns 'mz', and optionaly an 'rt' column.

mz

A vector of M/Z values to match. Used if input.df is not set.

mz.tol

The M/Z tolerance, whose unit is defined by mz.tol.unit.

mz.tol.unit

The type of the M/Z tolerance. Set it to either to 'ppm' or 'plain'.

min.rel.int

The minimum relative intensity, in percentage (i.e.: float number between 0 and 100).

ms.mode

The MS mode. Set it to either 'neg' or 'pos'.

ms.level

The MS level to which you want to restrict your search. 0 means that you want to search in all levels.

max.results

If set, it is used to limit the number of matches found for each M/Z value.

chrom.col.ids

IDs of chromatographic columns on which to match the retention time.

rt

A vector of retention times to match. Used if input.df is not set. Unit is specified by rt.unit parameter.

rt.unit

The unit for submitted retention times. Either 's' or 'min'.

rt.tol

The plain tolerance (in seconds) for retention times: input.rt

rt.tol <= database.rt <= input.rt + rt.tol.

rt.tol.exp

A special exponent tolerance for retention times: input.rt

input.rt ** rt.tol.exp <= database.rt <= input.rt + input.rt ** rt.tol.exp. This exponent is applied on the RT value in seconds. If both rt.tol and rt.tol.exp are set, the inequality expression becomes input.rt - rt.tol - input.rt ** rt.tol.exp <= database.rt <= input.rt + rt.tol + input.rt ** rt.tol.exp.

precursor

If set to TRUE, then restrict the search to precursor peaks.

precursor.rt.tol

The RT tolerance used when matching the precursor.

insert.input.values

Insert input values at the beginning of the result data frame.

prefix

Add prefix on column names of result data frame.

compute

If set to TRUE, use the computed values when converting found entries to data frame.

fields

A character vector of field names to output. The data frame output will be restricted to this list of fields.

fieldsLimit

The maximum of values to output for fields with multiple values. Set it to 0 to get all values.

input.df.colnames

Names of the columns in the input data frame.

match.rt

If set to TRUE, match also RT values.

Returns

A data frame with at least input MZ and RT columns, and annotation columns prefixed with prefix if set. For each matching found a row is output. Thus if n matchings are found for M/Z value x, then there will be n rows for x, each for a different match. The number of matching found for each M/Z value is limited to max.results.

Method `msmsSearch()`

Searches MSMS spectra matching a template spectrum. The mz.tol parameter is applied on the precursor search.

Usage

BiodbConn$msmsSearch(
  spectrum,
  precursor.mz,
  mz.tol,
  mz.tol.unit = c("plain", "ppm"),
  ms.mode,
  npmin = 2,
  dist.fun = c("wcosine", "cosine", "pkernel", "pbachtttarya"),
  msms.mz.tol = 3,
  msms.mz.tol.min = 0.005,
  max.results = 0
)

Arguments

spectrum: A template spectrum to match inside the database.
precursor.mz: The M/Z value of the precursor peak of the mass spectrum.
mz.tol: The M/Z tolerance, whose unit is defined by mz.tol.unit.
mz.tol.unit: The type of the M/Z tolerance. Set it to either to 'ppm' or 'plain'.
ms.mode: The MS mode. Set it to either 'neg' or 'pos'.
npmin: The minimum number of peak to detect a match (2 is recommended).
dist.fun: The distance function used to compute the distance betweem two mass spectra.
msms.mz.tol: M/Z tolerance to apply while matching MSMS spectra. In PPM.
msms.mz.tol.min: Minimum of the M/Z tolerance (plain unit). If the M/Z tolerance computed with msms.mz.tol is lower than msms.mz.tol.min, then msms.mz.tol.min will be used.
max.results: If set, it is used to limit the number of matches found for each M/Z value.

Returns

A data frame with columns id, score and peak.*. Each peak.* column corresponds to a peak in the input spectrum, in the same order and gives the number of the peak that was matched with it inside the matched spectrum whose ID is inside the id column.

Method `collapseResultsDataFrame()`

Collapse rows of a results data frame, by outputing a data frame with only one row for each MZ/RT value.

Usage

BiodbConn$collapseResultsDataFrame(
  results.df,
  mz.col = "mz",
  rt.col = "rt",
  sep = "|"
)

Arguments

results.df: Results data frame.
mz.col: The name of the M/Z column in the results data frame.
rt.col: The name of the RT column in the results data frame.
sep: The separator used to concatenate values, when collapsing results data frame.

Returns

A data frame with rows collapsed."

Method `searchMzRange()`

Find spectra in the given M/Z range. Returns a list of spectra IDs.

Usage

BiodbConn$searchMzRange(
  mz.min,
  mz.max,
  min.rel.int = 0,
  ms.mode = NULL,
  max.results = 0,
  precursor = FALSE,
  ms.level = 0
)

Arguments

mz.min: A vector of minimum M/Z values.
mz.max: A vector of maximum M/Z values. Its length must be the same as mz.min.
min.rel.int: The minimum relative intensity, in percentage (i.e.: float number between 0 and 100).
ms.mode: The MS mode. Set it to either 'neg' or 'pos'.
max.results: If set, it is used to limit the number of matches found for each M/Z value.
precursor: If set to TRUE, then restrict the search to precursor peaks.
ms.level: The MS level to which you want to restrict your search. 0 means that you want to search in all levels.

Returns

A character vector of spectra IDs.

Method `searchMzTol()`

Find spectra containg a peak around the given M/Z value. Returns a character vector of spectra IDs.

Usage

BiodbConn$searchMzTol(
  mz,
  mz.tol,
  mz.tol.unit = "plain",
  min.rel.int = 0,
  ms.mode = NULL,
  max.results = 0,
  precursor = FALSE,
  ms.level = 0
)

Arguments

mz: A vector of M/Z values.
mz.tol: The M/Z tolerance, whose unit is defined by mz.tol.unit.
mz.tol.unit: The type of the M/Z tolerance. Set it to either to 'ppm' or 'plain'.
min.rel.int: The minimum relative intensity, in percentage (i.e.: float number between 0 and 100).
ms.mode: The MS mode. Set it to either 'neg' or 'pos'.
max.results: If set, it is used to limit the number of matches found for each M/Z value.
precursor: If set to TRUE, then restrict the search to precursor peaks.
ms.level: The MS level to which you want to restrict your search. 0 means that you want to search in all levels.

Returns

A character vector of spectra IDs.

Method `clone()`

The objects of this class are cloneable with this method.

Usage

BiodbConn$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Examples

# Create an instance with default settings:
mybiodb <- biodb::newInst()

# Get a compound CSV file database
chebi.tsv <- system.file("extdata", "chebi_extract.tsv", package='biodb')

# Create a connector
conn <- mybiodb$getFactory()$createConn('comp.csv.file', url=chebi.tsv)

# Get 10 identifiers from the database:
ids <- conn$getEntryIds(10)

# Get number of entries contained in the database:
n <- conn$getNbEntries()

# Terminate instance.
mybiodb$terminate()

pkrog/biodb documentation built on Nov. 29, 2022, 4:24 a.m.

pkrog/biodb index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

pkrog/biodb biodb, a library and a development framework for connecting to chemical and biological databases

BiodbConn: The mother abstract class of all database connectors. In pkrog/biodb: biodb, a library and a development framework for connecting to chemical and biological databases

The mother abstract class of all database connectors.

Description

Details

Super class

Methods

Public methods

Method new()

Usage

Arguments

Returns

Method getBiodb()

Usage

Returns

Method getId()

Usage

Returns

Method print()

Usage

Returns

Method correctIds()

Usage

Arguments

Returns

Method getEntry()

Usage

Arguments

Returns

Method getCacheFile()

Usage

Arguments

Returns

Method getEntryContent()

Usage

Arguments

Returns

Method getEntryContentFromDb()

Usage

Arguments

Returns

Method getEntryContentRequest()

Usage

Arguments

Returns

Method getEntryIds()

Usage

Arguments

Returns

Method getNbEntries()

Usage

Arguments

Returns

Method isEditable()

Usage

Returns

Method editingIsAllowed()

Usage

Returns

Method allowEditing()

Usage

Returns

Method disallowEditing()

Usage

Returns

Method setEditingAllowed()

Usage

Arguments

Returns

Method addNewEntry()

Usage

Arguments

Returns

Method isWritable()

Usage

Returns

Method allowWriting()

Usage

Returns

Method disallowWriting()

pkrog/biodb
biodb, a library and a development framework for connecting to chemical and biological databases

BiodbConn: The mother abstract class of all database connectors.
In pkrog/biodb: biodb, a library and a development framework for connecting to chemical and biological databases

Method `new()`

Method `getBiodb()`

Method `getId()`

Method `print()`

Method `correctIds()`

Method `getEntry()`

Method `getCacheFile()`

Method `getEntryContent()`

Method `getEntryContentFromDb()`

Method `getEntryContentRequest()`

Method `getEntryIds()`

Method `getNbEntries()`

Method `isEditable()`

Method `editingIsAllowed()`

Method `allowEditing()`

Method `disallowEditing()`

Method `setEditingAllowed()`

Method `addNewEntry()`

Method `isWritable()`

Method `allowWriting()`

Method `disallowWriting()`

Method `setWritingAllowed()`

Method `writingIsAllowed()`

Method `write()`

Method `isSearchableByField()`

Method `getSearchableFields()`

Method `searchForEntries()`

Method `searchByName()`

Method `isDownloadable()`

Method `isDownloaded()`

Method `requiresDownload()`

Method `getDownloadPath()`

Method `setDownloadedFile()`

Method `isExtracted()`

Method `download()`

Method `isRemotedb()`

Method `isCompounddb()`

Method `searchCompound()`

Method `annotateMzValues()`

Method `isMassdb()`

Method `checkDb()`

Method `getAllVolatileCacheEntries()`