BiodbConn: The mother abstract class of all database connectors.

BiodbConnR Documentation

The mother abstract class of all database connectors.

Description

The mother abstract class of all database connectors.

The mother abstract class of all database connectors.

Details

This is the super class of all connector classes. All methods defined here are thus common to all connector classes. All connector classes inherit from this abstract class.

See section Fields for a list of the constructor's parameters. Concrete classes may have direct web services methods or other specific methods implemented, in which case they will be described inside the documentation of the concrete class. Please refer to the documentation of each concrete class for more information. The database direct web services methods will be named "ws.*".

The constructor has the following arguments:

id: The identifier of the connector.

cache.id: The identifier used in the disk cache.

Super class

biodb::BiodbConnBase -> BiodbConn

Methods

Public methods

Inherited methods

Method new()

New instance initializer. Connector objects must not be created directly. Instead, you create new connector instances through the BiodbFactory instance.

Usage
BiodbConn$new(id = NA_character_, cache.id = NA_character_, bdb, ...)
Arguments
id

The ID of the connector instance.

cache.id

The Cache ID of the connector instance.

bdb

The BiodbMain instance.

...

Remaining arguments will be passed to the constructor of the super class.

Returns

Nothing.


Method getBiodb()

Returns the biodb main class instance to which this object is attached.

Usage
BiodbConn$getBiodb()
Returns

The main biodb instance.


Method getId()

Get the identifier of this connector.

Usage
BiodbConn$getId()
Returns

The identifier of this connector.


Method print()

Prints a description of this connector.

Usage
BiodbConn$print()
Returns

Nothing.


Method correctIds()

Correct a vector of IDs by formatting them to the database official format, if required and possible.

Usage
BiodbConn$correctIds(ids)
Arguments
ids

A character vector of IDs.

Returns

The vector of IDs corrected.


Method getEntry()

Return the entry corresponding to this ID. You can pass a vector of IDs, and you will get a list of entries.

Usage
BiodbConn$getEntry(id, drop = TRUE, nulls = TRUE)
Arguments
id

A character vector containing entry identifiers.

drop

If set to TRUE and only one entry is requrested, then the returned value will be a single BiodbEntry object, otherwise it will be a list of BiodbEntry objects.

nulls

If set to TRUE, NULL entries are preserved. This ensures that the output list has the same length than the input vector id. Otherwise they are removed from the final list.

Returns

A list of BiodbEntry objects, the same size of the vector of IDs. The list will contain NULL values for invalid IDs. If drop is set to TRUE and only one etrny was requested then a single BiodbEntry is returned instead of a list.


Method getCacheFile()

Get the path to the persistent cache file.

Usage
BiodbConn$getCacheFile(entry.id)
Arguments
entry.id

The identifiers (e.g.: accession numbers) as a character vector of the database entries.

Returns

A character vector, the same length as the vector of IDs, containing the paths to the cache files corresponding to the requested entry IDs.


Method getEntryContent()

Get the contents of database entries from IDs (accession numbers).

Usage
BiodbConn$getEntryContent(id)
Arguments
id

A character vector of entry IDs.

Returns

A character vector containing the contents of the requested IDs. If no content is available for an entry ID, then NA will be used.


Method getEntryContentFromDb()

Get the contents of entries directly from the database. A direct request or an access to the database will be made in order to retrieve the contents. No access to the biodb cache system will be made.

Usage
BiodbConn$getEntryContentFromDb(entry.id)
Arguments
entry.id

A character vector with the IDs of entries to retrieve.

Returns

A character vector, the same size of entry.id, with contents of the requested entries. An NA value will be set for the content of each entry for which the retrieval failed.


Method getEntryContentRequest()

Gets the URL to use in order to get the contents of the specified entries.

Usage
BiodbConn$getEntryContentRequest(entry.id, concatenate = TRUE, max.length = 0)
Arguments
entry.id

A character vector with the IDs of entries to retrieve.

concatenate

If set to TRUE, then try to build as few URLs as possible, sending requests with several identifiers at once.

max.length

The maximum length of the URLs to return, in number of characters.

Returns

A vector of URL strings.


Method getEntryIds()

Get entry identifiers from the database. More arguments can be given, depending on implementation in specific databases. For mass databases the ms.level argument can also be set.

Usage
BiodbConn$getEntryIds(max.results = 0, ...)
Arguments
max.results

The maximum of elements to return from the method.

...

Arguments specific to connectors.

Returns

A character vector containing entry IDs from the database. An empty vector for a remote database may mean that the database does not support requesting for entry accessions.


Method getNbEntries()

Get the number of entries contained in this database.

Usage
BiodbConn$getNbEntries(count = FALSE)
Arguments
count

If set to TRUE and no straightforward way exists to get number of entries, count the output of getEntryIds().

Returns

The number of entries in the database, as an integer.


Method isEditable()

Tests if this connector is able to edit the database (i.e.: the connector class implements the interface BiodbEditable). If this connector is editable, then you can call allowEditing() to enable editing.

Usage
BiodbConn$isEditable()
Returns

Returns TRUE if the database is editable.


Method editingIsAllowed()

Tests if editing is allowed.

Usage
BiodbConn$editingIsAllowed()
Returns

TRUE if editing is allowed for this database, FALSE otherwise.


Method allowEditing()

Allows editing for this database.

Usage
BiodbConn$allowEditing()
Returns

Nothing.


Method disallowEditing()

Disallows editing for this database.

Usage
BiodbConn$disallowEditing()
Returns

Nothing.


Method setEditingAllowed()

Allow or disallow editing for this database.

Usage
BiodbConn$setEditingAllowed(allow)
Arguments
allow

A logical value.

Returns

Nothing.


Method addNewEntry()

Adds a new entry to the database. The passed entry must have been previously created from scratch using BiodbFactory :createNewEntry() or cloned from an existing entry using BiodbEntry :clone().

Usage
BiodbConn$addNewEntry(entry)
Arguments
entry

The new entry to add. It must be a valid BiodbEntry object.

Returns

Nothing.


Method isWritable()

Tests if this connector is able to write into the database. If this connector is writable, then you can call allowWriting() to enable writing.

Usage
BiodbConn$isWritable()
Returns

Returns TRUE if the database is writable.


Method allowWriting()

Allows the connector to write into this database.

Usage
BiodbConn$allowWriting()
Returns

Nothing.


Method disallowWriting()

Disallows the connector to write into this database.

Usage
BiodbConn$disallowWriting()
Returns

Nothing.


Method setWritingAllowed()

Allows or disallows writing for this database.

Usage
BiodbConn$setWritingAllowed(allow)
Arguments
allow

If set to TRUE, allows writing.

Returns

Nothing.


Method writingIsAllowed()

Tests if the connector has access right to the database.

Usage
BiodbConn$writingIsAllowed()
Returns

TRUE if writing is allowed for this database, FALSE otherwise.


Method write()

Writes into the database. All modifications made to the database since the last time write() was called will be saved.

Usage
BiodbConn$write()
Returns

Nothing.


Method isSearchableByField()

Tests if a field can be used to search entries when using method searchForEntries().

Usage
BiodbConn$isSearchableByField(field = NULL, field.type = NULL)
Arguments
field

The name of the field.

field.type

The field type.

Returns

Returns TRUE if the database is searchable using the specified field or searchable by any field of the specified type, FALSE otherwise.


Method getSearchableFields()

Get the list of all searchable fields.

Usage
BiodbConn$getSearchableFields()
Returns

A character vector containing all searchable fields for this connector.


Method searchForEntries()

Searches the database for entries whose name matches the specified name. Returns a character vector of entry IDs.

Usage
BiodbConn$searchForEntries(fields = NULL, max.results = 0)
Arguments
fields

A list of fields on which to filter entries. To get a match, all fields must be matched (i.e. logical AND). The keys of the list are the entry field names on which to filter, and the values are the filtering parameters. For character fields, the filter parameter is a character vector in which all strings must be found inside the field's value. For numeric fields, the filter parameter is either a list specifying a min-max range (list(min=1.0, max=2.5)) or a value with a tolerance in delta (list(value=2.0, delta=0.1)) or ppm (list(value=2.0, ppm=1.0)).

max.results

If set, the number of returned IDs is limited to this number.

Returns

A character vector of entry IDs whose name matches the requested name.


Method searchByName()

DEPRECATED. Use searchForEntries() instead.

Usage
BiodbConn$searchByName(name, max.results = 0)
Arguments
name

A character value to search inside name fields.

max.results

If set, the number of returned IDs is limited to this number.

Returns

A character vector of entry IDs whose name matches the requested name.


Method isDownloadable()

Tests if the connector can download the database.

Usage
BiodbConn$isDownloadable()
Returns

Returns TRUE if the database is downloadable.


Method isDownloaded()

Tests if the database has been downloaded.

Usage
BiodbConn$isDownloaded()
Returns

TRUE if the database content has already been downloaded.


Method requiresDownload()

Tests if the connector requires the download of the database.

Usage
BiodbConn$requiresDownload()
Returns

TRUE if the connector requires download of the database.


Method getDownloadPath()

Gets the path where the downloaded content is written.

Usage
BiodbConn$getDownloadPath()
Returns

The path where the downloaded database is written.


Method setDownloadedFile()

Set the downloaded file into the cache.

Usage
BiodbConn$setDownloadedFile(src, action = c("copy", "move"))
Arguments
src

Path to the downloaded file.

action

Specifies if files have to be moved or copied into the cache.

Returns

Nothing.


Method isExtracted()

Tests if the downloaded database has been extracted (in case the database needs extraction).

Usage
BiodbConn$isExtracted()
Returns

TRUE if the downloaded database content has been extracted, FALSE otherwise.


Method download()

Downloads the database content locally.

Usage
BiodbConn$download()
Returns

Nothing.


Method isRemotedb()

Tests if the connector is connected to a remote database.

Usage
BiodbConn$isRemotedb()
Returns

Returns TRUE if the database is a remote database."


Method isCompounddb()

Tests if the connector's database is a compound database.

Usage
BiodbConn$isCompounddb()
Returns

Returns TRUE if the database is a compound database.


Method searchCompound()

This method is deprecated. Use searchForEntries() instead. Searches for compounds by name and/or by mass. At least one of name or mass must be set.

Usage
BiodbConn$searchCompound(
  name = NULL,
  mass = NULL,
  mass.field = NULL,
  mass.tol = 0.01,
  mass.tol.unit = "plain",
  max.results = 0
)
Arguments
name

The name of a compound to search for.

mass

The searched mass.

mass.field

For searching by mass, you must indicate a mass field to use ('monoisotopic.mass', 'molecular.mass', 'average.mass' or 'nominal.mass').

mass.tol

The tolerance value on the molecular mass.

mass.tol.unit

The type of mass tolerance. Either 'plain' or 'ppm'.

max.results

The maximum number of matches to return.

description

A character vector of words or expressions to search for inside description field. The words will be searched in order. A match will be made only if all words are inside the description field.

Returns

A character vector of entry IDs."


Method annotateMzValues()

Annotates a mass spectrum with the database. For each matching entry the entry field values will be set inside columns appended to the data frame. Names of these columns will use a common prefix in order to distinguish them from other data from the input data frame.

Usage
BiodbConn$annotateMzValues(
  x,
  mz.tol,
  ms.mode,
  mz.tol.unit = c("plain", "ppm"),
  mass.field = "monoisotopic.mass",
  max.results = 3,
  mz.col = "mz",
  fields = NULL,
  prefix = NULL,
  insert.input.values = TRUE,
  fieldsLimit = 0
)
Arguments
x

Either a data frame or a numeric vector containing the M/Z values.

mz.tol

The tolerance on the M/Z values.

ms.mode

The MS mode. Set it to either 'neg' or 'pos'.

mz.tol.unit

The type of the M/Z tolerance. Set it to either to 'ppm' or 'plain'.

mass.field

The mass field to use for matching M/Z values. One of: 'monoisotopic.mass', 'molecular.mass', 'average.mass', 'nominal.mass'.

max.results

If set, it is used to limit the number of matches found for each M/Z value. To get all the matches, set this parameter to NA_integer_. Default value is 3.

mz.col

The name of the column where to find M/Z values in case x is a data frame.

fields

A character vector containing the additional entry fields you would like to get for each matched entry. Each field will be output in a different column.

prefix

A prefix that will be inserted before the name of each added column in the output. By default it will be set to the name of the database followed by a dot.

insert.input.values

Insert input values at the beginning of the result data frame.

fieldsLimit

The maximum of values to output for fields with multiple values. Set it to 0 to get all values.

Returns

A data frame containing the input values, and annotation columns appended at the end. The first annotation column contains the IDs of the matched entries. The following columns contain the fields you have requested through the fields parameter.


Method isMassdb()

Tests if the connector's database is a mass spectra database.

Usage
BiodbConn$isMassdb()
Returns

Returns TRUE if the database is a mass database.


Method checkDb()

Checks that the database is correct by trying to retrieve all its entries.

Usage
BiodbConn$checkDb()
Returns

Nothing.


Method getAllVolatileCacheEntries()

Get all entries stored in the memory cache (volatile cache).

Usage
BiodbConn$getAllVolatileCacheEntries()
Returns

A list of BiodbEntry instances.


Method getAllCacheEntries()

This method is deprecated. Use getAllVolatileCacheEntries() instead.

Usage
BiodbConn$getAllCacheEntries()
Returns

All entries cached in memory.


Method deleteAllEntriesFromVolatileCache()

Delete all entries from the volatile cache (memory cache).

Usage
BiodbConn$deleteAllEntriesFromVolatileCache()
Returns

Nothing.


Method deleteAllEntriesFromPersistentCache()

Delete all entries from the persistent cache (disk cache).

Usage
BiodbConn$deleteAllEntriesFromPersistentCache(deleteVolatile = TRUE)
Arguments
deleteVolatile

If TRUE deletes also all entries from the volatile cache (memory cache).

Returns

Nothing.


Method deleteWholePersistentCache()

Delete all files associated with this connector from the persistent cache (disk cache).

Usage
BiodbConn$deleteWholePersistentCache(deleteVolatile = TRUE)
Arguments
deleteVolatile

If TRUE deletes also all entries from the volatile cache (memory cache).

Returns

Nothing.


Method deleteAllCacheEntries()

Delete all entries from the memory cache. This method is deprecated, please use deleteAllEntriesFromVolatileCache() instead.

Usage
BiodbConn$deleteAllCacheEntries()
Returns

Nothing.


Method getCacheId()

Gets the ID used by this connector in the disk cache.

Usage
BiodbConn$getCacheId()
Returns

The cache ID of this connector.


Method makesRefToEntry()

Tests if some entry of this database makes reference to another entry of another database.

Usage
BiodbConn$makesRefToEntry(id, db, oid, any = FALSE, recurse = FALSE)
Arguments
id

A character vector of entry IDs from the connector's database.

db

Another database connector.

oid

A entry ID from database db.

any

If set to TRUE, returns a single logical value: TRUE if any entry contains a reference to oid, FALSE otherwise.

recurse

If set to TRUE, the algorithm will follow all references to entries from other databases, to see if it can establish an indirect link to oid.

Returns

A logical vector, the same size as id, with TRUE for each entry making reference to oid, and FALSE otherwise.


Method makeRequest()

Makes a BiodbRequest instance using the passed parameters, and set ifself as the associated connector.

Usage
BiodbConn$makeRequest(...)
Arguments
...

Those parameters are passed to the initializer of BiodbRequest.

Returns

The BiodbRequest instance.


Method getEntryImageUrl()

Gets the URL to a picture of the entry (e.g.: a picture of the molecule in case of a compound entry).

Usage
BiodbConn$getEntryImageUrl(entry.id)
Arguments
entry.id

A character vector containing entry IDs.

Returns

A character vector, the same length as entry.id, containing for each entry ID either a URL or NA if no URL exists.


Method getEntryPageUrl()

Gets the URL to the page of the entry on the database web site.

Usage
BiodbConn$getEntryPageUrl(entry.id)
Arguments
entry.id

A character vector with the IDs of entries to retrieve.

Returns

A list of BiodbUrl objects, the same length as entry.id.


Method getChromCol()

Gets a list of chromatographic columns contained in this database.

Usage
BiodbConn$getChromCol(ids = NULL)
Arguments
ids

A character vector of entry identifiers (i.e.: accession numbers). Used to restrict the set of entries on which to run the algorithm.

Returns

A data.frame with two columns, one for the ID 'id' and another one for the title 'title'.


Method getMatchingMzField()

Gets the field to use for M/Z matching.

Usage
BiodbConn$getMatchingMzField()
Returns

The name of the field (one of peak.mztheo or peak.mzexp).


Method setMatchingMzField()

Sets the field to use for M/Z matching.

Usage
BiodbConn$setMatchingMzField(field = c("peak.mztheo", "peak.mzexp"))
Arguments
field

The field to use for matching.

Returns

Nothing.


Method getMzValues()

Gets a list of M/Z values contained inside the database.

Usage
BiodbConn$getMzValues(
  ms.mode = NULL,
  max.results = 0,
  precursor = FALSE,
  ms.level = 0
)
Arguments
ms.mode

The MS mode. Set it to either 'neg' or 'pos' to limit the output to one mode.

max.results

If set, it is used to limit the size of the output.

precursor

If set to TRUE, then restrict the search to precursor peaks.

ms.level

The MS level to which you want to restrict your search. 0 means that you want to search in all levels.

Returns

A numeric vector containing M/Z values.


Method getNbPeaks()

Gets the number of peaks contained in the database.

Usage
BiodbConn$getNbPeaks(mode = NULL, ids = NULL)
Arguments
mode

The MS mode. Set it to either 'neg' or 'pos' to limit the counting to one mode.

ids

A character vector of entry identifiers (i.e.: accession numbers). Used to restrict the set of entries on which to run the algorithm.

Returns

The number of peaks, as an integer.


Method filterEntriesOnRt()

Filters a list of entries on retention time values.

Usage
BiodbConn$filterEntriesOnRt(
  entry.ids,
  rt,
  rt.unit,
  rt.tol,
  rt.tol.exp,
  chrom.col.ids,
  match.rt
)
Arguments
entry.ids

A character vector of entry IDs.

rt

A vector of retention times to match. Used if input.df is not set. Unit is specified by rt.unit parameter.

rt.unit

The unit for submitted retention times. Either 's' or 'min'.

rt.tol

The plain tolerance (in seconds) for retention times: input.rt

  • rt.tol <= database.rt <= input.rt + rt.tol.

rt.tol.exp

A special exponent tolerance for retention times: input.rt

  • input.rt ** rt.tol.exp <= database.rt <= input.rt + input.rt ** rt.tol.exp. This exponent is applied on the RT value in seconds. If both rt.tol and rt.tol.exp are set, the inequality expression becomes input.rt - rt.tol - input.rt ** rt.tol.exp <= database.rt <= input.rt + rt.tol + input.rt ** rt.tol.exp.

chrom.col.ids

IDs of chromatographic columns on which to match the retention time.

match.rt

If set to TRUE, filters on RT values, otherwise does not do any filtering.

Returns

A character vector containing entry IDs after filtering.


Method searchForMassSpectra()

Searches for entries (i.e.: spectra) that contain a peak around the given M/Z value. Entries can also be filtered on RT values. You can input either a list of M/Z values through mz argument and set a tolerance with mz.tol argument, or two lists of minimum and maximum M/Z values through mz.min and mz.max arguments.

Usage
BiodbConn$searchForMassSpectra(
  mz.min = NULL,
  mz.max = NULL,
  mz = NULL,
  mz.tol = NULL,
  mz.tol.unit = c("plain", "ppm"),
  rt = NULL,
  rt.unit = c("s", "min"),
  rt.tol = NULL,
  rt.tol.exp = NULL,
  chrom.col.ids = NULL,
  precursor = FALSE,
  min.rel.int = 0,
  ms.mode = NULL,
  max.results = 0,
  ms.level = 0,
  include.ids = NULL
)
Arguments
mz.min

A vector of minimum M/Z values.

mz.max

A vector of maximum M/Z values. Its length must be the same as mz.min.

mz

A vector of M/Z values.

mz.tol

The M/Z tolerance, whose unit is defined by mz.tol.unit.

mz.tol.unit

The type of the M/Z tolerance. Set it to either to 'ppm' or 'plain'.

rt

A vector of retention times to match. Used if input.df is not set. Unit is specified by rt.unit parameter.

rt.unit

The unit for submitted retention times. Either 's' or 'min'.

rt.tol

The plain tolerance (in seconds) for retention times: input.rt

  • rt.tol <= database.rt <= input.rt + rt.tol.

rt.tol.exp

A special exponent tolerance for retention times: input.rt

  • input.rt ** rt.tol.exp <= database.rt <= input.rt + input.rt ** rt.tol.exp. This exponent is applied on the RT value in seconds. If both rt.tol and rt.tol.exp are set, the inequality expression becomes input.rt - rt.tol - input.rt ** rt.tol.exp <= database.rt <= input.rt + rt.tol + input.rt ** rt.tol.exp.

chrom.col.ids

IDs of chromatographic columns on which to match the retention time.

precursor

If set to TRUE, then restrict the search to precursor peaks.

min.rel.int

The minimum relative intensity, in percentage (i.e.: float number between 0 and 100).

ms.mode

The MS mode. Set it to either 'neg' or 'pos'.

max.results

If set, it is used to limit the number of matches found for each M/Z value.

ms.level

The MS level to which you want to restrict your search. 0 means that you want to search in all levels.

include.ids

A list of IDs to which to restrict the final results. All IDs that are not in this list will be excluded.

Returns

A character vector of spectra IDs.


Method searchMsEntries()

DEPRECATED. Use searchForMassSpectra() instead.

Usage
BiodbConn$searchMsEntries(
  mz.min = NULL,
  mz.max = NULL,
  mz = NULL,
  mz.tol = NULL,
  mz.tol.unit = c("plain", "ppm"),
  rt = NULL,
  rt.unit = c("s", "min"),
  rt.tol = NULL,
  rt.tol.exp = NULL,
  chrom.col.ids = NULL,
  precursor = FALSE,
  min.rel.int = 0,
  ms.mode = NULL,
  max.results = 0,
  ms.level = 0
)
Arguments
mz.min

A vector of minimum M/Z values.

mz.max

A vector of maximum M/Z values. Its length must be the same as mz.min.

mz

A vector of M/Z values.

mz.tol

The M/Z tolerance, whose unit is defined by mz.tol.unit.

mz.tol.unit

The type of the M/Z tolerance. Set it to either to 'ppm' or 'plain'.

rt

A vector of retention times to match. Used if input.df is not set. Unit is specified by rt.unit parameter.

rt.unit

The unit for submitted retention times. Either 's' or 'min'.

rt.tol

The plain tolerance (in seconds) for retention times: input.rt

  • rt.tol <= database.rt <= input.rt + rt.tol.

rt.tol.exp

A special exponent tolerance for retention times: input.rt

  • input.rt ** rt.tol.exp <= database.rt <= input.rt + input.rt ** rt.tol.exp. This exponent is applied on the RT value in seconds. If both rt.tol and rt.tol.exp are set, the inequality expression becomes input.rt - rt.tol - input.rt ** rt.tol.exp <= database.rt <= input.rt + rt.tol + input.rt ** rt.tol.exp.

chrom.col.ids

IDs of chromatographic columns on which to match the retention time.

precursor

If set to TRUE, then restrict the search to precursor peaks.

min.rel.int

The minimum relative intensity, in percentage (i.e.: float number between 0 and 100).

ms.mode

The MS mode. Set it to either 'neg' or 'pos'.

max.results

If set, it is used to limit the number of matches found for each M/Z value.

ms.level

The MS level to which you want to restrict your search. 0 means that you want to search in all levels.

Returns

A character vector of spectra IDs.


Method searchMsPeaks()

For each M/Z value, searches for matching MS spectra and returns the matching peaks.

Usage
BiodbConn$searchMsPeaks(
  input.df = NULL,
  mz = NULL,
  mz.tol = NULL,
  mz.tol.unit = c("plain", "ppm"),
  min.rel.int = 0,
  ms.mode = NULL,
  ms.level = 0,
  max.results = 0,
  chrom.col.ids = NULL,
  rt = NULL,
  rt.unit = c("s", "min"),
  rt.tol = NULL,
  rt.tol.exp = NULL,
  precursor = FALSE,
  precursor.rt.tol = NULL,
  insert.input.values = TRUE,
  prefix = NULL,
  compute = TRUE,
  fields = NULL,
  fieldsLimit = 0,
  input.df.colnames = c(mz = "mz", rt = "rt"),
  match.rt = FALSE
)
Arguments
input.df

A data frame taken as input for searchMsPeaks(). It must contain a columns 'mz', and optionaly an 'rt' column.

mz

A vector of M/Z values to match. Used if input.df is not set.

mz.tol

The M/Z tolerance, whose unit is defined by mz.tol.unit.

mz.tol.unit

The type of the M/Z tolerance. Set it to either to 'ppm' or 'plain'.

min.rel.int

The minimum relative intensity, in percentage (i.e.: float number between 0 and 100).

ms.mode

The MS mode. Set it to either 'neg' or 'pos'.

ms.level

The MS level to which you want to restrict your search. 0 means that you want to search in all levels.

max.results

If set, it is used to limit the number of matches found for each M/Z value.

chrom.col.ids

IDs of chromatographic columns on which to match the retention time.

rt

A vector of retention times to match. Used if input.df is not set. Unit is specified by rt.unit parameter.

rt.unit

The unit for submitted retention times. Either 's' or 'min'.

rt.tol

The plain tolerance (in seconds) for retention times: input.rt

  • rt.tol <= database.rt <= input.rt + rt.tol.

rt.tol.exp

A special exponent tolerance for retention times: input.rt

  • input.rt ** rt.tol.exp <= database.rt <= input.rt + input.rt ** rt.tol.exp. This exponent is applied on the RT value in seconds. If both rt.tol and rt.tol.exp are set, the inequality expression becomes input.rt - rt.tol - input.rt ** rt.tol.exp <= database.rt <= input.rt + rt.tol + input.rt ** rt.tol.exp.

precursor

If set to TRUE, then restrict the search to precursor peaks.

precursor.rt.tol

The RT tolerance used when matching the precursor.

insert.input.values

Insert input values at the beginning of the result data frame.

prefix

Add prefix on column names of result data frame.

compute

If set to TRUE, use the computed values when converting found entries to data frame.

fields

A character vector of field names to output. The data frame output will be restricted to this list of fields.

fieldsLimit

The maximum of values to output for fields with multiple values. Set it to 0 to get all values.

input.df.colnames

Names of the columns in the input data frame.

match.rt

If set to TRUE, match also RT values.

Returns

A data frame with at least input MZ and RT columns, and annotation columns prefixed with prefix if set. For each matching found a row is output. Thus if n matchings are found for M/Z value x, then there will be n rows for x, each for a different match. The number of matching found for each M/Z value is limited to max.results.


Method msmsSearch()

Searches MSMS spectra matching a template spectrum. The mz.tol parameter is applied on the precursor search.

Usage
BiodbConn$msmsSearch(
  spectrum,
  precursor.mz,
  mz.tol,
  mz.tol.unit = c("plain", "ppm"),
  ms.mode,
  npmin = 2,
  dist.fun = c("wcosine", "cosine", "pkernel", "pbachtttarya"),
  msms.mz.tol = 3,
  msms.mz.tol.min = 0.005,
  max.results = 0
)
Arguments
spectrum

A template spectrum to match inside the database.

precursor.mz

The M/Z value of the precursor peak of the mass spectrum.

mz.tol

The M/Z tolerance, whose unit is defined by mz.tol.unit.

mz.tol.unit

The type of the M/Z tolerance. Set it to either to 'ppm' or 'plain'.

ms.mode

The MS mode. Set it to either 'neg' or 'pos'.

npmin

The minimum number of peak to detect a match (2 is recommended).

dist.fun

The distance function used to compute the distance betweem two mass spectra.

msms.mz.tol

M/Z tolerance to apply while matching MSMS spectra. In PPM.

msms.mz.tol.min

Minimum of the M/Z tolerance (plain unit). If the M/Z tolerance computed with msms.mz.tol is lower than msms.mz.tol.min, then msms.mz.tol.min will be used.

max.results

If set, it is used to limit the number of matches found for each M/Z value.

Returns

A data frame with columns id, score and peak.*. Each peak.* column corresponds to a peak in the input spectrum, in the same order and gives the number of the peak that was matched with it inside the matched spectrum whose ID is inside the id column.


Method collapseResultsDataFrame()

Collapse rows of a results data frame, by outputing a data frame with only one row for each MZ/RT value.

Usage
BiodbConn$collapseResultsDataFrame(
  results.df,
  mz.col = "mz",
  rt.col = "rt",
  sep = "|"
)
Arguments
results.df

Results data frame.

mz.col

The name of the M/Z column in the results data frame.

rt.col

The name of the RT column in the results data frame.

sep

The separator used to concatenate values, when collapsing results data frame.

Returns

A data frame with rows collapsed."


Method searchMzRange()

Find spectra in the given M/Z range. Returns a list of spectra IDs.

Usage
BiodbConn$searchMzRange(
  mz.min,
  mz.max,
  min.rel.int = 0,
  ms.mode = NULL,
  max.results = 0,
  precursor = FALSE,
  ms.level = 0
)
Arguments
mz.min

A vector of minimum M/Z values.

mz.max

A vector of maximum M/Z values. Its length must be the same as mz.min.

min.rel.int

The minimum relative intensity, in percentage (i.e.: float number between 0 and 100).

ms.mode

The MS mode. Set it to either 'neg' or 'pos'.

max.results

If set, it is used to limit the number of matches found for each M/Z value.

precursor

If set to TRUE, then restrict the search to precursor peaks.

ms.level

The MS level to which you want to restrict your search. 0 means that you want to search in all levels.

Returns

A character vector of spectra IDs.


Method searchMzTol()

Find spectra containg a peak around the given M/Z value. Returns a character vector of spectra IDs.

Usage
BiodbConn$searchMzTol(
  mz,
  mz.tol,
  mz.tol.unit = "plain",
  min.rel.int = 0,
  ms.mode = NULL,
  max.results = 0,
  precursor = FALSE,
  ms.level = 0
)
Arguments
mz

A vector of M/Z values.

mz.tol

The M/Z tolerance, whose unit is defined by mz.tol.unit.

mz.tol.unit

The type of the M/Z tolerance. Set it to either to 'ppm' or 'plain'.

min.rel.int

The minimum relative intensity, in percentage (i.e.: float number between 0 and 100).

ms.mode

The MS mode. Set it to either 'neg' or 'pos'.

max.results

If set, it is used to limit the number of matches found for each M/Z value.

precursor

If set to TRUE, then restrict the search to precursor peaks.

ms.level

The MS level to which you want to restrict your search. 0 means that you want to search in all levels.

Returns

A character vector of spectra IDs.


Method clone()

The objects of this class are cloneable with this method.

Usage
BiodbConn$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

See Also

Super class BiodbConnBase, and BiodbFactory class.

Examples

# Create an instance with default settings:
mybiodb <- biodb::newInst()

# Get a compound CSV file database
chebi.tsv <- system.file("extdata", "chebi_extract.tsv", package='biodb')

# Create a connector
conn <- mybiodb$getFactory()$createConn('comp.csv.file', url=chebi.tsv)

# Get 10 identifiers from the database:
ids <- conn$getEntryIds(10)

# Get number of entries contained in the database:
n <- conn$getNbEntries()

# Terminate instance.
mybiodb$terminate()


pkrog/biodb documentation built on Nov. 29, 2022, 4:24 a.m.