filterMS: Filter compounds from mass spectrometry data

Description Usage Arguments Details Value Examples

Description

Filters mass spectrometry data using a set of criteria, described in Details. Returns an object of classes msDat and filterMS.

Usage

1
2
filterMS(msObj, region, border = "all", bord_ratio = 0.05,
  min_inten = 1000, max_chg = 7L)

Arguments

msObj

An object class msDat. Note that this includes objects created by the functions binMS and msDat.

region

A vector either of mode character or mode numeric. If numeric then the entries should provide the indices for the region of interest in the mass spectrometry data provided as the argument for msObj. If character then the entries should uniquely specify the region of interest through partial string matching (see criterion 1, 4).

border

Either a character string "all", or a character string "none", or a length-1 or length-2 numeric value specifying the number of fractions to either side of the region of interest to comprise the bordering region. If a single numeric value, then this is the number of fractions to each side of the region of interest; if it is two values, then the first value is the number of fractions to the left, and the second value is the number of fractions to the right. If there are not enough fractions in either direction to completely span the number of specified fractions, then all of the available fractions to the side in question are considered to be part of the bordering region (see criterion 2).

bord_ratio

A single nonnegative numeric value. A value of 0 will not admit any compounds, while a value greater than 1 will admit all compounds (see criterion 2).

min_inten

A single numeric value. A value less than the minimum mass spectrometry value in the data will admit all compounds (see criterion 4).

max_chg

A single numeric value specifying the maximum charge which a compound may exhibit (see criterion 5)

Details

Attempts to filter out candidate compounds via subject-matter knowledge, with the goal of removing spurious noise from downstream models. The criteria for the downstream inclusion of a candidate compound is listed below.

  1. The m/z intensity maximum must fall inside the range of the bioactivity region of interest

  2. The ratio of the m/z intensity of a species in the areas bordering the region of interest and the species maximum intensity must be less than bord_ratio. When there is no bordering area then it is taken to mean that all observations satisfy this criterion.

  3. The immediately right adjacent fraction to its maximum intensity fraction for a species must have a non-zero abundance. In the case of ties for the maximum, it is the fraction immediately to the right of the rightmost maximum fraction which cannot have zero abundance. When the fraction with maximum intensity is the rightmost fraction in the data for an observation, then it is taken to mean that the observation satisfies this criterion.

  4. At least 1 fraction in the region of interest must have intensity greater than min_inten

  5. Compound charge state must be less than or equal to max_chg

Value

Returns an object of class filterMS which inherits from msDat. This object is a list with elements described below. The class is equipped with a print, summary, and extractMS function.

msDatObj

An object of class msDat such that the encapsulated mass spectrometry data corresponds to each of the candidate compounds that satisfed each of the criteria. If no criteria are satisfied then NULL is returned.

cmp_by_crit

A list containing data.frames, one for each criterion. Each row (if any) in one of the sub-data.frames contains the mass-to-charge and charge information for a candidate compound that satisfies the criterion represented by the data.frame; all of the compounds that satisfied the criterion are included in the data. The data.frames are named c1, ..., c5, etc corresponding to criterion 1, ..., criterion 5.

summ_info

A list containing information pertaining to the filtering process; for use by the summary function.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# Load mass spectrometry data
data(mass_spec)

# Convert mass_spec from a data.frame to an msDat object
ms <- msDat(mass_spec = mass_spec,
            mtoz = "m/z",
            charge = "Charge",
            ms_inten = c(paste0("_", 11:43), "_47"))

# Filter out potential candidate compounds
filter_out <- filterMS(msObj = ms,
                       region = paste0("VO_", 17:25),
                       border = "all",
                       bord_ratio = 0.01,
                       min_inten = 1000,
                       max_chg = 7)

# print, summary function
filter_out
summary(filter_out)

# Extract filtered mass spectrometry data as a matrix or msDat object
filter_matr <- extractMS(msObj = filter_out, type = "matrix")
filter_msDat <- extractMS(msObj = filter_out, type = "matrix")

dpritchLibre/Bioactivity documentation built on May 15, 2019, 1:48 p.m.