metabData: Constructor for the metabData object.

Description Usage Arguments Details Value Examples

View source: R/metabData.R

Description

This is a constructor for objects of type metabData.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
metabData(
  table,
  mz = "mz",
  rt = "rt",
  id = "id",
  adduct = "adduct",
  samples = NULL,
  Q = NULL,
  extra = NULL,
  rtmin = "min",
  rtmax = "max",
  misspc = 50,
  measure = c("median", "mean"),
  zero = FALSE,
  duplicate = c(0.0025, 0.05)
)

Arguments

table

Path to file containing feature table or data.frame object containing features

mz

Character name(s) or regular expression associated with data column containing m/z values. The first column whose name contains this expression will be selected for analysis.

rt

Character name(s) or regular expression associated with data column containing retention time values. The first column whose name contains this expression will be selected for analysis.

id

Character name(s) or regular expression associated with data column containing metabolomics feature identifiers. The first column whose name contains this expression will be selected for analysis.

adduct

Character name(s) or regular expression associated with data column containing adduct or chemical formula annotations. The first column whose name contains this expression will be selected for analysis.

samples

Character name(s) or regular expression associated with data columns. All numeric columns whose names contain these keywords are selected for analysis. If no keywords given, program searches longest stretch of remaining numeric columns.

Q

Character name(s) or regular expression associated with numeric feature abundance quantiles. If NULL, abundance quantiles are calculated from sample intensities.

extra

Character names of columns containing additional feature information, e.g. non-analyzed sample values. All columns containing these keywords selected and will be displayed in the final output.

rtmin

Numeric. Minimum retention time for analysis.

rtmax

Numeric. Maximum retention time for analysis.

misspc

Numeric. Threshold missingness percentage for analysis.

measure

Central quantitation measure, either "median" or "mean".

zero

Logical. Whether to consider zero values as missing.

duplicate

Numeric ordered pair (m/z, rt) duplicate feature tolerances. Pairs of features within these tolerances are deemed duplicates and one of the pair is removed (see: findDuplicates)

Details

Processed metabolomics feature table must contain columns for m/z, rt, and numeric sample intensities. Some optional fields such as identity id and adduct label columns may also be supplied. Non-analyzed columns can be included into the final output by specifying the names of these columns in the extra argument. All required arguments are checked for validity (e.g. no negative m/z or rt values, each column is used at most once, column types are valid, etc...).

Following this is a pre-analysis filtering of rows that are either: 1) Outside of a specified retention time range (rtmin,rtmax), 2) Missing in excess of misspc percent of analyzed samples, or 3) deemed duplicates by small pairwise <m/z, rt> differences as specified by the duplicate argument.

Remaining features are ranked by abundance quantiles, Q, using a central measure, either "median" or "mean." Alternatively, the abundance quantiles column can be specified in the argument Q.

Value

An object of class metabData containing the specific information specified by mz,rt, samples, id, adduct, Q, and extra arguments, and adjusted by pre-processing steps.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
data(plasma30)

#samples: CHEAR; RedCross samples non-analyzed "extra" columns
p30 <- metabData(plasma30, mz = "mz", rt = "rt", id = "identity",
                 adduct = "adduct", samples = "CHEAR", extra = "RedCross")

getSamples(p30)  #should print names of 5 CHEAR Sample column names
getExtra(p30)    #should print names of 5 Red Cross Sample column names

#equivalent to above
p30 <- metabData(plasma30, id = "id", samples = "CHEAR", extra = "Red")

#analyzing Red Cross samples with retention time limitations (0.5-17.5min)
p30 <- metabData(plasma30, samples = "Red", rtmin = 0.5, rtmax = 17.5)
data = getData(p30)
range(data$rt)

#using regular expressions for field searches
p30.2 <- metabData(plasma30, id = "identity|id|ID", samples = ".[3-5]$")
getSamples(p30.2)    #should print all column names ending in .3, .4, .5

metabCombiner documentation built on Dec. 10, 2020, 2 a.m.