metabCombiner: Form a metabCombiner object.

View source: R/metabCombiner.R

metabCombinerR Documentation

Form a metabCombiner object.

Description

This constructs an object of type metabCombiner from a pair of metabolomics datasets, formatted as either metabData (single-dataset class) or metabCombiner (combined-dataset class). An initial table of possible feature pair alignments is constructed by grouping features into m/z groups controlled by the binGap argument

Usage

metabCombiner(
  xdata,
  ydata,
  binGap = 0.005,
  xid = NULL,
  yid = NULL,
  means = list(mz = FALSE, rt = FALSE, Q = FALSE),
  rtOrder = TRUE,
  impute = FALSE
)

Arguments

xdata

metabData or metabCombiner object

ydata

metabData or metabCombiner object

binGap

numeric parameter used for grouping features by m/z. See ?mzGroup for more details.

xid

character. If xdata is a metabData, assigns a new identifier for this dataset; if xdata is a metabCombiner, selects one of the existing dataset IDs to represent xdata. See details for more information.

yid

character. If ydata is a metabData, assigns a new identifier for this dataset; if ydata is a metabCombiner, selects one of the existing dataset IDs to represent ydata. See details for more information.

means

logical. Option to take average m/z, rt, and/or Q from metabComber. May be a vector (length = 3), a single value (TRUE/FALSE), or a list with names "mz", "rt", "Q" as names.

rtOrder

logical. If set to TRUE, retention order consistency expected when resolving conflicting alignments for metabCombiner object inputs.

impute

logical. If TRUE, imputes the mean mz/rt/Q values for missing features in metabCombiner object inputs before use in alignment (not recommended for disparate data alignment); if FALSE, features with missing information are dropped.

Details

This function serves as a constructor of the metabCombiner combined dataset class and the entry point to the main workflow for pairwise dataset alignment. Two arguments must be specified, xdata and ydata, which must be either metabData or metabCombiner objects. There are four scenarios listed here:

1) If xdata & ydata are metabData objects, a new metabCombiner object is constructed with an alignment of this pair. New character identifiers are assigned to each dataset (xid & yid, respectively); if these are unassigned, then "1" and "2" will be their respective ids. xdata & ydata will be the active "dataset x" and "dataset y" used for the paired alignment.

2) If xdata is a metabCombiner and ydata is a metabData, then the result is the existing metabCombiner xdata augmented by an additional dataset, ydata. One set of meta-data (id, m/z, rt, Q, adduct labels) from xdata is used for alignment with the respective information from ydata, which is controlled by the xid argument; see the datasets method for extracting existing dataset ids. A new identifier yid is assigned to ydata, which must be distinct from the current dataset identifier.

3) If xdata is a metabData and ydata is a metabCombiner, then a similar process to #2 occurs, with xdata augmented to the existing ydata object and one of the constitutent dataset's meta-data is accessed, as controlled by the yid argument. One major difference is that rts of ydata serve as the "reference" or dependent variable in the spline-fitting step.

4) If xdata and ydata are both metabCombiner objects, the resulting metabCombiner object aligns information from both combined datasets. As before, one set of values contained in xdata (specified by xid argument) is used to align to the values from ydata (controlled by yid argument). The samples and extra columns are concatenated from all datasets.

For metabCombiner object inputs, the full workflow (selectAnchors, fit_gam/fit_loess, calcScores, labelRows) must be performed before further alignment. If not completed already, features are pared down to 1-1 alignments via the resolveConflicts approach (see: help(resolveRows)). Features may not be used more than twice and will be removed if they are detected as duplicates.

The mean of the numeric fields (m/z, rt, Q) from all constituent datasets can be used in alignment in place of values from a single dataset. These are controlled by the means argument. By default this is a list value with "mz", "rt" and "Q" as names, but may also accept a single logical or a length-3 logical vector. If set to a single logical value, then all three fields are averaged (TRUE) or not averaged (FALSE). If a three-length argument is supplied (e.g. c(TRUE, FALSE, FALSE)), then the values correspond to m/z, rt, and Q respectively. RT averaging is generally not recommended for disparate data alignment.

If missing features have been incorporated into the metabCombiner, they an be imputed using the average m/z, rt, and Q values for that feature in datasets in which it is present by setting impute to TRUE. Likewise, this option is not recommended for disparate data alignment.

Value

a metabCombiner object constructed from xdata and ydata, with features grouped by m/z according to the binGap argument.

Note

If using a metabCombiner object as input, only one row is allowed per feature corresponding to its first appearance. It is strongly recommended to reduce the table to 1-1 paired matches prior to aligning it with a new dataset.

Examples

data(plasma30)
data(plasma20)

p30 <- metabData(plasma30, samples = "CHEAR")
p20 <- metabData(plasma20, samples = "Red", rtmax = 17.25)

p.comb = metabCombiner(xdata = p30, ydata = p20, binGap = 0.0075,
                       xid = "p30", yid = "p20")


hhabra/Combiner documentation built on Jan. 26, 2024, 10:30 p.m.