met.FilterVariable: Methods for non-specific filtering of variables

met.FilterVariableR Documentation

Methods for non-specific filtering of variables

Description

met.FilterVariable filters non-informative variables (i.e., features with very small values, near-constant values, or low repeatability) from the dataset, dependent on the user-specified method for filtering. The function applies a filtering method, ranks the variables within the dataset, and removes variables based on its rank. The final dataset should contain no more than than 5000 variables for effective computing. If more features are present, the IQR filter will be applied to keep only a number of 5000, even if filter = "none". Data filtering is performed as part of the data preparation workflow met.read_data.

Usage

met.FilterVariable(
  mSetObj = NA,
  filter = "none",
  remain.num = NULL,
  qcFilter = "F",
  qc.rsd = 0.25,
  all.rsd = NULL
)

Arguments

mSetObj

Enter the name of the created mSet object (see InitDataObjects and Read.TextData).

filter

(Character) Select an option for unspecific filtering based on the following ranking criteria:

  • "none" apply no unspecific filtering.

  • "rsd" filters features with low relative standard deviation across the dataset.

  • "nrsd" is the non-parametric relative standard deviation.

  • "mean" filters features with low mean intensity value across the dataset.

  • "median" filters features with low median intensity value across the dataset.

  • "sd" filters features with low absolute standard deviation across the dataset.

  • "mad" filters features with low median absolute deviation across the dataset.

  • "iqr" filters features with a low inter-quartile range across the dataset.

remain.num

(Numerical) Enter the number of variables to keep in your dataset. If NULL, the following empirical rules are applied during data filtering with the methods specified in filter = "":

  • Less than 250 variables: 5% will be filtered

  • 250 - 500 variables: 10% will be filtered

  • 500 - 1000 variables: 25% will be filtered

  • More than 1000 variables: 40% will be filtered

qcFilter

(Logical) Filter the variables based on the relative standard deviation of features in QC samples (TRUE), or not (FALSE). This filter can be applied in addition to other, unspecific filtering methods.

qc.rsd

(Numeric) Define the relative standard deviation cut-off in %. Variables with a RSD greater than this number will be removed from the dataset. It is only necessary to specify this argument if qcFilter is TRUE. Otherwise, it will be ignored.

all.rsd

(Numeric or NULL) Apply a filter based on the in-group relative standard deviation (RSD, in %) or not NULL. Therefore, the RSD of every feature is calculated for every group in the dataset. If the RSD of a variable in any group exceeds the indicated threshold, it is removed from the dataset. This filter can be applied in addition to other filtering methods and is especially useful to perform on data with technical replicates.

Value

The input mSet object with filtered data added at mSetObj$dataSet$filt.

Author(s)

Nicolas T. Wirth mail.nicowirth@gmail.com Technical University of Denmark License: GNU GPL (>= 2)

References

adapted from FilterVariable (https://github.com/xia-lab/MetaboAnalystR).


NicWir/VisomX documentation built on Dec. 8, 2024, 1:27 a.m.