mv_imputation: Missing value imputation using different algorithms
In pmp: Peak Matrix Processing and signal batch correction for metabolomics datasets

Description Usage Arguments Details Value Examples

Missing values in metabolomics data sets occur widely and can originate from a number of sources, including technical and biological reasons.
Missing values imputation is applied to replace non-existing values with an estimated values while maintaining the data structure. A number of different methods are available as part of this function.

mv_imputation(
  df,
  method,
  k = 10,
  rowmax = 0.5,
  colmax = 0.5,
  maxp = NULL,
  check_df = TRUE
)

`df`	A matrix-like (e.g. an ordinary matrix, a data frame) or RangedSummarizedExperiment-class object with all values of class `numeric()` or `integer()` of peak intensities, areas or other quantitative characteristic.
`method`	`character(1)`, missing value imputation method. Supported methods are `knn`, `rf`, `bpca`, `sv`, `'mn'` and `'md'`.
`k`	`numeric(1)`, for a given sample containing a missing value, the number of nearest neighbours to include to calculate a replacement value. Used only for method `knn`.
`rowmax`	`numeric(1)`, the maximum percentage of missing data allowed in any row. For any rows exceeding given limit, missing values are imputed using the overall mean per sample. Used only for method `knn`.
`colmax`	`numeric(1)`, the maximum percent missing data allowed in any column. If any column exceeds given limit, the function will report an error Used only for method `knn`.
`maxp`	`integer(1)`, number of features to run on single core. If set to NULL will use total number of features.
`check_df`	`logical(1)`, if set to TRUE will check if input data needs to be transposed, so that features are in rows.

Supported missing value imputation methods are:

knn - K-nearest neighbour. For each feature in each sample, missing values are replaced by the mean average value (non-weighted) calculated from its k closest neighbours in multivariate space (default distance metric: euclidean distance);

rf - Random Forest. This method is a wrapper of missForest function. For each feature, missing values are iteratively imputed until a maximum number of iterations (10), or until the difference between consecutively-imputed matrices becomes positive. Trees per forest are set to 100, variables included per tree are calculate using formula sqrt(total number of variables);

bpca - Bayesian principal component analysis. This method is a wrapper of pca function. Missing values are replaced by the values obtained from principal component analysis regression with a Bayesian method. Therefore every imputed missing value does not occur multiple times, neither across the samples nor across the metabolite features;

sv - Small value. For each feature, replace missing values with half of the lowest value recorded in the entire data matrix;

'mn' - Mean. For each feature, replace missing values with the mean average (non-weighted) of all other non-missing values for that variable;

'md' - Median. For each feature, replace missing values with the median of all other non-missing values for that variable.

Object of class SummarizedExperiment. If input data are a matrix-like (e.g. an ordinary matrix, a data frame) object, function returns the same R data structure as input with all value of data type numeric().

1 2	df <- MTBLS79 [ ,MTBLS79$Batch == 1] out <- mv_imputation(df=df, method='knn')

pmp documentation built on April 1, 2021, 6:01 p.m.

pmp index

Peak Matrix Processing for metabolomics datasets Signal drift and batch effect correction and mass spectral quality assessment Signal drift and batch effect correction for mass spectrometry

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

pmp
Peak Matrix Processing and signal batch correction for metabolomics datasets

mv_imputation: Missing value imputation using different algorithms
In pmp: Peak Matrix Processing and signal batch correction for metabolomics datasets

Description

Usage

Arguments

Details

Value

Examples

Related to mv_imputation in pmp...

R Package Documentation

Browse R Packages

We want your feedback!

pmp Peak Matrix Processing and signal batch correction for metabolomics datasets

mv_imputation: Missing value imputation using different algorithms In pmp: Peak Matrix Processing and signal batch correction for metabolomics datasets

Description

Usage

Arguments

Details

Value

Examples

Related to mv_imputation in pmp...

R Package Documentation

Browse R Packages

We want your feedback!

pmp
Peak Matrix Processing and signal batch correction for metabolomics datasets

mv_imputation: Missing value imputation using different algorithms
In pmp: Peak Matrix Processing and signal batch correction for metabolomics datasets