knn_imputation: Customized K-NN Imputation

Description Usage Arguments Details Note Examples

View source: R/knn_imputation.r

Description

K-NN imputation adopted to LC-MS proteomics data. The main reason for missing data in LC-MS datasets is low abundance of the protein/peptides. Therefore this K-NN imputation algorithm explicitely relies on this assumption.

Usage

1
knn_imputation(x, K = 10, show.diagnostics = F)

Arguments

x

MSnSet or ExpressionSet object

K

number of nearest neighbors

show.diagnostics

logical indicating if to plot the results of imputation for each feature

Details

The algorithm. For each row in the exprs matrix, that contain missing values, perform the following steps:

  1. impute missing values with the lowest values in the row (feature)

  2. find K features (with no missing values) with highest Spearman correlation

  3. scale the K-neighbors, so that median intensity ratio is 1

  4. impute missing values with mean value of scaled K-neighbors

Note

The algorithm assumes that the data is not log-transformed. Thus, if the data is log-transform - exponentiate.

Examples

1
2
3
4
5
suppressPackageStartupMessages(library("MSnbase"))
data(naset)
image(naset[1:50,])
x <- knn_imputation(naset)
image(x[1:50,])

vladpetyuk/vp.misc documentation built on June 25, 2021, 6:35 a.m.