MissingValues: Missing value replacement

Description Usage Arguments Value Author(s) Examples

View source: R/MissingValues.R

Description

Missing value imputation for metabolomics data matrices

Usage

1
2
3
4
MissingValues(featuredata, sampledata = NULL, metabolitedata = NULL,
  feature.cutoff = 0.8, sample.cutoff = 0.8, method = c("knn", "replace",
  "none"), k = 10, featuremax.knn = 0.8, samplemax.knn = 0.8,
  seed = 100, saveoutput = FALSE, outputname = "nomissing")

Arguments

featuredata

A data frame in the featuredata format. This is a dataframe with metabolites in columns and samples in rows. Unique sample names should be provided as row names. See NormalizeMets Vignette for details.

sampledata

A dataframe with sample information matching featuredata.

metabolitedata

A dataframe with metabolite information matching featuredata.

feature.cutoff

A value between zero and one. Used to exclude features that have a large proportion of missing values. If the proportion of missing values is equal to or more than the feature.cutoff, that feature will be deleted.

sample.cutoff

A value between zero and one. Used to exclude samples that have a large proportion of missing values. If the proportion of missing values is equal to or more than the sample.cutoff in any row, that whole sample will be deleted.

method

Missing value replacement method. Should be either "knn" (the kth nearest neighbour algorithm), "replace" (replacing by half the minimum detectable signal ), or "none".

k

The number of nearest neighbours to be used in the knn algorithm

featuremax.knn

For the knn algorithm. The maximum proportion of missing data allowed in any feature. For any features with more than featuremax.knn proportion missing, missing values are imputed using the overall mean per sample.

samplemax.knn

For the knn algorithm. The maximum proportion of missing data allowed in any sample. If any sample has more than samplemax.knn missing data, the program halts and reports an error.

seed

For the knn algorithm for very large matrices. An integer, denoting state for random number generation in R.

saveoutput

A logical indicating whether the output should be saved. If TRUE, the results will be saved as a csv file.

outputname

The name of the output file if the output has to be saved.

Value

The output is an object of class alldata.

Author(s)

Alysha M De Livera, Gavriel Olshansky

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
    data(alldata_eg)
    featuredata_eg<-alldata_eg$featuredata
    sampledata_eg<-alldata_eg$sampledata
    metabolitedata_eg<-alldata_eg$metabolitedata
    logdata <- LogTransform(featuredata_eg)
    
    imp <-  MissingValues(logdata$featuredata,sampledata_eg,metabolitedata_eg,
                      feature.cutof=0.8, sample.cutoff=0.8, method="knn")
    imp
    dataview(imp$featuredata)                       

NormalizeMets documentation built on May 1, 2019, 10:26 p.m.