imputation: Metabolomic Dataset Imputation
In AurelieGuilbault/VIQCing: Data Processing For Metabolomic Data

Description Usage Arguments Details Value See Also Examples

View source: R/imputation.R

Impute the given dataset with different method options. Produces <filename>_imputed.txt, containing the imputed dataset; See Details for available Imputation Methods

1
2
3

imputation(file, k = 2, method = "knn", npcs = 3, sigma = 0.1,
  nTree = 30, na.string = "NA", transformation = "None",
  compound = NULL, metabolite = NULL, sampleStart = 3)

`file`	file containing the dataset to impute; a column "Compound"; a column "Metabolite"; and all the columns sample from <sampleStart> to the end of the file; the rest doesn't matter and the names are optional, as long as the column position is entered.
`k`	default 2, the k used for the knn imputation;
`method`	default "knn", the chosen method for replacing the missing values. Can be "knn", "RF", "QRILC", "SVD", "mean", "median", "HM" or "0". See Details.
`npcs`	default 3, npcs for SVD method;
`sigma`	default 0.1, tune sigma parameter for QRILC method;
`nTree`	default 30, number of tree for the RF method;
`na.string`	default "NA", string to consider as NA in the dataset;
`transformation`	default "None", can be "scale" or "log";
`compound`	default NULL, position of the compound column if named otherwise;
`metabolite`	default NULL, position of the metabolite column if named otherwise;
`sampleStart`	default 3, 1st column of the actual data;

Available imputation methods:

"knn": From the impute package, use the k nearest neighboors to impute the values;
"RF": From the missForest package, use RandomForest algorithm to impute the values;
"QRILC": From the imputeLCMD package, use Quantile regression to impute the values;
"SVD": From the pcaMethods package, use SVDimpute algorithm as proposed by Troyanskaya et al, 2001. to impute the values;
"mean","median", ""median", "0", "HM": simple value replacement, either by the mean, median, 0 of Half minimum of the row;

df, the imputed dataset as a dataframe.

impute packagehttps://www.rdocumentation.org/packages/impute
missForest package https://www.rdocumentation.org/packages/missForest
imputeLCMD package https://www.rdocumentation.org/packages/imputeLCMD
pcaMethods package https://www.rdocumentation.org/packages/pcaMethods

for a dataset with the following header ; Compound, m/z, Metabolite, RT, Sample #1, ...
imputation("dummySet.tsv", method="knn", transformation="log", sampleStart=5)

for a dataset with the following header ; compound, m/z, metabolite, RT, Sample #1, ...
imputation("dummySet.tsv", method="knn", transformation="log", metabolite = 3, compound = 1, sampleStart=5)