Description Usage Arguments Details Value Note Author(s) See Also
View source: R/create_mzdata.R
create_mzdata()
pre-processes the LCMS data used for modelling in
gordon01.
1 2 | create_mzdata(parallel = FALSE, seed = 100, savecsv = FALSE,
saverda = TRUE)
|
parallel |
Logical indicating if missing values imputation should be run
in parallel. If |
seed |
An integer used for setting seeds of random number generation |
savecsv |
Logical indicating if output should be saved as a |
saverda |
Logical indicating if a .rda file should be saved to /data |
Initially, the function takes the raw output from xcms and removes unwanted data (e.g. retention times, isotopes, peak counts etc.). Then, it creates new categorical variables based on the sample information. Finally, it replaces true non-detects with noise, removes poorly resolved mass features and then replaces the small number of remining missing values using random forest imputaion. The list below details the logic behind the missing values imputation:
TRUE NON-DETECTS:
Replace values missing in one class, but not others, with a random number
between zero and the minimum of the matrix (i.e. noise). To be considered a
true non-detect, a class should be missing at least 60 precent of its
values. Achieved with,
metabolomics::MissingValues(group.cutoff = 0.6)
POORLY RESOLVED MASS FEATURES:
Remove mass features with more than 90 percent missing values. Achieved
with, metabolomics::MissingValues(column.cutoff = 0.9)
FALSE NON DETECTS:
Remaining missing values will be computed using
missForest::missForest()
. Achieved with,
metabolomics::MissingValues(complete.matrix = FALSE)
Returns a dataframe of class tbl_df
Using parallel = TRUE
is not reproducible. Future versions of
this function may include support for reproducible RNG seeds when using
parallel processing. Although this function is exported,
create_mzdata()
was not intended to be used outside of this package.
Benjamin R. Gordon
missing_values
registerDoMC
missForest
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.