create_mzdata: Create mzdata.
In brgordon17/coralclass: Metabolomics Classification of Stressed Coral From Heron Island 2011

Description Usage Arguments Details Value Note Author(s) See Also

create_mzdata() pre-processes the LCMS data used for modelling in gordon01.

1 2	create_mzdata(parallel = FALSE, seed = 100, savecsv = FALSE, saverda = TRUE)

`parallel`	Logical indicating if missing values imputation should be run in parallel. If `TRUE`, the default number of cores is equal to half the available number of cores
`seed`	An integer used for setting seeds of random number generation
`savecsv`	Logical indicating if output should be saved as a `.csv` file to the current working directory
`saverda`	Logical indicating if a .rda file should be saved to /data

Initially, the function takes the raw output from xcms and removes unwanted data (e.g. retention times, isotopes, peak counts etc.). Then, it creates new categorical variables based on the sample information. Finally, it replaces true non-detects with noise, removes poorly resolved mass features and then replaces the small number of remining missing values using random forest imputaion. The list below details the logic behind the missing values imputation:

TRUE NON-DETECTS: Replace values missing in one class, but not others, with a random number between zero and the minimum of the matrix (i.e. noise). To be considered a true non-detect, a class should be missing at least 60 precent of its values. Achieved with, metabolomics::MissingValues(group.cutoff = 0.6)
POORLY RESOLVED MASS FEATURES: Remove mass features with more than 90 percent missing values. Achieved with, metabolomics::MissingValues(column.cutoff = 0.9)
FALSE NON DETECTS: Remaining missing values will be computed using missForest::missForest(). Achieved with, metabolomics::MissingValues(complete.matrix = FALSE)

Returns a dataframe of class tbl_df

Using parallel = TRUE is not reproducible. Future versions of this function may include support for reproducible RNG seeds when using parallel processing. Although this function is exported, create_mzdata() was not intended to be used outside of this package.

Benjamin R. Gordon

missing_values registerDoMC missForest

brgordon17/coralclass documentation built on June 15, 2020, 9:21 p.m.