lcms_data_analysis: Data analysis
In sipss/NIHSlcms: Signal processing of LC-MS metabolomics

Description Usage Arguments Details Value

Data analysis on AlpsLCMS can be performed on both lcms_dataset_1D full spectra as well as lcms_dataset_peak_table peak tables.

lcms_data_analysis(
  dataset,
  y_column,
  identity_column,
  balance_in_train = NULL,
  external_val,
  internal_val,
  data_analysis_method
)

`dataset`	An lcms_dataset_family object
`y_column`	A string with the name of the y column (present in the metadata of the dataset)
`identity_column`	`NULL` or a string with the name of the identity column (present in the metadata of the dataset).
`external_val, internal_val`	A list with two elements: `iterations` and `test_size`. See random_subsampling for further details
`data_analysis_method`	An lcms_data_analysis_method object

The workflow consists of a double cross validation strategy using random subsampling for splitting into train and test sets. The classification model and the metric to choose the best model can be customized (see new_lcms_data_analysis_method()), but for now only a PLSDA classification model with a best area under ROC curve metric is implemented (see the examples here and plsda_auroc_vip_method)

A list with the following elements:

train_test_partitions: A list with the indices used in train and test on each of the cross-validation iterations
inner_cv_results: The output returned by train_evaluate_model on each inner cross-validation
inner_cv_results_digested: The output returned by choose_best_inner.
outer_cv_results: The output returned by train_evaluate_model on each outer cross-validation
outer_cv_results_digested: The output returned by train_evaluate_model_digest_outer.