lcms_data_analysis: Data analysis

Description Usage Arguments Details Value

View source: R/data_analysis.R

Description

Data analysis on AlpsLCMS can be performed on both lcms_dataset_1D full spectra as well as lcms_dataset_peak_table peak tables.

Usage

1
2
3
4
5
6
7
8
9
lcms_data_analysis(
  dataset,
  y_column,
  identity_column,
  balance_in_train = NULL,
  external_val,
  internal_val,
  data_analysis_method
)

Arguments

dataset

An lcms_dataset_family object

y_column

A string with the name of the y column (present in the metadata of the dataset)

identity_column

NULL or a string with the name of the identity column (present in the metadata of the dataset).

external_val, internal_val

A list with two elements: iterations and test_size. See random_subsampling for further details

data_analysis_method

An lcms_data_analysis_method object

Details

The workflow consists of a double cross validation strategy using random subsampling for splitting into train and test sets. The classification model and the metric to choose the best model can be customized (see new_lcms_data_analysis_method()), but for now only a PLSDA classification model with a best area under ROC curve metric is implemented (see the examples here and plsda_auroc_vip_method)

Value

A list with the following elements:


sipss/NIHSlcms documentation built on May 13, 2021, 6:19 p.m.