PomaPLS: Partial Least Squares Methods

View source: R/PomaPLS.R

PomaPLSR Documentation

Partial Least Squares Methods

Description

PomaPLS performs Partial Least Squares (PLS) regression, Partial Least Squares Discriminant Analysis (PLS-DA) to classify samples, and Sparse Partial Least Squares Discriminant Analysis (sPLS-DA) to classify samples (supervised analysis) and select variables.

Usage

PomaPLS(
  data,
  method = "pls",
  y = NULL,
  ncomp = 5,
  labels = FALSE,
  ellipse = TRUE,
  cross_validation = FALSE,
  validation = "Mfold",
  folds = 5,
  nrepeat = 10,
  vip = 1,
  num_features = 10,
  theme_params = list()
)

Arguments

data

A SummarizedExperiment object.

method

Character. PLS method. Options include "pls", "plsda", and "splsda".

y

Character. Indicates the name of colData columns to be used as dependent variable. If it's set to NULL, the first variable in colData will be used as the dependent variable.

ncomp

Numeric. Number of components in the model. Default is 5.

labels

Logical. Indicates if sample names should be displayed.

ellipse

Logical. Indicates whether a 95 percent confidence interval ellipse should be displayed. Default is TRUE.

cross_validation

Logical. Indicates if cross-validation should be performed for PLS-DA ("plsda") and sPLS-DA ("splsda") methods. Default is FALSE.

validation

Character. (Only for "plsda" and "splsda" methods). Indicates the cross-validation method. Options are "Mfold" and "loo" (Leave-One-Out).

folds

Numeric. (Only for "plsda" and "splsda" methods). Number of folds for "Mfold" cross-validation method (default is 5). If the validation method is "loo", this value is set to 1.

nrepeat

Numeric. (Only for "plsda" and "splsda" methods). Number of times the cross-validation process is repeated.

vip

Numeric. (Only for "plsda" method). Indicates the variable importance in the projection (VIP) cutoff.

num_features

Numeric. (Only for "splsda" method). Number of features to discriminate groups.

theme_params

List. Indicates theme_poma parameters.

Value

A list with results including plots and tables.

Author(s)

Pol Castellano-Escuder

Examples

data <- POMA::st000284 %>% # Example SummarizedExperiment object included in POMA
  PomaImpute() %>% 
  PomaNorm()

## Output is a list with objects `factors` (tibble), `factors_plot` (ggplot2 object), `loadings` (tibble), and `loadings_plot` (ggplot2 object)
# PLS
data %>% 
  PomaPLS(method = "pls",
          y = NULL,
          ncomp = 5,
          labels = FALSE,
          ellipse = FALSE)

## Output is a list with objects `factors` (tibble), `factors_plot` (ggplot2 object), `vip_values` (tibble), and `vip_plot` (ggplot2 object)
# PLS-DA
data %>%
  PomaPLS(method = "plsda",
          y = NULL,
          ncomp = 5,
          labels = FALSE,
          ellipse = TRUE,
          cross_validation = FALSE,
          vip = 1)

# Alternative outcome (dependent variable)
data %>%
  PomaPLS(method = "plsda",
          y = "gender",
          ncomp = 5,
          labels = FALSE,
          ellipse = TRUE,
          cross_validation = FALSE,
          vip = 1)

## Output is a list with objects `factors` (tibble), `factors_plot` (ggplot2 object), `vip_values` (tibble), `vip_plot` (ggplot2 object), `errors` (tibble), and `errors_plot` (ggplot2 object)
# PLS-DA with Cross-Validation
data %>% 
  PomaPLS(method = "plsda",
          y = NULL,
          ncomp = 5,
          labels = FALSE,
          ellipse = TRUE,
          cross_validation = TRUE,
          validation = "Mfold",
          folds = 5,
          nrepeat = 10,
          vip = 1)

## Output is a list with objects `factors` (tibble), `factors_plot` (ggplot2 object), `selected_features` (tibble), and `selected_features_plot` (ggplot2 object)
# sPLS-DA
data %>% 
  PomaPLS(method = "splsda",
          y = NULL,
          ncomp = 5,
          labels = FALSE,
          ellipse = TRUE,
          cross_validation = FALSE,
          num_features = 10)

## Output is a list with objects `factors` (tibble), `factors_plot` (ggplot2 object), `selected_features` (tibble), `selected_features_plot` (ggplot2 object), `errors` (tibble), `errors_plot` (ggplot2 object), `optimal_components` (numeric value), and `optimal_features` (vector with optimal features per component)
# sPLS-DA with Cross-Validation
data %>% 
  PomaPLS(method = "splsda",
          y = NULL,
          ncomp = 3,
          labels = FALSE,
          ellipse = TRUE,
          cross_validation = TRUE,
          validation = "Mfold",
          folds = 5,
          nrepeat = 10,
          num_features = 10)

pcastellanoescuder/POMA documentation built on Nov. 18, 2024, 10:41 p.m.