pca_lda: PCA followed by Linear Discriminant Analysis

View source: R/pca_lda.R

pca_ldaR Documentation

PCA followed by Linear Discriminant Analysis

Description

This function applies Principal Component Analysis (PCA) followed by Linear Discriminant Analysis (LDA) to a given dataset. The data is first projected onto dp principal components, then further transformed via a two-step LDA procedure: an intermediate within-class projection of dimension di, followed by a final between-class projection of dimension dl. This sequence of transformations aims to reduce dimensionality while enhancing class separability.

Usage

pca_lda(
  X,
  Y,
  preproc = center(),
  dp = min(dim(X)),
  di = dp - 1,
  dl = length(unique(Y)) - 1
)

Arguments

X

A numeric matrix of size n x d, where n is the number of samples (rows) and d is the number of features (columns).

Y

A factor or numeric vector of length n representing class labels for each sample. If numeric, it will be converted to a factor.

preproc

A preprocessing function from the multivarious package (e.g. center(), scale()) to apply to the data before PCA. Defaults to centering.

dp

Integer. The dimension of the initial PCA projection. Defaults to min(dim(X)), i.e., the smaller of the number of samples or features. Must be at least 2 and at most min(n,d).

di

Integer. The dimension of the within-class projection, typically dp-1. Defaults to dp-1.

dl

Integer. The dimension of the between-class projection. Defaults to length(unique(Y))-1, which is often the maximum number of discriminative axes for LDA.

Details

The function proceeds through the following steps:

  1. Preprocessing: The data X is preprocessed using the specified preproc function.

  2. PCA Projection: The preprocessed data is projected onto the first dp principal components.

  3. Within-Class Scatter: The within-class scatter matrix Sw is computed in the PCA-transformed space.

  4. Between-Class Scatter: The between-class scatter matrix Sb is computed in the PCA-transformed space.

  5. Within-Class Projection: The eigen-decomposition of Sw is used to derive an intermediate projection of dimension di.

  6. Between-Class Projection: The projected group means are subjected to PCA to derive a final projection of dimension dl.

  7. Final Projection: The data is ultimately projected onto the dl-dimensional subspace that maximizes class separation.

Value

An object of class discriminant_projector (from multivarious) containing:

  • rotation: The final projection matrix of size d x dl, mapping from original features to dl-dimensional space.

  • s: The projected data scores of size n x dl, where each row is a sample in the reduced space.

  • sdev: The standard deviations of each dimension in the projected space.

  • labels: The class labels associated with each sample.

  • dp, di, dl: The specified or inferred PCA/LDA dimensions.

  • preproc: The preprocessing object used.

See Also

pca, eigs_sym

Examples

## Not run: 
data(iris)
X <- as.matrix(iris[, 1:4])
Y <- iris[, 5]
# Reduce to a space of dp=4, di=3, dl=2 for illustration
res <- pca_lda(X, Y, di=3)

## End(Not run)

bbuchsbaum/discursive documentation built on April 14, 2025, 4:57 p.m.