pca_lda: PCA followed by Linear Discriminant Analysis
In bbuchsbaum/discursive: What the Package Does (One Line, Title Case)

pca_lda

R Documentation

PCA followed by Linear Discriminant Analysis

Description

This function applies Principal Component Analysis (PCA) followed by Linear Discriminant Analysis (LDA) to a given dataset. The data is first projected onto dp principal components, then further transformed via a two-step LDA procedure: an intermediate within-class projection of dimension di, followed by a final between-class projection of dimension dl. This sequence of transformations aims to reduce dimensionality while enhancing class separability.

Usage

pca_lda(
  X,
  Y,
  preproc = center(),
  dp = min(dim(X)),
  di = dp - 1,
  dl = length(unique(Y)) - 1
)

Arguments

`X`	A numeric matrix of size `n x d`, where `n` is the number of samples (rows) and `d` is the number of features (columns).
`Y`	A factor or numeric vector of length `n` representing class labels for each sample. If numeric, it will be converted to a factor.
`preproc`	A preprocessing function from the `multivarious` package (e.g. `center()`, `scale()`) to apply to the data before PCA. Defaults to centering.
`dp`	Integer. The dimension of the initial PCA projection. Defaults to `min(dim(X))`, i.e., the smaller of the number of samples or features. Must be at least 2 and at most `min(n,d)`.
`di`	Integer. The dimension of the within-class projection, typically `dp-1`. Defaults to `dp-1`.
`dl`	Integer. The dimension of the between-class projection. Defaults to `length(unique(Y))-1`, which is often the maximum number of discriminative axes for LDA.

Details

The function proceeds through the following steps:

Preprocessing: The data X is preprocessed using the specified preproc function.
PCA Projection: The preprocessed data is projected onto the first dp principal components.
Within-Class Scatter: The within-class scatter matrix Sw is computed in the PCA-transformed space.
Between-Class Scatter: The between-class scatter matrix Sb is computed in the PCA-transformed space.
Within-Class Projection: The eigen-decomposition of Sw is used to derive an intermediate projection of dimension di.
Between-Class Projection: The projected group means are subjected to PCA to derive a final projection of dimension dl.
Final Projection: The data is ultimately projected onto the dl-dimensional subspace that maximizes class separation.

Value

An object of class discriminant_projector (from multivarious) containing:

rotation: The final projection matrix of size d x dl, mapping from original features to dl-dimensional space.
s: The projected data scores of size n x dl, where each row is a sample in the reduced space.
sdev: The standard deviations of each dimension in the projected space.
labels: The class labels associated with each sample.
dp, di, dl: The specified or inferred PCA/LDA dimensions.
preproc: The preprocessing object used.

Examples

## Not run: 
data(iris)
X <- as.matrix(iris[, 1:4])
Y <- iris[, 5]
# Reduce to a space of dp=4, di=3, dl=2 for illustration
res <- pca_lda(X, Y, di=3)

## End(Not run)

bbuchsbaum/discursive documentation built on April 14, 2025, 4:57 p.m.