soft_lda: Partially Supervised LDA with Soft Labels (Robust Version)

View source: R/soft_lda.R View source: R/em_soft_lda.R

soft_ldaR Documentation

Partially Supervised LDA with Soft Labels (Robust Version)

Description

Fits a Linear Discriminant Analysis (LDA) model using soft labels and the Evidential EM (E²M) algorithm, with improved robustness and optional regularization.

This function implements a soft-label variant of Linear Discriminant Analysis (LDA), following the approach described in:

Usage

soft_lda(
  X,
  C,
  preproc = pass(),
  dp = min(dim(X)),
  di = dp - 1,
  dl = ncol(C) - 1,
  alpha = 0
)

soft_lda(
  X,
  C,
  preproc = pass(),
  dp = min(dim(X)),
  di = dp - 1,
  dl = ncol(C) - 1,
  alpha = 0
)

Arguments

X

A numeric matrix \(\mathrmn \times d\), rows = samples, columns = features.

C

A numeric matrix \(\mathrmn \times c\) of soft memberships. C[i, j] = weight of sample i for class j. Must be \(\ge 0\); row sums can be any positive value (if \(\sum_j Ci,j = 1\), each row is a probability distribution).

preproc

A pre_processor from multivarious, e.g. center() or pass(). Defaults to pass() (no centering).

dp

Integer. Number of principal components to keep in the first PCA step. Defaults to min(dim(X)).

di

Integer. Dimension of the within-class subspace. Default dp - 1.

dl

Integer. Dimension of the final subspace for between-class separation. Default ncol(C) - 1.

alpha

A numeric ridge parameter (\(\ge 0\)). If alpha > 0, we add \(\alpha I\) to \(\widetildeS_w\) to ensure invertibility. Default 0.

PL

A numeric matrix n x K of plausibility values for each class and instance.

max_iter

Integer, maximum number of E²M iterations. Default: 100.

tol

Numeric tolerance for convergence in the log-likelihood. Default: 1e-6.

n_starts

Integer, number of random initializations. The best solution (highest final log-likelihood) is chosen. Default: 5.

reg

Numeric, a small ridge penalty to add to the covariance matrix for numerical stability. Default: 1e-9.

verbose

Logical, if TRUE prints progress messages. Default: FALSE.

Details

Zhao, M., Zhang, Z., Chow, T.W.S., & Li, B. (2014). "A general soft label based Linear Discriminant Analysis for semi-supervised dimensionality reduction." Neurocomputing, 135, 250-264.

Instead of hard (0/1) labels, each sample can have fractional memberships (soft labels) across multiple classes. These memberships are encoded in a matrix C, typically obtained via a label-propagation or fuzzy labeling step. SL-LDA uses these soft memberships to form generalized scatter matrices \(\widetildeS_w\) and \(\widetildeS_b\), then solves an LDA-like dimension-reduction problem in a PCA subspace via a two-step approach:

  1. Preprocessing: Apply a preproc function (e.g. center()) to the data X.

  2. PCA: Project the data onto the top dp principal components (to handle rank deficiency).

  3. Compute Soft-Label Scatter in the PCA space:

    • Let \(\mathbfF = \mathbfC^\top\) be size \(\mathrmc \times n\).

    • Let \(\mathbfE = \mathrmdiag(\textrowSums(\mathbfC))\) (size \(\mathrmn \times n\)).

    • Let \(\mathbfG = \mathrmdiag(\textcolSums(\mathbfC))\) (size \(\mathrmc \times c\)).

    • Form \(\widetildeS_w = X_p^\top ( E - F^\top G^-1 F ) X_p + \alpha I\) (within-class), and \(\widetildeS_b = X_p^\top \bigl(F^\top G^-1F - \tfracE e e^\top Ee\,E\,e^\top\bigr) X_p\) (between-class).

  4. Within-class projection (di): Partially diagonalize \(\widetildeS_w\). In code, we extract di eigenvectors. (Note: Some references keep the largest eigenvalues, others the smallest.)

  5. Between-class projection (dl): Project the (soft) class means into the di-dim subspace, then run a small PCA for dimension dl.

  6. Combine: Multiply \(\mathrm(d \times dp) \cdot (\mathrmdp \times di) \cdot (\mathrmdi \times dl)\) to get the final \(\mathrm(d \times dl)\) projection matrix.

In typical references, one might pick the largest eigenvalues of \(\widetildeS_w\) for stable inversion, but certain versions (like Null-LDA) use the smallest eigenvalues. Adjust the code in RSpectra::eigs_sym() accordingly if you prefer a different variant.

If you want to confirm \(\widetildeS_t = \widetildeS_w + \widetildeS_b\) numerically, you can define a helper function for \(\widetildeS_t\) and compare it to \(\widetildeS_w + \widetildeS_b\).

Value

A list with:

pi

Estimated class priors

mu

Estimated class means

Sigma

Estimated covariance matrix

zeta

Posterior class probabilities (n x K)

loglik

Final log evidential likelihood

iter

Number of iterations performed

A discriminant_projector object with subclass "soft_lda" containing:

  • v ~ The \(\mathrm(d \times dl)\) final projection matrix.

  • s ~ The \(\mathrm(n \times dl)\) projected scores of the training set.

  • sdev ~ The std dev of each dimension in s.

  • labels ~ Currently set to colnames(C) (or NULL).

  • preproc ~ The preprocessing object used.

  • classes ~ A string "soft_lda".

References

Quost, B., Denoeux, T., Li, S. (2017). Parametric classification with soft labels using the Evidential EM algorithm. Advances in Data Analysis and Classification, 11(4), 659-690.

Zhao, M., Zhang, Z., Chow, T.W.S., & Li, B. (2014). "A general soft label based Linear Discriminant Analysis for semi-supervised dimensionality reduction." Neurocomputing, 135, 250-264.

Examples

set.seed(123)
n <- 100; d <- 2; K <- 3
X <- rbind(
  MASS::mvrnorm(n, c(1,0), diag(2)),
  MASS::mvrnorm(n, c(-1,1), diag(2)),
  MASS::mvrnorm(n, c(0,-1), diag(2))
)
Y <- c(rep(1,n), rep(2,n), rep(3,n))
# Soft labels: add uncertainty
PL <- matrix(0, 3*n, K)
for (i in 1:(3*n)) {
  if (runif(1)<0.2) {
    alt <- sample(setdiff(1:K, Y[i]),1)
    PL[i,Y[i]] <- 0.5
    PL[i,alt] <- 0.5
  } else {
    PL[i,Y[i]] <- 1
  }
}
res <- soft_lda(X, PL, verbose=TRUE)
str(res)

bbuchsbaum/discursive documentation built on April 14, 2025, 4:57 p.m.