wMLDA: Weighted Multi-label Linear Discriminant Analysis (wMLDA)

View source: R/wMLDA.R

wMLDAR Documentation

Weighted Multi-label Linear Discriminant Analysis (wMLDA)

Description

This function implements the Weighted Multi-label Linear Discriminant Analysis (wMLDA) framework as described in the paper "A weighted linear discriminant analysis framework for multi-label feature extraction" by Jianhua Xu et al. The wMLDA framework unifies several weight forms for multi-label LDA, including binary, correlation, entropy, fuzzy, and dependence-based weighting. Each weighting strategy determines how much each instance contributes to each label, which in turn defines the between-class and within-class scatter matrices.

Usage

wMLDA(
  X,
  Y,
  weight_method = c("binary", "correlation", "entropy", "fuzzy", "dependence"),
  ncomp = NULL,
  max_iter_fuzzy = 100,
  tol_fuzzy = 1e-06,
  max_iter_dep = 100,
  preproc = multivarious::center(),
  reg = 1e-09,
  seed = NULL
)

Arguments

X

A numeric matrix or data frame of size n x d, where n is the number of samples and d is the number of features.

Y

A binary label matrix of size n x q, where Yi, k = 1 if sample i has label k, and 0 otherwise.

weight_method

A character string specifying the weight form to use. One of:

  • "binary": Each relevant label gets weight 1, potentially over-counting for multi-label instances.

  • "correlation": Uses global label correlation to determine weights, possibly assigning positive weights to irrelevant labels.

  • "entropy": Weights are the reciprocal of the number of relevant labels, distributing weights evenly among relevant labels.

  • "fuzzy": A fuzzy membership approach that uses both label and feature information.

  • "dependence": A dependence-based form using Hilbert-Schmidt independence criterion (HSIC) and random block coordinate descent.

ncomp

The number of components (dimensions) to extract. Must be \leq q-1. Defaults to q - 1.

max_iter_fuzzy

Maximum number of iterations for the fuzzy method. Default 100.

tol_fuzzy

Convergence tolerance for the fuzzy method. Default 1e-6.

max_iter_dep

Maximum number of epochs for the dependence-based RBCDM method. Default 100.

preproc

A preprocessing step from multivarious, e.g. center() or scale(). Defaults to center().

reg

A small regularization value added to Sw to ensure invertibility. Default 1e-9.

seed

Random seed for reproducibility. Default NULL (no setting of seed).

Details

The final result is returned as a discriminant_projector object from the multivarious package, which can be integrated into downstream analytical workflows (e.g. applying the projection to new data).

Value

A discriminant_projector object containing:

rotation

The projection matrix (d x ncomp) mapping original features into discriminant space.

s

The projected scores of the training data (n x ncomp).

sdev

Standard deviations of the scores.

labels

The class (label) information.

preproc

The preprocessing object.

classes

The string "wMLDA".

References

Xu, J. "A weighted linear discriminant analysis framework for multi-label feature extraction." Knowledge-Based Systems, Volume 131, 2017, Pages 1-13.

Examples

## Not run: 
library(multivarious)
set.seed(123)
X <- matrix(rnorm(100*5), nrow=100, ncol=5)
# Suppose we have 3 labels:
Y <- matrix(0, nrow=100, ncol=3)
# Assign random labels:
for (i in 1:100) {
  lab_count <- sample(1:3, 1)
  chosen <- sample(1:3, lab_count)
  Y[i, chosen] <- 1
}

res <- wMLDA(X, Y, weight_method="entropy", ncomp=2)
str(res)

## End(Not run)

bbuchsbaum/discursive documentation built on April 14, 2025, 4:57 p.m.