wMLDA: Weighted Multi-label Linear Discriminant Analysis (wMLDA)
In bbuchsbaum/discursive: What the Package Does (One Line, Title Case)

wMLDA

R Documentation

Weighted Multi-label Linear Discriminant Analysis (wMLDA)

Description

This function implements the Weighted Multi-label Linear Discriminant Analysis (wMLDA) framework as described in the paper "A weighted linear discriminant analysis framework for multi-label feature extraction" by Jianhua Xu et al. The wMLDA framework unifies several weight forms for multi-label LDA, including binary, correlation, entropy, fuzzy, and dependence-based weighting. Each weighting strategy determines how much each instance contributes to each label, which in turn defines the between-class and within-class scatter matrices.

Usage

wMLDA(
  X,
  Y,
  weight_method = c("binary", "correlation", "entropy", "fuzzy", "dependence"),
  ncomp = NULL,
  max_iter_fuzzy = 100,
  tol_fuzzy = 1e-06,
  max_iter_dep = 100,
  preproc = multivarious::center(),
  reg = 1e-09,
  seed = NULL
)

Arguments

`X`	A numeric matrix or data frame of size n x d, where n is the number of samples and d is the number of features.
`Y`	A binary label matrix of size n x q, where Yi, k = 1 if sample i has label k, and 0 otherwise.
`weight_method`	A character string specifying the weight form to use. One of: "binary": Each relevant label gets weight 1, potentially over-counting for multi-label instances. "correlation": Uses global label correlation to determine weights, possibly assigning positive weights to irrelevant labels. "entropy": Weights are the reciprocal of the number of relevant labels, distributing weights evenly among relevant labels. "fuzzy": A fuzzy membership approach that uses both label and feature information. "dependence": A dependence-based form using Hilbert-Schmidt independence criterion (HSIC) and random block coordinate descent.
`ncomp`	The number of components (dimensions) to extract. Must be `\leq q-1`. Defaults to `q - 1`.
`max_iter_fuzzy`	Maximum number of iterations for the fuzzy method. Default 100.
`tol_fuzzy`	Convergence tolerance for the fuzzy method. Default 1e-6.
`max_iter_dep`	Maximum number of epochs for the dependence-based RBCDM method. Default 100.
`preproc`	A preprocessing step from `multivarious`, e.g. `center()` or `scale()`. Defaults to `center()`.
`reg`	A small regularization value added to `Sw` to ensure invertibility. Default 1e-9.
`seed`	Random seed for reproducibility. Default NULL (no setting of seed).

Details

The final result is returned as a discriminant_projector object from the multivarious package, which can be integrated into downstream analytical workflows (e.g. applying the projection to new data).

Value

A discriminant_projector object containing:

`rotation`	The projection matrix (d x ncomp) mapping original features into discriminant space.
`s`	The projected scores of the training data (n x ncomp).
`sdev`	Standard deviations of the scores.
`labels`	The class (label) information.
`preproc`	The preprocessing object.
`classes`	The string "wMLDA".

References

Xu, J. "A weighted linear discriminant analysis framework for multi-label feature extraction." Knowledge-Based Systems, Volume 131, 2017, Pages 1-13.

Examples

## Not run: 
library(multivarious)
set.seed(123)
X <- matrix(rnorm(100*5), nrow=100, ncol=5)
# Suppose we have 3 labels:
Y <- matrix(0, nrow=100, ncol=3)
# Assign random labels:
for (i in 1:100) {
  lab_count <- sample(1:3, 1)
  chosen <- sample(1:3, lab_count)
  Y[i, chosen] <- 1
}

res <- wMLDA(X, Y, weight_method="entropy", ncomp=2)
str(res)

## End(Not run)

bbuchsbaum/discursive documentation built on April 14, 2025, 4:57 p.m.