moda_full | R Documentation |
Implements the full Multimodal Oriented Discriminant Analysis (MODA) framework as derived in De la Torre & Kanade (2005). This code:
Clusters each class into one or more clusters to capture multimodal structure.
Approximates each cluster's covariance as U_i \Lambda_i U_i^T + \sigma_i^2 I
to handle high-dimensional data (Section 6 of the paper).
Constructs the majorization function L(\mathbf{B})
that upper-bounds
the Kullback–Leibler divergence-based objective G(\mathbf{B})
(Equations (7)-(8)).
Iterates using the gradient-based solution to minimize E_5(\mathbf{B})
(Equation (10)) with updates from Equation (11) (i.e., normalized gradient descent
or line search).
It does not merely provide a starter approach; instead, it faithfully implements the steps described in the paper, including references to Equations (7)–(11).
moda_full(
X,
y,
k,
numClusters = 1,
pcaFirst = TRUE,
pcaVar = 0.95,
maxIter = 50,
tol = 1e-05,
clusterMethod = "kmeans",
B_init = "random",
verbose = FALSE,
lineSearchIter = 20,
B_init_sd = 0.01
)
X |
A numeric matrix of size |
y |
A vector (length |
k |
Integer. Dimensionality of the target subspace (number of features to extract). |
numClusters |
Integer or vector/list specifying #clusters per class. If 1, it's ODA. |
pcaFirst |
Logical. If TRUE, run PCA first to reduce dimension if |
pcaVar |
Fraction of variance to keep if |
maxIter |
Maximum number of majorization iterations. Defaults to 50. |
tol |
Convergence tolerance on relative change in the objective |
clusterMethod |
Either |
B_init |
Either |
verbose |
If TRUE, prints iteration progress. |
lineSearchIter |
Number of line search iterations for step size selection (default = 20). |
B_init_sd |
Standard deviation for the random initialization of \(\mathbfB\) if |
Key Steps:
Clustering (Section 4): For each class, optionally split the samples into multiple clusters to model multimodality.
Approximate Covariances (Section 6): For each cluster, approximate \(\Sigma_i\) by \(\mathbfU_i \boldsymbol\Lambda_i \mathbfU_i^T + \sigma_i^2 \mathbfI\).
Majorization (Sections 5.1–5.2): Build L(\mathbf{B})
from G(\mathbf{B})
using Equation (7) and sum up to get Equation (8).
Iterative Minimization of L(\mathbf{B})
\(\geq G(\mathbfB)\). The partial
derivatives (Equation (9)) yield a system of linear equations, solved here by
gradient-based updates (Equations (10)–(11)).
High-Dimensional Data: When d \gg n
, it is recommended to set pcaFirst=TRUE
so that the dimension is reduced to at most n
, avoiding rank deficiency and
improving generalization.
Classification after MODA: Once B
is learned, map a new sample \mathbf{x}
to \mathbf{B}^T \mathbf{x}
(plus PCA if used) and classify in that lower-dimensional space.
For further details, see:
De la Torre & Kanade (2005). "Multimodal Oriented Discriminant Analysis."
Equations (7)–(11) for the majorization steps.
Section 6 for the covariance factorization in high dimensions.
A list with elements:
B
: A d' \times k
matrix (or d \times k
if no PCA) with the learned projection.
objVals
: The values of the objective G(\mathbf{B})
at each iteration.
clusters
: The cluster assignments (per class).
pcaInfo
: If PCA was applied, contains the PCA rotation U
and mean.
Equation (7): Inequality used to construct the majorization function.
Equation (8): Definition of L(\mathbf{B})
that majorizes G(\mathbf{B})
.
Equation (9): Necessary condition for the minimum of L(\mathbf{B})
.
Equation (10): Definition of E_5(\mathbf{B})
to be minimized via gradient methods.
Equation (11): Normalized gradient-descent update with step size \eta
chosen
to minimize E_5
.
# Synthetic example (small scale):
set.seed(123)
d <- 20; n <- 40
X <- matrix(rnorm(d*n), nrow = d, ncol = n)
y <- rep(1:2, each = n/2)
res <- moda_full(X, y, k = 2, numClusters = 1, pcaFirst = FALSE, maxIter = 15, verbose = TRUE)
# Inspect the learned projection B
str(res)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.