nnmf: Nonnegative Matrix Factorization

nnmfR Documentation

Nonnegative Matrix Factorization

Description

Nonnegative matrix factorization (NMF) decomposes a nonnegative data matrix into a matrix of basis variables and a matrix of activations (or coefficients). The factorization is approximate and may be less accurate than alternative methods such as PCA, but can greatly improve the interpretability of the reduced dimensions.

Usage

# Alternating least squares
nnmf_als(x, k = 3L, niter = 100L, transpose = FALSE,
	eps = 1e-9, tol = 1e-5, verbose = NA, ...)

# Multiplicative updates
nnmf_mult(x, k = 3L, niter = 100L, cost = c("euclidean", "KL", "IS"),
	transpose = FALSE, eps = 1e-9, tol = 1e-5, verbose = NA, ...)

## S3 method for class 'nnmf'
predict(object, newdata, ...)

# Nonnegative double SVD
nndsvd(x, k = 3L, ...)

Arguments

x

A nonnegative matrix.

k

The number of NMF components to extract.

niter

The maximum number of iterations.

transpose

A logical value indicating whether x should be considered transposed or not. This can be useful if the input matrix is (P x N) instead of (N x P) and storing the transpose is expensive. This is not necessary for matter_mat and sparse_mat objects, but can be useful for large in-memory (P x N) matrices.

eps

A small regularization parameter to prevent singularities.

tol

The tolerance for convergence, as measured by the Frobenius norm of the differences between the W and H matrices in successive iterations.

verbose

Should progress be printed for each iteration?

cost

The cost function (i.e., error measure between the reconstructed matrix and original x) to optimize, where 'euclidean' is the Frobenius norm, 'KL' is the Kullback-Leibler divergence, and 'IS' is the Itakura-Saito divergence. See Details.

...

Additional options passed to irlba.

object

An object inheriting from nmf.

newdata

An optional data matrix to use for the prediction.

Details

These functions implement nonnegative matrix factorization (NMF) using either alternating least squares as described by Berry at al. (2007) or multiplicative updates from Lee and Seung (2000) and further described by Burred (2014). The algorithms are initialized using nonnegative double singular value decomposition (NNDSVD) from Boutsidis and Gallopoulos (2008).

The algorithm using multiplicative updates (nnmf_mult()) tends to be more stable but converges more slowly. The alternative least squares algorithm (nnmf_als()) tends to converge faster to more accurate results, but can be less numerically stable than the multiplicative updates algorithm.

Note for nnmf_mult() that method = "euclidean" is the only method that can handle out-of-memory matter_mat and sparse_mat matrices. x will be coerced to an in-memory matrix for other methods.

Value

An object of class nnmf, with the following components:

  • activation: The (transposed) coefficient matrix (H).

  • x: The basis variable matrix (W).

  • iter: The number of iterations performed.

Author(s)

Kylie A. Bemis

References

M. W. Berry, M. Browne, A. N. Langville, V. P. Pauca, R. J. Plemmons. “Algorithms and applications for approximate nonnegative matrix factorization.” Computational Statistics and Data Analysis, vol. 52, issue 1, pp. 155-173, Sept. 2007.

D. D. Lee and H. S. Seung. “Algorithms for non-negative matrix factorization.” Proceedings of the 13th International Conference on Neural Information Processing Systems (NIPS), pp. 535-541, Jan. 2000.

C. Boutsidis and E. Gallopoulos. “SVD based initialization: A head start for nonnegative matrix factorization.” Pattern Recognition, vol. 41, issue 4, pp. 1350-1362, Apr. 2008.

J. J. Burred. “Detailed derivation of multiplicative update rules for NMF.” Techical report, Paris, March 2014.

See Also

svd, prcomp

Examples

set.seed(1)

a <- matrix(sort(runif(500)), nrow=50, ncol=10)
b <- matrix(rev(sort(runif(500))), nrow=50, ncol=10)
x <- cbind(a, b)

mf <- nnmf_als(x, k=3)

kuwisdelu/matter documentation built on Oct. 19, 2024, 10:31 a.m.