nnlm: Non-negative linear model/regression (NNLM)
In NNLM: Fast and Versatile Non-Negative Matrix Factorization

Description Usage Arguments Details Value Author(s) References Examples

Solving non-negative linear regression problem as

argmin_{β ≥ 0} L(y - xβ) + α_1 ||β||_2^2 + α_2 ∑_{i < j} β_{\cdot i}^T β_{\cdot j}^T + α_3 ||β||_1

where L is a loss function of either square error or Kullback-Leibler divergence.

nnlm(x, y, alpha = rep(0, 3), method = c("scd", "lee"),
  loss = c("mse", "mkl"), init = NULL, mask = NULL, check.x = TRUE,
  max.iter = 10000L, rel.tol = 1e-12, n.threads = 1L,
  show.warning = TRUE)

`x`	Design matrix
`y`	Vector or matrix of response
`alpha`	A vector of non-negative value length equal to or less than 3, meaning [L2, angle, L1] regularization on `beta` (non-masked entries)
`method`	Iteration algorithm, either 'scd' for sequential coordinate-wise descent or 'lee' for Lee's multiplicative algorithm
`loss`	Loss function to use, either 'mse' for mean square error or 'mkl' for mean KL-divergence. Note that if `x`, `y` contains negative values, one should always use 'mse'
`init`	Initial value of `beta` for iteration. Either NULL (default) or a non-negative matrix of
`mask`	Either NULL (default) or a logical matrix of the same shape as `beta`, indicating if an entry should be fixed to its initial (if `init` specified) or 0 (if `init` not specified).
`check.x`	If to check the condition number of `x` to ensure unique solution. Default to `TRUE` but could be slow
`max.iter`	Maximum number of iterations
`rel.tol`	Stop criterion, relative change on x between two successive iteration. It is equal to 2\|e2-e1\|/(e2+e1)*. One could specify a negative number to force an exact `max.iter` iteration, i.e., not early stop
`n.threads`	An integer number of threads/CPUs to use. Default to 1 (no parallel). Use 0 or a negative value for all cores
`show.warning`	If to shown warnings if exists. Default to TRUE

The linear model is solve in column-by-column manner, which is parallelled. When y_{\cdot j} (j-th column) contains missing values, only the complete entries are used to solve β_{\cdot j}. Therefore, the minimum complete entries of each column should be not smaller than number of columns of x when penalty is not used.

method = 'scd' is recommended, especially when the solution is probably sparse. Though both "mse" and "mkl" loss are supported for non-negative x and y, only "mse" is proper when either y or x contains negative value. Note that loss "mkl" is much slower then loss "mse", which might be your concern when x and y is extremely large.

mask is can be used for hard regularization, i.e., forcing entries to their initial values (if init specified) or 0 (if init not specified). Internally, mask is achieved by skipping the masked entries during the element-wse iteration.

An object of class 'nnlm', which is a list with components

coefficients : a matrix or vector (depend on y) of the NNLM solution, i.e., β
n.iteration : total number of iteration (sum over all column of beta)
error : a vector of errors/loss as c(MSE, MKL, target.error) of the solution
options : list of information of input arguments
call : function call

Eric Xihui Lin, xihuil.silence@gmail.com

Franc, V. C., Hlavac, V. C., Navara, M. (2005). Sequential Coordinate-Wise Algorithm for the Non-negative Least Squares Problem. Proc. Int'l Conf. Computer Analysis of Images and Patterns. Lecture Notes in Computer Science 3691. p. 407.

Lee, Daniel D., and H. Sebastian Seung. 1999. "Learning the Parts of Objects by Non-Negative Matrix Factorization." Nature 401: 788-91.

Pascual-Montano, Alberto, J.M. Carazo, Kieko Kochi, Dietrich Lehmann, and Roberto D.Pascual-Marqui. 2006. "Nonsmooth Nonnegative Matrix Factorization (NsNMF)." IEEE Transactions on Pattern Analysis and Machine Intelligence 28 (3): 403-14.

# without negative value
x <- matrix(runif(50*20), 50, 20);
beta <- matrix(rexp(20*2), 20, 2);
y <- x %*% beta + 0.1*matrix(runif(50*2), 50, 2);
beta.hat <- nnlm(x, y, loss = 'mkl');

# with negative values
x2 <- 10*matrix(rnorm(50*20), 50, 20);
y2 <- x2 %*% beta + 0.2*matrix(rnorm(50*2), 50, 2);
beta.hat2 <- nnlm(x, y);