#' @title Fit Non-negative Matrix Factorization to Count Data
#'
#' @description Approximate the input matrix \code{X} by the
#' non-negative matrix factorization \code{tcrossprod(L,F)}, in which
#' the quality of the approximation is measured by a
#' \dQuote{divergence} criterion; equivalently, optimize the
#' likelihood under a Poisson model of the count data, \code{X}, in
#' which the Poisson rates are given by \code{tcrossprod(L,F)}.
#' Function \code{fit_poisson_nmf} runs a specified number of
#' coordinate-wise updates to fit the L and F matrices.
#'
#' @details In Poisson non-negative matrix factorization (Lee & Seung,
#' 2001), counts \eqn{x_{ij}} in the \eqn{n \times m} matrix, \eqn{X},
#' are modeled by the Poisson distribution: \deqn{x_{ij} \sim
#' \mathrm{Poisson}(\lambda_{ij}).} Each Poisson rate,
#' \eqn{\lambda_{ij}}, is a linear combination of parameters
#' \eqn{f_{jk} \geq 0, l_{ik} \geq 0} to be fitted to the data:
#' \deqn{\lambda_{ij} = \sum_{k=1}^K l_{ik} f_{jk},} in which \eqn{K}
#' is a user-specified tuning parameter specifying the rank of the
#' matrix factorization. Function \code{fit_poisson_nmf} computes
#' maximum-likelihood estimates (MLEs) of the parameters. For
#' additional mathematical background, and an explanation of how
#' Poisson NMF is connected to topic modeling, see the vignette:
#' \code{vignette(topic = "relationship",package = "fastTopics")}.
#'
#' Using this function requires some care; only minimal argument
#' checking is performed, and error messages may not be helpful.
#'
#' The EM and multiplicative updates are simple and fast, but can be
#' slow to converge to a stationary point. When \code{control$numiter
#' = 1}, the EM and multiplicative updates are mathematically
#' equivalent to the multiplicative updates, and therefore share the
#' same convergence properties. However, the implementation of the EM
#' updates is quite different; in particular, the EM updates are more
#' suitable for sparse counts matrices. The implementation of the
#' multiplicative updates is adapted from the MATLAB code by Daichi
#' Kitamura \url{http://d-kitamura.net}.
#'
#' Since the multiplicative updates are implemented using standard
#' matrix operations, the speed is heavily dependent on the
#' BLAS/LAPACK numerical libraries used. In particular, using
#' optimized implementations such as OpenBLAS or Intel MKL can result
#' in much improved performance of the multiplcative updates.
#'
#' The cyclic co-ordinate descent (CCD) and sequential co-ordinate
#' descent (SCD) updates adopt the same optimization strategy, but
#' differ in the implementation details. In practice, we have found
#' that the CCD and SCD updates arrive at the same solution when
#' initialized \dQuote{sufficiently close} to a stationary point. The
#' CCD implementation is adapted from the C++ code developed by
#' Cho-Jui Hsieh and Inderjit Dhillon, which is available for download
#' at \url{https://www.cs.utexas.edu/~cjhsieh/nmf/}. The SCD
#' implementation is based on version 0.4-3 of the \sQuote{NNLM}
#' package.
#'
#' An additional re-scaling step is performed after each update to
#' promote numerical stability.
#'
#' We use three measures of progress for the model fitting: (1)
#' improvement in the log-likelihood (or deviance), (2) change in the
#' model parameters, and (3) the residuals of the Karush-Kuhn-Tucker
#' (KKT) first-order conditions. As the iterates approach a stationary
#' point of the loss function, the change in the model parameters
#' should be small, and the residuals of the KKT system should vanish.
#' Use \code{\link{plot_progress}} to plot the improvement in the
#' solution over time.
#'
#' See \code{\link{fit_topic_model}} for additional guidance on model
#' fitting, particularly for large or complex data sets.
#'
#' The \code{control} argument is a list in which any of the
#' following named components will override the default optimization
#' algorithm settings (as they are defined by
#' \code{fit_poisson_nmf_control_default}):
#'
#' \describe{
#'
#' \item{\code{numiter}}{Number of \dQuote{inner loop} iterations to
#' run when performing and update of the factors or loadings. This
#' must be set to 1 for \code{method = "mu"} and \code{method =
#' "ccd"}.}
#'
#' \item{\code{nc}}{Number of RcppParallel threads to use for the
#' updates. When \code{nc} is \code{NA}, the number of threads is
#' determined by calling
#' \code{\link[RcppParallel]{defaultNumThreads}}. This setting is
#' ignored for the multiplicative upates (\code{method = "mu"}).
#' Please note that best multithreading performance is typically
#' achieved when the number of BLAS threads is set to 1, but
#' controlling this in R is not always possible; see the
#' \dQuote{\code{nc.blas}} control option for more information.}
#'
#' \item{\code{nc.blas}}{Number of threads used in the numerical
#' linear algebra library (e.g., OpenBLAS), if available. Typically
#' setting this to 1 (i.e., no multithreading) results in best
#' performance. Note that setting the number of BLAS threads relies on
#' the RhpcBLASctl package, and may not work for all multithreaded
#' BLAS libraries.}
#'
#' \item{\code{min.delta.loglik}}{Stop performing updates if the
#' difference in the Poisson NMF log-likelihood between two successive
#' updates is less than \code{min.delta.loglik}. This should not be
#' kept at zero when \code{control$extrapolate = TRUE} because the
#' extrapolated updates are expected to occasionally keep the
#' likelihood unchanged. Ignored if \code{min.delta.loglik < 0}.}
#'
#' \item{\code{min.res}}{Stop performing updates if the maximum KKT
#' residual is less than \code{min.res}. Ignored if \code{min.res < 0}.}
#'
#' \item{\code{minval}}{A small, positive constant used to safeguard
#' the multiplicative updates. The safeguarded updates are implemented
#' as \code{F <- pmax(F1,minval)} and \code{L <- pmax(L1,minval)},
#' where \code{F1} and \code{L1} are the factors and loadings matrices
#' obtained by applying an update. This is motivated by Theorem 1 of
#' Gillis & Glineur (2012). Setting \code{minval = 0} is allowed, but
#' some methods are not guaranteed to converge to a stationary point
#' without this safeguard, and a warning will be given in this case.}
#'
#' \item{\code{extrapolate}}{When \code{extrapolate = TRUE}, the
#' extrapolation scheme of Ang & Gillis (2019) is used.}
#'
#' \item{\code{extrapolate.reset}}{To promote better numerical
#' stability of the extrapolated updates, they are \dQuote{reset}
#' every so often. This parameter determines the number of iterations
#' to wait before resetting.}
#'
#' \item{\code{beta.increase}}{When the extrapolated update improves
#' the solution, scale the extrapolation parameter by this amount.}
#'
#' \item{\code{beta.reduce}}{When the extrapolaaed update does not
#' improve the solution, scale the extrapolation parameter by this
#' amount.}
#'
#' \item{\code{betamax.increase}}{When the extrapolated update
#' improves the solution, scale the extrapolation parameter by this
#' amount.}
#'
#' \item{\code{eps}}{A small, non-negative number that is added to the
#' terms inside the logarithms to sidestep computing logarithms of
#' zero. This prevents numerical problems at the cost of introducing a
#' small inaccuracy in the solution. Increasing this number may lead
#' to faster convergence but possibly a less accurate solution.}
#'
#' \item{\code{zero.threshold}}{A small, non-negative number used to
#' determine which entries of the solution are exactly zero. Any
#' entries that are less than or equal to \code{zero.threshold} are
#' considered to be exactly zero.}}
#'
#' An additional setting, \code{control$init.numiter}, controls the
#' number of sequential co-ordinate descent (SCD) updates that are
#' performed to initialize the loadings matrix when \code{init.method
#' = "topicscore"}.
#'
#' @param X The n x m matrix of counts; all entries of X should be
#' non-negative. It can be a sparse matrix (class \code{"dgCMatrix"})
#' or dense matrix (class \code{"matrix"}), with some exceptions (see
#' \sQuote{Details}).
#'
#' @param k An integer 2 or greater giving the matrix rank. This
#' argument should only be specified if the initial fit (\code{fit0}
#' or \code{F, L}) is not provided.
#'
#' @param fit0 The initial model fit. It should be an object of class
#' \dQuote{poisson_nmf_fit}, such as an output from
#' \code{init_poisson_nmf}, or from a previous call to
#' \code{fit_poisson_nmf}.
#'
#' @param numiter The maximum number of updates of the factors and
#' loadings to perform.
#'
#' @param update.factors A numeric vector specifying which factors
#' (rows of \code{F}) to update. By default, all factors are
#' updated. Note that the rows that are not updated may still change
#' by rescaling. When \code{NULL}, all factors are fixed. This option
#' is only implemented for \code{method = "em"} and \code{method =
#' "scd"}. If another method is selected, the default setting of
#' \code{update.factors} must be used.
#'
#' @param update.loadings A numeric vector specifying which loadings
#' (rows of \code{L}) to update. By default, all loadings are
#' updated. Note that the rows that are not updated may still change
#' by rescaling. When \code{NULL}, all loadings are fixed. This option
#' is only implemented for \code{method = "em"} and \code{method =
#' "scd"}. If another method is selected, the default setting of
#' \code{update.loadings} must be used.
#'
#' @param method The method to use for updating the factors and
#' loadings. Four methods are implemented: multiplicative updates,
#' \code{method = "mu"}; expectation maximization (EM), \code{method =
#' "em"}; sequential co-ordinate descent (SCD), \code{method = "scd"};
#' and cyclic co-ordinate descent (CCD), \code{method = "ccd"}. See
#' \sQuote{Details} for a detailed description of these methods.
#'
#' @param init.method The method used to initialize the factors and
#' loadings. When \code{init.method = "random"}, the factors and
#' loadings are initialized uniformly at random; when
#' \code{init.method = "topicscore"}, the factors are initialized
#' using the (very fast) Topic SCORE algorithm (Ke & Wang, 2017), and
#' the loadings are initialized by running a small number of SCD
#' updates. This input argument is ignored if initial estimates of the
#' factors and loadings are already provided via input \code{fit0}, or
#' inputs \code{F} and \code{L}.
#'
#' @param control A list of parameters controlling the behaviour of
#' the optimization algorithm (and the Topic SCORE algorithm if it
#' is used to initialize the model parameters). See \sQuote{Details}.
#'
#' @param verbose When \code{verbose = "detailed"}, information about
#' the algorithm's progress is printed to the console at each
#' iteration; when \code{verbose = "progressbar"}, a progress bar is
#' shown; and when \code{verbose = "none"}, no progress information is
#' printed. See the description of the \dQuote{progress} return value
#' for an explanation of \code{verbose = "detailed"} console
#' output. (Note that some columns of the \dQuote{progress} data frame
#' are not shown in the console output.)
#'
#' @return \code{init_poisson_nmf} and \code{fit_poisson_nmf} both
#' return an object capturing the optimization algorithm state (for
#' \code{init_poisson_nmf}, this is the initial state). It is a list
#' with the following elements:
#'
#' \item{F}{A matrix containing the current best estimates of the
#' factors.}
#'
#' \item{L}{A matrix containing the current best estimates of the
#' loadings.}
#'
#' \item{Fn}{A matrix containing the non-extrapolated factor estimates.
#' If extrapolation is not used, \code{Fn} and \code{F} will be the
#' same.}
#'
#' \item{Ln}{A matrix containing the non-extrapolated estimates of the
#' loadings. If extrapolation is not used, \code{Ln} and \code{L} will
#' be the same.}
#'
#' \item{Fy}{A matrix containing the extrapolated factor estimates. If
#' the extrapolation scheme is not used, \code{Fy} and \code{F} will
#' be the same.}
#'
#' \item{Ly}{A matrix containing the extrapolated estimates of the
#' loadings. If extrapolation is not used, \code{Ly} and \code{L} will
#' be the same.}
#'
#' \item{loss}{Value of the objective (\dQuote{loss}) function
#' computed at the current best estimates of the factors and
#' loadings.}
#'
#' \item{loss.fnly}{Value of the objective (\dQuote{loss}) function
#' computed at the extrapolated solution for the loadings (\code{Ly})
#' and the non-extrapolated solution for the factors (\code{Fn}). This
#' is used internally to implement the extrapolated updates.}
#'
#' \item{iter}{The number of the most recently completed iteration.}
#'
#' \item{beta}{The extrapolation parameter, \eqn{beta} in Algorithm 3
#' of Ang & Gillis (2019).}
#'
#' \item{betamax}{Upper bound on the extrapolation parameter. This is
#' \eqn{\bar{\gamma}} in Algorithm 3 of Ang & Gillis (2019).}
#'
#' \item{beta0}{The setting of the extrapolation parameter at the
#' last iteration that improved the solution.}
#'
#' \item{progress}{A data frame containing detailed information about
#' the algorithm's progress. The data frame should have at most
#' \code{numiter}
#' rows. The columns of the data frame are: \dQuote{iter}, the
#' iteration number; \dQuote{loglik}, the Poisson NMF log-likelihood
#' at the current best factor and loading estimates;
#' \dQuote{loglik.multinom}, the multinomial topic model
#' log-likelihood at the current best factor and loading estimates;
#' \dQuote{dev}, the deviance at the current best factor and loading
#' estimates; \dQuote{res}, the maximum residual of the
#' Karush-Kuhn-Tucker (KKT) first-order optimality conditions at the
#' current best factor and loading estimates; \dQuote{delta.f}, the
#' largest change in the factors matrix; \dQuote{delta.l}, the largest
#' change in the loadings matrix; \dQuote{nonzeros.f}, the proportion
#' of entries in the factors matrix that are nonzero;
#' \dQuote{nonzeros.l}, the proportion of entries in the loadings
#' matrix that are nonzero; \dQuote{extrapolate}, which is 1 if
#' extrapolation is used, otherwise it is 0; \dQuote{beta}, the
#' setting of the extrapolation parameter; \dQuote{betamax}, the
#' setting of the extrapolation parameter upper bound; and
#' \dQuote{timing}, the elapsed time in seconds (recorded using
#' \code{\link{proc.time}}).}
#'
#' @references
#' Ang, A. and Gillis, N. (2019). Accelerating nonnegative matrix
#' factorization algorithms using extrapolation. \emph{Neural
#' Computation} \bold{31}, 417–439.
#'
#' Cichocki, A., Cruces, S. and Amari, S. (2011). Generalized
#' alpha-beta divergences and their application to robust nonnegative
#' matrix factorization. \emph{Entropy} \bold{13}, 134–170.
#'
#' Gillis, N. and Glineur, F. (2012). Accelerated multiplicative
#' updates and hierarchical ALS algorithms for nonnegative matrix
#' factorization. \emph{Neural Computation} \code{24}, 1085–1105.
#'
#' Hsieh, C.-J. and Dhillon, I. (2011). Fast coordinate descent
#' methods with variable selection for non-negative matrix
#' factorization. In \emph{Proceedings of the 17th ACM SIGKDD
#' international conference on Knowledge discovery and data mining},
#' p. 1064-1072
#'
#' Lee, D. D. and Seung, H. S. (2001). Algorithms for non-negative
#' matrix factorization. In \emph{Advances in Neural Information
#' Processing Systems} \bold{13}, 556–562.
#'
#' Lin, X. and Boutros, P. C. (2018). Optimization and expansion of
#' non-negative matrix factorization. \emph{BMC Bioinformatics}
#' \bold{21}, 7.
#'
#' Ke, Z. & Wang, M. (2017). A new SVD approach to optimal topic
#' estimation. \emph{arXiv} \url{https://arxiv.org/abs/1704.07016}
#'
#' @seealso \code{\link{fit_topic_model}}, \code{\link{plot_progress}}
#'
#' @examples
#' # Simulate a (sparse) 80 x 100 counts matrix.
#' library(Matrix)
#' set.seed(1)
#' X <- simulate_count_data(80,100,k = 3,sparse = TRUE)$X
#'
#' # Remove columns (words) that do not appear in any row (document).
#' X <- X[,colSums(X > 0) > 0]
#'
#' # Run 10 EM updates to find a good initialization.
#' fit0 <- fit_poisson_nmf(X,k = 3,numiter = 10,method = "em")
#'
#' # Fit the Poisson NMF model by running 50 EM updates.
#' fit_em <- fit_poisson_nmf(X,fit0 = fit0,numiter = 50,method = "em")
#'
#' # Fit the Poisson NMF model by running 50 extrapolated SCD updates.
#' fit_scd <- fit_poisson_nmf(X,fit0 = fit0,numiter = 50,method = "scd",
#' control = list(extrapolate = TRUE))
#'
#' # Compare the two fits.
#' fits <- list(em = fit_em,scd = fit_scd)
#' compare_fits(fits)
#' plot_progress(fits,y = "loglik")
#' plot_progress(fits,y = "res")
#'
#' # Recover the topic model. After this step, the L matrix contains the
#' # mixture proportions ("loadings"), and the F matrix contains the
#' # word frequencies ("factors").
#' fit_multinom <- poisson2multinom(fit_scd)
#'
#' @useDynLib fastTopics
#'
#' @importFrom utils modifyList
#' @importFrom RhpcBLASctl blas_set_num_threads
#' @importFrom RhpcBLASctl blas_get_num_procs
#'
#' @export
#'
fit_poisson_nmf <- function (X, k, fit0, numiter = 100,
update.factors = seq(1,ncol(X)),
update.loadings = seq(1,nrow(X)),
method = c("scd","em","mu","ccd"),
init.method = c("topicscore","random"),
control = list(),
verbose = c("progressbar","detailed","none")) {
# CHECK & PROCESS INPUTS
# ----------------------
# Check and process input argument "X".
verify.count.matrix(X)
if (is.matrix(X) & length(X) > 1e4 & mean(X > 0) < 0.1 & any(method != "mu"))
message(paste("Input matrix \"X\" has less than 10% nonzero entries;",
"consider converting \"X\" to a sparse matrix to reduce",
"computational effort"))
if (is.matrix(X) & is.integer(X))
storage.mode(X) <- "double"
# Get the number of rows (n) and columns (m) of the data matrix,
n <- nrow(X)
m <- ncol(X)
# Check input argument "numiter".
if (any(numiter < 1))
stop("Input argument \"numiter\" must be 1 or greater")
# Process input arguments "update.factors" and "update.loadings".
update.factors <- sort(update.factors)
update.loadings <- sort(update.loadings)
if (length(update.factors) == 0 & length(update.loadings) == 0)
stop("None of the factors or loadings have been selected for updating")
# Check and process input arguments "method" and "init.method".
method <- match.arg(method)
init.method <- match.arg(init.method)
if (method == "mu" & is.sparse.matrix(X)) {
warning("method = \"mu\" is not implemented for sparse counts matrices; ",
"attempting to convert \"X\" to a dense matrix")
X <- as.matrix(X)
}
if (!(length(update.factors) == m & length(update.loadings) == n))
if (method == "mu" | method == "ccd")
stop("All factors and loadings must be updated for method = \"mu\" ",
"and method = \"ccd\"")
# Check and process input argument "verbose".
verbose <- match.arg(verbose)
# Check and process the optimization settings.
control <- modifyList(fit_poisson_nmf_control_default(),
control,keep.null = TRUE)
if (any(control$minval == 0))
warning("Updates may not converge when minval = 0")
if (control$numiter > 1 & (method == "mu" | method == "ccd")) {
warning("multiplicative and CCD updates do not allow ",
"control$numiter > 1; setting control$numiter = 1")
control$numiter <- 1
}
if ((length(update.factors) == 0 | length(update.loadings) == 0) &
control$extrapolate)
stop("control$extrapolate cannot be TRUE when all factors or loadings ",
"are fixed")
control$nc <- initialize.multithreading(control$nc,verbose != "none")
ncb <- blas_get_num_procs()
blas_set_num_threads(control$nc.blas)
# Only one of "k" and "fit0" should be provided. If argument "k" is
# given, generate a random initialization of the factors and
# loadings.
if (!(missing(k) & !missing(fit0) | (!missing(k) & missing(fit0))))
stop("Provide a rank, \"k\", or an initialization, \"fit0\", but not both")
if (missing(fit0))
fit0 <- init_poisson_nmf(X,k = k,init.method = init.method,
control = control,
verbose = ifelse(verbose == "none",
"none","detailed"))
# Check input argument "fit0".
if (!inherits(fit0,"poisson_nmf_fit"))
stop("Input argument \"fit0\" should be an object of class ",
"\"poisson_nmf_fit\", such as an output of init_poisson_nmf")
verify.fit.and.count.matrix(X,fit0)
fit <- fit0
k <- ncol(fit$F)
# Give an overview of the optimization settings.
if (verbose != "none") {
cat(sprintf("Fitting rank-%d Poisson NMF to %d x %d %s matrix.\n",k,n,m,
ifelse(is.matrix(X),"dense","sparse")))
if (method == "mu")
method.text <- "multiplicative"
else if (method == "em")
method.text <- "EM"
else if (method == "scd")
method.text <- "SCD"
else if (method == "ccd")
method.text <- "CCD"
cat(sprintf("Running at most %d %s updates, %s extrapolation ",
numiter,method.text,
ifelse(control$extrapolate,"with","without")))
cat("(fastTopics 0.7-24).\n")
}
# INITIALIZE ESTIMATES
# --------------------
# Re-scale the initial estimates of the factors and loadings so that
# they are on same scale on average. This is intended to improve
# numerical stability of the updates.
fit <- rescale.fit(fit)
# Initialize the estimates of the factors and loadings. To prevent
# the updates from getting "stuck", force the initial estimates to
# be positive.
fit <- safeguard.fit(fit,control$minval)
# RUN UPDATES
# -----------
if (verbose == "detailed")
cat("iter loglik(PoisNMF) loglik(multinom) res(KKT) |F-F'| |L-L'|",
"nz(F) nz(L) beta\n")
fit <- fit_poisson_nmf_main_loop(X,fit,numiter,update.factors,
update.loadings,method,control,
verbose)
# Restore the BLAS settings.
blas_set_num_threads(ncb)
# Output the updated "fit".
fit$progress <- rbind(fit0$progress,fit$progress)
dimnames(fit$F) <- dimnames(fit0$F)
dimnames(fit$L) <- dimnames(fit0$L)
dimnames(fit$Fn) <- dimnames(fit0$Fn)
dimnames(fit$Ln) <- dimnames(fit0$Ln)
dimnames(fit$Fy) <- dimnames(fit0$Fy)
dimnames(fit$Ly) <- dimnames(fit0$Ly)
class(fit) <- c("poisson_nmf_fit","list")
return(fit)
}
# This implements the core part of fit_poisson_nmf.
#
#' @importFrom progress progress_bar
fit_poisson_nmf_main_loop <- function (X, fit, numiter, update.factors,
update.loadings, method, control,
verbose) {
if (verbose == "progressbar")
pb <- progress_bar$new(total = numiter)
# Pre-compute quantities and set up data structures used in the
# loop below.
loglik.const <- sum(loglik_poisson_const(X))
dev.const <- sum(deviance_poisson_const(X))
progress <- as.data.frame(matrix(0,numiter,13))
names(progress) <- c("iter","loglik","loglik.multinom","dev","res",
"delta.f","delta.l","nonzeros.f","nonzeros.l",
"extrapolate","beta","betamax","timing")
# Iterate the updates of the factors and loadings.
converged <- FALSE
for (i in 1:numiter) {
fit0 <- fit
t1 <- proc.time()
if (verbose == "progressbar")
pb$tick()
# Update the factors and loadings.
if (control$extrapolate &
fit$beta > 0 &
i %% control$extrapolate.reset != 0) {
# Perform an "extrapolated" update of the factors and loadings.
extrapolate <- TRUE
fit <- update_poisson_nmf_extrapolated(X,fit,update.factors,
update.loadings,method,
control)
} else {
# Perform a basic coordinate-wise update of the factors and
# loadings.
extrapolate <- FALSE
fit <- update_poisson_nmf(X,fit,update.factors,update.loadings,
method,control)
}
t2 <- proc.time()
# Update the iteration number.
fit$iter <- fit$iter + 1
loglik.diff <- fit0$loss - fit$loss
# Update the "progress" data frame with the log-likelihood,
# deviance, and other quantities, and report the algorithm's
# progress to the console if requested. In all cases, the "current
# best" estimates of the factors and loadings are used to report
# progress.
progress[i,"iter"] <- fit$iter
progress[i,"loglik"] <- loglik.const - fit$loss
progress[i,"loglik.multinom"] <-
loglik.const - fit$loss - sum(loglik_size_factors(X,fit$F,fit$L))
progress[i,"dev"] <- dev.const + 2*fit$loss
res <- with(poisson_nmf_kkt(X,fit$F,fit$L),
max(abs(rbind(F[update.factors,],
L[update.loadings,]))))
progress[i,"res"] <- res
progress[i,"delta.f"] <- max(abs(fit$F - fit0$F))
progress[i,"delta.l"] <- max(abs(fit$L - fit0$L))
progress[i,"beta"] <- fit$beta
progress[i,"betamax"] <- fit$betamax
progress[i,"timing"] <- t2["elapsed"] - t1["elapsed"]
progress[i,"nonzeros.f"] <- mean(fit$F > control$zero.threshold)
progress[i,"nonzeros.l"] <- mean(fit$L > control$zero.threshold)
progress[i,"extrapolate"] <- extrapolate
if (verbose == "detailed")
cat(sprintf("%4d %+0.8e %+0.9e %0.2e %0.1e %0.1e %0.3f %0.3f %0.2f\n",
fit$iter,progress[i,"loglik"],progress[i,"loglik.multinom"],
progress[i,"res"],progress[i,"delta.f"],
progress[i,"delta.l"],progress[i,"nonzeros.f"],
progress[i,"nonzeros.l"],extrapolate * progress[i,"beta"]))
if (control$min.res >= 0 & res < control$min.res) {
if (verbose == "progressbar")
pb$terminate()
cat(sprintf("Stopping condition is met at iteration %d: ",i))
cat(sprintf("max. KKT residual < %0.2e\n",control$min.res))
converged <- TRUE
break
} else if (control$min.delta.loglik >= 0 &
loglik.diff < control$min.delta.loglik) {
if (verbose == "progressbar")
pb$terminate()
cat(sprintf("Stopping condition is met at iteration %d: ",i))
cat(sprintf("change in NMF loglik < %0.2e\n",
control$min.delta.loglik))
converged <- TRUE
break
}
}
# Print a warning if one or more of the stopping criteria were not
# satisfied.
if (!converged & with(control,min.res >= 0 | min.delta.loglik >= 0))
warning(sprintf(paste("One or both stopping conditions were not met",
"within %d iterations"),numiter))
# Output the updated "fit".
progress <- progress[1:i,]
fit$progress <- progress
return(fit)
}
# This implements a single (non-extrapolated) update of the factors
# and loadings.
#
# The output is the updated "fit". Because no extrapolation scheme is
# used, the extrapolated and non-extrapolated estimates are the same
# in the return value; that is, Fy and Fn are the same as F, and Ly
# and Ln are the same as L. The value of the loss function ("loss",
# "loss.fnly") is also updated.
#
# Note that "update.factors" and "update.loadings" is ignored for
# method = "mu" and method = "ccd".
update_poisson_nmf <- function (X, fit, update.factors, update.loadings,
method, control) {
# Update the loadings ("activations"). The factors are forced to be
# non-negative, or positive; the latter can promote better
# convergence for some algorithms.
if (length(update.loadings) > 0) {
fit$L <- update_loadings_poisson_nmf(X,fit$F,fit$L,update.loadings,
method,control)
fit$L <- pmax(fit$L,control$minval)
}
# Update the factors ("basis vectors"). The loadings are forced to
# be non-negative, or positive; the latter can promote better
# convergence for some algorithms.
if (length(update.factors) > 0) {
fit$F <- update_factors_poisson_nmf(X,fit$F,fit$L,update.factors,
method,control)
fit$F <- pmax(fit$F,control$minval)
}
# The extrapolated and "current best" estimates are the same as the
# non-extrapolated estimates.
fit$Fy <- fit$F
fit$Ly <- fit$L
fit$Fn <- fit$F
fit$Ln <- fit$L
# Re-scale the factors and loadings.
fit <- rescale.fit(fit)
# Compute the value of the objective ("loss") function at the updated
# estimates.
fit$loss <- sum(cost(X,fit$L,t(fit$F),control$eps))
fit$loss.fnly <- fit$loss
# Output the updated "fit".
return(fit)
}
# This implements an extrapolated update of the factors and loadings.
#
# The output is the updated "fit". The updated parts of the "fit" are:
#
# F "best current" factor estimates
# Fn non-extrapolated factor estimates
# Fy extrapolated factor estimates
# L "best current" loadings estimates
# Ln non-extrapolated loadings estimates
# Ly extrapolated loadings estimates
# loss value of the objective at the "best current" estimates
# loss.fnly value of the objective at (Fn, Ly)
# beta extrapolation parameter
# betamax upper bound on the extrapolation parameter
# beta0 extrapolation parameter setting from last improvement
#
# Note that "update.factors" and "update.loadings" is ignored for
# method = "mu" and method = "ccd".
update_poisson_nmf_extrapolated <- function (X, fit, update.factors,
update.loadings, method,
control) {
# Store the value of the objective (loss) function at the current
# iterate (Fn, Ly).
loss0.fnly <- fit$loss.fnly
# Compute the extrapolated update for the loadings ("activations").
# Note that when beta is zero, Ly = Ln.
Ln <- update_loadings_poisson_nmf(X,fit$Fy,fit$Ly,update.loadings,
method,control)
Ln <- pmax(Ln,control$minval)
fit$Ly <- pmax(Ln + fit$beta*(Ln - fit$Ln),control$minval)
# Compute the extrapolated update for the factors ("basis vectors").
# Note that when beta = 0, Fy = Fn.
Fn <- update_factors_poisson_nmf(X,fit$Fy,fit$Ly,update.factors,
method,control)
Fn <- pmax(Fn,control$minval)
fit$Fy <- pmax(Fn + fit$beta*(Fn - fit$Fn),control$minval)
# Compute the value of the objective (loss) function at the
# extrapolated solution for the loadings (Ly) and the
# non-extrapolated solution for the factors (Fn).
fit$loss.fnly <- sum(cost(X,fit$Ly,t(Fn),control$eps))
# Update the extrapolation parameters following Algorithm 3 of
# Ang & Gillis (2019).
if (fit$loss.fnly >= loss0.fnly) {
# The solution did not improve, so restart the extrapolation
# scheme.
fit$Fy <- fit$Fn
fit$Ly <- fit$Ln
fit$betamax <- fit$beta0
fit$beta <- control$beta.reduce * fit$beta
} else {
# The solution improved; retain the basic co-ordinate ascent
# update as well.
fit$Fn <- Fn
fit$Ln <- Ln
fit$beta <- min(fit$betamax,control$beta.increase * fit$beta)
fit$beta0 <- fit$beta
fit$betamax <- min(0.99,control$betamax.increase * fit$betamax)
}
# If the solution improves the "current best" estimate, update the
# current best estimate using the non-extrapolated estimates of the
# factors (Fn) and the extrapolated estimates of the loadings (Ly).
if (fit$loss.fnly < fit$loss) {
fit$F <- Fn
fit$L <- fit$Ly
fit$loss <- fit$loss.fnly
}
# Re-scale the factors and loadings.
fit <- rescale.fit(fit)
# Output the updated "fit".
return(fit)
}
# Implements a single update of the factors matrix. Note that input
# argument "j", specifying which columns of F to update, is ignored for
# method = "mu" and method = "ccd".
update_factors_poisson_nmf <- function (X, F, L, j, method, control) {
numiter <- control$numiter
nc <- control$nc
eps <- control$eps
if (method == "mu")
F <- t(betanmf_update_factors(X,L,t(F)))
else if (method == "em")
F <- pnmfem_update_factors(X,F,L,j,numiter,nc)
else if (method == "ccd")
F <- t(ccd_update_factors(X,L,t(F),nc,eps))
else if (method == "scd")
F <- t(scd_update_factors(X,L,t(F),j,numiter,nc,eps))
return(F)
}
# Implements a single update of the loadings matrix. Note that input
# argument "i", specifying which rows of L to update, is ignored for
# method = "mu" and method = "ccd".
update_loadings_poisson_nmf <- function (X, F, L, i, method, control) {
numiter <- control$numiter
nc <- control$nc
eps <- control$eps
if (method == "mu")
L <- betanmf_update_loadings(X,L,t(F))
else if (method == "em")
L <- pnmfem_update_loadings(X,F,L,i,numiter,nc)
else if (method == "ccd")
L <- ccd_update_loadings(X,L,t(F),nc,eps)
else if (method == "scd")
L <- scd_update_loadings(X,L,t(F),i,numiter,nc,eps)
return(L)
}
# Rescale the extrapolated, non-extrapolated, and "current best"
# factors and loadings.
rescale.fit <- function (fit) {
# Rescale the "current best" factors and loadings.
out <- rescale.factors(fit$F,fit$L)
fit$F <- out$F
fit$L <- out$L
# Rescale the non-extrapolated factors and loadings.
out <- rescale.factors(fit$Fn,fit$Ln)
fit$Fn <- out$F
fit$Ln <- out$L
# Rescale the extrapolated factors and loadings.
out <- rescale.factors(fit$Fy,fit$Ly)
fit$Fy <- out$F
fit$Ly <- out$L
return(fit)
}
# Force the factors and loadings to be positive.
safeguard.fit <- function (fit, minval) {
fit$F <- pmax(fit$F,minval)
fit$L <- pmax(fit$L,minval)
fit$Fn <- pmax(fit$Fn,minval)
fit$Ln <- pmax(fit$Ln,minval)
fit$Fy <- pmax(fit$Fy,minval)
fit$Ly <- pmax(fit$Ly,minval)
return(fit)
}
#' @rdname fit_poisson_nmf
#'
#' @export
#'
fit_poisson_nmf_control_default <- function()
list(numiter = 4,
init.numiter = 10,
min.delta.loglik = -1,
min.res = -1,
minval = 1e-10,
eps = 1e-8,
zero.threshold = 1e-6,
nc = 1,
nc.blas = 1,
extrapolate = FALSE,
extrapolate.reset = 20,
beta.increase = 1.1,
beta.reduce = 0.75,
betamax.increase = 1.05)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.