R/fit_poisson_nmf.R

Defines functions fit_poisson_nmf_control_default

Documented in fit_poisson_nmf_control_default

#' @title Fit Non-negative Matrix Factorization to Count Data
#'
#' @description Approximate the input matrix \code{X} by the
#'   non-negative matrix factorization \code{tcrossprod(L,F)}, in which
#'   the quality of the approximation is measured by a
#'   \dQuote{divergence} criterion; equivalently, optimize the
#'   likelihood under a Poisson model of the count data, \code{X}, in
#'   which the Poisson rates are given by \code{tcrossprod(L,F)}.
#'   Function \code{fit_poisson_nmf} runs a specified number of
#'   coordinate-wise updates to fit the L and F matrices.
#'
#' @details In Poisson non-negative matrix factorization (Lee & Seung,
#' 2001), counts \eqn{x_{ij}} in the \eqn{n \times m} matrix, \eqn{X},
#' are modeled by the Poisson distribution: \deqn{x_{ij} \sim
#' \mathrm{Poisson}(\lambda_{ij}).} Each Poisson rate,
#' \eqn{\lambda_{ij}}, is a linear combination of parameters
#' \eqn{f_{jk} \geq 0, l_{ik} \geq 0} to be fitted to the data:
#' \deqn{\lambda_{ij} = \sum_{k=1}^K l_{ik} f_{jk},} in which \eqn{K}
#' is a user-specified tuning parameter specifying the rank of the
#' matrix factorization. Function \code{fit_poisson_nmf} computes
#' maximum-likelihood estimates (MLEs) of the parameters. For
#' additional mathematical background, and an explanation of how
#' Poisson NMF is connected to topic modeling, see the vignette:
#' \code{vignette(topic = "relationship",package = "fastTopics")}.
#'
#' Using this function requires some care; only minimal argument
#' checking is performed, and error messages may not be helpful.
#'
#' The EM and multiplicative updates are simple and fast, but can be
#' slow to converge to a stationary point. When \code{control$numiter
#' = 1}, the EM and multiplicative updates are mathematically
#' equivalent to the multiplicative updates, and therefore share the
#' same convergence properties. However, the implementation of the EM
#' updates is quite different; in particular, the EM updates are more
#' suitable for sparse counts matrices. The implementation of the
#' multiplicative updates is adapted from the MATLAB code by Daichi
#' Kitamura \url{http://d-kitamura.net}.
#'
#' Since the multiplicative updates are implemented using standard
#' matrix operations, the speed is heavily dependent on the
#' BLAS/LAPACK numerical libraries used. In particular, using
#' optimized implementations such as OpenBLAS or Intel MKL can result
#' in much improved performance of the multiplcative updates.
#'
#' The cyclic co-ordinate descent (CCD) and sequential co-ordinate
#' descent (SCD) updates adopt the same optimization strategy, but
#' differ in the implementation details. In practice, we have found
#' that the CCD and SCD updates arrive at the same solution when
#' initialized \dQuote{sufficiently close} to a stationary point. The
#' CCD implementation is adapted from the C++ code developed by
#' Cho-Jui Hsieh and Inderjit Dhillon, which is available for download
#' at \url{https://www.cs.utexas.edu/~cjhsieh/nmf/}. The SCD
#' implementation is based on version 0.4-3 of the \sQuote{NNLM}
#' package.
#'
#' An additional re-scaling step is performed after each update to
#' promote numerical stability.
#'
#' We use three measures of progress for the model fitting: (1)
#' improvement in the log-likelihood (or deviance), (2) change in the
#' model parameters, and (3) the residuals of the Karush-Kuhn-Tucker
#' (KKT) first-order conditions. As the iterates approach a stationary
#' point of the loss function, the change in the model parameters
#' should be small, and the residuals of the KKT system should vanish.
#' Use \code{\link{plot_progress}} to plot the improvement in the
#' solution over time.
#'
#' See \code{\link{fit_topic_model}} for additional guidance on model
#' fitting, particularly for large or complex data sets.
#' 
#' The \code{control} argument is a list in which any of the
#' following named components will override the default optimization
#' algorithm settings (as they are defined by
#' \code{fit_poisson_nmf_control_default}):
#' 
#' \describe{
#'
#' \item{\code{numiter}}{Number of \dQuote{inner loop} iterations to
#'   run when performing and update of the factors or loadings. This
#'   must be set to 1 for \code{method = "mu"} and \code{method =
#'   "ccd"}.}
#'
#' \item{\code{nc}}{Number of RcppParallel threads to use for the
#'   updates. When \code{nc} is \code{NA}, the number of threads is
#'   determined by calling
#'   \code{\link[RcppParallel]{defaultNumThreads}}. This setting is
#'   ignored for the multiplicative upates (\code{method = "mu"}).}
#'
#' \item{\code{nc.blas}}{Number of threads used in the numerical
#'   linear algebra library (e.g., OpenBLAS), if available. For best
#'   performance, we recommend setting this to 1 (i.e., no
#'   multithreading).}
#' 
#' \item{\code{min.delta.loglik}}{Stop performing updates if the
#'   difference in the Poisson NMF log-likelihood between two successive
#'   updates is less than \code{min.delta.loglik}. This should not be
#'   kept at zero when \code{control$extrapolate = TRUE} because the
#'   extrapolated updates are expected to occasionally keep the
#'   likelihood unchanged. Ignored if \code{min.delta.loglik < 0}.}
#'
#' \item{\code{min.res}}{Stop performing updates if the maximum KKT
#'   residual is less than \code{min.res}. Ignored if \code{min.res < 0}.}
#' 
#' \item{\code{minval}}{A small, positive constant used to safeguard
#'   the multiplicative updates. The safeguarded updates are implemented
#'   as \code{F <- pmax(F1,minval)} and \code{L <- pmax(L1,minval)},
#'   where \code{F1} and \code{L1} are the factors and loadings matrices
#'   obtained by applying an update. This is motivated by Theorem 1 of
#'   Gillis & Glineur (2012). Setting \code{minval = 0} is allowed, but
#'   some methods are not guaranteed to converge to a stationary point
#'   without this safeguard, and a warning will be given in this case.}
#'
#' \item{\code{extrapolate}}{When \code{extrapolate = TRUE}, the
#'   extrapolation scheme of Ang & Gillis (2019) is used.}
#'
#' \item{\code{extrapolate.reset}}{To promote better numerical
#'   stability of the extrapolated updates, they are \dQuote{reset}
#'   every so often. This parameter determines the number of iterations
#'   to wait before resetting.}
#'
#' \item{\code{beta.increase}}{When the extrapolated update improves
#'   the solution, scale the extrapolation parameter by this amount.}
#'
#' \item{\code{beta.reduce}}{When the extrapolaaed update does not
#'   improve the solution, scale the extrapolation parameter by this
#'   amount.}
#'
#' \item{\code{betamax.increase}}{When the extrapolated update
#'   improves the solution, scale the extrapolation parameter by this
#'   amount.}
#'
#' \item{\code{eps}}{A small, non-negative number that is added to the
#'   terms inside the logarithms to sidestep computing logarithms of
#'   zero. This prevents numerical problems at the cost of introducing a
#'   small inaccuracy in the solution. Increasing this number may lead
#'   to faster convergence but possibly a less accurate solution.}
#'
#' \item{\code{zero.threshold}}{A small, non-negative number used to
#'   determine which entries of the solution are exactly zero. Any
#'   entries that are less than or equal to \code{zero.threshold} are
#'   considered to be exactly zero.}}
#'
#' An additional setting, \code{control$init.numiter}, controls the
#' number of sequential co-ordinate descent (SCD) updates that are
#' performed to initialize the loadings matrix when \code{init.method
#' = "topicscore"}.
#'
#' @param X The n x m matrix of counts; all entries of X should be
#'   non-negative. It can be a sparse matrix (class \code{"dgCMatrix"})
#'   or dense matrix (class \code{"matrix"}), with some exceptions (see
#'   \sQuote{Details}).
#'
#' @param k An integer 2 or greater giving the matrix rank. This
#'   argument should only be specified if the initial fit (\code{fit0}
#'   or \code{F, L}) is not provided.
#' 
#' @param fit0 The initial model fit. It should be an object of class
#'   \dQuote{poisson_nmf_fit}, such as an output from
#'   \code{init_poisson_nmf}, or from a previous call to
#'   \code{fit_poisson_nmf}.
#'
#' @param numiter The maximum number of updates of the factors and
#'   loadings to perform.
#'
#' @param update.factors A numeric vector specifying which factors
#'   (rows of \code{F}) to update. By default, all factors are
#'   updated. Note that the rows that are not updated may still change
#'   by rescaling. When \code{NULL}, all factors are fixed. This option
#'   is only implemented for \code{method = "em"} and \code{method =
#'   "scd"}. If another method is selected, the default setting of
#'   \code{update.factors} must be used.
#'
#' @param update.loadings A numeric vector specifying which loadings
#'   (rows of \code{L}) to update. By default, all loadings are
#'   updated. Note that the rows that are not updated may still change
#'   by rescaling. When \code{NULL}, all loadings are fixed. This option
#'   is only implemented for \code{method = "em"} and \code{method =
#'   "scd"}. If another method is selected, the default setting of
#'   \code{update.loadings} must be used.
#' 
#' @param method The method to use for updating the factors and
#'   loadings. Four methods are implemented: multiplicative updates,
#'   \code{method = "mu"}; expectation maximization (EM), \code{method =
#'   "em"}; sequential co-ordinate descent (SCD), \code{method = "scd"};
#'   and cyclic co-ordinate descent (CCD), \code{method = "ccd"}. See
#'   \sQuote{Details} for a detailed description of these methods.
#'
#' @param init.method The method used to initialize the factors and
#'   loadings. When \code{init.method = "random"}, the factors and
#'   loadings are initialized uniformly at random; when
#'   \code{init.method = "topicscore"}, the factors are initialized
#'   using the (very fast) Topic SCORE algorithm (Ke & Wang, 2017), and
#'   the loadings are initialized by running a small number of SCD
#'   updates. This input argument is ignored if initial estimates of the
#'   factors and loadings are already provided via input \code{fit0}, or
#'   inputs \code{F} and \code{L}.
#' 
#' @param control A list of parameters controlling the behaviour of
#'   the optimization algorithm (and the Topic SCORE algorithm if it
#'   is used to initialize the model parameters). See \sQuote{Details}.
#' 
#' @param verbose When \code{verbose = "detailed"}, information about
#'   the algorithm's progress is printed to the console at each
#'   iteration; when \code{verbose = "progressbar"}, a progress bar is
#'   shown; and when \code{verbose = "none"}, no progress information is
#'   printed. See the description of the \dQuote{progress} return value
#'   for an explanation of \code{verbose = "detailed"} console
#'   output. (Note that some columns of the \dQuote{progress} data frame
#'   are not shown in the console output.)
#'
#' @return \code{init_poisson_nmf} and \code{fit_poisson_nmf} both
#' return an object capturing the optimization algorithm state (for
#' \code{init_poisson_nmf}, this is the initial state). It is a list
#' with the following elements:
#'
#' \item{F}{A matrix containing the current best estimates of the
#'   factors.}
#' 
#' \item{L}{A matrix containing the current best estimates of the
#'   loadings.}
#' 
#' \item{Fn}{A matrix containing the non-extrapolated factor estimates.
#'   If extrapolation is not used, \code{Fn} and \code{F} will be the
#'   same.}
#'
#' \item{Ln}{A matrix containing the non-extrapolated estimates of the
#'   loadings. If extrapolation is not used, \code{Ln} and \code{L} will
#'   be the same.}
#'
#' \item{Fy}{A matrix containing the extrapolated factor estimates. If
#'   the extrapolation scheme is not used, \code{Fy} and \code{F} will
#'   be the same.}
#'
#' \item{Ly}{A matrix containing the extrapolated estimates of the
#'   loadings. If extrapolation is not used, \code{Ly} and \code{L} will
#'   be the same.}
#'
#' \item{loss}{Value of the objective (\dQuote{loss}) function
#'   computed at the current best estimates of the factors and
#'   loadings.}
#' 
#' \item{loss.fnly}{Value of the objective (\dQuote{loss}) function
#'   computed at the extrapolated solution for the loadings (\code{Ly})
#'   and the non-extrapolated solution for the factors (\code{Fn}). This
#'   is used internally to implement the extrapolated updates.}
#'
#' \item{iter}{The number of the most recently completed iteration.}
#' 
#' \item{beta}{The extrapolation parameter, \eqn{beta} in Algorithm 3
#'   of Ang & Gillis (2019).}
#'
#' \item{betamax}{Upper bound on the extrapolation parameter. This is
#'   \eqn{\bar{\gamma}} in Algorithm 3 of Ang & Gillis (2019).}
#'
#' \item{beta0}{The setting of the extrapolation parameter at the
#'   last iteration that improved the solution.}
#' 
#' \item{progress}{A data frame containing detailed information about
#'   the algorithm's progress. The data frame should have at most
#'   \code{numiter}
#'   rows. The columns of the data frame are: \dQuote{iter}, the
#'   iteration number; \dQuote{loglik}, the Poisson NMF log-likelihood
#'   at the current best factor and loading estimates;
#'   \dQuote{loglik.multinom}, the multinomial topic model
#'   log-likelihood at the current best factor and loading estimates;
#'   \dQuote{dev}, the deviance at the current best factor and loading
#'   estimates; \dQuote{res}, the maximum residual of the
#'   Karush-Kuhn-Tucker (KKT) first-order optimality conditions at the
#'   current best factor and loading estimates; \dQuote{delta.f}, the
#'   largest change in the factors matrix; \dQuote{delta.l}, the largest
#'   change in the loadings matrix; \dQuote{nonzeros.f}, the proportion
#'   of entries in the factors matrix that are nonzero;
#'   \dQuote{nonzeros.l}, the proportion of entries in the loadings
#'   matrix that are nonzero; \dQuote{extrapolate}, which is 1 if
#'   extrapolation is used, otherwise it is 0; \dQuote{beta}, the
#'   setting of the extrapolation parameter; \dQuote{betamax}, the
#'   setting of the extrapolation parameter upper bound; and
#'   \dQuote{timing}, the elapsed time in seconds (recorded using
#'   \code{\link{proc.time}}).}
#' 
#' @references
#'   Ang, A. and Gillis, N. (2019). Accelerating nonnegative matrix
#'   factorization algorithms using extrapolation. \emph{Neural
#'   Computation} \bold{31}, 417–439.
#' 
#'   Cichocki, A., Cruces, S. and Amari, S. (2011). Generalized
#'   alpha-beta divergences and their application to robust nonnegative
#'   matrix factorization. \emph{Entropy} \bold{13}, 134–170.
#' 
#'   Gillis, N. and Glineur, F. (2012). Accelerated multiplicative
#'   updates and hierarchical ALS algorithms for nonnegative matrix
#'   factorization. \emph{Neural Computation} \code{24}, 1085–1105.
#'
#'   Hsieh, C.-J. and Dhillon, I. (2011). Fast coordinate descent
#'   methods with variable selection for non-negative matrix
#'   factorization. In \emph{Proceedings of the 17th ACM SIGKDD
#'   international conference on Knowledge discovery and data mining},
#'   p. 1064-1072
#'
#'   Lee, D. D. and Seung, H. S. (2001). Algorithms for non-negative
#'   matrix factorization. In \emph{Advances in Neural Information
#'   Processing Systems} \bold{13}, 556–562.
#'
#'   Lin, X. and Boutros, P. C. (2018). Optimization and expansion of
#'   non-negative matrix factorization. \emph{BMC Bioinformatics}
#'   \bold{21}, 7.
#'
#'   Ke, Z. & Wang, M. (2017). A new SVD approach to optimal topic
#'   estimation. \emph{arXiv} \url{https://arxiv.org/abs/1704.07016}
#'
#' @seealso \code{\link{fit_topic_model}}, \code{\link{plot_progress}}
#' 
#' @examples
#' # Simulate a (sparse) 80 x 100 counts matrix.
#' library(Matrix)
#' set.seed(1)
#' X <- simulate_count_data(80,100,k = 3,sparse = TRUE)$X
#' 
#' # Remove columns (words) that do not appear in any row (document).
#' X <- X[,colSums(X > 0) > 0]
#' 
#' # Run 10 EM updates to find a good initialization.
#' fit0 <- fit_poisson_nmf(X,k = 3,numiter = 10,method = "em")
#' 
#' # Fit the Poisson NMF model by running 50 EM updates.
#' fit_em <- fit_poisson_nmf(X,fit0 = fit0,numiter = 50,method = "em")
#' 
#' # Fit the Poisson NMF model by running 50 extrapolated SCD updates.
#' fit_scd <- fit_poisson_nmf(X,fit0 = fit0,numiter = 50,method = "scd",
#'                            control = list(extrapolate = TRUE))
#' 
#' # Compare the two fits.
#' fits <- list(em = fit_em,scd = fit_scd)
#' compare_fits(fits)
#' plot_progress(fits,y = "loglik")
#' plot_progress(fits,y = "res")
#' 
#' # Recover the topic model. After this step, the L matrix contains the
#' # mixture proportions ("loadings"), and the F matrix contains the
#' # word frequencies ("factors").
#' fit_multinom <- poisson2multinom(fit_scd)
#'
#' @useDynLib fastTopics
#'
#' @importFrom utils modifyList
#' @importFrom RhpcBLASctl blas_set_num_threads
#' @importFrom RhpcBLASctl blas_get_num_procs
#'
#' @export
#'
fit_poisson_nmf <- function (X, k, fit0, numiter = 100,
                             update.factors = seq(1,ncol(X)),
                             update.loadings = seq(1,nrow(X)),
                             method = c("scd","em","mu","ccd"),
                             init.method = c("topicscore","random"),
                             control = list(),
                             verbose = c("progressbar","detailed","none")) {

  # CHECK & PROCESS INPUTS
  # ----------------------
  # Check and process input argument "X".
  verify.count.matrix(X)
  if (is.matrix(X) & length(X) > 1e4 & mean(X > 0) < 0.1 & any(method != "mu"))
    message(paste("Input matrix \"X\" has less than 10% nonzero entries;",
                  "consider converting \"X\" to a sparse matrix to reduce",
                  "computational effort"))
  if (is.matrix(X) & is.integer(X))
    storage.mode(X) <- "double"
  
  # Get the number of rows (n) and columns (m) of the data matrix,
  n <- nrow(X)
  m <- ncol(X)

  # Check input argument "numiter".
  if (any(numiter < 1))
    stop("Input argument \"numiter\" must be 1 or greater")

  # Process input arguments "update.factors" and "update.loadings".
  update.factors  <- sort(update.factors)
  update.loadings <- sort(update.loadings)
  if (length(update.factors) == 0 & length(update.loadings) == 0)
    stop("None of the factors or loadings have been selected for updating")
  
  # Check and process input arguments "method" and "init.method".
  method      <- match.arg(method)
  init.method <- match.arg(init.method)
  if (method == "mu" & is.sparse.matrix(X)) {
    warning("method = \"mu\" is not implemented for sparse counts matrices; ",
            "attempting to convert \"X\" to a dense matrix")
    X <- as.matrix(X)
  }
  if (!(length(update.factors) == m & length(update.loadings) == n))
    if (method == "mu" | method == "ccd")
      stop("All factors and loadings must be updated for method = \"mu\" ",
           "and method = \"ccd\"")

  # Check and process input argument "verbose".
  verbose <- match.arg(verbose)
  
  # Check and process the optimization settings.
  control <- modifyList(fit_poisson_nmf_control_default(),
                        control,keep.null = TRUE)
  if (any(control$minval == 0))
    warning("Updates may not converge when minval = 0")
  if (control$numiter > 1 & (method == "mu" | method == "ccd")) {
    warning("multiplicative and CCD updates do not allow ",
            "control$numiter > 1; setting control$numiter = 1")
    control$numiter <- 1
  }
  if ((length(update.factors) == 0 | length(update.loadings) == 0) &
      control$extrapolate)
    stop("control$extrapolate cannot be TRUE when all factors or loadings ",
         "are fixed")
  control$nc <- initialize.multithreading(control$nc,verbose != "none")
  ncb <- blas_get_num_procs()
  blas_set_num_threads(control$nc.blas)
  
  # Only one of "k" and "fit0" should be provided. If argument "k" is
  # given, generate a random initialization of the factors and
  # loadings.
  if (!(missing(k) & !missing(fit0) | (!missing(k) & missing(fit0))))
    stop("Provide a rank, \"k\", or an initialization, \"fit0\", but not both")
  if (missing(fit0))
    fit0 <- init_poisson_nmf(X,k = k,init.method = init.method,
                             control = control,
                             verbose = ifelse(verbose == "none",
                                              "none","detailed"))

  # Check input argument "fit0".
  if (!inherits(fit0,"poisson_nmf_fit"))
    stop("Input argument \"fit0\" should be an object of class ",
         "\"poisson_nmf_fit\", such as an output of init_poisson_nmf")
  verify.fit.and.count.matrix(X,fit0)
  fit <- fit0
  k   <- ncol(fit$F)
  
  # Give an overview of the optimization settings.
  if (verbose != "none") {
    cat(sprintf("Fitting rank-%d Poisson NMF to %d x %d %s matrix.\n",k,n,m,
                ifelse(is.matrix(X),"dense","sparse")))
    if (method == "mu")
      method.text <- "multiplicative"
    else if (method == "em")
      method.text <- "EM"
    else if (method == "scd")
      method.text <- "SCD"
    else if (method == "ccd")
      method.text <- "CCD"
    cat(sprintf("Running at most %d %s updates, %s extrapolation ",
                numiter,method.text,
                ifelse(control$extrapolate,"with","without")))
    cat("(fastTopics 0.6-192).\n")
  }
  
  # INITIALIZE ESTIMATES
  # --------------------
  # Re-scale the initial estimates of the factors and loadings so that
  # they are on same scale on average. This is intended to improve
  # numerical stability of the updates.
  fit <- rescale.fit(fit)

  # Initialize the estimates of the factors and loadings. To prevent
  # the updates from getting "stuck", force the initial estimates to
  # be positive.
  fit <- safeguard.fit(fit,control$minval)

  # RUN UPDATES
  # -----------
  if (verbose == "detailed")
    cat("iter loglik(PoisNMF) loglik(multinom) res(KKT)  |F-F'|  |L-L'|",
        "nz(F) nz(L) beta\n")
  fit <- fit_poisson_nmf_main_loop(X,fit,numiter,update.factors,
                                   update.loadings,method,control,
                                   verbose)

  # Restore the BLAS settings.
  blas_set_num_threads(ncb)
  
  # Output the updated "fit".
  fit$progress     <- rbind(fit0$progress,fit$progress)
  dimnames(fit$F)  <- dimnames(fit0$F)
  dimnames(fit$L)  <- dimnames(fit0$L)
  dimnames(fit$Fn) <- dimnames(fit0$Fn)
  dimnames(fit$Ln) <- dimnames(fit0$Ln)
  dimnames(fit$Fy) <- dimnames(fit0$Fy)
  dimnames(fit$Ly) <- dimnames(fit0$Ly)
  class(fit) <- c("poisson_nmf_fit","list")
  return(fit)
}

# This implements the core part of fit_poisson_nmf.
#
#' @importFrom progress progress_bar
fit_poisson_nmf_main_loop <- function (X, fit, numiter, update.factors,
                                       update.loadings, method, control,
                                       verbose) {
  if (verbose == "progressbar")
    pb <- progress_bar$new(total = numiter)
    
  # Pre-compute quantities and set up data structures used in the
  # loop below.
  loglik.const    <- sum(loglik_poisson_const(X))
  dev.const       <- sum(deviance_poisson_const(X))
  progress        <- as.data.frame(matrix(0,numiter,13))
  names(progress) <- c("iter","loglik","loglik.multinom","dev","res",
                       "delta.f","delta.l","nonzeros.f","nonzeros.l",
                       "extrapolate","beta","betamax","timing")

  # Iterate the updates of the factors and loadings.
  converged <- FALSE
  for (i in 1:numiter) {
    fit0 <- fit
    t1   <- proc.time()
    if (verbose == "progressbar")
      pb$tick()

    # Update the factors and loadings.
    if (control$extrapolate &
        fit$beta > 0 &
        i %% control$extrapolate.reset != 0) {
        
      # Perform an "extrapolated" update of the factors and loadings.
      extrapolate <- TRUE
      fit <- update_poisson_nmf_extrapolated(X,fit,update.factors,
                                             update.loadings,method,
                                             control)
    } else {

      # Perform a basic coordinate-wise update of the factors and
      # loadings.
      extrapolate <- FALSE
      fit <- update_poisson_nmf(X,fit,update.factors,update.loadings,
                                method,control)
    }
    t2 <- proc.time()

    # Update the iteration number.
    fit$iter    <- fit$iter + 1
    loglik.diff <- fit0$loss - fit$loss
    
    # Update the "progress" data frame with the log-likelihood,
    # deviance, and other quantities, and report the algorithm's
    # progress to the console if requested. In all cases, the "current
    # best" estimates of the factors and loadings are used to report
    # progress.
    progress[i,"iter"]        <- fit$iter
    progress[i,"loglik"]      <- loglik.const - fit$loss
    progress[i,"loglik.multinom"] <-
      loglik.const - fit$loss - sum(loglik_size_factors(X,fit$F,fit$L))
    progress[i,"dev"]         <- dev.const + 2*fit$loss
    res <- with(poisson_nmf_kkt(X,fit$F,fit$L),
                max(abs(rbind(F[update.factors,],
                              L[update.loadings,]))))
    progress[i,"res"]         <- res
    progress[i,"delta.f"]     <- max(abs(fit$F - fit0$F))
    progress[i,"delta.l"]     <- max(abs(fit$L - fit0$L))
    progress[i,"beta"]        <- fit$beta
    progress[i,"betamax"]     <- fit$betamax
    progress[i,"timing"]      <- t2["elapsed"] - t1["elapsed"]
    progress[i,"nonzeros.f"]  <- mean(fit$F > control$zero.threshold)
    progress[i,"nonzeros.l"]  <- mean(fit$L > control$zero.threshold)
    progress[i,"extrapolate"] <- extrapolate
    if (verbose == "detailed")
      cat(sprintf("%4d %+0.8e %+0.9e %0.2e %0.1e %0.1e %0.3f %0.3f %0.2f\n",
                  fit$iter,progress[i,"loglik"],progress[i,"loglik.multinom"],
                  progress[i,"res"],progress[i,"delta.f"],
                  progress[i,"delta.l"],progress[i,"nonzeros.f"],
                  progress[i,"nonzeros.l"],extrapolate * progress[i,"beta"]))
    if (control$min.res >= 0 & res < control$min.res) {
      if (verbose == "progressbar")
        pb$terminate()
      cat(sprintf("Stopping condition is met at iteration %d: ",i))
      cat(sprintf("max. KKT residual < %0.2e\n",control$min.res))
      converged <- TRUE
      break
    } else if (control$min.delta.loglik >= 0 &
               loglik.diff < control$min.delta.loglik) {
      if (verbose == "progressbar")
        pb$terminate()
      cat(sprintf("Stopping condition is met at iteration %d: ",i))
      cat(sprintf("change in NMF loglik < %0.2e\n",
                  control$min.delta.loglik))
      converged <- TRUE
      break
    }
  }

  # Print a warning if one or more of the stopping criteria were not
  # satisfied.
  if (!converged & with(control,min.res >= 0 | min.delta.loglik >= 0))
    warning(sprintf(paste("One or both stopping conditions were not met",
                          "within %d iterations"),numiter))
  
  # Output the updated "fit".
  progress <- progress[1:i,]    
  fit$progress <- progress
  return(fit)
}

# This implements a single (non-extrapolated) update of the factors
# and loadings.
#
# The output is the updated "fit". Because no extrapolation scheme is
# used, the extrapolated and non-extrapolated estimates are the same
# in the return value; that is, Fy and Fn are the same as F, and Ly
# and Ln are the same as L. The value of the loss function ("loss",
# "loss.fnly") is also updated.
#
# Note that "update.factors" and "update.loadings" is ignored for
# method = "mu" and method = "ccd".
update_poisson_nmf <- function (X, fit, update.factors, update.loadings,
                                method, control) {

  # Update the loadings ("activations"). The factors are forced to be
  # non-negative, or positive; the latter can promote better
  # convergence for some algorithms.
  if (length(update.loadings) > 0) {
    fit$L <- update_loadings_poisson_nmf(X,fit$F,fit$L,update.loadings,
                                         method,control)
    fit$L <- pmax(fit$L,control$minval)
  }

  # Update the factors ("basis vectors"). The loadings are forced to
  # be non-negative, or positive; the latter can promote better
  # convergence for some algorithms.
  if (length(update.factors) > 0) {
    fit$F <- update_factors_poisson_nmf(X,fit$F,fit$L,update.factors,
                                        method,control)
    fit$F <- pmax(fit$F,control$minval)
  }
  
  # The extrapolated and "current best" estimates are the same as the
  # non-extrapolated estimates.
  fit$Fy <- fit$F
  fit$Ly <- fit$L
  fit$Fn <- fit$F
  fit$Ln <- fit$L

  # Re-scale the factors and loadings.
  fit <- rescale.fit(fit)

  # Compute the value of the objective ("loss") function at the updated
  # estimates.
  fit$loss      <- sum(cost(X,fit$L,t(fit$F),control$eps))
  fit$loss.fnly <- fit$loss
  
  # Output the updated "fit".
  return(fit)
}

# This implements an extrapolated update of the factors and loadings.
#
# The output is the updated "fit". The updated parts of the "fit" are:
#
#   F          "best current" factor estimates
#   Fn         non-extrapolated factor estimates
#   Fy         extrapolated factor estimates
#   L          "best current" loadings estimates
#   Ln         non-extrapolated loadings estimates
#   Ly         extrapolated loadings estimates
#   loss       value of the objective at the "best current" estimates 
#   loss.fnly  value of the objective at (Fn, Ly)
#   beta       extrapolation parameter
#   betamax    upper bound on the extrapolation parameter
#   beta0      extrapolation parameter setting from last improvement
#
# Note that "update.factors" and "update.loadings" is ignored for
# method = "mu" and method = "ccd".
update_poisson_nmf_extrapolated <- function (X, fit, update.factors,
                                             update.loadings, method,
                                             control) {

  # Store the value of the objective (loss) function at the current
  # iterate (Fn, Ly).
  loss0.fnly <- fit$loss.fnly

  # Compute the extrapolated update for the loadings ("activations").
  # Note that when beta is zero, Ly = Ln.
  Ln     <- update_loadings_poisson_nmf(X,fit$Fy,fit$Ly,update.loadings,
                                        method,control)
  Ln     <- pmax(Ln,control$minval)
  fit$Ly <- pmax(Ln + fit$beta*(Ln - fit$Ln),control$minval)

  # Compute the extrapolated update for the factors ("basis vectors").
  # Note that when beta = 0, Fy = Fn.
  Fn     <- update_factors_poisson_nmf(X,fit$Fy,fit$Ly,update.factors,
                                       method,control)
  Fn     <- pmax(Fn,control$minval)
  fit$Fy <- pmax(Fn + fit$beta*(Fn - fit$Fn),control$minval)
  
  # Compute the value of the objective (loss) function at the
  # extrapolated solution for the loadings (Ly) and the
  # non-extrapolated solution for the factors (Fn).
  fit$loss.fnly <- sum(cost(X,fit$Ly,t(Fn),control$eps))

  # Update the extrapolation parameters following Algorithm 3 of
  # Ang & Gillis (2019).
  if (fit$loss.fnly >= loss0.fnly) {

    # The solution did not improve, so restart the extrapolation
    # scheme.
    fit$Fy      <- fit$Fn
    fit$Ly      <- fit$Ln
    fit$betamax <- fit$beta0
    fit$beta    <- control$beta.reduce * fit$beta
  } else {
        
    # The solution improved; retain the basic co-ordinate ascent
    # update as well.
    fit$Fn      <- Fn
    fit$Ln      <- Ln
    fit$beta    <- min(fit$betamax,control$beta.increase * fit$beta)
    fit$beta0   <- fit$beta
    fit$betamax <- min(0.99,control$betamax.increase * fit$betamax)
  }
 
  # If the solution improves the "current best" estimate, update the
  # current best estimate using the non-extrapolated estimates of the
  # factors (Fn) and the extrapolated estimates of the loadings (Ly).
  if (fit$loss.fnly < fit$loss) {
    fit$F    <- Fn
    fit$L    <- fit$Ly 
    fit$loss <- fit$loss.fnly
  }

  # Re-scale the factors and loadings.
  fit <- rescale.fit(fit)
  
  # Output the updated "fit".
  return(fit)
}

# Implements a single update of the factors matrix. Note that input
# argument "j", specifying which columns of F to update, is ignored for
# method = "mu" and method = "ccd".
update_factors_poisson_nmf <- function (X, F, L, j, method, control) {
  numiter <- control$numiter
  nc      <- control$nc
  eps     <- control$eps
  if (method == "mu")
    F <- t(betanmf_update_factors(X,L,t(F)))
  else if (method == "em")
    F <- pnmfem_update_factors(X,F,L,j,numiter,nc)
  else if (method == "ccd")
    F <- t(ccd_update_factors(X,L,t(F),nc,eps))
  else if (method == "scd")
    F <- t(scd_update_factors(X,L,t(F),j,numiter,nc,eps))
  return(F)
}
  
# Implements a single update of the loadings matrix. Note that input
# argument "i", specifying which rows of L to update, is ignored for
# method = "mu" and method = "ccd".
update_loadings_poisson_nmf <- function (X, F, L, i, method, control) {
  numiter <- control$numiter
  nc      <- control$nc
  eps     <- control$eps
  if (method == "mu")
    L <- betanmf_update_loadings(X,L,t(F))
  else if (method == "em")
    L <- pnmfem_update_loadings(X,F,L,i,numiter,nc)
  else if (method == "ccd")
    L <- ccd_update_loadings(X,L,t(F),nc,eps)
  else if (method == "scd")
    L <- scd_update_loadings(X,L,t(F),i,numiter,nc,eps)
  return(L)
}

# Rescale the extrapolated, non-extrapolated, and "current best"
# factors and loadings.
rescale.fit <- function (fit) {

  # Rescale the "current best" factors and loadings.
  out    <- rescale.factors(fit$F,fit$L)
  fit$F  <- out$F
  fit$L  <- out$L

  # Rescale the non-extrapolated factors and loadings.
  out    <- rescale.factors(fit$Fn,fit$Ln)
  fit$Fn <- out$F
  fit$Ln <- out$L

  # Rescale the extrapolated factors and loadings.
  out    <- rescale.factors(fit$Fy,fit$Ly)
  fit$Fy <- out$F
  fit$Ly <- out$L

  return(fit)
}

# Force the factors and loadings to be positive.
safeguard.fit <- function (fit, minval) {
  fit$F  <- pmax(fit$F,minval)
  fit$L  <- pmax(fit$L,minval)
  fit$Fn <- pmax(fit$Fn,minval)
  fit$Ln <- pmax(fit$Ln,minval)
  fit$Fy <- pmax(fit$Fy,minval)
  fit$Ly <- pmax(fit$Ly,minval)
  return(fit)
}

#' @rdname fit_poisson_nmf
#'
#' @export
#' 
fit_poisson_nmf_control_default <- function()
  list(numiter           = 4,
       init.numiter      = 10,
       min.delta.loglik  = -1,
       min.res           = -1,
       minval            = 1e-10,
       eps               = 1e-8,
       zero.threshold    = 1e-6,
       nc                = 1,
       nc.blas           = 1,
       extrapolate       = FALSE,
       extrapolate.reset = 20,
       beta.increase     = 1.1,
       beta.reduce       = 0.75,
       betamax.increase  = 1.05)

Try the fastTopics package in your browser

Any scripts or data that you put into this service are public.

fastTopics documentation built on Sept. 11, 2024, 5:14 p.m.