Home

/

CRAN

/

RcppML

/

Introduction to the RcppML package"
In RcppML: Rcpp Machine Learning Library

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

The 'RcppML' package provides high-performance machine learning algorithms using Rcpp with a focus on matrix factorization.

Installation

Install the latest development version of RcppML from github:

```{R, eval = FALSE} library(devtools) install_github("zdebruine/RcppML")

```{R}
library(RcppML)
library(Matrix)

Non-Negative Least Squares

RcppML contains extremely fast NNLS solvers. Use the nnls function to solve systems of equations subject to non-negativity constraints.

The RcppML::solve function solves the equation \eqn{ax = b} for \eqn{x} where \eqn{a} is symmetric positive definite matrix of dimensions \eqn{m x m} and \eqn{b} is a vector of length \eqn{m} or a matrix of dimensions \eqn{m x n}.

# construct a system of equations
X <- matrix(rnorm(2000),100,20)
btrue <- runif(20)
y <- X %*% btrue + rnorm(100)
a <- crossprod(X)
b <- crossprod(X, y)

# solve the system of equations
x <- RcppML::nnls(a, b)

# use only coordinate descent
x <- RcppML::nnls(a, b, fast_nnls = FALSE, cd_maxit = 1000, cd_tol = 1e-8)

RcppML::solve implements a new and fastest-in-class algorithm for non-negative least squares:

initialization is done by solving for the unconstrained least squares solution.
forward active set tuning (FAST) provides a near-exact solution (often exact for well-conditioned systems) by setting all negative values in the unconstrained solution to zero, re-solving the system for only positive values, and repeating the process until the solution for values not constrained to zero is strictly positive. Set cd_maxit = 0 to use only the FAST solver.
Coordinate descent refines the FAST solution and finds the best solution discoverable by gradient descent. The coordinate descent solution is only used if it gives a better error than the FAST solution. Generally, coordinate descent re-introduces variables constrained to zero by FAST back into the feasible set, but does not dramatically change the solution.

Projecting Linear Models

Project dense linear factor models onto real-valued sparse matrices (or any matrix coercible to Matrix::dgCMatrix) using RcppML::project.

RcppML::project solves the equation \eqn{A = WH} for \eqn{H}.

# simulate a sparse matrix
A <- rsparsematrix(1000, 100, 0.1)

# simulate a linear factor model
w <- matrix(runif(1000 * 10), 1000, 10)

# project the model
h <- RcppML::project(A, w)

Non-negative Matrix Factorization

RcppML::nmf finds a non-negative matrix factorization by alternating least squares (alternating projections of linear models \eqn{w} and \eqn{h}).

There are several ways in which the NMF algorithm differs from other currently available methods:

Diagonalized scaling of factors to sum to 1, permitting convex L1 regularization along the entire solution path
Fast stopping criteria, based on correlation between models across consecutive iterations
Extremely fast algorithms using the Eigen C++ library, optimized for matrices that are >90% sparse
Support for NMF or unconstrained matrix factorization
Parallelized using OpenMP multithreading

The following example runs rank-10 NMF on a random 1000 x 1000 matrix that is 90% sparse:

A <- rsparsematrix(100, 100, 0.1)
model <- RcppML::nmf(A, 10, verbose = F)

w <- model$w
d <- model$d
h <- model$h
model_tolerance <- tail(model$tol, 1)

Tolerance is simply a measure of the average correlation between \eqn{w_{i-1} and \eqn{w_i} and \eqn{h_{i-1}} and \eqn{h_i} for a given iteration \eqn{i}.

For symmetric factorizations (when \code{symmetric = TRUE}), tolerance becomes a measure of the correlation between \eqn{w_i} and \eqn{h_i}, and diagonalization is automatically performed to enforce symmetry:

A_sym <- as(crossprod(A), "dgCMatrix")

model <- RcppML::nmf(A_sym, 10, verbose = F)

Mean squared error of a factorization can be calculated for a given model using the RcppML::mse function:

RcppML::mse(A_sym, model$w, model$d, model$h)

Any scripts or data that you put into this service are public.

RcppML documentation built on Sept. 22, 2021, 1:06 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

RcppML
Rcpp Machine Learning Library

Introduction to the RcppML package"
In RcppML: Rcpp Machine Learning Library

Installation

Non-Negative Least Squares

Projecting Linear Models

Non-negative Matrix Factorization

Try the RcppML package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

RcppML Rcpp Machine Learning Library

Introduction to the RcppML package" In RcppML: Rcpp Machine Learning Library

Installation

Non-Negative Least Squares

Projecting Linear Models

Non-negative Matrix Factorization

Try the RcppML package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

RcppML
Rcpp Machine Learning Library

Introduction to the RcppML package"
In RcppML: Rcpp Machine Learning Library