In dirkschumacher/ftrl: FTRL Optimization For Logistic Regression

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
set.seed(1)

Dense FTRL-Proximal online learning algorithm for logistic regression

The goal of ftrl is to implement the FTRL online training algorithm in R. Really not for production and just for fun. Just works for logisitic regression. Probably some bugs.

Here is the paper.

Installation

You can install the released version of ftrl from Github:

remotes::install_github("dirkschumacher/ftrl")

Example

This is a basic example which shows you how to solve a common problem:

library(ftrl)
optimizer <- FTRLDenseOptimizer(
  n_weights = 100, # number of features
  # ... plus some learning rate and regulization parameters
  lambda1 = 20
)

You can then stream examples to the optimizer. The optimizer itself will only keep a copy of the current weights and some intermediate results but nothing proportional to the number of training examples.

# fit 100k samples
system.time(
  for (i in seq_len(100000)) {
    # 99 useless features, learn if the first element is negative
    x <- rnorm(100)
    optimizer$fit(x, x[1] < 0)
  }
)

optimizer$predict(c(2, rnorm(99)))
optimizer$predict(c(1, rnorm(99)))
optimizer$predict(c(0.1, rnorm(99)))
optimizer$predict(c(-0.1, rnorm(99)))
optimizer$predict(c(-1, rnorm(99)))
optimizer$predict(c(-2, rnorm(99)))

Since we use l1 regularitzation, a lot of these useless weights are exactly zero. Which makes the resulting model sparse and smaller to store.

sum(optimizer$weights() == 0) / length(optimizer$weights())

References

McMahan, H. Brendan, et al. "Ad click prediction: a view from the trenches." Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2013.

dirkschumacher/ftrl documentation built on July 7, 2019, 12:43 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com