The goal of ftrl
is to implement the FTRL online training algorithm in
R. Really not for production and just for fun. Just works for logisitic
regression. Probably some bugs.
Here is the paper.
You can install the released version of ftrl from Github:
remotes::install_github("dirkschumacher/ftrl")
This is a basic example which shows you how to solve a common problem:
library(ftrl)
optimizer <- FTRLDenseOptimizer(
n_weights = 100, # number of features
# ... plus some learning rate and regulization parameters
lambda1 = 20
)
You can then stream examples to the optimizer. The optimizer itself will only keep a copy of the current weights and some intermediate results but nothing proportional to the number of training examples.
# fit 100k samples
system.time(
for (i in seq_len(100000)) {
# 99 useless features, learn if the first element is negative
x <- rnorm(100)
optimizer$fit(x, x[1] < 0)
}
)
#> user system elapsed
#> 5.060 0.757 6.486
optimizer$predict(c(2, rnorm(99)))
#> [1] 1.464633e-11
optimizer$predict(c(1, rnorm(99)))
#> [1] 3.941826e-06
optimizer$predict(c(0.1, rnorm(99)))
#> [1] 0.2369947
optimizer$predict(c(-0.1, rnorm(99)))
#> [1] 0.7867933
optimizer$predict(c(-1, rnorm(99)))
#> [1] 0.9999964
optimizer$predict(c(-2, rnorm(99)))
#> [1] 1
Since we use l1 regularitzation, a lot of these useless weights are exactly zero. Which makes the resulting model sparse and smaller to store.
sum(optimizer$weights() == 0) / length(optimizer$weights())
#> [1] 0.72
McMahan, H. Brendan, et al. “Ad click prediction: a view from the trenches.” Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2013.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.