FTRL: Logistic regression model with FTRL proximal SGD solver.

FTRLR Documentation

Logistic regression model with FTRL proximal SGD solver.

Description

Creates 'Follow the Regularized Leader' model. Only logistic regression implemented at the moment.

Methods

Public methods


Method new()

creates a model

Usage
FTRL$new(
  learning_rate = 0.1,
  learning_rate_decay = 0.5,
  lambda = 0,
  l1_ratio = 1,
  dropout = 0,
  family = c("binomial")
)
Arguments
learning_rate

learning rate

learning_rate_decay

learning rate which controls decay. Please refer to FTRL proximal paper for details. Usually convergense does not heavily depend on this parameter, so default value 0.5 is safe.

lambda

regularization parameter

l1_ratio

controls L1 vs L2 penalty mixing. 1 = Lasso regression, 0 = Ridge regression. Elastic net is in between

dropout

dropout - percentage of random features to exclude from each sample. Acts as regularization.

family

a description of the error distribution and link function to be used in the model. Only binomial (logistic regression) is implemented at the moment.


Method partial_fit()

fits model to the data

Usage
FTRL$partial_fit(x, y, weights = rep(1, length(y)), ...)
Arguments
x

input sparse matrix. Native format is Matrix::RsparseMatrix. If x is in different format, model will try to convert it to RsparseMatrix with as(x, "RsparseMatrix"). Dimensions should be (n_samples, n_features)

y

vector of targets

weights

numeric vector of length 'n_samples'. Defines how to amplify SGD updates for each sample. May be useful for highly unbalanced problems.

...

not used at the moment


Method fit()

shorthand for applying 'partial_fit' 'n_iter' times

Usage
FTRL$fit(x, y, weights = rep(1, length(y)), n_iter = 1L, ...)
Arguments
x

input sparse matrix. Native format is Matrix::RsparseMatrix. If x is in different format, model will try to convert it to RsparseMatrix with as(x, "RsparseMatrix"). Dimensions should be (n_samples, n_features)

y

vector of targets

weights

numeric vector of length 'n_samples'. Defines how to amplify SGD updates for each sample. May be useful for highly unbalanced problems.

n_iter

number of SGD epochs

...

not used at the moment


Method predict()

makes predictions based on fitted model

Usage
FTRL$predict(x, ...)
Arguments
x

input matrix

...

not used at the moment


Method coef()

returns coefficients of the regression model

Usage
FTRL$coef()

Method clone()

The objects of this class are cloneable with this method.

Usage
FTRL$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

Examples

library(rsparse)
library(Matrix)
i = sample(1000, 1000 * 100, TRUE)
j = sample(1000, 1000 * 100, TRUE)
y = sample(c(0, 1), 1000, TRUE)
x = sample(c(-1, 1), 1000 * 100, TRUE)
odd = seq(1, 99, 2)
x[i %in% which(y == 1) & j %in% odd] = 1
x = sparseMatrix(i = i, j = j, x = x, dims = c(1000, 1000), repr="R")

ftrl = FTRL$new(learning_rate = 0.01, learning_rate_decay = 0.1,
lambda = 10, l1_ratio = 1, dropout = 0)
ftrl$partial_fit(x, y)

w = ftrl$coef()
head(w)
sum(w != 0)
p = ftrl$predict(x)

rsparse documentation built on Sept. 12, 2022, 1:06 a.m.