BaselearnerPolynomial: Polynomial base learner

BaselearnerPolynomialR Documentation

Polynomial base learner

Description

⁠[BaselearnerPolynomial]⁠ creates a polynomial base learner object. The base learner takes one feature and calculates the polynomials (with intercept) 1 + x + x^2 + \dots + x^d for a given degree d.

Arguments

data_source

(InMemoryData)
Data object which contains the raw data (see ?InMemoryData).

blearner_type

(character(1))
Type of the base learner (if not specified, blearner_type = paste0("poly", d) is used). The unique id of the base learner is defined by appending blearner_type to the feature name: paste0(data_source$getIdentifier(), "_", blearner_type).

degree

(integer(1))
Polynomial degree.

intercept

(logical(1))
Polynomial degree.

bin_root

(integer(1))
The binning root to reduce the data to n^{1/\text{binroot}} data points (default bin_root = 1, which means no binning is applied). A value of bin_root = 2 is suggested for the best approximation error (cf. Wood et al. (2017) Generalized additive models for gigadata: modeling the UK black smoke network daily data).

Format

S4 object.

Usage

BaselearnerPolynomial$new(data_source, list(degree, intercept, bin_root))
BaselearnerPolynomial$new(data_source, blearner_type, list(degree, intercept, bin_root))

Fields

This class doesn't contain public fields.

Methods

  • ⁠$summarizeFactory()⁠: ⁠() -> ()⁠

  • ⁠$transfromData(newdata)⁠: list(InMemoryData) -> matrix()

  • ⁠$getMeta()⁠: ⁠() -> list()⁠

Inherited methods from Baselearner

  • ⁠$getData()⁠: ⁠() -> matrix()⁠

  • ⁠$getDF()⁠: ⁠() -> integer()⁠

  • ⁠$getPenalty()⁠: ⁠() -> numeric()⁠

  • ⁠$getPenaltyMat()⁠: ⁠() -> matrix()⁠

  • ⁠$getFeatureName()⁠: ⁠() -> character()⁠

  • ⁠$getModelName()⁠: ⁠() -> character()⁠

  • ⁠$getBaselearnerId()⁠: ⁠() -> character()⁠

Examples

# Sample data:
x = runif(100)
y = 1 + 2*x + rnorm(100, 0, 0.2)
dat = data.frame(x, y)

# S4 wrapper

# Create new data object, a matrix is required as input:
data_mat = cbind(x)
data_source = InMemoryData$new(data_mat, "my_data_name")

# Create new linear base learner factory:
bl_lin = BaselearnerPolynomial$new(data_source,
  list(degree = 1))
bl_cub = BaselearnerPolynomial$new(data_source,
  list(intercept = FALSE, degree = 3, bin_root = 2))

# Get the transformed data:
head(bl_lin$getData())
head(bl_cub$getData())

# Summarize factory:
bl_lin$summarizeFactory()

# Transform "new data":
newdata = list(InMemoryData$new(cbind(rnorm(5)), "my_data_name"))
bl_lin$transformData(newdata)
bl_cub$transformData(newdata)

# R6 wrapper

cboost_lin = Compboost$new(dat, "y")
cboost_lin$addBaselearner("x", "lin", BaselearnerPolynomial, degree = 1)
cboost_lin$train(100, 0)

cboost_cub = Compboost$new(dat, "y")
cboost_cub$addBaselearner("x", "cubic", BaselearnerPolynomial,
  intercept = FALSE, degree = 3, bin_root = 2)
cboost_cub$train(100, 0)

# Access base learner directly from the API (n = sqrt(100) = 10 with binning):
head(cboost_lin$baselearner_list$x_lin$factory$getData())
cboost_cub$baselearner_list$x_cubic$factory$getData()

gg_lin = plotPEUni(cboost_lin, "x")
gg_cub = plotPEUni(cboost_cub, "x")

library(ggplot2)
library(patchwork)

(gg_lin | gg_cub) &
  geom_point(data = dat, aes(x = x, y = y - c(cboost_lin$offset)), alpha = 0.2)

schalkdaniel/compboost documentation built on April 15, 2023, 9:03 p.m.