BaselearnerPolynomial: Polynomial base learner
In schalkdaniel/compboost: Efficient Component-Wise Boosting Implementation

BaselearnerPolynomial

R Documentation

Polynomial base learner

Description

⁠[BaselearnerPolynomial]⁠ creates a polynomial base learner object. The base learner takes one feature and calculates the polynomials (with intercept) 1 + x + x^2 + \dots + x^d for a given degree d.

Arguments

`data_source`	(InMemoryData) Data object which contains the raw data (see `?InMemoryData`).
`blearner_type`	(`character(1)`) Type of the base learner (if not specified, `blearner_type = paste0("poly", d)` is used). The unique id of the base learner is defined by appending `blearner_type` to the feature name: `paste0(data_source$getIdentifier(), "_", blearner_type)`.
`degree`	(`integer(1)`) Polynomial degree.
`intercept`	(`logical(1)`) Polynomial degree.
`bin_root`	(`integer(1)`) The binning root to reduce the data to `n^{1/\text{binroot}}` data points (default `bin_root = 1`, which means no binning is applied). A value of `bin_root = 2` is suggested for the best approximation error (cf. Wood et al. (2017) Generalized additive models for gigadata: modeling the UK black smoke network daily data).

Format

S4 object.

Usage

BaselearnerPolynomial$new(data_source, list(degree, intercept, bin_root))
BaselearnerPolynomial$new(data_source, blearner_type, list(degree, intercept, bin_root))

Fields

This class doesn't contain public fields.

Methods

⁠$summarizeFactory()⁠: ⁠() -> ()⁠
⁠$transfromData(newdata)⁠: list(InMemoryData) -> matrix()
⁠$getMeta()⁠: ⁠() -> list()⁠

Inherited methods from Baselearner

⁠$getData()⁠: ⁠() -> matrix()⁠
⁠$getDF()⁠: ⁠() -> integer()⁠
⁠$getPenalty()⁠: ⁠() -> numeric()⁠
⁠$getPenaltyMat()⁠: ⁠() -> matrix()⁠
⁠$getFeatureName()⁠: ⁠() -> character()⁠
⁠$getModelName()⁠: ⁠() -> character()⁠
⁠$getBaselearnerId()⁠: ⁠() -> character()⁠

Examples

# Sample data:
x = runif(100)
y = 1 + 2*x + rnorm(100, 0, 0.2)
dat = data.frame(x, y)

# S4 wrapper

# Create new data object, a matrix is required as input:
data_mat = cbind(x)
data_source = InMemoryData$new(data_mat, "my_data_name")

# Create new linear base learner factory:
bl_lin = BaselearnerPolynomial$new(data_source,
  list(degree = 1))
bl_cub = BaselearnerPolynomial$new(data_source,
  list(intercept = FALSE, degree = 3, bin_root = 2))

# Get the transformed data:
head(bl_lin$getData())
head(bl_cub$getData())

# Summarize factory:
bl_lin$summarizeFactory()

# Transform "new data":
newdata = list(InMemoryData$new(cbind(rnorm(5)), "my_data_name"))
bl_lin$transformData(newdata)
bl_cub$transformData(newdata)

# R6 wrapper

cboost_lin = Compboost$new(dat, "y")
cboost_lin$addBaselearner("x", "lin", BaselearnerPolynomial, degree = 1)
cboost_lin$train(100, 0)

cboost_cub = Compboost$new(dat, "y")
cboost_cub$addBaselearner("x", "cubic", BaselearnerPolynomial,
  intercept = FALSE, degree = 3, bin_root = 2)
cboost_cub$train(100, 0)

# Access base learner directly from the API (n = sqrt(100) = 10 with binning):
head(cboost_lin$baselearner_list$x_lin$factory$getData())
cboost_cub$baselearner_list$x_cubic$factory$getData()

gg_lin = plotPEUni(cboost_lin, "x")
gg_cub = plotPEUni(cboost_cub, "x")

library(ggplot2)
library(patchwork)

(gg_lin | gg_cub) &
  geom_point(data = dat, aes(x = x, y = y - c(cboost_lin$offset)), alpha = 0.2)

schalkdaniel/compboost documentation built on April 15, 2023, 9:03 p.m.