boostSplines: Wrapper to boost p spline models for each feature.

Description Usage Arguments Details Value Examples

View source: R/boost_splines.R

Description

This wrapper function automatically initializes the model by adding all numerical features of a dataset within a spline base-learner. Categorical features are dummy encoded and inserted using linear base-learners without intercept. After initializing the model boostSpline also fits as many iterations as given by the user through iters.

Usage

1
2
3
4
boostSplines(data, target, optimizer = OptimizerCoordinateDescent$new(),
  loss, learning.rate = 0.05, iterations = 100, trace = -1,
  degree = 3, n.knots = 20, penalty = 2, differences = 2,
  data.source = InMemoryData, data.target = InMemoryData)

Arguments

data

[data.frame]
A data frame containing the data on which the model should be built.

target

[character(1)]
Character indicating the target variable. Note that the loss must match the data type of the target.

optimizer

[S4 Optimizer]
Optimizer to select features. This should be an initialized S4 Optimizer object exposed by Rcpp (for instance OptimizerCoordinateDescent$new()).

loss

[S4 Loss]
Loss used to calculate the risk and pseudo residuals. This object must be an initialized S4 Loss object exposed by Rcpp (for instance LossQuadratic$new()).

learning.rate

[numeric(1)]
Learning rate which is used to shrink the parameter in each step.

iterations

[integer(1)]
Number of iterations that are trained.

trace

[integer(1)]
Integer indicating how often a trace should be printed. Specifying trace = 10, then every 10th iteration is printed. If no trace should be printed set trace = 0. Default is -1 which means that we set trace at a value that 40 iterations are printed.

degree

[integer(1)]
Polynomial degree of the splines used for modeling. Note that the number of parameter increases with the degrees.

n.knots

[integer(1)]
Number of equidistant "inner knots". The real number of used knots also depends on the polynomial degree.

penalty

[numeric(1)]
Penalty term for p-splines. If penalty equals 0, then ordinary b-splines are fitted. The higher penalty, the higher the smoothness.

differences

[integer(1)]
Number of differences that are used for penalization. The higher this value is, the more function values of neighbor knots are forced to be more similar which results in a smoother curve.

data.source

[S4 Data]
Uninitialized S4 Data object which is used to store the data. At the moment just in memory training is supported.

data.target

[S4 Data]
Uninitialized S4 Data object which is used to store the data. At the moment just in memory training is supported.

Details

The returned object is an object of the Compboost class which then can be used for further analyses (see ?Compboost for details).

Value

Usually a model of class Compboost. This model is an R6 object which can be used for retraining, predicting, plotting, and anything described in ?Compboost.

Examples

1
2
3
4
5
6
mod = boostSplines(data = iris, target = "Sepal.Length", loss = LossQuadratic$new())
mod$getBaselearnerNames()
mod$getEstimatedCoef()
table(mod$getSelectedBaselearner())
mod$predict()
mod$plot("Sepal.Width_spline")

compboost documentation built on May 2, 2019, 6:40 a.m.