Compboost_internal: Main Compboost Class
In compboost: C++ Implementation of Component-Wise Boosting

Description Format Usage Arguments Details Fields Methods Examples

This class collects all parts such as the factory list or the used logger and passes them to C++. On the C++ side is then the main algorithm.

S4 object.

1 2	Compboost$new(response, learning_rate, stop_if_all_stopper_fulfilled, factory_list, loss, logger_list, optimizer)

response [numeric]: Vector of the true values which should be modeled.
learning_rate [numeric(1)]: The learning rate which is used to shrink the parameter in each iteration.
stop_if_all_stopper_fulfilled [logical(1)]: Boolean to indicate which stopping strategy is used. If TRUE then the algorithm stops if all registered logger stopper are fulfilled.
factory_list [BlearnerFactoryList object]: List of base-learner factories from which one base-learner is selected in each iteration by using the
loss [Loss object]: The loss which should be used to calculate the pseudo residuals in each iteration.
logger_list [LoggerList object]: The list with all registered logger which are used to track the algorithm.
optimizer [Optimizer object]: The optimizer which is used to select in each iteration one good base-learner.

This class is a wrapper around the pure C++ implementation. To see the functionality of the C++ class visit https://schalkdaniel.github.io/compboost/cpp_man/html/classcboost_1_1_compboost.html.

This class doesn't contain public fields.

train(trace): Initial training of the model. The integer argument trace indicates if the logger progress should be printed or not and if so trace indicates which iterations should be printed.
continueTraining(trace, logger_list): Continue the training by using an additional logger_list. The retraining is stopped if the first logger says that the algorithm should be stopped.
getPrediction(): Get the inbag prediction which is done during the fitting process.
getSelectedBaselearner(): Returns a character vector of how the base-learner are selected.
getLoggerData(): Returns a list of all logged data. If the algorithm is retrained, then the list contains for each training one element.
getEstimatedParameter(): Returns a list with the estimated parameter for base-learner which was selected at least once.
getParameterAtIteration(k): Calculates the prediction at the iteration k.
getParameterMatrix(): Calculates a matrix where row i includes the parameter at iteration i. There are as many rows as done iterations.
isTrained(): This function returns just a boolean value which indicates if the initial training was already done.
predict(newdata): Prediction on new data organized within a list of source data objects. It is important that the names of the source data objects matches those one that were used to define the factories.
predictAtIteration(newdata, k): Prediction on new data by using another iteration k.
setToIteration(k): Set the whole model to another iteration k. After calling this function all other elements such as the parameters or the prediction are calculated corresponding to k.
summarizeCompboost(): Summarize the Compboost object.

# Some data:
df = mtcars
df$mpg.cat = ifelse(df$mpg > 20, 1, -1)

# # Create new variable to check the polynomial base-learner with degree 2:
# df$hp2 = df[["hp"]]^2

# Data for the baselearner are matrices:
X.hp = as.matrix(df[["hp"]])
X.wt = as.matrix(df[["wt"]])

# Target variable:
y = df[["mpg.cat"]]

data.source.hp = InMemoryData$new(X.hp, "hp")
data.source.wt = InMemoryData$new(X.wt, "wt")

data.target.hp1 = InMemoryData$new()
data.target.hp2 = InMemoryData$new()
data.target.wt1 = InMemoryData$new()
data.target.wt2 = InMemoryData$new()

# List for oob logging:
oob.data = list(data.source.hp, data.source.wt)

# List to test prediction on newdata:
test.data = oob.data

# Factories:
linear.factory.hp = BaselearnerPolynomial$new(data.source.hp, data.target.hp1, 1, TRUE)
linear.factory.wt = BaselearnerPolynomial$new(data.source.wt, data.target.wt1, 1, TRUE)
quadratic.factory.hp = BaselearnerPolynomial$new(data.source.hp, data.target.hp2, 2, TRUE)
spline.factory.wt = BaselearnerPSpline$new(data.source.wt, data.target.wt2, 3, 10, 2, 2)

# Create new factory list:
factory.list = BlearnerFactoryList$new()

# Register factories:
factory.list$registerFactory(linear.factory.hp)
factory.list$registerFactory(linear.factory.wt)
factory.list$registerFactory(quadratic.factory.hp)
factory.list$registerFactory(spline.factory.wt)

# Define loss:
loss.bin = LossBinomial$new()

# Define optimizer:
optimizer = OptimizerCoordinateDescent$new()

## Logger

# Define logger. We want just the iterations as stopper but also track the
# time, inbag risk and oob risk:
log.iterations  = LoggerIteration$new(TRUE, 500)
log.time        = LoggerTime$new(FALSE, 500, "microseconds")
log.inbag       = LoggerInbagRisk$new(FALSE, loss.bin, 0.05)
log.oob         = LoggerOobRisk$new(FALSE, loss.bin, 0.05, oob.data, y)

# Define new logger list:
logger.list = LoggerList$new()

# Register the logger:
logger.list$registerLogger(" iteration.logger", log.iterations)
logger.list$registerLogger("time.logger", log.time)
logger.list$registerLogger("inbag.binomial", log.inbag)
logger.list$registerLogger("oob.binomial", log.oob)

# Run compboost:
# --------------

# Initialize object:
cboost = Compboost_internal$new(
  response      = y,
  learning_rate = 0.05,
  stop_if_all_stopper_fulfilled = FALSE,
  factory_list = factory.list,
  loss         = loss.bin,
  logger_list  = logger.list,
  optimizer    = optimizer
)

# Train the model (we want to print the trace):
cboost$train(trace = 50)
cboost

# Get estimated parameter:
cboost$getEstimatedParameter()

# Get trace of selected base-learner:
cboost$getSelectedBaselearner()

# Set to iteration 200:
cboost$setToIteration(200)

# Get new parameter values:
cboost$getEstimatedParameter()