tune.spLearner-methods: Optimize spLearner by fine-tuning parameters and running...
In landmap: Automated Spatial Prediction using Ensemble Machine Learning

Description Usage Arguments Value Note Author(s) Examples

Optimize spLearner by fine-tuning parameters and running feature selection

## S4 method for signature 'spLearner'
tune.spLearner(
  object,
  num.trees = 85,
  blocking,
  discrete_ps,
  rdesc = mlr::makeResampleDesc("CV", iters = 2L),
  inner = mlr::makeResampleDesc("Holdout"),
  maxit = 20,
  xg.model_Params,
  xg.skip = FALSE,
  parallel = "multicore",
  hzn_depth = FALSE,
  ...
)

`object`	spLearner object (unoptimized),
`num.trees`	number of random forest trees,
`blocking`	blocking columns,
`discrete_ps`	settings for random forest,
`rdesc`	resampling method for fine-tuning,
`inner`	resampling method for feature selection,
`maxit`	maximum number of iterations for feature selection,
`xg.model_Params`	xgboost parameter set,
`xg.skip`	logical, should the tuning of the XGboost should be skipped?
`parallel`	Initiate parallel processing,
`hzn_depth`	specify whether horizon depth available in the training dataframe,
`...`	other arguments that can be passed on to `mlr::makeStackedLearner`,

optimized object of type spLearner

Currently requires that two base learners are regr.ranger and regr.xgboost, and that there are at least 3 base learners in total. Fine-tuning and feature selection can be quite computational and it is highly recommended to start with smaller subsets of data and then measure processing time. The function mlr::makeFeatSelWrapper can result in errors if the covariates have a low variance or follow a zero-inflated distribution. Reducing the number of features via feature selection and fine-tuning of the Random Forest mtry and XGboost parameters, however, can result in significantly higher prediction speed and accuracy.

Tom Hengl

library(mlr)
library(ParamHelpers)
library(geoR)
library(xgboost)
library(kernlab)
library(ranger)
library(glmnet)
library(raster)
demo(meuse, echo=FALSE)
## Regression:
sl = c("regr.ranger", "regr.xgboost", "regr.ksvm", "regr.cvglmnet")
m <- train.spLearner(meuse["lead"], covariates=meuse.grid[,c("dist","ffreq")],
      lambda=0, parallel=FALSE, SL.library=sl)
summary(m@spModel$learner.model$super.model$learner.model)
## Optimize model:
t <- try( m0 <- tune.spLearner(m, xg.skip = TRUE, parallel=FALSE), silent=TRUE)
if(!class(t) == "try-error") summary(m0@spModel$learner.model$super.model$learner.model)