tune_svm | R Documentation |
Fit and validate Support Vector Machine models with exploration of hyper-parameters that optimize performance
tune_svm(
data,
response,
predictors,
predictors_f = NULL,
fit_formula = NULL,
partition,
grid = NULL,
thr = NULL,
metric = "TSS",
n_cores = 1
)
data |
data.frame. Database with response (0,1) and predictors values. |
response |
character. Column name with species absence-presence data (0,1). |
predictors |
character. Vector with the column names of quantitative predictor variables (i.e. continuous variables). Usage predictors = c("aet", "cwd", "tmin") |
predictors_f |
character. Vector with the column names of qualitative predictor variables (i.e. ordinal or nominal variables type). Usage predictors_f = c("landform") |
fit_formula |
formula. A formula object with response and predictor variables (e.g. formula(pr_ab ~ aet + ppt_jja + pH + awc + depth + landform)). Note that the variable names used here must be consistent with those used in response, predictors, and predictors_f arguments. Default NULL |
partition |
character. Column name with training and validation partition groups. |
grid |
data.frame. Provide a data frame object with algorithm hyper-parameters values to be tested. It Is recommended to generate this data.frame with grid() function. Hyper-parameters needed for tuning are 'size' and 'decay'. |
thr |
character. Threshold used to get binary suitability values (i.e. 0,1). It is useful for threshold-dependent performance metrics. It is possible to use more than one threshold type. It is necessary to provide a vector for this argument. The next threshold area available:
In the case of use more than one threshold type it is necessary concatenate threshold types, e.g., thr=c('lpt', 'max_sens_spec', 'max_jaccard'), or thr=c('lpt', 'max_sens_spec', 'sensitivity', sens='0.8'), or thr=c('lpt', 'max_sens_spec', 'sensitivity'). Function will use all thresholds if no threshold is specified |
metric |
character. Performance metric used for selecting the best combination of hyper-parameter values. One of the following metrics can be used: SORENSEN, JACCARD, FPB, TSS, KAPPA, AUC, and BOYCE. TSS is used as default. |
n_cores |
numeric. Number of cores use for parallelization. Default 1 |
A list object with:
model: A "ksvm" class object from kernlab package. This object can be used for predicting.
predictors: A tibble with quantitative (c column names) and qualitative (f column names) variables use for modeling.
performance: Hyper-parameters values and performance metric (see sdm_eval
)
for the best hyper-parameters combination.
hyper_performance: Performance metrics (see sdm_eval
) for each combination
of the hyper-parameters.
data_ens: Predicted suitability for each test partition based on the best model. This database is used in fit_ensemble
tune_gbm
, tune_max
, tune_net
,
and tune_raf
.
## Not run:
data(abies)
abies
# Partition the data with the k-fold method
abies2 <- part_random(
data = abies,
pr_ab = "pr_ab",
method = c(method = "kfold", folds = 5)
)
# pr_ab column is species presence and absences (i.e. the response variable)
# from aet to landform are the predictors variables (landform is a qualitative variable)
# Hyper-parameter values for tuning
tune_grid <-
expand.grid(
C = c(2, 4, 8, 16, 20),
sigma = c(0.01, 0.1, 0.2, 0.3, 0.4)
)
svm_t <-
tune_svm(
data = abies2,
response = "pr_ab",
predictors = c(
"aet", "cwd", "tmin", "ppt_djf",
"ppt_jja", "pH", "awc", "depth"
),
predictors_f = c("landform"),
partition = ".part",
grid = tune_grid,
thr = "max_sens_spec",
metric = "TSS",
n_cores = 1
)
# Outputs
svm_t$model
svm_t$predictors
svm_t$performance
svm_t$hyper_performance
svm_t$data_ens
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.