benchmark_sdm: Benchmark regular models

Description Usage Arguments Value Examples

View source: R/benchmark_sdm.r

Description

A function to benchmark a collection of regular machine learning models.

Usage

1
2
benchmark_sdm(benchmarking_data, learners, dataset_type = "default",
  sample = FALSE)

Arguments

benchmarking_data

A dataframe from the output of get_benchmarking_data function. This dataset contains species occurrence coordinates together with a set of environmental data points.

learners

A list of mlr learner objects which specify which models to use (i.e. Random Forests). The following learners are supported: "classif.logreg", "classif.gbm", "classif.multinom", "classif.naiveBayes", "classif.xgboost", "classif.ksvm".

dataset_type

A character string indicating spatial partitioning method. This is used in order to avoid spatial autocorrelation issues.

sample

Logical. Indicates whether benchmarking should be done on an undersampled dataset. This is useful for testing model efficiency with an imbalanced dataset (i.e. few observations and many background (pseudo-absence) points).

Value

Benchmarking object (class bmr). This object can be accessed by other functions in order to obtain the benchmark results.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
## Not run: 
# download benchmarking data
benchmarking_data <- get_benchmarking_data("Lynx lynx",
                                           limit = 1500)

# create a list of algorithms to compare
# here it is important to specify predict.type as "prob"
learners <- list(mlr::makeLearner("classif.randomForest",
                                  predict.type = "prob"),
                 mlr::makeLearner("classif.logreg",
                                 predict.type = "prob"))

# run the model benchmarking process
# if you have previously used a partitioning method you should specify it here
bmr <- benchmark_sdm(benchmarking_data$df_data,
                    learners = learners,
                    dataset_type = "default")

# for benchmarking an imbalanced dataset you can undersample
bmr <- benchmark_sdm(benchmarking_data$df_data,
                    learners = learners,
                    dataset_type = "default",
                    sample = TRUE)

# inspect the benchmark results
bmr

## End(Not run)

boyanangelov/sdmbench documentation built on Dec. 14, 2020, 1:08 a.m.