reduceVar: Reduce Variables

Description Usage Arguments Value Author(s) Examples

View source: R/reduceVar.R

Description

Remove variables whose importance is less than the given threshold. The function removes one variable at time and after trains a new model to get the new variable contribution rank. If use_jk is TRUE the function checks if after removing the variable the model performance decreases (according to the given metric and based on the starting model). In this case the function stops removing the variable even if the contribution is lower than the given threshold.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
reduceVar(
  model,
  th,
  metric,
  test = NULL,
  env = NULL,
  use_jk = FALSE,
  permut = 10,
  use_pc = FALSE
)

Arguments

model

SDMmodel or SDMmodelCV object.

th

numeric. The contribution threshold used to remove variables.

metric

character. The metric used to evaluate the models, possible values are: "auc", "tss" and "aicc", used only if use_jk is TRUE.

test

SWD object containing the test dataset used to evaluate the model, not used with aicc, and if use_jk = FALSE, default is NULL.

env

stack containing the environmental variables, used only with "aicc", default is NULL.

use_jk

Flag to use the Jackknife AUC test during the variable selection, if FALSE the function uses the percent variable contribution, default is FALSE.

permut

integer. Number of permutations, used if use_pc = FALSE, default is 10.

use_pc

logical, use percent contribution. If TRUE and the model is trained using the Maxent method, the algorithm uses the percent contribution computed by Maxent software to score the variable importance, default is FALSE.

Value

The model trained using the selected variables.

Author(s)

Sergio Vignali

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
# Acquire environmental variables
files <- list.files(path = file.path(system.file(package = "dismo"), "ex"),
                    pattern = "grd", full.names = TRUE)
predictors <- raster::stack(files)

# Prepare presence and background locations
p_coords <- virtualSp$presence
bg_coords <- virtualSp$background

# Create SWD object
data <- prepareSWD(species = "Virtual species", p = p_coords, a = bg_coords,
                   env = predictors, categorical = "biome")

# Split presence locations in training (80%) and testing (20%) datasets
datasets <- trainValTest(data, test = 0.2, only_presence = TRUE)
train <- datasets[[1]]
test <- datasets[[2]]

# Train a Maxnet model
model <- train(method = "Maxnet", data = train, fc = "lq")

# Remove all variables with permuation importance lower than 2%
output <- reduceVar(model, th = 2, metric = "auc", test = test, permut = 1)

# Remove variables with permuation importance lower than 2% only if testing
# TSS doesn't decrease
## Not run: 
output <- reduceVar(model, th = 2, metric = "tss", test = test, permut = 1,
                    use_jk = TRUE)

# Remove variables with permuation importance lower than 2% only if AICc
# doesn't increase
output <- reduceVar(model, th = 2, metric = "aicc", permut = 1,
                    use_jk = TRUE, env = predictors)

# Train a Maxent model
# The next line checks if Maxent is correctly configured but you don't need
# to run it in your script
if (dismo::maxent(silent = TRUE)) {
model <- train(method = "Maxent", data = train, fc = "lq")

# Remove all variables with percent contribution lower than 2%
output <- reduceVar(model, th = 2, metric = "auc", test = test,
                    use_pc = TRUE)
}

## End(Not run)

SDMtune documentation built on July 17, 2021, 9:06 a.m.