isotree_po | R Documentation |
Call Isolation forest and its variations to do species distribution modeling and optionally call a collection of other functions to do model explanation.
isotree_po(
obs_mode = "imperfect_presence",
obs,
obs_ind_eval = NULL,
variables,
categ_vars = NULL,
contamination = 0.1,
ntrees = 100L,
sample_size = 1,
ndim = 1L,
seed = 10L,
...,
offset = 0,
response = TRUE,
spatial_response = TRUE,
check_variable = TRUE,
visualize = FALSE
)
obs_mode |
( |
obs |
( |
obs_ind_eval |
( |
variables |
( |
categ_vars |
( |
contamination |
( |
ntrees |
( |
sample_size |
( |
ndim |
( |
seed |
( |
... |
Other arguments that |
offset |
( |
response |
( |
spatial_response |
( |
check_variable |
( |
visualize |
( |
For "perfect_presence", a user-defined number (contamination
) of samples
will be taken from background to let iForest
function normally.
If "imperfect_presence", no further actions is required.
If the obs_mode is "presence_absence", a contamination
percent
of absences will be randomly selected and work together with all presences
to train the model.
NOTE: obs_mode and mode only works for obs
. obs_ind_eval
will follow its own structure.
Please read details of algorithm isolation.forest
on
https://github.com/david-cortes/isotree, and
the R documentation of function isolation.forest
.
(POIsotree
) A list of
model (isolation.forest
) The threshold set in
function inputs
variables (stars
) The formatted image stack of
environmental variables
observation (sf
) A sf
of training occurrence
dataset
background_samples (sf
) A sf
of background points
for training dataset evaluation or SHAP dependence plot
independent_test (sf
or NULL
) A sf
of test
occurrence dataset
background_samples_test (sf
or NULL
) A sf
of
background points for test dataset evaluation or SHAP dependence plot
vars_train (data.frame
) A data.frame
with values of each
environmental variables for training occurrence
pred_train (data.frame
) A data.frame
with values of
prediction for training occurrence
eval_train (POEvaluation
) A list of presence-only evaluation metrics
based on training dataset. See details of POEvaluation
in
evaluate_po
var_test (data.frame
or NULL
) A data.frame
with values of each
environmental variables for test occurrence
pred_test (data.frame
or NULL
) A data.frame
with values of
prediction for test occurrence
eval_test (POEvaluation
or NULL
) A list of presence-only evaluation metrics
based on test dataset.
See details of POEvaluation
in evaluate_po
prediction (stars
) The predicted environmental suitability
marginal_responses (MarginalResponse
or NULL
) A list of marginal response
values of each environmental variables.
See details in marginal_response
offset (numeric
) The offset value set as inputs.
independent_responses (IndependentResponse
or NULL
) A list of independent
response values of each environmental variables.
See details in independent_response
shap_dependences (ShapDependence
or NULL
) A list of variable
dependence values of each environmental variables.
See details in shap_dependence
spatial_responses (SpatialResponse
or NULL
) A list of spatial variable
dependence values of each environmental variables.
See details in shap_dependence
variable_analysis (VariableAnalysis
or NULL
) A list of variable importance
analysis based on multiple metrics.
See details in variable_analysis
Liu, Fei Tony, Kai Ming Ting, and Zhi-Hua Zhou. "Isolation forest." 2008 eighth ieee international conference on data mining.IEEE, 2008. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1109/ICDM.2008.17")}
Liu, Fei Tony, Kai Ming Ting, and Zhi-Hua Zhou. "Isolation-based anomaly detection." ACM Transactions on Knowledge Discovery from Data (TKDD) 6.1 (2012): 1-39. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1145/2133360.2133363")}
Liu, Fei Tony, Kai Ming Ting, and Zhi-Hua Zhou. "On detecting clustered anomalies using SCiForest." Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, Berlin, Heidelberg, 2010. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/978-3-642-15883-4_18")}
Ha riri, Sahand, Matias Carrasco Kind, and Robert J. Brunner. "Extended isolation forest." IEEE Transactions on Knowledge and Data Engineering (2019). \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1109/TKDE.2019.2947676")}
References of related feature such as response curves and variable importance will be listed under their own functions
evaluate_po
, marginal_response
,
independent_response
, shap_dependence
,
spatial_response
, variable_analysis
,
isolation.forest
########### Presence-absence mode #################
library(dplyr)
library(sf)
library(stars)
library(itsdm)
# Load example dataset
data("occ_virtual_species")
obs_df <- occ_virtual_species %>% filter(usage == "train")
eval_df <- occ_virtual_species %>% filter(usage == "eval")
x_col <- "x"
y_col <- "y"
obs_col <- "observation"
obs_type <- "presence_absence"
# Format the observations
obs_train_eval <- format_observation(
obs_df = obs_df, eval_df = eval_df,
x_col = x_col, y_col = y_col, obs_col = obs_col,
obs_type = obs_type)
# Load variables
env_vars <- system.file(
'extdata/bioclim_tanzania_10min.tif',
package = 'itsdm') %>% read_stars() %>%
slice('band', c(1, 5, 12))
# Modeling
mod_virtual_species <- isotree_po(
obs_mode = "presence_absence",
obs = obs_train_eval$obs,
obs_ind_eval = obs_train_eval$eval,
variables = env_vars, ntrees = 10,
sample_size = 0.6, ndim = 1L,
seed = 123L, nthreads = 1)
# Check results
## Evaluation based on training dataset
print(mod_virtual_species$eval_train)
plot(mod_virtual_species$eval_train)
## Response curves
plot(mod_virtual_species$marginal_responses)
plot(mod_virtual_species$independent_responses,
target_var = c('bio1', 'bio5'))
plot(mod_virtual_species$shap_dependence)
## Relationships between target var and related var
plot(mod_virtual_species$shap_dependence,
target_var = c('bio1', 'bio5'),
related_var = 'bio12', smooth_span = 0)
# Variable importance
mod_virtual_species$variable_analysis
plot(mod_virtual_species$variable_analysis)
########### Presence-absence mode ##################
# Load example dataset
data("occ_virtual_species")
obs_df <- occ_virtual_species %>% filter(usage == "train")
eval_df <- occ_virtual_species %>% filter(usage == "eval")
x_col <- "x"
y_col <- "y"
obs_col <- "observation"
# Format the observations
obs_train_eval <- format_observation(
obs_df = obs_df, eval_df = eval_df,
x_col = x_col, y_col = y_col, obs_col = obs_col,
obs_type = "presence_only")
# Modeling with perfect_presence mode
mod_perfect_pres <- isotree_po(
obs_mode = "perfect_presence",
obs = obs_train_eval$obs,
obs_ind_eval = obs_train_eval$eval,
variables = env_vars, ntrees = 10,
sample_size = 0.6, ndim = 1L,
seed = 123L, nthreads = 1)
# Modeling with imperfect_presence mode
mod_imperfect_pres <- isotree_po(
obs_mode = "imperfect_presence",
obs = obs_train_eval$obs,
obs_ind_eval = obs_train_eval$eval,
variables = env_vars, ntrees = 10,
sample_size = 0.6, ndim = 1L,
seed = 123L, nthreads = 1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.