View source: R/correct_colinvar.R
correct_colinvar | R Documentation |
Collinearity reduction of predictor variables
correct_colinvar(
env_layer,
method,
proj = NULL,
save_proj = NULL,
restric_to_region = NULL,
restric_pca_proj = FALSE,
maxcell = NULL,
based_on_points = FALSE,
data = NULL,
x = NULL,
y = NULL
)
env_layer |
SpatRaster An object of class SpatRaster containing the predictors. This function does not allow categorical variables |
method |
character. Collinearity reduction method. It is necessary to provide a vector for this argument. The next methods are implemented:
|
proj |
character. Only used for pca method. Path to a folder that contains sub-folders for the different projection scenarios. Variables names must have the same names as in the raster used in env_layer argument. Usage proj = "C:/User/Desktop/Projections" (see in Details more about the use of this argument) |
save_proj |
character. Directory to save PCA projection. Default NULL. |
restric_to_region |
SpatVector. Area used to restrict cells of env_layer at moment to perform collinearity reduction. Default: NULL. |
restric_pca_proj |
logical. Area used to restrict geographically PCA projection within SpatVector used in restric_to_region. Only use for PCA analysis. Default: FALSE. |
maxcell |
numeric. Number of raster cells to be randomly sampled. Taking a sample could be useful to reduce memory usage for large rasters. If NULL, the function will use all raster cells. Default NULL. Usage maxcell = 50000. |
based_on_points |
logical. If TRUE, collinearity reduction method will be based on species points data (i.e., presences, and absences, pseudo-absences or background points). If TRUE, data, x and y arguments must be provided. Default FALSE. |
data |
tibble or data.frame. Database with species data used to model (i.e., presence + absence, or presence + pseudo-absence + background points) with x and y coordinates |
x |
character. Column name with spatial x coordinates |
y |
character. Column name with spatial y coordinates |
In the case of having environmental variables for the current conditions and other time periods (future or present), it is recommended to perform the PCA analysis with the current environmental condition and project the PCA for the other time periods. To do so, it is necessary to use “proj” argument. Path to a folder (e.g., projections) that contains sub-folders for the different projection scenarios (e.g., years and emissions). Within each sub-folder must be stored single or multiband rasters with the environmental variables.
For example:
C:/Users/my_pc/projections/
├── MRIESM_2050_ssp126
│ └── var1.tif
│ └── var2.tif
│ └── var3.tif
├── MRIESM_2080_ssp585
│ └── var1.tif
│ └── var2.tif
│ └── var3.tif
├── UKESM_2050_ssp370
│ └── var1.tif
│ └── var2.tif
│ └── var3.tif
If pca method is run with time projections, correct_colinvar function will create the Projection_PCA (the exact path is in the path object returned by the function) with the same system of sub-folders and multiband raster with the principal components (pcs.tif)
C:/Users/my_pc/Projection_PCA/
├── MRIESM_2050_ssp126
│ └── pcs.tif # a multiband tif with principal components
├── MRIESM_2080_ssp585
│ └── pcs.tif
├── UKESM_2050_ssp370
│ └── pcs.tif
Perform collinearity reduction based on points
Evaluating collinearity based on all environmental conditions of a calibration area or study area could yield different results than evaluating collinearity based on points used to construct the models. If you want to perform collinearity reduction based on species points data, it is strongly recommended to use all the point data used for modeling (i.e., presence + absence or presence + pseudo-absence/background points).
#' If 'pearson', returns a list with the following elements:
cor_table: a matrix object with pairwise correlation values of the environmental variables
cor_variables: a list object with the same length of the number of environmental values containing the pairwise relations that exceeded the correlation threshold for each one of the environmental variables
If 'vif' method, returns a list with the following elements:
env_layer: a SpatRaster object with selected environmental variables
removed_variables: a character vector with removed environmental variables
vif_table: a data frame with VIF values for all environmental variables
If 'pca' method, returns a list with the following elements:
env_layer: SpatRaster with scores of selected principal component (PC) that sum up 95% of the whole variation or original environmental variables
coefficients: a matrix with the coefficient of principal component (PC) for predictors
cumulative_variance: a tibble with the cumulative variance explained in selected principal component (PC)
If 'fa' method, returns a list with the following elements:
env_layer: SpatRaster with scores of selected variables due to correlation to factors.
number_factors: number of factors selected according to the Broken-Stick criteria,
removed_variables: removed variables,
uniqueness: uniqueness of each environmental variable according to the factorial analysis,
loadings: environmental variables loadings in each of the chosen factors
## Not run:
require(terra)
require(dplyr)
somevar <- system.file("external/somevar.tif", package = "flexsdm")
somevar <- terra::rast(somevar)
# Perform pearson collinearity control
var <- correct_colinvar(env_layer = somevar, method = c("pearson", th = "0.7"))
var$cor_table
var$cor_variables
# For all correct_colinvar methods it is possible to take a sample or raster to reduce memory
var <- correct_colinvar(env_layer = somevar, method = c("pearson", th = "0.7"), maxcell = 10000)
var$cor_table
var$cor_variables
# Perform vif collinearity control
var <- correct_colinvar(env_layer = somevar, method = c("vif", th = "8"))
var$env_layer
var$removed_variables
var$vif_table
# Perform pca collinearity control
var <- correct_colinvar(env_layer = somevar, method = c("pca"))
plot(var$env_layer)
var$env_layer
var$coefficients
var$cumulative_variance
# Perform pca collinearity control with different projections
## Below will be created a set of folders to simulate the structure of the
## directory where environmental variables are stored for different scenarios
dir_sc <- file.path(tempdir(), "projections")
dir.create(dir_sc)
dir_sc <- file.path(dir_sc, c("scenario_1", "scenario_2"))
sapply(dir_sc, dir.create)
somevar <-
system.file("external/somevar.tif", package = "flexsdm")
somevar <- terra::rast(somevar)
terra::writeRaster(somevar, file.path(dir_sc[1], "somevar.tif"), overwrite = TRUE)
terra::writeRaster(somevar, file.path(dir_sc[2], "somevar.tif"), overwrite = TRUE)
## Perform pca with projections
dir_w_proj <- dirname(dir_sc[1])
dir_w_proj
var <- correct_colinvar(env_layer = somevar, method = "pca", proj = dir_w_proj)
var$env_layer
var$coefficients
var$cumulative_variance
var$proj
# Perform fa colinearity control
var <- correct_colinvar(env_layer = somevar, method = c("fa"))
var$env_layer
var$number_factors
var$removed_variables
var$uniqueness
var$loadings
## %######################################################%##
# #
#### Other option to perform PCA ####
#### considering cell restricted to a region ####
# #
## %######################################################%##
data("abies")
# Define a calibration area
abies2 <- abies %>%
dplyr::select(x, y, pr_ab) %>%
dplyr::filter(pr_ab == 1)
plot(somevar[[1]])
points(abies2[-3])
ca <- calib_area(abies2, x = "x", y = "y", method = c("mcp"), crs = crs(somevar))
plot(ca, add = T)
# Full geographical range to perform PCA
pca_fr <- correct_colinvar(
env_layer = somevar,
method = c("pca"),
maxcell = NULL,
restric_to_region = NULL,
restric_pca_proj = FALSE
)
# Perform PCA only with cell delimited by polygon used in restric_to_region
pca_rr <- correct_colinvar(
env_layer = somevar,
method = c("pca"),
maxcell = NULL,
restric_to_region = ca,
restric_pca_proj = FALSE
)
# Perform and predicted PCA only with cell delimited by polygon used in restric_to_region
pca_rrp <- correct_colinvar(
env_layer = somevar,
method = c("pca"),
maxcell = NULL,
restric_to_region = ca,
restric_pca_proj = TRUE
)
plot(pca_fr$env_layer) # PCA with all cells
plot(pca_rr$env_layer) # PCA with calibration area cell but predicted for entire region
plot(pca_rrp$env_layer) # PCA performed and predicted for cells within calibration area (ca)
##%######################################################%##
# #
#### Use correct_colinvar with points data ####
# #
##%######################################################%##
data("abies")
# Presence-absence database
abies2 <- abies %>%
dplyr::select(x, y, pr_ab)
# Perform collinearity control
# Pearson
correct_colinvar(
env_layer = somevar,
method = c("pearson", th = "0.6"),
based_on_points = TRUE,
data = abies2,
x = "x",
y = "y"
)
# VIF
correct_colinvar(
env_layer = somevar,
method = c("vif", th = "8"),
based_on_points = TRUE,
data = abies2,
x = "x",
y = "y"
)
# PCA
correct_colinvar(
env_layer = somevar,
method = c("pca"),
based_on_points = TRUE,
data = abies2,
x = "x",
y = "y"
)
# FA
correct_colinvar(
env_layer = somevar,
method = "fa",
based_on_points = TRUE,
data = abies2,
x = "x",
y = "y"
)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.