vif_filter: Filter SpatRaster layers based on Variance Inflation Factor...

View source: R/vif_filter.R

vif_filterR Documentation

Filter SpatRaster layers based on Variance Inflation Factor (VIF)

Description

This function iteratively filters layers from a SpatRaster object by removing the one with the highest Variance Inflation Factor (VIF) that exceeds a specified threshold (th).

Usage

vif_filter(x, th = 5)

Arguments

x

A SpatRaster object containing the layers (variables) to filter. Must contain two or more layers.

th

A numeric value specifying the Variance Inflation Factor (VIF) threshold. Layers whose VIF exceeds this threshold are candidates for removal in each iteration (default: 5).

Details

This function implements a common iterative procedure to reduce multicollinearity among raster layers by removing variables with a high Variance Inflation Factor (VIF). The VIF for a specific predictor indicates how much the variance of its estimated coefficient is inflated due to its linear relationships with all other predictors in the model. A high VIF value suggests a high degree of collinearity with other predictors (values exceeding 5 or 10 are often considered problematic; see O'Brien, 2007; Legendre & Legendre, 2012).

The filtering process is fully automated and robust:

  1. Validates the input and converts the SpatRaster to a data.frame for calculations.

  2. In each step, the function attempts to calculate VIF efficiently using matrix inversion. If perfect collinearity is detected (resulting in a singular matrix that cannot be inverted), the function automatically switches to a more robust method based on linear regressions to handle the situation without an error.

  3. The function identifies the variable with the highest VIF among the remaining variables. If its VIF is greater than the threshold (th), that variable is removed. The process repeats until all remaining variables are below the threshold or until only one variable remains.

The output is a list containing two main components:

  • SpatRaster object with the variables that were retained after the filtering process.

  • A list with a detailed summary of the process, including the names of the kept and excluded variables, the original Pearson's correlation matrix, and the final VIF values for the retained variables.

The internal VIF calculation includes checks to handle potential numerical instability, such as columns with zero or near-zero variance and cases of perfect collinearity among variables, which could otherwise lead to errors (e.g., infinite VIFs). Variables identified as having infinite VIF due to perfect collinearity are prioritized for removal.

References: O’Brien (2007) A caution regarding rules of thumb for variance inflation factors. Quality & Quantity, 41(5), 673–690. https://doi.org/10.1007/s11135-006-9018-6 Legendre & Legendre (2012) Interpretation of ecological structures. In P. Legendre & L. Legendre (Eds.), Developments in Environmental Modelling (Vol. 24, pp. 521-624). Elsevier. https://doi.org/10.1016/B978-0-444-53868-0.50010-1

Value

A list object containing the filtered SpatRaster and a summary of the filtering process.

Examples

library(terra)

set.seed(2458)
n_cells <- 100 * 100
r_clim <- terra::rast(ncols = 100, nrows = 100, nlyrs = 7)
values(r_clim) <- c(
  (rowFromCell(r_clim, 1:n_cells) * 0.2 + rnorm(n_cells, 0, 3)),
  (rowFromCell(r_clim, 1:n_cells) * 0.9 + rnorm(n_cells, 0, 0.2)),
  (colFromCell(r_clim, 1:n_cells) * 0.15 + rnorm(n_cells, 0, 2.5)),
  (colFromCell(r_clim, 1:n_cells) +
    (rowFromCell(r_clim, 1:n_cells)) * 0.1 + rnorm(n_cells, 0, 4)),
  (colFromCell(r_clim, 1:n_cells) /
    (rowFromCell(r_clim, 1:n_cells)) * 0.1 + rnorm(n_cells, 0, 4)),
  (colFromCell(r_clim, 1:n_cells) *
    (rowFromCell(r_clim, 1:n_cells) + 0.1 + rnorm(n_cells, 0, 4))),
  (colFromCell(r_clim, 1:n_cells) *
    (colFromCell(r_clim, 1:n_cells) + 0.1 + rnorm(n_cells, 0, 4))))
names(r_clim) <- c("varA", "varB", "varC", "varD", "varE", "varF", "varG")
terra::crs(r_clim) <- "EPSG:4326"
terra::plot(r_clim)

vif_result <- ClimaRep::vif_filter(r_clim, th = 5)
print(vif_result$summary)
r_clim_filtered <- vif_result$filtered_raster
terra::plot(r_clim_filtered)

ClimaRep documentation built on Aug. 24, 2025, 5:08 p.m.