oss.seqVIF: Perform Sequential Variable Inflation Factor Analysis

View source: R/oss.seqVIF.R

oss.seqVIFR Documentation

Perform Sequential Variable Inflation Factor Analysis

Description

Perform sequential Variable Inflation Factor analysis for a dataframe of predictor variables with dataframe columns representing predictor variable values

Usage

oss.seqVIF(cov_df, thresh, trace = FALSE, show.R2.vals = FALSE)

Arguments

cov_df

data.frame* object where each column represent covariate values.

thresh

numeric* object Perform sequential VIF until all covariates have VIF scores lower than this value

trace

Boolean* object (default = FALSE). Should the results of each VIF be shown in the console?

show.R2.vals

Boolean* object (default = FALSE) Should the VIF score be also described as an R squared value? (VIF = 1/1-R2)

Value

Returns a list of 4 objects: 1. Covariates_retained: First object is the column names of the covariates that are kept (i.e. have VIF scores lower than threshold). 2. Covariates_removed: Second object is the column names of the covariates were removed (i.e., have VIF scores higher than threshold). 3. VIF_removed: Third object is a summary of the VIF scores of each covariate that was removed. NA indicates scores fell below threshold and removal was not performed. 4. VIF_all: Forth object is a complete report of each VIF score for each covariate for each time VIF was run.

Examples

#Perform sequential VIF on an environmental raster stack
library(terra)
library(fmsb)

#Generate autocorrelated raster layers from the Keene study area DEM
data(keene)
keene<- rast(keene)

#Original DEM values
orig_pts <- terra::spatSample(x= keene, size=1000, na.rm=TRUE, method="random", as.df=FALSE)

#Create values correlated with original DEM values
ras1 <- orig_pts + rnorm(orig_pts, mean=0, sd=5)
ras2 <- orig_pts + rnorm(orig_pts, mean=0, sd=5)
ras3 <- orig_pts + rnorm(orig_pts, mean=0, sd=5)
ras4 <- orig_pts + rnorm(orig_pts, mean=0, sd=10)
ras5 <- orig_pts + rnorm(orig_pts, mean=0, sd=10)

df <- data.frame(orig_pts, ras1, ras2, ras3, ras4, ras5)

#Create values with same mean as DEM values but not correlated
random_rasters <- NULL
for(i in 1:10){
  rand_ras <- rnorm(n=length(orig_pts), mean=mean(orig_pts), sd=10)
  random_rasters <- cbind(random_rasters, rand_ras)
}

df <- cbind(df, random_rasters)

#Run to perform sequential VIF analyses,
# removing the covariate with the highest VIF value,
# then repeating until threshold is hit

vif_results <- oss.seqVIF(df, thresh=5, trace=FALSE, show.R2.vals=TRUE)


newdale/onsoilsurvey documentation built on Jan. 5, 2024, 1:35 a.m.