spat_strat: Preparing data for spatially stratified cross‐validation...

View source: R/spat_strat.R

spat_stratR Documentation

Preparing data for spatially stratified cross‐validation schemes

Description

This function sets up the variables used spatial stratification sampling to be used in the bootstrapped modeling approach in oneimpact. The function defines the hierarchical levels H0, used for block cross-validation, as well as the levels H1 and H2, used for setting the spatial structure of the sampling for model fitting and tuning. Levels H1 and and H2 are organized in hierarchical spatial blocks. Blocks H1 correspond to larger square blocks, each of which is subdivided into four, smaller H2 blocks (with side equal block_size). H2 blocks are separated spatially so that 3 of them are used for model fitting the the fourth is used for model tuning. The level H0 comes from the data set and should represent some other level for block cross validation, such as population ID, animal ID, year or study area. The block H0 is used for drawing validation sets, to allow a thorough evaluation of the fitted models in all blocks H0.

Usage

spat_strat(
  x,
  colH0 = NULL,
  colID = NULL,
  H1_as_H0 = FALSE,
  k = 4,
  block_size = 10000,
  buffer = 1000,
  coords = NULL,
  all_cols = FALSE,
  crs = "",
  plot_grid = TRUE,
  save_grid = c(NA_character_, "raster", "vector")[1]
)

Arguments

x

⁠[data.frame,sf,SpatVector]⁠
Vector of points to be spatially stratified, as a sf or a terra::SpatVector object. If a data.frame, the columns corresponding to the (x,y) coordinates must be given in coords.

colH0

⁠[numeric,character=NULL]⁠
Column number or name to define the ids of the H0 level

  • the one with ecological meaning, e.g. individuals, populations, or study areas, used for block cross-validating the predictions of the fitted models. Default is NULL, in which case there is no block H0 defined.

H1_as_H0

⁠[logical(1)=FALSE]⁠
Whether the spatial blocks of level H1 should be used as the block H0, in case no block H0 is provided (if colH0 = NULL). This parameter is ignored if colH0 is provided.

k

⁠[numeric(1)=4]⁠
Number of H2 blocks within each block H1. Should be 4, 9, 16, or some number k = x**2, where x is an integer > 1. Default is k = 4. TO BE IMPLEMENTED: Number of parts for k-fold cross validation within H1 hierarchical level, for tuning (setting the penalty parameter). Could be used for nested cross-validation.

block_size

⁠[numeric(1)=10000]⁠
Size (side of a square) of the blocks for H2 level, map units (generally meters). The size of the H1 level blocks is defined as sqrt(k)*block_size. Default is block_size = 10000

buffer

⁠[numeric(1)=1000]⁠
Buffer added around the points before creating the blocks, to make sure all points are included in the samples. Default is buffer = 1000.

coords

string,vector
Vector with the names of the columns with the (x,y) coordinates of the locations from the data set. Default is NULL, in which case x should be a sf or a terra::SpatVector object. If x is a data.frame, coords must be provided.

all_cols

⁠[logical=FALSE]⁠
If TRUE, and if x is a data.frame, the spatial strata blocks are appended as columns in the input data x.

crs

⁠[string=""]⁠
Coordinate reference system (CRS) of the observations, if x is a data.frame.

plot_grid

⁠[logical=TRUE]⁠
if TRUE (default), the grid with spatial blocks and observations is plotted.

save_grid

⁠[character=NA]{NA, "raster", "vector"}⁠
Should the grid which defines the H1 and H2 blocks be saved? NOT IMPLEMENTED.

col_id

⁠[numeric,character=NULL]⁠ Column number or name with the ID of the rows of the data observations. In step-selection analysis, this should be the column showing the number of the strata of each step. For resource selection analysis and environmental niche modeling, this might be the row id, for instance.

Details

The function returns a data.frame with the blocks for the hierarchical levels H0, H1, and H2 for each observation in the input data set, to be used to create samples for the bootstrapped modeling approach using the create_samples() function, in case spatial stratified samples are desired.

Value

A data.frame with the blocks at hierarchical levels H0, H1, and H2 corresponding to each of the observations in the input data set x.

Note

To be implemented for input = data.frame #terra::vect(datadata$case==1,, geom = c("x", "y")) To be implemented for track objects - already have crs Put H0 here as well.

Examples

data(reindeer)
library(terra)
library(amt)

# no block H0, spatial blocks
spat_strat(reindeer, block_size = 5000, coords = c("x", "y"))

# with block H0
spst <- spat_strat(reindeer, coords = c("x", "y"), colH0 = "original_animal_id",
                   all_cols = TRUE)
# Visualize level H0 - individuals
spst_vect <- terra::vect(spst, geom = c("x", "y"))
terra::plot(spst_vect, "blockH0")
# Visualize level H1
terra::plot(spst_vect, "blockH1", type = "classes")
# Visualize level H2 for blockH1 numbers 6 to 10
terra::plot(spst_vect, col = grey(0.7))
terra::plot(spst_vect[spst$blockH1 == 6], "blockH2", type = "classes", add = TRUE) # only 6
terra::plot(spst_vect[spst$blockH1 %in% c(6,7,10,11)], "blockH2", type = "classes", add = TRUE) # 6-10
terra::plot(spst_vect[spst$blockH1 == 6], "blockH2", type = "classes") # zoom to 6


NINAnor/oneimpact documentation built on June 14, 2025, 12:27 a.m.