train_mark_model: Train a flexible model for the mark distribution

View source: R/train_mark_model.R

train_mark_modelR Documentation

Train a flexible model for the mark distribution

Description

Trains a predictive model for the mark distribution of a spatio-temporal process. Allows the user to incorporate location specific information and competition indices as covariates in the mark model.

Usage

train_mark_model(
  data,
  raster_list = NULL,
  scaled_rasters = FALSE,
  model_type = "xgboost",
  xy_bounds = NULL,
  save_model = FALSE,
  save_path = NULL,
  parallel = TRUE,
  include_comp_inds = FALSE,
  competition_radius = 15,
  correction = "none",
  selection_metric = "rmse",
  cv_folds = 5,
  tuning_grid_size = 200,
  verbose = TRUE
)

Arguments

data

a data frame containing named vectors x, y, size, and time.

raster_list

a list of raster objects.

scaled_rasters

'TRUE' or 'FALSE' indicating whether the rasters have been scaled.

model_type

the machine learning model type ("xgboost" or "random_forest").

xy_bounds

a vector of domain bounds (2 for x, 2 for y).

save_model

'TRUE' or 'FALSE' indicating whether to save the generated model.

save_path

path for saving the generated model.

parallel

'TRUE' or 'FALSE' indicating whether to use parallelization in model training.

include_comp_inds

'TRUE' or 'FALSE' indicating whether to generate and use competition indices as covariates.

competition_radius

distance for competition radius if include_comp_inds is 'TRUE'.

correction

type of correction to apply ("none", "toroidal", or "truncation").

selection_metric

metric to use for identifying the optimal model ("rmse" or "mae").

cv_folds

number of cross-validation folds to use in model training.

tuning_grid_size

size of the tuning grid for hyperparameter tuning.

verbose

'TRUE' or 'FALSE' indicating whether to show progress of model training.

Value

a list containing the raw trained model and a bundled model object.

Examples

# Load example raster data
raster_paths <- list.files(system.file("extdata", package = "ldmppr"),
  pattern = "\\.tif$", full.names = TRUE
)
raster_paths <- raster_paths[!grepl("_med\\.tif$", raster_paths)]
rasters <- lapply(raster_paths, terra::rast)

# Scale the rasters
scaled_raster_list <- scale_rasters(rasters)

# Load example locations
locations <- small_example_data %>%
  dplyr::mutate(time = power_law_mapping(size, .5))

# Train the model
train_mark_model(
  data = locations,
  raster_list = scaled_raster_list,
  scaled_rasters = TRUE,
  model_type = "xgboost",
  xy_bounds = c(0, 25, 0, 25),
  parallel = FALSE,
  include_comp_inds = FALSE,
  competition_radius = 10,
  correction = "none",
  selection_metric = "rmse",
  cv_folds = 3,
  tuning_grid_size = 2,
  verbose = TRUE
)


ldmppr documentation built on April 4, 2025, 12:45 a.m.