lrren: Ecological niche model using a log relative risk surface

View source: R/lrren.R

lrrenR Documentation

Ecological niche model using a log relative risk surface

Description

Estimate the ecological niche of a single species with presence/absence data and two covariates. Predict the ecological niche in geographic space.

Usage

lrren(
  obs_locs,
  predict = FALSE,
  predict_locs = NULL,
  conserve = TRUE,
  alpha = 0.05,
  p_correct = "none",
  cv = FALSE,
  kfold = 10,
  balance = FALSE,
  parallel = FALSE,
  n_core = 2,
  poly_buffer = NULL,
  obs_window = NULL,
  verbose = FALSE,
  ...
)

Arguments

obs_locs

Input data frame of presence and absence observations with six (6) features (columns): 1) ID, 2) longitude, 3) latitude, 4) presence/absence binary variable, 5) covariate 1 as x-coordinate, 6) covariate 2 as y-coordinate.

predict

Logical. If TRUE, will predict the ecological niche in geographic space. If FALSE (the default), will not predict.

predict_locs

Input data frame of prediction locations with 4 features (columns): 1) longitude, 2) latitude, 3) covariate 1 as x-coordinate, 4) covariate 2 as y-coordinate. The covariates must be the same as those included in obs_locs.

conserve

Logical. If TRUE (the default), the ecological niche will be estimated within a concave hull around the locations in obs_locs. If FALSE, the ecological niche will be estimated within a concave hull around the locations in predict_locs.

alpha

Numeric. The two-tailed alpha level for the significance threshold (the default is 0.05).

p_correct

Optional. Character string specifying whether to apply a correction for multiple comparisons including a False Discovery Rate p_correct = "FDR", a Sidak correction p_correct = "Sidak", and a Bonferroni correction p_correct = "Bonferroni". If p_correct = "none" (the default), then no correction is applied.

cv

Logical. If TRUE, will calculate prediction diagnostics using internal k-fold cross-validation. If FALSE (the default), will not.

kfold

Integer. Specify the number of folds used in the internal cross-validation. The default is 10.

balance

Logical. If TRUE, the prevalence within each k-fold will be 0.50 by undersampling absence locations (assumes absence data are more frequent). If FALSE (the default), the prevalence within each k-fold will match the prevalence in obs_locs.

parallel

Logical. If TRUE, will execute the function in parallel. If FALSE (the default), will not execute the function in parallel.

n_core

Optional. Integer specifying the number of CPU cores on the current host for parallelization (the default is 2 cores).

poly_buffer

Optional. Specify a custom distance (in the same units as covariates) to add to the window within which the ecological niche is estimated. The default is 1/100th of the smallest range among the two covariates.

obs_window

Optional. Specify a custom window of class 'owin' within which to estimate the ecological niche. The default computes a concave hull around the data specified in conserve.

verbose

Logical. If TRUE (the default), will print function progress during execution. If FALSE, will not print.

...

Arguments passed to risk to select bandwidth, edge correction, and resolution.

Details

This function estimates the ecological niche of a single species (presence/absence data), or the presence of one species relative to another, using two covariates, will predict the ecological niche into a geographic area and prepare k-fold cross-validation data sets for prediction diagnostics.

The function uses the risk function to estimate the spatial relative risk function and forces risk(tolerate == TRUE) in order to calculate asymptotic p-values. The estimated ecological niche can be visualized using the plot_obs function.

If predict = TRUE, this function will predict ecological niche at every location specified with predict_locs with best performance if predict_locs are gridded locations in the same study area as the observations in obs_locs - a version of environmental interpolation. The predicted spatial distribution of the estimated ecological niche can be visualized using the plot_predict function.

If cv = TRUE, this function will prepare k-fold cross-validation data sets for prediction diagnostics. The sample size of each fold depends on the number of folds set with kfold. If balance = TRUE, the sample size of each fold will be the frequency of presence locations divided by the number of folds times two. If balance = FALSE, the sample size of each fold will be the frequency of all observed locations divided by the number of folds. The cross-validation can be performed in parallel if parallel = TRUE using the future, doFuture, doRNG, and foreach packages. Two diagnostics (area under the receiver operating characteristic curve and precision-recall curve) can be visualized using the plot_cv function.

The obs_window argument may be useful to specify a 'known' window for the ecological niche (e.g., a convex hull around observed locations).

This function has functionality for a correction for multiple testing. If p_correct = "FDR", calculates a False Discovery Rate by Benjamini and Hochberg. If p_correct = "Sidak", calculates a Sidak correction. If p_correct = "Bonferroni", calculates a Bonferroni correction. If p_correct = "none" (the default), then the function does not account for multiple testing and uses the uncorrected alpha level. See the internal pval_correct function documentation for more details.

Value

An object of class 'list'. This is a named list with the following components:

out

An object of class 'list' for the estimated ecological niche.

dat

An object of class 'data.frame', returns obs_locs that are used in the accompanying plotting functions.

p_critical

A numeric value for the critical p-value used for significance tests.

The returned out is a named list with the following components:

obs

An object of class 'rrs' for the spatial relative risk.

presence

An object of class 'ppp' for the presence locations.

absence

An object of class 'ppp' for the absence locations.

outer_poly

An object of class 'matrix' for the coordinates of the concave hull around the observation locations.

inner_poly

An object of class 'matrix' for the coordinates of the concave hull around the observation locations. Same as outer_poly.

If predict = TRUE, the returned out has additional components:

outer_poly

An object of class 'matrix' for the coordinates of the concave hull around the prediction locations.

prediction

An object of class 'matrix' for the coordinates of the concave hull around the prediction locations.

If cv = TRUE, the returned object of class 'list' has an additional named list cv with the following components:

cv_predictions_rr

A list of length kfold with values of the log relative risk surface at each point randomly selected in a cross-validation fold.

cv_labels

A list of length kfold with a binary value of presence (1) or absence (0) for each point randomly selected in a cross-validation fold.

Examples

if (interactive()) {
  set.seed(1234) # for reproducibility

# Using the 'bei' and 'bei.extra' data within {spatstat.data}

# Covariate data (centered and scaled)
  elev <- spatstat.data::bei.extra[[1]]
  grad <- spatstat.data::bei.extra[[2]]
  elev$v <- scale(elev)
  grad$v <- scale(grad)
  elev_raster <- terra::rast(elev)
  grad_raster <- terra::rast(grad)

# Presence data
  presence <- spatstat.data::bei
  spatstat.geom::marks(presence) <- data.frame("presence" = rep(1, presence$n),
                                               "lon" = presence$x,
                                               "lat" = presence$y)
  spatstat.geom::marks(presence)$elev <- elev[presence]
  spatstat.geom::marks(presence)$grad <- grad[presence]

# (Pseudo-)Absence data
  absence <- spatstat.random::rpoispp(0.008, win = elev)
  spatstat.geom::marks(absence) <- data.frame("presence" = rep(0, absence$n),
                                              "lon" = absence$x,
                                              "lat" = absence$y)
  spatstat.geom::marks(absence)$elev <- elev[absence]
  spatstat.geom::marks(absence)$grad <- grad[absence]

# Combine into readable format
  obs_locs <- spatstat.geom::superimpose(presence, absence, check = FALSE)
  obs_locs <- spatstat.geom::marks(obs_locs)
  obs_locs$id <- seq(1, nrow(obs_locs), 1)
  obs_locs <- obs_locs[ , c(6, 2, 3, 1, 4, 5)]
  
# Prediction Data
  predict_xy <- terra::crds(elev_raster)
  predict_locs <- as.data.frame(predict_xy)
  predict_locs$elev <- terra::extract(elev_raster, predict_xy)[ , 1]
  predict_locs$grad <- terra::extract(grad_raster, predict_xy)[ , 1]

# Run lrren
  test_lrren <- lrren(obs_locs = obs_locs,
                      predict_locs = predict_locs,
                      predict = TRUE,
                      cv = TRUE)
}


envi documentation built on Feb. 16, 2023, 7:38 p.m.