ebirdst_ppms: eBird Status and Trends predictive performance metrics (PPMs)

View source: R/ebirdst-ppms.R

ebirdst_ppmsR Documentation

eBird Status and Trends predictive performance metrics (PPMs)

Description

Calculate a suite of predictive performance metrics (PPMs) for the eBird Status and Trends model of a given species within a spatiotemporal extent.

Usage

ebirdst_ppms(path, ext, es_cutoff, pat_cutoff)

## S3 method for class 'ebirdst_ppms'
plot(x, ...)

Arguments

path

character; directory that the Status and Trends data for a given species was downloaded to. This path is returned by ebirdst_download() or get_species_path().

ext

ebirdst_extent object (optional); the spatiotemporal extent over which to calculate the PPMs.

es_cutoff

fraction between 0-1; the ensemble support cutoff to use in distinguishing zero and non-zero predictions. Optimal ensemble support cutoff values are calculated for each week during the modeling process and stored in the data package for each species. In general, you should not specify a value for es_cutoff and instead allow the function to use the species-specific model-based values.

pat_cutoff

numeric between 0-1; percent above threshold. Optimal PAT cutoff values are calculated for each week during the modeling process and stored in the data package for each species. In general, you should not specify a value for pat_cutoff and instead allow the function to use the species-specific model-based values.

x

ebirdst_ppms object; PPMs as calculated by ebirdst_ppms().

...

ignored.

Details

During the eBird Status and Trends modeling process, a subset of observations (the "test data") are held out from model fitting to be used for evaluating model performance. Model predictions are made for each of these observations and this function calculates a suite of predictive performance metrics (PPMs) by comparing the predictions with the observed count on the eBird checklist.

Three types of PPMs are calculated: binary or range-based PPMs assess the ability of model to predict range boundaries, occurrence PPMs assess the occurrence probability predictions, and abundance PPMs assess the predicted abundance. Both the occurrence and count PPMs are within-range metrics, meaning the comparison between observations and predictions is only made within the range where the species occurs.

Prior to calculating PPMs, the test dataset is subsampled spatiotemporally using ebirdst_subset(). This process is performed for 25 monte carlo iterations resulting in 25 estimates of each PPM.

Value

An ebirdst_pppms object containing a list of three data frames: binary_ppms, occ_ppms, and abd_ppms. These data frames have 25 rows corresponding to 25 Monte Carlo iterations each estimating the PPMs using a spatiotemporal subsample of the test data. Columns correspond to the different PPMs. binary_ppms contains binary or range-based PPMs, occ_ppms contains within-range occurrence probability PPMs, and abd_ppms contains within-range abundance PPMs. In some cases, PPMs may be missing, either because there isn't a large enough test set within the spatiotemporal extent or because average occurrence or abundance is too low. In these cases, try increasing the size of the ebirdst_extent object.

plot() can be called on the returned ebirdst_ppms object to produce a boxplot of PPMs in all three categories: Binary Occurrence, Occurrence Probability, and Abundance.

Examples

## Not run: 
# download example data
path <- ebirdst_download("example_data", tifs_only = FALSE)
# or get the path if you already have the data downloaded
path <- get_species_path("example_data")

# define a spatiotemporal extent to calculate ppms over
bb_vec <- c(xmin = -90, xmax = -82, ymin = 41, ymax = 48)
e <- ebirdst_extent(bb_vec, t = c("05-01", "07-31"))

# compute predictive performance metrics
ppms <- ebirdst_ppms(path = path, ext = e)
plot(ppms)

## End(Not run)

CornellLabofOrnithology/stemhelper documentation built on Feb. 5, 2023, 9:59 a.m.