s2bak | R Documentation |
fit.s2bak.so function fits SDMs for each provided species within the same system, using a specified SDM approach (or the default which are GAMs from the mgcv package). Parallelization is possible when processing each SDM, with the default being 1 core.
The fit.s2bak.s2 function fits SDMs using species sightings, background sites and survey sites, differentiating between them using a binary survey_var predictor, denoting sightings-only (1) or survey (0).
Saving SDMs to the output may be computationally intensive, particularly with large datasets and many species. To reduce issues with memory, readout and the version = "short" may be used, which does not output the fitted models but instead saves it to the directory specified in readout.
fit.s2bak.bak fit a bias-adjustment kernel (BaK) for a fitted sightings-only SDM. Provides three models: Location bias, species bias and the final bias-adjustment kernel. The user can specify nodelling function for species nad location biases, while the final bias-adjustment kernel functions as a generalized linear model (glm) that combines model predictions with the output from the other two models.
Build S2BaK from top to bottom. Has functionality for parallelization, but the default is 1 core.
Fits SO models for all species, S2 models for species with survey data and a BaK model for adjusted predictions.
Assumes that all columns/variables in 'data' are relevant for the location bias model.
fit.s2bak.s2( formula, data_obs, data_surv = NA, obs, surv = NA, sdm.fun, background = NA, nbackground = 10000, overlapBackground = TRUE, survey_var = "so", addSurvey = FALSE, index = NA, ncores = 1, readout = NA, version = c("full", "short")[1], ... ) fit.s2bak.so( formula, data_obs, obs, sdm.fun, background = NA, nbackground = 10000, overlapBackground = TRUE, index = NA, ncores = 1, readout = NA, version = c("full", "short")[1], ... ) fit.s2bak.bak( formula_site, formula_species, predictions, data_surv, surv, trait, bak.fun, predict.bak.fun, truncate = c(1e-04, 0.9999), index = NA, bak.arg = list() ) fit.s2bak( formula, formula_survey = NA, formula_site, formula_species, data_obs, data_surv = NA, obs, surv = NA, trait, sdm.fun, predict.fun, bak.fun, predict.bak.fun, truncate = c(1e-04, 0.9999), background = NA, nbackground = 10000, overlapBackground = TRUE, bak.arg = list(), addSurvey = TRUE, index = NA, ncores = 1, readout = NA, version = c("full", "short")[1], ... )
formula |
Formula for the model functions. Assumes the structure follows "Y ~ X". Alternatively, a named list of formulas can be provided corresponding to species names. In this case, species will be fit using their corresponding formula. The response variable can have any name, as the function name the column accordingly. For |
data_obs |
A data.frame containing the covariates used for fitting
|
data_surv |
A data.frame containing the covariates used for fitting
|
obs |
A data.frame of species observations, with a column for species name (must be labelled 'species') and column of index of observations to reflect presences. If the index column name is not found in 'data', it assumes row number. |
surv |
A data.frame of species presences for the survey data used to
fit |
sdm.fun |
Model (as function) used for fitting. The function must have the formula as their first argument, and 'data' as the parameter for the dataset (including presences and background sites within the data.frame). |
background |
Background sites (pseudo-absences) used to fit the presence-only model, provided as a vector of indices of data (following the same column name as observations). If the index column name is not found in 'data', it assumes row number within 'data'. If left as NA, it will randomly sample 'nbackground' sites, with or without overlap ('overlapBackground'). Currently, only one set of background sites can be used. |
nbackground |
Number of background sites to sample. Only applies if background = NA. |
overlapBackground |
Whether sampled background sites that overlap with observations should be included. By default, it allows overlap. If FALSE, number of background sites may be less than specified or provided. |
survey_var |
Character name for the predictor variable determining a site is sightings-only (1) or survey data (0), the default is called "so". The column is automatically created within the function, and is used to with the formulas. |
addSurvey |
Whether the binary variable survey_var should be added to formula(s). If survey data is not provided or if survey_var is already in the formula, then survey_var will not be added to the formula(s) regardless of addSurvey = TRUE. If there is survey data and addSurvey = FALSE, then 'so' will not be added, and it will throw a warning. |
index |
Name of the columns for indexing environment data.frame with species sightings/survey data. If left as index = NA, then it will assume row number. |
ncores |
Number of cores to fit the SDMs, default is 1 core but can be automatically set if ncores=NA. If ncores > number of available cores - 1, set to the latter. |
readout |
Directory to save fitted SDMs and background sites. If NA, it will not save any SDMs. Provides an additional output that shows where the SDM is saved (with file name). The output in this directory can later be used in other functions such as predict.s2bak.s2. |
version |
Whether the SDMs should be included in the output. With "short", no the fitted SDMs are not provided. Setting to "full" (default) will output the list with all SDMs. Setting to "short" and combined with readout, can considerably reduce RAM usage while saving the progress so far, which is useful when dealing with many species or large datasets. |
... |
Other arguments that are passed to the SDM function (sdm.fun). |
formula_site |
Formula for fitting survey site bias, with locational bias as a function of spatial predictions. The response variable, bias, is generated and therefore its variable name can be anything. |
formula_species |
Formula for fitting species bias, with species bias as a function of species traits. The response variable is generated and therefore its name can be anything. |
predictions |
Sightings-only (SO) model predictions over the survey sites for all species, beyond those found in the survey data, as a matrix with columns for each species and rows for each site. |
trait |
Full trait data for the species predictions, as a data.frame with 'species' as a column and relevant traits for the remainder. Like with the predictions, the species in the dataset do not necessarily have to possess survey data, but will be used in the final adjustment model as final output. |
bak.fun |
Model function for fitting bias adjustment model (e.g., glm). |
predict.bak.fun |
Model function for predicting bias adjustment model
(e.g., predict.glm). Needs to match |
truncate |
Numeric minimum and maximum range of predicted values. Values very close to zero or one cannot be meaningfully distinguished, however these extreme values may have disproportionally large consequences on likelihoods due to logit transformation. |
bak.arg |
Additional arguments for |
formula_survey |
For |
predict.fun |
Prediction function for SDM, which must match the model function used for s2bak.s2 and s2bak.so models). |
An object of class "s2bak.s2", providing fitted SDMs for each species based on the provided SDM modelling approach. The primary difference between SO and S2 models are the additional data points from the survey data, and an additional binary predictor 'so' which denotes whether the data is from presence-background (1) or presence-absence data (0).
Bias adjustment models, the kernels (location and species), as a second-order GLM.
An S2BaK class object containing S2, SO and BaK.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.