README.md

Looking for collaborators: I would like this R package to become a collaborative effort, and I am looking for collaborators and/or testers (everyone is welcome!). The idea would be to couple the package development it with at least two collaborative papers: a software note, and an application case highlighting the importance of variable selection and modelling paradigm when building species distribution models. If you are interested, please send me an email to blasbenito at gmail.com or a message on twitter to @blasbenito

sdmflow

Work in progress!!

This R package intends to facilitate the design and execution of scientific workflows oriented to model species distributions over space and past-time. It particularly focuses on:

Modeling grammar

It particularly intends to facilitate the design of SDM workflows by providing a consistent modeling grammar, as easy to remember as possible, in order to reduce the cognitive load produced by packages with large numbers of functions. This grammar is based on the idea that an SDM workflow is composed by a limited set of conceptual steps:

Note: most of the functions mentioned below are still a work in progress.

Each stage will be as well represented by a single function that can perform at once the most important steps of each stage, so a complete modeling workflow could be written as follows, once the pertinent parameters are filled:

v_auto(...) %>%     #variable preparation
o_auto(...) %>%     #occurrence preparation
s_auto(...) %>%     #automatic variable selection
m_auto(...) %>%     #automatic modeling and evaluation
r_auto(...)         #report generation

Object classes

Another important target of "sdmflow" is to reduce the cognitive load produced by a cluttered environment. To solve this issue the package will rely on three main object classes (well, named lists with a particular structure):

Modeling paradigm

The package sdmflow is based on the "use versus availability" modeling paradigm, which assumes that the presence records somehow reflect how a species uses the available habitat, which is represented by the background data. This modeling method gives higher habitat suitability to environmental values that are rare but disproportionately used by the species, following the idea that an accumulation of presence records over abundant environmental values can be the result of random processes, while the accumulation of presence records on rare environmental values is a clear signal of habitat selection.

The main advantage of this methods comes from its reliance on background data. Unlike absence or pseudo-absence data, background data does not have any interpretation problems, and simply represent an comprehensive sampling of the environmental conditions of the study area. This package also includes an option to work with "restricted background", generally taken from the area that is accessible to the species by dispersal.

Comprehensive documentation

This package was born after years of teaching species distribution models for GBIF.es, and I would like its documentation and vignettes to be as comprehensive as possible, so the package can become a source of tools and knowledge at the same time.



BlasBenito/sdmflow documentation built on April 10, 2020, 2:31 a.m.