pseudoAbsences: Pseudo-absence data generation
In mopa: Species Distribution MOdeling with Pseudo-Absences

Description Usage Arguments Details Value Author(s) References See Also Examples

Pseudo-absence data generation at random or by k-means clustering inside a single background or a group of backgrounds (e.g. of different extent, backgroundRadius)

1 2	pseudoAbsences(xy, background, realizations = 1, exclusion.buffer = 0.0166, prevalence = 0.5, kmeans = FALSE, varstack = NULL)

`xy`	Data frame or list of data frames with coordinates (each row is a point), this is, presence data
`background`	Matrix or list/s of matrixes with background coordinates in columns. Object derived from function `backgroundGrid`, `OCSVMprofiling` or `backgroundRadius`.
`realizations`	Integer. Number of realizations (default = 1).
`exclusion.buffer`	value of the minimum distance to be kept between presence data and pseudo-absence data. Default is 0.0166
`prevalence`	Proportion of presences against absences. Default is 0.5 (equal number of presences and absences)
`kmeans`	Logical. If FALSE (default) pseudo-absences are generated at random. If TRUE k-means clustering of the background is done and centroids are extracted as pseudo-absences.
`varstack`	RasterStack of variables for to compute the k-means clustering. Used if `kmeans` = TRUE.

Details. The application of this function could be preceded by the application of functions OCSVMprofiling and/or backgroundRadius in order to consider alternative methods for pseudo-absence data generation (see references).

data frame or list/s of data frames

M. Iturbide

Iturbide, M., Bedia, J., Herrera, S., del Hierro, O., Pinto, M., Gutierrez, J.M., 2015. A framework for species distribution modelling with improved pseudo-absence generation. Ecological Modelling. DOI:10.1016/j.ecolmodel.2015.05.018.

mopaTrain

# SHORT EXAMPLE
## Load and prepare presence data
data(Q_pubescens)
presences <- Q_pubescens[sample(1:300, size = 100),]

## Define the spatial characteristics of the study area
r <- raster(nrows=50, ncols=50, xmn=-10, xmx=20, ymn=35, ymx=65, vals = rep(1, 50*50))

## Background of the whole study area
bg <- backgroundGrid(r)

## Generate pseudo-absences considering an unique background extent
RS_random <-pseudoAbsences(xy = presences, background = bg$xy, 
                           exclusion.buffer = 0.083*5, 
                           prevalence = -0.5, kmeans = FALSE)



# FULL WORKED EXAMPLE
## Load presence data
data(Oak_phylo2)

## Load climate data 
destfile <- tempfile()
data.url <- "https://raw.githubusercontent.com/SantanderMetGroup/mopa/master/data/biostack.rda"
download.file(data.url, destfile)
load(destfile, verbose = TRUE)

projection(biostack$baseline) <- CRS("+proj=longlat +init=epsg:4326")
r <- biostack$baseline[[1]]
## Background of the whole study area
bg <- backgroundGrid(r)

## Environmental profiling of the background
bg.profiled <- OCSVMprofiling(xy = Oak_phylo2, varstack = biostack$baseline, 
                              background = bg$xy)

## Generate pseudo-absences considering an unique background extent
RS_random <-pseudoAbsences(xy = Oak_phylo2, background = bg$xy, 
                           exclusion.buffer = 0.083*5, 
                           prevalence = -0.5, kmeans = FALSE)
RSEP_random <-pseudoAbsences(xy = Oak_phylo2, background = bg.profiled$absence, 
                             exclusion.buffer = 0.083*5, 
                             prevalence = -0.5, kmeans = FALSE)

## Background partition into different extents
bg.extents <- backgroundRadius(xy = Oak_phylo2, background = bg$xy, 
                               start = 0.166, by = 0.083*20, 
                               unit = "decimal degrees")

## Generate pseudo-absences considering different background extents
TS_random <-pseudoAbsences(xy = Oak_phylo2, background = bg.extents, 
                           exclusion.buffer = 0.083*10, 
                           prevalence = -0.5, kmeans = FALSE)


## with k-means clustering
TS_kmeans <-pseudoAbsences(xy = Oak_phylo2, background = bg.extents, 
                           exclusion.buffer = 0.083*5, 
                           prevalence = -0.5, kmeans = TRUE, 
                           varstack = biostack$baseline)