minosse.data: Creates model's predictors

View source: R/minosse.data.R

minosse.dataR Documentation

Creates model's predictors

Description

This function creates both the response variable and the predictor variables to be used with minosse.target function.

Usage

minosse.data(obj,species_name,domain,time.overlap=0.95,coc.by="locality",
min.occs=3,abiotic.covs=NULL,combine.covs=TRUE,reduce_covs_by="pca",covs_th=0.95,
c.size="mean",bkg.predictors="presence",min.bkg=NULL,sampling.by.distance=TRUE,
prediction.ground=NULL,crop.by.mcp=FALSE,constrain.predictors=FALSE,
temporal.tolerance=NULL,projection=NULL,lon_0=NULL,lat_0=NULL,n.clusters=NULL,seed=NULL)

Arguments

obj

A n x m dataframe where n are the single occurrences and m are the following columns: spec (the species name), x and y (longitude and latitude in decimal degrees, respectively) and loc_id (an id identifying the fossil locality).

species_name

Character. The name of the species whose geographic range is to be estimated.

domain

Character or NULL. Only used if no prediction ground is provided. If set as "land", then present day mainland portions are selected according to fossil data spatial distribution, if "sea", marine domain portion is used as prediction ground. Default NULL.

time.overlap

Numeric. The proportion of temporal intersection between the target and the predictors' time span. Default is 0.95.

coc.by

Character. Either "locality" or "cell" to enable cooccurence analysis. See details below.

min.occs

Either numeric or numeric vector of length 2. The number occurrences below which to discard a species from being either valid predictors either a target. If ony one value is provided, the threshold is the same for both target and predicors.

abiotic.covs

the raster or rasters' stack of additional environmental predictors.

combine.covs

Logical. Should minosse.data collate species and abiotic predictors when performing variables' number reduction? Default TRUE See details.

reduce_covs_by

Character. The method used for predictors' number reduction. Available strategies are "pca", "variance" or "corr". See details.

covs_th

Numeric. The threshold value used for predictors' number reduction strategy. See details.

c.size

Numeric.This is the (square) cell resolution in meters for spatial interpolations. Some character values are possible: "mean", "semimean" and "max" (see details for forther explanations). If prediction.ground is not null and is a raster it is possible to use the raster resolution by setting "raster" as c.size.

bkg.predictors

The number of pseudo absences to be simulated for each predictor species. If "presence", the pseudo absences number equals the presences in each species.

min.bkg

Numeric. If bkg.predictors is set to "presence", this is the minimum number of pseudo absences to simulate if a species occurrence number is below this value.

sampling.by.distance

Logical. TRUE for a distace-based density pseudo absences simulation. FALSE for a pure spatial random distribution of pseudo absences.

prediction.ground

Either a raster or a SpatialPolygons class object where to perform all the spatial interpolations and target species prediction.

crop.by.mcp

Logical. If TRUE, interpoalations and prediction are restricted to the prediction.grund area delimited by the MCP enclosing the fossil occurrences of the whole dataset. Default FALSE.

constrain.predictors

Logical. Removing from the predictors' record all the localities not complying with spatial and temporal restrictions? Default is FALSE. See details.

temporal.tolerance

Numeric. If constrain.predictors is TRUE this is the maximum difference (expressed in Million years unit, i.e. 0.1 = one kylo years) allowed between target and predictors species' age estimate of the localites. See details.

projection

Character. This argument works only if prediction.ground is NULL. This is the equal-area projection for spatial interpolations. A character string in the proj4 format or either "moll" (Mollweide) or "laea" (Lambert Azimuthal equal area) projections (see details).

lon_0

Numeric. Only if prediction.ground is NULL. The longitude of the projection centre used when setting either "moll" or "laea" projections. If NULL the mean longitude of the whole fossil record is used. Default NULL.

lat_0

Numeric. Only if prediction.ground is NULL. The latitude of the projection centre used when setting "laea" projection. If NULL the mean latitude of whole fossil record is used. Default NULL.

n.clusters

Numeric or NULL. The number of cores to use during spatial interpolations. If "automatic", the number of used cores is equal to the number of predictors. If predictors' number > the avaialble cores, all cores - 1 is then used. Default is NULL.

seed

Numeric. The seed number for experiment replication.

Details

In minosse.data there are different strategies for predictor species (covariates) dimension reduction. The first one considers only the species that are significantly related (positively or negatively) to the target species, then discarding all the others. This first stratery uses the cooccurrence analysis that can be performed either at the locality level, i.e. by seeking pattern of cooccurrence whithin the species list of any single fossil locality, or at the cell level, i.e. by considering lists of unique species occurring inside the squared cell of the prediction ground. A cell based analysis is useful when having many low-richness fossil localities. If the significantly relationships is less than 4, then all the species are considered. Other strategies can be used for predictors' dimensionality reduction. These additional strategies are performed over the predictors'maps and can employ one of the following methods: Principal Component Analysis ("pca"), Variance Inflation Factor ("variance") and correlation ("corr"). These strategies need a threshold value ("covs_th") to be set in order to select the predictors to retain. If the strategy is "pca", then the covs_th is the percentage (from 1 to 100) of variance to be explained by PCA axes. If the strategy is "corr", then covs_th is any number between 0 and 1 indicating the correlation between predictors below which predictor species can be retained. If the strategy is "variance", then covs_th is any mumber higher than one indicating the higher variance inflation that can be achieved by the predictor. See details of vif function in the usdm package for further explanations. For c.size some automatic values are available: by setting "mean", the algorithm uses the mean nearest neighbour distance between fossil localities as cell resolution; by setting "semimean" it uses half of the average nearest neighbour distance, whereas, by setting "max" it uses the maximum nearest neighbour distance. If abiotic.covs is not NULL, the combine.covs argument indicates if performing predictors maps number reduction by including (TRUE) or excluding (FALSE) abiotic covariates. If FALSE, abiotic covariates are always included in the final dataset of predictors. Spatial interpolations always need equal area coordinates reference system to be used. The user can specify its own projected CRS (in the proj4 format, see https://proj4.org/operations/projections/index.html) or can use predefined choices like "laea" (for Lambert Azimuthal equal area) or "moll" (for Mollweide) projections. When setting predefined projections, the user can specify the projection centre's coordinates in decimal degrees by lon_0 and lat_0 arguments. If both lon_0 and lat_0 are NULL, the mean longitude and latitude of the whole fossil record are used. Warning: If not NULL, the prediction.ground's coordinates reference system has the priority over all the other projection settings. time.overlap indicates the percentage of target and predictors species temporal overalp. Each predictor temporally overlapping target species' time span is automatically ruled out from prediction. The argument constrain.predictors enables setting spatial and temporal restriction to predictors' fossil localities in order to be considered synchronous and syntopic to the target occurrences. The spatial restriction is set as the average nearest neighbour's distance between target species fossil sites and cannot be changed. The user is allowed to set a temporal restriction by the argument temporal.tolerance. By this argument it is possible to set the maximum temporal differnce between target and predictors' fossil localities estimated ages. All the sites exciding this value are ruled out from any analysis. See the reference paper and related Supporting Information for further details.

Value

A list of three objects to be used with minosse.target function. The first element of the list is the dataset of target species occurrences. The second object is the raster stack of predictor species. The third object, if present, is the result of the cooccurrence analysis.

Author(s)

Francesco Carotenuto, francesco.carotenuto@unina.it

Examples

  ## Not run: 
library(raster)
data(lgm)
raster(system.file("exdata/prediction_ground.gri", package="EcoPast"))->prediction_ground

minosse_dat<-minosse.data(obj=lgm,species_name="Mammuthus_primigenius",
domain=NULL,time.overlap=0.95,coc.by="locality",min.occs=3,abiotic.covs=NULL,
combine.covs=TRUE,reduce_covs_by=NULL,covs_th=0.95,c.size="mean",
bkg.predictors="presence",min.bkg=100,sampling.by.distance=TRUE,
prediction.ground=prediction_ground,crop.by.mcp=FALSE,constrain.predictors=FALSE,
temporal.tolerance=NULL,projection=NULL,lon_0=NULL,lat_0=NULL,
n.clusters=3,seed=625)

  
## End(Not run)

francesco-carotenuto/EcoPast documentation built on April 16, 2023, 5:48 p.m.