In chhetrid/rangemapR: A Modular Platform for Reproducible Modeling of Species Niches and Distributions

library(knitr)
knit_engines$set(asis = function(options) {
  if (options$echo && options$eval) knit_child(text = options$code)
})
knitr::opts_chunk$set(message = FALSE, warning = FALSE, eval = FALSE)

# comp1
occs.db <- rvs$comp1 == 'db'
occs.csv <- rvs$comp1 == 'csv'
occs.any <- occs.db | occs.db
# comp2
poccs.rem <- 'rem' %in% rvs$comp2
poccs.sel <- 'sel' %in% rvs$comp2
poccs.thin <- 'thin' %in% rvs$comp2
poccs.any <- poccs.rem | poccs.sel | poccs.thin
# comp3
env.bc <- rvs$comp3 == 'bc'
env.user <- rvs$comp3 == 'user'
env.any <- env.bc | env.user
# comp4
bg.bb <- rvs$comp4.shp == 'bb'
bg.mcp <- rvs$comp4.shp == 'mcp'
bg.ptbuf <- rvs$comp4.shp == 'ptbuf'
bg.userCSV <- rvs$comp4.shp == 'csv'
bg.userShp <- rvs$comp4.shp == 'shp'
bg.user <- bg.userShp | bg.userCSV
bg.any <- rvs$comp4.shp != ''
bg.bufWid <- rvs$comp4.buf > 0 & bg.ptbuf == FALSE
# comp5
part.block <- rvs$comp5 == 'block'
part.cb1 <- rvs$comp5 == 'cb1'
part.cb2 <- rvs$comp5 == 'cb2'
part.jack <- rvs$comp5 == 'jack'
part.rand <- rvs$comp5 == 'rand'
part.any <- rvs$comp5 != ''
# comp6
mod.bioclim <- rvs$comp6 == 'bioclim'
mod.maxent <- rvs$comp6 == 'maxent'
mod.any <- mod.bioclim | mod.maxent
alg.maxent <- if (mod.maxent) {rvs$algMaxent == 'maxent.jar'} else {FALSE}
alg.maxnet <- if (mod.maxent) {rvs$algMaxent == 'maxnet'} else {FALSE}
# comp7
viz.pred.thr <- if (!(is.null(rvs$comp7.thr))) {!(rvs$comp7.thr == 'noThresh')} else {FALSE}
viz.pred.p10 <- rvs$comp7.thr == 'p10'
viz.pred.raw <- rvs$comp7.type == 'raw'
viz.pred.log <- rvs$comp7.type == 'logistic'
viz.pred.cll <- rvs$comp7.type == 'cloglog'
viz.pred <- 'map' %in% rvs$comp7
viz.mxEval <- 'mxEval' %in% rvs$comp7
viz.bcPlot <- 'bcPlot' %in% rvs$comp7
viz.resp <- 'resp' %in% rvs$comp7
viz.any <- rvs$comp7 != ''
# comp8
pj.area <- rvs$comp8.pj == 'area'
pj.time <- rvs$comp8.pj == 'time'
pj.mess <- rvs$comp8.esim == 'mess'
pj.thr <- if (!(is.null(rvs$comp8.thr))) {!(rvs$comp8.thr == 'noThresh')} else {FALSE}
pj.any <- pj.area | pj.time | pj.mess

Please find below the R code history from your Wallace v1.0.6 session.

You can reproduce your session results by running this R Markdown file in RStudio.

Each code block is called a "chunk", and you can run them either one-by-one or all at once by choosing an option in the "Run" menu at the top-right corner of the "Source" pane in RStudio.

For more detailed information see http://rmarkdown.rstudio.com).

Package installation

Wallace uses the following R packages that must be installed and loaded before starting.

library(spocc)
library(spThin)
library(dismo)
library(rgeos)
library(ENMeval)
library(dplyr)

Wallace also includes several functions developed to help integrate different packages and some additional functionality. For this reason, it is necessary to load the file functions.R, The function system.file() finds this script, and source() loads it.

source(system.file('shiny/funcs', 'functions.R', package = 'wallace'))

Record of analysis for `r "{{spName}}"`.

```{asis, echo = occs.any, eval = occs.any, include = occs.any}

Obtain Occurrence Data

```{asis, echo = occs.db, eval = occs.db, include = occs.db}
The search for occurrences was limited to `r {{occNum}}` records. Obtain occurrence records of the selected species from the `r "{{dbName}}"` database.

# query selected database for occurrence records
results <- spocc::occ(query = "{{spName}}", from = "{{dbName}}", limit = {{occNum}}, has_coords = TRUE)
# retrieve data table from spocc object
results.data <- results[["{{dbName}}"]]$data[[formatSpName("{{spName}}")]]
# remove rows with duplicate coordinates
occs.dups <- duplicated(results.data[c('longitude', 'latitude')])
occs <- results.data[!occs.dups,]
# make sure latitude and longitude are numeric (sometimes they are characters)
occs$latitude <- as.numeric(occs$latitude)
occs$longitude <- as.numeric(occs$longitude)
# give all records a unique ID
occs$occID <- row.names(occs)

```{asis, echo = occs.csv, eval = occs.csv, include = occs.csv} User CSV path with occurrence data. If the CSV file is not in the current workspace, change to the correct file path (e.g. "/Users/darwin/Documents/occs.csv").

```r
# NOTE: provide the path to the folder that contains the CSV file
d.occs <- ''
# create path to user occurrences csv file
userOccs.path <- file.path(d.occs, "{{occsCSV}}")
# read in csv
userOccs.csv <- read.csv(userOccs.path, header = TRUE)
# remove rows with duplicate coordinates
occs.dups <- duplicated(userOccs.csv[c('longitude', 'latitude')])
occs <- userOccs.csv[!occs.dups,]
# remove NAs
occs <- occs[complete.cases(occs$longitude, occs$latitude), ]
# give all records a unique ID
occs$occID <- row.names(occs)

```{asis, echo = poccs.any, eval = poccs.any, include = poccs.any}

Process Occurrence Data

```{asis, echo = poccs.rem, eval = poccs.rem, include = poccs.rem}
Remove the occurrence localities with the following IDs: `r {{occsRemoved}}`.

# remove the rows that match the occIDs selected
occs <- occs %>% filter(!(occID %in% {{occsRemoved}}))

```{asis, echo = poccs.sel, eval = poccs.sel, include = poccs.sel} The following code recreates the polygon used to select occurrences to keep in the analysis.

```r
selCoords <- data.frame(x = {{occsSelX}}, y = {{occsSelY}})
selPoly <- sp::SpatialPolygons(list(sp::Polygons(list(sp::Polygon(selCoords)), ID=1)))
occs.xy <- occs[c('longitude', 'latitude')]
sp::coordinates(occs.xy) <- ~ longitude + latitude
intersect <- sp::over(occs.xy, selPoly)
intersect.rowNums <- as.numeric(which(!(is.na(intersect))))
occs <- occs[intersect.rowNums, ]

``{asis, echo = poccs.thin, eval = poccs.thin, include = poccs.thin} Spatial thinning selected. Thin distance selected isr {{thinDist}}` km.

```r
output <- spThin::thin(occs, 'latitude', 'longitude', 'name', thin.par = {{thinDist}}, reps = 100, locs.thinned.list.return = TRUE, write.files = FALSE, verbose = FALSE)

```{asis, echo = poccs.thin, eval = poccs.thin, include = poccs.thin} Since spThin did 100 iterations, there are 100 different variations of how it thinned your occurrence localities. As there is a stochastic element in the algorithm, some iterations may include more localities than the others, and we need to make sure we maximize the number of localities we proceed with.

```r
# find the iteration that returns the max number of occurrences
maxThin <- which(sapply(output, nrow) == max(sapply(output, nrow)))
# if there's more than one max, pick the first one
maxThin <- output[[ifelse(length(maxThin) > 1, maxThin[1], maxThin)]]  
# subset occs to match only thinned occs
occs <- occs[as.numeric(rownames(maxThin)),]

```{asis, echo = env.any, eval = env.any, include = env.any}

Obtain Environmental Data

``` {asis, echo = env.bc, eval = env.bc, include = env.bc}
Using WorldClim (http://www.worldclim.org/) bioclimatic dataset at resolution of `r {{bcRes}}` arcmin.

# get WorldClim bioclimatic variable rasters
envs <- raster::getData(name = "worldclim", var = "bio", res = {{bcRes}}, lat = {{bcLat}}, lon = {{bcLon}})
# change names rasters variables
envRes <- {{bcRes}}
if (envRes == 0.5) {
  i <- grep('_', names(envs))
  editNames <- sapply(strsplit(names(envs)[i], '_'), function(x) x[1])
  names(envs)[i] <- editNames
}
i <- grep('bio[0-9]$', names(envs))
editNames <- paste('bio', sapply(strsplit(names(envs)[i], 'bio'), function(x) x[2]), sep='0')
names(envs)[i] <- editNames
# subset by those variables selected
envs <- envs[[{{bcSels}}]]
# extract environmental values at occ grid cells
locs.vals <- raster::extract(envs[[1]], occs[, c('longitude', 'latitude')])
# remove occs without environmental values
occs <- occs[!is.na(locs.vals), ]

# NOTE: provide the path to the folder that contains the rasters
d.envs <- ''
# create paths to the raster files
userRas.paths <- file.path(d.envs, {{userEnvs}})
# make a RasterStack out of the raster files
envs <- raster::stack(userRas.paths)

bgType <- switch("{{bgSel}}", 'bb'='Bounding Box', 'mcp'='Minimum Convex Polygon',   
                'ptsbuf'='Buffered Points',  'user'='User-defined')

```{asis, echo = bg.any, eval = bg.any, include = bg.any}

Process Environmental Data

Background selection technique chosen as r bgType.

```r
xmin <- min(occs$longitude)
xmax <- max(occs$longitude)
ymin <- min(occs$latitude)
ymax <- max(occs$latitude)
bb <- matrix(c(xmin, xmin, xmax, xmax, xmin, ymin, ymax, ymax, ymin, ymin), ncol=2)
bgExt <- sp::SpatialPolygons(list(sp::Polygons(list(sp::Polygon(bb)), 1)))

occs.xy <- occs[c('longitude', 'latitude')]
sp::coordinates(occs.xy) <- ~ longitude + latitude
bgExt <- mcp(occs.xy)

# make SpatialPoints object for buffering
occs.xy <- occs[c('longitude', 'latitude')]
sp::coordinates(occs.xy) <- ~ longitude + latitude
bgExt <- rgeos::gBuffer(occs.xy, width = {{bgBuf}})

```{asis, echo = bg.userCSV, eval = bg.userCSV, include = bg.userCSV} Read a .csv file and generate a Spatial Polygon object.

```r
# NOTE: provide the full path to the CSV file
csvPath <- ''
# read csv with coordinates for polygon
shp <- read.csv(csvPath, header = TRUE)
bgExt <- sp::SpatialPolygons(list(sp::Polygons(list(sp::Polygon(shp)), 1)))

``{asis, echo = bg.userShp, eval = bg.userShp, include = bg.userShp} User study extent name isr "{{bgUserShpName}}". User study extent path isr "{{bgUserShpPath}}"`. Read in the .shp file and generate a Spatial Polygon object.

```r
# NOTE: provide the path to the folder that contains the shapefile
d.bg <- ''
# read csv with coordinates for polygon
bgExt <- rgdal::readOGR(d.bg, "{{bgUserShpName}}")

``{asis, echo = bg.bufWid, eval = bg.bufWid, include = bg.bufWid} Buffer size of the study extent polygon defined asr {{bgBuf}}` degrees.

```r
bgExt <- rgeos::gBuffer(bgExt, width = {{bgBuf}})

``{asis, echo = bg.any, eval = bg.any, include = bg.any} Mask environmental variables byr bgType`, and take a random sample of background values from the study extent. As the sample is random, your results may be different than those in the session. If there seems to be too much variability in these background samples, try increasing the number from 10,000 to something higher (e.g. 50,000 or 100,000). The better your background sample, the less variability you'll have between runs.

```r
# crop the environmental rasters by the background extent shape
envsBgCrop <- raster::crop(envs, bgExt)
# mask the background extent shape from the cropped raster
envsBgMsk <- raster::mask(envsBgCrop, bgExt)
# sample random background points
bg.xy <- dismo::randomPoints(envsBgMsk, {{bgPtsNum}})
# convert matrix output to data frame
bg.xy <- as.data.frame(bg.xy)

partSwitch <- switch("{{partSel}}", 'block'='Block', 'cb1'='Checkerboard 1', 'cb2'='Checkerboard 2', 'jack'='Jackknife', 'random'='Random')

```{asis, echo = part.any, eval = part.any, include = part.any}

Partition Occurrence Data

Occurrence data is now partitioned for cross-validation, a method that iteratively builds a model on all but one group and evaluates that model on the left-out group.

For example, if the data is partitioned into 3 groups A, B, and C, a model is first built with groups A and B and is evaluated on C. This is repeated by building a model with B and C and evaluating on A, and so on until all combinations are done.

Cross-validation operates under the assumption that the groups are independent of each other, which may or may not be a safe assumption for your dataset. Spatial partitioning is one way to ensure more independence between groups.

You selected to partition your occurrence data by the r partSwitch method.

```r
occs.xy <- occs[c('longitude', 'latitude')]
group.data <- ENMeval::get.block(occ=occs.xy, bg.coords=bg.xy)

occs.xy <- occs[c('longitude', 'latitude')]
group.data <- ENMeval::get.checkerboard1(occ=occs.xy, env=envsBgMsk, bg.coords=bg.xy, aggregation.factor={{aggFact}})

occs.xy <- occs[c('longitude', 'latitude')]
group.data <- ENMeval::get.checkerboard2(occ=occs.xy, env=envsBgMsk, bg.coords=bg.xy, aggregation.factor={{aggFact}})

occs.xy <- occs[c('longitude', 'latitude')]
group.data <- ENMeval::get.jackknife(occ=occs.xy, bg.coords=bg.xy)

occs.xy <- occs[c('longitude', 'latitude')]
group.data <- ENMeval::get.randomkfold(occ=occs.xy, bg.coords=bg.xy, kfolds={{kfolds}})

# pull out the occurrence and background partition group numbers from the list
occs.grp <- group.data[[1]]
bg.grp <- group.data[[2]]

```{asis, echo = mod.any, eval = mod.any, include = mod.any}

Build and Evaluate Niche Model

You selected the r "{{enmSel}}" model.

```r
e <- BioClim_eval(as.data.frame(occs.xy), bg.xy, occs.grp, bg.grp, envsBgMsk)
# unpack the results data frame and list of models
evalTbl <- e$results
evalMods <- e$models
names(e$predictions) <- "BIOCLIM"
evalPreds <- e$predictions

# define the vector of regularization multipliers to test
rms <- seq({{rms1}}, {{rms2}}, {{rmsStep}})
# iterate model building over all chosen parameter settings
e <- ENMeval::ENMevaluate(occs.xy, envsBgMsk, bg.coords = bg.xy, RMvalues = rms, fc = {{fcs}}, 
                          method = 'user', occs.grp, bg.grp, clamp = {{clamp}}, algorithm = "{{algMaxent}}")

# unpack the results data frame, the list of models, and the RasterStack of raw predictions
evalTbl <- e@results
evalMods <- e@models
names(evalMods) <- e@results$settings
evalPreds <- e@predictions

```{asis, echo = viz.any, eval = viz.any, include = viz.any}

Visualize Niche Model

Below are some visualizations of your modeling analysis results.

```r}' == "maxent.jar", include = viz.resp}
# view response curves for environmental variables with non-zero coefficients
dismo::response(evalMods[["{{modSel}}"]], var = {{mxNonZeroCoefs}})

```r}' == "maxnet", include = viz.resp, warning = F, message = F}

view response curves for environmental variables with non-zero coefficients

plot(evalMods[["{{modSel}}"]], vars = {{mxNonZeroCoefs}}, type = "cloglog")

```r
# view BIOCLIM envelope plot
plot(evalMods[['BIOCLIM']], a = {{bcPlot1}}, b = {{bcPlot2}}, p = {{bcPlotP}})

# view ENMeval results
ENMeval::eval.plot(evalTbl, value = "{{mxEvalSel}}")

# Select your model from the models list
mod <- evalMods[["{{modSel}}"]]

# generate bioclim prediction
pred <- evalPreds[["{{modSel}}"]]

# generate raw prediction
pred <- evalPreds[["{{modSel}}"]]

# generate logistic prediction
pred <- ENMeval::maxnet.predictRaster(mod, envsBgMsk, type = 'logistic', clamp = {{clamp}})

# generate cloglog prediction
pred <- ENMeval::maxnet.predictRaster(mod, envsBgMsk, type = 'cloglog', clamp = {{clamp}})

# generate logistic prediction
pred <- dismo::predict(mod, envsBgMsk, args = c("outputformat=logistic"))

# generate cloglog prediction
pred <- dismo::predict(mod, envsBgMsk, args = c("outputformat=cloglog"))

# get predicted values for occurrence grid cells
occPredVals <- raster::extract(pred, occs.xy)
# define minimum training presence threshold
thr <- thresh(occPredVals, "{{comp7.thresh}}")
# threshold model prediction
pred <- pred > thr

# plot the model prediction
plot(pred)

```{asis, echo = pj.any, eval = pj.any, include = pj.any}

Project Niche Model

You selected to project your model. First define a polygon with the coordinates you chose, then crop and mask your predictor rasters. Finally, predict suitability values for these new raster cells based on the model you selected.

```r
projCoords <- data.frame(x = {{occsPjX}}, y = {{occsPjY}})
projPoly <- sp::SpatialPolygons(list(sp::Polygons(list(sp::Polygon(projCoords)), ID=1)))

```{asis, echo = pj.area & alg.maxnet, eval = pj.area, include = pj.area}

Project Niche Model to New Extent

Now use crop and mask the predictor variables by projPoly, and use the maxnet.predictRaster() function to predict the values for the new extent based on the model selected.

```r
predsProj <- raster::crop(envs, projPoly)
predsProj <- raster::mask(predsProj, projPoly)
proj <- ENMeval::maxnet.predictRaster(mod, predsProj, type = 'exponential', clamp = {{clamp}})

predsProj <- raster::crop(envs, projPoly)
predsProj <- raster::mask(predsProj, projPoly)
proj <- ENMeval::maxnet.predictRaster(mod, predsProj, type = 'logistic', clamp = {{clamp}})

predsProj <- raster::crop(envs, projPoly)
predsProj <- raster::mask(predsProj, projPoly)
proj <- ENMeval::maxnet.predictRaster(mod, predsProj, type = 'cloglog', clamp = {{clamp}})

```{asis, echo = pj.area & alg.maxent, eval = pj.area, include = pj.area}

Project Niche Model to New Extent

Now use crop and mask the predictor variables by projPoly, and use the dismo::predict() function to predict the values for the new extent based on the model selected.

```r
predsProj <- raster::crop(envs, projPoly)
predsProj <- raster::mask(predsProj, projPoly)
proj <- dismo::predict(mod, predsProj, args = c("outputformat=raw"))

predsProj <- raster::crop(envs, projPoly)
predsProj <- raster::mask(predsProj, projPoly)
proj <- dismo::predict(mod, predsProj, args = c("outputformat=logistic"))

predsProj <- raster::crop(envs, projPoly)
predsProj <- raster::mask(predsProj, projPoly)
proj <- dismo::predict(mod, predsProj, args = c("outputformat=cloglog"))

```{asis, echo = pj.area & mod.bioclim, eval = pj.area, include = pj.area}

Project Niche Model to New Extent

Now use crop and mask the predictor variables by projPoly, and use the dismo::predict() function to predict the values for the new extent based on the model selected.

```r
predsProj <- raster::crop(envs, projPoly)
predsProj <- raster::mask(predsProj, projPoly)
proj <- dismo::predict(mod, predsProj)

```{asis, echo = pj.time & alg.maxnet, eval = pj.time, include = pj.time}

Project Niche Model to New Time

Now download the future climate variables chosen with Wallace, crop and mask them by projPoly, and use the maxnet.predictRaster() function to predict the values for the new time based on the model selected.

```{asis, echo = pj.time & alg.maxent, eval = pj.time, include = pj.time}
### Project Niche Model to New Time
Now download the future climate variables chosen with *Wallace*, crop and mask them by projPoly, and use the dismo::predict() function to predict the values for the new time based on the model selected.

```{asis, echo = pj.time & mod.bioclim, eval = pj.time, include = pj.time}

Project Niche Model to New Time

Now download the future climate variables chosen with Wallace, crop and mask them by projPoly, and use the dismo::predict() function to predict the values for the new time based on the model selected.

```r
envsFuture <- raster::getData("CMIP5", var = "bio", res = {{bcRes}}, rcp = {{pjRCP}}, model = "{{pjGCM}}", year = {{pjYear}})

predsProj <- raster::crop(envsFuture, projPoly)
predsProj <- raster::mask(predsProj, projPoly)

# rename future climate variable names
names(predsProj) <- paste0('bio', sprintf("%02d", 1:19))
# select climate variables
predsProj <- raster::subset(predsProj, names(envs))

# predict model
proj <- ENMeval::maxnet.predictRaster(mod, predsProj, type = 'exponential', clamp = {{clamp}})

# predict model
proj <- ENMeval::maxnet.predictRaster(mod, predsProj, type = 'logistic', clamp = {{clamp}})

# predict model
proj <- ENMeval::maxnet.predictRaster(mod, predsProj, type = 'cloglog', clamp = {{clamp}})

# predict model
proj <- dismo::predict(mod, predsProj, args = c('outputformat=raw'))

# predict model
proj <- dismo::predict(mod, predsProj, args = c('outputformat=logistic'))

# predict model
proj <- dismo::predict(mod, predsProj, args = c('outputformat=cloglog'))

# predict model
proj <- dismo::predict(mod, predsProj)

# get predicted values for occurrence grid cells
occPredVals <- raster::extract(pred, occs.xy)
# define minimum training presence threshold
thr <- thresh(occPredVals, "{{comp8.thresh}}")
# threshold model prediction
proj <- proj > thr

# plot the model prediction
plot(proj)

```{asis, echo = pj.mess, eval = pj.mess, include = pj.mess}

Calculate Environmental Similarity

To visualize the environmental difference between the occurrence localities and your selected projection extent, calculate a multidimensional environmental similarity surface (MESS). High negative values mean great difference, whereas high positive values mean great similarity. Interpreting the projected suitability for areas with high negative values should be done with extreme caution, as they are outside the environmental range of the occurrence localities.

```r

# extract environmental values from occurrence localities and background -- these were the values that went into the model
names(bg.xy) <- names(occs.xy)
all.xy <- rbind(occs.xy, bg.xy)
occEnvVals <- raster::extract(envs, all.xy)
# compare these values with the projection extent (envsMsk)
proj.mess <- dismo::mess(predsProj, occEnvVals)
plot(proj.mess)

chhetrid/rangemapR documentation built on May 13, 2019, 11:09 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

chhetrid/rangemapR
A Modular Platform for Reproducible Modeling of Species Niches and Distributions

In chhetrid/rangemapR: A Modular Platform for Reproducible Modeling of Species Niches and Distributions

Package installation

Record of analysis for `r "{{spName}}"`.

Obtain Occurrence Data

Process Occurrence Data

Obtain Environmental Data

Process Environmental Data

Partition Occurrence Data

Build and Evaluate Niche Model

Visualize Niche Model

view response curves for environmental variables with non-zero coefficients

Project Niche Model

Project Niche Model to New Extent

Project Niche Model to New Extent

Project Niche Model to New Extent

Project Niche Model to New Time

Project Niche Model to New Time

Calculate Environmental Similarity

R Package Documentation

Browse R Packages

We want your feedback!

chhetrid/rangemapR A Modular Platform for Reproducible Modeling of Species Niches and Distributions

In chhetrid/rangemapR: A Modular Platform for Reproducible Modeling of Species Niches and Distributions

Package installation

Record of analysis for r "{{spName}}".

Obtain Occurrence Data

Process Occurrence Data

Obtain Environmental Data

Process Environmental Data

Partition Occurrence Data

Build and Evaluate Niche Model

Visualize Niche Model

view response curves for environmental variables with non-zero coefficients

Project Niche Model

Project Niche Model to New Extent

Project Niche Model to New Extent

Project Niche Model to New Extent

Project Niche Model to New Time

Project Niche Model to New Time

Calculate Environmental Similarity

R Package Documentation

Browse R Packages

We want your feedback!

chhetrid/rangemapR
A Modular Platform for Reproducible Modeling of Species Niches and Distributions

Record of analysis for `r "{{spName}}"`.