knitr::opts_chunk$set( collapse = TRUE, fig.path = "README-", message = FALSE, warning = FALSE )
All-in-on model based custom predictions for species in Alberta
if (!require("remotes")) install.packages("remotes") remotes::install_github("ABbiodiversity/allinone")
You will need data and coefficients from the ABbiodiversity/allinone-coefs repository. Clone or download the contents in zip format and extract into a folder (dir in the example).
dir <- "~/repos/allinone-coefs"
If you don't need the spatial raster files from the ABbiodiversity/allinone-coefs, the ai_doanload_coefs() function will grab the coefficients for you and you'll be ready to roll:
library(allinone) #ai_dowload_coefs() ai_load_coefs()
See all the r nrow(ai_species()) species that we have coefficients for:
tab <- ai_species() str(tab)
Here is the number of species by groups:
data.frame(table(tab$Group))
We use an example data set that shows you how to organize the data:
## example data to see what is needed and how it is formatted load(system.file("extdata/example.RData", package="allinone")) ## space climate data frame + veg/soil classes str(spclim) ## veg+HF composition data matrix colnames(p_veghf) ## soil+HF composition data matrix colnames(p_soilhf)
You need to define the species ID (use the tab object to find out) and the bootstrap ID (i). The bootstrap ID can be between 1 and 100 (only 1 for mammals and habitat elements).
## define species and bootstrap id spp <- "AlderFlycatcher" i <- 1
You can use composition data, i.e. giving the areas or proportions of different landcover types (columns) in a spatial unit (rows). The corresponding relative abundance values will be returned in a matrix format:
## use composition z1 <- ai_predict(spp, spclim=spclim, veghf=p_veghf, soilhf=p_soilhf, i=i) str(z1)
Having such a matrix format is ideal when further aggregation is to e performad on the output, e.g. when calculating sector effects. In the example we use only the current landscape here, and show how to use model weights (wN) to average the north and south results in the overlap zone:
## sector effects library(mefa4) lt <- ai_classes() ltn <- nonDuplicated(lt$north, Label, TRUE) lts <- nonDuplicated(lt$south, Label, TRUE) Nn <- groupSums(z1$north, 2, ltn[colnames(z1$north), "Sector"]) Ns <- groupSums(z1$south, 2, lts[colnames(z1$south), "Sector"]) Ns <- cbind(Ns, Forestry=0) N <- spclim$wN * Nn + (1-spclim$wN) * Ns colSums(Nn) colSums(Ns) colSums(N)
We have classified landcover data when we are making predictions for single polygons (which are aggregated in the composition data case). We can provide veghf and soilhf as a vector of these classes.
Make sure that the class names are consistent with column names in the example data matrices for the north and south, respectively.
The function now returns a list of vectors:
## use land cover classes z2 <- ai_predict(spp, spclim=spclim, veghf=spclim$veghf, soilhf=spclim$soilhf, i=i) str(z2) ## averaging predictions avg2 <- spclim$wN * z2$north + (1-spclim$wN) * z2$south str(avg2)
We can use the bootstrap distribution to calculate uncertainty (i.e. confidence intervals, CI):
v <- NULL for (i in 1:20) { zz <- ai_predict(spp, spclim=spclim, veghf=spclim$veghf, soilhf=spclim$soilhf, i=i) v <- cbind(v, spclim$wN * zz$north + (1-spclim$wN) * zz$south) } t(apply(v[25:30,], 1, quantile, c(0.5, 0.05, 0.95))) 50% 5% 95% ## 25 0.31590764 0.26543390 0.35751442 ## 26 0.05219450 0.03893699 0.07041478 ## 27 0.09576693 0.07895616 0.11126043 ## 28 0.04139684 0.03238818 0.05872012 ## 29 0.12563301 0.09836205 0.15970154 ## 30 0.06754963 0.05384576 0.08798115
This gives the median and the 90% CI. This is currently not available for mammals and habitat elements.
Once the predictors are organized, loop over the species IDs from tab and store the results in an organized fashion.
The variables in the spclim object can be extracted from the raster layers stored in the ABbiodiversity/allinone-coefs repository. Clone or download the contents in zip format and extract into a folder (dir variable used here).
library(sf) library(raster) ## you got some coordinates (degree long/lat) XY <- spclim[,c("POINT_X", "POINT_Y")] head(XY) ## make a sf data frame xy <- sf::st_as_sf(XY, coords = c("POINT_X", "POINT_Y")) ## set CRS xy <- st_set_crs(xy, "+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0") ## variable names vn <- c("AHM", "FFP", "MAP", "MAT", "MCMT", "MWMT", "NSR1CentralMixedwood", "NSR1DryMixedwood", "NSR1Foothills", "NSR1Mountain", "NSR1North", "NSR1Parkland", "NSR1Shield", "pAspen", "PeaceRiver", "PET", "pWater", "wN") ## link to the rasters & create a stack rl <- list() for (j in vn) rl[[j]] <- suppressWarnings(raster( file.path(dir, "spatial", paste0(j, ".tif")))) rl <- stack(rl) ## transform xy to Alberta TM xy <- st_transform(xy, proj4string(rl)) plot(rl[["pAspen"]], axes=FALSE, box=FALSE) plot(xy$geometry, add=TRUE, pch=3) ## extract info e <- extract(rl, xy) ## absolute difference should be tiny max(colSums(abs(spclim[,vn] - e[,vn]))) ## we need long/lat as in XY too spclim2 <- data.frame(XY, e) ## let's see what we get str(ai_predict(spp, spclim=spclim2, veghf=spclim$veghf, soilhf=spclim$soilhf, i=i))
For batch processing, look into the inst/docker folder. The API can be deployed using Docker or Docker Compose.
Pass a JSON object (Content type: application/json) as the request body with longitude, latitude, veg/soil/HF classes, the species ID and the bootstrap ID (all these arrays in side an object).
{
"long":[-112.6178,-110.9309,-112.8359,-113.3896,-113.8632,-112.299,
-110.5805,-111.7806,-112.3541,-113.4144,-110.2329,-114.6634,-111.5857,
-114.2688,-114.4383,-110.4444,-113.6355,-113.5874,-113.7964,-113.9445,
-119.9851,-112.7504,-113.5144,-112.7433,-118.0259,-116.2997,-114.9535,
-111.387,-111.8138,-111.1588],
"lat":[49.5851,51.3653,49.3464,51.344,
49.5951,51.3239,50.2184,50.9328,49.5974,49.3468,53.4814,53.8818,53.1679,
51.1549,51.3357,52.3465,51.7699,53.748,52.7971,52.1959,59.2693,56.3071,
58.269,56.1541,56.3535,59.8075,53.046,58.2175,58.7773,55.0714],
"veghf":["Crop","RoughP","Crop","Crop","TameP","GrassHerb","GrassHerb",
"GrassHerb","Crop","Crop","Crop","Crop","Crop","Rural","Crop","Crop","TameP",
"Crop","Crop","Crop","TreedFenR","TreedBog4","TreedSwamp","TreedBog5",
"CCDeciduousR","TreedFen8","CCDeciduous2","PineR","ShrubbyFen","Mixedwood2"],
"soilhf":["Crop","RoughP","Crop","Crop","TameP","Blowout","ThinBreak",
"RapidDrain","Crop","Crop","Crop","Crop","Crop","Rural","Crop","Crop",
"TameP","Crop","Crop","Crop","RapidDrain","RapidDrain","RapidDrain",
"RapidDrain","RapidDrain","RapidDrain","RoughP","RapidDrain","RapidDrain",
"RapidDrain"],
"spp":["AlderFlycatcher"],
"i":[1]
}
The response is a JSON (application/json) array with the model averaged relative abundance values:
[0,0.0001,0,0.0001,0.0002,0,0,0,0,0.0001,0.0044,0.0105,0.0063, 0.0045,0.005,0.0055,0.0059,0.0128,0.0116,0.0082,0.1826,0.0326, 0.072,0.0269,0.3291,0.0538,0.1034,0.0447,0.1266,0.0681]
A Docker workflow is ideal because the size of the application with all the required spatial layers and coefficients is relatively small.
Change directory to where the Dockerfile is, build and push the image:
cd inst/docker export REGISTRY="psolymos" export TAG="allinone:latest" ## build the image docker build -t $REGISTRY/$TAG . docker push $REGISTRY/$TAG
The Docker image can be deployed enywhere. The machine needs to have the Docker Engine installed. Pull the Docker image to the machine you want to run the API on:
docker pull $REGISTRY/$TAG
In production, run the image with port mapping (-p host:container) in the background (-d):
docker run -d --name aiapi -p 8080:8080 $REGISTRY/$TAG
Now use curl to make a request and test the API, it can take a single value, or a vector of values for long, lat, veghf, and soilhf. It is best to provide both veghf and soilhf and let the model averaging do it's job, i.e. use "UNK" soil class in the north. The weights are extracted from a raster (wN) too.
## vector input with model averaging curl http://localhost:8080/ -d \ '{"long":[-112.6178,-111.1588],"lat":[49.5851,55.0714],"veghf":["Crop","Mixedwood2"],"soilhf":["Crop","RapidDrain"],"spp":["AlderFlycatcher"],"i":[1]}' # [0,0.0681] ## single valued input with averaging curl http://localhost:8080/ -d \ '{"long":[-111.1588],"lat":[55.0714],"veghf":["Mixedwood2"],"soilhf":["RapidDrain"],"spp":["AlderFlycatcher"],"i":[2]}' # [0.0605]
See the logs:
docker logs aiapi # [INFO] This is the All-in-one API # [INFO] Packages loaded # [INFO] Raster stack loaded # [INFO] Loading coefs # [INFO] 2021-07-24 06:36:14 Starting server: http://0.0.0.0:8080 # [INFO] Making predictions for species AlderFlycatcher (birds) i=1
Kill and remove the container:
docker kill aiapi && docker rm aiapi
It is advised to add the Docker background process to systemd or use Docker Compose in production. Docker Compose can better handle container restarts and can run multiple replicas (simple round robin load balancing, which requires a proxy server, like Nginx or Caddy).
See the inst/docker/docker-compose.yml file for details. Deploy with docker-compose up -d which will publish the app on port 8080.
When the containers are already running, and the configuration or images have changed after the container's creation, docker-compose up picks up the changes. Tear down with docker-compose down.
This Docker image is not very lean because the parent image is huge. Install only the necessary dependencies with a smaller parent image if size is a concern.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.