MultiLL: Set up modelling multiple species from NBN atlas using...

Description Usage Arguments Value Examples

View source: R/FunctionMultiLL.R

Description

This function is to carry out modelling of multiple species using the JNCCSdms package.

Usage

1
2
3
4
5
6
7
MultiLL(sp_list = sp_list, out_flder = "Outputs/", dat_flder,
  bkgd_flder, vars, max_tries = 1, minyear = 0, maxyear = 0,
  mindata = 5000, covarRes = 300, models = c("MaxEnt", "BioClim",
  "SVM", "RF", "GLM", "GAM", "BRT"), prop_test_data = 0.25,
  mult_prssr = FALSE, rndm_occ = TRUE, GBonly = TRUE,
  xCol = "Latitude (WGS84)", yCol = "Longitude (WGS84)",
  precisionCol = "Coordinate uncertainty (m)", yearCol = "Year")

Arguments

sp_list

List of unique species names which you wish to model.

out_flder

The location of the output folder for your models.

dat_flder

The location of the folder containing your species occurrence data, as txt or csv files exported from NBN gateway or NBN atlas. Each file should contain data for a single species and the naming convention should correspond to your species list in order to be recognised. e.g. 'Triturus cristatus' in the sp_list should have a corresponding data file named 'Triturus cristatus.csv' in the dat_folder.

bkgd_flder

The location of the folder containing your background masks. These should be raster files showing the background area in which pseudo-absence points will be placed. Cells from which background points should be taken should have a value of 1 and excluded cells should be NA. This should be named after the Taxon Group e.g. 'amphibian' and if this is not found in the data by a 'taxonGroup' variable, then pseudo absences with be generated from the variables layer.

vars

A RasterStack of the environmental parameters to be used as predictor variables for the species range.

max_tries

The number of times the number is run.

minyear

Numeric, the earliest year from which data should be selected. Year inclusive, data older than this will be discarded.

maxyear

Numeric, the latest year from which data should be used. Year inclusive, data newer than this will be discarded.

mindata

The target minimum number of data points to return. If this is specified, the lowest resolution data will be discarded if there are enough higher resolution data points available to reach this target.

covarRes

The resolution of the environmental covariate data layers, in metres. Data will not be discarded if it is of higher resolution than the environmental covariate layers.

models

A character vector of the models to run and evaluate. This should be at least one of 'MaxEnt', 'BioClim', 'SVM', 'RF', 'GLM', 'GAM', 'BRT'. Default is to run all models.

prop_test_data

Numeric, the proportion of data to keep back as testing data for evaluating the models. Default is 25%.

mult_prssr

Set up a parallel backend to use multiple processors. As a default this is turned off. Need to ensure the suggested packages have been loaded in order to run this.

rndm_occ

Logical, Default is TRUE and will randomise the locations of presence points where the species occurrence data is low resolution, through calling the randomOcc function.

GBonly

logical, TRUE if you wish to remove Northern Ireland from the records, FALSE if you wish to retain all records.

xCol

The column name of the column in speciesdf giving the species record location as a decimal latitude.

yCol

The column name of the column in speciesdf giving the species record location as a decimal longitude.

precisionCol

The column name of the column in speciesdf denoting the precision of the species record locations. For NBNAtlas this is denoted as the "Coordinate uncertainty (m)". where denoted with km or m, this will be converted into meters.

yearCol

The column name in speciesdf giving the year of the record.

datafrom

Whether it is data from the 'NBNgateway' or 'NBNatlas'.

Value

Lists containing predictions from the best models (as a raster layer showing probability of species occurrence), the best model evaluations and the best model itself for each species in a given species list.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#Provide a list of species you wish to model
sp_list <- c("Notonecta_glauca", "Sigara_dorsalis")

#Organise an Input folder containing your input species files as .csv
dir.create("Inputs")
data("ng_data")
data("sd_data")
names(ng_data)[26] <- "Decimal longitude (WGS84)"
names(sd_data)[26] <- "Decimal longitude (WGS84)"
utils::write.csv(ng_data, file = "./Inputs/Notonecta_glauca.csv")
utils::write.csv(ng_data, file = "./Inputs/Sigara_dorsalis.csv")

#Organise a folder containing your background masks where your pseudo absences will be generated from.
dir.create("BGmasks")
data("background")
latlong = "+init=epsg:4326"
ukgrid = "+init=epsg:27700"
proj4string(background) <- sp::CRS(ukgrid)
background = projectRaster(background, crs = latlong)
save(background, file = "./BGmasks/Hemiptera")

#Create outputs folder
dir.create("Outputs")

#Get variables data
data(vars)
proj4string(vars) <- sp::CRS(ukgrid)
vars = projectRaster(vars, crs = latlong)

#run the function
output <- MultiLL(sp_list = sp_list, vars, out_flder = "Outputs/",dat_flder = "Inputs/", bkgd_flder = "BGmasks/", max_tries = 1, covarRes = 100, models = c("BioClim", "RF"), prop_test_data = 0.25, mult_prssr = FALSE, rndm_occ = TRUE, minyear =2000, maxyear = 2007, GBonly = TRUE, xCol = "Decimal latitude (WGS84)", yCol = "Decimal longitude (WGS84)", precisionCol = "Coordinate uncertainty in metres", yearCol = "Year")

jncc/sdms documentation built on Aug. 13, 2021, 4:21 a.m.