sdmSetting: creating sdmSetting object

Description Usage Arguments Details Value Author(s) References Examples

Description

Creates sdmSetting object that holds settings to fit and evaluate the models. It can be used to reproduce a study.

Usage

1
2
3
sdmSetting(formula,data,methods,interaction.depth=1,n=1,replication=NULL,cv.folds=NULL,
     test.percent=NULL,bg=NULL,bg.n=NULL,var.importance=NULL,response.curve=TRUE,
     var.selection=FALSE,ncore=1L,modelSettings=NULL,seed=NULL,parallelSettings=NULL,...)

Arguments

formula

specify the structure of the model

data

sdm data object or data.frame including species and feature data

methods

character, name of the algorithms

interaction.depth

level of interactions between predictors

n

number of replicates (run)

replication

replication method (e.g., 'subsampling', 'bootstrapping', 'cv')

cv.folds

number of folds if cv (cross-validation) is in the selected replication methods

test.percent

test percentage if subsampling is in the selected replication methods

bg

method to generate background

bg.n

number of background records

var.importance

logical, whether variable importance should be calculated

response.curve

method to calculate variable importance

var.selection

logical, whether variable selection should be considered

ncore

number of cores to parallelize processing

modelSettings

optional list; settings for modelling methods can be specified by users

seed

default is NULL; either logical specify whether a seed for random number generator should be considered, or a numerical to specify the exact seed number

parallelSettings

default is NULL; a list include settings items for parallel processing. The parallel setting items include ncore, method, type, hosts, doParallel, and fork; see details for more information.

...

additional arguments

Details

using sdmSetting, the feature types, interaction.depth and all settings of the model can be defined. This function generate a sdmSetting object that can be specifically helpful for reproducibility. The object can be shared by a user that may be used for other studies.

If a user aims to reproduce the same results for every time the code is running with the same data and settings, a seed number should be specified. Through the seed argument, a user can specify NULL, means a seed should not be set (if a random sampling is incorporated in the modelling procedure, for different runs the results would be different); TRUE, means a seed should be set (the seed number is randomly selected and used everytime the same setting is incorporated); a number, means the seed will be set to the number specified by the user.

For parallel processing, a list of items can be passed to parallelSettings, include:

ncore: defines the number of cores (it can also be specified outside of this list, but will be removed in future)

method: defines the platform/set of functions to run the parallelisation. Currently, two options of 'parallel', and 'foreach' is implemented. default is 'parallel'

doParallel: Optional, definition to register for a backend for parallel processing (currently when method='foreach'). It should be provided as an R expression.

cluster: Optional, if a cluster is already created and started, it can be introduced through this item to be used as the parallel processing platform (currently when method='parallel')

hosts: A list of addresses for the accessible hosts (remote clusters) to be registered and used in parallel processing (may not work appropriately as it is still under development!)

fork: Logical, Available for non-windows operating system and specifies whether a fork solution should be used for the parallelisation. Default is TRUE.

Value

an object of class sdmSettings

Author(s)

Babak Naimi naimi.b@gmail.com

http://r-gis.net

http://biogeoinformatics.org

References

Naimi, B., Araujo, M.B. (2016) sdm: a reproducible and extensible R platform for species distribution modelling, Ecography, DOI: 10.1111/ecog.01881

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
## Not run: 
file <- system.file("external/pa_df.csv", package="sdm")

df <- read.csv(file)

head(df) 

d <- sdmData(sp~b15+NDVI,train=df)

# generate sdmSettings object:
s <- sdmSetting(sp~., methods=c('glm','gam','brt','svm','rf'),
          replication='sub',test.percent=30,n=10,modelSettings=list(brt=list(n.trees=500)))

s



## End(Not run)

sdm documentation built on April 30, 2020, 1:04 a.m.