sdmData: creating sdm Data object

Description Usage Arguments Details Value Author(s) References Examples

Description

Creates a sdmdata objects that holds species (single or multiple) and explanatory variates. In addition, more information such as spatial coordinates, time, grouping variables, and metadata (e.g., author, date, reference, etc.) can be included.

Usage

1
sdmData(formula,train, test,predictors,bg,filename, crs,...)

Arguments

formula

Specifies which species and explanatory variables should be taken from the input data. Other information (e.g., spatial coordinates, grouping variables, time, etc.) can be determined as well

train

Training data containing species observations as a data.frame or SpatialPoints or SpatialPointsDataFrames. It may contain predictor variables as well

test

Independent test data with the same structure as the train data

predictors

explanatory variables (predictors), defined as a raster object (RasterStack or RasterBrick). Required if train data only contain species records, or background records (pseudo-absences) should be generated

bg

Background data (pseudo-absence), as a data.frame. It can also be a list contains the settings to generate background data (a Raster object is required in the predictors argument)

filename

filename of the sdm data object to store in the disk

crs

optional, coordinate reference system

...

Additional arguments (optional) that are used to create a metadata object. See details

Details

sdmData creates a data object, for single or multiple species. It can automatically detect the variables containing species data (if a data.frame is provided in train), but it is recommended to use formula through which all species (in the left hand side, e.g., sp1+sp2+sp3 ~ .), and the explanatory variables (in the right hand side) can be determined. If there are additional information such as spatial coordinates, time, or some variables based on which the observation can be grouped, they can be determined in the right hand side of the formula in a flexsible way (e.g., ~ . + coords(x+y) + g(var); This right hand side formula, simply determines all variables (.) + x and y as spatial coordinates + grouping observations based on the variable var; for grouping, the variable (var in this example) should be categorical, i.e., factor ).

Additional arguments can be provided to determine metadata information including: author, website, citation, help, description, date, and license

Value

an object of class sdmdata

Author(s)

Babak Naimi naimi.b@gmail.com

https://www.r-gis.net/

https://www.biogeoinformatics.org

References

Naimi, B., Araujo, M.B. (2016) sdm: a reproducible and extensible R platform for species distribution modelling, Ecography, 39:368-375, DOI: 10.1111/ecog.01881

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
## Not run: 
# Example 1: a data.frame containing records for a species (sp) and two predictors (b15 & NDVI):

file <- system.file("external/pa_df.csv", package="sdm")

df <- read.csv(file)

head(df) 

d <- sdmData(sp~b15+NDVI,train=df)

d

# or simply:
d <- sdmData(sp~.,train=df)

d

#--------
# if formula is not specified, function tries to detect species and covariates, it works well only
# if dataset contains no additional columns but species and covariates!

d <- sdmData(train=df)

d

# # only right hand side of the formula is specified (one covariate), so function detects species:
d <- sdmData(~NDVI,train=df) 

d 

#----------
###########
# Example 2: a data.frame containing presence-absence records for 1 species, 4 covariates, and 
# x, y coordinates:

file <- system.file("external/pa_df_with_xy.csv", package="sdm")

df <- read.csv(file)

head(df) 

d <- sdmData(sp~b15+NDVI+categoric1+categoric2+coords(x+y),train=df) 

d
#----
# categoric1 and categoric2 are categorical variables (factors), if not sure the data.frame has 
# them as factor, it can be specified in the formula:
d <- sdmData(sp~b15+NDVI+f(categoric1)+f(categoric2)+coords(x+y),train=df) 

d
# more simple forms of the formula:
d <- sdmData(sp~.+coords(x+y),train=df) 

d

d <- sdmData(~.+coords(x+y),train=df)  # function detects the species

d
##############
# Example 3: a data.frame containing presence-absence records for 10 species:

file <- system.file("external/multi_pa_df.csv", package="sdm")

df <- read.csv(file)

head(df) 

# in the following formula, spatial coordinates columns are specified, and the rest is asked to
# be detected by the function:
d <- sdmData(~.+coords(x+y),train=df)  

d

#--- or it can be customized wich species and which covariates are needed:
d <- sdmData(sp1+sp2+sp3~b15+NDVI+f(categoric1) + coords(x+y),train=df) 

d # 3 species, 3 covariates, and coordinates
# just be careful that if you put "." in the right hand side, while not all species columns or 
# additional columns (e.g., coordinates, time) are specified in the formula, then it takes those
# columns as covariates which is NOT right!

#########
# Example 4: Spatial data:

file <- system.file("external/pa_spatial_points.shp", package="sdm") # path to a shapefile

# use a package like rgdal, or maptools, or shapefile function in package raster to read shapefile:
p <- shapefile(file)
class(p) # a "SpatialPointsDataFrame"

plot(p)

head(p) # it contains data for 3 species

# presence-absence plot for the first species (i.e., sp1)
plot(p[p@data$sp1 == 1,],col='blue',pch=16, main='Presence-Absence for sp1')

points(p[p@data$sp1 == 0,],col='red',pch=16)


# Let's read raster dataset containing predictor variables for this study area:

file <- system.file("external/predictors.grd", package="sdm") # path to a raster object

r <- brick(file)

r # a RasterBrick object including 2 rasters (covariates)

plot(r)

# now, we can use the species points and predictor rasters in sdmData function:
d <- sdmData(sp1+sp2+sp3~b15+NDVI,train=p,predictors = r)

d

##################
# Example 5: presence-only records:


file <- system.file("external/po_spatial_points.shp", package="sdm") # path to a shapefile

# use an appropriate function to read the shapefile (e.g., readOGR in rgdal, readShapeSpatial in 
# maptools, or shapefile in raster):

po <- shapefile(file)
class(po) # a "SpatialPointsDataFrame"


head(po) # it contains data for one species (sp4) and the column has only presence records!


d <- sdmData(sp4~b15+NDVI,train=po,predictors = r)

d # as you see in the type, the data is Presence-Only

### we can add another argument (i.e., bg) to generate background (pseudo-absence) records:

#------ in bg, we are going to provide a list containing the setting to generate background
#------ the setting includes n (number of background records), method (the method used for 
#------ background generation; gRandom refers to random in geographic space), and remove (whether 
#------ points located in presence sites should be removed).

d <- sdmData(sp4~b15+NDVI,train=po,predictors = r,bg=list(n=1000,method='gRandom',remove=TRUE))

d       # as you see in the type, the data is Presence-Background

# you can alternatively, put a data.frame including background records in bg!

## End(Not run)

Example output

Loading required package: sp
sdm 1.0-89 (2020-04-22)
       b15     NDVI sp
1 25.99306 194.8277  0
2 54.52778 109.7022  0
3 26.97917 108.4545  1
4 32.83333  92.9647  0
5 42.79167 178.8910  1
6 31.42361 100.5012  1
Loading required package: dismo
Loading required package: raster
Loading required package: gbm
Loaded gbm 2.1.8
Loading required package: tree
Loading required package: mda
Loading required package: class
Loaded mda 0.5-2

Loading required package: mgcv
Loading required package: nlme

Attaching package:nlmeThe following object is masked frompackage:raster:

    getData

This is mgcv 1.8-33. For overview type 'help("mgcv-package")'.
Loading required package: glmnet
Loading required package: Matrix
Loaded glmnet 4.0-2
Loading required package: earth
Loading required package: Formula
Loading required package: plotmo
Loading required package: plotrix
Loading required package: TeachingDemos
Loading required package: rJava
Loading required package: RSNNS
Loading required package: Rcpp
Loading required package: randomForest
randomForest 4.6-14
Type rfNews() to see new features/changes/bug fixes.
Loading required package: rpart
Loading required package: kernlab

Attaching package:kernlabThe following objects are masked frompackage:raster:

    buffer, rotated

class                                 : sdmdata 
=========================================================== 
number of species                     :  1 
species names                         :  sp 
number of features                    :  2 
feature names                         :  b15, NDVI 
type                                  :  Presence-Absence 
has independet test data?             :  FALSE 
number of records                     :  149 
has Coordinates?                      :  FALSE 
class                                 : sdmdata 
=========================================================== 
number of species                     :  1 
species names                         :  sp 
number of features                    :  2 
feature names                         :  b15, NDVI 
type                                  :  Presence-Absence 
has independet test data?             :  FALSE 
number of records                     :  149 
has Coordinates?                      :  FALSE 
class                                 : sdmdata 
=========================================================== 
number of species                     :  1 
species names                         :  sp 
number of features                    :  2 
feature names                         :  b15, NDVI 
type                                  :  Presence-Absence 
has independet test data?             :  FALSE 
number of records                     :  149 
has Coordinates?                      :  FALSE 
class                                 : sdmdata 
=========================================================== 
number of species                     :  1 
species names                         :  sp 
number of features                    :  1 
feature names                         :  NDVI 
type                                  :  Presence-Absence 
has independet test data?             :  FALSE 
number of records                     :  149 
has Coordinates?                      :  FALSE 
       b15     NDVI categoric1 categoric2      x       y sp
1 25.99306 194.8277          A         ff 494375 4774936  0
2 54.52778 109.7022          B         cc 214375 4244936  0
3 26.97917 108.4545          B         bb 364375 4674936  1
4 32.83333  92.9647          E         cc 564375 4344936  0
5 42.79167 178.8910          B         ff  94375 4744936  1
6 31.42361 100.5012          D         cc 634375 4474936  1
class                                 : sdmdata 
=========================================================== 
number of species                     :  1 
species names                         :  sp 
number of features                    :  4 
feature names                         :  b15, NDVI, categoric1, ... 
which feature is categorical (factor) :  categoric1, categoric2 
type                                  :  Presence-Absence 
has independet test data?             :  FALSE 
number of records                     :  150 
has Coordinates?                      :  TRUE 
class                                 : sdmdata 
=========================================================== 
number of species                     :  1 
species names                         :  sp 
number of features                    :  4 
feature names                         :  b15, NDVI, categoric1, ... 
which feature is categorical (factor) :  categoric1, categoric2 
type                                  :  Presence-Absence 
has independet test data?             :  FALSE 
number of records                     :  150 
has Coordinates?                      :  TRUE 
class                                 : sdmdata 
=========================================================== 
number of species                     :  1 
species names                         :  sp 
number of features                    :  4 
feature names                         :  b15, NDVI, categoric1, ... 
which feature is categorical (factor) :  categoric1, categoric2 
type                                  :  Presence-Absence 
has independet test data?             :  FALSE 
number of records                     :  150 
has Coordinates?                      :  TRUE 
class                                 : sdmdata 
=========================================================== 
number of species                     :  1 
species names                         :  sp 
number of features                    :  4 
feature names                         :  b15, NDVI, categoric1, ... 
which feature is categorical (factor) :  categoric1, categoric2 
type                                  :  Presence-Absence 
has independet test data?             :  FALSE 
number of records                     :  150 
has Coordinates?                      :  TRUE 
       b15     NDVI categoric1 categoric2      x       y sp1 sp2 sp3 sp4 sp5
1 25.99306 194.8277          A         ff 494375 4774936   1   0   0   0   1
2 54.52778 109.7022          B         cc 214375 4244936   0   0   0   0   1
3 26.97917 108.4545          B         bb 364375 4674936   1   0   0   0   1
4 32.83333  92.9647          E         cc 564375 4344936   1   0   0   0   1
5 42.79167 178.8910          B         ff  94375 4744936   1   0   1   1   0
6 31.42361 100.5012          D         cc 634375 4474936   1   0   1   0   1
  sp6 sp7 sp8 sp9 sp10
1   0   0   0   1    0
2   0   0   1   1    0
3   0   0   0   1    1
4   0   0   0   1    1
5   0   0   1   0    1
6   0   0   0   1    1
class                                 : sdmdata 
=========================================================== 
number of species                     :  10 
species names                         :  sp1, sp2, sp3, ... 
number of features                    :  4 
feature names                         :  b15, NDVI, categoric1, ... 
which feature is categorical (factor) :  categoric1, categoric2 
type                                  :  Presence-Absence 
has independet test data?             :  FALSE 
number of records                     :  150 
has Coordinates?                      :  TRUE 
class                                 : sdmdata 
=========================================================== 
number of species                     :  3 
species names                         :  sp1, sp2, sp3 
number of features                    :  3 
feature names                         :  b15, NDVI, categoric1 
which feature is categorical (factor) :  categoric1 
type                                  :  Presence-Absence 
has independet test data?             :  FALSE 
number of records                     :  150 
has Coordinates?                      :  TRUE 
Warning message:
In .local(x, ...) : .prj file is missing
[1] "SpatialPointsDataFrame"
attr(,"package")
[1] "sp"
  sp1 sp2 sp3 coords_x1 coords_x2
1   1   0   0    494375   4774936
2   0   0   0    214375   4244936
3   1   0   0    364375   4674936
4   1   0   0    564375   4344936
5   1   0   1     94375   4744936
6   1   0   1    634375   4474936
class      : RasterBrick 
dimensions : 89, 105, 9345, 2  (nrow, ncol, ncell, nlayers)
resolution : 10000, 10000  (x, y)
extent     : -20625, 1029375, 3979936, 4869936  (xmin, xmax, ymin, ymax)
crs        : NA 
source     : /usr/lib/R/site-library/sdm/external/predictors.grd 
names      :       b15,      NDVI 
min values :  11.50000,  27.59722 
max values :  77.28571, 203.31061 

sh: 1: awk: Permission denied
class                                 : sdmdata 
=========================================================== 
number of species                     :  3 
species names                         :  sp1, sp2, sp3 
number of features                    :  2 
feature names                         :  b15, NDVI 
type                                  :  Presence-Absence 
has independet test data?             :  FALSE 
number of records                     :  150 
has Coordinates?                      :  TRUE 
Warning message:
In .local(x, ...) : .prj file is missing
[1] "SpatialPointsDataFrame"
attr(,"package")
[1] "sp"
  sp4 coords_x1 coords_x2
1   1    494375   4774936
2   1    214375   4244936
3   1    364375   4674936
4   1    564375   4344936
5   1    634375   4474936
6   1    624375   4314936
sh: 1: awk: Permission denied
class                                 : sdmdata 
=========================================================== 
number of species                     :  1 
species names                         :  sp4 
number of features                    :  2 
feature names                         :  b15, NDVI 
type                                  :  Presence-Only 
has independet test data?             :  FALSE 
number of records                     :  79 
has Coordinates?                      :  TRUE 
sh: 1: awk: Permission denied
sh: 1: awk: Permission denied
sh: 1: awk: Permission denied
class                                 : sdmdata 
=========================================================== 
number of species                     :  1 
species names                         :  sp4 
number of features                    :  2 
feature names                         :  b15, NDVI 
type                                  :  Presence-Background 
has independet test data?             :  FALSE 
number of records                     :  1068 
has Coordinates?                      :  TRUE 

sdm documentation built on Nov. 12, 2021, 9:06 a.m.

Related to sdmData in sdm...