sdmData: creating sdm Data object

Description Usage Arguments Details Value Author(s) References Examples

Description

Creates a sdmdata objects that holds species (single or multiple) and explanatory variates. In addition, more information such as spatial coordinates, time, grouping variables, and metadata (e.g., author, date, reference, etc.) can be included.

Usage

1
sdmData(formula,train, test,predictors,bg,filename, crs,...)

Arguments

formula

Specifies which species and explanatory variables should be taken from the input data. Other information (e.g., spatial coordinates, grouping variables, time, etc.) can be determined as well

train

Training data containing species observations as a data.frame or SpatialPoints or SpatialPointsDataFrames. It may contain predictor variables as well

test

Independent test data with the same structure as the train data

predictors

explanatory variables (predictors), defined as a raster object (RasterStack or RasterBrick). Required if train data only contain species records, or background records (pseudo-absences) should be generated

bg

Background data (pseudo-absence), as a data.frame. It can also be a list contains the settings to generate background data (a Raster object is required in the predictors argument)

filename

filename of the sdm data object to store in the disk

crs

optional, coordinate reference system

...

Additional arguments (optional) that are used to create a metadata object. See details

Details

sdmData creates a data object, for single or multiple species. It can automatically detect the variables containing species data (if a data.frame is provided in train), but it is recommended to use formula through which all species (in the left hand side, e.g., sp1+sp2+sp3 ~ .), and the explanatory variables (in the right hand side) can be determined. If there are additional information such as spatial coordinates, time, or some variables based on which the observation can be grouped, they can be determined in the right hand side of the formula in a flexsible way (e.g., ~ . + coords(x+y) + g(var); This right hand side formula, simply determines all variables (.) + x and y as spatial coordinates + grouping observations based on the variable var; for grouping, the variable (var in this example) should be categorical, i.e., factor ).

Additional arguments can be provided to determine metadata information including: author, website, citation, help, description, date, and license

Value

an object of class sdmdata

Author(s)

Babak Naimi naimi.b@gmail.com

http://r-gis.net

http://biogeoinformatics.org

References

Naimi, B., Araujo, M.B. (2016) sdm: a reproducible and extensible R platform for species distribution modelling, Ecography, 39:368-375, DOI: 10.1111/ecog.01881

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
## Not run: 
# Example 1: a data.frame containing records for a species (sp) and two predictors (b15 & NDVI):

file <- system.file("external/pa_df.csv", package="sdm")

df <- read.csv(file)

head(df) 

d <- sdmData(sp~b15+NDVI,train=df)

d

# or simply:
d <- sdmData(sp~b15+NDVI,train=df)

d

#--------
# if formula is not specified, function tries to detect species and covariates, it works well only
# if dataset contains no additional columns but species and covariates!

d <- sdmData(train=df)

d

# # only right hand side of the formula is specified (one covariate), so function detects species:
d <- sdmData(~NDVI,train=df) 

d 

#----------
###########
# Example 2: a data.frame containing presence-absence records for 1 species, 4 covariates, and 
# x, y coordinates:

file <- system.file("external/pa_df_with_xy.csv", package="sdm")

df <- read.csv(file)

head(df) 

d <- sdmData(sp~b15+NDVI+categoric1+categoric2+coords(x+y),train=df) 

d
#----
# categoric1 and categoric2 are categorical variables (factors), if not sure the data.frame has 
# them as factor, it can be specified in the formula:
d <- sdmData(sp~b15+NDVI+f(categoric1)+f(categoric2)+coords(x+y),train=df) 

d
# more simple forms of the formula:
d <- sdmData(sp~.+coords(x+y),train=df) 

d

d <- sdmData(~.+coords(x+y),train=df)  # function detects the species

d
##############
# Example 3: a data.frame containing presence-absence records for 10 species:

file <- system.file("external/multi_pa_df.csv", package="sdm")

df <- read.csv(file)

head(df) 

# in the following formula, spatial coordinates columns are specified, and the rest is asked to
# be detected by the function:
d <- sdmData(~.+coords(x+y),train=df)  

d

#--- or it can be customized wich species and which covariates are needed:
d <- sdmData(sp1+sp2+sp3~b15+NDVI+f(categoric1) + coords(x+y),train=df) 

d # 3 species, 3 covariates, and coordinates
# just be careful that if you put "." in the right hand side, while not all species columns or 
# additional columns (e.g., coordinates, time) are specified in the formula, then it takes those
# columns as covariates which is NOT right!

#########
# Example 4: Spatial data:

file <- system.file("external/pa_spatial_points.shp", package="sdm") # path to a shapefile

# use a package like rgdal, or maptools, or shapefile function in package raster to read shapefile:
p <- shapefile(file)
class(p) # a "SpatialPointsDataFrame"

plot(p)

head(p) # it contains data for 3 species

# presence-absence plot for the first species (i.e., sp1)
plot(p[p@data$sp1 == 1,],col='blue',pch=16, main='Presence-Absence for sp1')

points(p[p@data$sp1 == 0,],col='red',pch=16)


# Let's read raster dataset containing predictor variables for this study area:

file <- system.file("external/predictors.grd", package="sdm") # path to a raster object

r <- brick(file)

r # a RasterBrick object including 2 rasters (covariates)

plot(r)

# now, we can use the species points and predictor rasters in sdmData function:
d <- sdmData(sp1+sp2+sp3~b15+NDVI,train=p,predictors = r)

d

##################
# Example 5: presence-only records:


file <- system.file("external/po_spatial_points.shp", package="sdm") # path to a shapefile

# use an appropriate function to read the shapefile (e.g., readOGR in rgdal, readShapeSpatial in 
# maptools, or shapefile in raster):

po <- shapefile(file)
class(po) # a "SpatialPointsDataFrame"


head(po) # it contains data for one species (sp4) and the column has only presence records!


d <- sdmData(sp4~b15+NDVI,train=po,predictors = r)

d # as you see in the type, the data is Presence-Only

### we can add another argument (i.e., bg) to generate background (pseudo-absence) records:

#------ in bg, we are going to provide a list containing the setting to generate background
#------ the setting includes n (number of background records), method (the method used for 
#------ background generation; gRandom refers to random in geographic space), and remove (whether 
#------ points located in presence sites should be removed).

d <- sdmData(sp4~b15+NDVI,train=po,predictors = r,bg=list(n=1000,method='gRandom',remove=TRUE))

d       # as you see in the type, the data is Presence-Background

# you can alternatively, put a data.frame including background records in bg!

## End(Not run)

sdm documentation built on May 2, 2019, 6:32 p.m.

Related to sdmData in sdm...