exclude: Excluding variables specified in a VIF object

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

Phisically exclude the collinear variables which are identified using vifcor or vifstep from a set of variables.

Usage

1

Arguments

x

explanatory variables (predictors), defined as a raster object (RasterStack or RasterBrick), or as a matrix, or as a data.frame.

vif

an object of class VIF, resulted from vifcor or vifstep functions.

...

additional argument as in vifstep

Details

Before using this function, you should execute one of vifstep or vifcor which detect collinearity based on calculating variance inflation factor (VIF) statistics. If vif is missing, then vifstep is called.

Value

an object of class same as x (i.e. RasterStack or RasterBrick or data.frame or matrix)

Author(s)

Babak Naimi naimi.b@gmail.com

http://r-gis.net

References

IF you used this method, please cite the following article for which this package is developed:

Naimi, B., Hamm, N.A.S., Groen, T.A., Skidmore, A.K., and Toxopeus, A.G. 2014. Where is positional uncertainty a problem for species distribution modelling?, Ecography 37 (2): 191-203.

See Also

vif

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
## Not run: 
file <- system.file("external/spain.grd", package="usdm")

r <- brick(file) # reading a RasterBrick object including 10 raster layers in Spain

r 

vif(r) # calculates vif for the variables in r

v1 <- vifcor(r, th=0.9) # identify collinear variables that should be excluded

v1

re1 <- exclude(r,v1) # exclude the collinear variables that were identified in 
# the previous step

re1

v2 <- vifstep(r, th=10) # identify collinear variables that should be excluded

v2

re2 <- exclude(r, v2) # exclude the collinear variables that were identified in 
# the previous step

re2

re3 <- exclude(r) # first, vifstep is called 


re3

## End(Not run)

Example output

Loading required package: sp
Loading required package: raster
class       : RasterBrick 
dimensions  : 30, 29, 870, 10  (nrow, ncol, ncell, nlayers)
resolution  : 10000, 10000  (x, y)
extent      : 319375, 609375, 4449936, 4749936  (xmin, xmax, ymin, ymax)
coord. ref. : NA 
data source : /usr/local/lib/R/site-library/usdm/external/spain.grd 
names       :       Bio1,       Bio2,       Bio3,       Bio4,       Bio5,       Bio6,       Bio7,       Bio8,       Bio9,      Bio10 
min values  :   65.40278,   83.90278,   34.09028, 4884.11816,  228.18750,  -47.90972,  221.13889,   36.33333,   31.68056,  144.34723 
max values  :  145.16667,  120.17361,   39.94444, 6740.22900,  320.09723,   21.56944,  310.95834,  156.18750,  234.34723,  234.34723 

   Variables          VIF
1       Bio1 7.767314e+02
2       Bio2 2.458951e+02
3       Bio3 5.511014e+01
4       Bio4 1.759985e+02
5       Bio5 2.558863e+12
6       Bio6 1.381049e+12
7       Bio7 2.316071e+12
8       Bio8 1.581807e+00
9       Bio9 3.009865e+00
10     Bio10 1.520138e+03
2 variables from the 10 input variables have collinearity problem: 
 
Bio5 Bio10 

After excluding the collinear variables, the linear correlation coefficients ranges between: 
min correlation ( Bio2 ~ Bio1 ):  0.03838531 
max correlation ( Bio7 ~ Bio4 ):  0.8909937 

---------- VIFs of the remained variables -------- 
  Variables        VIF
1      Bio1  46.440583
2      Bio2 236.664027
3      Bio3  54.930047
4      Bio4  13.868554
5      Bio6  58.667824
6      Bio7 316.648968
7      Bio8   1.472454
8      Bio9   3.002529
class       : RasterStack 
dimensions  : 30, 29, 870, 8  (nrow, ncol, ncell, nlayers)
resolution  : 10000, 10000  (x, y)
extent      : 319375, 609375, 4449936, 4749936  (xmin, xmax, ymin, ymax)
coord. ref. : NA 
names       :       Bio1,       Bio2,       Bio3,       Bio4,       Bio6,       Bio7,       Bio8,       Bio9 
min values  :   65.40278,   83.90278,   34.09028, 4884.11816,  -47.90972,  221.13889,   36.33333,   31.68056 
max values  :  145.16667,  120.17361,   39.94444, 6740.22900,   21.56944,  310.95834,  156.18750,  234.34723 

5 variables from the 10 input variables have collinearity problem: 
 
Bio5 Bio10 Bio7 Bio6 Bio4 

After excluding the collinear variables, the linear correlation coefficients ranges between: 
min correlation ( Bio2 ~ Bio1 ):  0.03838531 
max correlation ( Bio9 ~ Bio1 ):  0.7101681 

---------- VIFs of the remained variables -------- 
  Variables      VIF
1      Bio1 2.086186
2      Bio2 1.370264
3      Bio3 1.253408
4      Bio8 1.267217
5      Bio9 2.309479
class       : RasterStack 
dimensions  : 30, 29, 870, 5  (nrow, ncol, ncell, nlayers)
resolution  : 10000, 10000  (x, y)
extent      : 319375, 609375, 4449936, 4749936  (xmin, xmax, ymin, ymax)
coord. ref. : NA 
names       :      Bio1,      Bio2,      Bio3,      Bio8,      Bio9 
min values  :  65.40278,  83.90278,  34.09028,  36.33333,  31.68056 
max values  : 145.16667, 120.17361,  39.94444, 156.18750, 234.34723 

5 variables from the 10 input variables have collinearity problem: 
 
Bio5 Bio10 Bio7 Bio6 Bio4 

After excluding the collinear variables, the linear correlation coefficients ranges between: 
min correlation ( Bio2 ~ Bio1 ):  0.03838531 
max correlation ( Bio9 ~ Bio1 ):  0.7101681 

---------- VIFs of the remained variables -------- 
  Variables      VIF
1      Bio1 2.086186
2      Bio2 1.370264
3      Bio3 1.253408
4      Bio8 1.267217
5      Bio9 2.309479
class       : RasterBrick 
dimensions  : 30, 29, 870, 5  (nrow, ncol, ncell, nlayers)
resolution  : 10000, 10000  (x, y)
extent      : 319375, 609375, 4449936, 4749936  (xmin, xmax, ymin, ymax)
coord. ref. : NA 
data source : in memory
names       :      Bio1,      Bio2,      Bio3,      Bio8,      Bio9 
min values  :  65.40278,  83.90278,  34.09028,  36.33333,  31.68056 
max values  : 145.16667, 120.17361,  39.94444, 156.18750, 234.34723 

usdm documentation built on May 2, 2019, 5:50 p.m.