ngnn: NGNN, Nonlinear Gradient Nearest Neighbors

Description Usage Arguments Details Value References See Also Examples

Description

Predict community composition based on individualistic but possibly coordinated species responses

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
ngnn(spe, idi, ido, nm, nmulti = 5, pa = FALSE, pr = FALSE,
  method = "bray", thresh = 0.9, neighb = 5, maxits = 999, k = 1,
  ...)

gnn(obj, k, ...)

## S3 method for class 'ngnn'
summary(obj, ...)

ngnn_plot_nco(obj, type = "points", ocol = 2, cexn = NULL, ...)

ngnn_get_spp(obj, ...)

ngnn_plot_spp(obj, pick = NULL, zlim, nm, ...)

Arguments

spe

species dataframe, rows = sample units and columns = species

idi

in-sample predictor dataframe, rows must match 'spe'

ido

out-of-sample predictor dataframe, where rows = new sample units

nm

string vector specifying predictors to include (max 2)

nmulti

number of random starts in nonparametric regression

pa

logical, convert to presence/absence?

pr

logical, use 'beals' for probs of joint occurrence?

method

distance measure for all ordinations

thresh

numeric threshold for stepacross dissimilarities

neighb

number of adjacent distances considered in NCOpredict

maxits

number of NCOpredict iterations

k

the maximum number of nearest neighbors to find in NCO gradient space

...

additional arguments passed to function

obj

object of class 'ngnn' from call to ngnn

type

either 'points' or 'text' for plotting

ocol

color value or vector for out-of-sample points

cexn

expansion factor for points and text

pick

variable to query

zlim

vector of length 2, giving vertical limits for plots

Details

When given a set of sample units where species abundances and corresponding predictor values are both known, how does one infer which species should appear in 'new' sample units where only the predictors are known? NGNN (nonlinear gradient nearest neighbors) approaches the problem of species imputation in the following way:

Regress species individualistically on predictors ->
Feed fitted values to NMS ordination ->
Find nearest neighbors in ordination space, and assign species.

A more detailed description:
First, define an in-sample set of sample units where species abundances and corresponding predictor values are both known, as well as an out-of-sample set where only predictor values are known. Second, use npmr to perform NPMR regression (McCune 2006) of both in-sample and out-of-sample sample units; use nco to feed NPMR fitted values to NMS ordination (Kruskal 1964); this is nonparametric constrained ordination (NCO; McCune and Root 2012; McCune and Root 2017). A follow-up step with nco_predict allows calculating predicted NCO scores for the out-of-sample set even though species compositions are not strictly known. Finally, use gnn to identify the in-sample Euclidean nearest neighbor of each out-of-sample point in the NCO ordination space, and assign the (possibly averaged) species composition of that neighbor to the point in question. This retains realistic communities of co-occurring species, since they've already been observed in at least one other sample unit. The entire process is summarized in the wrapper function ngnn.

Function ngnn finds the k nearest nighbors in the original ordination space; higher values of k probably work better with many original points, and with points more evenly distributed in ordination space.

Value

List of class 'ngnn' with elements:

References

Kruskal, J. B. 1964. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 29: 1-27.

McCune, B. 2006. Non-parametric habitat models with automatic interactions. Journal of Vegetation Science 17(6):819-830.

McCune, B., and H. T. Root. 2012. Nonparametric constrained ordination. 97th ESA Annual Meeting. Ecological Society of America, Portland, OR.

McCune, B., and H. T. Root. 2017. Nonparametric constrained ordination to describe community and species relationships to environment. Unpublished ms.

Ohmann, J.L., and M.J. Gregory. 2002. Predictive mapping of forest composition and structure with direct gradient analysis and nearest-neighbor imputation in coastal Oregon, U.S.A. Canadian Journal of Forest Research 32:725-741.

See Also

npmr for NPMR, nco for NCO, nco_predict for predictive NCO, and gnn for the core function of NGNN.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
# set up
set.seed(978)
require(vegan)
data(varespec, varechem)
spe <- varespec ; id  <- varechem
i   <- sample(1:nrow(spe), size=floor(0.75*nrow(spe))) # sample
spe <- spe[i, ]          # in-sample species
idi <- id[i, ]           # in-sample predictors
ido <- id[-i, ]          # out-of-sample predictors
nm  <- c('Al', 'K')      # select 1 or 2 gradients of interest

# basic usage
res <- ngnn(spe, idi, ido, nm, nmulti=5, method='bray',
            thresh=0.90, neighb=5, maxits=999, k=1)
summary(res)
str(res, 1)

# plot the species response curves
ngnn_plot_spp(res, pick=1:9, nm=nm)

# plot the NCO gradient space
ngnn_plot_nco(res)

# predicted (imputed) species composition for out-of-sample sites
ngnn_get_spp(res)

# how close were predicted species composition to 'true' values?
spe_append <- rbind(spe, res$spp_imputed)   # append to existing
heatmap(t(as.matrix(spe_append)), Rowv=NA, Colv=NA)

# check composition of 'hold-out' data
heatmap(t(as.matrix(varespec[-i,])), Rowv=NA, Colv=NA)
# ... vs new species from NGNN
heatmap(t(as.matrix(res$spp_imputed)), Rowv=NA, Colv=NA)

# Prediction error: Root Mean Square Error
`rmse` <- function(y, ypred, ...){
     sqrt(mean((y-ypred)^2, ...))
}
rmse(varespec[-i,], res$spp_imputed)


## can do entire process manually, avoiding the wrapper function:
# NPMR
res_npmr <- npmr(spe, idi, ido, nm, nmulti=5)
# NCO (NMS)
res_nco  <- nco(res_npmr, method='bray', thresh=0.90)
# NCOpredict (NMSpredict)
res_nmsp <- nco_predict(res_nco, method='bray', neighb=5,
                        maxits=999)
# GNN
res_gnn  <- gnn(obj=res_nmsp, k=1)
summary(res_gnn)

phytomosaic/ngnn documentation built on May 9, 2019, 5:57 a.m.