Nothing
#' Selection of the best subset of spatial eigenvectors (MEM, Moran's
#' Eigenvector Maps)
#'
#' \code{mem.select} computes the spatial eigenvectors (MEM) of the spatial
#' weighting matrix (SWM) provided (\code{listw}) and optimizes the selection of
#' a subset of MEM variables relative to response variable(s) stored in
#' \code{x}. The optimization is done either by maximizing the adjusted
#' R-squared (R2) of all (\code{method = "global"}) or a subset (\code{method =
#' "FWD"}) of MEM variables or by minimizing the residual spatial
#' autocorrelation (\code{method = "MIR"}) (see details in Bauman et al. 2018a).
#'
#' @param x A vector, matrix, or dataframe of response variable(s). The
#' \code{method = "MIR"} is only implemented for a vector response. Note that
#' \code{x} can also contain the residuals of a model when other (e.g.,
#' environmental) variables should be considered in the model.
#' @param listw A spatial weighting matrix of class \code{listw}; can be created
#' with functions of the package \code{spdep}, or with the user-friendly
#' function \code{listw.candidates}. Note that, the function \code{listw.candidates}
#' returns a \code{list} of \code{listw} and subselection by \code{[[]]} should be
#' performed in this case (see \code{Example})
#' @param MEM.autocor Sign of the spatial eigenvectors to generate; "positive",
#' "negative", or "all", for positively, negatively autocorrelated
#' eigenvectors, or both, respectively; default is "positive"
#' @param method Criterion to select the best subset of MEM variables. Either
#' \code{"FWD"} (default option), \code{"MIR"} (for univariate \code{x}
#' only), or \code{"global"} (see \code{Details})
#' @param MEM.all A logical indicating if the complete set of MEM variables
#' should be returned
#' @param nperm Number of permutations to perform the tests in the selection
#' procedure; Default is 999
#' @param nperm.global Number of permutations to perform the tests in the global test;
#' Default is 9999
#' @param alpha Significance threshold value for the tests; Default is 0.05
#' @param verbose If 'TRUE' more diagnostics are printed. The default setting is
#' FALSE
#' @param \dots Other parameters (for internal use with \code{listw.select})
#'
#' @return The function returns a list with: \describe{ \item{global.test}{An
#' object of class \code{randtest} containing the result of the global test
#' associated to all MEM (adjusted R2 and p-value). If \code{MEM.autocor = "all"},
#' a list with two elements (\code{positive} and \code{negative}) corresponding to
#' the results of the global tests performed on positive and negative MEM respectively.}
#' \item{MEM.all}{An object of class \code{orthobasisSp} containing the
#' complete set of generated MEM variables (generated by
#' \code{\link{scores.listw}}). Only returned if \code{MEM.all = TRUE}.}
#' \item{summary}{A dataframe summarizing the results of the selection
#' procedure} \item{MEM.select}{An object of class \code{orthobasisSp}
#' containing the subset of significant MEM variables.}}
#'
#' @details The function provides three different methods to select a subset of
#' MEM variables. For all methods, a global test is firstly performed. If
#' \code{MEM.autocor = "all"}, two global tests are performed and p-values are
#' corrected for multiple comparison (Sidak correction).
#'
#' If the MEM variables are to be further used in a model including actual
#' predictors (e.g. environmental), then a subset of spatial eigenvectors
#' needs to be selected before proceeding to further analyses to avoid model
#' overfitting and/or a loss of statistical power to detect the contribution of
#' the environment to the variability of the response data (Griffith 2003, Dray
#' et al. 2006, Blanchet et al. 2008, Peres-Neto and Legendre 2010, Diniz-Filho
#' et al. 2012). Although several eigenvector selection approaches have been
#' proposed to select a best subset of eigenvectors, Bauman et al. (2018b) showed
#' that two main procedures should be preferred, depending on the underlying
#' objective: the forward selection with double stopping criterion (Blanchet et
#' al. 2008; \code{method = "FWD"}) or the minimization of the residual spatial
#' autocorrelation (Griffith and Peres-Neto 2006; MIR selection in Bauman et al.
#' 2018a,b, \code{method = "MIR"}).
#' The most powerful and accurate selection method, in terms of R2 estimation,
#' is the forward selection. This method should be preferred when the objective
#' is to capture as accurately as possible the spatial patterns of \code{x}. If
#' the objective is to optimize the detection of the spatial patterns in the
#' residuals of a model of the response variable(s) against a set of environmental
#' predictors, for instance, then \code{x} should be the model residuals, and
#' \code{method = "FWD"}. This allows optimizing the detection of residual
#' spatial patterns once the effect of the environmental predictors has been
#' removed.
#' If however the objective is only to remove the spatial autocorrelation from the
#' residuals of a model of \code{x} against a set of actual predictors (e.g.
#' environmental) with a small number of spatial predictors, then accuracy is
#' not as important and one should focus mainly on the number of spatial
#' predictors (Bauman et al. 2018b). In this case, \code{method = "MIR"} is more
#' adapted, as it has the advantage to maintain the standard errors of the actual
#' predictor coefficients as low as possible. Note that \code{method = "MIR"} can
#' only be used for a univariate \code{x}, as the Moran's I is a univariate index.
#' If \code{x} is multivariate, then the best criterion is the forward selection
#' (see Bauman et al. 2018b).
#' A third option is to not perform any selection of MEM variables
#' (\code{method = "global"}). This option may be interesting when the complete set
#' of MEM variables will be used, like in Moran spectral randomizations (Wagner and
#' Dray 2015, Bauman et al. 2018c) or when using smoothed MEM (Munoz 2009).
#'
#' For \code{method = "MIR"}, the global test consists in computing the
#' Moran's I of \code{x} (e.g. residuals of the model of the response variable
#' against environmental variables) and tests it by permutation (results
#' stored in \code{global.test}). If the Moran's I is significant, the
#' function performs a selection procedure that searches among the set of
#' generated spatial predictors the one that best minimizes the value of the
#' Moran's I. A model of \code{x} against the selected eigenvector is built,
#' and the significance of the Moran's I of the model residuals is tested
#' again. The procedure goes on until the Moran's I of the model residuals is
#' not significant anymore, hence the name of Minimization of moran's I in the
#' Residuals (MIR).
#'
#' For \code{method = "global"} and \code{method = "FWD"}, the global test
#' consists in computing the adjusted global R2, that is, the R2 of the model
#' of \code{x} against the whole set of generated MEM variables and tests it
#' by permutation (results stored in \code{global.test}).
#'
#' For \code{method = "global"}, if the adjusted global R2 is significant, the
#' functions returns the whole set of generated MEM variables in \code{MEM.select}.
#'
#' For \code{method = "FWD"}, if the adjusted global R2 is significant, the
#' function performs a forward selection with double stopping criterion that
#' searches among the set of generated spatial predictors the one that best maximizes
#' the R2 of the model. The procedure is repeated untill one of the two stopping
#' criterion is reached (see Blanchet et al. 2008). Note that in a few cases, the forward
#' selection does not select any variable even though the global model is significant. This
#' can happen for example when a single variable has a strong relation with the response
#' variable(s), because the integration of the variable alone yields an adjusted R2
#' slightly higher than the global adjusted R2. In this case, we recommend checking that this
#' is indeed the reason why the first selected variable was rejected, and rerun the analysis
#' with a second stopping criterion equal to the global adjusted R2 plus a small amount
#' allowing avoiding this issue (e.g. 5% of the global adjusted R2). For now, this can be
#' done through the argument \code{adjR2thresh} of function \code{forward.sel}, until the
#' solution is implemented in \code{mem.select}.
#'
#' For the \code{method = "FWD"} and \code{method = "MIR"}, the MEM selected
#' by the procedure are returned in \code{MEM.select} and a summary of the
#' results is provided in \code{summary}. If no MEM are selected, then
#' \code{MEM.select} and \code{summary} are not returned.
#'
#' @author David Bauman (\email{dbauman@@ulb.ac.be} or
#' \email{davbauman@@gmail.com}) and Stéphane Dray
#'
#' @seealso \code{\link{listw.candidates}}, \code{\link{listw.select}},
#' \code{\link{scores.listw}}
#'
#' @references Bauman D., Fortin M-J, Drouet T. and Dray S. (2018a) Optimizing the choice
#' of a spatial weighting matrix in eigenvector-based methods. Ecology, 99, 2159-2166
#'
#' Bauman D., Drouet T., Dray S. and Vleminckx J. (2018b) Disentangling good from
#' bad practices in the selection of spatial or phylogenetic eigenvectors.
#' Ecography, 41, 1--12
#'
#' Bauman D., Vleminckx J., Hardy O., Drouet T. (2018c) Testing and interpreting the
#' shared space-environment fraction in variation partitioning analyses of ecological data.
#' Oikos
#'
#' Blanchet G., Legendre P. and Borcard D. (2008) Forward selection of
#' explanatory variables. Ecology, 89(9), 2623--2632
#'
#' Diniz-Filho J.A.F., Bini L.M., Rangel T.F., Morales-Castilla I. et al.
#' (2012) On the selection of phylogenetic eigenvectors for ecological
#' analyses. Ecography, 35, 239--249
#'
#' Dray S., Legendre P. and Peres-Neto P. R. (2006) Spatial modeling: a
#' comprehensive framework for principal coordinate analysis of neighbor
#' matrices (PCNM). Ecological Modelling, 196, 483--493
#'
#' Griffith D. (2003) Spatial autocorrelation and spatial filtering: gaining
#' understanding through theory and scientific visualization. Springer, Berlin
#'
#' Griffith D. and Peres-Neto P. (2006) Spatial modeling in Ecology: the
#' flexibility of eigenfunction spatial analyses. Ecology, 87, 2603--2613
#'
#' Munoz, F. 2009. Distance-based eigenvector maps (DBEM) to analyse
#' metapopulation structure with irregular sampling. Ecological Modelling, 220,
#' 2683--2689
#'
#' Peres-Neto P. and Legendre P. (2010) Estimating and controlling for spatial
#' structure in the study of ecological communities. Global Ecology and
#' Biogeography, 19, 174--184
#'
#' Wagner H., Dray S. (2015). Generating spatially constrained null models for
#' irregularly spaced data using Moran spectral randomization methods. Methods
#' in Ecology and Evolution, 6, 1169--1178
#'
#' @keywords spatial
#'
#' @examples if(require(vegan)){
#'# Illustration of the MIR selection on the oribatid mite data
#'# (Borcard et al. 1992, 1994 for details on the dataset):
#'# *******************************************************
#'# Community data (response matrix):
#'data(mite)
#'# We will compute the example on a single species:
#'spe <- mite[, 2]
#'# Environmental explanatory dataset:
#'data(mite.env)
#'# We only use two numerical explanatory variables:
#'env <- mite.env[, 1:2]
#'dim(env)
#'# Coordinates of the 70 sites:
#'data(mite.xy)
#'coord <- mite.xy
#'# We build the model we are interested in:
#'mod <- lm(spe ~ ., data = env)
#'
#'
#'# In order to avoid possible type I error rate inflation issues, we check
#'# whether the model residuals are independent, and if they are spatially
#'# autocorrelated, we select a small subset of MEM variables to add to the
#'# model as covariables with the MIR selection:
#'
#'# 1) We build a spatial weighting matrix based on Gabriel graph with a
#'# weighting function decreasing linearly with the distance:
#'w <- listw.candidates(coord, nb = "gab", weights = "flin")
#'
#'
#'# 2) We test the spatial autocorrelation of the model residuals and, if
#'# necessary, select a subset of spatial predictors:
#'y <- residuals(mod)
#'MEM <- mem.select(x = y, listw = w[[1]], method = "MIR", MEM.autocor = "positive",
#' nperm = 999, alpha = 0.05)
#'dim(MEM$MEM.select)
#'# The residuals of the model presented spatial autocorrelation. The selection
#'# of MEM variables is thus performed to remove residual autocorrelation.
#'
#'# 3) We can reconstruct our model adding the selected MEM variable as covariables:
#'env2 <- cbind(env, MEM$MEM.select)
#'mod_complete <- lm(spe ~ ., data = env2)
#'summary(mod_complete)$coefficient[, 1] # Coefficient estimates
#'summary(mod_complete)$coefficient[, 2] # Standard errors
#'}
#'
#'
#' @importFrom spdep moran
#' @importFrom stats lm residuals
#' @importFrom methods hasArg
#' @importFrom vegan rda RsquareAdj anova.cca
#' @export mem.select
mem.select <- function(x, listw, MEM.autocor = c("positive", "negative", "all"),
method = c("FWD", "MIR", "global"), MEM.all = FALSE, nperm = 999, nperm.global = 9999, alpha = 0.05,
verbose = FALSE, ...){
method <- match.arg(method)
MEM.autocor <- match.arg(MEM.autocor)
if (any(is.na(x)))
stop("NA entries in x")
nsites <- NROW(x)
MEM <- scores.listw(listw, MEM.autocor = MEM.autocor)
## alpha.global is hidden as only used internally
if (hasArg(ntest))
ntest <- list(...)$ntest
else
ntest <- 1
if (MEM.autocor == "all") {
ntest <- ntest * 2
res.pos <- mem.select(x = x, listw = listw, MEM.autocor = "positive",
method = method, MEM.all = MEM.all, nperm = nperm, nperm.global = nperm.global, alpha = alpha, ntest = ntest)
res.neg <- mem.select(x = x, listw = listw, MEM.autocor = "negative",
method = method, MEM.all = MEM.all, nperm = nperm, nperm.global = nperm.global, alpha = alpha, ntest = ntest)
# combine results for positive and negative autocorrelations
res <- list(global.test = list(positive = res.pos$global.test, negative = res.neg$global.test))
if (MEM.all)
res$MEM.all <- MEM
if (is.null(res.neg$MEM.select)) {
if (!is.null(res.pos$MEM.select)) {
res$MEM.select <- res.pos$MEM.select
res$summary <- res.pos$summary
}
} else {
## renumber negative MEMs
## by identifying the column in MEM.all
idx.1 <- which.min(colSums(abs(MEM - res.neg$MEM.select[, 1])))
new.order <- idx.1 - res.neg$summary$order[1]
new.order <- new.order + res.neg$summary$order
names(res.neg$MEM.select) <- names(MEM)[new.order]
res.neg$summary$order <- new.order
res.neg$summary$variables <- names(MEM)[new.order]
if (!is.null(res.pos$MEM.select)) {
res$MEM.select <- cbind(res.pos$MEM.select, res.neg$MEM.select)
res$summary <- rbind(res.pos$summary, res.neg$summary)
## udpate summary in this case for the values of statistics
if (method == "FWD") {
res$summary$R2Cum <- cumsum(res$summary$R2)
res$summary$AdjR2Cum <- sapply(1:nrow(res$summary), function(x) 1 - (nsites - 1) / (nsites - x - 1) * (1 - sum(res$summary$R2[1:x])))
}
if (method == "MIR") {
x2 <- x
for (j in 1:ncol(res$MEM.select)) {
x2 <- residuals(lm(x2 ~ res$MEM.select[,j]))
res$summary$Iresid[j] <- moran(x2, listw, nsites, Szero(listw))$I
}
}
} else{
res$MEM.select <- res.neg$MEM.select
res$summary <- res.neg$summary
}
}
return(res)
}
if (method == "MIR") {
if (NCOL(x) > 1)
stop("MIR can only be used for a univariate x")
if (MEM.autocor == "positive")
alter <- "greater"
else if (MEM.autocor == "negative")
alter <- "less"
testI <- moran.randtest(x, listw, nperm.global, alter = alter)
##pvalue correction
testI$pvalue <- 1 - (1 - testI$pvalue)^ntest
if (MEM.all)
res <- list(global.test = testI, MEM.all = MEM)
else
res <- list(global.test = testI)
if (testI$pvalue >= alpha) {
if (verbose)
message(paste("No significant", MEM.autocor, "spatial structure"))
return(res)
}
p <- testI$pvalue
MEM.sel <- idx.min <- min.moran <- p.vector <- c()
while (p < alpha) {
# Selection of MEM best minimizing the Moran's I value of the residuals
I.vector <- apply(MEM, 2, function(z) moran(residuals(lm(x ~ z)), listw, nsites, Szero(listw))$I)
I.vector[MEM.sel] <- NA ## MEM cannot be selected two times
idx.min <- which.min(abs(I.vector))
if (verbose)
message(paste("Testing variable", length(MEM.sel) + 1))
x <- residuals(lm(x ~ MEM[, idx.min]))
testI <- moran.randtest(x, listw, nperm, alter = alter)
p <- testI$pvalue
MEM.sel <- c(MEM.sel, idx.min)
min.moran <- c(min.moran, testI$obs)
p.vector <- c(p.vector, p)
}
if (verbose)
message(paste("Procedure stopped (alpha criteria): pvalue for variable", length(MEM.sel) + 1, "is", p, "(>", alpha, ")"))
res <- c(res, list(MEM.select = MEM[, MEM.sel, drop = FALSE],
summary = data.frame(variables = names(MEM)[MEM.sel],
order = MEM.sel, Iresid = min.moran, pvalue = p.vector,
row.names = 1:length(MEM.sel))))
return(res)
} else {
x <- as.data.frame(x)
testF <- .testglobal(x, as.matrix(MEM), nperm.global)
##pvalue correction
testF$pvalue <- 1 - (1 - testF$pvalue)^ntest
if (MEM.all)
res <- list(global.test = testF, MEM.all = MEM)
else
res <- list(global.test = testF)
if (testF$pvalue >= alpha) {
if (verbose)
message(paste("No significant", MEM.autocor, "spatial structure"))
return(res)
}
if (method == "global") {
res <- c(res, list(MEM.select = MEM))
} else if (method == "FWD") {
class <- class(try(fsel <- forward.sel(x, MEM, adjR2thresh = testF$obs, nperm = nperm, verbose = verbose),
TRUE))
if (class != "try-error") {
res <- c(res, list(MEM.select = MEM[, fsel$order, drop = FALSE], summary = as.data.frame(fsel[, -6])))
} else {
if (verbose)
message("No MEM variable was selected by the forward selection. See help document of mem.select.")
}
}
return(res)
}
}
## global test of adjusted R2
## should use ade4 instead (to return a randtest)
.testglobal <- function(x, mem.vectors, nperm = 999){
rda.res <- rda(x, mem.vectors)
res <- list(obs = RsquareAdj(rda.res)$adj.r.squared,
pvalue = anova.cca(rda.res, permutations = nperm)$Pr[1])
return(res)
}
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.