bvStep | R Documentation |
The bvStep
function performs Clarke and Ainsworth's (1993) "BVSTEP" routine which is a algorithm that searches for
highest correlation (Mantel test) between dissimilarities of a fixed and variable multivariate datasets.
The test is the same as that performed by the bioEnv
function but the routine provides a more efficient
search of combinations when the number of variables is large.
bvStep( fix.mat, var.mat, fix.dist.method = "bray", var.dist.method = "euclidean", scale.fix = FALSE, scale.var = TRUE, max.rho = 0.95, min.delta.rho = 0.001, random.selection = TRUE, prop.selected.var = 0.2, num.restarts = 10, var.always.include = NULL, var.exclude = NULL, output.best = 10 )
fix.mat |
The "fixed" matrix of community or environmental sample by variable values |
var.mat |
A "variable" matrix of community or environmental sample by variable values |
fix.dist.method |
The method of calculating dissimilarity indices bewteen samples in the fixed
matrix (Uses the |
var.dist.method |
The method of calculating dissimilarity indices bewteen samples in the variable
matrix. Defaults to Euclidean dissimularity |
scale.fix |
Logical. Should fixed matrix be centered and scaled (Defaults to |
scale.var |
Logical. Should fixed matrix be centered and scaled (Defaults to |
max.rho |
Numeric value between 0 and 1. Provides a maximum Spearman rank correlation ("rho") by which
to stop the searching process. This is especially important when conducting a "BIOBIO" or "ENVENV" type
setup where rho will be equal to 1 with the full set of variables
(see |
min.delta.rho |
Numeric value. Defines a minimum change in the improvement of Spearman rank
correlation ("rho"). When not satisfied, |
random.selection |
Logical. When |
prop.selected.var |
Numeric. Value between 0 and 1 indicating the proportion of variables to include at each restart. |
num.restarts |
Numeric. Number of restarts (Default: |
var.always.include |
Numeric vector. A vector of column numbers from the variable dataset to include at the each restart. |
var.exclude |
Numeric vector. A vector of column numbers from the variable dataset to always exclude at the each restart and during the search process. |
output.best |
Numeric value. Number of best combinations to return in the results object (Default=10). |
The variable multivariate data set has 2^n-1 possible combinations to test, where n is the
number of variables. Testing all variable combinations is thus unrealistic, computationally,
when the number of variables is high (e.g. 20 variables contain >1e6 combinations).
This may often be the case when conducting a BIOBIO type analysis , where
the number of species combinations to search can be quite large
(see bioEnv
for an explanation of other types of analyses
beyond the typical "BIOENV").
Below is an example of a two-step search refinement for searching
for subsets of variables that best correlate with a fixed mutlivariate set.
Clarke, K. R & Ainsworth, M. 1993. A method of linking multivariate community structure to environmental variables. Marine Ecology Progress Series, 92, 205-219.
library(vegan) data(varespec) data(varechem) # Example of a 2-round BIO-BIO search. Uses the most frequently included variables # in the first round at the beginning of each restart in the second round # first round set.seed(1) res.biobio1 <- bvStep(wisconsin(varespec), wisconsin(varespec), fix.dist.method="bray", var.dist.method="bray", scale.fix=FALSE, scale.var=FALSE, max.rho=0.95, min.delta.rho=0.001, random.selection=TRUE, prop.selected.var=0.3, num.restarts=50, output.best=10, var.always.include=NULL ) res.biobio1 # Best rho equals 0.833 (10 of 44 variables) #second round - always includes variables 23, 26, and 29 ("Cla.ran" "Cla.coc" "Cla.fim") set.seed(1) res.biobio2 <- bvStep(wisconsin(varespec), wisconsin(varespec), fix.dist.method="bray", var.dist.method="bray", scale.fix=FALSE, scale.var=FALSE, max.rho=0.95, min.delta.rho=0.001, random.selection=TRUE, prop.selected.var=0.3, num.restarts=50, output.best=10, var.always.include=c(23,26,29) ) res.biobio2 # Best rho equals 0.895 (15 of 44 variables) # A plot of best variables MDS_res=metaMDS(wisconsin(varespec), distance = "bray", k = 2, trymax = 50) bio.keep <- as.numeric(unlist(strsplit(res.biobio2$order.by.best$var.incl[1], ","))) bio.fit <- envfit(MDS_res, varespec[,bio.keep], perm=999) bio.fit plot(MDS_res$points, t="n",xlab="NMDS1", ylab="NMDS2") plot(bio.fit, col="gray50", cex=0.8, font=4) # display only those with p>0.1 text(MDS_res$points, as.character(1:length(MDS_res$points[,1])), cex=0.7) mtext(paste("Stress =",round(MDS_res$stress, 2)), side=3, adj=1, line=0.5) # Display only those with envfit p >= 0.1 plot(MDS_res$points, t="n",xlab="NMDS1", ylab="NMDS2") plot(bio.fit, col="gray50", p.max=0.1, cex=0.8, font=4) # p.max=0.1 text(MDS_res$points, as.character(1:length(MDS_res$points[,1])), cex=0.7) mtext(paste("Stress =",round(MDS_res$stress, 2)), side=3, adj=1, line=0.5)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.