bvStep: Clarke and Ainsworth's BVSTEP routine
In marchtaylor/sinkr: Collection of functions with emphasis on multivariate data analysis

bvStep

R Documentation

Clarke and Ainsworth's BVSTEP routine

Description

The bvStep function performs Clarke and Ainsworth's (1993) "BVSTEP" routine which is a algorithm that searches for highest correlation (Mantel test) between dissimilarities of a fixed and variable multivariate datasets. The test is the same as that performed by the bioEnv function but the routine provides a more efficient search of combinations when the number of variables is large.

Usage

bvStep(
  fix.mat,
  var.mat,
  fix.dist.method = "bray",
  var.dist.method = "euclidean",
  scale.fix = FALSE,
  scale.var = TRUE,
  max.rho = 0.95,
  min.delta.rho = 0.001,
  random.selection = TRUE,
  prop.selected.var = 0.2,
  num.restarts = 10,
  var.always.include = NULL,
  var.exclude = NULL,
  output.best = 10
)

Arguments

`fix.mat`	The "fixed" matrix of community or environmental sample by variable values
`var.mat`	A "variable" matrix of community or environmental sample by variable values
`fix.dist.method`	The method of calculating dissimilarity indices bewteen samples in the fixed matrix (Uses the `vegdist` function from the vegan package to calculate distance matrices. See the documentation for available methods.). Defaults to Bray-Curtis dissimularity `"bray"`.
`var.dist.method`	The method of calculating dissimilarity indices bewteen samples in the variable matrix. Defaults to Euclidean dissimularity `"euclidean"`.
`scale.fix`	Logical. Should fixed matrix be centered and scaled (Defaults to `FALSE`, recommended for biologic data).
`scale.var`	Logical. Should fixed matrix be centered and scaled (Defaults to `TRUE`, recommended for environmental data to correct for differing units between variables).
`max.rho`	Numeric value between 0 and 1. Provides a maximum Spearman rank correlation ("rho") by which to stop the searching process. This is especially important when conducting a "BIOBIO" or "ENVENV" type setup where rho will be equal to 1 with the full set of variables (see `bioEnv` for an explanation to these types of setups). Defaults to `max.rho=0.95`
`min.delta.rho`	Numeric value. Defines a minimum change in the improvement of Spearman rank correlation ("rho"). When not satisfied, `bvStep` will terminate the search process and return results of the best variable correlations.
`random.selection`	Logical. When `random.selection=TRUE` (Default), the algorithm will begin each restart with a random number of variables from the variable dataset. When `random.selection=FALSE`, a single search is conducted starting with all variables.
`prop.selected.var`	Numeric. Value between 0 and 1 indicating the proportion of variables to include at each restart.
`num.restarts`	Numeric. Number of restarts (Default: `num.restarts=50`)
`var.always.include`	Numeric vector. A vector of column numbers from the variable dataset to include at the each restart.
`var.exclude`	Numeric vector. A vector of column numbers from the variable dataset to always exclude at the each restart and during the search process.
`output.best`	Numeric value. Number of best combinations to return in the results object (Default=10).

Details

The variable multivariate data set has 2^n-1 possible combinations to test, where n is the number of variables. Testing all variable combinations is thus unrealistic, computationally, when the number of variables is high (e.g. 20 variables contain >1e6 combinations). This may often be the case when conducting a BIOBIO type analysis , where the number of species combinations to search can be quite large (see bioEnv for an explanation of other types of analyses beyond the typical "BIOENV"). Below is an example of a two-step search refinement for searching for subsets of variables that best correlate with a fixed mutlivariate set.

References

Clarke, K. R & Ainsworth, M. 1993. A method of linking multivariate community structure to environmental variables. Marine Ecology Progress Series, 92, 205-219.

Examples



library(vegan)
data(varespec)
data(varechem)

# Example of a 2-round BIO-BIO search. Uses the most frequently included variables
# in the first round at the beginning of each restart in the second round
# first round
set.seed(1)
res.biobio1 <- bvStep(wisconsin(varespec), wisconsin(varespec), 
 fix.dist.method="bray", var.dist.method="bray",
 scale.fix=FALSE, scale.var=FALSE, 
 max.rho=0.95, min.delta.rho=0.001,
 random.selection=TRUE,
 prop.selected.var=0.3,
 num.restarts=50,
 output.best=10,
 var.always.include=NULL
)
res.biobio1 # Best rho equals 0.833 (10 of 44 variables)

#second round - always includes variables 23, 26, and 29 ("Cla.ran" "Cla.coc" "Cla.fim")
set.seed(1)
res.biobio2  <- bvStep(wisconsin(varespec), wisconsin(varespec), 
 fix.dist.method="bray", var.dist.method="bray",
 scale.fix=FALSE, scale.var=FALSE, 
 max.rho=0.95, min.delta.rho=0.001,
 random.selection=TRUE,
 prop.selected.var=0.3,
 num.restarts=50,
 output.best=10,
 var.always.include=c(23,26,29)
)
res.biobio2 # Best rho equals 0.895 (15 of 44 variables)

# A plot of best variables
MDS_res=metaMDS(wisconsin(varespec), distance = "bray", k = 2, trymax = 50)
bio.keep <- as.numeric(unlist(strsplit(res.biobio2$order.by.best$var.incl[1], ",")))
bio.fit <- envfit(MDS_res, varespec[,bio.keep], perm=999)
bio.fit 

plot(MDS_res$points, t="n",xlab="NMDS1", ylab="NMDS2")
plot(bio.fit, col="gray50", cex=0.8, font=4) # display only those with p>0.1
text(MDS_res$points, as.character(1:length(MDS_res$points[,1])), cex=0.7)
mtext(paste("Stress =",round(MDS_res$stress, 2)), side=3, adj=1, line=0.5)

# Display only those with envfit p >= 0.1
plot(MDS_res$points, t="n",xlab="NMDS1", ylab="NMDS2")
plot(bio.fit, col="gray50", p.max=0.1, cex=0.8, font=4) # p.max=0.1
text(MDS_res$points, as.character(1:length(MDS_res$points[,1])), cex=0.7)
mtext(paste("Stress =",round(MDS_res$stress, 2)), side=3, adj=1, line=0.5)

marchtaylor/sinkr documentation built on June 15, 2025, 1:17 a.m.

marchtaylor/sinkr index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

marchtaylor/sinkr
Collection of functions with emphasis on multivariate data analysis

bvStep: Clarke and Ainsworth's BVSTEP routine
In marchtaylor/sinkr: Collection of functions with emphasis on multivariate data analysis

Clarke and Ainsworth's BVSTEP routine

Description

Usage

Arguments

Details

References

Examples

Related to bvStep in marchtaylor/sinkr...

R Package Documentation

Browse R Packages

We want your feedback!

marchtaylor/sinkr Collection of functions with emphasis on multivariate data analysis

bvStep: Clarke and Ainsworth's BVSTEP routine In marchtaylor/sinkr: Collection of functions with emphasis on multivariate data analysis

Clarke and Ainsworth's BVSTEP routine

Description

Usage

Arguments

Details

References

Examples

Related to bvStep in marchtaylor/sinkr...

R Package Documentation

Browse R Packages

We want your feedback!

marchtaylor/sinkr
Collection of functions with emphasis on multivariate data analysis

bvStep: Clarke and Ainsworth's BVSTEP routine
In marchtaylor/sinkr: Collection of functions with emphasis on multivariate data analysis