oecosimu: Null Models for Biological Communities
In pattakosn/Rworkshop: Community Ecology Package

Description Usage Arguments Details Value Note Author(s) References See Also Examples

Null models generate random communities with different criteria to study the significance of nestedness or other community patterns. The function only simulates binary (presence/absence) models with constraint for total number of presences, and optionally for numbers of species and/or species frequencies.

oecosimu(comm, nestfun, method, nsimul = 99, burnin = 0, thin = 1,
   statistic = "statistic", alternative = c("two.sided", "less", "greater"),
   ...)
commsimulator(x, method, thin=1)
## S3 method for class 'oecosimu'
as.ts(x, ...)
## S3 method for class 'oecosimu'
as.mcmc(x)
## S3 method for class 'oecosimu'
density(x, ...)
## S3 method for class 'oecosimu'
densityplot(x, data, xlab = "Simulated", ...)

`comm`	Community data.
`x`	Community data for `commsimulator`, or an `oecosimu` result object for `as.ts`, `as.mcmc`, `density` and `densityplot`.
`nestfun`	Function to analyse nestedness. Some functions are provided in vegan, but any function can be used if it accepts the community as the first argument, and returns either a plain number or the result in list item with the name defined in argument `statistic`. See Examples for defining your own functions.
`method`	Null model method. See details.
`nsimul`	Number of simulated null communities.
`burnin`	Number of null communities discarded before proper analysis in sequential methods `"swap"` and `"tswap"`.
`thin`	Number of discarded null communities between two evaluations of nestedness statistic in sequential methods `"swap"` and `"tswap"`.
`statistic`	The name of the statistic returned by `nestedfun`
`alternative`	a character string specifying the alternative hypothesis, must be one of `"two.sided"` (default), `"greater"` or `"less"`. Please note that the p-value of two-sided test is approximately two times higher than in the corresponding one-sided test (`"greater"` or `"less"` depending on the sign of the difference).
`data`	Ignored argument of the generic function.
`xlab`	Label of the x-axis.
`...`	Other arguments to functions.

Function oecosimu is a wrapper that evaluates a nestedness statistic using function given by nestfun, and then simulates a series of null models using commsimulator or other functions (depending on method argument), and evaluates the statistic on these null models. The vegan packages contains some nestedness functions that are described separately (nestedchecker, nesteddisc, nestedn0, nestedtemp), but many other functions can be used as long as they are meaningful with binary or quantitative community models. An applicable function must return either the statistic as a plain number, or as a list element "statistic" (like chisq.test), or in an item whose name is given in the argument statistic. The statistic can be a single number (like typical for a nestedness index), or it can be a vector. The vector indices can be used to analyse site (row) or species (column) properties, see treedive for an example. Raup-Crick index (raupcrick) gives an example of using a dissimilarities index.

Function commsimulator implements binary (presence/absence) null models for community composition. The implemented models are r00 which maintains the number of presences but fills these anywhere so that neither species (column) nor site (row) totals are preserved. Methods r0, r1 and r2 maintain the site (row) frequencies. Method r0 fills presences anywhere on the row with no respect to species (column) frequencies, r1 uses column marginal frequencies as probabilities, and r2 uses squared column sums. Methods r1 and r2 try to simulate original species frequencies, but they are not strictly constrained. All these methods are reviewed by Wright et al. (1998). Method c0 maintains species frequencies, but does not honour site (row) frequencies (Jonsson 2001).

The other methods maintain both row and column frequencies. Methods swap and tswap implement sequential methods, where the matrix is changed only little in one step, but the changed matrix is used as an input if the next step. Methods swap and tswap inspect random 2x2 submatrices and if they are checkerboard units, the order of columns is swapped. This changes the matrix structure, but does not influence marginal sums (Gotelli & Entsminger 2003). Method swap inspects submatrices so long that a swap can be done. Miklós & Podani (2004) suggest that this may lead into biased sequences, since some columns or rows may be more easily swapped, and they suggest trying a fixed number of times and doing zero to many swaps at one step. This method is implemented by method tswap or trial swap. Function commsimulator makes only one trial swap in time (which probably does nothing), but oecosimu estimates how many submatrices are expected before finding a swappable checkerboard, and uses that ratio to thin the results, so that on average one swap will be found per step of tswap. However, the checkerboard frequency probably changes during swaps, but this is not taken into account in estimating the thin. One swap still changes the matrix only little, and it may be useful to thin the results so that the statistic is only evaluated after burnin steps (and thinned).

Methods quasiswap and backtracking are not sequential, but each call produces a matrix that is independent of previous matrices, and has the same marginal totals as the original data. The recommended method is quasiswap which is much faster because it is implemented in C. Method backtracking is provided for comparison, but it is so slow that it may be dropped from future releases of vegan (or also implemented in C). Method quasiswap (Miklós & Podani 2004) implements a method where matrix is first filled honouring row and column totals, but with integers that may be larger than one. Then the method inspects random 2x2 matrices and performs a quasiswap on them. Quasiswap is similar to ordinary swap, but it also can reduce numbers above one to ones maintaining marginal totals. Method backtracking implements a filling method with constraints both for row and column frequencies (Gotelli & Entsminger 2001). The matrix is first filled randomly using row and column frequencies as probabilities. Typically row and column sums are reached before all incidences are filled in. After that begins “backtracking”, where some of the points are removed, and then filling is started again, and this backtracking is done so may times that all incidences will be filled into matrix. The quasiswap method is not sequential, but it produces a random incidence matrix with given marginal totals.

Function as.ts transforms the simulated results of sequential methods into a time series or a ts object. This allows using analytic tools for time series in studying the sequences (see examples). Function as.mcmc transforms the simulated results of sequential methods into an mcmc object of the coda package. The coda package provides functions for the analysis of stationarity, adequacy of sample size, autocorrelation, need of burn-in and much more for sequential methods. Please consult the documentation of coda package.

Function density provides an interface to the standard density function for the simulated values. Function densityplot is an interface to the densityplot function of the lattice package. The density can be used meaningfully only for single statistics and must be plotted separately. The densityplot function can handle multiple statistics, and it plots the results directly. In addition to the density, the densityplot also shows the observed value of the statistic (provided it is within the graph limits). The densityplot function is defined as a generic function in the lattice package and you must either load the lattice library before calling densityplot, or use the longer form densityplot.oecosimu when you first time call the function.

As a result of method = "r2dtable" in oecosimu, quantitative community null models are used to evaluate the statistic. This setting uses the r2dtable function to generate random matrices with fixed row and column totals (hypergeometric distribution). This null model is used in diversity partitioning function (see adipart).

The method argument can be a function with first argument taking the community matrix, and optionally with burnin and thin argument. The function must return a matrix-like object with same dimensions. But be careful, blindly applying permuted matrices for null model testing can be dangerous.

Function oecosimu returns the result of nestfun added with a component called oecosimu. The oecosimu component contains the simulated values of the statistic (item simulated), the name of the method, P value (with given alternative), z-value of the statistic based on simulation (also known as standardized effect size), and the mean of simulations.

Functions commsimulator and oecosimu do not have default nestfun nor default method, because there is no clear natural choice. If you use these methods, you must be able to choose your own strategy. The choice of nestedness index is difficult because the functions seem to imply very different concepts of structure and randomness. The choice of swapping method is also problematic. Method r00 has some heuristic value of being really random. However, it produces null models which are different from observed communities in most respects, and a “significant” result may simply mean that not all species are equally common (r0 is similar with this respect). It is also difficult to find justification for r2. The methods maintaining both row and column totals only study the community relations, but they can be very slow. Moreover, they regard marginal totals as constraints instead of results of occurrence patterns. You should evaluate timings in small trials (one cycle) before launching an extensive simulation. One swap is fast, but it changes data only little, and you may need long burnin and strong thinning in large matrices. You should plot the simulated values to see that they are more or less stationary and there is no trend. Method quasiswap is implemented in C and it is much faster than backtrack. Method backtrack may be removed from later releases of vegan because it is slow, but it is still included for comparison.

If you wonder about the name of oecosimu, look at journal names in the References (and more in nestedtemp).

Jari Oksanen

Gotelli, N.J. & Entsminger, N.J. (2001). Swap and fill algorithms in null model analysis: rethinking the knight's tour. Oecologia 129, 281–291.

Gotelli, N.J. & Entsminger, N.J. (2003). Swap algorithms in null model analysis. Ecology 84, 532–535.

Jonsson, B.G. (2001) A null model for randomization tests of nestedness in species assemblages. Oecologia 127, 309–313.

Miklós, I. & Podani, J. (2004). Randomization of presence-absence matrices: comments and new algorithms. Ecology 85, 86–92.

Wright, D.H., Patterson, B.D., Mikkelson, G.M., Cutler, A. & Atmar, W. (1998). A comparative analysis of nested subset patterns of species composition. Oecologia 113, 1–20.

r2dtable generates table with given marginals but with entries above one. Functions permatfull and permatswap generate Null models for count data. Function rndtaxa (labdsv package) randomizes a community table. See also nestedtemp (that also discusses other nestedness functions) and treedive for another application.

## Use the first eigenvalue of correspondence analysis as an index
## of structure: a model for making your own functions.
data(sipoo)
## Traditional nestedness statistics (number of checkerboard units)
oecosimu(sipoo, nestedchecker, "r0")
## sequential model, one-sided test, a vector statistic
out <- oecosimu(sipoo, decorana, "swap", burnin=100, thin=10, 
   statistic="evals", alt = "greater")
out
## Inspect the swap sequence as a time series object
plot(as.ts(out))
lag.plot(as.ts(out))
acf(as.ts(out))
## Density plot
densityplot(out, as.table = TRUE)
## Use quantitative null models to compare
## mean Bray-Curtis dissimilarities
data(dune)
meandist <- function(x) mean(vegdist(x, "bray"))
mbc1 <- oecosimu(dune, meandist, "r2dtable")
mbc1
## Define a custom function that shuffles
## cells in each rows
f <- function(x) {
    apply(x, 2, function(z) sample(z, length(z)))
}
mbc2 <- oecosimu(as.matrix(dune), meandist, f)
mbc2