sampsd: Sampling Simulated Data and Estimation of Multivariate...
In SSP: Simulated Sampling Procedure for Community Ecology

sampsd

R Documentation

Sampling Simulated Data and Estimation of Multivariate Standard Errors

Description

For each simulated data set, this function performs repeated sampling across a range of effort levels and estimates the corresponding MultSE (pseudo-multivariate standard error) using dissimilarity-based methods.

Usage

sampsd(dat.sim, Par, transformation, method, n, m, k)

Arguments

`dat.sim`	A list of simulated data sets generated by `simdata`.
`Par`	A list of parameters estimated by `assempar`.
`transformation`	Mathematical transformation to reduce the influence of dominant species: one of "square root", "fourth root", "Log (X+1)", "P/A", or "none".
`method`	Dissimilarity metric to use, passed to `vegdist` (e.g., "bray", "jaccard", "gower").
`n`	Maximum number of sampling units per site (must be <= total units available).
`m`	Maximum number of sites to sample per data set (must be <= total number of sites).
`k`	Number of repetitions of each sampling configuration (samples × sites) for each data set.

Details

For multi-site simulations, the function selects subsets of sites (from 2 to m) and then draws n samples per site using a two-stage sampling method with inclusion probabilities (Tillé, 2006). For single-site simulations, repeated samples of size 2 to n are taken without replacement.

Each sample undergoes the selected transformation and a dissimilarity matrix is computed. MultSE is estimated using:

Single site: pseudo-variance, with MultSE = \sqrt(V/n)
Multiple sites: mean squares from a PERMANOVA model (residual and site effects)

This procedure is computationally intensive, especially with large k. Start with low values for exploration.

Value

A matrix containing the estimated MultSE values for each simulated data set, sampling effort combination, and repetition. This matrix is used by summary_ssp.

Note

For quick exploratory analysis, use small k. Once optimal sampling effort is explored, rerun with larger k (e.g. 100). Computation time will increase accordingly.

References

Anderson, M. J., & Santana-Garcon, J. (2015). Measures of precision for dissimilarity-based multivariate analysis of ecological communities. Ecology Letters, 18(1), 66–73.

Guerra-Castro, E. J., Cajas, J. C., Simoes, N., Cruz-Motta, J. J., & Mascaro, M. (2021). SSP: An R package to estimate sampling effort in studies of ecological communities. Ecography, 44(4), 561–573. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1111/ecog.05284")}

Tillé, Y. (2006). Sampling Algorithms. Springer, New York.

Examples

## Single site example
data(micromollusk)
par.mic <- assempar(data = micromollusk, type = "P/A", Sest.method = "average")
sim.mic <- simdata(par.mic, cases = 3, N = 20, sites = 1)
sam.mic <- sampsd(dat.sim = sim.mic, Par = par.mic, transformation = "P/A",
                  method = "jaccard", n = 10, m = 1, k = 3)

## Multiple site example
data(sponges)
par.spo <- assempar(data = sponges, type = "counts", Sest.method = "average")
sim.spo <- simdata(par.spo, cases = 3, N = 20, sites = 3)
sam.spo <- sampsd(dat.sim = sim.spo, Par = par.spo, transformation = "square root",
                  method = "bray", n = 10, m = 3, k = 3)

SSP documentation built on June 8, 2025, 11:41 a.m.