datquality: Diversity Metrics of Simulated and Original Data

View source: R/datquality.R

datqualityR Documentation

Diversity Metrics of Simulated and Original Data

Description

Estimates the average number of species and the Simpson diversity index per sampling unit, as well as the total multivariate dispersion of both the original (pilot) and simulated datasets.

Usage

datquality(data, dat.sim, Par, transformation, method)

Arguments

data

Data frame with species as columns and samples as rows. The first column should indicate the site to which the sample belongs, regardless of whether a single site was sampled.

dat.sim

List of simulated data sets generated by simdata.

Par

List of parameters generated by assempar.

transformation

Mathematical transformation to reduce the weight of dominant species: one of "square root", "fourth root", "Log (X+1)", "P/A", or "none".

method

Dissimilarity metric used for multivariate dispersion, passed to vegdist.

Details

The quality of the simulated data sets is evaluated by statistical similarity to the pilot data. This includes: (i) the average number of species per sampling unit, (ii) the average Simpson diversity index, and (iii) the multivariate dispersion (MVD), defined as the average dissimilarity of each sampling unit to the group centroid in the dissimilarity space (Anderson 2006). For simulated datasets, mean and standard deviation are reported for (i) and (ii), and the 0.95 quantile of the MVD distribution is used to describe its variability.

Value

A data frame containing the mean and standard deviation of richness and diversity per sampling unit, and the MVD for original data, as well as the 0.95 quantile of MVD from the simulated data.

Note

It is desirable that simulated data resemble observed data in species richness and diversity per sampling unit.

References

Anderson, M. J. (2006). Distance-based tests for homogeneity of multivariate dispersions. Biometrics, 62, 245–253.

Guerra-Castro, E.J., Cajas, J.C., Simões, N., Cruz-Motta, J.J., & Mascaró, M. (2021). SSP: an R package to estimate sampling effort in studies of ecological communities. Ecography 44(4), 561-573. doi: \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1111/ecog.05284")}

See Also

vegdist, diversity

Examples

## Single site: micromollusk from Cayo Nuevo (Yucatan, Mexico)
data(micromollusk)
par.mic <- assempar(data = micromollusk, type = "P/A", Sest.method = "average")
sim.mic <- simdata(par.mic, cases = 3, N = 10, sites = 1)
qua.mic <- datquality(data = micromollusk, dat.sim = sim.mic, Par = par.mic,
                      transformation = "none", method = "jaccard")
qua.mic

## Multiple sites: Sponges from Alacranes National Park (Yucatan, Mexico)
data(sponges)
par.spo <- assempar(data = sponges, type = "counts", Sest.method = "average")
sim.spo <- simdata(par.spo, cases = 3, N = 10, sites = 3)
qua.spo <- datquality(data = sponges, dat.sim = sim.spo, Par = par.spo,
                      transformation = "square root", method = "bray")
qua.spo


SSP documentation built on June 8, 2025, 11:41 a.m.