rarefaction: Rarefaction Analysis

View source: R/rarefaction.R

rarefactionR Documentation

Rarefaction Analysis

Description

Performs a rarefaction analysis, a method widely used in ecology to estimate species richness based on sample size. The function computes the expected number of species for increasing numbers of individuals, along with confidence intervals, following classical approaches by Hurlbert (1971), Heck et al. (1975), and related developments.

Usage

rarefaction(
  formula,
  data,
  x,
  step = 1,
  points = NULL,
  prob = 0.95,
  xlab,
  ylab,
  plot = TRUE,
  theme = "theme_classic"
)

Arguments

formula

An optional formula specifying the relationship between taxa and sample units (e.g., Taxon ~ Sample). If provided, the function extracts variables from data. A third variable may be included to remove dead individuals (e.g., Taxon ~ Sample - Dead).

data

A data frame containing the variables specified in formula ('long format'). It must contain one column representing the sample unit labels (e.g., quadrats or points) and one column representing the taxon names of the individual plants. This argument accepts the data frame used in the argument x in the function phytoparam.

x

An optional contingency table of species (rows) by samples (columns). If not provided, it is calculated from formula and data. Alternatively, it can be a vector representing the number of individuals per species (see Examples).

step

Step size for the sequence of sample sizes in the rarefaction curve. Default is 1.

points

Optional vector of specific sample sizes (breakpoints) for which to calculate rarefaction. If NULL, a sequence from 1 to the total number of individuals is used.

prob

The confidence level for the confidence intervals. Default is 0.95.

xlab

Label for the x-axis of the plot (defaults to "Number of individuals").

ylab

Label for the y-axis of the plot (defaults to "Number of species").

plot

Logical; if TRUE, a rarefaction curve is plotted. Default is TRUE.

theme

Character string with the name of a ggplot2 theme to be applied to the plot (e.g., "theme_light", "theme_bw", "theme_minimal"). Default is "theme_classic".

Details

Rarefaction analysis provides a standardized way to compare species richness among samples of different sizes. It is based on probabilistic resampling without replacement and produces an expected species accumulation curve. Confidence intervals are calculated following variance estimators proposed by Heck et al. (1975) and Tipper (1979).

The function accepts data in three formats:

  • long format (formula + data arguments),

  • contingency matrix,

  • vector of individuals per species.

Dead individuals can be excluded by specifying an additional term in the formula.

Value

A data frame with the following components:

  • n: number of sample units.

  • s: mean number of species.

  • lower: lower confidence interval bound.

  • upper: upper confidence interval bound.

If plot = TRUE, a rarefaction curve with confidence ribbons is produced using ggplot2.

Author(s)

Rodrigo Augusto Santinelo Pereira raspereira@usp.br

References

Colwell, R. K., Mao, C. X., & Chang, J. (2004). Interpolating, extrapolating, and comparing incidence-based species accumulation curves. Ecology, 85(10), 2717–2727. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1890/03-0557")}

Heck, K. L., Van Belle, G., & Simberloff, D. (1975). Explicit calculation of the rarefaction diversity measurement and the determination of sufficient sample size. Ecology, 56(6), 1459–1461. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.2307/1934716")}

Hurlbert, S. H. (1971). The nonconcept of species diversity: A critique and alternative parameters. Ecology, 52(4), 577–586. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.2307/1934145")}

Tipper, J. C. (1979). Rarefaction and rarefiction—The use and abuse of a method in paleoecology. Paleobiology, 5(4), 423–434. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1017/S0094837300016924")}

Examples

## Using 'formula' (long format)
rarefaction(
  formula = Species ~ Plot - Morta,
  data = quadrat.df,
  plot = TRUE
)


## Using different plot themes
rarefaction(
  formula = Species ~ Plot - Morta,
  data = quadrat.df,
  plot = TRUE,
  theme = "theme_light"
)
rarefaction(
  formula = Species ~ Plot - Morta,
  data = quadrat.df,
  plot = TRUE,
  theme = "theme_bw"
)
rarefaction(
  formula = Species ~ Plot - Morta,
  data = quadrat.df,
  plot = TRUE,
  theme = "theme_minimal"
)

## Using a matrix (wide format)
data.matrix <- with(
  quadrat.df,
  table(Plot, Species, exclude = "Morta")
)
rarefaction(x = data.matrix, plot = TRUE)

data.matrix <- as.matrix(
  xtabs(~ Plot + Species, data = quadrat.df, exclude = "Morta")
)
rarefaction(x = data.matrix, plot = TRUE)

## Using a vector
data.vector <- sort(
  as.vector(apply(data.matrix, 2, sum)),
  decreasing = TRUE
)
rarefaction(x = data.vector, plot = TRUE)

## Using breakpoints
pts <- c(1, 10, 30, 50, 80)
rarefaction(
  formula = Species ~ Plot - Morta,
  data = quadrat.df,
  points = pts,
  plot = TRUE
)
rarefaction(x = data.matrix, points = pts, plot = TRUE)
rarefaction(
  x = data.vector,
  points = pts,
  plot = TRUE,
  theme = "theme_light"
)
rarefaction(x = data.vector, points = 50, plot = FALSE)


PhytoIn documentation built on Nov. 5, 2025, 5:47 p.m.