diversity_ci: Perform bootstrap statistics, calculate, and plot confidence...
In poppr: Genetic Analysis of Populations with Mixed Reproduction

diversity_ci

R Documentation

Perform bootstrap statistics, calculate, and plot confidence intervals.

Description

This function is for calculating bootstrap statistics and their confidence intervals. It is important to note that the calculation of confidence intervals is not perfect (See Details). Please be cautious when interpreting the results.

Usage

diversity_ci(
  tab,
  n = 1000,
  n.boot = 1L,
  ci = 95,
  total = TRUE,
  rarefy = FALSE,
  n.rare = 10,
  plot = TRUE,
  raw = TRUE,
  center = TRUE,
  ...
)

Arguments

`tab`	a `adegenet::genind()`, `genclone()`, `snpclone()`, OR a matrix produced from `mlg.table()`.
`n`	an integer defining the number of bootstrap replicates (defaults to 1000).
`n.boot`	an integer specifying the number of samples to be drawn in each bootstrap replicate. If `n.boot` < 2 (default), the number of samples drawn for each bootstrap replicate will be equal to the number of samples in the data set. See Details.
`ci`	the percent for confidence interval.
`total`	argument to be passed on to `mlg.table()` if `tab` is a genind object.
`rarefy`	if `TRUE`, bootstrapping will be performed on the smallest population size or the value of `n.rare`, whichever is larger. Defaults to `FALSE`, indicating that bootstrapping will be performed respective to each population size.
`n.rare`	an integer specifying the smallest size at which to resample data. This is only used if `rarefy = TRUE`.
`plot`	If `TRUE` (default), boxplots will be produced for each population, grouped by statistic. Colored dots will indicate the observed value.This plot can be retrieved by using `p <- last_plot()` from the ggplot2 package.
`raw`	if `TRUE` (default) a list containing three elements will be returned
`center`	if `TRUE` (default), the confidence interval will be centered around the observed statistic. Otherwise, if `FALSE`, the confidence interval will be bias-corrected normal CI as reported from `boot::boot.ci()`
`...`	parameters to be passed on to `boot::boot()` and `diversity_stats()`

Details

Bootstrapping

For details on the bootstrapping procedures, see diversity_boot(). Default bootstrapping is performed by sampling N samples from a multinomial distribution weighted by the relative multilocus genotype abundance per population where N is equal to the number of samples in the data set. If n.boot > 2, then n.boot samples are taken at each bootstrap replicate. When rarefy = TRUE, then samples are taken at the smallest population size without replacement. This will provide confidence intervals for all but the smallest population.

Confidence intervals

Confidence intervals are derived from the function boot::norm.ci(). This function will attempt to correct for bias between the observed value and the bootstrapped estimate. When center = TRUE (default), the confidence interval is calculated from the bootstrapped distribution and centered around the bias-corrected estimate as prescribed in Marcon (2012). This method can lead to undesirable properties, such as the confidence interval lying outside of the maximum possible value. For rarefaction, the confidence interval is simply determined by calculating the percentiles from the bootstrapped distribution. If you want to calculate your own confidence intervals, you can use the results of the permutations stored in the ⁠$boot⁠ element of the output.

Rarefaction

Rarefaction in the sense of this function is simply sampling a subset of the data at size n.rare. The estimates derived from this method have straightforward interpretations and allow you to compare diversity across populations since you are controlling for sample size.

Plotting

Results are plotted as boxplots with point estimates. If there is no rarefaction applied, confidence intervals are displayed around the point estimates. The boxplots represent the actual values from the bootstrapping and will often appear below the estimates and confidence intervals.

Value

raw = TRUE

obs a matrix with observed statistics in columns, populations in rows
est a matrix with estimated statistics in columns, populations in rows
CI an array of 3 dimensions giving the lower and upper bound, the index measured, and the population.
boot a list containing the output of boot::boot() for each population.

raw = FALSE

a data frame with the statistic observations, estimates, and confidence intervals in columns, and populations in rows. Note that the confidence intervals are converted to characters and rounded to three decimal places.

Note

Confidence interval calculation

Almost all of the statistics supplied here have a maximum when all genotypes are equally represented. This means that bootstrapping the samples will always be downwardly biased. In many cases, the confidence intervals from the bootstrapped distribution will fall outside of the observed statistic. The reported confidence intervals here are reported by assuming the variance of the bootstrapped distribution is the same as the variance around the observed statistic. As different statistics have different properties, there will not always be one clear method for calculating confidence intervals. A suggestion for correction in Shannon's index is to center the CI around the observed statistic (Marcon, 2012), but there are theoretical limitations to this. For details, see https://stats.stackexchange.com/q/156235/49413.

User-defined functions

While it is possible to use custom functions with this, there are three important things to remember when using these functions:

1. The function must return a single value. 
2. The function must allow for both matrix and vector inputs 
3. The function name cannot match or partially match any arguments 
from [boot::boot()]

Anonymous functions are okay
(e.g. function(x) vegan::rarefy(t(as.matrix(x)), 10)).

Author(s)

Zhian N. Kamvar

References

Marcon, E., Herault, B., Baraloto, C. and Lang, G. (2012). The Decomposition of Shannon’s Entropy and a Confidence Interval for Beta Diversity. Oikos 121(4): 516-522.

Examples

library(poppr)
data(Pinf)
diversity_ci(Pinf, n = 100L)
## Not run: 
# With pretty results
diversity_ci(Pinf, n = 100L, raw = FALSE)

# This can be done in a parallel fasion (OSX uses "multicore", Windows uses "snow")
system.time(diversity_ci(Pinf, 10000L, parallel = "multicore", ncpus = 4L))
system.time(diversity_ci(Pinf, 10000L))

# We often get many requests for a clonal fraction statistic. As this is 
# simply the number of observed MLGs over the number of samples, we 
# recommended that people calculate it themselves. With this function, you
# can add it in:

CF <- function(x){
 x <- drop(as.matrix(x))
 if (length(dim(x)) > 1){
   res <- rowSums(x > 0)/rowSums(x)
 } else {
   res <- sum(x > 0)/sum(x)
 }
 return(res)
}
# Show pretty results

diversity_ci(Pinf, 1000L, CF = CF, center = TRUE, raw = FALSE)
diversity_ci(Pinf, 1000L, CF = CF, rarefy = TRUE, raw = FALSE)

## End(Not run)

poppr documentation built on June 19, 2025, 1:08 a.m.

poppr index

Package overview README.md Algorightms and Equations

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

poppr
Genetic Analysis of Populations with Mixed Reproduction

diversity_ci: Perform bootstrap statistics, calculate, and plot confidence...
In poppr: Genetic Analysis of Populations with Mixed Reproduction

Perform bootstrap statistics, calculate, and plot confidence intervals.

Description

Usage

Arguments

Details

Bootstrapping

Confidence intervals

Rarefaction

Plotting

Value

raw = TRUE

raw = FALSE

Note

Confidence interval calculation

User-defined functions

Author(s)

References

See Also

Examples

Related to diversity_ci in poppr...

R Package Documentation

Browse R Packages

We want your feedback!

poppr Genetic Analysis of Populations with Mixed Reproduction

diversity_ci: Perform bootstrap statistics, calculate, and plot confidence... In poppr: Genetic Analysis of Populations with Mixed Reproduction

Perform bootstrap statistics, calculate, and plot confidence intervals.

Description

Usage

Arguments

Details

Bootstrapping

Confidence intervals

Rarefaction

Plotting

Value

raw = TRUE

raw = FALSE

Note

Confidence interval calculation

User-defined functions

Author(s)

References

See Also

Examples

Related to diversity_ci in poppr...

R Package Documentation

Browse R Packages

We want your feedback!

poppr
Genetic Analysis of Populations with Mixed Reproduction

diversity_ci: Perform bootstrap statistics, calculate, and plot confidence...
In poppr: Genetic Analysis of Populations with Mixed Reproduction