estimateDominance: Estimate dominance measures
In microbiome/mia: Microbiome analysis

estimateDominance

R Documentation

Estimate dominance measures

Description

This function calculates community dominance indices. This includes the ‘Absolute’, ‘Berger-Parker’, ‘Core abundance’, ‘Gini’, ‘McNaughton’s’, ‘Relative’, and ‘Simpson's’ indices.

Usage

estimateDominance(
  x,
  assay.type = assay_name,
  assay_name = "counts",
  index = c("absolute", "dbp", "core_abundance", "gini", "dmn", "relative",
    "simpson_lambda"),
  ntaxa = 1,
  aggregate = TRUE,
  name = index,
  ...,
  BPPARAM = SerialParam()
)

## S4 method for signature 'SummarizedExperiment'
estimateDominance(
  x,
  assay.type = assay_name,
  assay_name = "counts",
  index = c("absolute", "dbp", "core_abundance", "gini", "dmn", "relative",
    "simpson_lambda"),
  ntaxa = 1,
  aggregate = TRUE,
  name = index,
  ...,
  BPPARAM = SerialParam()
)

Arguments

`x`	a `SummarizedExperiment` object
`assay.type`	A single character value for selecting the `assay` to calculate the sample-wise estimates.
`assay_name`	a single `character` value for specifying which assay to use for calculation. (Please use `assay.type` instead. At some point `assay_name` will be disabled.)
`index`	a `character` vector, specifying the indices to be calculated.
`ntaxa`	Optional and only used for the `Absolute` and `Relative` dominance indices: The n-th position of the dominant taxa to consider (default: `ntaxa = 1`). Disregarded for the indices “dbp”, “core_abundance”, “Gini”, “dmn”, and “Simpson”.
`aggregate`	Optional and only used for the `Absolute`, `dbp`, `Relative`, and `dmn` dominance indices: Aggregate the values for top members selected by `ntaxa` or not. If `TRUE`, then the sum of relative abundances is returned. Otherwise the relative abundance is returned for the single taxa with the indicated rank (default: `aggregate = TRUE`). Disregarded for the indices “core_abundance”, “gini”, “dmn”, and “simpson”.
`name`	A name for the column(s) of the colData where the calculated Dominance indices should be stored in.
`...`	additional arguments currently not used.
`BPPARAM`	A `BiocParallelParam` object specifying whether calculation of estimates should be parallelized. (Currently not used)

Details

A dominance index quantifies the dominance of one or few species in a community. Greater values indicate higher dominance.

Dominance indices are in general negatively correlated with alpha diversity indices (species richness, evenness, diversity, rarity). More dominant communities are less diverse.

estimateDominance calculates the following community dominance indices:

'absolute' Absolute index equals to the absolute abundance of the most dominant n species of the sample (specify the number with the argument ntaxa). Index gives positive integer values.
'dbp' Berger-Parker index (See Berger & Parker 1970) calculation is a special case of the 'relative' index. dbp is the relative abundance of the most abundant species of the sample. Index gives values in interval 0 to 1, where bigger value represent greater dominance.

dbp = \frac{N_1}{N_{tot}}

where N_1 is the absolute abundance of the most dominant species and N_{tot} is the sum of absolute abundances of all species.
'core_abundance' Core abundance index is related to core species. Core species are species that are most abundant in all samples, i.e., in whole data set. Core species are defined as those species that have prevalence over 50\ species must be prevalent in 50\ calculate the core abundance index. Core abundance index is sum of relative abundances of core species in the sample. Index gives values in interval 0 to 1, where bigger value represent greater dominance.

core_abundance = \frac{N_{core}}{N_{tot}}

where N_{core} is the sum of absolute abundance of the core species and N_{tot} is the sum of absolute abundances of all species.
'gini' Gini index is probably best-known from socio-economic contexts (Gini 1921). In economics, it is used to measure, for example, how unevenly income is distributed among population. Here, Gini index is used similarly, but income is replaced with abundance.

If there is small group of species that represent large portion of total abundance of microbes, the inequality is large and Gini index closer to 1. If all species has equally large abundances, the equality is perfect and Gini index equals 0. This index should not be confused with Gini-Simpson index, which quantifies diversity.
'dmn' McNaughton’s index is the sum of relative abundances of the two most abundant species of the sample (McNaughton & Wolf, 1970). Index gives values in the unit interval:

dmn = (N_1 + N_2)/N_tot

where N_1 and N_2 are the absolute abundances of the two most dominant species and N_{tot} is the sum of absolute abundances of all species.
'relative' Relative index equals to the relative abundance of the most dominant n species of the sample (specify the number with the argument ntaxa). This index gives values in interval 0 to 1.

relative = N_1/N_tot

where N_1 is the absolute abundance of the most dominant species and N_{tot} is the sum of absolute abundances of all species.
'simpson_lambda' Simpson's (dominance) index or Simpson's lambda is the sum of squared relative abundances. This index gives values in the unit interval. This value equals the probability that two randomly chosen individuals belongs to the same species. The higher the probability, the greater the dominance (See e.g. Simpson 1949).

lambda = \sum(p^2)

where p refers to relative abundances.

There is also a more advanced Simpson dominance index (Simpson 1949). However, this is not provided and the simpler squared sum of relative abundances is used instead as the alternative index is not in the unit interval and it is highly correlated with the simpler variant implemented here.

Value

x with additional colData named *name*

Author(s)

Leo Lahti and Tuomas Borman. Contact: microbiome.github.io

References

Berger WH & Parker FL (1970) Diversity of Planktonic Foraminifera in Deep-Sea Sediments. Science 168(3937):1345-1347. doi: 10.1126/science.168.3937.1345

Gini C (1921) Measurement of Inequality of Incomes. The Economic Journal 31(121): 124-126. doi: 10.2307/2223319

McNaughton, SJ and Wolf LL. (1970). Dominance and the niche in ecological systems. Science 167:13, 1–139

Simpson EH (1949) Measurement of Diversity. Nature 163(688). doi: 10.1038/163688a0

Examples

data(esophagus)

# Calculates Simpson's lambda (can be used as a dominance index)
esophagus <- estimateDominance(esophagus, index="simpson_lambda")

# Shows all indices
colData(esophagus)

# Indices must be written correctly (e.g. dbp, not dbp), otherwise an error
# gets thrown
esophagus <- estimateDominance(esophagus, index="dbp")
# Calculates dbp and Core Abundance indices
esophagus <- estimateDominance(esophagus, index=c("dbp", "core_abundance"))
# Shows all indices
colData(esophagus)
# Shows dbp index
colData(esophagus)$dbp
# Deletes dbp index
colData(esophagus)$dbp <- NULL
# Shows all indices, dbp is deleted
colData(esophagus)
# Deletes all indices
colData(esophagus) <- NULL

# Calculates all indices
esophagus <- estimateDominance(esophagus)
# Shows all indices
colData(esophagus)
# Deletes all indices
colData(esophagus) <- NULL

# Calculates all indices with explicitly specified names
esophagus <- estimateDominance(esophagus,
    index = c("dbp", "dmn", "absolute", "relative",
              "simpson_lambda", "core_abundance", "gini"),
    name  = c("BergerParker", "McNaughton", "Absolute", "Relative",
              "SimpsonLambda", "CoreAbundance", "Gini")
)
# Shows all indices
colData(esophagus)

microbiome/mia documentation built on April 27, 2024, 4:04 a.m.