View source: R/site_species_metrics.R
| site_species_metrics | R Documentation |
This function computes metrics that quantify how species and sites relate to clusters (bioregions or chorotypes). Depending on the type of clustering, metrics can measure how species are distributed across bioregions (site clusters), how sites relate to chorotypes (species clusters), or both.
site_species_metrics(
bioregionalization,
bioregion_metrics = c("Specificity", "NSpecificity", "Fidelity", "IndVal", "NIndVal",
"Rho"),
bioregionalization_metrics = "P",
data_type = "auto",
cluster_on = "site",
comat,
similarity = NULL,
include_cluster = FALSE,
index = names(similarity)[3],
verbose = TRUE
)
bioregionalization |
A |
bioregion_metrics |
A
Use |
bioregionalization_metrics |
A
Use |
data_type |
A
|
cluster_on |
A
|
comat |
A site-species |
similarity |
A site-by-site similarity object from |
include_cluster |
A |
index |
The name or number of the column to use as similarity.
By default, the third column name of |
verbose |
A |
This function computes metrics that characterize the relationship between species, sites, and clusters. The available metrics depend on whether you clustered sites (into bioregions) or species (into chorotypes).
Bioregions are clusters of sites with similar species composition.
Chorotypes are clusters of species with similar distributions.
In general, the package is designed to cluster sites into bioregions. However, it is possible to group species into clusters. We call these species clusters 'chorotypes', following conceptual definitions in the biogeographical literature, to avoid any confusion in the calculation of metrics.
In some cases, such as bipartite network clustering, both species and sites
receive the same clusters. We maintain the name distinction in the
calculation of metrics - but remember that in this case
BIOREGION IDs = CHOROTYPE IDs.
The cluster_on argument determines
which perspective to use.
cluster_on = "site" or cluster_on = "both") —Species-per-bioregion metrics quantify how each species is distributed across bioregions.
These metrics are derived from three core terms (see the online vignette for a visual diagram):
n_sb: Number of sites in bioregion b where species s is present
n_s: Total number of sites in which species s is present.
n_b: Total number of sites in bioregion b.
Abundance version of these core terms can also be calculated when
data_type = "abundance" (or data_type = "auto" and
bioregionalization was based on abundance):
w_sb: Sum of abundances of species s in sites of bioregion b.
w_s: Total abundance of species s.
w_b: Total abundance of all species present in sites of bioregion b.
The species-per-bioregion metrics are (click on metric names to access formulas):
Specificity: Fraction of a species' occurrences found in a given bioregion (De Cáceres & Legendre 2009). A value of 1 means the species occurs only in that bioregion.
NSpecificity: Normalized specificity that accounts for differences in bioregion size (De Cáceres & Legendre 2009).
Fidelity: Fraction of sites in a bioregion where the species occurs (De Cáceres & Legendre 2009). A value of 1 means the species is present in all sites of that bioregion.
IndVal: Indicator Value = Specificity × Fidelity (De Cáceres & Legendre 2009). High values identify species that are both restricted to and frequent within a bioregion.
NIndVal: Normalized IndVal accounting for bioregion size (De Cáceres & Legendre 2009).
Rho: Standardized contribution index comparing observed vs. expected co-occurrence under random association (Lenormand 2019).
CoreTerms: Raw counts (n, n_b, n_s, n_sb) for custom calculations.
These metrics can be found in the output slot species_bioregions.
Site-per-bioregion metrics characterize sites relative to bioregions:
Richness: Number of species in the site.
Rich_Endemics: Number of species in the site that are endemic to one bioregion.
Prop_Endemics: Proportion of endemic species in the site.
MeanSim: Mean similarity of a site to all sites in each bioregion.
SdSim: Standard deviation of similarity values.
These metrics can be found in the output slot site_bioregions.
Summary metrics across the whole bioregionalization:
These metrics summarize how an entity (species or site) is distributed across all clusters, rather than in relation to each individual cluster.
Species-level summary metric:
P
(Participation): Evenness of species distribution across bioregions
(Denelle et al. 2020). Found in output slot species_bioregionalization.
Site-level summary metric:
Silhouette:
How well a site fits its assigned bioregion vs. the nearest alternative
(Rousseeuw 1987). Found in output slot site_bioregionalization.
cluster_on = "species" or cluster_on = "both") —Site-per-chorotype metrics quantify how each site relates to species clusters (chorotypes).
The same metrics as above (Specificity, Fidelity, IndVal, etc.) can be computed, but their interpretation is inverted. These metrics are based on the following core terms:
n_gc: Number of species belonging to chorotype c that are present in site g.
n_g: Total number of species present in site g.
n_c: Total number of species belonging to chorotype c.
Abundance version of these core terms can also be calculated when
data_type = "abundance" (or data_type = "auto" and
bioregionalization was based on abundance).
Their interpretation changes, for example:
Specificity: Fraction of a site's species belonging to a chorotype.
Fidelity: Fraction of a chorotype's species present in the site.
IndVal: Indicator value for site-chorotype associations.
P: Evenness of sites across chorotypes
A list containing one or more data.frame elements, depending on the
selected metrics and clustering type:
When sites are clustered (cluster_on = "site"):
species_bioregions: Metrics for each species x bioregion combination (e.g., Specificity, IndVal). One row per species x bioregion pair.
species_bioregionalization: Summary metrics for each species across all bioregions (e.g., Participation coefficient). One row per species.
site_bioregions: Metrics for each site x bioregion combination (e.g., MeanSim, Richness). One row per site x bioregion pair.
site_bioregionalization: Summary metrics for each site (e.g., Silhouette). One row per site.
When species are clustered (cluster_on = "species"):
site_chorotypes: Metrics for each site x chorotype combination (e.g., Specificity, IndVal). One row per site x chorotype pair.
site_chorological: Summary metrics for each site across all chorotypes (e.g., Participation coefficient). One row per site.
Note that if bioregionalization contains multiple partitions
(i.e., if dim(bioregionalization$clusters) > 2), a nested list will be
returned, with one sublist per partition.
If data_type = "auto", the choice between occurrence- or abundance-
based metrics will be determined automatically from the input data, and a
message will explain the choice made.
Strict matching between entity IDs (site and species IDs) in
bioregionalization and in comat / similarity is required.
Maxime Lenormand (maxime.lenormand@inrae.fr)
Boris Leroy (leroy.boris@gmail.com)
Pierre Denelle (pierre.denelle@gmail.com)
De Cáceres M & Legendre P (2009) Associations between species and groups of sites: indices and statistical inference. Ecology 90, 3566–3574.
Denelle P, Violle C & Munoz F (2020) Generalist plants are more competitive and more functionally similar to each other than specialist plants: insights from network analyses. Journal of Biogeography 47, 1922–-1933.
Lenormand M, Papuga G, Argagnon O, Soubeyrand M, Alleaume S & Luque S (2019) Biogeographical network analysis of plant species distribution in the Mediterranean region. Ecology and Evolution 9, 237–250.
Rousseeuw PJ (1987) Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics 20, 53–65.
For more details illustrated with a practical example, see the vignette: https://biorgeo.github.io/bioregion/articles/a5_2_summary_metrics.html.
Associated functions: bioregion_metrics bioregionalization_metrics
data(fishmat)
fishsim <- similarity(fishmat, metric = "Jaccard")
bioregionalization <- hclu_hierarclust(similarity_to_dissimilarity(fishsim),
index = "Jaccard",
method = "average",
randomize = TRUE,
optimal_tree_method = "best",
n_clust = c(1,2,3),
verbose = FALSE)
ind <- site_species_metrics(bioregionalization = bioregionalization,
bioregion_metrics = "all",
bioregionalization_metrics = "all",
data_type = "auto",
cluster_on = "site",
comat = fishmat,
similarity = fishsim,
include_cluster = TRUE,
index = 3,
verbose = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.