infoclust: Information Clusterings of Ecological Communities

infoclustR Documentation

Information Clusterings of Ecological Communities

Description

Function performs hierarchical clustering of binary ecological communities based on information analysis as defined by Williams et al. (1966) and Lance & Williams (1966).

Usage

infoclust(x, delta = TRUE)

Arguments

x

Community data

delta

Use increase in information (\Delta I) instead of the value of information (I) when merging clusters as recommended by Williams et al. (1966) and Lance & Williams (1966).

Details

Function performs information analysis of binary ecological communities (Williams et al. 1966, Lance & Williams 1966). The current implementation is based on Legendre & Legendre (2012).

The information I of a collection of N sampling units with S species is defined as

I = S N \log N - \sum_i^S a_i \log a_i + (N-a_i) \log (N-a_i)

where a is the frequency count of each species in the collection. The method works by merging either the units that give the lowest increase (\Delta I when delta = TRUE), or the units that are most homogeneous (lowest I when delta = FALSE). After merging sampling units or clusters, the community data matrix is updated by actually merging the data units and re-evaluating their information distance to all other units. The information content of all non-merged clusters is I = 0, and for clusters of several sampling units the constant species (completely absent or always present) do not contribute to the information. The largest increase in information is made by species with 0.5 relative frequency, so that the analysis tries to build clusters where species is either always present or always absent. This often gives easily interpretable clusters.

Value

Function returns an object of class "infoclust" that inherits from hclust. It uses all "hclust" methods, but some may fail or work in unexpected ways because the analysis is not based on dissimilarities but on binary data matrix.

References

Williams, W.T., Lambert, J.M. & Lance, G.N. (1966). Multivariate methods in plant ecology. V. Similarity analyses and information-analysis. J. Ecol. 54, 427–445.

Lance, G.N. & Williams, W.T. (1966). Computer programs for hierarchical polythetic classification (“similarity analyses”). Comp. J. 9, 60–64.

Legendre, P. & Legendre, L. (2012). Numerical Ecology. 3rd English Ed., Elsevier.

Examples

## example used to demonstrate the calculation of
## information analysis by Legendre & Legendre (2012, p. 372).
data(pond)
cl <- infoclust(pond)
plot(cl, hang = -1)
## Lance & Williams suggest a limit below which clustering is
## insignificant and should not be interpreted
abline(h=qchisq(0.95, ncol(pond)), col=2)
## Spurn Point Scrub data
data(spurn)
cl <- infoclust(spurn)
plot(cl, hang = -1)
if (require(vegan)) {
tabasco(spurn, cl)
## apply information clustering on species
tabasco(spurn, cl, infoclust(t(spurn)))
}


jarioksa/natto documentation built on March 28, 2024, 12:45 a.m.