dissimilarity_to_distance: Transform (non-metric) dissimilarity matrix to a weighted...

View source: R/dissimilarity_to_distance.R

dissimilarity_to_distanceR Documentation

Transform (non-metric) dissimilarity matrix to a weighted Euclidean distance (metric).

Description

This function constructs a weighted Euclidean distance that optimally approximates the dissimilarities. Advantages of the Euclidean approach include the neat decomposition of variance and the ordination’s optimal biplot display (Greenacre, 2017).

Usage

dissimilarity_to_distance(
  datt,
  dist_type = "bray",
  dst = NULL,
  drop_species = F,
  importance_percentile = 0.02,
  show_plot = T,
  ndim = NULL,
  ...
)

Arguments

datt

Data frame with species abundance (species = columns, samples = rows)

dist_type

Dissimilarity index (for the supported methods see vegdist) or "other"

dst

Distance matrix (of class 'dist') if dist_type == "other"

drop_species

Logical; TRUE indicates removal of "unimportant" species which doesn't contribute to the sample differentiation

importance_percentile

Percentile value for importance below which species are considered unimportant

show_plot

Logical; if TRUE, plot of original dissimilarities vs. the obtained weighted Euclidean distances will be shown

ndim

Number of dimensions; NULL by default, number of dimesions will be automatically determined

...

Additional arguments may be passed to vegdist)

Details

The code of the function is based on the work of Prof. Michael Greenacre (2017).

It is possible to eliminate species that have little or no value in measuring difference between the samples (with 'drop_species = TRUE'). For this one need to specify a threshold value for species importance ('importance_percentile'). By default, 'importance_percentile = 0.02', which indicates that all species with importance below the 2nd percentile of the species importance distribution will be removed.

Pre-calculated distance matrix can be passed as input to this function with 'dst' argument ('dist_type' should be set to "other"). However, species removal ('drop_species') will be impossible in this case.

Value

Function 'dissimilarity_to_distance' returns a list with the following components:

  • WEdist. Weighted Euclidean distance (class 'dist');

  • sp_weights. Data frame with species weights;

  • stress. Stress 1 measure which corresponds to the loss of the variance due to distance transformation (see stress0);

  • dissim_plot. (if 'show_plot = TRUE') gglot object with the corresponding plot.

References

Greenacre, M. (2017), Ordination with any dissimilarity measure: a weighted Euclidean solution. Ecology, 98: 2293–2300. doi:10.1002/ecy.1937

See Also

smacofConstraint, vegdist, dissimilarity_to_distance_importance_plot

Examples

library(vegan); library(smacof)
data(varespec)

## Estimate weighted Euclidean distance that optimally approximates Bray-Curtis dissimilarity
bc_to_eucl <- dissimilarity_to_distance(varespec, dist_type = "bray")
# ade4::is.euclid(bc_to_eucl$WEdist)


vmikk/metagMisc documentation built on Feb. 14, 2024, 2:29 a.m.