dissvar: Dissimilarity based discrepancy

View source: R/dissvar.R

dissvarR Documentation

Dissimilarity based discrepancy

Description

Compute the discrepancy from the pairwise dissimilarities between objects. The discrepancy is a measure of dispersion of the set of objects.

Usage

dissvar(diss, weights=NULL, squared = FALSE)

Arguments

diss

A dissimilarity matrix or a dist object (see dist)

weights

optional numerical vector containing weights.

squared

Logical. If TRUE diss is squared.

Details

The discrepancy is an extension of the concept of variance to any kind of objects for which we can compute pairwise dissimilarities. The discrepancy s^2 is defined as:

s^2=\frac{1}{2n^2}\sum_{i=1}^{n}\sum_{j=1}^{n}d_{ij}

Mathematical ground: In the Euclidean case, the sum of squares can be expressed as:

SS=\sum_{i=1}^{n}(y_i-\bar{y})^2=\frac{1}{2n}\sum_{i=1}^{n}\sum_{j=1}^{n}(y_i-y_j)^2

The concept of discrepancy generalizes the equation by allowing to replace the (y_i - y_j)^2 term with any measure of dissimilarity d_{ij}.

Value

The discrepancy.

Author(s)

Matthias Studer (with Gilbert Ritschard for the help page)

References

Studer, M., G. Ritschard, A. Gabadinho and N. S. Müller (2011). Discrepancy analysis of state sequences, Sociological Methods and Research, Vol. 40(3), 471-510, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1177/0049124111415372")}.

Studer, M., G. Ritschard, A. Gabadinho and N. S. Müller (2010) Discrepancy analysis of complex objects using dissimilarities. In F. Guillet, G. Ritschard, D. A. Zighed and H. Briand (Eds.), Advances in Knowledge Discovery and Management, Studies in Computational Intelligence, Volume 292, pp. 3-19. Berlin: Springer.

Studer, M., G. Ritschard, A. Gabadinho and N. S. Müller (2009) Analyse de dissimilarités par arbre d'induction. In EGC 2009, Revue des Nouvelles Technologies de l'Information, Vol. E-15, pp. 7-18.

Anderson, M. J. (2001) A new method for non-parametric multivariate analysis of variance. Austral Ecology 26, 32-46.

Batagelj, V. (1988) Generalized ward and related clustering problems. In H. Bock (Ed.), Classification and related methods of data analysis, Amsterdam: North-Holland, pp. 67-74.

See Also

dissassoc to test association between objects represented by their dissimilarities and a covariate.
disstree for an induction tree analyse of objects characterized by a dissimilarity matrix.
disscenter to compute the distance of each object to its group center from pairwise dissimilarities.
dissmfacw to perform multi-factor analysis of variance from pairwise dissimilarities.

Examples

## Defining a state sequence object
data(mvad)
mvad.seq <- seqdef(mvad[, 17:86])

## Building dissimilarities (any dissimilarity measure can be used)
mvad.ham <- seqdist(mvad.seq, method="HAM")

## Pseudo variance of the sequences
print(dissvar(mvad.ham))

TraMineR documentation built on Sept. 19, 2023, 1:07 a.m.