CluMix-package: Clustering and Visualization of Mixed-Type Data

Description Details Author(s) References See Also Examples

Description

Provides utilities for clustering subjects and variables of mixed data types (Hummel, Edelmann, Kopp-Schneider (2017) <doi: 10.1371/journal.pone.0188274>). Similarities between subjects are measured by Gower's general similarity coefficient with an extension of Podani for ordinal variables. Similarities between variables can be assessed i) by combination of appropriate measures of association for different pairs of data types or ii) based on distance correlation. Alternatively, variables can also be clustered by the 'ClustOfVar' approach. The main feature of the package is the generation of a mixed-data heatmap. For visualizing similarities between either subjects or variables, a heatmap of the corresponding distance matrix can be drawn. Associations between variables can be explored by a 'confounder plot', which allows visual detection of possible confounding, collinear, or surrogate factors for some variables of primary interest. Distance matrices and dendrograms for subjects and variables can be derived and used for further visualizations and applications. This work was supported by BMBF grant 01ZX1609B, Germany.

Details

The DESCRIPTION file: This package was not yet installed at build time.

Index: This package was not yet installed at build time.
The main function mix.heatmap of the package generates a mixed-data heatmap. For visualizing similarities between either subjects or variables, a heatmap of the corresponding distance matrix can be drawn (distmap). Associations between variables can be explored by the confounderPlot, which allows visual detection of possible confounding, collinear, or surrogate factors for some variables of primary interest. Distance matrices and dendrograms for subjects and variables can be derived by functions dist.subjects, dist.variables, dendro.subjects, and dendro.variables. Clustering subjects is based on Gower's general similarity coefficient. Variables can be clustered by i) combination of association measures, ii) distance correlation, iii) the ClustOfVar approach.

Author(s)

M. Hummel, D. Edelmann, A. Kopp-Schneider

Maintainer: Manuela Hummel <manuela.hummel@web.de>

References

Hummel M, Edelmann D, Kopp-Schneider A (2017). Clustering of samples and variables with mixed-type data. PLOS ONE, 12(11):e0188274.

Gower J (1971). A general coefficient of similarity and some of its properties. Biometrics, 27:857-871.

Chavent M, Kuentz-Simonet V, Liquet B, Saracco J (2012). ClustOfVar: An R Package for the Clustering of Variables. Journal of Statistical Software, 50:1-16.

Szekely GJ, Rizzo ML, Bakirov NK (2007). Measuring and testing dependence by correlation of distances. The Annals of Statistics, 35.6:2769-2794.

Lyons R (2013). Distance covariance in metric spaces. The Annals of Probability, 41.5:3284-3305.

See Also

mix.heatmap

Examples

1
2
3

CluMix documentation built on Jan. 21, 2019, 5:05 p.m.