Using ClusVis with RMixtComp Output for Visualization
In RMixtComp: Mixture Models with Heterogeneous and (Partially) Missing Data

knitr::opts_chunk$set(echo = TRUE, fig.align = "center", fig.width = 7, fig.height = 5)

Sys.setenv(MC_DETERMINISTIC = 2)

ClusVis

ClusVis is an R package that performs a gaussian-based visualization of gaussian and non-gaussian Model-Based Clustering. This visualization is based on the probabilities of classification. See this preprint for more details about the method. It allows to visualize clusters as bivariate spherical gaussian.

ClusVis and RMixtComp

First, we load the required packages.

library(RMixtComp)
library(ClusVis)

To illustrate the use of ClusVis with RMixtComp output, we use the iris dataset and the congress dataset.

Example 1: iris dataset

The iris dataset gives the measurements in centimeters of the variables sepal length and width and petal length and width, respectively, for 50 flowers from each of 3 species of iris.

data("iris")
head(iris)

First, we learn a mixture model with 3 classes for the 4 measurements varaibles.

res <- mixtCompLearn(iris[, -5], nClass = 3, criterion = "BIC", nRun = 3, nCore = 1, verbose = FALSE)

Then, we apply the clusvis function. This function requires 2 parameters: the logarithm of the probabilities of classification of every individuals and the proportion of the mixture.

logTik <- getTik(res, log = TRUE)
prop <- getProportion(res)
resVisu <- clusvis(logTik, prop)

The results can be displayed using the plotDensityClusVisu function. The first graph is generated with the parameter add.obs = TRUE. It overlays on the most discriminative map the curve of iso-probabilities of classification and the cloud of observations.

plotDensityClusVisu(resVisu, add.obs = TRUE)

With add.obs = FALSE, the goal of the plot is to represents the overlap between the clusters. Each clusters is represented by its centers and a 95% confidence level border. The differene between entropies displayed in the title defines the accuracy of the representation. A difference closed to 0 means that the representation is accurate.

plotDensityClusVisu(resVisu, add.obs = FALSE)

Here, we note that two clusters are closed and so they contains flowers with similar measures whereas the other cluster contains flowers with very different measures from the two others.

Example 2: congress dataset

This data set includes votes for each of the U.S. House of Representatives Congressmen on the 16 key votes identified by the CQA in 1984.

data("congress")
head(congress)

First, we change the format of the data. The vote "n" is refactored as 1 and "y" as 2. "democrat" is refactored as 1 and "republican" as 2.

## MixtComp Format
congress$V1 = refactorCategorical(congress$V1, c("democrat", "republican", "?"), c(1, 2, "?"))
for(i in 2:ncol(congress))
  congress[, i] = refactorCategorical(congress[, i], c("n", "y", "?"), c(1, 2, "?"))

head(congress)

We run MixtComp with a Multinomial model for each variable.

model <- rep("Multinomial", ncol(congress))
names(model) = colnames(congress)

res <- mixtCompLearn(congress, model = model, nClass = 4, criterion = "BIC", nRun = 3, nCore = 1)

As before, we extract the required parameters.

logTik <- getTik(res, log = TRUE)
prop <- getProportion(res)
head(logTik)

It is important to notice that there are a lot of -Inf values in the variable logTik because some probabilities to be in a cluster are exactly 0. If there are too many infinite values, it is a problem for the cluvis function. One way to avoid this problem is to replace infinite values with the logarithm of a epsilon.

logTik[is.infinite(logTik)] = log(1e-20)
head(logTik)

Now, the clusvis function can be run.

resVisu <- clusvis(logTik, prop)

And the two associated plots generated.

plotDensityClusVisu(resVisu, add.obs = TRUE)

plotDensityClusVisu(resVisu, add.obs = FALSE)

Sys.unsetenv("MC_DETERMINISTIC")

Any scripts or data that you put into this service are public.

RMixtComp documentation built on July 9, 2023, 6:06 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

RMixtComp
Mixture Models with Heterogeneous and (Partially) Missing Data

Using ClusVis with RMixtComp Output for Visualization
In RMixtComp: Mixture Models with Heterogeneous and (Partially) Missing Data

ClusVis

ClusVis and RMixtComp

Example 1: iris dataset

Example 2: congress dataset

Try the RMixtComp package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

RMixtComp Mixture Models with Heterogeneous and (Partially) Missing Data

Using ClusVis with RMixtComp Output for Visualization In RMixtComp: Mixture Models with Heterogeneous and (Partially) Missing Data

ClusVis

ClusVis and RMixtComp

Example 1: iris dataset

Example 2: congress dataset

Try the RMixtComp package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

RMixtComp
Mixture Models with Heterogeneous and (Partially) Missing Data

Using ClusVis with RMixtComp Output for Visualization
In RMixtComp: Mixture Models with Heterogeneous and (Partially) Missing Data