plot.otrimle: Plot Methods for OTRIMLE Objects
In otrimle: Robust Model-Based Clustering

Description Usage Arguments Value References Author(s) See Also Examples

Plot robust model-based clustering results: scatter plot with clustering information, optimization profiling, and cluster fit.

1
2
3

## S3 method for class 'otrimle'
plot(x, what=c("criterion","iloglik", "fit", "clustering"),
     data=NULL, margins=NULL, cluster=NULL, ...)

`x`	Output from `otrimle`
`what`	The type of graph. It can be one of the following: `"criterion"` (default), `"iloglik"`, `"fit"`, `"clustering"`. See Details.
`data`	The data vector, matrix or data.frame (or some transformation of them), used for obtaining the `'otrimle'` object. This is only relevant if `what="clustering"`.
`margins`	A vector of integers denoting the variables (numbers of columns of `data`) to be used for a `pairs`-plot if `what="clustering"`. When `margins=NULL` it is set to `1:ncol(data)` (default).
`cluster`	An integer denoting the cluster for which the fit plot is returned. This is only relevant if `what="fit"`.
`...`	further arguments passed to or from other methods.

If what="criterion": A plot with the profiling of the OTRIMLE criterion optimization. Criterion at log(icd)=-Inf is always represented.
If what="iloglik": A plot with the profiling of the improper log-likelihood function along the search path for the OTRIMLE optimization.
If what="fit": The P-P plot (probability-probability plot) of the weighted empirical distribution function of the Mahalanobis distances of observations from clusters' centers against the target distribution. The target distribution is the Chi-square distribution with degrees of freedom equal to ncol(data). The weights are given by the improper posterior probabilities. If cluster=NULL P-P plots are produced for all clusters, otherwise cluster selects a single P-P plot at times.
If what="clustering": A pairwise scatterplot with cluster memberships. Points assigned to the noise/outliers component are denoted by '+'.

Coretto, P. and C. Hennig (2016). Robust improper maximum likelihood: tuning, computation, and a comparison with other methods for robust Gaussian clustering. Journal of the American Statistical Association, Vol. 111(516), pp. 1648-1659. doi: 10.1080/01621459.2015.1100996

P. Coretto and C. Hennig (2017). Consistency, breakdown robustness, and algorithms for robust improper maximum likelihood clustering. Journal of Machine Learning Research, Vol. 18(142), pp. 1-39. https://jmlr.org/papers/v18/16-382.html

Pietro Coretto pcoretto@unisa.it https://pietro-coretto.github.io

plot.otrimle

## Load  Swiss banknotes data
data(banknote)
x <- banknote[,-1]

## Perform otrimle clustering on a small grid of logicd values
a <- otrimle(data = x, G = 2, logicd = c(-Inf, -50, -10), ncores = 1)
print(a)

## Plot clustering
plot(a, data = x, what = "clustering")

## Plot clustering on selected margins
plot(a, data = x, what = "clustering", margins = 4:6)

## Plot clustering on the first two principal components
z <- scale(x) %*%   eigen(cor(x), symmetric = TRUE)$vectors
colnames(z) <- paste("PC", 1:ncol(z), sep = "")
plot(a, data = z, what = "clustering", margins = 1:2)

## Plot OTRIMLE criterion profiling
plot(a, what = "criterion")

## Plot Improper log-likelihood profiling
plot(a, what = "iloglik")

## Fit plot for all clusters
plot(a, what = "fit")

## Fit plot for cluster 1
plot(a, what = "fit", cluster = 1)



## Not run: 
## Perform the same example using the finer default grid of logicd
## values using multiple cores
##
a <- otrimle(data = x, G = 2)

## Inspect the otrimle criterion-vs-logicd
plot(a, what = 'criterion')

## The minimum occurs at  a$logicd=-9, and exploring a$optimization it
## cane be seen that the interval [-12.5, -4] brackets the optimal
## solution. We search with a finer grid located around the minimum
##
b <- otrimle(data = x, G = 2, logicd = seq(-12.5, -4, length.out = 25))

## Inspect the otrimle criterion-vs-logicd
plot(b, what = 'criterion')

## Check the difference between the two clusterings
table(A = a$cluster, B = b$cluster)

## Check differences in estimated parameters
##
colSums(abs(a$mean - b$mean))               ## L1 distance for mean vectors
apply({a$cov-b$cov}, 3, norm, type = "F")   ## Frobenius distance for covariances
c(Noise=abs(a$npr-b$npr), abs(a$cpr-b$cpr)) ## Absolute difference for proportions

## End(Not run)