overlap: Overlap
In MixSim: Simulating Data to Study Performance of Clustering Algorithms

View source: R/main.R

overlap

R Documentation

Overlap

Description

Computes misclassification probabilities and pairwise overlaps for finite mixture models with Gaussian components. Overlap is defined as sum of two misclassification probabilities.

Usage

overlap(Pi, Mu, S, eps = 1e-06, lim = 1e06)

Arguments

`Pi`	vector of mixing proprtions (length K).
`Mu`	matrix consisting of components' mean vectors (K * p).
`S`	set of components' covariance matrices (p * p * K).
`eps`	error bound for overlap computation.
`lim`	maximum number of integration terms (Davies, 1980).

Value

`OmegaMap`	matrix of misclassification probabilities (K * K); OmegaMap[i,j] is the probability that X coming from the i-th component is classified to the j-th component.
`BarOmega`	value of average overlap.
`MaxOmega`	value of maximum overlap.
`rcMax`	row and column numbers for the pair of components producing maximum overlap 'MaxOmega'.

Author(s)

Volodymyr Melnykov, Wei-Chen Chen, and Ranjan Maitra.

References

Maitra, R. and Melnykov, V. (2010) “Simulating data to study performance of finite mixture modeling and clustering algorithms”, The Journal of Computational and Graphical Statistics, 2:19, 354-376.

Melnykov, V., Chen, W.-C., and Maitra, R. (2012) “MixSim: An R Package for Simulating Data to Study Performance of Clustering Algorithms”, Journal of Statistical Software, 51:12, 1-25.

Davies, R. (1980) “The distribution of a linear combination of chi-square random variables”, Applied Statistics, 29, 323-333.

Examples


data("iris", package = "datasets")
p <- ncol(iris) - 1
id <- as.integer(iris[, 5])
K <- max(id)

# estimate mixture parameters
Pi <- prop.table(tabulate(id))
Mu <- t(sapply(1:K, function(k){ colMeans(iris[id == k, -5]) }))
S <- sapply(1:K, function(k){ var(iris[id == k, -5]) })
dim(S) <- c(p, p, K)

overlap(Pi = Pi, Mu = Mu, S = S)

MixSim documentation built on Sept. 11, 2024, 9:08 p.m.