DiscrFact: Discriminant Factor Analysis for tclust Objects

Description

Analyzes a tclust-object by calculating discriminant factors and comparing the quality of the actual cluster assignments and the second best possible assignment for each observation. Discriminant factors, measuring the strength of the "trimming" decision may also be defined. Cluster assignments of observations with large discriminant factors are considered as "doubtful" decisions. Silhouette plots give a graphical overview of the discriminant factors distribution (see plot.DiscrFact). More details can be found in García-Escudero et al. (2010).

Usage

1
DiscrFact(x, threshold = 1/10)

Arguments

x

A tclust object.

threshold

A cluster assignment or a trimming decision for an observation with a discriminant factor larger than log(threshold) is considered as a "doubtful" decision.

Details

This function compares the actual (best) assignment of each observation to its second best possible assignment. This comparison is based on the discriminant factors of each observation, which are calculated here. If the discriminant factor of an observation is larger than a given level (log (threshold)), the observation is considered as "doubtfully" assigned to a cluster. More information is shown when plotting the returned DiscrFact object.

Value

The function returns an S3 object of type DiscrFact containing the following components:

x

A tclust object.

ylimmin

A minimum y-limit calculated for plotting purposes.

ind

The actual cluster assignment.

ind2

The second most likely cluster assignment for each observation.

disc

The (weighted) likelihood of the actual cluster assignment of each observation.

disc2

The (weighted) likelihood of the second best cluster assignment of each observation.

assignfact

The factor log (disc/disc2).

threshold

The threshold used for deciding whether assignfact indicates a "doubtful" assignment.

mean.DiscrFact

A vector of length k + 1 containing the mean discriminant factors for each cluster (including the outliers).

Author(s)

Agustin Mayo-Iscar, Luis Angel García-Escudero, Heinrich Fritz

References

García-Escudero, L.A.; Gordaliza, A.; Matrán, C. and Mayo-Iscar, A. (2010), "Exploring the number of groups in robust model-based clustering." Statistics and Computing, (Forthcoming).
Preprint available at www.eio.uva.es/infor/personas/langel.html.

See Also

plot.DiscrFact

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
sig <- diag (2)
cen <- rep (1, 2)
x <- rbind (
	rmvnorm (360, cen * 0,   sig),
	rmvnorm (540, cen * 5,   sig * 6 - 2),
	rmvnorm (100, cen * 2.5, sig * 50)
)
clus.1 <- tclust (x, k = 2, alpha = 0.1, restr.fact = 12)

clus.2 <- tclust (x, k = 3, alpha = 0.1, restr.fact = 1)
  ##  restr.fact and k are chosen improperly for pointing out the 
  ##    difference in the plot of DiscrFact

dsc.1 <- DiscrFact (clus.1)
plot(dsc.1)

dsc.2 <- DiscrFact (clus.2)
plot (dsc.2)

Questions? Problems? Suggestions? or email at ian@mutexlabs.com.

All documentation is copyright its authors; we didn't write any of that.