Home

/

CRAN

/

ICGE

/

INCAindex: INCA index

INCAindex: INCA index
In ICGE: Estimation of Number of Clusters and Identification of Atypical Units

View source: R/INCAindex.R

INCAindex

R Documentation

INCA index

Description

INCAindex helps to estimate the number of clusters in a dataset.

Usage

INCAindex(d, pert_clus)

Arguments

`d`	a distance matrix or a `dist` object with distance information between units.
`pert_clus`	an n-vector that indicates which group each unit belongs to. Note that the expected values of `pert` are numbers greater than or equal to 1 (for instance 1,2,3,4..., k). The default value indicates the presence of only one group in data.

Value

Returns an object of class incaix which is a list containing the following components:

`well_class`	a vector indicating the number of well classified units.
`Ni_cluster`	a vector indicating each cluster size.
`Total`	percentage of objects well classified in the partition defined by `pert_clus`.

Note

For a correct geometrical interpretation it is convenient to verify whether the distance matrix d is Euclidean. It admits the associated methods summary and plot. The first simply returns the percentage of well-classified units and the second offers a barchart with the percentages of well classified units for each group in the given partition.

Author(s)

Itziar Irigoien itziar.irigoien@ehu.eus; Konputazio Zientziak eta Adimen Artifiziala, Euskal Herriko Unibertsitatea (UPV/EHU), Donostia, Spain.

Conchita Arenas carenas@ub.edu; Departament d'Estadistica, Universitat de Barcelona, Barcelona, Spain.

References

Arenas, C. and Cuadras, C.M. (2002). Some recent statistical methods based on distances. Contributions to Science, 2, 183–191.

Irigoien, I. and Arenas, C. (2008). INCA: New statistic for estimating the number of clusters and identifying atypical units. Statistics in Medicine, 27(15), 2948–2973.

Examples

#generate 3 clusters, each of them with 20 objects in dimension 5.
mu1 <- sample(1:10, 5, replace=TRUE)
x1 <- matrix(rnorm(20*5, mean = mu1, sd = 1),ncol=5, byrow=TRUE)
mu2 <- sample(1:10, 5, replace=TRUE)
x2 <- matrix(rnorm(20*5, mean = mu2, sd = 1),ncol=5, byrow=TRUE)
mu3 <- sample(1:10, 5, replace=TRUE)
x3 <- matrix(rnorm(20*5, mean = mu3, sd = 1),ncol=5, byrow=TRUE)
x <- rbind(x1,x2,x3)

# Euclidean distance between units.
d <- dist(x)

# given the right partition, calculate the percentage of well classified objects.
partition <- c(rep(1,20), rep(2,20), rep(3,20))
INCAindex(d, partition)


# In order to estimate the number of cluster in data, try several 
#  partitions and compare the results
library(cluster)
T <- rep(NA, 5)
for (l in 2:5){
	part <- pam(d,l)$clustering
	T[l] <- INCAindex(d,part)$Total
}

plot(T, type="b",xlab="Number of clusters", ylab="INCA", xlim=c(1.5, 5.5))

ICGE documentation built on Oct. 17, 2022, 5:10 p.m.

ICGE index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

ICGE
Estimation of Number of Clusters and Identification of Atypical Units

INCAindex: INCA index
In ICGE: Estimation of Number of Clusters and Identification of Atypical Units

INCA index

Description

Usage

Arguments

Value

Note

Author(s)

References

See Also

Examples

Related to INCAindex in ICGE...

R Package Documentation

Browse R Packages

We want your feedback!

ICGE Estimation of Number of Clusters and Identification of Atypical Units

INCAindex: INCA index In ICGE: Estimation of Number of Clusters and Identification of Atypical Units

INCA index

Description

Usage

Arguments

Value

Note

Author(s)

References

See Also

Examples

Related to INCAindex in ICGE...

R Package Documentation

Browse R Packages

We want your feedback!

ICGE
Estimation of Number of Clusters and Identification of Atypical Units

INCAindex: INCA index
In ICGE: Estimation of Number of Clusters and Identification of Atypical Units