# estW: INCA Statistic In ICGE: Estimation of Number of Clusters and Identification of Atypical Units

## INCA Statistic

### Description

Assume that n units are divided into k clusters C1,...,Ck, and consider a fixed unit x0. Function `estW` calculates the INCA statistic W(x0) and the related U_i statistics.

### Usage

```estW(d, dx0, pert = "onegroup")
```

### Arguments

 `d` a distance matrix or a `dist` object with distance information between units. `dx0` an n-vector containing the distances d0j between x0 and unit j. `pert` an n-vector that indicates which group each unit belongs to. Note that the expected values of `pert` are consecutive integers bigger or equal than 1 (for instance 1,2,3,4..., k). The default value indicates the presence of only one group in data.

### Value

The function returns an object of class `incaest` which is a list containing the following components:

 `Wvalue` is the INCA statistic W(x_0). `Uvalue ` is a vector containing the statistics U_i.

### Note

For a correct geometrical interpretation it is convenient to verify whether the distance matrix d is Euclidean.

### Author(s)

Itziar Irigoien itziar.irigoien@ehu.eus; Konputazio Zientziak eta Adimen Artifiziala, Euskal Herriko Unibertsitatea (UPV/EHU), Donostia, Spain.

Conchita Arenas carenas@ub.edu; Departament d'Estadistica, Universitat de Barcelona, Barcelona, Spain.

### References

Arenas, C. and Cuadras, C.M. (2002). Some recent statistical methods based on distances. Contributions to Science, 2, 183–191.

Irigoien, I. and Arenas, C. (2008). INCA: New statistic for estimating the number of clusters and identifying atypical units. Statistics in Medicine, 27(15), 2948–2973.

### Examples

```data(iris)
d <- dist(iris[,1:4])

# characteristics of a specific flower (likely group 1)
x0 <- c(5.3, 3.6, 1.1, 0.1)
# distances between  flower x0 and the rest of flowers in iris
dx0 <- rep(0,150)
for (i in 1:150){
dif <-x0-iris[i,1:4]
dx0[i] <- sqrt(sum(dif*dif))
}
estW(d, dx0, iris[,5])

```

