# fanny: Fuzzy Analysis Clustering In cluster: "Finding Groups in Data": Cluster Analysis Extended Rousseeuw et al.

## Description

Computes a fuzzy clustering of the data into `k` clusters.

## Usage

 ```1 2 3 4 5 6``` ```fanny(x, k, diss = inherits(x, "dist"), memb.exp = 2, metric = c("euclidean", "manhattan", "SqEuclidean"), stand = FALSE, iniMem.p = NULL, cluster.only = FALSE, keep.diss = !diss && !cluster.only && n < 100, keep.data = !diss && !cluster.only, maxit = 500, tol = 1e-15, trace.lev = 0) ```

## Arguments

 `x` data matrix or data frame, or dissimilarity matrix, depending on the value of the `diss` argument. In case of a matrix or data frame, each row corresponds to an observation, and each column corresponds to a variable. All variables must be numeric. Missing values (NAs) are allowed. In case of a dissimilarity matrix, `x` is typically the output of `daisy` or `dist`. Also a vector of length n*(n-1)/2 is allowed (where n is the number of observations), and will be interpreted in the same way as the output of the above-mentioned functions. Missing values (NAs) are not allowed. `k` integer giving the desired number of clusters. It is required that 0 < k < n/2 where n is the number of observations. `diss` logical flag: if TRUE (default for `dist` or `dissimilarity` objects), then `x` is assumed to be a dissimilarity matrix. If FALSE, then `x` is treated as a matrix of observations by variables. `memb.exp` number r strictly larger than 1 specifying the membership exponent used in the fit criterion; see the ‘Details’ below. Default: `2` which used to be hardwired inside FANNY. `metric` character string specifying the metric to be used for calculating dissimilarities between observations. Options are `"euclidean"` (default), `"manhattan"`, and `"SqEuclidean"`. Euclidean distances are root sum-of-squares of differences, and manhattan distances are the sum of absolute differences, and `"SqEuclidean"`, the squared euclidean distances are sum-of-squares of differences. Using this last option is equivalent (but somewhat slower) to computing so called “fuzzy C-means”. If `x` is already a dissimilarity matrix, then this argument will be ignored. `stand` logical; if true, the measurements in `x` are standardized before calculating the dissimilarities. Measurements are standardized for each variable (column), by subtracting the variable's mean value and dividing by the variable's mean absolute deviation. If `x` is already a dissimilarity matrix, then this argument will be ignored. `iniMem.p` numeric n x k matrix or `NULL` (by default); can be used to specify a starting `membership` matrix, i.e., a matrix of non-negative numbers, each row summing to one.
 `cluster.only` logical; if true, no silhouette information will be computed and returned, see details.
 `keep.diss, keep.data` logicals indicating if the dissimilarities and/or input data `x` should be kept in the result. Setting these to `FALSE` can give smaller results and hence also save memory allocation time. `maxit, tol` maximal number of iterations and default tolerance for convergence (relative convergence of the fit criterion) for the FANNY algorithm. The defaults `maxit = 500` and ```tol = 1e-15``` used to be hardwired inside the algorithm. `trace.lev` integer specifying a trace level for printing diagnostics during the C-internal algorithm. Default `0` does not print anything; higher values print increasingly more.

## Details

In a fuzzy clustering, each observation is “spread out” over the various clusters. Denote by u(i,v) the membership of observation i to cluster v.

The memberships are nonnegative, and for a fixed observation i they sum to 1. The particular method `fanny` stems from chapter 4 of Kaufman and Rousseeuw (1990) (see the references in `daisy`) and has been extended by Martin Maechler to allow user specified `memb.exp`, `iniMem.p`, `maxit`, `tol`, etc.

Fanny aims to minimize the objective function

SUM_[v=1..k] (SUM_(i,j) u(i,v)^r u(j,v)^r d(i,j)) / (2 SUM_j u(j,v)^r)

where n is the number of observations, k is the number of clusters, r is the membership exponent `memb.exp` and d(i,j) is the dissimilarity between observations i and j.
Note that r -> 1 gives increasingly crisper clusterings whereas r -> Inf leads to complete fuzzyness. K&R(1990), p.191 note that values too close to 1 can lead to slow convergence. Further note that even the default, r = 2 can lead to complete fuzzyness, i.e., memberships u(i,v) == 1/k. In that case a warning is signalled and the user is advised to chose a smaller `memb.exp` (=r).

Compared to other fuzzy clustering methods, `fanny` has the following features: (a) it also accepts a dissimilarity matrix; (b) it is more robust to the `spherical cluster` assumption; (c) it provides a novel graphical display, the silhouette plot (see `plot.partition`).

## Value

an object of class `"fanny"` representing the clustering. See `fanny.object` for details.

`agnes` for background and references; `fanny.object`, `partition.object`, `plot.partition`, `daisy`, `dist`.

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20``` ```## generate 10+15 objects in two clusters, plus 3 objects lying ## between those clusters. x <- rbind(cbind(rnorm(10, 0, 0.5), rnorm(10, 0, 0.5)), cbind(rnorm(15, 5, 0.5), rnorm(15, 5, 0.5)), cbind(rnorm( 3,3.2,0.5), rnorm( 3,3.2,0.5))) fannyx <- fanny(x, 2) ## Note that observations 26:28 are "fuzzy" (closer to # 2): fannyx summary(fannyx) plot(fannyx) (fan.x.15 <- fanny(x, 2, memb.exp = 1.5)) # 'crispier' for obs. 26:28 (fanny(x, 2, memb.exp = 3)) # more fuzzy in general data(ruspini) f4 <- fanny(ruspini, 4) stopifnot(rle(f4\$clustering)\$lengths == c(20,23,17,15)) plot(f4, which = 1) ## Plot similar to Figure 6 in Stryuf et al (1996) plot(fanny(ruspini, 5)) ```

### Example output     ```Fuzzy Clustering object of class 'fanny' :
m.ship.expon.        2
objective     12.64452
tolerance        1e-15
iterations           9
converged            1
maxit              500
n                   28
Membership coefficients (in %, rounded):
[,1] [,2]
[1,]   97    3
[2,]   95    5
[3,]   98    2
[4,]   97    3
[5,]   94    6
[6,]   96    4
[7,]   95    5
[8,]   96    4
[9,]   98    2
[10,]   96    4
[11,]    3   97
[12,]    4   96
[13,]    3   97
[14,]   12   88
[15,]    9   91
[16,]    3   97
[17,]   11   89
[18,]    4   96
[19,]    6   94
[20,]    5   95
[21,]    3   97
[22,]    3   97
[23,]    2   98
[24,]    3   97
[25,]   10   90
[26,]   27   73
[27,]   30   70
[28,]   39   61
Fuzzyness coefficients:
dunn_coeff normalized
0.8751449  0.7502897
Closest hard clustering:
 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

Available components:
 "membership"  "coeff"       "memb.exp"    "clustering"  "k.crisp"
 "objective"   "convergence" "diss"        "call"        "silinfo"
 "data"
Fuzzy Clustering object of class 'fanny' :
m.ship.expon.        2
objective     12.64452
tolerance        1e-15
iterations           9
converged            1
maxit              500
n                   28
Membership coefficients (in %, rounded):
[,1] [,2]
[1,]   97    3
[2,]   95    5
[3,]   98    2
[4,]   97    3
[5,]   94    6
[6,]   96    4
[7,]   95    5
[8,]   96    4
[9,]   98    2
[10,]   96    4
[11,]    3   97
[12,]    4   96
[13,]    3   97
[14,]   12   88
[15,]    9   91
[16,]    3   97
[17,]   11   89
[18,]    4   96
[19,]    6   94
[20,]    5   95
[21,]    3   97
[22,]    3   97
[23,]    2   98
[24,]    3   97
[25,]   10   90
[26,]   27   73
[27,]   30   70
[28,]   39   61
Fuzzyness coefficients:
dunn_coeff normalized
0.8751449  0.7502897
Closest hard clustering:
 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

Silhouette plot information:
cluster neighbor sil_width
9        1        2 0.9395819
3        1        2 0.9381881
4        1        2 0.9306578
1        1        2 0.9235375
10       1        2 0.9180549
6        1        2 0.9168881
8        1        2 0.9161457
2        1        2 0.9119693
7        1        2 0.9004525
5        1        2 0.8948673
23       2        1 0.8720496
21       2        1 0.8716156
24       2        1 0.8715160
22       2        1 0.8700107
13       2        1 0.8669201
16       2        1 0.8647070
11       2        1 0.8627588
18       2        1 0.8594370
12       2        1 0.8566746
20       2        1 0.8541398
19       2        1 0.8391494
15       2        1 0.8080395
25       2        1 0.8051226
17       2        1 0.7922377
14       2        1 0.7770089
26       2        1 0.5609001
27       2        1 0.5207050
28       2        1 0.3310012
Average silhouette width per cluster:
 0.9190343 0.7824441
Average silhouette width of total data set:
 0.8312263

378 dissimilarities, summarized :
Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
0.03481 0.91004 3.50410 4.13540 7.48930 9.27200
Metric :  euclidean
Number of objects : 28

Available components:
 "membership"  "coeff"       "memb.exp"    "clustering"  "k.crisp"
 "objective"   "convergence" "diss"        "call"        "silinfo"
 "data"
Fuzzy Clustering object of class 'fanny' :
m.ship.expon.      1.5
objective     14.50784
tolerance        1e-15
iterations          10
converged            1
maxit              500
n                   28
Membership coefficients (in %, rounded):
[,1] [,2]
[1,]  100    0
[2,]  100    0
[3,]  100    0
[4,]  100    0
[5,]  100    0
[6,]  100    0
[7,]  100    0
[8,]  100    0
[9,]  100    0
[10,]  100    0
[11,]    0  100
[12,]    0  100
[13,]    0  100
[14,]    2   98
[15,]    1   99
[16,]    0  100
[17,]    1   99
[18,]    0  100
[19,]    0  100
[20,]    0  100
[21,]    0  100
[22,]    0  100
[23,]    0  100
[24,]    0  100
[25,]    1   99
[26,]   10   90
[27,]   12   88
[28,]   24   76
Fuzzyness coefficients:
dunn_coeff normalized
0.9666511  0.9333022
Closest hard clustering:
 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

Available components:
 "membership"  "coeff"       "memb.exp"    "clustering"  "k.crisp"
 "objective"   "convergence" "diss"        "call"        "silinfo"
 "data"
Fuzzy Clustering object of class 'fanny' :
m.ship.expon.        3
objective     7.937143
tolerance        1e-15
iterations          12
converged            1
maxit              500
n                   28
Membership coefficients (in %, rounded):
[,1] [,2]
[1,]   84   16
[2,]   82   18
[3,]   87   13
[4,]   85   15
[5,]   79   21
[6,]   84   16
[7,]   81   19
[8,]   82   18
[9,]   88   12
[10,]   84   16
[11,]   15   85
[12,]   17   83
[13,]   14   86
[14,]   28   72
[15,]   24   76
[16,]   16   84
[17,]   26   74
[18,]   16   84
[19,]   21   79
[20,]   18   82
[21,]   14   86
[22,]   14   86
[23,]   13   87
[24,]   14   86
[25,]   25   75
[26,]   39   61
[27,]   40   60
[28,]   45   55
Fuzzyness coefficients:
dunn_coeff normalized
0.6933146  0.3866293
Closest hard clustering:
 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

Available components:
 "membership"  "coeff"       "memb.exp"    "clustering"  "k.crisp"
 "objective"   "convergence" "diss"        "call"        "silinfo"
 "data"
```

cluster documentation built on April 17, 2021, 9:07 a.m.