Description Usage Arguments Details Value References See Also Examples

This function implements Factorial K-means (Vichi and Kiers, 2001) and Reduced K-means (De Soete and Carroll, 1994), as well as a compromise version of these two methods. The methods combine Principal Component Analysis for dimension reduction with K-means for clustering.

1 2 3 4 5 6 7 8 9 10 11 12 | ```
cluspca(data, nclus, ndim, alpha = NULL, method = c("RKM","FKM"),
center = TRUE, scale = TRUE, rotation = "none", nstart = 100,
smartStart = NULL, seed = NULL)
## S3 method for class 'cluspca'
print(x, ...)
## S3 method for class 'cluspca'
summary(object, ...)
## S3 method for class 'cluspca'
fitted(object, mth = c("centers", "classes"), ...)
``` |

`data` |
Dataset with metric variables |

`nclus` |
Number of clusters (nclus = 1 returns the PCA solution |

`ndim` |
Dimensionality of the solution |

`method` |
Specifies the method. Options are RKM for reduced K-means and FKM for factorial K-means (default = |

`alpha` |
Adjusts for the relative importance of RKM and FKM in the objective function; |

`center` |
A logical value indicating whether the variables should be shifted to be zero centered (default = |

`scale` |
A logical value indicating whether the variables should be scaled to have unit variance before the analysis takes place (default = |

`rotation` |
Specifies the method used to rotate the factors. Options are |

`nstart` |
Number of starts (default = 100) |

`smartStart` |
If |

`seed` |
An integer that is used as argument by |

`x` |
For the |

`object` |
For the |

`mth` |
For the |

`...` |
Not used |

For the K-means part, the algorithm of Hartigan-Wong is used by default.

The hidden `print`

and `summary`

methods print out some key components of an object of class `cluspca`

.

The hidden `fitted`

method returns cluster fitted values. If method is `"classes"`

, this is a vector of cluster membership (the cluster component of the "cluspca" object). If method is `"centers"`

, this is a matrix where each row is the cluster center for the observation. The rownames of the matrix are the cluster membership values.

When `nclus`

= 1 the function returns the PCA solution and `plot(object)`

shows the corresponding biplot.

`obscoord` |
Object scores |

`attcoord` |
Variable scores |

`centroid` |
Cluster centroids |

`cluster` |
Cluster membership |

`criterion` |
Optimal value of the objective function |

`size` |
The number of objects in each cluster |

`scale` |
A copy of |

`center` |
A copy of |

`nstart` |
A copy of |

`odata` |
A copy of |

De Soete, G., and Carroll, J. D. (1994). K-means clustering in a low-dimensional Euclidean space. In Diday E. et al. (Eds.), *New Approaches in Classification and Data Analysis*, Heidelberg: Springer, 212-219.

Vichi, M., and Kiers, H.A.L. (2001). Factorial K-means analysis for two-way data. *Computational Statistics and Data Analysis*, 37, 49-64.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | ```
#Reduced K-means with 3 clusters in 2 dimensions after 10 random starts
data(macro)
outRKM = cluspca(macro, 3, 2, method = "RKM", rotation = "varimax", scale = FALSE, nstart = 10)
summary(outRKM)
#Scatterplot (dimensions 1 and 2) and cluster description plot
plot(outRKM, cludesc = TRUE)
#Factorial K-means with 3 clusters in 2 dimensions
#with a Reduced K-means starting solution
data(macro)
outFKM = cluspca(macro, 3, 2, method = "FKM", rotation = "varimax",
scale = FALSE, smartStart = outRKM$cluster)
outFKM
#Scatterplot (dimensions 1 and 2) and cluster description plot
plot(outFKM, cludesc = TRUE)
#To get the Tandem approach (PCA(SVD) + K-means)
outTandem = cluspca(macro, 3, 2, alpha = 1, seed = 1234)
plot(outTandem)
#nclus = 1 just gives the PCA solution
#outPCA = cluspca(macro, 1, 2)
#outPCA
#Scatterplot (dimensions 1 and 2)
#plot(outPCA)
``` |

```
Loading required package: ggplot2
Loading required package: dummies
dummies-1.5.6 provided by Decision Patterns
Loading required package: grid
Solution with 3 clusters of sizes 12 (60%), 5 (25%), 3 (15%) in 2 dimensions. Variables were mean centered and unstandardized.
Cluster centroids:
Dim.1 Dim.2
Cluster 1 -1.1627 -2.9713
Cluster 2 -3.5997 5.9900
Cluster 3 10.6502 1.9020
Variable scores:
Dim.1 Dim.2
GDP 0.0638 -0.1169
LI -0.1734 -0.0140
UR -0.0610 -0.4849
IR 0.6662 -0.0344
TB -0.7179 0.0678
NNS 0.0544 0.8633
Within cluster sum of squares by cluster:
[1] 113.4856 23.2023 45.8149
(between_SS / total_SS = 79.72 %)
Clustering vector:
Australia Canada Finland France Spain Sweden
1 1 1 1 1 1
USA Netherlands Greece Mexico Portugal Austria
1 2 3 3 3 1
Belgium Denmark Germany Italy Japan Norway
2 1 1 1 2 2
Switzerland UK
2 1
Objective criterion value: 431.7131
Available output:
[1] "obscoord" "attcoord" "centroid" "cluster" "criterion" "size"
[7] "odata" "scale" "center" "nstart"
$map
$parcoord
Solution with 3 clusters of sizes 12 (60%), 5 (25%), 3 (15%) in 2 dimensions. Variables were mean centered and unstandardized.
Cluster centroids:
Dim.1 Dim.2
Cluster 1 -0.2945 -0.8344
Cluster 2 -3.9747 1.7404
Cluster 3 7.8024 0.4367
Variable scores:
Dim.1 Dim.2
GDP 0.2272 0.9209
LI -0.6554 0.1850
UR 0.0504 -0.1255
IR 0.6648 -0.1139
TB -0.2666 -0.0412
NNS -0.0574 0.2956
Within cluster sum of squares by cluster:
[1] 26.6997 12.7474 1.0522
(between_SS / total_SS = 87.62 %)
Objective criterion value: 40.4992
Available output:
[1] "obscoord" "attcoord" "centroid" "cluster" "criterion" "size"
[7] "odata" "scale" "center" "nstart"
$map
$parcoord
Warning messages:
1: Removed 1 rows containing missing values (geom_segment).
2: Removed 1 rows containing missing values (geom_text_repel).
```

clustrd documentation built on May 8, 2019, 5:03 p.m.

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.