KMeans  R Documentation 
KMeans
KMeans Cluster Analysis.
KMeans(
data = NULL,
centers = 2,
centers.names = NULL,
subset = NULL,
weights = NULL,
missing = "Use partial data",
iter.max = 100,
n.starts = 10,
algorithm = "Batch",
output = "Means",
profile.var = NULL,
seed = 1223,
binary = FALSE,
show.labels = FALSE,
max.nchar.subtitle = 200,
verbose = FALSE,
...
)
data 
A 
centers 
Either the number of clusters (e.g., 2), or a set of initial cluster centers. Where the number of clusters is specified, or the algorithm is 'Bagging', a random selection of rows of data is chosen as the initial start points. 
centers.names 
An optional commaseparated list that will be used to name the predicted clusters. 
subset 
An optional vector specifying a subset of observations to be
used in the fitting process, or, the name of a variable in 
weights 
An optional vector of sampling weights, or, the name or, the
name of a variable in 
missing 
How missing data is to be treated in the regression. Options:

iter.max 
The number of iterations of the algorithm to run. 
n.starts 
The number of times the algorithm should be run, each time with a different number of start points. 
algorithm 
One of 
output 
The defaults is 
profile.var 
An optional list of variables which will be compared against the KMeans predicted cluster. 
seed 
The random number seed used in imputation. 
binary 
Makes categorical variables into indicator variables (otherwise their values are used). 
show.labels 
Shows the variable labels, as opposed to the labels, in the outputs, where a variables label is an attribute (e.g., attr(foo, "label")). 
max.nchar.subtitle 
Maximum number of characters in the subtitle. This is used to determine the number of significant profiling variables to show. 
verbose 
Whether or not to show the verbose outputs to 
... 
Additional arguments to 
"Bagging"
uses bagging in an attempt to find replicable custers.
By default, 10 bootstrap samples are created (using weights if provided), and kmean
cluster analysis is used to find 20 clusters in each of these samples, and the completelink
hiearchical clustering algorithm is then used to form the final clusters (Leisch 1999).
See bclust
to see the names and descriptions of additional parameters.
After running bclust
, cases are assigned to the most similar cluster.
Forgy, E. W. (1965) Cluster analysis of multivariate data: efficiency vs interpretability of classifications. Biometrics 21, 768769. Hartigan, J. A. and Wong, M. A. (1979). A Kmeans clustering algorithm. Applied Statistics 28, 100108. Leisch, Friedrich (1999) Bagged clustering. Working Paper 51, SFB "Adaptive Information Systems and Modeling in Economics and Management Science", August 1999. http://epub.wu.ac.at/1272/ 1/document.pdf Lloyd, S. P. (1957, 1982) Least squares quantization in PCM. Technical Note, Bell Laboratories. Published in 1982 in IEEE Transactions on Information Theory 28, 128137. MacQueen, J. (1967) Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, eds L. M. Le Cam & J. Neyman, 1, pp. 281297. Berkeley, CA: University of California Press.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.