The trimmed kmeans clustering method by CuestaAlbertos, Gordaliza and Matran (1997). This optimizes the kmeans criterion under trimming a portion of the points.
1 2 3 4 5 6 7 8 
data 
matrix or data.frame with raw data 
k 
integer. Number of clusters. 
trim 
numeric between 0 and 1. Proportion of points to be trimmed. 
scaling 
logical. If 
runs 
integer. Number of algorithm runs from initial means (randomly chosen from the data points). 
points 

countmode 
optional positive integer. Every 
printcrit 
logical. If 
maxit 
integer. Maximum number of iterations within an algorithm
run. Each iteration determines all points which
are closer to a different cluster center than the one to which they are
currently assigned. The algorithm terminates if no more points have
to be reassigned, or if 
x 
object of class 
... 
further arguments to be transferred to 
plot.tkm
calls plotcluster
if the
dimensionality of the data p
is 1, shows a scatterplot
with nontrimmed regions if p=2
and discriminant coordinates
computed from the clusters (ignoring the trimmed points) if p>2
.
An object of class 'tkm' which is a LIST with components
classification 
integer vector coding cluster membership with trimmed
observations coded as 
means 
numerical matrix giving the mean vectors of the k classes. 
disttom 
vector of squared Euclidean distances of all points to the closest mean. 
ropt 
maximum value of 
k 
see above. 
trim 
see above. 
runs 
see above. 
scaling 
see above. 
Christian Hennig chrish@stats.ucl.ac.uk http://www.homepages.ucl.ac.uk/~ucakche/
CuestaAlbertos, J. A., Gordaliza, A., and Matran, C. (1997) Trimmed kMeans: An Attempt to Robustify Quantizers, Annals of Statistics, 25, 553576.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29  set.seed(10001)
n1 <60
n2 <60
n3 <70
n0 <10
nn < n1+n2+n3+n0
pp < 2
X < matrix(rep(0,nn*pp),nrow=nn)
ii <0
for (i in 1:n1){
ii <ii+1
X[ii,] < c(5,5)+rnorm(2)
}
for (i in 1:n2){
ii < ii+1
X[ii,] < c(5,5)+rnorm(2)*0.75
}
for (i in 1:n3){
ii < ii+1
X[ii,] < c(5,5)+rnorm(2)*0.75
}
for (i in 1:n0){
ii < ii+1
X[ii,] < rnorm(2)*8
}
tkm1 < trimkmeans(X,k=3,trim=0.1,runs=3)
# runs=3 is used to save computing time.
print(tkm1)
plot(tkm1,X)

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.
Please suggest features or report bugs with the GitHub issue tracker.
All documentation is copyright its authors; we didn't write any of that.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.