Probabilistic distance clustering (PD-clustering) is an iterative, distribution free, probabilistic clustering method. PD clustering assigns units to a cluster according to their probability of membership, under the constraint that the product of the probability and the distance of each point to any cluster centre is a constant.

1 |

`data` |
A matrix or data frame such that rows correspond to observations and columns correspond to variables. |

`k` |
A numerical parameter giving the number of clusters |

A list with components

`label ` |
A vector of integers indicating the cluster membership for each unit |

`centers ` |
A matrix of cluster centers |

`probability ` |
A matrix of probability of each point belonging to each cluster |

`JDF ` |
The value of the Joint distance function |

`iter` |
The number of iterations |

Cristina Tortora and Paul D. McNicholas

Ben-Israel A, Iyigun. Probabilistic D-Clustering.* Journal of Classification*, **25**(1), 5–26, 2008.

1 2 3 4 5 6 7 8 9 10 11 12 | ```
#Normally generated clusters
c1 = c(+2,+2,2,2)
c2 = c(-2,-2,-2,-2)
c3 = c(-3,3,-3,3)
n=200
x1 = cbind(rnorm(n, c1[1]), rnorm(n, c1[2]), rnorm(n, c1[3]), rnorm(n, c1[4]) )
x2 = cbind(rnorm(n, c2[1]), rnorm(n, c2[2]),rnorm(n, c2[3]), rnorm(n, c2[4]) )
x3 = cbind(rnorm(n, c3[1]), rnorm(n, c3[2]),rnorm(n, c3[3]), rnorm(n, c3[4]) )
x = rbind(x1,x2,x3)
pdn=PDclust(x,3)
plot(x[,1:2],col=pdn$label)
plot(x[,3:4],col=pdn$label)
``` |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

All documentation is copyright its authors; we didn't write any of that.