PDC: Probabilistic Distance Clustering

View source: R/PDC.R

PDCR Documentation

Probabilistic Distance Clustering

Description

Probabilistic distance clustering (PD-clustering) is an iterative, distribution free, probabilistic clustering method. PD clustering is based on the constraint that the product of the probability and the distance of each point to any cluster centre is a constant.

Usage

PDC(data = NULL, k = 2)

Arguments

data

A matrix or data frame such that rows correspond to observations and columns correspond to variables.

k

A numerical parameter giving the number of clusters

Value

A class FPDclustering list with components

label

A vector of integers indicating the cluster membership for each unit

centers

A matrix of cluster centers

probability

A matrix of probability of each point belonging to each cluster

JDF

The value of the Joint distance function

iter

The number of iterations

data

the data set

Author(s)

Cristina Tortora and Paul D. McNicholas

References

Ben-Israel C. and Iyigun C. Probabilistic D-Clustering. Journal of Classification, 25(1), 5-26, 2008.

Examples


#Normally generated clusters
c1 = c(+2,+2,2,2)
c2 = c(-2,-2,-2,-2)
c3 = c(-3,3,-3,3)
n=200
x1 = cbind(rnorm(n, c1[1]), rnorm(n, c1[2]), rnorm(n, c1[3]), rnorm(n, c1[4]) )
x2 = cbind(rnorm(n, c2[1]), rnorm(n, c2[2]),rnorm(n, c2[3]), rnorm(n, c2[4]) )
x3 = cbind(rnorm(n, c3[1]), rnorm(n, c3[2]),rnorm(n, c3[3]), rnorm(n, c3[4]) )
x = rbind(x1,x2,x3)

#Clustering
pdn=PDC(x,3)

#Results
plot(pdn)


FPDclustering documentation built on Aug. 31, 2022, 5:09 p.m.