Description Usage Arguments References Examples

This function performs an ensemble hierarchical clustering of high dimensional categorical data (p >> n).

1 |

`data` |
A nxp data matrix of data frame; n is the number of observations and p is the number of features or dimensions. |

`En` |
Number of clusterings to include in the ensemble, i.e., cardinality of the ensemble. |

`len` |
Range of sizes of clusterings (i.e., number of clusters) to run and ensemble. |

`type` |
Numeric indicator of single bootstrap (type=1) or double bootstrap (type=2) for selecting subsets of variables to include in each clustering within the ensemble. The default is type=2 |

Amiri, S., Clarke, B., and Clarke, J. (2015). Clustering categorical data via ensembling dissimilarity matrices. arXiv preprint arXiv:1506.07930.

1 2 3 4 5 6 7 8 9 10 11 12 | ```
#data("rhabdodata")
### The following code generates the dissimilary matrix of sequence data stored in alphadata
### The ensemble has 100 member clusterings, and the number of clusters in each clustering
### is generated randomly from a discrete uniform on (2,10). A double bootstrap procedure is
### used to select a subset of variables for each clustering.
#ens<-enhcHi(rhabdodata$dat,En=100,len=c(2,10), type=2)
### Calculate the hamming distance
#dis0<-hammingD(ens)
### Save as distance format
#REDIST<-as.dist(dis0)
#hc0 <- hclust(REDIST,method = "average")
#plot(hc0,label=rhabdodata$lab,hang =-1)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.