README.md

UnsupRF

Unsupervised Random Forest Clustering

Cluster data using randomForest proximities. A Random forest classifier is trained to predict the data labeled as class "True.Data" and a synthetic data labeled as class "Synthetic.Data". The synthetic data is generated by random sampling from the emperical distribution of the true data or by permuting true data. The proximities between observations in the true data is converted to a dissimilarity matrix and can be used by any clustering algorithm that accepts a dissimilarity matrix. Several routines for cluster validation and determination of optimal number of clusters are also implemented.

Get Started

Install via devtools:

devtools::install_github("nguforche/UnsupRF")



nguforche/UnsupRF documentation built on May 5, 2019, 4:51 p.m.