mixdir: Cluster High Dimensional Categorical Datasets

Scalable Bayesian clustering of categorical datasets. The package implements a hierarchical Dirichlet (Process) mixture of multinomial distributions. It is thus a probabilistic latent class model (LCM) and can be used to reduce the dimensionality of hierarchical data and cluster individuals into latent classes. It can automatically infer an appropriate number of latent classes or find k classes, as defined by the user. The model is based on a paper by Dunson and Xing (2009) <doi:10.1198/jasa.2009.tm08439>, but implements a scalable variational inference algorithm so that it is applicable to large datasets. It is described and tested in the accompanying paper by Ahlmann-Eltze and Yau (2018) <doi:10.1109/DSAA.2018.00068>.

Getting started

Package details

AuthorConstantin Ahlmann-Eltze [aut, cre] (<https://orcid.org/0000-0002-3762-068X>), Christopher Yau [ths] (<https://orcid.org/0000-0001-7615-8523>)
MaintainerConstantin Ahlmann-Eltze <artjom31415@googlemail.com>
URL https://github.com/const-ae/mixdir
Package repositoryView on CRAN
Installation Install the latest version of this package by entering the following in R:

Try the mixdir package in your browser

Any scripts or data that you put into this service are public.

mixdir documentation built on Sept. 20, 2019, 5:04 p.m.