knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

bigdpclust

CRAN_Status_Badge Travis-CI Build Status AppVeyor Build Status Downloads

bigdpclust performs clustering of tall data using a Bayesian nonparametric Gaussian Dirichlet process mixture model.

Installation

You can install the development version of bigdpclust from GitHub with:

#install.packages("devtools")
devtools::install_github("borishejblum/bigdpclust")

bigdpclust depends on the weightedobs branch from the NPflow package, which can be installed through the following command:

devtools::install_github(repo = "borishejblum/NPflow", ref = "weightedobs")
library(ggplot2)
library(bigdpclust)

n1 <- 100000
n2 <- 100
mydata <- rbind(cbind(rnorm(n1), rnorm(n = n1)),
                cbind(rnorm(n2, m=10), rnorm(n = n2, m=10)))
plot(mydata)

res <- bigdpclust(mydata, nclumps=100, 
                  Nmcmc = 1000, plotevery = 2000, burnin = 500)
table(res$cluster[1:n1])
table(res$cluster[n1 + 1:n2])

-- Boris Hejblum & Paul Kirk



borishejblum/bigdpclust documentation built on Dec. 18, 2019, 3:39 a.m.