Load the data

library(camcos2017)
library(magrittr)

data(newsgroups)
sub0 <- news_subset(newsgroups, filter = c(2,8,12), binary = TRUE, vocab = FALSE)
sub0.data <- sub0[[1]]
sub0.labels <- sub0[[2]]

Layout of the subset data:

pander::pandoc.table(sub0.data[1:10, ])

Run the clustering

x <- colweights(data = sub0.data[,1:3], binary = TRUE, weightfunction = "IDF", sparse = TRUE) %>%
  similarity("correlation", sparse = TRUE) %>%
  clustering("DiffusionMap", k = 3, t = 0.5) %>%
  clustercheck(sub0.labels, k = 3)
x


CAMCOS/camcos2017 documentation built on May 6, 2019, 9:23 a.m.