mmpca_clust_modelselect: Model selection for MMPCA

Description Usage Arguments Value Examples

View source: R/mmpca_clust_modelselection.R

Description

A wrapper on mmpca_clust() to perform model selection with an Integrated Classification Likelihood (ICL) criterion.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
mmpca_clust_modelselect(
  dtm,
  Qs,
  Ks,
  Yinit = "random",
  method = "BBCVEM",
  init.beta = "lda",
  keep = 1L,
  max.epochs = 10L,
  verbose = 1L,
  nruns = 5L,
  mc.cores = (detectCores() - 1)
)

Arguments

dtm

an NxV DocumentTermMatrix with term-frequency weighting.

Qs

The vector of clusters to be tested.

Ks

The number of topics to be tested.

Yinit

Parameter for the initialization of Y. It can be either:

  • a string or a function specifying the initialization procedure. It should be one of ('random', 'kmeans_lda'). See benchmarks-functions functions for more details.

  • (Only when Qs is a singleton) A vector of length N with Q modalities, specifying the initialization clustering.

method

The clustering algorithm to be used. Only "BBCVEM" is available : it corresponds to the branch and bound C-VEM of the original article.

init.beta

Parameter for the initialization of the matrix beta. It can be either:

  • a string specifying the initialization procedure. It should be one of ('random', 'lda'). See initializeBeta() for more details.

  • (Only when Ks is a singleton) A KxV matrix with each row summing to 1.

keep

The evolution of the bound is tracked every keep iteration.

max.epochs

Specifies the maximum number of pass allowed on the whole dataset.

verbose

verbosity level.

nruns

number of runs of the algorithm for each (K,Q) pair (default to 1) : the run achieving the best evidence lower bound is selected.

mc.cores

The number of CPUs to use when fitting in parallel the different models. Default is the number of available cores minus 1.

Value

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
## generate data with the BBCmsg
simu = simulate_BBC(N = 100, L = 250)
## Define a grid
Qs = 5:6
Ks = 3:4
## Run model selection with MoMPCA
res <- mmpca_clust_modelselect(simu$dtm.full, Qs = Qs, Ks = Ks,
                               Yinit = 'kmeans_lda',
                               init.beta = 'lda',
                               method = 'BBCVEM',
                               max.epochs = 7,
                               nruns = 2,
                               verbose = 1,
                               mc.cores = 2)

nicolasJouvin/MMPCA documentation built on Jan. 23, 2021, 3 a.m.