nmfem_mult: NMF-EM algorithm for mixture of multinomials

Description Usage Arguments Value Examples

View source: R/nmfem_mult.R

Description

Proceed to an NMF-EM algorithm on mixture of multinomials dataset. In comparison to the classical EM algorithm, the number of parameters to estimate is lower. For more explanation, see pre-print of Carel and Alquier (2017) <arXiv:1709.03346>.

Usage

1
2
3
4
5
6
7
8
9
nmfem_mult(
  X,
  H,
  K,
  path = NULL,
  eps_init = 0.001,
  eps_M = 1e-08,
  eps_llh = 1e-05
)

Arguments

X

a matrix containing multinomials observations of dimension N (number of observation) x M (number of variables).

H

number of words.

K

number of clusters.

path

path to the directory to save the initialization or to load it. NULL by default, won't save or load it.

eps_init

convergence criterion on the initialization. Default value is 1e-3.

eps_M

convergence criterion on the Maximization step. Default value is 1e-8.

eps_llh

convergence criterion on the log-likelihood. Default value is 1e-5.

Value

A list with the elements:

Theta

matrix of dimension M x H. Contains a dictionnary of redundant components.

Lambda

matrix of dimension H x K. Contains the expression of the K clusters in the dictionnary.

llh

log-likelihood of the model.

p

vector containing the proportions of each cluster.

posterior

matrix containing for each observation the posterior probability to belong to each cluster.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
# Example on a data sample
x <- dplyr::sample_n(travelers[,-1],900)
out <- nmfem_mult(x, H = 4, K = 7)
# Display first cluster profile
display_profile(t((out$Theta %*% out$Lambda)[ ,1]))
# Display first word profile
display_profile(t(out$Theta[ ,1]), color = "Greens")

# Example on the complete data - it needs a few minutes to run
## Not run: 
nmfem_travelers <- nmfem_mult(travelers[ ,-1], H = 5, K = 10)
Theta <- nmfem_travelers$Theta
Lambda <- nmfem_travelers$Lambda

# Display first cluster profile
display_profile(t((Theta %*% Lambda)[ ,1]))

# Display first word profile
display_profile(t(Theta[ ,1]), color = "Greens")
## End(Not run)

nmfem documentation built on March 26, 2020, 7:42 p.m.