fast_E_M: EM clustering
In svs: Tools for Semantic Vector Spaces

fast_E_M

R Documentation

EM clustering

Description

A fast procedure for Expectation-Maximization clustering.

Usage

fast_E_M(dat, k, tol = 1e-08)

fast_EM(dat, k, tol = 1e-08)

Arguments

`dat`	Input data: can be a table or a data frame (but the data frame must have only two columns).
`k`	Numeric specification of the number of latent classes to compute.
`tol`	Numeric specification of the convergence criterion.

Details

This function assumes that the rows of a frequency table come from a multinomial distribution. The prior probabilities of the latent classes are initialized with a Dirichlet distribution (by means of rdirichlet from the package gtools) with alpha = the total frequency counts of every level.

Value

A list with components:

`prob0`	The probabilities of the latent classes.
`prob1`	The probabilities for the first set of levels (viz. the row levels of a frequency table). The rows of `prob1` sum to 1.
`prob2`	The probabilities for the second set of levels (viz. the column levels of a frequency table). The rows of `prob2` sum to 1.

References

Dempster, A. P., N. M. Laird and D. B. Rubin (1977) Maximum likelihood from incomplete data via the EM algorithm. Journal of the royal statistical society, series B 39 (1), 1–38.

Examples

SndT_Fra <- read.table(system.file("extdata", "SndT_Fra.txt", package = "svs"),
   header = TRUE, sep = "\t", quote = "\"", encoding = "UTF-8",
   stringsAsFactors = FALSE)
E_M_SndT_Fra <- fast_E_M(SndT_Fra, k = 7)
E_M_SndT_Fra

svs documentation built on June 24, 2024, 5:07 p.m.