multiEM: EM algorithm
In activetext/activeR: a semi-supervised active learning algorithm for text classification.

multiEM

R Documentation

EM algorithm

Description

Use EM algorithm to maximize the marginal posterior. The marginal posterior is the probability of parameters given both labeled and unlabeled documents and the labels for the labeled documents

Usage

multiEM(
  .D_train = NULL,
  .C_train = NULL,
  .D_test,
  .D_test_out = NULL,
  .n_class = 2,
  .n_cluster = 2,
  .lambda = 0.1,
  .max_iter = 100,
  .alpha = 0.1,
  .labeled_docs = NULL,
  .counter_on = T,
  .active_iter = NULL,
  .maxactive_iter = NULL,
  .fixed_words = NULL,
  .export_all = F,
  .supervise = T,
  .choose_NB_init = FALSE,
  .prev_word_prob = NULL,
  .prev_class_prob = NULL,
  .prev_mu = NA,
  .prev_psi = NA,
  .beta = NULL,
  .binary_metadata_varnames = NA,
  .cont_metadata_varnames = NA
)

Arguments

`.D_train`	document term matrix of the labeled documents
`.C_train`	vector of class labels for the labeled documents
`.D_test`	document term matrix of the unlabeled documents
`.D_test_out`	document term matrix for out of sample validation
`.n_class`	number of classes
`.n_cluster`	number of clusters
`.lambda`	vector of document weights
`.max_iter`	maximum number of iteration of the EM algorithm
`.alpha`	the threshold of the convergence. If the increase of the maximand becomes less than alpha, the iteration stops.
`.labeled_docs`	Optional vector of index values for labeled documents. Used if '.choose_NB_init == FALSE'
`.counter_on`	boolean object. If `counter_on == T`, displays the progress of the EM algorithm.
`.active_iter`	integer value that tells the EM algorithm which iteration of the active loop it is in.
`.maxactive_iter`	integer value that tells the EM algorithm the maximum allowed active iterations.
`.fixed_words`	matrix of fixed words with class probabilities, where ncol is the number of classes.
`.export_all`	If T, model parameters from each iteration of the EM algorithm are returned. If F, only model results from the last iteration are returned.
`.supervise`	T if supervised. F is unsupervised.
`.choose_NB_init`	boolean object. By default true, and EM starts with Naive Bayes step. If false, and if an appropriate '.C_train' is provided, the initial M step is performed with document class probabilities from both labeled and unlabeled documents, as weighted by the chosen '.lambda' value.
`.binary_metadata_varnames`	vector of stricts indicating variable names of binary metadata
`.cont_metadata_varnames`	vector of strings indicating variable names of continuous metadata
`.lazy_eval`	boolean object. If `lazy_eval == T`, convergence is measured by comparing changes in log likelihood across model iterations rather than directly computing maximand.
`.class_prob`	required if .supervise == T. Starting value of class probability (logged)
`.word_prob`	required if .supervise == T. Starting value of word probability (logged)

Details

The inputs must conform to the following specifications D_train: a matrix with dimension: the number of labeled documents * the number of unique words D_test: a matrix with dimension: the number of labeled documents * the number of unique words The column length of D_train and D_test must be the same. The elements of the D_train, D_test are integers (the counts of each unique word appeard in each document) C_train: vector of labels for the labeled documents. The length must be the same as the row length of D_test

Value

maximands is a vector of maximands in each iteration. Each element of the vector contains the log maximand in each step. pi is a vector of log class probabilities. (length = 2) eta is a matrix of log word probabilities (nrow = the number of all documents, ncol = 2)

References

Active EM in overleaf

activetext/activeR documentation built on May 31, 2024, 10:21 a.m.

activetext/activeR index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

activetext/activeR
a semi-supervised active learning algorithm for text classification.

multiEM: EM algorithm
In activetext/activeR: a semi-supervised active learning algorithm for text classification.

EM algorithm

Description

Usage

Arguments

Details

Value

References

Related to multiEM in activetext/activeR...

R Package Documentation

Browse R Packages

We want your feedback!

activetext/activeR a semi-supervised active learning algorithm for text classification.

multiEM: EM algorithm In activetext/activeR: a semi-supervised active learning algorithm for text classification.

EM algorithm

Description

Usage

Arguments

Details

Value

References

Related to multiEM in activetext/activeR...

R Package Documentation

Browse R Packages

We want your feedback!

activetext/activeR
a semi-supervised active learning algorithm for text classification.

multiEM: EM algorithm
In activetext/activeR: a semi-supervised active learning algorithm for text classification.