active_label: Active Learning EM Algorithm

View source: R/functions_active.R

active_labelR Documentation

Active Learning EM Algorithm

Description

Active learning for weighted-EM algorithm. After initial EM algorithm converges, oracle is queried for labels to documents that the EM algorithm was most unsure of. This process iterates until max iterations are reached, or there are no documents in the window of uncertainty.

Usage

active_label(
  docs,
  labels = c(0, 1),
  lambda = 1,
  n_class = 2,
  n_cluster = 2,
  max_active = 5,
  init_size = 10,
  max_query = 10,
  save_file_name = NA,
  save_directory = NA
)

Arguments

docs

[matrix] Matrix of labeled and unlabeled documents, where each row has index values and a nested Matrix of word tokens.

labels

[vector] Vector of character strings indicating classification options for labeling.

lambda

[numeric] Numeric value between 0 and 1. Used to weight unlabeled documents.

n_class

[numeric] Number of classes to be considered.

max_active

[numeric] Value of maximum allowed active learning iterations.

init_size

[numeric] Value of maximum allowed iterations within the EM algorithm.

max_query

[numeric] Maximum number of documents queried in each EM iteration.

doc_name

[character] Character string indicating the variable in 'docs' that denotes the text of the documents to be classified.

Value

[list] List containing labeled document matrix, prior weights, word likelihoods, and a vector of user-labeled documents ids.


activetext/activeR documentation built on May 31, 2024, 10:21 a.m.