clean_data: Structure Data
In activetext/activeR: a semi-supervised active learning algorithm for text classification.

View source: R/functions_active_helper.R

clean_data

R Documentation

Structure Data

Description

Structures data to prepare for Active-EM implementation. Options to filter documents by chosen character strings, as well as to add index value for each document.

Usage

clean_data(
  docs,
  n_class,
  doc_name,
  index_name,
  labels_name = NULL,
  filters = NULL,
  add_index = T,
  add_filter = T,
  keep_labels = F
)

Arguments

`docs`	[matrix] Matrix of labeled and/or unlabeled documents.
`n_class`	[numeric] Number of classes to be considered.
`doc_name`	[string] Character string indicating the variable in 'docs' that denotes the text of the documents to be classified.
`index_name`	[character] Character string indicating the variable in 'docs' that denotes the index value of the document to be classified.
`labels_name`	[character] Character string indicating the variable in `docs` that denotes the already known labels of the documents. By default, value is set to `NULL`.
`filters`	[character] A vector of regular expressions used to filter out unwanted documents.
`add_index`	[logical] Boolean logical value indicating whether or not add an index in the restructuring process.
`add_filter`	[logical] Boolean logical value indicating whether or not to filter documents in the restructuring process.
`keep_labels`	[logical] Boolean logical value indicating whether or not to keep an existing column of labels in the dataset.

Value

[matrix] Structured matrix of labeled and unlabeled documents, updated with labels for the documents in 'toLabel'.

activetext/activeR documentation built on May 31, 2024, 10:21 a.m.

activetext/activeR index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

activetext/activeR
a semi-supervised active learning algorithm for text classification.

clean_data: Structure Data
In activetext/activeR: a semi-supervised active learning algorithm for text classification.

Structure Data

Description

Usage

Arguments

Value

Related to clean_data in activetext/activeR...

R Package Documentation

Browse R Packages

We want your feedback!

activetext/activeR a semi-supervised active learning algorithm for text classification.

clean_data: Structure Data In activetext/activeR: a semi-supervised active learning algorithm for text classification.

Structure Data

Description

Usage

Arguments

Value

Related to clean_data in activetext/activeR...

R Package Documentation

Browse R Packages

We want your feedback!

activetext/activeR
a semi-supervised active learning algorithm for text classification.

clean_data: Structure Data
In activetext/activeR: a semi-supervised active learning algorithm for text classification.