classifier_selection_description: Classifies new documents on a labeled training set...

View source: R/text_analysis.R

classifier_selection_descriptionR Documentation

Classifies new documents on a labeled training set (description).

Description

Classifies new documents on a labeled training set (description).

Usage

classifier_selection_description(
  train,
  new_docs,
  text_field = "description",
  class_to_keep = 1,
  training_classify_var = "EV_article",
  prior = "uniform",
  classifier_type = "xgboost",
  stem_dfm = FALSE,
  return_logical = FALSE,
  logical_to_prob = FALSE,
  ...
)

Arguments

train

a data frame with the training documents.

new_docs

the documents to classify.

text_field

the text field (must be the same in the training documents and the documents to classify).

class_to_keep

the class (0 or 1) to keep.

training_classify_var

the variable containing the labels in the training set.

prior

for naive bayes classifier only.

classifier_type

which classifier to use (xgboost or nb (naive Bayes))

return_logical

return the subset of documents or a logical vector indicating that subset.

...

other arguments to be passed to preprocess_corpus.

stem

for preprocessing


gidonc/durhamevp documentation built on April 8, 2022, 10:31 a.m.