get_dfm: Get Document Feature Matrix
In activetext/activeR: a semi-supervised active learning algorithm for text classification.

get_dfm

R Documentation

Get Document Feature Matrix

Description

Builds document feature matrix using quanteda package.

Usage

get_dfm(
  docs,
  doc_name = "text",
  index_name = "id",
  stem = T,
  ngrams = 1,
  trimPct = 1e-04,
  min_doc_freq = 2,
  idfWeight = F,
  removeStopWords = T,
  minChar = 4
)

Arguments

`docs`	[matrix] Matrix of labeled and unlabeled documents.
`doc_name`	[character] Character string indicating the variable in 'docs' that denotes the text of the documents to be classified.
`index_name`	[character] Character string indicating the variable in 'docs' that denotes the index value of the document to be classified.
`stem`	[logical] Switch indicating whether or not to stem terms.
`ngrams`	[integer] Integer value indicating the size of the ngram to use to build the dfm.
`trimPct`	[numeric] Numeric value indicating the threshold of percentage of document membership at which to remove terms from the data-term matrix. E.g., if `trimPct = .5`, then all words that are in less than 50 percent of the documents will be removed.
`min_doc_freq`	[integer] Minimum number of documents a term must be in to stay in the document term matrix.
`idfWeight`	[logical] Switch indicating whether to weight the document term matrix by the frequency of word counts. Only works if `dfmType = "quanteda"`.

Value

[matrix] Document term matrix.

activetext/activeR documentation built on May 31, 2024, 10:21 a.m.

activetext/activeR index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

activetext/activeR
a semi-supervised active learning algorithm for text classification.

get_dfm: Get Document Feature Matrix
In activetext/activeR: a semi-supervised active learning algorithm for text classification.

Get Document Feature Matrix

Description

Usage

Arguments

Value

Related to get_dfm in activetext/activeR...

R Package Documentation

Browse R Packages

We want your feedback!

activetext/activeR a semi-supervised active learning algorithm for text classification.

get_dfm: Get Document Feature Matrix In activetext/activeR: a semi-supervised active learning algorithm for text classification.

Get Document Feature Matrix

Description

Usage

Arguments

Value

Related to get_dfm in activetext/activeR...

R Package Documentation

Browse R Packages

We want your feedback!

activetext/activeR
a semi-supervised active learning algorithm for text classification.

get_dfm: Get Document Feature Matrix
In activetext/activeR: a semi-supervised active learning algorithm for text classification.