AHRC Victorian Election Violence Project R Code (Durham University)

split_corpus

R Documentation

Preprocess a text corpus and divide it into training and testing sets based on number of training items. Note: it is more efficient to preprocess and split separately, especially if running in a loop.

Description

Preprocess a text corpus and divide it into training and testing sets based on number of training items. Note: it is more efficient to preprocess and split separately, especially if running in a loop.

Usage

split_corpus(
  the_corpus,
  n_train,
  min_termfreq = 2,
  min_docfreq = 2,
  remove_punct = TRUE,
  remove_numbers = TRUE,
  remove_hyphens = TRUE,
  dfm_tfidf = FALSE,
  stem = TRUE
)

gidonc/durhamevp documentation built on April 8, 2022, 10:31 a.m.

gidonc/durhamevp index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

gidonc/durhamevp
ESRC/AHRC Victorian Election Violence Project R Code (Durham University)

split_corpus: Preprocess a text corpus and divide it into training and...
In gidonc/durhamevp: ESRC/AHRC Victorian Election Violence Project R Code (Durham University)

Preprocess a text corpus and divide it into training and testing sets based on number of training items. Note: it is more efficient to preprocess and split separately, especially if running in a loop.

Description

Usage

Related to split_corpus in gidonc/durhamevp...

R Package Documentation

Browse R Packages

We want your feedback!

gidonc/durhamevp ESRC/AHRC Victorian Election Violence Project R Code (Durham University)

split_corpus: Preprocess a text corpus and divide it into training and... In gidonc/durhamevp: ESRC/AHRC Victorian Election Violence Project R Code (Durham University)

Preprocess a text corpus and divide it into training and testing sets based on number of training items. Note: it is more efficient to preprocess and split separately, especially if running in a loop.

Description

Usage

Related to split_corpus in gidonc/durhamevp...

R Package Documentation

Browse R Packages

We want your feedback!

gidonc/durhamevp
ESRC/AHRC Victorian Election Violence Project R Code (Durham University)

split_corpus: Preprocess a text corpus and divide it into training and...
In gidonc/durhamevp: ESRC/AHRC Victorian Election Violence Project R Code (Durham University)