AHRC Victorian Election Violence Project R Code (Durham University)

preprocess_sgrams

R Documentation

Preprocess a text corpus including the creation of n-grams for specific words and return a document feature matrix (wrapper round quanteda functions).

Description

Preprocess a text corpus including the creation of n-grams for specific words and return a document feature matrix (wrapper round quanteda functions).

Usage

preprocess_sgrams(
  the_corpus,
  wseq,
  stem = TRUE,
  min_termfreq = 2,
  min_docfreq = 2,
  max_termfreq = NULL,
  max_docfreq = NULL,
  remove_punct = TRUE,
  remove_numbers = TRUE,
  remove_hyphens = TRUE,
  termfreq_type = "count",
  docfreq_type = "count",
  dfm_tfidf = FALSE
)

Arguments

`the_corpus`	The text corpus to be pre-processed.
`wseq`	Pre-specified word sequence as list on which n-grams idenfied and added to the dfm
`stem`	default TRUE
`min_termfreq`	default 2
`min_docfreq`	default 2
`max_termfreq`	default NULL
`max_docfreq`	default NULL
`remove_punct`	default TRUE
`remove_numbers`	default TRUE
`remove_hyphens`	default TRUE
`termfreq_type`	default "count"
`docfreq_type`	default "count"
`dfm_tfidf`	default FALSE

gidonc/durhamevp documentation built on April 8, 2022, 10:31 a.m.

gidonc/durhamevp index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

gidonc/durhamevp
ESRC/AHRC Victorian Election Violence Project R Code (Durham University)

preprocess_sgrams: Preprocess a text corpus including the creation of n-grams...
In gidonc/durhamevp: ESRC/AHRC Victorian Election Violence Project R Code (Durham University)

Preprocess a text corpus including the creation of n-grams for specific words and return a document feature matrix (wrapper round quanteda functions).

Description

Usage

Arguments

Related to preprocess_sgrams in gidonc/durhamevp...

R Package Documentation

Browse R Packages

We want your feedback!

gidonc/durhamevp ESRC/AHRC Victorian Election Violence Project R Code (Durham University)

preprocess_sgrams: Preprocess a text corpus including the creation of n-grams... In gidonc/durhamevp: ESRC/AHRC Victorian Election Violence Project R Code (Durham University)

Preprocess a text corpus including the creation of n-grams for specific words and return a document feature matrix (wrapper round quanteda functions).

Description

Usage

Arguments

Related to preprocess_sgrams in gidonc/durhamevp...

R Package Documentation

Browse R Packages

We want your feedback!

gidonc/durhamevp
ESRC/AHRC Victorian Election Violence Project R Code (Durham University)

preprocess_sgrams: Preprocess a text corpus including the creation of n-grams...
In gidonc/durhamevp: ESRC/AHRC Victorian Election Violence Project R Code (Durham University)