ngram: n-Gram creators
In UBESP-DCTV/costumer: COmprehensive Searches ThroUgh Machine learning for systEmatic Reviews

ngram

R Documentation

n-Gram creators

Description

The function aims is to create the ngram tokens for each document in a corpora.

a shortcuts for ngram using n_min = n_max = 2

a shortcuts for ngram using n_min = n_max = 3

Usage

ngram(corpus, n_min = 1, n_max = 2, ..., parallel = FALSE,
  ncores = parallel::detectCores() - 1)

## S3 method for class 'list'
ngram(corpus, n_min = 1, n_max = 2, ..., parallel = FALSE,
  ncores = parallel::detectCores() - 1)

## S3 method for class 'VCorpus'
ngram(corpus, n_min = 1, n_max = 2, ...,
  parallel = FALSE, ncores = parallel::detectCores() - 1)

## S3 method for class 'character'
ngram(corpus, n_min = 1, n_max = 2, ...,
  parallel = FALSE, ncores = parallel::detectCores() - 1,
  docs_or_tokens = c("docs", "tokens"))

## Default S3 method:
ngram(corpus, n_min = 1, n_max = 2, ...,
  parallel = FALSE, ncores = parallel::detectCores() - 1)

bigram(corpus, ..., parallel = FALSE, ncores = parallel::detectCores() - 1)

trigram(corpus, ..., parallel = FALSE, ncores = parallel::detectCores() - 1)

Arguments

`corpus`	a compatible object storing documents (actually, list (and corpus-list of (tokened) documents, character vectors and `VCorpus`)
`n_min`	(num) minimum number of words to include in the grams
`n_max`	(num) maximum number of words to include into the grams
`...`	further option passed to the function
`parallel`	(lgl) if `TRUE` perform the computation in parallel using the `parallel` package functionality. Default is `FALSE`.
`ncores`	(int) number of core to use in the parallel computation (default is number of machine cores minus one)
`docs_or_tokens`	character vector to explain if the vector is a vector of documents (to be tokened) or is already a vector of tokens (of a single document)

Value

an object of the same class of input (except for character vector input, for which the output is a list) with documents tokenized in ngram.

(list) of character vectors containing the nGrammed documents

UBESP-DCTV/costumer documentation built on Feb. 1, 2023, 4:52 a.m.

UBESP-DCTV/costumer index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

UBESP-DCTV/costumer
COmprehensive Searches ThroUgh Machine learning for systEmatic Reviews

ngram: n-Gram creators
In UBESP-DCTV/costumer: COmprehensive Searches ThroUgh Machine learning for systEmatic Reviews

n-Gram creators

Description

Usage

Arguments

Value

Related to ngram in UBESP-DCTV/costumer...

R Package Documentation

Browse R Packages

We want your feedback!

UBESP-DCTV/costumer COmprehensive Searches ThroUgh Machine learning for systEmatic Reviews

ngram: n-Gram creators In UBESP-DCTV/costumer: COmprehensive Searches ThroUgh Machine learning for systEmatic Reviews

n-Gram creators

Description

Usage

Arguments

Value

Related to ngram in UBESP-DCTV/costumer...

R Package Documentation

Browse R Packages

We want your feedback!

UBESP-DCTV/costumer
COmprehensive Searches ThroUgh Machine learning for systEmatic Reviews

ngram: n-Gram creators
In UBESP-DCTV/costumer: COmprehensive Searches ThroUgh Machine learning for systEmatic Reviews