pack: Pack a data.frame of tokens
In paithiov909/audubon: Japanese Text Processing Tools

View source: R/pack.R

pack	R Documentation

Pack a data.frame of tokens

Description

Packs a data.frame of tokens into a new data.frame of corpus, which is compatible with the Text Interchange Formats.

Usage

pack(tbl, pull = "token", n = 1L, sep = "-", .collapse = " ")

Arguments

`tbl`	A data.frame of tokens.
`pull`	<`data-masked`> Column to be packed into text or ngrams body. Default value is `token`.
`n`	Integer internally passed to ngrams tokenizer function created of `audubon::ngram_tokenizer()`
`sep`	Character scalar internally used as the concatenator of ngrams.
`.collapse`	This argument is passed to `stringi::stri_c()`.

Value

A tibble.

Text Interchange Formats (TIF)

The Text Interchange Formats (TIF) is a set of standards that allows R text analysis packages to target defined inputs and outputs for corpora, tokens, and document-term matrices.

Valid data.frame of tokens

The data.frame of tokens here is a data.frame object compatible with the TIF.

A TIF valid data.frame of tokens are expected to have one unique key column (named doc_id) of each text and several feature columns of each tokens. The feature columns must contain at least token itself.

Examples

pack(strj_tokenize(polano[1:5], format = "data.frame"))

paithiov909/audubon documentation built on June 2, 2025, 1:15 a.m.

paithiov909/audubon index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

paithiov909/audubon
Japanese Text Processing Tools

pack: Pack a data.frame of tokens
In paithiov909/audubon: Japanese Text Processing Tools

Pack a data.frame of tokens

Description

Usage

Arguments

Value

Text Interchange Formats (TIF)

Valid data.frame of tokens

See Also

Examples

Related to pack in paithiov909/audubon...

R Package Documentation

Browse R Packages

We want your feedback!

paithiov909/audubon Japanese Text Processing Tools

pack: Pack a data.frame of tokens In paithiov909/audubon: Japanese Text Processing Tools

Pack a data.frame of tokens

Description

Usage

Arguments

Value

Text Interchange Formats (TIF)

Valid data.frame of tokens

See Also

Examples

Related to pack in paithiov909/audubon...

R Package Documentation

Browse R Packages

We want your feedback!

paithiov909/audubon
Japanese Text Processing Tools

pack: Pack a data.frame of tokens
In paithiov909/audubon: Japanese Text Processing Tools