split_text: Split texts into segments
In zumbov2/deeplr: Interface to the 'DeepL' Translation API

split_text

R Documentation

Split texts into segments

Description

split_text splits texts into blocks of a maximum number of bytes.

Usage

split_text(text, max_size_bytes = 29000, tokenize = "sentences")

Arguments

`text`	character vector to be split.
`max_size_bytes`	maximum size of a single text segment in bytes.
`tokenize`	level of tokenization. Either "sentences" or "words".

Details

The function uses tokenizers::tokenize_sentences to split texts.

Value

Returns a (tibble) with the following columns:

text_id position of the text in the character vector.
segment_id ID of a text segment.
segment_text text segment that is smaller than max_size_bytes

Examples

## Not run: 
# Split long text
text <- paste0(rep("This is a very long text.", 10000), collapse = " ")
split_text(text)

## End(Not run)

zumbov2/deeplr documentation built on March 30, 2024, 12:17 p.m.

zumbov2/deeplr index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

zumbov2/deeplr
Interface to the 'DeepL' Translation API

split_text: Split texts into segments
In zumbov2/deeplr: Interface to the 'DeepL' Translation API

Split texts into segments

Description

Usage

Arguments

Details

Value

Examples

Related to split_text in zumbov2/deeplr...

R Package Documentation

Browse R Packages

We want your feedback!

zumbov2/deeplr Interface to the 'DeepL' Translation API

split_text: Split texts into segments In zumbov2/deeplr: Interface to the 'DeepL' Translation API

Split texts into segments

Description

Usage

Arguments

Details

Value

Examples

Related to split_text in zumbov2/deeplr...

R Package Documentation

Browse R Packages

We want your feedback!

zumbov2/deeplr
Interface to the 'DeepL' Translation API

split_text: Split texts into segments
In zumbov2/deeplr: Interface to the 'DeepL' Translation API