stri_extract_boundaries: Extract Data Between Text Boundaries
In stringi: Fast and Portable Character String Processing Facilities

stri_extract_all_boundaries

R Documentation

Extract Data Between Text Boundaries

Description

These functions extract data between text boundaries.

Usage

stri_extract_all_boundaries(
  str,
  simplify = FALSE,
  omit_no_match = FALSE,
  ...,
  opts_brkiter = NULL
)

stri_extract_last_boundaries(str, ..., opts_brkiter = NULL)

stri_extract_first_boundaries(str, ..., opts_brkiter = NULL)

stri_extract_all_words(
  str,
  simplify = FALSE,
  omit_no_match = FALSE,
  locale = NULL
)

stri_extract_first_words(str, locale = NULL)

stri_extract_last_words(str, locale = NULL)

Arguments

`str`	character vector or an object coercible to
`simplify`	single logical value; if `TRUE` or `NA`, then a character matrix is returned; otherwise (the default), a list of character vectors is given, see Value
`omit_no_match`	single logical value; if `FALSE`, then a missing value will indicate that there are no words
`...`	additional settings for `opts_brkiter`
`opts_brkiter`	a named list with ICU BreakIterator's settings, see `stri_opts_brkiter`; `NULL` for the default break iterator, i.e., `line_break`
`locale`	`NULL` or `''` for text boundary analysis following the conventions of the default locale, or a single string with locale identifier, see stringi-locale

Details

Vectorized over str.

For more information on text boundary analysis performed by ICU's BreakIterator, see stringi-search-boundaries.

In case of stri_extract_*_words, just like in stri_count_words, ICU's word BreakIterator iterator is used to locate the word boundaries, and all non-word characters (UBRK_WORD_NONE rule status) are ignored.

Value

For stri_extract_all_*, if simplify=FALSE (the default), then a list of character vectors is returned. Each string consists of a separate word. In case of omit_no_match=FALSE and if there are no words or if a string is missing, a single NA is provided on output.

Otherwise, stri_list2matrix with byrow=TRUE argument is called on the resulting object. In such a case, a character matrix with length(str) rows is returned. Note that stri_list2matrix's fill argument is set to an empty string and NA, for simplify TRUE and NA, respectively.

For stri_extract_first_* and stri_extract_last_*, a character vector is returned. A NA element indicates a no-match.

Author(s)

Marek Gagolewski and other contributors

Examples

stri_extract_all_words('stringi: THE string processing package 123.48...')

stringi documentation built on May 29, 2024, 8:16 a.m.

stringi index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

stringi
Fast and Portable Character String Processing Facilities

stri_extract_boundaries: Extract Data Between Text Boundaries
In stringi: Fast and Portable Character String Processing Facilities

Extract Data Between Text Boundaries

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Related to stri_extract_boundaries in stringi...

R Package Documentation

Browse R Packages

We want your feedback!

stringi Fast and Portable Character String Processing Facilities

stri_extract_boundaries: Extract Data Between Text Boundaries In stringi: Fast and Portable Character String Processing Facilities

Extract Data Between Text Boundaries

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Related to stri_extract_boundaries in stringi...

R Package Documentation

Browse R Packages

We want your feedback!

stringi
Fast and Portable Character String Processing Facilities

stri_extract_boundaries: Extract Data Between Text Boundaries
In stringi: Fast and Portable Character String Processing Facilities