utf8_substr: Substring of an UTF-8 string

View source: R/utf8.R

utf8_substrR Documentation

Substring of an UTF-8 string

Description

This function uses grapheme clusters instead of Unicode code points in UTF-8 strings.

Usage

utf8_substr(x, start, stop)

Arguments

x

Character vector.

start

Starting index or indices, recycled to match the length of x.

stop

Ending index or indices, recycled to match the length of x.

Value

Character vector of the same length as x, containing the requested substrings.

See Also

Other UTF-8 string manipulation: utf8_graphemes(), utf8_nchar()

Examples

# Five grapheme clusters, select the middle three
str <- paste0(
  "\U0001f477\U0001f3ff\u200d\u2640\ufe0f",
  "\U0001f477\U0001f3ff",
  "\U0001f477\u200d\u2640\ufe0f",
  "\U0001f477\U0001f3fb",
  "\U0001f477\U0001f3ff")
cat(str)
str24 <- utf8_substr(str, 2, 4)
cat(str24)

cli documentation built on June 22, 2024, 10:57 a.m.