stri_width: Determine the Width of Code Points
In stringi: Fast and Portable Character String Processing Facilities

stri_width

R Documentation

Determine the Width of Code Points

Description

Approximates the number of text columns the 'cat()' function might use to print a string using a mono-spaced font.

Usage

stri_width(str)

Arguments

str

character vector or an object coercible to

Details

The Unicode standard does not formalize the notion of a character width. Roughly based on http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c, https://github.com/nodejs/node/blob/master/src/node_i18n.cc, and UAX #11 we proceed as follows. The following code points are of width 0:

code points with general category (see stringi-search-charclass) Me, Mn, and Cf),
C0 and C1 control codes (general category Cc) - for compatibility with the nchar function,
Hangul Jamo medial vowels and final consonants (code points with enumerable property UCHAR_HANGUL_SYLLABLE_TYPE equal to U_HST_VOWEL_JAMO or U_HST_TRAILING_JAMO; note that applying the NFC normalization with stri_trans_nfc is encouraged),
ZERO WIDTH SPACE (U+200B),

Characters with the UCHAR_EAST_ASIAN_WIDTH enumerable property equal to U_EA_FULLWIDTH or U_EA_WIDE are of width 2.

Most emojis and characters with general category So (other symbols) are of width 2.

SOFT HYPHEN (U+00AD) (for compatibility with nchar) as well as any other characters have width 1.

Value

Returns an integer vector of the same length as str.

Author(s)

Marek Gagolewski and other contributors

References

East Asian Width – Unicode Standard Annex #11, https://www.unicode.org/reports/tr11/

Examples

stri_width(LETTERS[1:5])
stri_width(stri_trans_nfkd('\u0105'))
stri_width(stri_trans_nfkd('\U0001F606'))
stri_width( # Full-width equivalents of ASCII characters:
   stri_enc_fromutf32(as.list(c(0x3000, 0xFF01:0xFF5E)))
)
stri_width(stri_trans_nfkd('\ubc1f')) # includes Hangul Jamo medial vowels and final consonants

stringi documentation built on May 29, 2024, 8:16 a.m.

stringi index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

stringi
Fast and Portable Character String Processing Facilities

stri_width: Determine the Width of Code Points
In stringi: Fast and Portable Character String Processing Facilities

Determine the Width of Code Points

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to stri_width in stringi...

R Package Documentation

Browse R Packages

We want your feedback!

stringi Fast and Portable Character String Processing Facilities

stri_width: Determine the Width of Code Points In stringi: Fast and Portable Character String Processing Facilities

Determine the Width of Code Points

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to stri_width in stringi...

R Package Documentation

Browse R Packages

We want your feedback!

stringi
Fast and Portable Character String Processing Facilities

stri_width: Determine the Width of Code Points
In stringi: Fast and Portable Character String Processing Facilities