stri_enc_isutf8: Check If a Data Stream Is Possibly in UTF-8
In stringi: Fast and Portable Character String Processing Facilities

stri_enc_isutf8

R Documentation

Check If a Data Stream Is Possibly in UTF-8

Description

The function checks whether given sequences of bytes forms a proper UTF-8 string.

Usage

stri_enc_isutf8(str)

Arguments

str

character vector, a raw vector, or a list of raw vectors

Details

FALSE means that a string is certainly not valid UTF-8. However, false positives are possible. For instance, (c4,85) represents ('a with ogonek') in UTF-8 as well as ('A umlaut', 'Ellipsis') in WINDOWS-1250. Also note that UTF-8, as well as most 8-bit encodings, extend ASCII (note that stri_enc_isascii implies that stri_enc_isutf8).

However, the longer the sequence, the greater the possibility that the result is indeed in UTF-8 – this is because not all sequences of bytes are valid UTF-8.

This function is independent of the way R marks encodings in character strings (see Encoding and stringi-encoding).

Value

Returns a logical vector. Its i-th element indicates whether the i-th string corresponds to a valid UTF-8 byte sequence.

Author(s)

Marek Gagolewski and other contributors

Examples

stri_enc_isutf8(letters[1:3])
stri_enc_isutf8('\u0105\u0104')
stri_enc_isutf8('\u1234\u0222')

stringi documentation built on May 29, 2024, 8:16 a.m.

stringi index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

stringi
Fast and Portable Character String Processing Facilities

stri_enc_isutf8: Check If a Data Stream Is Possibly in UTF-8
In stringi: Fast and Portable Character String Processing Facilities

Check If a Data Stream Is Possibly in UTF-8

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Related to stri_enc_isutf8 in stringi...

R Package Documentation

Browse R Packages

We want your feedback!

stringi Fast and Portable Character String Processing Facilities

stri_enc_isutf8: Check If a Data Stream Is Possibly in UTF-8 In stringi: Fast and Portable Character String Processing Facilities

Check If a Data Stream Is Possibly in UTF-8

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Related to stri_enc_isutf8 in stringi...

R Package Documentation

Browse R Packages

We want your feedback!

stringi
Fast and Portable Character String Processing Facilities

stri_enc_isutf8: Check If a Data Stream Is Possibly in UTF-8
In stringi: Fast and Portable Character String Processing Facilities