These functions convert strings between encodings.
They aim to serve as a more portable and faster replacement
for R's own
stri_encode(str, from = NULL, to = NULL, to_raw = FALSE) stri_conv(str, from = NULL, to = NULL, to_raw = FALSE)
a character vector, a raw vector, or
a list of
a single logical value; indicates whether a list of raw vectors rather than a character vector should be returned
stri_conv is an alias for
stri_enc_list for the list
of supported encodings and stringi-encoding
for a general discussion.
from is either missing,
str is a character vector
then the marked encodings are used
stri_enc_mark) – in such a case
strings are disallowed.
Otherwise, i.e., if
str is a
or a list of raw vectors,
we assume that the input encoding is the current default encoding
as given by
from is given explicitly,
the internal encoding declarations are always ignored.
to_raw=FALSE, the output
strings always have the encodings marked according to the target converter
used (as specified by
to) and the current default Encoding
bytes in all other cases).
Note that some issues might occur if
to indicates, e.g,
UTF-16 or UTF-32, as the output strings may have embedded NULs.
In such cases, please use
to_raw=TRUE and consider
specifying a byte order marker (BOM) for portability reasons
UTF-32 which automatically
adds the BOMs).
is a clever substitute for
In the current version of stringi, if an incorrect code point is found
on input, it is replaced with the default (for that target encoding)
'missing/erroneous' character (with a warning), e.g.,
the SUBSTITUTE character (U+001A) or the REPLACEMENT one (U+FFFD).
Occurrences thereof can be located in the output string to diagnose
the problematic sequences, e.g., by calling:
Because of the way this function is currently implemented, maximal size of a single string to be converted cannot exceed ~0.67 GB.
then a character vector with encoded strings (and appropriate
encoding marks) is returned.
Otherwise, a list of vectors of type raw is produced.
Marek Gagolewski and other contributors
Conversion – ICU User Guide, https://unicode-org.github.io/icu/userguide/conversion/
The official online manual of stringi at https://stringi.gagolewski.com/
Gagolewski M., stringi: Fast and portable character string processing in R, Journal of Statistical Software 103(2), 2022, 1-59, doi: 10.18637/jss.v103.i02
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.