Convert Strings Between Given Encodings
These functions convert a character vector between encodings.
1 2 3
a character vector, a raw vector, or
a list of
a single logical value; indicates whether a list of raw vectors shall be returned rather than a character vector
stri_conv is an alias for
These two functions aim to replace R's
It is not only faster, but also
works in the same manner on all platforms.
Please refer to
stri_enc_list for the list
of supported encodings and stringi-encoding
for a general discussion.
str is a character vector
from is either missing,
then the declared encodings are used
stri_enc_mark) – in such a case
strings are disallowed.
Otherwise, the internal encoding declarations are ignored and
a converter selected with
from is used.
On the other hand, for
str being a raw vector
or a list of raw vectors,
we assume that the input encoding is the current default encoding
as given by
to_raw=FALSE, the output
strings have always marked encodings according to the target converter
used (as specified by
to) and the current default Encoding
bytes in all other cases).
Note that problems may occur if
to indicates e.g UTF-16 or UTF-32,
as the output strings may have embedded NULs.
In such cases use
to_raw=TRUE and consider
specifying a byte order marker (BOM) for portability reasons
UTF-32 which automatically
is a wise substitute for
In the current version of stringi, if an incorrect code point is found on input, it is replaced by the default (for that target encoding) substitute character and a warning is generated.
then a character vector with encoded strings (and sensible
encoding marks) is returned.
Otherwise, a list of raw vectors is produced.
Conversion – ICU User Guide, http://userguide.icu-project.org/conversion
Converters – ICU User Guide, http://userguide.icu-project.org/conversion/converters (technical details)
Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.