Description Usage Arguments Details Value References See Also
These functions convert a character vector between encodings.
1 2 3 |
str |
character vector, a raw vector, or a list of
|
from |
input encoding: |
to |
target encoding: |
to_raw |
single logical value; indicates whether a list of raw vectors shall be returned rather than a character vector |
These two functions aim to replace R's iconv –
note only it is slightly faster, and works in the same
manner on all platforms. stri_conv
is an alias for
stri_encode
.
Please, refer to stri_enc_list
for the list
of supported encodings and stringi-encoding for
general discussion.
If from
is either missing, ""
, or NULL
and str
is an atomic vector, then the input strings'
encoding marks are used (just like in almost all
stringi functions: bytes marks are disallowed). In
other words, the input string will be converted from ASCII,
UTF-8, or current default encoding, see
stri_enc_get
. Otherwise, the internal
encoding marks are overridden by the given encoding. On the
other hand, for str
being a list of raw vectors, we
assume that the input encoding is the current default
encoding.
For to_raw=FALSE
, the output strings always have
marked encodings according to the target converter used (as
specified by to
) and the current default Encoding
(ASCII
, latin1
, UTF-8
, native
,
or bytes
in all other cases).
Note that possible problems may occur when to
is set
to e.g. UTF-16 and UTF-32, as the output strings may have
embedded NULs. In such cases use to_raw=TRUE
and
consider specifying a byte order marker (BOM) for
portability reasons (e.g. set UTF-16
or
UTF-32
which automatically adds BOMs).
Note that stri_encode(as.raw(data),
"8bitencodingname")
is a wise substitute for
rawToChar
.
Currently, if an incorrect code point is found on input, it is replaced by the default (for that target encoding) substitute character and a warning is generated.
If to_raw
is FALSE
, then a character vector
with encoded strings (and sensible encoding marks) is
returned. Otherwise, you get a list of raw vectors.
Conversion – ICU User Guide, http://userguide.icu-project.org/conversion
Converters – ICU User Guide, http://userguide.icu-project.org/conversion/converters (technical details)
Other encoding_conversion:
stri_enc_fromutf32
;
stri_enc_toascii
;
stri_enc_toutf32
;
stri_enc_toutf8
;
stringi-encoding
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.