Functions for testing and adapting the (declared) encoding
of the components of a vector of mode
1 2 3 4 5 6
a vector (of character).
option to process list components.
option to use internal translation.
option to assume
is.utf8 tests if the components of a vector of character
are true UTF-8 strings, i.e. contain one or more valid UTF-8
is.locale tests if the components of a vector of character
are in the encoding of the current locale.
translate encodes the components of a vector of
in the encoding of the current locale. This includes the
attribute of vectors of arbitrary mode. If
recursive = TRUE
the components of a
list are processed. If
internal = TRUE
multi-byte sequences that are invalid in the encoding of the current
locale are changed to literal hex numbers (see FIXME).
fixEncoding sets the declared encoding of the components of
a vector of character to their correct or preferred values. If
latin1 = TRUE strings that are not valid UTF-8 strings are
declared to be in
"latin1". On the other hand, strings that
are true UTF-8 strings are declared to be in
The same type of object as
x with the (declared) encoding
iconv and therefore is not
guaranteed to work on all platforms.
FIXME PCRE, RFC 3629
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
## Note that we assume R runs in an UTF-8 locale text <- c("aa", "a\xe4") Encoding(text) <- c("unknown", "latin1") is.utf8(text) is.ascii(text) is.locale(text) ## implicit translation text ## t1 <- iconv(text, from = "latin1", to = "UTF-8") Encoding(t1) ## oops t2 <- iconv(text, from = "latin1", to = "utf-8") Encoding(t2) t2 is.locale(t2) ## t2 <- fixEncoding(t2) Encoding(t2) ## explicit translation t3 <- translate(text) Encoding(t3)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.