Conversion of UTF-8 encoded character vectors to and from integer vectors.
object to be converted.
logical: should the conversion be to a single character string or multiple individual characters?
These will work in any locale, including on platforms that do not otherwise support multi-byte character sets.
Unicode defines a name and a number of all of the glyphs it
encompasses: the numbers are called code points: they run from
utf8ToInt converts a length-one character string encoded in
UTF-8 to an integer vector of Unicode code points. As from R 3.2.1
it checks validity of the input and returns
NA if it is invalid.
intToUtf8 converts a numeric vector of Unicode code points
either to a single character string or a character vector of single
characters. (For a single character string
0 is silently
0 is mapped to
numeric values are truncated to integers.) The
is declared as
NA inputs are mapped to
1 2 3 4 5