View source: R/encoding_management.R
| stri_enc_info | R Documentation |
Gets basic information on a character encoding.
stri_enc_info(enc = NULL)
enc |
|
An error is raised if the provided encoding is unknown to ICU
(see stri_enc_list for more details).
Returns a list with the following components:
Name.friendly – friendly encoding name:
MIME Name or JAVA Name or ICU Canonical Name
(the first of provided ones is selected, see below);
Name.ICU – encoding name as identified by ICU;
Name.* – other standardized encoding names,
e.g., Name.UTR22, Name.IBM, Name.WINDOWS,
Name.JAVA, Name.IANA, Name.MIME (some of them
may be unavailable for all the encodings);
ASCII.subset – is ASCII a subset of the given encoding?;
Unicode.1to1 – for 8-bit encodings only: are all characters
translated to exactly one Unicode code point and is the translation
scheme reversible?;
CharSize.8bit – is this an 8-bit encoding, i.e., do we have
CharSize.min == CharSize.max and CharSize.min == 1?;
CharSize.min – minimal number of bytes used
to represent a UChar (in UTF-16, this is not the same as UChar32)
CharSize.max – maximal number of bytes used
to represent a UChar (in UTF-16, this is not the same as UChar32,
i.e., does not reflect the maximal code point representation size)
Marek Gagolewski and other contributors
The official online manual of stringi at https://stringi.gagolewski.com/
Gagolewski M., stringi: Fast and portable character string processing in R, Journal of Statistical Software 103(2), 2022, 1-59, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v103.i02")}
Other encoding_management:
about_encoding,
stri_enc_list(),
stri_enc_mark(),
stri_enc_set()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.