AsciiToInt: Character to and from Integer Codes Conversion

AsciiToIntR Documentation

Character to and from Integer Codes Conversion

Description

AsciiToInt returns integer codes in 0:255 for each (one byte) character in strings. ichar is an alias for it, for old S compatibility.

strcodes implements in R the basic engine for translating characters to corresponding integer codes.

chars8bit() is the inverse function of AsciiToint, producing “one byte” characters from integer codes. Note that it (and hence strcodes() depends on the locale, see Sys.getlocale().

Usage

AsciiToInt(strings)
     ichar(strings)
chars8bit(i = 1:255)
strcodes(x, table = chars8bit(1:255))

Arguments

strings, x

character vector.

i

numeric (integer) vector of values in 1:255.

table

a vector of (unique) character strings, typically of one character each.

Details

Only codes in 1:127 make up the ASCII encoding which should be identical for all R versions, whereas the ‘upper’ half is often determined from the ISO-8859-1 (aka “ISO-Latin 1)” encoding, but may well differ, depending on the locale setting, see also Sys.setlocale.

Note that 0 is no longer allowed since, R does not allow \0 aka nul characters in a string anymore.

Value

AsciiToInt (and hence ichar) and chars8bit return a vector of the same length as their argument.

strcodes(x, tab) returns a list of the same length and names as x with list components of integer vectors with codes in 1:255.

Author(s)

Martin Maechler, partly in 1991 for S-plus

Examples

chars8bit(65:70)#-> "A" "B" .. "F"
stopifnot(identical(LETTERS,   chars8bit(65:90)),
          identical(AsciiToInt(LETTERS), 65:90))


## may only work in ISO-latin1 locale (not in UTF-8):
try( strcodes(c(a= "ABC", ch="1234", place = "Zürich")) )
## in "latin-1" gives  {otherwise should give NA instead of 252}:
## Not run: 
$a
[1] 65 66 67

$ch
[1] 49 50 51 52

$place
[1]  90 252 114 105  99 104

## End(Not run)

myloc <- Sys.getlocale()

if(.Platform $ OS.type == "unix") withAutoprint({ # ''should work'' here
  try( Sys.setlocale(locale = "de_CH") )# "try": just in case
  strcodes(c(a= "ABC", ch="1234", place = "Zürich")) # no NA hopefully
  AsciiToInt(chars8bit()) # -> 1:255  {if setting latin1 succeeded above}

  chars8bit(97:140)
  try( Sys.setlocale(locale = "de_CH.utf-8") )# "try": just in case
  chars8bit(97:140) ## typically looks different than above
})

## Resetting to original locale .. works "mostly":
lapply(strsplit(strsplit(myloc, ";")[[1]], "="),
       function(cc) try(Sys.setlocale(cc[1], cc[2]))) -> .scratch

Sys.getlocale() == myloc # TRUE if we have succeeded to reset it

mmaechler/sfsmisc documentation built on Feb. 28, 2024, 4:18 a.m.