unicode_code_points: Get Unicode code points
In trevorld/bittermelon: Bitmap Tools

hex2ucp

R Documentation

Get Unicode code points

Description

hex2ucp(), int2ucp(), name2ucp(), and str2ucp() return Unicode code points as character vectors. is_ucp() returns TRUE if a valid Unicode code point.

Usage

hex2ucp(x)

int2ucp(x)

str2ucp(x)

name2ucp(x, type = c("exact", "grep"), ...)

is_ucp(x)

block2ucp(x, omit_unnamed = TRUE)

range2ucp(x, omit_unnamed = TRUE)

Arguments

`x`	R objects coercible to the respective Unicode character data types. See `Unicode::as.u_char()` for `hex2ucp()` and `int2ucp()`, `base::utf8ToInt()` for `str2ucp()`, `Unicode::u_char_from_name()` for `name2ucp()`, `Unicode::as.u_char_range()` for `range2ucp()`, and `Unicode::u_blocks()` for `block2ucp()`.
`type`	one of `"exact"` or `"grep"`, or an abbreviation thereof.
`...`	arguments to be passed to `grepl` when using this for pattern matching.
`omit_unnamed`	Omit control codes or unassigned code points

Details

hex2ucp(x) is a wrapper for as.character(Unicode::as.u_char(toupper(x))). int2ucp is a wrapper for as.character(Unicode::as.u_char(as.integer(x))). str2ucp(x) is a wrapper for as.character(Unicode::as.u_char(utf8ToInt(x))). name2ucp(x) is a wrapper for as.character(Unicode::u_char_from_name(x)). However missing values are coerced to NA_character_ instead of "<NA>". Note the names of bm_font() objects must be character vectors as returned by these functions and not Unicode::u_char objects.

Value

A character vector of Unicode code points.

Examples

  # These are all different ways to get the same 'R' code point
  hex2ucp("52")
  hex2ucp(as.hexmode("52"))
  hex2ucp("0052")
  hex2ucp("U+0052")
  hex2ucp("0x0052")
  int2ucp(82) # 82 == as.hexmode("52")
  int2ucp("82") # 82 == as.hexmode("52")
  int2ucp(utf8ToInt("R"))
  ucp2label("U+0052")
  name2ucp("LATIN CAPITAL LETTER R")
  str2ucp("R")

  block2ucp("Basic Latin")
  block2ucp("Basic Latin", omit_unnamed = FALSE)
  range2ucp("U+0020..U+0030")

trevorld/bittermelon documentation built on Jan. 16, 2025, 4:11 a.m.