Unicode Character Properties

Description

Get the properties of Unicode characters.

Usage

1
2
3

Arguments

x

an R object which can be coerced to a u_char vector of Unicode characters via as.u_char.

which

a character vector or string (for u_char_property), respectively, with the possibly abbreviated names of Unicode properties.

Value

For u_char_info, a data frame with variables giving the Code (Code) and the ‘basic’ Unicode variables Name, General Category, Canonical Combining Class, Bidi Class, Decomposition, Numeric Value Decimal Digit, Numeric Value Digit, Numeric Value, Bidi Mirrored, Unicode 1 Name, ISO Comment, Simple Uppercase Mapping, Simple Lowercase Mapping, and Simple Titlecase Mapping, with names obtained by replacing white spaces by underscores (e.g., Bidi_Class.)

For u_char_properties, a data frame with the values of the specified properties, or, if no arguments were given, a character vector with the names of all currently available Unicode character properties.

For u_char_property, the values of the specified property.

Note

Currently, only the property values of a subset of all Unicode character properties can be obtained.

References

Unicode Character Database (http://www.unicode.org/ucd/)

Examples

1
2
3
4
5
6
## When was the Euro sign added to Unicode?
x <- u_char_from_name("EURO SIGN")
u_char_property(x, "Age")

## List the currently available Unicode character properties.
u_char_properties()