char_map: Split a Character Vector into its Unique Elements and a...

View source: R/compress_chars.R

char_mapR Documentation

Split a Character Vector into its Unique Elements and a Mapping on These

Description

This is implemented using a radix sort on the CHARSXPs directly, i.e. on the addresses of the strings in the global string cache. Hence, in contrast to unique, this function does not consider two strings equal which differ only in their encoding. Also, the order of the unique elements is undefined.

Usage

char_map(x)

map2char(map)

Arguments

x

a character vector. Long vectors are supported.

map

an object as returned by char_map.

Value

char_map returns an S3 object of class "char_map", which is a list with the following elements: (chars) the unique set of strings in x in undefined order, (idx) an integer (or - for long vectors - double) vector such that map$chars[map$idx] is identical to x (except maybe for attributes), (attributes) the attributes of x as a shallow copy of the corresponding pairlist.

map2char returns a character vector identical to x, including attributes.

Windows Support

Fully supported on Windows.

Lifecycle

options:
  alt='[Stable]'

Examples

x <- sample(letters, 100, replace = TRUE)
map <- char_map(x)
stopifnot(identical(x, map$chars[map$idx]))

names(x) <- 1:100
stopifnot(identical(x, map2char(char_map(x))))


gfkse/bettermc documentation built on April 23, 2023, 6:51 a.m.