Hash R objects to 32bit integers

Share:

Description

Hash R objects to 32bit integers

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
hash(x, ...)

## Default S3 method:
hash(x, ...)

## S3 method for class 'character'
hash(x, recursive = TRUE,
  nthread = getOption("hashr_num_thread"), ...)

## S3 method for class 'list'
hash(x, recursive = TRUE,
  nthread = getOption("hashr_num_thread"), ...)

Arguments

x

Object to hash

...

Arguments to be passed to other methods. In particular, for the default method, these arguments are passed to serialize.

recursive

hash each element separately?

nthread

maximum number of threads used.

Details

The default method serializes the input to a single raw vector which is then hashed to a single signed integer. This is also true for character vectors when recursive=FALSE. When recursive=TRUE each element of a character vector is hashed separately, based on the underlying char representation in C.

Parallelization

On systems supporting openMP, this function is able to use multiple cores. By default, a sensible number of cores is chosen. See the entry on OpenMP Support in the writing R extensions manual to check whether your system supports it.

Hash function

The hash function used is Paul Hsieh's' SuperFastHash function which is described on his website. As the title of the algorithm suggests, this hashing algorithm is not aimed to be used as a secure hash, and it is probably a bad idea to use it for that purpose.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# hash some complicated R object (not a list).
m <- lm(height ~ weight, data=women)
hash(m)

# hash a character vector element by element:
x <- c("Call any vegetable"
     , "and the chances are good"
     , "that the vegetable will respond to you")
hash(x)

# hash a character vector as one object:
hash(x, recursive=FALSE)

# hash a list recursively
L <- strsplit(x," ")
hash(L)

# recursive really means recursive, so nested lists are recursed over:
L <- list(
  x = 10
  , y = list(
    foo = "bob"
    , bar = lm(Sepal.Width ~ Sepal.Length, data=iris)
  )
)

hash(L)
hash(L,recursive=FALSE)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.