hash_names: Anonymise data using scrypt

Description Usage Arguments Details Author(s) See Also Examples

Description

This function uses the scrypt algorithm from libsodium to anonymise data, based on user-indicated data fields. Data fields are concatenated first, then each entry is hashed. The function can either return a full detailed output, or short labels ready to use for 'anonymised data'. Before concatenation (using "_" as a separator) to form labels, inputs are modified using clean_labels.

Usage

1
hash_names(..., size = 6, full = TRUE, salt = NULL)

Arguments

...

Data fields to be hashed.

size

The number of characters retained in the hash.

full

A logical indicating if the a full output should be returned as a data.frame, including original labels, shortened hash, and full hash.

salt

An optional object that can be coerced to a character to be used to 'salt' the hashing algorithm (see details). Ignored if NULL (default).

Details

The argument salt should be used for salting the algorithm, i.e. adding an extra input to the input fields (the 'salt') to change the resulting hash and prevent identification of individuals via pre-computed hash tables.

It is highly recommend to choose a secret, random salt in order make it harder for an attacker to decode the hash.

Author(s)

Thibaut Jombart [email protected], Dirk Shchumacher [email protected]

See Also

clean_labels, used to clean labels prior to hashing.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
first_name <- c("Jane", "Joe", "Raoul")
last_name <- c("Doe", "Smith", "Dupont")
age <- c(25, 69, 36)

hash_names(first_name, last_name, age)

hash_names(first_name, last_name, age,
           size = 8, full = FALSE)


## salting the hashing (more secure!)
hash_names(first_name, last_name) # unsalted - less secure
hash_names(first_name, last_name, salt = 123) # salted with an integer
hash_names(first_name, last_name, salt = "foobar") # salted with an character

epitrix documentation built on May 2, 2019, 6:35 a.m.