pseudonymize: Pseudonymize columns in data containing PINs

Description Usage Arguments Value See Also

View source: R/pseudonymize.R

Description

Pseudonymize columns in data containing PINs

Usage

1
2
3
4
5
pseudonymize(data, key, ..., guess = FALSE, replace = TRUE,
  rename = !replace, quiet = FALSE)

pseudonymise(data, key, ..., guess = FALSE, replace = TRUE,
  rename = !replace, quiet = FALSE)

Arguments

data

A data frame containing PINs to be pseudonymized.

key

Named vector or data frame, used as a lookup table for pids. If data frame, the first column is assumed to contain PINs and the second column the corresponding pids.

...

Manually selected columns to be pseudonymized. These are automatically quoted and evaluated in the context of the data. Uses 'tidyselect' semantics for selection.

guess

Logical. Attempt to automatically identify and pseudonymize columns that contain PINs?

replace

Logical. Should PIN columns be replaced with the pseudonymized versions?

rename

Logical or function. If 'FALSE', pseudonymized columns will not be automatically renamed; if 'TRUE', they will be suffixed with '"_pid"'; if a function, will be called on PIN column names to generate new names for the pseudonymized columns. Manually specified new names will always be used regardless.

quiet

Suppress additional messages? Currently controls showing a warning if pseudonymization results in all 'NA' values in some column.

Value

A data frame where PINs have probably been linked to pids. If replace = TRUE values in columns guessed to have PINs have been replaced with matching pids from 'key'.

See Also

is_probably_pin used to guess if columns contain PINs


fbc-studies/pinr documentation built on May 17, 2019, 7:35 p.m.