pseudonymize: Pseudonymize columns in data containing PINs
In fbc-studies/pinr: Tools for working with Finnish personal identity codes

Description Usage Arguments Value See Also

View source: R/pseudonymize.R

Pseudonymize columns in data containing PINs

pseudonymize(data, key, ..., guess = FALSE, replace = TRUE,
  rename = !replace, quiet = FALSE)

pseudonymise(data, key, ..., guess = FALSE, replace = TRUE,
  rename = !replace, quiet = FALSE)

`data`	A data frame containing PINs to be pseudonymized.
`key`	Named vector or data frame, used as a lookup table for pids. If data frame, the first column is assumed to contain PINs and the second column the corresponding pids.
`...`	Manually selected columns to be pseudonymized. These are automatically quoted and evaluated in the context of the data. Uses 'tidyselect' semantics for selection.
`guess`	Logical. Attempt to automatically identify and pseudonymize columns that contain PINs?
`replace`	Logical. Should PIN columns be replaced with the pseudonymized versions?
`rename`	Logical or function. If 'FALSE', pseudonymized columns will not be automatically renamed; if 'TRUE', they will be suffixed with '"_pid"'; if a function, will be called on PIN column names to generate new names for the pseudonymized columns. Manually specified new names will always be used regardless.
`quiet`	Suppress additional messages? Currently controls showing a warning if pseudonymization results in all 'NA' values in some column.

A data frame where PINs have probably been linked to pids. If replace = TRUE values in columns guessed to have PINs have been replaced with matching pids from 'key'.

is_probably_pin used to guess if columns contain PINs

fbc-studies/pinr documentation built on May 17, 2019, 7:35 p.m.