Description Usage Arguments Details
'deidentify()' will generate a unique ID from personally identifying information. Because the IDs are generated with the SHA-256 algorithm, they are a) very unlikely to be the same for people with different identifying information, and b) nearly impossible to recover the identifying information from.
1 2 |
data |
A data frame (or tibble). |
... |
A list of the columns in 'data' that contain personally identifying information, from which the unique IDs will be generated. |
salt |
An optional salt (see Details). |
key |
The name of the column to create containing unique IDs, "id" by default. |
drop |
A logical value, TRUE by default, indicating whether to remove the personally identifying columns after the IDs are created. |
warn_duplicates |
A logical value, TRUE, by default, indicating whether to emit a warning if there are duplicate input rows or produced IDs. |
This function uses non-standard evaluation for column names in 'data', so there's no need to surround them with quotation marks.
Optionally, a salt can be added to the personally identifying information. A salt is an extra piece of text, usually kept secret, that will change the resulting IDs. This makes it harder for somebody to re-identify people in the data set by generating IDs from a list of potential inputs. However, you will need to use the same salt every time you deidentify datasets from the same cohort if you want to be able to cross-reference people by ID.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.