detector
makes detecting data containing Personally Identifiable Information (PII) quick, easy, and scalable. It provides high-level functions that can take vectors and data.frames and return important summary statistics in a convenient data.frame. Once complete, detector
will be able to detect the following types of PII:
You can install the latest development version from CRAN:
install.packages("detector")
Or from GitHub with:
if (packageVersion("devtools") < 1.6) {
install.packages("devtools")
}
devtools::install_github("paulhendricks/detector")
If you encounter a clear bug, please file a minimal reproducible example on GitHub.
library(dplyr, warn.conflicts = FALSE)
library(generator)
n <- 6
set.seed(1)
ashley_madison <-
data.frame(name = r_full_names(n),
snn = r_national_identification_numbers(n),
dob = r_date_of_births(n),
email = r_email_addresses(n),
ip = r_ipv4_addresses(n),
phone = r_phone_numbers(n),
credit_card = r_credit_card_numbers(n),
lat = r_latitudes(n),
lon = r_longitudes(n),
stringsAsFactors = FALSE)
knitr::kable(ashley_madison, format = "markdown")
name
snn
dob
email
ip
phone
credit_card
lat
lon
Eldridge Pfannerstill
442-34-5338
1993-04-28
ntakqojv@lgbcyk.rkv
45.84.71.225
6794976958
4125-7204-9193-5140
-2.7018575
8.634988
Augustine Homenick
799-44-6396
1912-09-08
iqg@mtcuh.viy
191.116.55.106
3275827694
2182-5994-2283-9486
-70.4148630
-65.827918
Jennie Runte
941-11-5441
1985-01-12
wjszy@sjhreocvt.gbp
27.128.73.17
7419351735
4370-4866-4735-7857
-45.4091701
-79.932229
Araceli Kunde
290-44-2675
1948-04-28
uljsnvhfr@qfdkumtn.jkd
221.47.229.86
3243246285
6682-5074-2898-9396
-0.2673845
103.514583
Josue Rau
686-88-8446
1996-06-14
c@lqxzkdpi.nfy
157.136.114.185
9169736873
4510-3757-4858-5236
-22.8839925
72.886505
Elnora Zemlak
212-40-7016
1976-01-09
capvnl@nympzf.gsk
143.20.199.87
3295843196
7206-6205-2194-6432
78.2444466
-120.590050
library(detector)
ashley_madison %>%
detect %>%
knitr::kable(format = "markdown")
column_name
has_email_addresses
has_phone_numbers
has_national_identification_numbers
name
FALSE
FALSE
FALSE
snn
FALSE
FALSE
TRUE
dob
FALSE
FALSE
FALSE
email
TRUE
FALSE
FALSE
ip
FALSE
FALSE
FALSE
phone
FALSE
TRUE
FALSE
credit_card
FALSE
FALSE
FALSE
lat
FALSE
TRUE
FALSE
lon
FALSE
TRUE
FALSE
To cite package ‘detector’ in publications use:
Paul Hendricks (2015). detector: Detect Data Containing Personally Identifiable Information. R package version 0.1.0. https://CRAN.R-project.org/package=detector
A BibTeX entry for LaTeX users is
@Manual{,
title = {detector: Detect Data Containing Personally Identifiable Information},
author = {Paul Hendricks},
year = {2015},
note = {R package version 0.1.0},
url = {https://CRAN.R-project.org/package=detector},
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.