deidentify_choices_table: Make a data.frame showing either the outcome (k-score) of all...

Description Usage Arguments Value Examples

View source: R/tables.R

Description

Make a data.frame showing either the outcome (k-score) of all possible de-identify choices or only those that meet a certain k-score threshold.

Usage

1
2
3
4
5
6
7
deidentify_choices_table(
  data,
  date_cols = NULL,
  group_rare_values_cols,
  k_score_columns = NULL,
  preferred_k_score = NULL
)

Arguments

data

A data.frame with the data you want to de-identify.

date_cols

A vector of strings with the name of date columns that you want to be aggregated. If NULL, will use all date columns in the data.

group_rare_values_cols

A string or vector of strings with the columns that you want to turn rare values (below k% where k is 1-99) into NA.

k_score_columns

A string or vector of strings for the names of columns to generate the k-score from. If NULL (default), will use the columns inputted in date_cols and group_rare_values_cols. Note that if you select columns for these parameters and don't include them in k_score_columns, deidentifying these columns won't affect the k-score.

preferred_k_score

A number of vector of numbers to set the minimum (and maximum if a vector) k-score you want from the possible choices.

Value

Returns a data.frame that only has all possible choices of decisions to make and the k-score that it returns. Each row is a possible decision when using the deidentify_data() function and includes summary statistics for the k-score of that decision. If preferred_k_score is set, returns only choices that meet this parameter. If no choices meet this k-score minimum, will return an empty data.frame.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
deidentify_choices_table(mtcars, group_rare_values_cols = c("mpg", "vs"),
k_score_columns = c("mpg", "vs"))

deidentify_choices_table(mtcars, group_rare_values_cols = c("mpg", "vs"), preferred_k_score = 5:15)

## Not run: 
deidentify_choices_table(deidentify::initiations,
date_cols = c("arrest_date", "felony_review_date"),
group_rare_values_cols = c("race", "primary_charge_flag"),
k_score_columns = c("primary_charge_flag", "gender", "race",
 "arrest_date", "felony_review_date"))

## End(Not run)

phillydao/deidentify documentation built on Feb. 4, 2021, 2:31 p.m.