IdentificationRisk: An Identification Risk Function

Description Usage Arguments

View source: R/IdentificationRiskContinuous.R

Description

This function will compute the identification risk for a dataset with synthetic categorical variables. This function assumes categorical variables will be as factors.

Usage

1
2
3
4
5
6
7
8
9
IdentificationRisk(
  origdata,
  syndata,
  known,
  syn,
  r,
  percentage = TRUE,
  euclideanDist = FALSE
)

Arguments

origdata

dataframe of the original data

syndata

list of the different synthetic dataframes

known

vector of the names of the columns in the dataset assumed to be known

syn

vector of the names of the columns in the dataset that are synthetic

r

radius to compare with for continuous variables. Radius is either percentage (default) or fixed. Radius can be the same for all continuous variables or specific to each. To specify for each use a vector, with the radii ordered in the same order those columns appear in the dataset.

percentage

true for a percentage radius, false for a constant radius

euclideanDist

true for a euclidean distance radius, false otherwise


RyanHornby/IdentificationRisk documentation built on May 8, 2021, 5:23 a.m.