View source: R/multi.disclosure.R
multi.disclosure | R Documentation |
Calculates, prints and plots tables of disclosure measures for a set of
target variables from a fixed set of keys to form quasi-identifiers.
The calculations of disclosure measures are done by the function
disclosure
for each target.
This function can be also used with synthetic data NOT created by
syn()
, or even made anonymous by other methods such as sampling
More details of the measures calculated can be found in the package vignette
"Disclosure measures for Synthetic Data".
## S3 method for class 'synds'
multi.disclosure(object, data,
keys , targets = NULL, print.flag = TRUE,
denom_lim = 5, exclude_ov_denom_lim = FALSE,
not.targetslev = NULL,
usetargetsNA = TRUE, usekeysNA = TRUE,
exclude.keys = NULL, exclude.keylevs = NULL, exclude.targetlevs = NULL,
ngroups_targets = NULL, ngroups_keys = NULL,
ident.meas = "repU", attrib.meas = "DiSCO",
thresh_1way = c(50, 90),thresh_2way = c(4, 80),
digits = 2, plot = TRUE, ...)
## S3 method for class 'data.frame'
multi.disclosure(object, data, cont.na = NULL,
keys , targets = NULL, print.flag = TRUE,
denom_lim = 5, exclude_ov_denom_lim = FALSE,
not.targetslev = NULL,
usetargetsNA = TRUE, usekeysNA = TRUE,
exclude.keys = NULL, exclude.keylevs = NULL, exclude.targetlevs = NULL,
ngroups_targets = NULL, ngroups_keys = NULL,
ident.meas = "repU", attrib.meas = "DiSCO",
thresh_1way = c(50, 90),thresh_2way = c(4, 80),
digits = 2, plot = TRUE, compare.synorig = TRUE, ...)
## S3 method for class 'list'
multi.disclosure(object, data, cont.na = NULL,
keys , targets = NULL, print.flag = TRUE,
denom_lim = 5, exclude_ov_denom_lim = FALSE,
not.targetslev = NULL,
usetargetsNA = TRUE, usekeysNA = TRUE,
exclude.keys = NULL, exclude.keylevs = NULL, exclude.targetlevs = NULL,
ngroups_targets = NULL, ngroups_keys = NULL,
ident.meas = "repU", attrib.meas = "DiSCO",
thresh_1way = c(50, 90),thresh_2way = c(4, 80),
digits = 2, plot = TRUE, compare.synorig = TRUE,...)
## S3 method for class 'multi.disclosure'
print(x, digits = NULL, plot = NULL, to.print = c("ident","attrib"),
...)
object |
an object of class |
data |
the original (observed) data set. |
cont.na |
For data NOT supplied as a synthetic data object created by
|
keys |
a vector of strings with the names of variables to be used in combination to form a quasi identifier. |
targets |
a vector of strings with the names of variables to be used as
targets for the disclosure measures. Defaults to all variables in both original
and synthetic data that are not in |
denom_lim |
an integer that determines the limit above which a warning to check the two way relationships for potential prior disclosure information. |
exclude_ov_denom_lim |
TRUE/FALSE according to whether disclosive groups with denominators > denom_lim should be excluded from disclosure measures. |
not.targetslev |
Vector of same length as targets giving level of each target to be excluded from calculating disclosure measures. Set elements for unaffected targets as blanks. |
print.flag |
TRUE/FALSE to print out line as disclosure for each member of targets is calculated. |
usetargetsNA |
A logical vector of the same length as |
usekeysNA |
A logical vector of the same length as |
exclude.keys |
A list of same length as |
exclude.keylevs |
A list of same length as |
exclude.targetlevs |
A list of same length as |
ngroups_targets |
Unless set to NULL (the default) numeric target variables
will be grouped into |
ngroups_keys |
Unless set to NULL (the default) any numeric variable
will be grouped into categories If |
ident.meas |
Choice of statistics to use as a measure of identity disclosure.
Must be a selection from: |
attrib.meas |
Choice of statistics to use as a measure of attribute disclosure.
Must be a selection from: |
thresh_1way |
A vector of two numeric values both of which meed to be exceeded for warnings about a level of the target that may be dominating the results. The first is the count of all disclosive records, and the second is the % of all records for this level of the target. Default is c(50, 90), meaning a group of 50 disclosive records for this level of the target where they make up over 90% of all disclosive records. |
thresh_2way |
A vector of two numeric values both of which meed to be exceeded for warnings about a level of the target that may be dominating the results. The first is the count of all disclosive records for this key-target combination and the second is the percantage of all disclosive records for this combination. Default is c(5, 80), meaning a group of more than 5 records where over 80% of all the original values with this key have this level of the target. |
digits |
number of digits to print for the disclosure measures. |
plot |
determines if plot will be produced when the result is printed. |
print |
logical value that determines if a summary of results is to be printed. |
compare.synorig |
a logical value to determine if the functions
|
to.print |
Vector of items to be printed including "ident", "attrib", both or NULL |
... |
additional parameters |
x |
an object of class |
Calculates measures of identity and attribution disclosure from the keys
specified in keys
with the function disclosure
. For attribute
disclosure a table with one line for each target can be printed or plotted.
Details are in help file for disclosure
.
An object of class multi.disclosure
which is a list with the following
components:
attrib.table |
a table with the selected attribute disclosure measure
( |
attrib.plot |
plot of attrib.table with labels indicating where large denominators suggest checking. |
keys |
see above. |
ident.orig |
value of identity disclosure |
ident.syn |
value of identity disclosure |
Norig |
Number of records in data. |
denom_lim |
see above. |
exclude_ov_denom_lim |
see above. |
digits |
see above. |
usetargetsNA |
see above. |
usekeysNA |
see above. |
ident.meas |
see above. |
attrib.meas |
see above. |
m |
see above. |
plot |
see above. |
output.list |
A named list with a component for each target
where each component is the output from the function
|
call |
R call used to create the object |
to follow link to vignette
disclosure
ods <- SD2011[, c("sex", "age", "edu", "marital", "region", "income")]
s1 <- syn(ods)
### synthetic data provided as a 'data.frame' object
t1 <- multi.disclosure(s1$syn, ods,
keys = c("sex", "age", "edu"))
### synthetic data provided as a 'synds' object
t1 <- multi.disclosure(s1, ods,
keys = c("sex", "age", "edu"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.