View source: R/localSuppression.R
localSuppression | R Documentation |
Algorithm to achieve k-anonymity by performing local suppression.
localSuppression(obj, k = 2, importance = NULL, combs = NULL, ...)
kAnon(obj, k = 2, importance = NULL, combs = NULL, ...)
obj |
a |
k |
Threshold for k-anonymity |
importance |
Numeric vector of values between 1 and n ( |
combs |
Numeric vector. If specified, the algorithm provides k-anonymity
for each combination of n key variables (with n being the value of the ith
element of this parameter). For example, |
... |
see additional arguments below:
|
The algorithm provides a k-anonymized data set by suppressing values in key variables. The algorithm tries to find an optimal solution to suppress as few values as possible and considers the specified importance vector. If not specified, the importance vector is constructed in a way such that key variables with a high number of characteristics are considered less important than key variables with a low number of characteristics.
The implementation provides k-anonymity per strata, if slot strataVar
has
been set in sdcMicroObj-class
or if parameter strataVar
is
used when applying the data.frame
method. For details, see the examples provided.
For the parameter alpha
:
alpha = 1
counts all wildcard matches (i.e. NA
s match everything).
alpha = 0
assumes missing values form their own categories.
These are two extremes. With alpha = 0
, frequencies are likely underestimated when
NA
s are present. If combs
is used with alpha = 0
, the heuristic nature of kAnon()
may lead to technically correct, but not always intuitively understandable frequency evaluations.
A modified dataset with suppressions that meets k-anonymity based on
the specified key variables, or the modified sdcMicroObj-class
object.
Deprecated methods localSupp2
and localSupp2Wrapper
are no longer available
in sdcMicro
versions > 4.5.0.
kAnon()
is a more intuitive term for local suppression, since the goal is to achieve k-anonymity.
Bernhard Meindl, Matthias Templ
Templ, M. Statistical Disclosure Control for Microdata: Methods and Applications in R. Springer International Publishing, 287 pages, 2017. ISBN: 978-3-319-50272-4. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/978-3-319-50272-4")}
Templ, M., Kowarik, A., Meindl, B. Statistical Disclosure Control for Micro-Data Using the R Package sdcMicro. Journal of Statistical Software, 67(4), 1–36, 2015. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v067.i04")}
data(francdat)
## Local Suppression
localS <- localSuppression(francdat, keyVar = c(4, 5, 6))
localS
plot(localS)
## for objects of class sdcMicro, no stratification
data(testdata2)
kv <- c("urbrur", "roof", "walls", "water", "electcon", "relat", "sex")
sdc <- createSdcObj(testdata2, keyVars = kv, w = "sampling_weight")
sdc <- localSuppression(sdc)
## for objects of class sdcMicro, with stratification
testdata2$ageG <- cut(testdata2$age, 5, labels = paste0("AG", 1:5))
sdc <- createSdcObj(
dat = testdata2,
keyVars = kv,
w = "sampling_weight",
strataVar = "ageG"
)
sdc <- localSuppression(sdc, nc = 1)
## it is also possible to provide k-anonymity for subsets of key-variables
## with different parameter k!
## in this case we want to provide 10-anonymity for all combinations
## of 5 key variables, 20-anonymity for all combinations with 4 key variables
## and 30-anonymity for all combinations of 3 key variables.
sdc <- createSdcObj(testdata2, keyVars = kv, w = "sampling_weight")
combs <- 5:3
k <- c(10, 20, 30)
sdc <- localSuppression(sdc, k = k, combs = combs)
## data.frame method (no stratification)
inp <- testdata2[, c(kv, "ageG")]
ls <- localSuppression(inp, keyVars = 1:7)
print(ls)
plot(ls)
## data.frame method (with stratification)
ls <- kAnon(inp, keyVars = 1:7, strataVars = 8)
print(ls)
plot(ls)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.