Description Usage Arguments Details Value Author(s) References See Also Examples
The function measures the disclosure risk for weighted or unweighted data. It computes the individual risk (and household risk if reasonable) and the global risk. It also computes a risk threshold based on a global risk value.
Prints a 'measure_risk'object
Prints a 'ldiversity'object
1 2 3 4 5 6 7 8 9  measure_risk(obj, ...)
ldiversity(obj, ldiv_index = NULL, l_recurs_c = 2, missing = 999, ...)
## S3 method for class 'measure_risk'
print(x, ...)
## S3 method for class 'ldiversity'
print(x, ...)

obj 
Object of class 
... 
see arguments below

ldiv_index 
indices (or names) of the variables used for ldiversity 
l_recurs_c 
lDiversity Constant 
missing 
a integer value to be used as missing value in the C++ routine 
x 
Output of measure_risk() or ldiversity() 
To be used when risk of disclosure for individuals within a family is considered to be statistical independent.
Internally, function freqCalc() and indivRisk are used for estimation.
Measuring individual risk: The individual risk approach based on socalled superpopulation models. In such models population frequency counts are modeled given a certain distribution. The estimation procedure of sample frequency counts given the population frequency counts is modeled by assuming a negative binomial distribution. This is used for the estimation of the individual risk. The extensive theory can be found in Skinner (1998), the approximation formulas for the individual risk used is described in Franconi and Polettini (2004).
Measuring hierarchical risk: If “hid”  the index of variable holding information on the hierarchical cluster structures (e.g., individuals that are clustered in households)  is provided, the hierarchical risk is additional estimated. Note that the risk of reidentifying an individual within a household may also affect the probability of disclosure of other members in the same household. Thus, the household or clusterstructure of the data must be taken into account when estimating disclosure risks. It is commonly assumed that the risk of reidentification of a household is the risk that at least one member of the household can be disclosed. Thus this probability can be simply estimated from individual risks as 1 minus the probability that no member of the household can be identified.
Global risk: The sum of the individual risks in the dataset gives the expected number of reidentifications that serves as measure of the global risk.
lDiversity: If “ldiv_index” is unequal to NULL, i.e. if the indices of sensible variables are specified, various measures for ldiversity are calculated. ldiverstiy is an extension of the wellknown kanonymity approach where also the uniqueness in sensible variables for each pattern spanned by the key variables are evaluated.
A modified sdcMicroObjclass
object or a list with the following elements:
global_risk_ER: expected number of reidentification.
global_risk: global risk (sum of indivdual risks).
global_risk_pct: global risk in percent.
Res: matrix with the risk, frequency in the sample and grossedup frequency in the population (and the hierachical risk) for each observation.
global_threshold: for a given max_global_risk the threshold for the risk of observations.
max_global_risk: the input max_global_risk of the function.
hier_risk_ER: expected number of reidentification with household structure.
hier_risk: global risk with household structure (sum of indivdual risks).
hier_risk_pct: global risk with household structure in percent.
ldiverstiy: Matrix with Distinct_Ldiversity, Entropy_Ldiversity and Recursive_Ldiversity for each sensitivity variable.
Prints riskinformation into the console
Information on LDiversity Measures in the console
Alexander Kowarik, Bernhard Meindl, Matthias Templ, Bernd Prantner, minor parts of IHSN C++ source
Franconi, L. and Polettini, S. (2004) Individual risk estimation in muArgus: a review. Privacy in Statistical Databases, Lecture Notes in Computer Science, 262–272. Springer
Machanavajjhala, A. and Kifer, D. and Gehrke, J. and Venkitasubramaniam, M. (2007) lDiversity: Privacy Beyond kAnonymity. ACM Trans. Knowl. Discov. Data, 1(1)
Templ, M. Statistical Disclosure Control for Microdata: Methods and Applications in R. Springer International Publishing, 287 pages, 2017. ISBN 9783319502724. doi: 10.1007/9783319502724.
#' Templ, M. and Kowarik, A. and Meindl, B. Statistical Disclosure Control for MicroData Using the R Package sdcMicro. Journal of Statistical Software, 67 (4), 1–36, 2015. doi: 10.18637/jss.v067.i04
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45  ## measure_risk with sdcMicro objects:
data(testdata)
sdc < createSdcObj(testdata,
keyVars=c('urbrur','roof','walls','water','electcon'),
numVars=c('expend','income','savings'), w='sampling_weight')
## risk is already estimated and available in...
names(sdc@risk)
## measure risk on data frames or matrices:
res < measure_risk(testdata,
keyVars=c("urbrur","roof","walls","water","sex"))
print(res)
head(res$Res)
resw < measure_risk(testdata,
keyVars=c("urbrur","roof","walls","water","sex"),w="sampling_weight")
print(resw)
head(resw$Res)
res1 < ldiversity(testdata,
keyVars=c("urbrur","roof","walls","water","sex"),ldiv_index="electcon")
print(res1)
head(res1)
res2 < ldiversity(testdata,
keyVars=c("urbrur","roof","walls","water","sex"),ldiv_index=c("electcon","relat"))
print(res2)
head(res2)
# measure risk with household risk
resh < measure_risk(testdata,
keyVars=c("urbrur","roof","walls","water","sex"),w="sampling_weight",hid="ori_hid")
print(resh)
# change max_global_risk
rest < measure_risk(testdata,
keyVars=c("urbrur","roof","walls","water","sex"),
w="sampling_weight",max_global_risk=0.0001)
print(rest)
## for objects of class sdcMicro:
data(testdata2)
sdc < createSdcObj(testdata2,
keyVars=c('urbrur','roof','walls','water','electcon','relat','sex'),
numVars=c('expend','income','savings'), w='sampling_weight')
## already interally applied and availabe in object sdc:
## sdc < measure_risk(sdc)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.