protectTable: protecting 'sdcProblem-class' objects

Description Usage Arguments Details Value Author(s) Examples

Description

Function protectTable is used to protect primary sensitive table cells (that usually have been identified and set using primarySuppression). The function protects primary sensitive table cells according to the method that has been chosen and the parameters that have been set. Additional parameters that are used to control the protection algorithm are set using parameter ....

Usage

1
protectTable(object, method, ...)

Arguments

object

a sdcProblem-class object that has created using makeProblem and has been modified by primarySuppression

method

a character vector of length 1 specifying the algorithm that should be used to protect the primary sensitive table cells. Allowed values are:

  • OPT: protect the complete problem at once using a cut and branch algorithm. The optimal algorithm should be used for small problem-instances only.

  • HITAS: split the overall problem in smaller problems. These problems are protected using a top-down approach.

  • HYPERCUBE: protect the complete problem by protecting sub-tables with a fast heuristic that is based on finding and suppressing geometric structures (n-dimensional cubes) that are required to protect primary sensitive table cells.

  • SIMPLEHEURISTIC: heuristic, quick procedure which might be applied to very large problem instances

...

parameters used in the protection algorithm that has been selected. Parameters that can be changed are:

  • general parameters include:

    • verbose: logical vector of length 1 defining if verbose output should be produced. Parameter verbose defaults to 'FALSE'

    • save: logical vector of length 1 defining if temporary results should be saved in the current working directory (TRUE) or not (FALSE). Parameter save defaults to 'FALSE'

  • parameters used for HITAS|OPT procedures:

    • solver: character vector of length 1 defining the solver to be used. Currently available choices are limited to 'glpk'.

    • timeLimit: numeric vector of length 1 (or NULL) defining a time limit in minutes after which the cut and branch algorithm should stop and return a possible non-optimal solution. Parameter safe has a default value of 'NULL'

    • maxVars: a numeric vector of length 1 (or NULL) defining the maximum problem size in terms of decision variables for which an optimization should be tried. If the number of decision variables in the current problem are larger than parameter maxVars, only a possible non-optimal, heuristic solution is calculated. Parameter safe has a default value of 'NULL'

    • fastSolution: logical vector of length 1 defining if or if not the cut and branch algorithm will be started or if the possibly non-optimal heuristic solution is returned independent of parameter maxVars. Parameter fastSolution has a default value of 'FALSE'

    • fixVariables: logical vector of length 1 defining whether or not it should be tried to fix some variables to zero or one based on reduced costs early in the cut and branch algorithm. Parameter fixVariables has a default value of 'TRUE'

    • approxPerc: numeric vector of length 1 that defines a percentage for which a integer solution of the cut and branch algorithm is accepted as optimal with respect to the upper bound given by the (relaxed) solution of the master problem. Its default value is set to '10'

    • useC: boolean vector of length 1 defining if c++ implementation of the secondary cell suppression problem should be used, defaults to FALSE

  • parameters used for HYPERCUBE procedure:

    • protectionLevel: numeric vector of length 1 specifying the required protection level for the HYPERCUBE-procedure. Its default value is 80

    • suppMethod: character vector of length 1 defining the rule on how to select the 'optimal' cube to protect a single sensitive cells. Possible choices are:

      • minSupps: minimize the number of additional secondary suppressions (this is also the default setting).

      • minSum: minimize the sum of counts of additional suppressed cells

      • minSumLogs: minimize the log of the sum of additional suppressed cells

    • suppAdditionalQuader: logical vector of length 1 specfifying if additional cubes should be suppressed if any secondary suppressions in the 'optimal' cube are 'singletons'. Parameter suppAdditionalQuader has a default value of 'FALSE'

  • parameter used for protectLinkedTables():

    • maxIter: numeric vector of length 1 specifying the maximal number of interations that should be make while trying to protect common cells of two different tables. The default value of parameter maxIter is 10

  • parameters used for SIMPLEHEURISTIC procedure:

    • detectSingletons: logical, should a singleton-detection procedure be run before protecting the data, defaults to FALSE.

Details

The implemented methods may have bugs that yield in not-fully protected tables. Especially the usage of OPT, HITAS and HYPERCUBE in production is not suggested as these methods may eventually be removed completely. In case you encounter any problems, please report it or use Tau-Argus (http://research.cbs.nl/casc/tau.htm).

Value

an safeObj-class object

Author(s)

Bernhard Meindl bernhard.meindl@statistik.gv.at

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# load problem (as it was created after performing primary suppression
# in the example of \code{\link{primarySuppression}})
sp <- searchpaths()
fn <- paste(sp[grep("sdcTable", sp)], "/data/problemWithSupps.RData", sep="")
problem <- get(load(fn))

# protect the table using the 'HITAS' algorithm with verbose output
protectedData <- protectTable(problem, method='HITAS', verbose=TRUE, useC=TRUE)

# showing a summary
summary(protectedData)

# looking at the final table with result suppression pattern
print(getInfo(protectedData, type='finalData'))

bernhard-da/sdcTable documentation built on June 10, 2019, 4:54 a.m.