View source: R/SuppressLinkedTables.R
SuppressLinkedTables | R Documentation |
Provides alternatives to global protection for linked tables through methods that may reduce the computational burden.
SuppressLinkedTables(
data = NULL,
fun,
...,
withinArg = NULL,
linkedGauss = "super-consistent",
linkedIntervals = ifelse(linkedGauss == "local-bdiag", "local-bdiag",
"super-consistent"),
lpPackage = NULL,
recordAware = TRUE,
iterBackTracking = Inf,
whenEmptyUnsuppressed = NULL
)
data |
The |
fun |
A function: |
... |
Arguments to |
withinArg |
A list of named lists. Arguments to |
linkedGauss |
Specifies the strategy for protecting linked tables.
The
|
linkedIntervals |
This parameter controls how interval calculations,
triggered by the
|
lpPackage |
See |
recordAware |
If |
iterBackTracking |
Maximum number of back-tracking iterations. |
whenEmptyUnsuppressed |
Parameter to |
The reason for introducing the new method "consistent"
, which has not yet been extensively tested in practice,
is to provide something that works better than "back-tracking"
, while still offering equally strong protection.
Note that for singleton methods of the elimination type (see SSBtools::NumSingleton()
), "back-tracking"
may lead to
the creation of a large number of redundant secondary cells. This is because, during the method's iterations,
all secondary cells are eventually treated as primary. As a result, protection is applied to prevent a singleton
contributor from inferring a secondary cell that was only included to protect that same contributor.
Note that the frequency singleton methods "subSpace"
, "anySum0"
, and "anySumNOTprimary"
are currently not implemented
and will result in an error.
As a result, the singletonZeros
parameter in the SuppressDominantCells()
function cannot be set to TRUE
,
and the SuppressKDisclosure()
function is not available for use.
Also note that automatic forcing of "anySumNOTprimary"
is disabled.
That is, SSBtools::GaussSuppression()
is called with auto_anySumNOTprimary = FALSE
.
See the parameter documentation for an explanation of why FALSE
is required.
A list of data frames, or, if withinArg
is NULL
, the ordinary output from fun
.
Note on differences between SuppressLinkedTables()
and alternative approaches.
By alternatives, we refer to using the linkedGauss
parameter via GaussSuppressionFromData()
, its wrappers, or through tables_by_formulas()
, as shown in the examples below.
Alternatives can be used when only the formula
parameter varies between the linked tables.
SuppressLinkedTables()
creates several smaller model matrices, which may be combined into a single block-diagonal matrix. A large overall matrix is never created.
With the alternatives, a large overall matrix is created first. Smaller matrices are then derived from it. If the size of the full matrix is a bottleneck, SuppressLinkedTables()
is the better choice.
The "global"
method is available with the alternatives, but not with SuppressLinkedTables()
.
The collapseAware
parameter is supported by the alternatives, but not by SuppressLinkedTables()
. This option may improve coordination across tables. See GaussSuppressionFromData()
.
Due to differences in candidate ordering, the two methods may not always produce identical results. With the alternatives, candidate order is constructed globally across all cells (as with the global method).
In contrast, SuppressLinkedTables()
uses a locally determined candidate order within each table. The ordering across tables
is coordinated to ensure the method works, but it is not based on a strictly defined global order.
This may lead to some differences.
With the alternatives, linkedIntervals
may also contain "global"
.
See the documentaion of the linkedIntervals
parameter above and in GaussSuppressionFromData()
.
### The first example can be performed in three ways
### Alternatives are possible since only the formula parameter varies between the linked tables
a <- SuppressLinkedTables(data = SSBtoolsData("magnitude1"), # With trick "sector4 - sector4" and
fun = SuppressDominantCells, # "geo - geo" to ensure same names in output
withinArg = list(list(formula = ~(geo + eu) * sector2 + sector4 - sector4),
list(formula = ~eu:sector4 - 1 + geo - geo),
list(formula = ~geo + eu + sector4 - 1)),
dominanceVar = "value",
pPercent = 10,
contributorVar = "company",
linkedGauss = "consistent")
print(a)
# Alternatively, SuppressDominantCells() can be run directly using the linkedGauss parameter
a1 <- SuppressDominantCells(SSBtoolsData("magnitude1"),
formula = list(table_1 = ~(geo + eu) * sector2,
table_2 = ~eu:sector4 - 1,
table_3 = ~(geo + eu) + sector4 - 1),
dominanceVar = "value",
pPercent = 10,
contributorVar = "company",
linkedGauss = "consistent")
print(a1)
# In fact, tables_by_formulas() is also a possibility
a2 <- tables_by_formulas(SSBtoolsData("magnitude1"),
table_fun = SuppressDominantCells,
table_formulas = list(table_1 = ~region * sector2,
table_2 = ~region1:sector4 - 1,
table_3 = ~region + sector4 - 1),
substitute_vars = list(region = c("geo", "eu"), region1 = "eu"),
collapse_vars = list(sector = c("sector2", "sector4")),
dominanceVar = "value",
pPercent = 10,
contributorVar = "company",
linkedGauss = "consistent")
print(a2)
#### The second example cannot be handled using the alternative methods.
#### This is similar to the (old) LazyLinkedTables() example.
z1 <- SSBtoolsData("z1")
z2 <- SSBtoolsData("z2")
z2b <- z2[3:5] # As in ChainedSuppression example
names(z2b)[1] <- "region"
# As 'f' and 'e' in ChainedSuppression example.
# 'A' 'annet'/'arbeid' suppressed in b[[1]], since suppressed in b[[3]].
b <- SuppressLinkedTables(fun = SuppressSmallCounts,
linkedGauss = "consistent",
recordAware = FALSE,
withinArg = list(
list(data = z1, dimVar = 1:2, freqVar = 3, maxN = 5),
list(data = z2b, dimVar = 1:2, freqVar = 3, maxN = 5),
list(data = z2, dimVar = 1:4, freqVar = 5, maxN = 1)))
print(b)
##################################
#### Examples with intervals
##################################
lpPackage <- "highs"
if (requireNamespace(lpPackage, quietly = TRUE)) {
# Common cells occur because the default for recordAware is TRUE
out1 <- SuppressLinkedTables(data = SSBtoolsData("magnitude1"),
fun = SuppressDominantCells,
withinArg = list(table_1 = list(dimVar = c("geo", "sector2")),
table_2 = list(dimVar = c("eu", "sector4"))),
dominanceVar = "value", k = 90, contributorVar = "company",
lpPackage = lpPackage, rangeMin = 50)
print(out1)
# In the algorithm, common cells occur because recordAware is TRUE,
# although this is not reflected in the output variables table_1 and table_2
out2 <- tables_by_formulas(data = SSBtoolsData("magnitude1"),
table_fun = SuppressDominantCells,
table_formulas = list(table_1 = ~geo * sector2,
table_2 = ~eu * sector4),
substitute_vars = list(region = c("geo", "eu"),
sector = c("sector2", "sector4")),
dominanceVar = "value", k = 90, contributorVar = "company",
linkedGauss = "super-consistent",
lpPackage = lpPackage, rangeMin = 50,
linkedIntervals = c("super-consistent", "local-bdiag", "global"))
print(out2)
} else {
message(paste0("Examples skipped because the '", lpPackage, "' package is not installed."))
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.