MagnitudeRule: Dominance (n,k) or p% rule for magnitude tables

View source: R/MagnitudeRule.R

MagnitudeRuleR Documentation

Dominance ⁠(n,k)⁠ or p% rule for magnitude tables

Description

Supports application of multiple values for n and k. The function works on magnitude tables containing negative cell values by calculating contribution based on absolute values.

Usage

MagnitudeRule(
  data,
  x,
  numVar,
  n = NULL,
  k = NULL,
  pPercent = NULL,
  protectZeros = FALSE,
  charVar = NULL,
  removeCodes = character(0),
  removeCodesFraction = 1,
  sWeightVar = NULL,
  domWeightMethod = "default",
  allDominance = FALSE,
  outputWeightedNum = !is.null(sWeightVar),
  dominanceVar = NULL,
  structuralEmpty = FALSE,
  apply_abs_directly = FALSE,
  max_contribution_output = NULL,
  num,
  ...
)

DominanceRule(data, n, k, protectZeros = FALSE, ...)

PPercentRule(data, pPercent, protectZeros = FALSE, ...)

Arguments

data

the dataset

x

ModelMatrix generated by parent function

numVar

vector containing numeric values in the data set

n

Parameter n in dominance rule.

k

Parameter k in dominance rule.

pPercent

Parameter in the p% rule, when non-NULL. Parameters n and k will then be ignored. Technically, calculations are performed internally as if n = 1:2. The results of these intermediate calculations can be viewed by setting allDominance = TRUE.

protectZeros

Parameter determining whether cells with value 0 should be suppressed. Unless structuralEmpty is TRUE (see below), cells that result in a value of 0 due to removed removeCode contributions are also suppressed.

charVar

Variable in data holding grouping information. Dominance will be calculated after aggregation within these groups.

removeCodes

A vector of charVar codes that are to be excluded when calculating dominance percentages. Essentially, the corresponding numeric values from dominanceVar or numVar are set to zero before proceeding with the dominance calculations. With empty charVar row indices are assumed and conversion to integer is performed. See also removeCodesFraction below.

removeCodesFraction

Numeric value(s) in the range ⁠[0, 1]⁠. This can be either a single value or a vector with the same length as removeCodes. A value of 1 represents the default behavior, as described above. A value of 0 indicates that dominance percentages are calculated as if removeCodes were not removed, but percentages associated with removeCodes are still excluded when identifying major contributions. Values between 0 and 1 modify the contributions of removeCodes proportionally in the calculation of percentages.

sWeightVar

variable with sampling weights to be used in dominance rule

domWeightMethod

character representing how weights should be treated in the dominance rule. See Details.

allDominance

Logical. If TRUE, additional information is included in the output. When n = 2, the following variables are added:

  • "dominant2": The fraction associated with the dominance rule.

  • "max2contributor": IDs associated with the second largest contribution. These IDs are taken from charVar if provided, or the row indices if charVar is not supplied.

  • "n_contr" and "n_non0_contr": Outputs from max_contribution. If removeCodes is used as input, "n_contr_all" and "n_non0_contr_all" are also included. The parameter max_contribution_output can be used to specify custom outputs from max_contribution. Note that if max_contribution_output is provided, only the specified outputs will be included, and the default outputs ("n_contr" and "n_non0_contr") will not be added unless explicitly listed.

outputWeightedNum

logical value to determine whether weighted numerical value should be included in output. Default is TRUE if sWeightVar is provided.

dominanceVar

When specified, dominanceVar is used in place of numVar. Specifying dominanceVar is beneficial for avoiding warnings when there are multiple numVar variables. Typically, dominanceVar will be one of the variables already included in numVar.

structuralEmpty

Parameter as input to GaussSuppressionFromData. It is needed also here to handle structural zeros caused by removeCodes.

apply_abs_directly

Logical. Determines how negative values are treated in the rules. When apply_abs_directly = FALSE (default), absolute values are taken after summing contributions, as performed by max_contribution. When apply_abs_directly = TRUE, absolute values are computed directly on the input values, prior to any summation. This corresponds to the old behavior of the function.

max_contribution_output

See the description of the allDominance parameter.

num

Output numeric data generated by parent function. This parameter is needed when protectZeros is TRUE.

...

unused parameters

Details

This method only supports suppressing a single numeric variable. There are multiple ways of handling sampling weights in the dominance rule. the default method implemented here compares unweighted sample values with the corresponding weighted cell totals. if domWeightMethod is set to "tauargus", the method implemented in tauArgus is used. For more information on this method, see "Statistical Disclosure Control" by Hundepool et al (2012, p. 151).

Value

logical vector that is TRUE in positions corresponding to cells breaching the dominance rules.

Note

Explicit protectZeros in wrappers since default needed by GaussSuppressionFromData

Author(s)

Daniel Lupp and Øyvind Langsrud

Examples

  set.seed(123)
z <- SSBtools::MakeMicro(SSBtoolsData("z2"), "ant")
z$value <- sample(1:1000, nrow(z), replace = TRUE)

GaussSuppressionFromData(z, dimVar = c("region", "fylke", "kostragr", "hovedint"), 
numVar = "value", candidates = CandidatesNum, primary = DominanceRule, preAggregate = FALSE,
singletonMethod = "sub2Sum", n = c(1, 2), k = c(65, 85), allDominance = TRUE)


num <- c(100,
         90, 10,
         80, 20,
         70, 30,
         50, 25, 25,
         40, 20, 20, 20,
         25, 25, 25, 25)
v1 <- c("v1",
        rep(c("v2", "v3", "v4"), each = 2),
        rep("v5", 3),
        rep(c("v6", "v7"), each = 4))
sw <- c(1, 2, 1, 2, 1, 2, 1, 2, 1, 1, 2, 1, 1, 1, 2, 1, 1, 1)
d <- data.frame(v1 = v1, num = num, sw = sw)

# without weights
GaussSuppressionFromData(d, formula = ~v1 - 1, 
 numVar = "num",  n = c(1,2), k = c(80,70),
  preAggregate = FALSE, allDominance = TRUE, candidates = CandidatesNum,
  primary = DominanceRule)

# with weights, standard method
GaussSuppressionFromData(d, formula = ~v1 - 1,
 numVar = "num",  n = c(1,2), k = c(80,70), sWeightVar = "sw",
 preAggregate = FALSE, allDominance = TRUE, candidates = CandidatesNum,
 primary = DominanceRule)

# with weights, tauargus method
GaussSuppressionFromData(d, formula = ~v1 - 1,
 numVar = "num",  n = c(1,2), k = c(80,70), sWeightVar = "sw",
 preAggregate = FALSE, allDominance = TRUE, candidates = CandidatesNum,
 primary = DominanceRule, domWeightMethod = "tauargus")


GaussSuppression documentation built on April 3, 2025, 7:54 p.m.