mask_table: Apply Threshold-Based Masking to a Data Frame

View source: R/mask_table.R

mask_tableR Documentation

Apply Threshold-Based Masking to a Data Frame

Description

The mask_table function applies threshold-based masking to specified columns in a data frame. It uses the mask_counts function to mask counts that are below a certain threshold, adhering to data privacy requirements. The function can handle grouped data and calculate percentages if required. It ensures convergence by checking specific criteria after each iteration.

Usage

mask_table(
  data,
  threshold = 11,
  col_groups,
  group_by = NULL,
  overwrite_columns = TRUE,
  percentages = FALSE,
  perc_decimal = 0,
  zero_masking = FALSE,
  secondary_cell = "min",
  .verbose = FALSE
)

Arguments

data

A data frame containing the counts to be masked. Must be a data frame.

threshold

A positive numeric value specifying the threshold below which values must be suppressed. Default is 11.

col_groups

A character vector or a list of character vectors, where each character vector specifies columns in data to which masking should be applied.

group_by

An optional character string specifying a column name in data to group the data by before masking.

overwrite_columns

Logical; if TRUE, the original columns are overwritten with masked counts. If FALSE, new columns are added with masked counts. Default is TRUE.

percentages

Logical; if TRUE, percentages are calculated and masked accordingly. Default is FALSE.

perc_decimal

= A positive numeric value specifying the decimals for percentages. Default is 0.

zero_masking

Logical; if TRUE, zeros can be masked as secondary cells when present. Passed to mask_counts. Default is FALSE.

secondary_cell

Character string specifying the method for selecting secondary cells when necessary. Options are "min", "max", or "random". Passed to mask_counts. Default is "min".

.verbose

Logical; if TRUE, progress messages are printed during masking. Default is FALSE.

Value

A data frame with masked counts in specified columns. If percentages = TRUE, additional columns with percentages are added. The structure of the returned data frame depends on the overwrite_columns parameter.

See Also

mask_counts

Examples

data("countmaskr_data")

aggregate_table <- countmaskr_data %>%
  select(-c(id, age)) %>%
  gather(block, Characteristics) %>%
  group_by(block, Characteristics) %>%
  summarise(N = n()) %>%
  ungroup()

mask_table(aggregate_table,
  group_by = "block",
  col_groups = list("N")
)

mask_table(aggregate_table,
  group_by = "block",
  col_groups = list("N"),
  overwrite_columns = FALSE,
  percentages = TRUE
)

countmaskr_data %>%
  count(race, gender) %>%
  pivot_wider(names_from = gender, values_from = n) %>%
  mutate(across(all_of(c("Male", "Other")), ~ ifelse(is.na(.), 0, .)),
    Overall = Female + Male + Other, .after = 1
  ) %>%
  countmaskr::mask_table(.,
    col_groups = list(c("Overall", "Female", "Male", "Other")),
    overwrite_columns = TRUE,
    percentages = FALSE
  )


countmaskr documentation built on April 10, 2026, 5:07 p.m.