ProtectKostra: Table suppression according to a frequency rule following the...

View source: R/ProtectKostra.R

ProtectKostraR Documentation

Table suppression according to a frequency rule following the standards in the Kostra project.

Description

Table suppression according to a frequency rule following the standards in the Kostra project.

Usage

ProtectKostra(
  data,
  idVar = 1,
  strataVar = NULL,
  freqVar = 2,
  freqVarGroup = NULL,
  protectZeros = TRUE,
  maxN = 3,
  method = "Gauss",
  output = "suppressed",
  total = "Total",
  split = "_",
  singleTotal = FALSE,
  ...
)

Arguments

data

Input data set of class data.frame

idVar

Id-variable (name or number)

strataVar

Strata-variable(s) (name or number)

freqVar

Variable(s) holding counts (name or number)

freqVarGroup

NULL (default) or integer representing groups of variables (see details)

protectZeros

When TRUE empty cells (count=0) is considered sensitive

maxN

All cells having counts <= maxN are set as primary suppressed

method

Parameter "method" in ProtectTable: Only "Gauss" possible (only-Gauss replacement function )

output

One of "suppressed" (default), "freq", "sdcStatus" or "extraWide" (only when freqVarGroup is NULL)

total

String used to name totals.

split

Parameter to AutoSplit - see varNames and rowData above. When NULL automatic splitting without needing a split string.

singleTotal

When TRUE identical rowsums in all freqVarGroups needed. When FALSE totals for each freqVarGroup will be in output.

...

Additional variables that will be included in output (name or number).

Details

When freqVarGroup is NULL:

This function is a wrapper to ProtectTable with dimVar=c(idVar, strataVar). The function GetData is used.

Note that the names of output variables are strange when a single freqVar variable is input. This can be fixed by using freqVarGroup=1 instead of NULL.

When freqVarGroup is NOT NULL:

The suppression function (as when freqVarGroup is NULL) is run several times according to the groups with freqVarGroup>0. We have to types of groups: Single variables and several variables. All groups of several variables must have identical rowsums.

Variables with freqVarGroup<1 will be included in output sorted as input.

A warning is produced if generated total-output is not unique. Only the first result is then returned. In the case of output="suppressed" this means that the suppressions of the total has been is different. In the case of output="sdcStatus" only coding may have been different.

Value

A data.frame with as many rows as input

Note

Even if freq-variables with freqVarGroup<1 is not used they will be read by GetData together with the other the freq-variables variables into a matrix. Use a common numeric type for all these variables to prevent change of data type.

All codes in idVar and strataVar must be unique. If not, automatic re-coding will be done with a warning. Using addName=TRUE in input will prevent this warning. Anyway, when output="extraWide" non-unique codes produce problematic output.

Normally a value is only safe if sdcStatus="s". When using tau-argus sdcStatus="z" is also safe when protectZeros="FALSE". But currently tau-argus methods are not allowed in ProtectKostra. Use a simpler (binary) coding of "sdcStatus" in future version? When the underlying function ProtectTable results in error: sdcStatus="e".

Author(s)

Øyvind Langsrud

Examples


 # ==================================
 #    Examples without freqVarGroup
 # ==================================

 # ==== Example 1 , 8 regions ====
 z1w = KostraData("z1w")
 ProtectKostra(z1w,idVar="region",freqVar=2:5)

 # ==== Example 2 , 11 regions ====
 z2w <- KostraData("z2w")
 ProtectKostra(z2w,idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:7)

 # ==== Example 3 , 36 regions ====
 z3w <- KostraData("z3w")
 ProtectKostra(z3w,idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:15)

 #  ==== Example 3b , 36 regions == with three level column name coding
 z3wb <- KostraData("z3wb")
 ProtectKostra(z3wb,idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:15)

 #  ==== Example 4 , 437 regions ====
 z4w <- KostraData("z4w")
 ProtectKostra(z4w,idVar="region",strataVar="fylke",freqVar=4:15)

 # =====================================================================
 #    Examples with extra variables in output and several id variables
 # =====================================================================

 ProtectKostra(z3wb,idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:15,fylke="fylke",kostragr="kostragr")

 # Same using DotWrap
 DotWrap("ProtectKostra",dots=c("fylke","kostragr"),z3wb,idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:15)

 # Several id variable
 ProtectKostra(z3wb,idVar=c("region","fylke","kostragr"),strataVar=c("fylke","kostragr"),freqVar=4:15,region="region")

 # ==================================
 #    Examples with freqVarGroup
 # ==================================

 # Generate example data for this function
 exData   <- KostraData("z3w")[,c(1:15,15,4:6)]
 names(exData)[12:19]=c("s1","s2","s3","s4","A","B","C","D")
 exData[,"s4"] <- rowSums(exData[,4:11]) - rowSums(exData[,12:14])

 # Create input parameter
 freqVarGroup <- c(1,1,1,1,1,1,1,1,2,2,2,2,3,4,-1,5) # Same as c(rep(1,8),rep(2,4),3,4,-1,5)

 a <- ProtectKostra(exData ,idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:19,freqVarGroup=freqVarGroup)
 #  Now output of a$C is just missing since "-1"

 names(exData)[18] <- "arbeid" #  Rename from "C" to "arbeid"

 b <- ProtectKostra(exData ,idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:19,freqVarGroup=freqVarGroup)
 # Now "arbeid" in output is still between "B" and "D" as in input. And b$arbeid is NOT just missing

 # singleTotal=TRUE
 ProtectKostra(exData ,idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:19,freqVarGroup=freqVarGroup, singleTotal=TRUE)

 exData[4,4] <- 3  # Warning will be produced
 ProtectKostra(exData ,idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:19,freqVarGroup=freqVarGroup)

 freqVarGroup <- c(11,11,11,11,11,11,11,11,2,2,2,2,3,4,0,5)  # Using this instead give same result in different order


 # ========================================
 #    Examples with a single freq-variable
 # ========================================

 z1w = KostraData("z1w")
 ProtectKostra(z1w,idVar="region",freqVar=2)  # wrong "name"
 ProtectKostra(z1w,idVar="region",freqVar=2, freqVarGroup=1) # same name as input


statisticsnorway/Kostra documentation built on Nov. 2, 2024, 6:40 p.m.