View source: R/ProtectKostra.R
ProtectKostra | R Documentation |
Table suppression according to a frequency rule following the standards in the Kostra project.
ProtectKostra(
data,
idVar = 1,
strataVar = NULL,
freqVar = 2,
freqVarGroup = NULL,
protectZeros = TRUE,
maxN = 3,
method = "Gauss",
output = "suppressed",
total = "Total",
split = "_",
singleTotal = FALSE,
...
)
data |
Input data set of class data.frame |
idVar |
Id-variable (name or number) |
strataVar |
Strata-variable(s) (name or number) |
freqVar |
Variable(s) holding counts (name or number) |
freqVarGroup |
NULL (default) or integer representing groups of variables (see details) |
protectZeros |
When TRUE empty cells (count=0) is considered sensitive |
maxN |
All cells having counts <= maxN are set as primary suppressed |
method |
Parameter "method" in ProtectTable: Only "Gauss" possible (only-Gauss replacement function ) |
output |
One of "suppressed" (default), "freq", "sdcStatus" or "extraWide" (only when freqVarGroup is NULL) |
total |
String used to name totals. |
split |
Parameter to |
singleTotal |
When TRUE identical rowsums in all freqVarGroups needed. When FALSE totals for each freqVarGroup will be in output. |
... |
Additional variables that will be included in output (name or number). |
When freqVarGroup is NULL:
This function is a wrapper to ProtectTable
with dimVar=c(idVar, strataVar).
The function GetData
is used.
Note that the names of output variables are strange when a single freqVar variable is input.
This can be fixed by using freqVarGroup=1
instead of NULL.
When freqVarGroup is NOT NULL:
The suppression function (as when freqVarGroup is NULL) is run several times according to the groups with freqVarGroup>0
.
We have to types of groups: Single variables and several variables.
All groups of several variables must have identical rowsums.
Variables with freqVarGroup<1
will be included in output sorted as input.
A warning is produced if generated total-output is not unique. Only the first result is then returned.
In the case of output="suppressed"
this means that the suppressions of the total has been is different.
In the case of output="sdcStatus"
only coding may have been different.
A data.frame with as many rows as input
Even if freq-variables with freqVarGroup<1
is not used they will be read by GetData
together with the other
the freq-variables variables into a matrix. Use a common numeric type for all these variables to prevent change of data type.
All codes in idVar
and strataVar
must be unique. If not, automatic re-coding will be done with a warning.
Using addName=TRUE
in input will prevent this warning.
Anyway, when output="extraWide"
non-unique codes produce problematic output.
Normally a value is only safe if sdcStatus="s". When using tau-argus sdcStatus="z" is also safe when protectZeros="FALSE".
But currently tau-argus methods are not allowed in ProtectKostra. Use a simpler (binary) coding of "sdcStatus" in future version?
When the underlying function ProtectTable
results in error: sdcStatus="e".
Øyvind Langsrud
# ==================================
# Examples without freqVarGroup
# ==================================
# ==== Example 1 , 8 regions ====
z1w = KostraData("z1w")
ProtectKostra(z1w,idVar="region",freqVar=2:5)
# ==== Example 2 , 11 regions ====
z2w <- KostraData("z2w")
ProtectKostra(z2w,idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:7)
# ==== Example 3 , 36 regions ====
z3w <- KostraData("z3w")
ProtectKostra(z3w,idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:15)
# ==== Example 3b , 36 regions == with three level column name coding
z3wb <- KostraData("z3wb")
ProtectKostra(z3wb,idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:15)
# ==== Example 4 , 437 regions ====
z4w <- KostraData("z4w")
ProtectKostra(z4w,idVar="region",strataVar="fylke",freqVar=4:15)
# =====================================================================
# Examples with extra variables in output and several id variables
# =====================================================================
ProtectKostra(z3wb,idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:15,fylke="fylke",kostragr="kostragr")
# Same using DotWrap
DotWrap("ProtectKostra",dots=c("fylke","kostragr"),z3wb,idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:15)
# Several id variable
ProtectKostra(z3wb,idVar=c("region","fylke","kostragr"),strataVar=c("fylke","kostragr"),freqVar=4:15,region="region")
# ==================================
# Examples with freqVarGroup
# ==================================
# Generate example data for this function
exData <- KostraData("z3w")[,c(1:15,15,4:6)]
names(exData)[12:19]=c("s1","s2","s3","s4","A","B","C","D")
exData[,"s4"] <- rowSums(exData[,4:11]) - rowSums(exData[,12:14])
# Create input parameter
freqVarGroup <- c(1,1,1,1,1,1,1,1,2,2,2,2,3,4,-1,5) # Same as c(rep(1,8),rep(2,4),3,4,-1,5)
a <- ProtectKostra(exData ,idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:19,freqVarGroup=freqVarGroup)
# Now output of a$C is just missing since "-1"
names(exData)[18] <- "arbeid" # Rename from "C" to "arbeid"
b <- ProtectKostra(exData ,idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:19,freqVarGroup=freqVarGroup)
# Now "arbeid" in output is still between "B" and "D" as in input. And b$arbeid is NOT just missing
# singleTotal=TRUE
ProtectKostra(exData ,idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:19,freqVarGroup=freqVarGroup, singleTotal=TRUE)
exData[4,4] <- 3 # Warning will be produced
ProtectKostra(exData ,idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:19,freqVarGroup=freqVarGroup)
freqVarGroup <- c(11,11,11,11,11,11,11,11,2,2,2,2,3,4,0,5) # Using this instead give same result in different order
# ========================================
# Examples with a single freq-variable
# ========================================
z1w = KostraData("z1w")
ProtectKostra(z1w,idVar="region",freqVar=2) # wrong "name"
ProtectKostra(z1w,idVar="region",freqVar=2, freqVarGroup=1) # same name as input
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.