RoundKostra: Rounding à la Heldal following the standards in the Kostra...

View source: R/RoundKostra.R

RoundKostraR Documentation

Rounding à la Heldal following the standards in the Kostra project.

Description

Rounding à la Heldal following the standards in the Kostra project.

Usage

RoundKostra(
  data,
  idVar,
  strataVar = NULL,
  freqVar,
  freqVarGroup = NULL,
  roundBase = 3,
  method = "pls",
  formula = NULL,
  level = 0,
  allSmall = TRUE,
  singleTotal = TRUE,
  makeSums = TRUE,
  output = "rounded",
  total = "Total",
  split = "_",
  extraOutput = FALSE,
  seed = 12345,
  ...
)

Arguments

data

Input data set of class data.frame

idVar

Id-variable (name or number)

strataVar

Strata-variable(s) (name or number)

freqVar

Variable(s) holding counts (name or number)

freqVarGroup

NULL (default) or integer representing groups of variables (see details)

roundBase

Basis for rounding

method

Algorithm for the rounding calculations. Currently "pls" or "singleRandom".

formula

Model formula as a string defining cells to be published (additional to automation)

level

Interaction level or 0 (all levels) defining complexity of model component created from strata

allSmall

When TRUE all small values will be rounded (when a single freqVar)

singleTotal

When TRUE identical rowsums in all freqVarGroups needed. When FALSE totals for each freqVarGroup will be in output.

makeSums

When TRUE totals vil be made similar to ProtectKostra (in fact this is done by calling ProtectKostra)

output

One of "rounded" (default), "original", "difference" or "status". Unrounded results are returned by “original” (same as roundBase=0) and "difference" = "rounded" - “original”. With output=”status” zero differences are set to “o” (original) and others are coded as “r” (rounded)

total

String used to name totals.

split

Parameter to AutoSplit - see varNames and rowData above. When NULL automatic splitting without needing a split string.

extraOutput

When TRUE output is a list of several elements (makeSums ignored)

seed

NULL or seed for random number generator (set.seed(seed) will be run at the beginning of the function)

...

Variables for formula and additional variables that will be included in output (name or number).

Details

A single freq variable and formula: A formula defines all the publishable cells. Rounding is performed so that all the publishable cells are safe. When allSmall=TRUE all small cells are rounded. All possible totals are then safe, but totals not defined by the formula can be far from the original values.

A single freq variable and strata: Instead of a formula it is assumed that the cells to be published are obtained by crossing all strata variables. The parameter "level" may be used.

Several freq variables without freqVarGroup: The data is in a unstacked form and stack/unstack will be performed in the background similar to ProtectKostra. The original id-var will be considered as a strata-var when stacked and rounding is performed similar to "A single freq variable and strata".

With freqVarGroup without single-groups: Each group can be stacked to form a separate data set, but a common data set is needed. An ad hoc data set is created to match all the single data sets and this data set will be used in the rounding process. This method will not work in complicated cases. Use with care. Try extraOutput=TRUE to see what is going on.

With freqVarGroup with single-groups: The single-groups are assumed to be two-category groups (yes and no, but only yes is reported). The remaining category will be computed in order to create ad hoc data set.

Value

A data.frame unless extraOutput = TRUE

Note

Even if freq-variables with freqVarGroup<1 is not used they will be read by GetData together with the other the freq-variables variables into a matrix. Use a common numeric type for all these variables to prevent change of data type.

Parameter namesAsInput in ProtectTable() is not yet available in RoundKostra and therefore very advanced variable name coding will give insufficient results. Example:

RoundKostra(KostraData("z3wb") ,idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:15)

But this will work:

RoundKostra(KostraData("z3wb") ,idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:15,split=NULL)

See similar example in ProtectTable.

NOTE: Be sure to spell the input parameters correctly. Because of the "..."-input misspelled parameters may give strange results instead of error.

See Also

ProtectKostra, RoundViaDummy, makeroundtabs, Round2, FormulaSums, ModelMatrix

Examples


 # ========================================================
 #    Examples:  A single freq variable
 # =======================================================

 z2w <- KostraData("z2w")

 # ==== Without strataVar and  without formula ====
 RoundKostra(z2w ,idVar="region", freqVar="arbeid", roundBase=5)
 RoundKostra(z2w ,idVar="region", freqVar="arbeid", makeSums=FALSE) # Without total

 # ==== With strataVar and  without formula ====
 RoundKostra(z2w ,idVar="region", freqVar="arbeid", strataVar=c("fylke","kostragr"), roundBase=5)
 RoundKostra(z2w ,idVar="region", freqVar="arbeid", strataVar=c("fylke","kostragr"), makeSums=FALSE)
 RoundKostra(z2w ,idVar="region", freqVar="arbeid", strataVar= "fylke", allSmall=FALSE) # Warning when makeSums=TRUE
 RoundKostra(z2w ,idVar="region", freqVar="arbeid", strataVar= "fylke", allSmall=FALSE, makeSums=FALSE)
 RoundKostra(z2w ,idVar="region", freqVar="arbeid", strataVar=c("fylke","kostragr"), extraOutput=TRUE)$formula

 # ==== Without strataVar and  with formula ( makeSums ignored without warning)  ====
 RoundKostra(z2w ,idVar="region", freqVar="arbeid", formula="fylke", fylke="fylke")
 RoundKostra(z2w ,idVar="region", freqVar="arbeid", formula="fylke", fylke="fylke",allSmall = FALSE)
 RoundKostra(z2w ,idVar="region", freqVar="arbeid", formula="A+B", A="fylke",B="kostragr")
 RoundKostra(z2w ,idVar="region", freqVar="arbeid", formula="A*B+C", A="fylke",B="kostragr",C="annet")

 # =============================================================================
 #   Examples:  Several freq variables without freqVarGroup (allSmall ignored)
 # =============================================================================
 RoundKostra(z2w ,idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:7)
 RoundKostra(z2w ,idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:7,makeSums=FALSE)
 RoundKostra(z2w ,idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:7,extraOutput=TRUE)$input
 RoundKostra(z2w ,idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:7,extraOutput=TRUE)$formula

 # ===============================================================
 #   Examples: With freqVarGroup
 # ==========================================================================

 # ==========  With no single-groups  ================
 ex1 = Kostra:::exData1()   #  hack endre seinere
 freqVarGroup <- c(1,1,1,1,1,1,1,1,2,2,2,2)
 RoundKostra(ex1, idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:15,freqVarGroup=freqVarGroup)
 RoundKostra(ex1, idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:15,freqVarGroup=freqVarGroup, makeSums=FALSE)
 a1 <- RoundKostra(ex1, idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:15,freqVarGroup=freqVarGroup, extraOutput=TRUE)
 head(a1$input) # ad hoc created data
 a1$formula     # The formula used


 # ==========  With some single-groups  ================
 freqVarGroup <- c(1,1,1,1,1,1,1,1,2,2,2,2,3,4,-1,5)
 RoundKostra(ex1, idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:19,freqVarGroup=freqVarGroup)
 RoundKostra(ex1, idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:19,freqVarGroup=freqVarGroup, singleTotal = FALSE)

 # ====== With incorrect totals
 ex1b <- ex1
 ex1b$s1[1]=2L
 ex1b$arb_A[2]=5L
 RoundKostra(ex1b, idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:19,freqVarGroup=freqVarGroup, singleTotal = FALSE)
 a2 <- RoundKostra(ex1b, idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:19,freqVarGroup=freqVarGroup, singleTotal = FALSE,extraOutput=TRUE)
 table(a2$input$s1_s2_s3_s4,  useNA ="always") # Missing values in ad hoc created data when incorrect totals

 # ===============================================================
 #   Examples  With parameter output
 # ===============================================================

 RoundKostra(z2w ,idVar="region", freqVar="arbeid",output="status")
 RoundKostra(z2w ,idVar="region", freqVar="arbeid",output="difference")
 RoundKostra(z2w ,idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:7,output="status")
 RoundKostra(z2w ,idVar="region",strataVar=c("fylke","kostragr"),freqVar=4:7,output="difference")

 # ===============================================================
 #   Micro data example  (":::" since functions not exported yet)
 # ===============================================================

 microData <- Kostra:::microEx1()                 # A micro data set
 freqData  <- MakeFreq(microData,"freq") # Make cross-classified data
 freqData$id <- 1:NROW(freqData)                  # Add id-variable

 #  Rounding with makeSums=FALSE
 freqRound <- RoundKostra(freqData, idVar="id", strataVar=c("region", "fylke", "kostragr", "hovedint"), freqVar="freq", makeSums=FALSE)

 microRound  <- MakeMicro(freqRound,"freq")  # Create micro data set from output
 microRound  <- microRound[,-c(1,6)]                        # Remove some variables

 # Alternative where only region sums and the cross-classifications fylke*hovedint and kostragr*hovedint are to be published
 freqRound2 <- RoundKostra(freqData, idVar="id", formula=("region +fylke*hovedint + kostragr*hovedint"), freqVar="freq",
                   makeSums=FALSE, allSmall=FALSE, region="region",fylke="fylke",kostragr="kostragr",hovedint="hovedint")
 microRound2  <- MakeMicro(freqRound2,"freq")
 microRound2  <- microRound2[,-c(1,2)]


statisticsnorway/Kostra documentation built on Nov. 2, 2024, 6:40 p.m.