View source: R/HierarchyCompute.R
HierarchyCompute | R Documentation |
This function computes aggregates by crossing several hierarchical specifications and factorial variables.
HierarchyCompute(
data,
hierarchies,
valueVar,
colVar = NULL,
rowSelect = NULL,
colSelect = NULL,
select = NULL,
inputInOutput = FALSE,
output = "data.frame",
autoLevel = TRUE,
unionComplement = FALSE,
constantsInOutput = NULL,
hierarchyVarNames = c(mapsFrom = "mapsFrom", mapsTo = "mapsTo", sign = "sign", level =
"level"),
selectionByMultiplicationLimit = 10^7,
colNotInDataWarning = TRUE,
useMatrixToDataFrame = TRUE,
handleDuplicated = "sum",
asInput = FALSE,
verbose = FALSE,
reOrder = FALSE,
reduceData = TRUE,
makeRownames = NULL
)
data |
The input data frame |
hierarchies |
A named (names in |
valueVar |
Name of the variable(s) to be aggregated. |
colVar |
When non-NULL, the function |
rowSelect |
Data frame specifying variable combinations for output. The colFactor variable is not included.
In addition |
colSelect |
Vector specifying categories of the colFactor variable for output. |
select |
Data frame specifying variable combinations for output. The colFactor variable is included. |
inputInOutput |
Logical vector (possibly recycled) for each element of hierarchies.
TRUE means that codes from input are included in output. Values corresponding to |
output |
One of "data.frame" (default), "dummyHierarchies", "outputMatrix", "dataDummyHierarchy", "valueMatrix", "fromCrossCode",
"toCrossCode", "crossCode" (as toCrossCode), "outputMatrixWithCrossCode", "matrixComponents",
"dataDummyHierarchyWithCodeFrame", "dataDummyHierarchyQuick".
The latter two do not require |
autoLevel |
Logical vector (possibly recycled) for each element of hierarchies.
When TRUE, level is computed by automatic method as in |
unionComplement |
Logical vector (possibly recycled) for each element of hierarchies.
When TRUE, sign means union and complement instead of addition or subtraction as in |
constantsInOutput |
A single row data frame to be combine by the other output. |
hierarchyVarNames |
Variable names in the hierarchy tables as in |
selectionByMultiplicationLimit |
With non-NULL |
colNotInDataWarning |
When TRUE, warning produced when elements of |
useMatrixToDataFrame |
When TRUE (default) special functionality for saving time and memory is used. |
handleDuplicated |
Handling of duplicated code rows in data. One of: "sum" (default), "sumByAggregate", "sumWithWarning", "stop" (error), "single" or "singleWithWarning". With no colFactor sum and sumByAggregate/sumWithWarning are different (original values or aggregates in "valueMatrix"). When single, only one of the values is used (by matrix subsetting). |
asInput |
When TRUE (FALSE is default) output matrices match input data. Thus
|
verbose |
Whether to print information during calculations. FALSE is default. |
reOrder |
When TRUE (FALSE is default) output codes are ordered differently, more similar to a usual model matrix ordering. |
reduceData |
When TRUE (default) unnecessary (for the aggregated result) rows of |
makeRownames |
When TRUE |
A key element of this function is the matrix multiplication:
outputMatrix
=
dataDummyHierarchy
%*%
valueMatrix
.
The matrix, valueMatrix
is a re-organized version of the valueVar vector from input. In particular,
if a variable is selected as colFactor
, there is one column for each level of that variable.
The matrix, dataDummyHierarchy
is constructed by crossing dummy coding of hierarchies (DummyHierarchy
) and factorial variables
in a way that matches valueMatrix
. The code combinations corresponding to rows and columns of dataDummyHierarchy
can be obtained as toCrossCode
and fromCrossCode
. In the default data frame output, the outputMatrix
is stacked
to one column and combined with the code combinations of all variables.
As specified by the parameter output
Øyvind Langsrud
Hierarchies2ModelMatrix
, AutoHierarchies
.
# Data and hierarchies used in the examples
x <- SSBtoolsData("sprt_emp") # Employment in sport in thousand persons from Eurostat database
geoHier <- SSBtoolsData("sprt_emp_geoHier")
ageHier <- SSBtoolsData("sprt_emp_ageHier")
# Two hierarchies and year as rowFactor
HierarchyCompute(x, list(age = ageHier, geo = geoHier, year = "rowFactor"), "ths_per")
# Same result with year as colFactor (but columns ordered differently)
HierarchyCompute(x, list(age = ageHier, geo = geoHier, year = "colFactor"), "ths_per")
# Internally the computations are different as seen when output='matrixComponents'
HierarchyCompute(x, list(age = ageHier, geo = geoHier, year = "rowFactor"), "ths_per",
output = "matrixComponents")
HierarchyCompute(x, list(age = ageHier, geo = geoHier, year = "colFactor"), "ths_per",
output = "matrixComponents")
# Include input age groups by setting inputInOutput = TRUE for this variable
HierarchyCompute(x, list(age = ageHier, geo = geoHier, year = "colFactor"), "ths_per",
inputInOutput = c(TRUE, FALSE))
# Only input age groups by switching to rowFactor
HierarchyCompute(x, list(age = "rowFactor", geo = geoHier, year = "colFactor"), "ths_per")
# Select some years (colFactor) including a year not in input data (zeros produced)
HierarchyCompute(x, list(age = ageHier, geo = geoHier, year = "colFactor"), "ths_per",
colSelect = c("2014", "2016", "2018"))
# Select combinations of geo and age including a code not in data or hierarchy (zeros produced)
HierarchyCompute(x, list(age = ageHier, geo = geoHier, year = "colFactor"), "ths_per",
rowSelect = data.frame(geo = "EU", age = c("Y0-100", "Y15-64", "Y15-29")))
# Select combinations of geo, age and year
HierarchyCompute(x, list(age = ageHier, geo = geoHier, year = "colFactor"), "ths_per",
select = data.frame(geo = c("EU", "Spain"), age = c("Y15-64", "Y15-29"), year = 2015))
# Extend the hierarchy table to illustrate the effect of unionComplement
# Omit level since this is handled by autoLevel
geoHier2 <- rbind(data.frame(mapsFrom = c("EU", "Spain"), mapsTo = "EUandSpain", sign = 1),
geoHier[, -4])
# Spain is counted twice
HierarchyCompute(x, list(age = ageHier, geo = geoHier2, year = "colFactor"), "ths_per")
# Can be seen in the dataDummyHierarchy matrix
HierarchyCompute(x, list(age = ageHier, geo = geoHier2, year = "colFactor"), "ths_per",
output = "matrixComponents")
# With unionComplement=TRUE Spain is not counted twice
HierarchyCompute(x, list(age = ageHier, geo = geoHier2, year = "colFactor"), "ths_per",
unionComplement = TRUE)
# With constantsInOutput
HierarchyCompute(x, list(age = ageHier, geo = geoHier, year = "colFactor"), "ths_per",
constantsInOutput = data.frame(c1 = "AB", c2 = "CD"))
# More that one valueVar
x$y <- 10*x$ths_per
HierarchyCompute(x, list(age = ageHier, geo = geoHier), c("y", "ths_per"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.