model_aggregate: Hierarchical aggregation via model specification
In SSBtools: Algorithms and Tools for Tabular Statistics and Hierarchical Computations

model_aggregate

R Documentation

Hierarchical aggregation via model specification

Description

Internally a dummy/model matrix is created according to the model specification. This model matrix is used in the aggregation process via matrix multiplication and/or the function aggregate_multiple_fun.

Usage

model_aggregate(
  data,
  sum_vars = NULL,
  fun_vars = NULL,
  fun = NULL,
  hierarchies = NULL,
  formula = NULL,
  dim_var = NULL,
  total = NULL,
  input_in_output = NULL,
  remove_empty = NULL,
  avoid_hierarchical = NULL,
  preagg_var = NULL,
  dummy = TRUE,
  pre_aggregate = dummy,
  aggregate_pkg = "base",
  aggregate_na = TRUE,
  aggregate_base_order = FALSE,
  list_return = FALSE,
  pre_return = FALSE,
  verbose = TRUE,
  mm_args = NULL,
  ...
)

Arguments

`data`	Input data containing data to be aggregated, typically a data frame, tibble, or data.table. If data is not a classic data frame, it will be coerced to one internally.
`sum_vars`	Variables to be summed. This will be done via matrix multiplication.
`fun_vars`	Variables to be aggregated by supplied functions. This will be done via `aggregate_multiple_fun` and `dummy_aggregate` and `fun_vars` is specified as the parameter `vars`.
`fun`	The `fun` parameter to `aggregate_multiple_fun`
`hierarchies`	The `hierarchies` parameter to `ModelMatrix`
`formula`	The `formula` parameter to `ModelMatrix`
`dim_var`	The `dimVar` parameter to `ModelMatrix`
`total`	When non-NULL, the `total` parameter to `ModelMatrix`. Thus, the actual default value is `"Total"`.
`input_in_output`	When non-NULL, the `inputInOutput` parameter to `ModelMatrix`. Thus, the actual default value is `TRUE`.
`remove_empty`	When non-NULL, the `removeEmpty` parameter to `ModelMatrix`. Thus, the actual default value is `TRUE` with formula input without hierarchy and otherwise `FALSE` (see `ModelMatrix`).
`avoid_hierarchical`	When non-NULL, the `avoidHierarchical` parameter to `Formula2ModelMatrix`, which is an underlying function of `ModelMatrix`.
`preagg_var`	Extra variables to be used as grouping elements in the pre-aggregate step
`dummy`	The `dummy` parameter to `dummy_aggregate`. When `TRUE`, only 0s and 1s are assumed in the generated model matrix. When `FALSE`, non-0s in this matrix are passed as an additional first input parameter to the `fun` functions.
`pre_aggregate`	Whether to pre-aggregate data to reduce the dimension of the model matrix. Note that all original `fun_vars` observations are retained in the aggregated dataset and `pre_aggregate` does not affect the final result. However, `pre_aggregate` must be set to `FALSE` when the `dummy_aggregate` parameter `dummy` is set to `FALSE` since then `unlist` will not be run. An exception to this is if the `fun` functions are written to handle list data.
`aggregate_pkg`	Package used to pre-aggregate. Parameter `pkg` to `aggregate_by_pkg`.
`aggregate_na`	Whether to include NAs in the grouping variables while preAggregate. Parameter `include_na` to `aggregate_by_pkg`.
`aggregate_base_order`	Parameter `base_order` to `aggregate_by_pkg`, used when pre-aggregate. The default is set to `FALSE` to avoid unnecessary sorting operations. When `TRUE`, an attempt is made to return the same result with `data.table` as with base R. This cannot be guaranteed due to potential variations in sorting behavior across different systems.
`list_return`	Whether to return a list of separate components including the model matrix `x`.
`pre_return`	Whether to return the pre-aggregate data as a two-component list. Can also be combined with `list_return` (see examples).
`verbose`	Whether to print information during calculations.
`mm_args`	List of further arguments passed to `ModelMatrix`.
`...`	Further arguments passed to `dummy_aggregate`.

Details

With formula input, limited output can be achieved by formula_selection (see example). An attribute called startCol has been added to the output data frame to make this functionality work.

Value

A data frame or a list.

Examples

z <- SSBtoolsData("sprt_emp_withEU")
z$age[z$age == "Y15-29"] <- "young"
z$age[z$age == "Y30-64"] <- "old"
names(z)[names(z) == "ths_per"] <- "ths"
z$y <- 1:18

my_range <- function(x) c(min = min(x), max = max(x))

out <- model_aggregate(z, 
   formula = ~age:year + geo, 
   sum_vars = c("y", "ths"), 
   fun_vars = c(sum = "ths", mean = "y", med = "y", ra = "ths"), 
   fun = c(sum = sum, mean = mean, med = median, ra = my_range))

out

# Limited output can be achieved by formula_selection
formula_selection(out, ~geo)


# Using the single unnamed variable feature.
model_aggregate(z, formula = ~age, fun_vars = "y", 
                fun = c(sum = sum, mean = mean, med = median, n = length))


# To illustrate list_return and pre_return 
for (pre_return in c(FALSE, TRUE)) for (list_return in c(FALSE, TRUE)) {
  cat("\n=======================================\n")
  cat("list_return =", list_return, ", pre_return =", pre_return, "\n\n")
  out <- model_aggregate(z, formula = ~age:year, 
                         sum_vars = c("ths", "y"), 
                         fun_vars = c(mean = "y", ra = "y"), 
                         fun = c(mean = mean, ra = my_range), 
                         list_return = list_return,
                         pre_return = pre_return)
  cat("\n")
  print(out)
}


# To illustrate preagg_var 
model_aggregate(z, formula = ~age:year, 
sum_vars = c("ths", "y"), 
fun_vars = c(mean = "y", ra = "y"), 
fun = c(mean = mean, ra = my_range), 
preagg_var = "eu",
pre_return = TRUE)[["pre_data"]]


# To illustrate hierarchies 
geo_hier <- SSBtoolsData("sprt_emp_geoHier")
model_aggregate(z, hierarchies = list(age = "All", geo = geo_hier), 
                sum_vars = "y", 
                fun_vars = c(sum = "y"))

####  Special non-dummy cases illustrated below  ####

# Extend the hierarchy to make non-dummy model matrix  
geo_hier2 <- rbind(data.frame(mapsFrom = c("EU", "Spain"), 
                              mapsTo = "EUandSpain", sign = 1), geo_hier[, -4])

# Warning since non-dummy
# y and y_sum are different 
model_aggregate(z, hierarchies = list(age = "All", geo = geo_hier2), 
                sum_vars = "y", 
                fun_vars = c(sum = "y"))

# No warning since dummy since unionComplement = TRUE (see ?HierarchyCompute)
# y and y_sum are equal   
model_aggregate(z, hierarchies = list(age = "All", geo = geo_hier2), 
                sum_vars = "y", 
                fun_vars = c(sum = "y"),
                mm_args = list(unionComplement = TRUE))

# Non-dummy again, but no warning since dummy = FALSE
# Then pre_aggregate is by default set to FALSE (error when TRUE) 
# fun with extra argument needed (see ?dummy_aggregate)
# y and y_sum2 are equal
model_aggregate(z, hierarchies = list(age = "All", geo = geo_hier2), 
                sum_vars = "y", 
                fun_vars = c(sum2 = "y"),
                fun = c(sum2 = function(x, y) sum(x * y)),
                dummy = FALSE)

SSBtools documentation built on June 19, 2025, 5:07 p.m.

SSBtools index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

SSBtools
Algorithms and Tools for Tabular Statistics and Hierarchical Computations

model_aggregate: Hierarchical aggregation via model specification
In SSBtools: Algorithms and Tools for Tabular Statistics and Hierarchical Computations

Hierarchical aggregation via model specification

Description

Usage

Arguments

Details

Value

Examples

Related to model_aggregate in SSBtools...

R Package Documentation

Browse R Packages

We want your feedback!

SSBtools Algorithms and Tools for Tabular Statistics and Hierarchical Computations

model_aggregate: Hierarchical aggregation via model specification In SSBtools: Algorithms and Tools for Tabular Statistics and Hierarchical Computations

Hierarchical aggregation via model specification

Description

Usage

Arguments

Details

Value

Examples

Related to model_aggregate in SSBtools...

R Package Documentation

Browse R Packages

We want your feedback!

SSBtools
Algorithms and Tools for Tabular Statistics and Hierarchical Computations

model_aggregate: Hierarchical aggregation via model specification
In SSBtools: Algorithms and Tools for Tabular Statistics and Hierarchical Computations