View source: R/generate_population_totals.R
generate_population_totals | R Documentation |
Build a fixed model matrix on a population frame and return the
column totals needed for calibration (optionally weighted). The function
freezes dummy/interaction structure on the population by constructing
a terms
object, so downstream use on respondent data can reuse
the exact same encoding.
generate_population_totals(
population_df,
calibration_formula,
weights = NULL,
contrasts = NULL,
include_intercept = TRUE,
sparse = FALSE,
na_action = stats::na.pass,
drop_zero_cols = FALSE
)
population_df |
A data frame containing the calibration population. |
calibration_formula |
A one-sided formula specifying main effects and interactions
(e.g., |
weights |
Optional numeric vector of population weights (length |
contrasts |
Optional named list of contrasts to pass to |
include_intercept |
Logical; if |
sparse |
Logical; if |
na_action |
NA handling passed to |
drop_zero_cols |
Logical; if |
An object of class "calib_totals"
: a list with
population_totals
: named numeric vector of column totals
levels
: list of factor levels observed in the population (for reproducibility)
terms
: the terms
object built on population_df
contrasts
: the contrasts actually used (from the model matrix)
# Example using the API data from the survey package
library(survey)
data(api) # loads apipop, apisrs, apistrat, etc.
# Build a population frame and create some binary fields used in a formula
pop <- apipop
pop$api00_bin <- as.factor(ifelse(pop$api00 >= 700, "700plus", "lt700"))
pop$growth_bin <- as.factor(ifelse(pop$growth >= 0, "nonneg", "neg"))
pop$ell_bin <- as.factor(ifelse(pop$ell >= 10, "highELL", "lowELL"))
pop$comp.imp_bin <- as.factor(ifelse(pop$comp.imp >= 50, "highComp", "lowComp"))
pop$hsg_bin <- as.factor(ifelse(pop$hsg >= 60, "highHSG", "lowHSG"))
# A calibration formula with main effects + a few interactions
cal_formula <- ~ stype + growth_bin + api00_bin + ell_bin + comp.imp_bin + hsg_bin +
api00_bin:stype + hsg_bin:stype + comp.imp_bin:stype + api00_bin:growth_bin
# (Optional) frame weights if available; here we use unweighted totals
gp <- generate_population_totals(
population_df = pop,
calibration_formula = cal_formula,
include_intercept = TRUE
)
# Named totals ready for calibration:
head(gp$population_totals)
# If you later build a respondent model matrix, reuse gp$terms to ensure alignment:
# X_resp <- model.matrix(gp$terms, data = apisrs)
# stopifnot(identical(colnames(X_resp), names(gp$population_totals)))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.