post_stratification: Apply post-stratification to classifiers.

View source: R/post_stratification.R

post_stratificationR Documentation

Apply post-stratification to classifiers.

Description

Apply post-stratification to classifiers.

Usage

post_stratification(
  y,
  L1.x,
  L2.x,
  L2.unit,
  L2.reg,
  best.subset.opt,
  lasso.opt,
  lasso.L2.x,
  pca.opt,
  gb.opt,
  svm.opt,
  svm.L2.reg,
  svm.L2.unit,
  svm.L2.x,
  mrp.include,
  n.minobsinnode,
  L2.unit.include,
  L2.reg.include,
  kernel,
  mrp.L2.x,
  data,
  ebma.fold,
  census,
  verbose,
  deep.mrp,
  deep.L2.x,
  deep.L2.reg,
  deep.splines
)

Arguments

y

Outcome variable. A character vector containing the column names of the outcome variable. A character scalar containing the column name of the outcome variable in survey.

L1.x

Individual-level covariates. A character vector containing the column names of the individual-level variables in survey and census used to predict outcome y. Note that geographic unit is specified in argument L2.unit.

L2.x

Context-level covariates. A character vector containing the column names of the context-level variables in survey and census used to predict outcome y. To exclude context-level variables, set L2.x = NULL.

L2.unit

Geographic unit. A character scalar containing the column name of the geographic unit in survey and census at which outcomes should be aggregated.

L2.reg

Geographic region. A character scalar containing the column name of the geographic region in survey and census by which geographic units are grouped (L2.unit must be nested within L2.reg). Default is NULL.

best.subset.opt

Optimal tuning parameters from best subset selection classifier. A list returned by run_best_subset().

lasso.opt

Optimal tuning parameters from lasso classifier A list returned by run_lasso().

lasso.L2.x

Lasso context-level covariates. A character vector containing the column names of the context-level variables in survey and census to be used by the lasso classifier. If NULL and lasso is set to TRUE, then lasso uses the variables specified in L2.x. Default is NULL.

pca.opt

Optimal tuning parameters from best subset selection with principal components classifier A list returned by run_pca().

gb.opt

Optimal tuning parameters from gradient tree boosting classifier A list returned by run_gb().

svm.opt

Optimal tuning parameters from support vector machine classifier A list returned by run_svm().

svm.L2.reg

SVM L2.reg. A logical argument indicating whether L2.reg should be included in the SVM classifier. Default is FALSE.

svm.L2.unit

SVM L2.unit. A logical argument indicating whether L2.unit should be included in the SVM classifier. Default is FALSE.

svm.L2.x

SVM context-level covariates. A character vector containing the column names of the context-level variables in survey and census to be used by the SVM classifier. If NULL and svm is set to TRUE, then SVM uses the variables specified in L2.x. Default is NULL.

mrp.include

Whether to run MRP classifier. A logical argument indicating whether the standard MRP classifier should be used for predicting outcome y. Passed from autoMrP() argument mrp.

n.minobsinnode

GB minimum number of observations in the terminal nodes. An integer-valued scalar specifying the minimum number of observations that each terminal node of the trees must contain. Passed from autoMrP() argument gb.n.minobsinnode.

L2.unit.include

GB L2.unit. A logical argument indicating whether L2.unit should be included in the GB classifier. Passed from autoMrP() argument gb.L2.unit.

L2.reg.include

A logical argument indicating whether L2.reg should be included in the GB classifier. Passed from autoMrP() argument GB L2.reg.

kernel

SVM kernel. A character-valued scalar specifying the kernel to be used by SVM. The possible values are linear, polynomial, radial, and sigmoid. Passed from autoMrP() argument svm.kernel.

mrp.L2.x

MRP context-level covariates. A character vector containing the column names of the context-level variables in survey and census to be used by the MRP classifier. The character vector empty if no context-level variables should be used by the MRP classifier. If NULL and mrp is set to TRUE, then MRP uses the variables specified in L2.x. Default is NULL. Note: For the empty MrP model, set L2.x = NULL and mrp.L2.x = "".

data

A data.frame containing the survey data used in classifier training.

ebma.fold

A data.frame containing the data not used in classifier training.

census

Census data. A data.frame whose column names include L1.x, L2.x, L2.unit, if specified, L2.reg and pcs, and either bin.proportion or bin.size.

verbose

Verbose output. A logical argument indicating whether or not verbose output should be printed. Default is FALSE.

deep.mrp

Deep MRP classifier. A logical argument indicating whether the deep MRP classifier should be used for predicting outcome y. Default is FALSE.

deep.L2.x

Deep MRP context-level covariates. A character vector containing the column names of the context-level variables in survey and census to be used by the deep MRP classifier. If NULL and deep.mrp is set to TRUE, then deep MRP uses the variables specified in L2.x. Default is NULL.

deep.L2.reg

Deep MRP L2.reg. A logical argument indicating whether L2.reg should be included in the deep MRP classifier. Default is TRUE.

deep.splines

Deep MRP splines. A logical argument indicating whether splines should be used in the deep MRP classifier. Default is TRUE.


autoMrP documentation built on May 29, 2024, 6:40 a.m.