ebma: Bayesian Ensemble Model Averaging EBMA

Description Usage Arguments

View source: R/ebma.R

Description

ebma tunes EBMA and generates weights for classifier averaging.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
ebma(
  ebma.fold,
  y,
  L1.x,
  L2.x,
  L2.unit,
  L2.reg,
  pc.names,
  post.strat,
  n.draws,
  tol,
  best.subset.opt,
  pca.opt,
  lasso.opt,
  gb.opt,
  svm.opt,
  verbose,
  cores
)

Arguments

ebma.fold

New data for EBMA tuning. A list containing the the data that must not have been used in classifier training.

y

Outcome variable. A character vector containing the column names of the outcome variable. A character scalar containing the column name of the outcome variable in survey.

L1.x

Individual-level covariates. A character vector containing the column names of the individual-level variables in survey and census used to predict outcome y. Note that geographic unit is specified in argument L2.unit.

L2.x

Context-level covariates. A character vector containing the column names of the context-level variables in survey and census used to predict outcome y.

L2.unit

Geographic unit. A character scalar containing the column name of the geographic unit in survey and census at which outcomes should be aggregated.

L2.reg

Geographic region. A character scalar containing the column name of the geographic region in survey and census by which geographic units are grouped (L2.unit must be nested within L2.reg). Default is NULL.

pc.names

Principal Component Variable names. A character vector containing the names of the context-level principal components variables.

post.strat

Post-stratification results. A list containing the best models for each of the tuned classifiers, the individual level predictions on the data classifier trainig data and the post-stratified context-level predictions.

n.draws

EBMA number of samples. An integer-valued scalar specifying the number of bootstrapped samples to be drawn from the EBMA fold and used for tuning EBMA. Default is 100. Passed on from ebma.n.draws.

tol

EBMA tolerance. A numeric vector containing the tolerance values for improvements in the log-likelihood before the EM algorithm stops optimization. Values should range at least from 0.01 to 0.001. Default is c(0.01, 0.005, 0.001, 0.0005, 0.0001, 0.00005, 0.00001). Passed on from ebma.tol.

best.subset.opt

Tuned best subset parameters. A list returned from run_best_subset().

pca.opt

Tuned best subset with principal components parameters. A list returned from run_pca().

lasso.opt

Tuned lasso parameters. A list returned from run_lasso().

gb.opt

Tuned gradient tree boosting parameters. A list returned from run_gb().

svm.opt

Tuned support vector machine parameters. A list returned from run_svm().

verbose

Verbose output. A logical argument indicating whether or not verbose output should be printed. Default is FALSE.

cores

The number of cores to be used. An integer indicating the number of processor cores used for parallel computing. Default is 1.


autoMrP documentation built on Jan. 21, 2021, 5:07 p.m.