View source: R/run_classifiers.R
run_classifiers | R Documentation |
run_classifiers
tunes classifiers, post-stratifies and carries out
EMBA.
run_classifiers(
y,
L1.x,
L2.x,
mrp.L2.x,
L2.unit,
L2.reg,
L2.x.scale,
pcs,
pc.names,
folds,
bin.proportion,
bin.size,
cv.folds,
cv.data,
ebma.fold,
census,
ebma.size,
ebma.n.draws,
k.folds,
cv.sampling,
loss.unit,
loss.fun,
best.subset,
lasso,
pca,
gb,
svm,
mrp,
deep.mrp,
best.subset.L2.x,
lasso.L2.x,
pca.L2.x,
gb.L2.x,
svm.L2.x,
gb.L2.unit,
gb.L2.reg,
svm.L2.unit,
svm.L2.reg,
deep.L2.x,
deep.L2.reg,
deep.splines,
lasso.lambda,
lasso.n.iter,
gb.interaction.depth,
gb.shrinkage,
gb.n.trees.init,
gb.n.trees.increase,
gb.n.trees.max,
gb.n.minobsinnode,
svm.kernel,
svm.gamma,
svm.cost,
ebma.tol,
cores,
verbose
)
y |
Outcome variable. A character vector containing the column names of
the outcome variable. A character scalar containing the column name of
the outcome variable in |
L1.x |
Individual-level covariates. A character vector containing the
column names of the individual-level variables in |
L2.x |
Context-level covariates. A character vector containing the
column names of the context-level variables in |
mrp.L2.x |
MRP context-level covariates. A character vector containing
the column names of the context-level variables in |
L2.unit |
Geographic unit. A character scalar containing the column
name of the geographic unit in |
L2.reg |
Geographic region. A character scalar containing the column
name of the geographic region in |
L2.x.scale |
Scale context-level covariates. A logical argument
indicating whether the context-level covariates should be normalized.
Default is |
pcs |
Principal components. A character vector containing the column
names of the principal components of the context-level variables in
|
pc.names |
A character vector of the principal component variable names in the data. |
folds |
EBMA and cross-validation folds. A character scalar containing
the column name of the variable in |
bin.proportion |
Proportion of ideal types. A character scalar
containing the column name of the variable in |
bin.size |
Bin size of ideal types. A character scalar containing the
column name of the variable in |
cv.folds |
Data for cross-validation. A |
cv.data |
A data.frame containing the survey data used in classifier training. |
ebma.fold |
A data.frame containing the data not used in classifier training. |
census |
Census data. A |
ebma.size |
EBMA fold size. A number in the open unit interval
indicating the proportion of respondents to be allocated to the EBMA fold.
Default is |
ebma.n.draws |
EBMA number of samples. An integer-valued scalar
specifying the number of bootstrapped samples to be drawn from the EBMA
fold and used for tuning EBMA. Default is |
k.folds |
Number of cross-validation folds. An integer-valued scalar
indicating the number of folds to be used in cross-validation. Default is
|
cv.sampling |
Cross-validation sampling method. A character-valued
scalar indicating whether cross-validation folds should be created by
sampling individual respondents ( |
loss.unit |
Loss function unit. A character-valued scalar indicating
whether performance loss should be evaluated at the level of individual
respondents ( |
loss.fun |
Loss function. A character-valued scalar indicating whether
prediction loss should be measured by the mean squared error ( |
best.subset |
Best subset classifier. A logical argument indicating
whether the best subset classifier should be used for predicting outcome
|
lasso |
Lasso classifier. A logical argument indicating whether the
lasso classifier should be used for predicting outcome |
pca |
PCA classifier. A logical argument indicating whether the PCA
classifier should be used for predicting outcome |
gb |
GB classifier. A logical argument indicating whether the GB
classifier should be used for predicting outcome |
svm |
SVM classifier. A logical argument indicating whether the SVM
classifier should be used for predicting outcome |
mrp |
MRP classifier. A logical argument indicating whether the standard
MRP classifier should be used for predicting outcome |
deep.mrp |
Deep MRP classifier. A logical argument indicating whether
the deep MRP classifier should be used for predicting outcome |
best.subset.L2.x |
Best subset context-level covariates. A character
vector containing the column names of the context-level variables in
|
lasso.L2.x |
Lasso context-level covariates. A character vector
containing the column names of the context-level variables in
|
pca.L2.x |
PCA context-level covariates. A character vector containing
the column names of the context-level variables in |
gb.L2.x |
GB context-level covariates. A character vector containing the
column names of the context-level variables in |
svm.L2.x |
SVM context-level covariates. A character vector containing
the column names of the context-level variables in |
gb.L2.unit |
GB L2.unit. A logical argument indicating whether
|
gb.L2.reg |
GB L2.reg. A logical argument indicating whether
|
svm.L2.unit |
SVM L2.unit. A logical argument indicating whether
|
svm.L2.reg |
SVM L2.reg. A logical argument indicating whether
|
deep.L2.x |
Deep MRP context-level covariates. A character vector
containing the column names of the context-level variables in |
deep.L2.reg |
Deep MRP L2.reg. A logical argument indicating whether
|
deep.splines |
Deep MRP splines. A logical argument indicating whether
splines should be used in the deep MRP classifier. Default is |
lasso.lambda |
Lasso penalty parameter. A numeric |
lasso.n.iter |
Lasso number of lambda values. An integer-valued scalar
specifying the number of lambda values to search over. Default is
|
gb.interaction.depth |
GB interaction depth. An integer-valued vector
whose values specify the interaction depth of GB. The interaction depth
defines the maximum depth of each tree grown (i.e., the maximum level of
variable interactions). Default is |
gb.shrinkage |
GB learning rate. A numeric vector whose values specify
the learning rate or step-size reduction of GB. Values between |
gb.n.trees.init |
GB initial total number of trees. An integer-valued
scalar specifying the initial number of total trees to fit by GB. Default
is |
gb.n.trees.increase |
GB increase in total number of trees. An
integer-valued scalar specifying by how many trees the total number of
trees to fit should be increased (until |
gb.n.trees.max |
GB maximum number of trees. An integer-valued scalar
specifying the maximum number of trees to fit by GB. Default is |
gb.n.minobsinnode |
GB minimum number of observations in the terminal
nodes. An integer-valued scalar specifying the minimum number of
observations that each terminal node of the trees must contain. Default is
|
svm.kernel |
SVM kernel. A character-valued scalar specifying the kernel
to be used by SVM. The possible values are |
svm.gamma |
SVM kernel parameter. A numeric vector whose values specify the gamma parameter in the SVM kernel. This parameter is needed for all kernel types except linear. Default is a sequence with minimum = 1e-5, maximum = 1e-1, and length = 20 that is equally spaced on the log-scale. |
svm.cost |
SVM cost parameter. A numeric vector whose values specify the cost of constraints violation in SVM. Default is a sequence with minimum = 0.5, maximum = 10, and length = 5 that is equally spaced on the log-scale. |
ebma.tol |
EBMA tolerance. A numeric vector containing the
tolerance values for improvements in the log-likelihood before the EM
algorithm stops optimization. Values should range at least from |
cores |
The number of cores to be used. An integer indicating the number of processor cores used for parallel computing. Default is 1. |
verbose |
Verbose output. A logical argument indicating whether or not
verbose output should be printed. Default is |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.