| abclass | R Documentation |
Multi-category angle-based large-margin classifiers with regularization by the elastic-net or groupwise penalty.
abclass(
x,
y,
loss = c("logistic", "boost", "hinge.boost", "lum"),
penalty = c("glasso", "lasso"),
weights = NULL,
offset = NULL,
intercept = TRUE,
control = list(),
...
)
abclass.control(
lum_a = 1,
lum_c = 0,
boost_umin = -5,
alpha = 1,
lambda = NULL,
nlambda = 50L,
lambda_min_ratio = NULL,
lambda_max_alpha_min = 0.01,
penalty_factor = NULL,
ncv_kappa = 0.1,
gel_tau = 0.33,
mellowmax_omega = 1,
lower_limit = -Inf,
upper_limit = Inf,
epsilon = 1e-07,
maxit = 100000L,
standardize = TRUE,
varying_active_set = TRUE,
adjust_mm = FALSE,
save_call = FALSE,
verbose = 0L
)
x |
A numeric matrix representing the design matrix. No missing valus
are allowed. The coefficient estimates for constant columns will be
zero. Thus, one should set the argument |
y |
An integer vector, a character vector, or a factor vector representing the response label. |
loss |
A character value specifying the loss function. The available
options are |
penalty |
A character vector specifying the name of the penalty. |
weights |
A numeric vector for nonnegative observation weights. Equal observation weights are used by default. |
offset |
An optional numeric matrix for offsets of the decision functions. |
intercept |
A logical value indicating if an intercept should be
considered in the model. The default value is |
control |
A list of control parameters. See |
... |
Other control parameters passed to |
lum_a |
A positive number greater than one representing the parameter
a in LUM, which will be used only if |
lum_c |
A nonnegative number specifying the parameter c in LUM,
which will be used only if |
boost_umin |
A negative number for adjusting the boosting loss for the internal majorization procedure. |
alpha |
A numeric value in $[0,1]$ representing the mixing parameter
alpha. The default value is |
lambda |
A numeric vector specifying the tuning parameter
lambda. A data-driven lambda sequence will be generated
and used according to specified |
nlambda |
A positive integer specifying the length of the internally
generated lambda sequence. This argument will be ignored if a
valid |
lambda_min_ratio |
A positive number specifying the ratio of the
smallest lambda parameter to the largest lambda parameter. The default
value is set to |
lambda_max_alpha_min |
A positive number specifying the minimum
denominator when the function determines the largest lambda. If the
|
penalty_factor |
A numerical vector with nonnegative values specifying the adaptive penalty factors for individual predictors (excluding intercept). |
ncv_kappa |
A positive number within $(0,1)$ specifying the ratio of
reciprocal gamma parameter for group SCAD or group MCP. A close-to-zero
|
gel_tau |
A positive parameter tau for group exponential lasso penalty. |
mellowmax_omega |
A positive parameter omega for Mellowmax penalty. It is experimental and subject to removal in future. |
lower_limit, upper_limit |
Numeric matrices representing the desired lower and upper limits for the coefficient estimates, respectively. |
epsilon |
A positive number specifying the relative tolerance that determines convergence. |
maxit |
A positive integer specifying the maximum number of iteration. |
standardize |
A logical value indicating if each column of the design
matrix should be standardized internally to have mean zero and standard
deviation equal to the sample size. The default value is |
varying_active_set |
A logical value indicating if the active set
should be updated after each cycle of coordinate-descent algorithm. The
default value is |
adjust_mm |
An experimental logical value specifying if the estimation procedure should track loss function and adjust the MM lower bound if needed. |
save_call |
A logical value indicating if the function call of the
model fitting should be saved. If |
verbose |
A nonnegative integer specifying if the estimation procedure
is allowed to print out intermediate steps/results. The default value
is |
The function abclass() returns an object of class
abclass representing a trained classifier; The function
abclass.control() returns an object of class
abclass.control representing a list of control parameters.
Zhang, C., & Liu, Y. (2014). Multicategory Angle-Based Large-Margin Classification. Biometrika, 101(3), 625–640.
Liu, Y., Zhang, H. H., & Wu, Y. (2011). Hard or soft classification? large-margin unified machines. Journal of the American Statistical Association, 106(493), 166–177.
library(abclass)
set.seed(123)
## toy examples for demonstration purpose
## reference: example 1 in Zhang and Liu (2014)
ntrain <- 100 # size of training set
ntest <- 1000 # size of testing set
p0 <- 2 # number of actual predictors
p1 <- 2 # number of random predictors
k <- 3 # number of categories
n <- ntrain + ntest; p <- p0 + p1
train_idx <- seq_len(ntrain)
y <- sample(k, size = n, replace = TRUE) # response
mu <- matrix(rnorm(p0 * k), nrow = k, ncol = p0) # mean vector
## normalize the mean vector so that they are distributed on the unit circle
mu <- mu / apply(mu, 1, function(a) sqrt(sum(a ^ 2)))
x0 <- t(sapply(y, function(i) rnorm(p0, mean = mu[i, ], sd = 0.25)))
x1 <- matrix(rnorm(p1 * n, sd = 0.3), nrow = n, ncol = p1)
x <- cbind(x0, x1)
train_x <- x[train_idx, ]
test_x <- x[- train_idx, ]
y <- factor(paste0("label_", y))
train_y <- y[train_idx]
test_y <- y[- train_idx]
## regularization through group lasso penalty
model <- abclass(
x = train_x,
y = train_y,
loss = "logistic",
penalty = "glasso"
)
pred <- predict(model, test_x, s = 5)
mean(test_y == pred) # accuracy
table(test_y, pred)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.