abclass | R Documentation |
Multi-category angle-based large-margin classifiers with regularization by the elastic-net or groupwise penalty.
abclass( x, y, intercept = TRUE, weight = NULL, loss = c("logistic", "boost", "hinge-boost", "lum"), control = list(), ... ) abclass.control( lambda = NULL, alpha = 1, nlambda = 50L, lambda_min_ratio = NULL, grouped = TRUE, group_weight = NULL, group_penalty = c("lasso", "scad", "mcp"), dgamma = 1, lum_a = 1, lum_c = 1, boost_umin = -5, maxit = 100000L, epsilon = 1e-04, standardize = TRUE, varying_active_set = TRUE, verbose = 0L, ... )
x |
A numeric matrix representing the design matrix. No missing valus
are allowed. The coefficient estimates for constant columns will be
zero. Thus, one should set the argument |
y |
An integer vector, a character vector, or a factor vector representing the response label. |
intercept |
A logical value indicating if an intercept should be
considered in the model. The default value is |
weight |
A numeric vector for nonnegative observation weights. Equal observation weights are used by default. |
loss |
A character value specifying the loss function. The available
options are |
control |
A list of control parameters. See |
... |
Other control parameters passed to |
lambda |
A numeric vector specifying the tuning parameter
lambda. A data-driven lambda sequence will be generated
and used according to specified |
alpha |
A numeric value in [0, 1] representing the mixing parameter
alpha. The default value is |
nlambda |
A positive integer specifying the length of the internally
generated lambda sequence. This argument will be ignored if a
valid |
lambda_min_ratio |
A positive number specifying the ratio of the
smallest lambda parameter to the largest lambda parameter. The default
value is set to |
grouped |
A logicial value. Experimental flag to apply group penalties. |
group_weight |
A numerical vector with nonnegative values representing the adaptive penalty factors for the specified group penalty. |
group_penalty |
A character vector specifying the name of the group penalty. |
dgamma |
A positive number specifying the increment to the minimal gamma parameter for group SCAD or group MCP. |
lum_a |
A positive number greater than one representing the parameter
a in LUM, which will be used only if |
lum_c |
A nonnegative number specifying the parameter c in LUM,
which will be used only if |
boost_umin |
A negative number for adjusting the boosting loss for the internal majorization procedure. |
maxit |
A positive integer specifying the maximum number of iteration.
The default value is |
epsilon |
A positive number specifying the relative tolerance that
determines convergence. The default value is |
standardize |
A logical value indicating if each column of the design
matrix should be standardized internally to have mean zero and standard
deviation equal to the sample size. The default value is |
varying_active_set |
A logical value indicating if the active set
should be updated after each cycle of coordinate-majorization-descent
algorithm. The default value is |
verbose |
A nonnegative integer specifying if the estimation procedure
is allowed to print out intermediate steps/results. The default value
is |
The function abclass()
returns an object of class
abclass
representing a trained classifier; The function
abclass.control()
returns an object of class abclass.control
representing a list of control parameters.
Zhang, C., & Liu, Y. (2014). Multicategory Angle-Based Large-Margin Classification. Biometrika, 101(3), 625–640.
Liu, Y., Zhang, H. H., & Wu, Y. (2011). Hard or soft classification? large-margin unified machines. Journal of the American Statistical Association, 106(493), 166–177.
library(abclass) set.seed(123) ## toy examples for demonstration purpose ## reference: example 1 in Zhang and Liu (2014) ntrain <- 100 # size of training set ntest <- 100 # size of testing set p0 <- 5 # number of actual predictors p1 <- 5 # number of random predictors k <- 5 # number of categories n <- ntrain + ntest; p <- p0 + p1 train_idx <- seq_len(ntrain) y <- sample(k, size = n, replace = TRUE) # response mu <- matrix(rnorm(p0 * k), nrow = k, ncol = p0) # mean vector ## normalize the mean vector so that they are distributed on the unit circle mu <- mu / apply(mu, 1, function(a) sqrt(sum(a ^ 2))) x0 <- t(sapply(y, function(i) rnorm(p0, mean = mu[i, ], sd = 0.25))) x1 <- matrix(rnorm(p1 * n, sd = 0.3), nrow = n, ncol = p1) x <- cbind(x0, x1) train_x <- x[train_idx, ] test_x <- x[- train_idx, ] y <- factor(paste0("label_", y)) train_y <- y[train_idx] test_y <- y[- train_idx] ## Regularization through ridge penalty control1 <- abclass.control(nlambda = 5, lambda_min_ratio = 1e-3, alpha = 1, grouped = FALSE) model1 <- abclass(train_x, train_y, loss = "logistic", control = control1) pred1 <- predict(model1, test_x, s = 5) table(test_y, pred1) mean(test_y == pred1) # accuracy ## groupwise regularization via group lasso model2 <- abclass(train_x, train_y, loss = "boost", grouped = TRUE, nlambda = 5) pred2 <- predict(model2, test_x, s = 5) table(test_y, pred2) mean(test_y == pred2) # accuracy
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.