alphasvm: Support Vector Machines taking initial alpha values
In SwarmSVM: Ensemble Learning Algorithms Based on Support Vector Machines

alphasvm

R Documentation

Support Vector Machines taking initial alpha values

Description

alphasvm is used to train a support vector machine. It can be used to carry out general regression and classification (of nu and epsilon-type), as well as density-estimation. A formula interface is provided.

Usage

alphasvm(x, ...)

## S3 method for class 'formula'
alphasvm(
  formula,
  data = NULL,
  ...,
  subset,
  na.action = stats::na.omit,
  scale = FALSE
)

## Default S3 method:
alphasvm(
  x,
  y = NULL,
  scale = FALSE,
  type = NULL,
  kernel = "radial",
  degree = 3,
  gamma = if (is.vector(x)) 1 else 1/ncol(x),
  coef0 = 0,
  cost = 1,
  nu = 0.5,
  class.weights = NULL,
  cachesize = 40,
  tolerance = 0.001,
  epsilon = 0.1,
  shrinking = TRUE,
  cross = 0,
  probability = FALSE,
  fitted = TRUE,
  alpha = NULL,
  mute = TRUE,
  nclass = NULL,
  ...,
  subset,
  na.action = stats::na.omit
)

## S3 method for class 'alphasvm'
print(x, ...)

## S3 method for class 'alphasvm'
summary(object, ...)

## S3 method for class 'summary.alphasvm'
print(x, ...)

Arguments

`x`	a data matrix, a vector, or a sparse matrix (object of class `Matrix` provided by the `Matrix` package, or of class `matrix.csr` provided by the `SparseM` package, or of class `simple_triplet_matrix` provided by the `slam package`).
`...`	additional parameters for the low level fitting function `svm.default`
`formula`	a symbolic description of the model to be fit.
`data`	an optional data frame containing the variables in the model. By default the variables are taken from the environment which 'svm' is called from.
`subset`	An index vector specifying the cases to be used in the training sample. (NOTE: If given, this argument must be named.)
`na.action`	A function to specify the action to be taken if `NA`s are found. The default action is `stats::na.omit`, which leads to rejection of cases with missing values on any required variable. An alternative is `stats::na.fail`, which causes an error if `NA` cases are found. (NOTE: If given, this argument must be named.)
`scale`	A logical vector indicating the variables to be scaled. If scale is of length 1, the value is recycled as many times as needed. Per default, data are scaled internally (both x and y variables) to zero mean and unit variance. The center and scale values are returned and used for later predictions.
`y`	a response vector with one label for each row/component of x. Can be either a factor (for classification tasks) or a numeric vector (for regression).
`type`	svm can be used as a classification machine. The default setting for type is C-classification, but may be set to nu-classification as well.
`kernel`	the kernel used in training and predicting. You might consider changing some of the following parameters, depending on the kernel type. linear: u'v* polynomial: (gammau'v + coef0)^degree radial basis: exp(-gamma\|u-v\|^2)* sigmoid: tanh(gammau'v + coef0)
`degree`	parameter needed for kernel of type `polynomial` (default: 3)
`gamma`	parameter needed for all kernels except `linear` (default: 1/(data dimension))
`coef0`	parameter needed for kernels of type `polynomial` and `sigmoid` (default: 0)
`cost`	cost of constraints violation (default: 1)—it is the ‘C’-constant of the regularization term in the Lagrange formulation.
`nu`	parameter needed for `nu-classification`
`class.weights`	a named vector of weights for the different classes, used for asymmetric class sizes. Not all factor levels have to be supplied (default weight: 1). All components have to be named.
`cachesize`	cache memory in MB (default 40)
`tolerance`	tolerance of termination criterion (default: 0.001)
`epsilon`	epsilon in the insensitive-loss function (default: 0.1)
`shrinking`	option whether to use the shrinking-heuristics (default: `TRUE`)
`cross`	if a integer value k>0 is specified, a k-fold cross validation on the training data is performed to assess the quality of the model: the accuracy rate for classification and the Mean Squared Error for regression
`probability`	logical indicating whether the model should allow for probability predictions.
`fitted`	logical indicating whether the fitted values should be computed and included in the model or not (default: `TRUE`)
`alpha`	Initial values for the coefficients (default: `NULL`). A numerical vector for binary classification or a nx(k-1) matrix for a k-class-classification problem.
`mute`	a logical value indicating whether to print training information from svm.
`nclass`	the number of classes in total.
`object`	An object of class `alphasvm`

Details

For multiclass-classification with k levels, k>2, libsvm uses the ‘one-against-one’-approach, in which k(k-1)/2 binary classifiers are trained; the appropriate class is found by a voting scheme.

libsvm internally uses a sparse data representation, which is also high-level supported by the package SparseM.

If the predictor variables include factors, the formula interface must be used to get a correct model matrix.

plot.svm allows a simple graphical visualization of classification models.

The probability model for classification fits a logistic distribution using maximum likelihood to the decision values of all binary classifiers, and computes the a-posteriori class probabilities for the multi-class problem using quadratic optimization. The probabilistic regression model assumes (zero-mean) laplace-distributed errors for the predictions, and estimates the scale parameter using maximum likelihood.

Author(s)

Tong He (based on package e1071 by David Meyer and C/C++ code by Cho-Jui Hsieh in Divide-and-Conquer kernel SVM (DC-SVM) )

References

Chang, Chih-Chung and Lin, Chih-Jen:
LIBSVM: a library for Support Vector Machines
https://www.csie.ntu.edu.tw/~cjlin/libsvm/
Exact formulations of models, algorithms, etc. can be found in the document:
Chang, Chih-Chung and Lin, Chih-Jen:
LIBSVM: a library for Support Vector Machines
https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.ps.gz
More implementation details and speed benchmarks can be found on: Rong-En Fan and Pai-Hsune Chen and Chih-Jen Lin:
Working Set Selection Using the Second Order Information for Training SVM
https://www.csie.ntu.edu.tw/~cjlin/papers/quadworkset.pdf

Examples


data(svmguide1)
svmguide1.t = svmguide1[[2]]
svmguide1 = svmguide1[[1]]

model = alphasvm(x = svmguide1[,-1], y = svmguide1[,1], scale = TRUE)
preds = predict(model, svmguide1.t[,-1])
table(preds, svmguide1.t[,1])

data(iris)
attach(iris)

# default with factor response:
model = alphasvm(Species ~ ., data = iris)

# get new alpha
new.alpha = matrix(0, nrow(iris),2)
new.alpha[model$index,] = model$coefs

model2 = alphasvm(Species ~ ., data = iris, alpha = new.alpha)
preds = predict(model2, as.matrix(iris[,-5]))
table(preds, iris[,5])

SwarmSVM documentation built on Dec. 28, 2022, 1:24 a.m.