gensvm-package: GenSVM: A Generalized Multiclass Support Vector Machine

gensvm-packageR Documentation

GenSVM: A Generalized Multiclass Support Vector Machine

Description

The GenSVM classifier is a generalized multiclass support vector machine (SVM). This classifier aims to find decision boundaries that separate the classes with as wide a margin as possible. In GenSVM, the loss functions that measures how misclassifications are counted is very flexible. This allows the user to tune the classifier to the dataset at hand and potentially obtain higher classification accuracy. Moreover, this flexibility means that GenSVM has a number of alternative multiclass SVMs as special cases. One of the other advantages of GenSVM is that it is trained in the primal space, allowing the use of warm starts during optimization. This means that for common tasks such as cross validation or repeated model fitting, GenSVM can be trained very quickly.

Details

This package provides functions for training the GenSVM model either as a separate model or through a cross-validated parameter grid search. In both cases the GenSVM C library is used for speed. Auxiliary functions for evaluating and using the model are also provided.

GenSVM functions

The main GenSVM functions are:

gensvm

Fit a GenSVM model for specific model parameters.

gensvm.grid

Run a cross-validated grid search for GenSVM.

For the GenSVM and GenSVMGrid models the following two functions are available. When applied to a GenSVMGrid object, the function is applied to the best GenSVM model.

plot

Plot the low-dimensional simplex space where the decision boundaries are fixed (for problems with 3 classes).

predict

Predict the class labels of new data using the GenSVM model.

Moreover, for the GenSVM and GenSVMGrid models a coef function is defined:

coef.gensvm

Get the coefficients of the fitted GenSVM model.

coef.gensvm.grid

Get the parameter grid of the GenSVM grid search.

The following utility functions are also included:

gensvm.accuracy

Compute the accuracy score between true and predicted class labels

gensvm.maxabs.scale

Scale each column of the dataset by its maximum absolute value, preserving sparsity and mapping the data to [-1, 1]

gensvm.train.test.split

Split a dataset into a training and testing sample

gensvm.refit

Refit a fitted GenSVM model with slightly different parameters or on a different dataset

Kernels in GenSVM

GenSVM can be used for both linear and nonlinear multiclass support vector machine classification. In general, linear classification will be faster but depending on the dataset higher classification performance can be achieved using a nonlinear kernel.

The following nonlinear kernels are implemented in the GenSVM package:

RBF

The Radial Basis Function kernel is a well-known kernel function based on the Euclidean distance between objects. It is defined as

k(x_i, x_j) = exp( -γ || x_i - x_j ||^2 )

Polynomial

A polynomial kernel can also be used in GenSVM. This kernel function is implemented very generally and therefore takes three parameters (coef, gamma, and degree). It is defined as:

k(x_i, x_j) = ( γ x_i' x_j + coef)^{degree}

Sigmoid

The sigmoid kernel is the final kernel implemented in GenSVM. This kernel has two parameters and is implemented as follows:

k(x_i, x_j) = \tanh( γ x_i' x_j + coef)

Author(s)

Gerrit J.J. van den Burg, Patrick J.F. Groenen
Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com>

References

Van den Burg, G.J.J. and Groenen, P.J.F. (2016). GenSVM: A Generalized Multiclass Support Vector Machine, Journal of Machine Learning Research, 17(225):1–42. URL https://jmlr.org/papers/v17/14-526.html.

See Also

gensvm, gensvm.grid


gensvm documentation built on Feb. 16, 2023, 5:58 p.m.