wsvm: Weighted Support Vector Machines
In schiffner/locClass: Collection of Local Classification Methods

Description Usage Arguments Details Value Note References See Also Examples

This is a modification of the svm function in package e1071 that can deal with observation weights.

wsvm(x, ...)

## S3 method for class 'formula'
wsvm(formula, data = NULL, case.weights = rep(1,
  nrow(data)), ..., subset, na.action = na.omit, scale = TRUE)

## Default S3 method:
wsvm(x, y = NULL, scale = TRUE, type = NULL,
  kernel = "radial", degree = 3, gamma = if (is.vector(x)) 1 else
  1/ncol(x), coef0 = 0, cost = 1, nu = 0.5, class.weights = NULL,
  case.weights = rep(1, nrow(x)), cachesize = 40, tolerance = 0.001,
  epsilon = 0.1, shrinking = TRUE, cross = 0, probability = FALSE,
  fitted = TRUE, seed = 1L, ..., subset = NULL, na.action = na.omit)

`x`	(Required if no `formula` is given as principal argument.) A data matrix, a vector, or a sparse matrix (object of class `Matrix` provided by the Matrix package, or of class `matrix.csr` provided by the SparseM package, or of class `simple_triplet_matrix` provided by the slam package).
`formula`	A symbolic description of the model to be fit.
`data`	An optional data frame containing the variables in the model. By default the variables are taken from the environment which `wsvm` is called from.
`case.weights`	A vector of observation weights (default: a vector of 1s).
`subset`	An index vector specifying the cases to be used in the training sample. (NOTE: If given, this argument must be named.)
`na.action`	A function to specify the action to be taken if `NA`s are found. The default action is `na.omit`, which leads to rejection of cases with missing values on any required variable. An alternative is `na.fail`, which causes an error if `NA` cases are found. (NOTE: If given, this argument must be named.)
`scale`	A logical vector indicating the variables to be scaled. If `scale` is of length 1, the value is recycled as many times as needed. Per default, data are scaled internally (both `x` and `y` variables) to zero mean and unit variance. The center and scale values are returned and used for later predictions.
`y`	(Only if no `formula` is given as principal argument.) A response vector with one label for each row/component of `x`. Can be either a factor (for classification tasks) or a numeric vector (for regression).
`type`	`wsvm` can be used as a classification machine, as a regression machine, or for novelty detection. Depending of whether `y` is a factor or not, the default setting for `type` is `C-classification` or `eps-regression`, respectively, but may be overwritten by setting an explicit value. Valid options are: `C-classification` `nu-classification` `one-classification` (for novelty detection) `eps-regression` `nu-regression`
`kernel`	The kernel used in training and predicting. You might consider changing some of the following parameters, depending on the kernel type. linear: u'v* polynomial: (gammau'v + coef0)^degree radial basis: exp(-gamma\|u-v\|^2)* sigmoid: tanh(gammau'v + coef0)
`degree`	Parameter needed for kernel of type `polynomial` (default: 3).
`gamma`	Parameter needed for all kernels except `linear` (default: 1/(data dimension)).
`coef0`	Parameter needed for kernels of type `polynomial` and `sigmoid` (default: 0).
`cost`	Cost of constraints violation (default: 1) — it is the ‘C’-constant of the regularization term in the Lagrange formulation.
`nu`	Parameter needed for `nu-classification`, `nu-regression`, and `one-classification`.
`class.weights`	A named vector of weights for the different classes, used for asymmetric class sizes. Not all factor levels have to be supplied (default weight: 1). All components have to be named.
`cachesize`	Cache memory in MB (default: 40).
`tolerance`	Tolerance of termination criterion (default: 0.001).
`epsilon`	epsilon in the insensitive-loss function (default: 0.1).
`shrinking`	Option whether to use the shrinking-heuristics (default: `TRUE`).
`cross`	If an integer value k>0 is specified, a k-fold cross validation on the training data is performed to assess the quality of the model: the accuracy rate for classification and the Mean Squared Error for regression.
`probability`	Logical indicating whether the model should allow for probability predictions (default: `FALSE`).
`fitted`	Logical indicating whether the fitted values should be computed and included in the model or not (default: `TRUE`).
`seed`	Integer seed for libsvm (used for cross-validation and probability prediction models).
`...`	Additional parameters for the low level fitting function `wsvm.default`.

wsvm is used to train a support vector machine with case weights. It can be used to carry out general regression and classification (of nu and epsilon-type), as well as density-estimation. A formula interface is provided.

This function is a modification of the svm function in package e1071 written by David Meyer (based on C/C++-code by Chih-Chung Chang and Chih-Jen Lin). An extension of LIBSVM that can deal with case weights written by Ming-Wei Chang, Hsuan-Tien Lin, Ming-Hen Tsai, Chia-Hua Ho and Hsiang-Fu Yu is used. It is available at http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/#weights_for_data_instances.

For multiclass-classification with k levels, k>2, libsvm uses the ‘one-against-one’-approach, in which k(k-1)/2 binary classifiers are trained; the appropriate class is found by a voting scheme.

libsvm internally uses a sparse data representation, which is also high-level supported by the package SparseM.

If the predictor variables include factors, the formula interface must be used to get a correct model matrix.

plot.svm allows a simple graphical visualization of classification models.

The probability model for classification fits a logistic distribution using maximum likelihood to the decision values of all binary classifiers, and computes the a-posteriori class probabilities for the multi-class problem using quadratic optimization. The probabilistic regression model assumes (zero-mean) laplace-distributed errors for the predictions, and estimates the scale parameter using maximum likelihood.

Data are scaled internally, usually yielding better results. Parameters of SVM-models usually must be tuned to yield sensible results!

An object of class "wsvm", inheriting from "svm" containing the fitted model, including:

`SV`	The resulting support vectors (possibly scaled).
`index`	The index of the resulting support vectors in the data matrix. Note that this index refers to the preprocessed data (after the possible effect of `na.omit` and `subset`)
`coefs`	The corresponding coefficients times the training labels.
`rho`	The negative intercept.
`obj`	The value(s) of the objective function.
`sigma`	In case of a probabilistic regression model, the scale parameter of the hypothesized (zero-mean) laplace distribution estimated by maximum likelihood.
`probA, probB`	numeric vectors of length k(k-1)/2, k number of classes, containing the parameters of the logistic distributions fitted to the decision values of the binary classifiers (1 / (1 + exp(a x + b))).

This modification is not well-tested.

Chang, Chih-Chung and Lin, Chih-Jen:
LIBSVM: a library for Support Vector Machines
http://www.csie.ntu.edu.tw/~cjlin/libsvm

Exact formulations of models, algorithms, etc. can be found in the document:
Chang, Chih-Chung and Lin, Chih-Jen:
LIBSVM: a library for Support Vector Machines
http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.ps.gz

More implementation details and speed benchmarks can be found on: Rong-En Fan and Pai-Hsune Chen and Chih-Jen Lin:
Working Set Selection Using the Second Order Information for Training SVM
http://www.csie.ntu.edu.tw/~cjlin/papers/quadworkset.pdf

predict.wsvm, plot.svm in package e1071, matrix.csr (in package SparseM).

Other svm: predict.wsvm

  data(iris)
  attach(iris)

 ## classification mode
 ## default with factor response:
   model <- wsvm(Species ~ ., data = iris)

# alternatively the traditional interface:
  x <- subset(iris, select = -Species)
  y <- Species
  model <- wsvm(x, y) 

  print(model)
 summary(model)

 # test with train data
  pred <- predict(model, x)
  # (same as:)
   pred <- fitted(model)

   # Check accuracy:
   table(pred, y)

 # compute decision values and probabilities:
  pred <- predict(model, x, decision.values = TRUE)
  attr(pred, "decision.values")[1:4,]

  ## visualize (classes by color, SV by crosses):
  plot(cmdscale(dist(iris[,-5])),
    col = as.integer(iris[,5]),
    pch = c("o","+")[1:150 %in% model$index + 1])

 ## density-estimation

  # create 2-dim. normal with rho=0:
  X <- data.frame(a = rnorm(1000), b = rnorm(1000))
  attach(X)

  # traditional way:
   m <- wsvm(X, gamma = 0.1)

  # formula interface:
  m <- wsvm(~., data = X, gamma = 0.1)

  # test:
  newdata <- data.frame(a = c(0, 4), b = c(0, 4))
  predict (m, newdata)

 ## visualize:
 plot(X, col = 1:1000 %in% m$index + 1, xlim = c(-5,5), ylim=c(-5,5))
  points(newdata, pch = "+", col = 2, cex = 5)

 ## weights: (example not particularly sensible)
 i2 <- iris
 levels(i2$Species)[3] <- "versicolor"
 summary(i2$Species)
 wts <- 100 / table(i2$Species)
 wts
 m <- wsvm(Species ~ ., data = i2, class.weights = wts)

 ## case.weights:
 fit <- wsvm(Species ~ ., data = iris, wf = "gaussian", bw = 0.5, case.weights = rep(c(0.5,1),75))
 pred <- predict(fit)
 mean(pred != iris$Species)