dasvm: Discriminant Adaptive Support Vector Machine
In schiffner/locClass: Collection of Local Classification Methods

Description Usage Arguments Details Value References See Also Examples

A local version of Support Vector Machines for classification that puts increased emphasis on a good model fit near the decision boundary.

dasvm(x, ...)

## S3 method for class 'formula'
dasvm(formula, data = NULL, case.weights = rep(1,
  nrow(data)), ..., subset, na.action = na.omit, scale = TRUE)

## Default S3 method:
dasvm(x, y = NULL, wf = c("biweight", "cauchy", "cosine",
  "epanechnikov", "exponential", "gaussian", "optcosine", "rectangular",
  "triangular"), bw, k, nn.only, itr = 3, method = c("prob", "decision"),
  scale = TRUE, type = NULL, case.weights = rep(1, nrow(x)), ...,
  subset = NULL, na.action = na.omit)

`x`	(Required if no `formula` is given as principal argument.) A data matrix, a vector, or a sparse matrix (object of class `Matrix` provided by the Matrix package, or of class `matrix.csr` provided by the SparseM package, or of class `simple_triplet_matrix` provided by the slam package).
`formula`	A symbolic description of the model to be fit.
`data`	An optional data frame containing the variables in the model. By default the variables are taken from the environment which `dasvm` is called from.
`case.weights`	Initial observation weights (defaults to a vector of 1s).
`subset`	An index vector specifying the cases to be used in the training sample. (NOTE: If given, this argument must be named.)
`na.action`	A function to specify the action to be taken if `NA`s are found. The default action is `na.omit`, which leads to rejection of cases with missing values on any required variable. An alternative is `na.fail`, which causes an error if `NA` cases are found. (NOTE: If given, this argument must be named.)
`scale`	A logical vector indicating the variables to be scaled. If `scale` is of length 1, the value is recycled as many times as needed. Per default, data are scaled internally (both `x` and `y` variables) to zero mean and unit variance. The center and scale values are returned and used for later predictions.
`y`	(Only if no `formula` is given as principal argument.) A response vector with one label for each row/component of x. Should be a factor, otherwise it is coerced to a factor with a warning.
`wf`	A window function which is used to calculate weights that are introduced into the fitting process. Either a character string or a function, e.g. `wf = function(x) exp(-x)`. For details see the documentation for `wfs`.
`bw`	(Required only if `wf` is a string.) The bandwidth parameter of the window function. (See `wfs`.)
`k`	(Required only if `wf` is a string.) The number of nearest neighbors of the decision boundary to be used in the fitting process. (See `wfs`.)
`nn.only`	(Required only if `wf` is a string indicating a window function with infinite support and if `k` is specified.) Should only the `k` nearest neighbors or all observations receive positive weights? (See `wfs`.)
`itr`	Number of iterations for model fitting, defaults to 3. See also the Details section.
`method`	The method for adaptation to the decision boundary, either `"prob"` or `"decision"`. Defaults to "prob".
`type`	`dasvm` can be used only as a classification machine, hence valid options are: `C-classification` `nu-classification` `C-classification` is default.
`...`	Further parameters that are passed to `wsvm.default`, e.g. `kernel`, `degree`, `gamma`, `coef0`, `cost`, `nu`, `class.weights`, `case.weights`, `cachesize`, `tolerance`, `shrinking`, `cross`, `probability`, `fitted`, and `seed`. Note that `epsilon` is a parameter that is only needed in the regression case and thus will have no effect if specified.

The idea of Hand and Vinciotti (2003) to put increased weight on observations near the decision boundary is generalized to the multiclass case and applied to Support Vector Machines (SVM).

Two different methods are implemented to achieve this. The first one is based on the decision values. In order to deal with multiclass problems with k classes, k>2, ‘libsvm’ uses the ‘one-against-one’-approach, in which k(k-1)/2 binary classifiers are trained; the appropriate class is found by a voting scheme. Hence, there are decision values for every binary classification problem. The absolute decision values are proportional to the distance between the training observations and the decision boundary. A window function is applied to these distances in order to get observation weights.

The second method is based on posterior probabilities. The probability model for classification fits a logistic distribution using maximum likelihood to the decision values of all binary classifiers, and computes the a-posteriori class probabilities for the multi-class problem using quadratic optimization. The probabilistic regression model assumes (zero-mean) laplace-distributed errors for the predictions, and estimates the scale parameter using maximum likelihood. Observation weights are calculated based on the differences between the two largest estimated posterior probabilities.

Since the decision boundary is not known in advance an iterative procedure is required. First, an unweighted SVM is fitted to the data. Then either based on the estimated decision values or the estimated posterior probabilities observation weights are calculated. Then a weighted SVM (see wsvm) is fitted using these weights. Calculation of weights and model fitting is done several times in turn. The number of iterations is determined by the itr-argument that defaults to 3.

In order to calculate the weights a window function is applied to the decision values or posterior probabilities. The name of the window function (wf) can be specified as a character string. In this case the window function is generated internally in dasvm. Currently supported are "biweight", "cauchy", "cosine", "epanechnikov", "exponential", "gaussian", "optcosine", "rectangular" and "triangular".

Moreover, it is possible to generate the window functions mentioned above in advance (see wfs) and pass them to dasvm.

Any other function implementing a window function can also be used as wf argument. This allows the user to try own window functions. See help on wfs for details.

dasvm calls wsvm internally which is a version of Support Vector Machines that can deal with case weights and which is a modified version of the svm function in package e1071 written by David Meyer (based on C/C++-code by Chih-Chung Chang and Chih-Jen Lin). An extension of LIBSVM that can deal with case weights written by Ming-Wei Chang, Hsuan-Tien Lin, Ming-Hen Tsai, Chia-Hua Ho and Hsiang-Fu Yu is used. It is available at http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/#weights_for_data_instances.

libsvm internally uses a sparse data representation, which is also high-level supported by the package SparseM.

If the predictor variables include factors, the formula interface must be used to get a correct model matrix.

Data are scaled internally, usually yielding better results. Parameters of SVM-models usually must be tuned to yield sensible results!

An object of class "dasvm.formula" or "dasvm" inheriting from "wsvm.formula" or "wsvm" and "svm", a list containing the following components:

`case.weights`	A list of length `itr + 1`. The initial observation weights (a vector of 1s if none were given) and the observation weights calculated in the individual iterations.
`itr`	The number of iterations used.
`wf`	The window function used. Always a function, even if the input was a string.
`bw`	(Only if `wf` is a string or was generated by means of one of the functions documented in `wfs`.) The bandwidth used, `NULL` if `bw` was not specified.
`k`	(Only if `wf` is a string or was generated by means of one of the functions documented in `wfs`.) The number of nearest neighbors used, `NULL` if `k` was not specified.
`nn.only`	(Logical. Only if `wf` is a string or was generated by means of one of the functions documented in `wfs` and if `k` was specified.) `TRUE` if only the `k` nearest neighbors recieve a positive weight, `FALSE` otherwise.
`adaptive`	(Logical.) `TRUE` if the bandwidth of `wf` is adaptive to the local density of data points, `FALSE` if the bandwidth is fixed.
`call`	The (matched) function call.

Hand, D. J., Vinciotti, V. (2003), Local versus global models for classification problems: Fitting models where it matters, The American Statistician, 57(2) 124–130.

predict.dasvm, wsvm for a weighted version of Support Vector Machines.