dcSVM: Divide-and-Conquer kernel SVM (DC-SVM)
In SwarmSVM: Ensemble Learning Algorithms Based on Support Vector Machines

dcSVM

R Documentation

Divide-and-Conquer kernel SVM (DC-SVM)

Description

Implementation of Divide-and-Conquer kernel SVM (DC-SVM) by Cho-Jui Hsieh, Si Si, and Inderjit S. Dhillon

Usage

dcSVM(
  x,
  y,
  k = 4,
  m,
  kernel = 3,
  max.levels,
  early = 0,
  final.training = FALSE,
  pre.scale = FALSE,
  seed = NULL,
  verbose = TRUE,
  valid.x = NULL,
  valid.y = NULL,
  valid.metric = NULL,
  cluster.method = "kmeans",
  cluster.fun = NULL,
  cluster.predict = NULL,
  ...
)

Arguments

`x`	the nxp training data matrix. Could be a matrix or a sparse matrix object.
`y`	a response vector for prediction tasks with one value for each of the n rows of `x`. For classification, the values correspond to class labels and can be a 1xn matrix, a simple vector or a factor.
`k`	the number of sub-problems divided
`m`	the number of sample for kernel kmeans
`kernel`	the kernel type: 1 for linear, 2 for polynomial, 3 for gaussian
`max.levels`	the maximum number of level
`early`	whether use early prediction
`final.training`	whether train the svm over the entire data again. usually not needed.
`pre.scale`	either a logical value indicating whether to scale the data or not, or an integer vector specifying the columns. We don't scale data in SVM seperately.
`seed`	the random seed. Set it to `NULL` to randomize the model.
`verbose`	a logical value indicating whether to print information of training.
`valid.x`	the mxp validation data matrix.
`valid.y`	if provided, it will be used to calculate the validation score with `valid.metric`
`valid.metric`	the metric function for the validation result. By default it is the accuracy for classification. Customized metric is acceptable.
`cluster.method`	The clusterign algorithm to use. Possible choices are "kmeans" Algorithm from `stats::kmeans` "mlKmeans" Algorithm from `RcppMLPACK::mlKmeans` "kernkmeans" Algorithm from `kernlab::kkmeans` If `cluster.fun` and `cluster.predict` are provided, `cluster.method` doesn't work anymore.
`cluster.fun`	The function to train cluster labels for the data based on given number of centers. Customized function is acceptable, as long as the resulting list contains two fields named as `cluster` and `centers`.
`cluster.predict`	The function to predict cluster labels for the data based on trained object. Customized function is acceptable, as long as the resulting list contains two fields named as `cluster` and `centers`.
`...`	other parameters passed to `e1071::svm`

Value

svm a list of svm models if using early prediction, or an svm object otherwise.
early whether using the early prediction strategy or not
cluster.tree a matrix containing clustering labels in each level
cluster.fun the clustering training function
cluster.predict the clustering predicting function
scale a list containing scaling information
valid.pred the validation prediction
valid.score the validation score
valid.metric the validation metric
time a list object recording the time consumption for each steps.

Examples

data(svmguide1)
svmguide1.t = as.matrix(svmguide1[[2]])
svmguide1 = as.matrix(svmguide1[[1]])
dcsvm.model = dcSVM(x = svmguide1[,-1], y = svmguide1[,1],
                    k = 4, max.levels = 4, seed = 0, cost = 32, gamma = 2,
                    kernel = 3,early = 0, m = 800,
                    valid.x = svmguide1.t[,-1], valid.y = svmguide1.t[,1])
preds = dcsvm.model$valid.pred
table(preds, svmguide1.t[,1])
dcsvm.model$valid.score

SwarmSVM documentation built on Dec. 28, 2022, 1:24 a.m.