parallelSVM: Parallel-voting version of Support-Vector-Machine
In parallelSVM: A Parallel-Voting Version of the Support-Vector-Machine Algorithm

Description Usage Arguments Value Note Author(s) See Also Examples

By sampling your data, running the Support-Vector-Machine algorithm on these samples in parallel on your own machine and letting your models vote on a prediction, we return much faster predictions than the regular Support-Vector-Machine and possibly even more accurate predictions.

## S3 method for class 'formula'
## S3 method for class 'formula'
parallelSVM(formula, data= NULL, numberCores = detectCores(),
			samplingSize = 0.2, ..., 
			subset, na.action = na.omit, scale = TRUE)
## Default S3 method
## Default S3 method:
parallelSVM(x, y = NULL, numberCores = detectCores(), 
			samplingSize = 0.2, scale = TRUE, type = NULL, 
			kernel = "radial", degree = 3, 
			gamma = if (is.vector(x)) 1 else 1/ncol(x), 
			coef0 = 0, cost = 1, nu = 0.5, class.weights = NULL, 
			cachesize = 40, tolerance = 0.001, epsilon = 0.1, 
			shrinking = TRUE, cross = 0, probability = FALSE, 
			fitted = TRUE, seed = 1L, ..., subset, na.action = na.omit)

`formula`	a symbolic description of the model to be fit
`data`	An optional data frame containing the variables in the model. By default the variables are taken from the environment which 'svm' is called from.
`x`	A data matrix, a vactor ar a sparse matrix.
`y`	A response vector with one label for each row/component of x. Can be either a factor (for calssification tasks) or a numeric vector (for regression).
`numberCores`	Number of cores of your machine you want to use. Is set equal to the number of samples you take.
`samplingSize`	Size of your data or of x you will take in each sample.
`scale`	A logical vector indicating the variables to be scaled. If scale is of length 1, the value is recycled as many times as needed. Per default, data are scaled internally (both x and y variables) to zero mean and unit variance. The center and scale values are returned and used for later predictions.
`type`	Support-Vector-Machine can be used as a classification machine, as a regression machine, or for novelty detection. Depending of whether y is a factor or not, the default setting for type is C-classification or eps-regression, respectively, but may be overwritten by setting an explicit value. Valid options are: - C-classification - nu-classification - one-classification (for novelty detection) - eps-regression - nu-regression
`kernel`	the kernel used in training and predicting. You might consider changing some of the following parameters, depending on the kernel type. - linear - polynomial - radial basis - sigmoid
`degree`	parameter needed for kernel of type polynomial (default: 3)
`gamma`	parameter needed for all kernels except linear (default: 1/(data dimension))
`coef0`	parameter needed for kernels of type polynomial and sigmoid (default: 0)
`cost`	cost of constraints violation (default: 1)—it is the ‘C’-constant of the regularization term in the Lagrange formulation.
`nu`	parameter needed for nu-classification, nu-regression, and one-classification
`class.weights`	a named vector of weights for the different classes, used for asymmetric class sizes. Not all factor levels have to be supplied (default weight: 1). All components have to be named.
`cachesize`	cache memory in MB (default 40)
`tolerance`	tolerance of termination criterion (default: 0.001)
`epsilon`	epsilon in the insensitive-loss function (default: 0.1)
`shrinking`	option whether to use the shrinking-heuristics (default: TRUE)
`cross`	if a integer value k>0 is specified, a k-fold cross validation on the training data is performed to assess the quality of the model: the accuracy rate for classification and the Mean Squared Error for regression
`probability`	logical indicating whether the model should allow for probability predictions.
`fitted`	logical indicating whether the fitted values should be computed and included in the model or not (default: TRUE)
`seed`	integer seed for libsvm (used for cross-validation and probability prediction models).
`...`	additional parameters for the low level fitting function svm.default
`subset`	An index vector specifying the cases to be used in the training sample. (NOTE: If given, this argument must be named.)
`na.action`	A function to specify the action to be taken if NAs are found. The default action is na.omit, which leads to rejection of cases with missing values on any required variable. An alternative is na.fail, which causes an error if NA cases are found. (NOTE: If given, this argument must be named.)

A list containing of numberCores Support Vector Machine models.

Usage is just like svm, the only difference is the numberCores you want to use (equal to the number of models you build), and the sampleSize (the size of the sample you want to use to create each model)

Wannes Rosiers

This package can be regarded as a parallel extension of svm.

## Not run: 
# Load the normal svm function
library(e1071)

# Example with formula
# load trainData and testData
data(magicData)

# Calculate the model
# Here we use it on bigger data
system.time(serialSvm   <- svm(V11 ~ ., trainData[,-1], 
						probability=TRUE, cost=10, gamma=0.1))
system.time(parallelSvm <- parallelSVM(V11 ~ ., data = trainData[,-1],
						numberCores = 8, samplingSize = 0.2, 
						probability = TRUE, gamma=0.1, cost = 10))
                                       
# Calculate predictions
system.time(serialPredictions <- predict(serialSvm, testData))
system.time(parallelPredicitions <- predict(parallelSvm, testData))

# Check for quality
table(serialPredictions,testData[,"V11"])
table(parallelPredicitions,testData[,"V11"])

# Example without formula
# load data
data(iris)
x <- subset(iris, select = -Species)
y <- iris$Species

# estimate model and predict input values
system.time(model       <- parallelSVM(x, y))
system.time(serialmodel <- svm(x, y))

fitted(model)
fitted(serialmodel)

# Calculate predictions
system.time(serialPredictions <- predict(serialmodel, x))
system.time(parallelPredicitions <- predict(model, x))

# Check for quality
table(serialPredictions,y)
table(parallelPredicitions,y)

## End(Not run)