predictML: Parallel-Voting prediction of multiple machine learning...

Description Usage Arguments Value Note Author(s) See Also Examples

Description

By sampling your data, running the machine learning algorithm on these samples in parallel on your own machine and letting your models vote on a prediction, we return much faster predictions than the regular machine learning algorithm and possibly even more accurate predictions.

Usage

1
predictML(predictCall, MLPackage, combine = "raw")

Arguments

predictCall

Your call to a predict function. The object however should be an element of class parallelML.

MLPackage

A character string of the package which provides your machine learning algorithm. This is needed since all cores should load the package.

combine

A character string: "raw" when you want to return a list of predictions, "vote" when you want to return the class upon which most models agree and "avg" when you want to return the average of numeric predictions (probabilities).

Value

Either a list of vectors of predictions for each model or a vector of class upon which most models agree.

Note

Although it can cope with numeric probability predictions, this package is designed for classification labeling.

Author(s)

Wannes Rosiers

See Also

This package can be regarded as a parallel extension of machine learning algorithms, therefor check the package of the machine learning algorithm you want to use.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
## Not run: 
# Load the library which provides svm
library(e1071)

# Create your data
data(iris)

# Create a model
parSvmModel <- parallelML("svm(formula = Species ~ ., data = iris)",
                     "e1071",samplingSize = 0.8)
                     
# Get prediction
parSvmPred   <- predictML("predict(parSvmModel,newdata=iris)",
                          "e1071","vote")

# Check the quality
table(parSvmPred,iris$Species)

## End(Not run)
## Not run: 
# Load the library which provides rpart
library(rpart)

# Create your data
data("magicData")

# Create a model
parTreeModel  <- parallelML("rpart(formula = V11 ~ ., data = trainData[,-1])",
                            "rpart",samplingSize = 0.8)

# Get prediction
parTrainTreePred  <- predictML("predict(parTreeModel,newdata=trainData[,-1],type='class')",
                               "rpart","vote")
parTestTreePred  <- predictML("predict(parTreeModel,newdata=testData[,-1],type='class')",
                              "rpart","vote")

# Check the quality
table(parTrainTreePred,trainData$V11)
table(parTestTreePred,testData$V11)	

## End(Not run)
## Not run: 
# Load the library which provides svm
library(e1071)

# Create your data
data(iris)
data("magicData")

# Get nummeric predicitions of Support Vector Machine
parsvmmodel   <- parallelML("svm(formula = Species ~ ., data = iris,probability=TRUE)",
                            "e1071",samplingSize = 0.8,
                            underSample = TRUE, underSampleTarget = "versicolor")
parsvmpred    <- predictML("predict(parsvmmodel,newdata=iris,probability=TRUE)",
                           "e1071","avg")

# Get numeric predictions of a generalized linear model
parglmmodel   <- parallelML("glm(formula = V11 ~ ., data = trainData[,-1], 
                                 family = binomial(link='logit'))","stats",samplingSize = 0.8)

parglmpred   <- predictML("predict(parglmmodel,newdata=trainData[,-1],type='response')",
                          "stats","avg")

## End(Not run)

parallelML documentation built on May 2, 2019, 2:44 a.m.