predict.dt.madlib: Compute the predictions of the model produced by madlib.rpart

Description Usage Arguments Value Author(s) References See Also Examples

View source: R/madlib-rpart.R

Description

This is actually a wrapper for MADlib's predict function of decision tree. It accepts the result of madlib.rpart, which is a representation of decision tree, and compute the predictions for new data sets.

Usage

1
2
3
    ## S3 method for class 'dt.madlib'
predict(object, newdata, type = c("response", "prob"),
    ...)

Arguments

object

A dt.madlib object, which is the result of madlib.rpart.

newdata

A db.obj object, which contains the data used for prediction. If it is not given, then the data set used to train the model will be used.

type

A string, default is "response". For regessions, this will generate the fitting values. For classification, this will generate the predicted class values. There is an extra option "prob" for classification tree, which computes the probabilities of each class.

...

Other arguments. Not implemented yet.

Value

A db.obj object, which wraps a table that contains the predicted values and also a valid ID column. For type='response', the predicted column has the fitted value (regression tree) or the predicted classes (classification tree). For type='prob', there are one column for each class, which contains the probabilities for that class.

Author(s)

Author: Predictive Analytics Team at Pivotal Inc.

Maintainer: Frank McQuillan, Pivotal Inc. [email protected]

References

[1] Documentation of decision tree in MADlib 1.6, http://doc.madlib.net/latest/

See Also

madlib.lm, madlib.glm, madlib.rpart, madlib.summary, madlib.arima, madlib.elnet are all MADlib wrapper functions.

predict.lm.madlib, predict.logregr.madlib, predict.elnet.madlib, predict.arima.css.madlib are all predict functions related to MADlib wrapper functions.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
## Not run: 


## set up the database connection
## Assume that .port is port number and .dbname is the database name
cid <- db.connect(port = .port, dbname = .dbname, verbose = FALSE)

x <- as.db.data.frame(abalone, conn.id = cid, verbose = FALSE)

key(x) <- "id"
fit <- madlib.rpart(rings < 10 ~ length + diameter + height + whole + shell,
       data=x, parms = list(split='gini'), control = list(cp=0.005))

predict(fit, x, 'r')

db.disconnect(cid)

## End(Not run)

PivotalR documentation built on May 30, 2017, 8:18 a.m.