Description Usage Arguments Details Value Note Author(s) References See Also Examples
The function COBRA delivers prediction outcomes for a testing sample on
the basis of a training sample and a bunch of basic regression
machines. By default, those machines are wrappers to the R packages
lars
, ridge
, tree
and
randomForest
, covering a somewhat wide spectrum in contemporary
prediction methods for regression. However the most interesting way to use COBRA
is to use any regression method suggested by the context (see argument machines
). COBRA may natively parallelize the computations (use option parallel
).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
train.design |
Mandatory. The design matrix for the training sample. |
train.responses |
Mandatory. The responses vector for the training sample. |
split |
Optional. How should COBRA cut the training sample? |
test |
Mandatory. The design matrix of the testing sample. |
machines |
Optional. Regression basic machines provided by the user. This should be a matrix, whose number of rows is the length of the training sample (ntrain) plus the length of the testing sample (ntest), and with as many columns as machines. Element (i,j) of this matrix is assumed to be r_j(X_i), the (scalar) prediction of machine j for query point X_i, where i is from 1 to ntrain+ntest. |
machines.names |
Optional. If |
logGrid |
Optional. If |
grid |
Optional. How many points should be used in the discretization scheme for calibrating the parameter epsilon. |
alpha.machines |
Optional. Coerce COBRA to use exactly
|
parallel |
Optional. If |
nb.cpus |
Optional. If |
plots |
Optional. If |
savePlots |
Optional. If |
logs |
Optional. If |
progress |
Optional. If |
path |
Optional. If |
For most users, options grid
and split
should be set to
their default values.
Returns a list including only
predict |
The vector of predicted values. |
Caution: If your data is ordered, you should shuffle the observations before calling COBRA since the algorithm assumes all data points are independent and identically distributed.
Benjamin Guedj <benjamin.guedj@upmc.fr>
http://www.lsta.upmc.fr/doct/guedj/index.html
G. Biau, A. Fischer, B. Guedj and J. D. Malley (2013), COBRA: A Nonlinear Aggregation Strategy. http://arxiv.org/abs/1303.2236 and http://hal.archives-ouvertes.fr/hal-00798579
COBRA-package
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | n <- 500
d <- 30
ntrain <- 400
X <- replicate(d,2*runif(n = n)-1)
Y <- X[,1]^2 + X[,3]^3 + exp(X[,10]) + rnorm(n = n, sd = .1)
train.design <- as.matrix(X[1:ntrain,])
train.responses <- Y[1:ntrain]
test <- as.matrix(X[-(1:ntrain),])
test.responses <- Y[-(1:ntrain)]
## using the default machines
if(require(lars) && require(tree) && require(ridge) &&
require(randomForest))
{
res <- COBRA(train.design = train.design,
train.responses = train.responses,
test = test)
print(cbind(res$predict,test.responses))
plot(test.responses,res$predict,xlab="Responses",ylab="Predictions",pch=3,col=2)
abline(0,1,lty=2)
}
## using own machines
machines.names <- c("Soothsayer","Dummy")
machines <- matrix(nr = n, nc = 2, data = 0)
machines[,1] <- Y+rnorm(n = n, sd=.1) ## soothsayer
machines[,2] <- mean(train.responses) ## dummy prediction, averaging train.responses
res2 <- COBRA(train.design = train.design,
train.responses = train.responses,
test = test,
machines = machines,
machines.names = machines.names)
print(cbind(res2$predict,test.responses))
plot(test.responses,res2$predict,xlab="Responses",ylab="Predictions",pch=3,col=2)
abline(0,1,lty=2)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.