Description Usage Arguments Value Author(s) See Also Examples
This function applies Penalized Regression (Lasso and Ridge) to predictions from regression base learners to produce an ensemble prediction. Shrinkage parameter (lambda
) is determined by minimizing the cross-validation error. The data partition for the integration phase does not have to be the same as the partition(s) used to generate the base learners. Functions from EnsembleBase are used for training and prediction of base learners. Also, base classes and generic methods of the same package are extended to support PenReg integration.
1 2 3 4 5 6 7 8 | epenreg(formula, data
, baselearner.control=epenreg.baselearner.control()
, integrator.control=epenreg.integrator.control()
, ncores=1, filemethod=FALSE, print.level=1
, preschedule = TRUE
, schedule.method = c("random", "as.is", "task.length")
, task.length
)
|
formula |
Formula expressing response variable and covariates. |
data |
Data frame containing the response variable and covariates. |
baselearner.control |
Control structure determining the base learners, their configurations, and data partitioning details. See |
integrator.control |
Control structure governing integrator behavior. See |
ncores |
Number of cores used for parallel training of base learners. |
filemethod |
Boolean flag indicating whether or not to save estimation objects to disk or not. Using |
print.level |
Controlling verbosity level. |
preschedule |
Boolean flag, indicating whether base learner training jobs must be scheduled statically ( |
schedule.method |
Method used for scheduling tasks on threads. In "as.is" tasks are assigned to threads in a round-robin fashion for static scheduling. In dynamic scheduling, tasks form a queue without any re-ordering. In "random", tasks are first randomly shuffled, and the rest is similar to "as.is". In "task.length", a heuristic algorithm is used in static scheduling for assigning tasks to threads to minimize load imbalance, i.e. make total task lengths in threads roughly equal. In dynamic scheduling, tasks are sorted in descending order of expected length to form the task queue. |
task.length |
Vector of estimated task lengths, to be used in the "task.length" method of scheduling. |
An object of classes epenreg
(if filemethod==TRUE
, also has class of epenreg.file
), a list with the following elements:
call |
Copy of function call. |
formula |
Copy of formula argument in function call. |
instance.list |
An object of class |
integrator.config |
Copy of configuration object passed to the integrator. Object of class |
method |
Integration method. Currently, only "default" is supported. |
est |
A list with these elements: 1) |
y |
Copy of response variable vector. |
pred |
Within-sample prediction of the ensemble model. |
filemethod |
Copy of passed-in |
Mansour T.A. Sharabiani, Alireza S. Mahani
epenreg.baselearner.control
, epenreg.integrator.control
, Instance.List
, Regression.Integrator.PenReg.SelMin.Config
, Regression.CV.Batch.FitObj
, Regression.Batch.FitObj
, Regression.Integrator.PenReg.SelMin.FitObj
1 2 3 4 5 6 7 8 9 10 11 | data(servo)
myformula <- class~motor+screw+pgain+vgain
perc.train <- 0.7
index.train <- sample(1:nrow(servo), size = round(perc.train*nrow(servo)))
data.train <- servo[index.train,]
data.predict <- servo[-index.train,]
## to run longer test using all 5 default regression base learners
## try: est <- epenreg(myformula, data.train, ncores=2)
est <- epenreg(myformula, data.train, ncores=2
, baselearner.control=epenreg.baselearner.control(baselearners="knn"))
newpred <- predict(est, data.predict)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.