View source: R/FcRandomForest.R
FcRandomForest | R Documentation |
Random forest for forecasting using multivariate regression as published in [Breiman, 2001].
This function was succesfully used in [Thrun et al., 2019].
FcRandomForest(Time, DF, formula=NULL,Horizon,
Package='randomForest', AutoCorrelation,NoOfTree=200,
PlotIt=TRUE,Holidays,SimilarPoints=TRUE,...)
Time |
Time [1:n] bector of objects of |
DF |
Dataframe [1:n,1:d] with |
formula |
Either a formula describing praediktors and indicators or NULL. Usually set to |
Horizon |
Forecast horizon as a number of days. The test set is defined by |
Package |
Either 'ranger' or 'randomForest' |
AutoCorrelation |
If not missing a name of variable stored in DF can be given, it should be the predictor and be also defined in |
NoOfTree |
Number of trees to grow, [Hastie et al., 2013] suggests that random forest often stabilize aroundd 200, for big numbers there is no improvement contrary to boosting. |
PlotIt |
Plots MAE results, but if |
Holidays |
Either German Holidays are used if missing, else a data frame or vector of will only be used if |
SimilarPoints |
highly experimental, please set FALSE if you want to publish your results |
... |
Further parameters of random forest such as |
mtry
: Number of variables randomly sampled as candidates at each split, usually d/3 or higher but lower than d
nodesize
('randomForest') or min.node.size
('ranger'): Default 5, Setting this number larger causes smaller trees to be grown . Trees are grown to the maximum node size possible. [Hastie et al., 2013] to grow as large trees as possible
maxnodes
('randomForest' only):Maximum number of terminal nodes trees in the forest can have,If not given, trees are grown to the maximum possible (subject to limits by nodesize). [Hastie et al., 2013] to grow as large trees as possible
if NULL than autocorralation defined by Horizon is used as predictor.
List with
Forecast |
Vector [1:Horizon] of predicted forecast values of the test data, names if |
TestDataPredictor |
See also |
FeatureImportance |
Importance of Features for Forecast, see |
Accuracy |
Output of |
Model |
Output of either |
TestDataIndicators |
data.frame[1:(d-1),1:Horizon], in the multivariate case all variables except predictor, in the other case NULL. See also |
TrainData |
data.frame[1:d,1:k], see |
For n=1 example of forecasting [Thrun et al.,2019] it was visible to the data scientist that even with the choice of the same parameters and data randomForest extremly outperformed ranger. The reason is unknown and this information remains unpublished.
Michael Thrun
[Breiman, 2001] Breiman, L., Random Forests, Machine Learning 45(1), 5-32, 2001.
[Hastie, 2014] Hastie, TREVOR, Tibshirani, ROBERT, Friedman, JH: The elements of statistical learning: data mining, inference, and prediction, pages 587ff, 2013.
[Thrun et al., 2019] Thrun, M., Maerte, J., Boehme, P, and Gehlert, T.: Applying Two Theorems of Machine Learning to the Forecasting of Biweekly Arrivals at a Call Center, Proceedings of ECDA, accepted, Bayreuth, 2019.
randomForest
, ranger
##ToDo
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.