honestRF-class: honstRF Constructor

Description Slots


'honestRF' object implementing the most basic version of a random forest.



An external pointer pointing to a C++ honestRF object


An external pointer pointing to a C++ DataFrame object


A vector of all training responses.


A list of index for all categorical data. Used for trees to detect categorical columns.


A list of encoding details for each categorical column, including all unique factor values and their corresponding numeric representation.


The number of trees to grow in the forest. The default value is 500.


An indicator of whether sampling of training data is with replacement. The default value is TRUE.


The size of total samples to draw for the training data. If sampling with replacement, the default value is the length of the training data. If samplying without replacement, the default value is two-third of the length of the training data.


The number of variables randomly selected at each split point. The default value is set to be one third of total number of features of the training data.


The minimum observations contained in terminal nodes. The default value is 3.


Minimum size of terminal nodes for averaging dataset. The default value is 3.


Proportion of the training data used as the splitting dataset. It is a ratio between 0 and 1. If the ratio is 1, then essentially splitting dataset becomes the total entire sampled set and the averaging dataset is empty. If the ratio is 0, then the splitting data set is empty and all the data is used for the averaging data set (This is not a good usage however since there will be no data available for splitting).


if the split value is taking the average of two feature values. If false, it will take a point based on a uniform distribution between two feature values. (Default = FALSE)

soerenkuenzel/hte documentation built on June 12, 2018, 4:26 p.m.