'RF' object implementing the most basic version of a random forest.
x
A data frame of all training predictors.
y
A vector of all training responses.
se
A list containing the standard deviation for each y value. This is for example useful, if y is aggregated data from repreated measurements or if the measurement is in some sense noisy.
ntree
The number of trees to grow in the forest. The default value is 500.
replace
An indicator of whether sampling of training data is with replacement. The default value is TRUE.
sampsize
The size of total samples to draw for the training data. If sampling with replacement, the default value is the length of the training data. If samplying without replacement, the default value is two-third of the length of the training data.
mtry
The number of variables randomly selected at each split point. The default value is set to be one third of total number of features of the training data.
nodesize
The minimum observations contained in terminal nodes. The default value is 5.
splitrule
A string to specify how to find the best split among all candidate feature values. The current version only supports 'variance' which minimizes the overall MSE after splitting. The default value is 'variance'.
avgfunc
An averaging function to average observations in the node. The function is used for prediction. The input of this function should be a dataframe of predictors 'x' and a vector of outcomes 'y'. The output is a scalar. The default function is to take the mean of vector 'y'.
forest
A list of 'RFTree' objects in the forest. If the class is extended, the list may contain the corresponding extended 'RFTree' object.
categoricalFeatureCols
A list of index for all categorical data. Used for trees to detect categorical columns.
categoricalFeatureMapping
A list of encoding details for each categorical column, including all unique factor values and their corresponding numeric representation.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.