Description Usage Arguments Value Author(s) Examples
This function builds a random forest classifier, predicts class values for the unknown data set, and returns error rates and confusion matrices for the known and unknown data sets.
1  | 
known | 
 A data set with known classes, used to classify the unknown data set. Defaults to NULL.  | 
unknown | 
 A data set whose classes are considered unknown. Classes will be predicted for this data set. Defaults to NULL.  | 
ctrl | 
 A trainControl statement from the caret package. Defaults to NULL.  | 
grid | 
 A grid for the tuneGrid parameter in the train (caret) function. Defaults to NULL.  | 
keeps | 
 A vector of feature names to consider in the model (must include 'class'). Defaults to NULL.  | 
samps | 
 A vector of sample sizes by class for the sampsize random forest argument.  | 
A list containing the following components:
x$model = Random forest model object (created using caret pckg).
x$classPred = Predicted class values for unknown data set.
x$conf_matrix_known = Confusion matrix for cross-validated model (on training set).
x$result = Accuracy for training model.
x$unknown.error = Error rate for applying model to unknown data.
x$conf_matrix_unknown = Confusion matrix for applying model to unknown data.
Jennifer Starling
1 2 3 4 5 6  | ## Define ctrl object.
c <- trainControl(method='cv',number=5,classProbs=F)
## Define list of features to keep, including 'class' as the first feature.
features <- c('class','feature1','feature2','feature3')
## Known and Unknown data sets must contain a 'class' column.
model <- myRF(known=labeled_data_set, unknown=unlabeled_data_set, ctrl=c,keeps=features)
 | 
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.