rf.blind | R Documentation |
Grows multiple random forests with non-random cross validation: the algorithm is trained on a specific part of the dataset, and predictions are done on another part of the dataset.
rf.blind(
tab,
treat,
train.id,
mtry = NULL,
n.tree = 500,
n.forest = 10,
importance_p = F,
seed = NULL
)
tab |
An abundance or presence absence table containing samples in columns and OTUs/ASV in rows. |
treat |
A boolean vector containing the class identity of each sample, i.e. the treatment to predict. This means that you should pick a class as a reference for the calculation of precision and sensitivity. |
train.id |
A charecter sting to be searched in samples names that will be used for training. Can be a regular expression. Can alernatively be a boolean vector saying wether or not each sample is part of the training dataset(TRUE for training samples, FALSE for testing samples), or a character vector containing the training sample names. |
mtry |
The mtry parameter to be passed to the |
n.tree |
The number of tree to grow. The default is |
n.forest |
The number of forests to grow. The default is |
importance_p |
A boolean defining if the p-value should be computed for the importance of variable. For now, the importance is the Gini index, and the p-value is estimated by permutation with the Altmann method. See ranger documentation for details |
seed |
A number to set the seed before growing the forest. Only meaningful
if n.forest == 1. The default is |
A list object containing:
a summary table with the number of true positives (TP), true negatives (TN), false positives (FP) and false negatives (FN)
the error rate, the sensistivity TP/(TP + FN)
, and the precision TP/(TP + FP)
The confusion matrix
n.forest
tables containing Gini index for each variable in each of the n.forest
grown forests.
This index gives the variable importance for classification.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.