predict_train_funbarRF: Prediction of species labels for the out-of-bag (OOB)...

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

Generally, training or reference dataset is used to train the model and not for prediction purpose. However, since Random Forest method is used here, prediction for the OOB instances is made. The OOB instances are the observations that are not participated in constructing tree-based classifiers.

Usage

1
predict_train_funbarRF (object, m_try = 10, n_tree = 500)

Arguments

object

An object created by the function seq_funbarRF or seq_funbarRF_manual .

m_try

This parameter is required for randomForest. It represents the number of variables to be randomly sampled at each split. Default value is 10.

n_tree

This is also a parameter for randomForest. It denotes the number of tree-based classifiers to be built. This should not be set to too small a number, to ensure that every instance gets predicted at least a few times. Default is 500.

Details

The user has to supply the reference sequence dataset to assess the accuracy of the developed prediction approach. Here, the prediction for the species label is made for the OOB instances and are then aggregated over all the classifiers for final prediction based on majority voting strategy.

Value

result_train

A dataframe consisting of species labels, number of species labels observed and correctly predicted.

Author(s)

Prabina Kumar Meher, Division of Statistical Genetics,Indian Agricultural Statistics Research Institute, New Delhi-110012, INDIA

References

  1. Liaw A., and Wiener M. (2002). Classification and Regression by randomForest. R News, 2(3), 18-22.

  2. Meher P.K., Sahu T.K., and Rao A.R. (2016). Identification of species based on DNA barcode using k-mer feature vector and Random forest classifier. Gene, 592(2), 316-324.

See Also

randomForest, predict_test_funbarRF

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
#######################
data (fun_dat) 
kk <- read_seq_txt (fun_dat$seq)[1:5]
zz <- as.factor(as.character (fun_dat$seq_name)[1:5])
train <- seq_funbarRF (reference_seq=kk, seq_id=zz)
res <- predict_train_funbarRF (object=train, m_try=10, n_tree=20) 
# kindly use large number of n_tree
print(res)

######################

data (data_barcode)
tr_ss <- seq_funbarRF_manual (manual_seq=data_barcode$Fish$train[1:100])
prd1 <- predict_train_funbarRF (object=tr_ss, m_try=10, n_tree=500)
# kindly use large number of n_tree
print(prd1)

funbarRF documentation built on May 27, 2019, 5:03 p.m.