predictRF: Run classification using random forests

Description Usage Arguments Value

Description

Given an input allPairwise dataframe with pairwise similarity metrics, run a random forest model. The outcome variable is 'PGPmatched', a binary variable. This column can be 0, 1 or NA. The cross-validation set is made of non-NA values. The cross-validation set is split into 10 folds, and predictions for each fold are made using the remaining 9 folds. Then the model is trained on the entire cross-validation set, and this model is used to predict on the test set. Predictions on the cross-validation set are returned as 'out', on the test set are 'outTest'. The model fit on all the cross-validation data is returned as 'fit'.

Usage

1
predictRF(allPairwise, similarityCols)

Arguments

allPairwise

name of dataframe that contains pairwise similarity metrics. Needs 'PGPmatched' column which is the outcome variable.

similarityCols

indices of columns that should be included in model

Value

'out', 'outTest', 'fit' as described above


xhtai/heisenbrgr documentation built on June 8, 2019, 9:30 a.m.