baseRandForest: Prediction by random forest

View source: R/BioMM.R

baseRandForestR Documentation

Prediction by random forest

Description

Prediction by random forest with different settings: 'probability', 'classification' and 'regression'.

Usage

baseRandForest(
  trainData,
  testData,
  predMode = c("classification", "probability", "regression"),
  paramlist = list(ntree = 2000, nthreads = 20)
)

Arguments

trainData

The input training dataset. The first column is the label or the output. For binary classes, 0 and 1 are used to indicate the class member.

testData

The input test dataset. The first column is the label or the output. For binary classes, 0 and 1 are used to indicate the class member.

predMode

The prediction mode. Available options are c('probability', 'classification', 'regression').

paramlist

A set of model parameters defined in an R list object. The valid option: list(ntree, nthreads). 'ntree' is the number of trees used. The defaul is 2000. 'nthreads' is the number of threads used for computation. The default is 20.

Value

The predicted output for the test data.

Author(s)

Junfang Chen

Examples

 
## Load data  
methylfile <- system.file('extdata', 'methylData.rds', package='BioMM')  
methylData <- readRDS(methylfile)  
dataY <- methylData[,1]
## test a subset of genome-wide methylation data at random
methylSub <- data.frame(label=dataY, methylData[,c(2:2001)])  
trainIndex <- sample(nrow(methylSub), 12)
trainData = methylSub[trainIndex,]
testData = methylSub[-trainIndex,]
library(ranger)
predY <- baseRandForest(trainData, testData, 
                        predMode='classification', 
                        paramlist=list(ntree=300, nthreads=20)) 
testY <- testData[,1]
accuracy <- classifiACC(dataY=testY, predY=predY)
print(accuracy) 

transbioZI/BioMMex documentation built on Jan. 27, 2023, 4:14 a.m.