baseRandForest: Prediction by random forest

Description Usage Arguments Value Author(s) Examples

View source: R/BioMM.R

Description

Prediction by random forest with different settings: 'probability', 'classification' and 'regression'.

Usage

1
2
3
4
5
6
baseRandForest(
  trainData,
  testData,
  predMode = c("classification", "probability", "regression"),
  paramlist = list(ntree = 2000, nthreads = 20)
)

Arguments

trainData

The input training dataset. The first column is the label or the output. For binary classes, 0 and 1 are used to indicate the class member.

testData

The input test dataset. The first column is the label or the output. For binary classes, 0 and 1 are used to indicate the class member.

predMode

The prediction mode. Available options are c('probability', 'classification', 'regression').

paramlist

A set of model parameters defined in an R list object. The valid option: list(ntree, nthreads). 'ntree' is the number of trees used. The defaul is 2000. 'nthreads' is the number of threads used for computation. The default is 20.

Value

The predicted output for the test data.

Author(s)

Junfang Chen

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
 
## Load data  
methylfile <- system.file('extdata', 'methylData.rds', package='BioMM')  
methylData <- readRDS(methylfile)  
dataY <- methylData[,1]
## test a subset of genome-wide methylation data at random
methylSub <- data.frame(label=dataY, methylData[,c(2:2001)])  
trainIndex <- sample(nrow(methylSub), 12)
trainData = methylSub[trainIndex,]
testData = methylSub[-trainIndex,]
library(ranger)
predY <- baseRandForest(trainData, testData, 
                        predMode='classification', 
                        paramlist=list(ntree=300, nthreads=20)) 
testY <- testData[,1]
accuracy <- classifiACC(dataY=testY, predY=predY)
print(accuracy) 

BioMM documentation built on Nov. 8, 2020, 11:04 p.m.