baseModel: Base supervised machine learning models for prediction

View source: R/BioMM.R

baseModelR Documentation

Base supervised machine learning models for prediction

Description

Prediction using different supervised machine learning models.

Usage

baseModel(
  trainData,
  testData,
  classifier = c("randForest", "SVM", "glmnet"),
  predMode = c("classification", "probability", "regression"),
  paramlist
)

Arguments

trainData

The input training dataset. The first column is the label or the output. For binary classes, 0 and 1 are used to indicate the class member.

testData

The input test dataset. The first column is the label or the output. For binary classes, 0 and 1 are used to indicate the class member.

classifier

Machine learning classifiers. Available options are c('randForest', 'SVM', 'glmnet').

predMode

The prediction mode. Available options are c('classification', 'probability', 'regression'). 'probability' is currently only for 'randForest'.

paramlist

A set of model parameters defined in an R list object. See more details for each individual model.

Value

Based on a given machine learning, the predicted score/output will be estimated for the test data.

Author(s)

Junfang Chen

Examples

 
## Load data  
methylfile <- system.file('extdata', 'methylData.rds', package='BioMM')  
methylData <- readRDS(methylfile)  
dataY <- methylData[,1]
## select a subset of genome-wide methylation data at random
methylSub <- data.frame(label=dataY, methylData[,c(2:2001)])  
trainIndex <- sample(nrow(methylSub), 16)
trainData = methylSub[trainIndex,]
testData = methylSub[-trainIndex,]
library(ranger) 
set.seed(123)
predY <- baseModel(trainData, testData, 
                   classifier='randForest',  
                   predMode='classification', 
                   paramlist=list(ntree=300, nthreads=20)) 
print(table(predY)) 
testY <- testData[,1]
accuracy <- classifiACC(dataY=testY, predY=predY)
print(accuracy)  

transbioZI/BioMMex documentation built on Jan. 27, 2023, 4:14 a.m.