Description Usage Arguments Value Examples
Prediction via supervised machine learning using bootstrap resampling along with feature selection methods.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | predByBS(
trainData,
testData,
dataMode,
repeats,
FSmethod,
cutP,
fdr,
FScore = MulticoreParam(),
classifier,
predMode,
paramlist,
innerCore = MulticoreParam()
)
|
trainData |
The input training dataset. The first column is the label or the output. For binary classes, 0 and 1 are used to indicate the class member. |
testData |
The input test dataset. The first column is the label or the output. For binary classes, 0 and 1 are used to indicate the class member. |
dataMode |
The input training data mode for model training. It is used only if 'testData' is present. It can be a subset of the whole training data or the entire training data. 'subTrain' is the given for subsetting and 'allTrain' for the entire training dataset. |
repeats |
The number of repeats used for boostrapping. |
FSmethod |
Feature selection methods. Available options are c(NULL, 'positive', 'wilcox.test', 'cor.test', 'chisq.test', 'posWilcox', or 'top10pCor'). |
cutP |
The cutoff used for p value thresholding. Commonly used cutoffs are c(0.5, 0.1, 0.05, 0.01, etc). The default is 0.05. |
fdr |
Multiple testing correction method. Available options are
c(NULL, 'fdr', 'BH', 'holm', etc).
See also |
FScore |
The number of cores used for feature selection if parallel computing needed. |
classifier |
Machine learning classifiers. |
predMode |
The prediction mode. Available options are c('probability', 'classification', 'regression'). |
paramlist |
A set of model parameters defined in an R list object. |
innerCore |
The number of cores used for computation. |
The predicted output for the test data.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
## Load data
methylfile <- system.file('extdata', 'methylData.rds', package='BioMM')
methylData <- readRDS(methylfile)
dataY <- methylData[,1]
## select a subset of genome-wide methylation data at random
methylSub <- data.frame(label=dataY, methylData[,c(2:2001)])
trainIndex <- sample(nrow(methylSub), 16)
trainData = methylSub[trainIndex,]
testData = methylSub[-trainIndex,]
library(ranger)
library(BiocParallel)
param1 <- MulticoreParam(workers = 1)
param2 <- MulticoreParam(workers = 20)
predY <- predByBS(trainData, testData,
dataMode='allTrain', repeats=50,
FSmethod=NULL, cutP=0.1,
fdr=NULL, FScore=param1,
classifier='randForest',
predMode='classification',
paramlist=list(ntree=300, nthreads=10),
innerCore=param2)
testY <- testData[,1]
accuracy <- classifiACC(dataY=testY, predY=predY)
print(accuracy)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.