reconBySupervised: Reconstruct stage-2 data by supervised machine learning...
In BioMM: BioMM: Biological-informed Multi-stage Machine learning framework for phenotype prediction using omics data

Description Usage Arguments Details Value Author(s) Examples

Reconstruct stage-2 data by supervised machine learning prediction.

reconBySupervised(
  trainDataList,
  testDataList,
  resample = "BS",
  dataMode,
  repeatA,
  repeatB,
  nfolds,
  FSmethod,
  cutP,
  fdr,
  FScore = MulticoreParam(),
  classifier,
  predMode,
  paramlist,
  innerCore = MulticoreParam(),
  outFileA = NULL,
  outFileB = NULL
)

`trainDataList`	The input training data list containing ordered collections of matrices.
`testDataList`	The input test data list containing ordered collections of matrices.
`resample`	The resampling methods. Valid options are 'CV' and 'BS'. 'CV' for cross validation and 'BS' for bootstrapping resampling. The default is 'BS'.
`dataMode`	The mode of data used. 'subTrain' or 'allTrain'.
`repeatA`	The number of repeats N is used during resampling procedure. Repeated cross validation or multiple boostrapping is performed if N >=2. One can choose 10 repeats for 'CV' and 100 repeats for 'BS'.
`repeatB`	The number of repeats N is used for generating test data prediction scores.
`nfolds`	The number of folds is defined for cross validation.
`FSmethod`	Feature selection methods. Available options are c(NULL, 'positive', 'wilcox.test', 'cor.test', 'chisq.test', 'posWilcox', or 'top10pCor').
`cutP`	The cutoff used for p value thresholding. Commonly used cutoffs are c(0.5, 0.1, 0.05, 0.01, etc). The default is 0.05.
`fdr`	Multiple testing correction method. Available options are c(NULL, 'fdr', 'BH', 'holm', etc). See also `p.adjust`. The default is NULL.
`FScore`	The number of cores used for feature selection, if parallel computing needed.
`classifier`	Machine learning classifiers.
`predMode`	The prediction mode. Available options are c('probability', 'classification', 'regression').
`paramlist`	A set of model parameters defined in an R list object.
`innerCore`	The number of cores used for computation.
`outFileA`	The file name of stage-2 training data with the '.rds' file extension. If it's provided, then the result will be saved in this file. The default is NULL.
`outFileB`	The file name of stage-2 training data with the '.rds' file extension. If it's provided, then the result will be saved in this file. The default is NULL.

Stage-2 training data can be learned either using bootstrapping or cross validation resampling methods. Stage-2 test data is learned via independent test set prediction.

The predicted stage-2 training data and also stage-2 test data, if 'testDataList' provided. If outFileA and outFileB are provided, then the results will be stored in the files.

Junfang Chen

 
## Load data  
methylfile <- system.file('extdata', 'methylData.rds', package='BioMM')  
methylData <- readRDS(methylfile)  
## Annotation file
probeAnnoFile <- system.file('extdata', 'cpgAnno.rds', package='BioMM')  
featureAnno <- readRDS(file=probeAnnoFile)  
## Mapping CpGs into Pathways
featureAnno <- readRDS(system.file("extdata", "cpgAnno.rds", package="BioMM")) 
pathlistDB <- readRDS(system.file("extdata", "goDB.rds", package="BioMM")) 
head(featureAnno)   
dataList <- omics2pathlist(data=methylData, pathlistDB, featureAnno, 
                           restrictUp=100, restrictDown=10, minPathSize=10) 
length(dataList)
library(ranger) 
library(BiocParallel)
param1 <- MulticoreParam(workers = 1)
param2 <- MulticoreParam(workers = 20)
## Not Run, this will take a bit long
## stage2data <- reconBySupervised(trainDataList=dataList, testDataList=NULL, 
##                             resample='CV', dataMode='allTrain', 
##                             repeatA=50, repeatB=20, nfolds=10, 
##                             FSmethod=NULL, cutP=0.1, 
##                             fdr=NULL, FScore=param1, 
##                             classifier='randForest',
##                             predMode='classification', 
##                             paramlist=list(ntree=500, nthreads=20),
##                             innerCore=param2, outFileA=NULL, outFileB=NULL) 
## print(dim(stage2data))
## print(head(stage2data[,1:5]))

BioMM documentation built on Nov. 8, 2020, 11:04 p.m.

BioMM index

README.md BioMM: Biological-informed Multi-stage Machine learning framework for phenotype prediction using omics data

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

BioMM
BioMM: Biological-informed Multi-stage Machine learning framework for phenotype prediction using omics data

reconBySupervised: Reconstruct stage-2 data by supervised machine learning...
In BioMM: BioMM: Biological-informed Multi-stage Machine learning framework for phenotype prediction using omics data

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Related to reconBySupervised in BioMM...

R Package Documentation

Browse R Packages

We want your feedback!

BioMM BioMM: Biological-informed Multi-stage Machine learning framework for phenotype prediction using omics data

reconBySupervised: Reconstruct stage-2 data by supervised machine learning... In BioMM: BioMM: Biological-informed Multi-stage Machine learning framework for phenotype prediction using omics data

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Related to reconBySupervised in BioMM...

R Package Documentation

Browse R Packages

We want your feedback!

BioMM
BioMM: Biological-informed Multi-stage Machine learning framework for phenotype prediction using omics data

reconBySupervised: Reconstruct stage-2 data by supervised machine learning...
In BioMM: BioMM: Biological-informed Multi-stage Machine learning framework for phenotype prediction using omics data