metricMCB: Calculation of the metric matrix for Methylation Correlation...

View source: R/metricMCB.R

metricMCBR Documentation

Calculation of the metric matrix for Methylation Correlation Block

Description

To enable quantitative analysis of the methylation patterns
within individual Methylation Correlation Blocks across many samples, a single metric to
define the methylated pattern of multiple CpG sites within each block.
Compound scores which calculated all CpGs within individual Methylation Correlation Blocks by linear, SVM or elastic-net model
Predict values were used as the compound methylation values of Methylation Correlation Blocks.

Usage

metricMCB(MCBset,training_set,Surv,testing_set,
Surv.new,Method,predict_time,ci,silent,alpha,n_mstop,n_nu,theta)

Arguments

MCBset

Methylation Correlation Block information returned by the IndentifyMCB function.

training_set

methylation matrix used for training the model in the analysis.

Surv

Survival function contain the survival information for training.

testing_set

methylation matrix used in the analysis. This can be missing then training set itself will be used as testing set.

Surv.new

Survival function contain the survival information for testing.

Method

model used to calculate the compound values for multiple Methylation correlation blocks.
Options include "svm" "cox" "mboost" and "enet". The default option is SVM method.

predict_time

time point of the ROC curve used in the AUC calculations, default is 5 years.

ci

if True, the confidence intervals for AUC under area under the receiver operating characteristic curve will be calculated. This will be time consuming. default is False.

silent

True indicates that processing information and progress bar will be shown.

alpha

The elasticnet mixing parameter, with 0 <= alpha <= 1. alpha=1 is the lasso penalty, and alpha=0 the ridge penalty.
It works only when "enet" Method is selected.

n_mstop

an integer giving the number of initial boosting iterations. If mstop = 0, the offset model is returned.
It works only when "mboost" Method is selected.

n_nu

a double (between 0 and 1) defining the step size or shrinkage parameter in mboost model.
It works only when "mboost" Method is selected.

theta

penalty used in the penalized coxph model, which is theta/2 time sum of squared coefficients. default is 1.
It works only when "cox" Method is selected.

Value

Object of class list with elements (XXX will be replaced with the model name you choose):

MCB_XXX_matrix_training Prediction results of model for training set.
MCB_XXX_matrix_test_set Prediction results of model for test set.
XXX_auc_results AUC results for each model.
best_XXX_model Model object for the model with best AUC.
maximum_auc Maximum AUC for the whole generated models.

Author(s)

Xin Yu

References

Xin Yu et al. 2019 Predicting disease progression in lung adenocarcinoma patients based on methylation correlated blocks using ensemble machine learning classifiers (under review)

Examples

#import datasets
data(demo_survival_data)
datamatrix<-create_demo()
data(demo_MCBinformation)
#select MCB with at least 3 CpGs.
demo_MCBinformation<-demo_MCBinformation[demo_MCBinformation[,"CpGs_num"]>2,]

trainingset<-colnames(datamatrix) %in% sample(colnames(datamatrix),0.6*length(colnames(datamatrix)))
testingset<-!trainingset
#create the results using Cox regression. 
mcb_cox_res<-metricMCB(MCBset = demo_MCBinformation,
               training_set = datamatrix[,trainingset],
               Surv = demo_survival_data[trainingset],
               testing_set = datamatrix[,testingset],
               Surv.new = demo_survival_data[testingset],
               Method = "cox"
               )


whirlsyu/EnMCB documentation built on Jan. 25, 2023, 4:33 a.m.