mlhighHet: mlhighHet

View source: R/mlhighHet.R

mlhighHetR Documentation



This function extracts features based on ML method, finds optimal cut-off values of features using sequencial Cox PH model and obtain the most consistent level according to the cut-offs.


mlhighHet(cols, idSurv, idEvent, idFrail, num, fold = 3, data)



A numeric vector of column numbers indicating the features for which the log Loss functions are to be computed


The name of the survival time variable


The name of the survival event variable


The name of the frailty variable


Number of features to be selected


An integer denoting number of folds in cross validation, default value 3


A data frame that contains the survival and covariate information for the subjects


Performs heterogeneity analysis in gene expression

This function extracts features based on minimum log-Loss function using Cox proportional hazard model as learner method on a high dimensional survival data. For those selected genes, we obtain optimal cutoff values using minimum p-value in a Cox PH model. The Cox PH model is used sequencially for each combination of genes and all possible gene combinations are tested to obtain best possible combination with minimum BIC value. The subjects are classified according to different levels of those genes. Using a Cox PH frailty model, we obtain the most consistent level for which the frailty variance is minimum. The data is splited using cross validation technique. The performance measure is considered as logarithmic loss function. It is defined as,


The CoxPH frailty model is defined as,

λ(t)=λ 0(t)ν exp{X'β}

where ν is called the frailty. The variance of the frailty term is considered as the heterogeneity among the subjects or patients. Gaussian distribution with mean 0 is considered for the distribution of frailty component.


dataframes containing optimal gene cutoff values and most consistent level according to those cut-offs with frailty variance.


Atanu Bhattacharjee, Gajendra K. Vishwakarma & Souvik Banerjee


Sonabend, R., Király, F. J., Bender, A., Bernd Bischl B. and Lang M. mlr3proba: An R Package for Machine Learning in Survival Analysis, 2021, Bioinformatics, <>

Bhattacharjee, A. Vishwakarma, G.K. and Banerjee, S. A modified risk detection approach of biomarkers by frailty effect on multiple time to event data, 2020, <arXiv:2012.02102>.

See Also

mlhighCox, mlhighFrail


## Not run: 
mlhighHet(cols=c(27:32), idSurv="OS", idEvent="Death", idFrail="ID", num=2, fold = 3, data=hnscc)

## End(Not run)

highMLR documentation built on July 18, 2022, 9:06 a.m.

Related to mlhighHet in highMLR...