DriverGeneAnalysis: DrGA: driver gene analysis in an automatic manner

View source: R/DrGA.R

DriverGeneAnalysisR Documentation

DrGA: driver gene analysis in an automatic manner

Description

DrGA is a novel R package that has been developed based on the idea of our recent driver gene analysis scheme. Its aim is to wholy automate the analysis process of driver genes at a low investment of time for this process by merging state-of-the-art statistical tools into one.

Usage

DriverGeneAnalysis(organism, sources, methodCC,
       exp, clinicalEXP, timeEXP, statusEXP,
       datMODULE4, cliMODULE4, timeMODULE4, statusMODULE4,
       minClusterSize, verbose,
       NetworkType, hm_row_names)

Arguments

organism

organism name. Organism names are constructed by concatenating the first letter of the name and the family name. Example: human - hsapiens, mouse - mmusculus. Default is organism = "hsapiens"

sources

possible biological mechanisms allowed (e.g., Gene Ontology - GO:BP, GO:MF, GO:CC; KEGG; REAC; TF; MIRNA; CORUM; HP; HPA; WP;… Please see the g:GOSt web tool for the comprehensive list and details on incorporated data sources). Default is sources = c("GO:BP", "KEGG")

methodCC

Correlation method type. Allowed values are spearman (default), pearson, kendall

exp

a data frame or matrix. exp has its rows are samples and its columns are genes. It is input data to serve to run the second and third modules.

clinicalEXP

a data frame or matrix. It includes its rows are samples, and its columns are clinical features of your choice. Note that if users want to perform survival analysis, clinicalEXP must have two columns as overall survival time (continuous variable) and overall survival status (binary variable; usually coded 1 as death and 0 as live) of all the subjects.

timeEXP

a vector of overall survival time. It is a column vector of clinicalEXP.

statusEXP

a vector of overall survival time. It is a column vector of clinicalEXP.

datMODULE4

a data frame or matrix. datMODULE4 has its rows are samples and its columns are genes. It is input data to serve to run the forth module.

cliMODULE4

a data frame or matrix. It includes its rows are samples, and its columns are clinical features of your choice. Note that if users want to perform survival analysis, cliMODULE4 must have two columns as overall survival time (continuous variable) and overall survival status (binary variable; usually coded 1 as death and 0 as live) of all the subjects.

timeMODULE4

a vector of overall survival time. It is a column vector of cliMODULE4

statusMODULE4

a vector of overall survival time. It is a column vector of cliMODULE4

minClusterSize

Minimum cluster size. minClusterSize = 10 is default.

verbose

Default value is TRUE. A logical specifying whether to print details of analysis processes.

NetworkType

network type. Allowed values are (unique abbreviations of) "unsigned", "signed", "signed hybrid". Default value is signed

hm_row_names

logical. If hm_row_names = TRUE (default value), gene names appear in rows of the heatmap. If due to the large number of driver genes leading to impossibly showing gene names in rows of the heatmap, users can turn them off by hm_row_names = FALSE.

Author(s)

Quang-Huy Nguyen

References

Quang-Huy Nguyen, Duc-Hau Le. (2022). DrGA: cancer driver gene analysis in a simpler manner. BMC Genomics, 23(1):86.

Examples

DriverGeneAnalysis(exp = exp, clinicalEXP = clinicalEXP, timeEXP = clinicalEXP$time, statusEXP = clinicalEXP$status, datMODULE4 = cna,  cliMODULE4 = clinicalCNA, timeMODULE4 = clinicalCNA$time, statusMODULE4 = clinicalCNA$status)

huynguyen250896/DrGA documentation built on Oct. 18, 2023, 5:47 a.m.