View source: R/PRODIGY_cohort.R
PRODIGY_cohort | R Documentation |
This is a wrapper that runs PRODIGY for a cohort of patients. It could take a long time to finish a group of patints, hence it is advised to output PRODIGY's results into files using the write_results parameter. It is also advised to use as much cores as possible using the num_of_cores parameter
PRODIGY_cohort( snv_matrix = NULL, expression_matrix, network = NULL, samples = NULL, DEGs = NULL, alpha = 0.05, pathway_list = NULL, num_of_cores = 1, sample_origins = NULL, write_results = F, results_folder = "./", beta = 2, gamma = 0.05, delta = 0.05, mutation_list = NULL )
snv_matrix |
A binary matrix with genes in rows and patients on columns. 1=mutation. All genes must be contained in the global PPI network. |
expression_matrix |
A read count matrix with genes in rows and patients on columns. All genes must be contained in the global PPI network. |
network |
The global PPI network. Columns describe the source protein, destination protein and interaction score respectively. The network is considered as undirected. |
DEGs |
Named list of differentially expressed genes for every sample. All genes must be contained in the global PPI network. |
alpha |
The penalty exponent. |
pathway_list |
A list where each object is a 3 column data.table (src,dest,weight). Names correspond to pathway names |
num_of_cores |
The number of CPU cores to be used by the influence scores calculation step. |
sample_origins |
A vector that contains two optional values ("tumor","normal") corresponds to the tissues from which each column in expression_matrix was derived. This vector is utilized for differential expression analysis. If no vector is specified, the sample names of expression_matrix are assumed to be in TCGA format where last two digits correspond to sample type: "01"= solid tumor and "11"= normal. |
write_results |
Should the results be written to text files? |
results_folder |
Location for resulting influence matrices storage (if write_results = T) |
beta |
Minimal fold-change threshold for declering gene as differentially expressed by DESeq (default = 0.2) |
gamma |
FDR threshold for declering gene as differentially expressed by DESeq (default = 0.05) |
delta |
FDR threshold for declering a pathway as statistically enriched for differentially expressed genes (default = 0.05) |
mutation_list |
An alternative to snv_matrix, one can simply provide a list of named vectors with genes to be analyzed by PRODIGY |
sample |
The sample labels as appears in the SNV and expression matrices. |
A list of influence scores matrices.
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 1-21 (2014). Gabriele Sales, Enrica Calura and Chiara Romualdi, graphite: GRAPH Interaction from pathway Topological Environment (2017).
data(COAD_SNV) data(COAD_Expression) # Load STRING network data data(STRING_network) network = STRING_network # Take samples for which SNV and expression is available samples = intersect(colnames(expression_matrix),colnames(snv_matrix))[1:5] # Get differentially expressed genes (DEGs) for all samples expression_matrix = expression_matrix[which(rownames(expression_matrix) %in% unique(c(network[,1],network[,2]))),] DEGs = get_DEGs(expression_matrix,samples,sample_origins=NULL) # Identify sample origins (tumor or normal) sample_origins = rep("tumor",ncol(expression_matrix)) sample_origins[substr(colnames(expression_matrix),nchar(colnames(expression_matrix)[1])-1,nchar(colnames(expression_matrix)[1]))=="11"] = "normal" # Run PRODIGY all_patients_scores = PRODIGY_cohort(snv_matrix,expression_matrix,network=network,samples=samples,DEGs=DEGs,alpha=0.05,pathwayDB="reactome",num_of_cores=1,sample_origins=sample_origins, write_results = F, results_folder = "./",beta=2,gama=0.05,delta=0.05)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.