preprocess_COSMOS_signaling_to_metabolism: Preprocess COSMOS Inputs For Signaling to Metabolism

View source: R/preprocess_COSMOS_signaling_to_metabolism.R

preprocess_COSMOS_signaling_to_metabolismR Documentation

Preprocess COSMOS Inputs For Signaling to Metabolism

Description

Runs checks on the input data and simplifies the prior knowledge network. Simplification includes the removal of (1) nodes that are not reachable from signaling nodes and (2) interactions between transcription factors and target genes if the target gene does not respond or the response is contradictory with the change in the transcription factor activity. Optionally, further TF activities are estimated via network optimization via CARNIVAL and the interactions between TF and genes are filtered again.

Usage

preprocess_COSMOS_signaling_to_metabolism(
  meta_network = meta_network,
  tf_regulon = load_tf_regulon_dorothea(),
  signaling_data,
  metabolic_data,
  diff_expression_data = NULL,
  diff_exp_threshold = 1,
  maximum_network_depth = 8,
  expressed_genes = NULL,
  remove_unexpressed_nodes = TRUE,
  filter_tf_gene_interaction_by_optimization = TRUE,
  CARNIVAL_options = default_CARNIVAL_options("lpSolve")
)

Arguments

meta_network

prior knowledge network (PKN). A PKN released with COSMOS and derived from Omnipath, STITCHdb and Recon3D can be used. See details on the data meta_network.

tf_regulon

collection of transcription factor - target interactions. A default collection from dorothea can be obtained by the load_tf_regulon_dorothea function.

signaling_data

numerical vector, where names are signaling nodes in the PKN and values are from {1, 0, -1}. Continuous data will be discretized using the sign function.

metabolic_data

numerical vector, where names are metabolic nodes in the PKN and values are continuous values that represents log2 fold change or t-values from a differential analysis. These values are compared to the simulation results (simulated nodes can take value -1, 0 or 1)

diff_expression_data

(optional) numerical vector that represents the results of a differential gene expression analysis. Names are gene names using gene symbole and values are log fold change or t-values. We use the “diff_exp_threshold” parameter to decide which genes changed significantly. Genes with NA values are considered none expressed and they will be removed from the TF-gene expression interactions.

diff_exp_threshold

threshold parameter (default 1) used to binarize the values of “diff_expression_data”.

maximum_network_depth

integer > 0 (default: 8). Nodes that are further than “maximum_network_depth” steps from the signaling nodes on the directed graph of the PKN are considered non-reachable and are removed.

expressed_genes

character vector. Names of nodes that are expressed. By default we consider all the nodes that appear in diff_expression_data with a numeric value (i.e. nodes with NA are removed)

remove_unexpressed_nodes

if TRUE (default) removes nodes from the PKN that are not expressed, see input “expressed_genes”.

filter_tf_gene_interaction_by_optimization

(default:TRUE), if TRUE then runs a network optimization that estimates TF activity not included in the inputs and checks the consistency between the estimated activity and change in gene expression. Removes interactions where TF and gene expression are inconsistent

CARNIVAL_options

list that controls the options of CARNIVAL. See details in default_CARNIVAL_options.

Value

cosmos_data object with the following fields:

meta_network

Filtered PKN

tf_regulon

TF - target regulatory network

signaling_data_bin

Binarised signaling data

metabolic_data

Metabolomics data

diff_expression_data_bin

Binarized gene expression data

optimized_network

Initial optimized network if filter_tf_gene_interaction_by_optimization is TRUE

See Also

meta_network for meta PKN, load_tf_regulon_dorothea for tf regulon, runCARNIVAL.

Examples

data(toy_network)
data(toy_signaling_input)
data(toy_metabolic_input)
data(toy_RNA)
test_for <- preprocess_COSMOS_signaling_to_metabolism(meta_network = toy_network,
     signaling_data = toy_signaling_input,
     metabolic_data = toy_metabolic_input,
     diff_expression_data = toy_RNA,
     maximum_network_depth = 15,
     remove_unexpressed_nodes = TRUE,
     CARNIVAL_options = default_CARNIVAL_options("lpSolve"))

saezlab/COSMOS documentation built on Sept. 17, 2023, 1:22 p.m.