score_candidate_genes_from_PPI | R Documentation |
The wppi package implements a prioritization of genes according to their potential relevance in a disease or other experimental or physiological condition. For this it uses a PPI network and functional annotations. A protein-protein interactions (PPI) in the neighborhood of the genes of interest are weighted according to the number of common neighbors of interacting partners and the similarity of their functional annotations. The PPI networks are obtained using the OmniPath (https://omnipathdb.org/) resource and functionality is deduced using the Gene Ontology (GO, http://geneontology.org/) and Human Phenotype Ontology (HPO, https://hpo.jax.org/app/) ontology databases. To score the candidate genes, a Random Walk with Restart algorithm is applied on the weighted network.
score_candidate_genes_from_PPI( genes_interest, HPO_interest = NULL, percentage_output_genes = 100, graph_order = 1, GO_annot = TRUE, GO_slim = NULL, GO_aspects = c("C", "F", "P"), GO_organism = "human", HPO_annot = TRUE, restart_prob_rw = 0.4, threshold_rw = 1e-05, databases = NULL, ... )
genes_interest |
Character vector of gene symbols with genes known to be related to the investigated disease or condition. |
HPO_interest |
Character vector with Human Phenotype Ontology (HPO)
annotations of interest from which to construct the functionality (for
a list of available annotations see the 'Name' column in the data
frame provided by |
percentage_output_genes |
Positive integer (range between 0 and 100) specifying the percentage (%) of the total candidate genes in the network returned in the output. If not specified, the score of all the candidate genes is delivered. |
graph_order |
Integer larger than zero: the neighborhood range counted as steps from the genes of interest. These genes, also called candidate genes, together with the given genes of interest define the Protein-Protein Interaction (PPI) network used in the analysis. If not specified, the first order neighbors are used. |
GO_annot |
Logical: use the Gene Ontology (GO) annotation database to weight the PPI network. The default is to use it. |
GO_slim |
Character: use a GO subset (slim). If |
GO_aspects |
Character vector with the single letter codes of the gene ontology aspects to use. By default all three aspects are used. The aspects are "C": cellular component, "F": molecular function and "P" biological process. |
GO_organism |
Character: name of the organism for GO annotations. |
HPO_annot |
Logical: use the Human Phenotype Ontology (HPO) annotation database to weight the PPI network. The default is to use it. |
restart_prob_rw |
Numeric: between 0 and 1, defines the restart probability parameter used in the Random Walk with Restart algorithm. The default value is 0.4. |
threshold_rw |
Numeric: the threshold parameter in the Random Walk with Restart algorithm. When the error between probabilities is smaller than the threshold, the algorithm stops. The default is 1e-5. |
databases |
Database knowledge as produced by |
... |
Passed to
|
If you use a GO subset (slim), building it at the first time might take
around 20 minutes. The result is saved into the cache so next time loading
the data from there is really quick.
Gene Ontology annotations are available for a few other organisms apart
from human. The currently supported organisms are "chicken", "cow", "dog",
"human", "pig" and "uniprot_all". If you disable HPO_annot
you can
use wppi
to score PPI networks other than human.
Data frame with the ranked candidate genes based on the functional score inferred from given ontology terms, PPI and Random Walk with Restart parameters.
wppi_data
weighted_adj
random_walk
prioritization_genes
# example gene set genes_interest <- c("ERCC8", "AKT3", "NOL3", "GFI1B", "CDC25A", "TPX2", "SHE") # example HPO annotations set hpo <- wppi_hpo_data() HPO_interest <- unique( dplyr::filter(hpo, grepl("Diabetes", .data$Name))$Name ) # Score 1st-order candidate genes new_genes_diabetes <- score_candidate_genes_from_PPI( genes_interest = genes_interest, HPO_interest = HPO_interest, percentage_output_genes = 10, graph_order = 1) new_genes_diabetes # # A tibble: 30 x 3 # score gene_symbol uniprot # <dbl> <chr> <chr> # 1 0.247 KNL1 Q8NG31 # 2 0.247 HTRA2 O43464 # 3 0.247 KAT6A Q92794 # 4 0.247 BABAM1 Q9NWV8 # 5 0.247 SKI P12755 # # . with 25 more rows
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.