View source: R/prioritise_targets.R
prioritise_targets | R Documentation |
Prioritise target genes based on a procedure:
Disease-level: keep_deaths
:
Keep only diseases with a certain age of death.
Disease-level: severity_threshold_max
:
Keep only diseases annotated as a certain degree of severity or greater (filters on maximum severity per disease).
Phenotype-level: prune_ancestors
:
Remove redundant ancestral phenotypes when at least one of their descendants already exist.
Phenotype-level: keep_descendants
:
Remove phenotypes belonging to a certain branch of the HPO, as defined by an ancestor term.
Phenotype-level: keep_ont_levels
:
Keep only phenotypes at certain absolute ontology levels within the HPO.
Phenotype-level: pheno_ndiseases_threshold
:
The maximum number of diseases each phenotype can be associated with.
Phenotype-level: keep_tiers
:
Keep only phenotypes with high severity Tiers.
Phenotype-level: severity_threshold
:
Keep only phenotypes with mean Severity equal to or below the threshold.
Phenotype-level: gpt_filters
:
Keep only phenotypes with certain GPT annotations in specific severity metrics.
Phenotype-level: severity_score_gpt_threshold
:
Keep only phenotypes with a minimum GPT severity score.
Phenotype-level: info_content_threshold
:
Keep only phenotypes with a minimum information criterion score (computed from the HPO).
Symptom-level: pheno_frequency_threshold
:
Keep only phenotypes with mean frequency equal to or above the threshold (i.e. how frequently a phenotype is associated with any diseases in which it occurs).
Symptom-level: keep_onsets
:
Keep only symptoms with a certain age of onset.
Symptom-level: symptom_p_threshold
:
Uncorrected p-value threshold to filter cell type-symptom associations by.
Symptom-level: symptom_intersection_threshold
:
Minimum proportion of genes overlapping between a symptom gene list (phenotype-associated genes in the context of a particular disease) and the phenotype-cell type association driver genes.
Cell type-level: q_threshold
:
Keep only cell type-phenotype association results at q<=0.05.
Cell type-level: effect_threshold
:
Keep only cell type-phenotype association results at effect size>=1.
Cell type-level: keep_celltypes
:
Keep only terminally differentiated cell types.
Gene-level: keep_chr
:
Remove genes on non-standard chromosomes.
Gene-level: evidence_score_threshold
:
Remove genes that are below an aggregate phenotype-gene evidence score threshold.
Gene-level: gene_size
:
Keep only genes <4.3kb in length.
Gene-level: add_driver_genes
:
Keep only genes that are driving the association with a given phenotype (inferred by the intersection of phenotype-associated genes and gene with high-specificity quantiles in the target cell type).
Gene-level: keep_biotypes
:
Keep only genes belonging to certain biotypes.
Gene-level: gene_frequency_threshold
:
Keep only genes at or above a certain mean frequency threshold (i.e. how frequently a gene is associated with a given phenotype when observed within a disease).
Gene-level: keep_specificity_quantiles
:
Keep only genes in top specificity quantiles from the cell type dataset (CTD).
Gene-level: keep_mean_exp_quantiles
:
Keep only genes in top mean expression quantiles from the cell type dataset (CTD).
Gene-level: symptom_gene_overlap
:
Ensure that genes nominated at the phenotype-level also appear in the genes overlapping at the cell type-specific symptom-level.
All levels: sort_cols
:
Sort candidate targets by one or more columns (e.g. "severity_score_gpt", "q").
All levels: top_n
:
Only return the top N targets per variable group (specified with the "group_vars" argument). For example, setting "group_vars" to "hpo_id" and "top_n" to 1 would only return one target (row) per phenotype ID after sorting.
prioritise_targets(
results = load_example_results(),
ctd_list = load_example_ctd(c("ctd_DescartesHuman.rds", "ctd_HumanCellLandscape.rds"),
multi_dataset = TRUE),
phenotype_to_genes = HPOExplorer::load_phenotype_to_genes(),
hpo = HPOExplorer::get_hpo(),
keep_deaths = HPOExplorer::list_deaths(exclude = c("Miscarriage", "Stillbirth",
"Prenatal death"), include_na = TRUE),
keep_descendants = c("Phenotypic abnormality"),
keep_ont_levels = NULL,
pheno_ndiseases_threshold = NULL,
gpt_filters = NULL,
severity_score_gpt_threshold = 20,
keep_tiers = NULL,
severity_threshold_max = NULL,
info_content_threshold = 8,
run_prune_ancestors = TRUE,
severity_threshold = NULL,
pheno_frequency_threshold = NULL,
keep_onsets = HPOExplorer::list_onsets(include_na = TRUE),
effect_var = "logFC",
q_threshold = 0.05,
effect_threshold = 1,
symptom_intersection_threshold = 0.25,
keep_celltypes = NULL,
evidence_score_threshold = 15,
keep_chr = c(seq(22), "X", "Y"),
gene_size = list(min = 0, max = Inf),
gene_frequency_threshold = NULL,
keep_biotypes = NULL,
keep_specificity_quantiles = seq(30, 40),
keep_mean_exp_quantiles = seq(30, 40),
sort_cols = c(severity_score_gpt = -1, q = 1, logFC = -1, specificity = -1, mean_exp =
-1, pheno_freq_mean = -1, gene_freq_mean = -1, width = 1),
top_n = NULL,
group_vars = c("hpo_id"),
return_report = TRUE,
verbose = TRUE
)
results |
The cell type-phenotype enrichment results generated by gen_results and merged together with merge_results |
ctd_list |
A named list of CellTypeDataset objects each created with generate_celltype_data. |
phenotype_to_genes |
Output of load_phenotype_to_genes mapping phenotypes to gene annotations. |
hpo |
Human Phenotype Ontology object, loaded from get_ontology. |
keep_deaths |
The age of death associated with each HPO ID to keep. If >1 age of death is associated with the term, only the earliest age is considered. See add_death for details. |
keep_descendants |
Terms whose descendants should be kept
(including themselves).
Set to |
keep_ont_levels |
Only keep phenotypes at certain absolute ontology levels to keep. See add_ont_lvl for details. |
pheno_ndiseases_threshold |
Filter phenotypes by the maximum number of diseases they are associated with. |
gpt_filters |
A named list of filters to apply to the GPT annotations. |
severity_score_gpt_threshold |
The minimum GPT severity score that a phenotype can have across any disease. |
keep_tiers |
Tiers from hpo_tiers to keep.
Include |
severity_threshold_max |
The max severity score that a phenotype can have across any disease. |
info_content_threshold |
Minimum phenotype information content threshold. |
run_prune_ancestors |
Prune redundant ancestral terms if any of their descendants are present. Passes to prune_ancestors. |
severity_threshold |
Only keep phenotypes with a mean
severity score (averaged across multiple associated diseases) below the
set threshold. The severity score ranges from 1-4 where 1 is the MOST severe.
Include |
pheno_frequency_threshold |
Only keep phenotypes with frequency
above the set threshold. Frequency ranges from 0-100 where 100 is
a phenotype that occurs 100% of the time in all associated diseases.
Include |
keep_onsets |
The age of onset associated with each HPO ID to keep. If >1 age of onset is associated with the term, only the earliest age is considered. See add_onset for details. |
effect_var |
Name of the effect size column in the |
q_threshold |
The q value threshold to subset the |
effect_threshold |
The minimum fold change in specific expression
to subset the |
symptom_intersection_threshold |
Minimum proportion of genes overlapping between a symptom gene list (phenotype-associated genes in the context of a particular disease) and the phenotype-cell type association driver genes |
keep_celltypes |
Cell type to keep. |
evidence_score_threshold |
The minimum threshold of mean evidence scores of each gene-phenotype association to keep. |
keep_chr |
Chromosomes to keep. |
gene_size |
Min/max gene size (important for therapeutics design). |
gene_frequency_threshold |
Only keep genes with frequency
above the set threshold. Frequency ranges from 0-100 where 100 is
a gene that occurs 100% of the time in a given phenotype.
Include |
keep_biotypes |
Which gene biotypes to keep. (e.g. "protein_coding", "processed_transcript", "snRNA", "lincRNA", "snoRNA", "IG_C_gene") |
keep_specificity_quantiles |
Which cell type specificity quantiles to keep (max quantile is 40). |
keep_mean_exp_quantiles |
Which cell type mean expression quantiles to keep (max quantile is 40). |
sort_cols |
How to sort the rows using setorderv.
|
top_n |
Top N genes to keep when grouping by |
group_vars |
Columns to group by when selecting |
return_report |
If |
verbose |
Print messages. |
Term key:
Disease:
A disease defined in the database OMIM, DECIPHER and/or Orphanet.
Phenotype: A clinical feature associated with one or more diseases.
Symptom:
A phenotype within the context of a particular disease. Within a given phenotype, there may be multiple symptoms with partially overlapping genetic mechanisms.
Assocation:
A cell type-specific enrichment test result conducted at the disease-level, phenotype-level, or symptom-level.
A data.table of the prioritised phenotype- and cell type-specific gene targets.
results = load_example_results()[q<0.05]
out <- prioritise_targets(results=results)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.