View source: R/predict_target_genes.R
predict_target_genes | R Documentation |
The master, user-facing function of this package.
predict_target_genes( trait = NULL, out_dir = NULL, variants_file = NULL, known_genes_file = NULL, reference_panels_dir = NULL, celltype_of_interest = NULL, tissue_of_interest = NULL, celltypes = "enriched_tissues", variant_to_gene_max_distance = 2e+06, max_n_known_genes_per_CS = Inf, do_scoring = T, do_performance = T, do_XGBoost = F, do_timestamp = F, HiChIP = NULL, H3K27ac = NULL )
trait |
Optional. The name of the trait of interest. |
out_dir |
The output directory in which to save the predictions. Default is "./out/trait/celltypes/". |
variants_file |
A BED file of trait-associated variants grouped by association signal, for example SNPs correlated with an index variant, or credible sets of fine-mapped variants |
known_genes_file |
Optional. The file containing a list of trait known gene symbols. If do_performance is TRUE, must provide a known_genes_file. |
reference_panels_dir |
The directory containing the external, accompanying reference panels data. |
celltype_of_interest |
Optional. The celltype(s) of interest for the trait. Only annotations in these celltypes will be used to make predictions. Argument(s) must match the names of celltypes in the metadata. Make sure the celltype of interest has coverage across all annotations (TADs, HiChIP, expression, H3K27ac) in the metadata table. |
tissue_of_interest |
Optional. The tissue(s) of interest for the trait. Only annotations in these tissues will be used to make predictions. Argument(s) must match the names of tissues in the metadata. |
celltypes |
Dictates which celltypes' annotations are used. Must be one of c("enriched_celltypes", "enriched_tissues", "all_celltypes"). If "enriched_celltypes", annotations from only the enriched celltype(s) will be used. The enriched celltype(s) must have coverage across all annotations (TADs, HiChIP, expression, H3K27ac) in the metadata table for this to work. If "enriched_tissues", all annotations from the tissue of the enriched celltype(s) will be used. If "all_celltypes", the enrichment analysis is skipped and annotations from all available cell types will be used. Default is "enriched_tissues". |
variant_to_gene_max_distance |
The maximum absolute distance (bp) across which variant-gene pairs are considered. Default is 2Mb. The HiChIP data is also already filtered to 2Mb. |
max_n_known_genes_per_CS |
In performance analysis, the maximum number of known genes within variant_to_gene_max_distance of the credible set. |
do_scoring |
If TRUE, runs the scoring chunk of the script, which combines all of the constituent MAE annotations into one score per transcript-variant pair. Default is FALSE. |
do_performance |
If TRUE, runs the performance chunk of the script, which measures the performance of the score and each of its constituent annotations in predicting known genes as the targets of nearby variants. Default is FALSE. |
do_XGBoost |
If TRUE, runs the XGBoost chunk of the script, which generates a model to predict the targets of variants from all available annotations and rates the importance of each annotation. Default is FALSE. |
do_timestamp |
If TRUE, will save output into a subdirectory timestamped with the data/time of the run. |
HiChIP |
If you are repeatedly running predict_target_genes, you can load the HiChIP object from the reference_panels_dir into the global environment and pass it to the function to prevent redundant re-loading each call to predict_target_genes. |
H3K27ac |
If you are repeatedly running predict_target_genes, you can load the H3K27ac object from the reference_panels_dir into the global environment and pass it to the function to prevent redundant re-loading with each call to predict_target_genes. |
A MultiAssayExperiment object with one assay object per annotation, one row per variant-transcript pair and one column per cell type (or 'value' if it is a non-cell-type-specific annotation).
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.