vargen_pipeline: Main vargen function, to get the list of variants
In MCorentin/vargen: Fetching Variants Related to Phenotypes Using Public Databases

vargen_pipeline

R Documentation

Main vargen function, to get the list of variants

Description

Will get a list of variants related to certain OMIM morbid IDs. Be aware that some of these variants will not be necessarily associated to the phenotype. We advise to filter the results by annotation ("CADD phred score", "snpEff impact" etc...). If you want a smaller list of variants that are all associated with the disease, then run get_variants_from_phenotypes The variants are fetched from the following sources:

OMIM: get variants on the genes related to the disease
FANTOM5: get the variants on the enhancers of the OMIM genes
GTEx: get variants impacting the expression of the OMIM genes in specific tissues.
GWAS: get variants related to the phenotype of interest from the gwas catalog

The pipeline will also annotate the variants using getVariants

Usage

vargen_pipeline(
  vargen_dir,
  omim_morbid_ids,
  fantom_corr = 0.25,
  outdir = "./",
  gtex_tissues,
  gwas_traits,
  gene_mart,
  snp_mart,
  verbose = FALSE
)

Arguments

`vargen_dir`	directory with the following file (can be generated with `vargen_install`)
`omim_morbid_ids`	a vector containing the omim morbid id(s) of the phenotype(s) of interest. You can search on the Online Mendelian Inheritance in Man website (https://www.omim.org/) or use `list_omim_accessions`
`fantom_corr`	the minimum correlation (z-score) to consider a FANTOM5 enhancer/gene association valid (default: 0.25). A z-score greater than 0 represents an element greater than the mean, this means that this association has more correlation than random motifs.
`outdir`	the output directory, some files will be written during the running of this function
`gtex_tissues`	a vector containing the name of the "signif_variant_gene_pairs.txt.gz" files. Output from `select_gtex_tissues` can be used.
`gwas_traits`	a vector with the trait of interest (as characters). The list of available traits can be obtained with `list_gwas_traits`
`gene_mart`	optional, a connection to ensembl gene mart, can be created using `connect_to_gene_ensembl` (If missing this function will be used to create the connection).
`snp_mart`	optional, a connection to ensembl snp mart, can be created using `connect_to_snp_ensembl` (If missing this function will be used to create the connection).
`verbose`	if TRUE, will print progress messages (default: FALSE)

Value

a data.frame with the variants fetched from OMIM, FANTOM5, GTEx and GWAS. The data.frame will contain the following columns:

chr (chromosome)
pos (position of the variant)
rsid (variant ID)
ensembl_gene_id ("gene id" of the gene associated with the variant)
hgnc_symbol ("hgnc symbol" of the gene associated with the variant)
source ("omim", "fantom5", "gtex" or "gwas")
trait (the "omim ids" seperated by ';' for omim,fantom and gtex variants and the gwas trait for the gwas variants).

Examples

vargen_install("./vargen_data/")

# Simple query
DM1_simple <- vargen_pipeline(vargen_dir = "./vargen_data/", omim_morbid_ids = "222100",
                              fantom_corr = 0.25, outdir = "./", verbose = TRUE)


# Query with gtex and gwas
pancreas_tissues <- select_gtex_tissues(gtex_dir = "./vargen_data/GTEx_Analysis_v8_eQTL/",
                                        tissues_query = "pancreas")

# list_gwas_traits("diabetes")

DM1 <- vargen_pipeline(vargen_dir = "./vargen_data/", omim_morbid_ids = "222100",
                       fantom_corr = 0.25, outdir = "./",
                       gtex_tissues = pancreas_tissues,
                       gwas_traits = "Type 1 diabetes", verbose = TRUE)

MCorentin/vargen documentation built on Feb. 6, 2024, 2:32 p.m.