vargen_pipeline: Main vargen function, to get the list of variants

View source: R/VarGen.R

vargen_pipelineR Documentation

Main vargen function, to get the list of variants

Description

Will get a list of variants related to certain OMIM morbid IDs. Be aware that some of these variants will not be necessarily associated to the phenotype. We advise to filter the results by annotation ("CADD phred score", "snpEff impact" etc...). If you want a smaller list of variants that are all associated with the disease, then run get_variants_from_phenotypes The variants are fetched from the following sources:

  • OMIM: get variants on the genes related to the disease

  • FANTOM5: get the variants on the enhancers of the OMIM genes

  • GTEx: get variants impacting the expression of the OMIM genes in specific tissues.

  • GWAS: get variants related to the phenotype of interest from the gwas catalog

The pipeline will also annotate the variants using getVariants

Usage

vargen_pipeline(
  vargen_dir,
  omim_morbid_ids,
  fantom_corr = 0.25,
  outdir = "./",
  gtex_tissues,
  gwas_traits,
  gene_mart,
  snp_mart,
  verbose = FALSE
)

Arguments

vargen_dir

directory with the following file (can be generated with vargen_install)

omim_morbid_ids

a vector containing the omim morbid id(s) of the phenotype(s) of interest. You can search on the Online Mendelian Inheritance in Man website (https://www.omim.org/) or use list_omim_accessions

fantom_corr

the minimum correlation (z-score) to consider a FANTOM5 enhancer/gene association valid (default: 0.25). A z-score greater than 0 represents an element greater than the mean, this means that this association has more correlation than random motifs.

outdir

the output directory, some files will be written during the running of this function

gtex_tissues

a vector containing the name of the "signif_variant_gene_pairs.txt.gz" files. Output from select_gtex_tissues can be used.

gwas_traits

a vector with the trait of interest (as characters). The list of available traits can be obtained with list_gwas_traits

gene_mart

optional, a connection to ensembl gene mart, can be created using connect_to_gene_ensembl (If missing this function will be used to create the connection).

snp_mart

optional, a connection to ensembl snp mart, can be created using connect_to_snp_ensembl (If missing this function will be used to create the connection).

verbose

if TRUE, will print progress messages (default: FALSE)

Value

a data.frame with the variants fetched from OMIM, FANTOM5, GTEx and GWAS. The data.frame will contain the following columns:

  • chr (chromosome)

  • pos (position of the variant)

  • rsid (variant ID)

  • ensembl_gene_id ("gene id" of the gene associated with the variant)

  • hgnc_symbol ("hgnc symbol" of the gene associated with the variant)

  • source ("omim", "fantom5", "gtex" or "gwas")

  • trait (the "omim ids" seperated by ';' for omim,fantom and gtex variants and the gwas trait for the gwas variants).

Examples

vargen_install("./vargen_data/")

# Simple query
DM1_simple <- vargen_pipeline(vargen_dir = "./vargen_data/", omim_morbid_ids = "222100",
                              fantom_corr = 0.25, outdir = "./", verbose = TRUE)


# Query with gtex and gwas
pancreas_tissues <- select_gtex_tissues(gtex_dir = "./vargen_data/GTEx_Analysis_v8_eQTL/",
                                        tissues_query = "pancreas")

# list_gwas_traits("diabetes")

DM1 <- vargen_pipeline(vargen_dir = "./vargen_data/", omim_morbid_ids = "222100",
                       fantom_corr = 0.25, outdir = "./",
                       gtex_tissues = pancreas_tissues,
                       gwas_traits = "Type 1 diabetes", verbose = TRUE)

MCorentin/vargen documentation built on Feb. 6, 2024, 2:32 p.m.