collect_gnomad_controls: Automatically get controls for BIN-test and ClusterBurden...

Usage Arguments

View source: R/collect_gnomad_controls.r

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
collect_gnomad_controls(
  genenames = NULL,
  dataset = "exome",
  switch_dataset_threshold = 0,
  inframes = T,
  max_inframe_size = 3,
  filtertype = "strict",
  maxmaf = 1e-04,
  messages = T
)

Arguments

genenames

genenames to collect controls for (NULL = all genes).

dataset

GnomAD exomes or genomes? ("exome", "genome")

switch_dataset_threshold

coverage threshold below which to switch control group for that gene (see Details)

inframes

include inframes as missense variants? (default=TRUE)

max_inframe_size

maximum inframe size to include to a maximum of 3 (default=3)

filtertype

type of allele frequency to filter MAF; "global", "popmax" or "strict" where global is the total allele frequency in controls, popmax is the maximum allele frequency in any ancestry group in GnomAD excluding ASJ, FIN, OTH and strict is the maximum allele frequency in any ancestry group in gnomAD or globally across GnomAD genomes

maxmaf

maf threshold using for frequency filtering default is 0.0001 (i.e. 0.1

a data.table of GnomAD controls Automatically get controls for BIN-test and ClusterBurden GnomAD exomes or genomes v2 (missense only) Retrieve missense variants for GnomAD exomes or genomes v2 to use in an association analysis. Columns in resulting dataset include genename (symbol), protein position and allele count.

For switch_dataset_threshold the coverage for each gene in gnomAD exomes and genomes are calculated as the mean 10X coverage across all bases in the exonic regions for that gene. If this input is not 0 then for all genes with coverage below the inputs value will switch to the other control group if it has better coverage. For example, if 0.9 (90 less than 90

The sample sizes for each gene are calculated as the mean number of samples with at least 10X coverage across each base in the exonic regions for that gene multiplied by the complete cohort size of gnomAD exomes (125748) or GnomAD genomes (15708). These sample sizes are attached as an attribute to the control group which can be accessed by: attributes(controls)$ss. # Filtering for cases must match filtering for controls # for the simplest scenario using all filtering defaults # Then for the genes of interest e.g. MYH7 and TNNI3

controls = collect_gnomad_controls(c("MYH7", "TNNI3")) Adam Waring - adam.waring@msdtc.ox.ac.uk RVAT, case-control, cluster, distribution, gene genetics,


adamwaring/ClusterBurden documentation built on July 29, 2020, 9:50 p.m.