View source: R/collect_gnomad_controls.r
1 2 3 4 5 6 7 8 9 10 | collect_gnomad_controls(
genenames = NULL,
dataset = "exome",
switch_dataset_threshold = 0,
inframes = T,
max_inframe_size = 3,
filtertype = "strict",
maxmaf = 1e-04,
messages = T
)
|
genenames |
genenames to collect controls for (NULL = all genes). |
dataset |
GnomAD exomes or genomes? ("exome", "genome") |
switch_dataset_threshold |
coverage threshold below which to switch control group for that gene (see Details) |
inframes |
include inframes as missense variants? (default=TRUE) |
max_inframe_size |
maximum inframe size to include to a maximum of 3 (default=3) |
filtertype |
type of allele frequency to filter MAF; "global", "popmax" or "strict" where global is the total allele frequency in controls, popmax is the maximum allele frequency in any ancestry group in GnomAD excluding ASJ, FIN, OTH and strict is the maximum allele frequency in any ancestry group in gnomAD or globally across GnomAD genomes |
maxmaf |
maf threshold using for frequency filtering default is 0.0001 (i.e. 0.1 |
a data.table of GnomAD controls Automatically get controls for BIN-test and ClusterBurden GnomAD exomes or genomes v2 (missense only) Retrieve missense variants for GnomAD exomes or genomes v2 to use in an association analysis. Columns in resulting dataset include genename (symbol), protein position and allele count.
For switch_dataset_threshold the coverage for each gene in gnomAD exomes and genomes are calculated as the mean 10X coverage across all bases in the exonic regions for that gene. If this input is not 0 then for all genes with coverage below the inputs value will switch to the other control group if it has better coverage. For example, if 0.9 (90 less than 90
The sample sizes for each gene are calculated as the mean number of samples with at least 10X coverage across each base in the exonic regions for that gene multiplied by the complete cohort size of gnomAD exomes (125748) or GnomAD genomes (15708). These sample sizes are attached as an attribute to the control group which can be accessed by: attributes(controls)$ss. # Filtering for cases must match filtering for controls # for the simplest scenario using all filtering defaults # Then for the genes of interest e.g. MYH7 and TNNI3
controls = collect_gnomad_controls(c("MYH7", "TNNI3")) Adam Waring - adam.waring@msdtc.ox.ac.uk RVAT, case-control, cluster, distribution, gene genetics,
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.