remove_high_frequency_vars: remove high frequency variants from the dataset

Description Usage Arguments Value Examples

View source: R/cumulative_frequency.R

Description

Sometimes a variant is above the frequency threshold in one population, but under it in another. We exclude variants from both populations in these cases.

Usage

1
remove_high_frequency_vars(vars, threshold)

Arguments

vars

dataframe of variants (one row per allele), which includes the number of times that allele was observed within the population, as well as the total number of alleles in the population. Alternatively, this can be a list of dataframe, each for a different population

threshold

minor allele frequency (MAF) threshold, we exclude variants with MAF values above or equal to this threshold. This needs to be matched to the threshold used during identification of the biallelically inherited genotypes.

Value

object with high frequency variants excluded.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
vars = read.table(header = TRUE, text = "
    CHROM  POS  REF  ALT  AC  AN    CQ
    1      1    A    G    1   1000  missense_variant
    1      2    G    C    1   1000  stop_gained
    1      3    T    A    1   1000  stop_lost
    1      4    G    T    1   1000  synonymous_variant")
get_cumulative_frequencies(vars)

vars2 = read.table(header = TRUE, text = "
    CHROM  POS  REF  ALT  AC  AN    CQ
    1      1    A    G    1   1000  missense_variant
    1      2    G    C    1   1000  stop_gained
    1      3    T    A    1   1000  stop_lost
    1      4    G    T    1   1000  synonymous_variant")
var_list = list("first"=vars, "second"=vars2)
threshold = 0.005
remove_high_frequency_vars(var_list, threshold)

jeremymcrae/recessiveStats documentation built on May 19, 2019, 5:08 a.m.