filter_maf | R Documentation |
Parses a data table of genotypes/allele frequencies and returns a list of loci that conform to a desired MAF threshold.
filter_maf(
dat,
maf = 0.05,
type = "genos",
method = "mean",
sampCol = "SAMPLE",
locusCol = "LOCUS",
popCol = "POP",
genoCol = "GT",
freqCol = "FREQ"
)
dat |
A data table of genotypes or allele frequencies. Ggenotypes are recorded either as '/' separated alleles (0/0, 0/1 1/1), or as counts of the Alt allele (0, 1, 2). If allele frequencies, can be either the Ref or Alt allele, so long as it is consistent across samples, populations, loci, etc. Expects the columns:
|
maf |
Numeric: The minor allele frequency. E.g. 0.05 will filter for 5 a locus if its frequency is < 0.05 or > 0.95. Default is 0.05, and the value must be <=0.5. |
type |
Character: Is |
method |
Character: The method by which MAF filtering is performed.
One of |
sampCol |
Character: The column name with the sampled individual information.
Default = |
locusCol |
Character: The column name with the locus information.
Default = |
popCol |
Character: The column name with population information.
Default = |
genoCol |
Character: The column name with the genotype information.
Default = |
freqCol |
Character: The column name with the allele frequency information.
Default = |
Returns a character vector of locus names in dat[[locusCol]]
that conform
to the MAF threshold (>= value of maf
).
# LONG TABLE OF GENOTYPES
data(data_Genos)
# Filter for MAF=0.20
loci.genos <- filter_maf(data_Genos, maf=0.20, type='genos')
data_Genos[LOCUS %in% dt.loci]
# LONG TABLE OF ALLELE FREQUENCIES
freqs_4pops <- data_Genos %>%
.[, .(FREQ=sum(GT)/(length(GT)*2)), by=c('LOCUS','POP')]
loci.freqs <- filter_maf(freqs_4pops, maf=0.20, type='freqs')
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.