filter_supporting_reads: Filter samples by the minimum supporting reads for alleles.
In j-a-thia/genomalicious: A smorgasbord of R functions for population genomic analyses

View source: R/filter_supporting_reads.R

filter_supporting_reads

R Documentation

Filter samples by the minimum supporting reads for alleles.

Description

This function can almost be seen like a minor allele frequency or count filter at the level of a the sample (instead of the whole dataset). It will mark a sample as having insufficient supporting reads for the allele with lower coverage if they are below a certain threshold. This might be useful, for example, when using pooled allele frequencies, or when genotypes individuals are sequenced at low-to-moderate coverage.

Usage

filter_supporting_reads(
  dat,
  sampCol = "SAMPLE",
  locusCol = "LOCUS",
  dpCol = "DP",
  aoCol = "AO",
  suppReads = 3
)

Arguments

`dat`	Data.table: Contains the information of samples, loci, the total depth of coverage, and the read count of the alterante allele. The reference allele read count is assumed to be 1 - alternate allele read count. Must contain the columns: The sample ID (see param `sampCol`). The locus ID (see param `locusCol`). The total read depth (see param `dpCol`). The alternate allele read counts (see param `aoCol`).
`sampCol`	Character: The column with the sample information. Default = `'SAMPLE'`.
`locusCol`	Character: The column with the locus information. Default = `'LOCUS'`.
`dpCol`	Character: The column with the total read depth information. Default = `'DP'`.
`aoCol`	Character: The column with the alternate allele read count information. Default = `'AO'`.
`suppReads`	Integer: The minimum number of supporting reads for the allele that is least well covered by reads within a sample.

Details

Note, this sample will only evaluate sites for each there are reads supporting both alleles. It will not evaluate sites that only have reads for the reference alleles, or only have reads for the alternate allele.

Value

Returns a data.table with the columns $SAMPLE and $LOCUS, the sample and locus information, and KEEP, a logical column with TRUE or FALSE indicating whether a sample + locus observation should be kept based on uncertainty in the supporting reads. Note, all samples + loci observations are returned, such that they will match dat. This facilitates merging of the original data and results.

Examples

library(genomalicious)
data(data_Genos)

# Take a look at the read distribution for alternate alleles
hist(data_Genos$AO, xlab='Alt allele read counts', main='')

# Let's find those sample + loci observations where there are not
# at least 5 reads supporting each allele
suppTest <- filter_supporting_reads(data_Genos, suppReads=5)

head(suppTest)

suppTest[KEEP==FALSE]

# You could use this information to filter loci. For example, removing
# a locus if any sample does not meet the supporting read threshold for
# both alleles.
uniq_bad_loci <- unique(suppTest[KEEP==FALSE]$LOCUS)

uniq_bad_loci

data_Genos[!LOCUS %in% uniq_bad_loci]

j-a-thia/genomalicious documentation built on April 13, 2025, 9:41 a.m.

j-a-thia/genomalicious index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

j-a-thia/genomalicious
A smorgasbord of R functions for population genomic analyses

filter_supporting_reads: Filter samples by the minimum supporting reads for alleles.
In j-a-thia/genomalicious: A smorgasbord of R functions for population genomic analyses

Filter samples by the minimum supporting reads for alleles.

Description

Usage

Arguments

Details

Value

Examples

Related to filter_supporting_reads in j-a-thia/genomalicious...

R Package Documentation

Browse R Packages

We want your feedback!

j-a-thia/genomalicious A smorgasbord of R functions for population genomic analyses

filter_supporting_reads: Filter samples by the minimum supporting reads for alleles. In j-a-thia/genomalicious: A smorgasbord of R functions for population genomic analyses

Filter samples by the minimum supporting reads for alleles.

Description

Usage

Arguments

Details

Value

Examples

Related to filter_supporting_reads in j-a-thia/genomalicious...

R Package Documentation

Browse R Packages

We want your feedback!

j-a-thia/genomalicious
A smorgasbord of R functions for population genomic analyses

filter_supporting_reads: Filter samples by the minimum supporting reads for alleles.
In j-a-thia/genomalicious: A smorgasbord of R functions for population genomic analyses