filter_depth: Filter loci by sequencing depth

View source: R/filter_depth.R

filter_depthR Documentation

Filter loci by sequencing depth

Description

Parses a data table of genotypes/allele frequencies and returns a list of loci that conform to a desired read depth threshold.

Usage

filter_depth(dat, minDP = NULL, maxDP = NULL, locusCol = "LOCUS", dpCol = "DP")

Arguments

dat

Data table: The sequencing read information, must contain the columns:

  1. The locus ID (see param locusCol).

  2. The read depth (see param dpCol).

minDP

Integer: The minimum sequencing depth. Loci below this value are flagged as 'bad' loci. Default is NULL.

maxDP

Integer: The maximum sequencing depth. Loci above this value are flagged as 'bad' loci. Default is NULL.

locusCol

Character: The column name with the locus information. Default = 'LOCUS'.

dpCol

Character: The column with read depth information. Default = 'DP'.

Value

Returns a character vector of locus names in dat[[locusCol]] that conform to the read depth threshold (>= value of minDP and <= value of maxDP).

Examples

library(genomalicious)

data(data_Genos)

# Exclude loci with coverage < 10 reads
min10 <- filter_depth(data_Genos, minDP=10)
min10
data_Genos[LOCUS %in% min10]$DP %>% summary

# Exclude loci with coverage < 10 and > 100 reads
min10max100 <- filter_depth(data_Genos, minDP=10, maxDP=100)
min10max100
data_Genos[LOCUS %in% min10max100]$DP %>% summary

# Alternatively, subset data.table to only contain the
# bad loci with coverage < 10 reads
data_Genos[!(LOCUS %in% min10)]$DP %>% summary


j-a-thia/genomalicious documentation built on Oct. 19, 2024, 7:51 p.m.