read_depth_plot: Generate a figure with the read depth groups

Description Usage Arguments Details Value References Examples

View source: R/read_depth_plot.R

Description

This function reads the fastq file of an individual and generate a figure of read coverage groups.

Usage

1
2
3
4
5
read_depth_plot(
  fq.file,
  min.coverage.fig = 7L,
  parallel.core = parallel::detectCores() - 1
)

Arguments

fq.file

(character, path). The path to the individual fastq file to check. Default: fq.file = "my-sample.fq.gz".

min.coverage.fig

(character, path). Minimum coverage used to draw the color on the figure. Default: min.coverage.fig = 7L.

parallel.core

(integer) Enable parallel execution with the number of threads. Default: parallel.core = parallel::detectCores() - 1.

Details

4 read coverage groups are shown:

  1. distinct reads with low coverage (in red): these reads are likely sequencing errors or uninformative polymorphisms (shared only by a few samples).

  2. disting reads for a target coverage (in green):

    • Usually represent around 80

    • It’s a safe coverage range to start exploring your data (open for discussion).

    • Lower threshold (default = 7): you can’t escape it, it’s your tolerance to call heterozygote a true heterozygote. You want a minimum coverage for both the reference and the alternative allele. Yes, you can use population information to lower this threshold or use some fancy bayesian algorithm.

    • Higher threshold: is a lot more open for discussion, here it’s the lower limit of another group (the orange, see below for description). Minus 1 bp.

  3. distinct reads with high coverage > 1 read depth (in yellow): those are legitimate alleles with high coverage.

  4. distinct and unique reads with high coverage (in orange): those repetitive elements when assembled in locus are usually paralogs, retrotransposons, transposable elements, etc.

Value

The function returns the read depth groups plot.

References

Ilut, D., Nydam, M., Hare, M. (2014). Defining Loci in Restriction-Based Reduced Representation Genomic Data from Non model Species: Sources of Bias and Diagnostics for Optimal Clustering BioMed Research International 2014. https://dx.doi.org/10.1155/2014/675158

Examples

1
2
3
4
5
## Not run: 
require(vroom)
check.reads.depth.groups <- read_depth_plot(fq.file = "my-sample.fq.gz")

## End(Not run)

thierrygosselin/stackr documentation built on Nov. 11, 2020, 11 a.m.