PoolQCbyPos: Evaluate QC by position

View source: R/PoolQCbyPos.R

PoolQCbyPosR Documentation

Evaluate QC by position


This function evaluates fastq files before and after the execution of the FLASH program to extend paired-end reads, and returns Quality Control (QC) by position plots in pdf format.

It can be applied also after filtering FLASH fastq files by Phred Score.


PoolQCbyPos(flashfiles, samples, primers, runfiles, ncores = 1)



Vector including the paths of FLASH processed/filtered files, with fastq extension.


Data frame with relevant information to identify the samples of the sequencing experiment, including Patient.ID, MID, Primer.ID, Region, RefSeq.ID, and Pool.Nm columns.


Data frame with information about the primers used in the experiment, including Ampl.Nm, Region, Primer.FW, Primer.RV, FW.pos, RV.pos, FW.tpos, RV.tpos, Aa.ipos, and Aa.lpos columns.


Vector including the paths of Illumina MiSeq Raw Data files, often with fastq.gz extension. If the function is applied for filtered fastq files, this argument must be NA or missing.


Number of cores to use for parallelization with mclapply.hack.


After execution, a pdf file for each pool used in the experiment will be saved in a reports folder (if it is not previously defined, the function will create this folder), and a message indicating that the files are generated will appear in console.

If the function is applied after the execution of FLASH, the pdf file(s) will be named PoolQCbyPos.PoolName.pdf, where PoolName is extracted from samples data frame. The file(s) contain a QC plot for both raw data and extended fastq files, and also the read length distribution for the evaluated pool.

In contrast, if the function is applied after Phred Score filtering, the generated pdf file(s) will be named PoolFiltQCbyPos.PoolName.pdf, including a QC plot for the filtered data and another plot representing read length distribution.


Alicia Aranda

See Also

R1R2toFLASH, FiltbyQ30, QCscores, QCplot


runDir <- "./run"
flashDir <- "./flash"
repDir <- "./reports"
# Save the file names with complete path
runfiles <- list.files(runDir,recursive=TRUE,full.names=TRUE,include.dirs=TRUE)
flashfiles <- list.files(flashDir,recursive=TRUE,full.names=TRUE,include.dirs=TRUE)
# Get data
samples <- read.table("./data/samples.csv", sep="\t", header=T,
primers <- read.table("./data/primers.csv", sep="\t", header=T,

aliafdz/QApckg documentation built on June 2, 2022, 10:29 a.m.