GblYield: Compute global yield by step
In aliafdz/QApckg: Quality assessment for Miseq data derived from viral sequencing

GblYield

R Documentation

Compute global yield by step

Description

Generates global yield reports for each evaluated pool from previous results.

Usage

GblYield(samples, filtres, pm.res, int.res)

Arguments

`samples`	Data frame with relevant information to identify the samples of the sequencing experiment, including `Patient.ID, MID, Primer.ID, Region, RefSeq.ID`, and `Pool.Nm` columns.
`filtres`	The data frame returned by `FiltbyQ30` function.
`pm.res`	The list returned by `demultiplexPrimer`, including `fileTable` and `poolTable` data frames.
`int.res`	The data frame returned by `ConsHaplotypes` function.

Value

After execution, two report files will be saved in the reports folder:

GlobalYieldBarplots.pdf: Includes some barplots representing the yield (in nº of reads and percentage) by each step of the quality assessment pipeline. This representation is done for all pools included in the analysis and also for global results.
GlobalYield-SumRprt.txt: Summary report including global yield by analysis step in number of reads, in percentage by step and percentage referred to raw reads.

Note

This function is designed to be applied at the end of the quality assessment analysis and requires the previous execution of FiltbyQ30, demultiplexPrimer and ConsHaplotypes and functions from the same package.

Author(s)

Alicia Aranda

Examples

## Execute FLASH extension
runDir <- "./run"
runfiles <- list.files(runDir,recursive=TRUE,full.names=TRUE,include.dirs=TRUE)
flash <- "./FLASH/flash.exe"
flashres <- R1R2toFLASH(runfiles,flash)

## Execute Q30 filtering
flashDir <- "./flash"
flashfiles <- list.files(flashDir,recursive=TRUE,full.names=TRUE,include.dirs=TRUE)
filtres <- FiltbyQ30(max.pct=0.05,flashfiles,flashres)

## Execute demultiplexing by MID with default parameters
flashFiltDir <- "./flashFilt"
flashffiles <- list.files(flashFiltDir,recursive=TRUE,full.names=TRUE,include.dirs=TRUE)
# Get data
samples <- read.table("./data/samples.csv", sep="\t", header=T,
                      colClasses="character",stringsAsFactors=F)
mids <- read.table("./data/mids.csv", sep="\t", header=T,
                   stringsAsFactors=F)
dem.res<-demultiplexMID(flashffiles,samples,mids)

## Execute demultiplexing by primer
splitDir <- "./splits"
splitfiles <- list.files(splitDir,recursive=TRUE,full.names=TRUE,include.dirs=TRUE)
pm.res <- demultiplexPrimer(splitfiles,samples,primers)

## Obtain consensus haplotypes (default parameters)
trimDir <- "./trim"
trimfiles <- list.files(trimDir,recursive=TRUE,full.names=TRUE,include.dirs=TRUE)
int.res <- ConsHaplotypes(trimfiles, pm.res, thr, min.seq.len)

## Apply function
GblYield(samples, filtres, pm.res, int.res)

aliafdz/QApckg documentation built on June 2, 2022, 10:29 a.m.