ConsHaplotypes | R Documentation |
Computes the intersection of forward and reverse strand haplotypes and generates some report files.
ConsHaplotypes(trimfiles, pm.res, thr = 0.2, min.seq.len = 150, max.difs = 250)
trimfiles |
Vector including the paths of demultiplexed files by specific primer, with fna extension. |
pm.res |
The list returned by |
thr |
Threshold to filter haplotypes at minimum abundance before multiple alignment. |
min.seq.len |
Threshold to filter haplotypes at minimum length before intersection. |
max.difs |
Maximum number of mismatches allowed in resulting consensus haplotypes with respect to the dominant one. |
This function is designed to be used after the execution of demultiplexPrimer
function
from the same package. After the generation of FASTA files containing forward and reverse strand reads
for the evaluated samples, ConsHaplotypes
executes multiple alignment with muscle
and
returns the consensus haplotypes using IntersectStrandHpls
, that will be saved using
the helper function SaveHaplotypes
.
The function returns a data.frame
object containing the intersection results
for each combination of patient and amplicon region, including the initial number of reads, filtered out reads
(for being below a given frequency threshold or unique to a single strand), overlapping frequency
between both strands and the common reads (in percentage and nÂș of reads).
After execution, two FASTA files for each combination of sample and pool will be saved in a newly generated MACH folder; the first includes multiple alignment between forward and reverse strand haplotypes, and the second includes the forward and reverse strands intersected. Additionaly, some report files will be generated in the reports folder:
MA.Intersects-SummRprt.txt
: Includes the sumary results by reads number after abundance filter and
strand intersection.
MA.Intersects.plots.pdf
: Includes different barplots for each sample representing the frequency of
forward, reverse and intersected strand haplotypes.
IntersectBarplots.pdf
: Includes different barplots for all combinations of patient and pool,
representing the number of intersected and filtered out reads, the intersection yield and global yield.
A new file named muscle.log
containing muscle
options will be generated and saved in a folder named "tmp".
Alicia Aranda
muscle
, IntersectStrandHpls
, demultiplexPrimer
, SaveHaplotypes
splitDir <- "./splits" # Save the file names with complete path splitfiles <- list.files(splitDir,recursive=TRUE,full.names=TRUE,include.dirs=TRUE) # Get data samples <- read.table("./data/samples.csv", sep="\t", header=T, colClasses="character",stringsAsFactors=F) mids <- read.table("./data/mids.csv", sep="\t", header=T, stringsAsFactors=F) # Apply previous function from QA analysis pm.res <- demultiplexPrimer(splitfiles,samples,primers) # Save the files generated by previous function trimDir <- "./trim" trimfiles <- list.files(trimDir,recursive=TRUE,full.names=TRUE,include.dirs=TRUE) # Define necessary parameters min.seq.len <- 150 thr <- 0.2 int.res <- ConsHaplotypes(trimfiles, pm.res, thr, min.seq.len)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.