demultiplexPrimer | R Documentation |
Demultiplex reads by identifying template specific primer sequences within windows of expected positions in the sequenced reads. It is important to note that MID and template specific primer sequences will be trimmed from reads after the identification of primers, but amplicon length is not predetermined.
demultiplexPrimer( splitfiles, samples, primers, prmm = 3, min.len = 180, target.st = 1, target.end = 100 )
splitfiles |
Vector including the paths of demultiplexed files by MID, with fna extension. |
samples |
Data frame with relevant information to identify the samples of the sequencing experiment, including
|
primers |
Data frame with information about the template specific primers used in the experiment, including
|
prmm |
Number of mismatches allowed between the primers and read sequences. |
min.len |
Minimum length desired for haplotypes. Any sequence below this length will be discarted. |
target.st, target.end |
Initial and end positions between which template specific primer sequences will be searched. |
After demultiplexing reads by MID with demultiplexMID
function, template specific primer sequences are identified
in both strands. First, forward strands are recognized by searching FW primer sequence in 5' end and the
reverse complement of RV primer sequence in 3' end. Then, reverse strands are recognized by searching RV
primer sequence in 5' end and FW primer sequence in 3' end, after obtaining the reverse complement of all reads
identified as reverse strands. So, both strands are obtained in a way that facilitates their intersection.
A list containing the following:
fileTable |
A table with relevant data of each FASTA file generated in execution, including their associated strand, mean read length, total reads and total haplotypes obtained. |
poolTable |
A table with the number of total trimmed reads and the yield of the process by pool. |
After execution, a FASTA file for each combination of strand, MID and pool will be saved in a newly created trim folder. Additionaly, some report files will be generated in a reports folder:
AmpliconLengthsRprt.txt
: Includes the amplicon lengths of both strands
for each sample (with their corresponding MID identifier).
AmpliconLengthsPlot.pdf
: Includes a barplot for each sample representing the amplicon
lengths of both strands.
SplitByPrimersOnFlash.txt
: Includes a table of reads identified by primer, total reads identified by patient
and the yield by pool.
SplitByPrimersOnFlash.pdf,SplitByPrimersOnFlash-hz.pdf
: Includes some plots representing primer matches
by patient (in nÂș of reads) and the coverage of forward/reverse matches by pool.
SplittedReadsFileTable.txt
: A file containing the same information as fileTable
.
Alicia Aranda
demultiplexMID
, primermatch
# Set parameters prmm <- 3 min.len <- 180 # The expected window for template specific primer sequences will depend on the presence of # adapters, MID sequences and/or M13 primer. target.st <- 1 target.end <- 100 splitDir <- "./splits" # Save the file names with complete path splitfiles <- list.files(splitDir,recursive=TRUE,full.names=TRUE,include.dirs=TRUE) # Get data samples <- read.table("./data/samples.csv", sep="\t", header=T, colClasses="character",stringsAsFactors=F) primers <- read.table("./data/primers.csv", sep="\t", header=T, stringsAsFactors=F) pm.res <- demultiplexPrimer(splitfiles,samples,primers,prmm,min.len,target.st,target.end)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.