sortAmplicons: sortAmplicons

sortAmpliconsR Documentation

sortAmplicons

Description

Sort different amplicons into a fully stratified samples x amplicons structure based on primer matches.

Usage

sortAmplicons(
  MA,
  filedir = "stratified_files",
  n = 1e+06,
  countOnly = FALSE,
  rmPrimer = TRUE,
  ...
)

## S4 method for signature 'MultiAmplicon'
sortAmplicons(
  MA,
  filedir = "stratified_files",
  n = 1e+06,
  countOnly = FALSE,
  rmPrimer = TRUE,
  ...
)

Arguments

MA

MultiAmplicon-class object containing a set of paired end files and a primer-pairs set.

filedir

path to an existing or newly to be created folder on your computer. If existing it has to be empty.

n

parameter passed to the yield functions of package ShortRead. This controls the memory consumption during streaming. Lower values result in lower memory requirements but might result longer processing time due to more repeated I/O operations reading the sequence files.

countOnly

logical argument if set TRUE only a matrix of read counts is returned

rmPrimer

logical, indicating whether primer sequences should be removed during sorting

...

additional parameter so be passed to Biostrings::isMatchingStartingAt. Be careful when using multiple starting positions or allowing error. This could lead to read pairs being assigned to multiple amplicons.

Details

This function uses isMatchingStartingAt to match primer sequences at the first position of forward and reverse sequences. These primer sequences can be removed. The remaining sequences of interest are written to files to allow processing via standard metabarcoding pipelines.

Value

MultiAmplicon: By default (countOnly=FALSE) a MultiAmplicon-class object is returned with the stratifiedFiles slot populated. Stratified file names are constructed using a unique string created by tempfile and stored in the given filedir (by default R's tempdir). If the countOnly is set only a numeric matrix of read counts is returned.

Author(s)

Emanuel Heitlinger

Examples


primerF <- c("AGAGTTTGATCCTGGCTCAG", "ACTCCTACGGGAGGCAGC",
            "GAATTGACGGAAGGGCACC", "YGGTGRTGCATGGCCGYT")
primerR <- c("CTGCWGCCNCCCGTAGG", "GACTACHVGGGTATCTAATCC",
             "AAGGGCATCACAGACCTGTTAT", "TCCTTCTGCAGGTTCACCTAC")

PPS <- PrimerPairsSet(primerF, primerR)

fastq.dir <- system.file("extdata", "fastq", package = "MultiAmplicon")
fastq.files <- list.files(fastq.dir, full.names=TRUE)
Ffastq.file <- fastq.files[grepl("F_filt", fastq.files)]
Rfastq.file <- fastq.files[grepl("R_filt", fastq.files)]

PRF <- PairedReadFileSet(Ffastq.file, Rfastq.file)

MA <- MultiAmplicon(PPS, PRF)

## sort into amplicons
MA1 <- sortAmplicons(MA)


derele/MultiAmplicon documentation built on Dec. 11, 2024, 12:09 p.m.