MultiAmplicon-class: The central data structure of the MultiAmplicon package
In derele/MultiAmplicon: Metabarcoding and microbiome analysis using multiple amplicons

MultiAmplicon-class

R Documentation

The central data structure of the MultiAmplicon package

Description

The MultiAmplicon class is a container that stores at least primer pairs, read files and progressively processed data in an 'amplicon x samples' format. The slots in this object are incrementally filled with by running wrappers functions (mostly around functions from the dada2 package). The object is treated (subsetted etc.) like a (pseudo) matrix, colums are samples, rows are different amplicons.

Subsetting for MultiAmplicon objects should conveniently subset all (potentially) filled slots

Usage

MultiAmplicon(
  PrimerPairsSet = PrimerPairsSet(),
  PairedReadFileSet = PairedReadFileSet(),
  sampleData = new("sample_data", data.frame(row.names = names(PairedReadFileSet),
    readsF = PairedReadFileSet@readsF, readsR = PairedReadFileSet@readsF)),
  .Data = matrix(seq(1, length(PrimerPairsSet) * length(PairedReadFileSet)), nrow =
    length(PrimerPairsSet), ncol = length(PairedReadFileSet), dimnames =
    list(names(PrimerPairsSet), names(PairedReadFileSet))),
  stratifiedFilesF = matrix(nrow = 0, ncol = 0),
  stratifiedFilesR = matrix(nrow = 0, ncol = 0),
  rawCounts = matrix(nrow = 0, ncol = 0),
  derepF = matrix(nrow = 0, ncol = 0),
  derepR = matrix(nrow = 0, ncol = 0),
  dadaF = matrix(nrow = 0, ncol = 0),
  dadaR = matrix(nrow = 0, ncol = 0),
  mergers = matrix(nrow = 0, ncol = 0),
  sequenceTable = list(),
  sequenceTableNoChime = list(),
  taxonTable = list()
)

getPrimerPairsSet(MA)

getPairedReadFileSet(MA)

getRawCounts(MA)

getSampleData(MA)

getStratifiedFilesF(MA, ...)

getStratifiedFilesR(MA, ...)

getDerepF(MA, ...)

getDerepR(MA, ...)

getDadaF(MA, ...)

getDadaR(MA, ...)

getMergers(MA, ...)

getSequenceTable(MA, dropEmpty = TRUE, simplify = TRUE)

getSequenceTableNoChime(MA, dropEmpty = TRUE, fill = FALSE)

getTaxonTable(MA, simplify = TRUE)

getSequencesFromTable(MA)

## S4 method for signature 'MultiAmplicon'
getSequencesFromTable(MA)

## S4 method for signature 'MultiAmplicon,numeric,'function''
apply(X, MARGIN, FUN, ..., simplify = TRUE)

## S4 method for signature 'MultiAmplicon'
show(object)

## S4 method for signature 'MultiAmplicon,ANY,ANY,ANY'
x[i, j, ..., drop = TRUE]

## S4 method for signature 'MultiAmplicon,index,missing,ANY'
x[i, j, ..., drop = FALSE]

## S4 method for signature 'MultiAmplicon,missing,index,ANY'
x[i, j, ..., drop = FALSE]

Arguments

`PrimerPairsSet`	a set of primer pairs specifiying your amplicons see `PrimerPairsSet-class`
`PairedReadFileSet`	a set of paired end sequencing data files `PairedReadFileSet-class`
`sampleData`	Users should not supply this parameter. It's filled with a sample_data object from `phyloseq`. The slot is created from sample names (same as `colnames(MA)`) and more data can be added by `addSampleData`.
`.Data`	Users should not supply this parameter, the slot is created by `sortAmplicons`.
`mergers`	Users should not supply this parameter, the slot is created by `mergeMulti`
`sequenceTable`	Users should not supply this parameter, the slot is created by `makeSequenceTableMulti`
`sequenceTableNoChime`	Users should not supply this parameter, the slot is created by `removeChimeraMulti`
`taxonTable`	Users should not supply this parameter, the slot is created by `blastTaxAnnot`. It's filled with a list of taxonomyTable objects from `phyloseq`.
`MA`	MultiAmplicon-class object
`...`	not used
`dropEmpty`	Should empty files be returned
`object`	A `MultiAmplicon-class` object.
`x`	MultiAmplicon-class object
`i`	numeric, logical or names vector for subsetting rows (== amplicons)
`j`	numeric, logical or names vector for subsetting columns (== read files, corresponding usually to samples)
`drop`	should not be used
`stratifiedFiles`	Users should not supply this parameter, the slot is created by `sortAmplicons`.
`derep`	Users should not supply this parameter, the slot is created by `derepMulti`
`dada`	Users should not supply this parameter, the slot is created by `dadaMulti`

Functions

MultiAmplicon: Constructor for MultiAmplicon-class

Slots

PrimerPairsSet

The primer pairs used in your experiment to specify amplicons stored in a PrimerPairsSet-class object.

PairedReadFileSet

The (quality filtered) fastq files (one file pair for each sample) that store your sequencing data.

.Data

A numeric matrix of sequencing read counts per amplicon and sample. Created by the function sortAmplicons in the MultiAmplicon pipeline.

sampleData

A sample_data object from phyloseq. The slot is created from sample names (names of the PrimerPairsSet, which have tto be the same as colnames(MA)). More data can be added by addSampleData.

stratifiedFilesF

temporary files as a result of stratifying into amplicons and samples using the MultiAmplicon pipeline function sortAmplicons. Forward (sometimes called R1) and reverse (sometimes called R2) files are stored as a (amplicons x samples) matrix objects.

stratifiedFilesR

derep

A list of PairedDerep-class objects containing pairs of derep-class objects created by dada2’s derepFastq function or withing the MultiAmplicon pipeline by derepMulti.

dada

A list of PairedDada-class object containing pairs of dada-class objects created by dada2’s dada function. Within the MultiAmplicon pipeline this slot is filled by dadaMulti.

mergers

A list of objects containing merged pairs of forward and reverse reads as created by by dada2’s mergePairs function. Within the MultiAmplicon pipeline this slot is filled by mergeMulti.

sequenceTable

A list of matrix objects created by dada2’s makeSequenceTable. Samples (in rows) and amplified sequence variants (ASVs) in columns. Within the MultiAmplicon pipeline this slot is filled by makeSequenceTableMulti.

sequenceTableNoChime

A list of matrix objects created by dada2’s removeBimeraDenovo. Samples (in rows) and ASVs screened for PCR chimeras in columns. Within the MultiAmplicon pipeline this slot is filled by removeChimeraMulti.

taxonTable

A list of matrix objects created by a function for taxonomical annotation (for example blastTaxAnnot. ASVs are in rows and taxnomical ranks are in columns.

MultiAmplicon(PrimerPairsSet, PairedReadFileSet)

Author(s)

Emanuel Heitlinger

Examples


primerF <- c("AGAGTTTGATCCTGGCTCAG", "ACTCCTACGGGAGGCAGC",
            "GAATTGACGGAAGGGCACC", "YGGTGRTGCATGGCCGYT")
primerR <- c("CTGCWGCCNCCCGTAGG", "GACTACHVGGGTATCTAATCC",
             "AAGGGCATCACAGACCTGTTAT", "TCCTTCTGCAGGTTCACCTAC")

PPS <- PrimerPairsSet(primerF, primerR)

fastq.dir <- system.file("extdata", "fastq", package = "MultiAmplicon")
fastq.files <- list.files(fastq.dir, full.names=TRUE)
Ffastq.file <- fastq.files[grepl("F_filt", fastq.files)]
Rfastq.file <- fastq.files[grepl("R_filt", fastq.files)]

PRF <- PairedReadFileSet(Ffastq.file, Rfastq.file)

MA <- MultiAmplicon(PPS, PRF)

## sort into amplicons
MA1 <- sortAmplicons(MA, filedir=tempfile(pattern = "dir"))

## Only after sorting the MultiAmplicon object is really poplated
## with sensible data, now matrix-like access to different 
## amplicons (primer pairs) and different sequencing read files
## (usually samples) is implemented.

## the number of amplicons (primer pairs)
nrow(MA)

## the number of samples (sequencing read file pairs)
ncol(MA)

## dereplication is currently not supported
## MA2 <- derepMulti(MA1)

### use dada directly after sorting
MA3 <- dadaMulti(MA1, selfConsist = TRUE)

MA4 <- mergeMulti(MA3, justConcatenate=TRUE)

MA5 <- makeSequenceTableMulti(MA4)

MA6 <- removeChimeraMulti(MA5, mc.cores=1)

derele/MultiAmplicon documentation built on Dec. 11, 2024, 12:09 p.m.