MultiAmplicon-class: The central data structure of the MultiAmplicon package

MultiAmplicon-classR Documentation

The central data structure of the MultiAmplicon package

Description

The MultiAmplicon class is a container that stores at least primer pairs, read files and progressively processed data in an 'amplicon x samples' format. The slots in this object are incrementally filled with by running wrappers functions (mostly around functions from the dada2 package). The object is treated (subsetted etc.) like a (pseudo) matrix, colums are samples, rows are different amplicons.

Subsetting for MultiAmplicon objects should conveniently subset all (potentially) filled slots

Usage

MultiAmplicon(
  PrimerPairsSet = PrimerPairsSet(),
  PairedReadFileSet = PairedReadFileSet(),
  sampleData = new("sample_data", data.frame(row.names = names(PairedReadFileSet),
    readsF = PairedReadFileSet@readsF, readsR = PairedReadFileSet@readsF)),
  .Data = matrix(seq(1, length(PrimerPairsSet) * length(PairedReadFileSet)), nrow =
    length(PrimerPairsSet), ncol = length(PairedReadFileSet), dimnames =
    list(names(PrimerPairsSet), names(PairedReadFileSet))),
  stratifiedFilesF = matrix(nrow = 0, ncol = 0),
  stratifiedFilesR = matrix(nrow = 0, ncol = 0),
  rawCounts = matrix(nrow = 0, ncol = 0),
  derepF = matrix(nrow = 0, ncol = 0),
  derepR = matrix(nrow = 0, ncol = 0),
  dadaF = matrix(nrow = 0, ncol = 0),
  dadaR = matrix(nrow = 0, ncol = 0),
  mergers = matrix(nrow = 0, ncol = 0),
  sequenceTable = list(),
  sequenceTableNoChime = list(),
  taxonTable = list()
)

getPrimerPairsSet(MA)

getPairedReadFileSet(MA)

getRawCounts(MA)

getSampleData(MA)

getStratifiedFilesF(MA, ...)

getStratifiedFilesR(MA, ...)

getDerepF(MA, ...)

getDerepR(MA, ...)

getDadaF(MA, ...)

getDadaR(MA, ...)

getMergers(MA, ...)

getSequenceTable(MA, dropEmpty = TRUE, simplify = TRUE)

getSequenceTableNoChime(MA, dropEmpty = TRUE, fill = FALSE)

getTaxonTable(MA, simplify = TRUE)

getSequencesFromTable(MA)

## S4 method for signature 'MultiAmplicon'
getSequencesFromTable(MA)

## S4 method for signature 'MultiAmplicon,numeric,'function''
apply(X, MARGIN, FUN, ..., simplify = TRUE)

## S4 method for signature 'MultiAmplicon'
show(object)

## S4 method for signature 'MultiAmplicon,ANY,ANY,ANY'
x[i, j, ..., drop = TRUE]

## S4 method for signature 'MultiAmplicon,index,missing,ANY'
x[i, j, ..., drop = FALSE]

## S4 method for signature 'MultiAmplicon,missing,index,ANY'
x[i, j, ..., drop = FALSE]

Arguments

PrimerPairsSet

a set of primer pairs specifiying your amplicons see PrimerPairsSet-class

PairedReadFileSet

a set of paired end sequencing data files PairedReadFileSet-class

sampleData

Users should not supply this parameter. It's filled with a sample_data object from phyloseq. The slot is created from sample names (same as colnames(MA)) and more data can be added by addSampleData.

.Data

Users should not supply this parameter, the slot is created by sortAmplicons.

mergers

Users should not supply this parameter, the slot is created by mergeMulti

sequenceTable

Users should not supply this parameter, the slot is created by makeSequenceTableMulti

sequenceTableNoChime

Users should not supply this parameter, the slot is created by removeChimeraMulti

taxonTable

Users should not supply this parameter, the slot is created by blastTaxAnnot. It's filled with a list of taxonomyTable objects from phyloseq.

MA

MultiAmplicon-class object

...

not used

dropEmpty

Should empty files be returned

object

A MultiAmplicon-class object.

x

MultiAmplicon-class object

i

numeric, logical or names vector for subsetting rows (== amplicons)

j

numeric, logical or names vector for subsetting columns (== read files, corresponding usually to samples)

drop

should not be used

stratifiedFiles

Users should not supply this parameter, the slot is created by sortAmplicons.

derep

Users should not supply this parameter, the slot is created by derepMulti

dada

Users should not supply this parameter, the slot is created by dadaMulti

Functions

  • MultiAmplicon: Constructor for MultiAmplicon-class

Slots

PrimerPairsSet

The primer pairs used in your experiment to specify amplicons stored in a PrimerPairsSet-class object.

PairedReadFileSet

The (quality filtered) fastq files (one file pair for each sample) that store your sequencing data.

.Data

A numeric matrix of sequencing read counts per amplicon and sample. Created by the function sortAmplicons in the MultiAmplicon pipeline.

sampleData

A sample_data object from phyloseq. The slot is created from sample names (names of the PrimerPairsSet, which have tto be the same as colnames(MA)). More data can be added by addSampleData.

stratifiedFilesF

temporary files as a result of stratifying into amplicons and samples using the MultiAmplicon pipeline function sortAmplicons. Forward (sometimes called R1) and reverse (sometimes called R2) files are stored as a (amplicons x samples) matrix objects.

stratifiedFilesR

temporary files as a result of stratifying into amplicons and samples using the MultiAmplicon pipeline function sortAmplicons. Forward (sometimes called R1) and reverse (sometimes called R2) files are stored as a (amplicons x samples) matrix objects.

derep

A list of PairedDerep-class objects containing pairs of derep-class objects created by dada2’s derepFastq function or withing the MultiAmplicon pipeline by derepMulti.

dada

A list of PairedDada-class object containing pairs of dada-class objects created by dada2’s dada function. Within the MultiAmplicon pipeline this slot is filled by dadaMulti.

mergers

A list of objects containing merged pairs of forward and reverse reads as created by by dada2’s mergePairs function. Within the MultiAmplicon pipeline this slot is filled by mergeMulti.

sequenceTable

A list of matrix objects created by dada2’s makeSequenceTable. Samples (in rows) and amplified sequence variants (ASVs) in columns. Within the MultiAmplicon pipeline this slot is filled by makeSequenceTableMulti.

sequenceTableNoChime

A list of matrix objects created by dada2’s removeBimeraDenovo. Samples (in rows) and ASVs screened for PCR chimeras in columns. Within the MultiAmplicon pipeline this slot is filled by removeChimeraMulti.

taxonTable

A list of matrix objects created by a function for taxonomical annotation (for example blastTaxAnnot. ASVs are in rows and taxnomical ranks are in columns.

MultiAmplicon(PrimerPairsSet, PairedReadFileSet)

Author(s)

Emanuel Heitlinger

See Also

derepFastq,dada

Examples


primerF <- c("AGAGTTTGATCCTGGCTCAG", "ACTCCTACGGGAGGCAGC",
            "GAATTGACGGAAGGGCACC", "YGGTGRTGCATGGCCGYT")
primerR <- c("CTGCWGCCNCCCGTAGG", "GACTACHVGGGTATCTAATCC",
             "AAGGGCATCACAGACCTGTTAT", "TCCTTCTGCAGGTTCACCTAC")

PPS <- PrimerPairsSet(primerF, primerR)

fastq.dir <- system.file("extdata", "fastq", package = "MultiAmplicon")
fastq.files <- list.files(fastq.dir, full.names=TRUE)
Ffastq.file <- fastq.files[grepl("F_filt", fastq.files)]
Rfastq.file <- fastq.files[grepl("R_filt", fastq.files)]

PRF <- PairedReadFileSet(Ffastq.file, Rfastq.file)

MA <- MultiAmplicon(PPS, PRF)

## sort into amplicons
MA1 <- sortAmplicons(MA, filedir=tempfile(pattern = "dir"))

## Only after sorting the MultiAmplicon object is really poplated
## with sensible data, now matrix-like access to different 
## amplicons (primer pairs) and different sequencing read files
## (usually samples) is implemented.

## the number of amplicons (primer pairs)
nrow(MA)

## the number of samples (sequencing read file pairs)
ncol(MA)

## dereplication is currently not supported
## MA2 <- derepMulti(MA1)

### use dada directly after sorting
MA3 <- dadaMulti(MA1, selfConsist = TRUE)

MA4 <- mergeMulti(MA3, justConcatenate=TRUE)

MA5 <- makeSequenceTableMulti(MA4)

MA6 <- removeChimeraMulti(MA5, mc.cores=1)


derele/MultiAmplicon documentation built on Dec. 11, 2024, 12:09 p.m.