Description Usage Arguments Details Value
This algorithm assumes that the sequences "should be" identical except for amplification and sequencing errors. Its main purpose is to calculate a consensus sequence for an amplicon that is too long to use in DADA2 directly, but which has been clustered based on sequence variant identity in one subregion.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | cluster_consensus(seq, nread = 1, ..., ncpus = 1, simplify = TRUE)
## S3 method for class 'character'
cluster_consensus(
seq,
nread = 1,
names = base::names(seq),
dna2rna = TRUE,
...,
ncpus = 1,
simplify = TRUE
)
## S3 method for class 'XStringSet'
cluster_consensus(seq, nread = 1, ..., ncpus = 1, simplify = TRUE)
|
seq |
( |
nread |
( |
... |
passed to methods |
ncpus |
( |
simplify |
( |
names |
( |
dna2rna |
(logical) whether to convert |
The sequences are first aligned using
AlignSeqs. Sequences which are "outliers" in the
alignment are then removed by
odseq. If the input sequences were clustered based on
DADA2 sequence variants of a variable region, and the sequences were
appropriately quality filtered prior to running dada,
then outliers should mostly be chimeras.
After outlier removal, sites with greater than 50% gaps are removed, and
the most frequent letter (ignoring gaps) is chosen at all other sites. If no
letter has greater than 50% representation at a position, then an IUPAC
ambiguous base representing at least 50% of the reads at that position is
chosen for nucleotide sequences, or "X" for amino acids.
an XStringSet-class representing the
consensus sequence.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.