assignSpecies_mod: Return long-form species assignments with DADA2's...

View source: R/assignSpecies_mod.R

assignSpecies_modR Documentation

Return long-form species assignments with DADA2's assignSpecies

Description

This function is a modification of DADA2's assignSpecies, which uses exact matching against a reference fasta to identify the genus-species binomial classification of the input sequences. Where the original function run with allowMultiple = TRUE returns a concatenated string of all exactly matched species, results are returned here as distinct rows, one per match.

Usage

assignSpecies_mod(seqs, refFasta, tryRC = FALSE, n = 2000)

Arguments

seqs

A character vector of the sequences to be assigned

refFasta

The path to the reference fasta file, or an R connection. Can be compressed. This reference fasta file should be formatted so that the ID lines correspond to the genus-species of the associated sequence:

>SeqID genus species ACGAATGTGAAGTAA......

tryRC

(Optional). Default FALSE. If TRUE, the reverse-complement of each sequences will also be tested for exact matching to the reference sequences.

n

(Optional). Default 2000. The number of sequences to perform assignment on at one time. This controls the peak memory requirement so that large numbers of sequences are supported.

Value

A two-column data frame matching ASVs to their exact match in the reference, with multiple matches indicated by the presence of more than one row per ASV.


ammararuby/MButils documentation built on Jan. 29, 2023, 11:13 a.m.