assign_blastn | R Documentation |
Use the blast software.
assign_blastn(
physeq,
ref_fasta = NULL,
database = NULL,
blastpath = NULL,
behavior = c("return_matrix", "add_to_phyloseq"),
method = c("vote", "top-hit"),
suffix = "_blastn",
min_id = 95,
min_bit_score = 50,
min_cover = 95,
min_e_value = 1e-30,
nb_voting = NULL,
column_names = c("Kingdom", "Phylum", "Class", "Order", "Family", "Genus", "Species"),
vote_algorithm = c("consensus", "rel_majority", "abs_majority", "unanimity"),
strict = FALSE,
nb_agree_threshold = 1,
preference_index = NULL,
collapse_string = "/",
replace_collapsed_rank_by_NA = TRUE,
simplify_taxo = TRUE,
keep_blast_metrics = FALSE,
...
)
physeq |
(required): a |
ref_fasta |
Either a DNAStringSet object or a path to a fasta
file to make the blast database. It must be in sintax format.
See |
database |
path to a blast database. Only used if ref_fasta is not set. |
blastpath |
path to blast program. |
behavior |
Either "return_matrix" (default), or "add_to_phyloseq":
|
method |
(One of "vote" or "top-hit"). If top-hit, only the
better match is used to assign taxonomy. If vote, the algorithm
takes all (or |
suffix |
(character) The suffix to name the new columns. If set to "" (the default), the taxa_ranks algorithm is used without suffix. |
min_id |
(default: 95) the identity percent to take into account a references taxa |
min_bit_score |
(default: 50) the minimum bit score to take into account a references taxa |
min_cover |
(default: 50) cut of in query cover (%) to keep result |
min_e_value |
(default: 1e-30) cut of in e-value (%) to keep result The BLAST E-value is the number of expected hits of similar quality (score) that could be found just by chance. |
nb_voting |
(Int, default NULL). The number of taxa to keep before apply a vote to resolve conflict. If NULL all taxa passing the filters (min_id, min_bit_score, min_cover and min_e_value) are selected. |
column_names |
A vector of names for taxonomic ranks. Must correspond to names in the ref_fasta files. |
vote_algorithm |
the method to vote among "consensus", "rel_majority",
"abs_majority" and "unanimity". See |
strict |
(Logical, default FALSE). See |
nb_agree_threshold |
See |
preference_index |
See |
collapse_string |
See |
replace_collapsed_rank_by_NA |
(Logical, default TRUE) See |
simplify_taxo |
(logical default TRUE). Do we apply the
function |
keep_blast_metrics |
(Logical, default FALSE). If TRUE, the blast metrics ("Query seq. length", "Taxa seq. length", "Alignment length", "% id. match", "e-value", "bit score" and "Query cover") are stored in the tax_table. |
... |
Additional arguments passed on to |
If behavior == "return_matrix" :
If method = "top-hit" a matrix of taxonomic assignation
If method = "vote", a list of two matrix, the first is the raw taxonomic assignation (before vote). The second one is the taxonomic assignation in which conflicts are resolved using vote.
If behavior == "add_to_phyloseq", return a new phyloseq object
Adrien Taudière
## Not run:
ref_fasta <- Biostrings::readDNAStringSet(system.file("extdata",
"mini_UNITE_fungi.fasta.gz",
package = "MiscMetabar", mustWork = TRUE
))
# assign_blastn(data_fungi_mini, ref_fasta = ref_fasta) # error because not
# enough sequences in db so none blast query passed the filters.
# So we used low score filter hereafter.
mat <- assign_blastn(data_fungi_mini,
ref_fasta = ref_fasta,
method = "top-hit", min_id = 70, min_e_value = 1e-3, min_cover = 50,
min_bit_score = 20
)
head(mat)
assign_blastn(data_fungi_mini,
ref_fasta = ref_fasta, method = "vote",
vote_algorithm = "rel_majority", min_id = 90, min_cover = 50,
behavior = "add_to_phyloseq"
)@tax_table
assign_blastn(data_fungi_mini,
ref_fasta = ref_fasta, method = "vote",
vote_algorithm = "consensus", replace_collapsed_rank_by_NA = FALSE,
min_id = 90, min_cover = 50, behavior = "add_to_phyloseq"
)@tax_table
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.