View source: R/dada_phyloseq.R
| verify_tax_table | R Documentation |
Check taxonomy table for common issues and send warnings/messages accordingly.
This function is called by verify_pq() when check_taxonomy = TRUE.
verify_tax_table(
physeq,
verbose = TRUE,
replace_to_NA = unwanted_tax_patterns,
min_char = 4,
redundant_suffix = "_sp",
taxonomic_ranks = c("Domain", "Phylum", "Class", "Order", "Family", "Genus", "Species"),
modify_phyloseq = FALSE,
remove_border_spaces = TRUE,
remove_all_space = FALSE,
replace_space_with = "_",
detect_invisible_chars = TRUE,
replace_invisible_chars = FALSE,
invisible_chars_replacement = ""
)
physeq |
(required) a |
verbose |
(logical, default TRUE) If TRUE, print warnings and messages about potential taxonomy issues. |
replace_to_NA |
(character vector) A vector of regex patterns to identify values that should be considered as NA. Defaults to unwanted_tax_patterns, a named character vector of common placeholders like "unclassified", "unknown", "uncultured", "incertae_sedis", "metagenome", empty QIIME-style ranks, etc. |
min_char |
(integer, default 4) Minimum number of characters for a taxonomic value to be considered valid. Values with fewer characters (excluding NA) will trigger a warning when verbose = TRUE. |
redundant_suffix |
(character, default "_sp") Suffix pattern to detect redundant taxonomic information. For example, "Russula_sp" in Species column is redundant if "Russula" is already present in the Genus column. Set to NULL to disable this check. Other examples: "_var", "_ssp", "_cf". |
taxonomic_ranks |
(character vector, default NULL) Names of taxonomic ranks in hierarchical order from highest to lowest (e.g., c("Kingdom", "Phylum", "Class", "Order", "Family", "Genus", "Species")). If NULL, uses the column names of the taxonomy table in their existing order. Used to determine parent-child relationships for redundant suffix detection. |
modify_phyloseq |
(logical, default FALSE) If TRUE, replace problematic values with NA in the taxonomy table and return the modified phyloseq object. The following types of values are replaced:
Messages will indicate the number of values replaced for each type. |
remove_border_spaces |
(logical, default TRUE) If TRUE and
|
remove_all_space |
(logical, default FALSE) If TRUE and
|
replace_space_with |
(character, default "_") Character to use when
replacing internal spaces. Only used when |
detect_invisible_chars |
(logical, default TRUE) If TRUE, scan
taxonomic values for invisible / unusual characters: anything in
Unicode category |
replace_invisible_chars |
(logical, default FALSE) If TRUE and
|
invisible_chars_replacement |
(character, default |
If modify_phyloseq = FALSE (default): Nothing (invisible NULL).
Warnings/messages only if verbose = TRUE and issues are found.
If modify_phyloseq = TRUE: The modified phyloseq object with problematic
values replaced by NA, along with messages summarizing the changes.
Adrien Taudière
verify_tax_table(data_fungi_mini)
verify_tax_table(data_fungi_mini, verbose = TRUE)
# Check for redundant "_sp" patterns (default)
data_fungi2 <- data_fungi_mini
data_fungi2@tax_table[1, "Species"] <- "Eutypa_sp"
verify_tax_table(data_fungi2, verbose = TRUE, redundant_suffix = "_sp")
# Automatically replace problematic values with NA
# This replaces: NA-like patterns, short values, and redundant suffixes
data_fungi2_cleaned <- verify_tax_table(data_fungi2,
modify_phyloseq = TRUE
)
# Check that the redundant value was replaced
data_fungi2@tax_table[1, "Species"] # "Eutypa_sp"
data_fungi2_cleaned@tax_table[1, "Species"] # NA
# Combine verbose mode with modifications to see all issues
data_fungi2_cleaned <- verify_tax_table(data_fungi2,
verbose = TRUE,
modify_phyloseq = TRUE
)
# Check for other patterns like "_var" or "_cf"
verify_tax_table(data_fungi_mini, verbose = TRUE, redundant_suffix = "_var")
# Disable redundant suffix check
verify_tax_table(data_fungi_mini, verbose = TRUE, redundant_suffix = NULL)
# Specify custom taxonomic rank order
verify_tax_table(data_fungi_mini,
verbose = TRUE,
taxonomic_ranks = c("Class", "Order", "Family", "Genus")
)
# Handle whitespace in taxonomic values
# Create example with spaces
data_fungi3 <- data_fungi_mini
data_fungi3@tax_table[1, "Genus"] <- " Russula "
data_fungi3@tax_table[2, "Species"] <- "Russula emetica"
# Check for spaces (verbose mode)
verify_tax_table(data_fungi3, verbose = TRUE)
# Remove leading/trailing whitespace (enabled by default)
data_fungi3_trimmed <- verify_tax_table(data_fungi3, modify_phyloseq = TRUE)
data_fungi3_trimmed@tax_table[1, "Genus"] # "Russula" (trimmed)
# Also replace internal spaces with underscores
data_fungi3_cleaned <- verify_tax_table(data_fungi3,
modify_phyloseq = TRUE,
remove_all_space = TRUE,
replace_space_with = "_"
)
data_fungi3_cleaned@tax_table[2, "Species"] # "Russula_emetica"
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.