taxonkit_reformat: Reformat Taxonomic Lineage using taxonkit
In pctax: Professional Comprehensive Omics Data Analysis

taxonkit_reformat

R Documentation

Reformat Taxonomic Lineage using taxonkit

Description

Reformat Taxonomic Lineage using taxonkit

Usage

taxonkit_reformat(
  file_path,
  delimiter = NULL,
  add_prefix = FALSE,
  prefix_kingdom = "K__",
  prefix_phylum = "p__",
  prefix_class = "c__",
  prefix_order = "o__",
  prefix_family = "f__",
  prefix_genus = "g__",
  prefix_species = "s__",
  prefix_subspecies = "t__",
  prefix_strain = "T__",
  fill_miss_rank = FALSE,
  format_string = "",
  miss_rank_repl_prefix = "unclassified ",
  miss_rank_repl = "",
  miss_taxid_repl = "",
  output_ambiguous_result = FALSE,
  lineage_field = 2,
  taxid_field = NULL,
  pseudo_strain = FALSE,
  trim = FALSE,
  text = FALSE,
  data_dir = NULL
)

Arguments

`file_path`	The path to the input file with taxonomic lineages. Or file text (text=TRUE)
`delimiter`	The field delimiter in the input lineage (default ";").
`add_prefix`	Logical, indicating whether to add prefixes for all ranks (default: FALSE).
`prefix_kingdom`	The prefix for kingdom, used along with –add-prefix (default: "K__").
`prefix_phylum`	The prefix for phylum, used along with –add-prefix (default: "p__").
`prefix_class`	The prefix for class, used along with –add-prefix (default: "c__").
`prefix_order`	The prefix for order, used along with –add-prefix (default: "o__").
`prefix_family`	The prefix for family, used along with –add-prefix (default: "f__").
`prefix_genus`	The prefix for genus, used along with –add-prefix (default: "g__").
`prefix_species`	The prefix for species, used along with –add-prefix (default: "s__").
`prefix_subspecies`	The prefix for subspecies, used along with –add-prefix (default: "t__").
`prefix_strain`	The prefix for strain, used along with –add-prefix (default: "T__").
`fill_miss_rank`	Logical, indicating whether to fill missing rank with lineage information of the next higher rank (default: FALSE).
`format_string`	The output format string with placeholders for each rank.
`miss_rank_repl_prefix`	The prefix for estimated taxon level for missing rank (default: "unclassified ").
`miss_rank_repl`	The replacement string for missing rank.
`miss_taxid_repl`	The replacement string for missing taxid.
`output_ambiguous_result`	Logical, indicating whether to output one of the ambiguous result (default: FALSE).
`lineage_field`	The field index of lineage. Input data should be tab-separated (default: 2).
`taxid_field`	The field index of taxid. Input data should be tab-separated. It overrides -i/–lineage-field.
`pseudo_strain`	Logical, indicating whether to use the node with lowest rank as strain name (default: FALSE).
`trim`	Logical, indicating whether to not fill missing rank lower than current rank (default: FALSE).
`text`	logical
`data_dir`	directory containing nodes.dmp and names.dmp (default "/Users/asa/.taxonkit")

Value

A character vector containing the reformatted taxonomic lineages.

Examples

## Not run: 
# Use taxid
taxids2 <- system.file("extdata/taxids2.txt", package = "pctax")
reformatted_lineages <- taxonkit_reformat(taxids2,
  add_prefix = TRUE, taxid_field = 1, fill_miss_rank = TRUE
)
reformatted_lineages
taxonomy <- strsplit2(reformatted_lineages, "\t")
taxonomy <- strsplit2(taxonomy$V2, ";")

# Use lineage result
taxonkit_lineage("9606\n63221", show_name = TRUE, show_rank = TRUE, text = TRUE) %>%
  taxonkit_reformat(text = TRUE)

## End(Not run)

pctax documentation built on April 4, 2025, 2:26 a.m.