download_ncbi: Download the NCBI taxonomy

View source: R/load_methods.R

download_ncbiR Documentation

Download the NCBI taxonomy

Description

Download the NCBI taxonomy

Usage

download_ncbi(taxonkitpath = NA)

Arguments

taxonkitpath

A string containing the full path to where Taxonkit is installed (optional).

Details

This method downloads a NCBI taxonomy archive file to a temporary directory, extracts four files (nodes.dmp, names.dmp, merged.dmp and delnodes.dmp) from the downloaded archive file, and then removes the archive file. Further parsing of these four files must be carried out with Taxonkit (https://bioinf.shenwei.me/taxonkit/download/), either automatically or manually. If the path to a Taxonkit installation is supplied, Taxonkit is called and the location of the four files is passed to Taxonkit as an argument for automatic parsing. Taxonkit output is saved in the same temporary folder in a file called All.lineages.tsv.gz. If the path to Taxonkit is not supplied, parsing should be carried out manually using the command: ⁠taxonkit list --data-dir=path/to/downloaded/files --ids 1 | taxonkit lineage --show-lineage-taxids --show-lineage-ranks --show-rank --show-name --data-dir=path/to/downloaded/files | taxonkit reformat --taxid-field 1 --data-dir=path/to/downloaded/files -o All.lineages.tsv.gz⁠

Value

A character vector containing paths to the relevant downloaded and unzipped NCBI data dump files, or if the taxonkitpath parameter was set, the path to All.lineages.tsv.gz.

Examples

## Not run: download_ncbi()
## Not run: download_ncbi(taxonkitpath = "/home/usr/bin/taxonkit")

MoultDB/moultdbtools documentation built on Feb. 2, 2024, 5:21 p.m.